zeeshanlakhani · July 13, 2015 18:26 · Jul 13, 2015
diff --git a/eqc_statem.md b/eqc_statem.md
@@ -0,0 +1,362 @@
+## Testing Stateful Code
+
+So far, all the properties we have written have tested stateless code. Stateless
+code is made up of [pure functions](http://en.wikipedia.org/wiki/Pure_function)
+and is inherently easier to test than stateful code with side effects. The chief
+problem with testing stateful code is that the input to output mapping depends
+on the current state of the program. Previous operations can cause the same
+function to return different output given the same input. Therefore, in order to
+test stateful code, our tests must maintain some state of their own. This state
+is known as the **model state** and is updated as part of the testing process.
+In order to validate the stateful code under test, the model state is compared
+to the output of the stateful functions. Additionally, since most stateful code
+in Erlang is process state, we can query the internal state and compare it
+directly to the model state. This latter technique, while useful, comes with
+some caveats regarding symbolic and dynamic state, as described below.
+
+## Symbolic and Dynamic State
+```eqc_statem``` generated test cases are lists of **symbolic commands**.
+Symbolic commands represent functions that are going to be called during the
+test run, as well as the output of running those functions. We say that
+**symbolic variables** are **bound** to the result of **symbolic functions**.
+Symbolic commands are represented in Erlang as the following tuple where ``Var``
+is the symbolic binding of the result of the symbolic function ``Call``.
+
+```erlang
+{set, Var, Call}
+```
+
+An example of an actual symbolic call looks like the following:
+
+```erlang
+{set, {var, 1}, {call, erlang, whereis, [a]}}
+```
+
+As you can see the symbolic function call is just a tuple tagged with ``call``
+and then followed by the actual module, function, and argument parameters of the
+function to be called. The result of this function call will be stored in the
+first variable that eqc is maintaining to keep track of the test run. When the
+test is run, eqc will call the function represented by the symbolic call and
+replace the symbolic variable with the output of the function. The resulting
+function call and output are called **dynamic** values. It's important to keep
+in mind that symbolic values are generated during test **generation** and
+dynamic values are computed during test **execution**.
+
+Figuring out when to use symbolic values and dynamic values in your eqc_statem
+callbacks is tricky and at times maddening. This, of course, begs the question
+as to why we need to worry about symbolic values in the first place. Quoting the
+eqc_statem documentation: 
+
+> The reason we use symbolic representations (rather than just working with
+> their dynamic values) is to enable us to display, save, analyze, and above all
+> simplify test cases before they are run.
+
+The rest of this tutorial will attempt to steer you in the right direction and
+minimize the pain of dealing with symbolic calls.
+
+## Modelling State
+
+In order to keep track of our model state, we need to store it in an erlang
+term. Typically, model state is held inside a **state** record just like the
+process state record typically used in ``gen_xxx`` servers. Only state relevant
+to the tests needs to be included in this record. It is unnecessary to include
+extraneous state. In other words, the state does not need to match the process
+state record of the system under test. It is typical, that as your statem tests
+grow to cover more of the system, that this model state record will include new
+fields to cover the new tests.
+
+During a statem test you define the initial model state of the system and then
+the resulting state from executing a given command with the current state. The
+tricky part is that the state must be mutated during both test generation and
+execution. Therefore the state must be able to represent both symbolic and
+dynamic values. The key to this is to not inspect the mutated state during test
+generation, as the values don't exist yet. For example, if you were spawning
+processes as part of your test, you would end up with symbolic state that looks
+like this: 
+
+```erlang
+[{var,1}, {var,2}, {var,3}]
+```
+
+During test execution the dynamic state will be computed, and these variables
+will be replaced with the actual runtime pids. Below, the eqc_statem behaviour
+will be described and this will be elaborated on.
+
+## The eqc\_statem behaviour
+
+We've talked a bit about symbolic versus dynamic values, and described why you
+must maintain model state in order to test stateful code. However, we haven't
+yet begun to examine how you actually write tests using ```eqc_statem```. It
+turns out that in addition to providing helper functions, eqc_statem also
+provides an OTP behaviour that implements the generic part of the stateful test
+process. You must only write the callbacks required by this behaviour in order
+to implement your tests. There are also optional callbacks that allow further
+test customization. These callbacks will be described in the following
+subsections.
+
+#### initial\_state/0
+
+```erlang
+-spec initial_state() -> symbolic_state().
+```
+
+This callback function returns the **symbolic state** in which each test case
+starts. This symbolic state should not assume anything about the runtime of the
+system, but otherwise it's just a normal erlang term.
+
+```erlang
+initial_state() ->
+    #state{pids=[]}.
+```
+
+#### precondition/2
+
+```erlang
+-spec precondition(symbolic_state(), symbolic_call()) -> bool().
+```
+
+``precondition/2`` gets called during test generation and thus only deals with
+symbolic values. It is a predicate function that determines if a given command
+should be included in test cases. In general I've found it easier to start my
+tests without preconditions and add them in after I get them working initially.
+Buggy preconditions are a common cause of error, and this isn't always obvious.
+You can see an example of a buggy precondition
+[here](http://www.trapexit.org/BuggyPrecondition).
+
+```erlang
+%% Only allow commands that query running pids
+precondition(S#state{running=RunningPids}, {call, cache_server, query, [Pid]}) ->
+    lists:member(Pid, Running).
+```
+
+#### next\_state/3
+
+```erlang
+-spec next_state(symbolic_state(), symbolic_result(), symbolic_call()) ->
+          symbolic_state();
+      next_state(dynamic_state(), dynamic_result(), symbolic_call_with_dynamic_args()) ->
+          dynamic_state().
+```
+
+The eqc_statem behaviour is essentially an abstract state machine. You've
+already set the initial model state with ``initial_state/0``. This function
+computes the next state given the current state and the function to be called.
+It is very important to note that this callback operates during both **test
+generation and test execution**, and therefore operates on both **symbolic and
+dynamic values**. The trick to this is, once again, not to inspect the values
+inside the symbolic variables. Quoting from the eqc_statem docs:
+
+> A correctly written next_state function does not inspect symbolic inputs--it
+> just includes them as symbolic parts of the result. Thus the same code can
+> construct a dynamic state of the same shape, given actual values instead of
+> symbolic ones.
+
+```erlang
+%% For both symbolic and dynamic states, start a new cache_server and add the
+%% returned pid to the list of running pids maintained by the model state.
+%% The trick here is not to inspect NewPid, which is a symbolic result. 
+next_state(#state{running=Pids}=S, NewPid, {call, cache_server, start_link, _})
+   -> S#state{running=[NewPid | Pids]}.
+```
+
+The above state transition function wisely avoids inspecting symbolic results
+and parameters. However, this presents a challenge. What if the function being
+called returns ``{ok, Pid}`` or ``{error, Reason}`` instead of just a value or
+exception?  Your ``next_state/3`` function can't inspect those values symbolically.
+There are a few options, but the simplest is to anticpate what the model state
+should look like based on the arguments to the command, and the current state as
+shown below:
+
+```erlang
+%% Ignore the result, since the arguments tell us all we need to know
+%% Don't ever allow Current > Max.
+next_state(#state{max=Max, current=Current}=S, _, {call, _, increment, N}) ->
+    Val = Current + N,
+    case Val > Max of
+        true ->
+            S#state{current=Max};
+        false ->
+            S#state{current=Val}
+    end.
+```
+
+In some cases however, you actually want your tests to handle a success, error,
+or other variable response, because you actually have no way to anticipate the
+future state from the function arguments. You don't want your tests to fail just
+because you anticipated the wrong response randomly. In this case what you want
+to do is actually save the symbolic result in the returned model state. Then on
+the subsequent call to ``command/1`` you take the symbolic result from the model
+state and pass it as the first parameter to the next call. Typically, you'd wrap
+the calls in a local function that checks the state rather than calling the
+actual function under test directly. The example below should help clarify this.
+
+```erlang
+next_state(S, Res, {call, _, non_deterministic_fun, _}) ->
+    S#state{prev_result=Res}.
+
+command(S=#state{prev_result=Res}) ->
+    {call, ?MODULE, maybe_do_something, [Res, some_other_arg]}.
+
+%% Called in postcondition
+maybe_do_something({ok, _}, Arg2) ->
+    some_module:do_something(Arg2);
+maybe_do_something({error, _}, _) ->
+    true.
+```
+
+Finally, we have the case where you need to use the symbolic result as part of
+the model state during the test execution phase. Adding extra state variables to
+keep track of past operations gets unwieldy quickly, and in fact it doesn't work
+for everything. In cases like this you actually want to compute the model state from
+the result during test execution and then compare it to the actual system
+state via `invariant/1`.  In these cases the solution is actually to embed a
+symbolic call in your state output. The symbolic call will be translated and
+executed during test execution and acts like a normal function call at that
+time. This doesn't impact test generation and allows much more useful tests
+during execution. You can do that with code like the following:
+
+```erlang
+   next_state(S=#state{leader=L}, Result, {call, _, some_distributed_op, _}) ->
+       S#state{leader={call, ?MODULE, maybe_change_leader, [L, Result]}}. 
+
+    maybe_change_leader(Leader, {ok, _}) ->
+        Leader;
+    maybe_change_leader(_Leader, {error, {redirect, NewLeader}}) ->
+        NewLeader;
+    maybe_change_leader(_Leader, {error, _}) ->
+        unknown.
+```
+
+It's interesting to note that even though the function call is symbolic during
+test generation, the value stored in the argument named N is just an erlang
+term. While it's possible to have symbolic values inside the arguments of a
+symbolic call, it is rare. Typically the argument is either hardcoded or part of
+a generator output and is a regular erlang term. Therefore it can be calculated
+and inspected during test generation and execution, unlike symbolic results
+which can't be computed until the test is actually run. In this case N is just
+an integer, and our code works just fine.
+
+#### postcondition/3
+
+```erlang
+-spec postcondition(dynamic_state(), symbolic_call_with_dynamic_args(),
+    dynamic_result()) -> bool().
+```
+
+```postcondition/3``` is a predicate called during test execution with the
+dynamic state **before** the call. First and foremost, this is where the actual
+results of the function calls are checked against the model state. Since this
+only happens during runtime, ``postcondition/3`` is typically easier to write
+and debug than ``precondition/2`` and ``next_state/3``.  Additionally, it is
+often useful to query parts of the system in post conditions, such as process
+state and assert that the 'real world' is behaving as you expect. However, this
+definitely should not be the main goal of your tests.
+
+```erlang
+postcondition(#state{running=Peers, nodes=N}, {call, _, get_count, []}, {error, no_quorum}) ->
+    length(Peers) <= N/2;
+postcondition(#state{running=Peers, nodes=N, current=Current}, {call, _, get_count, []}, {ok, Result}) ->
+    length(Peers) > N/2 andalso Current =:= Result.
+```
+
+#### command/1
+
+```erlang
+-spec command(symbolic_state()) -> symbolic_call().
+```
+
+We've seen how to ensure valid commands are executed, implement model state
+transitions and assert that our code is correct. However, we have not yet seen
+how to tell eqc_statem what commands it should attempt to run during test cases.
+This is the purpose of ``command/1``. Test sequences are generated by repeatedly calling
+command.
+
+```erlang
+oneof([{call, cache_server, get_count, []},
+       {call, cache_server, query, [count, "<=", int()]}]).
+```
+
+#### invariant/1 (optional)
+
+```erlang
+-spec invariant(dynamic_state()) -> bool.
+```
+This is the only optional callback that will be covered in this tutorial.
+``invariant/1`` is only called during test execution after each command is
+executed with the result state of the call. It can be used to insure invariants
+in the model state are kept, such as *count is always non-negative*. It's
+other main use is to compare the model state to the actual state of the system,
+instead of doing that in postconditions.
+
+```erlang
+invariant(#state{count=Count}) when Count >= 0 ->
+    true;
+invariant(_) ->
+    false.
+```
+
+## Implementing eqc\_statem tests
+
+In order to begin writing your eqc_statem callback module you need to include a
+small amount of boilerplate.
+
+```erlang
+-include_lib("eqc/include/eqc.hrl").
+-include_lib("eqc/include/eqc_statem.hrl").
+
+-behaviour(eqc_statem).
+```
+
+Just like with other eqc tests, you must also define a property to actually run
+your tests. This property should live in the callback module, and looks similar
+to most other eqc properties.
+
+```erlang
+prop_do_something() ->
+    ?FORALL(Cmds, commands(?MODULE),
+        aggregate(command_names(Cmds),
+            begin
+                test_specific_setup(),
+                {H, S, Res} = run_commands(?MODULE, Cmds),
+                eqc_statem:pretty_commands(?MODULE, 
+                                           Cmds,
+                                           {H, S, Res},
+                                           test_specific_teardown(S))
+            end)).
+```
+
+In lesson 3 we saw the property creator ``?FORALL``. This instance of
+``?FORALL`` binds Cmds to the commands generated by the abstract state machine
+described in the callback module, ``?MODULE``. 
+
+The ``aggregate/2`` function is a passthrough property that collects a list of
+values in each test and prints their distribution at the end of the test run. It
+returns identical results of the expression or property as it's second argument,
+except for the side effect of collecting and printing the distribution of
+commands.
+
+In most tests, you want to do specific setup and teardown from within the
+property itself. The reason is that these properties get run more than once by
+quickcheck, and the initial state needs to be reset each time. There are ways
+around this that allow seeding a state in ``commands/2`` instead of using the
+``initial_state/0`` function of the module, but that is not covered in this
+tutorial.
+
+In between the specific setup and teardown functions is the actual test run
+itself, ``run_commands/2``. It returns a 3-tuple of the dynamic history of the
+test run, the final state, and the reason execution stopped. Using
+``eqc_statem:pretty_commands/4`` you can pretty print the commands and results
+of your test on failure.  This is much more effective than using the `?WHENFAIL`
+macro. The last parameter is the remainder of your property.
+
+That's all there is to eqc_statem. However, we still need a way to actually run
+the property. For this we use the ``quickcheck/1`` function. ``quickcheck/1`` is
+not specific to eqc_statem and will run any property 100 times.
+
+```erlang
+quickcheck(prop_do_something()).
+```
+
+Those are the basic steps to writing an eqc_statem test. There are many more
+functions available in quickcheck to help (and complicate) your tests. Explore
+the documentation in the install and enjoy!
No results found