Skip to content
Christoph Matheja edited this page Apr 19, 2018 · 2 revisions

This page demonstrates Attestor's state space generation by means of a small example using singly-linked lists (SLL). In particular, we cover all necessary inputs to execute the analysis and explore the generated state space step by step.

Prerequisites

Theory

We assume a basic understanding of concepts from model-checking and static program analysis. Moreover, we recommend to have a look at the architecture page and the glossary to clarify concepts and terminology. Further details are found in the publications section.

Graphical Notation

We will often graphically illustrate the heap configurations occurring within program states. Hence, it is recommended to get accustomed with our graphical notation first.

Further information on using the graphical user interface are found in the manual.

State Space Exploration

The state space obtained for our example can be graphically explored online. We recommend to follow the analysis described below while retracing the individual states in the graphical user interface.

Try it yourself!

We provide an archive containing a binary of Attestor and all example files considered on this page here. Please notice that Attestor requires Java JDK 1.8 in order to run (cf. system requirements). Moreover, please make sure that JAVA_HOME is set correctly. Further information on installing other versions of Attestor are found here.

After unpacking the archive with examples, each individual example can be executed as follows:

java -jar attestor.jar -l FILE.attestor

where FILE.attestor specifies the example that should be executed. The graphical user interface generated for the provided example file is located in a directory FILE/ after execution.

Overview

This section serves as a reference for the individual steps considered in our example. We recommend to read the first to sections of the setup section to get accustomed with the data structure and the source code that is analyzed. After that, feel free to sections and come back whenever you want to improve your understanding of some components of the analysis.

Setup

In this section we discuss all inputs passed to Attestor:

The Analysis

In this section, we explore the state space generated by Attestor and discuss a few steps in detail.

Initial setting

Step-by-step analysis

We first provide an overview of the full state space generated by Attestor. After that, we discuss individual steps:

Next steps

This section discusses a few modifications of our running example to explore more features of Attestor.

Setup

In this section, we briefly discuss the data structure that is analyzed throughout the example. Furthermore, we consider all inputs passed to Attestor to perform the analysis.

Data structure

In our running example, we consider program manipulating singly-linked lists (SLL). We will assume the following class definition of an SLL in Java:

	public class SLList {
		public SLList next;
		
		public SLList( SLList next ){
			this.next = next;
		}
	}

As explained on the page on graphical notation, we model the heap as a graph, where objects are represented by nodes and pointers, for example x.next, are represented by edges between the source object of a pointer and its target object.

In the above case, instances of class SLL will be represented by nodes and outgoing edges of such nodes correspond to class variable next. The source of an edge labeled with next is thus the object whose class variable is considered. Its target is the value of class variable edge for its source object.

An exemplary list is depicted below, where the shaded node represents the special location null, i.e. the last element of the list points to null.

singly-linked list

back to overview

Code

Throughout our example analysis, we consider the following Java class:

public class SLList {
    public SLList next;

    public SLList( SLList next ){
         this.next = next;
    }

    public static SLList prependSLList( SLList tail ){
         SLList first = new SLList( tail );
         SLList curr = first;
         for( int i = 0; i < 10; i++ ){ //for Attestor this is non-deterministic
               SLList tmp = new SLList( null ); //for demonstration: setting next to null first
               curr.next = tmp;
               curr = tmp;
         }
         return first;
    }
    
    // this is just a dummy such that the file compiles
    public static void main(String[] args) {
        prependSLList( null );
    }
} 

Our goal is to analyze the method prependSLList. It takes a singly linked list as input and prepends a number of new elements to it, starting with the new head of the list. So for example, if we start with the list [a,b,c], in the first iteration we would get [1,a,b,c] in the second [1,2,a,b,c], etc. Notice that Attestor does not account for arithmetic (yet). Hence, the guard i < 10 in the above loop will be interpreted nondeterministically.

We have to pass the classpath, the class name and the method to analyze to Attestor as follows (assuming the above class SLList.java is located in directory code):

--classpath "code"
--class "SLList"
--method "prependSLList"

back to overview

Grammar

Attestor requires a graph grammar to guide abstraction and concretization. The grammar used in our example describes all singly-linked list segments and is defined as follows:

grammar

Here, the third rule does not increase expressiveness, i.e. all graphs generated by the grammar can also be generated without the third rule. However, we need the third rule to obtain a backward confluent grammar.

Although Attestor has a built-in graph grammar for singly-linked lists, we define our own grammar for educational purposes. The grammar is then passed to Attestor as follows:

--grammar "grammar/SLList.json"

The full grammar file specifying the above grammar is found below. Further information about writing customized graph grammars are found on the corresponding page.

[
  {
    "nonterminal": "SLL",
    "rank": 2,
    "rules": [
      {
        "nodes": [
          {
            "type": "SLList",
            "number": 2
          }
        ],
        "externals": [
          0,
          1
        ],
        "variables": [],
        "selectors": [
          {
            "label": "next",
            "origin": 0,
            "target": 1
          }
        ],
        "hyperedges": []
      },
      {
        "nodes": [
          {
            "type": "SLList",
            "number": 3
          }
        ],
        "externals": [
          0,
          2
        ],
        "variables": [],
        "selectors": [
          {
            "label": "next",
            "origin": 0,
            "target": 1
          }
        ],
        "hyperedges": [
          {
            "label": "SLL",
            "tentacles": [
              1,
              2
            ]
          }
        ]
      },
      {
        "nodes": [
          {
            "type": "SLList",
            "number": 3
          }
        ],
        "externals": [
          0,
          2
        ],
        "variables": [],
        "selectors": [],
        "hyperedges": [
          {
            "label": "SLL",
            "tentacles": [
              0,
              1
            ]
          },
          {
            "label": "SLL",
            "tentacles": [
              1,
              2
            ]
          }
        ]
      }
    ]
  }
]

back to overview

Custom initial state

Since the method considered in our example takes a (relevant) parameter, we provide a custom initial state that determines the (abstract) value of this parameter. In our case, we would like the parameter to initially point to the head of an arbitrary singly-linked list. This is achieved by specifying a heap configuration where one node has an attached variable that is labeled with a parameter reference.

The heap configuration used in our example is illustrated below.

initial state

To pass such a heap configuration to Attestor, we use the option

--initial "initial/list.json"

where list.json is a description of the above heap configuration in JSON. The full specification of the initial state is provided below. Further information on writing custom initial states are found on the corresponding page.

{
    "nodes":[
        {
            "type":"NULL",
            "number":1
        },{
            "type":"SLList",
            "number":1
        }
    ],
    "externals":[],
    "variables":[
        {
            "name":"null",
            "target":0
        },{
            "name":"@parameter0:",
            "target":1
        }
    ],
    "selectors":[],
    "hyperedges":[
        {
            "label":"SLL",
            "tentacles":[1,0]
        }
    ]

} 

back to overview

Running Attestor

Putting all input files together, we end up with the following command-line options passed to Attestor:

--description "tour example 'prepend SLL'"
--export "example"
--classpath "code"
--class "SLList"
--method "prependSLList"
--grammar "grammar/SLList.json"
--initial "initial/list.json" 

Here, --description is optional and just provides a short description of the task that is, for example, also passed to the graphical user interface. Furthermore, --export tells Attestor to export a report that enables graphical state space exploration.

The remaining options have been discussed in previous sections and serve to pass the class and method to analyze to Attestor together with the grammar and initial state used within the analysis. A complete reference of all command-line options of Attestor is found here.

We can either pass these options directly to Attestor or store them in a settings file, say example.attestor, that is then fed into Attestor using the --load option:

Try it yourself!
java -jar attestor -l example.attestor

The console output should look as follows:

java -jar attestor.jar -l example.attestor
[Version]  attestor - version 0.3.8
           Analyzed method: SLList.prependSLList
           Scenario: tour example 'prepend SLL'
           +-------------------------+------------------+
[Summary]  | Generated states        | Number of states |
           +-------------------------+------------------+
           | w/ procedure calls      |               56 |
           | w/o procedure calls     |               38 |
           | final states            |                1 |
           +-------------------------+------------------+
[Summary]  Exports:
           Report exported to example
           +-----------------------------+--------------+
[Summary]  | Phase                       | Runtime      |
           +-----------------------------+--------------+
           | Command Line Interface      |      0.011 s |
           | Parse program               |      0.603 s |
           | Parse grammar               |      0.020 s |
           | Parse input                 |      0.001 s |
           | Parse contracts             |      0.000 s |
           | Marking generation          |      0.001 s |
           | Grammar Refinement          |      0.000 s |
           | Abstraction preprocessing   |      0.101 s |
           | Interprocedural Analysis    |      0.057 s |
           | Model checking              |      0.000 s |
           | Counterexample generation   |      0.000 s |
           | Report generation           |      0.065 s |
           +-----------------------------+--------------+
           | Total verification time     |      0.058 s |
           | Total runtime               |      0.861 s |
           +-----------------------------+--------------+

Moreover, the state space has been exported for graphical exploration to directory example/.

back to overview

The Analysis

In the remainder of this page, we discuss the results and individual steps performed by Attestor for our example setup.

Jimple

The program semantics of Attestor is defined for Jimple programs, an intermediate language for Java that unifies several Bytecode statements and hence requires less semantics transformers. The translation from Java (Bytecode) programs is performed using soot. When exploring a state space generated by Attestor, the statement of each state thus corresponds to a Jimple statement.

For instance, the Jimple code of procedure prependSLList is found below.

public static SLList prependSLList(SLList){
	SLList tail, temp$0, curr, temp$1;
	int i, temp$3;

1	tail := @parameter0: SLList;
2	temp$0 = new SLList;
3	specialinvoke temp$0.<SLList: void <init>(SLList)>(tail);
4	curr = temp$0;
5	i = 0;
	
	label1:
6	if i < 10 goto label2;

7	goto label3;	

	label2:
8	temp$1 = new SLList;
9	specialinvoke temp$1.<SLList: void <init>(SLList)>(null);
10	curr.<SLList: SLList next> = temp$1;
11	curr = temp$1;
12	temp$3 = i + 1;
13	i = temp$3;

14	goto label1;
	
	label3:
15	return temp$0;

State Space

In our example, we set the option --export "example" in order to generate a report (stored in directory example) that allows graphical state space exploration. For our example, the graphical user interface can also be accessed online.

Further details regarding the graphical user interface are found in the manual.

Let us have a look at the generated state space. Please find below a screenshot of the generated state space in which we grouped several states into a single block.

The topmost state is the initial state, i.e. the state of the heap before the execution of the program started. The green states correspond to the if-statement in line 6. We observe that both states have two outgoing transitions, because the condition of the if-statement has been resolved by non-deterministic branching.

The pink states before the first branching state correspond to the statements before the loop.

The two blue segments each correspond to the statements in the loop body. They start after an evaluation of the if-statement and end in another evaluation of the if-statement.

For the segment in light blue (the lower one) we see, that the evaluation results in an actual loop indicating that Attestor found a fixed point and hence a loop invariant.

Finally, the yellow statements correspond to the program statements after the loop. There is a single final state. Notice, however, that analyzing other programs could result in multiple final states.

state space

In the following, we consider a few states in detail to highlight the essential steps of the analysis.

back to overview

Initial State

The initial state (ID: 0) of the state space is depicted below.

state space

Notice that this state it is not identical to the custom initial state. This is due to the fact that Attestor has automatically inserted missing constants, such as true and false.

back to overview

New Statement

The new statement in Jimple code does not include the call to the constructor. It only creates the new object and sets its fields to the default values. The constructor is then invoked by the next instruction, see method call.

Our example contains several new statements. Let us consider the statement in line 2. It is executed on the state with ID 1, namely:

state 1

The resulting state (ID: 2) adds a new object (in the picture this is node 11), assigns it to variable temp$0 and sets the field next to the null-node. Attestor then directly canonicalizes the state resulting in an abstract list between nodes 11 and 0.

state 2

back to overview

Method Call

The method calls in this example are all constructor calls. Howeover, notice that Attestor is capable of analyzing arbitrary method calls, e.g. static, non-static, and recursive calls. Directly after the new statement described above the call specialinvoke temp$0.<SLList: void <init>(SLList)>(tail); has to be executed (state with ID 2). Here, <init> is the constructor, temp$0, the base value, is the newly created object (node 11 in the pictures) and tail is the argument (attached to node 1). After identifying tail as the argument, Attestor automatically removes the variable tail as it is no longer used (using Soot's Live Variable Analysis). This is an optimization to allow for a more aggressive canonicalization.

The constructor sets the next-field of the base value to the argument. After executing the return statement, Attestor automatically canonicalizes the state. Since the variable tail pointing to node 1 was removed, it can be abstracted. Therefore, the resulting state contains only a single list segment:

state 3

back to overview

Materialization

Attestor does only manipulate concrete parts of the heap. Whenever it needs to access a field which is hidden in a nonterminal edge, it first materializes it (in all possible ways) and then performs the semantics on the concrete elements instead. In the state space materialization is indicated by red transitions. For our running example, materialization is performed at the state with ID 11.

state 11

The next statement to be executed is curr.next = temp$1;. Node 11 is referenced by variable curr and its next-field is hidden in the adjacent nonterminal edge. Attestor thus uses its grammar to materialize the edge, resulting in states with IDs 12 and 13, respectively:

state 13 state 12

In one of the resulting states, the first grammar rule is applied such that the next field directly points to null. In the other state, the second rule is applied resulting in an additional node (3) to which the next field points and that is attached to a new nonterminal edge. We observe that the second materialization is actually spurious, because the next-field of curr was explicitly set to null in the previous statement.

In other words, the abstraction is too aggressive. Attestor performs a less aggressive abstraction strategy if the option --admissible-abstraction is enabled. We have not enabled this option to highlight the effects of materialization.

Note that Attestor does not apply unnecessary rules. This means that

  1. It does not materialize nonterminal edges when it does not need a field from the adjacent nodes (here: the edge adjacent to node 13 is not materialized) and
  2. It does not use rules which do not result in the requested field (here: the third rule is not used since it would not generate a field next at node 11 anyway).

back to overview

Assignment

Assignment statements set variables or fields. For example, the effect of line 10 curr.<SLList: SLList next> = temp$1; on the state with ID 13 is to set the next-field of node 11 from null to node 13.

state 13 after state 13

back to overview

Return

Return statements are resolved by identifying the node to be returned and attaching a special variable with a return label to it. For example in the state with ID 8, the next statement to be executed is return temp$0;.

state 8

Here, node 11 is referenced by temp$0. The resulting state (ID: 9) thus references this node via @return.

state 9

back to overview

Canonicalization

Attestor abstracts states directly after computing the effects of a statement. For instance, the state with ID 33 is canonicalized after its associated statement, goto 5, has been executed. Since this statement does not affect the heap, all changes in the heap are due to canonicalization. The state with ID 33 contains two abstract list segments and a next pointer from node 13 to 14:

state 33

Its successor state (ID 23) does not contain this next edge. The state is obtained by first embedding the first grammar rule between nodes 13 and 14 and then using the third rule to abstract the two adjacent nonterminal edges from 11 to 13 (already there) and from 13 to 14 (resulting from the previous step).

state 23

Note that, as the resulting state is merged with an existing one, the node identifiers may have changed. For example, node 13 in the resulting state should not be seen as identical to the one previous to canonicalization.

back to overview

Next steps

After following the above steps, a reader should be familiar with generating state spaces with Attestor and exploring them using the generated graphical user interface. In this section, we look present a few options that you might want to try out next.

Admissible abstractions

As we observed when considering the new statement, Attestor may abstract too much. This results in unncessary materialization steps and thus larger state spaces and run times. In particular, over-abstraction can be observed if parts of the heap are abstracted and immediately accessed in the next statement. Attestor can counter this kind of over-abstraction by performing abstraction only, if the embedding used for abstraction does not violate admissibility.

Try it yourself!

To prevent abstraction from violating admissibility, it suffices to set the option --admissible-abstraction. Run the above example again and explore the changes in the resulting state space:

java -jar attestor -l admissible.attestor

back to overview

Model checking

It is possible to supply LTL specifications to Attestor that are checked after state space generation. There are three possible outcomes of the verification process: Either Attestor finds a proof that the specification is satisfied or states that the specification is violated. In the second case, Attestor produces a counterexample which can be explored similarly to state spaces.

Notice that a specification might be violated due to the over-approximation performed by Attestor to obtain a finite state space. To check whether a property is really violated or provided counterexamples are spurious, the option --canonical can additionally be supplied. In this case, Attestor attempts to find an input that is guaranteed to violate the specification.

Try it yourself!

Specifications are supplied using the option -mc <spec>, where <spec> is a linear temporal logic formula as defined here. Please find below a few example specifications to experiment with. For each specification, the example archive contains a corresponding .attestor file.

The heap will always be a list again.

Assume we want to express that for every point in the program execution, the heap will - at some point in the future - consist of a singly-linked list again and of nothing else but constants. A corresponding formula looks as follows:

GF { L(SLL) }

Please find below the relevant output of Attestor when running our example with the above formula:

java -jar attestor -l model-checking-01.attestor

...
[Summary]  Model checking results: All provided LTL formulae are satisfied.
[SAT]      GF { L(SLL) }
...
Isn't the heap always just a list?

One might guess that the heap is not only eventually a list, but it actually consists of a list and nothing else during the whole execution of our example program. We can formalize this property as follows:

G { L(SLL) }

Please find below the relevant output of Attestor when running our example with the above formula:

java -jar attestor -l model-checking-02.attestor

...
[Warning]  Verification of potentially spurious counterexamples is disabled.
[Warning]  It is advised to rerun with option '--canonical'.
...
[Summary]  Model checking results: Some provided LTL formulae could not be verified.
[UNSAT]    G { L(SLL) }
[Summary]  Detected counterexamples are not verified.
           G { L(SLL) }
                 Counterexample trace: [0, 1, 2]
...

As we can see, Attestor fails to verify this property! In fact, it produces a counterexample, the trace [0,1,2], that leads to a state in which the heap does not consist exactly of a singly-linked list. Since Attestor over-approximates the concrete program semantics, this counterexample might be spurious. Hence, a corresponding warning is produced. We can tell Attestor to check whether our counterexample [0,1,2] is spurious, by running the above example again with option --canonical:

java -jar attestor.jar -l model-checking03.attestor  
...
[Summary]  Model checking results: Some provided LTL formulae could not be verified.
[UNSAT]    G { L(SLL) }
[Summary]  Detected a non-spurious counterexample for all violated LTL formulae.
           G { L(SLL) }
                 Counterexample trace: [0, 1, 2]
...

As stated above, Attestor verified that the counterexample is a realistic one. The heap thus does not always consist exactly of a singly-linked list.

back to overview

Other examples

We collect various running examples including source code, grammars, LTL properties, and initial states in a separate example repository. Feel free to try out the analyses specified in this repository and modify them to analyze your own code.

back to overview