diff --git a/docs/README.md b/docs/README.md index 6b462301af..7492cec320 100644 --- a/docs/README.md +++ b/docs/README.md @@ -3,7 +3,7 @@ The Apalache documentation is written in markdown files in the [./src](./src) directory and compiled using [mdbook](https://github.com/rust-lang/mdBook). -To view the documentation, visit https://apalache.informal.systems/docs/ . +To view the documentation, visit https://apalache.informal.systems/docs/. ## Building and previewing the documentation @@ -41,7 +41,7 @@ Our CI uses the [emojitsu](https://github.com/shonfeder/emojitsu#emojitsu) CLI utility. You can also use this tool to lookup the name for an emoji if you have the unicode on hand or to see how a name will render. -Note that GitHub cheats in it's rendering: it replaces emojinames with pngs, and +Note that GitHub cheats in its rendering: it replaces emoji names with pngs, and includes some emojis which are not supported by the unicode standard. We don't stand for this standardless stuff at the moment, because it would be too tedious to implement. @@ -62,7 +62,9 @@ We provide a custom TLA+ syntax highlighting plugin for [highlight.js][] in `hig #### Building -The plugin is checked in with the source code, and you generally shouldn't need to rebuild it unless you need to make a change or update the plugin. +The plugin is checked in with the source code, +and you generally shouldn't need to rebuild it +unless you need to make a change or update the plugin. To build, check out the main highlight.js into a separate directory. Symlink the TLA+ language definition into the highlight.js `extra` directory: diff --git a/docs/internal/dev/using-bloop.md b/docs/internal/dev/using-bloop.md index 53a541016c..c2497a2203 100644 --- a/docs/internal/dev/using-bloop.md +++ b/docs/internal/dev/using-bloop.md @@ -1,7 +1,7 @@ # Developing with bloop [Bloop](https://scalacenter.github.io/bloop/) is a build server that can yield -lightening fast incremental builds of Scala projects. +lightning fast incremental builds of Scala projects. It is used by [Scala Metals](https://scalameta.org/metals/), but is also very useful as a standalone to save devs from the pain of `mvn`. diff --git a/docs/internal/smt/Cardinality.md b/docs/internal/smt/Cardinality.md index 0733237ec7..975098fc76 100644 --- a/docs/internal/smt/Cardinality.md +++ b/docs/internal/smt/Cardinality.md @@ -203,7 +203,7 @@ By collecting the constraints in this section, `[DefCin]` and `[Repr]`, we have a complete set of constraints for computing `k_n = Cardinality(S)`. As one can see, the number of literals in this encoding is **linear**, that is, `O(n)`. -**NOTE:** Although the number of constraints is linear, we have to analyze complexity +**NOTE:** Although the number of constraints is linear, we have to analyze the complexity of the underlying SMT problem, which may happen to be as hard as the SMT problem that is constructed in v0.6.0. diff --git a/docs/internal/types-api.md b/docs/internal/types-api.md index 40ac6505db..e02b9efaff 100644 --- a/docs/internal/types-api.md +++ b/docs/internal/types-api.md @@ -5,7 +5,7 @@ Type reconstruction API In this note, we focus on the client interface of a type inference engine for TLA+. In the following, we refer to this interface as ``TI``. -By fixing this interface we give the users the freedom of choosing from several +By fixing this interface, we give the users the freedom of choosing from several type inference engines. Given a TLA+ expression ``ex``, the ultimate goal of type inference is to assign a type @@ -25,16 +25,16 @@ An implementation of ``TI`` _must_ support two main phases of operation: 1. Resolving operator overloading, e.g., the expressions ``<<1, 2>>`` and ``f[e]`` can be treated as expressions either over tuples, or sequences. - 1. Finding signatures for operator definitions, e.g., ``F(x) == x + 1`` should have + 2. Finding signatures for operator definitions, e.g., ``F(x) == x + 1`` should have the signature ``Int => Int``. If successful, the results of this analysis should be stored somewhere for the subsequent use in the _Type computation_ mode. Note that it is not necessary to store the types of all the intermediate - subexpressions -- that would be wasteful. This analysis should store only the results that cannot + subexpressions - that would be wasteful. This analysis should store only the results that cannot be deterministically computed in the next phase such as resolved operator signatures and types of the variables. ``TIE`` _may_ use expression identifiers to save the type information in some storage. -1. **Type computation**. In this phase, the client queries ``TIE`` by giving a ``TLA+`` expression +2. **Type computation**. In this phase, the client queries ``TIE`` by giving a ``TLA+`` expression and the types of its arguments. ``TIE`` computes and returns the resulting type of the expression. For instance, given the expression ``F(e)`` and the type ``Int`` of ``e``, ``TIE`` finds the signature ``F: 'a -> 'a`` in its internal storage and returns the type ``Int``. Given the expression ``{x, y}`` @@ -44,7 +44,7 @@ An implementation of ``TI`` _must_ support two main phases of operation: ``TIE`` _must_ return the resulting type. **Importantly**, ``TIE`` _must_ assume that it can be given expressions that have not been analyzed in the first stage. Such expressions may originate from the rewriting techniques used by the client. In this case, ``TIE`` _must_ try to compute the resulting type. Only if the resulting type - cannot be deterministically computed (e.g., there is not relevant type information in the storage), + cannot be deterministically computed (e.g., there is no relevant type information in the storage), should ``TIE`` fail. diff --git a/docs/src/HOWTOs/howto-write-type-annotations.md b/docs/src/HOWTOs/howto-write-type-annotations.md index e775bc22b5..e467f82577 100644 --- a/docs/src/HOWTOs/howto-write-type-annotations.md +++ b/docs/src/HOWTOs/howto-write-type-annotations.md @@ -21,14 +21,13 @@ were removed in version 0.23.1. - Version 0.15.0: This HOWTO discusses how to write type annotations for the type checker - [Snowcat][], which is used in Apalache since version 0.15.0 (introduced in - 2021). + [Snowcat][], which is used in Apalache since version 0.15.0 (introduced in 2021). This HOWTO gives you concrete steps to extend TLA+ specifications with type annotations. You can find the detailed syntax of type annotations in [ADR002][]. The first rule of writing type annotations: -*Do not to write any annotations at all, until the type checker [Snowcat][] is +*Do not write any annotations at all, until the type checker [Snowcat][] is asking you to write a type annotation.* Of course, there must be an exception to this rule. You have to write type @@ -128,10 +127,10 @@ Have a look at the type of `chan`: ``` The type of `chan` is a record that has three fields: field `val` of type -`DATUM`, field `rdy` of type `Int`, field `ack` of type `Int`. The record type syntax is similar to dictionary syntax from programming languages (e.g. Python). We made it different -from TLA+'s syntax for records `[ val |-> v, rdy |-> r, ack |-> a ]` -and record sets `[ val: V, rdy: R, ack: A ]`, to avoid confusion between -types and values. +`DATUM`, field `rdy` of type `Int`, field `ack` of type `Int`. +The record type syntax is similar to dictionary syntax from programming languages (e.g. Python). +We made it different from TLA+'s syntax for records `[ val |-> v, rdy |-> r, ack |-> a ]` +and record sets `[ val: V, rdy: R, ack: A ]`, to avoid confusion between types and values. Run the type checker again. You should see the following message: @@ -441,9 +440,9 @@ For more details on the design and usage, see [Type Aliases][] in ADR-002. ## Recipe 7: Multi-line annotations -A type annotation may span over multiple lines. You may use both the `(* ... -*)` syntax as well as the single-line syntax `\* ...`. All three examples below -are accepted by the parser: +A type annotation may span over multiple lines. You may use both the `(* ... *)` +syntax as well as the single-line syntax `\* ...`. +All three examples below are accepted by the parser: ```tla VARIABLES @@ -467,8 +466,8 @@ similar to how multi-line strings are treated in modern programming languages. ## Recipe 8: Comments in annotations -Sometimes, it helps to document the meaning of type components. Consider the following -example from [Recipe 5](#funAsSeq): +Sometimes, it helps to document the meaning of type components. +Consider the following example from [Recipe 5](#funAsSeq): ```tla \* @type: (Seq(Int), Int, Int) => Bool; @@ -571,7 +570,7 @@ a type annotation before the keyword `LOCAL`, not after it. For example: LOCAL LocalInc(x) == x + 1 ``` -This may change later, when the tlaplus [Issue 578][] is resolved. +This may change later when the tlaplus [Issue 578][] is resolved. [old type annotations]: ../apalache/types-and-annotations.md diff --git a/docs/src/HOWTOs/uninterpretedTypes.md b/docs/src/HOWTOs/uninterpretedTypes.md index 77ad0f8de1..1590292ba5 100644 --- a/docs/src/HOWTOs/uninterpretedTypes.md +++ b/docs/src/HOWTOs/uninterpretedTypes.md @@ -1,23 +1,32 @@ # How to use uninterpreted types -This HOWTO explains what uninterpreted types are in the context of Apalache's type system, outlined in [ADR002][], and where/how to use them. +This HOWTO explains what uninterpreted types are in the context of Apalache's type system, +outlined in [ADR002][], and where/how to use them. ## What are uninterpreted types? -It is often the case, when writing specifications, that inputs (`CONSTANTS`) describe a collection of values, where the only relevant property is that all of the values are considered unique. For instance, [TwoPhase.tla][] defines +It is often the case, when writing specifications, that inputs (`CONSTANTS`) describe a collection of values, +where the only relevant property is that all the values are considered unique. +For instance, [TwoPhase.tla][] defines ```tla CONSTANT RM \* The set of resource managers ``` -however, for the purposes of specification analysis, it does not matter if we instantiate -`RM = 1..3` or `RM = {"a","b","c"}`, because the only operators applied to elements of `RM` are polymorphic in the type of the elements of `RM`. +however, for the purposes of specification analysis, +it does not matter if we instantiate `RM = 1..3` or `RM = {"a","b","c"}`, +because the only operators applied to elements of `RM` are polymorphic in the type of the elements of `RM`. For this reason, Apalache supports a special kind of type annotation: uninterpreted types. -The type checker [Snowcat][] makes sure that a value belonging to an uninterpreted type is only ever passed to polymorphic operators, and, importantly, that it is never compared to a value of any other type. +The type checker [Snowcat][] makes sure that a value belonging to an uninterpreted type +is only ever passed to polymorphic operators, and, importantly, that it is never compared to a value of any other type. ## When to use uninterpreted types? -For efficiency reasons, you should use uninterpreted types whenever a `CONSTANT` or value represents (an element of) a collection of unique identifiers, the precise value of which does not influence the properties of the specification. +For efficiency reasons, you should use uninterpreted types whenever a `CONSTANT` or value +represents (an element of) a collection of unique identifiers, +the precise value of which does not influence the properties of the specification. -On the other hand, if, for example, the order of values matters, identifiers should likely be `1..N` and hold type `Int` instead of an uninterpreted type, since `Int` values can be passed to the non-polymorphic `<,>,>=,<=` operators. +On the other hand, if, for example, the order of values matters, +identifiers should likely be `1..N` and hold type `Int` instead of an uninterpreted type, +since `Int` values can be passed to the non-polymorphic `<,>,>=,<=` operators. ## How to annotate uninterpreted types Following [ADR002][], an annotation with an uninterpreted type looks exactly like an annotation with a type alias: @@ -36,13 +45,17 @@ where: - `TYPENAME` is the uninterpreted type to which this value belongs, matching the pattern `[A-Z_][A-Z0-9_]*`, and - `identifier` is a unique identifier within the uninterpreted type, matching the pattern `[a-zA-Z0-9_]+`. -Example: `"1_OF_UT"` is a value belonging to the uninterpreted type `UT`, as is `"2_OF_UT"`. These two values are distinct by definition. On the contrary, -`"1_OF_ut"` does _not_ meet the criteria for a value belonging to an uninterpreted type ( lowercase `ut` is not a valid identifier for an uninterpreted type), so it is treated as a string value. +Example: `"1_OF_UT"` is a value belonging to the uninterpreted type `UT`, as is `"2_OF_UT"`. +These two values are distinct by definition. +On the contrary, `"1_OF_ut"` does _not_ meet the criteria for a value belonging to an uninterpreted type +(lowercase `ut` is not a valid identifier for an uninterpreted type), so it is treated as a string value. -**Note**: Values matching the pattern `"FRESH[0-9]+_OF_TYPENAME"` are reserved for internal use, to allow Apalache to construct fresh values. +**Note**: Values matching the pattern `"FRESH[0-9]+_OF_TYPENAME"` are reserved for internal use, +to allow Apalache to construct fresh values. ## Uninterpreted types, `Str`, and comparisons -Importantly, while both strings and values belonging to uninterpreted types are introduced using the `"..."` notation, they are treated as having distinct, incomparable types. +Importantly, while both strings and values belonging to uninterpreted types are introduced using the `"..."` notation, +they are treated as having distinct, incomparable types. Examples: - The following expression is type-incorrect: ```tla diff --git a/docs/src/apalache/antipatterns.md b/docs/src/apalache/antipatterns.md index 8c6c583af0..8e84f596dc 100644 --- a/docs/src/apalache/antipatterns.md +++ b/docs/src/apalache/antipatterns.md @@ -1,8 +1,13 @@ # Antipatterns -This page collects known antipaterns (APs) when writing TLA+ for Apalache. In this context, APs are syntactic forms or specification approaches that, for one reason or another, have particularly slow/complex encodings for the target model checker. For a pattern to be an AP, there must exist a known, equivalent, efficient pattern. +This page collects known antipaterns (APs) when writing TLA+ for Apalache. +In this context, APs are syntactic forms or specification approaches that, +for one reason or another, have particularly slow/complex encodings for the target model checker. +For a pattern to be an AP, there must exist a known, equivalent, efficient pattern. -Often, APs arise from a user's past experiences with writing TLA+ for TLC, or from a direct translation of imperative OOP code into TLA+, as those follow a different paradigm, and therefore entail different cost evaluation of which expressions are slow/complex and which are not. +Often, APs arise from a user's past experiences with writing TLA+ for TLC, +or from a direct translation of imperative OOP code into TLA+, as those follow a different paradigm, +and therefore entail different cost evaluation of which expressions are slow/complex and which are not. ## `CHOOSE`-based recursion @@ -29,9 +34,11 @@ min(S) == IN IF e < minOther THEN e ELSE minOther ``` -Apalache dislikes the use of the above, for several reasons. Firstly, since the operator is `RECURSIVE`, Apalache does not support it after version 0.23.1. In earlier versions Apalache requires a predefined upper bound on unrolling, which means that the user must know, ahead of time, what the largest `|S|` is, for any set `S`, to which this operator is ever applied. +Apalache dislikes the use of the above, for several reasons. Firstly, since the operator is `RECURSIVE`, +Apalache does not support it after version 0.23.1. In earlier versions Apalache requires a predefined upper bound on unrolling, +which means that the user must know, ahead of time, what the largest `|S|` is, for any set `S`, to which this operator is ever applied. In addition, computing `F` for a set `S` of size `n = |S|` requires `n` encodings of a `CHOOSE` operation, which can be considerably expensive in Apalache. -Lastly, Apalache also needs to encode all of the the `n` intermediate sets, `S \ {e1}`, `(S \ {e1}) \ {e2}`, `((S \ {e1}) \ {e2}) \ {e3}`, and so on. +Lastly, Apalache also needs to encode all the the `n` intermediate sets, `S \ {e1}`, `(S \ {e1}) \ {e2}`, `((S \ {e1}) \ {e2}) \ {e3}`, and so on. The AP above can be replaced by a very simple pattern: ```tla @@ -41,7 +48,10 @@ F(S) == ApaFoldSet( G, v, S ) `ApaFoldSet` (and `ApaFoldSeqLeft`) were introduced precisely for these scenarios, and should be used over `RECURSIVE + CHOOSE` in most cases. ## Incremental computation -Often, users introduce an expression `Y`, which is derived from another expression `X` (`Y == F(X)`, for some `F`). Instead of defining `Y` directly, in terms of the properties it possesses, it is possible to define all the intermediate steps of transforming `X` into `Y`: "`X` is slightly changed into `X1` (e.g. by adding one element to a set, or via `EXCEPT`), which is changed into `X2`, etc. until `Xn = Y`". Doing this in Apalache is almost always a bad idea, if a direct characterization of `Y` exists. +Often, users introduce an expression `Y`, which is derived from another expression `X` (`Y == F(X)`, for some `F`). +Instead of defining `Y` directly, in terms of the properties it possesses, it is possible to define all the intermediate +steps of transforming `X` into `Y`: "`X` is slightly changed into `X1` (e.g. by adding one element to a set, or via `EXCEPT`), +which is changed into `X2`, etc. until `Xn = Y`". Doing this in Apalache is almost always a bad idea, if a direct characterization of `Y` exists. Concretely, the following constructs are APs: 1. Incremental `EXCEPT` @@ -73,7 +83,10 @@ Y == TLC likes these sorts of operations, because it manipulates programming-language objects in its own implementation. This makes it easy to construct temporary mutable objects, manipulate them (e.g. via for-loops) and garbage-collect them after they stop being useful. -For constraint-based approaches, like Apalache, the story is different. Not only are these intermediate steps not directly useful (since Apalache is not modeling TLA+ expressions as objects in Sacala), they actually hurt performance, since they can generate a significant amount of constraints, which are all about describing data structures (e.g. two functions being almost equal, except at one point). +For constraint-based approaches, like Apalache, the story is different. +Not only are these intermediate steps not directly useful (since Apalache is not modeling TLA+ expressions as objects in Scala), +they actually hurt performance, since they can generate a significant amount of constraints, +which are all about describing data structures (e.g. two functions being almost equal, except at one point). Essentially, Apalache is spending its resources not on state-space exploration, but on in-state value computation, which is not its strong suit. Below we show how to rewrite these APs. @@ -112,4 +125,3 @@ with ```tla f == [ k \in {k1,...,kn} |-> A(k) ] ``` - diff --git a/docs/src/apalache/assignments-in-depth.md b/docs/src/apalache/assignments-in-depth.md index 04a51754b8..d23ce2fb69 100644 --- a/docs/src/apalache/assignments-in-depth.md +++ b/docs/src/apalache/assignments-in-depth.md @@ -346,7 +346,7 @@ Lastly, while `E` looks almost identical to `A`, the key difference is that expressions under universal quantifiers may not contain assignments. Therefore, `y' = s` is *not* an assignment to `y` and thus violates assignment-before-use. - + ## Manual Assignments Users may choose, but aren't required, to use manual assignments `x' := e` in diff --git a/docs/src/apalache/config.md b/docs/src/apalache/config.md index facb7b7252..1cfde34e37 100644 --- a/docs/src/apalache/config.md +++ b/docs/src/apalache/config.md @@ -51,7 +51,7 @@ common { } ``` -A `~` found at the beginning of a file path will expanded into the value set for +A `~` found at the beginning of a file path will be expanded into the value set for the user's home directory. Details on the effect of these parameters can be found in [Running the diff --git a/docs/src/apalache/features.md b/docs/src/apalache/features.md index 9abd5d5865..fc14b1b817 100644 --- a/docs/src/apalache/features.md +++ b/docs/src/apalache/features.md @@ -11,33 +11,32 @@ the manual.) To check liveness/temporal properties, we employ a [liveness-to-saf ### Module-Level constructs -Construct | Supported? | Milestone | Comment -------------------------|:------------------:|:---------------:|-------------- -``EXTENDS module`` | ✔ | - | A few standard modules are not supported yet (Bags) -``CONSTANTS C1, C2`` | ✔ | - | Either define a ``ConstInit`` operator to initialize the constants, use a `.cfg` file, or declare operators instead of constants, e.g., C1 == 111 -``VARIABLES x, y, z`` | ✔ | - | -``ASSUME P`` | ✔ / ✖ | - | Parsed, but not propagated to the solver -``F(x1, ..., x_n) == exp`` | ✔ / ✖ | - | Every application of `F` is replaced with its body. [Recursive operators](./principles/recursive.md) not supported after 0.23.1. From 0.16.1 and later, for better performance and UX, use `ApaFoldSet` and `ApaFoldSeqLeft`. -``f[x ∈ S] == exp`` | ✔ / ✖ | - | [Recursive functions](./principles/recursive.md) not supported after 0.23.1. From 0.16.1 and later, for better performance and UX, use `ApaFoldSet` and `ApaFoldSeqLeft`. -``INSTANCE M WITH ...`` | ✔ / ✖ | - | No special treatment for ``~>``, ``\cdot``, ``ENABLED`` -``N(x1, ..., x_n) == INSTANCE M WITH...`` | ✔ / ✖ | - | Parameterized instances are not supported -``THEOREM P`` | ✔ / ✖ | - | Parsed but not used -``LOCAL def`` | ✔ | - | Replaced with local LET-IN definitions +| Construct | Supported? | Milestone | Comment | +|-------------------------------------------|:----------:|:---------:|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| ``EXTENDS module`` | ✔ | - | A few standard modules are not supported yet (Bags) | +| ``CONSTANTS C1, C2`` | ✔ | - | Either define a ``ConstInit`` operator to initialize the constants, use a `.cfg` file, or declare operators instead of constants, e.g., C1 == 111 | +| ``VARIABLES x, y, z`` | ✔ | - | | +| ``ASSUME P`` | ✔ / ✖ | - | Parsed, but not propagated to the solver | +| ``F(x1, ..., x_n) == exp`` | ✔ / ✖ | - | Every application of `F` is replaced with its body. [Recursive operators](./principles/recursive.md) not supported after 0.23.1. From 0.16.1 and later, for better performance and UX, use `ApaFoldSet` and `ApaFoldSeqLeft`. | +| ``f[x ∈ S] == exp`` | ✔ / ✖ | - | [Recursive functions](./principles/recursive.md) not supported after 0.23.1. From 0.16.1 and later, for better performance and UX, use `ApaFoldSet` and `ApaFoldSeqLeft`. | +| ``INSTANCE M WITH ...`` | ✔ / ✖ | - | No special treatment for ``~>``, ``\cdot``, ``ENABLED`` | +| ``N(x1, ..., x_n) == INSTANCE M WITH...`` | ✔ / ✖ | - | Parameterized instances are not supported | +| ``THEOREM P`` | ✔ / ✖ | - | Parsed but not used | +| ``LOCAL def`` | ✔ | - | Replaced with local LET-IN definitions | ### The constant operators #### Logic -Operator | Supported? | Milestone | Comment -------------------------|:------------------:|:---------------:|-------------- -`/\`, `\/`, `~`, `=>`, `<=>` | ✔ | - | -``TRUE``, ``FALSE``, ``BOOLEAN`` | ✔ | - | -``\A x \in S: p``, ``\E x \in S : p`` | ✔ | - | -``CHOOSE x \in S : p`` | ✖ | - | Partial support prior to version 0.16.1. From 0.16.1 and later, use `Some`, `ApaFoldSet`, or `ApaFoldSeqLeft`. See [#841](https://github.com/informalsystems/apalache/issues/841). -``CHOOSE x : x \notin S`` | ✖ | - | Not supported. You can use records or a default value such as -1. -``\A x : p, \E x : p`` | ✖ | - | Use bounded quantifiers -``CHOOSE x : p`` | ✖ | - | - +| Operator | Supported? | Milestone | Comment | +|---------------------------------------|:----------:|:---------:|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| `/\`, `\/`, `~`, `=>`, `<=>` | ✔ | - | | +| ``TRUE``, ``FALSE``, ``BOOLEAN`` | ✔ | - | | +| ``\A x \in S: p``, ``\E x \in S : p`` | ✔ | - | | +| ``CHOOSE x \in S : p`` | ✖ | - | Partial support prior to version 0.16.1. From 0.16.1 and later, use `Some`, `ApaFoldSet`, or `ApaFoldSeqLeft`. See [#841](https://github.com/informalsystems/apalache/issues/841). | +| ``CHOOSE x : x \notin S`` | ✖ | - | Not supported. You can use records or a default value such as -1. | +| ``\A x : p, \E x : p`` | ✖ | - | Use bounded quantifiers | +| ``CHOOSE x : p`` | ✖ | - | | #### Sets @@ -45,24 +44,24 @@ Operator | Supported? | Milestone | Comment quantification over `Int` and `Nat` is supported, as soon as it can be replaced with a constant. -Operator | Supported? | Milestone | Comment -------------------------|:------------------:|:---------------:|-------------- -`=`, `/=`, `\in`, `\notin`, `\intersect`, `\union`, `\subseteq`, `\` | ✔ | - | -`{e_1, ..., e_n}` | ✔ | - | -`{x \in S : p}` | ✔ | - | -`{e : x \in S}` | ✔ | - | -`SUBSET S` | ✔ | - | Sometimes, the powersets are expanded -`UNION S` | ✔ | - | Provided that S is expanded +| Operator | Supported? | Milestone | Comment | +|----------------------------------------------------------------------|:----------:|:---------:|---------------------------------------| +| `=`, `/=`, `\in`, `\notin`, `\intersect`, `\union`, `\subseteq`, `\` | ✔ | - | | +| `{e_1, ..., e_n}` | ✔ | - | | +| `{x \in S : p}` | ✔ | - | | +| `{e : x \in S}` | ✔ | - | | +| `SUBSET S` | ✔ | - | Sometimes, the powersets are expanded | +| `UNION S` | ✔ | - | Provided that S is expanded | #### Functions -Operator | Supported? | Milestone | Comment -------------------------|:------------------:|:---------------:|-------------- -`f[e]` | ✔ | - | -`DOMAIN f` | ✔ | - | -`[ x \in S ↦ e]` | ✔ | - | -`[ S -> T ]` | ✔ | - | Supported, provided the function can be interpreted symbolically -`[ f EXCEPT ![e1] = e2 ]` | ✔ | - | +| Operator | Supported? | Milestone | Comment | +|---------------------------|:----------:|:---------:|------------------------------------------------------------------| +| `f[e]` | ✔ | - | | +| `DOMAIN f` | ✔ | - | | +| `[ x \in S ↦ e]` | ✔ | - | | +| `[ S -> T ]` | ✔ | - | Supported, provided the function can be interpreted symbolically | +| `[ f EXCEPT ![e1] = e2 ]` | ✔ | - | | #### Records @@ -71,13 +70,13 @@ annotations](https://apalache.informal.systems/docs/tutorials/snowcat-tutorial.h to help the model checker in finding the right types.* Note that our type system distinguishes records from general functions. -Operator | Supported? | Milestone | Comment -------------------------|:------------------:|:---------------:|-------------- -`e.h` | ✔ | - | -`r[e]` | ✔/✖ | - | Provided that `e` is a constant expression. -`[ h1 ↦ e1, ..., h_n ↦ e_n]` | ✔ | - | -`[ h1 : S1, ..., h_n : S_n]` | ✔ | - | -`[ r EXCEPT !.h = e]` | ✔ | - | +| Operator | Supported? | Milestone | Comment | +|------------------------------|:----------:|:---------:|---------------------------------------------| +| `e.h` | ✔ | - | | +| `r[e]` | ✔/✖ | - | Provided that `e` is a constant expression. | +| `[ h1 ↦ e1, ..., h_n ↦ e_n]` | ✔ | - | | +| `[ h1 : S1, ..., h_n : S_n]` | ✔ | - | | +| `[ r EXCEPT !.h = e]` | ✔ | - | | #### Tuples @@ -86,55 +85,55 @@ annotations](https://apalache.informal.systems/docs/tutorials/snowcat-tutorial.h to help the model checker in finding the right types.* Note that our type system distinguishes records from general functions. -Operator | Supported? | Milestone | Comment -------------------------|:------------------:|:---------------:|-------------- -`e[i]` | ✔ / ✖ | - | Provided that `i` is a constant expression -`<< e1, ..., e_n >>` | ✔ | - | Use a [type annotation](https://apalache.informal.systems/docs/tutorials/snowcat-tutorial.html) to distinguish between a tuple and a sequence. -`S1 \X ... \X S_n` | ✔ | - | -`[ t EXCEPT ![i] = e]` | ✔/✖ | - | Provided that `i` is a constant expression +| Operator | Supported? | Milestone | Comment | +|------------------------|:----------:|:---------:|------------------------------------------------------------------------------------------------------------------------------------------------| +| `e[i]` | ✔ / ✖ | - | Provided that `i` is a constant expression | +| `<< e1, ..., e_n >>` | ✔ | - | Use a [type annotation](https://apalache.informal.systems/docs/tutorials/snowcat-tutorial.html) to distinguish between a tuple and a sequence. | +| `S1 \X ... \X S_n` | ✔ | - | | +| `[ t EXCEPT ![i] = e]` | ✔/✖ | - | Provided that `i` is a constant expression | #### Strings and numbers -Construct | Supported? | Milestone | Comment -------------------------|:------------------:|:---------------:|-------------- -`"c1...c_n"` | ✔ | - | A string is always mapped to a unique uninterpreted constant -`STRING` | ✖ | - | It is an infinite set. We cannot handle infinite sets. -`d1...d_n` | ✔ | - | As long as the SMT solver (Z3) accepts that large number -`d1...d_n.d_n+1...d_m` | ✖ | - | Technical issue. We will implement it upon a user request. +| Construct | Supported? | Milestone | Comment | +|------------------------|:----------:|:---------:|--------------------------------------------------------------| +| `"c1...c_n"` | ✔ | - | A string is always mapped to a unique uninterpreted constant | +| `STRING` | ✖ | - | It is an infinite set. We cannot handle infinite sets. | +| `d1...d_n` | ✔ | - | As long as the SMT solver (Z3) accepts that large number | +| `d1...d_n.d_n+1...d_m` | ✖ | - | Technical issue. We will implement it upon a user request. | #### Miscellaneous Constructs -Construct | Supported? | Milestone | Comment -------------------------|:------------------:|:---------------:|-------------- -`IF p THEN e1 ELSE e2` | ✔ | - | Provided that both e1 and e2 have the same type -`CASE p1 -> e1 [] ... [] p_n -> e_n [] OTHER -> e` | ✔ | - | Provided that `e1, ..., e_n, e` have the same type -`CASE p1 -> e1 [] ... [] p_n -> e_n` | ✔ | - | Provided that `e1, ..., e_n` have the same type -`LET d1 == e1 ... d_n == e_n IN e` | ✔ | | All applications of `d1`, ..., `d_n` are replaced with the expressions `e1`, ... `e_n` respectively. LET-definitions without arguments are kept in place. -multi-line `/\` and `\/` | ✔ | - | +| Construct | Supported? | Milestone | Comment | +|----------------------------------------------------|:----------:|:---------:|-----------------------------------------------------------------------------------------------------------------------------------------------------------| +| `IF p THEN e1 ELSE e2` | ✔ | - | Provided that both e1 and e2 have the same type | +| `CASE p1 -> e1 [] ... [] p_n -> e_n [] OTHER -> e` | ✔ | - | Provided that `e1, ..., e_n, e` have the same type | +| `CASE p1 -> e1 [] ... [] p_n -> e_n` | ✔ | - | Provided that `e1, ..., e_n` have the same type | +| `LET d1 == e1 ... d_n == e_n IN e` | ✔ | | All applications of `d1`, ..., `d_n` are replaced with the expressions `e1`, ... `e_n` respectively. LET-definitions without arguments are kept in place. | +| multi-line `/\` and `\/` | ✔ | - | | ### The Action Operators -Construct | Supported? | Milestone | Comment -------------------------|:------------------:|:---------------:|-------------- -`e'` | ✔ | - | -`[A]_e` | ✔ | - | -`< A >_e` | ✔ | - | -`ENABLED A` | ✖ | - | Has to be specified manually -`UNCHANGED <>` | ✔ | - |Always replaced with `e_1' = e_1 /\ ... /\ e_k' = e_k` -`A ∙ B` | ✖ | - | +| Construct | Supported? | Milestone | Comment | +|-------------------------------|:----------:|:---------:|--------------------------------------------------------| +| `e'` | ✔ | - | | +| `[A]_e` | ✔ | - | | +| `< A >_e` | ✔ | - | | +| `ENABLED A` | ✖ | - | Has to be specified manually | +| `UNCHANGED <>` | ✔ | - | Always replaced with `e_1' = e_1 /\ ... /\ e_k' = e_k` | +| `A ∙ B` | ✖ | - | | ### The Temporal Operators -Construct | Supported? | Milestone | Comment -------------------------|:------------------:|:---------------:|-------------- -`[]F` | ✔ | - | -`<>F` | ✔ | - | -`WF_e(A)` | ✖ | - | Has to be specified manually (see ENABLED) -`SF_e(A)` | ✖ | - | Has to be specified manually (see ENABLED) -`F ~> G` | ✔ | - | Always replaced with `[](F => <>G)` -`F -+-> G` | ✖ | - | -`\EE x: F` | ✖ | - | -`\AA x: F` | ✖ | - | +| Construct | Supported? | Milestone | Comment | +|------------|:----------:|:---------:|--------------------------------------------| +| `[]F` | ✔ | - | | +| `<>F` | ✔ | - | | +| `WF_e(A)` | ✖ | - | Has to be specified manually (see ENABLED) | +| `SF_e(A)` | ✖ | - | Has to be specified manually (see ENABLED) | +| `F ~> G` | ✔ | - | Always replaced with `[](F => <>G)` | +| `F -+-> G` | ✖ | - | | +| `\EE x: F` | ✖ | - | | +| `\AA x: F` | ✖ | - | | The model checker assumes that the specification has the form `Init /\ [][Next]_e`. Other than that, temporal operators @@ -146,39 +145,39 @@ may only appear in temporal properties, not in e.g. actions. For the moment, the model checker does not differentiate between integers and naturals. They are all translated as integers in SMT. -Operator | Supported? | Milestone | Comment -------------------------|:------------------:|:---------------:|-------------- -`+`, `-`, `*`, `<=`, `>=`, `<`, `>` | ✔ | - | These operators are translated into integer arithmetic of the SMT solver. Linear integer arithmetic is preferred. -`\div`, `%` | ✔ | - | Integer division and modulo -`a^b` | ✔ / ✖ | - | Provided a and b are constant expressions -`a..b` | ✔ / ✖ | - | Sometimes, `a..b` needs a constant upper bound on the range. When Apalache complains, use `{x \in A..B : a <= x /\ x <= b}`, provided that `A` and `B` are constant expressions. -`Int`, `Nat` | ✔ / ✖ | - | Supported in `\E x \in Nat: p` and `\E x \in Int: p`, if the expression is not located under `\A` and `~`. We also support assignments like `f' \in [S -> Int]` and tests `f \in [S -> Nat]` -`/` | ✖ | - | Real division, not supported +| Operator | Supported? | Milestone | Comment | +|-------------------------------------|:----------:|:---------:|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| `+`, `-`, `*`, `<=`, `>=`, `<`, `>` | ✔ | - | These operators are translated into integer arithmetic of the SMT solver. Linear integer arithmetic is preferred. | +| `\div`, `%` | ✔ | - | Integer division and modulo | +| `a^b` | ✔ / ✖ | - | Provided a and b are constant expressions | +| `a..b` | ✔ / ✖ | - | Sometimes, `a..b` needs a constant upper bound on the range. When Apalache complains, use `{x \in A..B : a <= x /\ x <= b}`, provided that `A` and `B` are constant expressions. | +| `Int`, `Nat` | ✔ / ✖ | - | Supported in `\E x \in Nat: p` and `\E x \in Int: p`, if the expression is not located under `\A` and `~`. We also support assignments like `f' \in [S -> Int]` and tests `f \in [S -> Nat]` | +| `/` | ✖ | - | Real division, not supported | ### Sequences -Operator | Supported? | Milestone | Comment -------------------------|:------------------:|:---------------:|-------------- -`<<...>>` | ✔ | | Often needs a [type annotation](https://apalache.informal.systems/docs/tutorials/snowcat-tutorial.html). -`Head`, `Tail`, `Len``, `SubSeq`, `Append`, `\o`, `f[e]` | ✔ | - | -`EXCEPT` | ✔ | | -`SelectSeq` | ✔ | - | Not as efficient, as it could be, see [#1350](https://github.com/informalsystems/apalache/issues/1350). -`Seq(S)` | ✖ | - | Use `Gen` of Apalache to produce bounded sequences +| Operator | Supported? | Milestone | Comment | +|----------------------------------------------------------|:----------:|:---------:|----------------------------------------------------------------------------------------------------------| +| `<<...>>` | ✔ | | Often needs a [type annotation](https://apalache.informal.systems/docs/tutorials/snowcat-tutorial.html). | +| `Head`, `Tail`, `Len``, `SubSeq`, `Append`, `\o`, `f[e]` | ✔ | - | | +| `EXCEPT` | ✔ | | | +| `SelectSeq` | ✔ | - | Not as efficient, as it could be, see [#1350](https://github.com/informalsystems/apalache/issues/1350). | +| `Seq(S)` | ✖ | - | Use `Gen` of Apalache to produce bounded sequences | ### FiniteSets -Operator | Supported? | Milestone | Comment -------------------------|:------------------:|:---------------:|-------------- -``IsFinite`` | ✔ | - | Always returns true, as all the supported sets are finite -``Cardinality`` | ✔ | - | Try to avoid it, as `Cardinality(S)` produces `O(n^2)` constraints in SMT for cardinality `n` +| Operator | Supported? | Milestone | Comment | +|-----------------|:----------:|:---------:|-----------------------------------------------------------------------------------------------| +| ``IsFinite`` | ✔ | - | Always returns true, as all the supported sets are finite | +| ``Cardinality`` | ✔ | - | Try to avoid it, as `Cardinality(S)` produces `O(n^2)` constraints in SMT for cardinality `n` | ### TLC -Operator | Supported? | Milestone | Comment ------------|:------------------:|:---------------:|-------------- -`a :> b` | ✔ | - | A singleton function `<>` -`f @@ g` | ✔ | - | Extends function `f` with the domain and values of function `g` but keeps the values of `f` where domains overlap -Other operators | | | Dummy definitions for spec compatibility +| Operator | Supported? | Milestone | Comment | +|-----------------|:----------:|:---------:|-------------------------------------------------------------------------------------------------------------------| +| `a :> b` | ✔ | - | A singleton function `<>` | +| `f @@ g` | ✔ | - | Extends function `f` with the domain and values of function `g` but keeps the values of `f` where domains overlap | +| Other operators | | | Dummy definitions for spec compatibility | ### Reals diff --git a/docs/src/apalache/installation/docker.md b/docs/src/apalache/installation/docker.md index 56c17f620b..d5eea19e76 100644 --- a/docs/src/apalache/installation/docker.md +++ b/docs/src/apalache/installation/docker.md @@ -6,8 +6,8 @@ configuration of the tool considerably. Unless you have a pressing need to use the docker image, we recommend using one of our [prebuilt releases](./jvm.md). -[Docker](https://www.docker.com/) lets you to run the Apalache tool from inside -an isolated container. The only dependency required to run Apalache is the a +[Docker](https://www.docker.com/) lets you run the Apalache tool from inside +an isolated container. The only dependency required to run Apalache is a suitable JVM, and the container supplies this. However, you must already have [docker installed](https://docs.docker.com/get-docker/). @@ -34,7 +34,7 @@ The following docker parameters are used: extends.** From the user perspective, it works as if Apalache was executing in ``. - In particular the tool logs are written in that directory. + In particular, the tool logs are written in that directory. When using SELinux, you might have to use the modified form of `-v` option: `-v :/var/apalache:z` @@ -59,7 +59,7 @@ $ APALACHE_TAG=foo $APALACHE_HOME/script/run-docker.sh ## Setting an alias -If you are running Apalache on Linux :penguin: or MacOS +If you are running Apalache on Linux :penguin: or macOS :green_apple:, you can define this handy alias in your rc file, which runs Apalache in docker while sharing the working directory: @@ -76,7 +76,7 @@ $ alias apalache='docker run --rm -v $(pwd):/var/apalache ghcr.io/informalsystem ## Using the development branch of Apalache -The development of Apalache proceeds at a quick pace and we cut releases weekly. +The development of Apalache proceeds at a quick pace, and we cut releases weekly. Please refer to the [changelog][] and the [manual][] on the `main` development branch for a report of the newest features. Since we cut releases weekly, you should have access to all the latest features in the last week by using the @@ -109,7 +109,7 @@ $ alias apalache='docker run --rm -v $(pwd):/var/apalache ghcr.io/informalsystem ## Building an image -For an end user there is no need to build an Apalache image. If you like to +For an end user, there is no need to build an Apalache image. If you like to produce a modified docker image, take into account that it will take about 30 minutes for the image to get built, due to compilation times of Microsoft Z3. To build a docker image of Apalache, issue the following command in diff --git a/docs/src/apalache/installation/index.md b/docs/src/apalache/installation/index.md index 419d2b3973..5216095e68 100644 --- a/docs/src/apalache/installation/index.md +++ b/docs/src/apalache/installation/index.md @@ -3,11 +3,10 @@ There are three ways to run Apalache: 1. [Prebuilt package](./jvm.md): download a prebuilt package and run it in the JVM. - 1. [Docker](./docker.md): download and run a Docker image. - 1. [Build from source](./source.md): build Apalache from sources and run the compiled package. + 2. [Docker](./docker.md): download and run a Docker image. + 3. [Build from source](./source.md): build Apalache from sources and run the compiled package. -If you just want to try the tool, we recommend using the [prebuilt -package](./jvm.md). +If you just want to try the tool, we recommend using the [prebuilt package](./jvm.md). ## System requirements diff --git a/docs/src/apalache/installation/source.md b/docs/src/apalache/installation/source.md index ad99a64f1a..f285d3d443 100644 --- a/docs/src/apalache/installation/source.md +++ b/docs/src/apalache/installation/source.md @@ -28,7 +28,7 @@ repository, you have three options after running `make`: - Add the `./bin` directory in the source repository to your `PATH`, which will make `apalche-mc` available. - Install [direnv][] and run `direnv allow`, which will put `apalche-mc` in your - path when you are inside of the Apalache repo directory. + path when you are inside the Apalache repo directory. - Run `./bin/apalache-mc` directly. diff --git a/docs/src/apalache/known-issues.md b/docs/src/apalache/known-issues.md index 4925a580da..005f48393c 100644 --- a/docs/src/apalache/known-issues.md +++ b/docs/src/apalache/known-issues.md @@ -59,7 +59,9 @@ Next == ## Integer ranges with non-constant bounds -When using an integer range `a..b`, where `a` or `b` aren't constant (or cannot be simplified to a constant), the current encoding fails (see [Issue 425][]): +When using an integer range `a..b`, where `a` or `b` aren't constant +(or cannot be simplified to a constant), +the current encoding fails (see [Issue 425][]): ```tla ---------- MODULE Example ---------- @@ -97,7 +99,8 @@ instead of `a..b`. ## Using Seq(S) -The operator `Seq(S)` produces an infinite set of unbounded sequences. Hence, Apalache is not able to do anything about +The operator `Seq(S)` produces an infinite set of unbounded sequences. +Hence, Apalache is not able to do anything about this set. Consider the following snippet: ```tla @@ -111,7 +114,8 @@ this set. Consider the following snippet: ### Workaround -If you know an upper bound on the length of sequences you need, which is often the case when checking one model, you can +If you know an upper bound on the length of sequences you need, +which is often the case when checking one model, you can work around this issue by using [Apalache.Gen](https://github.com/informalsystems/apalache/blob/0bf827c521d3992f39e085cc98ff114bfa0b1721/src/tla/Apalache.tla#L31-L39): @@ -124,8 +128,9 @@ EXTENDS Apalache /\ seq' = s ``` -In the above example, we instruct Apalache to introduce an unrestricted sequence that contains up to 10 elements; this -is done with `Gen`. We further restrict the sequence to contain the elements of `{ 1, 2, 3 }`. +In the above example, we instruct Apalache to introduce an unrestricted sequence +that contains up to 10 elements; this is done with `Gen`. +We further restrict the sequence to contain the elements of `{ 1, 2, 3 }`. However, note that our workaround only works for bounded sequences, whereas `Seq({ 1, 2, 3 })` is the set of all sequences whose elements come from `{ 1, 2, 3 }`. diff --git a/docs/src/apalache/preprocessing.md b/docs/src/apalache/preprocessing.md index 440fc1aba3..2b604c46ff 100644 --- a/docs/src/apalache/preprocessing.md +++ b/docs/src/apalache/preprocessing.md @@ -8,7 +8,7 @@ preprocessing steps: - replaces every call to a let-in defined operator of arity at least 1 with the operator's body * `PrimingPass`: adds primes to variables in `Init` and `ConstInit` (required by `TransitionPass`) * `VCGen`: extracts verification conditions from the invariant candidate. - * `Desugarer`: removes syntactic sugar like short-hand expressions in `EXCEPT`. + * `Desugarer`: removes syntactic sugar like shorthand expressions in `EXCEPT`. * `Normalizer`: rewrites all expressions in [negation-normal form](https://en.wikipedia.org/wiki/Negation_normal_form). * `Keramelizer`: translates TLA+ expressions into the kernel language [KerA](./kera.md). * `ExprOptimizer`: statically computes select expressions (e.g. record field access from a known record) @@ -16,9 +16,10 @@ preprocessing steps: ## Keramelizer - Keramelizer rewrites TLA+ expressions into [KerA](./kera.md). For many TLA+ expressions - this translation is clear, however, some expressions cannot be easily translated. Below - we discuss such expressions and the decisions that we have made. + Keramelizer rewrites TLA+ expressions into [KerA](./kera.md). + For many TLA+ expressions, this translation is clear; + however, some expressions cannot be easily translated. + Below, we discuss such expressions and the decisions that we have made. # References diff --git a/docs/src/apalache/principles/apalache-mod.md b/docs/src/apalache/principles/apalache-mod.md index cdf3df6dd8..2d35d6d7d2 100644 --- a/docs/src/apalache/principles/apalache-mod.md +++ b/docs/src/apalache/principles/apalache-mod.md @@ -3,7 +3,7 @@ Similar to the `TLC` module, we provide a module called `Apalache`, which can be found in [Apalache.tla][]. Many of the operators in that module are used internally by Apalache, when rewriting a TLA+ specification. It is useful to read the comments to the operators defined in [Apalache.tla][], as they will help you in understanding -the [detailed output](./running.md#detailed) produced by the tool. +the [detailed output](../running.md#detailed) produced by the tool. See the detailed description of the [Apalache operators][]. diff --git a/docs/src/apalache/principles/assignments.md b/docs/src/apalache/principles/assignments.md index 035a765a89..efd5a49f37 100644 --- a/docs/src/apalache/principles/assignments.md +++ b/docs/src/apalache/principles/assignments.md @@ -25,7 +25,7 @@ As you can see, the model checker did two things: 1. It has translated several expressions that look like `x' = e` into `x' := e`. For instance, you can see `year' := 80` and `hasLicense' := FALSE` in `Init_si_0000`. We call these expressions **assignments**. -1. It has factored the operator `Next` into two operators `Next_si_0000` and `Next_si_0001`. +2. It has factored the operator `Next` into two operators `Next_si_0000` and `Next_si_0001`. We call these operators **symbolic transitions**. Pure TLA+ does not have the notions of assignments and symbolic diff --git a/docs/src/apalache/principles/enumeration.md b/docs/src/apalache/principles/enumeration.md index e4d539cfeb..d45d0d980d 100644 --- a/docs/src/apalache/principles/enumeration.md +++ b/docs/src/apalache/principles/enumeration.md @@ -1,9 +1,8 @@ # Enumerating counterexamples -By default, Apalache stops whenever it finds a property violation. This is true -for the commands that are explained in the Section on [Running the -Tool](./running.md). Sometimes, we want to produce multiple counterexamples; -for instance, to generate multiple tests. +By default, Apalache stops whenever it finds a property violation. +This is true for the commands that are explained in the Section on [Running the Tool](../running.md). +Sometimes, we want to produce multiple counterexamples; for instance, to generate multiple tests. Consider the following TLA+ specification: @@ -79,9 +78,9 @@ produce complex constraints and slow down the model checker, we leave the choice to the user. Usually, the specification author has a good idea of how to partition states -into interesting equivalence classes. We let you specify this partitiong by declaring +into interesting equivalence classes. We let you specify this partitioning by declaring a view abstraction, similar to the `VIEW` configuration option in TLC. -Basically, two states are considered to be similar, if they have the same view. +Basically, two states are considered to be similar if they have the same view. In our example, we compute the state view with the operator `View1`: diff --git a/docs/src/apalache/principles/folds.md b/docs/src/apalache/principles/folds.md index 07ea6ad797..200549e852 100644 --- a/docs/src/apalache/principles/folds.md +++ b/docs/src/apalache/principles/folds.md @@ -2,12 +2,15 @@ **Folds are an efficient replacement for recursive operators and functions.** -Apalache natively implements two operators users might be familiar with from the [community modules](https://github.com/tlaplus/CommunityModules) or functional programming. Those operators are `ApaFoldSet` and `ApaFoldSeqLeft`. This brief introduction to fold operators highlights the following: +Apalache natively implements two operators users might be familiar with from the [community modules](https://github.com/tlaplus/CommunityModules) +or functional programming. +Those operators are `ApaFoldSet` and `ApaFoldSeqLeft`. +This brief introduction to fold operators highlights the following: 1. What are the semantics of fold operators? - 1. How do I use these operators in Apalache? - 1. Should I use folding or recursion? - 1. Examples of common operators defined with folds + 2. How do I use these operators in Apalache? + 3. Should I use folding or recursion? + 4. Examples of common operators defined with folds ### Syntax @@ -23,45 +26,63 @@ ApaFoldSeqLeft( operator, base, seq ) ### Semantics of fold operators -Folding refers to iterative application of a binary operator over a collection. Given an operator `Op`, a base value `b` and a collection of values `C`, the definition of folding `Op` over `C` starting with `b` depends on the type of the collection `C`. +Folding refers to iterative application of a binary operator over a collection. +Given an operator `Op`, a base value `b` and a collection of values `C`, +the definition of folding `Op` over `C` starting with `b` depends on the type of the collection `C`. #### Semantics of `ApaFoldSeqLeft` In the case of folding over sequences, `C` is a sequence `<>`. Then, `ApaFoldSeqLeft( Op, b, C )` is defined as follows: 1. If `C` is empty, then `ApaFoldSeqLeft( Op, b, <<>> ) = b`, regardless of `Op` - 1. If `C` is nonempty, we establish a recursive relation between folding over `C` and folding over `Tail(C)` in the following way: `ApaFoldSeqLeft( Op, b, C ) = ApaFoldSeqLeft( Op, Op(b, Head(C)), Tail(C) )`. + 2. If `C` is nonempty, we establish a recursive relation between folding over `C` and folding over `Tail(C)` in the following way: + `ApaFoldSeqLeft( Op, b, C ) = ApaFoldSeqLeft( Op, Op(b, Head(C)), Tail(C) )`. #### Semantics of `ApaFoldSet` In the case of folding over sets, `C` is a set `{a_1, ..., a_n}`. Then, `ApaFoldSet( Op, b, C )` is defined as follows: 1. If `C` is empty, then `ApaFoldSet( Op, b, {} ) = b`, regardless of `Op` - 1. If `C` is nonempty, we establish a recursive relation between folding over `C` and folding over some subset of `C` in the following way: `ApaFoldSet( Op, b, C ) = ApaFoldSet( Op, Op(b, x), C \ {x} )`, where `x` is some arbitrary member of `C` (e.g. `x = CHOOSE y \in C: TRUE`). Note that Apalache does not guarantee a deterministic choice of `x`, unlike what using `CHOOSE` would imply. + 2. If `C` is nonempty, we establish a recursive relation between folding over `C` and folding over some subset of `C` in the following way: + `ApaFoldSet( Op, b, C ) = ApaFoldSet( Op, Op(b, x), C \ {x} )`, where `x` is some arbitrary member of `C` (e.g. `x = CHOOSE y \in C: TRUE`). + Note that Apalache does not guarantee a deterministic choice of `x`, unlike what using `CHOOSE` would imply. Note that the above are definitions of a _left fold_ in the literature. Apalache does not implement a right fold. -For example, if `C` is the sequence `<>`, the result is equal to `Op( Op( Op(b, x), y), z)`. If `C = {x,y}`, the result is either `Op( Op(b, x), y)` or `Op( Op(b, y), x)`. Because the order of elements selected from a set is not predefined, users should be careful, as the result is only uniquely defined in the case that the operator is both associative (`Op(Op(a,b),c) = Op(a,Op(b,c))`) and commutative (`Op(a,b) = Op(b,a)`). +For example, if `C` is the sequence `<>`, the result is equal to `Op( Op( Op(b, x), y), z)`. +If `C = {x,y}`, the result is either `Op( Op(b, x), y)` or `Op( Op(b, y), x)`. +Because the order of elements selected from a set is not predefined, users should be careful, +as the result is only uniquely defined in the case that the operator is both associative +(`Op(Op(a,b),c) = Op(a,Op(b,c))`) and commutative +(`Op(a,b) = Op(b,a)`). -For example, consider the operator `Op(p,q) == 2 * p + q`, which is noncommutative, and the set `S = {1,2,3}`. The value of `ApaFoldSet(Op, 0, S)` depends on the order in which Apalache selects elements from S: +For example, consider the operator `Op(p,q) == 2 * p + q`, which is non-commutative, and the set `S = {1,2,3}`. +The value of `ApaFoldSet(Op, 0, S)` depends on the order in which Apalache selects elements from S: -| Order | ApaFoldSet value | -| --- | --- | -| 1 -> 2 -> 3 | 11 | -| 1 -> 3 -> 2 | 12 | -| 2 -> 1 -> 3 | 13 | -| 2 -> 3 -> 1 | 15 | -| 3 -> 1 -> 2 | 16 | -| 3 -> 2 -> 1 | 17 | +| Order | ApaFoldSet value | +|-------------|------------------| +| 1 -> 2 -> 3 | 11 | +| 1 -> 3 -> 2 | 12 | +| 2 -> 1 -> 3 | 13 | +| 2 -> 3 -> 1 | 15 | +| 3 -> 1 -> 2 | 16 | +| 3 -> 2 -> 1 | 17 | -Because Apalache does not guarantee deterministic choice in the order of iteration, users should treat all of the above results as possible outcomes. +Because Apalache does not guarantee deterministic choice in the order of iteration, +users should treat all the above results as possible outcomes. ### Using fold operators in Apalache -As shown by the type signature, Apalache permits a very general form of folding, where the types of the collection elements and the type of the base element/return-type of the operator do not have to match. Again, we urge users to exercise caution when using `ApaFoldSet` with an operator, for which the types `a` and `b` are different, as such operators cannot be commutative or associative, and therefore the result is not guaranteed to be unique and predictable. +As shown by the type signature, Apalache permits a very general form of folding, +where the types of the collection elements and the type of the base element/return-type of the operator do not have to match. +Again, we urge users to exercise caution when using `ApaFoldSet` with an operator, +for which the types `a` and `b` are different, +as such operators cannot be commutative or associative, +and therefore the result is not guaranteed to be unique and predictable. -The other component of note is `operator`, the _name_ (not definition) of some binary operator, which is available in this context. The following are examples of valid uses of folds: +The other component of note is `operator`, the _name_ (not definition) of some binary operator, which is available in this context. +The following are examples of valid uses of folds: ```tla PlusOne(p,q) == p + q + 1 @@ -92,7 +113,8 @@ X == A(1) \* X = 9 ### Folding VS recursion -While TLA+ allows users to write arbitrary recursive operators, they are, in our experience, mostly used to implement collection traversals. Consider the following implementations of a `Max` operator, which returns the largest element of a sequence: +While TLA+ allows users to write arbitrary recursive operators, they are, in our experience, mostly used to implement collection traversals. +Consider the following implementations of a `Max` operator, which returns the largest element of a sequence: ```tla \* Max(<<>>) = -inf, but integers are unbounded in TLA+, @@ -111,10 +133,15 @@ MaxFold(seq) == LET Max(p,q) == IF p > q THEN p ELSE q IN ApaFoldSeqLeft( Max, negInf, seq ) ``` -The first advantage of the fold implementation, we feel, is that it is much more clear and concise. It also does not require a termination condition, unlike the recursive case. +The first advantage of the fold implementation, we feel, is that it is much more clear and concise. +It also does not require a termination condition, unlike the recursive case. One inherent problem of using recursive operators with a symbolic encoding, is the inability to estimate termination. -While it may be immediately obvious to a human, that `MaxRec` terminates after no more than `Len(seq)` steps, automatic termination analysis is, in general, a rather complex and incomplete form of static analysis. -Apalache addresses this by finitely unrolling recursive operators and requires users to provide unroll limits (`UNROLL_LIMIT_MaxRec == ...`), which serve as a static upper bound to the number of recursive re-entries, because in general, recursive operators may take an unpredictable number of steps (e.g. computing the [Collatz sequence](https://en.wikipedia.org/wiki/Collatz_conjecture)) or never terminate at all. +While it may be immediately obvious to a human, that `MaxRec` terminates after no more than `Len(seq)` steps, +automatic termination analysis is, in general, a rather complex and incomplete form of static analysis. +Apalache addresses this by finitely unrolling recursive operators and requires users to provide unroll limits (`UNROLL_LIMIT_MaxRec == ...`), +which serve as a static upper bound to the number of recursive re-entries, because in general, +recursive operators may take an unpredictable number of steps +(e.g. computing the [Collatz sequence](https://en.wikipedia.org/wiki/Collatz_conjecture)) or never terminate at all. Consider a minor adaptation of the above example, where the author made a mistake in implementing the operator: ```tla @@ -127,7 +154,10 @@ MaxRec(seq) == IF seq = <<>> ELSE Head(seq) ``` -Now, `MaxRec` never terminates, but spotting this error might not be trivial at a glance. This is where we believe folds hold the second advantage: `ApaFoldSet` and `ApaFoldSeqLeft` *always terminate* in `Cardinality(set)` or `Len(seq)` steps, and each step is simple to describe, as it consists of a single operator application. +Now, `MaxRec` never terminates, but spotting this error might not be trivial at a glance. +This is where we believe folds hold the second advantage: +`ApaFoldSet` and `ApaFoldSeqLeft` *always terminate* in `Cardinality(set)` or `Len(seq)` steps, +and each step is simple to describe, as it consists of a single operator application. In fact, the vast majority of the traditionally recursive operators can be equivalently rewritten as folds, for example: ```tla @@ -141,17 +171,25 @@ CardinalityFold(set) == LET Count(p,q) == p + 1 \* the value of q, the set eleme IN ApaFoldSet( Count, 0, set ) ``` -Notice that, in the case of sets, picking an arbitrary element `x`, to remove from the set at each step, utilizes the `CHOOSE` operator. This is a common trait shared by many operators that implement recursion over sets. -Since the introduction of folds, the use of `CHOOSE` in Apalache is heavily discouraged as it is both inefficient, as well as nondeterministic (unlike how `CHOOSE` is defined in TLA+ literature). For details, see the discussion in [issue 841](https://github.com/informalsystems/apalache/issues/841). +Notice that, in the case of sets, picking an arbitrary element `x`, +to remove from the set at each step, utilizes the `CHOOSE` operator. +This is a common trait shared by many operators that implement recursion over sets. +Since the introduction of folds, the use of `CHOOSE` in Apalache is heavily discouraged as it is both inefficient, +and nondeterministic (unlike how `CHOOSE` is defined in TLA+ literature). +For details, see the discussion in [issue 841](https://github.com/informalsystems/apalache/issues/841). -So the third advantage of using folds is the ability to, almost always, avoid using the `CHOOSE` operator. +So the third advantage of using folds is the ability to, almost always, avoid using the `CHOOSE` operator. The downside of folding, compared to general recursion, is the inability to express non-primitively recursive functions. -For instance, one cannot define the [Ackermann function](https://en.wikipedia.org/wiki/Ackermann_function), as a fold. We find that in most specifications, this is not something the users would want to implement anyway, so in practice, we believe it is almost always better to use fold over recursive functions. +For instance, one cannot define the [Ackermann function](https://en.wikipedia.org/wiki/Ackermann_function), as a fold. +We find that in most specifications, this is not something the users would want to implement anyway, +so in practice, we believe it is almost always better to use fold over recursive functions. ### Folding VS quantification and `CHOOSE` -Often, folding can be used to select a value from a collection, which could alternatively be described by a predicate and selected with `CHOOSE`. Let us revisit the `MaxFold` example: +Often, folding can be used to select a value from a collection, +which could alternatively be described by a predicate and selected with `CHOOSE`. +Let us revisit the `MaxFold` example: ```tla MaxFold(seq) == LET Max(p,q) == IF p > q THEN p ELSE q @@ -166,11 +204,23 @@ MaxChoose(seq) == IN CHOOSE m \in Range : \A n \in Range : m >= n ``` -The predicate-based approach might result in a more compact specification, but that is because specifications have no notion of execution or complexity. Automatic verification tools, such as Apalache, the job of which includes finding witnesses to predicates, can work much faster with the fold approach. The reason is that _evaluating_ `CHOOSE x \in S : \A y \in S: P(x,y)` is quadratic in the size of `S` (in a symbolic approach this is w.r.t. the number of constraints). For each candidate `x`, the entire set `S` must be tested for `P(x,_)`. On the other hand, the fold approach is linear in the size of `S`, since each element is visited exactly once. +The predicate-based approach might result in a more compact specification, +but that is because specifications have no notion of execution or complexity. +Automatic verification tools, such as Apalache, +the job of which includes finding witnesses to the predicates, can work much faster with the fold approach. +The reason is that _evaluating_ `CHOOSE x \in S : \A y \in S: P(x,y)` is quadratic in the size of `S` +(in a symbolic approach this is w.r.t. the number of constraints). +For each candidate `x`, the entire set `S` must be tested for `P(x,_)`. +On the other hand, the fold approach is linear in the size of `S`, since each element is visited exactly once. -In addition, the fold approach admits no undefined behavior. If, in the above example, `seq` was an empty sequence, the value of the computed maximum depends on the value of `CHOOSE x \in {}: TRUE`, which is undefined in TLA+, while the fold-based approach allows the user to determine behavior in that scenario (via the initial value). +In addition, the fold approach admits no undefined behavior. +If, in the above example, `seq` was an empty sequence, +the value of the computed maximum depends on the value of `CHOOSE x \in {}: TRUE`, which is undefined in TLA+, +while the fold-based approach allows the user to determine behavior in that scenario (via the initial value). -Our general advice is to use folds over `CHOOSE` with quantified predicates wherever possible, if you're willing accept a very minor increase in specification size in exchange for a decrease in Apalache execution time, or, if you wish to avoid `CHOOSE` over empty sets resulting in undefined behavior. +Our general advice is to use folds over `CHOOSE` with quantified predicates wherever possible, +if you're willing accept a very minor increase in specification size in exchange for a decrease in Apalache execution time, +or, if you wish to avoid `CHOOSE` over empty sets resulting in undefined behavior. ### Examples: The versatility of folds @@ -186,6 +236,12 @@ For the sake of comparison, we rewrite the above operators using recursion, `CHO {{#include ../../../../test/tla/NonFoldDefined.tla::}} ``` -In most cases, recursive operators are much more verbose, and the operators using `CHOOSE` and/or quantification mask double iteration (and thus have quadratic complexity). -For instance, the evaluation of the fold-less `IsInjective` operator actually requires the traversal of all domain pairs, instead of the single domain traversal with fold. -In particular, `Mode`, the most verbose among the fold-defined operators, is still very readable (most LET-IN operators are introduced to improve readability, at the cost of verbosity) and quite efficient, as its complexity is linear w.r.t. the length of the sequence (the mode could also be computed directly, without a sub-call to `Range`, but the example would be more difficult to read), unlike the variant with `CHOOSE` and `\A`, which is quadratic. +In most cases, recursive operators are much more verbose, +and the operators using `CHOOSE` and/or quantification mask double iteration (and thus have quadratic complexity). +For instance, the evaluation of the fold-less `IsInjective` operator actually requires the traversal of all domain pairs, +instead of the single domain traversal with fold. +In particular, `Mode`, the most verbose among the fold-defined operators, +is still very readable (most LET-IN operators are introduced to improve readability, at the cost of verbosity) and quite efficient, +as its complexity is linear w.r.t. the length of the sequence +(the mode could also be computed directly, without a sub-call to `Range`, but the example would be more difficult to read), +unlike the variant with `CHOOSE` and `\A`, which is quadratic. diff --git a/docs/src/apalache/principles/index.md b/docs/src/apalache/principles/index.md index 0197deef3c..fb742bb2d5 100644 --- a/docs/src/apalache/principles/index.md +++ b/docs/src/apalache/principles/index.md @@ -3,16 +3,15 @@ In order to take advantage of Apalache's symbolic model checking, there are a few principles one must bear in mind when writing TLA. -Note that Apalache requires type annotations. Check the [Snowcat -tutorial](../tutorials/snowcat-tutorial.md) and [HOWTO on -annotations](../HOWTOs/howto-write-type-annotations.md). +Note that Apalache requires type annotations. Check the [Snowcat tutorial](../../tutorials/snowcat-tutorial.md) +and [HOWTO on annotations](../../HOWTOs/howto-write-type-annotations.md). Topics: - [Assignments and symbolic transitions](./assignments.md) - [Folding sets and sequences](./folds.md) - - [Invariants: State, Action, Trace](./principles/invariants.md) - - [Enumeration of counterexamples](./principles/enumeration.md) - - [The Apalache Module](./apalache/principles/apalache-mod.md) + - [Invariants: State, Action, Trace](./invariants.md) + - [Enumeration of counterexamples](./enumeration.md) + - [The Apalache Module](./apalache-mod.md) - [Naturals module](./naturals.md) diff --git a/docs/src/apalache/principles/invariants.md b/docs/src/apalache/principles/invariants.md index ad3d289e55..fe31e012a7 100644 --- a/docs/src/apalache/principles/invariants.md +++ b/docs/src/apalache/principles/invariants.md @@ -103,7 +103,7 @@ State 3: trace invariant 0 violated. Check the counterexample in: counterexample ``` Trace invariants are quite powerful. You can write down temporal properties as -trace invariants. However, we recommend to use trace invariants for testing, as +trace invariants. However, we recommend using trace invariants for testing, as they are too powerful. For verification, you should use temporal properties. [Action properties]: https://www.hillelwayne.com/post/action-properties/ diff --git a/docs/src/apalache/principles/naturals.md b/docs/src/apalache/principles/naturals.md index f857db9d2e..2982209490 100644 --- a/docs/src/apalache/principles/naturals.md +++ b/docs/src/apalache/principles/naturals.md @@ -1,8 +1,7 @@ ### Naturals -If you look carefully at the [HOWTO on -annotations](../HOWTOs/howto-write-type-annotations.md), you will find that +If you look carefully at the [HOWTO on annotations](../../HOWTOs/howto-write-type-annotations.md), you will find that there is no designated type for naturals. Indeed, one can just use the type `Int`, whenever a natural number is required. If we introduced a special type for naturals, that would cause a lot of confusion for the type checker. What diff --git a/docs/src/apalache/principles/recursive.md b/docs/src/apalache/principles/recursive.md index 50f90ec0ce..1100e8fd78 100644 --- a/docs/src/apalache/principles/recursive.md +++ b/docs/src/apalache/principles/recursive.md @@ -1,7 +1,8 @@ ## Recursive operators and functions -While TLA+ allows the use of recursive operators and functions, we have decided to no longer support them in Apalache from version `0.23.1` onward, and suggest the use of fold operators, described in [Folding sets and sequences](./folds.md) instead: +While TLA+ allows the use of recursive operators and functions, we have decided to no longer support them in Apalache from version `0.23.1` onward, +and suggest the use of fold operators, described in [Folding sets and sequences](./folds.md) instead: ```tla {{#include ../../../../src/tla/Apalache.tla:122:127}} @@ -16,26 +17,33 @@ Note that the remainder of this section discusses only recursive operators, for ### The problem with recursive operators -In the preprocessing phase, Apalache replaces every application of a user-defined operator with its body. We call this process "operator inlining". -This obviously cannot be done for recursive operators, since the process would never terminate. Additionally, even if inlining wasn't problematic, we would still face the following issues when attempting to construct a symbolic encoding: +In the preprocessing phase, Apalache replaces every application of a user-defined operator with its body. +We call this process "operator inlining". +This obviously cannot be done for recursive operators, since the process would never terminate. +Additionally, even if inlining wasn't problematic, +we would still face the following issues when attempting to construct a symbolic encoding: 1. A recursive operator may be non-terminating (although a non-terminating operator is useless in TLA+); - 1. A terminating call to an operator may take an unpredictable number of iterations. + 2. A terminating call to an operator may take an unpredictable number of iterations. -A note on (2): In practice, when one fixes specification parameters (that is, -`CONSTANTS`), it is sometimes possible to find a bound on the number of operator iterations. For instance, consider the following specification: +A note on (2): In practice, when one fixes specification parameters (that is, `CONSTANTS`), +it is sometimes possible to find a bound on the number of operator iterations. +For instance, consider the following specification: ```tla {{#include ../../../../test/tla/Rec6.tla::}} ``` -It is clear that the expression `Sum(S)` requires `Cardinality(S)` steps of recursive computation. Moreover, as the unspecified invariant `set \subseteq 1..N` always holds for this specification, every call `Sum(set)` requires up to `N` recursive steps. +It is clear that the expression `Sum(S)` requires `Cardinality(S)` steps of recursive computation. +Moreover, as the unspecified invariant `set \subseteq 1..N` always holds for this specification, +every call `Sum(set)` requires up to `N` recursive steps. ### The previously supported approach -Previously, when it was possible to find an upper bound on the number of iterations of an operator `Op`, such as `N` for `Sum` in the example above, Apalache would unroll the recursive operator up to this bound. +Previously, when it was possible to find an upper bound on the number of iterations of an operator `Op`, +such as `N` for `Sum` in the example above, Apalache would unroll the recursive operator up to this bound. Two additional operators, `UNROLL_DEFAULT_Op` and `UNROLL_TIMES_Op`, were required, for instance: ```tla @@ -46,9 +54,12 @@ With the operators `UNROLL_DEFAULT_Op` and `UNROLL_TIMES_Op`, Apalache would internally replace the definition of `Op` with an operator `OpInternal`, that had the following property: 1. `OpInternal` was non-recursive 2. If computing `Op(x)` required a recursion stack of depth at most `UNROLL_TIMES_Op`, then `OpInternal(x) = Op(x)` - 3. Otherwise, `OpInternal(x)` would return the value, which would have been produced by the computation of `Op(x)`, if all applications of `Op` while the recursion stack height was `UNROLL_TIMES_Op` returned `UNROLL_DEFAULT_Op` instead of the value produced by another recursive call to `Op` + 3. Otherwise, `OpInternal(x)` would return the value, which would have been produced by the computation of `Op(x)`, + if all applications of `Op` while the recursion stack height was `UNROLL_TIMES_Op` returned `UNROLL_DEFAULT_Op` + instead of the value produced by another recursive call to `Op` -Unsurprisingly, (3) caused a lot of confusion, particularly w.r.t. the meaning of the value `UNROLL_DEFAULT_Op`. Consider the following example: +Unsurprisingly, (3) caused a lot of confusion, particularly w.r.t. the meaning of the value `UNROLL_DEFAULT_Op`. +Consider the following example: ```tla RECURSIVE Max(_) @@ -62,7 +73,9 @@ Max(S) == ``` -As computing `Max(S)` requires `|S|` recursive calls, there is no static upper bound to the recursion stack height that works for all set sizes. Therefore, if one wanted to use this operator in Apalache, one would have to guess (or externally compute) a value `N`, such that, _in the particular specification_, `Max` would never be called on an argument, the cardinality of which exceeded `N`, e.g. +As computing `Max(S)` requires `|S|` recursive calls, there is no static upper bound to the recursion stack height that works for all set sizes. +Therefore, if one wanted to use this operator in Apalache, one would have to guess (or externally compute) a value `N`, +such that, _in the particular specification_, `Max` would never be called on an argument, the cardinality of which exceeded `N`, e.g. ```tla UNROLL_TIMES_Max = 2 ``` @@ -95,11 +108,15 @@ In this case, `MaxInternal({1,42}) = 42 = Max({1,42})`, by property (2) as expec `UNROLL_DEFAULT_Max` and the order in which elements are selected by `CHOOSE`), by property (3). So how does one select a sensible value for `UNROLL_DEFAULT_Op`? The problem is, one generally cannot. -In the `Max` case, one could pick a "very large" `N` and then assume that `Max` computation has "failed" (exceeded the `UNROLL_TIMES_Max` recursion limit) if the result was ever equal to `N`, though "very large" is of course subjective and gives absolutely no guarantees that `Max` won't be called on a set containing some element `M > N`. +In the `Max` case, one could pick a "very large" `N` and then assume that `Max` computation has "failed" +(exceeded the `UNROLL_TIMES_Max` recursion limit) if the result was ever equal to `N`, +though "very large" is of course subjective and gives absolutely no guarantees that `Max` won't be called on a set containing some element `M > N`. -As the recursion becomes more complex (e.g. non-primitive or non-tail), attempting to implement a sort of "monitor" via default values quickly becomes impractical, if not impossible. +As the recursion becomes more complex (e.g. non-primitive or non-tail), +attempting to implement a sort of "monitor" via default values quickly becomes impractical, if not impossible. -Fundamentally though, it is very easy to accidentally either introduce spurious invariant violations, or hide actual invariant violations by doing this. For instance, in a specification with +Fundamentally though, it is very easy to accidentally either introduce spurious invariant violations, +or hide actual invariant violations by doing this. For instance, in a specification with ```tla UNROLL_DEFAULT_Max = 42 UNROLL_TIMES_Max = 2 @@ -113,7 +130,8 @@ Apalache could "prove" `Inv` holds, as it would rewrite this `Inv` to \A n \in 10..20: MaxInternal(1..n) = 99 ``` -and `MaxInternal(1..n)` evaluates to `99` for all `n \in 3..99` (and might still evaluate to `99` for `n > 99`, based on the `CHOOSE` order), despite the fact that `Max(1..n) = n` in the mathematical sense. +and `MaxInternal(1..n)` evaluates to `99` for all `n \in 3..99` (and might still evaluate to `99` for `n > 99`, +based on the `CHOOSE` order), despite the fact that `Max(1..n) = n` in the mathematical sense. Consider now the much simpler alternative: ```tla @@ -122,11 +140,13 @@ NonRecursiveMax(S) == ApaFoldSet(Max2, 0, S) ``` -In this case, the user doesn't have to think about defaults (aside from the empty-set case), or recursion, as `ApaFoldSet` ensures `|S|`-step "iteration". As an additional benefit, one also doesn't need to use `CHOOSE` to select elements this way. +In this case, the user doesn't have to think about defaults (aside from the empty-set case), or recursion, +as `ApaFoldSet` ensures `|S|`-step "iteration". As an additional benefit, one also doesn't need to use `CHOOSE` to select elements this way. So ultimately, the reasons for abandoning support for recursive operators boils down to the following: - **In the vast majority of cases, equivalent functionality can be achieved by using `ApaFoldSet` or `ApaFoldSeqLeft`** - `UNROLL_TIMES_Op` is hard to determine, or doesn't exist statically, - `UNROLL_DEFAULT_Op` is hard to determine, - - Apalache doesn't have runtime evaluation of recursion, so it can't natively determine when a call to a recursive `Op` would have required more than `UNROLL_TIMES_Op` steps + - Apalache doesn't have runtime evaluation of recursion, so it can't natively determine when a call to a recursive `Op` + would have required more than `UNROLL_TIMES_Op` steps - The use of recursive operators produces unpredictable results, particularly when used in invariants \ No newline at end of file diff --git a/docs/src/apalache/profiling.md b/docs/src/apalache/profiling.md index d018c3fbbe..3abc43bda4 100644 --- a/docs/src/apalache/profiling.md +++ b/docs/src/apalache/profiling.md @@ -1,11 +1,13 @@ # Profiling Your Specification -**Warning:** Profiling only works in the incremental SMT mode, that is, when the model checker is run -with `--algo=incremental`, or without the option -`--algo` specified. - -As Apalache translates the TLA+ specification to SMT, it often defeats our intuition about the standard bottlenecks that -one learns about when running TLC. For instance, whereas TLC needs a lot of time to compute the initial states for the +**Warning:** Profiling only works in the incremental SMT mode, that is, +when the model checker is run with `--algo=incremental`, +or without the option `--algo` specified. + +As Apalache translates the TLA+ specification to SMT, +it often defeats our intuition about the standard bottlenecks that +one learns about when running TLC. +For instance, whereas TLC needs a lot of time to compute the initial states for the following specification, Apalache can check the executions of length up to ten steps in seconds: ```tla diff --git a/docs/src/apalache/running.md b/docs/src/apalache/running.md index 19b4cc4629..8f9bf86999 100644 --- a/docs/src/apalache/running.md +++ b/docs/src/apalache/running.md @@ -1,7 +1,7 @@ # Running the Tool -**Opt-in statistics programme**: if you opt-in for statistics collection (off by default), then every run of Apalache -will submit anonymized statistics to +**Opt-in statistics programme**: if you opt in for statistics collection (off by default), +then every run of Apalache will submit anonymized statistics to `tlapl.us`. See the details in [TLA+ Anonymized Execution Statistics](./statistics.md). Apalache supports several modes of execution. You can run it with the `--help` option, @@ -14,19 +14,23 @@ $ apalache-mc --help The most important commands are as follows: - `parse` reads a TLA+ specification with the SANY parser and flattens it by - instantiating all modules. It terminates successfully, if there are no parse + instantiating all modules. It terminates successfully if there are no parse errors. The input specification to `parse` may be given in standard TLA+ format, or in the [JSON serialization format][], while the outputs are produced in both formats. - `typecheck` performs all of the operations of `parse` and additionally runs the type checker Snowcat to infer - the types of all expressions in the parsed specification. It terminates successfully, if there are no type errors. + the types of all expressions in the parsed specification. It terminates successfully if there are no type errors. - - `simulate` performs all of the operations of `typecheck` and additionally runs the model checker in simulation mode, which *randomly* picks a sequence of [actions](https://apalache.informal.systems/docs/apalache/assignments-in-depth.html#slices) and checks the invariants for the subset of all executions which only admit actions in the selected order. - It terminates successfully, if there are no invariant violations. - This command usually checks randomized symbolic runs much faster than the `check` command. + - `simulate` performs all of the operations of `typecheck` and additionally runs the model checker in simulation mode, + which *randomly* picks a sequence of [actions](https://apalache.informal.systems/docs/apalache/assignments-in-depth.html#slices) + and checks the invariants for the subset of all executions which only admit actions in the selected order. + It terminates successfully if there are no invariant violations. + This command usually checks randomized symbolic runs much faster than the `check` command. - - `check` performs all of the operations of `typecheck` and then runs the model checker in bounded model checking mode, which checks invariants for *all executions*, the length of which does not exceed the value specified by the `--length` parameter. - It terminates successfully, if there are no invariant violations. + - `check` performs all of the operations of `typecheck` and then runs the model checker in bounded model checking mode, + which checks invariants for *all executions*, + the length of which does not exceed the value specified by the `--length` parameter. + It terminates successfully if there are no invariant violations. - `test` performs all of the operations of `check` in a mode that is designed to [test a single action](https://apalache.informal.systems/docs/adr/006rfc-unit-testing.html#32-testing-actions). @@ -79,8 +83,8 @@ The arguments are as follows: - `--no-deadlock` disables deadlock-checking, when `--discard-disabled=false` is on. When `--discard-disabled=true`, deadlocks are found in any case. - `--tuning-options-file` specifies a properties file that stores options for - [fine tuning](tuning.md) - - `--tuning-options` can pass and/or override these [fine tuning](tuning.md) + [fine-tuning](tuning.md) + - `--tuning-options` can pass and/or override these [fine-tuning](tuning.md) options on the command line - `--out-dir` set location for outputting any generated logs or artifacts, *`./_apalache-out` by default* @@ -177,13 +181,13 @@ To check executions of arbitrary lengths, one usually finds a formula that satis /\ TypeOK /\ IndInv /\ Next => TypeOK' /\ IndInv' ``` -In normal words: (1) The initial states satisfy the constraint `TypeOK /\ -IndInv`, and (2) whenever the specification makes a step when starting in a state that satisfies `TypeOK /\ IndInv`, it -ends up in a state that again satisfies `TypeOK /\ IndInv`. +In normal words: (1) The initial states satisfy the constraint `TypeOK /\ IndInv`, +and (2) whenever the specification makes a step when starting in a state that satisfies `TypeOK /\ IndInv`, +it ends up in a state that again satisfies `TypeOK /\ IndInv`. -Note that we usually check `IndInv` in conjunction with `TypeOK`, as we have to constrain the variable values. In -the `y2k` example, our inductive invariant is actually constraing the variables. In fact, such an inductive invariant is -usually called `TypeOK`. +Note that we usually check `IndInv` in conjunction with `TypeOK`, as we have to constrain the variable values. +In the `y2k` example, our inductive invariant is actually constraining the variables. +In fact, such an inductive invariant is usually called `TypeOK`. To check an inductive invariant ``IndInv`` in Apalache, you run two commands that check the above two formulas: @@ -244,9 +248,8 @@ $ cd test/tla apalache-mc check --cinit=ConstInit --length=20 --inv=Safety y2k_cinit.tla ``` -This command checks, whether `Safety` can be violated in 20 specification steps. The constants are initialized with the -predicate -`ConstInit`, defined in `y2k_cinit.tla` as: +This command checks, whether `Safety` can be violated in 20 specification steps. +The constants are initialized with the predicate `ConstInit`, defined in `y2k_cinit.tla` as: ```tla ConstInit == BIRTH_YEAR \in 0..99 /\ LICENSE_AGE \in 10..99 @@ -277,9 +280,9 @@ Apalache uses [the SANY parser](https://lamport.azurewebsites.net/tla/tools.html and the TLA+ Toolbox. By default, SANY is looking for modules (in this order) in 1. The current working directory. -1. The directory containing the main TLA+ file passed on the CLI. -1. A small Apalache standard library (bundled from `$APALACHE_HOME/src/tla`). -1. The Java package `tla2sany.StandardModules` (usually provided by the `tla2tools.jar` that is included in the Java +2. The directory containing the main TLA+ file passed on the CLI. +3. A small Apalache standard library (bundled from `$APALACHE_HOME/src/tla`). +4. The Java package `tla2sany.StandardModules` (usually provided by the `tla2tools.jar` that is included in the Java classpath). __Note:__ To let TLA+ Toolbox and TLC know about the Apalache modules, include @@ -298,8 +301,8 @@ Each run will produce a unique subdirectory inside its "namespace", derived from the file name of the specification, using the following convention `yyyy-MM-ddTHH-mm-ss_`. -For an example, consider using the default location of the `run-dir` for a run of -Apalache on a spec named `test.tla`. This will create a directory structuring matching following pattern: +For an example, consider using the default location of the `run-dir` for a run of Apalache on a spec named `test.tla`. +This will create a directory structuring matching the following pattern: ``` ./_apalache-out/ @@ -352,11 +355,11 @@ In this case, Apalache performs the following steps: 1. It parses the specification with [SANY](https://lamport.azurewebsites.net/tla/tools.html). -1. It translates SANY semantic nodes +2. It translates SANY semantic nodes into [Apalache IR](https://github.com/informalsystems/apalache/blob/master/tlair/src/main/scala/at/forsyte/apalache/tla/lir/package.scala) . -1. If the `--write-intermediate` flag is given, it pretty-prints the IR into +3. If the `--write-intermediate` flag is given, it pretty-prints the IR into the output directory (see [Detailed output](#detailed)). You can also write output to a specified location by using the `--output` flag. @@ -370,5 +373,5 @@ will write the IR to the file `result.json`. [alternative SMT encoding using arrays]: ../adr/011adr-smt-arrays.md [Enumeration of counterexamples]: ./principles/enumeration.md -[JSON serialization format]: ../adr/005adr-json.html +[JSON serialization format]: ../adr/005adr-json.md reads a TLA+ specification with the SANY parser and flattens it by diff --git a/docs/src/apalache/statistics.md b/docs/src/apalache/statistics.md index 42ff51cd1b..fafe1ff592 100644 --- a/docs/src/apalache/statistics.md +++ b/docs/src/apalache/statistics.md @@ -4,7 +4,7 @@ Apalache participates in the optional [anonymized statistics programme] along with [TLA+ Toolbox], TLC (which is part of the Toolbox), and [Visual Studio Code Plugin for TLA+]. -The statistics collection is **never enabled by default**. You have to **opt-in** +The statistics collection is **never enabled by default**. You have to **opt in** for the programme either in TLA+ Toolbox, or in Apalache. When statistics collection is enabled by the user, it is submitted to `tlapl.us` via the util.[ExecutionStatisticsCollector], which is part of `tla2tools.jar`. Apalache @@ -12,19 +12,19 @@ accesses this class in at.forsyte.apalache.tla.[Tool]. As explained in [anonymized statistics programme], if you never create the file `$HOME/.tlaplus/esc.txt`, then the statistics is not submitted to `tlapl.us`. -If you opt-in for the programme and later remove the file, then the statistics -will not be submitted too. +If you opt in for the programme and later remove the file, then the statistics +will not be submitted either. -## Why do we ask you to help us +## Why do we ask you to help us? There are several reasons: - Although our project is open source, developing Apalache is our main job. We are grateful to [Informal Systems] for supporting us and to [TU Wien], [Vienna Science and Technology Fund], and [Inria Nancy/LORIA], who - supported us in the past. It is easier to convince our decision makers to - continue the development, if we have clear feedback on how many people + supported us in the past. It is easier to convince our decision makers to + continue the development if we have clear feedback on how many people **use and need Apalache**. - We would like to know which features you are using most, so we can focus on @@ -35,7 +35,7 @@ There are several reasons: ## How to opt-in and opt-out -To opt-in in the statistics collection, execute the following command: +To opt in the statistics collection, execute the following command: ```sh ./apalache-mc config --enable-stats=true @@ -45,7 +45,7 @@ As a result of this command, a random identifier is written in `$HOME/.tlaplus/esc.txt`. This identifier is used by the execution statistics code. -To opt-out from the statistics collection, execute the following command: +To opt out from the statistics collection, execute the following command: ```sh ./apalache-mc config --enable-stats=false @@ -55,9 +55,9 @@ To opt-out from the statistics collection, execute the following command: You can check the daily log at [exec-stats.tlapl.us](https://exec-stats.tlapl.us/). -The following data is submitted for each run, if you have opted in: +The following data is submitted for each run if you have opted in: - - Total number of CPU cores and cores assigned + - The total number of CPU cores and cores assigned (the latter is 1 for now, but will change soon) - Java heap memory size (in Megabytes) - Apalache version (semantic version + build) diff --git a/docs/src/apalache/theory.md b/docs/src/apalache/theory.md index 3bfa4c8891..d648357c40 100644 --- a/docs/src/apalache/theory.md +++ b/docs/src/apalache/theory.md @@ -30,5 +30,5 @@ Here, `[[_]]` is the translator from TLA+ to SMT. Importantly, the values for the states `s_0`, ..., `s_k` are not enumerated as in TLC, but have to be found by the SMT solver. -If you would like to learn more about theory behind Apalache, check the [paper -delivered at OOPSLA19](https://dl.acm.org/doi/10.1145/3360549). +If you would like to learn more about the theory behind Apalache, +check the [paper delivered at OOPSLA19](https://dl.acm.org/doi/10.1145/3360549). diff --git a/docs/src/apalache/tuning.md b/docs/src/apalache/tuning.md index 3ba59f6589..d3a60c5eeb 100644 --- a/docs/src/apalache/tuning.md +++ b/docs/src/apalache/tuning.md @@ -1,7 +1,7 @@ Parameters for fine tuning ========================== -The parameters for fine tuning can be passed to the checker in a properties +The parameters for fine-tuning can be passed to the checker in a properties file. Its name is given with the command-line option `--tuning-options-file=my.properties`. This file supports variable substitution, e.g., `${x}` is replaced with the value of `x`, if it was previously declared. @@ -34,7 +34,7 @@ reports 'UNKNOWN', and the model checker reports a runtime error. `search.invariant.mode=(before|after)` defines the moment when the invariant is checked. In the `after` mode, all transitions are first translated, one of them -is picked non-deterministically and then the invariant is checked. Although this +is picked non-deterministically, and then the invariant is checked. Although this mode reduces the number of SMT queries, it usually requires more memory than the `before` mode. In the `before` mode, the invariant is checked for every enabled transition independently. The `before` mode may drastically reduce memory @@ -85,7 +85,7 @@ model checker to check Note that there is no direct correspondence between invariant numbers and the operators in a TLA+ spec. Rather, the numbers refer to *verification conditions* -(i.e., broken up parts of a TLA+ invariant operator). To find these numbers, run +(i.e., broken-up parts of a TLA+ invariant operator). To find these numbers, run Apalache with `--write-intermediate=true` and check the invariant numbers in `_apalache-out/.tla/*/intermediate/XX_OutVCGen.tla`. The 0th state invariant is called `VCInv_si_0`, the 1st state invariant is called @@ -101,7 +101,7 @@ different machines to speed up turnaround time. ## Translation to SMT -### Short circuiting +### Short-circuiting `rewriter.shortCircuit=(false|true)`. When `rewriter.shortCircuit=true`, `A \/ B` and `A /\ B` are translated to SMT as if-then-else expressions, e.g., `(ite A diff --git a/docs/src/apalache/types-and-annotations.md b/docs/src/apalache/types-and-annotations.md index 3d2439e958..defac8979c 100644 --- a/docs/src/apalache/types-and-annotations.md +++ b/docs/src/apalache/types-and-annotations.md @@ -23,7 +23,7 @@ Simply, search for the following declaration: a <: b == a ``` -If your specification contains such a declaration and it is used somewhere, +If your specification contains such a declaration, and it is used somewhere, then you are using the old type annotations. You have to remove them and write the new ones. The old and new type annotations are conceptually different, so there is no automatic upgrade process. @@ -32,7 +32,7 @@ no automatic upgrade process. Our model checker assigns types to variables, in order to encode TLA+ expressions in [Z3](https://github.com/Z3Prover/z3). Hence, the expressions that are ill-typed -(from the point of view of our type system), will be rejected right away. Some +(from the point of view of our type system) will be rejected right away. Some expressions, such as ``{}`` and ``<<>>`` require an advanced type inference algorithm, so the model checker will ask the user to provide the tool with a type annotation. To get an idea of our type system, check Section 2. In a nutshell, @@ -49,40 +49,40 @@ inference algorithm for every computation step: 1. It assumes that all operator definitions have been replaced with their bodies. (This is done automatically by Apalache.) - 1. It assumes that non-primed variables have been assigned types already. + 2. It assumes that non-primed variables have been assigned types already. As expected, the non-primed variables get their initial types by running type inference on ``Init``. - 1. It recursively computes the types of subexpressions in a TLA+ expression in + 3. It recursively computes the types of subexpressions in a TLA+ expression in a bottom-up way as follows: 1. A literal is assigned the respective basic type. That is, an integer, a Boolean, or a string gets assigned the integer, Boolean, or the constant type respectively. - 1. An assignment-like expression ``x' = e`` or ``x' \in S`` assigns to ``x'`` + 2. An assignment-like expression ``x' = e`` or ``x' \in S`` assigns to ``x'`` the type of ``e`` and the type of ``S`` elements respectively. The type checker requires that ``x'`` is assigned the same type across all formula branches. However, variables _may_ have different types at different steps. For instance, the definitions ``Init == x = 1`` and ``Next == x' = {x}`` will be processed perfectly fine: ``x`` is assigned the type ``Int`` in the initial states, and the type ``Set(...(Set(Int)))`` of _n_ nested sets at the _n_-th step, ``n >= 0``. - 1. The expressions that introduce bound variables, e.g., ``{e: x \in S}``, + 3. The expressions that introduce bound variables, e.g., ``{e: x \in S}``, are treated as usual: first, the type of ``S`` is computed and ``x`` is assigned the element type, and then the type of ``e`` is computed, which immediately gives us the type of the set expression. This approach manages to automatically compute types of many TLA+ expressions. -However, there a few problematic cases that require type annotations: +However, there are a few problematic cases that require type annotations: 1. An empty set ``{}`` gets assigned the type ``Set[Unknown]``. When it is later combined with a more precise type, e.g., as in ``{} \union {1}``, the type finder reports a type error. In this case, the user has to write a type annotation. For instance, the above-mentioned problematic expression can be fixed as follows: ``({} <: {Int}) \union {1}``. - 1. Similar to an empty set, an empty sequence ``<<>>`` gets assigned the type + 2. Similar to an empty set, an empty sequence ``<<>>`` gets assigned the type ``Seq[Unknown]``. Hence ``<<>> \o <<1>>`` produces a type error. To resolve this, the user has to write a type annotation ``(<<>> <: Seq(Int)) \o <<1>>``. - 1. It is common to mix records that have different sets of fields, e.g., + 3. It is common to mix records that have different sets of fields, e.g., see [Paxos](https://github.com/tlaplus/Examples/tree/master/specifications/Paxos). However, our type checker will report a type error on the following expression: ``{[type |-> "1a", bal |-> 1]} \union {[type |-> "2a", bal |-> 2, val |-> 3]}``. @@ -103,7 +103,7 @@ As a preliminary step, the user has to introduce the operator ``<:`` as follows: a <: b == a ``` -This operator does not nothing else but returns its first argument, so the standard TLA+ +This operator does nothing else but returns its first argument, so the standard TLA+ tools will ignore the second argument, which contains a type annotation. Our model checker interprets the second argument of the operator ``<:`` as a type annotation. (This also means that you should not assign any other meaning to ``<:`` in your specifications.) @@ -112,34 +112,34 @@ Further, the user may use ``<:`` to define types of problematic expressions, see examples in Section 1. The syntax for type annotations is given below. Note that these expressions should not be -understood as sets of values, as one would expects from type invariants such as ``TypeOK``. Rather, -they are TLA+ expressions that are parsed by the model checker, in order to construct types. +understood as sets of values, as one would expect from type invariants such as ``TypeOK``. +Rather, they are TLA+ expressions that are parsed by the model checker, in order to construct types. The syntax of type annotations is as follows: 1. ``Int`` specifies the integer type. For instance, ``x <: Int`` specifies that ``x`` is an integer, but not a set of integers. - 1. ``BOOLEAN`` specifies the Boolean type. Again, although we are using a set here, + 2. ``BOOLEAN`` specifies the Boolean type. Again, although we are using a set here, its purpose is to say that an expression is a Boolean, not a set of Booleans. - 1. ``STRING`` specifies the type of constants, e.g., ``"a"`` and ``"hello"`` + 3. ``STRING`` specifies the type of constants, e.g., ``"a"`` and ``"hello"`` are such constants. - 1. ``{T}`` specifies the set whose elements have type ``T``. For instance, + 4. ``{T}`` specifies the set whose elements have type ``T``. For instance, ``{Int}`` specifies a set of integers, whereas ``{{BOOLEAN}}`` specifies a set of sets of Booleans. Note that you should always use singleton sets in type annotations. For instance, ``{Int, BOOLEAN}`` would be immediately rejected. Hence, sets should contain the elements of the same type (there is some flexibility for records, see Section 1) - 1. ``[T_1 -> T_2]`` specifies the type of a function whose arguments have type ``T_1``, + 5. ``[T_1 -> T_2]`` specifies the type of a function whose arguments have type ``T_1``, and the results are of type ``T_2``. Hence, a function should return the values of the same type. - 1. ``<>`` specifies the type of a _k_-element tuple whose + 6. ``<>`` specifies the type of a _k_-element tuple whose elements have types ``T_1, ..., T_k`` respectively. Note that different fields - of a tuple are allowed to have different types. In these sense, we differentiate them + of a tuple are allowed to have different types. In this sense, we differentiate them from the general functions. - 1. ``[f_1 |-> T_1, ..., f_k |-> T_k]`` specifies the type of a _k_-field record, + 7. ``[f_1 |-> T_1, ..., f_k |-> T_k]`` specifies the type of a _k_-field record, whose field ``f_i`` is of the type ``T_i``. The types ``T_1, ..., T_k`` may differ. Again, that makes them different from the general functions. - 1. ``Seq(T)`` specifies the type of finite sequences, whose elements are of type ``T``. + 8. ``Seq(T)`` specifies the type of finite sequences, whose elements are of type ``T``. There are no restrictions on the sequence length, except finiteness. In theory, a sequence of type ``Seq[T]`` is no different from a function of type ``[Int -> T]``. In practice, we use different encodings for the general functions and sequences. diff --git a/docs/src/idiomatic/000keep-minimum-state-variables.md b/docs/src/idiomatic/000keep-minimum-state-variables.md index df66b1f933..8f7df002f6 100644 --- a/docs/src/idiomatic/000keep-minimum-state-variables.md +++ b/docs/src/idiomatic/000keep-minimum-state-variables.md @@ -1,13 +1,16 @@ # Idiom 0: Keep state variables to the minimum -In imperative programming, it is common to use mutable variable assignments liberally, but to exercise caution whenever mutable variables have a global scope. In TLA+, mutable variables are always global, so it is important to use them carefully and in a way that accurately reflects the global state of the system you are specifying. +In imperative programming, it is common to use mutable variable assignments liberally, +but to exercise caution whenever mutable variables have a global scope. +In TLA+, mutable variables are always global, so it is important to use them carefully and in a way +that accurately reflects the global state of the system you are specifying. ## Description _A good TLA+ specification minimizes the computation state and makes it visible_. -TLA+ does not have special syntax for variable assignment. For a good -reason. The power of TLA+ is in writing constraints on variables rather than in +TLA+ does not have a special syntax for variable assignment. For a good reason. +The power of TLA+ is in writing constraints on variables rather than in writing detailed commands. If you have been writing in languages such as C, C++, Java, Python, your first reflex would be to define a variable to store the intermediate result of a complex computation. @@ -15,8 +18,8 @@ intermediate result of a complex computation. In programming languages, we introduce temporary variables for several reasons: 1. To avoid repetitive computations of the same expression, - 1. To break down a large expression into a series of smaller expressions, - 1. To make the code concise. + 2. To break down a large expression into a series of smaller expressions, + 3. To make the code concise. Point 1 is a non-issue in TLA+, as it is mostly executed in the reader's brain, and people are probably less efficient in caching expressions than computers. @@ -40,7 +43,10 @@ Sometimes, we have to expose the internals of the computation. For instance, if we want to closely monitor the values of the computed expressions, when using the specification for model-based testing. -Sometimes, we have to break this idiom to make the specification more readable. Here is an example by Markus Kuppe. The specification of [BlockingQueue](https://github.com/lemmy/BlockingQueue/blob/3a66f46f6f5703f2863f71baaf0aedaaee58836f/BlockingQueueSplit.tla#L16-L51) that has one more variable is easier to read than [the original specification](https://github.com/lemmy/BlockingQueue/blob/3a66f46f6f5703f2863f71baaf0aedaaee58836f/BlockingQueue.tla) with a minimal number of variables. +Sometimes, we have to break this idiom to make the specification more readable. +Here is an example by Markus Kuppe. +The specification of [BlockingQueue](https://github.com/lemmy/BlockingQueue/blob/3a66f46f6f5703f2863f71baaf0aedaaee58836f/BlockingQueueSplit.tla#L16-L51) +that has one more variable is easier to read than [the original specification](https://github.com/lemmy/BlockingQueue/blob/3a66f46f6f5703f2863f71baaf0aedaaee58836f/BlockingQueue.tla) with a minimal number of variables. ## Example Consider the following implementation of [Bubble sort] in Python: diff --git a/docs/src/idiomatic/001assignments.md b/docs/src/idiomatic/001assignments.md index eaad0c137f..031d7cd344 100644 --- a/docs/src/idiomatic/001assignments.md +++ b/docs/src/idiomatic/001assignments.md @@ -2,15 +2,15 @@ ## Description -The idiom "[Keep state variables to the -minimum](000keep-minimum-state-variables.md)" tells us to store the minimum -necessary state variables. By following this idiom, we develop -the specification by writing constraints over the primed variables. +The idiom "[Keep state variables to the minimum](000keep-minimum-state-variables.md)" +tells us to store the minimum necessary state variables. +By following this idiom, we develop the specification by writing constraints over the primed variables. TLA+ comes with a great freedom of expressing constraints over variables. While we love TLA+ for that freedom, we believe that constraints over primed variables are sometimes confusing. -TLA+ uses the same glyph, `=` for three separate purposes: assignment, asserting equality, and binding variables. But these are very different operations and have different semantics. +TLA+ uses the same glyph, `=` for three separate purposes: assignment, asserting equality, and binding variables. +But these are very different operations and have different semantics. ### Issue 1 **tl;dr:** Use `:=` (supplied by the `Apalache.tla` module) instead of `=` for assignment. diff --git a/docs/src/idiomatic/002primes.md b/docs/src/idiomatic/002primes.md index b914c5a538..ee980dc4b8 100644 --- a/docs/src/idiomatic/002primes.md +++ b/docs/src/idiomatic/002primes.md @@ -98,7 +98,7 @@ see Chapter 5 and pages 312-313 of [Specifying Systems]. ## Disadvantages - - Sometimes, the operator "prime" helps us in avoiding code duplication. + - Sometimes, the operator "prime" helps us to avoid code duplication. For instance, you can write a state invariant `Inv` and later evaluate it against a next state by simply writing `Inv'`. However, you have to be careful about propagation of primes in `Inv`. diff --git a/docs/src/idiomatic/003record-sets.md b/docs/src/idiomatic/003record-sets.md index aa86a5ce97..4fa311761a 100644 --- a/docs/src/idiomatic/003record-sets.md +++ b/docs/src/idiomatic/003record-sets.md @@ -1,23 +1,38 @@ # Idiom 15: Replace sets of mixed records with disjoint unions -Message sets are canonically modeled as sets of records with mixed types. While the current type system supports this, in the future, Apalache is likely going to change support for these kinds of sets and implement stricter type-checking. See [this](https://github.com/informalsystems/apalache/issues/401) issue for a discussion. -This document aims to provide instructions for users to migrate their specs to maintain type compatibility in the future (and improve the performance and robustness of current specs in the present). +Message sets are canonically modeled as sets of records with mixed types. +While the current type system supports this, in the future, +Apalache is likely going to change support for these kinds of sets and implement stricter type-checking. +See [this](https://github.com/informalsystems/apalache/issues/401) issue for a discussion. +This document aims to provide instructions for users to migrate their specs to maintain type compatibility +in the future (and improve the performance and robustness of current specs in the present). ## The common approach -Apalache allows mixed sets of records, by defining the type of the set to be `Set(r)`, where `r` is the record type which contains all of the fields, which are held by at least one set member. For example: +Apalache allows mixed sets of records, by defining the type of the set to be `Set(r)`, +where `r` is the record type which contains all the fields, which are held by at least one set member. +For example: ```tla { [x: Int], [y: Str] } ``` -would have the type `Set([x:Int,y:Str])`. The only constraints Apalache imposes are that, if two set elements declared the same field name, the types of the fields have to match. Consequently, given +would have the type `Set([x:Int,y:Str])`. +The only constraints Apalache imposes are that, if two set elements declared the same field name, +the types of the fields have to match. +Consequently, given ```tla A == { [x: Int, z: Bool], [y: Str, z: Bool] } B == { [x: Int, z: Bool], [y: Str, z: Int] } ``` -`A` is considered well typed, and is assigned the type `Set([x:Int, y:Str, z:Bool])`, whereas `B` is rejected by the type checker. +`A` is considered well typed, and is assigned the type `Set([x:Int, y:Str, z:Bool])`, +whereas `B` is rejected by the type checker. -The treatment of record types was implemented in this fashion, to maintain backward-compatibility with specifications of message-based algorithms, which typically encoded different message types as records of the shape `[ type: Str, ... ]`, where all messages shared a disambiguation filed (commonly named `type`), the value of which described the category of the message. Additional fields depended on the value of `type`. +The treatment of record types was implemented in this fashion, +to maintain backward-compatibility with specifications of message-based algorithms, +which typically encoded different message types as records of the shape `[ type: Str, ... ]`, +where all messages shared a disambiguation filed (commonly named `type`), +the value of which described the category of the message. +Additional fields depended on the value of `type`. The bellow snippet from [Paxos.tla][] demonstrates this convention: ```tla \* The set of all possible messages @@ -28,17 +43,26 @@ Message == [type : {"1a"}, bal : Ballot] \union [type : {"2b"}, acc : Acceptor, bal : Ballot, val : Value] ``` -Ultimately, this approach both disagrees with our interpretation of the purpose of a type-system for TLA+, as well as introduces unsoundness, in the sense that it makes it impossible, at the type-checking level, to detect record-field access violations. +Ultimately, this approach both disagrees with our interpretation of the purpose of a type-system for TLA+, +as well as introduces unsoundness, in the sense that it makes it impossible, at the type-checking level, +to detect record-field access violations. Consider the following: ``` \E m \in Message: m.type = "1a" /\ m.mbal = -1 ``` -As defined above, messages for which `m.type = "1a"` do not define a field named `mbal`, however, the type of `Message` is `Set([type: Str, ..., mbal: Int, ...])`, which means, that `m` is assumed to have an `mbal` field, typed `Int`. Thus, this access error can only be caught much later in the model-checking process, instead of at the level of static analysis provided by the type-checker. +As defined above, messages for which `m.type = "1a"` do not define a field named `mbal`, +however, the type of `Message` is `Set([type: Str, ..., mbal: Int, ...])`, +which means, that `m` is assumed to have an `mbal` field, typed `Int`. +Thus, this access error can only be caught much later in the model-checking process, +instead of at the level of static analysis provided by the type-checker. ## The proposed changes -This section outlines a proposed migration strategy, to replace such sets in older specifications. The convention presented in this section works with both the current version of Apalache, as well as the next iteration of the type-checker, currently in development. +This section outlines a proposed migration strategy, to replace such sets in older specifications. +The convention presented in this section works with both the current version of Apalache, +as well as the next iteration of the type-checker, currently in development. -Suppose we use messages with types `t1,...,tn` in the specification and a message set variable `msgs`, like in the snippet below: +Suppose we use messages with types `t1,...,tn` in the specification and a message set variable `msgs`, +like in the snippet below: ```tla VARIABLE @@ -57,7 +81,8 @@ Message == [type : {"t1"}, x1: S1, ...] TypeOk: msgs \subseteq Message ``` -We propose the following substitution: Instead of modeling the union of all messages as a single set, we model their disjoint union explicitly, with a record, in the following way: +We propose the following substitution: Instead of modeling the union of all messages as a single set, +we model their disjoint union explicitly, with a record, in the following way: ```tla \* @type: [ int: Set([x: Int]), str: Set([y: Str]) ]; @@ -67,24 +92,31 @@ Messages == [ tn: [xn: Sn, ...] ] ``` -This way, `Messages.t1` represents the set of all messages `m`, for which `m.type` would have been equal to "t1" in the original implementation, that is, `[type: {"t1"}, x1: S1, ...]`. +This way, `Messages.t1` represents the set of all messages `m`, +for which `m.type` would have been equal to "t1" in the original implementation, that is, `[type: {"t1"}, x1: S1, ...]`. For example, assume the original specification included ```tla Messages == [type: {"t1"}, x: {1,2,3}] \union [type: {"t2"}, y:{"a","b","c"}] ``` -that is, defined two types of messages: "t1", with an integer-valued field "x" and "t2" with a string-valued field "y". The type of any `m \in Messages` would have been `[type: Str, x: Int, y: Str]` in the old approach. +that is, defined two types of messages: "t1", with an integer-valued field "x" and "t2" with a string-valued field "y". +The type of any `m \in Messages` would have been `[type: Str, x: Int, y: Str]` in the old approach. The rewritten version would be: ```tla Messages == [ t1: [x:{1,2,3}], t2: [y:{"a","b", "c"}] ] ``` -If we took `m: [ t1: Set([x: Int]), t2: Set([y: Str]) ]`, `m` would be a record pointing to two disjoint sets of messages (of categories "t1" and "t2" respectively). Values in `m.t1` would be records with the type `[x: Int]` and values in `m.t2` would be records with the type `[y: Str]`. +If we took `m: [ t1: Set([x: Int]), t2: Set([y: Str]) ]`, `m` +would be a record pointing to two disjoint sets of messages (of categories "t1" and "t2" respectively). +Values in `m.t1` would be records with the type `[x: Int]` and values in `m.t2` would be records with the type `[y: Str]`. -Note, however, that this approach also requires a change in the way messages are added to, or read from, the "set" of all messages (`m` is a record representing a set, but not a set itself, in the new approach). +Note, however, that this approach also requires a change in the way messages are added to, +or read from, the "set" of all messages (`m` is a record representing a set, but not a set itself, in the new approach). Previously, a message `m` would be added by writing: ``` msgs' = msgs \union {m} ``` -regardless of whether `m.type = "t1"` or `m.type = "t2"`. In the new approach, one must always specify which type of message is being added. However, the type no longer needs to be a property of the message itself, i.e. the `type` field is made redundant. +regardless of whether `m.type = "t1"` or `m.type = "t2"`. +In the new approach, one must always specify which type of message is being added. +However, the type no longer needs to be a property of the message itself, i.e. the `type` field is made redundant. To add a message `m` of the category `ti` one should write ``` @@ -114,6 +146,8 @@ The new approach: {{#include ./MsgSetNew.tla::}} ``` -Note that the new approach, in addition to being sound w.r.t. record types, also typically results in a performance improvement, since type-unification for record sets is generally expensive for the solver. +Note that the new approach, in addition to being sound w.r.t. record types, +also typically results in a performance improvement, +since type-unification for record sets is generally expensive for the solver. [Paxos.tla]: https://github.com/tlaplus/Examples/blob/master/specifications/Paxos/Paxos.tla diff --git a/docs/src/idiomatic/007if-then-else.md b/docs/src/idiomatic/007if-then-else.md index 645fc327c0..7b770e2744 100644 --- a/docs/src/idiomatic/007if-then-else.md +++ b/docs/src/idiomatic/007if-then-else.md @@ -4,15 +4,29 @@ ## Description -TLA+ provides an `IF-THEN-ELSE` operator, and it can be pretty tempting to use it for flow control, as it's done in procedural programming. However, TLA+ is about transitions over a state machine, and a transition-defining action declared with `IF-THEN-ELSE` can be more complex than 2 actions declared without it. Considering that any expression of the form `IF b THEN x ELSE y`, where `x` and `y` are Booleans, can be rewritten as `(b /\ x) \/ (~b /\ y)`, there's a pattern we can apply to get rid of some potentially troublesome `IF-THEN-ELSE` definitions. - -The `IF-THEN-ELSE` operator can be used either to define a value, or to branch some action as a sort of flow control. Defining values with `IF-THEN-ELSE` is common practice and is similar to the use of `IF` expressions in declarative programming languages. However, flow control in TLA+ can be done naturally by behavior definition through actions, making the use of `IF-THEN-ELSE` for flow control unnecessary. This idiom aims to clarify different usages of `IF-THEN-ELSE` expressions, keeping in mind the TLA+ essence of declaring actions to define transitions. +TLA+ provides an `IF-THEN-ELSE` operator, and it can be pretty tempting to use it for flow control, +as it's done in procedural programming. +However, TLA+ is about transitions over a state machine, +and a transition-defining action declared with `IF-THEN-ELSE` can be more complex than 2 actions declared without it. +Considering that any expression of the form `IF b THEN x ELSE y`, where `x` and `y` are Booleans, +can be rewritten as `(b /\ x) \/ (~b /\ y)`, +there's a pattern we can apply to get rid of some potentially troublesome `IF-THEN-ELSE` definitions. + +The `IF-THEN-ELSE` operator can be used either to define a value, or to branch some action as a sort of flow control. +Defining values with `IF-THEN-ELSE` is common practice +and is similar to the use of `IF` expressions in declarative programming languages. +However, flow control in TLA+ can be done naturally by behavior definition through actions, +making the use of `IF-THEN-ELSE` for flow control unnecessary. +This idiom aims to clarify different usages of `IF-THEN-ELSE` expressions, +keeping in mind the TLA+ essence of declaring actions to define transitions. ## When to use `IF-THEN-ELSE` ### When the result is not Boolean -When the `IF-THEN-ELSE` expression doesn't evaluate to a Boolean value, it cannot be rewritten using Boolean operators, so this idiom doesn't apply. For example: +When the `IF-THEN-ELSE` expression doesn't evaluate to a Boolean value, +it cannot be rewritten using Boolean operators, so this idiom doesn't apply. +For example: ```tla SafeDiv(x, y) == IF y /= 0 THEN x/y ELSE 0 @@ -20,7 +34,10 @@ SafeDiv(x, y) == IF y /= 0 THEN x/y ELSE 0 ### When the result is a state formula -State formulas are formulas that don't contain any action operator (e.g. primed variables, `UNCHANGED`). Using `IF-THEN-ELSE` on this type of formula can make it easier to read in some situations, and don't come with any disadvantage. This example state formula uses `IF-THEN-ELSE` to return a Boolean value: +State formulas are formulas that don't contain any action operator (e.g. primed variables, `UNCHANGED`). +Using `IF-THEN-ELSE` on this type of formula can make it easier to read in some situations, +and don't come with any disadvantage. +This example state formula uses `IF-THEN-ELSE` to return a Boolean value: ```tla ValidIdentity(person) == IF Nationalized(person) THEN ValidId(person) ELSE ValidPassport(person) @@ -37,7 +54,10 @@ ValidIdentity(person) == \/ /\ Nationalized(person) ## When there are dependent conditions -Nesting `IF-THEN-ELSE` expressions can be useful when there is a dependency between the conditions where some conditions are only relevant if other conditions are met. This is an example where using an `IF-THEN-ELSE` expressions is clearer than the Boolean operator form. Consider the following: +Nesting `IF-THEN-ELSE` expressions can be useful when there is a dependency between the conditions +where some conditions are only relevant if other conditions are met. +This is an example where using an `IF-THEN-ELSE` expressions is clearer than the Boolean operator form. +Consider the following: ```tla IF c1 @@ -63,7 +83,8 @@ The Boolean operator version is quite verbose: ## When (and how) *not* to use `IF-THEN-ELSE` -Mixing `IF-THEN-ELSE` expressions with action operators introduces unnecessary branching to definitions that could be self-contained and look more like a transition definition. +Mixing `IF-THEN-ELSE` expressions with action operators introduces unnecessary branching to definitions +that could be self-contained and look more like a transition definition. ```tla Withdraw(amount) == IF balance >= amount @@ -73,7 +94,8 @@ Withdraw(amount) == IF balance >= amount /\ response' = "FAILURE" ``` -We could separate the two branches into their own actions with clarifying names and explicit conditions, and use a disjunction between the two actions instead of the `IF-THEN-ELSE` block: +We could separate the two branches into their own actions with clarifying names and explicit conditions, +and use a disjunction between the two actions instead of the `IF-THEN-ELSE` block: ```tla WithdrawSuccess(amount) == /\ balance >= amount @@ -91,7 +113,10 @@ Withdraw(amount) == WithdrawSuccess(amount) \/ WithdrawFailure(amount) - Each action declares fewer transitions, so it's easier to reason about it - A disjunction of actions is closer to a union of transition relations than an `IF-THEN-ELSE` expression is -- Nested `IF-THEN-ELSE` expressions are an extrapolation of these problems and can over-constrain some branches if not done carefully. Using different actions defining its conditions explicitly leaves less room for implicit wrong constraints that an `ELSE` branch allows. See the example below. +- Nested `IF-THEN-ELSE` expressions are an extrapolation of these problems and can over-constrain some branches if not done carefully. + Using different actions defining its conditions explicitly leaves less room for implicit wrong constraints + that an `ELSE` branch allows. + See the example below. Assuming `C1()` is a condition for `A1()` and `C2()` is a condition for `A2()`: @@ -115,8 +140,11 @@ Next == \/ /\ C1() ``` -This second definition can allow more behaviors than the first one (depending on whether `C1()` and `C2()` overlap), and these additional behaviors can be unintentionally left out when `IF-THEN-ELSE` is used without attention. +This second definition can allow more behaviors than the first one (depending on whether `C1()` and `C2()` overlap), +and these additional behaviors can be unintentionally left out when `IF-THEN-ELSE` is used without attention. ## Disadvantages -A disjunction in TLA+ may or may not represent non-determinism, while an `IF-THEN-ELSE` is incapable of introducing non-determinism. If it's important that readers can easily differentiate deterministic and non-deterministic definitions, using `IF-THEN-ELSE` expressions can help to make determinism explicit. +A disjunction in TLA+ may or may not represent non-determinism, while an `IF-THEN-ELSE` is incapable of introducing non-determinism. +If it's important that readers can easily differentiate deterministic and non-deterministic definitions, +using `IF-THEN-ELSE` expressions can help to make determinism explicit. diff --git a/docs/src/idiomatic/index.md b/docs/src/idiomatic/index.md index 83f7d9707a..28bedb6a65 100644 --- a/docs/src/idiomatic/index.md +++ b/docs/src/idiomatic/index.md @@ -11,19 +11,19 @@ In this document, we collect specification idioms that aid us in writing TLA+ specifications that are: 1. understood by distributed system engineers, -1. understood by verification engineers, and -1. understood by automatic analysis tools such as the Apalache model checker. +2. understood by verification engineers, and +3. understood by automatic analysis tools such as the Apalache model checker. -If you believe, that the above points are contradictory when put together, it is +If you believe that the above points are contradictory when put together, it is to some extent true. TLA+ is an extremely general specification language. As a result, it is easy to write a short specification that leaves a human reader -puzzled . It is even easier to write a (syntactically correct) specification that +puzzled. It is even easier to write a (syntactically correct) specification that turns any program trying to reason about TLA+ to dust. Nevertheless, we find TLA+ quite useful when writing concise specifications of distributed protocols at [Informal Systems]. Other specification languages -- especially, those designed for software verification -- would require us to -introduce unnecessary book-keeping details that would both obfuscate the +introduce unnecessary bookkeeping details that would both obfuscate the protocols and make their verification harder. However, we do not always need _"all the power of mathematics"_, so we find it useful to introduce additional structure in TLA+ specifications. diff --git a/docs/src/index.md b/docs/src/index.md index 481a85534e..4ff19a632d 100644 --- a/docs/src/index.md +++ b/docs/src/index.md @@ -3,7 +3,7 @@ This book collects five related, but independent, sets of documentation: 1. [The Apalache User Manual](./apalache/index.md) -1. [Apalache Tutorials](./tutorials/index.md) -1. [Apalache HOWTOs](./HOWTOs/index.md) -1. [A TLA+ Language Reference Manual](./lang/index.md) -1. [Guidelines for Idiomatic TLA+](./idiomatic/index.md) +2. [Apalache Tutorials](./tutorials/index.md) +3. [Apalache HOWTOs](./HOWTOs/index.md) +4. [A TLA+ Language Reference Manual](./lang/index.md) +5. [Guidelines for Idiomatic TLA+](./idiomatic/index.md) diff --git a/docs/src/lang/apalache-operators.md b/docs/src/lang/apalache-operators.md index d7098b9766..7166c4fc8a 100644 --- a/docs/src/lang/apalache-operators.md +++ b/docs/src/lang/apalache-operators.md @@ -1,6 +1,10 @@ # Apalache operators -In addition to the standard TLA+ operators described in the previous section, Apalache defines a number of operators, which do not belong to the core language of TLA+, but which Apalache uses to provide clarity, efficiency, or special functionality. These operators belong to the module `Apalache`, and can be used in any specification by declaring `EXTENDS Apalache`. +In addition to the standard TLA+ operators described in the previous section, +Apalache defines a number of operators, which do not belong to the core language of TLA+, +but which Apalache uses to provide clarity, efficiency, or special functionality. +These operators belong to the module `Apalache`, +and can be used in any specification by declaring `EXTENDS Apalache`. ## Assignment @@ -13,12 +17,17 @@ In addition to the standard TLA+ operators described in the previous section, Ap **Apalache type:** `(a, a) => Bool`, for some type `a` -**Effect:** The expression `v' := e` evaluates to `v' = e`. At the level of Apalache static analysis, such expressions indicate parts of an action, where the value of a state-variable in a successor state is determined. See [here](../idiomatic/001assignments.md) for more details about assignments in Apalache. +**Effect:** The expression `v' := e` evaluates to `v' = e`. +At the level of Apalache static analysis, such expressions indicate parts of an action, +where the value of a state-variable in a successor state is determined. +See [here](../idiomatic/001assignments.md) for more details about assignments in Apalache. **Determinism:** Deterministic. **Errors:** -If the first argument is not a primed variable name, or if the assignment operator is used where assignments are prohibited, Apalache statically reports an error. +If the first argument is not a primed variable name, +or if the assignment operator is used where assignments are prohibited, +Apalache statically reports an error. **Example in TLA+:** @@ -64,7 +73,7 @@ non-empty. If `S` is empty, return some value of the proper type. calls to `Guess(S)` may return `x, y \in S` that can differ (`x /= y`) or may be equal (`x = y`). Moreover, Apalache considers all possible combinations of elements of `S` in the model checking mode. If `S` is empty, `Guess(S)` -produces the same value of proper type. +produces the same value of a proper type. **Errors:** If `S` is not a set, Apalache reports an error. @@ -146,8 +155,8 @@ The operators `ApaFoldSet` and `ApaFoldSeqLeft` are explained in more detail in **Apalache type:** `Set(<>) => (a -> b)`, for some types `a` and `b`. -**Effect:** Convert a set of pairs `S` to a function `F`, with the property that `F(x) = y => <> \in S`. Note that if `S` -contains at least two pairs `<>` and `<>`, such that `y /= z`, then `F` is not uniquely defined. +**Effect:** Convert a set of pairs `S` to a function `F`, with the property that `F(x) = y => <> \in S`. +Note that if `S` contains at least two pairs `<>` and `<>`, such that `y /= z`, then `F` is not uniquely defined. We use `CHOOSE` to resolve this ambiguity. The operator `SetAsFun` can be defined as follows: ```tla @@ -220,8 +229,9 @@ MkSeq(3, Double) = <<2, 4, 6>> \* TRUE **Arguments:** Three arguments: * A function `fn` that should be interpreted as a sequence. -* An integer `len`, denoting the length of the sequence, with the property - `1..len \subseteq DOMAIN fn`. Apalache does not check this requirement. It is up to the user to ensure that it holds. This expression is not necessarily constant. +* An integer `len`, denoting the length of the sequence, with the property `1..len \subseteq DOMAIN fn`. + Apalache does not check this requirement. + It is up to the user to ensure that it holds. This expression is not necessarily constant. * An integer constant `maxLen`, which is an upper bound on `len`, that is, `len <= maxLen`. **Apalache type:** `(Int -> a, Int, Int) => Seq(a)`, for some type `a` @@ -231,8 +241,9 @@ sequence `<< fn[1], ..., fn[Min(len, maxLen)] >>`. **Determinism:** Deterministic. -**Errors:** If the types of `fn`, `len` or `maxLen` do not match the expected types, Apalache statically reports a -type error. Additionally, if it is not the case that `1..len \subseteq DOMAIN fn`, the result is undefined. +**Errors:** If the types of `fn`, `len` or `maxLen` do not match the expected types, +Apalache statically reports a type error. +Additionally, if it is not the case that `1..len \subseteq DOMAIN fn`, the result is undefined. **Example in TLA+:** @@ -280,15 +291,20 @@ g = boundedFn(lambda x: x * x, {0, 42}) **Apalache type:** `(Bool) => Bool` -**Effect:** The expression `Skolem(\E x \in S: P)` provides a hint to Apalache, that the existential quantification may be skolemized. It evaluates to the same value as `\E x \in S: P`. +**Effect:** The expression `Skolem(\E x \in S: P)` provides a hint to Apalache, +that the existential quantification may be skolemized. +It evaluates to the same value as `\E x \in S: P`. **Determinism:** Deterministic. **Errors:** -If `e` is not a Boolean expression, throws a type error. If it is Boolean, but not an existentially quantified expression, throws a `StaticAnalysisException`. +If `e` is not a Boolean expression, throws a type error. +If it is Boolean, but not an existentially quantified expression, throws a `StaticAnalysisException`. **Note:** -This is an operator produced internally by Apalache. You may see instances of this operator, when reading the `.tla` side-outputs of various passes. Manual use of this operator is discouraged and, in many cases, not supported. +This is an operator produced internally by Apalache. +You may see instances of this operator, when reading the `.tla` side-outputs of various passes. +Manual use of this operator is discouraged and, in many cases, not supported. **Example in TLA+:** @@ -309,15 +325,21 @@ Skolem( TRUE ) \* TRUE in TLC, error in Apalache **Apalache type:** `(Set(a)) => Set(a)`, for some `a`. -**Effect:** The expression `Expand(S)` provides instructions to Apalache, that the large set `S` (powerset or set of functions) should be explicitly constructed as a finite set, overriding Apalache's optimizations for dealing with such collections. It evaluates to the same value as `S`. +**Effect:** The expression `Expand(S)` provides instructions to Apalache, +that the large set `S` (powerset or set of functions) should be explicitly constructed as a finite set, +overriding Apalache's optimizations for dealing with such collections. +It evaluates to the same value as `S`. **Determinism:** Deterministic. **Errors:** -If `e` is not a set, throws a type error. If the expression is a set, but is not of the form `SUBSET SS` or `[T1 -> T2]`, throws a `StaticAnalysisException`. +If `e` is not a set, throws a type error. If the expression is a set, +but is not of the form `SUBSET SS` or `[T1 -> T2]`, throws a `StaticAnalysisException`. **Note:** -This is an operator produced internally by Apalache. You may see instances of this operator, when reading the `.tla` side-outputs of various passes. Manual use of this operator is discouraged and, in many cases, not supported. +This is an operator produced internally by Apalache. +You may see instances of this operator, when reading the `.tla` side-outputs of various passes. +Manual use of this operator is discouraged and, in many cases, not supported. **Example in TLA+:** @@ -338,15 +360,21 @@ Expand( 1 ) \* 1 in TLC, type error in Apalache **Apalache type:** `(Bool) => Bool` -**Effect:** The expression `ConstCardinality(Cardinality(S) >= k)` provides a hint to Apalache, that `Cardinality(S)` is a constant, allowing Apalache to encode the constraint `e` without attempting to dynamically encode `Cardinality(S). It evaluates to the same value as `e`. +**Effect:** The expression `ConstCardinality(Cardinality(S) >= k)` provides a hint to Apalache, +that `Cardinality(S)` is a constant, allowing Apalache to encode the constraint `e` +without attempting to dynamically encode `Cardinality(S)`. +It evaluates to the same value as `e`. **Determinism:** Deterministic. **Errors:** -If `S` is not a Boolean expression, throws a type error. If it is Boolean, but not an existentially quantified expression, throws a `StaticAnalysisException`. +If `S` is not a Boolean expression, throws a type error. +If it is Boolean, but not an existentially quantified expression, throws a `StaticAnalysisException`. **Note:** -This is an operator produced internally by Apalache. You may see instances of this operator, when reading the `.tla` side-outputs of various passes. Manual use of this operator is discouraged and, in many cases, not supported. +This is an operator produced internally by Apalache. +You may see instances of this operator, when reading the `.tla` side-outputs of various passes. +Manual use of this operator is discouraged and, in many cases, not supported. **Example in TLA+:** @@ -355,4 +383,3 @@ Skolem( \E x \in {1,2}: x = 1 ) \* TRUE Skolem( 1 ) \* 1 in TLC, type error in Apalache Skolem( TRUE ) \* TRUE in TLC, error in Apalache ``` - diff --git a/docs/src/lang/booleans.md b/docs/src/lang/booleans.md index 7f9280ded4..9dfa9b7e09 100644 --- a/docs/src/lang/booleans.md +++ b/docs/src/lang/booleans.md @@ -259,7 +259,9 @@ True **Arguments:** Two arguments. Although they can be arbitrary expressions, the result is only defined when both arguments are evaluated to Boolean values. -**Apalache type:** `(Bool, Bool) => Bool`. Note that the `=>` operator at the type level expresses the relation of inputs types to output types for operators, and as opposed to the `=>` expressing the implication relation at the value level. +**Apalache type:** `(Bool, Bool) => Bool`. +Note that the `=>` operator at the type level expresses the relation of inputs types to output types for operators, +and as opposed to the `=>` expressing the implication relation at the value level. **Effect:** `F => G` evaluates to: diff --git a/docs/src/lang/conditionals.md b/docs/src/lang/conditionals.md index 4d2a94c2e6..5dd074b728 100644 --- a/docs/src/lang/conditionals.md +++ b/docs/src/lang/conditionals.md @@ -13,7 +13,7 @@ often default to writing expressions such as `IF A THEN B ELSE C`. We encourage those authors to use this construct more sparingly. In our experience, the use of `IF-THEN-ELSE` is rarely required. Many things can be done with [Boolean operators](./booleans.md), which provide more structure in -TLA+ code than in programming languages. We recommend using `IF-THEN-ELSE` to +TLA+ code than in programming languages. We recommend using `IF-THEN-ELSE` to compute predicate-dependent values, not to structure code. **Warning 2:** `CASE` is considered deterministic in this diff --git a/docs/src/lang/control-and-nondeterminism.md b/docs/src/lang/control-and-nondeterminism.md index d1c8684cf8..6a313b98ca 100644 --- a/docs/src/lang/control-and-nondeterminism.md +++ b/docs/src/lang/control-and-nondeterminism.md @@ -165,7 +165,7 @@ After all, people are much better at solving certain logical puzzles than computers, though people get bored much faster than computers. To understand how TLC enumerates states, check Chapter 14 of [Specifying -Systems][]. In the rest of this document, we focus on treatment of +Systems][]. In the rest of this document, we focus on the treatment of non-determinism that is close to the approach in Apalache. ## Explaining non-determinism to computers @@ -255,8 +255,8 @@ has the following properties: 1. For a non-empty set `S`, a call `GUESS S` returns some value `v \in S`. - 1. A call `GUESS {}` halts the evaluation. - 1. There are no assumptions about fairness of `GUESS`. It is free to return + 2. A call `GUESS {}` halts the evaluation. + 3. There are no assumptions about fairness of `GUESS`. It is free to return elements in any order, produce duplicates and ignore some elements. Why do we call it a device? We cannot call it a function, as functions are @@ -280,13 +280,13 @@ we can think of `GUESS S` as being one of the following implementations: we have centralized control over the distributed system, the returned value of RPC may be non-deterministic. - 1. `GUESS S` can be simply the user input. In this case, the user resolves + 2. `GUESS S` can be simply the user input. In this case, the user resolves non-determinism. - 1. `GUESS S` can be controlled by an adversary, who is trying to break the + 3. `GUESS S` can be controlled by an adversary, who is trying to break the system. - 1. `GUESS S` can pick an element by calling a pseudo-random number generator. + 4. `GUESS S` can pick an element by calling a pseudo-random number generator. However, note that RNG is a very special way of resolving non-determinism: It assumes probabilistic distribution of elements (usually, it is close to the [uniform @@ -320,8 +320,8 @@ conditions: value for the name `y'`. If the above assumptions do not hold true, the expression `\E x \in S: P` does -not have non-determinism and it can be evaluated by following the standard -deterministic semantics of exists, see [Logic](./logic.md). +not have non-determinism, and it can be evaluated by following the standard +deterministic semantics of `exists`, see [Logic](./logic.md). **Note:** We do not consider action operators like `UNCHANGED y`. They can be translated into an equivalent form, e.g., `UNCHANGED x` is equivalent to `x' = @@ -341,10 +341,10 @@ evaluated. There are three possible outcomes: In this case, `P` assigns the value of an expression `e` to `y'` as soon as the evaluator meets the expression `y' = e`. The evaluation may continue. - 1. Predicate `P` evaluates to `FALSE` when using the provided value of `x`. + 2. Predicate `P` evaluates to `FALSE` when using the provided value of `x`. Well, that was a wrong guess. According to our semantics, the evaluation halts. See the above discussion on "halting". - 1. The set `S` is empty, and `GUESS S` halts. See the above discussion on + 3. The set `S` is empty, and `GUESS S` halts. See the above discussion on "halting". **Example.** Consider the following specification: @@ -357,13 +357,12 @@ Next == i > x /\ x' = i ``` -It is easy to evaluate `Init`: It does not contain non-determinism and it +It is easy to evaluate `Init`: It does not contain non-determinism, and it produces the binding `(x -> 0)` and the state `[x |-> 0]`, respectively. When evaluating `Next` against the binding `(x -> 0)`, we have plenty of choices. -Actually, we have infinitely many choices, as the set `Int` is infinite. TLC +Actually, we have infinitely many choices, as the set `Int` is infinite. TLC would immediately fail here. But there is no reason for our evaluation to fail. -Simply ask the oracle. Below we give three examples of how the evaluation -works: +Simply ask the oracle. Below, we give three examples of how the evaluation works: ``` 1. (GUESS Int) returns 10. (LET i == 10 IN i > x /\ x' = i) is TRUE, x' is assigned 10. diff --git a/docs/src/lang/examples/choose.py b/docs/src/lang/examples/choose.py index 4e918530b0..d0e71e5811 100755 --- a/docs/src/lang/examples/choose.py +++ b/docs/src/lang/examples/choose.py @@ -9,8 +9,8 @@ def choose(s): if __name__ == "__main__": - s = frozenset({ 1, 2, 3}) + s = frozenset({1, 2, 3}) print("CHOOSE {} = {}".format(s, choose(s))) - s2 = frozenset({ frozenset({1}), frozenset({2}), frozenset({3})}) + s2 = frozenset({frozenset({1}), frozenset({2}), frozenset({3})}) print("CHOOSE {} = {}".format(s2, choose(s2))) diff --git a/docs/src/lang/examples/subset.py b/docs/src/lang/examples/subset.py index 86cfa23f0d..7969e414d5 100755 --- a/docs/src/lang/examples/subset.py +++ b/docs/src/lang/examples/subset.py @@ -2,14 +2,15 @@ import functools + def subset(s): """construct the set of all subsets, that is, SUBSET S""" - sets = [] # sets to be constructed + sets = [] # sets to be constructed carrier = list(s) nelems = len(carrier) - bits = [0] * nelems # the binary vector encodes a set under construction - current = set() # a set under cosntruction - sets.append(frozenset(current)) # add {} + bits = [0] * nelems # the binary vector encodes a set under construction + current = set() # a set under construction + sets.append(frozenset(current)) # add {} while True: # turn leading 1s into 0s, add elements until 0 is found @@ -36,9 +37,10 @@ def subset(s): def union(s): """Flatten the set, that is, implement UNION S""" - + return functools.reduce(lambda x, y: x | y, s, frozenset()) + if __name__ == "__main__": carrier = frozenset({"a", "b", "c", "d"}) print("SUBSET {}".format(carrier)) @@ -48,4 +50,3 @@ def union(s): print("UNION (SUBSET {})".format(carrier)) print(union(powerset)) - diff --git a/docs/src/lang/functions.md b/docs/src/lang/functions.md index b13ff9ac4a..500c4a2e8b 100644 --- a/docs/src/lang/functions.md +++ b/docs/src/lang/functions.md @@ -8,12 +8,13 @@ Functions are probably the second most used TLA+ data structure after sets. TLA+ functions are not like functions in programming languages. In programming languages, functions contain code that calls other functions. Although it is technically possible to use functions when constructing a function in TLA+, -functions are more often used like tables or dictionaries: they are simple maps from a set of inputs to a set of outputs. For instance, in -[Two-phase commit], the function `rmState` stores the transaction state for -each process: +functions are more often used like tables or dictionaries: they are simple maps +from a set of inputs to a set of outputs. +For instance, in [Two-phase commit], the function `rmState` +stores the transaction state for each process: | argument | rmState[argument] | -| ---------- | ----------------- | +|------------|-------------------| | "process1" | "working" | | "process2" | "aborted" | | "process3" | "prepared" | @@ -62,7 +63,7 @@ Init == rmState = f ``` -In the above example we are not talking about one function that is somehow +In the above example, we are not talking about one function that is somehow initialized "by default". Rather, we say that `rmState` can be set to an arbitrary function that receives arguments from the set `{ "process1", "process2", "process3" }` and returns values that belong to the set `{ @@ -232,11 +233,10 @@ TypeError: unhashable type: 'dict' ``` -Of course, this is an implementation detail of Python and it has nothing to do +Of course, this is an implementation detail of Python, and it has nothing to do with TLA+. This example probably demonstrates that the built-in primitives of TLA+ are more powerful than the standard primitives of many programming -languages (see [this -discussion](https://github.com/informalsystems/apalache/discussions/551)). +languages (see [this discussion](https://github.com/informalsystems/apalache/discussions/551)). Alternatively, we could represent a TLA+ function in Python as a set of pairs `(key, value)` and implement TLA+ function operators over such a @@ -440,9 +440,9 @@ the type is `((<> -> b), a_1, ..., a_n) => b`. When `e` has a type incompatible with the type of `DOMAIN f`, Apalache flags a type error. When `e \notin DOMAIN f`, Apalache assigns some type-compatible value to `f[e]`, but does not report any error. This is not a bug in Apalache, -but a feature of the SMT encoding. Usually, an illegal access surfaces -somewhere, when checking a specification. If you want to detect an access -outside of the function domain, instrument your code with an additional state +but a feature of the SMT encoding. Usually, illegal access surfaces +somewhere when checking a specification. If you want to detect access +outside the function domain, instrument your code with an additional state variable. **Example in TLA+:** @@ -583,7 +583,7 @@ function domain is not known in advance. Additionally, given a Python dictionary `f`, we write `f.items()` to quickly iterate over the pairs of keys and values. Had we wanted to follow the TLA+ semantics more precisely, we would have to enumerate over the keys in the function domain and apply the function to -each key, in order to obtain the value that is associated with the key. This +each key, in order to obtain the value that is associated with the key. This code would be less efficient than the idiomatic Python code. ```python diff --git a/docs/src/lang/index.md b/docs/src/lang/index.md index a4b825a2e3..b5a24aecb3 100644 --- a/docs/src/lang/index.md +++ b/docs/src/lang/index.md @@ -7,7 +7,7 @@ an excellent starting point, if you are new to TLA+. For a comprehensive description and philosophy of the language, check [Specifying Systems] and the [TLA+ Home Page]. There are plenty of interesting talks on TLA+ at [TLA Channel] of Markus Kuppe. This manual completely ignores Pluscal -- a -higher-level language on top of TLA+. If you are interested in learning +higher-level language on top of TLA+. If you are interested in learning Pluscal, check [LearnTla.com] by Hillel Wayne. ## Contents diff --git a/docs/src/lang/integers.md b/docs/src/lang/integers.md index 75130c1ac1..86c4edb76e 100644 --- a/docs/src/lang/integers.md +++ b/docs/src/lang/integers.md @@ -21,16 +21,16 @@ EXTENDS Integers Although you can write arbitrary expressions over integers in TLA+, Apalache translates these expressions as constraints in -[SMT](https://en.wikipedia.org/wiki/Satisfiability_modulo_theories). Some -expressions are easier to solve than the others. For instance, the expression -`2 * x > 5` belongs to linear integer arithmetic, which can be solved more -efficiently than general arithmetic. For state variables `x` and `y`, the -expression `x * y > 5` belongs to non-linear integer arithmetic, which is -harder to solve than linear arithmetic. +[SMT](https://en.wikipedia.org/wiki/Satisfiability_modulo_theories). +Some expressions are easier to solve than others. +For instance, the expression `2 * x > 5` belongs to linear integer arithmetic, +which can be solved more efficiently than general arithmetic. +For state variables `x` and `y`, the expression `x * y > 5` belongs to +non-linear integer arithmetic, which is harder to solve than linear arithmetic. When your specification is using only integer literals, e.g., `1`, `2`, `42`, but none of the operators from the `Integers` module, the integers can -be avoided altogether. For instance, you can replace the integer constants +be avoided altogether. For instance, you can replace the integer constants with string constants, e.g., `"1"`, `"2"`, `"42"`. The string constants are translated as constants in the SMT constraints. This simple trick may bring your specification into a much simpler theory. Sometimes, this trick allows z3 @@ -291,17 +291,17 @@ b` according to the TLA+ definition is different from the integer division `a / b` in the programming languages (C, Java, Scala, Rust). See the table below._ -  C (clang 12) | Scala 2.13 | Rust | Python 3.8.6 | TLA+ (TLC) | SMT (z3 4.8.8) - -- | -- | -- | -- | -- | -- - 100 / 3 == 33 | 100 / 3 == 33 | 100 / 3 == 33 | 100 // 3 == 33 | (100 \div 3) = 33 | (assert (= 33 (div 100 3))) - -100 / 3 == -33 | -100 / 3 == -33 | -100 / 3 == -33 | -100 // 3 == -34 | ((-100) \div 3) = -34 | (assert (= (- 0 34) (div (- 0 100) 3))) - 100 / (-3) == -33 | 100 / (-3) == -33 | 100 / (-3) == -33 | 100 // (-3) == -34 | (100 \div (-3)) = -34 | (assert (= (- 0 33) (div 100 (- 0 3)))) - -100 / (-3) == 33 | -100 / (-3) == 33 | -100 / (-3) == 33 | -100 // (-3) == 33 | ((-100) \div (-3)) = 33 | (assert (= 34 (div (- 0 100) (- 0 3)))) +| C (clang 12) | Scala 2.13 | Rust | Python 3.8.6 | TLA+ (TLC) | SMT (z3 4.8.8) | +|-------------------|-------------------|-------------------|--------------------|-------------------------|-----------------------------------------| +| 100 / 3 == 33 | 100 / 3 == 33 | 100 / 3 == 33 | 100 // 3 == 33 | (100 \div 3) = 33 | (assert (= 33 (div 100 3))) | +| -100 / 3 == -33 | -100 / 3 == -33 | -100 / 3 == -33 | -100 // 3 == -34 | ((-100) \div 3) = -34 | (assert (= (- 0 34) (div (- 0 100) 3))) | +| 100 / (-3) == -33 | 100 / (-3) == -33 | 100 / (-3) == -33 | 100 // (-3) == -34 | (100 \div (-3)) = -34 | (assert (= (- 0 33) (div 100 (- 0 3)))) | +| -100 / (-3) == 33 | -100 / (-3) == 33 | -100 / (-3) == 33 | -100 // (-3) == 33 | ((-100) \div (-3)) = 33 | (assert (= 34 (div (- 0 100) (- 0 3)))) | _Unfortunately, [Specifying Systems][] only gives us the definition for the case -`b > 0` (that is, cases 1-2 in our description). The implementation in SMT and -TLC produce incompatible results for `b < 0`. See [issue #331 in -Apalache](https://github.com/informalsystems/apalache/issues/331)._ +`b > 0` (that is, cases 1–2 in our description). +The implementation in SMT and TLC produce incompatible results for `b < 0`. +See [issue #331 in Apalache](https://github.com/informalsystems/apalache/issues/331)._ **Determinism:** Deterministic. @@ -402,7 +402,7 @@ are evaluated to integer values, and these values fall into one of the several cases: 1. `b > 0`, - 1. `b = 0` and `a /= 0`. + 2. `b = 0` and `a /= 0`. **Apalache type:** `(Int, Int) => Int`. @@ -640,7 +640,7 @@ True ### Equality and inequality -The operators `a = b` and `a /= b` are core operators of TLA+ and thus they are +The operators `a = b` and `a /= b` are core operators of TLA+, and thus they are not defined in the module `Integers`, see [Logic](./logic.md). diff --git a/docs/src/lang/logic.md b/docs/src/lang/logic.md index dbf00e2023..de4a82eb14 100644 --- a/docs/src/lang/logic.md +++ b/docs/src/lang/logic.md @@ -105,7 +105,7 @@ True **Arguments:** At least three arguments: a variable name, a set, and an expression. As usual in TLA+, if the second argument is not a set, the result is -undefined.You can also use multiple variables and tuples, see **Advanced +undefined. You can also use multiple variables and tuples, see **Advanced syntax**. **Apalache type:** The formal type of this operator is a bit complex. @@ -230,7 +230,7 @@ values of `e_1` and `e_2`. Let `e_1` and `e_2` evaluate to the values ``` - In other cases, `e_1 = e_2` evaluates to `FALSE` if the values have comparable types. - - TLC and Apalache report an error, if the values have incomparable types. + - TLC and Apalache report an error if the values have incomparable types. **Determinism:** Deterministic, unless `e_1` has the form `x'`, which can be interpreted as an assignment to the variable `x'`. For the non-deterministic @@ -353,10 +353,9 @@ Hence, we give an informal description: **Effect:** This operator implements a black-box algorithm that _somehow_ picks one element from the set `{x \in S: P}`. Is it an algorithm? Yes! `CHOOSE x -\in S: P` is deterministic. When you give it two equal sets and two equivalent +\in S: P` is deterministic. When you give it two equal sets and two equivalent predicates, `CHOOSE` produces the same value. Formally, the only known property -of `CHOOSE` is as follows (which is slightly more general than what we wrote -above): +of `CHOOSE` is as follows (which is slightly more general than what we wrote above): ```tla {x \in S: P} = {y \in T: Q} => @@ -384,8 +383,7 @@ There are two common use cases, where the use of `CHOOSE` is well justified: For instance, see: [ReduceSet in FiniteSetsExt][]. In other cases, we believe that `CHOOSE` is bound to do [Program synthesis]. -So TLC does some form of synthesis by brute force when it has to evaluate -`CHOOSE`. +So TLC does some form of synthesis by brute force when it has to evaluate `CHOOSE`. **Determinism:** Deterministic. Very much deterministic. Don't try to model non-determinism with `CHOOSE`. For non-determinism, see: @@ -398,7 +396,7 @@ difference for the use cases 1 and 2. If you believe that this causes a problem in your specification, [open an issue...] **Errors:** Pure TLA+ does not restrict the operator arguments. TLC flags a -model checking error, if `S` is infinite. Apalache produces a static type +model checking error, if `S` is infinite. Apalache produces a static type error, if the type of elements of `S` is not compatible with the type of `x` as expected in `P`. diff --git a/docs/src/lang/option-types.md b/docs/src/lang/option-types.md index 68680709f9..506c9dae0f 100644 --- a/docs/src/lang/option-types.md +++ b/docs/src/lang/option-types.md @@ -25,9 +25,9 @@ The module defines a type alias `$option` as However, due to the current lack of support for polymorphic aliases, this alias has limited utility, and parametric option types can only be properly expressed -by writing out the full variant type `Some(a) | None(UNIT)`. Nonetheless, in this manual -page, we will sometimes write `$option(a)` as a shorthand for the type `Some(a) -| None(UNIT)`. +by writing out the full variant type `Some(a) | None(UNIT)`. +Nonetheless, in this manual page, we will sometimes write `$option(a)` +as a shorthand for the type `Some(a) | None(UNIT)`. In the context of TLA+, our encoding of option types is generalized over "partial operators", meaning operators which return a value of type @@ -115,8 +115,8 @@ TRUE = IsSome(Some(5)) /\ IsNone(None) **LaTeX notation:** same -**Apalache type:** `(Some(a) | None(UNIT), a => b, UNIT => b) => b`, for some types `a` -and `b`. +**Apalache type:** `(Some(a) | None(UNIT), a => b, UNIT => b) => b`, +for some types `a` and `b`. **Arguments:** @@ -161,8 +161,9 @@ Some(v)`, or else `caseNone(UNIT)`. This is a way of eliminating a value of type **LaTeX notation:** same -**Apalache type:** `(a => Some(b) | None(UNIT), Some(a) | None(UNIT)) => Some(b) | None(UNIT)`, for some types `a` -and `b`. +**Apalache type:** +`(a => Some(b) | None(UNIT), Some(a) | None(UNIT)) => Some(b) | None(UNIT)`, +for some types `a` and `b`. **Arguments:** @@ -202,7 +203,8 @@ LET s == OptionFlatMap(fail, r) IN - `o` an optional value - `default` is a default value to return -**Effect:** `OptionGetOrElse(o, default)` is `v` iff `o = Some(v)`, or else `default`. +**Effect:** +`OptionGetOrElse(o, default)` is `v` iff `o = Some(v)`, or else `default`. **Determinism:** Deterministic. @@ -331,7 +333,8 @@ IN - `f` is a function - `o` is an optional value -**Effect:** `OptionFunApp(f, o)` is `Some(f[v])` if `o = Some(v)` or else `None`. +**Effect:** +`OptionFunApp(f, o)` is `Some(f[v])` if `o = Some(v)` or else `None`. **Determinism:** Deterministic. @@ -363,8 +366,8 @@ LET f == [x \in 1..3 |-> x + 1] IN **Effect:** `OptionPartialFun(f, undef)` is a function mapping each value in `undef` to `None`, and each value `x \in (DOMAIN f \ undef)` to `Some(f[x])`. -This can be used to extend a total function into a "partial function" whose domain is -extended to include the values in 'undef'. +This can be used to extend a total function into a "partial function" +whose domain is extended to include the values in 'undef'. **Determinism:** Deterministic. diff --git a/docs/src/lang/records.md b/docs/src/lang/records.md index bbbba796b4..fbb4245e08 100644 --- a/docs/src/lang/records.md +++ b/docs/src/lang/records.md @@ -49,14 +49,14 @@ distinguishes between general functions and records. When Apalache processes a record constructor, it assigns the record type to the result. This record type carries the information about the names of the record fields and their types. Similarly, Apalache assigns the type of a set of records, when it processes a -record set constructor. See the [Apalache ADR002][] on types. +record set constructor. See the [Apalache ADR002][] on types. _Owing to the type information, records are translated into SMT more efficiently by Apalache than the general functions._ Every record is assigned a type in Apalache. For instance, the record `[name |-> "A", a |-> 3]` has the type `{ name: Str, a: Int }`. In contrast to -TLC, the type checker statically flags an error, if a spec is trying to access +TLC, the type checker statically flags an error if a spec is trying to access a non-existent field. Consider the following example: ```tla diff --git a/docs/src/lang/sequences.md b/docs/src/lang/sequences.md index bbedf8425d..1cfd7f8519 100644 --- a/docs/src/lang/sequences.md +++ b/docs/src/lang/sequences.md @@ -580,4 +580,4 @@ till the end of the universe. [Apalache ADR002]: https://github.com/informalsystems/apalache/blob/main/docs/src/adr/002adr-types.md [Cartesian product]: https://en.wikipedia.org/wiki/Cartesian_product [Overriding Seq in TLC]: https://groups.google.com/g/tlaplus/c/sYx_6e3YyWk/m/4CnwPqIVAgAJ -[HOWTO write type annotations]: ../../HOWTOs/howto-write-type-annotations.md +[HOWTO write type annotations]: ../HOWTOs/howto-write-type-annotations.md diff --git a/docs/src/lang/sets.md b/docs/src/lang/sets.md index 1c2398ed53..8f0942ab39 100644 --- a/docs/src/lang/sets.md +++ b/docs/src/lang/sets.md @@ -29,7 +29,7 @@ Note that the above set is equal to the sets `{ 1, 2, 3, 4 }` and `{ 4, 3, 2, 1 }`. They are actually the same set, though they are constructed by passing various number of arguments in different orders. -The most basic set operation is the set membership that checks, whether a set +The most basic set operation is the set membership that checks whether a set contains a value: ```tla @@ -54,10 +54,10 @@ TLC restricts set elements to comparable values. See Section 14.7.2 of kinds of values in a single set, TLC would not complain about your sets: 1. Booleans, - 1. integers, - 1. strings, - 1. sets, - 1. functions, tuples, records, sequences. + 2. integers, + 3. strings, + 4. sets, + 5. functions, tuples, records, sequences. @@ -509,7 +509,7 @@ set() **Arguments:** At least three arguments: a mapping expression, a variable name (or a tuple of names, see **Advanced syntax**), - a set. Additional arguments are variables names and sets, interleaved. + a set. Additional arguments are variables' names and sets, interleaved. **Apalache type:** The formal type of this operator is a bit complex. Hence, we give an informal description for the one-argument case: @@ -526,8 +526,8 @@ For every element `e_1` of `S` and every element `e_2` of `T`: 1. Bind the element `e_1` to variable `x`, 2. Bind the element `e_2` to variable `y`, - 2. Compute the value of `e` under the binding `[x |-> e_1, y |-> e_2]`, - 3. Insert the value `e` into the set `M`. + 3. Compute the value of `e` under the binding `[x |-> e_1, y |-> e_2]`, + 4. Insert the value `e` into the set `M`. **Determinism:** Deterministic. @@ -681,7 +681,7 @@ the result is undefined. **Determinism:** Deterministic. -**Errors:** Pure TLA+ does not restrict the operator argument. TLC flags a +**Errors:** Pure TLA+ does not restrict the operator argument. TLC flags a model checking error, when it discovers that `S` is not a set, or when it is an infinite set. Apalache produces a static type error, if the type of `S` is different from a finite set. diff --git a/docs/src/lang/standard-operators.md b/docs/src/lang/standard-operators.md index 2f354bc95a..9fc1cfb5a1 100644 --- a/docs/src/lang/standard-operators.md +++ b/docs/src/lang/standard-operators.md @@ -2,22 +2,22 @@ In this document, we summarize the standard TLA+ operators in a form that is similar to manuals on programming languages. The purpose of this document is to -provide you with a quick reference, whenever you are looking at the [Summary of -TLA]. The [TLA+ Video -Course](http://lamport.azurewebsites.net/video/videos.html) by Leslie Lamport -is an excellent starting point, if you are new to TLA+. For a comprehensive -description and philosophy of the language, check [Specifying Systems] and the -[TLA+ Home Page]. You can find handy extensions of the standard library in -[Community Modules]. - -We explain the semantics of the operators under the lenses of the [Apalache -model checker]. Traditionally, the emphasis was put on the temporal operators -and action operators, as they build the foundation of TLA. We focus on the "+" -aspect of the language, which provides you with a language for writing a single -step by a state machine. This part of the language is absolutely necessary for -writing and reading system specifications. Moreover, we treat equally the -"core" operators of TLA+ and the "library" operators: This distinction is less -important to the language users than to the tool developers. +provide you with a quick reference, whenever you are looking at the [Summary of TLA]. +The [TLA+ Video Course](http://lamport.azurewebsites.net/video/videos.html) +by Leslie Lamport is an excellent starting point, if you are new to TLA+. +For a comprehensive description and philosophy of the language, +check [Specifying Systems] and the [TLA+ Home Page]. +You can find handy extensions of the standard library in [Community Modules]. + +We explain the semantics of the operators under the lenses of the [Apalache model checker]. +Traditionally, the emphasis was put on the temporal operators and action operators, +as they build the foundation of TLA. +We focus on the "+" aspect of the language, which provides you with a language +for writing a single step by a state machine. +This part of the language is absolutely necessary +for writing and reading system specifications. +Moreover, we treat equally the "core" operators of TLA+ and the "library" operators: +This distinction is less important to the language users than to the tool developers. In this document, we present the semantics of TLA+, as if it was executed on a computer that is equipped with an additional device that we call an _oracle_. @@ -25,7 +25,7 @@ Most of the TLA+ operators are understood as deterministic operators, so they can be executed on your computer. A few operators are non-deterministic, so they require the oracle to resolve non-determinism, see [Control Flow and Non-determinism]. This is one of the most important features that makes TLA+ -distinct from programming languages. Wherever possible, we complement the +distinct from programming languages. Wherever possible, we complement the English semantics with code in [Python](https://www.python.org/). Although our semantics are more restrictive than the denotational semantics in Chapter 16 of [Specifying Systems], they are very close to the treatment of TLA+ by the model diff --git a/docs/src/lang/tuples.md b/docs/src/lang/tuples.md index 87e8f8ca08..134a8efdeb 100644 --- a/docs/src/lang/tuples.md +++ b/docs/src/lang/tuples.md @@ -195,4 +195,4 @@ a_n>>, Int) => a_i`, for some types `a_1, ..., a_n`. [Paxos]: https://github.com/tlaplus/Examples/blob/master/specifications/Paxos/Paxos.tla [Apalache ADR002]: ../adr/002adr-types.md [Cartesian product]: https://en.wikipedia.org/wiki/Cartesian_product -[HOWTO write type annotations]: ../../HOWTOs/howto-write-type-annotations.md +[HOWTO write type annotations]: ../HOWTOs/howto-write-type-annotations.md diff --git a/docs/src/lang/user-operators.md b/docs/src/lang/user-operators.md index a253382f8c..8d574b825d 100644 --- a/docs/src/lang/user-operators.md +++ b/docs/src/lang/user-operators.md @@ -7,12 +7,13 @@ functions in programming languages. (Recall that [TLA+ functions](./functions.md) are more like dictionaries or hash maps, not functions in PL.) Then you realize that operators such as `Init` and `Next` are used as logic predicates. However, large specifications often contain operators -that are not predicates, but in fact are similar to pure functions in +that are not predicates, but in fact similar to pure functions in programming languages: They are computing values over the system state but pose no constraints over the system states. Recently, Leslie Lamport has extended the syntax of TLA+ operators in [TLA+ -version 2], which supports recursive operators and lambda operators. We explain why Apalache does not support those in [Recursive operators and functions]. +version 2], which supports recursive operators and lambda operators. +We explain why Apalache does not support those in [Recursive operators and functions]. The operator syntax that is described in [Specifying Systems] describes TLA+ version 1. This page summarizes the syntax of user-defined operators in @@ -21,9 +22,9 @@ versions 1 and 2. **Short digression**. The most important thing to understand about user-defined operators is that they are normally used inside `Init` and `Next`. While the operator `Init` describes the initial states, the operator `Next` describes a -single step of the system. That is, these two operators are describing the +single step of the system. That is, these two operators are describing the initial states and the possible transitions of the system, respectively. They -do not describe the whole system computation. Most of the time, we are writing +do not describe the whole system computation. Most of the time, we are writing *canonical specifications*, which are written in temporal logic as `Init /\ [][Next]_vars`. Actually, you do not have to understand temporal logic, in order to write canonical specifications. A canonical specification is saying: @@ -33,7 +34,7 @@ belongs to [Advanced topics]. After the digression, you should now see that user-defined operators in TLA+ are (normally) describing a single step of the system. Hence, they should be -terminating. That is why user operators are often understood as macros. The +terminating. That is why user operators are often understood as macros. The same applies to [Recursive operator definitions]. They have to terminate within a single system step. @@ -43,21 +44,21 @@ user-defined operators that you would probably find unexpected: 1. Some operators are used as predicates and some are used to compute values (*à la pure*). - 1. Operators may accept other operators as parameters. Such operators are + 2. Operators may accept other operators as parameters. Such operators are called [Higher-order operator definitions]. - 1. Although operators may be passed as parameters, they are not first-class + 3. Although operators may be passed as parameters, they are not first-class citizens in TLA+. For instance, an operator cannot be returned as a result of another operator. Nor can an operator be assigned to a variable (only the result of its application may be assigned to a variable). - 1. Operators do not support [Currying]. That is, you can only apply an operator + 4. Operators do not support [Currying]. That is, you can only apply an operator by providing values for all of its expected arguments. - 1. Operators can be nested. However, nested operators require a slightly + 5. Operators can be nested. However, nested operators require a slightly different syntax. They are defined with LET-IN definitions. -**Details about operators.** We go in detail about different kinds of operators +**Details about operators.** We go in detail about the different kinds of operators and recursive functions below: - [Top-level operator definitions] @@ -82,5 +83,5 @@ and recursive functions below: [Higher-order operator definitions]: ./user/higher-order-operators.md [Anonymous operator definitions]: ./user/lambdas.md [Local operator definitions]: ./user/local-operators.md -[Recursive operators and functions]: ./principles/recursive.md +[Recursive operators and functions]: ../apalache/principles/recursive.md diff --git a/docs/src/lang/user/higher-order-operators.md b/docs/src/lang/user/higher-order-operators.md index ab96938488..952869a0da 100644 --- a/docs/src/lang/user/higher-order-operators.md +++ b/docs/src/lang/user/higher-order-operators.md @@ -2,5 +2,5 @@ work in progress... -[[Back to user operators]](./user-operators.md) +[[Back to user operators]](../user-operators.md) diff --git a/docs/src/lang/user/lambdas.md b/docs/src/lang/user/lambdas.md index d9df4df7db..2f482839f8 100644 --- a/docs/src/lang/user/lambdas.md +++ b/docs/src/lang/user/lambdas.md @@ -2,5 +2,5 @@ work in progress... -[[Back to user operators]](./user-operators.md) +[[Back to user operators]](../user-operators.md) diff --git a/docs/src/lang/user/let-in.md b/docs/src/lang/user/let-in.md index f25147cf41..e60376c26f 100644 --- a/docs/src/lang/user/let-in.md +++ b/docs/src/lang/user/let-in.md @@ -2,5 +2,5 @@ work in progress... -[[Back to user operators]](./user-operators.md) +[[Back to user operators]](../user-operators.md) diff --git a/docs/src/lang/user/local-operators.md b/docs/src/lang/user/local-operators.md index bf05f172af..183b291b66 100644 --- a/docs/src/lang/user/local-operators.md +++ b/docs/src/lang/user/local-operators.md @@ -2,5 +2,5 @@ work in progress... -[[Back to user operators]](./user-operators.md) +[[Back to user operators]](../user-operators.md) diff --git a/docs/src/lang/user/top-level-operators.md b/docs/src/lang/user/top-level-operators.md index a49e84787f..c9b8649e65 100644 --- a/docs/src/lang/user/top-level-operators.md +++ b/docs/src/lang/user/top-level-operators.md @@ -1,6 +1,6 @@ # Top-level operator definitions -[[Back to user operators]](./user-operators.md) +[[Back to user operators]](../user-operators.md) ## Quick example @@ -108,7 +108,7 @@ In the above example, you see examples of four operator applications: Technically, there are more than four operator applications in our example. However, all other operators are the [standard -operators](./standard-operators.md). We do not focus on them here. +operators](../standard-operators.md). We do not focus on them here. **Note on the operator order.** As you can see, we are applying operators after they have been defined in a module. This is a general rule in TLA+: A name can @@ -170,7 +170,7 @@ in `B`: 1. Substitute the expression `A(e_1, ..., e_n)` in the definition of `B` with `body_of_uniq_A`. - 1. Substitute the names `uniq_p_1, ..., uniq_p_n` with the expressions `e_1, + 2. Substitute the names `uniq_p_1, ..., uniq_p_n` with the expressions `e_1, ..., e_n`, respectively. The above transformation is usually called [Beta reduction]. diff --git a/docs/src/lang/variants.md b/docs/src/lang/variants.md index 052492a69a..3bd83c871a 100644 --- a/docs/src/lang/variants.md +++ b/docs/src/lang/variants.md @@ -5,7 +5,9 @@ [Variants][] (also called *tagged unions* or *sum types*) are useful, when you want to combine values of different shapes in a single set or a sequence. -**Idiomatic tagged unions in untyped TLA+.** In untyped TLA+, one can construct sets, which contain records with different fields, where one filed is typically used as a disambiguation tag. +**Idiomatic tagged unions in untyped TLA+.** +In untyped TLA+, one can construct sets, which contain records with different fields, +where one filed is typically used as a disambiguation tag. For instance, we could create a set that contains two records of different shapes: ```tla @@ -335,9 +337,9 @@ type variable that captures other options in the variant type. **Effect:** The operator `VariantGetUnsafe` unconditionally returns some value that is compatible with the type of values tagged with `tagName`. If `variant` is tagged with `tagName`, the returned value is the value that was wrapped via -the `Variant` constructor. Otherwise, it is some arbitrary value of proper type. As such, -this operator does not guarantee that the retrieved value is always constructed -via `Variant`, unless the operator is used with the right tag. +the `Variant` constructor. Otherwise, it is some arbitrary value of a proper type. +As such, this operator does not guarantee that the retrieved value +is always constructed via `Variant`, unless the operator is used with the right tag. **Determinism:** Deterministic. diff --git a/docs/src/tutorials/entry-tutorial.md b/docs/src/tutorials/entry-tutorial.md index c704f66b8a..b33c9dae59 100644 --- a/docs/src/tutorials/entry-tutorial.md +++ b/docs/src/tutorials/entry-tutorial.md @@ -71,9 +71,10 @@ Binary Searches and Mergesorts are Broken][]: ``` As was found by Joshua Bloch, the addition in line 6 may throw -an out of bounds exception at line 7, due to an integer overflow. This is because `low` +an out-of-bounds exception at line 7, due to an integer overflow. This is because `low` and `high` are signed integers, with a maximum value of `2^31 - 1`. -However, the sum of two values, each smaller than `2^31-1`, may be greater than `2^31 -1`. If this is the case, `low + high` can wrap into a negative number. +However, the sum of two values, each smaller than `2^31-1`, may be greater than `2^31 -1`. +If this is the case, `low + high` can wrap into a negative number. This bug was [discussed](https://groups.google.com/g/tlaplus/c/msLltIcexF4/m/qnABiKJmDgAJ) @@ -122,7 +123,8 @@ definition: {{#include ../../../test/tla/bin-search/BinSearch0.tla:1:8}} ``` -This module does not yet specify any part of the binary search implementation. However, it contains a few important things: +This module does not yet specify any part of the binary search implementation. +However, it contains a few important things: - It imports constants and operators from three standard modules: `Integers`, `Sequences`, and `Apalache`. @@ -191,7 +193,7 @@ annotations in the comments: - `INPUT_KEY` has the type `Int`, that is, it is an integer. Recall that we wanted to specify signed and unsigned Java integers, which are -32 bit long. *TLA+ is not tuned towards any computer architecture.* Its integers +32 bits long. *TLA+ is not tuned towards any computer architecture.* Its integers are mathematical integers: always signed and arbitrarily large (unbounded). To model fixed bit-width integers, we introduce another constant `INT_WIDTH` of type `Int`: @@ -239,9 +241,10 @@ how we do it: {{#include ../../../test/tla/bin-search/BinSearch2.tla:32:38}} ``` -The variables `low` and `high` are called *state variables*. They define a state of our state machine. That is, they are never introduced and -never removed. Remember, TLA+ is not tuned towards any particular computer -architecture and thus it does not even have the notion of an execution stack. +The variables `low` and `high` are called *state variables*. They define a state of our state machine. +That is, they are never introduced and never removed. +Remember, TLA+ is not tuned towards any particular computer +architecture, and thus it does not even have the notion of an execution stack. You can think of `low` and `high` as being global variables. Yes, global variables are generally frowned upon in programming languages. However, when dealing with a specification, they are much easier to reason about than the execution stack. @@ -265,10 +268,10 @@ understand what it means for them to terminate? The variable `returnValue` will contain the result of the binary search, when the search terminates. Recall, there is no execution stack. Hence, we introduce the variable `returnValue` right away. The downside is that we have to do -book-keeping for this variable. +bookkeeping for this variable. **Initialize variables.** Having introduced the variables, we have to -initialize them. That is, we want to specify lines 2-3 of the Java code: +initialize them. That is, we want to specify lines 2–3 of the Java code: ```java 1: public static int binarySearch(int[] a, int key) { @@ -411,7 +414,7 @@ definitions from `BinSearch2.tla`. Since we are fixing the parameters with concrete values, `MC2_8.tla` looks very much like a unit test. It's a good start for debugging a few things, but since our program is entirely sequential, our specification is as good as a unit -test. Later in this tutorial we will show how to leverage Apalache to check +test. Later in this tutorial, we will show how to leverage Apalache to check properties for all possible inputs (up to some bound). Let us check `MC2_8.tla`: @@ -503,7 +506,7 @@ truth table for the Boolean connectives, which are no different from the Boolean logic in TLA+: | `A` | `B` | `~A` | `A \/ B` | `A /\ B` | `A => B` | -| ------- | ------- | ------- | -------- | -------- | -------- | +|---------|---------|---------|----------|----------|----------| | `FALSE` | `FALSE` | `TRUE` | `FALSE` | `FALSE` | `TRUE` | | `FALSE` | `TRUE` | `TRUE` | `TRUE` | `FALSE` | `TRUE` | | `TRUE` | `FALSE` | `FALSE` | `TRUE` | `FALSE` | `FALSE` | @@ -657,7 +660,7 @@ ConstInit == /\ INPUT_SEQ \in Seq(Int) ``` -This is straightforward definition. However, it does not work in Apalache: +This is a straightforward definition. However, it does not work in Apalache: ```sh $ apalache-mc check --cinit=ConstInit --inv=Postcondition MC5_8.tla @@ -693,7 +696,7 @@ In this new version, we use the Apalache operator `Gen` to: sequence is unrestricted, except its length is bounded with `MAX_INT`, which is exactly what we need in our case study. -The operator `Gen` introduces a data structure of proper type whose size is +The operator `Gen` introduces a data structure of a proper type whose size is bounded with the argument of `Gen`. For instance, the type of `INPUT_SEQ` is the sequence of integers, and thus `Gen(MAX_INT)` produces an unrestricted sequence of up to `MAX_INT` elements. This sequence is bound to the name @@ -877,8 +880,8 @@ Checker reports no error up to computation length 10 It took me 0 days 0 hours 0 min 19 sec ``` -Even if did not know precisely complexity of the binary search, we could -write a simpler property, which demonstrates progress of the search: +Even if we did not know the precise complexity of the binary search, we could +write a simpler property, which demonstrates the progress of the search: ```tla Progress == @@ -1135,7 +1138,7 @@ The final specifications can be found in and [MC10_8.tla](https://github.com/informalsystems/apalache/blob/main/test/tla/bin-search/MC10_8.tla). -In this tutorial we have shown how to: +In this tutorial, we have shown how to: - Specify the behavior of a sequential algorithm (binary search). - Specify invariants that check safety and termination. @@ -1147,7 +1150,7 @@ In this tutorial we have shown how to: We have written our specification for parameterized bit width. This lets us check the invariants relatively quickly and get fast feedback from the model checker. We chose a bit width of 8, a non-trivial value for which -Apalache terminates within reasonable time. Importantly, the specification +Apalache terminates within a reasonable time. Importantly, the specification for the bit width of 32 stays the same; we only have to change `INT_WIDTH`. Of course, Apalache reaches its limits when we set `INT_WIDTH` to 16 or 32. In these cases, it has to reason about all sequences of length up to 32,767 @@ -1163,7 +1166,7 @@ of a TLA+ specification, as we were writing it and checking it with Apalache. There are many different styles of writing TLA+ specifications. One of our goals was to demonstrate the incremental approach to specification writing. In fact, this approach is not very different from incremental development of -programs in the spirit of [Test-driven development][]. It took us 2-3 hours to +programs in the spirit of [Test-driven development][]. It took us 2 to 3 hours to iteratively develop a specification that is similar to the one demonstrated in this tutorial. @@ -1179,7 +1182,7 @@ or drop us a message on [Zulip chat]. [Apalache installation]: ../apalache/installation/index.md [Leslie Lamport]: https://lamport.azurewebsites.net/ [open an issue]: https://github.com/informalsystems/apalache/issues -[TLC Configuration Files]: ../apalache/parameters.html#tlc-configuration-file +[TLC Configuration Files]: ../apalache/parameters.md#tlc-configuration-file [Tutorial on Snowcat]: ./snowcat-tutorial.md [Nearly All Binary Searches and Mergesorts are Broken]: https://ai.googleblog.com/2006/06/extra-extra-read-all-about-it-nearly.html [Binary search in ProB]: https://groups.google.com/g/tlaplus/c/msLltIcexF4/m/qnABiKJmDgAJ @@ -1193,9 +1196,9 @@ or drop us a message on [Zulip chat]. [Summary of TLA+]: https://lamport.azurewebsites.net/tla/summary-standalone.pdf [TLA+ Video Course]: http://lamport.azurewebsites.net/video/videos.html [Classical implication]: https://en.wikipedia.org/wiki/Material_conditional -[Assignments in Apalache]: ../apalache/assignments-in-depth.html -[ConstInit predicate]: ../apalache/parameters.html#constinit-predicate -[Value generators]: ../lang/apalache-operators.html#value-generators +[Assignments in Apalache]: ../apalache/assignments-in-depth.md +[ConstInit predicate]: ../apalache/parameters.md#constinit-predicate +[Value generators]: ../lang/apalache-operators.md#value-generators [QuickCheck]: https://en.wikipedia.org/wiki/QuickCheck [Arrays.java in OpenJDK]: https://github.com/openjdk/jdk/blob/d7f31d0d53bfec627edc83ceb75fc6202891e186/src/java.base/share/classes/java/util/Arrays.java#L1662-L1698 [Two's complement]: https://en.wikipedia.org/wiki/Two%27s_complement diff --git a/docs/src/tutorials/pluscal-tutorial.md b/docs/src/tutorials/pluscal-tutorial.md index b859d1e8ec..99f9746f37 100644 --- a/docs/src/tutorials/pluscal-tutorial.md +++ b/docs/src/tutorials/pluscal-tutorial.md @@ -81,7 +81,7 @@ introduce a new module called `BakeryTyped.tla` with the following contents: Due to the semantics of `INSTANCE`, the constants and variables declared in `BakeryTyped.tla` substitute the constants and variables of -`BakeryWoTlaps.tla`. By doing so we effectively introduce type annotations. +`BakeryWoTlaps.tla`. By doing so, we effectively introduce type annotations. Since we introduce a separate module, any changes in the PlusCal code do not affect our type annotations. @@ -247,7 +247,7 @@ The final specifications can be found in and [BakeryWoTlaps.tla](https://github.com/informalsystems/apalache/blob/main/test/tla/bakery-pluscal/BakeryWoTlaps.tla). -In this tutorial we have shown how to: +In this tutorial, we have shown how to: - Annotate a PlusCal spec with types by introducing an additional TLA+ module. - Check safety of Bakery for bounded executions by bounded model checking (for @@ -280,7 +280,7 @@ or drop us a message on [Zulip chat]. [TLA+ Cheatsheet in HTML]: https://mbt.informal.systems/docs/tla_basics_tutorials/tla+cheatsheet.html [Summary of TLA+]: https://lamport.azurewebsites.net/tla/summary-standalone.pdf [TLA+ Video Course]: http://lamport.azurewebsites.net/video/videos.html -[ConstInit predicate]: ../apalache/parameters.html#constinit-predicate +[ConstInit predicate]: ../apalache/parameters.md#constinit-predicate [Specifying Systems]: http://lamport.azurewebsites.net/tla/book.html?back-link=learning.html [TLC]: https://github.com/tlaplus/tlaplus/ [TLAPS]: https://tla.msr-inria.inria.fr/tlaps/content/Home.html diff --git a/docs/src/tutorials/snowcat-tutorial.md b/docs/src/tutorials/snowcat-tutorial.md index f6a4ce5ba6..b808831b72 100644 --- a/docs/src/tutorials/snowcat-tutorial.md +++ b/docs/src/tutorials/snowcat-tutorial.md @@ -4,8 +4,8 @@ **Revision:** August 24, 2022 -In this tutorial, we introduce the Snowcat :snowflake: :cat: type checker. We -give concrete steps on running the type checker and annotating a specification +In this tutorial, we introduce the Snowcat :snowflake: :cat: type checker. +We give concrete steps to running the type checker and annotating a specification with types. This tutorial uses Type System 1.2, which guarantees safe record access. To see @@ -124,7 +124,7 @@ Alternatively, we could use the multi-line comment: *) ``` -**Note**: Importantly, every type annotation must end with a semi-colon `;`. +**Note**: Importantly, every type annotation must end with a semicolon `;`. Let's run the type checker again: @@ -295,8 +295,8 @@ If you are experiencing a problem with Snowcat, feel free to [open an issue][] or drop us a message on [Zulip chat][]. -[ADR002]: ../adr/002adr-types.html -[ADR004]: ../adr/004adr-annotations.html +[ADR002]: ../adr/002adr-types.md +[ADR004]: ../adr/004adr-annotations.md [HOWTO on writing type annotations]: ../HOWTOs/howto-write-type-annotations.md [Migrating to Type System 1.2]: ../HOWTOs/howto-write-type-annotations.md#recipe9 [Apalache installation]: ../apalache/installation/index.md diff --git a/docs/src/tutorials/symbmc.md b/docs/src/tutorials/symbmc.md index 27b4234c35..0895b66e9a 100644 --- a/docs/src/tutorials/symbmc.md +++ b/docs/src/tutorials/symbmc.md @@ -3,26 +3,38 @@ This brief introduction to symbolic model checking discusses the following: 1. State-spaces and transition systems - 1. What is a symbolic state? - 1. What are symbolic traces? - 1. How do I interpret Apalache counterexamples? + 2. What is a symbolic state? + 3. What are symbolic traces? + 4. How do I interpret Apalache counterexamples? -A glossary of notation and definitions can be found [below](#notation-and-definitions) +A glossary of notations and definitions can be found [below](#notation-and-definitions) ## On state-spaces and transition systems -A TLA+ specification defines a triple \\((S,S_0,\to)\\), called a _transition system_. \\(S\\) is the _state space_, \\(S_0\\) is the set of initial states \\(\left(S_0 \subseteq S\right)\\), and \\(\to\\) is the transition relation, a subset of \\(S^2\\). +A TLA+ specification defines a triple \\((S,S_0,\to)\\), called a _transition system_. +\\(S\\) is the _state space_, \\(S_0\\) is the set of initial states \\(\left(S_0 \subseteq S\right)\\), +and \\(\to\\) is the transition relation, a subset of \\(S^2\\). ### State spaces -The structure of a single state depends on the number of variables a specification declares. For example, if a specification declares +The structure of a single state depends on the number of variables a specification declares. +For example, if a specification declares ```tla VARIABLE A1, A2, A3, ..., Ak ``` -then a _state_ is a mapping \\([A_1 \mapsto a_1, \dots, A_k \mapsto a_k]\\), where \\(a_i\\) represents the value of the variable `Ai`, for each \\(i = 1,\dots,k\\). Here, we represent TLA+ variable names as unique formal symbols, where, for example the TLA+ variable `A1` is represented by the formal symbol \\(A_1\\). By convention, we will use markdown-syntax to refer to objects in TLA+ specifications, and latex notation otherwise. -The state space \\(S\\) is then the set of all such mappings, i.e. the set of all possible combinations of values that variables may hold. -For brevity, whenever the specification defines exactly one variable, we will treat a state as a single value \\(a_1\\) instead of the mapping \\([A_1 \mapsto a_1]\\). - -In untyped TLA+, one can think of \\(S\\) as \\(U^{\\{A_1,\dots, A_k\\}}\\), that is, the set of all mappings, which assign a value in \\(U\\), the universe of all TLA+ values, to each symbol. This set is naturally isomorphic to \\(U^k\\). In typed TLA+, such as in Apalache, where variable declarations look like: +then a _state_ is a mapping \\([A_1 \mapsto a_1, \dots, A_k \mapsto a_k]\\), +where \\(a_i\\) represents the value of the variable `Ai`, for each \\(i = 1,\dots,k\\). +Here, we represent TLA+ variable names as unique formal symbols, +where, for example the TLA+ variable `A1` is represented by the formal symbol \\(A_1\\). +By convention, we will use markdown-syntax to refer to objects in TLA+ specifications, and latex notation otherwise. +The state space \\(S\\) is then the set of all such mappings, +i.e. the set of all possible combinations of values that variables may hold. +For brevity, whenever the specification defines exactly one variable, +we will treat a state as a single value \\(a_1\\) instead of the mapping \\([A_1 \mapsto a_1]\\). + +In untyped TLA+, one can think of \\(S\\) as \\(U^{\\{A_1,\dots, A_k\\}}\\), that is, +the set of all mappings, which assign a value in \\(U\\), the universe of all TLA+ values, to each symbol. +This set is naturally isomorphic to \\(U^k\\). +In typed TLA+, such as in Apalache, where variable declarations look like: ```tla VARIABLE \* @type: T1; @@ -32,7 +44,9 @@ VARIABLE Ak ``` - \\(S\\) is additionally restricted, such that for all \\(s \in S\\) each symbol \\(A_i\\) maps to a value \\(s(A_i) \in V_i\\), where \\(Vi \subset U\\) is the set of all values, which hold the type \\(T_i\\), for each \\(i = 1,\dots,k\\). +\\(S\\) is additionally restricted, such that for all \\(s \in S\\) +each symbol \\(A_i\\) maps to a value \\(s(A_i) \in V_i\\), where \\(Vi \subset U\\) +is the set of all values, which hold the type \\(T_i\\), for each \\(i = 1,\dots,k\\). For example, in the specification with ```tla VARIABLE @@ -42,18 +56,27 @@ VARIABLE A2 ``` -The state space is \\(\mathbb{B}^{\\{x,y\\}}\\) when considering types, since each variable can hold one of two boolean values. In the untyped setting, the state space is infinite, and contains states where, for example, \\(A1\\) maps to `[z \in 1..5 |-> "a"]` and \\(A2\\) maps to `CHOOSE p \in {}: TRUE`. +The state space is \\(\mathbb{B}^{\\{x,y\\}}\\) when considering types, +since each variable can hold one of two boolean values. +In the untyped setting, the state space is infinite, and contains states where, +for example, \\(A1\\) maps to `[z \in 1..5 |-> "a"]` and \\(A2\\) maps to `CHOOSE p \in {}: TRUE`. -As Apalache enforces a type system,the remainder of this document will assume the typed setting. +As Apalache enforces a type system, the remainder of this document will assume the typed setting. This assumption does not change any of the definitions. -We will also assume that every specification declares an initial-state predicate `Init`, a transition-predicate `Next` and an invariant `Inv` (if not specified, assumed to be `TRUE`). -For simplicity, we will also assume that the specification if free of constants, resp. that all of the constants have been initialized. +We will also assume that every specification declares an initial-state predicate `Init`, +a transition-predicate `Next` and an invariant `Inv` (if not specified, assumed to be `TRUE`). +For simplicity, we will also assume that the specification if free of constants, +resp. that all of the constants have been initialized. ### Initial states The second component, \\(S_0\\), the set of all initial states, is derived from \\(S\\) and `Init`. The initial state predicate is a Boolean formula, in which specification-variables appear as free logic variables. -The operator `Init` characterizes a predicate \\(P_{S_0} \in \mathbb{B}^S\\) in the following way: given a state \\(s \in S\\), the formula obtained by replacing all occurrences of variable names `Ai` in `Init` with the values \\(s(A_i)\\) is a Boolean formula with no free variables (in a well-typed, parseable specification), which evaluates to either `TRUE` or `FALSE`. We say \\(P_{S_0}(s)\\) is the evaluation of this formula. -By the subset-predicate equivalence, we identify the predicate \\(P_{S_0}\\) with a subset \\(S_0\\) of \\(S\\): \\(S_0 = \\{ s \in S\mid P_{S_0}(s) = TRUE \\}\\). +The operator `Init` characterizes a predicate \\(P_{S_0} \in \mathbb{B}^S\\) in the following way: +given a state \\(s \in S\\), the formula obtained by replacing all occurrences of variable names `Ai` in `Init` +with the values \\(s(A_i)\\) is a Boolean formula with no free variables (in a well-typed, parseable specification), +which evaluates to either `TRUE` or `FALSE`. We say \\(P_{S_0}(s)\\) is the evaluation of this formula. +By the subset-predicate equivalence, we identify the predicate \\(P_{S_0}\\) +with a subset \\(S_0\\) of \\(S\\): \\(S_0 = \\{ s \in S\mid P_{S_0}(s) = TRUE \\}\\). For example, given ```tla @@ -69,11 +92,16 @@ Init == x \in 3..5 /\ y = 2 we see that \\(S = \mathbb{Z}^{\\{x,y\\}}\\) and \\(S_0 = \\{ [x \mapsto 3, y \mapsto 2], [x \mapsto 4, y \mapsto 2], [x \mapsto 5, y \mapsto 2] \\}\\). ### Transitions -Similar to \\(S_0\\), \\(\to\\) is derived from \\(S\\) and `Next`. If \\(S_0\\) is a single-argument predicate \\(S_0 \in \mathbb{B}^S\\), then \\(\to\\) is a relation \\(\to \in \mathbb{B}^{S^2}\\). -\\(\to(s_1,s_2)\\) is the evaluation of the formula obtained by replacing all occurrences of variable names `Ai` in `Next` with the values \\(s_1(A_i)\\), and all occurrences of `Ai'` with \\(s_2(A_i)\\). +Similar to \\(S_0\\), \\(\to\\) is derived from \\(S\\) and `Next`. +If \\(S_0\\) is a single-argument predicate \\(S_0 \in \mathbb{B}^S\\), +then \\(\to\\) is a relation \\(\to \in \mathbb{B}^{S^2}\\). +\\(\to(s_1,s_2)\\) is the evaluation of the formula obtained by replacing all occurrences +of variable names `Ai` in `Next` with the values \\(s_1(A_i)\\), +and all occurrences of `Ai'` with \\(s_2(A_i)\\). By the same principle of subset-predicate equivalence, we can treat \\(\to\\) as a subset of \\(S^2\\). As mentioned in the notation section, it is generally more convenient to use the infix notation -\\(s_1 \to s_2\\) over \\(\to(s_1, s_2)\\). We say that a state \\(s_2\\) is a _successor_ of the state \\(s_1\\) if \\(s_1 \to s_2\\). +\\(s_1 \to s_2\\) over \\(\to(s_1, s_2)\\). We say that a state \\(s_2\\) +is a _successor_ of the state \\(s_1\\) if \\(s_1 \to s_2\\). For example, given ```tla @@ -88,10 +116,14 @@ Init == x \in 3..5 /\ y = 2 Next == x' \in { x, x + 1 } /\ UNCHANGED y ``` -One can deduce, for any state \\([x \mapsto a, y \mapsto b] \in S\\), that it has two successors: \\([x \mapsto a + 1, y \mapsto b]\\) and \\([x \mapsto a, y \mapsto b]\\) because the following relations hold \\([x \mapsto a, y \mapsto b] \to [x \mapsto a + 1, y \mapsto b]\\) and \\( [x \mapsto a, y \mapsto b] \to [x \mapsto a, y \mapsto b] \\). +One can deduce, for any state \\([x \mapsto a, y \mapsto b] \in S\\), that it has two successors: +\\([x \mapsto a + 1, y \mapsto b]\\) and \\([x \mapsto a, y \mapsto b]\\) +because the following relations hold \\([x \mapsto a, y \mapsto b] \to [x \mapsto a + 1, y \mapsto b]\\) +and \\( [x \mapsto a, y \mapsto b] \to [x \mapsto a, y \mapsto b] \\). Lastly, we define traces in the following way: -A _trace_ of length \\(k\\) is simply a sequence of states \\(s_0,\dots, s_k \in S\\), such that \\(s_0 \in S_0\\) and \\(s_i \to s_{i+1}\\) for all \\(i\in \\{0,\dots,k-1\\}\\). +A _trace_ of length \\(k\\) is simply a sequence of states \\(s_0,\dots, s_k \in S\\), +such that \\(s_0 \in S_0\\) and \\(s_i \to s_{i+1}\\) for all \\(i\in \\{0,\dots,k-1\\}\\). This definition naturally extends to _inifinite traces_. For example, the above specification admits the following traces of length 2 (among others): @@ -106,17 +138,20 @@ For example, the above specification admits the following traces of length 2 (am \\] ### Reachable states -Using the above definitions, we can define the set of states reachable in exactly \\(k\\)-steps, for \\(k \in \mathbb{N}\\), denoted by \\(R(k)\\). We define \\(R(0) = S_0\\) and for each \\(k \in \mathbb{N}\\), +Using the above definitions, we can define the set of states reachable in exactly \\(k\\)-steps, +for \\(k \in \mathbb{N}\\), denoted by \\(R(k)\\). We define \\(R(0) = S_0\\) and for each \\(k \in \mathbb{N}\\), \\[ R(k+1) := \\{ t \in S \mid \exists s \in R(k) \ .\ s \to t \\} \\] -Similarly, we can define the set of states reachable in _at most_ \\(k\\)-steps, denoted \\(r(k)\\), for \\(k \in \mathbb{N}\\) by +Similarly, we can define the set of states reachable in _at most_ \\(k\\)-steps, +denoted \\(r(k)\\), for \\(k \in \mathbb{N}\\) by \\[ r(k) := \bigcup_{i=0}^k R(i) \\] -Finally, we define the set of all _reachable_ states, \\(R\\), as the (infinite) union of all \\(R(k)\\), over \\(k \in \mathbb{N}\\): +Finally, we define the set of all _reachable_ states, \\(R\\), +as the (infinite) union of all \\(R(k)\\), over \\(k \in \mathbb{N}\\): \\[ R := \bigcup_{k \in \mathbb{N}} R(k) \\] @@ -155,8 +190,12 @@ and so on. We can express this compactly as: ### Finite diameters We say that a transition system has a _finite diameter_, if there exists a \\(k \in N\\), such that \\(R = r(k)\\). -If such an integer exists then the smallest integer \\(k\\), for which this holds true, is the _diameter_ of the transition system. -In other words, if the transition system \\((S,S_0,\to)\\) has a finite diameter of \\(k\\), any state that is reachable from a state in \\(S_0\\), is reachable in at most \\(k\\) transitions. The example above clearly does not have a finite diameter, since \\(R\\) is infinite, but \\(r(k)\\) is finite for each \\(k\\). +If such an integer exists then the smallest integer \\(k\\), for which this holds true, +is the _diameter_ of the transition system. +In other words, if the transition system \\((S,S_0,\to)\\) has a finite diameter of \\(k\\), +any state that is reachable from a state in \\(S_0\\) is reachable in at most \\(k\\) transitions. +The example above clearly does not have a finite diameter, since \\(R\\) is infinite, +but \\(r(k)\\) is finite for each \\(k\\). However, the spec ```tla @@ -170,22 +209,34 @@ Next == x' = (x + 1) % 7 ``` has a finite diameter (more specifically, a diameter of 6), because: - 1. \\(R = \\{0,1,\dots,6\\}\\) (the set of remainders modulo 7), since those are the only values `x'`, which is defined as a `% 7` expression, can take. - 1. for any \\(k = 0,\dots,5\\), it is the case that \\(r(k) = \\{0,\dots,k\\} \ne R\\), so the diameter is not in \\(\\{1,\dots,5\\}\\) - 1. for any \\(k \ge 6\\), \\(r(k) = r(6) = R\\) + 1. \\(R = \\{0,1,\dots,6\\}\\) (the set of remainders modulo 7), since those are the only values `x'`, + which is defined as a `% 7` expression, can take. + 2. for any \\(k = 0,\dots,5\\), it is the case that \\(r(k) = \\{0,\dots,k\\} \ne R\\), + so the diameter is not in \\(\\{1,\dots,5\\}\\) + 3. for any \\(k \ge 6\\), \\(r(k) = r(6) = R\\) ### Invariants -Much like `Init`, an invariant operator `Inv` defines a predicate. However, it is not, in general, the case that `Inv` defines a predicate over `S`. There are different cases we can consider, discussed in more detail [here](../apalache/principles/invariants.md). For the purposes of this document, we focus on _state invariants_, i.e. operators which use only unprimed variables and no temporal- or trace- operators. A state invariant operator `Inv` defines a predicate \\(I\\) over \\(S\\). -We say that the \\(I\\) is an _invariant_ in the transition system, if \\(R \subseteq I\\), that is, for every reachable state \\(s_r \in R\\), \\(I(s_r)\\) holds true. If \\(R \setminus I\\) is nonempty (i.e., there exists a state \\(s_r \in R\\), such that \\(\neg I(s_r)\\)), we refer to elements of \\(R \setminus I\\) as _witnesses_ to invariant violation. +Much like `Init`, an invariant operator `Inv` defines a predicate. +However, it is not, in general, the case that `Inv` defines a predicate over `S`. +There are different cases we can consider, discussed in more detail [here](../apalache/principles/invariants.md). +For the purposes of this document, we focus on _state invariants_, +i.e. operators which use only unprimed variables and no temporal- or trace- operators. +A state invariant operator `Inv` defines a predicate \\(I\\) over \\(S\\). +We say that the \\(I\\) is an _invariant_ in the transition system, if \\(R \subseteq I\\), that is, +for every reachable state \\(s_r \in R\\), \\(I(s_r)\\) holds true. +If \\(R \setminus I\\) is nonempty (i.e., there exists a state \\(s_r \in R\\), +such that \\(\neg I(s_r)\\)), we refer to elements of \\(R \setminus I\\) as _witnesses_ to invariant violation. ### Goals of model checking The goal of model checking is to determine whether or not \\(R \setminus I\\) contains a witness. The goal of bounded model checking is to determine, given a bound \\(k\\), whether or not \\(r(k) \setminus I\\) contains a witness. -In a transition system with a bounded diameter, one can use bounded model checking to solve the general model checking problem, since \\(R \setminus I\\) is equivalent to \\(r(k) \setminus I\\) for a sufficiently large \\(k\\). In general, if the system does not have a bounded diameter, failing to find a witness in \\(r(k) \setminus I\\) cannot be used to reason about the absence of witnesses in \\(R \setminus I\\)! +In a transition system with a bounded diameter, one can use bounded model checking +to solve the general model checking problem, since \\(R \setminus I\\) is equivalent to \\(r(k) \setminus I\\) for a sufficiently large \\(k\\). In general, if the system does not have a bounded diameter, failing to find a witness in \\(r(k) \setminus I\\) cannot be used to reason about the absence of witnesses in \\(R \setminus I\\)! ## Explicit-state model checking -The idea behind explicit-state model checking is to simply perform the following algorithm (in pseudocode, \\(\leftarrow\\) represents assignment): +The idea behind explicit-state model checking is to simply perform the following algorithm +(in pseudocode, \\(\leftarrow\\) represents assignment): Compute \\(S_0\\) and set \\(Visited \leftarrow \emptyset, ToVisit \leftarrow S_0\\) @@ -197,10 +248,12 @@ The idea behind explicit-state model checking is to simply perform the following ToVisit &\leftarrow (ToVisit \cup Successors(s)) \setminus Visited \end{align} - 1. If \\(ToVisit = \emptyset\\) terminate. \\(R = Visited\\) and \\(I\\) is an invariant. + 2. If \\(ToVisit = \emptyset\\) terminate. \\(R = Visited\\) and \\(I\\) is an invariant. While simple to describe, there are several limitations of this approach in practice. -The first limitation is the absence of a termination guarantee. More specifically, this algorithm terminates if and only if \\(R\\) is finite. For example: +The first limitation is the absence of a termination guarantee. +More specifically, this algorithm terminates if and only if \\(R\\) is finite. +For example: ```tla VARIABLE \* @type: Int; @@ -210,8 +263,9 @@ Init == x = 0 Next == x' = x + 1 ``` -defines a states space, for which \\(R = \mathbb{N}\\), so the above algorithm never terminates. -Further, in the general case it is difficult or impossible to compute \\(S_0\\) or the set \\(Successors(s)\\) defined in the algorithm. +defines a state space, for which \\(R = \mathbb{N}\\), so the above algorithm never terminates. +Further, in the general case, it is difficult or impossible to compute \\(S_0\\) +or the set \\(Successors(s)\\) defined in the algorithm. As an example, consider the following specification: ```tla VARIABLE x @@ -226,11 +280,17 @@ ReachesOne(a) == \E n \in Nat: kIter(a,n) = 1 Init == x \in { n \in Nat: ~ReachesOne(n) } ``` -The specification encodes the [Collatz conjecture](https://en.wikipedia.org/wiki/Collatz_conjecture), so computing \\(S_0\\) is equivalent to proving or disproving the conjecture, which remains an open problem at present. -It is therefore unreasonable to expect any model checker to be able to accept such input, despite the fact that the condition is easily describable in first-order logic. - -A similar problem can occur in computing \\(Successors(s)\\); the relation between variables `Ai` (\\(s(A_i)\\)) and `Ai'` (\\(s_2(A_i)\\)) may be given by means of an implicit function or uncomputable expression. -Therefore, most tools impose the following constraints, which make computing \\(S_0\\) and \\(Successors(s)\\) possible without any sort of specialized solver: +The specification encodes the [Collatz conjecture](https://en.wikipedia.org/wiki/Collatz_conjecture), +so computing \\(S_0\\) is equivalent to proving or disproving the conjecture, +which remains an open problem at present. +It is therefore unreasonable to expect any model checker to be able to accept such an input, +despite the fact that the condition is easily describable in first-order logic. + +A similar problem can occur in computing \\(Successors(s)\\); +the relation between variables `Ai` (\\(s(A_i)\\)) and `Ai'` (\\(s_2(A_i)\\)) +may be given by means of an implicit function or uncomputable expression. +Therefore, most tools impose the following constraints, +which make computing \\(S_0\\) and \\(Successors(s)\\) possible without any sort of specialized solver: The specification must have the shape ```tla VARIABLE A1,...,Ak @@ -247,7 +307,8 @@ Next == /\ CondN(A1,...,Ak) /\ Ak' \in Gk(A1,...,Ak, A1',...,A{k-1}') ``` -or some equivalent form, in which variable values in a state can be iteratively computed, one at a time, by means of an explicit formula, which uses only variables computed so far. +or some equivalent form, in which variable values in a state can be iteratively computed, +one at a time, by means of an explicit formula, which uses only variables computed so far. For instance, ```tla VARIABLE x,y @@ -263,9 +324,12 @@ Next == \/ /\ x > 5 /\ x' = x + y' ``` -allows one to compute both \\(S_0\\) as well as \\(Successors(s)\\), for any \\(s\\), by traversing the conjunctions in the syntax-imposed order. +allows one to compute both \\(S_0\\) as well as \\(Successors(s)\\), for any \\(s\\), +by traversing the conjunctions in the syntax-imposed order. -However, even in a situation where states are computable, and \\(R\\) is finite, the size of \\(R\\) itself might be an issue in practice. We can create very compact specifications with large state-space sizes: +However, even in a situation where states are computable, and \\(R\\) is finite, +the size of \\(R\\) itself might be an issue in practice. +We can create very compact specifications with large state-space sizes: ```tla VARIABLE A1,...,Ak @@ -285,31 +349,43 @@ Next == \/ /\ A1' = (A1 + 1) % C This specification will have \\(C^k\\) distinct states, despite its rather simplistic behavior. ## Explicit-state bounded model checking -Adapting the general explicit-state approach to bounded model checking is trivial, and therefore not particularly interesting. Assume a bound \\(k \in \mathbb{N}\\) on the length of the traces considered. +Adapting the general explicit-state approach to bounded model checking is trivial, +and therefore not particularly interesting. +Assume a bound \\(k \in \mathbb{N}\\) on the length of the traces considered. Compute \\(S_0\\) and set \\(Visited \leftarrow \emptyset, ToVisit \leftarrow \\{ (s,0)\mid s \in S_0 \\}\\) 1. While \\(ToVisit \ne \emptyset\\), pick some \\((s,j) \in ToVisit\\): 1. If \\(\neg I(s)\\) then terminate, since a witness is found. - 1. If \\(I(s)\\) then: + 2. If \\(I(s)\\) then: \begin{align} Visited &\leftarrow Visited \cup \\{(s,j)\\} \\\\ ToVisit &\leftarrow (ToVisit \cup T) \setminus Visited \end{align} where \\( T \\) equals \\(\\{(t,j+1)\mid t \in Successors(s)\\}\\) if \\(j < k\\) and \\(\emptyset\\) otherwise - 1. If \\(ToVisit = \emptyset\\) terminate. \\(r(k) = \\{v \mid \exists j \in \mathbb{N} \ .\ (v,j) \in Visited\\}\\) and \\(I\\) holds in all states reachable in at most \\(k\\) steps. + 2. If \\(ToVisit = \emptyset\\) terminate. \\(r(k) = \\{v \mid \exists j \in \mathbb{N} \ .\ (v,j) \in Visited\\}\\) + and \\(I\\) holds in all states reachable in at most \\(k\\) steps. -A real implementation would, for efficiency reasons, avoid entering the same state via traces of different length, but the basic idea would remain unchanged. -Bounding the execution length guarantees termination of the algorithm if \\(S_0\\) is finite and each state has finitely many successors w.r.t. \\(\to\\), even if the state space is unbounded in general. -However, this comes at a cost of guarantees: while bounded model checking might still find an invariant violation if it can occur within the bound \\(k\\), it will fail if the shortest possible trace, on which the invariant is violated has a length greater than \\(k\\). +A real implementation would, for efficiency reasons, avoid entering the same state via traces of different length, +but the basic idea would remain unchanged. +Bounding the execution length guarantees termination of the algorithm +if \\(S_0\\) is finite and each state has finitely many successors w.r.t. \\(\to\\), +even if the state space is unbounded in general. +However, this comes at a cost of guarantees: while bounded model checking might still find an invariant violation +if it can occur within the bound \\(k\\), it will fail if the shortest possible trace, +on which the invariant is violated has a length greater than \\(k\\). -If the system has a finite diameter, bounded model checking is equivalent to model checking, as long as \\(k\\) exceeds the diameter. +If the system has a finite diameter, bounded model checking is equivalent to model checking, +as long as \\(k\\) exceeds the diameter. ## Symbolic bounded model checking -For a given \\(k \in \mathbb{N}\\), we want to find a way to determine if \\(r(k) \setminus I\\) is empty, without testing every single state in \\(r(k)\\) like in the explicit-state approach. +For a given \\(k \in \mathbb{N}\\), we want to find a way to determine if \\(r(k) \setminus I\\) is empty, +without testing every single state in \\(r(k)\\) like in the explicit-state approach. -The key insight behind symbolic model checking is the following: it is often the case that the size of the reachable state space is large, not because of the properties of the specification, but simply because of the constants or sets involved. +The key insight behind symbolic model checking is the following: +it is often the case that the size of the reachable state space is large, +not because of the properties of the specification, but simply because of the constants or sets involved. Consider the example: ```tla @@ -324,17 +400,22 @@ Next2 == x' \in 1..999999999999 Inv == x < 5 ``` -The sets of reachable states defined by each `Next` have sizes proportional to the upper bounds of the ranges used. However, to find a violation of the invariant, one merely needs to identify a state \\(s\\) in which, for example, \\(s(x) = 7\\), which belongs to both sets. -It is not necessary, or efficient, to loop over elements in the range and test each one against `Inv` to find a violation. Depending on the logic fragment `Inv` belongs to, there usually exist strategies for finding such violations much faster. +The sets of reachable states defined by each `Next` have sizes proportional to the upper bounds of the ranges used. +However, to find a violation of the invariant, one merely needs to identify a state \\(s\\) +in which, for example, \\(s(x) = 7\\), which belongs to both sets. +It is not necessary, or efficient, to loop over elements in the range and test each one against `Inv` to find a violation. +Depending on the logic fragment `Inv` belongs to, there usually exist strategies for finding such violations much faster. From this perspective, if, for some \\(k\\), we succeeded in finding a predicate \\(P\\) over \\(S\\), such that: - \\(P\\) belongs to a logic fragment, for which optimizations exist - \\(P\\) has a witness iff a state reachable in at most \\(k\\) steps violates \\(I\\): \\(\left(\exists s \in S \ .\ P(s)\right) \iff r(k) \setminus I \ne \emptyset\\) -we can use specialized techniques within the logical fragment to evaluate \\(P\\) and find a witness to the violation of \\(I\\), or else conclude that \\(r(k) \subseteq I\\). +we can use specialized techniques within the logical fragment to evaluate \\(P\\) +and find a witness to the violation of \\(I\\), or else conclude that \\(r(k) \subseteq I\\). -To do this, it is sufficient to find a predicate \\(P_R^l\\) encoding \\(R(l)\\), for each \\(l \in \\{0,\dots,k\\}\\), since: +To do this, it is sufficient to find a predicate \\(P_R^l\\) encoding \\(R(l)\\), +for each \\(l \in \\{0,\dots,k\\}\\), since: \begin{align} s \in r(l) \iff& \lor s \in R(0) \\\\ &\lor s \in R(1) \\\\ @@ -369,28 +450,44 @@ Then, the formula describing invariant violation in exactly \\(k\\) steps, \\(\e \exists s_0,\dots,s_k \in S \ .\ P_{S_0}(s_0) \land \neg I(s_k) \land \bigwedge_{i=0}^{k-1} s_i \to s_{i+1} \\] -The challenge in designing a symbolic model checker is determining, given TLA+ operators `Init`, `Next` and `Inv`, the encodings of \\(P_{S_0}, \to, I\\) as formulas in logcis supported by external solvers, for example SMT. +The challenge in designing a symbolic model checker is determining, given TLA+ operators `Init`, `Next` and `Inv`, +the encodings of \\(P_{S_0}, \to, I\\) as formulas in logics supported by external solvers, for example SMT. ### Symbolic states -In an explicit approach, the basic unit of computation is a single state \\(s \in S\\). However, as demonstrated above, symbolic approaches deal with logical formulas. Recall that a state formula, such as `Init` is actually a predicate over \\(S\\), and a predicate is equivalent to a subset of \\(S\\). +In an explicit approach, the basic unit of computation is a single state \\(s \in S\\). +However, as demonstrated above, symbolic approaches deal with logical formulas. +Recall that a state formula, such as `Init` is actually a predicate over \\(S\\), +and a predicate is equivalent to a subset of \\(S\\). Predicates tend to not distinguish between certain concrete states. For instance, the formula -\\(x < 3\\) is equally false for both \\(x = 7\\) and \\(x = 70000000\\). It is useful to characterize all of the states, in which a predicate evaluates to the same value. This is because we will define symbolic states in terms of equivalence relations: -A predicate \\(P\\) over \\(S\\) naturally defines an equivalence relation \\(\circledcirc_P\\): For \\(a,b \in S\\), we say that \\(a \circledcirc_P b\\) holds if \\(P(a) = P(b)\\). +\\(x < 3\\) is equally false for both \\(x = 7\\) and \\(x = 70000000\\). +It is useful to characterize all of the states, in which a predicate evaluates to the same value. +This is because we will define symbolic states in terms of equivalence relations: +A predicate \\(P\\) over \\(S\\) naturally defines an equivalence relation \\(\circledcirc_P\\): +For \\(a,b \in S\\), we say that \\(a \circledcirc_P b\\) holds if \\(P(a) = P(b)\\). Proving that this relation satisfies the criteria for an equivalence relation is left as an exercise to the reader. This equivalence relation has only two distinct equivalence classes, since \\(P(s)\\) can only be `TRUE` or `FALSE`. We can therefore think of predicates in the following way: Each predicate \\(P\\) slices the set \\(S\\) into two disjoint subsets, i.e. the equivalence classes of \\(\circledcirc_P\\). -An equivalent formulation of the above is saying that each predicate \\(P\\) defines a quotient space \\(S / \circledcirc_P\\), of size \\(2\\). +An equivalent formulation of the above is saying that each predicate \\(P\\) +defines a quotient space \\(S / \circledcirc_P\\), of size \\(2\\). -Recall that we have expressed the set of states \\(R(l)\\) with the predicate \\(P_R^l\\), for each \\(l \in \\{0,\dots,k\\}\\). By the above, \\(P_R^l\\) defines an equivalence relation \\(\circledcirc_{P_R^l}\\) on \\(S\\), and consequently, two equivalence classes. +Recall that we have expressed the set of states \\(R(l)\\) with the predicate \\(P_R^l\\), +for each \\(l \in \\{0,\dots,k\\}\\). By the above, \\(P_R^l\\) defines an equivalence relation +\\(\circledcirc_{P_R^l}\\) on \\(S\\), and consequently, two equivalence classes. For notational clarity, we use \\(\circledcirc^l\\) instead of \\(\circledcirc_{P_R^l}\\). Each concrete state \\(s \in S\\) belongs to exactly one equivalence class \\(\lbrack s \rbrack_{ \circledcirc^l} \in S / \circledcirc^l\\). -The states in \\(R(l)\\) correspond to the equivalence class in which \\(P_R^l\\) holds true (i.e. \\(s \in R(l) \iff \lbrack s \rbrack_{\circledcirc^l} = \\{t \in S \mid P_R^l(t) = TRUE\\}\\)), and the ones in \\(S \setminus R(l)\\) correspond to the equivalence class in which \\(P_R^l\\) is false (i.e. \\(s \notin R(l) \iff \lbrack s \rbrack_{\circledcirc^l} = \\{t \in S \mid P_R^l(t) = FALSE\\}\\)). +The states in \\(R(l)\\) correspond to the equivalence class in which \\(P_R^l\\) +holds true (i.e. \\(s \in R(l) \iff \lbrack s \rbrack_{\circledcirc^l} = \\{t \in S \mid P_R^l(t) = TRUE\\}\\)), +and the ones in \\(S \setminus R(l)\\) correspond to the equivalence class in which \\(P_R^l\\) is false +(i.e. \\(s \notin R(l) \iff \lbrack s \rbrack_{\circledcirc^l} = \\{t \in S \mid P_R^l(t) = FALSE\\}\\)). -We define symbolic states in the following way: Given a predicate \\(P\\) over \\(S\\), a _symbolic state_ with respect to \\(P\\) is an element of \\(S / \circledcirc_P\\), where \\(\circledcirc_P\\) is the equivalence relation derived from \\(P\\) (i.e. \\(a \circledcirc_P b \iff P(a) = P(b)\\)). -Recall the subset-predicate equivalence: in this context, a symbolic state, w.r.t. \\(P\\) is equivalent to a predicate, specifically, either \\(P\\) or \\(\neg P\\). +We define symbolic states in the following way: Given a predicate \\(P\\) over \\(S\\), +a _symbolic state_ with respect to \\(P\\) is an element of \\(S / \circledcirc_P\\), +where \\(\circledcirc_P\\) is the equivalence relation derived from \\(P\\) (i.e. \\(a \circledcirc_P b \iff P(a) = P(b)\\)). +Recall the subset-predicate equivalence: in this context, a symbolic state, w.r.t. \\(P\\) +is equivalent to a predicate, specifically, either \\(P\\) or \\(\neg P\\). For example, given ```tla @@ -423,21 +520,33 @@ and \\{ [x \mapsto a, y \mapsto b] \mid a,b\in \mathbb{Z} \land ( a \ne 1 \lor b \ne 1 )\\} \\] -If we only care about characterizing invariant violations, the above techniques are sufficient. However, specification invariants are often composed of multiple smaller, independent invariants. -For feedback purposes, it can be beneficial to identify, whenever an invariant violation occurs, the precise sub-invariant that is the cause. +If we only care about characterizing invariant violations, the above techniques are sufficient. +However, specification invariants are often composed of multiple smaller, independent invariants. +For feedback purposes, it can be beneficial to identify, whenever an invariant violation occurs, +the precise sub-invariant that is the cause. Suppose we are given an invariant \\(s(x) > 0 \land s(y) > 0\\). -The information whether a reachable state has just \\(s(x) \le 0\\), just \\(s(y) \le 0\\), or both can help determine problems at the design level. - -More generally: often, a predicate \\(P\\) is constructed as a conjunction of other predicates, e.g. \\(P(s) \iff p_1(s) \land \dots \land p_m(s)\\). A violation of \\(P\\) means a violation of (at least) one of \\(p_1,\dots,p_m\\), but knowing which one enables additional analysis. - -A collection of predicates \\(p_1,\dots,p_m\\) over \\(S\\) define an equivalence relation \\(\circledcirc\lbrack p_1,\dots,p_m\rbrack\\)in the following way: -For \\(a,b \in S\\), we say that \\( a \circledcirc\lbrack p_1,\dots,p_m\rbrack\ b\\) holds if \\(p_1(a) = p_1(b) \land \dots \land p_m(a) = p_m(b)\\). Clearly, \\(\circledcirc\lbrack p_1\rbrack = \circledcirc_{p_1}\\). - -Since a predicate can only evaluate to one of two values, there exist only two equivalence classes for \\(\circledcirc_P\\), i.e. only two symbolic states w.r.t. \\(P\\): one is the set of all states for which \\(P\\) is `TRUE`, and the other is the set of all values for which \\(P\\) is `FALSE`. +The information whether a reachable state has just \\(s(x) \le 0\\), just \\(s(y) \le 0\\), +or both can help determine problems at the design level. + +More generally: often, a predicate \\(P\\) is constructed as a conjunction of other predicates, +e.g. \\(P(s) \iff p_1(s) \land \dots \land p_m(s)\\). A violation of \\(P\\) means a violation of (at least) one of +\\(p_1,\dots,p_m\\), but knowing which one enables additional analysis. + +A collection of predicates \\(p_1,\dots,p_m\\) over \\(S\\) define an equivalence relation +\\(\circledcirc\lbrack p_1,\dots,p_m\rbrack\\)in the following way: +For \\(a,b \in S\\), we say that \\( a \circledcirc\lbrack p_1,\dots,p_m\rbrack\ b\\) holds if +\\(p_1(a) = p_1(b) \land \dots \land p_m(a) = p_m(b)\\). +Clearly, \\(\circledcirc\lbrack p_1\rbrack = \circledcirc_{p_1}\\). + +Since a predicate can only evaluate to one of two values, there exist only two equivalence classes for +\\(\circledcirc_P\\), i.e. only two symbolic states w.r.t. \\(P\\): one is the set of all states for which +\\(P\\) is `TRUE`, and the other is the set of all values for which \\(P\\) is `FALSE`. In this sense, \\(S / \circledcirc_P\\) is isomorphic to the set \\(\mathbb{B}\\). -In the case of \\(\circledcirc\lbrack p_1,\dots,p_m\rbrack\\), there are \\(2^m\\) different \\(m\\)-tuples with values from \\(\mathbb{B}\\), so \\(S / \circledcirc\lbrack p_1,\dots,p_m\rbrack\\) is isomorphic to \\(\mathbb{B}^m\\) . +In the case of \\(\circledcirc\lbrack p_1,\dots,p_m\rbrack\\), there are \\(2^m\\) different \\(m\\)-tuples with values from +\\(\mathbb{B}\\), so \\(S / \circledcirc\lbrack p_1,\dots,p_m\rbrack\\) is isomorphic to \\(\mathbb{B}^m\\) . -What is the relation between \\(\circledcirc\lbrack p_1,\dots,p_m\rbrack\\) and \\(\circledcirc_P\\), where \\(P(s) = p_1(s) \land \dots \land p_m(s)\\)? +What is the relation between \\(\circledcirc\lbrack p_1,\dots,p_m\rbrack\\) and \\(\circledcirc_P\\), +where \\(P(s) = p_1(s) \land \dots \land p_m(s)\\)? Clearly, \\(P(s) = TRUE \iff p_1(s) = \dots = p_m(s) = TRUE\\). Consequently, there is one equivalence class in \\(S / \circledcirc_P\\), that is equal to \\[ @@ -448,9 +557,12 @@ and one equivalence class in \\(S / \circledcirc\lbrack p_1,\dots,p_m\rbrack\\) C_2 = \\{ s \in S \mid p_1(s) = TRUE \land \dots \land p_m(s) = TRUE \\} \\] -They are one and the same, i.e. \\(C_1 = C_2\\). The difference is, that splitting \\(P\\) into \\(m\\) components \\(p_1,\dots,p_m\\) splits the other (unique) equivalence class \\(C \in \\{ c \in S / \circledcirc_P \mid c \ne C_1 \\}\\) into \\(2^m - 1\\) parts, which are the equivalence classes in \\(\\{ c \in S/\circledcirc\lbrack p_1,\dots,p_m\rbrack \mid c \ne C_2 \\}\\). +They are one and the same, i.e. \\(C_1 = C_2\\). The difference is, that splitting \\(P\\) into \\(m\\) components +\\(p_1,\dots,p_m\\) splits the other (unique) equivalence class \\(C \in \\{ c \in S / \circledcirc_P \mid c \ne C_1 \\}\\) +into \\(2^m - 1\\) parts, which are the equivalence classes in \\(\\{ c \in S/\circledcirc\lbrack p_1,\dots,p_m\rbrack \mid c \ne C_2 \\}\\). -Consequently, we can also define symbolic states with respect to a set of predicates \\(p_1,\dots,p_m\\), implicitly conjoined, as elements of \\(S / \circledcirc\lbrack p_1,\dots,p_m\rbrack\\). +Consequently, we can also define symbolic states with respect to a set of predicates \\(p_1,\dots,p_m\\), +implicitly conjoined, as elements of \\(S / \circledcirc\lbrack p_1,\dots,p_m\rbrack\\). Similarly, by the subset-predicate equivalence, a symbolic state, w.r.t. \\(p_1,\dots,p_m\\) can be viewed as one of \begin{align} p_1(s) \land p_2(s) \land \dots \land p_m(s) \qquad&= P(s) \\\\ @@ -461,8 +573,13 @@ p_1(s) \land \neg p_2(s) \land \dots \land p_m(s) \qquad& | \\\ \neg p_1(s) \land \neg p_2(s) \land \dots \land \neg p_{m-1}(s) \land \neg p_m(s) \qquad& | \\\\ \end{align} -For example, take \\(p_1(s) = s \in R(k)\\) and \\(p_2(s) = \neg I(s)\\). With respect to \\(p_1(s) \land p_2(s)\\), there are two symbolic states: one corresponds to the set of all states which are both reachable and in which the invariant is violated, while the other corresponds to the set of all states, which are either not reachable, or in which the invariant holds. -Conversely, with respect to \\(p_1,p_2\\), there are four symbolic states: one corresponds to states which are both reachable and violate the invariant, one corresponds to states which are reachable, but which do not violate the invariant, one corresponds to states which are not reachable, but violate the invariant and the last one corresponds to states which are neither reachable, nor violate the invariant. +For example, take \\(p_1(s) = s \in R(k)\\) and \\(p_2(s) = \neg I(s)\\). With respect to \\(p_1(s) \land p_2(s)\\), +there are two symbolic states: one corresponds to the set of all states which are both reachable and in which the invariant is violated, +while the other corresponds to the set of all states, which are either not reachable, or in which the invariant holds. +Conversely, with respect to \\(p_1,p_2\\), there are four symbolic states: one corresponds to states +which are both reachable and violate the invariant, one corresponds to states which are reachable, +but which do not violate the invariant, one corresponds to states which are not reachable, +but violate the invariant and the last one corresponds to states which are neither reachable, nor violate the invariant. ### Symbolic traces Having defined symbolic states, what is then the meaning of a symbolic trace? @@ -474,10 +591,12 @@ C_0 \in S / \circledcirc^0 \land \dots \land C_k \in S / \circledcirc^k \\] and, for each \\(i = 0,\dots,k\\), it is the case that \\(C_i = \\{ s \in S \mid P_R^i(s) = TRUE\\}\\). -In other words, a symbolic trace is the unique sequence of symbolic states, which correspond to the set of explicit states evaluating to `TRUE` under each of \\(P_R^0,\dots,P_R^k\\) respectively. +In other words, a symbolic trace is the unique sequence of symbolic states, which correspond to the set +of explicit states evaluating to `TRUE` under each of \\(P_R^0,\dots,P_R^k\\) respectively. Recall that \\(P_R^{i+1}(s_{i+1})\\) was defined as \\(\exists s_i \in S \ .\ P_R^i(s_i) \land s_i \to s_{i+1}\\). -While, in the explicit case, we needed to enforce the condition \\(s_i \to s_{i+1}\\), in the symbolic case this is already a part of the predicate definition. +While, in the explicit case, we needed to enforce the condition \\(s_i \to s_{i+1}\\), +in the symbolic case this is already a part of the predicate definition. For example, consider: ```tla @@ -496,7 +615,9 @@ a trace of length 2 would be one of \\(0,1,2\\) or \\(1,2,3\\). A symbolic trace In the case of symbolic states, we were particularly interested in symbolic states with respect to predicates that encoded reachability. -Unlike the case of invariants, where we considered conjunctions of sub-invariants, the most interesting scenario w.r.t. traces is when a transition relation is presented as a disjunction of transitions, i.e. when +Unlike the case of invariants, where we considered conjunctions of sub-invariants, +the most interesting scenario w.r.t. traces is when a transition relation is presented as a disjunction of transitions, +i.e. when \begin{align} s_1 \to s_2 \iff& \lor t_1(s_1,s_2)\\\\ & \lor t_2(s_1,s_2)\\\\ @@ -504,13 +625,17 @@ s_1 \to s_2 \iff& \lor t_1(s_1,s_2)\\\\ & \lor t_m(s_1,s_2) \end{align} -At the specification level, this is usually the case when one can nondeterministically choose to perform one of \\(m\\) actions, and each \\(t_1,\dots,t_m\\) is an encoding of one such action, which, like \\(\to\\), translates to a binary predicate over \\(S\\). +At the specification level, this is usually the case when one can nondeterministically choose to perform one of \\(m\\) actions, +and each \\(t_1,\dots,t_m\\) is an encoding of one such action, which, like \\(\to\\), translates to a binary predicate over \\(S\\). -Instead of a single trace \\(C_1, \dots, C_k\\), where states in \\(C_{i+1}\\) are reachable from states in \\(C_i\\) via \\(\to\\), we want to separate sets of states reachable by each \\(t_i\\) individually. +Instead of a single trace \\(C_1, \dots, C_k\\), where states in \\(C_{i+1}\\) are reachable from states in +\\(C_i\\) via \\(\to\\), we want to separate sets of states reachable by each \\(t_i\\) individually. -Recall that symbolic traces are sequences of symbolic states, implicitly related by \\(\to\\), since \\(R\\) is defined in terms of \\(\to\\). +Recall that symbolic traces are sequences of symbolic states, implicitly related by \\(\to\\), +since \\(R\\) is defined in terms of \\(\to\\). We define a symbolic trace decomposition by \\(t_1,\dots,t_m\\), in the following way: -If \\(t_1,\dots,t_m\\) are relations, such that \\(s_1 \to s_2 \iff \bigvee_{i=1}^m t_i(s_1,s_2)\\), the decomposition of a symbolic trace \\(X_0,\dots,X_k\\) of length \\(k\\) w.r.t. \\(t_1,\dots,t_m\\) is a set \\( +If \\(t_1,\dots,t_m\\) are relations, such that \\(s_1 \to s_2 \iff \bigvee_{i=1}^m t_i(s_1,s_2)\\), +the decomposition of a symbolic trace \\(X_0,\dots,X_k\\) of length \\(k\\) w.r.t. \\(t_1,\dots,t_m\\) is a set \\( D = \\{ Y(\tau) \mid \tau \in \\{1,\dots,m\\}^{\\{1,\dots, k\\}} \\} @@ -529,7 +654,9 @@ X_i = \bigcup \left\\{ Y_i(\tau)\mid \tau \in \\{1,\dots,m\\}^{\\{1,\dots, k\\}} \\] Less obvious is the fact that, the larger the index \\(i\\), the finer this decomposition becomes. -Consider \\(i=1\\). Since \\(Y_0\\) is fixed, there are as many different \\(Y_1(\tau)\\) components as there are possible values of \\(\tau(1)\\), i.e. \\(m\\). As \\(Y_2\\) depends on \\(Y_1\\), there are as many different components as there are pairs \\((\tau(1),\tau(2))\\), i.e. \\(m^2\\), and so on until \\(k\\), where there are \\(m^k\\) possible \\(Y_k(\tau)\\) sets. +Consider \\(i=1\\). Since \\(Y_0\\) is fixed, there are as many different \\(Y_1(\tau)\\) components as there are possible values of +\\(\tau(1)\\), i.e. \\(m\\). As \\(Y_2\\) depends on \\(Y_1\\), there are as many different components as there are pairs +\\((\tau(1),\tau(2))\\), i.e. \\(m^2\\), and so on until \\(k\\), where there are \\(m^k\\) possible \\(Y_k(\tau)\\) sets. In practice, however, many of these sets are empty. Let us look at an example: @@ -570,8 +697,11 @@ Without considering the decomposition, the symbolic trace is equal to X_0 = \\{1,\dots,10\\}, X_1 = \\{1,\dots,10\\}, X_2 = \\{1,\dots,10\\} \\] -Under the decomposition, we have \\(m^k = 4^2 = 16\\) candidates for \\(\tau\\). Let us look at \\(\tau_1\\), for which \\(\tau_1(1) = 1, \tau_1(2) = 2\\), representing an execution where the action `A1` is followed by the action `A2`. -If \\(Y_0(\tau_1),Y_1(\tau_1),Y_2(\tau_1)\\) is a partial trace (i.e. one of the elements in the decomposition \\(D\\)), then: +Under the decomposition, we have \\(m^k = 4^2 = 16\\) candidates for \\(\tau\\). +Let us look at \\(\tau_1\\), for which \\(\tau_1(1) = 1, \tau_1(2) = 2\\), representing an execution +where the action `A1` is followed by the action `A2`. +If \\(Y_0(\tau_1),Y_1(\tau_1),Y_2(\tau_1)\\) is a partial trace (i.e. one of the elements in the decomposition \\(D\\)), +then: - \\(Y_1(\tau_1) = \\{ b \in X_1 \mid \exists a \in Y_0(\tau_1) \ .\ t_{\tau_1(1)}(a,b)\\}\\) which means \\[ @@ -592,29 +722,37 @@ so the partial trace, corresponding to the sequence of actions `Init,A1,A2` is In fact, we can draw a table, representing partial traces corresponding to sequences of actions: -| Sequence of actions (after `Init`) | Partial trace (without \\(Y_0\\)) | -|---|---| -| A1, A1 | \\(\\{4, \dots, 9\\}, \\{4, \dots, 8\\}\\) | -| A1, A2 | \\(\\{4, \dots, 9\\}, \\{5, \dots, 7\\}\\) | -| A1, A3 | \\(\\{4, \dots, 9\\}, \\{4, \dots, 9\\}\\) | -| A1, A4 | \\(\\{4, \dots, 9\\}, \emptyset\\) | -| A2, A1 | \\(\\{2, \dots, 7\\}, \\{4, \dots, 6\\}\\) | -| A2, A2 | \\(\\{2, \dots, 7\\}, \\{3, \dots, 7\\}\\) | -| A2, A3 | \\(\\{2, \dots, 7\\}, \\{2, \dots, 7\\}\\) | -| A2, A4 | \\(\\{2, \dots, 7\\}, \emptyset\\) | -| A3, A1 | \\(\\{1, \dots, 10\\}, \\{4, \dots, 9\\}\\) | -| A3, A2 | \\(\\{1, \dots, 10\\}, \\{2, \dots, 7\\}\\) | -| A3, A3 | \\(\\{1, \dots, 10\\}, \\{1, \dots, 10\\}\\) | -| A3, A4 | \\(\\{1, \dots, 10\\}, \\{10\\}\\) | -| A4, A1 | \\(\\{10\\}, \\{9\\}\\) | -| A4, A2 | \\(\\{10\\}, \emptyset\\) | -| A4, A3 | \\(\\{10\\}, \\{10\\}\\) | -| A4, A4 | \\(\\{10\\}, \emptyset\\) | - -Clearly, the elements in every column (representing the various \\(Y_i(\tau)\\)), add up to \\(X_i = \\{1,\dots,10\\}\\). Also noticeable is the fact that some actions disable others, represented by the fact that some \\(Y_2(\tau)\\) sets are empty. For example, the action `A2` disables `A4`, because after `A2`, `x` cannot hold the value \\(1\\), which is a precondition for `A4`. +| Sequence of actions (after `Init`) | Partial trace (without \\(Y_0\\)) | +|------------------------------------|----------------------------------------------| +| A1, A1 | \\(\\{4, \dots, 9\\}, \\{4, \dots, 8\\}\\) | +| A1, A2 | \\(\\{4, \dots, 9\\}, \\{5, \dots, 7\\}\\) | +| A1, A3 | \\(\\{4, \dots, 9\\}, \\{4, \dots, 9\\}\\) | +| A1, A4 | \\(\\{4, \dots, 9\\}, \emptyset\\) | +| A2, A1 | \\(\\{2, \dots, 7\\}, \\{4, \dots, 6\\}\\) | +| A2, A2 | \\(\\{2, \dots, 7\\}, \\{3, \dots, 7\\}\\) | +| A2, A3 | \\(\\{2, \dots, 7\\}, \\{2, \dots, 7\\}\\) | +| A2, A4 | \\(\\{2, \dots, 7\\}, \emptyset\\) | +| A3, A1 | \\(\\{1, \dots, 10\\}, \\{4, \dots, 9\\}\\) | +| A3, A2 | \\(\\{1, \dots, 10\\}, \\{2, \dots, 7\\}\\) | +| A3, A3 | \\(\\{1, \dots, 10\\}, \\{1, \dots, 10\\}\\) | +| A3, A4 | \\(\\{1, \dots, 10\\}, \\{10\\}\\) | +| A4, A1 | \\(\\{10\\}, \\{9\\}\\) | +| A4, A2 | \\(\\{10\\}, \emptyset\\) | +| A4, A3 | \\(\\{10\\}, \\{10\\}\\) | +| A4, A4 | \\(\\{10\\}, \emptyset\\) | + +Clearly, the elements in every column (representing the various \\(Y_i(\tau)\\)), add up to \\(X_i = \\{1,\dots,10\\}\\). +Also noticeable is the fact that some actions disable others, represented by the fact that some \\(Y_2(\tau)\\) sets are empty. +For example, the action `A2` disables `A4`, because after `A2`, `x` cannot hold the value \\(1\\), which is a precondition for `A4`. ### Counterexamples in Apalache -Finally, we can interpret Apalache counterexamples in the context of the above definitions. Given an invariant \\(I\\), a transition system \\((S, S_0, \to)\\) and an upper bound on executions \\(k\\), Apalache first finds predicates \\(t_1,\dots,t_m\\) partitioning \\(\to\\). Then, it encodes a symbolic trace \\(X_0,\dots,X_k\\) and its decomposition \\(D\\). A counterexample in Apalache defines an explicit trace \\(s_0,s_1,\dots,s_l \in S\\) for some \\(l \le k\\), as well as a sequence \\(t_{\tau(1)}, \dots, t_{\tau(l)}\\) (in the comments). The predicate sequence defines a partial trace (of length \\(l\\)) +Finally, we can interpret Apalache counterexamples in the context of the above definitions. +Given an invariant \\(I\\), a transition system \\((S, S_0, \to)\\) and an upper bound on executions \\(k\\), +Apalache first finds predicates \\(t_1,\dots,t_m\\) partitioning \\(\to\\). +Then, it encodes a symbolic trace \\(X_0,\dots,X_k\\) and its decomposition \\(D\\). +A counterexample in Apalache defines an explicit trace \\(s_0,s_1,\dots,s_l \in S\\) for some \\(l \le k\\), +as well as a sequence \\(t_{\tau(1)}, \dots, t_{\tau(l)}\\) (in the comments). +The predicate sequence defines a partial trace (of length \\(l\\)) \\(Y_0(\tau),\dots,Y_l(\tau)\\) and \\(s_0,\dots,s_l\\) are chosen such that \\(s_i \in Y_i(\tau)\\). Take the following specification and counterexample, for \\(k = 10\\): @@ -671,20 +809,31 @@ In the above case, `Transition 0` refers to the one representing `A` and `Transi ## Notation and definitions We use the following definitions and conventions: - - Common sets: We use the notation \\(\mathbb{Z}\\) to refer to the set of all integers, \\(\mathbb{B}\\) to refer to the set of Booleans \\(\\{TRUE,FALSE\\}\\), and \\(\mathbb{N}\\) to refer to the set of all naturals, i.e. \\(\mathbb{N} = \\{z \in \mathbb{Z}\mid z \ge 0\\}\\). + - Common sets: We use the notation \\(\mathbb{Z}\\) to refer to the set of all integers, \\(\mathbb{B}\\) + to refer to the set of Booleans \\(\\{TRUE,FALSE\\}\\), and \\(\mathbb{N}\\) to refer to the set of all naturals, + i.e. \\(\mathbb{N} = \\{z \in \mathbb{Z}\mid z \ge 0\\}\\). - Function sets: We denote by \\(B^A\\) the set of all functions from \\(A\\) to \\(B\\), i.e. \\(f \in B^A \iff f\colon A \to B\\). - Powersets: We denote by \\(2^A\\) the set of all subsets of a set \\(A\\), i.e. \\(B \subseteq A \iff B \in 2^A\\) - Isomorphisms: Sets \\( A \\) and \\(B\\) are called isomorphic, if there exists a bijective function \\(b\in B^A\\). - - Predicates: Given a set \\(T\\), a _predicate_ over \\(T\\) is a function \\(P \in \mathbb{B}^T\\), that is, a function \\(P\\), such that \\(P(t) \in \mathbb{B}\\) for each \\(t \in T\\). - - Relations: Predicates over \\(A \times B\\) are called _relations_. A relation \\(R\\) over \\(T \times T\\) is an _equivalence relation_, if the following holds: + - Predicates: Given a set \\(T\\), a _predicate_ over \\(T\\) is a function \\(P \in \mathbb{B}^T\\), that is, + a function \\(P\\), such that \\(P(t) \in \mathbb{B}\\) for each \\(t \in T\\). + - Relations: Predicates over \\(A \times B\\) are called _relations_. A relation \\(R\\) over \\(T \times T\\) is an _equivalence relation_, + if the following holds: - For all \\(t \in T\\), it is the case that \\(R(t,t)\\) (reflexivity). - For all \\(s,t \in T\\), \\(R(s,t)\\) holds if and only if \\(R(t,s)\\) holds (symmetry). - For all \\(r,s,t \in T\\), \\(R(r,s) \land R(s,t)\\) implies \\(R(r,t)\\) (transititvity). - - Equivalence classes: An equivalence relation \\(R\\) over \\(T \times T\\) defines a function \\(E \in (2^T)^T\\), such that, for each \\(t \in T\\), \\(E(t) = \\{ s \in T\mid R(t,s) \\}\\). \\(E(t)\\) is called the _equivalence class_ of \\(t\\) for \\(R\\), denoted as \\(\lbrack t\rbrack_R\\). - - Quotient space: An equivalence relation \\(R\\) over \\(T \times T\\) defines a _quotient space_, denoted \\(T / R\\), such that \\(T / R = \\{ \lbrack t\rbrack_R \mid t \in T \\} \subseteq 2^T\\). + - Equivalence classes: An equivalence relation \\(R\\) over \\(T \times T\\) defines a function \\(E \in (2^T)^T\\), + such that, for each \\(t \in T\\), \\(E(t) = \\{ s \in T\mid R(t,s) \\}\\). \\(E(t)\\) + is called the _equivalence class_ of \\(t\\) for \\(R\\), denoted as \\(\lbrack t\rbrack_R\\). + - Quotient space: An equivalence relation \\(R\\) over \\(T \times T\\) defines a _quotient space_, denoted \\(T / R\\), + such that \\(T / R = \\{ \lbrack t\rbrack_R \mid t \in T \\} \subseteq 2^T\\). - Subset-predicate equivalence: For any set \\(T\\), there exists a natural isomorphism between - \\(\mathbb{B}^T\\) and \\(2^T\\) (implied by the similarity in notation): Each predicate \\(P \in \mathbb{B}^T\\) corresponds to the set \\(\\{ t \in T \mid P(t) = TRUE\\} \in 2^T\\). For this reason, predicates are often directly identified with the subset they are equivalent to, and we write \\(P \subseteq T\\) for brevity. - - Infix notation: Given a relation \\(R \in \mathbb{B}^{A\times B}\\), we commonly write \\(a\ R\ b\\) instead of \\(R(a,b)\\) (e.g. \\(a > b\\) instead of \\(>(a,b)\\)). - - Cartesian product: Given a set \\(T\\), we use \\(T^2\\) to refer to \\(T \times T\\). \\(T^k\\), for \\(k > 2\\) is defined similarly. + \\(\mathbb{B}^T\\) and \\(2^T\\) (implied by the similarity in notation): Each predicate \\(P \in \mathbb{B}^T\\) + corresponds to the set \\(\\{ t \in T \mid P(t) = TRUE\\} \in 2^T\\). + For this reason, predicates are often directly identified with the subset they are equivalent to, and we write \\(P \subseteq T\\) for brevity. + - Infix notation: Given a relation \\(R \in \mathbb{B}^{A\times B}\\), we commonly write \\(a\ R\ b\\) + instead of \\(R(a,b)\\) (e.g. \\(a > b\\) instead of \\(>(a,b)\\)). + - Cartesian product: Given a set \\(T\\), we use \\(T^2\\) to refer to \\(T \times T\\). + \\(T^k\\), for \\(k > 2\\) is defined similarly. diff --git a/docs/src/tutorials/temporal-properties.md b/docs/src/tutorials/temporal-properties.md index a567325428..78f04dd21c 100644 --- a/docs/src/tutorials/temporal-properties.md +++ b/docs/src/tutorials/temporal-properties.md @@ -99,8 +99,8 @@ stuttering like this: {{#include TrafficLight.tla:stutternext}} ``` -Recall that `[Next]_vars` is shorthand for `Next \/ UNCHANGED vars`. Now, let us try to verify the property once again, -using the modified next predicate: +Recall that `[Next]_vars` is shorthand for `Next \/ UNCHANGED vars`. +Now, let us try to verify the property once again, using the modified next predicate: ``` apalache-mc check --next=StutteringNext \ @@ -164,20 +164,20 @@ Two things are notable: 1. The initial state formula appears twice, once as a comment and once in TLA+. 2. There are way more variables than the two variables we specified. -The comment and the TLA+ specification express the same state, but in the comment, some variable names from the encoding have been replaced -with more human-readable names. +The comment and the TLA+ specification express the same state, but in the comment, +some variable names from the encoding have been replaced with more human-readable names. For example, there is a variable called `☐(requestedGreen ⇒ ♢isGreen)` in the comment, which is called `__temporal_t_1` in TLA+. In the following, let's focus on the content of the comment, since it's easier to understand what's going on. -There are many additional variables in the counterexample because to check temporal formulas, Apalache uses an +There are many additional variables in the counterexample, because to check temporal formulas, Apalache uses an encoding that transforms temporal properties to invariants. If you are interested in the technical details, the encoding is described in sections 3.2 and 4 of [Biere et al.][]. However, to understand the counterexample, you don't need to go into the technical details of the encoding. We'll go explain the counterexample in the following. We will talk about traces in the following. -You can find more information about (symbolic) traces [here](/tutorials/symbmc.html?highlight=trace#symbolic-traces). +You can find more information about (symbolic) traces [here](../tutorials/symbmc.md?highlight=trace#symbolic-traces). For the purpose of this tutorial, however, it will be enough to think of a trace as a sequence of states that were encountered by Apalache, and that demonstrate a violation of the property that is checked. @@ -334,7 +334,8 @@ leading box operator. To illustrate why these are necessary, consider the formula `[]isGreen`. To decide whether this formula holds in the last state of the loop, the algorithm needs to know whether `isGreen` holds in all states of the loop. So it needs to store this information when it traverses the loop. -That's why there is an extra variable, which stores whether `isGreen` holds on all states of the loop, and Apalache can access this information when it explores the last state of the loop. +That's why there is an extra variable, which stores whether `isGreen` holds on all states of the loop, +and Apalache can access this information when it explores the last state of the loop. Similarly, the unroll-variable `♢isGreen_unroll` holds true if there is a state on the loop such that `isGreen` is true. @@ -379,7 +380,8 @@ is not true, since `☐(requestedGreen ⇒ ♢isGreen)` does not hold in state 2, which is on the loop. Similar to the `__loop_` copies for model variables, -we also introduce copies for all (temporal) subformulas, e.g., `__loop_☐(requestedGreen ⇒ ♢isGreen)` for `☐(requestedGreen ⇒ ♢isGreen)`. +we also introduce copies for all (temporal) subformulas, +e.g., `__loop_☐(requestedGreen ⇒ ♢isGreen)` for `☐(requestedGreen ⇒ ♢isGreen)`. These fulfill the same function as the `__loop_` copies for the original variables of the model, i.e., retaining the state of variables from the first state of the loop, e.g., @@ -397,8 +399,8 @@ Finally, let's discuss `RequestWillBeFulfilled_init`. This variable is an artifact of the translation for temporal properties. Intuitively, in any state, the variable will be true if the variable encoding the formula `RequestWillBeFulfilled` is true in the first state. -A trace is a counterexample if `RequestWillBeFulfilled` is false in the first state, so `RequestWillBeFulfilled_init` is false, -and a loop satisfying requirements on the auxiliary variables is found. +A trace is a counterexample if `RequestWillBeFulfilled` is false in the first state, +so `RequestWillBeFulfilled_init` is false, and a loop satisfying requirements on the auxiliary variables is found. ## Further reading diff --git a/docs/src/tutorials/trail-tips.md b/docs/src/tutorials/trail-tips.md index ac54270b4c..539403faa3 100644 --- a/docs/src/tutorials/trail-tips.md +++ b/docs/src/tutorials/trail-tips.md @@ -55,7 +55,7 @@ invariant](../apalache/principles/invariants.md#action-invariants): ``` Our version of `NextFast` is quite concise and it uses the good parts of -TLA+. However, new TLA+ users would probably write it differently. Below you +TLA+. However, new TLA+ users would probably write it differently. Below, you can see the version that is more likely to be written by a specification writer who has good experience in software engineering: