From 11fa484fd338df6f49dd3985e9665dcb756ee73e Mon Sep 17 00:00:00 2001 From: Timothy Zakian Date: Thu, 16 May 2024 11:32:22 -0700 Subject: [PATCH 1/4] [move][docs] Add section on pattern matching to Move book --- .../move/documentation/book/src/SUMMARY.md | 2 + .../documentation/book/src/control-flow.md | 4 +- .../book/src/control-flow/pattern-matching.md | 567 ++++++++++++++++++ .../move/documentation/book/src/enums.md | 3 +- 4 files changed, 574 insertions(+), 2 deletions(-) create mode 100644 external-crates/move/documentation/book/src/control-flow/pattern-matching.md diff --git a/external-crates/move/documentation/book/src/SUMMARY.md b/external-crates/move/documentation/book/src/SUMMARY.md index c91679eb041e9..14dcfad90ef77 100644 --- a/external-crates/move/documentation/book/src/SUMMARY.md +++ b/external-crates/move/documentation/book/src/SUMMARY.md @@ -22,8 +22,10 @@ - [Conditional Expressions](control-flow/conditionals.md) - [Loops](control-flow/loops.md) - [Labeled Control FLow](control-flow/labeled-control-flow.md) + - [Patterm Matching](control-flow/pattern-matching.md) - [Functions](functions.md) - [Structs](structs.md) +- [Enums](enums.md) - [Constants](constants.md) - [Generics](generics.md) - [Type Abilities](abilities.md) diff --git a/external-crates/move/documentation/book/src/control-flow.md b/external-crates/move/documentation/book/src/control-flow.md index 2bfe35731f974..f9f7a4e58e00c 100644 --- a/external-crates/move/documentation/book/src/control-flow.md +++ b/external-crates/move/documentation/book/src/control-flow.md @@ -3,8 +3,10 @@ Move offers multiple constructs for control flow based on [boolean expressions](./primitive-types/bool.md), including common programming constructs such as `if` expressions and `while` and `for` loops, along with advanced control flow structures including -labels for loops and escapable named blocks. +labels for loops and escapable named blocks. It also supports more more complex constructs based on +structural pattern matching. - [Conditional Expressions](./control-flow/conditionals.md) - [Loops](./control-flow/loops.md) - [Labeled Control FLow](./control-flow/labeled-control-flow.md) +- [Pattern Matching](./control-flow/pattern-matching.md) diff --git a/external-crates/move/documentation/book/src/control-flow/pattern-matching.md b/external-crates/move/documentation/book/src/control-flow/pattern-matching.md new file mode 100644 index 0000000000000..ed3164cbae4db --- /dev/null +++ b/external-crates/move/documentation/book/src/control-flow/pattern-matching.md @@ -0,0 +1,567 @@ +# Pattern Matching + +A `match` expression is a powerful control structure that allows you to compare a value against a +series of patterns and then execute code based on which pattern matches first. Patterns can be +anything from simple literals to complex structures. As opposed to `if` expressions, which require a +`bool`ean expression, a `match` expression requires a value to be matched against a series of +patterns. + +A `match` expression can match either by value, immutable reference, or mutable reference. + +A pattern is matched by a value if the value is equal to the pattern, and where variables and +wildcards (e.g., `x`, `y`, `_`, or `..`) are "equal" to anything. + +For example: + +```move +fun run(x: u64): u64 { + match (x) { + 1 => 2, + 2 => 3, + x => x, + } +} + +run(1); // returns 2 +run(2); // returns 3 +run(3); // returns 3 +run(0); // returns 0 +``` + +## Syntax + +A `match` takes an expression and a non-empty series of _match arms_ delimited by commas. + +Each match arm consists of a pattern `p`, an optional guard `if (g)` where `g` is an expression of +type `bool`, followed by an arrow `=>`and an expression `e`. E.g., + +```move +match (expression) { + pattern1 if (guard_expression) => expression1, + pattern2 => expression2, + pattern3 => { expression3, expression4, ... }, +} +``` + +Match arms are checked in order from top to bottom, and the first match arm whose pattern matches +(and whose guard expression, if any, returns `true`) will be executed. + +Note that the series of match arms within a `match` must be exhaustive, meaning that every possible +value of the type being matched must be covered by one of the patterns in the `match`. If the series +of match arms is not exhaustive, the compiler will raise an error. + +## Patterns + +Patterns are used to match values. Patterns can be + +- literals (`true`, `2`, `@0x4`); +- constants (`MyConstant`); +- variables (`a`, `b`, `x`); +- wildcards (`_`); +- constructor patterns (`MyStruct { a, b }`, `MyEnum::Variant(x)`); +- at-patterns ` @ `; and +- or-patterns ` | `. + +Additionally, depending on the context patterns may also include: + +- multi-arity wildcards (`..`); and +- mutable-binding patterns (`mut x`). + +Some examples of patterns are: + +```move +public enum MyEnum { + Variant(u64, bool), + OtherVariant(bool, u64), +} + +public enum OtherEnum { + V(MyEnum) +} + +public struct MyStruct { + x: u64, + y: u64, +} + +// literal pattern +1 + +// constant pattern +MyConstant + +// variable pattern +x + +// wildcard pattern +_ + +// constructor pattern that matches `MyEnum::Variant` with the fields `1` and `true` +MyEnum::Variant(1, true) + +// constructor pattern that matches `MyEnum::Variant` with the fields `1` and binds the second field's value to `x` +MyEnum::Variant(1, x) + +// multi-arity wildcard pattern that matches multiple fields within the `MyEnum::Variant` variant +MyEnum::Variant(..) + +// constructor pattern that matches the `x` field of `MyStruct` and binds the `y` field to `other_variable` +MyStruct { x, y: other_variable } + +// at-pattern that matches `MyEnum::Variant` and binds the entire value to `x` +x @ MyEnum::Variant(..) + +// or-pattern that matches either `MyEnum::Variant` or `MyEnum::OtherVariant` +MyEnum::Variant(..) | MyEnum::OtherVariant(..) + +// Same as the above or-pattern, but with explicit wildcards +MyEnum::Variant(_, _) | MyEnum::OtherVariant(_, _) + +// or-pattern that matches either `MyEnum::Variant` or `MyEnum::OtherVariant` and binds the u64 field to `x` +MyEnum::Variant(x, _) | MyEnum::OtherVariant(_, x) + +// constructor pattern that matches `OtherEnum::V` and if the inner `MyEnum` is `MyEnum::Variant` +OtherEnum::V(MyEnum::Variant(..)) +``` + +More concisely we have the following grammar for patterns in Move: + +```bnf +pattern = + | + | + | _ + | C { inner-pattern, inner-pattern, ... } // where C is a struct or enum variant + | C ( inner-pattern, inner-pattern, ... ) // where C is a struct or enum variant + | C // where C is an enum variant + | @ top-level-pattern + | pattern | pattern +inner-pattern = pattern + | .. + | mut pattern +``` + +Patterns that contain variables bind them to the value being matched. These variables can then be +used either in any match guard expressions, or on the right-hand side of the match arm. For example: + +```move +public struct Wrapper(u64) + +fun add_under_wrapper_unless_equal(wrapper: Wrapper, x: u64): u64 { + match (wrapper) { + Wrapper(y) if (y == x) => Wrapper(y), + Wrapper(y) => y + x, + } +} +add_under_wrapper_unless_equal(Wrapper(1), 2); // returns Wrapper(3) +add_under_wrapper_unless_equal(Wrapper(2), 3); // returns Wrapper(5) +add_under_wrapper_unless_equal(Wrapper(3), 3); // returns Wrapper(3) +``` + +Patterns can be nested, and patterns can be or'd with other patterns. The `..` pattern is a special +pattern that matches any number of fields in a struct or enum variant, but it can only occur within +a constructor pattern, similarly the `mut` pattern can only be used within constructor patterns -- +this is used to specify that we want to use the variable mutably on the right-hand-side of the match +arm. + +Patterns are not expressions, but they are nevertheless typed just like expressions. This means that +the type of a pattern must match the type of the value it matches. For example, the pattern `1` has +type `u64`, the pattern `MyEnum::Variant(1, true)` has type `MyEnum`, and the pattern +`MyStruct { x, y }` has type `MyStruct`. If you try to match on an expression which differs from the +type of the pattern in the match this will result in a type error. For example: + +```move +match (1) { + // The `true` literal pattern is of type `bool` so this is a type error. + true => 1, + // TYPE ERROR: expected type u64, found bool + _ => 2, +} +``` + +Similarly the following would also result in a type error since `MyEnum` and `MyStruct` are +different types: + +``` +match (MyStruct { x: 0, y: 0 }) { + MyEnum::Variant(..) => 1, + // TYPE ERROR: expected type MyEnum, found MyStruct +} +``` + +Additionally, there are some restrictions on when the `..` pattern, and `mut` pattern modifier can +be used in a pattern. + +A `mut` modifier can only occur within a constructor pattern, and cannot be a top-level pattern. The +value being matched on must be either a mutable reference or by value in order for a `mut` pattern +to be used otherwise the compiler will raise an error. + +```move +public struct MyStruct(u64) + +fun top_level_mut(x: MyStruct) { + match (x) { + mut MyStruct(y) => 1, + // ERROR: cannot use mut pattern as a top-level pattern + } +} + +fun mut_on_non_mut(x: MyStruct): u64 { + match (x) { + // OK! Since `x` is matched by value + MyStruct(mut y) => { + *y = *y + 1; + *y + }, + } +} + +fun mut_on_mut(x: &mut MyStruct): u64 { + match (x) { + // OK! Since `x` is matched by mutable reference + MyStruct(mut y) => { + *y = *y + 1; + *y + }, + } +} + +let mut x = MyStruct(1); +mut_on_non_mut(&mut x); // returns 2 +x.0; // returns 2 + +fun mut_on_immut(x: &MyStruct): u64 { + match (x) { + MyStruct(mut y) => ..., + // ERROR: cannot use mut pattern on a non-mutable reference + } +} +``` + +The `..` pattern an only be used within a constructor pattern and: + +- It can only be used once within the constructor pattern; +- In positional arguments it can be used at the beginning, middle, or end of the patterns within the + constructor; +- In named arguments it can only be used at the end of the patterns within the constructor; + +```move +public struct MyStruct(u64, u64, u64, u64) has drop; + +public struct MyStruct2 { + x: u64, + y: u64, + z: u64, + w: u64, +} + +fun wild_match(x: MyStruct) { + match (x) { + MyStruct(.., 1) => 1, + // OK! The `..` pattern can be used at the begining of the constructor pattern + MyStruct(1, ..) => 2, + // OK! The `..` pattern can be used at the end of the constructor pattern + MyStruct(1, .., 1) => 3, + // OK! The `..` pattern can be used at the middle of the constructor pattern + MyStruct(1, .., 1, 1) => 4, + MyStruct(..) => 5, + } +} + +fun wild_match2(x: MyStruct2) { + match (x) { + MyStruct2 { x: 1, .. } => 1, + MyStruct2 { x: 1, w: 2 .. } => 2, + MyStruct2 { .. } => 3, + } +} +``` + +## Matching + +Prior to delving into the specifics of pattern matching and what it means for a value to "match" a +pattern, let's examine a few examples to provide an intuition for the concept. + +```move +fun test_lit(x: u64): u8 { + match (x) { + 1 => 2, + 2 => 3, + _ => 4, + } +} +test_lit(1); // returns 2 +test_lit(2); // returns 3 +test_lit(3); // returns 4 +test_lit(10); // returns 4 + +fun test_var(x: u64): u64 { + match (x) { + y => y, + } +} +test_var(1); // returns 1 +test_var(2); // returns 2 +test_var(3); // returns 3 +... + +const MyConstant: u64 = 10; +fun test_constant(x: u64): u64 { + match (x) { + MyConstant => 1, + _ => 2, + } +} +test_constant(MyConstant); // returns 1 +test_constant(10); // returns 1 +test_constant(20); // returns 2 + +fun test_or_pattern(x: u64): u64 { + match (x) { + 1 | 2 | 3 => 1, + 4 | 5 | 6 => 2, + _ => 3, + } +} +test_or_pattern(1); // returns 1 +test_or_pattern(2); // returns 1 +test_or_pattern(3); // returns 1 +test_or_pattern(4); // returns 2 +test_or_pattern(5); // returns 2 +test_or_pattern(6); // returns 2 +test_or_pattern(7); // returns 3 +test_or_pattern(70); // returns 3 + +fun test_or_at_pattern(x: u64): u64 { + match (x) { + x @ (1 | 2 | 3) => x + 1, + y @ (4 | 5 | 6) => y + 2, + z => z + 3, + } +} +test_or_pattern(1); // returns 2 +test_or_pattern(2); // returns 3 +test_or_pattern(3); // returns 4 +test_or_pattern(4); // returns 6 +test_or_pattern(5); // returns 7 +test_or_pattern(6); // returns 8 +test_or_pattern(7); // returns 10 +test_or_pattern(70); // returns 73 +``` + +The most important thing to note from these examples is that a pattern matches a value if the value +is equal to the pattern, and wildcard/variable patterns match anything. This is true for literals, +variables, and constants. For example, in the `test_lit` function, the value `1` matches the pattern +`1`, the value `2` matches the pattern `2`, and the value `3` matches the wildcard `_`. Similarly, +in the `test_var` function, the value `1` matches the pattern `y` and the value `2` matches the +pattern `y`. + +A variable `x` matches (or "equals") any value, and a wildcard `_` matches any value (but only one +value!). Or-patterns are like a logical OR, where a value matches the pattern if it matches any of +patterns in the or-pattern so `p1 | p2 | p3` should be read "matches p1, or p2, or p3". + +The most interesting part of pattern matching are constructor patterns. These patterns allow you +inspect and access deep within both structs and enums, and are the most powerful part of pattern +matching. Constructor patterns, coupled with variable bindings, allow you to match on values by +their structure, and pull out the parts of the value you care about for usage on the right-hand side +of the match arm. + +Take the following: + +```move +fun f(x: MyEnum) { + match (x) { + MyEnum::Variant(1, true) => 1, + MyEnum::OtherVariant(_, 3) => 2, + MyEnum::Variant(..) => 3, + MyEnum::OtherVariant(..) => 4, +} +f(MyEnum::Variant(1, true)); // returns 1 +f(MyEnum::Variant(2, true)); // returns 3 +f(MyEnum::OtherVariant(false, 3)); // returns 2 +f(MyEnum::OtherVariant(true, 3)); // returns 2 +f(MyEnum::OtherVariant(true, 2)); // returns 4 +``` + +This is saying that "if `x` is `MyEnum::Variant` with the fields `1` and `true`, then return `1`, if +it is `MyEnum::OtherVariant` with any value for the first field, and `3` for the second, then return +`2`, if it is `MyEnum::Variant` with any fields, then return `3`, and if it is +`MyEnum::OtherVariant` with any fields, then return `4`". + +You can also nest patterns, so if I wanted to match either 1, 2, or 10, instead of just matching 1 +in the `MyEnum::Variant` above, you could do so with an or-pattern: + +```move +fun f(x: MyEnum) { + match (x) { + MyEnum::Variant(1 | 2 | 10, true) => 1, + MyEnum::OtherVariant(_, 3) => 2, + MyEnum::Variant(..) => 3, + MyEnum::OtherVariant(..) => 4, +} +f(MyEnum::Variant(1, true)); // returns 1 +f(MyEnum::Variant(2, true)); // returns 1 +f(MyEnum::Variant(10, true)); // returns 1 +f(MyEnum::Variant(10, false)); // returns 3 +``` + +Additionally, when matching ability restrictions on the value being matched must be followed. In +particular, you cannot wildcard match on a non-droppable value (when matching by value), and if you +bind a non-droppable value to a variable that variable _must_ be used in the match arm. However, if +you fully destructure the non-droppable value then you can wildcard match on the fields within it. + +```move +public struct NonDrop(u64) + +fun drop_nondrop(x: NonDrop) { + match (x) { + NonDrop(1) => 1, + _ => 2 + // ERROR: cannot wildcard match on a non-droppable value + } +} + +fun destructure_nondrop(x: NonDrop) { + match (x) { + NonDrop(1) => 1, + NonDrop(_) => 2 + // OK! + } +} + +fun use_nondrop(x: NonDrop): NonDrop { + match (x) { + NonDrop(1) => NonDrop(8), + x => x + } +} +``` + +## Exhaustiveness + +The `match` expression in Move must be _exhaustive_; every possible value of the type being matched +must be covered by one of the patterns in one of the match's arms. If the series of match arms is +not exhaustive, the compiler will raise an error. + +As an example, if we were to match on a `u8` then in order for the match to be exhaustive we would +need to match on every number from 0 to 255 inclusive, or a wildcard or variable pattern would need +to be present. Similarly if we were to match on a `bool` then we would need to match on both `true` +and `false`, or a wildcard or variable pattern would need to be present. + +For structs, since there is only one type of constructor for the type, only one constructor needs to +be matched on, but the fields within the struct need to be matched exhaustively as well. Similarly +for enums, since there are multiple variants that can inhabit the type, each variant needs to be +matched on, and each field type within each variant needs to be matched in order for the match to be +considered exhaustive. + +Since underscores and variables match anything, they count as matching all values of the type they +are matching on in that position. Additionally, the multi-arity wildcard pattern `..` can be used to +match on multiple values within a struct or enum variant. + +To see some examples of _non-exhaustive_ matches, consider the following: + +```move +public enum MyEnum { + Variant(u64, bool), + OtherVariant(bool, u64), +} + +public struct Pair(T, T) + +fun f(x: MyEnum): u8 { + match (x) { + MyEnum::Variant(1, true) => 1, + MyEnum::Variant(_, _) => 1, + MyEnum::OtherVariant(_, 3) => 2, + // ERROR: not exhaustive as the value `MyEnum::OtherVariant(_, 4)` is not matched. + } +} + +fun match_pair_bool(x: Pair): u8 { + match (x) { + Pair(true, true) => 1, + Pair(true, false) => 1, + Pair(false, false) => 1, + // ERROR: not exhaustive as the value `Pair(false, true)` is not matched. + } +} +``` + +These examples can then be made exhaustive by adding a wildcard pattern to the end of the match arm, +or by fully matching on the remaining values: + +```move +fun f(x: MyEnum): u8 { + match (x) { + MyEnum::Variant(1, true) => 1, + MyEnum::Variant(_, _) => 1, + MyEnum::OtherVariant(_, 3) => 2, + // Now exhaustive since this will match all values of MyEnum::OtherVariant + MyEnum::OtherVariant(..) => 2, + + } +} + +fun match_pair_bool(x: Pair): u8 { + match (x) { + Pair(true, true) => 1, + Pair(true, false) => 1, + Pair(false, false) => 1, + // Now exhaustive since this will match all values of Pair + Pair(false, true) => 1, + } +} +``` + +## Guards + +As mentioned above you can add a guard to a match arm by adding an `if` clause after the pattern. +This guard will run _after_ the pattern has been matched but _before_ the expression on the +right-hand-side of the arrow is evaluated. If the guard expression evaluates to `true` then the +expression on the right-hand side of the arrow will be evaluated, if it evaluates to `false` then it +will be considered a "failed match" and the next match arm in the `match` expression will be +checked. + +```move +fun match_with_guard(x: u64): u64 { + match (x) { + 1 if (x == 0) => 1, + 1 => 2, + _ => 3, + } +} + +match_with_guard(1); // returns 2 +match_with_guard(0); // returns 3 +``` + +Guard expressions have access to the variables bound in the pattern, and can use them in the guard. +However, it is important to note that variables are by immutable reference only in guards regardless +of the pattern being matched -- even if there are mutability specifiers on the variable -- or if the +pattern is being matched by value. + +```move +fun incr(x: &mut u64) { + *x = *x + 1; +} + +fun match_with_guard_incr(x: u64): u64 { + match (x) { + x if ({ incr(&mut x); x == 1 }) => 1, + // ERROR: ^^^ invalid borrow of immutable value + _ => 2, + } +} + +fun match_with_guard_incr2(x: &mut u64): u64 { + match (x) { + x if ({ incr(&mut x); x == 1 }) => 1, + // ERROR: ^^^ invalid borrow of immutable value + _ => 2, + } +} +``` + +Additionally, it is important to note any match arms that have guard expressions will not be +considered either for exhaustivity purposes since the compiler has no way of evaluating the guard +expression statically. diff --git a/external-crates/move/documentation/book/src/enums.md b/external-crates/move/documentation/book/src/enums.md index f76de8f6ec9b4..31ace106e0781 100644 --- a/external-crates/move/documentation/book/src/enums.md +++ b/external-crates/move/documentation/book/src/enums.md @@ -194,7 +194,8 @@ You can pattern match on Move values by value, immutable reference, and mutable pattern matching by value, the value is moved into the match arm. When pattern matching by reference, the value is borrowed into the match arm (either immutably or mutably). We'll go through a brief description of pattern matching using `match` here, but for more information on pattern -matching using `match` in Move see the [Pattern Matching](./pattern_matching.md) section. +matching using `match` in Move see the [Pattern Matching](./control-flow/pattern_matching.md) +section. A `match` statement is used to pattern match on a Move value and consists of a number of _match arms_. Each match arm consists of a pattern, an arrow `=>`, and an expression, followed by a comma From 1a3414e6dc562d289b0f86569056c8085f72ffe7 Mon Sep 17 00:00:00 2001 From: Tim Zakian <2895723+tzakian@users.noreply.github.com> Date: Thu, 16 May 2024 15:16:23 -0700 Subject: [PATCH 2/4] Apply suggestions from code review Co-authored-by: Cam Swords --- .../book/src/control-flow/pattern-matching.md | 57 +++++++++---------- 1 file changed, 28 insertions(+), 29 deletions(-) diff --git a/external-crates/move/documentation/book/src/control-flow/pattern-matching.md b/external-crates/move/documentation/book/src/control-flow/pattern-matching.md index ed3164cbae4db..16c5482067ffa 100644 --- a/external-crates/move/documentation/book/src/control-flow/pattern-matching.md +++ b/external-crates/move/documentation/book/src/control-flow/pattern-matching.md @@ -2,11 +2,11 @@ A `match` expression is a powerful control structure that allows you to compare a value against a series of patterns and then execute code based on which pattern matches first. Patterns can be -anything from simple literals to complex structures. As opposed to `if` expressions, which require a -`bool`ean expression, a `match` expression requires a value to be matched against a series of -patterns. +anything from simple literals to complex, nested struct and enum definitions . As opposed to `if` expressions, which change control flow based on a `bool`-typed test expression, a `match` expression operates over +a value of any type and selects on of many arms. -A `match` expression can match either by value, immutable reference, or mutable reference. +A `match` expression can match Move values, immutable references, or mutable references, binding +sub-patterns accordingly. A pattern is matched by a value if the value is equal to the pattern, and where variables and wildcards (e.g., `x`, `y`, `_`, or `..`) are "equal" to anything. @@ -33,7 +33,7 @@ run(0); // returns 0 A `match` takes an expression and a non-empty series of _match arms_ delimited by commas. Each match arm consists of a pattern `p`, an optional guard `if (g)` where `g` is an expression of -type `bool`, followed by an arrow `=>`and an expression `e`. E.g., +type `bool`, an arrow `=>`, and an arm expression `e` to execute when the pattern matches. E.g., ```move match (expression) { @@ -43,8 +43,8 @@ match (expression) { } ``` -Match arms are checked in order from top to bottom, and the first match arm whose pattern matches -(and whose guard expression, if any, returns `true`) will be executed. +Match arms are checked in order from top to bottom, and the first pattern which matches +(with a guard expression, if present, that evaluates to `true`) will be executed. Note that the series of match arms within a `match` must be exhaustive, meaning that every possible value of the type being matched must be covered by one of the patterns in the `match`. If the series @@ -131,17 +131,18 @@ pattern = | | | _ - | C { inner-pattern, inner-pattern, ... } // where C is a struct or enum variant - | C ( inner-pattern, inner-pattern, ... ) // where C is a struct or enum variant + | C { : inner-pattern ["," : inner-pattern]* } // where C is a struct or enum variant + | C ( inner-pattern ["," inner-pattern]* ... ) // where C is a struct or enum variant | C // where C is an enum variant | @ top-level-pattern | pattern | pattern inner-pattern = pattern | .. - | mut pattern + | mut ``` -Patterns that contain variables bind them to the value being matched. These variables can then be +Patterns that contain variables bind them to the match subject or subject subcomponent being matched. +These variables can then be used either in any match guard expressions, or on the right-hand side of the match arm. For example: ```move @@ -158,13 +159,14 @@ add_under_wrapper_unless_equal(Wrapper(2), 3); // returns Wrapper(5) add_under_wrapper_unless_equal(Wrapper(3), 3); // returns Wrapper(3) ``` -Patterns can be nested, and patterns can be or'd with other patterns. The `..` pattern is a special +Patterns can be nested, and patterns can be combined used the or operator `|` which will succeed if either +pattern matches. The `..` pattern is a special pattern that matches any number of fields in a struct or enum variant, but it can only occur within a constructor pattern, similarly the `mut` pattern can only be used within constructor patterns -- this is used to specify that we want to use the variable mutably on the right-hand-side of the match arm. -Patterns are not expressions, but they are nevertheless typed just like expressions. This means that +Patterns are not expressions, but they are nevertheless typed. This means that the type of a pattern must match the type of the value it matches. For example, the pattern `1` has type `u64`, the pattern `MyEnum::Variant(1, true)` has type `MyEnum`, and the pattern `MyStruct { x, y }` has type `MyStruct`. If you try to match on an expression which differs from the @@ -184,8 +186,8 @@ different types: ``` match (MyStruct { x: 0, y: 0 }) { - MyEnum::Variant(..) => 1, // TYPE ERROR: expected type MyEnum, found MyStruct + MyEnum::Variant(..) => 1, } ``` @@ -194,7 +196,7 @@ be used in a pattern. A `mut` modifier can only occur within a constructor pattern, and cannot be a top-level pattern. The value being matched on must be either a mutable reference or by value in order for a `mut` pattern -to be used otherwise the compiler will raise an error. +to be used. ```move public struct MyStruct(u64) @@ -240,7 +242,7 @@ fun mut_on_immut(x: &MyStruct): u64 { The `..` pattern an only be used within a constructor pattern and: -- It can only be used once within the constructor pattern; +- It can only be used **once** within the constructor pattern; - In positional arguments it can be used at the beginning, middle, or end of the patterns within the constructor; - In named arguments it can only be used at the end of the patterns within the constructor; @@ -405,10 +407,7 @@ f(MyEnum::Variant(10, true)); // returns 1 f(MyEnum::Variant(10, false)); // returns 3 ``` -Additionally, when matching ability restrictions on the value being matched must be followed. In -particular, you cannot wildcard match on a non-droppable value (when matching by value), and if you -bind a non-droppable value to a variable that variable _must_ be used in the match arm. However, if -you fully destructure the non-droppable value then you can wildcard match on the fields within it. +Additionally, match bindings are subject to the same ability restrictions as other aspects of Move. In particular, the compiler will signal an error if you try to match a value (i.e., not-reference) without `drop` using a wildcard, as the wildcard expects to drop the value. Similarly, if you bind a non-`drop` value using a binder, it must be used in the right-hand side of the match arm. In addition, if you fully-destruct that value, you have unpacked it, matching the semantics of [non-`drop` struct unpacking](link). See [ref section] for more details about the `drop` capability. ```move public struct NonDrop(u64) @@ -439,19 +438,19 @@ fun use_nondrop(x: NonDrop): NonDrop { ## Exhaustiveness -The `match` expression in Move must be _exhaustive_; every possible value of the type being matched +The `match` expression in Move must be _exhaustive_: every possible value of the type being matched must be covered by one of the patterns in one of the match's arms. If the series of match arms is -not exhaustive, the compiler will raise an error. +not exhaustive, the compiler will raise an error. Note that any arm with a guard expression +does not contribute to match exhaustion, as it may fail to match at runtime. As an example, if we were to match on a `u8` then in order for the match to be exhaustive we would -need to match on every number from 0 to 255 inclusive, or a wildcard or variable pattern would need +need to match on _every_ number from 0 to 255 inclusive, or a wildcard or variable pattern would need to be present. Similarly if we were to match on a `bool` then we would need to match on both `true` and `false`, or a wildcard or variable pattern would need to be present. For structs, since there is only one type of constructor for the type, only one constructor needs to -be matched on, but the fields within the struct need to be matched exhaustively as well. Similarly -for enums, since there are multiple variants that can inhabit the type, each variant needs to be -matched on, and each field type within each variant needs to be matched in order for the match to be +be matched, but the fields within the struct need to be matched exhaustively as well. Conversely, +enums may define multiple variants, and each variant must be matched (including any sub-fields) in order for the match to be considered exhaustive. Since underscores and variables match anything, they count as matching all values of the type they @@ -535,9 +534,9 @@ match_with_guard(1); // returns 2 match_with_guard(0); // returns 3 ``` -Guard expressions have access to the variables bound in the pattern, and can use them in the guard. -However, it is important to note that variables are by immutable reference only in guards regardless -of the pattern being matched -- even if there are mutability specifiers on the variable -- or if the +Guard expressions can reference variables bound in the pattern during evaluation. +However, note that _variables are only available as immutable reference in guards_ regardless +of the pattern being matched -- even if there are mutability specifiers on the variable or if the pattern is being matched by value. ```move From 20888860dab1a67e5dbfe92f7f0362e7b2d2c8c3 Mon Sep 17 00:00:00 2001 From: Timothy Zakian Date: Thu, 16 May 2024 15:46:34 -0700 Subject: [PATCH 3/4] fixup! Apply suggestions from code review --- .../documentation/book/src/control-flow.md | 2 +- .../book/src/control-flow/pattern-matching.md | 396 ++++++++++-------- 2 files changed, 225 insertions(+), 173 deletions(-) diff --git a/external-crates/move/documentation/book/src/control-flow.md b/external-crates/move/documentation/book/src/control-flow.md index f9f7a4e58e00c..83bfb7d4683d9 100644 --- a/external-crates/move/documentation/book/src/control-flow.md +++ b/external-crates/move/documentation/book/src/control-flow.md @@ -7,6 +7,6 @@ labels for loops and escapable named blocks. It also supports more more complex structural pattern matching. - [Conditional Expressions](./control-flow/conditionals.md) +- [Pattern Matching](./control-flow/pattern-matching.md) - [Loops](./control-flow/loops.md) - [Labeled Control FLow](./control-flow/labeled-control-flow.md) -- [Pattern Matching](./control-flow/pattern-matching.md) diff --git a/external-crates/move/documentation/book/src/control-flow/pattern-matching.md b/external-crates/move/documentation/book/src/control-flow/pattern-matching.md index 16c5482067ffa..89eecf18c2d1f 100644 --- a/external-crates/move/documentation/book/src/control-flow/pattern-matching.md +++ b/external-crates/move/documentation/book/src/control-flow/pattern-matching.md @@ -2,15 +2,13 @@ A `match` expression is a powerful control structure that allows you to compare a value against a series of patterns and then execute code based on which pattern matches first. Patterns can be -anything from simple literals to complex, nested struct and enum definitions . As opposed to `if` expressions, which change control flow based on a `bool`-typed test expression, a `match` expression operates over -a value of any type and selects on of many arms. +anything from simple literals to complex, nested struct and enum definitions . As opposed to `if` +expressions, which change control flow based on a `bool`-typed test expression, a `match` expression +operates over a value of any type and selects on of many arms. A `match` expression can match Move values, immutable references, or mutable references, binding sub-patterns accordingly. -A pattern is matched by a value if the value is equal to the pattern, and where variables and -wildcards (e.g., `x`, `y`, `_`, or `..`) are "equal" to anything. - For example: ```move @@ -28,7 +26,7 @@ run(3); // returns 3 run(0); // returns 0 ``` -## Syntax +## `match` Syntax A `match` takes an expression and a non-empty series of _match arms_ delimited by commas. @@ -43,47 +41,52 @@ match (expression) { } ``` -Match arms are checked in order from top to bottom, and the first pattern which matches -(with a guard expression, if present, that evaluates to `true`) will be executed. +Match arms are checked in order from top to bottom, and the first pattern which matches (with a +guard expression, if present, that evaluates to `true`) will be executed. Note that the series of match arms within a `match` must be exhaustive, meaning that every possible value of the type being matched must be covered by one of the patterns in the `match`. If the series of match arms is not exhaustive, the compiler will raise an error. -## Patterns +## Pattern Syntax + +A pattern is matched by a value if the value is equal to the pattern, and where variables and +wildcards (e.g., `x`, `y`, `_`, or `..`) are "equal" to anything. Patterns are used to match values. Patterns can be -- literals (`true`, `2`, `@0x4`); -- constants (`MyConstant`); -- variables (`a`, `b`, `x`); -- wildcards (`_`); -- constructor patterns (`MyStruct { a, b }`, `MyEnum::Variant(x)`); -- at-patterns ` @ `; and -- or-patterns ` | `. +| Pattern | Description | +| -------------------- | ---------------------------------------------------------------------- | +| Literal | A literal value, e.g., `1`, `true`, `@0x1` | +| Constant | A constant value, e.g., `MyConstant` | +| Variable | A variable, e.g., `x`, `y`, `z` | +| Wildcard | A wildcard, e.g., `_` | +| Constructor | A constructor pattern, e.g., `MyStruct { x, y }`, `MyEnum::Variant(x)` | +| At-pattern | An at-pattern, e.g., `x @ MyEnum::Variant(..)` | +| Or-pattern | An or-pattern, e.g., `MyEnum::Variant(..) \| MyEnum::OtherVariant(..)` | +| Multi-arity wildcard | A multi-arity wildcard, e.g., `MyEnum::Variant(..)` | +| Mutable-binding | A mutable-binding pattern, e.g., `mut x` | -Additionally, depending on the context patterns may also include: +Patterns in Move have the following grammar: -- multi-arity wildcards (`..`); and -- mutable-binding patterns (`mut x`). +```bnf +pattern = + | + | + | _ + | C { : inner-pattern ["," : inner-pattern]* } // where C is a struct or enum variant + | C ( inner-pattern ["," inner-pattern]* ... ) // where C is a struct or enum variant + | C // where C is an enum variant + | @ top-level-pattern + | pattern | pattern + | mut +inner-pattern = pattern + | .. // multi-arity wildcard +``` Some examples of patterns are: ```move -public enum MyEnum { - Variant(u64, bool), - OtherVariant(bool, u64), -} - -public enum OtherEnum { - V(MyEnum) -} - -public struct MyStruct { - x: u64, - y: u64, -} - // literal pattern 1 @@ -114,7 +117,7 @@ x @ MyEnum::Variant(..) // or-pattern that matches either `MyEnum::Variant` or `MyEnum::OtherVariant` MyEnum::Variant(..) | MyEnum::OtherVariant(..) -// Same as the above or-pattern, but with explicit wildcards +// same as the above or-pattern, but with explicit wildcards MyEnum::Variant(_, _) | MyEnum::OtherVariant(_, _) // or-pattern that matches either `MyEnum::Variant` or `MyEnum::OtherVariant` and binds the u64 field to `x` @@ -124,26 +127,11 @@ MyEnum::Variant(x, _) | MyEnum::OtherVariant(_, x) OtherEnum::V(MyEnum::Variant(..)) ``` -More concisely we have the following grammar for patterns in Move: +### Patterns and Variables -```bnf -pattern = - | - | - | _ - | C { : inner-pattern ["," : inner-pattern]* } // where C is a struct or enum variant - | C ( inner-pattern ["," inner-pattern]* ... ) // where C is a struct or enum variant - | C // where C is an enum variant - | @ top-level-pattern - | pattern | pattern -inner-pattern = pattern - | .. - | mut -``` - -Patterns that contain variables bind them to the match subject or subject subcomponent being matched. -These variables can then be -used either in any match guard expressions, or on the right-hand side of the match arm. For example: +Patterns that contain variables bind them to the match subject or subject subcomponent being +matched. These variables can then be used either in any match guard expressions, or on the +right-hand side of the match arm. For example: ```move public struct Wrapper(u64) @@ -159,123 +147,80 @@ add_under_wrapper_unless_equal(Wrapper(2), 3); // returns Wrapper(5) add_under_wrapper_unless_equal(Wrapper(3), 3); // returns Wrapper(3) ``` -Patterns can be nested, and patterns can be combined used the or operator `|` which will succeed if either -pattern matches. The `..` pattern is a special -pattern that matches any number of fields in a struct or enum variant, but it can only occur within -a constructor pattern, similarly the `mut` pattern can only be used within constructor patterns -- -this is used to specify that we want to use the variable mutably on the right-hand-side of the match -arm. +### Combining Patterns -Patterns are not expressions, but they are nevertheless typed. This means that -the type of a pattern must match the type of the value it matches. For example, the pattern `1` has -type `u64`, the pattern `MyEnum::Variant(1, true)` has type `MyEnum`, and the pattern -`MyStruct { x, y }` has type `MyStruct`. If you try to match on an expression which differs from the -type of the pattern in the match this will result in a type error. For example: +Patterns can be nested, but patterns can also be combined using the or operator `p1 | p2` which will +succeed if either pattern `p1` or `p2` matches the subject. This pattern can occur anywhere -- +either as a top-level pattern or a sub-pattern within another pattern. ```move -match (1) { - // The `true` literal pattern is of type `bool` so this is a type error. - true => 1, - // TYPE ERROR: expected type u64, found bool - _ => 2, +public enum MyEnum has drop { + Variant(u64, bool), + OtherVariant(bool, u64), } -``` -Similarly the following would also result in a type error since `MyEnum` and `MyStruct` are -different types: - -``` -match (MyStruct { x: 0, y: 0 }) { - // TYPE ERROR: expected type MyEnum, found MyStruct - MyEnum::Variant(..) => 1, +fun test_or_pattern(x: u64): u64 { + match (x) { + MyEnum::Variant(1 | 2 | 3, true) | MyEnum::OtherVariant(true, 1 | 2 | 3) => 1, + MyEnum::Variant(8, true) | MyEnum::OtherVariant(_, 6 | 7) => 2, + _ => 3, + } } -``` -Additionally, there are some restrictions on when the `..` pattern, and `mut` pattern modifier can -be used in a pattern. +test_or_pattern(MyEnum::Variant(3, true)); // returns 1 +test_or_pattern(MyEnum::OtherVariant(true, 2)); // returns 1 +test_or_pattern(MyEnum::Variant(8, true)); // returns 2 +test_or_pattern(MyEnum::OtherVariant(false, 7)); // returns 2 +test_or_pattern(MyEnum::OtherVariant(false, 80)); // returns 3 +``` -A `mut` modifier can only occur within a constructor pattern, and cannot be a top-level pattern. The -value being matched on must be either a mutable reference or by value in order for a `mut` pattern -to be used. +### Restrictions on Some Patterns -```move -public struct MyStruct(u64) +The `mut` and `..` patterns also have specific conditions placed on when, where, and how they can be +used which we go into more detail in +[Limitations on Specific Patterns](#limitations-on-specific-patterns). At a high level, the `mut` +modifier can only be used on variable patterns, and the `..` pattern can only be used once within a +constructor pattern -- and not as a top-level pattern. -fun top_level_mut(x: MyStruct) { - match (x) { - mut MyStruct(y) => 1, - // ERROR: cannot use mut pattern as a top-level pattern - } -} +The following is an _invalid_ usage of the `..` pattern since it is used as a top-level pattern: -fun mut_on_non_mut(x: MyStruct): u64 { - match (x) { - // OK! Since `x` is matched by value - MyStruct(mut y) => { - *y = *y + 1; - *y - }, - } -} - -fun mut_on_mut(x: &mut MyStruct): u64 { - match (x) { - // OK! Since `x` is matched by mutable reference - MyStruct(mut y) => { - *y = *y + 1; - *y - }, - } +```move +match (x) { + .. => 1, + // ERROR: `..` pattern can only be used within a constructor pattern } -let mut x = MyStruct(1); -mut_on_non_mut(&mut x); // returns 2 -x.0; // returns 2 - -fun mut_on_immut(x: &MyStruct): u64 { - match (x) { - MyStruct(mut y) => ..., - // ERROR: cannot use mut pattern on a non-mutable reference - } +match (x) { + MyStruct(.., ..) => 1, + // ERROR: ^^ `..` pattern can only be used once within a constructor pattern } ``` -The `..` pattern an only be used within a constructor pattern and: +### Pattern Typing -- It can only be used **once** within the constructor pattern; -- In positional arguments it can be used at the beginning, middle, or end of the patterns within the - constructor; -- In named arguments it can only be used at the end of the patterns within the constructor; +Patterns are not expressions, but they are nevertheless typed. This means that the type of a pattern +must match the type of the value it matches. For example, the pattern `1` has an integer type, the +pattern `MyEnum::Variant(1, true)` has type `MyEnum`, and the pattern `MyStruct { x, y }` has type +`MyStruct`, and `OtherStruct { x: true, y: 1}` has type `OtherStruct`. If you try to +match on an expression which differs from the type of the pattern in the match this will result in a +type error. For example: ```move -public struct MyStruct(u64, u64, u64, u64) has drop; - -public struct MyStruct2 { - x: u64, - y: u64, - z: u64, - w: u64, +match (1) { + // The `true` literal pattern is of type `bool` so this is a type error. + true => 1, + // TYPE ERROR: expected type u64, found bool + _ => 2, } +``` -fun wild_match(x: MyStruct) { - match (x) { - MyStruct(.., 1) => 1, - // OK! The `..` pattern can be used at the begining of the constructor pattern - MyStruct(1, ..) => 2, - // OK! The `..` pattern can be used at the end of the constructor pattern - MyStruct(1, .., 1) => 3, - // OK! The `..` pattern can be used at the middle of the constructor pattern - MyStruct(1, .., 1, 1) => 4, - MyStruct(..) => 5, - } -} +Similarly the following would also result in a type error since `MyEnum` and `MyStruct` are +different types: -fun wild_match2(x: MyStruct2) { - match (x) { - MyStruct2 { x: 1, .. } => 1, - MyStruct2 { x: 1, w: 2 .. } => 2, - MyStruct2 { .. } => 3, - } +``` +match (MyStruct { x: 0, y: 0 }) { + MyEnum::Variant(..) => 1, + // TYPE ERROR: expected type MyEnum, found MyStruct } ``` @@ -325,13 +270,8 @@ fun test_or_pattern(x: u64): u64 { _ => 3, } } -test_or_pattern(1); // returns 1 -test_or_pattern(2); // returns 1 test_or_pattern(3); // returns 1 -test_or_pattern(4); // returns 2 test_or_pattern(5); // returns 2 -test_or_pattern(6); // returns 2 -test_or_pattern(7); // returns 3 test_or_pattern(70); // returns 3 fun test_or_at_pattern(x: u64): u64 { @@ -341,13 +281,8 @@ fun test_or_at_pattern(x: u64): u64 { z => z + 3, } } -test_or_pattern(1); // returns 2 test_or_pattern(2); // returns 3 -test_or_pattern(3); // returns 4 -test_or_pattern(4); // returns 6 test_or_pattern(5); // returns 7 -test_or_pattern(6); // returns 8 -test_or_pattern(7); // returns 10 test_or_pattern(70); // returns 73 ``` @@ -362,6 +297,8 @@ A variable `x` matches (or "equals") any value, and a wildcard `_` matches any v value!). Or-patterns are like a logical OR, where a value matches the pattern if it matches any of patterns in the or-pattern so `p1 | p2 | p3` should be read "matches p1, or p2, or p3". +### Matching Constructors + The most interesting part of pattern matching are constructor patterns. These patterns allow you inspect and access deep within both structs and enums, and are the most powerful part of pattern matching. Constructor patterns, coupled with variable bindings, allow you to match on values by @@ -390,7 +327,7 @@ it is `MyEnum::OtherVariant` with any value for the first field, and `3` for the `2`, if it is `MyEnum::Variant` with any fields, then return `3`, and if it is `MyEnum::OtherVariant` with any fields, then return `4`". -You can also nest patterns, so if I wanted to match either 1, 2, or 10, instead of just matching 1 +You can also nest patterns, so if you wanted to match either 1, 2, or 10, instead of just matching 1 in the `MyEnum::Variant` above, you could do so with an or-pattern: ```move @@ -407,7 +344,14 @@ f(MyEnum::Variant(10, true)); // returns 1 f(MyEnum::Variant(10, false)); // returns 3 ``` -Additionally, match bindings are subject to the same ability restrictions as other aspects of Move. In particular, the compiler will signal an error if you try to match a value (i.e., not-reference) without `drop` using a wildcard, as the wildcard expects to drop the value. Similarly, if you bind a non-`drop` value using a binder, it must be used in the right-hand side of the match arm. In addition, if you fully-destruct that value, you have unpacked it, matching the semantics of [non-`drop` struct unpacking](link). See [ref section] for more details about the `drop` capability. +### Ability Constraints + +Additionally, match bindings are subject to the same ability restrictions as other aspects of Move. +In particular, the compiler will signal an error if you try to match a value (i.e., not-reference) +without `drop` using a wildcard, as the wildcard expects to drop the value. Similarly, if you bind a +non-`drop` value using a binder, it must be used in the right-hand side of the match arm. In +addition, if you fully-destruct that value, you have unpacked it, matching the semantics of +[non-`drop` struct unpacking](link). See [ref section] for more details about the `drop` capability. ```move public struct NonDrop(u64) @@ -440,18 +384,18 @@ fun use_nondrop(x: NonDrop): NonDrop { The `match` expression in Move must be _exhaustive_: every possible value of the type being matched must be covered by one of the patterns in one of the match's arms. If the series of match arms is -not exhaustive, the compiler will raise an error. Note that any arm with a guard expression -does not contribute to match exhaustion, as it may fail to match at runtime. +not exhaustive, the compiler will raise an error. Note that any arm with a guard expression does not +contribute to match exhaustion, as it may fail to match at runtime. As an example, if we were to match on a `u8` then in order for the match to be exhaustive we would -need to match on _every_ number from 0 to 255 inclusive, or a wildcard or variable pattern would need -to be present. Similarly if we were to match on a `bool` then we would need to match on both `true` -and `false`, or a wildcard or variable pattern would need to be present. +need to match on _every_ number from 0 to 255 inclusive, or a wildcard or variable pattern would +need to be present. Similarly if we were to match on a `bool` then we would need to match on both +`true` and `false`, or a wildcard or variable pattern would need to be present. For structs, since there is only one type of constructor for the type, only one constructor needs to be matched, but the fields within the struct need to be matched exhaustively as well. Conversely, -enums may define multiple variants, and each variant must be matched (including any sub-fields) in order for the match to be -considered exhaustive. +enums may define multiple variants, and each variant must be matched (including any sub-fields) in +order for the match to be considered exhaustive. Since underscores and variables match anything, they count as matching all values of the type they are matching on in that position. Additionally, the multi-arity wildcard pattern `..` can be used to @@ -524,7 +468,7 @@ checked. ```move fun match_with_guard(x: u64): u64 { match (x) { - 1 if (x == 0) => 1, + 1 if (false) => 1, 1 => 2, _ => 3, } @@ -534,10 +478,10 @@ match_with_guard(1); // returns 2 match_with_guard(0); // returns 3 ``` -Guard expressions can reference variables bound in the pattern during evaluation. -However, note that _variables are only available as immutable reference in guards_ regardless -of the pattern being matched -- even if there are mutability specifiers on the variable or if the -pattern is being matched by value. +Guard expressions can reference variables bound in the pattern during evaluation. However, note that +_variables are only available as immutable reference in guards_ regardless of the pattern being +matched -- even if there are mutability specifiers on the variable or if the pattern is being +matched by value. ```move fun incr(x: &mut u64) { @@ -564,3 +508,111 @@ fun match_with_guard_incr2(x: &mut u64): u64 { Additionally, it is important to note any match arms that have guard expressions will not be considered either for exhaustivity purposes since the compiler has no way of evaluating the guard expression statically. + +## Limitations on Specific Patterns + +There are some restrictions on when the `..` pattern, and `mut` pattern modifier can be used in a +pattern. + +### Mutability Usage + +A `mut` modifier can be placed on a variable pattern to specify that the _variable_ is to be mutated +in the right-hand side expression of the match arm. Note that since the `mut` modifier only +signifies that the variable is to be mutated, not the underlying data, this can be used on all types +of match (by value, immutable reference, and mutable reference). + +Note that the `mut` modifier can only be applied to variables, and not other types of patterns. + +```move +public struct MyStruct(u64) + +fun top_level_mut(x: MyStruct) { + match (x) { + mut MyStruct(y) => 1, + // ERROR: cannot use mut on a non-variable pattern + } +} + +fun mut_on_immut(x: &MyStruct): u64 { + match (x) { + MyStruct(mut y) => { + y = &(*y + 1); + *y + } + } +} + +fun mut_on_value(x: MyStruct): u64 { + match (x) { + MyStruct(mut y) => { + *y = *y + 1; + *y + }, + } +} + +fun mut_on_mut(x: &mut MyStruct): u64 { + match (x) { + MyStruct(mut y) => { + *y = *y + 1; + *y + }, + } +} + +let mut x = MyStruct(1); + +mut_on_mut(&mut x); // returns 2 +x.0; // returns 2 + +mut_on_immut(&x); // returns 3 +x.0; // returns 2 + +mut_on_value(x); // returns 3 +``` + +### `..` Usage + +The `..` pattern can only be used within a constructor pattern is a wildcard that matches any number +of fields -- the +the compiler expands the `..` to inserting `_` in any missing fields in the constructor pattern (if +any). So `MyStruct(_, _, _)` is the same as `MyStruct(..)`, `MyStruct(1, _, _)` is the same +`MyStruct(1, ..)`. Because of this there are some restriction how, and where the `..` pattern can be +used: + +- It can only be used **once** within the constructor pattern; +- In positional arguments it can be used at the beginning, middle, or end of the patterns within the + constructor; +- In named arguments it can only be used at the end of the patterns within the constructor; + +```move +public struct MyStruct(u64, u64, u64, u64) has drop; + +public struct MyStruct2 { + x: u64, + y: u64, + z: u64, + w: u64, +} + +fun wild_match(x: MyStruct) { + match (x) { + MyStruct(.., 1) => 1, + // OK! The `..` pattern can be used at the begining of the constructor pattern + MyStruct(1, ..) => 2, + // OK! The `..` pattern can be used at the end of the constructor pattern + MyStruct(1, .., 1) => 3, + // OK! The `..` pattern can be used at the middle of the constructor pattern + MyStruct(1, .., 1, 1) => 4, + MyStruct(..) => 5, + } +} + +fun wild_match2(x: MyStruct2) { + match (x) { + MyStruct2 { x: 1, .. } => 1, + MyStruct2 { x: 1, w: 2 .. } => 2, + MyStruct2 { .. } => 3, + } +} +``` From 92a6579c42aa4ad3e0452ab3219c79c606534c24 Mon Sep 17 00:00:00 2001 From: Tim Zakian <2895723+tzakian@users.noreply.github.com> Date: Tue, 28 May 2024 09:20:52 -0700 Subject: [PATCH 4/4] Apply suggestions from Ronny Co-authored-by: ronny-mysten <118224482+ronny-mysten@users.noreply.github.com> --- .../book/src/control-flow/pattern-matching.md | 86 +++++++++---------- 1 file changed, 43 insertions(+), 43 deletions(-) diff --git a/external-crates/move/documentation/book/src/control-flow/pattern-matching.md b/external-crates/move/documentation/book/src/control-flow/pattern-matching.md index 89eecf18c2d1f..9433f050d842d 100644 --- a/external-crates/move/documentation/book/src/control-flow/pattern-matching.md +++ b/external-crates/move/documentation/book/src/control-flow/pattern-matching.md @@ -2,11 +2,11 @@ A `match` expression is a powerful control structure that allows you to compare a value against a series of patterns and then execute code based on which pattern matches first. Patterns can be -anything from simple literals to complex, nested struct and enum definitions . As opposed to `if` +anything from simple literals to complex, nested struct and enum definitions. As opposed to `if` expressions, which change control flow based on a `bool`-typed test expression, a `match` expression -operates over a value of any type and selects on of many arms. +operates over a value of any type and selects one of many arms. -A `match` expression can match Move values, immutable references, or mutable references, binding +A `match` expression can match Move values as well as mutable or immutable references, binding sub-patterns accordingly. For example: @@ -30,8 +30,8 @@ run(0); // returns 0 A `match` takes an expression and a non-empty series of _match arms_ delimited by commas. -Each match arm consists of a pattern `p`, an optional guard `if (g)` where `g` is an expression of -type `bool`, an arrow `=>`, and an arm expression `e` to execute when the pattern matches. E.g., +Each match arm consists of a pattern (`p`), an optional guard (`if (g)` where `g` is an expression of +type `bool`), an arrow (`=>`), and an arm expression (`e`) to execute when the pattern matches. For example, ```move match (expression) { @@ -41,7 +41,7 @@ match (expression) { } ``` -Match arms are checked in order from top to bottom, and the first pattern which matches (with a +Match arms are checked in order from top to bottom, and the first pattern that matches (with a guard expression, if present, that evaluates to `true`) will be executed. Note that the series of match arms within a `match` must be exhaustive, meaning that every possible @@ -57,7 +57,7 @@ Patterns are used to match values. Patterns can be | Pattern | Description | | -------------------- | ---------------------------------------------------------------------- | -| Literal | A literal value, e.g., `1`, `true`, `@0x1` | +| Literal | A literal value, such as `1`, `true`, `@0x1` | | Constant | A constant value, e.g., `MyConstant` | | Variable | A variable, e.g., `x`, `y`, `z` | | Wildcard | A wildcard, e.g., `_` | @@ -149,8 +149,8 @@ add_under_wrapper_unless_equal(Wrapper(3), 3); // returns Wrapper(3) ### Combining Patterns -Patterns can be nested, but patterns can also be combined using the or operator `p1 | p2` which will -succeed if either pattern `p1` or `p2` matches the subject. This pattern can occur anywhere -- +Patterns can be nested, but patterns can also be combined using the or operator (`|`). For example, `p1 | p2` +succeeds if either pattern `p1` or `p2` matches the subject. This pattern can occur anywhere -- either as a top-level pattern or a sub-pattern within another pattern. ```move @@ -177,12 +177,12 @@ test_or_pattern(MyEnum::OtherVariant(false, 80)); // returns 3 ### Restrictions on Some Patterns The `mut` and `..` patterns also have specific conditions placed on when, where, and how they can be -used which we go into more detail in +used, as detailed in [Limitations on Specific Patterns](#limitations-on-specific-patterns). At a high level, the `mut` modifier can only be used on variable patterns, and the `..` pattern can only be used once within a constructor pattern -- and not as a top-level pattern. -The following is an _invalid_ usage of the `..` pattern since it is used as a top-level pattern: +The following is an _invalid_ usage of the `..` pattern because it is used as a top-level pattern: ```move match (x) { @@ -200,9 +200,9 @@ match (x) { Patterns are not expressions, but they are nevertheless typed. This means that the type of a pattern must match the type of the value it matches. For example, the pattern `1` has an integer type, the -pattern `MyEnum::Variant(1, true)` has type `MyEnum`, and the pattern `MyStruct { x, y }` has type +pattern `MyEnum::Variant(1, true)` has type `MyEnum`, the pattern `MyStruct { x, y }` has type `MyStruct`, and `OtherStruct { x: true, y: 1}` has type `OtherStruct`. If you try to -match on an expression which differs from the type of the pattern in the match this will result in a +match on an expression that differs from the type of the pattern in the match, this will result in a type error. For example: ```move @@ -214,7 +214,7 @@ match (1) { } ``` -Similarly the following would also result in a type error since `MyEnum` and `MyStruct` are +Similarly, the following would also result in a type error because `MyEnum` and `MyStruct` are different types: ``` @@ -290,17 +290,17 @@ The most important thing to note from these examples is that a pattern matches a is equal to the pattern, and wildcard/variable patterns match anything. This is true for literals, variables, and constants. For example, in the `test_lit` function, the value `1` matches the pattern `1`, the value `2` matches the pattern `2`, and the value `3` matches the wildcard `_`. Similarly, -in the `test_var` function, the value `1` matches the pattern `y` and the value `2` matches the +in the `test_var` function, both the value `1` and the value `2` matches the pattern `y`. A variable `x` matches (or "equals") any value, and a wildcard `_` matches any value (but only one -value!). Or-patterns are like a logical OR, where a value matches the pattern if it matches any of +value). Or-patterns are like a logical OR, where a value matches the pattern if it matches any of patterns in the or-pattern so `p1 | p2 | p3` should be read "matches p1, or p2, or p3". ### Matching Constructors -The most interesting part of pattern matching are constructor patterns. These patterns allow you -inspect and access deep within both structs and enums, and are the most powerful part of pattern +Pattern matching includes the concept of constructor patterns. These patterns allow you to +inspect and access deep within both structs and enums, and are one of the most powerful parts of pattern matching. Constructor patterns, coupled with variable bindings, allow you to match on values by their structure, and pull out the parts of the value you care about for usage on the right-hand side of the match arm. @@ -322,13 +322,13 @@ f(MyEnum::OtherVariant(true, 3)); // returns 2 f(MyEnum::OtherVariant(true, 2)); // returns 4 ``` -This is saying that "if `x` is `MyEnum::Variant` with the fields `1` and `true`, then return `1`, if +This is saying that "if `x` is `MyEnum::Variant` with the fields `1` and `true`, then return `1`. If it is `MyEnum::OtherVariant` with any value for the first field, and `3` for the second, then return -`2`, if it is `MyEnum::Variant` with any fields, then return `3`, and if it is +`2`. If it is `MyEnum::Variant` with any fields, then return `3`. Finally, if it is `MyEnum::OtherVariant` with any fields, then return `4`". -You can also nest patterns, so if you wanted to match either 1, 2, or 10, instead of just matching 1 -in the `MyEnum::Variant` above, you could do so with an or-pattern: +You can also nest patterns. So, if you wanted to match either 1, 2, or 10, instead of just matching 1 +in the previous `MyEnum::Variant`, you could do so with an or-pattern: ```move fun f(x: MyEnum) { @@ -347,10 +347,10 @@ f(MyEnum::Variant(10, false)); // returns 3 ### Ability Constraints Additionally, match bindings are subject to the same ability restrictions as other aspects of Move. -In particular, the compiler will signal an error if you try to match a value (i.e., not-reference) +In particular, the compiler will signal an error if you try to match a value (not-reference) without `drop` using a wildcard, as the wildcard expects to drop the value. Similarly, if you bind a non-`drop` value using a binder, it must be used in the right-hand side of the match arm. In -addition, if you fully-destruct that value, you have unpacked it, matching the semantics of +addition, if you fully destruct that value, you have unpacked it, matching the semantics of [non-`drop` struct unpacking](link). See [ref section] for more details about the `drop` capability. ```move @@ -385,19 +385,19 @@ fun use_nondrop(x: NonDrop): NonDrop { The `match` expression in Move must be _exhaustive_: every possible value of the type being matched must be covered by one of the patterns in one of the match's arms. If the series of match arms is not exhaustive, the compiler will raise an error. Note that any arm with a guard expression does not -contribute to match exhaustion, as it may fail to match at runtime. +contribute to match exhaustion, as it might fail to match at runtime. -As an example, if we were to match on a `u8` then in order for the match to be exhaustive we would -need to match on _every_ number from 0 to 255 inclusive, or a wildcard or variable pattern would -need to be present. Similarly if we were to match on a `bool` then we would need to match on both -`true` and `false`, or a wildcard or variable pattern would need to be present. +As an example, a match on a `u8` is exhaustive only if +it matches on _every_ number from 0 to 255 inclusive, unless there is a wildcard or variable pattern present. +Similarly, a match on a `bool` would need to match on both +`true` and `false`, unless there is a wildcard or variable pattern present. -For structs, since there is only one type of constructor for the type, only one constructor needs to +For structs, because there is only one type of constructor for the type, only one constructor needs to be matched, but the fields within the struct need to be matched exhaustively as well. Conversely, -enums may define multiple variants, and each variant must be matched (including any sub-fields) in -order for the match to be considered exhaustive. +enums may define multiple variants, and each variant must be matched (including any sub-fields) +for the match to be considered exhaustive. -Since underscores and variables match anything, they count as matching all values of the type they +Because underscores and variables are wildcards that match anything, they count as matching all values of the type they are matching on in that position. Additionally, the multi-arity wildcard pattern `..` can be used to match on multiple values within a struct or enum variant. @@ -458,11 +458,11 @@ fun match_pair_bool(x: Pair): u8 { ## Guards -As mentioned above you can add a guard to a match arm by adding an `if` clause after the pattern. +As previously mentioned, you can add a guard to a match arm by adding an `if` clause after the pattern. This guard will run _after_ the pattern has been matched but _before_ the expression on the -right-hand-side of the arrow is evaluated. If the guard expression evaluates to `true` then the -expression on the right-hand side of the arrow will be evaluated, if it evaluates to `false` then it -will be considered a "failed match" and the next match arm in the `match` expression will be +right hand side of the arrow is evaluated. If the guard expression evaluates to `true` then the +expression on the right hand side of the arrow will be evaluated, if it evaluates to `false` then it +will be considered a failed match and the next match arm in the `match` expression will be checked. ```move @@ -506,18 +506,18 @@ fun match_with_guard_incr2(x: &mut u64): u64 { ``` Additionally, it is important to note any match arms that have guard expressions will not be -considered either for exhaustivity purposes since the compiler has no way of evaluating the guard +considered either for exhaustivity purposes because the compiler has no way of evaluating the guard expression statically. ## Limitations on Specific Patterns -There are some restrictions on when the `..` pattern, and `mut` pattern modifier can be used in a +There are some restrictions on when the `..` and `mut` pattern modifiers can be used in a pattern. ### Mutability Usage A `mut` modifier can be placed on a variable pattern to specify that the _variable_ is to be mutated -in the right-hand side expression of the match arm. Note that since the `mut` modifier only +in the right-hand expression of the match arm. Note that since the `mut` modifier only signifies that the variable is to be mutated, not the underlying data, this can be used on all types of match (by value, immutable reference, and mutable reference). @@ -573,11 +573,11 @@ mut_on_value(x); // returns 3 ### `..` Usage -The `..` pattern can only be used within a constructor pattern is a wildcard that matches any number +The `..` pattern can only be used within a constructor pattern as a wildcard that matches any number of fields -- the the compiler expands the `..` to inserting `_` in any missing fields in the constructor pattern (if -any). So `MyStruct(_, _, _)` is the same as `MyStruct(..)`, `MyStruct(1, _, _)` is the same -`MyStruct(1, ..)`. Because of this there are some restriction how, and where the `..` pattern can be +any). So `MyStruct(_, _, _)` is the same as `MyStruct(..)`, `MyStruct(1, _, _)` is the same as +`MyStruct(1, ..)`. Because of this, there are some restrictions on how, and where the `..` pattern can be used: - It can only be used **once** within the constructor pattern;