You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We allow deconstructing an instance into its constituent properties/fields in a way paralleling how property patterns can conditionally deconstruct an instance, and positional deconstruction can deconstruct instances with a suitable Deconstruct method.
We similiarly allow deconstructing a collection into its constituent elements in a way parraleling list patterns.
Motivation
It is common to want to extract a number of fields/properties from an instance. Currently this is possible to do declaratively using property patterns, but the fields/properties are only assigned when the pattern matches. This forces you to put your code within an if statement if you want to use pattern matching to declaratively extract a number of properties from an instance. In order to keep this brief I will link to a motivating example from an earlier discussion: #3546.
Additionally there's an aspect of symmetry in the language (see #3107 for more on this theme):
There is currently a parralelism in two dimensions between positional data, nominal data, and collections on one axis, and declaration, construction, deconstruction, and pattern matching on the other.
You can declare types positionally using positional records/primary constructors. You can construct an instance positionally using a constructor, you can deconstruct it using positional deconstructions, you can pattern match it using a positional pattern.
You can declare types nominally through properties/fields. You can construct an instance nominally through an object initialize and you can pattern match it using property patterns.
You can construct a collection using a collection initializer, and you will likely soon be able to pattern match it using list patterns.
This proposal fills in two of the three missing squares here by introducing nominal and sequence deconstructions.
Detailed design
High level overview.
We have 3 aims which inform this design:
Make the most common cases as easy as possible.
Maintain symmetry with existing constructs (positional deconstructions and patterns).
Don't block ourselves from making enhancements in future language versions.
The most common case is to simply want to declare a bunch of variables. Here we take a cue from positional deconstruction, which allow you to preface a deconstruction with var to automatically declare locals for all identifiers within the deconstruction:
var {Start:{Line:startLine,Column:startColumn},End:{Line:endLine,Column:endColumn},}=textRange;
This declares 4 variables, startLine, startColumn, endLine, endColumn.
Positional deconstruction also allows you to specify the type explicitly, and assign to arbitrary lValues, so we allow that by leaving off the var:
Patterns can contain any arbitrary pattern so we allow nesting any deconstruction in any other, e.g:
var({A:[a,b,c]},d)=(new{A=new[]{1,2,3}},4);
Patterns can assign a pattern to a variable, even if the pattern itself contains other nested patterns, so we allow that:
var {Start:{Line:startLine,Column:startColumn}start,End:{Line:endLine,Column:endColumn}end,}=textRange;
It's useful to be able to assign such a variable to an existing local, like so:
TextPointstart;{Start:{ ...}start}= textRange;
On the other hand, we want to be able to declare a new local. We can't do so by putting var beforehand, since that makes all nested identifiers declare new locals. We don't want to do so by putting an explicit type beforehand, since that would lead to a confusing difference between var and other types. Instead we say that {} identifier declares a new local if one does not exist, and otherwise assigns to the existing local. This is very different to how C# works so far and may be reconsidered.
We apply all these principles to positional and collection deconstructions as well, so the grammar and spec for the 3 deconstructions is very similiar.
Unlike patterns, deconstruction does no checking for null, or bounds checking, and will throw a NullReferenceException or a IndexOutOfRangeException if these are violated. As ever, the compiler will warn you if you deconstruct a maybe null reference.
// Short-hand deconstructionvar(x,y)=e;var[x,y]=e;
var {A:a}=e;// Recursive deconstruction(varx,vary)=e;[varx,vary]=e;{A:vara}=e;// Bind to an existing l-value(x,y)=e;[x,y]=e;{A:a}=e;
Detailed Spec
variable_designation
A var_variable_designation is lowered recursively as follows:
Every var_variable_designation has a unique target t, which is a temporary variable of type T inferred from the expression that is assigned to t.
If the var_variable_designation is the top level var_variable_designation in a declaration_statement we assign expression to t.
If the var_variable_designation is the top level var_variable_designation in a foreach_statement we assign enumerator.Current to t.
Else t is defined recursively below.
If a var_variable_designation defines an identifieri, we declare a local of type T? and name i and the same scope as the scope of the declaration_statement/foreach_statement, and assign t to i.
Assuming the var_variable_designation has n child variable_designations v0 to vn - 1, we produce a set of child temps t0 to tn - 1 as follows.
If the var_variable_designation is a parenthesized_variable_designation we look for a suitable deconstructor on T to deconstruct t into t0 to tn - 1. See the spec for more details.
If the var_variable_designation is a nominal_variable_designation, for each named_variable_designation with identifier ix, t must have an accessible property or field ix, and we assign t.ix to tx (this should match the spec for property patterns).
If the var_variable_designation is a sequence_variable_designation, t must have an indexer accepting a single parameter of type int, and we assign t[x] to tx (this should match and keep up to date with spec for collection patterns, e.g. we may allow use of GetEnumerator here).
For each child variable_designationvx
If vx is a var_variable_designation we lower vx as specified here, using tx as t for vx.
If vx is single_variable_designation with identifierix we declare a local of type Tx? and name ix and the same scope as the scope of the declaration_statement/foreach_statement, and assign tx to ix.
If vx is a discard_designation we do nothing.
deconstruction
A deconstruction is lowered recursively as follows:
Every deconstruction has a unique target t, which is a temporary variable of type T inferred from the expression that is assigned to t.
If the deconstruction is the top level deconstruction in a declaration_statement we assign expression to t.
Else t is defined recursively below.
If a deconstruction defines an identifieri
If there is a local in scope with name i we assign t to i.
Else we declare a local of type T? and name i and the same scope as the scope of the declaration_statement, and assign t to i.
Assuming the deconstruction has n child declaration_target_or_expressions d0 to dn - 1:
If this is a top level deconstruction:
For each declaration_target_or_expressiondx
If dx is an expression, it must be a valid lValue as defined by the spec, and we evaluate as much of dx as is evaluated before the RHS of an assignment operator as defined by the spec. The result of this evaluation is stored in a temp dtx.
If dx is a deconstruction we perform this step recursively to evaluate as much of it's child expressions as are necessary.
We produce a set of child temps t0 to tn - 1 as follows.
If deconstruction is a positional_deconstruction we look for a suitable deconstructor on T to deconstruct t into t0 to tn - 1. See the spec for more details.
If the deconstruction is a nominal_deconstruction, for each nominal_deconstruction_element with identifier ix, t must have an accessible property or field ix, and we assign t.ix to tx (this should match the spec for property patterns).
If the deconstruction is a sequence_deconstruction, t must have an indexer accepting a single parameter of type int, and we assign t[x] to tx (this should match and keep up to date with spec for collection patterns, e.g. we may allow use of GetEnumerator here).
For each child declaration_target_or_expressiondx
If dx is an expression we assign tx to dtx as specified by the spec on simple assignment. The assignment must be valid according to the rules specified there.
If dx is a declaration
If the declaration is a var_variable_designation we lower vx as specified above, using tx as t for vx.
If the declaration is a single_variable_designation with identifierix and typeTx we declare a local of type Tx`` and name ixand the same scope as the scope of thedeclaration_statement, and assign txtoix. If typeisvarTxis inferred fromtx`.
If the declaration is a discard_designation we do nothing.
If dx is a deconstruction we lower dx as specified here, using tx as t for dx.
Drawbacks
This is a significant set of enhancements to deconstruction. Deconstruction is far less common than pattern matching, so it may be that the benefit from this set of enhancements is not considered sufficient to pay for itself.
Parsing ambiguities
In order to distinguish between a nominal_deconstruction and a block, we need to parse till we reach a , a ; or the closing brace (at which point we can check if it's followed by a = or not). This lookahead may be expensive. However much of the parsed syntax tree can be reused between the two cases.
In order to distinguish between a positional_attribute and an attribute on a local function we need to parse till we reach the closing ] and check to see if it's followed by a = or not. This may also be expensive, although I imagine the most expensive cases will quickly run into something that will disambiguate them, such as expressions that are disallowed in attributes.
If expression blocks are added in the future, this may possibly lead to genuine ambiguities even at a semantic level. E.g { P : (condition ? ref a : ref b) } = e; could be a nominal deconstruction, or an assignment to an expression block containing the label P. It shouldn't be too difficult to work around this (e.g. disallow labels for final expression of an expression block).
Alternatives
There are a number of simplifications to this spec we could consider:
Only allow the var form of the patterns as the most common.
Don't allow mixing the different forms of deconstruction.
Don't allow declaring a local as well as a deconstruction.
etc.
As well there's a lot of axis on which the exact grammar/semantics could be adjusted. I hope I made clear in my high level overview why I made the decisions I did, but I will not be surprised if others come to different conclusions.
Unresolved questions
How do we modify the spec I've given above to allow target typing of literals in the case of tuple deconstruction.
Nominal and Collection Deconstruction
Discussion: #8707
Summary
We allow deconstructing an instance into its constituent properties/fields in a way paralleling how property patterns can conditionally deconstruct an instance, and positional deconstruction can deconstruct instances with a suitable
Deconstruct
method.We similiarly allow deconstructing a collection into its constituent elements in a way parraleling list patterns.
Motivation
It is common to want to extract a number of fields/properties from an instance. Currently this is possible to do declaratively using property patterns, but the fields/properties are only assigned when the pattern matches. This forces you to put your code within an
if statement
if you want to use pattern matching to declaratively extract a number of properties from an instance. In order to keep this brief I will link to a motivating example from an earlier discussion: #3546.Additionally there's an aspect of symmetry in the language (see #3107 for more on this theme):
There is currently a parralelism in two dimensions between positional data, nominal data, and collections on one axis, and declaration, construction, deconstruction, and pattern matching on the other.
You can declare types positionally using positional records/primary constructors. You can construct an instance positionally using a constructor, you can deconstruct it using positional deconstructions, you can pattern match it using a positional pattern.
You can declare types nominally through properties/fields. You can construct an instance nominally through an object initialize and you can pattern match it using property patterns.
You can construct a collection using a collection initializer, and you will likely soon be able to pattern match it using list patterns.
This proposal fills in two of the three missing squares here by introducing nominal and sequence deconstructions.
Detailed design
High level overview.
We have 3 aims which inform this design:
The most common case is to simply want to declare a bunch of variables. Here we take a cue from positional deconstruction, which allow you to preface a deconstruction with
var
to automatically declare locals for all identifiers within the deconstruction:This declares 4 variables,
startLine
,startColumn
,endLine
,endColumn
.Positional deconstruction also allows you to specify the type explicitly, and assign to arbitrary lValues, so we allow that by leaving off the
var
:Patterns can contain any arbitrary pattern so we allow nesting any deconstruction in any other, e.g:
Patterns can assign a pattern to a variable, even if the pattern itself contains other nested patterns, so we allow that:
It's useful to be able to assign such a variable to an existing local, like so:
On the other hand, we want to be able to declare a new local. We can't do so by putting
var
beforehand, since that makes all nested identifiers declare new locals. We don't want to do so by putting an explicit type beforehand, since that would lead to a confusing difference betweenvar
and other types. Instead we say that{} identifier
declares a new local if one does not exist, and otherwise assigns to the existing local. This is very different to how C# works so far and may be reconsidered.We apply all these principles to positional and collection deconstructions as well, so the grammar and spec for the 3 deconstructions is very similiar.
Unlike patterns, deconstruction does no checking for null, or bounds checking, and will throw a
NullReferenceException
or aIndexOutOfRangeException
if these are violated. As ever, the compiler will warn you if you deconstruct a maybe null reference.Changes to grammar
Examples:
Detailed Spec
variable_designation
A
var_variable_designation
is lowered recursively as follows:Every
var_variable_designation
has a unique targett
, which is a temporary variable of typeT
inferred from the expression that is assigned tot
.If the
var_variable_designation
is the top levelvar_variable_designation
in adeclaration_statement
we assignexpression
tot
.If the
var_variable_designation
is the top levelvar_variable_designation
in aforeach_statement
we assignenumerator.Current
tot
.Else
t
is defined recursively below.If a
var_variable_designation
defines anidentifier
i
, we declare a local of typeT?
and namei
and the same scope as the scope of thedeclaration_statement
/foreach_statement
, and assignt
toi
.Assuming the
var_variable_designation
hasn
childvariable_designation
sv0
tovn - 1
, we produce a set of child tempst0
totn - 1
as follows.var_variable_designation
is aparenthesized_variable_designation
we look for a suitable deconstructor onT
to deconstructt
intot0
totn - 1
. See the spec for more details.var_variable_designation
is anominal_variable_designation
, for eachnamed_variable_designation
with identifierix
,t
must have an accessible property or fieldix
, and we assignt.ix
totx
(this should match the spec for property patterns).var_variable_designation
is asequence_variable_designation
,t
must have an indexer accepting a single parameter of typeint
, and we assignt[x]
totx
(this should match and keep up to date with spec for collection patterns, e.g. we may allow use ofGetEnumerator
here).For each child
variable_designation
vx
vx
is avar_variable_designation
we lower vx as specified here, usingtx
ast
forvx
.vx
issingle_variable_designation
withidentifier
ix
we declare a local of typeTx?
and nameix
and the same scope as the scope of thedeclaration_statement
/foreach_statement
, and assigntx
toix
.vx
is adiscard_designation
we do nothing.deconstruction
A
deconstruction
is lowered recursively as follows:Every
deconstruction
has a unique targett
, which is a temporary variable of typeT
inferred from the expression that is assigned tot
.If the
deconstruction
is the top leveldeconstruction
in adeclaration_statement
we assignexpression
tot
.Else
t
is defined recursively below.If a
deconstruction
defines anidentifier
i
i
we assignt
toi
.T?
and namei
and the same scope as the scope of thedeclaration_statement
, and assignt
toi
.Assuming the
deconstruction
hasn
childdeclaration_target_or_expression
sd0
todn - 1
:If this is a top level
deconstruction
:For each
declaration_target_or_expression
dx
dx
is anexpression
, it must be a valid lValue as defined by the spec, and we evaluate as much ofdx
as is evaluated before the RHS of an assignment operator as defined by the spec. The result of this evaluation is stored in a tempdtx
.dx
is adeconstruction
we perform this step recursively to evaluate as much of it's childexpression
s as are necessary.We produce a set of child temps
t0
totn - 1
as follows.deconstruction
is apositional_deconstruction
we look for a suitable deconstructor onT
to deconstructt
intot0
totn - 1
. See the spec for more details.deconstruction
is anominal_deconstruction
, for eachnominal_deconstruction_element
with identifierix
,t
must have an accessible property or fieldix
, and we assignt.ix
totx
(this should match the spec for property patterns).deconstruction
is asequence_deconstruction
,t
must have an indexer accepting a single parameter of typeint
, and we assignt[x]
totx
(this should match and keep up to date with spec for collection patterns, e.g. we may allow use ofGetEnumerator
here).For each child
declaration_target_or_expression
dx
dx
is anexpression
we assigntx
todtx
as specified by the spec on simple assignment. The assignment must be valid according to the rules specified there.dx
is adeclaration
declaration
is avar_variable_designation
we lower vx as specified above, usingtx
ast
forvx
.declaration
is asingle_variable_designation
withidentifier
ix
andtype
Tx
we declare a local of typeTx`` and name
ixand the same scope as the scope of the
declaration_statement, and assign
txto
ix. If
typeis
varis inferred from
tx`.declaration
is adiscard_designation
we do nothing.dx
is adeconstruction
we lowerdx
as specified here, usingtx
ast
fordx
.Drawbacks
This is a significant set of enhancements to deconstruction. Deconstruction is far less common than pattern matching, so it may be that the benefit from this set of enhancements is not considered sufficient to pay for itself.
Parsing ambiguities
In order to distinguish between a
nominal_deconstruction
and ablock
, we need to parse till we reach a,
a;
or the closing brace (at which point we can check if it's followed by a=
or not). This lookahead may be expensive. However much of the parsed syntax tree can be reused between the two cases.In order to distinguish between a
positional_attribute
and an attribute on a local function we need to parse till we reach the closing]
and check to see if it's followed by a=
or not. This may also be expensive, although I imagine the most expensive cases will quickly run into something that will disambiguate them, such as expressions that are disallowed in attributes.If expression blocks are added in the future, this may possibly lead to genuine ambiguities even at a semantic level. E.g
{ P : (condition ? ref a : ref b) } = e;
could be a nominal deconstruction, or an assignment to an expression block containing the labelP
. It shouldn't be too difficult to work around this (e.g. disallow labels for final expression of an expression block).Alternatives
There are a number of simplifications to this spec we could consider:
var
form of the patterns as the most common.etc.
As well there's a lot of axis on which the exact grammar/semantics could be adjusted. I hope I made clear in my high level overview why I made the decisions I did, but I will not be surprised if others come to different conclusions.
Unresolved questions
How do we modify the spec I've given above to allow target typing of literals in the case of tuple deconstruction.
Design meetings
https://github.com/dotnet/csharplang/blob/master/meetings/2020/LDM-2020-11-16.md#nominal-and-collection-deconstruction
The text was updated successfully, but these errors were encountered: