From 21aed50fb11fa0ee476db866e1b8522ec2226d77 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Micha=C5=82=20Muska=C5=82a?= Date: Mon, 3 Feb 2025 12:01:35 +0000 Subject: [PATCH] EEP 78: Multi-valued comprehensions --- eeps/eep-0078.md | 114 +++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 114 insertions(+) create mode 100644 eeps/eep-0078.md diff --git a/eeps/eep-0078.md b/eeps/eep-0078.md new file mode 100644 index 0000000..424618a --- /dev/null +++ b/eeps/eep-0078.md @@ -0,0 +1,114 @@ + Author: Michał Muskała + Status: Draft + Type: ... + Created: 16-12-2024 + Erlang-Version: ... + Post-History: ... +**** +EEP 78: Multi-valued comprehensions +---- + +Abstract +======== + +This EEP proposes enhancing the comprehension syntax to allow emitting multiple +elements in a single iteration of the comprehension loop - effectively enhancing +comprehensions to implement `flatmap` with a fixed number of elements. + +Rationale +========= + +Comprehensions in Erlang are a very flexible and convinent way of implementing many +iteration and looping patterns. However, there are some cases that end up unergonomic, +notably, and where the scope of this EEP is, where we'd like to emit multiple elements +from a single iteration. + +Existing forms of expressing this are awkward and introduce extra, unnecessary allocations. +For example: + + lists:append([[X + 1, X + 2] || X <- Xs] + [Tmp || X <- Xs, Tmp <- [X + 1, X + 2]] + +Both of those ways end up creating an extra allocation of a temporary 2-element list, introducing +inefficiency, as well as are, arguably, harder to understand than necessary introducing extra function +calls or variables. + +This EEP proposes enhancing comprehensions with the ability to do so using a very natural +syntax extension of the existing comprehension syntax. In particular for list and map +comprehensions: + + [X + 1, X + 2, ... || X <- Xs] + + #{K + 1 => V + 1, K + 2 => V + 2, ... || K := V <- Map} + +The semantics of map comprehensions where multiple keys in the same iteration would end up +with the same value, should be the same as if the keys were emitted in subsequent iterations. + +Binary comprehensions already support this, and thus there's no enhancement to their syntax +suggested in this EEP, for example: + + << <<(X + 1), (X + 2)>> || <> <= Bin>>. + +Parsing & Abstract Forms +============== + +Today, the [comprehension abstract forms](https://www.erlang.org/doc/apps/erts/absform.html#expressions) are defined as: + +- If `E` is a list comprehension `[E_0 || Q_1, ..., Q_k]`, where each `Q_i` is a + qualifier, then `Rep(E) = {lc,ANNO,Rep(E_0),[Rep(Q_1), ..., Rep(Q_k)]}`. For + `Rep(Q)`, see below. +- If `E` is a map comprehension `#{E_0 || Q_1, ..., Q_k}`, where `E_0` is an + association `K => V` and each `Q_i` is a qualifier, then `Rep(E) = + {mc,ANNO,Rep(E_0),[Rep(Q_1), ..., Rep(Q_k)]}`. For `Rep(E_0)` and `Rep(Q)`, see + below. + +This EEP proposes to change the representation, in a fairly backwards-compatible way +to include the representation of `E_0` directly, if there's just one element emitted, +or a list of elements, if there's more than one. This slightly complicates the implementation +(vs always emitting a list), but retains backwards-compatibility of the AST for code that +exists today. As such, the definition after the changes would read: + +- If `E` is a list comprehension `[E_0, ..., E_k || Q_1, ..., Q_k]`, where each `Q_i` is a + qualifier, then `Rep(E) = {lc,ANNO,Rep(Es),[Rep(Q_1), ..., Rep(Q_k)]}`. `Rep(Es) = Rep(E_0)`, + if there's just one expression or `Rep(Es) = [Rep(E_0), ..., Rep(E_k)]` if there's many. For + `Rep(Q)`, see below. +- If `E` is a map comprehension `#{E_0, ..., E_k || Q_1, ..., Q_k}`, where `E_i` is an + association `K => V` and each `Q_i` is a qualifier, then `Rep(E) = + {mc,ANNO,Rep(Es),[Rep(Q_1), ..., Rep(Q_k)]}`. `Rep(Es) = Rep(E_0)`, + if there's just one expression or `Rep(Es) = [Rep(E_0), ..., Rep(E_k)]` if there's many. + For `Rep(E_0)` and `Rep(Q)`, see below. + +For example the following expressions: + + [X || X <- Xs] + [X, X || X <- Xs] + #{K => V || K := V <- Map} + #{K => V, K => V || K := V <- Map} + +Would have the following representations (where `_` is substituted for corresponding `Anno` values): + + {lc,_,{var,_,'X'},[{generate,_,{var,_,'X'},{var,_,'Xs'}}]} + {lc,_,[{var,_,'X'},{var,_,'X'}],[{generate,_,{var,_,'X'},{var,_,'Xs'}}]} + {mc,_,{map_field_assoc,_,{var,_,'K'},{var,_,'V'}},[ + {m_generate,_,{map_field_exact,_,{var,_,'K'},{var,_,'V'}},{var,_,'Map'}}}} + ]} + {mc,_,[{map_field_assoc,_,{var,_,'K'},{var,_,'V'}},{map_field_assoc,_,{var,_,'K'},{var,_,'V'}}],[ + {m_generate,_,{map_field_exact,_,{var,_,'K'},{var,_,'V'}},{var,_,'Map'}}}} + ]} + +Backwards compatibility +======================== + +For code that does not use this new feature, nothing changes. For code that uses this new feature +parse transforms or any tools using abstract forms, would need to be updated. + +Reference Implementation +======================== + +[PR #9375](https://github.com/erlang/otp/pull/9375) + +Copyright +========= + +This document is placed in the public domain or under the CC0-1.0-Universal +license, whichever is more permissive.