Skip to content

Design note: Commas

Herb Sutter edited this page Mar 18, 2024 · 3 revisions

Q: Why is adding trailing commas allowed in all lists?

A: Because it lets us write more robust and maintainable code, and has no known downsides.

Here's a great article by Raymond Chen: "On the virtues of the trailing comma"

The major advantage of allowing adding an "extra" trailing delimiter is to make code more robust to change. It does that by:

  • enabling reordering entire lines of code for simpler refactoring and maintenance
  • minimizing diffs and merge conflicts for simpler maintenance

For example, given:

data1: vector = (
    111,
    -99,
    42,
);

data2: vector = (
    111,
    -99,
    42
);
  • to change the value order, data1 can always just reorder whole lines, data2 can't
  • to append a new value, data1 can just add a new whole line without changing any existing lines, data2 can't
  • if two commits each append a new value, in data1 it's an easy merge conflict to resolve by accepting both edits in full, in data2 it requires hand-editing

Additionally, Cpp2 supports reflection and code generation, which is done by source code generation. Allowing trailing commas in lists makes it easier to generate source code without special cases.

In languages like today's C++ that only allow the data2 form in a given place, there's pressure to put the separator on the following line to avoid the problem:

mytype::mytype()
    : member1{ value1 }
    , member2{ value2 }
    , member2{ value3 }

For these reasons, Cpp2 allows adding trailing commas in all lists.

Q: But then why is omitting trailing semicolons not similarly allowed? Isn't it the same thing?

A: No it's not. Just as those keywords "adding" vs "omitting" are opposites, so too the benefits of the first are the liabilities of the second.

All the above advantages of allowing adding a trailing delimiter are conversely weaknesses of allowing omitting a trailing delimiter. For example (as Raymond's article points out in the last postscript):

func: () = {
    x = 1;
    y = 2;
    z = 3;
}

gunc: () = {
    x = 1;
    y = 2;
    z = 3
}
  • to change the order of lines of code, func can always just reorder whole lines, gunc can't
  • to append a new statement, func can just add a new whole line without changing any existing lines, gunc can't
  • if two commits each append a new statement, in func it's an easy merge conflict to resolve by accepting both edits in full, in gunc it requires hand-editing

There's also an impact on language evolution in omitting ; on the final statement/expression of a function body. Here's the experience I know of with that feature in other major languages that allow omitting the last ;:

  1. In languages where omitting it is innocuous (usually because the language has function bodies that contain a list of expressions, not statements), the feature isn't generally used. See Raymond Chen's article linked at top: He points out there that Pascal allows the gunc-like form, but Pascal programmers generally don't use it.

  2. In languages where omitting it is meaningful (changes the meaning of the code), typically it's to make the final statement be really an expression which is the implicit return-expression of the function.

Worse, (1) closes the door to (2): And if we do (1) (allowing omission as innocuous) today, that actively closes the door to doing (2) (giving omission a meaning) in the future, because allowing (1) means any code that actually uses it will be broken if we ever changed to (2), so as soon as code that relies on (1) exists we will never be allowed to do (2).

Two perspectives, subtly different

We can look at "oh look, it's optional to write the final delimiter" from two perspectives, which are (subtly) opposite:

  1. "Always allowed." From this view, the final , or ; can always be added, so we would make it allowed in grammar productions that currently don't allow it. This delivers the advantages in the first example above.

  2. "Always optional." From this view, the final , or ; can always be omitted, so we would make it optional in grammar productions that currently require it. This has the drawbacks of the second example above.

Cpp2 is pursuing the path of doing (1), "always allowed."