-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Named wildcards (extending valid identifier syntax) #54
Comments
As you say, the leading underscore is only a convention to be used by the "unused variable" linter. One less invasive option would be to make the SML convention be a trailing underscore. But, it seems that rather than expanding the tokens that serve as the wildcard pattern, it might be better to expand the regular expression for identifiers to allow for a leading underscore. |
What is the "unused variable" linter? I've never heard of it. This seems like a mild change (expanding the definition of the wildcard pattern symbol, though it doesn't do much (at least for me). So I would not object. My focus on language changes has shifted to my NewFrontEnd project (repository dmacqueen/NewFrontEnd) where I have proposed some micro-design changes for my own version of Successor ML (MsML, or MacQueen's successor ML). See the files language.txt and dbm-on-rwh1.txt. I could include this change with respect to the wildcard pattern, though my motivation would not be strong. |
I don't think that it is any specific tool, just the general expectation that a number of tools (e.g., MLton's "unusedVariable warn" MLB annotation, an SML LSP server) provide a warning on unused variables. |
My preference here is to keep dissallowing I fail to see the point of binding it if it is unused, nor the point of declaring it okay to go unused when it is in fact used. |
It's important to avoid leading underscore identifiers to have maximal compatibility with systems that use underscore-prefixed identifiers for language extensions. For example, open M
val x = _import "g" : unit -> unit; would be ambiguous, as the interpretation would depend on |
Well, I would just consider |
While it doesn't really help for these existing incompatibilities, perhaps there could be a reserved prefix like Could MLton consider deprecating these existing reserved words in favor of e.g. double underscore prefixed ones, if the definition could guarantee that double underscore was carved out for such purposes? |
Given that the proposal appears to be incompatibile with MLton, and given its limited utility (in my opinion), I think that this issue should be dropped (i.e. closed with no action recommended). |
This proposal is not incompatible with MLton as it exists currently. |
I don't see how what I proposed is different from what you proposed in the following text:
What I was responding to was Matthew Fluet's idea
Essentially saying that I prefer your original proposal over expanding the valid list of variable names to allow a leading So in this regard, I am confused by the idea that I suggested an incompatibility. Hopefully this clears it up. |
Another reason I like this proposal is the potential for a future destructuring expansion on records:
It would be nice if Edit: Tried to clarify original comment... |
Apologies, I tagged the wrong commenter - @MatthewFluet was the one who suggested
which would cause an incompatibility with MLton, as it would have a keyword that conflicted with valid Standard ML programs. |
FWIW,
is valid SML today, so technically, MLton is already in violation of the standard, and officially allowing leading underscores would only make it reject more valid programs. On the other hand, the above also implies that, technically, this would a breaking change, because the same program would start to lex differently and become invalid. Although I highly doubt that this breakage matters in practice. Warning about unused identifiers is super useful in my experience (e.g. in OCaml), as it regularly catches subtle mistakes quickly, like accidentally consuming |
@rossberg good point about |
Good point, Andreas. This discussion sounds to me like a typical "language lawyer" discussion. My personal preference is not to introduce minor changes in syntax (or semantics) unless there is a fairly obvious and significant payoff for the programmer. The proposal of introducing new wildcard notation doesn't seem to have much of a payoff, and as a programmer I would not be likely to use it (but I may be too set in my ways and may be missing the point). But I am curious about how some "lint" process or tool would make use of this change? If you have a bound but unused variable "foo", it is clearly useful to report this to the programmer, and it seems like a straightforward feature to implement as part of the compiler static analysis (e.g. the elaboration phase in SML/NJ). I don't promise to implement this in SML/NJ, but I will certainly include it in my NewFrontEnd project. But I don't see any connection with the suggested named wildcard notation. Could someone explain it to me? |
@dmacqueen, the way this works in other languages is that unused variables are warned about, unless they start with an underscore. Thereby the programmer makes clear the intention that it's essentially a wildcard, while still writing out a name for documentation purposes. Personally, I have come to find this quite useful for readability, especially when you'd otherwise have lots of non-descriptive wildcards, like with larger tuples. I actually like @ratmice's suggestion to make these proper wildcards, although I also have to admit that I occasionally take benefit of the fact that such variables can still be used, e.g., when quickly inserting some temporary printf for debugging. |
@rossberg thanks for the explanation. I think I see the point. So if one binds a variable like _x, it is implied that _x should have no applied occurrences (i.e occurrences within its scope), and a checker for bound variables without applied occurrences would ignore it. Yet, if _x should actually have an applied occurrence, no harm done. Right? |
@dmacqueen That sounds right to me, but I find the "should have no applied occurrences, but should it no harm done" description awkward, the way I would compare the two different proposals in play is this: in other languages that currently implement this, from the perspective of the linter, variable bindings are separated into one of two domains. Those that start with a leading In the Named Wildcards proposal the link such lints is admittedly less clear, for there is no One question I have which I don't immediately see syntax conflicts with is choosing a different character than leading Edit: perhaps it isn't worth thinking about leading |
@dmacqueen, yes, exactly. @ratmice, I think the status of such "named wildcards" would remain different from regular wildcards, since you'd still want to disallow multiple occurrences of the same one within a single pattern. Technically, it seems easier to still treat them as regular variables, but the linter would simply complain about any use site of a variable that starts with an underscore. That is sort of the dual to the binding site check for variables that do not start with an underscore. |
@rossberg Sure, that isn't an interpretation I likely would have considered, but it seems semantically equivalent, Unless I am misunderstanding something, that would seem to preclude the following extension,
|
It would seem confusing and misleading, and inconsistent to me if In your example, is |
Yeah I guess to me it doesn't seem inconsistent because |
Giving names to variables is a good way of documenting your code for future readers.
For example, you may have a function
MyObj.fold : (int * int * 'a -> 'a) -> 'a -> MyObj.t -> 'a
that passes the function a triple of(index, value, accumulator)
.If you want to document the fact that
idx
is not used, it is sometimes good practice to not bind the variable name.This way a linter can warn for unused variables.
But now there is no documentation on what that missing parameter actually meant, since we no longer give it a name.
One option is to add a comment. This is somewhat popular in C++:
But a better option is what
Haskell
,OCaml
,Erlang
,Prolog
,JavaScript
and other languages allow: prefix the identifier with an underscore, and make the "unused variable" linter ignore variables with leading underscores.This way you maintain the benefits of naming for documentation purposes, but also improves the readability of your code because the reader can immediately know that the variable is not used.
Changes
The simplest method to support this change is to change the wildcard syntax to
_[a-zA-Z0-9']*
, where_foo
is considered a named wildcard no different from_
.Potential Incompatibilities
Currently MLton uses this syntax for language extensions:
_address
,_build_const
,_command_line_const
,_const
,_export
,_import
,_overload
,_prim
,_symbol
. These uses seem possible to work around, because each of these extensions only make sense in expression or definition context and these special tokens can fallback to wildcards in pattern syntax.The text was updated successfully, but these errors were encountered: