You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
One way to parse this would be to generate a "start interpolation" event when lexing string literals. For the string above, this would generate:
StringStart
InterpolationStart
After the second event, we want to keep lexing not a string but any token (i.e. go back to the initial state), which is easy to do.
However after generating the } that terminates the interpolation, the lexer doesn't know that it's tokenizing an interpolation and so can't revert back to the "tokenize string" state.
The parser knows that the } terminates the interpolation, but currently we don't have a way to update a lexer state outside of a lexer semantic action function, so it cannot tell the lexer to go back to the top state or string state.
We should add a public method to the generated lexers to set lexer state in the call site to allow this kind of thing.
In an LR(1)/LALR(1) parser, this method would be called in the semantic action that produces an interpolated expression, as lexer.switch_(LexerState::String).
Why not keep track of the nesting level in a lexer state?
This requires lexer to know too much about the structure of the parsed format, as it would need to keep track of all nestings of parens, brackets etc.
For example:
"asdf ${ f( { } ) } asdf"
Here the first } does not terminate the interpolation. The lexer needs to know about parens (and other delimiters) so that it won't go back to string lexing after the }.
Parser already maintains the full structure, so it's not extra work in the parser to update lexer state when an interpolation is finished.
A problem with this approach of tokenizing interpolations is that the lexer cannot tokenize full files by itself anymore. One may accept this as the price of this syntax, or make lexer keep track of the delimiters and nesting.
The text was updated successfully, but these errors were encountered:
Consider parsing interpolated expression in string literals: https://langdev.stackexchange.com/questions/243.
Ideally an interpolated expression should be allowed to have strings with interpolations. E.g. this works in Dart:
One way to parse this would be to generate a "start interpolation" event when lexing string literals. For the string above, this would generate:
StringStart
InterpolationStart
After the second event, we want to keep lexing not a string but any token (i.e. go back to the initial state), which is easy to do.
However after generating the
}
that terminates the interpolation, the lexer doesn't know that it's tokenizing an interpolation and so can't revert back to the "tokenize string" state.The parser knows that the
}
terminates the interpolation, but currently we don't have a way to update a lexer state outside of a lexer semantic action function, so it cannot tell the lexer to go back to the top state or string state.We should add a public method to the generated lexers to set lexer state in the call site to allow this kind of thing.
In an LR(1)/LALR(1) parser, this method would be called in the semantic action that produces an interpolated expression, as
lexer.switch_(LexerState::String)
.Why not keep track of the nesting level in a lexer state?
This requires lexer to know too much about the structure of the parsed format, as it would need to keep track of all nestings of parens, brackets etc.
For example:
Here the first
}
does not terminate the interpolation. The lexer needs to know about parens (and other delimiters) so that it won't go back to string lexing after the}
.Parser already maintains the full structure, so it's not extra work in the parser to update lexer state when an interpolation is finished.
A problem with this approach of tokenizing interpolations is that the lexer cannot tokenize full files by itself anymore. One may accept this as the price of this syntax, or make lexer keep track of the delimiters and nesting.
The text was updated successfully, but these errors were encountered: