You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
However, when used as part of a MacroVerbatim node like:
macrofinished
{% verbatim do%}
{%10# Foo2030# Bar40%}
{%5060%}
{%end%}
end
NOTE: The macro finished isn't really needed to reproduce this problem, but is included for demonstration purposes to make generating the output easier.
This is because the line numbers for each node are now based on the to_s representation of the MacroExpressions within the MacroVerbatim versus the actual source code. E.g.
# There's a newline here
{%10203040%}
{%5060%}
The Problem
#15305 makes things a little bit more accurate by retaining the newline after the {% is present. E.g.
However things are still not quite 100% correct. We can make things a bit better by summing the node's location with the location of the MacroVerbatim node itself (removing the 0 || on like 592 in interpreter.cr in the appendix diff):
The number 10 now at least has the proper location, but all the others are still not correct. The gist of why the other numbers are off is because notice in the stringified output, all the whitespace and comments in the macro expression have been stripped. Thus they are not taken into consideration when the code is re-parsed as part of the macro expansion logic.
Not having the nodes have the proper location information in this context makes #14880 much harder as it relies on having proper line numbers to map to the source code when generating the coverage report.
Proposed Solution
So far the most robust solution I can think of is to have the parser include MacroLiteral nodes when parsing macro expressions that represent comments and extra newlines. This would ensure that the stringified macro expression includes these newlines so the re-parsed code is able to map to the line numbers of the source.
WDYT?
EDIT: Is this maybe something we could leverage location pragmas for?
I'm wondering if there could be a rather trivial solution by making verbatim retain the original source code as a string instead of parsing the content into an AST and then stringifing that again.
So there would still need to be some tweaks to ToSVisitor. At which point, I'm not sure if it makes sense to have special parsing logic just to handle verbatim, while still needing to have everything else in ToSVisitor anyway for when these nodes are outside of verbatim.
Discussion
Background
Currently if you have code like:
The parser is able to properly attribute location information to each number, taking into account the newlines and comments:
However, when used as part of a
MacroVerbatim
node like:The results are now not totally accurate:
This is because the line numbers for each node are now based on the
to_s
representation of theMacroExpression
s within theMacroVerbatim
versus the actual source code. E.g.The Problem
#15305 makes things a little bit more accurate by retaining the newline after the
{%
is present. E.g.However things are still not quite 100% correct. We can make things a bit better by summing the node's location with the location of the
MacroVerbatim
node itself (removing the0 ||
on like 592 ininterpreter.cr
in the appendix diff):The number
10
now at least has the proper location, but all the others are still not correct. The gist of why the other numbers are off is because notice in the stringified output, all the whitespace and comments in the macro expression have been stripped. Thus they are not taken into consideration when the code is re-parsed as part of the macro expansion logic.Not having the nodes have the proper location information in this context makes #14880 much harder as it relies on having proper line numbers to map to the source code when generating the coverage report.
Proposed Solution
So far the most robust solution I can think of is to have the parser include
MacroLiteral
nodes when parsing macro expressions that represent comments and extra newlines. This would ensure that the stringified macro expression includes these newlines so the re-parsed code is able to map to the line numbers of the source.WDYT?
EDIT: Is this maybe something we could leverage location pragmas for?
Appendix
Diff used to generate the outputs
The text was updated successfully, but these errors were encountered: