Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Strange syntax error only with zero '0' converted to int #88

Open
philip-h-dye opened this issue Apr 9, 2023 · 5 comments
Open

Strange syntax error only with zero '0' converted to int #88

philip-h-dye opened this issue Apr 9, 2023 · 5 comments

Comments

@philip-h-dye
Copy link

With this extremely simple grammar "1" succeeds but "0" fails. Can anyone shed light on this issue? Thanks

zero-fails.py

from prettyprinter import pprint as pp

from pegen.grammar_parser import GeneratedParser as GrammarParser
from pegen.utils import generate_parser, parse_string

grammar_spec = """
    start  : a=num

    num    : a=NUMBER { float(a.string) if '.' in a.string else int(a.string) }
"""

numbers = [ 1, 0 ]

for n in numbers :
    grammar = parse_string(grammar_spec, GrammarParser)
    # print() ; pp(grammar)
    grammar_parser_class = generate_parser(grammar)
    tree = parse_string(str(n), grammar_parser_class, verbose=True)
    verdict = 'PASSED' if tree is not None else 'FAILED'
    pp([ verdict, tree ]) ; print()

Output

start() ... (looking at 1.0: NUMBER:'1')
  num() ... (looking at 1.0: NUMBER:'1')
    number() ... (looking at 1.0: NUMBER:'1')
    ... number() -> TokenInfo(type=2 (NUMBER), string='1', start=(1, 0), end=(1, 1), line='1')
  ... num() -> 1
... start() -> 1
['PASSED', 1]

start() ... (looking at 1.0: NUMBER:'0')
  num() ... (looking at 1.0: NUMBER:'0')
    number() ... (looking at 1.0: NUMBER:'0')
    ... number() -> TokenInfo(type=2 (NUMBER), string='0', start=(1, 0), end=(1, 1), line='0')
  ... num() -> 0
... start() -> None
Traceback (most recent call last):
  File "/home/phdyex/src/python/grammar-tool/test-files/pegen/gunit/simple/issue/zero-syntax-error/./zero-fails.py", line 20, in <module>
    tree = parse_string(str(n), grammar_parser_class, verbose=True)
  File "/home/phdyex/.cache/pypoetry/virtualenvs/grammar-tool-srSGyud3-py3.10/lib/python3.10/site-packages/pegen/utils.py", line 66, in parse_string
    return run_parser(file, parser_class, verbose=verbose)  # type: ignore # typeshed issue #3515
  File "/home/phdyex/.cache/pypoetry/virtualenvs/grammar-tool-srSGyud3-py3.10/lib/python3.10/site-packages/pegen/utils.py", line 55, in run_parser
    raise parser.make_syntax_error("invalid syntax")
  File "<unknown>", line 1
    0
    ^
SyntaxError: invalid syntax
@0dminnimda
Copy link
Contributor

I suspect it is the same problem of testing the boolean value of the result and not actually checking if the node was matched.
If you look at the generated code you probably will see something like
if self.num() which will fail if the result is 0 as bool(0) == False
It was not taken care of before, because all ast.X are classes and always evaluate to True

The other issue caused by this behavior is #81

#82 and #84 are trying to fix this issue, but the maintainers are busy right now and unable to review/merge

In the meantime you can use values that don't evaluate to False, probably the best choice is to wrap the number in your class, i.e.

@dataclass
class NumberNode:
    value: int | float

And then create it like NumberNode(...) and use it like number.value

@0dminnimda
Copy link
Contributor

Also, personally I want to ask you to check if #82 fixes this issue and report about it if possible

@0dminnimda
Copy link
Contributor

@pablogsal if you have time, it would be nice of you to take a look at the PRs

@philip-h-dye
Copy link
Author

For the moment, I edited the loop accepting num:
was:

    @memoize
    def _loop1_1(self) -> Optional[Any]:
        # _loop1_1: num
        mark = self._mark()
        children = []
        while (
            (num := self.num())       # <== 0 and None both fail when 0 should succeed
        ):
            children.append(num)
            mark = self._mark()
        self._reset(mark)
        return children;

Fix _loop1_1 by adding is not None :

    @memoize
    def _loop1_1(self) -> Optional[Any]:
        # _loop1_1: num
        mark = self._mark()
        children = []
        while (
            (num := self.num()) is not None  # Now only None will fail and 0 will succeed
        ):
            children.append(num)
            mark = self._mark()
        self._reset(mark)
        return children;

@philip-h-dye
Copy link
Author

Making a global "fix" in src/pegen/python_generator.py sadly results in 94 newly failing tests.

Essentially, my action coercing the strings to integers or floats is the fundamental cause if PEGEN requires that actions always result in a value that is semantically True. My apologies if this is documented somewhere. If not, I would suggest adding it.

Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants