-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parser is not stack-safe for deep records #2
Comments
@travisbrown: The Haskell implementation had the exact same issue. See this thread starting here: dhall-lang/dhall-haskell#108 (comment). That issue has an even smaller reproduction which is just Basically, your intuition is right that the grammar needs to be left-factored. Specifically, the problematic grammar rule is the one for function types: To see the issue, let's imagine that we're parsing the following simple expression: Natural Until the parse hits the end of the file, the parser isn't sure whether or not it is in the middle of parsing a function's input type. For example, the parser doesn't know that the Now, imagine that you parse the following expression: (Natural) ... and now suppose that we temporarily halt the parser after it's done parsing up to here: (Natural…
^ At this point in the expression the parser has to entertain 4 possible branches:
More generally, you essentially double the number of possible parses every time you nest things. This leads to a slowdown that is exponential in the nesting level. The solution is to left-factor the top-level parser so that the parser branches after parsing an expression that might be a function's input type instead of ebfore. I don't know exactly what the Java analog of this, but if you're familiar with Haskell it would mean changing a parser like this: (parseOperatorExpression *> arrow *> parseExpression) <|> parseOperatorExpression ... to instead be left-factored like this: parseOperatorExpression *> optional (arrow *> parseExpression) I've oversimplified things, but that is the basic idea. |
Note that the recent precedence change of |
All operations currently work just fine on arbitrarily long lists:
…and most things work just fine on arbitrarily deep record literals (or record types, or union types):
Note that
dhall
produces the same hash for this expression:Unfortunately the parser can't handle this expression:
I think this should just be a matter of doing some more left-factoring, but I'm fairly new to JavaCC and I don't really know how much work this will be, so I've decided not to let this issue block the 0.1.0 release.
The text was updated successfully, but these errors were encountered: