-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Is the disjointness requirement for call and exit symbols really necessary? #44
Comments
The basic thing you need is, for each input token, to know whether it will cause the automaton to perform a call transition, a return transition, or an internal transition. It's technically possible, given owl's implementation, to have the same token used to perform different types of transition, as long as you can distinguish the different types of transition deterministically without any lookahead. In this example, it wouldn't be possible to do that -- And in any case, supporting stuff like this makes it much more difficult to explain what a valid grammar looks like. It would be similar to "shift-reduce conflicts" in LALR parser generators, where the error has nothing to do with the grammar itself but with some conflict within the automaton construction process. |
So, if I understood you correctly, what about the following grammar, which is rejected by owl:
An Or more generally, consider |
It would technically be possible, but I'm not interested in doing it in owl -- I don't want to write error messages that involve terms like "firstset". One of my main motivations for writing owl was frustration with LALR-type shift-reduce errors that make me have to understand how the parser generator actually works internally to diagnose. |
I completely agree that LALR conflicts are hard to debug. However, the thing about firstsets is that they are actually relatively easy to understand and explain. One only needs to be able to walk subtrees from left to right to be able to reason about firstsets. Furthermore, most implementations use a worklist-based approach for calculating them, but if one collects constraints and treats the problem of finding firstsets as a dataflow problem, it becomes possible to precisely output why the error exists in the first place. Error messages like the following become possible by walking the dataflow graph in reverse:
I think it would be really cool to see owl become more expressive like that and be able to support identical call and return symbols in cases where that is theoretically possible. |
According to #43 (comment), call and return symbols need to be disjoint.
Therefore, #38 is not possible because string literals viewed as a VPL would have an identical call and return symbol, and would not be a VPL.
However, I'm wondering, could it be that this requirement by Rajeev Alur & P. Madhusudan for VPLs is too restrictive?
I don't have an intuitive understanding of the inner workings of owl, and this might be obvious, but @ianh can you think of a counterexample where having identical call and return symbols would break any of the invariants that owl depends on?
What would happen (where would owl break) if owl ignored this requirement, and a grammar like
[ '[' ... '[' ]
would be accepted and compiled by owl?e.g.?:
The text was updated successfully, but these errors were encountered: