Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide EMFText to gracefully degrade and correct wrong rule choices #19

Open
lschuetze opened this issue Mar 27, 2013 · 2 comments
Open

Comments

@lschuetze
Copy link

Hello,

EMFText uses specific tokens over general tokens. This leads to failures, where EMFText chooses a RULE that matches at the beginning, but does not match in any way beyond that.
Here is an example of the problem:

We have two TOKENs:

DEFINE ITERATOR_NAME $ 'select' | 'reject' | 'forAll' | 'collect' | 'any' | 'exists' | 'one' | 'isUnique' | 'collectNested' | 'sortedBy' | 'closure' $;
DEFINE SIMPLE_NAME $ ('A'..'Z'|'a'..'z'|'_') ('A'..'Z'|'a'..'z'|'0'..'9'|'_')*$;

ITERATOR_NAME is more specific than SIMPLE_NAME and is therefore preferred when applying rules. Lets consider the following of our DSL (that is OCL):

def: forAll = 0

Now, forAll is not understand as a SIMPLE_NAME, but as an ITERATOR_NAME. This causes the following RULE to bug, i.e. the rule is not applicable. But instead of looking for alternatives (e.g. SIMPLE_NAME is still applicable before EMFText could gracefully degrade), EMFText continues in a bad state.

SimpleNameCS ::= simpleName[SIMPLE_NAME];

The fix is therefor, to use

SimpleNameCS ::= simpleName[ITERATOR_NAME] | simpleName[SIMPLE_NAME];

as ITERATOR_NAME does not mean a keyword in OCL. The case gets more worse (as the first one was resolvable), when considering the second example:

def: iterate = 0

This triggers the following rule:

IterateExpCS ::= "iterate" #0 "(" (iteratorVariable #0 ";")? resultVariable "|" bodyExpression #0 ")";

As "iterate" is more specific than

SimpleNameCS ::= simpleName[ITERATOR_NAME] | simpleName[SIMPLE_NAME];

But again, the rule is not applicable beyond the fact, that it begins with "iterate" (i.e. there are no parantheses and so on). EMFText then remains in a bad state, without trying to apply other rules that are matching.

@mirkoseifert
Copy link
Member

I'm sorry, but I don't think EMFText can easily support this. The reason is that EMFText uses ANTLR as underlying parser technology. Once ANTLR has split the input stream into tokens according to the token definitions, this assignment of characters from the input to tokens is final. There is no way to interpret characters as a different token later on even if this might be reasonable from a users point of view.

Can you explain why you're defining tokens for keywords? I'd expect that one would directly define the keywords in syntax rules or use enumeration attributes. Maybe we can find another solution if we know what is the exact goal you're targeting at.

@lschuetze
Copy link
Author

I did not design the language definition. But I am able to refactor it. Just writing from my tablet.
The exact goal is (1) to use keywords in token definitions as the definition of IterateExpCS; and (2) to reuse those keywords as normal strings in other token definitions as of SimpleNameCS.

I will answer again when I am back at my computer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants