Provide EMFText to gracefully degrade and correct wrong rule choices #19

lschuetze · 2013-03-27T11:44:09Z

Hello,

EMFText uses specific tokens over general tokens. This leads to failures, where EMFText chooses a RULE that matches at the beginning, but does not match in any way beyond that.
Here is an example of the problem:

We have two TOKENs:

DEFINE ITERATOR_NAME $ 'select' | 'reject' | 'forAll' | 'collect' | 'any' | 'exists' | 'one' | 'isUnique' | 'collectNested' | 'sortedBy' | 'closure' $;
DEFINE SIMPLE_NAME $ ('A'..'Z'|'a'..'z'|'_') ('A'..'Z'|'a'..'z'|'0'..'9'|'_')*$;

ITERATOR_NAME is more specific than SIMPLE_NAME and is therefore preferred when applying rules. Lets consider the following of our DSL (that is OCL):

def: forAll = 0

Now, forAll is not understand as a SIMPLE_NAME, but as an ITERATOR_NAME. This causes the following RULE to bug, i.e. the rule is not applicable. But instead of looking for alternatives (e.g. SIMPLE_NAME is still applicable before EMFText could gracefully degrade), EMFText continues in a bad state.

SimpleNameCS ::= simpleName[SIMPLE_NAME];

The fix is therefor, to use

SimpleNameCS ::= simpleName[ITERATOR_NAME] | simpleName[SIMPLE_NAME];

as ITERATOR_NAME does not mean a keyword in OCL. The case gets more worse (as the first one was resolvable), when considering the second example:

def: iterate = 0

This triggers the following rule:

IterateExpCS ::= "iterate" #0 "(" (iteratorVariable #0 ";")? resultVariable "|" bodyExpression #0 ")";

As "iterate" is more specific than

SimpleNameCS ::= simpleName[ITERATOR_NAME] | simpleName[SIMPLE_NAME];

But again, the rule is not applicable beyond the fact, that it begins with "iterate" (i.e. there are no parantheses and so on). EMFText then remains in a bad state, without trying to apply other rules that are matching.

The text was updated successfully, but these errors were encountered:

mirkoseifert · 2013-03-30T20:11:24Z

I'm sorry, but I don't think EMFText can easily support this. The reason is that EMFText uses ANTLR as underlying parser technology. Once ANTLR has split the input stream into tokens according to the token definitions, this assignment of characters from the input to tokens is final. There is no way to interpret characters as a different token later on even if this might be reasonable from a users point of view.

Can you explain why you're defining tokens for keywords? I'd expect that one would directly define the keywords in syntax rules or use enumeration attributes. Maybe we can find another solution if we know what is the exact goal you're targeting at.

lschuetze · 2013-03-31T10:09:49Z

I did not design the language definition. But I am able to refactor it. Just writing from my tablet.
The exact goal is (1) to use keywords in token definitions as the definition of IterateExpCS; and (2) to reuse those keywords as normal strings in other token definitions as of SimpleNameCS.

I will answer again when I am back at my computer.

lschuetze mentioned this issue May 30, 2013

Non-keywords are modeled as keywords dresden-ocl/dresdenocl#26

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Provide EMFText to gracefully degrade and correct wrong rule choices #19

Provide EMFText to gracefully degrade and correct wrong rule choices #19

lschuetze commented Mar 27, 2013

mirkoseifert commented Mar 30, 2013

lschuetze commented Mar 31, 2013

Provide EMFText to gracefully degrade and correct wrong rule choices #19

Provide EMFText to gracefully degrade and correct wrong rule choices #19

Comments

lschuetze commented Mar 27, 2013

mirkoseifert commented Mar 30, 2013

lschuetze commented Mar 31, 2013