Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The ++ pattern is too greedy? #89

Open
EliasC opened this issue Nov 27, 2023 · 3 comments
Open

The ++ pattern is too greedy? #89

EliasC opened this issue Nov 27, 2023 · 3 comments

Comments

@EliasC
Copy link
Contributor

EliasC commented Nov 27, 2023

I was expecting the pattern Any++ * T(Bar) to match the sequence Foo Foo Foo Bar, just as the regular expression .*b matches aaab, but it seems like the Any++ matches all of Foo Foo Foo Bar, meaning the whole pattern fails to match (since there is no Bar after that sequence). I don't know if this is by design, nor what the consequences would be of making it more like the kleene star, but I thought I would bring it up for discussion.

My use case was for matching a Y inside an X, where the last child of Y is a Z (for reasonable values of X, Y, and Z):

In(X) * T(Y) << (Any++ * T(Z) * End)

If I didn't care about both the X and the Y I could just have T(Z) * End, but I now it's important that the Y is in an X and has the children specified. In this particular case I can do (!T(Z))++ * T(Z), but in general this might not be satisfactory.

@mjp41
Copy link
Member

mjp41 commented Nov 27, 2023

So I would write that as

In(X) * T(Y) << ((!T(Z))++ * T(Z) * End)

The current implementation does not backtrack the | or ++ matching based on the continuation. For some of the optimisations I applied, I actually carry the continuation around, so that could be possible to do.

There is another similar case:

((T(X) * T(Y)) | (T(X)) * T(Z)

This matches XYZ or XZ. But

((T(X) | (T(X) * T(Y))) * T(Z)

only matches XZ.

@EliasC
Copy link
Contributor Author

EliasC commented Nov 27, 2023

Are these optimisations of the Trieste code, or optimisations written in the client code?

@mjp41
Copy link
Member

mjp41 commented Nov 29, 2023

Are these optimisations of the Trieste code, or optimisations written in the client code?

Trieste code optimisations. Before a pattern like

T(A) * T(B) * T(C)

would result in for dynamic dispatch calls before it tries to see if there is a T(A). By CPS converting the pattern it becomes one, and the continuation is a dynamic dispatch as well.

E.g. here we set_continuation

result->set_continuation(rhs.pattern);

This could form the basis for the better backtracking, but would need care given the other optimisations.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants