Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Any plans to make a more generic version? #67

Open
StyXman opened this issue Jun 9, 2022 · 5 comments
Open

Any plans to make a more generic version? #67

StyXman opened this issue Jun 9, 2022 · 5 comments

Comments

@StyXman
Copy link

StyXman commented Jun 9, 2022

I'm trying to generate a parser for a language that includes attribute names that contain -s and hex colors that begin with #.

The first one could be solved by reconstructing the name from NAME ('_' NAME)*, but that would also accept a - b as a name, which is not.

The second one forces us to use a second parser to separate the hex portion from f.i. any trailing comments (#abcdef // foo).

@MatthieuDartiailh
Copy link
Collaborator

#65 would allow to use an alternate tokenizer and may solve some of your issues. It is currently pending reviews.

@StyXman
Copy link
Author

StyXman commented Jun 9, 2022

It's way over my pay grade, so to speak. I'll wait for it, then.

@jpsnyder
Copy link

Maybe I'm missing something, but can pegen be used for non-python code? I was looking into using this for my own project. Since I already have a working lexer, but I would like to swap out the LALR parser for PEG, so this seemed promising.
Could any lexer be used provided the iterated tokens have the expected attributes?

I was looking for an example of pegen being used on something that isn't Python just to see a proof of concept, but I can't seem to find anything. Everything uses import tokenize which are for Python tokens.

@MatthieuDartiailh
Copy link
Collaborator

As mentioned in my previous answer, there is a pending PR introducing a generic lexer interface that would allow you to use a custom lexer. However the other maintainers are part of the Python release team and do not have the bandwidth to review this PR or other pending PRs ATM.

@jpsnyder
Copy link

Yeah, excuse me for my ignorance. I'm having a hard time wrapping my head around how to get started to use pegen.
I might work off your branch for the time being. Although I was hoping there was a super simple example for how to use pegen without the Python tokenizer in there to help me get started.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants