Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add grmtools app. #68

Merged
merged 1 commit into from
Jun 4, 2024
Merged

Add grmtools app. #68

merged 1 commit into from
Jun 4, 2024

Conversation

ltratt
Copy link
Contributor

@ltratt ltratt commented Jun 4, 2024

grmtools (or, more specifically from this commit's perspective lrpar) is a Yacc-compatible parser written for Rust. Although it hasn't really been optimised, I thought this benchmark is sufficiently interesting that users might like to know where grmtools/lrpar fits in.

I have deliberately written the lexer and parser in not only "normal/proper Lex/Yacc style" but also "full grmtools style", including good support for error recovery, because I think that's the only mode in which grmtools makes sense. This might mean that this lexer/grammar is doing a bit more work than other parsers, but since I don't expect grmtools to be especially fast anyway, I don't suppose another few percent slowdown will hurt!

grmtools (or, more specifically from this commit's perspective lrpar) is
a Yacc-compatible parser written for Rust. Although it hasn't really
been optimised, I thought this benchmark is sufficiently interesting
that users might like to know where grmtools/lrpar fits in.

I have deliberately written the lexer and parser in not only
"normal/proper Lex/Yacc style" but also "full grmtools style", including
good support for error recovery, because I think that's the only mode in
which grmtools makes sense. This might mean that this lexer/grammar is
doing a bit more work than other parsers, but since I don't expect
grmtools to be especially fast anyway, I don't suppose another few
percent slowdown will hurt!
@epage
Copy link
Collaborator

epage commented Jun 4, 2024

Generally we've had a "notoriety | uniqueness" policy. I feel like this isn't quite there on the notoriety. For uniquiness, maybe I don't deal with this area enough but I feel like lalrpop and pest are similar enough.

@ltratt
Copy link
Contributor Author

ltratt commented Jun 4, 2024

Notoriety I will leave to you to judge and I won't be offended either way.

LALRPOP is an LR parser, but it doesn't accept Yacc syntax and, indeed, converting Yacc grammars to it can be a challenge. Since every language needs a Yacc parser (there are Yacc grammars available for pretty much every language under the sun, and consequently Yacc-compatible implementations for pretty much every language), there's a need for a Rust Yacc system. grmtools is the only viable Yacc-compatible(ish) implementation I know of

Pest is a PEG parser, so is completely different to LALRPOP and grmtools.

@epage
Copy link
Collaborator

epage commented Jun 4, 2024

. Since every language needs a Yacc parser (there are Yacc grammars available for pretty much every language under the sun, and consequently Yacc-compatible implementations for pretty much every language),

I feel I'm missing something here. The tokens file is universal but doesn't have much value-add. The grammar file though is coupled to the Yacc implementation language and so it seems like "universality of yacc" is fairly limited.

@ltratt
Copy link
Contributor Author

ltratt commented Jun 4, 2024

The tokens file is universal but doesn't have much value-add.

Yes, this is lrlex input (similar, but not identical, to classic Lex). Lexing is boring, which is why I didn't bother mentioning it.

The grammar file though is coupled to the Yacc implementation language and so it seems like "universality of yacc" is fairly limited.

I'm not sure I've understood this part though? Yacc is the most widely used grammar specification language; only ANTLR comes close. For example, you can take a Yacc grammar from 40+ years ago and it'll probably run through a modern Yacc system on any language (modulo the "actions", which are necessarily language specific).

@epage
Copy link
Collaborator

epage commented Jun 4, 2024

I'm not sure I've understood this part though?

%start Object
%expect-unused Unmatched "UNMATCHED"

%%

Object -> Result<Value, Box<dyn Error>>:
    "{" ObjectMembersOpt "}" { Ok(Value::Object(HashMap::from_iter($2?))) }
  ;

ObjectMembersOpt -> Result<Vec<(String, Value)>, Box<dyn Error>>:
    ObjectMembers { $1 }
  | { Ok(Vec::new()) }
  ;

This is a yacc grammar for parsing json and generating Rust. The grammar being parsed and the language being generated to are coupled together.

Your earlier statement made it sound like this would allow leveraging a lot of existing yacc files which would be a value-add. It lets you leverage yacc knowledge, reduces the porting work, but it isn't 1:1 reuse of a grammar from a C application to a Rust application,

@ltratt
Copy link
Contributor Author

ltratt commented Jun 4, 2024

grmtools can, if you want, ignore the action code and produce a parse tree (nimbleparse uses this functionality) -- but that wouldn't be in the spirit of the benchmarks in this repo AFAICS, where a specific Value struct seems to be in keeping with the other parsers.

In practise, the "difficult" part of a .y file is normally seen as the grammar part: the action code is (as in the example you quote above) tends to be highly formulaic and thus "easy". In practise, this coupling of grammar + action code is what most people think of as "Yacc". It's pretty easy to take a grammar + action-code-in-another-language and change the latter to Rust (or C or ...), so Yacc files are still considered fairly portable, even if part of them is necessarily language specific.

@epage
Copy link
Collaborator

epage commented Jun 4, 2024

Thanks for the explanations. I'll go ahead and err on the side of including it. We can always remove it later if there is a reason to.

@epage epage merged commit 57efdf9 into rosetta-rs:main Jun 4, 2024
5 of 6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants