Add grmtools app. #68

ltratt · 2024-06-04T12:32:42Z

grmtools (or, more specifically from this commit's perspective lrpar) is a Yacc-compatible parser written for Rust. Although it hasn't really been optimised, I thought this benchmark is sufficiently interesting that users might like to know where grmtools/lrpar fits in.

I have deliberately written the lexer and parser in not only "normal/proper Lex/Yacc style" but also "full grmtools style", including good support for error recovery, because I think that's the only mode in which grmtools makes sense. This might mean that this lexer/grammar is doing a bit more work than other parsers, but since I don't expect grmtools to be especially fast anyway, I don't suppose another few percent slowdown will hurt!

grmtools (or, more specifically from this commit's perspective lrpar) is a Yacc-compatible parser written for Rust. Although it hasn't really been optimised, I thought this benchmark is sufficiently interesting that users might like to know where grmtools/lrpar fits in. I have deliberately written the lexer and parser in not only "normal/proper Lex/Yacc style" but also "full grmtools style", including good support for error recovery, because I think that's the only mode in which grmtools makes sense. This might mean that this lexer/grammar is doing a bit more work than other parsers, but since I don't expect grmtools to be especially fast anyway, I don't suppose another few percent slowdown will hurt!

epage · 2024-06-04T15:47:16Z

Generally we've had a "notoriety | uniqueness" policy. I feel like this isn't quite there on the notoriety. For uniquiness, maybe I don't deal with this area enough but I feel like lalrpop and pest are similar enough.

ltratt · 2024-06-04T16:10:15Z

Notoriety I will leave to you to judge and I won't be offended either way.

LALRPOP is an LR parser, but it doesn't accept Yacc syntax and, indeed, converting Yacc grammars to it can be a challenge. Since every language needs a Yacc parser (there are Yacc grammars available for pretty much every language under the sun, and consequently Yacc-compatible implementations for pretty much every language), there's a need for a Rust Yacc system. grmtools is the only viable Yacc-compatible(ish) implementation I know of

Pest is a PEG parser, so is completely different to LALRPOP and grmtools.

epage · 2024-06-04T16:14:00Z

. Since every language needs a Yacc parser (there are Yacc grammars available for pretty much every language under the sun, and consequently Yacc-compatible implementations for pretty much every language),

I feel I'm missing something here. The tokens file is universal but doesn't have much value-add. The grammar file though is coupled to the Yacc implementation language and so it seems like "universality of yacc" is fairly limited.

ltratt · 2024-06-04T16:27:12Z

The tokens file is universal but doesn't have much value-add.

Yes, this is lrlex input (similar, but not identical, to classic Lex). Lexing is boring, which is why I didn't bother mentioning it.

The grammar file though is coupled to the Yacc implementation language and so it seems like "universality of yacc" is fairly limited.

I'm not sure I've understood this part though? Yacc is the most widely used grammar specification language; only ANTLR comes close. For example, you can take a Yacc grammar from 40+ years ago and it'll probably run through a modern Yacc system on any language (modulo the "actions", which are necessarily language specific).

epage · 2024-06-04T19:33:09Z

I'm not sure I've understood this part though?

%start Object
%expect-unused Unmatched "UNMATCHED"

%%

Object -> Result<Value, Box<dyn Error>>:
    "{" ObjectMembersOpt "}" { Ok(Value::Object(HashMap::from_iter($2?))) }
  ;

ObjectMembersOpt -> Result<Vec<(String, Value)>, Box<dyn Error>>:
    ObjectMembers { $1 }
  | { Ok(Vec::new()) }
  ;

This is a yacc grammar for parsing json and generating Rust. The grammar being parsed and the language being generated to are coupled together.

Your earlier statement made it sound like this would allow leveraging a lot of existing yacc files which would be a value-add. It lets you leverage yacc knowledge, reduces the porting work, but it isn't 1:1 reuse of a grammar from a C application to a Rust application,

ltratt · 2024-06-04T19:52:20Z

grmtools can, if you want, ignore the action code and produce a parse tree (nimbleparse uses this functionality) -- but that wouldn't be in the spirit of the benchmarks in this repo AFAICS, where a specific Value struct seems to be in keeping with the other parsers.

In practise, the "difficult" part of a .y file is normally seen as the grammar part: the action code is (as in the example you quote above) tends to be highly formulaic and thus "easy". In practise, this coupling of grammar + action code is what most people think of as "Yacc". It's pretty easy to take a grammar + action-code-in-another-language and change the latter to Rust (or C or ...), so Yacc files are still considered fairly portable, even if part of them is necessarily language specific.

epage · 2024-06-04T19:58:14Z

Thanks for the explanations. I'll go ahead and err on the side of including it. We can always remove it later if there is a reason to.

epage merged commit 57efdf9 into rosetta-rs:main Jun 4, 2024
5 of 6 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add grmtools app. #68

Add grmtools app. #68

ltratt commented Jun 4, 2024

epage commented Jun 4, 2024

ltratt commented Jun 4, 2024

epage commented Jun 4, 2024

ltratt commented Jun 4, 2024

epage commented Jun 4, 2024

ltratt commented Jun 4, 2024

epage commented Jun 4, 2024

Add grmtools app. #68

Add grmtools app. #68

Conversation

ltratt commented Jun 4, 2024

epage commented Jun 4, 2024

ltratt commented Jun 4, 2024

epage commented Jun 4, 2024

ltratt commented Jun 4, 2024

epage commented Jun 4, 2024

ltratt commented Jun 4, 2024

epage commented Jun 4, 2024