grmtools is a suite of Rust libraries and binaries for parsing text, both at
compile-time, and run-time. Most users will probably be interested in the
compile-time Yacc feature, which allows traditional .y
files to be used
(mostly) unchanged in Rust.
A minimal example using this library consists of two files (in addition to the
grammar and lexing definitions). First we need to create a file build.rs
in
the root of our project with the following content:
use cfgrammar::yacc::YaccKind;
use lrlex::LexerBuilder;
use lrpar::{CTParserBuilder};
fn main() -> Result<(), Box<dyn std::error::Error>> {
let lex_rule_ids_map = CTParserBuilder::new()
.yacckind(YaccKind::Grmtools)
.process_file_in_src("grammar.y")?;
LexerBuilder::new()
.rule_ids_map(lex_rule_ids_map)
.process_file_in_src("lexer.l")?;
Ok(())
}
This will generate and compile a parser and lexer, where the definitions for the
lexer can be found in src/calc.l
:
%%
[0-9]+ "INT"
\+ "+"
\* "*"
\( "("
\) ")"
[\t ]+ ;
And where the definitions for the parser can be found in src/calc.y
:
%start Expr
%avoid_insert "INT"
%%
Expr -> Result<u64, ()>:
Expr '+' Term { Ok($1? + $3?) }
| Term { $1 }
;
Term -> Result<u64, ()>:
Term '*' Factor { Ok($1? * $3?) }
| Factor { $1 }
;
Factor -> Result<u64, ()>:
'(' Expr ')' { $2 }
| 'INT'
{
let v = $1.map_err(|_| ())?;
parse_int($lexer.span_str(v.span()))
}
;
%%
// Any functions here are in scope for all the grammar actions above.
fn parse_int(s: &str) -> Result<u64, ()> {
match s.parse::<u64>() {
Ok(val) => Ok(val),
Err(_) => {
eprintln!("{} cannot be represented as a u64", s);
Err(())
}
}
}
We can then use the generated lexer and parser within our src/main.rs
file as
follows:
use std::env;
use lrlex::lrlex_mod;
use lrpar::lrpar_mod;
// Using `lrlex_mod!` brings the lexer for `calc.l` into scope. By default the
// module name will be `calc_l` (i.e. the file name, minus any extensions,
// with a suffix of `_l`).
lrlex_mod!("calc.l");
// Using `lrpar_mod!` brings the parser for `calc.y` into scope. By default the
// module name will be `calc_y` (i.e. the file name, minus any extensions,
// with a suffix of `_y`).
lrpar_mod!("calc.y");
fn main() {
// Get the `LexerDef` for the `calc` language.
let lexerdef = calc_l::lexerdef();
let args: Vec<String> = env::args().collect();
// Now we create a lexer with the `lexer` method with which we can lex an
// input.
let lexer = lexerdef.lexer(&args[1]);
// Pass the lexer to the parser and lex and parse the input.
let (res, errs) = calc_y::parse(&lexer);
for e in errs {
println!("{}", e.pp(&lexer, &calc_y::token_epp));
}
match res {
Some(r) => println!("Result: {:?}", r),
_ => eprintln!("Unable to evaluate expression.")
}
}
For more information on how to use this library please refer to the grmtools book, which also includes a more detailed quickstart guide.
lrpar
contains several examples on how to use the lrpar
/lrlex
libraries, showing
how to generate parse
trees
and
ASTs,
or execute
code
while parsing.