Skip to content

pest-parser/pest

This branch is up to date with master.

Folders and files

NameName
Last commit message
Last commit date
Oct 2, 2018
Feb 2, 2025
Aug 30, 2024
Feb 2, 2025
Dec 7, 2024
Dec 7, 2024
Dec 7, 2024
Dec 7, 2024
Feb 2, 2025
Dec 7, 2024
Nov 19, 2018
Apr 12, 2022
Nov 4, 2022
Aug 29, 2023
Oct 25, 2022
Jan 20, 2018
Jan 20, 2018
Oct 23, 2023
Sep 19, 2023
Mar 16, 2021
Nov 24, 2022
Aug 28, 2017
Feb 2, 2025
Jan 18, 2018
Dec 9, 2018
Jan 4, 2024
Dec 23, 2022

Repository files navigation

pest. The Elegant Parser

Join the chat at https://gitter.im/pest-parser/pest Book Docs

pest Continuous Integration codecov Rustc Version 1.61.0+

Crates.io Crates.io

pest is a general purpose parser written in Rust with a focus on accessibility, correctness, and performance. It uses parsing expression grammars (or PEG) as input, which are similar in spirit to regular expressions, but which offer the enhanced expressivity needed to parse complex languages.

Getting started

The recommended way to start parsing with pest is to read the official book.

Other helpful resources:

Example

The following is an example of a grammar for a list of alphanumeric identifiers where all identifiers don't start with a digit:

alpha = { 'a'..'z' | 'A'..'Z' }
digit = { '0'..'9' }

ident = { !digit ~ (alpha | digit)+ }

ident_list = _{ ident ~ (" " ~ ident)* }
          // ^
          // ident_list rule is silent which means it produces no tokens

Grammars are saved in separate .pest files which are never mixed with procedural code. This results in an always up-to-date formalization of a language that is easy to read and maintain.

Meaningful error reporting

Based on the grammar definition, the parser also includes automatic error reporting. For the example above, the input "123" will result in:

thread 'main' panicked at ' --> 1:1
  |
1 | 123
  | ^---
  |
  = unexpected digit', src/main.rs:12

while "ab *" will result in:

thread 'main' panicked at ' --> 1:1
  |
1 | ab *
  |    ^---
  |
  = expected ident', src/main.rs:12

These error messages can be obtained from their default Display implementation, e.g. panic!("{}", parser_result.unwrap_err()) or println!("{}", e).

Pairs API

The grammar can be used to derive a Parser implementation automatically. Parsing returns an iterator of nested token pairs:

use pest_derive::Parser;
use pest::Parser;

#[derive(Parser)]
#[grammar = "ident.pest"]
struct IdentParser;

fn main() {
    let pairs = IdentParser::parse(Rule::ident_list, "a1 b2").unwrap_or_else(|e| panic!("{}", e));

    // Because ident_list is silent, the iterator will contain idents
    for pair in pairs {
        // A pair is a combination of the rule which matched and a span of input
        println!("Rule:    {:?}", pair.as_rule());
        println!("Span:    {:?}", pair.as_span());
        println!("Text:    {}", pair.as_str());

        // A pair can be converted to an iterator of the tokens which make it up:
        for inner_pair in pair.into_inner() {
            match inner_pair.as_rule() {
                Rule::alpha => println!("Letter:  {}", inner_pair.as_str()),
                Rule::digit => println!("Digit:   {}", inner_pair.as_str()),
                _ => unreachable!()
            };
        }
    }
}

This produces the following output:

Rule:    ident
Span:    Span { start: 0, end: 2 }
Text:    a1
Letter:  a
Digit:   1
Rule:    ident
Span:    Span { start: 3, end: 5 }
Text:    b2
Letter:  b
Digit:   2

Defining multiple parsers in a single file

The current automatic Parser derivation will produce the Rule enum which would have name conflicts if one tried to define multiple such structs that automatically derive Parser. One possible way around it is to put each parser struct in a separate namespace:

mod a {
    #[derive(Parser)]
    #[grammar = "a.pest"]
    pub struct ParserA;
}
mod b {
    #[derive(Parser)]
    #[grammar = "b.pest"]
    pub struct ParserB;
}

Other features

  • Precedence climbing
  • Input handling
  • Custom errors
  • Runs on stable Rust

Projects using pest

You can find more projects and ecosystem tools in the awesome-pest repo.

Minimum Supported Rust Version (MSRV)

This library should always compile with default features on Rust 1.61.0.

no_std support

The pest and pest_derive crates can be built without the Rust standard library and target embedded environments. To do so, you need to disable their default features. In your Cargo.toml, you can specify it as follows:

[dependencies]
# ...
pest = { version = "2", default-features = false }
pest_derive = { version = "2", default-features = false }

If you want to build these crates in the pest repository's workspace, you can pass the --no-default-features flag to cargo and specify these crates using the --package (-p) flag. For example:

$ cargo build --target thumbv7em-none-eabihf --no-default-features -p pest
$ cargo bootstrap
$ cargo build --target thumbv7em-none-eabihf --no-default-features -p pest_derive

Special thanks

A special round of applause goes to prof. Marius Minea for his guidance and all pest contributors, some of which being none other than my friends.