Skip to content

Oil Parser Generator Project

andychu edited this page Jun 7, 2022 · 19 revisions

Back to Tasks Under NLNet Grant

This is an introduction to an important subproject of https://www.oilshell.org/

Quick Description

Oil is developed "middle out", with an "executable spec" in Python, which is then semi-automatically translated to C++.

Much of the code works in C++, but the expression parser does not. It needs special handling.

In Python, the oil_lang/grammar_gen.py tool reads the grammar oil_lang/grammar.pgen2. Then it spits out a bunch of parse tables in Python's "marshal" format. At runtime, the pgen2/ library reads it.

So instead of outputting Python data structures in "marshal" format, we want to output C data structures just like Python itself does it. (before Python 3.8, when they switched to PEG.)

Background

How to Parse Shell Like a Programming Language explains our parsing approach. This already works in Python:

$ bin/oil --ast-format text -n -c 'echo "hello $name"'
(command.Simple
  words: [
    (compound_word parts:[(Token id:Id.Lit_Chars span_id:0 val:echo)])
...

And it's already translated to C++:

$ _bin/cxx-dbg/osh_eval  -n -c 'echo "hello $name"'
(command.Simple
  words: [
    (compound_word parts:[(Token id:Id.Lit_Chars span_id:0 val:echo)])
...

This part does not use pgen2.

Data Snippets

~/git/oilshell/oil/Python-2.7.13$ head -n 15  Python/graminit.c 
/* Generated by Parser/pgen */

#include "pgenheaders.h"
#include "grammar.h"
PyAPI_DATA(grammar) _PyParser_Grammar;
static arc arcs_0_0[3] = {
    {2, 1},
    {3, 1},
    {4, 2},
};
static arc arcs_0_1[1] = {
    {0, 1},
};
static arc arcs_0_2[1] = {
    {2, 1},

Relevant Files

  • oil_lang/grammar_gen.py
  • oil_lang/grammar.pgen2
  • _devbuild/gen/grammar.marshal and _devbuild/gen/grammar_nt.py (non-terminals)
  • oil_lang/expr_parse.py -- a wrapper for the generated parser
  • The pgen2/ directory
    • parse.py and more
  • pgen-native/ dir -- this is just a copy of Python, imported by a contributor

Related

Clone this wiki locally