Skip to content

Extending python grammar #1453

@mbhall88

Description

@mbhall88

So I have tried the import_lark branch out (#1446). I must admit, I am a complete beginner with Lark and so I'm not sure if my error is due to that, or due to something else.

What I a trying to do is create a grammar and parser for Snakemake which is a DSL built on top of python. i.e., any python syntax is valid snakemake syntax, and then there is snakemake-specific syntax on top of that.

I want to keep the Snakemake grammar definition separate from the Python grammar, hence why I stumbled across this issue.

Here is a small example of what I was trying to do (using lark installed from the linked branch (import_star))

from lark import Lark

lark = Lark(
    r"""
%import python.*

start: file_input
            
ruledef: "rule" NAME ":" inputs outputs
inputs: "input:" files
outputs: "output:" files
files: (FILE_NAME)+

FILE_NAME: /[a-zA-Z0-9_\.\/]+/

"""
)

snakefile = """x = 42

rule foo:
    input: 'foo.txt'
"""


def parse_snakemake_file():
    return lark.parse(snakefile)

When I try to import the parse_snakemake_file function and run it I get the following

p = snakemake_parser.parse_snakemake_file()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/michael/Projects/snakemake-parser/src/snakemake_parser/__init__.py", line 27, in parse_snakemake_file
    return lark.parse(snakefile)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/home/michael/Projects/snakemake-parser/.venv/lib/python3.12/site-packages/lark/lark.py", line 655, in parse
    return self.parser.parse(text, start=start, on_error=on_error)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/michael/Projects/snakemake-parser/.venv/lib/python3.12/site-packages/lark/parser_frontends.py", line 104, in parse
    return self.parser.parse(stream, chosen_start, **kw)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/michael/Projects/snakemake-parser/.venv/lib/python3.12/site-packages/lark/parsers/earley.py", line 280, in parse
    to_scan = self._parse(lexer, columns, to_scan, start_symbol)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/michael/Projects/snakemake-parser/.venv/lib/python3.12/site-packages/lark/parsers/xearley.py", line 152, in _parse
    to_scan = scan(i, to_scan)
              ^^^^^^^^^^^^^^^^
  File "/home/michael/Projects/snakemake-parser/.venv/lib/python3.12/site-packages/lark/parsers/xearley.py", line 125, in scan
    raise UnexpectedCharacters(stream, i, text_line, text_column, {item.expect.name for item in to_scan},
lark.exceptions.UnexpectedCharacters: No terminal matches ' ' in the current parser context, at line 1 col 2

x = 42
 ^
Expected one of:
        * __ANON_5
        * __ANON_13
        * LPAR
        * __ANON_18
        * EQUAL
        * VBAR
        * DOT
        * MORETHAN
        * SEMICOLON
        * __ANON_6
        * PERCENT
        * __ANON_17
        * __ANON_2
        * __ANON_16
        * COLON
        * __ANON_21
        * AMPERSAND
        * CIRCUMFLEX
        * COMMA
        * IN
        * __ANON_11
        * __ANON_10
        * SLASH
        * __ANON_12
        * LESSTHAN
        * __ANON_22
        * __ANON_8
        * IF
        * __ANON_7
        * __ANON_20
        * NOT
        * __ANON_15
        * AND
        * _NEWLINE
        * MINUS
        * __ANON_23
        * __ANON_3
        * __ANON_14
        * PLUS
        * __ANON_1
        * __ANON_9
        * LSQB
        * OR
        * STAR
        * IS
        * AT
        * __ANON_19
        * __ANON_4

again, this could be my misunderstanding - I wasn't certain what to use for start as that doesn't seem to be defined in the python grammer?

Also, I am happy to move this to a separate issue so as not to clutter this issue.

Originally posted by @mbhall88 in #1397 (comment)

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions