-
-
Notifications
You must be signed in to change notification settings - Fork 450
Description
So I have tried the import_lark branch out (#1446). I must admit, I am a complete beginner with Lark and so I'm not sure if my error is due to that, or due to something else.
What I a trying to do is create a grammar and parser for Snakemake which is a DSL built on top of python. i.e., any python syntax is valid snakemake syntax, and then there is snakemake-specific syntax on top of that.
I want to keep the Snakemake grammar definition separate from the Python grammar, hence why I stumbled across this issue.
Here is a small example of what I was trying to do (using lark installed from the linked branch (import_star))
from lark import Lark
lark = Lark(
r"""
%import python.*
start: file_input
ruledef: "rule" NAME ":" inputs outputs
inputs: "input:" files
outputs: "output:" files
files: (FILE_NAME)+
FILE_NAME: /[a-zA-Z0-9_\.\/]+/
"""
)
snakefile = """x = 42
rule foo:
input: 'foo.txt'
"""
def parse_snakemake_file():
return lark.parse(snakefile)When I try to import the parse_snakemake_file function and run it I get the following
p = snakemake_parser.parse_snakemake_file()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/michael/Projects/snakemake-parser/src/snakemake_parser/__init__.py", line 27, in parse_snakemake_file
return lark.parse(snakefile)
^^^^^^^^^^^^^^^^^^^^^
File "/home/michael/Projects/snakemake-parser/.venv/lib/python3.12/site-packages/lark/lark.py", line 655, in parse
return self.parser.parse(text, start=start, on_error=on_error)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/michael/Projects/snakemake-parser/.venv/lib/python3.12/site-packages/lark/parser_frontends.py", line 104, in parse
return self.parser.parse(stream, chosen_start, **kw)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/michael/Projects/snakemake-parser/.venv/lib/python3.12/site-packages/lark/parsers/earley.py", line 280, in parse
to_scan = self._parse(lexer, columns, to_scan, start_symbol)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/michael/Projects/snakemake-parser/.venv/lib/python3.12/site-packages/lark/parsers/xearley.py", line 152, in _parse
to_scan = scan(i, to_scan)
^^^^^^^^^^^^^^^^
File "/home/michael/Projects/snakemake-parser/.venv/lib/python3.12/site-packages/lark/parsers/xearley.py", line 125, in scan
raise UnexpectedCharacters(stream, i, text_line, text_column, {item.expect.name for item in to_scan},
lark.exceptions.UnexpectedCharacters: No terminal matches ' ' in the current parser context, at line 1 col 2
x = 42
^
Expected one of:
* __ANON_5
* __ANON_13
* LPAR
* __ANON_18
* EQUAL
* VBAR
* DOT
* MORETHAN
* SEMICOLON
* __ANON_6
* PERCENT
* __ANON_17
* __ANON_2
* __ANON_16
* COLON
* __ANON_21
* AMPERSAND
* CIRCUMFLEX
* COMMA
* IN
* __ANON_11
* __ANON_10
* SLASH
* __ANON_12
* LESSTHAN
* __ANON_22
* __ANON_8
* IF
* __ANON_7
* __ANON_20
* NOT
* __ANON_15
* AND
* _NEWLINE
* MINUS
* __ANON_23
* __ANON_3
* __ANON_14
* PLUS
* __ANON_1
* __ANON_9
* LSQB
* OR
* STAR
* IS
* AT
* __ANON_19
* __ANON_4
again, this could be my misunderstanding - I wasn't certain what to use for start as that doesn't seem to be defined in the python grammer?
Also, I am happy to move this to a separate issue so as not to clutter this issue.
Originally posted by @mbhall88 in #1397 (comment)