diff --git a/.gitignore b/.gitignore index 1f13d94..c8bebed 100644 --- a/.gitignore +++ b/.gitignore @@ -70,6 +70,7 @@ instance/ # Sphinx documentation docs/_build/ +docs/apidocs/ # PyBuilder .pybuilder/ diff --git a/docs/element.md b/docs/element.md index e6cde9a..584c115 100644 --- a/docs/element.md +++ b/docs/element.md @@ -2,7 +2,7 @@ The [`parse_file`](#textmate_grammar.language.LanguageParser.parse_file) and [`parse_string`](#textmate_grammar.language.LanguageParser.parse_string) methods of the [`LanguageParser`](#textmate_grammar.language.LanguageParser) both return either `None` if the content could not be parsed, or a [`ContentElement`](#textmate_grammar.elements.ContentElement) or [`ContentBlockElement`](#textmate_grammar.elements.ContentBlockElement) if parsing was successful. -```mermaid +```{mermaid} classDiagram direction LR class ContentElement{ @@ -60,7 +60,7 @@ This representation is more akin to the output of [vscode-textmate](https://gith To find specific descendent elements, instead of indexing manually through the `children` (or `begin` and `end`) attribute, use the provided methods [`find`](#textmate_grammar.elements.ContentElement.find), which yields the found descendent elements one by one, and [`findall`](#textmate_grammar.elements.ContentElement.findall), which returns all descendents as a list. -```mermaid +```{mermaid} flowchart LR style root stroke-width:0,fill:#F94144 style ca stroke-width:0,fill:#577590 diff --git a/docs/index.md b/docs/index.md index 86a7eec..4b84fe9 100644 --- a/docs/index.md +++ b/docs/index.md @@ -1,3 +1,5 @@ +# Textmate Grammar Python + [![Ruff](https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/astral-sh/ruff/main/assets/badge/v2.json)](https://github.com/astral-sh/ruff) [![Checked with mypy](https://img.shields.io/badge/mypy-checked-blue)](http://mypy-lang.org/) [![pre-commit](https://img.shields.io/badge/pre--commit-enabled-brightgreen?logo=pre-commit)](https://github.com/pre-commit/pre-commit) @@ -8,74 +10,28 @@ A lexer and tokenizer for grammar files as defined by TextMate and used in VSCod Textmate grammars are made for [vscode-texmate](https://github.com/microsoft/vscode-textmate), allowing for syntax highlighting in VSCode after tokenization. This presents textmate-grammar-python with a large list of potentially supported languages. -```mermaid -flowchart TD - A[grammar file] - Z[code] - B("`vscode-textmate **js**`") - C("`textmate-grammar-**python**`") - D[tokens] - - click C "https://github.com/microsoft/vscode-textmate" - - Z --> B - Z --> C - A -.-> B --> D - A -.-> C --> D -``` - -## Installation -Install the module with: -```bash -pip install textmate-grammar-python -``` - -Or, for development purposes, clone the repository and install locally with [poetry](https://python-poetry.org/), and setup [pre-commit](https://pre-commit.com/) such that code is linted and formatted with [Ruff](https://docs.astral.sh/ruff/) and checked with [mypy](https://mypy-lang.org/). - -```bash -pip install poetry -git clone https://github.com/watermarkhu/textmate-grammar-python -cd textmate-grammar-python -poetry install -pre-commit install -``` -For instructions on running the unit and regression tests see [CONTRIBUTING.md](https://github.com/watermarkhu/textmate-grammar-python/blob/main/CONTRIBUTING.md) - - -## Usage -Before tokenization is possible, a [`LanguageParser`](#textmate_grammar.language.LanguageParser) needs to be initialized using a loaded grammar. - -```python -from textmate_grammar.language import LanguageParser -from textmate_grammar.grammars import matlab -parser = LanguageParser(matlab.GRAMMAR) -``` - -After this, one can either choose to call [`parser.parsing_string`](#textmate_grammar.language.LanguageParser.parse_string) to parse a input string directly, or call [`parser.parse_file`](#textmate_grammar.language.LanguageParser.parse_file) with the path to the appropiate source file as the first argument, such as in the example [`example.py`](../example.py). - -The parsed `element` object can be displayed directly by calling the [`print`](#textmate_grammar.elements.ContentElement.print) method. By default the element is printed as an element tree in a dictionary format. - -```python ->>> element = parser.parse_string("value = num2str(10);") ->>> element.print() - -{'token': 'source.matlab', - 'children': [{'token': 'meta.assignment.variable.single.matlab', - 'children': [{'token': 'variable.other.readwrite.matlab', 'content': 'value'}]}, - {'token': 'keyword.operator.assignment.matlab', 'content': '='}, - {'token': 'meta.function-call.parens.matlab', - 'begin': [{'token': 'entity.name.function.matlab', 'content': 'num2str'}, - {'token': 'punctuation.section.parens.begin.matlab', 'content': '('}], - 'end': [{'token': 'punctuation.section.parens.end.matlab', 'content': ')'}], - 'children': [{'token': 'constant.numeric.decimal.matlab', 'content': '10'}]}, - {'token': 'punctuation.terminator.semicolon.matlab', 'content': ';'}]} - +```{mermaid} +flowchart LR + G[grammar file] + C[code] + PY("`textmate-grammar-**python**`") + JS("`vscode-textmate **js**`") + T[tokens] + click JS "https://github.com/microsoft/vscode-textmate" + C --> PY + C --> JS + G -.-> PY + G -.-> JS + PY --> T + JS --> T ``` +## Index ```{toctree} :maxdepth: 2 +started languages element apidocs/index diff --git a/docs/started.md b/docs/started.md new file mode 100644 index 0000000..c333e6c --- /dev/null +++ b/docs/started.md @@ -0,0 +1,49 @@ +# Getting started + +Install the module with: +```bash +pip install textmate-grammar-python +``` + +Or, for development purposes, clone the repository and install locally with [poetry](https://python-poetry.org/), and setup [pre-commit](https://pre-commit.com/) such that code is linted and formatted with [Ruff](https://docs.astral.sh/ruff/) and checked with [mypy](https://mypy-lang.org/). + +```bash +pip install poetry +git clone https://github.com/watermarkhu/textmate-grammar-python +cd textmate-grammar-python +poetry install +pre-commit install +``` +For instructions on running the unit and regression tests see [CONTRIBUTING.md](https://github.com/watermarkhu/textmate-grammar-python/blob/main/CONTRIBUTING.md) + + +# Usage +Before tokenization is possible, a [`LanguageParser`](#textmate_grammar.language.LanguageParser) needs to be initialized using a loaded grammar. + +```python +from textmate_grammar.language import LanguageParser +from textmate_grammar.grammars import matlab +parser = LanguageParser(matlab.GRAMMAR) +``` + +After this, one can either choose to call [`parser.parsing_string`](#textmate_grammar.language.LanguageParser.parse_string) to parse a input string directly, or call [`parser.parse_file`](#textmate_grammar.language.LanguageParser.parse_file) with the path to the appropiate source file as the first argument, such as in the example [`example.py`](../example.py). + +The parsed `element` object can be displayed directly by calling the [`print`](#textmate_grammar.elements.ContentElement.print) method. By default the element is printed as an element tree in a dictionary format. + +```python +>>> element = parser.parse_string("value = num2str(10);") +>>> element.print() + +{'token': 'source.matlab', + 'children': [{'token': 'meta.assignment.variable.single.matlab', + 'children': [{'token': 'variable.other.readwrite.matlab', 'content': 'value'}]}, + {'token': 'keyword.operator.assignment.matlab', 'content': '='}, + {'token': 'meta.function-call.parens.matlab', + 'begin': [{'token': 'entity.name.function.matlab', 'content': 'num2str'}, + {'token': 'punctuation.section.parens.begin.matlab', 'content': '('}], + 'end': [{'token': 'punctuation.section.parens.end.matlab', 'content': ')'}], + 'children': [{'token': 'constant.numeric.decimal.matlab', 'content': '10'}]}, + {'token': 'punctuation.terminator.semicolon.matlab', 'content': ';'}]} + +``` +