Skip to content

Commit

Permalink
Restore code from floyd-python v0.16.0.
Browse files Browse the repository at this point in the history
This reverts commit 6b4450e.
  • Loading branch information
dpranke committed Nov 17, 2024
1 parent 69e87e8 commit f3d9002
Show file tree
Hide file tree
Showing 44 changed files with 12,011 additions and 19 deletions.
28 changes: 28 additions & 0 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
name: Python package

on:
push:
pull_request:
branches:
- main

jobs:
build:

runs-on: ubuntu-latest
strategy:
matrix:
python-version: [3.8, 3.12]

steps:
- uses: actions/checkout@v4
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v5
with:
python-version: ${{ matrix.python-version }}
- name: Set up Node
uses: actions/setup-node@v4
with:
node-version: 18
- name: Run tests
run: python run tests
75 changes: 75 additions & 0 deletions .pylintrc
Original file line number Diff line number Diff line change
@@ -0,0 +1,75 @@
# Copyright 2014 Google Inc. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# This file is taken from the suggested pylint configuration in
# Google's Python Style Guide, specifically
# https://github.com/google/styleguide/blob/28d10d1/pylintrc
# with some changes:
# - suppress consider-using-f-string as too noisy
# - Commented out the disabling of no-self-use in order to get rid of
# the "useless option value" warning.
# - Disabled the duplicate-code check because there's too much unavoidable
# code similarity between compiler.py and printer.py.
# - Turned off the score message in the [REPORTS] section.

[MAIN]

persistent=yes

[MESSAGES CONTROL]

disable=
broad-except,
consider-using-f-string, # over-sensitive
duplicate-code, # Too much is duplicated between printer and compiler.
exec-used, # The uses of exec in this package are okay.
fixme, # Don't want warnings for this.
global-statement,
locally-disabled,
missing-docstring,
# no-self-use, # Pylint generates a 'useless option value' warning for this
too-many-arguments,
too-few-public-methods,
too-many-branches,
too-many-instance-attributes,
too-many-locals,
too-many-public-methods,
too-many-return-statements,
unidiomatic-typecheck,

[REPORTS]

reports=no

[SCORE]

score=no # Suppress the "score" message in the output.

[BASIC]

# By default, pylint wants method names to be at most 31 chars long,
# but we want to allow up to 49 to allow for longer test names.
method-rgx=[a-zA-Z_][a-zA-Z0-9_]{0,48}$

# By default, pylint only allows UPPER_CASE constants, but we want to
# allow snake_case as well in some situations.
const-rgx=[a-zA-Z_][a-zA-Z0-9_]{0,30}$

# By default, pylint wants all parameter names to be at least two chars long,
# but we want to allow single-char parameter names as well.
argument-rgx=[a-z_][a-z0-9_]{0,30}$

# By default, pylint wants all variable names to be at least two chars long,
# but we want to allow single-char variable names as well.
variable-rgx=[a-z_][a-z0-9_]{0,30}$
135 changes: 131 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,135 @@

A parsing framework and parser generator for Python.

2024-11-17: This project is now dead, new development is happening in a
renamed fork called `pyfloyd`. See
[https://github.com/dpranke/pyfloyd](https://github.com/dpranke/pyfloyd.git).
**Note the Python package name is `floyd`, not `floyd-python`.
`floyd-python` is the name on PyPI.**

This empty repo has been tagged with version `0.17.2`.
## Getting set up.

1. Install `uv` via whatever system-specific magic you need (e.g.,
`brew install uv` on a Mac w/ Homebrew).
2. Run `./run devenv` to create a virtualenv at `//.venv` with
all of the tools needed to do development (and with `floyd` installed
as an editable Python project.
3. Run `source ./.venv/bin/activate` to activate the environment and pick
up the tools.

## Running the tests

Get set up as per the above, and then run `./run tests`.

There are other commands to `run` to do other things like lint and
format the code. `./run --help` is your friend to find out more.

## Publishing a version

1. Run `./run build`
2. Run `./run publish --test]` or `./run publish --prod` to upload to PyPI.
If you pass `--test`, the package will be uploaded to TestPyPI instead
of the production instance.

## Version History / Release Notes

* v0.16.0 (2024-11-17)
* Tons of cleanup, move to new syntax. I'm too lazy to document all of
the changes from v0.15.0 to here. This is intended to be the last
release of this project under this name; work will continue under
the name `pyfloyd`.
* v0.15.0 (2024-07-05)
* Add a JavaScript back end. As a part of this I've changed the GitHub
CI config to install a specific version of Node so that the new
JS codepaths are tested properly.
* v0.14.0 (2024-06-30)
* Move a lot of the logic from python_compiler.py to a new pass in
analyzer.py. Now python_compiler is very simple and you can start
to see how it can be converted to something in a DSL like
StringTemplate. This also will make it much easier to share logic
between different compilers.
* v0.13 (2024-06-29)
* Rework the way subrules are generated and named to be much less
cryptic. Now rules result in methods named `_r_{rule}_` and
subrules methods named `_s_{rule}_{i}` where i is a series of
consecutive integers named in a depth first manner.
* v0.12 (2024-06-29)
* Significantly rework the generated parser so that it is less like
a set of parser combinators (self._star() and so on) and more like
code that you'd write inline by hand. This appears to produce a
parser close to 2x faster for 1.33x more code.
* v0.11 (2024-06-22)
* Rework the API significantly. Now the generated parser API is a single
`parse()` function and a data type for the return value (`Result`),
and the public API functions are called `compile()` instead of
`compile_parser()` and `generate()` instead of `generate_parser()`.
* Add typing hints to everything.
* Add lots more documentation of the API.
* v0.10.2 (2024-06-22)
* Update test to use 79 cols instead of 80.
* v0.10.1 (2024-06-22)
* Change formatter to default to 79 cols and regen/reformat.
* v0.10.0 (2024-06-22)
* Clean up compiler code, rework how inlining methods works. At this
point the compiler code is probably about as reasonably clean and
fast as the inlining approach can be.
* v0.9.0 (2024-05-19)
* get operator expressions working: you can now declare the precedence
and associativity of different operators in an expression and they
will be handled correctly. Note that there is a fair amount of
special-casing logic for this so that only some of the expressions
you might think would work will actually work. It's also unclear
how well this will play with memoization.
* v0.8.0 (2024-05-19)
* get left association in expressions that are both left- and
right-recursive working properly.
* v0.7.0 (2024-05-05)
* Add support for `?{...}` for semantic predicates and `{...}` for
semantic actions in addition to `?( ... )` and `-> ...`. For now,
both syntaxes will be supported.
* v0.6.0
* This version number was skipped.
* v0.5.0 (2024-05-04)
* Get automatic whitespace and comment insertion working. The
two collectively are known as "filler".
* You can now declare "tokens" that can consist of compound expressions
that will not have filler interleaved.
* Add support for positional labels in addition to named labels, i.e.
you can write an expression as `num '+' num -> $1 + $3` in addition
to `num:l '+' num:r -> l + r`. Both syntaxes will be supported for
now.
* Do much more semantic analysis to catch a broader set of errors.
* Add support for unicode character classes via `\p{X}`.
* Turn `--memoize` and `--main` off by default in floyd.tool.
* Got left recursion working.
* Added type hints to the API.
* v0.4.0
* This version number was skipped.
* v0.3.0 (2024-04-02)
* Changed the interpreter so that it no longer relies on runtime
compilation.
* v0.2.0 (2024-03-31)
* 100% test coverage.
* Code is clean and ready for new work.
* Add docs/goals.md to describe what I'm hoping to accomplish with
this project.
* Add docs/todos.md to capture everything I'm planning to fix or
change.
* v0.1.0 (2024-03-25)
* Copy over working code from glop v0.7.0. This copies only the code
needed to run things, and a couple of grammars that can be used
for hand-testing things. This does not add any tests, since I'm
likely going to rework all of that. The code is as-is as close to
the working glop code as I can keep it, except for updated formatting
and copyright info. `check` and `lint` are unhappy, the `coverage`
numbers are terrible, and we probably need to regenerate the floyd
parser as well.
* v0.0.5 (2024-03-24)
* There's a pattern forming.
* v0.0.4 (2024-03-24)
* Actually bump the version this time.
* v0.0.3 (2024-03-24)
* Fix typos and bugs found after v0.0.2 was tagged :).
* v0.0.2 (2024-03-24)
* Fix typos found after v0.0.1 was tagged :).
* v0.0.1 (2024-03-24)
* Initial skeleton of the project uploaded to GitHub. There is nothing
project-specific about this project except for the name and
description.
44 changes: 44 additions & 0 deletions docs/goals.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
# The goals I wish to research and/or achieve with Floyd

1. Build a production-ready PEG parser and generator that focuses its
action language on JSON (i.e., JSON objects are returned from the
parse).
2. Have the generated parsers be comparable to what you would write by
hand (i.e., be very readable).
3. If necessary, provide two modes of generated output: clear/simple and
fast.
4. Benchmark the generated grammars against other Python parsing
frameworks' output and hand-written parsers (e.g., the built-in JSON
or TOML parsers).
5. Have the parsers be able to handle a wide range of grammars, including
ones with left recursion and ones that are indentation-sensitive.
6. Potentially add the ability to generate non-PEG grammars, e.g. LL and
LR grammars for suitable languages.
7. Have the action language of the grammars be comparable in power to AWK
so that you can build simple parsing scripts with some of AWK's
utility. This might give us an external DSL.
8. Explore how to generate code without needing any custom Python code
(i.e., have a printing language to match the parsing language).
9. Be able to generate code for multiple languages including perhaps C,
C++, Go, Java, Javascript, Python, and Rust (and possibly Kotlin,
Swift, and Zig).
10. Use the output of (9) to enable cross-language benchmarking of the
same problem space.
11. Reimplement everything in some of the other languages in (9) so that
we have examples of a non-trivial code base in multiple languages to
compare and benchmark. Ideally the grammars and pretty-printing
descriptions are host-language-independent, so that writing new
grammars and implementing new languages are only N+M problems and not
N*M problems.
12. Be able to use the language implemented for the parsers as a simple
embeddable DSL.
13. Have the language be dynamically typed but suitable for static typing.
14. Define a way to map objects to schemas and host data types so that you
can parse directly to host data structures and not require manual
conversion in the host from JSON.
15. Implement grammars and pretty printers for some of JSON, JSON5, HJSON,
TOML, a simplified subset of YAML, Ninja, GN, HTML, CSS, JavaScript,
Protocol Buffers (text format if possible?).
16. Define a DOM-like interface that can be used to produce round-trippable
versions of parsed docs (i.e., ones that preserve the input including
whitespace and comments).
Loading

0 comments on commit f3d9002

Please sign in to comment.