Skip to content

ur-whitelab/parser-bench

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

parser-bench

A dataset and benchmark for file to file parser construction with LLMs that write code.

structure

For each filetype there is one directory with multiple subdirectories:

myfiletype
    meta.json
    implementations
        1
            meta.json
            parser.py
    inputs
        1.your_extension
        2.your_extension
    outputs
        1.json
        2.json

Check data/zeopp-sa for an example

FAQ

What do I get from contributing?

Besides helping to advance science, meaningful contributions (i.e., merged PR adding an entry) will qualify for co-authorship on a paper (that might come out of this work).

What languages/packages/frameworks can I use for the example implementation?

Please focus on implementation examples in

  • Python (preferred)
  • JavaScript
  • TypeScript

as our current infrastructure can only test code in these languages.

In the example implementations, please only use the standard libraries and the following additional dependencies:

Python:

JavaScript/TypeScript:

How do I structure the code example?

Please provide the implementation as function that accepts the file as string and returns the parsed json string.

Validating the data

Install the package

pip install -e .

Then run the validation

parserbench.validate_dirs data/

Related projects

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages