Integral Evaluation Dataset for LLMs

Description

Recent work surrounding math problem solving using large language models (LLMs) has been centered around performance improvement by prompting or reprompting, data augmentations, or extending capabilities to multimodal settings. Although papers address evaluation of LLMs on math word problems and other comprehensive benchmarks such as GSM8k or the MATH dataset, not as many benchmark models on calculus-specific tasks. The gap can be addressed with a robust data creation pipeline, which is what this repository aims to accomplish. The main contribution is starter code for generating other math datasets.

Setup

Run python latex_extraction.py -m ./message.txt -p ./prompt.txt -pp /path/to/poppler/Library/bin/ to extract the latex from MIT integration bee pdf at ./integrationbee using poppler.

Discussion

The current dataset creation pipeline can be improved through direct extraction of latex from the PDF when the text is vectorized. However, using vision models may be favored in scenarios where the math problem in the PDFs are contained in unstructured contents that is harder to extract reliably. This pipeline may also be adapted with an OCR layer to extract scanned problems in pdfs.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
integrals-2.json		integrals-2.json
latex_extraction.py		latex_extraction.py
message.txt		message.txt
prompt.txt		prompt.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Integral Evaluation Dataset for LLMs

Description

Setup

Discussion

About

Releases

Packages

Languages

License

dchou1618/integral-extraction

Folders and files

Latest commit

History

Repository files navigation

Integral Evaluation Dataset for LLMs

Description

Setup

Discussion

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages