PDF Layout Annotation Tool
A simple, self-hosted, web-based app that allows you to annotate the layouts of PDF files to create custom datasets.
PDFLAT is powered by a SvelteKit frontend, a FastAPI backend and a PostgreSQL database. For an easy setup, consistency, and portability across different environments, the application is fully dockerized.
- clone the repository
- make sure your ports
1337
,5173
, and5432
are unoccupied (or modify the configuration if needed) - run
./start.sh
(might require admin rights =>sudo ./start.sh
, this will take quite a while, don't worry, it's normal)
- create your datasets at port 5173
- upload PDF files for your datasets
- annotate pages
- use the API via port 1337 to export datasets for subsequent tasks
If you use PDFLAT in the process of creating any (published) work please cite this repository and feel invited to drop me a message so I can see what you are working on :)
Please note that PDFLAT is in early beta status and lacks proper documentation and useful features. Feel free to create pull requests if you improve it.