The Smart Document Router is an open source Document Understanding and Data Extraction tool. It is like LLamaCloud, but for Enterprises using ERP systems.
- It ingests unstructured docs from faxes, email, and ERPs.
- It selects streams of docs that can be processed autonomously with LLMs and NLP - reducing the need for time-consuming, expensive manual workflows.
The Document Router is designed to work with a human-in-the-loop and can processes financial data correctly 'on the nose'. (We are not doing RAG!)
- NextJS, NextAuth, MaterialUI, TailwindCSS
- FastAPI
- MongoDB
- Pydantic
- LiteLLM
- OpenAI, Anthropic, Gemini, Groq/DeepSeek...
PyData Boston DocRouter Slides (Feb '24) have more details about tech stack, and how Cursor AI was used to build the DocRouter.
- Smart Document Router Slides from Boston PyData, Spring 2025
- DocRouter.AI: Adventures in CSS and AI Coding, Summer 2025
- Installation
- Development