This project demonstrates how to build an AI-powered invoice processing agent using Pydantic AI, showcasing type-safe AI interactions with structured inputs and outputs.
This example implements an invoice processing system that:
- Extracts total amounts from invoice images
- Uses OpenAI's multimodal LLM capabilities
- Provides structured, type-safe outputs
- Includes comprehensive test coverage
- Type-safe AI interactions using Pydantic models
- Structured dependency injection with dataclasses
- Async support for API operations
- OpenAI GPT-4 Vision integration
- Tool-augmented AI agent capabilities
The main components include:
MultimodalLLMService
: Service for interacting with OpenAI's vision modelsInvoiceProcessingDependencies
: Dependency container for the AI agentInvoiceExtractionResult
: Structured output model for extracted data- Custom tools for processing invoice images
Before running this project, ensure you have the following prerequisites:
- Python 3.8 or higher
- An OpenAI API key
- Required Python packages (listed in
requirements.txt
)
To install the required packages, run:
pip install -r requirements.txt
To run the project, use the following command:
python3 app.py
To run the tests, use the following command:
pytest