Skip to content

stephenc222/example-pydantic-ai-multi-modal

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Pydantic AI Example: Invoice Processing Agent

This project demonstrates how to build an AI-powered invoice processing agent using Pydantic AI, showcasing type-safe AI interactions with structured inputs and outputs.

Overview

This example implements an invoice processing system that:

  • Extracts total amounts from invoice images
  • Uses OpenAI's multimodal LLM capabilities
  • Provides structured, type-safe outputs
  • Includes comprehensive test coverage

Key Features

  • Type-safe AI interactions using Pydantic models
  • Structured dependency injection with dataclasses
  • Async support for API operations
  • OpenAI GPT-4 Vision integration
  • Tool-augmented AI agent capabilities

Code Structure

The main components include:

  • MultimodalLLMService: Service for interacting with OpenAI's vision models
  • InvoiceProcessingDependencies: Dependency container for the AI agent
  • InvoiceExtractionResult: Structured output model for extracted data
  • Custom tools for processing invoice images

Prerequisites

Before running this project, ensure you have the following prerequisites:

  • Python 3.8 or higher
  • An OpenAI API key
  • Required Python packages (listed in requirements.txt)

To install the required packages, run:

pip install -r requirements.txt

Running the Project

To run the project, use the following command:

python3 app.py

Testing

To run the tests, use the following command:

pytest

About

Example project demonstrating how to use multimodal LLMs as a tool for PydanticAI applications

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages