🤖 Chat with PDF locally using Ollama + LangChain

A powerful local RAG (Retrieval Augmented Generation) application that lets you chat with your PDF documents using Ollama and LangChain. This project includes both a Jupyter notebook for experimentation and a Streamlit web interface for easy interaction.

Project Structure

ollama_pdf_rag/
├── src/                      # Source code
│   ├── app/                  # Streamlit application
│   │   ├── components/       # UI components
│   │   │   ├── chat.py      # Chat interface
│   │   │   ├── pdf_viewer.py # PDF display
│   │   │   └── sidebar.py   # Sidebar controls
│   │   └── main.py          # Main app
│   └── core/                 # Core functionality
│       ├── document.py       # Document processing
│       ├── embeddings.py     # Vector embeddings
│       ├── llm.py           # LLM setup
│       └── rag.py           # RAG pipeline
├── data/                     # Data storage
│   ├── pdfs/                # PDF storage
│   │   └── sample/          # Sample PDFs
│   └── vectors/             # Vector DB storage
├── notebooks/               # Jupyter notebooks
│   └── experiments/         # Experimental notebooks
├── tests/                   # Unit tests
├── docs/                    # Documentation
└── run.py                   # Application runner

📺 Video Tutorial

✨ Features

🔒 Fully local processing - no data leaves your machine
📄 PDF processing with intelligent chunking
🧠 Multi-query retrieval for better context understanding
🎯 Advanced RAG implementation using LangChain
🖥️ Clean Streamlit interface
📓 Jupyter notebook for experimentation

🚀 Getting Started

Prerequisites

Install Ollama

Visit Ollama's website to download and install

Pull required models:

ollama pull llama3.2  # or your preferred model
ollama pull nomic-embed-text

Clone Repository

git clone https://github.com/tonykipkemboi/ollama_pdf_rag.git
cd ollama_pdf_rag

Set Up Environment

python -m venv venv
source venv/bin/activate  # On Windows: .\venv\Scripts\activate
pip install -r requirements.txt

Key dependencies and their versions:

ollama==0.4.4
streamlit==1.40.0
pdfplumber==0.11.4
langchain==0.1.20
langchain-core==0.1.53
langchain-ollama==0.0.2
chromadb==0.4.22

🎮 Running the Application

Option 1: Streamlit Interface

python run.py

Then open your browser to http://localhost:8501

Streamlit interface showing PDF viewer and chat functionality

Option 2: Jupyter Notebook

jupyter notebook

Open updated_rag_notebook.ipynb to experiment with the code

💡 Usage Tips

Upload PDF: Use the file uploader in the Streamlit interface or try the sample PDF
Select Model: Choose from your locally available Ollama models
Ask Questions: Start chatting with your PDF through the chat interface
Adjust Display: Use the zoom slider to adjust PDF visibility
Clean Up: Use the "Delete Collection" button when switching documents

🤝 Contributing

Feel free to:

Open issues for bugs or suggestions
Submit pull requests
Comment on the YouTube video for questions
Star the repository if you find it useful!

⚠️ Troubleshooting

Ensure Ollama is running in the background
Check that required models are downloaded
Verify Python environment is activated
For Windows users, ensure WSL2 is properly configured if using Ollama

Common Errors

ONNX DLL Error

If you encounter this error:

DLL load failed while importing onnx_copy2py_export: a dynamic link Library (DLL) initialization routine failed.

Try these solutions:

Install Microsoft Visual C++ Redistributable:
- Download and install both x64 and x86 versions from Microsoft's official website
- Restart your computer after installation

If the error persists, try installing ONNX Runtime manually:

pip uninstall onnxruntime onnxruntime-gpu
pip install onnxruntime

CPU-Only Systems

If you're running on a CPU-only system:

Ensure you have the CPU version of ONNX Runtime:

pip uninstall onnxruntime-gpu  # Remove GPU version if installed
pip install onnxruntime  # Install CPU-only version

You may need to modify the chunk size in the code to prevent memory issues:
- Reduce chunk_size to 500-1000 if you experience memory problems
- Increase chunk_overlap for better context preservation

Note: The application will run slower on CPU-only systems, but it will still work effectively.

🧪 Testing

Running Tests

# Run all tests
python -m unittest discover tests

# Run tests verbosely
python -m unittest discover tests -v

Pre-commit Hooks

The project uses pre-commit hooks to ensure code quality. To set up:

pip install pre-commit
pre-commit install

This will:

Run tests before each commit
Run linting checks
Ensure code quality standards are met

Continuous Integration

The project uses GitHub Actions for CI. On every push and pull request:

Tests are run on multiple Python versions (3.9, 3.10, 3.11)
Dependencies are installed
Ollama models are pulled
Test results are uploaded as artifacts

📝 License

This project is open source and available under the MIT License.

⭐️ Star History

Built with ❤️ by Tony Kipkemboi!

Follow me on X | LinkedIn | YouTube | GitHub

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

🤖 Chat with PDF locally using Ollama + LangChain

Project Structure

📺 Video Tutorial

✨ Features

🚀 Getting Started

Prerequisites

🎮 Running the Application

Option 1: Streamlit Interface

Option 2: Jupyter Notebook

💡 Usage Tips

🤝 Contributing

⚠️ Troubleshooting

Common Errors

ONNX DLL Error

CPU-Only Systems

🧪 Testing

Running Tests

Pre-commit Hooks

Continuous Integration

📝 License

⭐️ Star History

Files

README.md

Latest commit

History

README.md

File metadata and controls

🤖 Chat with PDF locally using Ollama + LangChain

Project Structure

📺 Video Tutorial

✨ Features

🚀 Getting Started

Prerequisites

🎮 Running the Application

Option 1: Streamlit Interface

Option 2: Jupyter Notebook

💡 Usage Tips

🤝 Contributing

⚠️ Troubleshooting

Common Errors

ONNX DLL Error

CPU-Only Systems

🧪 Testing

Running Tests

Pre-commit Hooks

Continuous Integration

📝 License

⭐️ Star History