A Streamlit-based application for searching and retrieving information from a knowledge base built from PDF documents using Retrieval-Augmented Generation (RAG) techniques. Supports both text and image (diagram) queries.
- Multimodal Search: Query both text and images (e.g., diagrams) within PDF documents.
- Streamlit UI: Simple web interface for entering queries and viewing results.
- Knowledge Base Construction: Extracts and indexes content from PDFs for efficient retrieval.
- Extensible: Easily add more documents or customize the pipeline.
git clone https://github.com/yourusername/multimodal-rag-search.git
cd multimodal-rag-search
Place PDF files in the data/sample_reports/
directory.
Example:
mkdir -p data/sample_reports
wget -O data/sample_reports/apple_10k_2023.pdf https://www.apple.com/investor/static/pdf/10-K_2023.pdf
python src/build_kb.py
streamlit run app.py
docker build -t multimodal-rag-search .
docker run -p 8501:8501 multimodal-rag-search
- Open your browser and go to http://localhost:8501.
- Enter a query (e.g., "technical diagram of report" or "Apple revenue analysis").
- View top results, including extracted text and images from your PDFs.
├── app.py # Streamlit app entry point
├── src/
│ ├── build_kb.py # Script to build the knowledge base from PDFs
│ ├── search.py # Search logic
│ └── utils.py # Utility functions
├── data/
│ └── sample_reports/ # Directory for sample PDF files
├── requirements.txt # Python dependencies
├── Dockerfile # Docker setup
- Python 3.10+
- See
requirements.txt
for dependencies
This project is for educational and research purposes.