Ask about movies by genre, actors, plot summaries, or reviews β just like chatting with a friend.
Traditional movie search is broken. You know the feeling:
- π€ "I want something like Inception but not sci-fi"
- π€ "Show me action movies but make them smart"
- π "Find me movies with that actor from that thing"
-
π¬ Conversational Movie Search
Ask natural questions about movies by genre, cast, plot, or reviews. -
π― Smart Recommendations
Get personalized suggestions based on your interests using state-of-the-art AI. -
β‘ Instant Movie Info
Instantly see ratings, summaries, and reviewsβno more manual searches. -
π€ Multi-Agent Architecture
Orchestrated agents for retrieval, recommendations, and external data integration. -
π Vector-Powered Search
FAISS-based similarity search with OpenAI embeddings (no heavy downloads required). -
β‘ Lightning Fast Setup
No CUDA libraries or large ML models - just install and go!
This system gets it. Instead of keyword matching, it understands context, meaning, and relationships between movies. Ask naturally, get perfect results.
- RAG Architecture - Combines retrieval with generation for nuanced responses
- Vector Similarity - Finds movies based on meaning, not just keywords
- Multi-Agent System - Specialized AI agents work together for complex queries
- No Setup Hell - Lightweight, fast, and CUDA-free
Technology | Purpose | Version |
---|---|---|
π Python | Core Language | 3.8+ |
π€ OpenAI GPT-3.5 | Language Generation | Latest |
π¦ LangChain | LLM Framework | 0.3+ |
π FAISS | Vector Similarity Search | CPU Version (No CUDA) |
π OpenAI Embeddings | Text Embeddings | text-embedding-3-small |
π¨ Gradio | Web UI Framework | 5.0+ |
π Pandas | Data Processing | Latest |
π Jupyter | Interactive Development | Latest |
Get up and running in under 2 minutes!
# 1. Clone the repository
git clone <repository-url>
cd rag-movie-rec
# 2. Create virtual environment (recommended)
python3 -m venv movie-env
source movie-env/bin/activate # On Windows: movie-env\Scripts\activate
# 3. Install dependencies (lightweight, no CUDA!)
pip install -r requirements.txt
# 4. Set up your OpenAI API key
export OPENAI_API_KEY="your-openai-api-key"
# 5. Build the vector store
python main.py --mode build
# 6. Launch the web UI
python main.py --mode ui
π‘ Pro tip: Get your OpenAI API key at platform.openai.com/api-keys
# 1. Install dependencies
pip install -r requirements.txt
# 2. Launch Jupyter
jupyter notebook
# 3. Open and run notebooks/MovieFinder_Main.ipynb
graph TB
A[User Query] --> B[Vector Search]
B --> C[FAISS Index]
C --> D[Similar Movies]
D --> E[RAG Pipeline]
E --> F[GPT-3.5 Turbo]
F --> G[Personalized Response]
H[Movie Dataset] --> I[Text Chunking]
I --> J[Sentence Transformers]
J --> K[Embeddings]
K --> C
-
π Data Ingestion
Load and clean IMDb movie dataset with ratings, cast, genres, and descriptions. -
βοΈ Description Generation
Create natural language descriptions for each movie combining all metadata. -
π Text Chunking
Split descriptions into overlapping chunks for better retrieval granularity. -
𧬠Embedding Creation
Convert text chunks to high-dimensional vectors using Sentence Transformers. -
ποΈ Vector Store Building
Build FAISS index for lightning-fast similarity search across 7,000+ chunks. -
π Query Processing
Convert user queries to embeddings and find most similar movie content. -
π€ AI Generation
Use retrieved context with GPT-3.5-turbo to generate personalized responses. -
π¨ User Interface
Present results through intuitive Gradio web interface or CLI.
Launch the web UI and try these queries:
- "Recommend documentaries about famous people"
- "What are some good action movies from the 2000s?"
- "Movies similar to The Lord of the Rings"
- "Comedy movies with high IMDb ratings"
# Interactive CLI mode
python main.py --mode cli
# Build/rebuild vector store
python main.py --mode build
from src.movie_recommender.main import MovieRecommenderApp
app = MovieRecommenderApp()
app.setup_agents()
result = app.orchestrator.process_query("Sci-fi movies with robots")
print(result)
rag-movie-rec/
βββ src/movie_recommender/ # Core application modules
β βββ __init__.py
β βββ config.py # Configuration management
β βββ data_processor.py # Data processing utilities
β βββ vector_store.py # FAISS vector operations
β βββ agents.py # Multi-agent orchestration
β βββ rag_pipeline.py # RAG implementation
β βββ ui.py # Gradio interface
β βββ main.py # Application entry point
βββ notebooks/ # Jupyter development notebooks
β βββ MovieFinder_Main.ipynb # Primary development notebook
β βββ MovieFinder_Supplemental.ipynb
βββ scripts/ # Data preparation scripts
β βββ SetupData.py # Dataset downloading & merging
β βββ format_movie.py # Cast column formatting
β βββ ...
βββ tests/ # Test suite
βββ data/ # Generated data files
βββ main.py # Application launcher
βββ requirements.txt # Python dependencies
βββ README.md # This file
# Install development dependencies
pip install -r requirements.txt
# Install in editable mode
pip install -e .
# Run tests
python -m pytest tests/
# Format code
black src/ tests/
# Lint code
flake8 src/ tests/
# 1. Download and prepare dataset (requires Kaggle API)
python scripts/SetupData.py
# 2. Format cast columns
python scripts/format_movie.py
# 3. Build vector store
python main.py --mode build
# Run all tests
python -m pytest tests/ -v
# Run specific test categories
python -m pytest tests/test_vector_store.py -v
python -m pytest tests/test_agents.py -v
# Run with coverage
python -m pytest tests/ --cov=src/movie_recommender --cov-report=html
# Required
export OPENAI_API_KEY="your-openai-api-key"
# Optional
export OMDB_API_KEY="your-omdb-api-key"
Edit src/movie_recommender/config.py
to customize:
- Model parameters (temperature, max tokens)
- Chunk sizes and overlap
- Vector store settings
- UI configuration
- Dataset: 3,653 movies with 7,225 text chunks
- Vector Search: Sub-second similarity queries
- Memory Usage: ~100MB total installation (no heavy ML libraries)
- Embedding Model: 1,536-dimensional vectors (OpenAI text-embedding-3-small)
- Setup Time: Under 2 minutes vs 30+ minutes with local models
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature
) - Commit your changes (
git commit -m 'Add amazing feature'
) - Push to the branch (
git push origin feature/amazing-feature
) - Open a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
- OpenAI for GPT-3.5-turbo language model
- Meta AI for FAISS vector search library
- Hugging Face for Sentence Transformers
- LangChain for LLM orchestration framework
- Gradio for the beautiful web interface
- IMDb for the movie dataset
Ready to roll? Let's build a smarter way to search for movies. πΏ