A medical chatbot that answers user queries by retrieving relevant information from a collection of medical documents using LangChain, Pinecone, and Meta's Llama2 model.
The medical chatbot project employs a multi-step approach to deliver accurate and relevant responses to user queries:
-
Data Ingestion and Processing
- The project begins by ingesting and processing a collection of medical PDFs.
- Text is extracted from these PDFs and split into manageable chunks.
-
Embedding Generation
- The extracted text chunks are embedded using HuggingFace's pre-trained model to capture their semantic meanings.
-
Storage in Pinecone
- The generated embeddings are stored in Pinecone for efficient and scalable vector search.
-
Response Generation
- For generating responses, the chatbot utilizes Meta's Llama2 model.
- The model queries Pinecone to retrieve the most relevant document snippets based on user input.
-
Integration with Flask
- The entire system is integrated into a Flask web application.
- This provides an interactive interface for real-time communication with users.
This methodology ensures that the chatbot can process complex medical documents and deliver accurate, contextually relevant responses to user queries.
- Python
- LangChain
- Flask
- Meta Llama2
- Pinecone
-
Clone the repository
https://github.com/shikharrajat/Medical-Chatbot.git cd Medical-Chatbot
-
Create a conda environment
conda create -n env python=3.8 -y conda activate env
-
Install the requirements
pip install -r requirements.txt
-
Create a .env file
Create a
.env
file in the root directory and add your Pinecone credentials:PINECONE_API_KEY="your_pinecone_api_key"
-
Download the quantized Llama2 model
Download the Llama2 model (
llama-2-7b-chat.ggmlv3.q4_0.bin
) from the HuggingFace link and place it in themodel
directory.
-
Run the indexing script
python store_index.py
-
Start the Flask application
python app.py
-
Open your browser and go to
http://localhost:8080