Chat with Multiple PDFs

Here is the streamlit deployed app link- https://pdfansweringai-7wmhdc9uzgenccskbb4auy.streamlit.app/

Please Follow instruction.txt for make model to work.

Chat with Multiple PDFs

This project involves a Streamlit-based application enabling users to interact with multiple PDF documents through a chat interface. It employs FAISS for efficient document retrieval and utilizes a T5 model to generate responses. The PDF documents are processed, with their text divided into manageable chunks, vectorized, and stored in a FAISS index. Users can then pose questions, and the system retrieves relevant document sections to generate accurate and contextually relevant responses.

Features

Upload and process multiple PDF files.
Extract text from PDFs and divide it into chunks.
Vectorize the text chunks using SentenceTransformers.
Store and retrieve text chunks using FAISS.
Generate responses using a local T5 model.
Provide an interactive chat interface with document preview.

Prerequisites

Python 3.7 or higher

Required Python packages (see requirements.txt)

Installation

Clone the repository:

git clone https://github.com/ajaysonwani/pdf_answering_ai.git
cd yourrepository

Create a virtual environment and activate it:

python -m venv venv
source venv/bin/activate  # On Windows use `venv\Scripts\activate`

Install the required packages:
```
pip install -r requirements.txt
```
Download the LaMini-T5-738M model and place it in your project directory:
- Place the model files in a directory, e.g., models/LaMini-T5-738M

Usage

Run the Streamlit app:
```
streamlit run app.py
```
Open your web browser
Upload your PDF files using the sidebar, click "Process", and wait for processing to complete.
Ask questions about the uploaded documents in the main chat interface. Note: Please Download "MBZUAI/LaMini-T5-738M" model from hugging face at your folder location using code from trial.ipynb

File Structure

app.py: Main application code.
Templates.py: HTML and CSS templates for the chat interface.
requirements.txt: List of required Python packages.

Code Explanation

app.py

FAISSRetriever Class: Handles the retrieval of relevant documents from the FAISS index.
get_pdf_text: Extracts text from uploaded PDF files.
get_text_chunks: Splits extracted text into manageable chunks.
get_vectorstore: Vectorizes text chunks and stores them in a FAISS index.
get_conversation_model: Loads the local T5 model for generating responses.
generate_response: Generates a response to the user's query using the T5 model.
main: Streamlit app main function that handles file uploads, processing, and chat interface.

Templates.py

CSS and HTML Templates: Contains the styles and structure for the chat interface, including user and bot message templates and the PDF preview window.
render_pdf: Function to render PDF files in an iframe for preview.

Acknowledgements

https://github.com/ajaysonwani/pdf_answering_ai

demo.1.mp4

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
MBZUAI/LaMini-T5-738M		MBZUAI/LaMini-T5-738M
.gitignore		.gitignore
README.md		README.md
frontend.py		frontend.py
model.py		model.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Here is the streamlit deployed app link- https://pdfansweringai-7wmhdc9uzgenccskbb4auy.streamlit.app/

Chat with Multiple PDFs

Features

Prerequisites

Installation

Usage

File Structure

Code Explanation

app.py

Templates.py

Acknowledgements

About

Releases

Packages

Languages

ajaysonwani/pdf_answering_ai

Folders and files

Latest commit

History

Repository files navigation

Here is the streamlit deployed app link- https://pdfansweringai-7wmhdc9uzgenccskbb4auy.streamlit.app/

Chat with Multiple PDFs

Features

Prerequisites

Installation

Usage

File Structure

Code Explanation

app.py

Templates.py

Acknowledgements

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages