Skip to content

An app to upload and ask question in context of a pdf file

License

Notifications You must be signed in to change notification settings

rabee05/ask-a-pdf

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

27 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ASK A PDF

This Streamlit-based application question-answering on texts extracted from PDF documents by integrating the langchain framework with various language models and utilizing the FAISS library for efficient similarity searches in large vector spaces.A key feature of this application is its support for local model execution using Ollama, enabling users to process data without relying on external API calls, thus ensuring privacy.

Features

  • PDF text extraction and processing
  • Text chunking for efficient processing
  • Choice of OpenAI models or local Mistral 7B for inference and LLaMA2 for embeddings via Ollama.
  • Saved vector embeddings and search with FAISS

Running Models Locally with Ollama

To run models locally using Ollama, installation is required. The most straightforward method to install and start Ollama is via its official Docker image. For comprehensive installation instructions, refer to:

If you have an NVIDIA GPU on your machine, it's highly recommended to leverage it when running Ollama.

If you opt to use OpenAI models, you must obtain an OPENAI_API_KEY. Please visit OpenAI API Keys to get your key. Once obtained, ensure to save it in your .env file as follows:

OPENAI_API_KEY=your_openai_api_key_here

Installation

Clone the repository to your local machine:

git clone https://github.com/rabee05/ask-a-pdf.git
cd ask-a-pdf.git

To run the project, first, create a virtual environment. I recommend using Pipenv for its simplicity and effectiveness in managing project dependencies.

check if Pipenv is installed by running pipenv --version. If not found, install Pipenv with:

pip install pipenv --user

To ensure the virtual environment is created within the project folder, set the following environment variable:

export PIPENV_VENV_IN_PROJECT=1

Now, create a virtual environment and install dependencies by running:

pipenv install

If you prefer to use venv for virtual environment management and a requirements.txt file for dependencies, follow these steps:

python3 -m venv .venv

On macOS and Linux:

source .venv/bin/activate

Install the required packages from requirements.txt:

pip install -r requirements.txt

To delete the virtual environment, either manually remove the environment directory or run :

pipenv --rm

Environment Configuration

if you want to run OpenAI models, copy and rename .env.example to .env:

cp .env.example .env

Open the .env file and include the following information:

OPENAI_API_KEY=your_openai_api_key_here

Configuration File Updates

After setting up your project, ensure to review and update the config/config.py file to suit your environment, particularly the OLLAMA_SERVER settings:

After installing Ollama, check it's running by navigating to http://localhost:11434/ or use the IP address with default port 11434. You should see "Ollama is running".

Running the Application

To run the application, execute the following command from the project root:

streamlit run app.py

Navigate to the URL provided by Streamlit in your browser to interact with the application.

App Main Page

Data Flow Diagram

About

An app to upload and ask question in context of a pdf file

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published