Skip to content
This repository has been archived by the owner on Mar 8, 2024. It is now read-only.
/ WiseIlmu Public archive

A Q&A knowledge base powered by LangChain and OpenAI for intuitive content retrieval.

Notifications You must be signed in to change notification settings

farithadnan/WiseIlmu

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

89 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

WiseIlmu 🧠

WiseIlmu is a Python application that leverages large language model (LLM) like OpenAI and the langchain library to load and split documents into chunks. WiseIlmu also uses the sentence-transformers model (all-MiniLM-L6-v2) to convert document chunks into a vector space format, which helps LLMs to better understand the content of documents. The vector space of the documents is then stored in memory by Chroma allowing its utilization throughout the application's runtime for seamless querying.

How to Run the Project 🏃🏽‍♂️

Step 1: Clone the repo

Open Git bash and type:

  git clone https://github.com/farithadnan/WiseIlmu.git

Step 2: Installation

Install the required Python packages by running below command on your terminal:

  pip install -r requirements.txt

Step 3: Configuration Setup

Ensure that the configuration file (e.g., config.yaml) contains essential settings such as:

  • Paths to directories containing documents in app.py
  document_dir = os.path.join(current_dir, "documents")
  • LLM Model and embedding details.

Step 4: Run the Project

Activate your Python environment and execute the main Python script:

  python app.py

This will initialize the chatbot using the configured settings and launch a Gradio-based interface for interacting with the OpenAI-based chatbot.

Directory Structure 📂

Below shows the directory structure for this project:

.
├── config                  # Configuration
│   └── config.yaml
├── data		    # Vector database
├── documents               # Folder to store user's files (pdf, docs, csv and etc.)
│   ├── file1.pdf
│   ├── file2.docx
│   └── files3.csv
├── outputs                 # Log folder created by Hydra
├── venv                    # Virtual environtment
├── .gitignore              # gitignore
├── app.py                  # Main script
├── chatbot.py              # Script to handle conversation via Gradio
├── llm_openai.py           # Script to handle interaction with OpenAI
├── loader.py               # Script to handle vector database
├── README.md               # Project Info
└── requirements.txt        # List of required libraries

*Keep in mind that the folder outputs is automatically created by Hydra itself. Outputs are responsible for storing the logs when running the project.

About

A Q&A knowledge base powered by LangChain and OpenAI for intuitive content retrieval.

Topics

Resources

Stars

Watchers

Forks

Languages