Prerequisite: have python3 installed.
python3 -m venv venv # creates venv directory
source venv/bin/activate # enters virtual environment
pip install llama-cpp-python \
--extra-index-url https://abetlen.github.io/llama-cpp-python/whl/cpu # rm extra-index-url part if runs on GPU
- Download 8.6G file of model -> llama-2-13b.Q5_K_M.gguf, place in models/ dir.
python3 main.py 2>error.log
NOTE: by default the model writes a lot of information out into STDERR. I filter that out with 2>error.log for you to see later.
If you want to see all output, remove 2>error.log, just run python3 main.py
.
llama_cpp_python
library used here supports pulling models from HuggingFace directly - link to howto. This allows experiments with other models.
python3 -m venv venv # creates venv directory
source venv/bin/activate # enters virtual environment
pip install -r requirements.txt
- Create directory
data_rag_ru
in this project; - Put there PDF files to get the answers data from.
python3 rag_from_pdf.py
then ask your questions from it.
Cheers.