You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
MODEL_TYPE=LlamaCpp # GPT4All or LlamaCpp
MODEL_PATH=eachadea/ggml-vicuna-7b-1.1/ggml-vic7b-q5_1.bin
MODEL_TEMP=0.8
MODEL_N_CTX=2048 # Max total size of prompt+answer
MODEL_MAX_TOKENS=1024 # Max size of answer
MODEL_STOP=[STOP]
CHAIN_TYPE=betterstuff
N_RETRIEVE_DOCUMENTS=100 # How many documents to retrieve from the db
N_FORWARD_DOCUMENTS=100 # How many documents to forward to the LLM, chosen among those retrieved
N_GPU_LAYERS=32
I have fed the system a 5000 line csv file, with 30 columns.
Now I asked about overall insight from the data.
I can see in the terminal, it is only seeing top 5 or 7 documents, which is nothing but single row. So, this is giving me answer based on 5 or 7 rows, and thus no actual insight is coming
Point to be noted - I have kept only 1 document in the source documents folder to avoid information overlapping
Expected behavior
Should be able to understand the pattern in the data, and suggest some insights based on it.
The text was updated successfully, but these errors were encountered:
Curiosity007
changed the title
Unable to come out with insights on Overall Data
Unable to Provide with insights on Overall Data - Only Taking top 5 or 7 Documents
May 19, 2023
Curiosity007
changed the title
Unable to Provide with insights on Overall Data - Only Taking top 5 or 7 Documents
Unable to Provide insights on Overall Data - Only Taking top 5 or 7 Documents
May 19, 2023
Curiosity007
changed the title
Unable to Provide insights on Overall Data - Only Taking top 5 or 7 Documents
Unable to Provide insights on Overall Data - Only Taking top 5 or 7 chunks
May 19, 2023
On Fri, May 19, 2023, 12:35 Curiosity007 ***@***.***> wrote:
.env Generic
TEXT_EMBEDDINGS_MODEL=sentence-transformers/all-MiniLM-L6-v2
TEXT_EMBEDDINGS_MODEL_TYPE=HF # LlamaCpp or HF
USE_MLOCK=false
Ingestion
PERSIST_DIRECTORY=db
DOCUMENTS_DIRECTORY=source_documents
INGEST_CHUNK_SIZE=500
INGEST_CHUNK_OVERLAP=50
INGEST_N_THREADS=1
Generation
MODEL_TYPE=LlamaCpp # GPT4All or LlamaCpp
MODEL_PATH=eachadea/ggml-vicuna-7b-1.1/ggml-vic7b-q5_1.bin
MODEL_TEMP=0.8
MODEL_N_CTX=2048 # Max total size of prompt+answer
MODEL_MAX_TOKENS=1024 # Max size of answer
MODEL_STOP=[STOP]
CHAIN_TYPE=betterstuff
N_RETRIEVE_DOCUMENTS=100 # How many documents to retrieve from the db
N_FORWARD_DOCUMENTS=100 # How many documents to forward to the LLM, chosen
among those retrieved
N_GPU_LAYERS=32
Python version
Python 3.10.10
System
Description: Ubuntu 22.04.2 LTS Release: 22.04 Codename: jammy
CASALIOY version
Latest Commit - ee9a4e5
<ee9a4e5>
Information
- The official example scripts
- My own modified scripts
Related Components
- Document ingestion
- GUI
- Prompt answering
Reproduction
I have fed the system a 5000 line csv file, with 30 columns.
Now I asked about overall insight from the data.
I can see in the terminal, it is only seeing top 5 or 7 documents, which
is nothing but single row. So, this is giving me answer based on 5 or 7
rows, and thus no actual insight is coming
Expected behavior
Should be able to understand the pattern in the data, and suggest some
insights based on it.
—
Reply to this email directly, view it on GitHub
<#100>, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AEZ6BAN6V3AP427TLSPHCKTXG6HKRANCNFSM6AAAAAAYH53MNA>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
.env
Generic
TEXT_EMBEDDINGS_MODEL=sentence-transformers/all-MiniLM-L6-v2
TEXT_EMBEDDINGS_MODEL_TYPE=HF # LlamaCpp or HF
USE_MLOCK=false
Ingestion
PERSIST_DIRECTORY=db
DOCUMENTS_DIRECTORY=source_documents
INGEST_CHUNK_SIZE=500
INGEST_CHUNK_OVERLAP=50
INGEST_N_THREADS=1
Generation
MODEL_TYPE=LlamaCpp # GPT4All or LlamaCpp
MODEL_PATH=eachadea/ggml-vicuna-7b-1.1/ggml-vic7b-q5_1.bin
MODEL_TEMP=0.8
MODEL_N_CTX=2048 # Max total size of prompt+answer
MODEL_MAX_TOKENS=1024 # Max size of answer
MODEL_STOP=[STOP]
CHAIN_TYPE=betterstuff
N_RETRIEVE_DOCUMENTS=100 # How many documents to retrieve from the db
N_FORWARD_DOCUMENTS=100 # How many documents to forward to the LLM, chosen among those retrieved
N_GPU_LAYERS=32
Python version
Python 3.10.10
System
Description: Ubuntu 22.04.2 LTS Release: 22.04 Codename: jammy
CASALIOY version
Latest Commit - ee9a4e5
Information
Related Components
Reproduction
I have fed the system a 5000 line csv file, with 30 columns.
Now I asked about overall insight from the data.
I can see in the terminal, it is only seeing top 5 or 7 documents, which is nothing but single row. So, this is giving me answer based on 5 or 7 rows, and thus no actual insight is coming
Point to be noted - I have kept only 1 document in the source documents folder to avoid information overlapping
Expected behavior
Should be able to understand the pattern in the data, and suggest some insights based on it.
The text was updated successfully, but these errors were encountered: