Replies: 1 comment 2 replies
-
Answer generated by a 🤖 AnswerFrom your description, it appears that you're having an issue with the Pinecone vector store in your LangChain application. Specifically, when you're trying to upsert a CSV file to Pinecone and query the document, it seems that only a small number of documents (rows) are being taken into account. You've mentioned that only 4 rows from a 500+ row document are being processed, even though your Pinecone index has a 1536 dimension and uses the 'cosine' metric. One possible explanation for this issue could be related to the chunk size limit set in the To address this issue, I suggest increasing the chunk size to 100, as recommended by Pinecone. This could potentially allow more documents to be processed at a time. Here's how you can adjust the chunk size: // Pinecone recommends a limit of 100 vectors per upsert request
const chunkSize = 100; // Increase chunk size to 100
for (let i = 0; i < pineconeVectors.length; i += chunkSize) {
const chunk = pineconeVectors.slice(i, i + chunkSize);
await this.pineconeIndex.upsert({
upsertRequest: {
vectors: chunk,
namespace: this.namespace,
},
});
} Additionally, I recommend adding error handling to the upsert operation to catch any issues that might be occurring during the process. This can help identify if there are any problems that might be causing only a portion of your documents to be added to the Pinecone index. Here's how you can add error handling: // Pinecone recommends a limit of 100 vectors per upsert request
const chunkSize = 50;
for (let i = 0; i < pineconeVectors.length; i += chunkSize) {
const chunk = pineconeVectors.slice(i, i + chunkSize);
try {
await this.pineconeIndex.upsert({
upsertRequest: {
vectors: chunk,
namespace: this.namespace,
},
});
} catch (error) {
console.error('Error during upsert operation:', error);
}
} Please note that this is just one potential solution based on the information provided. If this solution works for you and you believe it could be helpful to others, I encourage you to make a pull request to update the LangChain documentation. This way, other users who encounter the same issue can benefit from your experience. SourcesThis response is meant to be useful, save you time, and share context. It is not meant to be a precise solution, but rather a starting point for your own research. Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant. |
Beta Was this translation helpful? Give feedback.
-
Hoping someone knows how this can be resolved.
I use the CSVLoader to upsert a CSV file to Pinecone. I add metadata to each document.
To query the document here is how I define the vector store:
I then use the
ConversationalRetrievalQAChain
to query agains the specific file.For some reason, the vector store only appears to take a small number of documents (rows) into account (e.g. 4 rows from a 500+ row document).
Pincecone index has a 1536 dimension and 'consine' metric.
Has anyone been able to do this successfully or can point me in the right direction.
Beta Was this translation helpful? Give feedback.
All reactions