-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Question]: Connecting llamaindex to aoss opensearch on aws #14133
Comments
Hey @mike-wethos! 👋 I'm here to help you out with your question. Let's tackle this together! The Here is a checklist to help you troubleshoot the issue:
Here is an updated version of your code snippet with the correct service name: import boto3
from llama_index.core import Settings, StorageContext, VectorStoreIndex
from llama_index.readers.s3 import S3Reader
from llama_index.vector_stores.opensearch import OpensearchVectorStore, OpensearchVectorClient
from opensearchpy import AWSV4SignerAuth
credentials = boto3.Session(
aws_access_key_id=env.aws_access_key_id,
aws_secret_access_key=env.aws_secret_access_key,
).get_credentials()
region = "us-east-2"
service = "es" # Correct service name for OpenSearch
auth = AWSV4SignerAuth(credentials, region, service)
endpoint = env.vector_store_host
idx = args.index_type
text_field = "content"
embedding_field = "embedding"
client = OpensearchVectorClient(
endpoint, idx, 768, embedding_field=embedding_field, text_field=text_field, kwargs={"http_auth": auth}
) If the issue persists after these checks, you may need to review the IAM policies attached to the AWS credentials and ensure they have the necessary permissions to interact with the OpenSearch service. Additionally, ensure that the |
@dosu |
The Here are some steps to address these issues:
By following these steps, you should be able to mitigate the timeout exceptions when storing documents in AWS AOSS OpenSearch [1][2][3]. |
I am still getting the connection timeout |
To resolve the Here is an example of how to correctly set the from opensearchpy import AWSV4SignerAuth
import boto3
# Create the AWS credentials
credentials = boto3.Session().get_credentials()
region = 'your-aws-region'
# Create the AWSV4SignerAuth object
http_auth = AWSV4SignerAuth(credentials, region)
# Initialize the OpensearchVectorClient with the correct http_auth
client = OpensearchVectorClient(
endpoint="https://your-opensearch-endpoint",
index="your-index-name",
dim=1536,
embedding_field="embedding",
text_field="content",
http_auth=http_auth
) Ensure that you replace |
With this setup
I am now getting a |
@dosu
It finally failed with
I have verified that I can successfully create an index using these credentials and host/port using opensearch-py. The issue seems to step from the fact that llamaindex only supports sending the host/port as a string rather than the standard dict that opensearch-py uses in its examples.
|
Are you encoding the port in your host when you use the vector client? |
To ensure compatibility with endpoint = "https://your-opensearch-endpoint:443" Here is the updated code snippet with the port included in the endpoint string: from opensearchpy import AWSV4SignerAuth, RequestsHttpConnection
import boto3
# AWS setup
credentials = boto3.Session(
aws_access_key_id=env.aws_access_key_id,
aws_secret_access_key=env.aws_secret_access_key,
).get_credentials()
region = "us-east-2"
service = "aoss"
auth = AWSV4SignerAuth(credentials, region, service)
endpoint = "https://your-opensearch-endpoint:443"
idx = args.index_type
text_field = "content"
embedding_field = "embedding"
client = OpensearchVectorClient(
endpoint, idx, 768, embedding_field=embedding_field, text_field=text_field, http_auth=auth,
connection_class=RequestsHttpConnection, use_ssl=True, verify_certs=True
) This should help resolve the connection issues by ensuring the port is correctly included in the endpoint URL, making it compatible with the |
Yes i have tried every variation of the url with no luck. I think an ideal solution would to update OpensearchVectorClientclient to better match the standard of |
Having similar issues. Any progress on solving this? |
The following worked for me - sharing incase its useful import boto3
from opensearchpy import AWSV4SignerAuth, RequestsHttpConnection, AWSV4SignerAsyncAuth, AsyncHttpConnection
from llama_index.vector_stores.opensearch import OpensearchVectorStore, OpensearchVectorClient
from llama_index.core import VectorStoreIndex, StorageContext
region = <your-aws-region>
credentials = boto3.Session().get_credentials()
auth = AWSV4SignerAsyncAuth(credentials, region)
endpoint = "https://<your-domain-name>:443"
idx = <your-index>
text_field = "content"
embedding_field = "embedding"
client = OpensearchVectorClient(
endpoint, idx, 1536, embedding_field=embedding_field, text_field=text_field,
http_auth=auth,
use_ssl=True,
verify_certs=True,
connection_class=AsyncHttpConnection
)
vector_store = OpensearchVectorStore(client) |
hmm this works for standalone scripts but now running into event loop issues when trying to setup the client in fastapi service running in uvicorn. Seems it has trouble getting the loop in the llamaindex code so it errors out due to the loop already running. Also I cant just make one client at lifecycle startup so I need a way to create a client when a request comes in since it could hit multiple indexes based on the request params. |
This worked for me: from opensearchpy import AWSV4SignerAuth, RequestsHttpConnection
auth = AWSV4SignerAuth(credentials, "us-east-2", "es")
client = OpensearchVectorClient(
endpoint, idx, 1024, embedding_field="embedding", text_field="content", http_auth=auth, use_ssl=True, verify_certs=True, connection_class=RequestsHttpConnection
) The |
I've got the following error when i use RequestsHttpConnection:
Any ideas? |
Question Validation
Question
Hello,
I am attempting to store documents in aws aoss opensearch and keep getting a 403 forbidden when creating the opensearchVectorClient despite passing correct credentials. Is this a supported method?
thank you for your time.
The text was updated successfully, but these errors were encountered: