Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: About ServerDisconnectedError(Server disconnected) in ElasticsearchStore #11641

Open
Hspix opened this issue Mar 5, 2024 · 3 comments
Labels
bug Something isn't working P2

Comments

@Hspix
Copy link

Hspix commented Mar 5, 2024

Bug Description

When there are more than 10 queries running simultaneously, the AsyncElasticsearch client may raise a ServerDisconnectedError(Server disconnected) exception, as shown below:

elastic_transport.ConnectionError: Connection error caused by: ConnectionError(Connection error caused by: ServerDisconnectedError(Server disconnected))

Version

llama-index=0.10.14
llama-index-vector-stores-elasticsearch=0.1.5

Steps to Reproduce

A potential issue with the aquery method in the llama_index.vector_stores.elasticsearch.base module could be the cause of this problem, as shown below:

        async with self.client as client:
            response = await client.search(
                index=self.index_name,
                **es_query,
                size=query.similarity_top_k,
                _source={"excludes": [self.vector_field]},
            )

I tried running the code block independently with manual queries, but it still generated the same exception.

However, nothing happens when using the async client directly without async with:

            response = await self.client.search(
                index=self.index_name,
                **es_query,
                size=query.similarity_top_k,
                _source={"excludes": [self.vector_field]},
            )

So, I guess that async with is redundant?

Relevant Logs/Tracbacks

No response

@Hspix Hspix added bug Something isn't working triage Issue needs to be triaged/prioritized labels Mar 5, 2024
Copy link

dosubot bot commented Mar 5, 2024

Hey @Hspix, back at it with another tech mystery, I see! 😄 Good to have you reaching out again.

Thank you for bringing this issue to our attention. The ServerDisconnectedError you're encountering is typically due to issues with the network connection between the client and the Elasticsearch server, or the server closing the connection unexpectedly. It's not directly related to the use of async with.

The async with statement is used here for managing the asynchronous context for the Elasticsearch client (self.client). It ensures that the client is properly acquired for the operation and then released, which is a good practice for managing resources in asynchronous programming.

To address the ServerDisconnectedError, you might want to look into the following areas:

  • Connection Pooling and Limits: Ensure that the Elasticsearch client is configured to handle a sufficient number of simultaneous connections. This might involve adjusting the connection pool size.
  • Elasticsearch Server Configuration: The server itself might have limits on the number of connections it can handle or might be configured to close idle connections after a certain period. Check the server's configuration for connection handling.
  • Network Issues: Network instability or configuration issues could also lead to such disconnections. Ensure that the network between the client and the server is stable and configured correctly.
  • Client Version and Compatibility: Ensure that the version of the Elasticsearch client you are using is compatible with the version of the Elasticsearch server. Incompatibilities might lead to unexpected behavior.

As for the aquery method in the llama_index.vector_stores.elasticsearch.base module, it is an asynchronous method designed for querying the Elasticsearch vector store. It is used to perform various types of queries, including basic vector similarity search, text search, hybrid queries, queries with filters, and deletion and re-querying. The method is designed to be used asynchronously, indicated by the await keyword before its calls, making it suitable for use in asynchronous Python applications or frameworks.

If the issue persists even after checking these areas, consider implementing a retry mechanism for your queries. If you need further assistance, please provide more details about your Elasticsearch server configuration and the exact steps to reproduce the issue.

Sources

About Dosu

This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

@logan-markewich logan-markewich added P2 and removed triage Issue needs to be triaged/prioritized labels Mar 5, 2024
@logan-markewich
Copy link
Collaborator

If you know the fix or feel up to it, I highly encourage a PR :)

@rzechen
Copy link

rzechen commented May 24, 2024

Bug Description

When there are more than 10 queries running simultaneously, the AsyncElasticsearch client may raise a ServerDisconnectedError(Server disconnected) exception, as shown below:

elastic_transport.ConnectionError: Connection error caused by: ConnectionError(Connection error caused by: ServerDisconnectedError(Server disconnected))

Version

llama-index=0.10.14 llama-index-vector-stores-elasticsearch=0.1.5

Steps to Reproduce

A potential issue with the aquery method in the llama_index.vector_stores.elasticsearch.base module could be the cause of this problem, as shown below:

        async with self.client as client:
            response = await client.search(
                index=self.index_name,
                **es_query,
                size=query.similarity_top_k,
                _source={"excludes": [self.vector_field]},
            )

I tried running the code block independently with manual queries, but it still generated the same exception.

However, nothing happens when using the async client directly without async with:

            response = await self.client.search(
                index=self.index_name,
                **es_query,
                size=query.similarity_top_k,
                _source={"excludes": [self.vector_field]},
            )

So, I guess that async with is redundant?

Relevant Logs/Tracbacks

No response

me too

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working P2
Projects
Status: No status
Development

No branches or pull requests

3 participants