-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OpenDistro Elasticsearch K-NN #4
Comments
Hello @mnagaya, I'm open for contributions using the two mentioned datasets if you have time? |
I have so much work to do on this month, but I will check it out later. |
Hi @jobergum, Opendistro for Elasticsearch K-NN is 10x faster than Vespa in my enviroment, but it is not a comparison with Vespa ANN. Do you have any plan to add the Vespa ANN result? The Vespa ANN implementation looks very nice. |
Currently, ef_search is 512 (it's default value) and recall is 0.997600.
Ref: https://opendistro.github.io/for-elasticsearch-docs/docs/knn/settings/#index-settings |
So I have done a run with hnsw enabled for the gist dataset with Vespa using the mentioned 1 x Intel(R) Xeon E5-2680 v3 2.50GHz (Haswell). https://opendistro.github.io/for-elasticsearch-docs/docs/knn/settings/#index-settings lists the following default values
Configuring Vespa with
Still I need to adjust the ef_search for Vespa upwards from 512 to 3*512 to get comparable recall so there might be some differences in the settings in the two engines. With Vespa the ef_search is a query parameter so it can be tuned on a per request basis. Since I wanted to compare performance at the same recall level I needed to adjust this param upwards.
Recall
Performance The open distro memory usage is very high, 35GB compared to Vespa's total including configuration services, search core process and serving container less than 10GB. |
Is there a way to control ef_search in the query with open distro ES @mnagaya ? |
Probably not. |
Great, thanks for confirming @mnagaya, Odd that this parameter cannot be set at run time as it's quite important for exploring to get a decent recall versus performance tradeoff. I'll capture a few more rounds and update the README with the results. Again, thank you for your contribution. |
@jobergum couple of suggestions for benchmarking with opendistro Elasticsearch's k-NN
You might find this link useful for indexing/search tuning https://medium.com/@kumon/how-to-realize-similarity-search-with-elasticsearch-3dd5641b9adb |
Yes, if you look at the README the benchmarking parameters is explained. A 20 second warmup is included, I also run multiple times before publishing results in the README. Also welcome discussion with developers, like in #2. On merging of segments - Yes, this seem to be a major factor with both ES variants and should be mentioned. In the README we mentioned this ,but also this highlights the weakness of the ES/Lucene indexing architecture with immutable segments. I also notice that there are very large off-heap allocations for OES, 1M 960 vectors touches 55G (Heap is 16G).
|
Thanks @jobergum. Graphs are loaded outside ES heap, so we could take advantage of bigger instances with more RAM. For 1M vectors with 960 dimensions k-NN graph memory usage would be in the order of (4d+8M) Bytes/vector For 1M vectors, graph usage should be Total memory usage should be from ES heap + graph memory usage + lucene. |
Have you increase value of "-n" parameter in vespa-fbench? I got the following results in my environment. Server Results
Thanks, |
Before merge segments (which took more than 2 hours)
After:
16G is from the heap, remaining is off-heap. After merging segments the recall also changes with the given index time setting for ef_search from 0.995500 with more segments (one hnsw graph per segment?) to 0.979600 with a single segment which also makes the recall more comparable with Vespa with the same ef_search setting (Vespa's query time setting "hnsw.exploreAdditionalHits")
With lower recall and less segments to search with the single threaded query execution over N segments of ES the performance improves significantly
All details https://gist.github.com/jobergum/ff46385c44bbb1683be16d98a8eed6ba @mnagaya here as in several ann benchmarks (e.g http://ann-benchmarks.com/) one reports QPS as a function of the average latency with 1 client. E.g 10 ms latency average gives 100 QPS. This gives a baseline, but I agree that it's also nice to prove that the technology uses scales with increased concurrency as I'm pretty sure many of the libraries benchmarked in ann-benchmarks will have issues with contention at high concurrency. |
@jobergum thank you. Did you run with the following suggestion in the ES query(avoid reading stored fields and just read scores and doc ids). Looks like all we need is the neighbors for recall? Example as mentioned in above comments
This would improve performance from our performance tests. Would be great if you could run with the above suggestion |
@vamshin The last quoted run above with 85 QPS was with the suggested stored_fields and docvalue_fields yes. I have not had time to format and re-run the results to update the README yet. When OES has merged the segments into 1 the performance of the two engines is almost equal. Though, the impact the number of segments have on search performance is in OES is mind blowing, also the time it takes to merge segments which will make a big impact in a real time feed scenario. |
@jobergum We can control the number of segments creation right at indexing by disabling the refresh interval or having a large refresh interval. Once all the indexing is done, we could then call Would love to see the results in README. thank you. |
Is there any plan to compare with OpenDistro Elasticsearch K-NN?
https://opendistro.github.io/for-elasticsearch/features/knn.html
The text was updated successfully, but these errors were encountered: