Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to get the high-quality knn graph without refinement as mentioned in the paper? #5

Open
RayWang96 opened this issue Feb 6, 2021 · 2 comments

Comments

@RayWang96
Copy link

RayWang96 commented Feb 6, 2021

Hello! Thanks for your excellent work. I'm trying your demo. I am confused that I can't get the high-quality knn graph without refinement as mentioned in the paper. For SIFT1M, I can only get C@10 of 0.5022 when KBuild=62 (Any larger KBuild will lead to an illegal memory error).

It is mentioned in the paper that your algorithm achieves C@10 of 0.987 without refinement. Please kindly tell me that how do I get that result. Besides, I used the getGraph function to export the graph, please let me know if I get something wrong. Thank you again!

@LukasRuppert
Copy link
Collaborator

Hi,
I've looked up the corresponding configuration.
Use K = 24, tau_build = 0.7, refinements = 0, tau_query = 1.08.
At the end of demo.cpp add query_function(1.08f);
and run the program using -tau 0.7 -refinement_iterations 0.

e.g.
./demo -base_filename sift_base.fvecs -query_filename sift_query.fvecs -groundtruth_filename sift_groundtruth.ivecs -tau 0.7 -refinement_iterations 0 -gpu_id=0 --v=4 -logtostderr

With these parameters, I get the following result:

I20210206 12:57:57.716079 10807 cuda_knn_query_layer.cuh:53] QueryKernel -- KQuery: 10 MAX_ITERATIONS: 400 CACHE_SIZE: 512
ms: 222.001144 for 10000 queries -> 0.022200 ms/query 
I20210206 12:57:57.939872 10807 cuda_knn_ggnn.cuh:137] c@1: 0.9871
I20210206 12:57:57.939880 10807 cuda_knn_ggnn.cuh:138] c@10: 0.98721
I20210206 12:57:57.939883 10807 cuda_knn_ggnn.cuh:140] mean calculated dists: 6310.51

Note that always two versions of the query are being executed per tau_query, so this will be the second last result.

The getGraph function will just give you the neighborhoods in the graph.
For our query, some more metadata is necessary (such as the distances to the nearest neighbor per point in the graph and the mean and max over these distances).
There is some dedicated serialization code, but we also noticed a bug in the currently published version.
It does not properly export the upper layers of the graph.

We will soon release an updated version of the code which also addresses the issues you experienced with KBuild > 62.
Part of it is the requirement that S > KBuild/2, but there are quite a few more changes necessary.

@RayWang96
Copy link
Author

RayWang96 commented Feb 6, 2021

Thank you very much for your reply! I mistakenly thought the results in the paper were the quality of the graph. Now I get the same result.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants