-
Notifications
You must be signed in to change notification settings - Fork 233
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Extending simplified PQ functions #520
base: main
Are you sure you want to change the base?
Conversation
@michael-popov please read the following Contributor License Agreement(CLA). If you agree with the CLA, please reply with the following information.
Contributor License AgreementContribution License AgreementThis Contribution License Agreement (“Agreement”) is agreed to by the party signing below (“You”),
|
src/pq.cpp
Outdated
{ | ||
if (num_pq_chunks == 0 || num_pq_chunks > dim || dim % num_pq_chunks != 0) | ||
if (num_pq_chunks < 2 || num_pq_chunks > dim) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why if < 2 we return -1?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Return value -1 means functions failed for some reason.
We check input parameters:
- num_pq_chunks == 0 => there is a division by num_pq_chunks a bit later, which will crash the system
- num_pq_chunks == 1 => originally I had offsets.size() == num_pq_chunks, and then I use offsets[1] as the largest possible size for a chunk. If num_pq_chunks == 1, then access offsets[1] will crash the system. Eventually I accepted the wisdom of the original function and added one more element to offsets so offsets.size() == num_pq_chunks+1 but the check of num_pq_chunks value was left the same.
In general, num_pq_chunks==1 does not make sense anyway, right? In this case the whole vector is squashed into a single number 0..255. Too much information is lost. I can't imagine a practical case when num_pq_chunks==1 is useful.
On the other hand, there are many things that I cannot imagine but they are still valid. :)
I can change the check to if (num_pq_chunks == 0) return -1; if required.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd say lets do what the last option is, we can allow num_pq_chunks =1.
Extending functions for in-memory PQ calculations:
For testing I modified (but not added to the PR) generate_pq utility. Code for testing loads pivot and centroids data produced by original functions and uses these data to calculate PQ with a new function. Then it compares results with PQ data produced by the original function.