Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix: sort document reference by long type id #14248

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

milaGGL
Copy link
Contributor

@milaGGL milaGGL commented Dec 12, 2024

Document ID supports strings and well as long integers in the format of "id" . When sorting documents by document ID, it should be sorted in the following order:

  1. Long (numeric order)
  2. String (lexicographic order)

#no-changelog

util::ComparisonResult CompareTo(const T& rhs) const {
return util::CompareContainer(segments_, rhs.segments_);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CompareContainer is a generic method that compares items in 2 T& variables by iteration, and it is only used here in base_path. The sorting order is unique to base_path(specifically resource_path), makes more sense to add the logic here instead of modifying the CompareContainer function.

@@ -174,6 +184,31 @@ class BasePath {

private:
SegmentsT segments_;

static util::ComparisonResult compareSegments(const std::string& lhs,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

C++ style guide suggests function names to start with a capital letter (CompareSegments)


static bool isNumericId(const std::string& segment) {
return segment.substr(0, 4) == "__id" &&
segment.substr(segment.size() - 2) == "__";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

substr throws and exception if pos is larger than the string length (source). Also not sure what the behavior would be if segment.size is 1 (and therefore segment.size()-2 will be -1). Please add a unit test for these corner cases.

You may want to do:

return
  segment.size() > 6 &&
  segment.substr(0, 4) == "__id" &&
  segment.substr(segment.size() - 2) == "__";

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

by adding unit tests I mean testing that the output of isNumericId is false for ("__id" or "__id__" or "") for example.

}

static int64_t extractNumericId(const std::string& segment) {
return std::stol(segment.substr(4, segment.size() - 2));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

std::stol throws an std::invalid_argument exception if the argument is an empty string. So it reinforces that this function should not be called if the segment length is 6 or smaller.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants