-
Notifications
You must be signed in to change notification settings - Fork 797
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pagination support for ListRules #6299
Pagination support for ListRules #6299
Conversation
f51bdef
to
e710de2
Compare
Signed-off-by: Anand Rajagopal <[email protected]>
e710de2
to
4c1d435
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks this looks good.
} | ||
if filter.NextToken != "" { | ||
addQueryParams(urlValues, "group_next_token", filter.NextToken) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: If we just add group_next_token=""
what will happen? It should still be a non paginated request?
Maybe we can just remove the condition check here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To keep consistency with upstream, we should fail if group_next_token
was provided but group_limit
was not.
https://github.com/prometheus/prometheus/blob/main/web/api/v1/api.go#L1609
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Never mind, I see the validation in the API.
} | ||
|
||
func (r *Ruler) getLocalRules(userID string, rulesRequest RulesRequest, includeBackups bool) ([]*GroupStateDesc, error) { | ||
func (r *Ruler) getLocalRules(userID string, rulesRequest RulesRequest, includeBackups bool) (RulesResponse, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Why we don't return pointer here? since pointer is what is asked by GetRules
.
MaxRuleGroup: 20, | ||
}, | ||
resultCheckFn: func(t assert.TestingT, resultGroups []*ruler.RuleGroup, token string, iteration int) { | ||
assert.Len(t, resultGroups, 20, "Expected %d rules but got %d", 20, len(resultGroups)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: doesn't the error message from the assertion tell us what was expected vs the actual?
}{ | ||
"List Rule Groups - Equal number of rule groups per page": { | ||
filter: e2ecortex.RuleFilter{ | ||
MaxRuleGroup: 20, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we add a test that use the next token?
pkg/ruler/merger.go
Outdated
|
||
if maxRuleGroups > 0 { | ||
//Need to sort here before we truncate | ||
sort.Sort(GroupStateDescs(groups)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we should always sort the result.
We are already sorting the results in the prometheus list rules:
Line 243 in c25b18d
sort.Slice(groups, func(i, j int) bool { |
So we could sort just once here instead for all the cases. This will reduce modality and simplify the code a bit. WDYT @yeya24 ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, where is the next token from the previous request being used here to generate the response starting from the rule group that we truncated?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we should always sort the result.
We are already sorting the results in the prometheus list rules:
Line 243 in c25b18d
sort.Slice(groups, func(i, j int) bool { So we could sort just once here instead for all the cases. This will reduce modality and simplify the code a bit. WDYT @yeya24 ?
Without pagination, the size of the list can be pretty substantial and we would sort it twice. This was the reason I did not want to sort for non-paginated requests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, where is the next token from the previous request being used here to generate the response starting from the rule group that we truncated?
Each response from getLocalRules has been filtered using the next token
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Each response from getLocalRules has been filtered using the next token
but what about the case you are using sharding?
My understanding is that in this case the ruler that receives the request will look into the ring for the other rulers that contain the rules and fire requests to each to collect from them. This ruler will then consolidate the list of rules in a single list before returning the response. I was expecting to see the pagination happen at this point because we need the complete list of rules to paginate properly.
My understanding is that it is not possible to create a distributed pagination algorithm because the way how we spread the rulers in rulers. This specially important if there are ring reorganization during requests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Each response from getLocalRules has been filtered using the next token
but what about the case you are using sharding?
getShardedRules()
calls getLocalRules()
. So the results from each ruler is already filtered using next token
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, we discussed offline, and there was a misunderstanding from my part. I stand corrected.
Since we are truncating on a sorted list, where the sorting happens by the sha(namespace + group), it is safe to use the >=
as filter when we spread the requests to get the rules from other rulers. This is like a distributed filter operation.
Having said that, It would not hurt to add a bit of description to the PR to detail how the implementation was done.
pkg/ruler/ruler_pagination.go
Outdated
func GetRuleGroupNextToken(namespace string, group string) string { | ||
h := sha1.New() | ||
h.Write([]byte(namespace + ";" + group)) | ||
return hex.EncodeToString(h.Sum(nil)) | ||
} | ||
|
||
func TruncateGroups(groups []*GroupStateDesc, maxRuleGroups int) ([]*GroupStateDesc, string) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do these functions need to be public?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm talking only about TruncateGroups and GetRuleGroupNextToken
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TruncateGroups
does not need to be public. I will fix it. GetRuleGroupNextToken
needs to be public since it is used in integration testing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would actually move all the content of this file to close where they are used, since they are only used in one place. I'm not particularly strong about this but I think it would be simpler to understand.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another suggestion, instead of truncateGroups we could name this function generatePage, createPage, makePage or similar to make it clear that this function is used to generate the page response with token etc. I think this function deserves some comments describing how the pagination works and that is supposed to receive a sorted list of groups. just my two cents.
resultingGroupDescs := make([]*GroupStateDesc, 0, len(combinedRuleStateDescs)) | ||
for _, group := range combinedRuleStateDescs { | ||
groupID := GetRuleGroupNextToken(group.Group.Namespace, group.Group.Name) | ||
if len(rulesRequest.NextToken) > 0 && rulesRequest.NextToken >= groupID { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there is no meaning in comparing two hashes sha1() using >=
. We should use ==
only.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
discussed offline, this is ok because are assuming that combinedRulesStateDescs
is sorted by sha.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please add a small comment here detailing why it is ok to compare groupIDs.
pkg/ruler/ruler_pagination.go
Outdated
"encoding/hex" | ||
) | ||
|
||
type GroupStateDescs []*GroupStateDesc |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be called PaginedGroupStates or something like that since this is only used in the pagination context?
Also I don't think this type need to be public right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It will now need to be public because it is used in integ tests
Signed-off-by: Anand Rajagopal <[email protected]>
pkg/ruler/ruler_pagination.go
Outdated
@@ -5,21 +5,24 @@ import ( | |||
"encoding/hex" | |||
) | |||
|
|||
type GroupStateDescs []*GroupStateDesc | |||
type PaginedGroupStates []*GroupStateDesc |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo? PaginedGroupStates
-> PaginatedGroupStates
Signed-off-by: Anand Rajagopal <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks
What this PR does:
This PR adds pagination support for List Rules API similar to Prometheus pagination feature for List Rules
generatePage
getShardedRules()
merges results from all thegetLocalRules()
, deduplicates, sorts using token as the key, and generates the next pageChecklist
CHANGELOG.md
updated - the order of entries should be[CHANGE]
,[FEATURE]
,[ENHANCEMENT]
,[BUGFIX]