-
Notifications
You must be signed in to change notification settings - Fork 39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Additional tracking on histograms for means, std dev, p* quantile estimates #5897
Conversation
1bcf5e2
to
4a3deff
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for taking all this on, it's a big chunk of work! I've not been able to work through everything yet, but have some questions and comments around the guts of the P2 algorithm. Maybe we can take a pass at those, and then I'll work through the rest once we're happy? Please let me know if my questions aren't clear.
Thanks!
9f90f94
to
bf92d09
Compare
8e9811e
to
6ef6bc2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great! I really like the new docstrings and doctests, and the cleanup of the quantile internals is all awesome. I've got a few remaining comments, most of which are pretty small at this point. Hopefully things are clear, but as always let me know if not! Thanks, this is really great to see.
5a742a1
to
ed32267
Compare
Last piece is the sql updates (will make those in the next commit). |
…stimated quantiles. Closes #4913. This is part 1 of tracking more information within a histogram in order to compute min, max, means, std_dev, and estimates for p50, p90, p99 and how to store/retrieve the associated info from Clickhouse.
…ersion on histo creation
fc729fe
to
917c7a0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I took a look at the SQL migrations assuming that's the only thing that changed. Those LGTM, though there's one opportunity I think we should take for simplifying the definition of a distributed table on top of the local ones. That will really reduce the difficulty of changing the replicated table definitions.
Other than that, LGTM! Thanks for all the back and forth on this, it'll be cool to see this work merged!
Closes #4913.
This is part 1 of tracking more information within a histogram to compute min, max, means, std_dev, and estimates for p50, p90, and p99 and how to store/retrieve the associated info from Clickhouse.