-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql/stats: more fully undo addOuterBuckets in stripOuterBuckets #135310
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 2 of 2 files at r1, all commit messages.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @michae2 and @rytaft)
pkg/sql/stats/merge.go
line 109 at r1 (raw file):
// [{1, 0, 0, 2}, {1, 0, 0, 3}, {1, 0, 0, 4}] // // Partial Statistic: {row, 8, dist: 4, null: 0, size: 1}
nit: s/row, 8/row: 8/
😃
Near the end of `stats.(*histogram).adjustCounts`, if there is leftover DistinctCount after adjusting each histogram bucket, we call `addOuterBuckets`. This function creates two "outer" buckets that fully cover the lower and upper ends of the domain with the remaining DistinctCount. These "outer" buckets interfere with partial stats, so we remove them with `stripOuterBuckets` when merging partial stats. `stripOuterBuckets` was simply slicing off the two buckets, but this could potentially leave non-zero DistinctCount in the first bucket (which is added by `addOuterBuckets`). `stripOuterBuckets` needs to also zero the DistinctCount and NumRange of the first bucket to fully undo the actions of `addOuterBuckets`. We were only catching this when trying to forecast using the merged histogram, which would sometimes produce a forecasted histogram with non-zero DistinctRange in the first bucket, leading to the infamous "histogram had first bucket with non-zero NumRange or DistinctRange" Sentry failure. Unfortunately, I don't think this bug fully explains all of the Sentry failures that look like cockroachdb#93892 because many of those predate the introduction of partial stats. I believe this specific failure started with v24.3.0-alpha.0. Informs: cockroachdb#93892 Fixes: cockroachdb#134031 (No release note because automatic collection of partial stats is not yet enabled in any available release.) Release note: None
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TFTR!
bors r=yuzefovich
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @rytaft and @yuzefovich)
pkg/sql/stats/merge.go
line 109 at r1 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
nit:
s/row, 8/row: 8/
😃
lol, how did you see this? nice. done.
135310: sql/stats: more fully undo addOuterBuckets in stripOuterBuckets r=yuzefovich a=michae2 Near the end of `stats.(*histogram).adjustCounts`, if there is leftover DistinctCount after adjusting each histogram bucket, we call `addOuterBuckets`. This function creates two "outer" buckets that fully cover the lower and upper ends of the domain with the remaining DistinctCount. These "outer" buckets interfere with partial stats, so we remove them with `stripOuterBuckets` when merging partial stats. `stripOuterBuckets` was simply slicing off the two buckets, but this could potentially leave non-zero DistinctCount in the first bucket (which is added by `addOuterBuckets`). `stripOuterBuckets` needs to also zero the DistinctCount and NumRange of the first bucket to fully undo the actions of `addOuterBuckets`. We were only catching this when trying to forecast using the merged histogram, which would sometimes produce a forecasted histogram with non-zero DistinctRange in the first bucket, leading to the infamous "histogram had first bucket with non-zero NumRange or DistinctRange" Sentry failure. Unfortunately, I don't think this bug fully explains all of the Sentry failures that look like #93892 because many of those predate the introduction of partial stats. I believe this specific failure started with v24.3.0-alpha.0. Informs: #93892 Fixes: #134031 (No release note because automatic collection of partial stats is not yet enabled in any available release.) Release note: None Co-authored-by: Michael Erickson <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 1 files at r2, all commit messages.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @michae2 and @rytaft)
pkg/sql/stats/merge.go
line 109 at r1 (raw file):
Previously, michae2 (Michael Erickson) wrote…
lol, how did you see this? nice. done.
Was reading through the comments to understand what's happening in the code :)
Build failed: |
Near the end of
stats.(*histogram).adjustCounts
, if there is leftover DistinctCount after adjusting each histogram bucket, we calladdOuterBuckets
. This function creates two "outer" buckets that fully cover the lower and upper ends of the domain with the remaining DistinctCount.These "outer" buckets interfere with partial stats, so we remove them with
stripOuterBuckets
when merging partial stats.stripOuterBuckets
was simply slicing off the two buckets, but this could potentially leave non-zero DistinctCount in the first bucket (which is added byaddOuterBuckets
).stripOuterBuckets
needs to also zero the DistinctCount and NumRange of the first bucket to fully undo the actions ofaddOuterBuckets
.We were only catching this when trying to forecast using the merged histogram, which would sometimes produce a forecasted histogram with non-zero DistinctRange in the first bucket, leading to the infamous "histogram had first bucket with non-zero NumRange or DistinctRange" Sentry failure.
Unfortunately, I don't think this bug fully explains all of the Sentry failures that look like #93892 because many of those predate the introduction of partial stats. I believe this specific failure started with v24.3.0-alpha.0.
Informs: #93892
Fixes: #134031
(No release note because automatic collection of partial stats is not yet enabled in any available release.)
Release note: None