-
Notifications
You must be signed in to change notification settings - Fork 591
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
storage: track cache disk metrics separately #24138
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Prepare for addition of cache disk metrics.
7 tasks
Previously if data and cache directories were mounted on different disk then we would override the metrics with information only from the cache disk. This commit introduces a separate metric for the cache disk to fix the bug and also allow greater visibility into the underlying system. I considered introducing a label rather than a new metric name but that would break backwards compatibility of the metrics since the cardinality would change from 1 to 2 and could also break dashboards and our tests. From the point-of-view of time series storage engines like prometheus there is no difference in new metric name vs new label. Fixes https://redpandadata.atlassian.net/browse/CORE-1609 Fixes redpanda-data#15223
Before > ``` > storage space alert: free space at 61.734% on /var/lib/redpanda/data: > 307.545GiB total, 189.861GiB free, min. free 0.000bytes. Please adjust > retention policies as needed to allow writing again > ``` After > ``` > space alert: free space at 61.455% on /var/lib/redpanda/data: 307.545GiB > total, 189.001GiB free, min. free for alert 0.000bytes, min. free for > degraded 1024.000PiB. Please adjust retention policies as needed to > allow writing again > ```
nvartolomei
force-pushed
the
nv/CORE-1609
branch
from
November 16, 2024 13:17
52128e3
to
5abcc9e
Compare
dotnwat
approved these changes
Nov 19, 2024
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
human::bytes(disk.total), // NOLINT narrowing conv. | ||
human::bytes(disk.free), // NOLINT " " | ||
human::bytes(min_space), // NOLINT " " | ||
human::bytes(disk.total), // NOLINT narrowing conv. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: you can silence specific with NOLINT(actual-name-of-check)
/backport v24.3.x |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Additional test in #24140
Backports Required
Release Notes
Bug Fixes
storage_disk_{total,free}_bytes
metric will report metrics for the data directory mountpoint and a newstorage_cache_disk_{total,free}_bytes
metric will report metrics for the cache directory mountpoint. Metrics will be equivalent if both are on the same mountpoint.