Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expose metrics related to memory arbitration #7940

Conversation

bikramSingh91
Copy link
Contributor

@bikramSingh91 bikramSingh91 commented Dec 8, 2023

This adds the following stats:

"velox.arbitrator_requests_count"
"velox.arbitrator_aborted_count"
"velox.arbitrator_failures_count"
"velox.arbitrator_queue_time_ms"
"velox.arbitrator_arbitration_time_ms"
"velox.arbitrator_free_capacity_bytes"

Also fixed accounting for:
"velox.memory_non_reclaimable_count"

Copy link

netlify bot commented Dec 8, 2023

Deploy Preview for meta-velox canceled.

Name Link
🔨 Latest commit e40e02b
🔍 Latest deploy log https://app.netlify.com/sites/meta-velox/deploys/6573a74d1bd653000857e3d6

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Dec 8, 2023
@facebook-github-bot
Copy link
Contributor

@bikramSingh91 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

Copy link
Contributor

@xiaoxmeng xiaoxmeng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bikramSingh91 LGTM % minors. Thanks!

velox/common/base/Counters.h Show resolved Hide resolved
DEFINE_METRIC(
kMetricArbitratorFailuresCount, facebook::velox::StatType::COUNT);

// Tracks the arbitration request queue times in range of [0, 100s] and
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might need to increase the queue time range later?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

increased to match that of reclaim time since they should roughly be on the same order of magnitude


// Tracks the arbitration run times in range of [0, 100s] and reports P50,
// P90, P99, and P100.
DEFINE_HISTOGRAM_METRIC(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

increased to match that of reclaim time since they should roughly be on the same order of magnitude

@@ -94,6 +96,7 @@ uint64_t SortingWriter::reclaim(
}

if (!isRunning()) {
RECORD_METRIC_VALUE(kMetricMemoryNonReclaimableCount);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need this. As for now, kMetricMemoryNonReclaimableCount tracks the reclaim in non-reclaimable section caused by insufficient memory reservation? thanks!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the clarification. I have added context to the meaning of this metric to both the metrics.rst doc and the cpp file declaring this metric. I do however feel that we would want to rename this to make it more explicit that it only counts for those kind of non-reclaimable failures. Please let me know what you think.

velox/dwio/dwrf/writer/Writer.cpp Outdated Show resolved Hide resolved
velox/dwio/dwrf/writer/Writer.cpp Show resolved Hide resolved
velox/common/base/Counters.cpp Outdated Show resolved Hide resolved
velox/exec/TableWriter.cpp Outdated Show resolved Hide resolved
velox/exec/TableWriter.cpp Outdated Show resolved Hide resolved
velox/exec/SharedArbitrator.cpp Outdated Show resolved Hide resolved
@bikramSingh91 bikramSingh91 force-pushed the createArbitratorCounters branch from d8e94b3 to 31fb677 Compare December 8, 2023 23:28
This adds the following stats:

"velox.arbitrator_requests_count"
"velox.arbitrator_aborted_count"
"velox.arbitrator_failures_count"
"velox.arbitrator_queue_time_ms"
"velox.arbitrator_arbitration_time_ms"
"velox.arbitrator_free_capacity_bytes"

Also fixed accounting for:
"velox.memory_non_reclaimable_count"
@bikramSingh91 bikramSingh91 force-pushed the createArbitratorCounters branch from 31fb677 to e40e02b Compare December 8, 2023 23:31
@facebook-github-bot
Copy link
Contributor

@bikramSingh91 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@bikramSingh91 merged this pull request in fa82b94.

Copy link

Conbench analyzed the 1 benchmark run on commit fa82b947.

There were 2 benchmark results indicating a performance regression:

The full Conbench report has more details.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Merged
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants