[C++] Add filesystem stats #38465

pitrou · 2023-10-25T18:38:48Z

Describe the enhancement requested

It would be nice to have per-filesystem stats to better analyze how filesystems are used by libraries (such as the Parquet reader).

Potentially useful stats:

total number of IO requests issued (one IO request could be one POSIX API call for local filesystems, or one network request for S3, etc.)
total number of IO reads issued
total number of IO writes issued
average size of IO reads
average size of IO writes
approximate quantiles of IO read sizes
approximate quantiles of IO write sizes
average duration of IO reads
average duration of IO writes
approximate quantiles of IO read durations
approximate quantiles of IO write durations

(perhaps some of those would be too costly to record, we'll see)

Component(s)

C++

pitrou · 2023-10-25T18:39:02Z

@mrocklin @lidavidm Any ideas and opinions here?

pitrou · 2023-10-25T19:20:59Z

Note: we have a T-Digest implementation for approximate quantiles, though https://github.com/HdrHistogram/HdrHistogram_c could be another efficient solution.

mapleFU · 2023-10-25T19:22:08Z

Awesome, RocksDB also has a simple histogram for this(though it only does disk io)

mrocklin · 2023-10-25T22:41:29Z

These things seem good to know. I think that at a high level the questions I want to know is if I'm operating efficiently. Sub-questions to this include:

What is my overall bandwidth over time? (including overlapping / parallel reads)
If it's low, what's going on?
- What is my bandwidth per read?
- If it's low, the next thing I'll ask is about the distribution of sizes

wjones127 · 2023-11-12T04:59:18Z

FWIW in Lance what we've found most helpful in understanding filesystem / object store use is traces rather than metrics. We use the Rust tracing library and Perfetto to visualize. Here's an example investigation: lancedb/lance#1352 (comment)

I think in Arrow C++, you could integrate OpenTelemetry tracing in the filesystems. @amoeba did some similar work integrating it with Flight C++.

That being said, collecting these kinds of metrics would be useful for additional information in benchmark comparisons. It would be nice, for example, for a benchmark to show that a certain optimization reduced that number of IO requests.

pitrou added the Type: enhancement label Oct 25, 2023

github-actions bot added the Component: C++ label Oct 25, 2023

pitrou self-assigned this Oct 25, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[C++] Add filesystem stats #38465

[C++] Add filesystem stats #38465

pitrou commented Oct 25, 2023

pitrou commented Oct 25, 2023

pitrou commented Oct 25, 2023

mapleFU commented Oct 25, 2023

mrocklin commented Oct 25, 2023

wjones127 commented Nov 12, 2023

[C++] Add filesystem stats #38465

[C++] Add filesystem stats #38465

Comments

pitrou commented Oct 25, 2023

Describe the enhancement requested

Component(s)

pitrou commented Oct 25, 2023

pitrou commented Oct 25, 2023

mapleFU commented Oct 25, 2023

mrocklin commented Oct 25, 2023

wjones127 commented Nov 12, 2023