Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Normalize DiskIOMeter bar by number of disks #1533

Merged
merged 2 commits into from
Dec 14, 2024

Conversation

Explorer09
Copy link
Contributor

@Explorer09 Explorer09 commented Sep 6, 2024

The maximum utilization should be "100% * number of disks". Set the bar and graph of the meter to draw with that maximum. Also remove the 100% cap on the utilization percentage display.

Resolves #1374.

@natoscott
Copy link
Member

Something like this patch will provide the PCP value you seek. You might want to consider a similar change for the NetworkIOMeter, in which case there is a "hinv.ninterface" metric that can be used similarly.

pcp/Metric.h
--- /tmp/git-blob-5igpu5/Metric.h	2024-09-06 15:32:13.776523149 +1000
+++ pcp/Metric.h	2024-09-06 15:26:46.052772959 +1000
@@ -26,6 +26,7 @@ typedef enum Metric_ {
    PCP_CONTROL_THREADS,         /* proc.control.perclient.threads */
 
    PCP_HINV_NCPU,               /* hinv.ncpu */
+   PCP_HINV_NDISK,              /* hinv.ndisk */
    PCP_HINV_CPUCLOCK,           /* hinv.cpu.clock */
    PCP_UNAME_SYSNAME,           /* kernel.uname.sysname */
    PCP_UNAME_RELEASE,           /* kernel.uname.release */
pcp/Platform.c
--- /tmp/git-blob-JCaR3p/Platform.c	2024-09-06 15:32:13.956523525 +1000
+++ pcp/Platform.c	2024-09-06 15:30:56.940361942 +1000
@@ -127,6 +127,7 @@ static const char* Platform_metricNames[
    [PCP_CONTROL_THREADS] = "proc.control.perclient.threads",
 
    [PCP_HINV_NCPU] = "hinv.ncpu",
+   [PCP_HINV_NDISK] = "hinv.ndisk",
    [PCP_HINV_CPUCLOCK] = "hinv.cpu.clock",
    [PCP_UNAME_SYSNAME] = "kernel.uname.sysname",
    [PCP_UNAME_RELEASE] = "kernel.uname.release",
@@ -385,6 +386,7 @@ bool Platform_init(void) {
    Metric_enable(PCP_PID_MAX, true);
    Metric_enable(PCP_BOOTTIME, true);
    Metric_enable(PCP_HINV_NCPU, true);
+   Metric_enable(PCP_HINV_NDISK, true);
    Metric_enable(PCP_PERCPU_SYSTEM, true);
    Metric_enable(PCP_UNAME_SYSNAME, true);
    Metric_enable(PCP_UNAME_RELEASE, true);
@@ -753,6 +755,8 @@ bool Platform_getDiskIO(DiskIOData* data
       data->totalBytesWritten = value.ull;
    if (Metric_values(PCP_DISK_ACTIVE, &value, 1, PM_TYPE_U64) != NULL)
       data->totalMsTimeSpend = value.ull;
+   if (Metric_values(PCP_HINV_NDISK, &value, 1, PM_TYPE_U64) != NULL)
+      data->numDisks = value.ull;
    return true;
 }
 

@Explorer09
Copy link
Contributor Author

@natoscott There is no "utilization percentage" for NetworkIOMeter at the moment. So I would consider showing the number of interfaces a feature separate from this pull request.

@natoscott
Copy link
Member

Understood & agreed. FWIW, I think this approach is not ideal (shares the same problem of the original 100% limiting, as mentioned by the reporter of #1374 - the code was fine the way it was originally). We need to be aware here of properties of some hardware described in the iostat(1) man page ...

              %util  Percentage  of elapsed time during which I/O requests were
                     issued to the device (bandwidth utilization  for  the  de‐
                     vice).  Device  saturation occurs when this value is close
                     to 100% for devices serving requests  serially.   But  for
                     devices  serving requests in parallel, such as RAID arrays
                     and modern SSDs, this number does not reflect  their  per‐
                     formance limits.

@Explorer09
Copy link
Contributor Author

@natoscott What was the ideal approach anyway?

@natoscott
Copy link
Member

Just reverting to the original htop behaviour as the issue reporter suggested.

@Explorer09
Copy link
Contributor Author

@natoscott Look in my patch carefully. I separated the utilization value into two. cached_utilisation_diff would revert to original behavior of allowing more than 100% to be shown; the bar would be drawn with cached_utilisation_norm which is the normalized value.

@BenBE BenBE added the enhancement Extension or improvement to existing feature label Sep 6, 2024
@BenBE BenBE added this to the 3.4.0 milestone Sep 6, 2024
@Explorer09 Explorer09 marked this pull request as ready for review September 6, 2024 19:18
@@ -16,6 +16,7 @@ typedef struct DiskIOData_ {
uint64_t totalBytesRead;
uint64_t totalBytesWritten;
uint64_t totalMsTimeSpend;
uint64_t numDisks;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the data type be unsigned int, size_t, uint64_t or unsigned long long?
Should the property be named "numDisks" or "nDisks"?

Other than those questions, this patch is ready.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think, both uint64_t and size_t would be fine. No need to change this.

@natoscott
Copy link
Member

@natoscott Look in my patch carefully. I separated the utilization value into two. cached_utilisation_diff would revert to original behavior of allowing more than 100% to be shown; the bar would be drawn with cached_utilisation_norm which is the normalized value.

Two thumbs up - good stuff, thanks. Regarding your data type question, I think I know what @BenBE is going to suggest: size_t FTW. :) However, systems with more than 4 billion disks are few and far between so I'm sure a 32 bit unsigned value would suffice. If noone has strong opinions, I suggest sticking with what you have now since that's consistent with the surrounding code.

Copy link
Member

@natoscott natoscott left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

DiskIOMeter.c Outdated Show resolved Hide resolved
@Explorer09
Copy link
Contributor Author

Regarding your data type question, I think I know what @BenBE is going to suggest: size_t FTW. :)

For PCP this would require a downcast from uint64_t and an (numDisks <= SIZE_MAX) check that I don't think it's worth it. That's why I chose a larger data type than size_t for now.

@natoscott
Copy link
Member

Regarding your data type question, I think I know what @BenBE is going to suggest: size_t FTW. :)

For PCP this would require a downcast from uint64_t and...

FWIW, for PCP to use 32-bit types we would change this section:

   if (Metric_values(PCP_HINV_NDISK, &value, 1, PM_TYPE_U64) != NULL)
      data->numDisks = value.ull;

to

   if (Metric_values(PCP_HINV_NDISK, &value, 1, PM_TYPE_U32) != NULL)
      data->numDisks = value.ul;

but, keep it as you have it, it's fine IMO.

@fasterit
Copy link
Member

@ShaiMagal: Have you tested the PR?

@ShaiMagal
Copy link

ShaiMagal commented Dec 8, 2024

@natoscott When this PR will be merged? It's "game changer" for overall about disk activity :) Should it be merged sooner than in 3.4.0?

Copy link
Member

@BenBE BenBE left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aside from some minor cleanup stuff the PR LGTM.

@ShaiMagal Please confirm whether you have tested these changes.

@Explorer09 Mind to include these minor cleanups? Just throw in an extra patch with the consistency cleanup stuff and I think we are fine.

@@ -16,6 +16,7 @@ typedef struct DiskIOData_ {
uint64_t totalBytesRead;
uint64_t totalBytesWritten;
uint64_t totalMsTimeSpend;
uint64_t numDisks;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think, both uint64_t and size_t would be fine. No need to change this.

darwin/Platform.c Outdated Show resolved Hide resolved
freebsd/Platform.c Outdated Show resolved Hide resolved
linux/Platform.c Outdated Show resolved Hide resolved
@Explorer09 Explorer09 force-pushed the disk-io-norm-util branch 2 times, most recently from 152412b to 64efa16 Compare December 12, 2024 20:50
@BenBE BenBE force-pushed the disk-io-norm-util branch from ddc3679 to d6d8f0a Compare December 14, 2024 15:49
Explorer09 and others added 2 commits December 14, 2024 16:57
The utilization percentage of DiskIOMeter is an accumulated total of
all disks, and for multiple disks, utilization above 100% is possible.
The maximum utilization should be "100% * number of disks". Set the bar
and graph of the meter to draw with that maximum.

Thanks to Nathan Scott for providing the PCP portion of the patch.

Resolves htop-dev#1374.

Co-authored-by: Nathan Scott <[email protected]>
Signed-off-by: Kang-Che Sung <[email protected]>
Change the data type from unsigned long long to uint64_t.

Co-authored-by: Benny Baumann <[email protected]>
@BenBE BenBE force-pushed the disk-io-norm-util branch from d6d8f0a to becb032 Compare December 14, 2024 15:57
@fasterit fasterit merged commit 4ba1988 into htop-dev:main Dec 14, 2024
18 checks passed
@Explorer09 Explorer09 deleted the disk-io-norm-util branch December 14, 2024 22:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Extension or improvement to existing feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Allow Disk IO meter to be above 100%
5 participants