added md disks in down state #3007

Finomosec · 2024-04-30T00:12:21Z

Added missing mdadm stats:

node_md_disks # added {state="down"}
node_md_sync_time_remaining (seconds)
node_md_blocks_synced_speed
node_md_blocks_synced_pct

Notes:

One drive was not being shown, as it was in state="down" (recovering), which was not reported in the output.
Using node_md_blocks_synced / node_md_blocks as progress percentage created wrong results on my system, as the total-blocks differed from the total-to-be-synced-blocks. This may be due to the raid-level being used (raid5).

md0 : active raid5 sdf1[4] sde1[1] sdc1[2] sdb1[0]
      14650718208 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/3] [UUU_]
      [===================>.]  recovery = 99.9% (4882207424/4883572736) finish=7.8min speed=2908K/sec
      bitmap: 2/37 pages [8KB], 65536KB chunk

Signed-off-by: Finomosec <[email protected]>

… node_md_blocks_synced_pct Signed-off-by: Finomosec <[email protected]>

Signed-off-by: Finomosec <[email protected]>

changed sync_minutes_remaining to sync_time_remaining (in seconds) Signed-off-by: Frederic <[email protected]>

fixed code formatting Signed-off-by: Frederic <[email protected]>

Signed-off-by: Finomosec <[email protected]>

Finomosec · 2024-05-03T15:25:43Z

@SuperQ
I'm done for now.
Feel free to merge it at any time.

collector/mdadm_linux.go

SuperQ · 2024-05-07T21:48:50Z

collector/mdadm_linux.go

+	blockSyncedSpeedDesc = prometheus.NewDesc(
+		prometheus.BuildFQName(namespace, "md", "blocks_synced_speed"),
+		"current sync speed (in Kilobytes/sec)",
+		[]string{"device"},
+		nil,
+	)


This doesn't seem necessary, we should be able to compute this from something like rate(node_md_blocks_synced[1m]) * <blocksize>.

Suggested change

blockSyncedSpeedDesc = prometheus.NewDesc(

prometheus.BuildFQName(namespace, "md", "blocks_synced_speed"),

"current sync speed (in Kilobytes/sec)",

[]string{"device"},

nil,

)

I think it is usefull. It is the CURRENT speed, as it is shown in /stat/proc/mdstat
I have it showing in my Grafana board.
Plus i guess <blocksize> is not included in the data, so it would require additional configuration for each md-device.

My Grafana Board: https://grafana.com/grafana/dashboards/20989-node-exporter-mdadm-status/

Note that the groundwork has already been laid for #1085, and we probably should not add any new parsing functionality relating to /proc/mdstat.

SuperQ · 2024-05-07T21:52:19Z

Maybe instead of exposing the sync percent, we should expose the "TODO" blocks value. This way the completion ratio can be correctly calculated as node_md_blocks_synced / node_md_blocks_synced_todo.

renamed pct to percent Co-authored-by: Ben Kochie <[email protected]> Signed-off-by: Frederic <[email protected]>

added unit "seconds" Co-authored-by: Ben Kochie <[email protected]> Signed-off-by: Frederic <[email protected]>

Finomosec · 2024-05-09T09:28:07Z

Maybe instead of exposing the sync percent, we should expose the "TODO" blocks value. This way the completion ratio can be correctly calculated as node_md_blocks_synced / node_md_blocks_synced_todo.

That was my first idea, too, but the data-source (https://github.com/prometheus/procfs/blob/master/mdstat.go) does not (yet) capture/expose this value.

Also node_md_blocks_synced_todo is not a good name. todo sound like remaining, which is not correct.
Maybe to_be_synced would suffice.

But hey! We could calculate it using blocks_synced and the percentage.
What do you think, should we do this?

... but it might be imprecise, especially for low percentage values plus it might yield slightly different results over time, which would be kind of akward.

So maybe better not after all.

I added a request to add it:
prometheus/procfs#636

Signed-off-by: Finomosec <[email protected]>

discordianfish · 2024-05-13T11:54:50Z

Yeah I agree, let's add the TODO blocks to procfs

SuperQ · 2024-05-14T09:43:55Z

Released updated procfs: https://github.com/prometheus/procfs/releases/tag/v0.15.0

JoaoPPCastelo · 2025-04-16T18:46:34Z

Hi @Finomosec

Found your dashboard and was interested to add it to my Grafana, but saw that this PR is still open. Any plans to continue the missing work and add the changes to the node exporter?

Thanks

Finomosec · 2025-04-30T11:15:40Z

Hello @JoaoPPCastelo , thanks for your interest.

It seems i did miss the updated procfs :-)
I will see what i can do. Maybe this week.

Greetings

p.s. i was using my branched version in the meantime

sklyfr · 2025-07-10T11:23:47Z

Hi, I see you have made (imo) best grafana chart for RAID.
Asking this for merge, due to upkeep associated with forked nodeexporter outside helm.

discordianfish · 2025-09-17T12:33:02Z

@Finomosec Sorry for dropping the ball on this. Can you update this branch?

Signed-off-by: Frederic <[email protected]>

discordianfish · 2025-09-17T13:03:31Z

See failing checks

Finomosec · 2025-09-17T14:07:22Z

I fixed the problems in prometheus/procfs#737, please merge this first so this PR can work.

added md disks in down state

21be960

Signed-off-by: Finomosec <[email protected]>

Finomosec force-pushed the master branch from 8a2e622 to 21be960 Compare April 30, 2024 00:15

Finomosec and others added 7 commits May 1, 2024 11:51

added node_md_sync_minutes_remaining, node_md_blocks_synced_speed and…

58f48d1

… node_md_blocks_synced_pct Signed-off-by: Finomosec <[email protected]>

fixed unit-test expected output

27b93f8

Signed-off-by: Finomosec <[email protected]>

fixed unit-test expected output

3179601

Signed-off-by: Finomosec <[email protected]>

Update mdadm_linux.go

f21e3b5

changed sync_minutes_remaining to sync_time_remaining (in seconds) Signed-off-by: Frederic <[email protected]>

Update mdadm_linux.go

dd0a449

fixed code formatting Signed-off-by: Frederic <[email protected]>

fixed unit-test expected output

70a889f

Signed-off-by: Finomosec <[email protected]>

fixed unit-test expected output

9b3de10

Signed-off-by: Finomosec <[email protected]>

SuperQ requested changes May 7, 2024

View reviewed changes

Frederic and others added 2 commits May 9, 2024 11:23

Update collector/mdadm_linux.go

4d743c4

renamed pct to percent Co-authored-by: Ben Kochie <[email protected]> Signed-off-by: Frederic <[email protected]>

Update collector/mdadm_linux.go

6df2222

added unit "seconds" Co-authored-by: Ben Kochie <[email protected]> Signed-off-by: Frederic <[email protected]>

fixed unit-test expected output

cf0dffe

Signed-off-by: Finomosec <[email protected]>

Finomosec requested a review from SuperQ May 9, 2024 09:39

Frederic and others added 5 commits October 16, 2024 23:32

Merge branch 'prometheus:master' into master

af65f85

Added DisksRecovering

de82d1a

Updated procfs to used patched version

e020eb4

üatched procfs

01a9d66

added dep

5586487

fixed disk status - renamed from recovering to replacing

847e8ca

Merge branch 'master' into master

1d11514

Signed-off-by: Frederic <[email protected]>

fixed DisksReplacing, added reshaping

626fd3f

Finomosec mentioned this pull request Sep 17, 2025

Added stat for disks in "replacing" state prometheus/procfs#737

Open

Finomosec added 3 commits September 17, 2025 16:10

reverted changes for local patch

bcbe1d7

fixed imports

843a753

fixed checkstyle errors

0840dde

added md disks in down state #3007

Are you sure you want to change the base?

added md disks in down state #3007

Uh oh!

Conversation

Finomosec commented Apr 30, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Finomosec commented May 3, 2024

Uh oh!

Uh oh!

Uh oh!

SuperQ May 7, 2024

Choose a reason for hiding this comment

Uh oh!

Finomosec May 9, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Finomosec May 9, 2024

Choose a reason for hiding this comment

Uh oh!

dswarbrick May 14, 2024

Choose a reason for hiding this comment

Uh oh!

SuperQ commented May 7, 2024

Uh oh!

Finomosec commented May 9, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

discordianfish commented May 13, 2024

Uh oh!

SuperQ commented May 14, 2024

Uh oh!

JoaoPPCastelo commented Apr 16, 2025

Uh oh!

Finomosec commented Apr 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sklyfr commented Jul 10, 2025

Uh oh!

discordianfish commented Sep 17, 2025

Uh oh!

discordianfish commented Sep 17, 2025

Uh oh!

Finomosec commented Sep 17, 2025

Uh oh!

Uh oh!

Finomosec commented Apr 30, 2024 •

edited

Loading

Finomosec May 9, 2024 •

edited

Loading

Finomosec commented May 9, 2024 •

edited

Loading

Finomosec commented Apr 30, 2025 •

edited

Loading