-
Notifications
You must be signed in to change notification settings - Fork 97
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added determining device type and use it at scrape data #205
Conversation
I might be missing something -- how does this cause |
through the -d flag I think, but its not clear to me how device type is determined? |
I'm thinking that the PR description isn't quite what it does. |
No need to set any devices. |
There is no standalone Personally I despise HBA RAID but I have thousands of them that aren't going away anytime soon. Dell's BOSS-N1, notably, lacks any way to pass through plain drives, it only exposes VDs to the system. |
Rigth now? I have situation:
where: |
But I don't see code that adds "-d megaraid" |
It's |
LGTM but I suspect that someone with more privs will need to approve/merge. You'll also need to sign-off the commits, see the Details link next to DCO above. |
/opt/index/sbin/smartctl --scan
./smartctl_exporter --help
./smartctl_exporter --smartctl.path=/opt/index/sbin/smartctl &
curl http://localhost:9633/metrics
|
|
@zxzharmlesszxz Before this can be merged, you will need to sign all commits. Visit the checks tab and follow the instructions. To help with testing, can you provide the output of
We are all volunteers. Please be patient with the review and merge process. |
a9d0470
to
bec4d10
Compare
@zxzharmlesszxz something that @NiceGuyIT didn't say, but would make the process MUCH easier is rebasing this onto the upstream master, to one or more logically separate commits. #168 is an example where I fixed the SCSI/SAS metrics & labels. I'm going to include more comments inline to your most recent changes. |
Hi @anthonyeleven, I added spoilers to your comment to reduce the scrolling. |
Thank you @robbat2 for mentoring @zxzharmlesszxz with this PR. @zxzharmlesszxz It may seem daunting to submit changes to a project when you're not an experienced developer; stick with it and the result will be worth it. As you saw in PR #107, this is not an easy fix. Other people's comments are to make sure we don't break X when fixing Y. Keep with it. 👍 You'll get there. |
|
In PR #107 method to specify type only for one device but how to make this for multiple? |
@anthonyeleven Completely understand the lack of time. This PR is ready to merge unless you find something. |
System with PERC H330 (LSI) HBA and 4x passthrough drives, using OS's smartmontools 7.1
Above system using smartmontools 7.3
System with PERC H740P mini (LSI) HBA, 2x VD drives, NVMe drives
Above system specifying smartmontools 7.3
system with 2x LSI HBAs, various HBA VDs
|
Thanks for the feedback @anthonyeleven! @zxzharmlesszxz Thank you for your contribution! Sorry for the long delay; the checks and balances are complete. |
Thank you for your contribution.
Then, with the new |
Yes, the current code is adapted for Dell hardware. If your hardware is not a Dell, then your SATA hdd is not work anymore. So it's ok because that's what this patch was trying to achieve, isn't it? |
I'm sure I do not understand something, I didn't read every single message in this pull request. I also didn't find any option to get the old behavior. That is, forcing calling smartctl without Background: |
I agree, out of the box it shouldn't break. Also, I got errors even on Dell HW, see above. |
Actually, this was merged, but not released
It's not exist. Currently, this is breaking change Let me try to explain, if you are not read all messages. Here is 4 devices
But actually for
If some PromQL makes a join by So, we need to think first about design of exporter, then release the new code My suggestions is:
[root@infra1:/]# cat /etc/conf.d/smartctl_exporter
OPTIONS="--smartctl.interval=600s --smartctl.device-exclude=^/dev/bus/[0-9]+$ --web.listen-address=192.168.102.254:9633
Hope this helped clear things up 🙏 |
Let's be sure that whatever is done works on systems that have a mix of:
* Drives that are not behind an LSI HBA
* Drives that are passed through on an LSI HBA
* Drives that are not passed through, e.g. Ugood or part of a VD on an LSI HBA
* Multiple HBAs
… On May 10, 2024, at 13:07, Konstantin Shalygin ***@***.***> wrote:
You're saying this change is not intended for non-Dell hardware. It's already merged into master, which (for me) means that it should generally work for everything. But it does not work for everything, as you (and me) are stating.
Actually, this was merged, but not released
I also didn't find any option to get the old behavior. That is, forcing calling smartctl without --device or force --device=auto or something like this.
It's not exist. Currently, this is breaking change
Let me try to explain, if you are not read all messages. Here is 4 devices
***@***.*** smartctl_exporter]# smartctl --scan
/dev/sda -d scsi # /dev/sda, SCSI device
/dev/sdb -d scsi # /dev/sdb, SCSI device
/dev/sdc -d scsi # /dev/sdc, SCSI device
/dev/sdd -d scsi # /dev/sdd, SCSI device
/dev/bus/0 -d megaraid,0 # /dev/bus/0 [megaraid_disk_00], SCSI device
..
/dev/bus/0 -d megaraid,3 # /dev/bus/0 [megaraid_disk_03], SCSI device
But actually for smartctl_exporter now is a 8 devices in total
smartctl_device{ata_additional_product_id="DELL(tm)",ata_version="ATA8-ACS (minor revision not indicated)",device="bus_0_megaraid_disk_00",firmware_version="FL2H",form_factor="3.5 inches",interface="sat+megaraid,0",model_family="Toshiba 3.5\" MG03ACAxxx(Y) Enterprise HDD",model_name="TOSHIBA MG03ACA100",protocol="ATA",sata_version="SATA 3.0",scsi_product="",scsi_revision="",scsi_vendor="",scsi_version="",serial_number="56H7KIVAF"} 1
smartctl_device{ata_additional_product_id="DELL(tm)",ata_version="ATA8-ACS (minor revision not indicated)",device="bus_0_megaraid_disk_01",firmware_version="FL2H",form_factor="3.5 inches",interface="sat+megaraid,1",model_family="Toshiba 3.5\" MG03ACAxxx(Y) Enterprise HDD",model_name="TOSHIBA MG03ACA100",protocol="ATA",sata_version="SATA 3.0",scsi_product="",scsi_revision="",scsi_vendor="",scsi_version="",serial_number="56H3K3RTF"} 1
smartctl_device{ata_additional_product_id="DELL(tm)",ata_version="ATA8-ACS (minor revision not indicated)",device="bus_0_megaraid_disk_02",firmware_version="FL2H",form_factor="3.5 inches",interface="sat+megaraid,2",model_family="Toshiba 3.5\" MG03ACAxxx(Y) Enterprise HDD",model_name="TOSHIBA MG03ACA100",protocol="ATA",sata_version="SATA 3.0",scsi_product="",scsi_revision="",scsi_vendor="",scsi_version="",serial_number="56H2K9TRF"} 1
smartctl_device{ata_additional_product_id="DELL(tm)",ata_version="ATA8-ACS (minor revision not indicated)",device="bus_0_megaraid_disk_03",firmware_version="FL2H",form_factor="3.5 inches",interface="sat+megaraid,3",model_family="Toshiba 3.5\" MG03ACAxxx(Y) Enterprise HDD",model_name="TOSHIBA MG03ACA100",protocol="ATA",sata_version="SATA 3.0",scsi_product="",scsi_revision="",scsi_vendor="",scsi_version="",serial_number="56H7KIV0F"} 1
smartctl_device{ata_additional_product_id="unknown",ata_version="",device="sda",firmware_version="",form_factor="3.5 inches",interface="scsi",model_family="unknown",model_name="unknown",protocol="SCSI",sata_version="",scsi_product="",scsi_revision="",scsi_vendor="",scsi_version="",serial_number="56H7KIVAF"} 1
smartctl_device{ata_additional_product_id="unknown",ata_version="",device="sdb",firmware_version="",form_factor="3.5 inches",interface="scsi",model_family="unknown",model_name="unknown",protocol="SCSI",sata_version="",scsi_product="",scsi_revision="",scsi_vendor="",scsi_version="",serial_number="56H7KIV0F"} 1
smartctl_device{ata_additional_product_id="unknown",ata_version="",device="sdc",firmware_version="",form_factor="3.5 inches",interface="scsi",model_family="unknown",model_name="unknown",protocol="SCSI",sata_version="",scsi_product="",scsi_revision="",scsi_vendor="",scsi_version="",serial_number="56H3K3RTF"} 1
smartctl_device{ata_additional_product_id="unknown",ata_version="",device="sdd",firmware_version="",form_factor="3.5 inches",interface="scsi",model_family="unknown",model_name="unknown",protocol="SCSI",sata_version="",scsi_product="",scsi_revision="",scsi_vendor="",scsi_version="",serial_number="56H2K9TRF"} 1
If some PromQL makes a join by serial_number label, the Prometheus return many-to-many error
So, we need to think first about design of exporter, then release the new code
My suggestions is:
Adding the --megaraid option. With this option the new megaraid logic is enabled. The non-megaraid devices should be filtered by administrator via regex filter, for example --smartctl.device-exclude="^/dev/sd[a-z]+"
Adding the --megaraid-only option. This tell exporter to use only megaraid devices. This will fix the doubled devices
Don't adding the new options, but add new device_type logic only for megaraid devices, so it's omit the device_type option to smartmontools by default, - the current exporter behavior. In this case, administrator should filter megaraid devices by hand via regex filter. Actually, we use this filter for all exporter deployments, to avoid non-disk devices in device counter
***@***.***:/]# cat /etc/conf.d/smartctl_exporter
OPTIONS="--smartctl.interval=600s --smartctl.device-exclude=^/dev/bus/[0-9]+$ --web.listen-address=192.168.102.254:9633
Another variant...
Hope this helped clear things up 🙏
—
Reply to this email directly, view it on GitHub <#205 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ADTVOND7N3ZKX3F46K62Q7LZBT5EBAVCNFSM6AAAAABEA6EM62VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMBUHE3DSOJRGU>.
You are receiving this because you were mentioned.
|
How it works with Broadcom HBA 9500-16i 3b:00.0 Serial Attached SCSI controller: Broadcom / LSI Fusion-MPT 12GSAS/PCIe Secure SAS38xx
Subsystem: Broadcom / LSI 9500-16i Tri-Mode HBA smartctl_exporter 0.12.0ts=2024-05-10T17:36:51.045Z caller=main.go:140 level=info msg="Starting smartctl_exporter" version="(version=0.12.0, branch=master, revision=1.el8)"
ts=2024-05-10T17:36:51.045Z caller=main.go:141 level=info msg="Build context" build_context="(go=go1.21.7 (Red Hat 1.21.7-1.module_el8+960+4060efbe), platform=linux/amd64, user=mockbuild@mockbuild, date=20240304, tags=unknown)"
ts=2024-05-10T17:36:51.045Z caller=main.go:147 level=info msg="No devices specified, trying to load them automatically"
ts=2024-05-10T17:36:51.046Z caller=readjson.go:79 level=debug msg="Scanning for devices"
ts=2024-05-10T17:36:51.065Z caller=main.go:120 level=info msg="Found device" name=/dev/sda
ts=2024-05-10T17:36:51.065Z caller=main.go:120 level=info msg="Found device" name=/dev/sdb
ts=2024-05-10T17:36:51.065Z caller=main.go:120 level=info msg="Found device" name=/dev/sdc
ts=2024-05-10T17:36:51.065Z caller=main.go:120 level=info msg="Found device" name=/dev/sdd
ts=2024-05-10T17:36:51.065Z caller=main.go:120 level=info msg="Found device" name=/dev/sde
ts=2024-05-10T17:36:51.065Z caller=main.go:120 level=info msg="Found device" name=/dev/sdf
ts=2024-05-10T17:36:51.065Z caller=main.go:120 level=info msg="Found device" name=/dev/sdg
ts=2024-05-10T17:36:51.065Z caller=main.go:120 level=info msg="Found device" name=/dev/sdh
ts=2024-05-10T17:36:51.065Z caller=main.go:120 level=info msg="Found device" name=/dev/sdi
ts=2024-05-10T17:36:51.065Z caller=main.go:120 level=info msg="Found device" name=/dev/sdj
ts=2024-05-10T17:36:51.065Z caller=main.go:120 level=info msg="Found device" name=/dev/sdk
ts=2024-05-10T17:36:51.065Z caller=main.go:120 level=info msg="Found device" name=/dev/sdl
ts=2024-05-10T17:36:51.065Z caller=main.go:120 level=info msg="Found device" name=/dev/sdm
ts=2024-05-10T17:36:51.065Z caller=main.go:120 level=info msg="Found device" name=/dev/sdn
ts=2024-05-10T17:36:51.065Z caller=main.go:120 level=info msg="Found device" name=/dev/sdo
ts=2024-05-10T17:36:51.065Z caller=main.go:120 level=info msg="Found device" name=/dev/sdp
ts=2024-05-10T17:36:51.065Z caller=main.go:120 level=info msg="Found device" name=/dev/sdq
ts=2024-05-10T17:36:51.065Z caller=main.go:120 level=info msg="Found device" name=/dev/sdr
ts=2024-05-10T17:36:51.065Z caller=main.go:120 level=info msg="Found device" name=/dev/sds
ts=2024-05-10T17:36:51.065Z caller=main.go:120 level=info msg="Found device" name=/dev/sdt
ts=2024-05-10T17:36:51.065Z caller=main.go:120 level=info msg="Found device" name=/dev/sdu
ts=2024-05-10T17:36:51.065Z caller=main.go:120 level=info msg="Found device" name=/dev/sdv
ts=2024-05-10T17:36:51.065Z caller=main.go:120 level=info msg="Found device" name=/dev/sdw
ts=2024-05-10T17:36:51.065Z caller=main.go:120 level=info msg="Found device" name=/dev/sdx
ts=2024-05-10T17:36:51.065Z caller=main.go:120 level=info msg="Found device" name=/dev/nvme0
ts=2024-05-10T17:36:51.065Z caller=main.go:120 level=info msg="Found device" name=/dev/nvme1
ts=2024-05-10T17:36:51.065Z caller=main.go:149 level=info msg="Number of devices found" count=26
ts=2024-05-10T17:36:51.065Z caller=main.go:158 level=info msg="Start background scan process"
ts=2024-05-10T17:36:51.065Z caller=main.go:159 level=info msg="Rescanning for devices every" rescanInterval=10m0s
ts=2024-05-10T17:36:51.351Z caller=readjson.go:74 level=debug msg="Collected S.M.A.R.T. json data" device=/dev/sda duration=286.192113ms
ts=2024-05-10T17:36:51.351Z caller=smartctl.go:75 level=debug msg="Collecting metrics from" device=sda family="Toshiba X300" model="TOSHIBA HDWE160"
ts=2024-05-10T17:36:51.635Z caller=readjson.go:74 level=debug msg="Collected S.M.A.R.T. json data" device=/dev/sdb duration=282.917796ms
ts=2024-05-10T17:36:51.636Z caller=smartctl.go:75 level=debug msg="Collecting metrics from" device=sdb family="Toshiba X300" model="TOSHIBA HDWE160"
ts=2024-05-10T17:36:51.703Z caller=readjson.go:74 level=debug msg="Collected S.M.A.R.T. json data" device=/dev/sdc duration=65.960245ms
ts=2024-05-10T17:36:51.703Z caller=smartctl.go:75 level=debug msg="Collecting metrics from" device=sdc family=unknown model=WUS721010ALE6L4
ts=2024-05-10T17:36:52.148Z caller=readjson.go:74 level=debug msg="Collected S.M.A.R.T. json data" device=/dev/sdd duration=444.575557ms
ts=2024-05-10T17:36:52.148Z caller=smartctl.go:75 level=debug msg="Collecting metrics from" device=sdd family="Toshiba X300" model="TOSHIBA HDWE160"
ts=2024-05-10T17:36:52.184Z caller=readjson.go:74 level=debug msg="Collected S.M.A.R.T. json data" device=/dev/sde duration=34.875406ms
ts=2024-05-10T17:36:52.185Z caller=smartctl.go:75 level=debug msg="Collecting metrics from" device=sde family=unknown model=WUS721010ALE6L4
ts=2024-05-10T17:36:52.220Z caller=readjson.go:74 level=debug msg="Collected S.M.A.R.T. json data" device=/dev/sdf duration=34.298321ms
ts=2024-05-10T17:36:52.220Z caller=smartctl.go:75 level=debug msg="Collecting metrics from" device=sdf family=unknown model="WDC WUS721010ALE6L4"
ts=2024-05-10T17:36:52.251Z caller=readjson.go:74 level=debug msg="Collected S.M.A.R.T. json data" device=/dev/sdg duration=30.437211ms
ts=2024-05-10T17:36:52.251Z caller=smartctl.go:75 level=debug msg="Collecting metrics from" device=sdg family="Toshiba MG06ACA... Enterprise Capacity HDD" model="TOSHIBA MG06ACA10TE"
ts=2024-05-10T17:36:52.550Z caller=readjson.go:74 level=debug msg="Collected S.M.A.R.T. json data" device=/dev/sdh duration=297.92271ms
ts=2024-05-10T17:36:52.550Z caller=smartctl.go:75 level=debug msg="Collecting metrics from" device=sdh family="Toshiba X300" model="TOSHIBA HDWE160"
ts=2024-05-10T17:36:52.831Z caller=readjson.go:74 level=debug msg="Collected S.M.A.R.T. json data" device=/dev/sdi duration=279.672304ms
ts=2024-05-10T17:36:52.831Z caller=smartctl.go:75 level=debug msg="Collecting metrics from" device=sdi family="Toshiba X300" model="TOSHIBA HDWE160"
ts=2024-05-10T17:36:53.126Z caller=readjson.go:74 level=debug msg="Collected S.M.A.R.T. json data" device=/dev/sdj duration=292.331348ms
ts=2024-05-10T17:36:53.126Z caller=smartctl.go:75 level=debug msg="Collecting metrics from" device=sdj family="Toshiba X300" model="TOSHIBA HDWE160"
ts=2024-05-10T17:36:53.399Z caller=readjson.go:74 level=debug msg="Collected S.M.A.R.T. json data" device=/dev/sdk duration=272.004944ms
ts=2024-05-10T17:36:53.399Z caller=smartctl.go:75 level=debug msg="Collecting metrics from" device=sdk family="Toshiba X300" model="TOSHIBA HDWE160"
ts=2024-05-10T17:36:53.692Z caller=readjson.go:74 level=debug msg="Collected S.M.A.R.T. json data" device=/dev/sdl duration=291.55157ms
ts=2024-05-10T17:36:53.692Z caller=smartctl.go:75 level=debug msg="Collecting metrics from" device=sdl family="Toshiba X300" model="TOSHIBA HDWE160"
ts=2024-05-10T17:36:53.724Z caller=readjson.go:69 level=warn msg="S.M.A.R.T. output reading" err="exit status 64" device=/dev/sdm
ts=2024-05-10T17:36:53.724Z caller=readjson.go:140 level=warn msg="The device error log contains records of errors" device=/dev/sdm
ts=2024-05-10T17:36:53.724Z caller=readjson.go:74 level=debug msg="Collected S.M.A.R.T. json data" device=/dev/sdm duration=31.055839ms
ts=2024-05-10T17:36:53.724Z caller=smartctl.go:75 level=debug msg="Collecting metrics from" device=sdm family="Toshiba MG06ACA... Enterprise Capacity HDD" model="TOSHIBA MG06ACA10TE"
ts=2024-05-10T17:36:54.012Z caller=readjson.go:74 level=debug msg="Collected S.M.A.R.T. json data" device=/dev/sdn duration=286.293623ms
ts=2024-05-10T17:36:54.012Z caller=smartctl.go:75 level=debug msg="Collecting metrics from" device=sdn family="Toshiba X300" model="TOSHIBA HDWE160"
ts=2024-05-10T17:36:54.325Z caller=readjson.go:74 level=debug msg="Collected S.M.A.R.T. json data" device=/dev/sdo duration=312.568885ms
ts=2024-05-10T17:36:54.325Z caller=smartctl.go:75 level=debug msg="Collecting metrics from" device=sdo family="Toshiba X300" model="TOSHIBA HDWE160"
ts=2024-05-10T17:36:54.624Z caller=readjson.go:74 level=debug msg="Collected S.M.A.R.T. json data" device=/dev/sdp duration=297.93094ms
ts=2024-05-10T17:36:54.624Z caller=smartctl.go:75 level=debug msg="Collecting metrics from" device=sdp family="Toshiba X300" model="TOSHIBA HDWE160"
ts=2024-05-10T17:36:54.906Z caller=readjson.go:74 level=debug msg="Collected S.M.A.R.T. json data" device=/dev/sdq duration=280.451251ms
ts=2024-05-10T17:36:54.906Z caller=smartctl.go:75 level=debug msg="Collecting metrics from" device=sdq family="Toshiba X300" model="TOSHIBA HDWE160"
ts=2024-05-10T17:36:55.186Z caller=readjson.go:69 level=warn msg="S.M.A.R.T. output reading" err="exit status 64" device=/dev/sdr
ts=2024-05-10T17:36:55.186Z caller=readjson.go:140 level=warn msg="The device error log contains records of errors" device=/dev/sdr
ts=2024-05-10T17:36:55.186Z caller=readjson.go:74 level=debug msg="Collected S.M.A.R.T. json data" device=/dev/sdr duration=279.37659ms
ts=2024-05-10T17:36:55.186Z caller=smartctl.go:75 level=debug msg="Collecting metrics from" device=sdr family="Toshiba X300" model="TOSHIBA HDWE160"
ts=2024-05-10T17:36:55.468Z caller=readjson.go:74 level=debug msg="Collected S.M.A.R.T. json data" device=/dev/sds duration=280.076152ms
ts=2024-05-10T17:36:55.468Z caller=smartctl.go:75 level=debug msg="Collecting metrics from" device=sds family="Toshiba X300" model="TOSHIBA HDWE160"
ts=2024-05-10T17:36:55.760Z caller=readjson.go:74 level=debug msg="Collected S.M.A.R.T. json data" device=/dev/sdt duration=291.624193ms
ts=2024-05-10T17:36:55.760Z caller=smartctl.go:75 level=debug msg="Collecting metrics from" device=sdt family="Toshiba X300" model="TOSHIBA HDWE160"
ts=2024-05-10T17:36:56.050Z caller=readjson.go:69 level=warn msg="S.M.A.R.T. output reading" err="exit status 64" device=/dev/sdu
ts=2024-05-10T17:36:56.050Z caller=readjson.go:140 level=warn msg="The device error log contains records of errors" device=/dev/sdu
ts=2024-05-10T17:36:56.050Z caller=readjson.go:74 level=debug msg="Collected S.M.A.R.T. json data" device=/dev/sdu duration=287.490146ms
ts=2024-05-10T17:36:56.050Z caller=smartctl.go:75 level=debug msg="Collecting metrics from" device=sdu family="Toshiba X300" model="TOSHIBA HDWE160"
ts=2024-05-10T17:36:56.319Z caller=readjson.go:74 level=debug msg="Collected S.M.A.R.T. json data" device=/dev/sdv duration=267.760124ms
ts=2024-05-10T17:36:56.319Z caller=smartctl.go:75 level=debug msg="Collecting metrics from" device=sdv family="Toshiba X300" model="TOSHIBA HDWE160"
ts=2024-05-10T17:36:56.596Z caller=readjson.go:74 level=debug msg="Collected S.M.A.R.T. json data" device=/dev/sdw duration=276.673288ms
ts=2024-05-10T17:36:56.597Z caller=smartctl.go:75 level=debug msg="Collecting metrics from" device=sdw family="Toshiba X300" model="TOSHIBA HDWE160"
ts=2024-05-10T17:36:56.632Z caller=readjson.go:74 level=debug msg="Collected S.M.A.R.T. json data" device=/dev/sdx duration=34.654125ms
ts=2024-05-10T17:36:56.632Z caller=smartctl.go:75 level=debug msg="Collecting metrics from" device=sdx family=unknown model="WDC WUS721010ALE6L4"
ts=2024-05-10T17:36:56.653Z caller=readjson.go:74 level=debug msg="Collected S.M.A.R.T. json data" device=/dev/nvme0 duration=19.965395ms
ts=2024-05-10T17:36:56.653Z caller=smartctl.go:75 level=debug msg="Collecting metrics from" device=nvme0 family=unknown model=Micron_7300_MTFDHBE3T2TDG
ts=2024-05-10T17:36:56.674Z caller=readjson.go:74 level=debug msg="Collected S.M.A.R.T. json data" device=/dev/nvme1 duration=20.415503ms
ts=2024-05-10T17:36:56.674Z caller=smartctl.go:75 level=debug msg="Collecting metrics from" device=nvme1 family=unknown model=Micron_7300_MTFDHBE3T2TDG
ts=2024-05-10T17:36:56.674Z caller=tls_config.go:313 level=info msg="Listening on" address=192.168.100.26:9630
ts=2024-05-10T17:36:56.674Z caller=tls_config.go:316 level=info msg="TLS is disabled." http2=false address=192.168.100.26:9630 And the same server with the current master smartctl_exporter masterts=2024-05-10T17:29:14.439Z caller=main.go:167 level=info msg="Starting smartctl_exporter" version="(version=, branch=, revision=2cc2249821d6417fcfff8ef8d302205d7b37b44c)"
ts=2024-05-10T17:29:14.439Z caller=main.go:168 level=info msg="Build context" build_context="(go=go1.22.2, platform=linux/amd64, user=, date=, tags=unknown)"
ts=2024-05-10T17:29:14.439Z caller=readjson.go:79 level=debug msg="Scanning for devices"
ts=2024-05-10T17:29:14.459Z caller=main.go:128 level=info msg="Found device" name=sda
ts=2024-05-10T17:29:14.459Z caller=main.go:128 level=info msg="Found device" name=sdb
ts=2024-05-10T17:29:14.459Z caller=main.go:128 level=info msg="Found device" name=sdc
ts=2024-05-10T17:29:14.459Z caller=main.go:128 level=info msg="Found device" name=sdd
ts=2024-05-10T17:29:14.459Z caller=main.go:128 level=info msg="Found device" name=sde
ts=2024-05-10T17:29:14.459Z caller=main.go:128 level=info msg="Found device" name=sdf
ts=2024-05-10T17:29:14.459Z caller=main.go:128 level=info msg="Found device" name=sdg
ts=2024-05-10T17:29:14.459Z caller=main.go:128 level=info msg="Found device" name=sdh
ts=2024-05-10T17:29:14.459Z caller=main.go:128 level=info msg="Found device" name=sdi
ts=2024-05-10T17:29:14.459Z caller=main.go:128 level=info msg="Found device" name=sdj
ts=2024-05-10T17:29:14.459Z caller=main.go:128 level=info msg="Found device" name=sdk
ts=2024-05-10T17:29:14.459Z caller=main.go:128 level=info msg="Found device" name=sdl
ts=2024-05-10T17:29:14.459Z caller=main.go:128 level=info msg="Found device" name=sdm
ts=2024-05-10T17:29:14.459Z caller=main.go:128 level=info msg="Found device" name=sdn
ts=2024-05-10T17:29:14.459Z caller=main.go:128 level=info msg="Found device" name=sdo
ts=2024-05-10T17:29:14.459Z caller=main.go:128 level=info msg="Found device" name=sdp
ts=2024-05-10T17:29:14.459Z caller=main.go:128 level=info msg="Found device" name=sdq
ts=2024-05-10T17:29:14.459Z caller=main.go:128 level=info msg="Found device" name=sdr
ts=2024-05-10T17:29:14.459Z caller=main.go:128 level=info msg="Found device" name=sds
ts=2024-05-10T17:29:14.459Z caller=main.go:128 level=info msg="Found device" name=sdt
ts=2024-05-10T17:29:14.459Z caller=main.go:128 level=info msg="Found device" name=sdu
ts=2024-05-10T17:29:14.459Z caller=main.go:128 level=info msg="Found device" name=sdv
ts=2024-05-10T17:29:14.459Z caller=main.go:128 level=info msg="Found device" name=sdw
ts=2024-05-10T17:29:14.459Z caller=main.go:128 level=info msg="Found device" name=sdx
ts=2024-05-10T17:29:14.459Z caller=main.go:128 level=info msg="Found device" name=nvme0
ts=2024-05-10T17:29:14.460Z caller=main.go:128 level=info msg="Found device" name=nvme1
ts=2024-05-10T17:29:14.460Z caller=main.go:172 level=info msg="Number of devices found" count=26
ts=2024-05-10T17:29:14.460Z caller=main.go:185 level=info msg="Start background scan process"
ts=2024-05-10T17:29:14.460Z caller=main.go:186 level=info msg="Rescanning for devices every" rescanInterval=10m0s
ts=2024-05-10T17:29:16.305Z caller=readjson.go:69 level=warn msg="S.M.A.R.T. output reading" err="exit status 4" device=sda
ts=2024-05-10T17:29:16.305Z caller=readjson.go:128 level=warn msg="Some SMART or other ATA command to the disk failed, or there was a checksum error in a SMART data structure" device=sda
ts=2024-05-10T17:29:16.305Z caller=readjson.go:74 level=debug msg="Collected S.M.A.R.T. json data" device=sda duration=1.845341351s
ts=2024-05-10T17:29:16.305Z caller=smartctl.go:100 level=debug msg="Collecting metrics from" device=sda family=unknown model=unknown
ts=2024-05-10T17:29:18.691Z caller=readjson.go:69 level=warn msg="S.M.A.R.T. output reading" err="exit status 4" device=sdb
ts=2024-05-10T17:29:18.691Z caller=readjson.go:128 level=warn msg="Some SMART or other ATA command to the disk failed, or there was a checksum error in a SMART data structure" device=sdb
ts=2024-05-10T17:29:18.691Z caller=readjson.go:74 level=debug msg="Collected S.M.A.R.T. json data" device=sdb duration=2.385952698s
ts=2024-05-10T17:29:18.691Z caller=smartctl.go:100 level=debug msg="Collecting metrics from" device=sdb family=unknown model=unknown
ts=2024-05-10T17:29:18.905Z caller=readjson.go:69 level=warn msg="S.M.A.R.T. output reading" err="exit status 4" device=sdc
ts=2024-05-10T17:29:18.905Z caller=readjson.go:128 level=warn msg="Some SMART or other ATA command to the disk failed, or there was a checksum error in a SMART data structure" device=sdc
ts=2024-05-10T17:29:18.905Z caller=readjson.go:74 level=debug msg="Collected S.M.A.R.T. json data" device=sdc duration=213.069916ms
ts=2024-05-10T17:29:18.905Z caller=smartctl.go:100 level=debug msg="Collecting metrics from" device=sdc family=unknown model=unknown
ts=2024-05-10T17:29:20.765Z caller=readjson.go:69 level=warn msg="S.M.A.R.T. output reading" err="exit status 4" device=sdd
ts=2024-05-10T17:29:20.765Z caller=readjson.go:128 level=warn msg="Some SMART or other ATA command to the disk failed, or there was a checksum error in a SMART data structure" device=sdd
ts=2024-05-10T17:29:20.765Z caller=readjson.go:74 level=debug msg="Collected S.M.A.R.T. json data" device=sdd duration=1.859711938s
ts=2024-05-10T17:29:20.765Z caller=smartctl.go:100 level=debug msg="Collecting metrics from" device=sdd family=unknown model=unknown
ts=2024-05-10T17:29:21.001Z caller=readjson.go:69 level=warn msg="S.M.A.R.T. output reading" err="exit status 4" device=sde
ts=2024-05-10T17:29:21.001Z caller=readjson.go:128 level=warn msg="Some SMART or other ATA command to the disk failed, or there was a checksum error in a SMART data structure" device=sde
ts=2024-05-10T17:29:21.001Z caller=readjson.go:74 level=debug msg="Collected S.M.A.R.T. json data" device=sde duration=235.850279ms
ts=2024-05-10T17:29:21.001Z caller=smartctl.go:100 level=debug msg="Collecting metrics from" device=sde family=unknown model=unknown
ts=2024-05-10T17:29:21.214Z caller=readjson.go:69 level=warn msg="S.M.A.R.T. output reading" err="exit status 4" device=sdf
ts=2024-05-10T17:29:21.214Z caller=readjson.go:128 level=warn msg="Some SMART or other ATA command to the disk failed, or there was a checksum error in a SMART data structure" device=sdf
ts=2024-05-10T17:29:21.214Z caller=readjson.go:74 level=debug msg="Collected S.M.A.R.T. json data" device=sdf duration=212.474777ms
ts=2024-05-10T17:29:21.214Z caller=smartctl.go:100 level=debug msg="Collecting metrics from" device=sdf family=unknown model=unknown
ts=2024-05-10T17:29:21.264Z caller=readjson.go:69 level=warn msg="S.M.A.R.T. output reading" err="exit status 4" device=sdg
ts=2024-05-10T17:29:21.264Z caller=readjson.go:128 level=warn msg="Some SMART or other ATA command to the disk failed, or there was a checksum error in a SMART data structure" device=sdg
ts=2024-05-10T17:29:21.264Z caller=readjson.go:74 level=debug msg="Collected S.M.A.R.T. json data" device=sdg duration=50.259454ms
ts=2024-05-10T17:29:21.264Z caller=smartctl.go:100 level=debug msg="Collecting metrics from" device=sdg family=unknown model=unknown
ts=2024-05-10T17:29:23.189Z caller=readjson.go:69 level=warn msg="S.M.A.R.T. output reading" err="exit status 4" device=sdh
ts=2024-05-10T17:29:23.190Z caller=readjson.go:128 level=warn msg="Some SMART or other ATA command to the disk failed, or there was a checksum error in a SMART data structure" device=sdh
ts=2024-05-10T17:29:23.190Z caller=readjson.go:74 level=debug msg="Collected S.M.A.R.T. json data" device=sdh duration=1.924845225s
ts=2024-05-10T17:29:23.190Z caller=smartctl.go:100 level=debug msg="Collecting metrics from" device=sdh family=unknown model=unknown
ts=2024-05-10T17:29:25.497Z caller=readjson.go:69 level=warn msg="S.M.A.R.T. output reading" err="exit status 4" device=sdi
ts=2024-05-10T17:29:25.497Z caller=readjson.go:128 level=warn msg="Some SMART or other ATA command to the disk failed, or there was a checksum error in a SMART data structure" device=sdi
ts=2024-05-10T17:29:25.497Z caller=readjson.go:74 level=debug msg="Collected S.M.A.R.T. json data" device=sdi duration=2.306888866s
ts=2024-05-10T17:29:25.497Z caller=smartctl.go:100 level=debug msg="Collecting metrics from" device=sdi family=unknown model=unknown
ts=2024-05-10T17:29:27.620Z caller=readjson.go:69 level=warn msg="S.M.A.R.T. output reading" err="exit status 4" device=sdj
ts=2024-05-10T17:29:27.620Z caller=readjson.go:128 level=warn msg="Some SMART or other ATA command to the disk failed, or there was a checksum error in a SMART data structure" device=sdj
ts=2024-05-10T17:29:27.620Z caller=readjson.go:74 level=debug msg="Collected S.M.A.R.T. json data" device=sdj duration=2.12262608s
ts=2024-05-10T17:29:27.620Z caller=smartctl.go:100 level=debug msg="Collecting metrics from" device=sdj family=unknown model=unknown
ts=2024-05-10T17:29:29.708Z caller=readjson.go:69 level=warn msg="S.M.A.R.T. output reading" err="exit status 4" device=sdk
ts=2024-05-10T17:29:29.708Z caller=readjson.go:128 level=warn msg="Some SMART or other ATA command to the disk failed, or there was a checksum error in a SMART data structure" device=sdk
ts=2024-05-10T17:29:29.708Z caller=readjson.go:74 level=debug msg="Collected S.M.A.R.T. json data" device=sdk duration=2.087567042s
ts=2024-05-10T17:29:29.708Z caller=smartctl.go:100 level=debug msg="Collecting metrics from" device=sdk family=unknown model=unknown
ts=2024-05-10T17:29:31.733Z caller=readjson.go:69 level=warn msg="S.M.A.R.T. output reading" err="exit status 4" device=sdl
ts=2024-05-10T17:29:31.733Z caller=readjson.go:128 level=warn msg="Some SMART or other ATA command to the disk failed, or there was a checksum error in a SMART data structure" device=sdl
ts=2024-05-10T17:29:31.733Z caller=readjson.go:74 level=debug msg="Collected S.M.A.R.T. json data" device=sdl duration=2.024666633s
ts=2024-05-10T17:29:31.733Z caller=smartctl.go:100 level=debug msg="Collecting metrics from" device=sdl family=unknown model=unknown
ts=2024-05-10T17:29:31.782Z caller=readjson.go:69 level=warn msg="S.M.A.R.T. output reading" err="exit status 4" device=sdm
ts=2024-05-10T17:29:31.782Z caller=readjson.go:128 level=warn msg="Some SMART or other ATA command to the disk failed, or there was a checksum error in a SMART data structure" device=sdm
ts=2024-05-10T17:29:31.782Z caller=readjson.go:74 level=debug msg="Collected S.M.A.R.T. json data" device=sdm duration=48.867196ms
ts=2024-05-10T17:29:31.782Z caller=smartctl.go:100 level=debug msg="Collecting metrics from" device=sdm family=unknown model=unknown
ts=2024-05-10T17:29:33.753Z caller=readjson.go:69 level=warn msg="S.M.A.R.T. output reading" err="exit status 4" device=sdn
ts=2024-05-10T17:29:33.753Z caller=readjson.go:128 level=warn msg="Some SMART or other ATA command to the disk failed, or there was a checksum error in a SMART data structure" device=sdn
ts=2024-05-10T17:29:33.753Z caller=readjson.go:74 level=debug msg="Collected S.M.A.R.T. json data" device=sdn duration=1.971290324s
ts=2024-05-10T17:29:33.754Z caller=smartctl.go:100 level=debug msg="Collecting metrics from" device=sdn family=unknown model=unknown
ts=2024-05-10T17:29:35.801Z caller=readjson.go:69 level=warn msg="S.M.A.R.T. output reading" err="exit status 4" device=sdo
ts=2024-05-10T17:29:35.801Z caller=readjson.go:128 level=warn msg="Some SMART or other ATA command to the disk failed, or there was a checksum error in a SMART data structure" device=sdo
ts=2024-05-10T17:29:35.801Z caller=readjson.go:74 level=debug msg="Collected S.M.A.R.T. json data" device=sdo duration=2.047568673s
ts=2024-05-10T17:29:35.802Z caller=smartctl.go:100 level=debug msg="Collecting metrics from" device=sdo family=unknown model=unknown
ts=2024-05-10T17:29:38.040Z caller=readjson.go:69 level=warn msg="S.M.A.R.T. output reading" err="exit status 4" device=sdp
ts=2024-05-10T17:29:38.040Z caller=readjson.go:128 level=warn msg="Some SMART or other ATA command to the disk failed, or there was a checksum error in a SMART data structure" device=sdp
ts=2024-05-10T17:29:38.040Z caller=readjson.go:74 level=debug msg="Collected S.M.A.R.T. json data" device=sdp duration=2.238011978s
ts=2024-05-10T17:29:38.040Z caller=smartctl.go:100 level=debug msg="Collecting metrics from" device=sdp family=unknown model=unknown
ts=2024-05-10T17:29:39.906Z caller=readjson.go:69 level=warn msg="S.M.A.R.T. output reading" err="exit status 4" device=sdq
ts=2024-05-10T17:29:39.906Z caller=readjson.go:128 level=warn msg="Some SMART or other ATA command to the disk failed, or there was a checksum error in a SMART data structure" device=sdq
ts=2024-05-10T17:29:39.906Z caller=readjson.go:74 level=debug msg="Collected S.M.A.R.T. json data" device=sdq duration=1.865314904s
ts=2024-05-10T17:29:39.906Z caller=smartctl.go:100 level=debug msg="Collecting metrics from" device=sdq family=unknown model=unknown
ts=2024-05-10T17:29:41.750Z caller=readjson.go:69 level=warn msg="S.M.A.R.T. output reading" err="exit status 4" device=sdr
ts=2024-05-10T17:29:41.750Z caller=readjson.go:128 level=warn msg="Some SMART or other ATA command to the disk failed, or there was a checksum error in a SMART data structure" device=sdr
ts=2024-05-10T17:29:41.750Z caller=readjson.go:74 level=debug msg="Collected S.M.A.R.T. json data" device=sdr duration=1.843800894s
ts=2024-05-10T17:29:41.750Z caller=smartctl.go:100 level=debug msg="Collecting metrics from" device=sdr family=unknown model=unknown
ts=2024-05-10T17:29:43.605Z caller=readjson.go:69 level=warn msg="S.M.A.R.T. output reading" err="exit status 4" device=sds
ts=2024-05-10T17:29:43.605Z caller=readjson.go:128 level=warn msg="Some SMART or other ATA command to the disk failed, or there was a checksum error in a SMART data structure" device=sds
ts=2024-05-10T17:29:43.605Z caller=readjson.go:74 level=debug msg="Collected S.M.A.R.T. json data" device=sds duration=1.854622712s
ts=2024-05-10T17:29:43.605Z caller=smartctl.go:100 level=debug msg="Collecting metrics from" device=sds family=unknown model=unknown
ts=2024-05-10T17:29:45.986Z caller=readjson.go:69 level=warn msg="S.M.A.R.T. output reading" err="exit status 4" device=sdt
ts=2024-05-10T17:29:45.987Z caller=readjson.go:128 level=warn msg="Some SMART or other ATA command to the disk failed, or there was a checksum error in a SMART data structure" device=sdt
ts=2024-05-10T17:29:45.987Z caller=readjson.go:74 level=debug msg="Collected S.M.A.R.T. json data" device=sdt duration=2.381171388s
ts=2024-05-10T17:29:45.987Z caller=smartctl.go:100 level=debug msg="Collecting metrics from" device=sdt family=unknown model=unknown
ts=2024-05-10T17:29:48.147Z caller=readjson.go:69 level=warn msg="S.M.A.R.T. output reading" err="exit status 4" device=sdu
ts=2024-05-10T17:29:48.147Z caller=readjson.go:128 level=warn msg="Some SMART or other ATA command to the disk failed, or there was a checksum error in a SMART data structure" device=sdu
ts=2024-05-10T17:29:48.147Z caller=readjson.go:74 level=debug msg="Collected S.M.A.R.T. json data" device=sdu duration=2.16028676s
ts=2024-05-10T17:29:48.147Z caller=smartctl.go:100 level=debug msg="Collecting metrics from" device=sdu family=unknown model=unknown
ts=2024-05-10T17:29:49.998Z caller=readjson.go:69 level=warn msg="S.M.A.R.T. output reading" err="exit status 4" device=sdv
ts=2024-05-10T17:29:49.998Z caller=readjson.go:128 level=warn msg="Some SMART or other ATA command to the disk failed, or there was a checksum error in a SMART data structure" device=sdv
ts=2024-05-10T17:29:49.998Z caller=readjson.go:74 level=debug msg="Collected S.M.A.R.T. json data" device=sdv duration=1.850612186s
ts=2024-05-10T17:29:49.998Z caller=smartctl.go:100 level=debug msg="Collecting metrics from" device=sdv family=unknown model=unknown
ts=2024-05-10T17:29:51.899Z caller=readjson.go:69 level=warn msg="S.M.A.R.T. output reading" err="exit status 4" device=sdw
ts=2024-05-10T17:29:51.899Z caller=readjson.go:128 level=warn msg="Some SMART or other ATA command to the disk failed, or there was a checksum error in a SMART data structure" device=sdw
ts=2024-05-10T17:29:51.899Z caller=readjson.go:74 level=debug msg="Collected S.M.A.R.T. json data" device=sdw duration=1.900320681s
ts=2024-05-10T17:29:51.899Z caller=smartctl.go:100 level=debug msg="Collecting metrics from" device=sdw family=unknown model=unknown
ts=2024-05-10T17:29:52.276Z caller=readjson.go:69 level=warn msg="S.M.A.R.T. output reading" err="exit status 4" device=sdx
ts=2024-05-10T17:29:52.276Z caller=readjson.go:128 level=warn msg="Some SMART or other ATA command to the disk failed, or there was a checksum error in a SMART data structure" device=sdx
ts=2024-05-10T17:29:52.276Z caller=readjson.go:74 level=debug msg="Collected S.M.A.R.T. json data" device=sdx duration=377.361725ms
ts=2024-05-10T17:29:52.276Z caller=smartctl.go:100 level=debug msg="Collecting metrics from" device=sdx family=unknown model=unknown
ts=2024-05-10T17:29:52.296Z caller=readjson.go:74 level=debug msg="Collected S.M.A.R.T. json data" device=nvme0 duration=19.676399ms
ts=2024-05-10T17:29:52.296Z caller=smartctl.go:100 level=debug msg="Collecting metrics from" device=nvme0 family=unknown model=Micron_7300_MTFDHBE3T2TDG
ts=2024-05-10T17:29:52.317Z caller=readjson.go:74 level=debug msg="Collected S.M.A.R.T. json data" device=nvme1 duration=20.107618ms
ts=2024-05-10T17:29:52.317Z caller=smartctl.go:100 level=debug msg="Collecting metrics from" device=nvme1 family=unknown model=Micron_7300_MTFDHBE3T2TDG
ts=2024-05-10T17:29:52.317Z caller=tls_config.go:313 level=info msg="Listening on" address=192.168.100.26:9630
ts=2024-05-10T17:29:52.317Z caller=tls_config.go:316 level=info msg="TLS is disabled." http2=false address=192.168.100.26:9630 [root@host]# smartctl --scan
/dev/sda -d scsi # /dev/sda, SCSI device
/dev/sdb -d scsi # /dev/sdb, SCSI device
/dev/sdc -d scsi # /dev/sdc, SCSI device
/dev/sdd -d scsi # /dev/sdd, SCSI device
/dev/sde -d scsi # /dev/sde, SCSI device
/dev/sdf -d scsi # /dev/sdf, SCSI device
/dev/sdg -d scsi # /dev/sdg, SCSI device
/dev/sdh -d scsi # /dev/sdh, SCSI device
/dev/sdi -d scsi # /dev/sdi, SCSI device
/dev/sdj -d scsi # /dev/sdj, SCSI device
/dev/sdk -d scsi # /dev/sdk, SCSI device
/dev/sdl -d scsi # /dev/sdl, SCSI device
/dev/sdm -d scsi # /dev/sdm, SCSI device
/dev/sdn -d scsi # /dev/sdn, SCSI device
/dev/sdo -d scsi # /dev/sdo, SCSI device
/dev/sdp -d scsi # /dev/sdp, SCSI device
/dev/sdq -d scsi # /dev/sdq, SCSI device
/dev/sdr -d scsi # /dev/sdr, SCSI device
/dev/sds -d scsi # /dev/sds, SCSI device
/dev/sdt -d scsi # /dev/sdt, SCSI device
/dev/sdu -d scsi # /dev/sdu, SCSI device
/dev/sdv -d scsi # /dev/sdv, SCSI device
/dev/sdw -d scsi # /dev/sdw, SCSI device
/dev/sdx -d scsi # /dev/sdx, SCSI device
/dev/nvme0 -d nvme # /dev/nvme0, NVMe device
/dev/nvme1 -d nvme # /dev/nvme1, NVMe device |
For those who might not know, Broadcom == Avago == LSI == Dell PERC
… On May 10, 2024, at 13:42, Konstantin Shalygin ***@***.***> wrote:
Drives that are passed through on an LSI HBA
How it works with Broadcom HBA 9500-16i
3b:00.0 Serial Attached SCSI controller: Broadcom / LSI Fusion-MPT 12GSAS/PCIe Secure SAS38xx
Subsystem: Broadcom / LSI 9500-16i Tri-Mode HBA
smartctl_exporter 0.12.0
ts=2024-05-10T17:36:51.045Z caller=main.go:140 level=info msg="Starting smartctl_exporter" version="(version=0.12.0, branch=master, revision=1.el8)"
ts=2024-05-10T17:36:51.045Z caller=main.go:141 level=info msg="Build context" build_context="(go=go1.21.7 (Red Hat 1.21.7-1.module_el8+960+4060efbe), platform=linux/amd64, ***@***.***, date=20240304, tags=unknown)"
ts=2024-05-10T17:36:51.045Z caller=main.go:147 level=info msg="No devices specified, trying to load them automatically"
ts=2024-05-10T17:36:51.046Z caller=readjson.go:79 level=debug msg="Scanning for devices"
ts=2024-05-10T17:36:51.065Z caller=main.go:120 level=info msg="Found device" name=/dev/sda
ts=2024-05-10T17:36:51.065Z caller=main.go:120 level=info msg="Found device" name=/dev/sdb
ts=2024-05-10T17:36:51.065Z caller=main.go:120 level=info msg="Found device" name=/dev/sdc
ts=2024-05-10T17:36:51.065Z caller=main.go:120 level=info msg="Found device" name=/dev/sdd
ts=2024-05-10T17:36:51.065Z caller=main.go:120 level=info msg="Found device" name=/dev/sde
ts=2024-05-10T17:36:51.065Z caller=main.go:120 level=info msg="Found device" name=/dev/sdf
ts=2024-05-10T17:36:51.065Z caller=main.go:120 level=info msg="Found device" name=/dev/sdg
ts=2024-05-10T17:36:51.065Z caller=main.go:120 level=info msg="Found device" name=/dev/sdh
ts=2024-05-10T17:36:51.065Z caller=main.go:120 level=info msg="Found device" name=/dev/sdi
ts=2024-05-10T17:36:51.065Z caller=main.go:120 level=info msg="Found device" name=/dev/sdj
ts=2024-05-10T17:36:51.065Z caller=main.go:120 level=info msg="Found device" name=/dev/sdk
ts=2024-05-10T17:36:51.065Z caller=main.go:120 level=info msg="Found device" name=/dev/sdl
ts=2024-05-10T17:36:51.065Z caller=main.go:120 level=info msg="Found device" name=/dev/sdm
ts=2024-05-10T17:36:51.065Z caller=main.go:120 level=info msg="Found device" name=/dev/sdn
ts=2024-05-10T17:36:51.065Z caller=main.go:120 level=info msg="Found device" name=/dev/sdo
ts=2024-05-10T17:36:51.065Z caller=main.go:120 level=info msg="Found device" name=/dev/sdp
ts=2024-05-10T17:36:51.065Z caller=main.go:120 level=info msg="Found device" name=/dev/sdq
ts=2024-05-10T17:36:51.065Z caller=main.go:120 level=info msg="Found device" name=/dev/sdr
ts=2024-05-10T17:36:51.065Z caller=main.go:120 level=info msg="Found device" name=/dev/sds
ts=2024-05-10T17:36:51.065Z caller=main.go:120 level=info msg="Found device" name=/dev/sdt
ts=2024-05-10T17:36:51.065Z caller=main.go:120 level=info msg="Found device" name=/dev/sdu
ts=2024-05-10T17:36:51.065Z caller=main.go:120 level=info msg="Found device" name=/dev/sdv
ts=2024-05-10T17:36:51.065Z caller=main.go:120 level=info msg="Found device" name=/dev/sdw
ts=2024-05-10T17:36:51.065Z caller=main.go:120 level=info msg="Found device" name=/dev/sdx
ts=2024-05-10T17:36:51.065Z caller=main.go:120 level=info msg="Found device" name=/dev/nvme0
ts=2024-05-10T17:36:51.065Z caller=main.go:120 level=info msg="Found device" name=/dev/nvme1
ts=2024-05-10T17:36:51.065Z caller=main.go:149 level=info msg="Number of devices found" count=26
ts=2024-05-10T17:36:51.065Z caller=main.go:158 level=info msg="Start background scan process"
ts=2024-05-10T17:36:51.065Z caller=main.go:159 level=info msg="Rescanning for devices every" rescanInterval=10m0s
ts=2024-05-10T17:36:51.351Z caller=readjson.go:74 level=debug msg="Collected S.M.A.R.T. json data" device=/dev/sda duration=286.192113ms
ts=2024-05-10T17:36:51.351Z caller=smartctl.go:75 level=debug msg="Collecting metrics from" device=sda family="Toshiba X300" model="TOSHIBA HDWE160"
ts=2024-05-10T17:36:51.635Z caller=readjson.go:74 level=debug msg="Collected S.M.A.R.T. json data" device=/dev/sdb duration=282.917796ms
ts=2024-05-10T17:36:51.636Z caller=smartctl.go:75 level=debug msg="Collecting metrics from" device=sdb family="Toshiba X300" model="TOSHIBA HDWE160"
ts=2024-05-10T17:36:51.703Z caller=readjson.go:74 level=debug msg="Collected S.M.A.R.T. json data" device=/dev/sdc duration=65.960245ms
ts=2024-05-10T17:36:51.703Z caller=smartctl.go:75 level=debug msg="Collecting metrics from" device=sdc family=unknown model=WUS721010ALE6L4
ts=2024-05-10T17:36:52.148Z caller=readjson.go:74 level=debug msg="Collected S.M.A.R.T. json data" device=/dev/sdd duration=444.575557ms
ts=2024-05-10T17:36:52.148Z caller=smartctl.go:75 level=debug msg="Collecting metrics from" device=sdd family="Toshiba X300" model="TOSHIBA HDWE160"
ts=2024-05-10T17:36:52.184Z caller=readjson.go:74 level=debug msg="Collected S.M.A.R.T. json data" device=/dev/sde duration=34.875406ms
ts=2024-05-10T17:36:52.185Z caller=smartctl.go:75 level=debug msg="Collecting metrics from" device=sde family=unknown model=WUS721010ALE6L4
ts=2024-05-10T17:36:52.220Z caller=readjson.go:74 level=debug msg="Collected S.M.A.R.T. json data" device=/dev/sdf duration=34.298321ms
ts=2024-05-10T17:36:52.220Z caller=smartctl.go:75 level=debug msg="Collecting metrics from" device=sdf family=unknown model="WDC WUS721010ALE6L4"
ts=2024-05-10T17:36:52.251Z caller=readjson.go:74 level=debug msg="Collected S.M.A.R.T. json data" device=/dev/sdg duration=30.437211ms
ts=2024-05-10T17:36:52.251Z caller=smartctl.go:75 level=debug msg="Collecting metrics from" device=sdg family="Toshiba MG06ACA... Enterprise Capacity HDD" model="TOSHIBA MG06ACA10TE"
ts=2024-05-10T17:36:52.550Z caller=readjson.go:74 level=debug msg="Collected S.M.A.R.T. json data" device=/dev/sdh duration=297.92271ms
ts=2024-05-10T17:36:52.550Z caller=smartctl.go:75 level=debug msg="Collecting metrics from" device=sdh family="Toshiba X300" model="TOSHIBA HDWE160"
ts=2024-05-10T17:36:52.831Z caller=readjson.go:74 level=debug msg="Collected S.M.A.R.T. json data" device=/dev/sdi duration=279.672304ms
ts=2024-05-10T17:36:52.831Z caller=smartctl.go:75 level=debug msg="Collecting metrics from" device=sdi family="Toshiba X300" model="TOSHIBA HDWE160"
ts=2024-05-10T17:36:53.126Z caller=readjson.go:74 level=debug msg="Collected S.M.A.R.T. json data" device=/dev/sdj duration=292.331348ms
ts=2024-05-10T17:36:53.126Z caller=smartctl.go:75 level=debug msg="Collecting metrics from" device=sdj family="Toshiba X300" model="TOSHIBA HDWE160"
ts=2024-05-10T17:36:53.399Z caller=readjson.go:74 level=debug msg="Collected S.M.A.R.T. json data" device=/dev/sdk duration=272.004944ms
ts=2024-05-10T17:36:53.399Z caller=smartctl.go:75 level=debug msg="Collecting metrics from" device=sdk family="Toshiba X300" model="TOSHIBA HDWE160"
ts=2024-05-10T17:36:53.692Z caller=readjson.go:74 level=debug msg="Collected S.M.A.R.T. json data" device=/dev/sdl duration=291.55157ms
ts=2024-05-10T17:36:53.692Z caller=smartctl.go:75 level=debug msg="Collecting metrics from" device=sdl family="Toshiba X300" model="TOSHIBA HDWE160"
ts=2024-05-10T17:36:53.724Z caller=readjson.go:69 level=warn msg="S.M.A.R.T. output reading" err="exit status 64" device=/dev/sdm
ts=2024-05-10T17:36:53.724Z caller=readjson.go:140 level=warn msg="The device error log contains records of errors" device=/dev/sdm
ts=2024-05-10T17:36:53.724Z caller=readjson.go:74 level=debug msg="Collected S.M.A.R.T. json data" device=/dev/sdm duration=31.055839ms
ts=2024-05-10T17:36:53.724Z caller=smartctl.go:75 level=debug msg="Collecting metrics from" device=sdm family="Toshiba MG06ACA... Enterprise Capacity HDD" model="TOSHIBA MG06ACA10TE"
ts=2024-05-10T17:36:54.012Z caller=readjson.go:74 level=debug msg="Collected S.M.A.R.T. json data" device=/dev/sdn duration=286.293623ms
ts=2024-05-10T17:36:54.012Z caller=smartctl.go:75 level=debug msg="Collecting metrics from" device=sdn family="Toshiba X300" model="TOSHIBA HDWE160"
ts=2024-05-10T17:36:54.325Z caller=readjson.go:74 level=debug msg="Collected S.M.A.R.T. json data" device=/dev/sdo duration=312.568885ms
ts=2024-05-10T17:36:54.325Z caller=smartctl.go:75 level=debug msg="Collecting metrics from" device=sdo family="Toshiba X300" model="TOSHIBA HDWE160"
ts=2024-05-10T17:36:54.624Z caller=readjson.go:74 level=debug msg="Collected S.M.A.R.T. json data" device=/dev/sdp duration=297.93094ms
ts=2024-05-10T17:36:54.624Z caller=smartctl.go:75 level=debug msg="Collecting metrics from" device=sdp family="Toshiba X300" model="TOSHIBA HDWE160"
ts=2024-05-10T17:36:54.906Z caller=readjson.go:74 level=debug msg="Collected S.M.A.R.T. json data" device=/dev/sdq duration=280.451251ms
ts=2024-05-10T17:36:54.906Z caller=smartctl.go:75 level=debug msg="Collecting metrics from" device=sdq family="Toshiba X300" model="TOSHIBA HDWE160"
ts=2024-05-10T17:36:55.186Z caller=readjson.go:69 level=warn msg="S.M.A.R.T. output reading" err="exit status 64" device=/dev/sdr
ts=2024-05-10T17:36:55.186Z caller=readjson.go:140 level=warn msg="The device error log contains records of errors" device=/dev/sdr
ts=2024-05-10T17:36:55.186Z caller=readjson.go:74 level=debug msg="Collected S.M.A.R.T. json data" device=/dev/sdr duration=279.37659ms
ts=2024-05-10T17:36:55.186Z caller=smartctl.go:75 level=debug msg="Collecting metrics from" device=sdr family="Toshiba X300" model="TOSHIBA HDWE160"
ts=2024-05-10T17:36:55.468Z caller=readjson.go:74 level=debug msg="Collected S.M.A.R.T. json data" device=/dev/sds duration=280.076152ms
ts=2024-05-10T17:36:55.468Z caller=smartctl.go:75 level=debug msg="Collecting metrics from" device=sds family="Toshiba X300" model="TOSHIBA HDWE160"
ts=2024-05-10T17:36:55.760Z caller=readjson.go:74 level=debug msg="Collected S.M.A.R.T. json data" device=/dev/sdt duration=291.624193ms
ts=2024-05-10T17:36:55.760Z caller=smartctl.go:75 level=debug msg="Collecting metrics from" device=sdt family="Toshiba X300" model="TOSHIBA HDWE160"
ts=2024-05-10T17:36:56.050Z caller=readjson.go:69 level=warn msg="S.M.A.R.T. output reading" err="exit status 64" device=/dev/sdu
ts=2024-05-10T17:36:56.050Z caller=readjson.go:140 level=warn msg="The device error log contains records of errors" device=/dev/sdu
ts=2024-05-10T17:36:56.050Z caller=readjson.go:74 level=debug msg="Collected S.M.A.R.T. json data" device=/dev/sdu duration=287.490146ms
ts=2024-05-10T17:36:56.050Z caller=smartctl.go:75 level=debug msg="Collecting metrics from" device=sdu family="Toshiba X300" model="TOSHIBA HDWE160"
ts=2024-05-10T17:36:56.319Z caller=readjson.go:74 level=debug msg="Collected S.M.A.R.T. json data" device=/dev/sdv duration=267.760124ms
ts=2024-05-10T17:36:56.319Z caller=smartctl.go:75 level=debug msg="Collecting metrics from" device=sdv family="Toshiba X300" model="TOSHIBA HDWE160"
ts=2024-05-10T17:36:56.596Z caller=readjson.go:74 level=debug msg="Collected S.M.A.R.T. json data" device=/dev/sdw duration=276.673288ms
ts=2024-05-10T17:36:56.597Z caller=smartctl.go:75 level=debug msg="Collecting metrics from" device=sdw family="Toshiba X300" model="TOSHIBA HDWE160"
ts=2024-05-10T17:36:56.632Z caller=readjson.go:74 level=debug msg="Collected S.M.A.R.T. json data" device=/dev/sdx duration=34.654125ms
ts=2024-05-10T17:36:56.632Z caller=smartctl.go:75 level=debug msg="Collecting metrics from" device=sdx family=unknown model="WDC WUS721010ALE6L4"
ts=2024-05-10T17:36:56.653Z caller=readjson.go:74 level=debug msg="Collected S.M.A.R.T. json data" device=/dev/nvme0 duration=19.965395ms
ts=2024-05-10T17:36:56.653Z caller=smartctl.go:75 level=debug msg="Collecting metrics from" device=nvme0 family=unknown model=Micron_7300_MTFDHBE3T2TDG
ts=2024-05-10T17:36:56.674Z caller=readjson.go:74 level=debug msg="Collected S.M.A.R.T. json data" device=/dev/nvme1 duration=20.415503ms
ts=2024-05-10T17:36:56.674Z caller=smartctl.go:75 level=debug msg="Collecting metrics from" device=nvme1 family=unknown model=Micron_7300_MTFDHBE3T2TDG
ts=2024-05-10T17:36:56.674Z caller=tls_config.go:313 level=info msg="Listening on" address=192.168.100.26:9630
ts=2024-05-10T17:36:56.674Z caller=tls_config.go:316 level=info msg="TLS is disabled." http2=false address=192.168.100.26:9630
And the same server with the current master
smartctl_exporter master
***@***.***# smartctl --scan
/dev/sda -d scsi # /dev/sda, SCSI device
/dev/sdb -d scsi # /dev/sdb, SCSI device
/dev/sdc -d scsi # /dev/sdc, SCSI device
/dev/sdd -d scsi # /dev/sdd, SCSI device
/dev/sde -d scsi # /dev/sde, SCSI device
/dev/sdf -d scsi # /dev/sdf, SCSI device
/dev/sdg -d scsi # /dev/sdg, SCSI device
/dev/sdh -d scsi # /dev/sdh, SCSI device
/dev/sdi -d scsi # /dev/sdi, SCSI device
/dev/sdj -d scsi # /dev/sdj, SCSI device
/dev/sdk -d scsi # /dev/sdk, SCSI device
/dev/sdl -d scsi # /dev/sdl, SCSI device
/dev/sdm -d scsi # /dev/sdm, SCSI device
/dev/sdn -d scsi # /dev/sdn, SCSI device
/dev/sdo -d scsi # /dev/sdo, SCSI device
/dev/sdp -d scsi # /dev/sdp, SCSI device
/dev/sdq -d scsi # /dev/sdq, SCSI device
/dev/sdr -d scsi # /dev/sdr, SCSI device
/dev/sds -d scsi # /dev/sds, SCSI device
/dev/sdt -d scsi # /dev/sdt, SCSI device
/dev/sdu -d scsi # /dev/sdu, SCSI device
/dev/sdv -d scsi # /dev/sdv, SCSI device
/dev/sdw -d scsi # /dev/sdw, SCSI device
/dev/sdx -d scsi # /dev/sdx, SCSI device
/dev/nvme0 -d nvme # /dev/nvme0, NVMe device
/dev/nvme1 -d nvme # /dev/nvme1, NVMe device
—
Reply to this email directly, view it on GitHub <#205 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ADTVONEIRGE7XGSGEF4E4KTZBUBITAVCNFSM6AAAAABEA6EM62VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMBVGAYTMMZRGM>.
You are receiving this because you were mentioned.
|
This change is not going to be released until it's fixed (preferred) or reverted. @k0ste or @robbat2, would one of you mind being a maintainer to help test changes like this? I don't have a system to test advanced drive setups and #227 needs implementing before I can use the |
I can test live combos of VD, passthrough, SAS, SATA, NVMe drives. Only thing I don't have -- thankfully -- is NVMe VDs.
… On May 11, 2024, at 17:55, David Randall ***@***.***> wrote:
This change is not going to be released until it's fixed (preferred) or reverted.
@k0ste <https://github.com/k0ste> or @robbat2 <https://github.com/robbat2>, would one of you mind being a maintainer to help test changes like this? I don't have a system to test advanced drive setups and #227 <#227> needs implementing before I can use the smartctl --json output others have provided. I would hate for this project to suffer due to my lack of hardware.
—
Reply to this email directly, view it on GitHub <#205 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ADTVONCEEC2ML6L6CKTYLA3ZB2HUNAVCNFSM6AAAAABEA6EM62VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMBWGA2DAMBVGQ>.
You are receiving this because you were mentioned.
|
Could I suggest sticking with the 3rd solution? Because the whole design rationale of this PR seems to be that it does not require any new flags and automatically detects everything. The first and second options would break this. The third variant would only behave differently (compared to the old version) if smartctl does report a megaraid type device.
Indeed, thank you. |
My suggestions is:
Adding the --megaraid option. With this option the new megaraid logic is enabled. The non-megaraid devices should be filtered by administrator via regex filter, for example --smartctl.device-exclude="^/dev/sd[a-z]+”
Adding the --megaraid-only option. This tell exporter to use only megaraid devices. This will fix the doubled devices
What about systems with both, though? I have e.g. systems with multiple NVMe data drives and an LSI RAID HBA that mirrors two boot drives. Not a cost-effective config for sure, but I’m stuck with them for years yet.
Don't adding the new options, but add new device_type logic only for megaraid devices, so it's omit the device_type option to smartmontools by default, - the current exporter behavior. In this case, administrator should filter megaraid devices by hand via regex filter. Actually, we use this filter for all exporter deployments, to avoid non-disk devices in device counter
***@***.***:/]# cat /etc/conf.d/smartctl_exporter
OPTIONS="--smartctl.interval=600s --smartctl.device-exclude=^/dev/bus/[0-9]+$ --web.listen-address=192.168.102.254:9633
Would that handle systems with more than one HBA?
Another variant...
Could I suggest sticking with the 3rd solution? Because the whole design rationale of this PR seems to be that it does not require any new flags and automatically detects everything.
That would be ideal of course, but “do the desired thing in all kinds of scenarios” often is by far the hardest to implement.
So if we land on a solution that requires that I script up something to statically populate a list of devices to poll, then that isn’t the worst outcome.
The first and second options would break this. The third variant would only behave differently (compared to the old version) if smartctl does report a megaraid type device.
* Drives part of an LSI VD
* Drives on an LSI RAID HBA but passed through
* DHS, GHS, Ugood, Ubad, F (storcli.py)
* Drives hidden behind an LSI RAID HBA but not passed through or part of a VD
* Drives on a non-RAID HBA, e.g. plugged into a chipset SATA port
* NVMe drives (no HBA, thankfully)
* Dell BOSS-S1/S2/N1 drives (mvcli)
* Handful of systems with more than one HBA.
I have various combinations of the above to deal with. Solving for only one per system doesn’t do what I need. ymmv.
… Hope this helped clear things up 🙏
Indeed, thank you.
—
Reply to this email directly, view it on GitHub <#205 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ADTVONB2BYLIR7TMGSDZSXDZB2VJFAVCNFSM6AAAAABEA6EM62VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMBWGA3DCMJUGQ>.
You are receiving this because you were mentioned.
|
Of course
This is not difficult to implement, this is the 4th variant - when scan find the |
Please support cciss(Compaq Smart Array Controller) device type, we can get deivces by |
I think this patch should be reverted. It completely disengages the |
OK, seems like this also breaks SATA device scraping? I have a bunch of SATA drives connected directly to AMD X470 chipset built-in SATA controller. edit: should've probably read the thread first... 🤦 |
Definitely |
* [CHANGE] `slog` used for logging instead go logger prometheus-community#246 * [ENHANCEMENT] Added support for `megaraid` devices and device types prometheus-community#205 prometheus-community#257 * [BUGFIX] Better support for smartmontools < 7.3 prometheus-community#238 * [BUGFIX] Corrected NVMe read/write bytes to NVMe metrics prometheus-community#211 Signed-off-by: Konstantin Shalygin <[email protected]>
* [CHANGE] `slog` used for logging instead go logger #246 * [ENHANCEMENT] Added support for `megaraid` devices and device types #205 #257 * [BUGFIX] Better support for smartmontools < 7.3 #238 * [BUGFIX] Corrected NVMe read/write bytes to NVMe metrics #211 Signed-off-by: Konstantin Shalygin <[email protected]>
Removed smartctl.device param - smartctl.device-include/smartctl.device-exclude fully covers this Added determining device type and use it at scrape data
Maybe fix issues:
#89
#26
And no need pr
#107