Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add release package for linux/arm64/v8 #267

Closed
gongyisheng opened this issue Jan 31, 2025 · 5 comments
Closed

Add release package for linux/arm64/v8 #267

gongyisheng opened this issue Jan 31, 2025 · 5 comments

Comments

@gongyisheng
Copy link

I'm trying to run smartctl_exporter on my ubuntu linux machine with docker compose. However I found out that there's no matched release.

docker compose yaml:

version: "3"

services:
  smartctl-exporter:
    image: prometheuscommunity/smartctl-exporter
    privileged: true
    user: root
    ports:
      - "9633:9633"

error message:

no matching manifest for linux/arm64/v8 in the manifest list entries
@GoingtohellYT
Copy link

Hey!

I tried the same thing today on a RPI and got the the same error. However, looking in the Docker Hub page, I found that version 0.7.0 support arm64, although it is 2 years old now...

Trying to run this version using the following at the end of a docker-compose results in the container exiting with code 2 :

  smartctl_exporter:
      image: prometheuscommunity/smartctl-exporter:v0.7.0
      container_name: smartctl
      privileged: true
      user: root
      ports:
        - "9633:9633"

Here are the returned logs :

[Warning] S.M.A.R.T. output reading error: fork/exec /usr/sbin/smartctl: no such file or directory
[Error] Device /dev/sdb unavialable
[Error] Device /dev/sdc unavialable
[Error] Device /dev/sdd unavialable
[Error] Device /dev/sde unavialable
[Error] Device /dev/sdf unavialable
[Error] Device /dev/sdg unavialable
[Error] Device /dev/sdh unavialable
[Error] Device /dev/sdi unavialable
[Error] Device /dev/sdj unavialable
[Error] Device /dev/sdk unavialable
[Error] Device /dev/sdl unavialable
panic: runtime error: index out of range [0] with length 0
goroutine 36 [running]:
main.(*SMARTctlInfo).mineVersion(0x4000113e80)
        /app/smartctlinfo.go:59 +0x748
main.(*SMARTctlInfo).Collect(...)
	/app/smartctlinfo.go:48
main.SMARTctlManagerCollector.Collect({}, 0x40001c00c0?)
	/app/main.go:51 +0x78
github.com/prometheus/client_golang/prometheus.DescribeByCollect.func1()
	/go/pkg/mod/github.com/prometheus/[email protected]/prometheus/collector.go:90 +0x34
created by github.com/prometheus/client_golang/prometheus.DescribeByCollect
	/go/pkg/mod/github.com/prometheus/[email protected]/prometheus/collector.go:89 +0xa0

Is it still possible to run this connector on arm64 devices or is this version now too old and the more recent ones incompatible ?

@gongyisheng
Copy link
Author

gongyisheng commented Feb 3, 2025

@GoingtohellYT Got same error as you. I can't run prometheuscommunity/smartctl-exporter:v0.7.0

@gongyisheng
Copy link
Author

@GoingtohellYT Besides, I downloaded the linux-arm64 package from release (0.13.0-2024-12-20 version) and it can successfully run on my RPI. However, there're only a limited number of metrics are supported. Important metrics like smartctl_device_bytes_written and smartctl_device_bytes_read are missing.

Here's the metrics I get:

# HELP smartctl_device Device info
# TYPE smartctl_device gauge
smartctl_device{ata_additional_product_id="unknown",ata_version="",device="sdb",firmware_version="ELFMH0.1",form_factor="",interface="sntrealtek",model_family="unknown",model_name="T-FORCE TM8FFT001T",protocol="NVMe",sata_version="",scsi_product="",scsi_revision="",scsi_vendor="",scsi_version="",serial_number="TPBF2311170010200049"} 1
# HELP smartctl_device_block_size Device block size
# TYPE smartctl_device_block_size gauge
smartctl_device_block_size{blocks_type="logical",device="sdb"} 512
smartctl_device_block_size{blocks_type="physical",device="sdb"} 0
# HELP smartctl_device_capacity_blocks Device capacity in blocks
# TYPE smartctl_device_capacity_blocks gauge
smartctl_device_capacity_blocks{device="sdb"} 1.953525168e+09
# HELP smartctl_device_capacity_bytes Device capacity in bytes
# TYPE smartctl_device_capacity_bytes gauge
smartctl_device_capacity_bytes{device="sdb"} 1.000204886016e+12
# HELP smartctl_device_power_cycle_count Device power cycle count
# TYPE smartctl_device_power_cycle_count counter
smartctl_device_power_cycle_count{device="sdb"} 22
# HELP smartctl_device_power_on_seconds Device power on seconds
# TYPE smartctl_device_power_on_seconds counter
smartctl_device_power_on_seconds{device="sdb"} 388800
# HELP smartctl_device_smart_status General smart status
# TYPE smartctl_device_smart_status gauge
smartctl_device_smart_status{device="sdb"} 1
# HELP smartctl_device_smartctl_exit_status Exit status of smartctl on device
# TYPE smartctl_device_smartctl_exit_status gauge
smartctl_device_smartctl_exit_status{device="sdb"} 0
# HELP smartctl_device_temperature Device temperature celsius
# TYPE smartctl_device_temperature gauge
smartctl_device_temperature{device="sdb",temperature_type="current"} 24
# HELP smartctl_devices Number of devices configured or dynamically discovered
# TYPE smartctl_devices gauge
smartctl_devices 1
# HELP smartctl_version smartctl version
# TYPE smartctl_version gauge
smartctl_version{build_info="(local build)",json_format_version="1.0",smartctl_version="7.4",svn_revision="5530"} 1

But the log looks normal and no error shows up.

time=2025-02-03T19:17:06.898Z level=INFO source=main.go:143 msg="Ignoring device" name=sda
time=2025-02-03T19:17:06.898Z level=INFO source=main.go:145 msg="Found device" name=sdb
time=2025-02-03T19:27:06.900Z level=INFO source=main.go:85 msg="Rescanning for devices"
time=2025-02-03T19:27:07.001Z level=INFO source=main.go:143 msg="Ignoring device" name=sda
time=2025-02-03T19:27:07.002Z level=INFO source=main.go:145 msg="Found device" name=sdb
time=2025-02-03T19:37:07.002Z level=INFO source=main.go:85 msg="Rescanning for devices"
time=2025-02-03T19:37:07.100Z level=INFO source=main.go:143 msg="Ignoring device" name=sda
time=2025-02-03T19:37:07.100Z level=INFO source=main.go:145 msg="Found device" name=sdb
time=2025-02-03T19:47:07.102Z level=INFO source=main.go:85 msg="Rescanning for devices"
time=2025-02-03T19:47:07.206Z level=INFO source=main.go:143 msg="Ignoring device" name=sda
time=2025-02-03T19:47:07.206Z level=INFO source=main.go:145 msg="Found device" name=sdb

I'm wondering if it's also caused by issues from package or my nvme disk?

@GoingtohellYT
Copy link

GoingtohellYT commented Feb 6, 2025

@gongyisheng I couldn't get the exporter to work either, but I did find a workaround that works fine for me.

Node-exporter, even though it does not scape SMART data, can read data from text files. Knowing this, you can create a bash script that puts the data returned by the smartctl -a /path/to/disk command.
You might need to modify it according to your needs (I've only tried it with 1 disk) but the structure should look something like this :

#!/bin/bash

# Path to directory checked by Node-exporter
TEXTFILE_DIR="/path/to/directory/for/text/file"

# Path to output file
OUTPUT_FILE="${TEXTFILE_DIR}/smartmon.prom"

# Clear the file
> $OUTPUT_FILE

# List with the disks
DISKS=($(ls /dev/sd?))

# Collect SMART data for each drive
for DISK in "${DISKS[@]}"; do
  sudo smartctl -A -j $DISK | jq -r '
    .ata_smart_attributes.table[]? |
    select(.id != null) |
    "\(.name | gsub("-"; "_")) \(.raw.value)"
  ' | awk '{printf "%s %.1f\n", $1, $2}' >> $OUTPUT_FILE
done

What the script does is basically the following :
For each disk:

  • run the smartctl -a -j /path/to/disk command
  • get the table containing the raw data to put it in raw text
  • place the raw data in the output file

The file ought to look something like this :
NAME_OF_METRIC value.as.float

However, I think that running this with multiple disks might lead to into an issue as you will have multiple lines with the same name...
To change this, you might want to change the script return a text file that looks more like this :
NAME_OF_METRIC_disk value.as.float

This way, you could identify each disk more easily.

Once your script is all set up, you need to make it executable: chmod u+x /path/to/file

Of course, for the data to be up to date, you need to run the script periodically, using a cron job :
Use crontab -e to modify the cron file.
Add the following line :
*/5 * * * * /path/to/your/script to execute it every 5 minutes

Lastly, you need to modify your Docker Compose file so that Node-exporter looks into that directory to retrieve the data. My Node-exporter configuration looks like this, feel free to modify it to your needs:

node_exporter:
    image: quay.io/prometheus/node-exporter:v1.5.0
    container_name: node_exporter
    ports:
      - "9100:9100"
    command:
      - "--path.rootfs=/host"
      - "--collector.textfile.directory=/path/to/dir/in/container"
    pid: host
    restart: unless-stopped
    volumes:
      - "/:/host:ro,rslave"
      - "/path/to/dir/on/host:/path/to/dir/in/container"

In my scenario, here are the metrics I can retrieve using this method:

Raw_Read_Error_Rate 0.0
Reallocate_NAND_Blk_Cnt 0.0
Power_On_Hours 12344.0
Power_Cycle_Count 15.0
Program_Fail_Count 0.0
Erase_Fail_Count 0.0
Ave_Block_Erase_Count 3.0
Unexpect_Power_Loss_Ct 10.0
Unused_Reserve_NAND_Blk 41.0
SATA_Interfac_Downshift 0.0
Error_Correction_Count 0.0
Reported_Uncorrect 0.0
Temperature_Celsius 206160068636.0
Reallocated_Event_Count 0.0
Current_Pending_ECC_Cnt 0.0
Offline_Uncorrectable 0.0
UDMA_CRC_Error_Count 0.0
Percent_Lifetime_Remain 0.0
Write_Error_Rate 0.0
Success_RAIN_Recov_Cnt 0.0
Total_LBAs_Written 1254920054.0
Host_Program_Page_Count 39216251.0
FTL_Program_Page_Count 47579136.0
Unkn_CrucialMicron_Attr 0.0
Read_Error_Retry_Rate 0.0
Unkn_CrucialMicron_Attr 1940210205.0
Unkn_CrucialMicron_Attr 1.0
Unkn_CrucialMicron_Attr 0.0
Unkn_CrucialMicron_Attr 103.0
Unkn_CrucialMicron_Attr 1.0

With this you should be all set up and be able to retrieve the data in Prometheus. Note that Node-exporter will probably return an error in the logs if your scrape interval is smaller than the time between two executions of the Bash script. This is normal and sould not prevent you from using it. You can always increase the scrape interval in the prometheus.yml file if you want to make these errors go away.

That should do the trick, tell me if you need any further help/details. Hope it helps.

PS: I use a SATA drive, so I don't know how that'll work with an NVME...

@gongyisheng
Copy link
Author

@GoingtohellYT It works for me after changing making some small code changes, thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants