Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

n_users field in [inputs.system]] plugin stuck at 0 on ppc64le #10003

Closed
jdmaloney opened this issue Oct 26, 2021 · 10 comments · Fixed by #15082
Closed

n_users field in [inputs.system]] plugin stuck at 0 on ppc64le #10003

jdmaloney opened this issue Oct 26, 2021 · 10 comments · Fixed by #15082
Labels
area/system bug unexpected problem or unintended behavior upstream bug or issues that rely on dependency fixes waiting for response waiting for response from contributor

Comments

@jdmaloney
Copy link

Relevent telegraf.conf

[[inputs.system]]

System info

Telegraf 1.20.2, RHEL 8.4

Docker

Not Applicable

Steps to reproduce

  1. Install telegraf-1.20.2-1.ppc64le
  2. Configure with the stock inputs.system plugin
  3. Can see output with telegraf --config test.conf --test
    ...

Expected behavior

Number of user session on the node is currently 35 (2 lines are output header):

# w | wc -l
37

Expect that telegraf would capture that for the n_users field for example:

# telegraf --config test.conf --test
2021-10-26T18:17:20Z I! Starting Telegraf 1.20.2
> system,host=XXXX.redacted.com load1=624.76,load15=625.04,load5=624.6,n_cpus=128i,n_users=35i 1635272240000000000
> system,host=XXXX.redacted.com uptime=5432092i 1635272240000000000
> system,host=XXXX.redacted.com uptime_format="62 days, 20:54" 1635272240000000000

Actual behavior

Telegraf records the number of user sessions as 0:

# telegraf --config test.conf --test
2021-10-26T18:17:20Z I! Starting Telegraf 1.20.2
> system,host=XXXX.redacted.com load1=624.76,load15=625.04,load5=624.6,n_cpus=128i,n_users=0i 1635272240000000000
> system,host=XXXX.redacted.com uptime=5432092i 1635272240000000000
> system,host=XXXX.redacted.com uptime_format="62 days, 20:54" 1635272240000000000

Additional info

No errors are recorded in the telegraf log complaining about not being able to retrieve that field.

@jdmaloney jdmaloney added the bug unexpected problem or unintended behavior label Oct 26, 2021
@powersj
Copy link
Contributor

powersj commented Oct 26, 2021

Thanks for opening this over here.

It looks like the system input plugin pulls the number of users from the gopsutil library. It looks like the value itself comes from this function.

Would you be willing to run the following go code snippet to help narrow down if this is in the gopsutil library itself or with telegraf:

package main

import (
	"fmt"
	"os"

	"github.com/shirou/gopsutil/host"
)

func main() {
	users, err := host.Users()
	if err == nil {
		fmt.Println(len(users))
	} else if os.IsNotExist(err) {
		fmt.Println("Reading users: ", err.Error())
	} else if os.IsPermission(err) {
		fmt.Println(err.Error())
	}
}

Thanks!

@jdmaloney
Copy link
Author

I'm not the best with go, so let me know if I'm doing something wrong here, but this is what I get:

# cat jd_test.go
package main

import (
	"fmt"
	"os"

	"github.com/shirou/gopsutil/host"
)

func main() {
	users, err := host.Users()
	if err == nil {
		fmt.Println(len(users))
	} else if os.IsNotExist(err) {
		fmt.Println("Reading users: ", err.Error())
	} else if os.IsPermission(err) {
		fmt.Println(err.Error())
	}
}

Running code snippet:

# chmod +x jd_test.go
# go run jd_test.go
go run: cannot run *_test.go files (jd_test.go)

@powersj
Copy link
Contributor

powersj commented Oct 26, 2021

to make this a little easier I threw some debugging messages into Telegraf, and put up a fake PR to get it to build. Can you try downloading the ppc64el.tar.gz and running that with the --debug option and share the output here?

Thanks!

@jdmaloney
Copy link
Author

I got the following:

# ./telegraf --config ../../etc/telegraf/test.conf --debug --test
2021-10-26T20:53:23Z I! Starting Telegraf
2021-10-26T20:53:23Z D! [agent] Initializing plugins
2021-10-26T20:53:23Z D! [agent] Starting service inputs
2021-10-26T20:53:23Z D! [inputs.system] Found %!i(int=0) number of users
2021-10-26T20:53:23Z D! [agent] Stopping service inputs
2021-10-26T20:53:23Z D! [agent] Input channel closed
2021-10-26T20:53:23Z D! [agent] Stopped Successfully
> system,host=XXXX.redacted.com load1=624.1,load15=624.25,load5=624.15,n_cpus=128i,n_users=0i 1635281603000000000
> system,host=XXXX.redacted.com uptime=5441455i 1635281603000000000
> system,host=XXXX.redacted.com uptime_format="62 days, 23:30" 1635281603000000000

@powersj
Copy link
Contributor

powersj commented Oct 26, 2021

Looks like the library itself is reporting 0 users. Can you open a bug in the upstream gopsutil project and see what they say? You can reference this bug as well.

Thanks!

@radu-boboc
Copy link

I got the same problem.
Is there a fix for it?

@powersj powersj added the upstream bug or issues that rely on dependency fixes label Aug 2, 2023
@misterf13
Copy link

Have only seen this issue in raspberry pis. Modified the script from @jdmaloney into main.go

package main

import (
        "fmt"
        "os"
        "strings"

        "github.com/shirou/gopsutil/host"
)

func main() {
        // Get host information
        users, err := host.Users()
        if err == nil {
                nUsers := len(users)
                nUniqueUsers := findUniqueUsers(users)

                fmt.Printf("Number of users: %d\n", nUsers)
                fmt.Printf("Number of unique users: %d\n", nUniqueUsers)
        } else if os.IsNotExist(err) {
                fmt.Println("Reading users: ", err.Error())
        } else if os.IsPermission(err) {
                fmt.Println("Permission error: ", err.Error())
        } else {
                fmt.Println("Other error: ", err.Error())
        }
}

func findUniqueUsers(users []host.UserStat) int {
        uniqueUsernames := make(map[string]struct{})

        for _, user := range users {
                // Normalize username (case-insensitive) for uniqueness
                normalizedUsername := strings.ToLower(user.User)
                uniqueUsernames[normalizedUsername] = struct{}{}
        }

        // Print detailed information about each user
        fmt.Println("User details:")
        for i, user := range users {
                fmt.Printf("User %d: %+v\n", i+1, user)
        }

        return len(uniqueUsernames)
}
->go run main.go
User details:
Number of users: 0
Number of unique users: 0

@misterf13
Copy link

More info here. shirou/gopsutil#1129

@JosefRypacek
Copy link

The bug has just been fixed (shirou/gopsutil#1129). I guess we need to wait for next release of gopsutil which could be at the beginning of April and then for updating of the dependency here in telegraf.

@powersj
Copy link
Contributor

powersj commented Apr 1, 2024

@JosefRypacek,

Looks like gopsutil did a release recently, I've put up #15082 with the updated dependency. Could you give the artifacts in that PR a try and let me know if it resolves the issue?

Thanks!

@powersj powersj added the waiting for response waiting for response from contributor label Apr 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/system bug unexpected problem or unintended behavior upstream bug or issues that rely on dependency fixes waiting for response waiting for response from contributor
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants