Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

xcapture-bpf error: no member named '__state' in 'struct task_struct' #51

Open
kevinbin opened this issue Sep 21, 2024 · 5 comments
Open
Assignees

Comments

@kevinbin
Copy link

[root@10-35-19-16 0xtools]# uname -a
Linux 10-35-19-16 4.18.0-305.3.1.el8.x86_64 #1 SMP Tue Jun 1 16:14:33 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
[root@10-35-19-16 0xtools]# cat /etc/redhat-release
CentOS Linux release 8.4.2105

[root@10-35-19-16 0xtools]# ./bin/xtop
=== [0x.tools] xtop 2.0.3 BETA by Tanel Poder. Centos Linux 8 4.18.0 x86_64
=== Loading BPF...
In file included from :2:
In file included from /virtual/include/bcc/bpf.h:12:
In file included from include/linux/types.h:6:
In file included from include/uapi/linux/types.h:14:
In file included from include/uapi/linux/posix_types.h:5:
In file included from include/linux/stddef.h:5:
In file included from include/uapi/linux/stddef.h:2:
In file included from include/linux/compiler_types.h:74:
include/linux/compiler-clang.h:25:9: warning: '__no_sanitize_address' macro redefined [-Wmacro-redefined]
#define __no_sanitize_address
^
include/linux/compiler-gcc.h:213:9: note: previous definition is here
#define __no_sanitize_address attribute((no_sanitize_address))
^
/virtual/main.c:173:29: error: no member named '__state' in 'struct task_struct'; did you mean 'state'?
t->state = curtask->__state;
^~~~~~~
state
include/linux/sched.h:424:18: note: 'state' declared here
volatile long state;
^
/virtual/main.c:312:37: error: no member named '__state' in 'struct task_struct'; did you mean 'state'?
unsigned int prev_state = prev->__state; // ctx->args[3] won't work in older configs due to breaking change in sched_switch tracepoint
^~~~~~~
state
include/linux/sched.h:424:18: note: 'state' declared here
volatile long state;
^
/virtual/main.c:371:31: error: no member named '__state' in 'struct task_struct'; did you mean 'state'?
t_next->state = next->__state;
^~~~~~~
state
include/linux/sched.h:424:18: note: 'state' declared here
volatile long state;
^
1 warning and 3 errors generated.
Traceback (most recent call last):
File "/root/0xtools/bin/xcapture-bpf", line 474, in
b = BPF(text= ifdef + bpf_text)
File "/usr/lib/python3.6/site-packages/bcc/init.py", line 365, in init
raise Exception("Failed to compile BPF module %s" % (src_file or ""))
Exception: Failed to compile BPF module

@tanelpoder
Copy link
Owner

tanelpoder commented Sep 21, 2024

This is due to a kernel structure change between somewhere in 4.18 and 5.x, where the kernel scheduler developers renamed the task->state field to task->__state. I thought that added a conditional to check for that so it would work on older kernels too, but apparently I have removed it. In v3 this will be fixed using BTF field checks. In v2 just extra #ifdef-s could be added based on the kernel version.

I have to head out (to drink beer with friends) soon, so for now you can just replace all occurrences of __state with state in the xcapture-bpf.c file on 4.18 kernels (should be just 3 locations). Please let me know if this works.

@tanelpoder tanelpoder self-assigned this Sep 21, 2024
@tanelpoder
Copy link
Owner

The PR #53 should fix it.

@tanelpoder
Copy link
Owner

Oh, now I remember why I had removed that kernel version check earlier.

Redhat backported these state to __state changes to RHEL8's 4.18 kernel (with plenty of other newer eBPF improvements from kernel 5.x). So, when running on RHEL 8 (or clone), one would need to use the newer __state field too, despite the kernel version being < 5.14.

[tanel@rhel8 bin]$ cat /etc/redhat-release 
Red Hat Enterprise Linux release 8.10 (Ootpa)
[tanel@rhel8 bin]$ 
[tanel@rhel8 bin]$ uname -a
Linux rhel8.localdomain 4.18.0-553.8.1.el8_10.x86_64 #1 SMP Fri Jun 14 03:19:37 EDT 2024 x86_64 x86_64 x86_64 GNU/Linux
[tanel@rhel8 bin]$ 
[tanel@rhel8 bin]$ sudo ./xcapture-bpf
=== [0x.tools] xcapture-bpf 2.0.3 BETA by Tanel Poder.  Linux  4.18.0 x86_64
===  Loading BPF...
/virtual/main.c:181:29: error: no member named 'state' in 'struct task_struct'
  181 |         t->state = curtask->state;
      |                    ~~~~~~~  ^
/virtual/main.c:324:37: error: no member named 'state' in 'struct task_struct'
  324 |     unsigned int prev_state = prev->state; 
      |                               ~~~~  ^
/virtual/main.c:387:31: error: no member named 'state' in 'struct task_struct'
  387 |         t_next->state = next->state;
      |                         ~~~~  ^
3 errors generated.
Traceback (most recent call last):
  File "./xcapture-bpf", line 474, in <module>
    b = BPF(text= ifdef + bpf_text)
  File "/usr/lib/python3.6/site-packages/bcc/__init__.py", line 476, in __init__
    raise Exception("Failed to compile BPF module %s" % (src_file or "<text>"))
Exception: Failed to compile BPF module <text>

In the xcapture-next (v3) that uses BTF + CO-RE this problem is solved by field existence checks and not by assuming field names based on kernel version. For the current v2, we probably should just check for kernel version (and RHEL clone vs not) and dynamically set a #define variable that'd determine which of the __state or state name would be used.

Or if thre's some #define that only RHEL/clone compiled kernels add, then check for that too.

Reopening the issue for now.

@tanelpoder
Copy link
Owner

tanelpoder commented Dec 11, 2024

I don't have time to go deeper myself right now, but at least ChatGPT told me that there are RHEL-specific defines:

RHEL_MAJOR and RHEL_MINOR:

These macros are defined in the file /usr/include/linux/version.h or /usr/include/linux/rh_version.h (depending on the kernel and version). They indicate the major and minor version of the RHEL kernel.

Looks like indeed they do exist:

https://access.redhat.com/solutions/37608

@Christoph-Lutz
Copy link

Christoph-Lutz commented Dec 12, 2024

Oh my, I see, I wasn't aware of the task->__state backport to the RHEL8 4.18 kernels. Unfortunately with #53, xcapture-bpf.c now breaks on RHEL8. So yeah, another (still quick 'n dirty) fix to get this addressed could be something like that:

#if LINUX_VERSION_CODE >= KERNEL_VERSION(5, 14, 0) || RHEL_MAJOR >= 8
#define STATE_FIELD __state
#else
#define STATE_FIELD state
#endif
...
t->state = curtask->STATE_FIELD;

Just managed to quickly get that tested on RHEL 8.10, 9.5 and OEL 8.10 and will submit a new PR shortly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants