-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
UEK-NEXT support #127
UEK-NEXT support #127
Conversation
7169419
to
648a271
Compare
Kernel 6.4 adopted a new module memory layout structure which is more flexible than before. Rather than complicate the "ModuleLayout" structure which contained lots of details from the previous implementation, I've gone ahead and removed as much detail as possible. Now the module helpers simply return a list of module address ranges, which may be text, data, rodata, or anything really. In addition, to more efficiently look up module addresses, add support for the kernel module address tree. This avoids iterating over the (possibly long) module list each time you want to find out which module an address belongs to. These helpers ended up being really nice, and I upstreamed them to drgn. Each helper function which was upstreamed has a TODO entry so that we can find and remove them when we raise our minimum required version to drgn 0.0.28. Orabug: 37296325 Signed-off-by: Stephen Brennan <[email protected]>
This also somewhat simplifies the legacy code thanks to a new helper for iterating over all tasks in a task group. Orabug: 37296325 Signed-off-by: Stephen Brennan <[email protected]>
I'm not sure the root cause, but some workers now have NULL task fields. I'm confident this has always been legal, but just didn't happen in tests prior to v6.9 due to some kernel internals. Now that it happens, let's fix this case by searching for a worker with a non-NULL task. Orabug: 37296325 Signed-off-by: Stephen Brennan <[email protected]>
Also, "NR_UNSTABLE_NFS" was never being set non-zero because we were looking it up in mm_stats, not node_zone_stats. Fix that too. Orabug: 37296325 Signed-off-by: Stephen Brennan <[email protected]>
In 6.3 and 6.4 there was a push to make struct bus_type and struct class const. This means that the private pointers were removed, and replaced by accessor functions. Implement a drgn version of each accessor. Orabug: 37296325 Signed-off-by: Stephen Brennan <[email protected]>
Previous commits have resolved all of the compatibility issues as of UEK-next 6.9.0-2. Enable the tests so we can run them in CI. Orabug: 37296325 Signed-off-by: Stephen Brennan <[email protected]>
Orabug: 37296325 Signed-off-by: Stephen Brennan <[email protected]>
Orabug: 37296325 Signed-off-by: Stephen Brennan <[email protected]>
The minimum required drgn version is 0.0.25 for drgn-tools. Orabug: 37296325 Signed-off-by: Stephen Brennan <[email protected]>
These errors are not fully resolved in any drgn version, and they're not fatal either. Leave the documentation links, but take away the TODO since there's nothing here to fix. Orabug: 37296325 Signed-off-by: Stephen Brennan <[email protected]>
Orabug: 37296325 Signed-off-by: Stephen Brennan <[email protected]>
An upstreamed version is available within the standard d_path() function, starting from 0.0.29. Once this is the minimum required version of drgn, we can delete this function. Orabug: 37296325 Signed-off-by: Stephen Brennan <[email protected]>
Orabug: 37296325 Signed-off-by: Stephen Brennan <[email protected]>
In v6.10 there was a cleanup of struct block_device, removing the bd_inode field and combining bd_read_only, bd_partno, and others into a single __bd_flags atomic field. Fix the helpers for these changes. Orabug: 37296325 Signed-off-by: Stephen Brennan <[email protected]>
Computing size and read-only is duplicated here. Instead, use the helpers from drgn_tools.block, which are already fixed for v6.10 and later. Orabug: 37296325 Signed-off-by: Stephen Brennan <[email protected]>
The information for partition info can be easily gleaned from /sys/class/block on live systems. Add unit tests to verify the information is correct, so that our test can detect block-related changes like the ones corrected recently. Orabug: 37296325 Signed-off-by: Stephen Brennan <[email protected]>
This matches the output of /proc/mounts, which is important because the test cases compare the output from this helper against /proc/mounts. With this, tests pass on my laptop which happens to have some FUSE mounts that populate s_subtype. Orabug: 37296325 Signed-off-by: Stephen Brennan <[email protected]>
The max_active field got moved into the workqueue_struct. Orabug: 37296325 Signed-off-by: Stephen Brennan <[email protected]>
This should have been done a while ago. The "smp" corelens module already disallows running on live systems. IPIs fly too fast for us to keep up in userspace, so running the full module tends to cause issues. In this case, the specific issue had to do with unwinding the stack on the running task, which raises an error in drgn. Orabug: 37296325 Signed-off-by: Stephen Brennan <[email protected]>
648a271
to
dd29e99
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IO fixes looks good to me
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes, in following files, LGTM:
- test_mm.py
- test_module.py
- test_smp.py
- test_task.py
- module.py
- task.py
- workqueue.py
- numastat.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes in lsmod.py are good to go.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM for changes in numstat.py and task.py.
Looks like the upstream has made quite some changes in mm. Thanks for fixing these!
This pull request begins running the Github CI tests against UEK-next, which is currently at 6.11. It includes a whole crop of fixes to helpers that allow the tests to pass. This is the first version of UEK-next that I'm actually doing this for, so it includes a lot of changes from UEK7 era up till the present.
These fixes are are necessary, but almost certainly incomplete. The Github CI tests are very small QEMU VMs, and I did manually verify that the tests pass against my laptop as well. There are plenty of subsystems not exercised by my laptop or the limited VM in the CI tests.
I'm breaking these changes down into a few categories for which I believe subject matter experts should review. If you're tagged here, please take a look at the subset of changes mentioned (note the Git SHAs may not be correct, as I may need to amend things to add bug references).