Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kunit to check the longest symbol length #956

Draft
wants to merge 1 commit into
base: rust
Choose a base branch
from

Conversation

sergiocollado
Copy link

Kunit test to check the maximum longest symbol length

Signed-off-by: Sergio González Collado [email protected]

Copy link

@sulix sulix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is definitely much cleaner than the kselftest version, and I quite like it overall.

I've left a bunch of comments, though they're all just suggestions: don't feel you have to change everything I suggest.

One thing I do miss from the kselftest version is the check with kallsyms_lookup_name, which would actually check that the symbol is actually present in the final kernel and/or module symbol table: as-is, the symbol could be inlined into the KUnit test by the compiler, and/or might not work as an exported symbol. So I'd not object to that being added back in as well (maybe as a separate test case within the same suite).

lib/Makefile Outdated
@@ -380,6 +380,7 @@ obj-$(CONFIG_OVERFLOW_KUNIT_TEST) += overflow_kunit.o
CFLAGS_stackinit_kunit.o += $(call cc-disable-warning, switch-unreachable)
obj-$(CONFIG_STACKINIT_KUNIT_TEST) += stackinit_kunit.o
obj-$(CONFIG_FORTIFY_KUNIT_TEST) += fortify_kunit.o
obj-$(CONFIG_LONGEST_SYM_KUNIT_TEST) += test_longest_symbol_kunit.o
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Does this filename need both a test_ prefix and a _kunit suffix?
How about longest_symbol_kunit.o (or longest_sym_kunit.o?)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

#define DDDDDI(name) DDDDI(n##name##name)

/*Generate a symbol whose name length is 511 */
#define LONGEST_SYM_NAME_LEN DDDDDI(g1h2i3j4k5l6m7n)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two comments:

  1. Could we just name this LONGEST_SYM_NAME: the _LEN suggests to me that this macro resolves to the length of the symbol, not the name itself?
  2. Random idea: Would it be possible to create two very long symbols which only differ in the last few characters? I could imagine that as a failure case with very long symbols. (e.g., ancient compilers with max symbol lengths of ~30 would treat all symbols with the same first 30 characters as the same).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍
I have added a larger symbol by 1 character, I was expecting failure, but actually it works :/. I don't know yet, why it works :/


int LONGEST_SYM_NAME_LEN(void)
{
return 0;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should have this return something more interesting than 0. I could imagine a case where we accidentally end calling a different function, and a random function returning 0 is quite likely, but a random function returning 0x424242 or something would be less likely to accidentally pass.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

"Incorrect symbol length found. It was expected KSYM_NAME_LEN: " \
__stringify(KSYM_NAME_LEN) ", but found: " \
__stringify(sizeof(LONGEST_SYM_NAME_LEN)));

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Maybe get rid of "It was" here, and just have "Expected KSYM_NAME_LEN:"?

/*Generate a symbol whose name length is 511 */
#define LONGEST_SYM_NAME_LEN DDDDDI(g1h2i3j4k5l6m7n)

int LONGEST_SYM_NAME_LEN(void)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we maybe mark this noinline, so we're actually loading another symbol.

It doesn't really matter, as this is all resolved at compile time anyway, but I'd feel weird about the whole test getting optimized down to nothing. :-)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@ojeda
Copy link
Member

ojeda commented Feb 1, 2023

I have totally copied your macros at #504 (comment). I will try to find a better option, but yours is a very good one.

Boqun's macros are a nice & simple approach.

If you need something more sophisticated (e.g. for David's suggestion of generating another slightly different symbol etc.), then we could call a program to the symbol (or the test, even). If done in C, one could import the length macro itself to generate the largest symbol, and thus make the test independent of the value (i.e. avoiding to have to adjust the macros etc.).

I am not sure if it is worth the complexity, really. But if you need it or want to give it a try, I can give you a commit with how to make it work, and you could plug something like ojeda@0329a98 into it.

@sergiocollado
Copy link
Author

Thanks for the comments. Those were right on the money.

Although I'm somewhat surprised that the that the longest_symbol +1 works. :/

@@ -2582,6 +2582,15 @@ config FORTIFY_KUNIT_TEST
by the str*() and mem*() family of functions. For testing runtime
traps of FORTIFY_SOURCE, see LKDTM's "FORTIFY_*" tests.

config LONGEST_SYM_KUNIT_TEST
tristate "Test the longest symbol possible" if !KUNIT_ALL_TESTS
Copy link

@sulix sulix Feb 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't build as a module:

ERROR: modpost: missing MODULE_LICENSE() in lib/longest_symbol_kunit.o
ERROR: modpost: "kallsyms_lookup_name" [lib/longest_symbol_kunit.ko] undefined!

The easy thing to do would be to make this a bool option, but this is exactly the sort of thing we probably should test as a module, too (maybe that's why the longest+1 name is working: the compiler/linker are fine, but the module tools (e.g. modpost) aren't). Fixing it up to work as a module would lose us some of the simplicity of the KUnit port, though.

Copy link
Author

@sergiocollado sergiocollado Feb 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting ... yes, that may be very well the reason for the longest+1 working. That is good, because I was feeling quite uneasy about it.

I will look into it and make it a module.

Thanks for pointing it out :) 👍

Copy link
Author

@sergiocollado sergiocollado Feb 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sulix I have updated the code, so now it actually compiles and can be executed as a module (as referenced here: https://www.kernel.org/doc/html/latest/dev-tools/kunit/running_tips.html#running-tests-as-modules ).

Incredible as it sounds, the test with the longest symbol +1, when executed as a module, passes. Is not clear to me why. I have to delve into this further.

sergio@ubuntu:/lib/modules/6.1.0-rc1-188160-g3ee25d98b592/kernel/lib$ dmesg -HW &
[1] 3207
sergio@ubuntu:/lib/modules/6.1.0-rc1-188160-g3ee25d98b592/kernel/lib$ sudo modprobe longest_symbol_kunit
[feb27 21:49]     # Subtest: longest-symbol
[  +0,000002]     1..4
[  +0,000075]     ok 1 - test_longest_symbol
[  +0,028325]     ok 2 - test_longest_symbol_kallsyms
[  +0,000237]     ok 3 - test_longest_symbol_plus1
[  +0,012761]     ok 4 - test_longest_symbol_plus1_kallsyms
[  +0,000005] # longest-symbol: pass:4 fail:0 skip:0 total:4
[  +0,000001] # Totals: pass:4 fail:0 skip:0 total:4
[  +0,000001] ok 1 - longest-symbol

running kunit.py:

$ ./tools/testing/kunit/kunit.py run "longest-symbol"
...

[22:34:12] Starting KUnit Kernel (1/1)...
[22:34:12] ============================================================
[22:34:12] =============== longest-symbol (4 subtests) ================
[22:34:12] [PASSED] test_longest_symbol
[22:34:12] [PASSED] test_longest_symbol_kallsyms
[22:34:12] [PASSED] test_longest_symbol_plus1
[22:34:12] [PASSED] test_longest_symbol_plus1_kallsyms
[22:34:12] ================= [PASSED] longest-symbol ==================
[22:34:12] ============================================================
[22:34:12] Testing complete. Ran 4 tests: passed: 4
[22:34:12] Elapsed time: 38.391s total, 0.001s configuring, 38.275s building, 0.093s running

Copy link
Member

@ojeda ojeda Aug 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have taken a look: the test passes unexpectedly because the symbol is simply LONGEST_SYM_PLUS1, not the expected one (very long).

This is due to two issues:

  • A typo in LONGEST_SYM_PLUS1 vs. LONGEST_SYM_NAME_PLUS1.

  • The preprocessor will concatenate before expanding. So you need one level of indirection more, e.g. with the __PASTE macro.

diff --git a/lib/longest_symbol_kunit.c b/lib/longest_symbol_kunit.c
index 575edca77864..10d2e9967647 100644
--- a/lib/longest_symbol_kunit.c
+++ b/lib/longest_symbol_kunit.c
@@ -18,13 +18,13 @@
 #define DDDDI(name) DDDI(n##name##name)
 #define DDDDDI(name) DDDDI(n##name##name)
 
-#define PLUS1(name) name##e
+#define PLUS1(name) __PASTE(name,e)
 
 /*Generate a symbol whose name length is 511 */
 #define LONGEST_SYM_NAME  DDDDDI(g1h2i3j4k5l6m7n)
 
 /*Generate a symbol whose name length is 512 */
-#define LONGEST_SYM_PLUS1 PLUS1(LONGEST_SYM_NAME)
+#define LONGEST_SYM_NAME_PLUS1 PLUS1(LONGEST_SYM_NAME)
 
 int noinline LONGEST_SYM_NAME(void)
 {

With this, the expected symbol is generated:

$ nm lib/longest_symbol_kunit.o | awk 'length($3) > 100 { print length($3), $3 }'
511 snnnng1h2i3j4k5l6m7ng1h2i3j4k5l6m7nng1h2i3j4k5l6m7ng1h2i3j4k5l6m7nnng1h2i3j4k5l6m7ng1h2i3j4k5l6m7nng1h2i3j4k5l6m7ng1h2i3j4k5l6m7nnnng1h2i3j4k5l6m7ng1h2i3j4k5l6m7nng1h2i3j4k5l6m7ng1h2i3j4k5l6m7nnng1h2i3j4k5l6m7ng1h2i3j4k5l6m7nng1h2i3j4k5l6m7ng1h2i3j4k5l6m7nnnnng1h2i3j4k5l6m7ng1h2i3j4k5l6m7nng1h2i3j4k5l6m7ng1h2i3j4k5l6m7nnng1h2i3j4k5l6m7ng1h2i3j4k5l6m7nng1h2i3j4k5l6m7ng1h2i3j4k5l6m7nnnng1h2i3j4k5l6m7ng1h2i3j4k5l6m7nng1h2i3j4k5l6m7ng1h2i3j4k5l6m7nnng1h2i3j4k5l6m7ng1h2i3j4k5l6m7nng1h2i3j4k5l6m7ng1h2i3j4k5l6m7n
512 snnnng1h2i3j4k5l6m7ng1h2i3j4k5l6m7nng1h2i3j4k5l6m7ng1h2i3j4k5l6m7nnng1h2i3j4k5l6m7ng1h2i3j4k5l6m7nng1h2i3j4k5l6m7ng1h2i3j4k5l6m7nnnng1h2i3j4k5l6m7ng1h2i3j4k5l6m7nng1h2i3j4k5l6m7ng1h2i3j4k5l6m7nnng1h2i3j4k5l6m7ng1h2i3j4k5l6m7nng1h2i3j4k5l6m7ng1h2i3j4k5l6m7nnnnng1h2i3j4k5l6m7ng1h2i3j4k5l6m7nng1h2i3j4k5l6m7ng1h2i3j4k5l6m7nnng1h2i3j4k5l6m7ng1h2i3j4k5l6m7nng1h2i3j4k5l6m7ng1h2i3j4k5l6m7nnnng1h2i3j4k5l6m7ng1h2i3j4k5l6m7nng1h2i3j4k5l6m7ng1h2i3j4k5l6m7nnng1h2i3j4k5l6m7ng1h2i3j4k5l6m7nng1h2i3j4k5l6m7ng1h2i3j4k5l6m7ne

You can play with it at https://godbolt.org/z/9ach7Wq63.

Hope that helps!

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:) Great! thanks a lot for having a look and pointing out the issue! I will fix this 👍

Copy link
Member

@ojeda ojeda Aug 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My pleasure, and apologies for the delay!

@sergiocollado sergiocollado force-pushed the kunit_check_longest_symbol branch 6 times, most recently from 35ad1b1 to c52846c Compare February 28, 2023 21:54
@sergiocollado sergiocollado force-pushed the kunit_check_longest_symbol branch 3 times, most recently from 2347b85 to a42c5d3 Compare July 12, 2023 18:48
Comment on lines 57 to 64
if (register_kprobe(&kp) < 0) {
pr_info("test_longest_symbol_kallsyms: kprobe not registered\n");
kunit_warn(test, "test_longest_symbol kallsyms: kprobe not registered\n");
KUNIT_ASSERT_TRUE(test, register_kprobe(&kp) < 0);
KUNIT_FAIL(test, "test_longest_symbol kallsysms: kprobe not registered\n");
return;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are all these needed? Shouldn't be this just a single assertion? (Same for the case below).

Added a kunit test to check the longest symbol length

Signed-off-by: Sergio González Collado <[email protected]>
ojeda pushed a commit that referenced this pull request Jul 29, 2024
When using cachefiles, lockdep may emit something similar to the circular
locking dependency notice below.  The problem appears to stem from the
following:

 (1) Cachefiles manipulates xattrs on the files in its cache when called
     from ->writepages().

 (2) The setxattr() and removexattr() system call handlers get the name
     (and value) from userspace after taking the sb_writers lock, putting
     accesses of the vma->vm_lock and mm->mmap_lock inside of that.

 (3) The afs filesystem uses a per-inode lock to prevent multiple
     revalidation RPCs and in writeback vs truncate to prevent parallel
     operations from deadlocking against the server on one side and local
     page locks on the other.

Fix this by moving the getting of the name and value in {get,remove}xattr()
outside of the sb_writers lock.  This also has the minor benefits that we
don't need to reget these in the event of a retry and we never try to take
the sb_writers lock in the event we can't pull the name and value into the
kernel.

Alternative approaches that might fix this include moving the dispatch of a
write to the cache off to a workqueue or trying to do without the
validation lock in afs.  Note that this might also affect other filesystems
that use netfslib and/or cachefiles.

 ======================================================
 WARNING: possible circular locking dependency detected
 6.10.0-build2+ #956 Not tainted
 ------------------------------------------------------
 fsstress/6050 is trying to acquire lock:
 ffff888138fd82f0 (mapping.invalidate_lock#3){++++}-{3:3}, at: filemap_fault+0x26e/0x8b0

 but task is already holding lock:
 ffff888113f26d18 (&vma->vm_lock->lock){++++}-{3:3}, at: lock_vma_under_rcu+0x165/0x250

 which lock already depends on the new lock.

 the existing dependency chain (in reverse order) is:

 -> #4 (&vma->vm_lock->lock){++++}-{3:3}:
        __lock_acquire+0xaf0/0xd80
        lock_acquire.part.0+0x103/0x280
        down_write+0x3b/0x50
        vma_start_write+0x6b/0xa0
        vma_link+0xcc/0x140
        insert_vm_struct+0xb7/0xf0
        alloc_bprm+0x2c1/0x390
        kernel_execve+0x65/0x1a0
        call_usermodehelper_exec_async+0x14d/0x190
        ret_from_fork+0x24/0x40
        ret_from_fork_asm+0x1a/0x30

 -> #3 (&mm->mmap_lock){++++}-{3:3}:
        __lock_acquire+0xaf0/0xd80
        lock_acquire.part.0+0x103/0x280
        __might_fault+0x7c/0xb0
        strncpy_from_user+0x25/0x160
        removexattr+0x7f/0x100
        __do_sys_fremovexattr+0x7e/0xb0
        do_syscall_64+0x9f/0x100
        entry_SYSCALL_64_after_hwframe+0x76/0x7e

 -> #2 (sb_writers#14){.+.+}-{0:0}:
        __lock_acquire+0xaf0/0xd80
        lock_acquire.part.0+0x103/0x280
        percpu_down_read+0x3c/0x90
        vfs_iocb_iter_write+0xe9/0x1d0
        __cachefiles_write+0x367/0x430
        cachefiles_issue_write+0x299/0x2f0
        netfs_advance_write+0x117/0x140
        netfs_write_folio.isra.0+0x5ca/0x6e0
        netfs_writepages+0x230/0x2f0
        afs_writepages+0x4d/0x70
        do_writepages+0x1e8/0x3e0
        filemap_fdatawrite_wbc+0x84/0xa0
        __filemap_fdatawrite_range+0xa8/0xf0
        file_write_and_wait_range+0x59/0x90
        afs_release+0x10f/0x270
        __fput+0x25f/0x3d0
        __do_sys_close+0x43/0x70
        do_syscall_64+0x9f/0x100
        entry_SYSCALL_64_after_hwframe+0x76/0x7e

 -> #1 (&vnode->validate_lock){++++}-{3:3}:
        __lock_acquire+0xaf0/0xd80
        lock_acquire.part.0+0x103/0x280
        down_read+0x95/0x200
        afs_writepages+0x37/0x70
        do_writepages+0x1e8/0x3e0
        filemap_fdatawrite_wbc+0x84/0xa0
        filemap_invalidate_inode+0x167/0x1e0
        netfs_unbuffered_write_iter+0x1bd/0x2d0
        vfs_write+0x22e/0x320
        ksys_write+0xbc/0x130
        do_syscall_64+0x9f/0x100
        entry_SYSCALL_64_after_hwframe+0x76/0x7e

 -> #0 (mapping.invalidate_lock#3){++++}-{3:3}:
        check_noncircular+0x119/0x160
        check_prev_add+0x195/0x430
        __lock_acquire+0xaf0/0xd80
        lock_acquire.part.0+0x103/0x280
        down_read+0x95/0x200
        filemap_fault+0x26e/0x8b0
        __do_fault+0x57/0xd0
        do_pte_missing+0x23b/0x320
        __handle_mm_fault+0x2d4/0x320
        handle_mm_fault+0x14f/0x260
        do_user_addr_fault+0x2a2/0x500
        exc_page_fault+0x71/0x90
        asm_exc_page_fault+0x22/0x30

 other info that might help us debug this:

 Chain exists of:
   mapping.invalidate_lock#3 --> &mm->mmap_lock --> &vma->vm_lock->lock

  Possible unsafe locking scenario:

        CPU0                    CPU1
        ----                    ----
   rlock(&vma->vm_lock->lock);
                                lock(&mm->mmap_lock);
                                lock(&vma->vm_lock->lock);
   rlock(mapping.invalidate_lock#3);

  *** DEADLOCK ***

 1 lock held by fsstress/6050:
  #0: ffff888113f26d18 (&vma->vm_lock->lock){++++}-{3:3}, at: lock_vma_under_rcu+0x165/0x250

 stack backtrace:
 CPU: 0 PID: 6050 Comm: fsstress Not tainted 6.10.0-build2+ #956
 Hardware name: ASUS All Series/H97-PLUS, BIOS 2306 10/09/2014
 Call Trace:
  <TASK>
  dump_stack_lvl+0x57/0x80
  check_noncircular+0x119/0x160
  ? queued_spin_lock_slowpath+0x4be/0x510
  ? __pfx_check_noncircular+0x10/0x10
  ? __pfx_queued_spin_lock_slowpath+0x10/0x10
  ? mark_lock+0x47/0x160
  ? init_chain_block+0x9c/0xc0
  ? add_chain_block+0x84/0xf0
  check_prev_add+0x195/0x430
  __lock_acquire+0xaf0/0xd80
  ? __pfx___lock_acquire+0x10/0x10
  ? __lock_release.isra.0+0x13b/0x230
  lock_acquire.part.0+0x103/0x280
  ? filemap_fault+0x26e/0x8b0
  ? __pfx_lock_acquire.part.0+0x10/0x10
  ? rcu_is_watching+0x34/0x60
  ? lock_acquire+0xd7/0x120
  down_read+0x95/0x200
  ? filemap_fault+0x26e/0x8b0
  ? __pfx_down_read+0x10/0x10
  ? __filemap_get_folio+0x25/0x1a0
  filemap_fault+0x26e/0x8b0
  ? __pfx_filemap_fault+0x10/0x10
  ? find_held_lock+0x7c/0x90
  ? __pfx___lock_release.isra.0+0x10/0x10
  ? __pte_offset_map+0x99/0x110
  __do_fault+0x57/0xd0
  do_pte_missing+0x23b/0x320
  __handle_mm_fault+0x2d4/0x320
  ? __pfx___handle_mm_fault+0x10/0x10
  handle_mm_fault+0x14f/0x260
  do_user_addr_fault+0x2a2/0x500
  exc_page_fault+0x71/0x90
  asm_exc_page_fault+0x22/0x30

Signed-off-by: David Howells <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
cc: Alexander Viro <[email protected]>
cc: Christian Brauner <[email protected]>
cc: Jan Kara <[email protected]>
cc: Jeff Layton <[email protected]>
cc: Gao Xiang <[email protected]>
cc: Matthew Wilcox <[email protected]>
cc: [email protected]
cc: [email protected]
cc: [email protected]
[brauner: fix minor issues]
Signed-off-by: Christian Brauner <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

3 participants