forked from freebsd/freebsd-src
-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unnecessary bootrom checking when restoring or starting a migrated VM #2
Comments
Michael-B8
added
bug
Something isn't working
question
Further information is requested
labels
Mar 10, 2022
ionut-mihalache
pushed a commit
that referenced
this issue
Jul 14, 2022
This LOR happens when reading from a file backed MD device: lock order reversal: 1st 0xfffffe00431eaac0 pbufwait (pbufwait, lockmgr) @ /cobra/src/sys/vm/vm_pager.c:471 2nd 0xfffff80003f17930 ufs (ufs, lockmgr) @ /cobra/src/sys/dev/md/md.c:977 lock order pbufwait -> ufs attempted at: #0 0xffffffff80c78ead at witness_checkorder+0xbdd #1 0xffffffff80bd6a52 at lockmgr_lock_flags+0x182 #2 0xffffffff80f52d5c at ffs_lock+0x6c freebsd#3 0xffffffff80d0f3f4 at _vn_lock+0x54 freebsd#4 0xffffffff80708629 at mdstart_vnode+0x499 freebsd#5 0xffffffff807060ec at md_kthread+0x20c freebsd#6 0xffffffff80bbfcd0 at fork_exit+0x80 freebsd#7 0xffffffff810b809e at fork_trampoline+0xe This LOR was previously blessed by witness before commit 531f8cf ("Use dedicated lock name for pbufs"). Instead of blessing ufs and pbufwait, use LK_NOWAIT to prevent recording the lock order. LK_NOWAIT will be a nop here as the lock is dropped in pbuf_dtor(). The takes the same approach as 5875b94 ("buf_alloc(): lock the buffer with LK_NOWAIT"). Reviewed by: kib Differential Revision: https://reviews.freebsd.org/D34183
Michael-B8
pushed a commit
that referenced
this issue
Apr 25, 2023
Under certain loads, the following panic is hit: panic: page fault KDB: stack backtrace: #0 0xffffffff805db025 at kdb_backtrace+0x65 #1 0xffffffff8058e86f at vpanic+0x17f #2 0xffffffff8058e6e3 at panic+0x43 freebsd#3 0xffffffff808adc15 at trap_fatal+0x385 freebsd#4 0xffffffff808adc6f at trap_pfault+0x4f freebsd#5 0xffffffff80886da8 at calltrap+0x8 freebsd#6 0xffffffff80669186 at vgonel+0x186 freebsd#7 0xffffffff80669841 at vgone+0x31 freebsd#8 0xffffffff8065806d at vfs_hash_insert+0x26d freebsd#9 0xffffffff81a39069 at sfs_vgetx+0x149 freebsd#10 0xffffffff81a39c54 at zfsctl_snapdir_lookup+0x1e4 freebsd#11 0xffffffff8065a28c at lookup+0x45c freebsd#12 0xffffffff806594b9 at namei+0x259 freebsd#13 0xffffffff80676a33 at kern_statat+0xf3 freebsd#14 0xffffffff8067712f at sys_fstatat+0x2f freebsd#15 0xffffffff808ae50c at amd64_syscall+0x10c freebsd#16 0xffffffff808876bb at fast_syscall_common+0xf8 The page fault occurs because vgonel() will call VOP_CLOSE() for active vnodes. For this reason, define vop_close for zfsctl_ops_snapshot. While here, define vop_open for consistency. After adding the necessary vop, the bug progresses to the following panic: panic: VERIFY3(vrecycle(vp) == 1) failed (0 == 1) cpuid = 17 KDB: stack backtrace: #0 0xffffffff805e29c5 at kdb_backtrace+0x65 #1 0xffffffff8059620f at vpanic+0x17f #2 0xffffffff81a27f4a at spl_panic+0x3a freebsd#3 0xffffffff81a3a4d0 at zfsctl_snapshot_inactive+0x40 freebsd#4 0xffffffff8066fdee at vinactivef+0xde freebsd#5 0xffffffff80670b8a at vgonel+0x1ea freebsd#6 0xffffffff806711e1 at vgone+0x31 freebsd#7 0xffffffff8065fa0d at vfs_hash_insert+0x26d freebsd#8 0xffffffff81a39069 at sfs_vgetx+0x149 freebsd#9 0xffffffff81a39c54 at zfsctl_snapdir_lookup+0x1e4 freebsd#10 0xffffffff80661c2c at lookup+0x45c freebsd#11 0xffffffff80660e59 at namei+0x259 freebsd#12 0xffffffff8067e3d3 at kern_statat+0xf3 freebsd#13 0xffffffff8067eacf at sys_fstatat+0x2f freebsd#14 0xffffffff808b5ecc at amd64_syscall+0x10c freebsd#15 0xffffffff8088f07b at fast_syscall_common+0xf8 This is caused by a race condition that can occur when allocating a new vnode and adding that vnode to the vfs hash. If the newly created vnode loses the race when being inserted into the vfs hash, it will not be recycled as its usecount is greater than zero, hitting the above assertion. Fix this by dropping the assertion. FreeBSD-issue: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=252700 Reviewed-by: Andriy Gapon <[email protected]> Reviewed-by: Mateusz Guzik <[email protected]> Reviewed-by: Alek Pinchuk <[email protected]> Reviewed-by: Ryan Moeller <[email protected]> Signed-off-by: Rob Wing <[email protected]> Co-authored-by: Rob Wing <[email protected]> Submitted-by: Klara, Inc. Sponsored-by: rsync.net Closes #14501
Michael-B8
pushed a commit
that referenced
this issue
Aug 8, 2023
Avoid locking issues when if_allmulti() calls the driver's if_ioctl, because that may acquire sleepable locks (while we hold a non-sleepable rwlock). Fortunately there's no pressing need to hold the mroute lock while we do this, so we can postpone the call slightly, until after we've released the lock. This avoids the following WITNESS warning (with iflib drivers): lock order reversal: (sleepable after non-sleepable) 1st 0xffffffff82f64960 IPv4 multicast forwarding (IPv4 multicast forwarding, rw) @ /usr/src/sys/netinet/ip_mroute.c:1050 2nd 0xfffff8000480f180 iflib ctx lock (iflib ctx lock, sx) @ /usr/src/sys/net/iflib.c:4525 lock order IPv4 multicast forwarding -> iflib ctx lock attempted at: #0 0xffffffff80bbd6ce at witness_checkorder+0xbbe #1 0xffffffff80b56d10 at _sx_xlock+0x60 #2 0xffffffff80c9ce5c at iflib_if_ioctl+0x2dc freebsd#3 0xffffffff80c7c395 at if_setflag+0xe5 freebsd#4 0xffffffff82f60a0e at del_vif_locked+0x9e freebsd#5 0xffffffff82f5f0d5 at X_ip_mrouter_set+0x265 freebsd#6 0xffffffff80bfd402 at sosetopt+0xc2 freebsd#7 0xffffffff80c02105 at kern_setsockopt+0xa5 freebsd#8 0xffffffff80c02054 at sys_setsockopt+0x24 freebsd#9 0xffffffff81046be8 at amd64_syscall+0x138 freebsd#10 0xffffffff8101930b at fast_syscall_common+0xf8 See also: https://redmine.pfsense.org/issues/12079 Reviewed by: mjg Sponsored by: Rubicon Communications, LLC ("Netgate") Differential Revision: https://reviews.freebsd.org/D41209
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Whenever a new bhyve VM is started as a migration destination or restored from checkpoint files (without bhyveload-ing or having a bootrom set), it fails the first time, requiring two runs of the command.
This is because of the bootrom and vm creation checks in the
do_open()
function in bhyverun.c.Is this necessary considering the memory is either migrated or restored to a running state?
The text was updated successfully, but these errors were encountered: