FPU issues when installing inkscape package #22

glaubitz · 2017-01-02T14:56:25Z

When trying to install inkscape, we're running into an issue which seems FPU-related:

(sid-m68k-sbuild)root@jessie64:/# apt install inkscape
Reading package lists... Done
Building dependency tree       
Reading state information... Done
(...)
Setting up inkscape (0.91-12+b1) ...
Sorry: OverflowError: cannot convert float infinity to integer
dpkg: error processing package inkscape (--configure):
 subprocess installed post-installation script returned error exit status 101
Setting up libgtk2.0-bin (2.24.31-1) ...
(...)
Errors were encountered while processing:
 inkscape
W: No sandbox user '_apt' on the system, can not drop privileges
E: Sub-process /usr/bin/dpkg returned an error code (1)
(sid-m68k-sbuild)root@jessie64:/#

Looking at the inkscape postinst package, the only possible culprit can only be the pycompile command:

(sid-m68k-sbuild)root@jessie64:/# cat /var/lib/dpkg/info/inkscape.postinst 
#!/bin/sh
set -e

# Automatically added by dh_python2:
if which pycompile >/dev/null 2>&1; then
        pycompile -p inkscape /usr/share/inkscape
fi

# End automatically added section
# Automatically added by dh_installdeb
dpkg-maintscript-helper rm_conffile /etc/bash_completion.d/inkscape 0.91-6~ -- "$@"
# End automatically added section
(sid-m68k-sbuild)root@jessie64:/#

which is indeed correct:

(sid-m68k-sbuild)root@jessie64:/# pycompile -p inkscape /usr/share/inkscape
Sorry: OverflowError: cannot convert float infinity to integer
(sid-m68k-sbuild)root@jessie64:/#

The text was updated successfully, but these errors were encountered:

During block job completion, nothing is preventing block_job_defer_to_main_loop_bh from being called in a nested aio_poll(), which is a trouble, such as in this code path: qmp_block_commit commit_active_start bdrv_reopen bdrv_reopen_multiple bdrv_reopen_prepare bdrv_flush aio_poll aio_bh_poll aio_bh_call block_job_defer_to_main_loop_bh stream_complete bdrv_reopen block_job_defer_to_main_loop_bh is the last step of the stream job, which should have been "paused" by the bdrv_drained_begin/end in bdrv_reopen_multiple, but it is not done because it's in the form of a main loop BH. Similar to why block jobs should be paused between drained_begin and drained_end, BHs they schedule must be excluded as well. To achieve this, this patch forces draining the BH in BDRV_POLL_WHILE. As a side effect this fixes a hang in block_job_detach_aio_context during system_reset when a block job is ready: #0 0x0000555555aa79f3 in bdrv_drain_recurse #1 0x0000555555aa825d in bdrv_drained_begin #2 0x0000555555aa8449 in bdrv_drain #3 0x0000555555a9c356 in blk_drain #4 0x0000555555aa3cfd in mirror_drain #5 0x0000555555a66e11 in block_job_detach_aio_context #6 0x0000555555a62f4d in bdrv_detach_aio_context #7 0x0000555555a63116 in bdrv_set_aio_context #8 0x0000555555a9d326 in blk_set_aio_context #9 0x00005555557e38da in virtio_blk_data_plane_stop #10 0x00005555559f9d5f in virtio_bus_stop_ioeventfd #11 0x00005555559fa49b in virtio_bus_stop_ioeventfd #12 0x00005555559f6a18 in virtio_pci_stop_ioeventfd #13 0x00005555559f6a18 in virtio_pci_reset #14 0x00005555559139a9 in qdev_reset_one #15 0x0000555555916738 in qbus_walk_children #16 0x0000555555913318 in qdev_walk_children #17 0x0000555555916738 in qbus_walk_children #18 0x00005555559168ca in qemu_devices_reset #19 0x000055555581fcbb in pc_machine_reset #20 0x00005555558a4d96 in qemu_system_reset #21 0x000055555577157a in main_loop_should_exit #22 0x000055555577157a in main_loop #23 0x000055555577157a in main The rationale is that the loop in block_job_detach_aio_context cannot make any progress in pausing/completing the job, because bs->in_flight is 0, so bdrv_drain doesn't process the block_job_defer_to_main_loop BH. With this patch, it does. Reported-by: Jeff Cody <[email protected]> Signed-off-by: Fam Zheng <[email protected]> Message-Id: <[email protected]> Reviewed-by: Jeff Cody <[email protected]> Tested-by: Jeff Cody <[email protected]> Signed-off-by: Fam Zheng <[email protected]>

Direct leak of 160 byte(s) in 4 object(s) allocated from: #0 0x55ed7678cda8 in calloc (/home/elmarco/src/qq/build/x86_64-softmmu/qemu-system-x86_64+0x797da8) #1 0x7f3f5e725f75 in g_malloc0 /home/elmarco/src/gnome/glib/builddir/../glib/gmem.c:124 #2 0x55ed778aa3a7 in query_option_descs /home/elmarco/src/qq/util/qemu-config.c:60:16 #3 0x55ed778aa307 in get_drive_infolist /home/elmarco/src/qq/util/qemu-config.c:140:19 #4 0x55ed778a9f40 in qmp_query_command_line_options /home/elmarco/src/qq/util/qemu-config.c:254:36 #5 0x55ed76d4868c in qmp_marshal_query_command_line_options /home/elmarco/src/qq/build/qmp-marshal.c:3078:14 #6 0x55ed77855dd5 in do_qmp_dispatch /home/elmarco/src/qq/qapi/qmp-dispatch.c:104:5 #7 0x55ed778558cc in qmp_dispatch /home/elmarco/src/qq/qapi/qmp-dispatch.c:131:11 #8 0x55ed768b592f in handle_qmp_command /home/elmarco/src/qq/monitor.c:3840:11 #9 0x55ed7786ccfe in json_message_process_token /home/elmarco/src/qq/qobject/json-streamer.c:105:5 #10 0x55ed778fe37c in json_lexer_feed_char /home/elmarco/src/qq/qobject/json-lexer.c:323:13 #11 0x55ed778fdde6 in json_lexer_feed /home/elmarco/src/qq/qobject/json-lexer.c:373:15 #12 0x55ed7786cd83 in json_message_parser_feed /home/elmarco/src/qq/qobject/json-streamer.c:124:12 #13 0x55ed768b559e in monitor_qmp_read /home/elmarco/src/qq/monitor.c:3882:5 #14 0x55ed77714f29 in qemu_chr_be_write_impl /home/elmarco/src/qq/chardev/char.c:167:9 #15 0x55ed77714fde in qemu_chr_be_write /home/elmarco/src/qq/chardev/char.c:179:9 #16 0x55ed7772ffad in tcp_chr_read /home/elmarco/src/qq/chardev/char-socket.c:440:13 #17 0x55ed7777113b in qio_channel_fd_source_dispatch /home/elmarco/src/qq/io/channel-watch.c:84:12 #18 0x7f3f5e71d90b in g_main_dispatch /home/elmarco/src/gnome/glib/builddir/../glib/gmain.c:3182 #19 0x7f3f5e71e7ac in g_main_context_dispatch /home/elmarco/src/gnome/glib/builddir/../glib/gmain.c:3847 #20 0x55ed77886ffc in glib_pollfds_poll /home/elmarco/src/qq/util/main-loop.c:214:9 #21 0x55ed778865fd in os_host_main_loop_wait /home/elmarco/src/qq/util/main-loop.c:261:5 #22 0x55ed77886222 in main_loop_wait /home/elmarco/src/qq/util/main-loop.c:515:11 #23 0x55ed76d2a4df in main_loop /home/elmarco/src/qq/vl.c:1995:9 #24 0x55ed76d1cb4a in main /home/elmarco/src/qq/vl.c:4914:5 #25 0x7f3f555f6039 in __libc_start_main (/lib64/libc.so.6+0x21039) Signed-off-by: Marc-André Lureau <[email protected]> Reviewed-by: Eric Blake <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>

Spotted thanks to ASAN: ==25226==ERROR: AddressSanitizer: global-buffer-overflow on address 0x556715a1f120 at pc 0x556714b6f6b1 bp 0x7ffcdfac1360 sp 0x7ffcdfac1350 READ of size 1 at 0x556715a1f120 thread T0 #0 0x556714b6f6b0 in init_disasm /home/elmarco/src/qemu/disas/s390.c:219 #1 0x556714b6fa6a in print_insn_s390 /home/elmarco/src/qemu/disas/s390.c:294 #2 0x55671484d031 in monitor_disas /home/elmarco/src/qemu/disas.c:635 #3 0x556714862ec0 in memory_dump /home/elmarco/src/qemu/monitor.c:1324 #4 0x55671486342a in hmp_memory_dump /home/elmarco/src/qemu/monitor.c:1418 #5 0x5567148670be in handle_hmp_command /home/elmarco/src/qemu/monitor.c:3109 #6 0x5567148674ed in qmp_human_monitor_command /home/elmarco/src/qemu/monitor.c:613 #7 0x556714b00918 in qmp_marshal_human_monitor_command /home/elmarco/src/qemu/build/qmp-marshal.c:1704 #8 0x556715138a3e in do_qmp_dispatch /home/elmarco/src/qemu/qapi/qmp-dispatch.c:104 #9 0x556715138f83 in qmp_dispatch /home/elmarco/src/qemu/qapi/qmp-dispatch.c:131 #10 0x55671485cf88 in handle_qmp_command /home/elmarco/src/qemu/monitor.c:3839 #11 0x55671514e80b in json_message_process_token /home/elmarco/src/qemu/qobject/json-streamer.c:105 #12 0x5567151bf2dc in json_lexer_feed_char /home/elmarco/src/qemu/qobject/json-lexer.c:323 #13 0x5567151bf827 in json_lexer_feed /home/elmarco/src/qemu/qobject/json-lexer.c:373 #14 0x55671514ee62 in json_message_parser_feed /home/elmarco/src/qemu/qobject/json-streamer.c:124 #15 0x556714854b1f in monitor_qmp_read /home/elmarco/src/qemu/monitor.c:3881 #16 0x556715045440 in qemu_chr_be_write_impl /home/elmarco/src/qemu/chardev/char.c:172 #17 0x556715047184 in qemu_chr_be_write /home/elmarco/src/qemu/chardev/char.c:184 #18 0x55671505a8e6 in tcp_chr_read /home/elmarco/src/qemu/chardev/char-socket.c:440 #19 0x5567150943c3 in qio_channel_fd_source_dispatch /home/elmarco/src/qemu/io/channel-watch.c:84 #20 0x7fb90292b90b in g_main_dispatch ../glib/gmain.c:3182 #21 0x7fb90292c7ac in g_main_context_dispatch ../glib/gmain.c:3847 #22 0x556715162eca in glib_pollfds_poll /home/elmarco/src/qemu/util/main-loop.c:214 #23 0x556715163001 in os_host_main_loop_wait /home/elmarco/src/qemu/util/main-loop.c:261 #24 0x5567151631fa in main_loop_wait /home/elmarco/src/qemu/util/main-loop.c:515 #25 0x556714ad6d3b in main_loop /home/elmarco/src/qemu/vl.c:1950 #26 0x556714ade329 in main /home/elmarco/src/qemu/vl.c:4865 #27 0x7fb8fe5c9009 in __libc_start_main (/lib64/libc.so.6+0x21009) #28 0x5567147af4d9 in _start (/home/elmarco/src/qemu/build/s390x-softmmu/qemu-system-s390x+0xf674d9) 0x556715a1f120 is located 32 bytes to the left of global variable 'char_hci_type_info' defined in '/home/elmarco/src/qemu/hw/bt/hci-csr.c:493:23' (0x556715a1f140) of size 104 0x556715a1f120 is located 8 bytes to the right of global variable 's390_opcodes' defined in '/home/elmarco/src/qemu/disas/s390.c:860:33' (0x556715a15280) of size 40600 This fix is based on Andreas Arnez <[email protected]> upstream commit: https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;a=commitdiff;h=9ace48f3d7d80ce09c5df60cccb433470410b11b 2014-08-19 Andreas Arnez <[email protected]> * s390-dis.c (init_disasm): Simplify initialization of opc_index[]. This also fixes an access after the last element of s390_opcodes[]. Signed-off-by: Marc-André Lureau <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>

Eric Auger reported the problem days ago that OOB broke ARM when running with libvirt: http://lists.gnu.org/archive/html/qemu-devel/2018-03/msg06231.html The problem was that the monitor dispatcher bottom half was bound to qemu_aio_context now, which could be polled unexpectedly in block code. We should keep the dispatchers run in iohandler_ctx just like what we did before the Out-Of-Band series (chardev uses qio, and qio binds everything with iohandler_ctx). If without this change, QMP dispatcher might be run even before reaching main loop in block IO path, for example, in a stack like (the ARM case, "cont" command handler run even during machine init phase): #0 qmp_cont () #1 0x00000000006bd210 in qmp_marshal_cont () #2 0x0000000000ac05c4 in do_qmp_dispatch () #3 0x0000000000ac07a0 in qmp_dispatch () #4 0x0000000000472d60 in monitor_qmp_dispatch_one () #5 0x000000000047302c in monitor_qmp_bh_dispatcher () #6 0x0000000000acf374 in aio_bh_call () #7 0x0000000000acf428 in aio_bh_poll () #8 0x0000000000ad5110 in aio_poll () #9 0x0000000000a08ab8 in blk_prw () #10 0x0000000000a091c4 in blk_pread () #11 0x0000000000734f94 in pflash_cfi01_realize () #12 0x000000000075a3a4 in device_set_realized () #13 0x00000000009a26cc in property_set_bool () #14 0x00000000009a0a40 in object_property_set () #15 0x00000000009a3a08 in object_property_set_qobject () #16 0x00000000009a0c8c in object_property_set_bool () #17 0x0000000000758f94 in qdev_init_nofail () #18 0x000000000058e190 in create_one_flash () #19 0x000000000058e2f4 in create_flash () #20 0x00000000005902f0 in machvirt_init () #21 0x00000000007635cc in machine_run_board_init () #22 0x00000000006b135c in main () Actually the problem is more severe than that. After we switched to the qemu AIO handler it means the monitor dispatcher code can even be called with nested aio_poll(), then it can be an explicit aio_poll() inside another main loop aio_poll() which could be racy too; breaking code like TPM and 9p that use nested event loops. Switch to use the iohandler_ctx for monitor dispatchers. My sincere thanks to Eric Auger who offered great help during both debugging and verifying the problem. The ARM test was carried out by applying this patch upon QEMU 2.12.0-rc0 and problem is gone after the patch. A quick test of mine shows that after this patch applied we can pass all raw iotests even with OOB on by default. CC: Eric Blake <[email protected]> CC: Markus Armbruster <[email protected]> CC: Stefan Hajnoczi <[email protected]> CC: Fam Zheng <[email protected]> Reported-by: Eric Auger <[email protected]> Tested-by: Eric Auger <[email protected]> Signed-off-by: Peter Xu <[email protected]> Message-Id: <[email protected]> Reviewed-by: Eric Blake <[email protected]> Reviewed-by: Stefan Hajnoczi <[email protected]> Signed-off-by: Eric Blake <[email protected]>

Even for block nodes with bs->drv == NULL, we can't just ignore a bdrv_set_aio_context() call. Leaving the node in its old context can mean that it's still in an iothread context in bdrv_close_all() during shutdown, resulting in an attempted unlock of the AioContext lock which we don't hold. This is an example stack trace of a related crash: #0 0x00007ffff59da57f in raise () at /lib64/libc.so.6 #1 0x00007ffff59c4895 in abort () at /lib64/libc.so.6 #2 0x0000555555b97b1e in error_exit (err=<optimized out>, msg=msg@entry=0x555555d386d0 <__func__.19059> "qemu_mutex_unlock_impl") at util/qemu-thread-posix.c:36 #3 0x0000555555b97f7f in qemu_mutex_unlock_impl (mutex=mutex@entry=0x5555568002f0, file=file@entry=0x555555d378df "util/async.c", line=line@entry=507) at util/qemu-thread-posix.c:97 #4 0x0000555555b92f55 in aio_context_release (ctx=ctx@entry=0x555556800290) at util/async.c:507 #5 0x0000555555b05cf8 in bdrv_prwv_co (child=child@entry=0x7fffc80012f0, offset=offset@entry=131072, qiov=qiov@entry=0x7fffffffd4f0, is_write=is_write@entry=true, flags=flags@entry=0) at block/io.c:833 #6 0x0000555555b060a9 in bdrv_pwritev (qiov=0x7fffffffd4f0, offset=131072, child=0x7fffc80012f0) at block/io.c:990 #7 0x0000555555b060a9 in bdrv_pwrite (child=0x7fffc80012f0, offset=131072, buf=<optimized out>, bytes=<optimized out>) at block/io.c:990 #8 0x0000555555ae172b in qcow2_cache_entry_flush (bs=bs@entry=0x555556810680, c=c@entry=0x5555568cc740, i=i@entry=0) at block/qcow2-cache.c:51 #9 0x0000555555ae18dd in qcow2_cache_write (bs=bs@entry=0x555556810680, c=0x5555568cc740) at block/qcow2-cache.c:248 #10 0x0000555555ae15de in qcow2_cache_flush (bs=0x555556810680, c=<optimized out>) at block/qcow2-cache.c:259 #11 0x0000555555ae16b1 in qcow2_cache_flush_dependency (c=0x5555568a1700, c=0x5555568a1700, bs=0x555556810680) at block/qcow2-cache.c:194 #12 0x0000555555ae16b1 in qcow2_cache_entry_flush (bs=bs@entry=0x555556810680, c=c@entry=0x5555568a1700, i=i@entry=0) at block/qcow2-cache.c:194 #13 0x0000555555ae18dd in qcow2_cache_write (bs=bs@entry=0x555556810680, c=0x5555568a1700) at block/qcow2-cache.c:248 #14 0x0000555555ae15de in qcow2_cache_flush (bs=bs@entry=0x555556810680, c=<optimized out>) at block/qcow2-cache.c:259 #15 0x0000555555ad242c in qcow2_inactivate (bs=bs@entry=0x555556810680) at block/qcow2.c:2124 #16 0x0000555555ad2590 in qcow2_close (bs=0x555556810680) at block/qcow2.c:2153 #17 0x0000555555ab0c62 in bdrv_close (bs=0x555556810680) at block.c:3358 #18 0x0000555555ab0c62 in bdrv_delete (bs=0x555556810680) at block.c:3542 #19 0x0000555555ab0c62 in bdrv_unref (bs=0x555556810680) at block.c:4598 #20 0x0000555555af4d72 in blk_remove_bs (blk=blk@entry=0x5555568103d0) at block/block-backend.c:785 #21 0x0000555555af4dbb in blk_remove_all_bs () at block/block-backend.c:483 #22 0x0000555555aae02f in bdrv_close_all () at block.c:3412 #23 0x00005555557f9796 in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4776 The reproducer I used is a qcow2 image on gluster volume, where the virtual disk size (4 GB) is larger than the gluster volume size (64M), so we can easily trigger an ENOSPC. This backend is assigned to a virtio-blk device using an iothread, and then from the guest a 'dd if=/dev/zero of=/dev/vda bs=1G count=1' causes the VM to stop because of an I/O error. qemu_gluster_co_flush_to_disk() sets bs->drv = NULL on error, so when virtio-blk stops the dataplane, the block nodes stay in the iothread AioContext. A 'quit' monitor command issued from this paused state crashes the process. Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1631227 Cc: [email protected] Signed-off-by: Kevin Wolf <[email protected]> Reviewed-by: Eric Blake <[email protected]> Reviewed-by: Max Reitz <[email protected]> Reviewed-by: Stefano Garzarella <[email protected]>

baryluk · 2019-10-25T14:45:54Z

Can't reproduce using inkscape 0.92.4-4 , pycompile 1.0 from python2-minimal 2.7.17-1 and running on latest qemu-m68k and Linux kernel 5.3.7-1 from Debian ports.

Installation proceeds without error, and pycompile -p inkscape /usr/share/inkscape also succeeded if invoked manually.

glaubitz · 2019-10-25T15:06:14Z

Indeed, this seems to have been fixed. I have to check which were the affected packages which were installing inkscape for building. These might be blacklisted because of this bug.

All paths that lead to bdrv_backup_top_drop(), except for the call from backup_clean(), imply that the BDS AioContext has already been acquired, so doing it there too can potentially lead to QEMU hanging on AIO_WAIT_WHILE(). An easy way to trigger this situation is by issuing a two actions transaction, with a proper and a bogus blockdev-backup, so the second one will trigger a rollback. This will trigger a hang with an stack trace like this one: #0 0x00007fb680c75016 in __GI_ppoll (fds=0x55e74580f7c0, nfds=1, timeout=<optimized out>, timeout@entry=0x0, sigmask=sigmask@entry=0x0) at ../sysdeps/unix/sysv/linux/ppoll.c:39 #1 0x000055e743386e09 in ppoll (__ss=0x0, __timeout=0x0, __nfds=<optimized out>, __fds=<optimized out>) at /usr/include/bits/poll2.h:77 #2 0x000055e743386e09 in qemu_poll_ns (fds=<optimized out>, nfds=<optimized out>, timeout=<optimized out>) at util/qemu-timer.c:336 #3 0x000055e743388dc4 in aio_poll (ctx=0x55e7458925d0, blocking=blocking@entry=true) at util/aio-posix.c:669 #4 0x000055e743305dea in bdrv_flush (bs=bs@entry=0x55e74593c0d0) at block/io.c:2878 #5 0x000055e7432be58e in bdrv_close (bs=0x55e74593c0d0) at block.c:4017 #6 0x000055e7432be58e in bdrv_delete (bs=<optimized out>) at block.c:4262 #7 0x000055e7432be58e in bdrv_unref (bs=bs@entry=0x55e74593c0d0) at block.c:5644 #8 0x000055e743316b9b in bdrv_backup_top_drop (bs=bs@entry=0x55e74593c0d0) at block/backup-top.c:273 #9 0x000055e74331461f in backup_job_create (job_id=0x0, bs=bs@entry=0x55e7458d5820, target=target@entry=0x55e74589f640, speed=0, sync_mode=MIRROR_SYNC_MODE_FULL, sync_bitmap=sync_bitmap@entry=0x0, bitmap_mode=BITMAP_SYNC_MODE_ON_SUCCESS, compress=false, filter_node_name=0x0, on_source_error=BLOCKDEV_ON_ERROR_REPORT, on_target_error=BLOCKDEV_ON_ERROR_REPORT, creation_flags=0, cb=0x0, opaque=0x0, txn=0x0, errp=0x7ffddfd1efb0) at block/backup.c:478 #10 0x000055e74315bc52 in do_backup_common (backup=backup@entry=0x55e746c066d0, bs=bs@entry=0x55e7458d5820, target_bs=target_bs@entry=0x55e74589f640, aio_context=aio_context@entry=0x55e7458a91e0, txn=txn@entry=0x0, errp=errp@entry=0x7ffddfd1efb0) at blockdev.c:3580 #11 0x000055e74315c37c in do_blockdev_backup (backup=backup@entry=0x55e746c066d0, txn=0x0, errp=errp@entry=0x7ffddfd1efb0) at /usr/src/debug/qemu-kvm-4.2.0-2.module+el8.2.0+5135+ed3b2489.x86_64/./qapi/qapi-types-block-core.h:1492 #12 0x000055e74315c449 in blockdev_backup_prepare (common=0x55e746a8de90, errp=0x7ffddfd1f018) at blockdev.c:1885 #13 0x000055e743160152 in qmp_transaction (dev_list=<optimized out>, has_props=<optimized out>, props=0x55e7467fe2c0, errp=errp@entry=0x7ffddfd1f088) at blockdev.c:2340 #14 0x000055e743287ff5 in qmp_marshal_transaction (args=<optimized out>, ret=<optimized out>, errp=0x7ffddfd1f0f8) at qapi/qapi-commands-transaction.c:44 #15 0x000055e74333de6c in do_qmp_dispatch (errp=0x7ffddfd1f0f0, allow_oob=<optimized out>, request=<optimized out>, cmds=0x55e743c28d60 <qmp_commands>) at qapi/qmp-dispatch.c:132 #16 0x000055e74333de6c in qmp_dispatch (cmds=0x55e743c28d60 <qmp_commands>, request=<optimized out>, allow_oob=<optimized out>) at qapi/qmp-dispatch.c:175 #17 0x000055e74325c061 in monitor_qmp_dispatch (mon=0x55e745908030, req=<optimized out>) at monitor/qmp.c:145 #18 0x000055e74325c6fa in monitor_qmp_bh_dispatcher (data=<optimized out>) at monitor/qmp.c:234 #19 0x000055e743385866 in aio_bh_call (bh=0x55e745807ae0) at util/async.c:117 #20 0x000055e743385866 in aio_bh_poll (ctx=ctx@entry=0x55e7458067a0) at util/async.c:117 #21 0x000055e743388c54 in aio_dispatch (ctx=0x55e7458067a0) at util/aio-posix.c:459 #22 0x000055e743385742 in aio_ctx_dispatch (source=<optimized out>, callback=<optimized out>, user_data=<optimized out>) at util/async.c:260 #23 0x00007fb68543e67d in g_main_dispatch (context=0x55e745893a40) at gmain.c:3176 #24 0x00007fb68543e67d in g_main_context_dispatch (context=context@entry=0x55e745893a40) at gmain.c:3829 #25 0x000055e743387d08 in glib_pollfds_poll () at util/main-loop.c:219 #26 0x000055e743387d08 in os_host_main_loop_wait (timeout=<optimized out>) at util/main-loop.c:242 #27 0x000055e743387d08 in main_loop_wait (nonblocking=<optimized out>) at util/main-loop.c:518 #28 0x000055e74316a3c1 in main_loop () at vl.c:1828 #29 0x000055e743016a72 in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4504 Fix this by not acquiring the AioContext there, and ensuring all paths leading to it have it already acquired (backup_clean()). RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1782111 Signed-off-by: Sergio Lopez <[email protected]> Signed-off-by: Kevin Wolf <[email protected]>

Dirty map addition and removal functions are not acquiring to BDS AioContext, while they may call to code that expects it to be acquired. This may trigger a crash with a stack trace like this one: #0 0x00007f0ef146370f in __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50 #1 0x00007f0ef144db25 in __GI_abort () at abort.c:79 #2 0x0000565022294dce in error_exit (err=<optimized out>, msg=msg@entry=0x56502243a730 <__func__.16350> "qemu_mutex_unlock_impl") at util/qemu-thread-posix.c:36 #3 0x00005650222950ba in qemu_mutex_unlock_impl (mutex=mutex@entry=0x5650244b0240, file=file@entry=0x565022439adf "util/async.c", line=line@entry=526) at util/qemu-thread-posix.c:108 #4 0x0000565022290029 in aio_context_release (ctx=ctx@entry=0x5650244b01e0) at util/async.c:526 #5 0x000056502221cd08 in bdrv_can_store_new_dirty_bitmap (bs=bs@entry=0x5650244dc820, name=name@entry=0x56502481d360 "bitmap1", granularity=granularity@entry=65536, errp=errp@entry=0x7fff22831718) at block/dirty-bitmap.c:542 #6 0x000056502206ae53 in qmp_block_dirty_bitmap_add (errp=0x7fff22831718, disabled=false, has_disabled=<optimized out>, persistent=<optimized out>, has_persistent=true, granularity=65536, has_granularity=<optimized out>, name=0x56502481d360 "bitmap1", node=<optimized out>) at blockdev.c:2894 #7 0x000056502206ae53 in qmp_block_dirty_bitmap_add (node=<optimized out>, name=0x56502481d360 "bitmap1", has_granularity=<optimized out>, granularity=<optimized out>, has_persistent=true, persistent=<optimized out>, has_disabled=false, disabled=false, errp=0x7fff22831718) at blockdev.c:2856 #8 0x00005650221847a3 in qmp_marshal_block_dirty_bitmap_add (args=<optimized out>, ret=<optimized out>, errp=0x7fff22831798) at qapi/qapi-commands-block-core.c:651 #9 0x0000565022247e6c in do_qmp_dispatch (errp=0x7fff22831790, allow_oob=<optimized out>, request=<optimized out>, cmds=0x565022b32d60 <qmp_commands>) at qapi/qmp-dispatch.c:132 #10 0x0000565022247e6c in qmp_dispatch (cmds=0x565022b32d60 <qmp_commands>, request=<optimized out>, allow_oob=<optimized out>) at qapi/qmp-dispatch.c:175 #11 0x0000565022166061 in monitor_qmp_dispatch (mon=0x56502450faa0, req=<optimized out>) at monitor/qmp.c:145 #12 0x00005650221666fa in monitor_qmp_bh_dispatcher (data=<optimized out>) at monitor/qmp.c:234 #13 0x000056502228f866 in aio_bh_call (bh=0x56502440eae0) at util/async.c:117 #14 0x000056502228f866 in aio_bh_poll (ctx=ctx@entry=0x56502440d7a0) at util/async.c:117 #15 0x0000565022292c54 in aio_dispatch (ctx=0x56502440d7a0) at util/aio-posix.c:459 #16 0x000056502228f742 in aio_ctx_dispatch (source=<optimized out>, callback=<optimized out>, user_data=<optimized out>) at util/async.c:260 #17 0x00007f0ef5ce667d in g_main_dispatch (context=0x56502449aa40) at gmain.c:3176 #18 0x00007f0ef5ce667d in g_main_context_dispatch (context=context@entry=0x56502449aa40) at gmain.c:3829 #19 0x0000565022291d08 in glib_pollfds_poll () at util/main-loop.c:219 #20 0x0000565022291d08 in os_host_main_loop_wait (timeout=<optimized out>) at util/main-loop.c:242 #21 0x0000565022291d08 in main_loop_wait (nonblocking=<optimized out>) at util/main-loop.c:518 #22 0x00005650220743c1 in main_loop () at vl.c:1828 #23 0x0000565021f20a72 in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4504 Fix this by acquiring the AioContext at qmp_block_dirty_bitmap_add() and qmp_block_dirty_bitmap_add(). RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1782175 Signed-off-by: Sergio Lopez <[email protected]> Signed-off-by: Kevin Wolf <[email protected]>

external_snapshot_abort() calls to bdrv_set_backing_hd(), which returns state->old_bs to the main AioContext, as it's intended to be used then the BDS is going to be released. As that's not the case when aborting an external snapshot, return it to the AioContext it was before the call. This issue can be triggered by issuing a transaction with two actions, a proper blockdev-snapshot-sync and a bogus one, so the second will trigger a transaction abort. This results in a crash with an stack trace like this one: #0 0x00007fa1048b28df in __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50 #1 0x00007fa10489ccf5 in __GI_abort () at abort.c:79 #2 0x00007fa10489cbc9 in __assert_fail_base (fmt=0x7fa104a03300 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=0x5572240b44d8 "bdrv_get_aio_context(old_bs) == bdrv_get_aio_context(new_bs)", file=0x557224014d30 "block.c", line=2240, function=<optimized out>) at assert.c:92 #3 0x00007fa1048aae96 in __GI___assert_fail (assertion=assertion@entry=0x5572240b44d8 "bdrv_get_aio_context(old_bs) == bdrv_get_aio_context(new_bs)", file=file@entry=0x557224014d30 "block.c", line=line@entry=2240, function=function@entry=0x5572240b5d60 <__PRETTY_FUNCTION__.31620> "bdrv_replace_child_noperm") at assert.c:101 #4 0x0000557223e631f8 in bdrv_replace_child_noperm (child=0x557225b9c980, new_bs=new_bs@entry=0x557225c42e40) at block.c:2240 #5 0x0000557223e68be7 in bdrv_replace_node (from=0x557226951a60, to=0x557225c42e40, errp=0x5572247d6138 <error_abort>) at block.c:4196 #6 0x0000557223d069c4 in external_snapshot_abort (common=0x557225d7e170) at blockdev.c:1731 #7 0x0000557223d069c4 in external_snapshot_abort (common=0x557225d7e170) at blockdev.c:1717 #8 0x0000557223d09013 in qmp_transaction (dev_list=<optimized out>, has_props=<optimized out>, props=0x557225cc7d70, errp=errp@entry=0x7ffe704c0c98) at blockdev.c:2360 #9 0x0000557223e32085 in qmp_marshal_transaction (args=<optimized out>, ret=<optimized out>, errp=0x7ffe704c0d08) at qapi/qapi-commands-transaction.c:44 #10 0x0000557223ee798c in do_qmp_dispatch (errp=0x7ffe704c0d00, allow_oob=<optimized out>, request=<optimized out>, cmds=0x5572247d3cc0 <qmp_commands>) at qapi/qmp-dispatch.c:132 #11 0x0000557223ee798c in qmp_dispatch (cmds=0x5572247d3cc0 <qmp_commands>, request=<optimized out>, allow_oob=<optimized out>) at qapi/qmp-dispatch.c:175 #12 0x0000557223e06141 in monitor_qmp_dispatch (mon=0x557225c69ff0, req=<optimized out>) at monitor/qmp.c:120 #13 0x0000557223e0678a in monitor_qmp_bh_dispatcher (data=<optimized out>) at monitor/qmp.c:209 #14 0x0000557223f2f366 in aio_bh_call (bh=0x557225b9dc60) at util/async.c:117 #15 0x0000557223f2f366 in aio_bh_poll (ctx=ctx@entry=0x557225b9c840) at util/async.c:117 #16 0x0000557223f32754 in aio_dispatch (ctx=0x557225b9c840) at util/aio-posix.c:459 #17 0x0000557223f2f242 in aio_ctx_dispatch (source=<optimized out>, callback=<optimized out>, user_data=<optimized out>) at util/async.c:260 #18 0x00007fa10913467d in g_main_dispatch (context=0x557225c28e80) at gmain.c:3176 #19 0x00007fa10913467d in g_main_context_dispatch (context=context@entry=0x557225c28e80) at gmain.c:3829 #20 0x0000557223f31808 in glib_pollfds_poll () at util/main-loop.c:219 #21 0x0000557223f31808 in os_host_main_loop_wait (timeout=<optimized out>) at util/main-loop.c:242 #22 0x0000557223f31808 in main_loop_wait (nonblocking=<optimized out>) at util/main-loop.c:518 #23 0x0000557223d13201 in main_loop () at vl.c:1828 #24 0x0000557223bbfb82 in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4504 RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1779036 Signed-off-by: Sergio Lopez <[email protected]> Signed-off-by: Kevin Wolf <[email protected]>

Coverity reports: *** CID 1419387: Memory - illegal accesses (OVERRUN) /hw/hppa/dino.c: 267 in dino_chip_read_with_attrs() 261 val = s->ilr & s->imr & s->icr; 262 break; 263 case DINO_TOC_ADDR: 264 val = s->toc_addr; 265 break; 266 case DINO_GMASK ... DINO_TLTIM: >>> CID 1419387: Memory - illegal accesses (OVERRUN) >>> Overrunning array "s->reg800" of 12 4-byte elements at element index 12 (byte offset 48) using index "(addr - 2048UL) / 4UL" (which evaluates to 12). 267 val = s->reg800[(addr - DINO_GMASK) / 4]; 268 if (addr == DINO_PAMR) { 269 val &= ~0x01; /* LSB is hardwired to 0 */ 270 } 271 if (addr == DINO_MLTIM) { 272 val &= ~0x07; /* 3 LSB are hardwired to 0 */ *** CID 1419393: Memory - corruptions (OVERRUN) /hw/hppa/dino.c: 363 in dino_chip_write_with_attrs() 357 /* These registers are read-only. */ 358 break; 359 360 case DINO_GMASK ... DINO_TLTIM: 361 i = (addr - DINO_GMASK) / 4; 362 val &= reg800_keep_bits[i]; >>> CID 1419393: Memory - corruptions (OVERRUN) >>> Overrunning array "s->reg800" of 12 4-byte elements at element index 12 (byte offset 48) using index "i" (which evaluates to 12). 363 s->reg800[i] = val; 364 break; 365 366 default: 367 /* Controlled by dino_chip_mem_valid above. */ 368 g_assert_not_reached(); *** CID 1419394: Memory - illegal accesses (OVERRUN) /hw/hppa/dino.c: 362 in dino_chip_write_with_attrs() 356 case DINO_IRR1: 357 /* These registers are read-only. */ 358 break; 359 360 case DINO_GMASK ... DINO_TLTIM: 361 i = (addr - DINO_GMASK) / 4; >>> CID 1419394: Memory - illegal accesses (OVERRUN) >>> Overrunning array "reg800_keep_bits" of 12 4-byte elements at element index 12 (byte offset 48) using index "i" (which evaluates to 12). 362 val &= reg800_keep_bits[i]; 363 s->reg800[i] = val; 364 break; 365 366 default: 367 /* Controlled by dino_chip_mem_valid above. */ Indeed the array should contain 13 entries, the undocumented register 0x82c is missing. Fix by increasing the array size and adding the missing register. CID 1419387 can be verified with: $ echo x 0xfff80830 | hppa-softmmu/qemu-system-hppa -S -monitor stdio -display none QEMU 4.2.50 monitor - type 'help' for more information (qemu) x 0xfff80830 qemu/hw/hppa/dino.c:267:15: runtime error: index 12 out of bounds for type 'uint32_t [12]' SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior /home/phil/source/qemu/hw/hppa/dino.c:267:15 in 00000000fff80830: 0x00000000 and CID 1419393/1419394 with: $ echo writeb 0xfff80830 0x69 \ | hppa-softmmu/qemu-system-hppa -S -accel qtest -qtest stdio -display none [I 1581634452.654113] OPENED [R +4.105415] writeb 0xfff80830 0x69 qemu/hw/hppa/dino.c:362:16: runtime error: index 12 out of bounds for type 'const uint32_t [12]' SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior qemu/hw/hppa/dino.c:362:16 in ================================================================= ==29607==ERROR: AddressSanitizer: global-buffer-overflow on address 0x5577dae32f30 at pc 0x5577d93f2463 bp 0x7ffd97ea11b0 sp 0x7ffd97ea11a8 READ of size 4 at 0x5577dae32f30 thread T0 #0 0x5577d93f2462 in dino_chip_write_with_attrs qemu/hw/hppa/dino.c:362:16 #1 0x5577d9025664 in memory_region_write_with_attrs_accessor qemu/memory.c:503:12 #2 0x5577d9024920 in access_with_adjusted_size qemu/memory.c:539:18 #3 0x5577d9023608 in memory_region_dispatch_write qemu/memory.c:1482:13 #4 0x5577d8e3177a in flatview_write_continue qemu/exec.c:3166:23 #5 0x5577d8e20357 in flatview_write qemu/exec.c:3206:14 #6 0x5577d8e1fef4 in address_space_write qemu/exec.c:3296:18 #7 0x5577d8e20693 in address_space_rw qemu/exec.c:3306:16 #8 0x5577d9011595 in qtest_process_command qemu/qtest.c:432:13 #9 0x5577d900d19f in qtest_process_inbuf qemu/qtest.c:705:9 #10 0x5577d900ca22 in qtest_read qemu/qtest.c:717:5 #11 0x5577da8c4254 in qemu_chr_be_write_impl qemu/chardev/char.c:183:9 #12 0x5577da8c430c in qemu_chr_be_write qemu/chardev/char.c:195:9 #13 0x5577da8cf587 in fd_chr_read qemu/chardev/char-fd.c:68:9 #14 0x5577da9836cd in qio_channel_fd_source_dispatch qemu/io/channel-watch.c:84:12 #15 0x7faf44509ecc in g_main_context_dispatch (/lib64/libglib-2.0.so.0+0x4fecc) #16 0x5577dab75f96 in glib_pollfds_poll qemu/util/main-loop.c:219:9 #17 0x5577dab74797 in os_host_main_loop_wait qemu/util/main-loop.c:242:5 #18 0x5577dab7435a in main_loop_wait qemu/util/main-loop.c:518:11 #19 0x5577d9514eb3 in main_loop qemu/vl.c:1682:9 #20 0x5577d950699d in main qemu/vl.c:4450:5 #21 0x7faf41a87f42 in __libc_start_main (/lib64/libc.so.6+0x23f42) #22 0x5577d8cd4d4d in _start (qemu/build/sanitizer/hppa-softmmu/qemu-system-hppa+0x1256d4d) 0x5577dae32f30 is located 0 bytes to the right of global variable 'reg800_keep_bits' defined in 'qemu/hw/hppa/dino.c:87:23' (0x5577dae32f00) of size 48 SUMMARY: AddressSanitizer: global-buffer-overflow qemu/hw/hppa/dino.c:362:16 in dino_chip_write_with_attrs Shadow bytes around the buggy address: 0x0aaf7b5be590: 00 f9 f9 f9 f9 f9 f9 f9 00 02 f9 f9 f9 f9 f9 f9 0x0aaf7b5be5a0: 07 f9 f9 f9 f9 f9 f9 f9 07 f9 f9 f9 f9 f9 f9 f9 0x0aaf7b5be5b0: 07 f9 f9 f9 f9 f9 f9 f9 00 00 00 00 00 00 00 00 0x0aaf7b5be5c0: 00 00 00 02 f9 f9 f9 f9 00 00 00 00 00 00 00 00 0x0aaf7b5be5d0: 00 00 00 00 00 00 00 00 00 00 00 03 f9 f9 f9 f9 =>0x0aaf7b5be5e0: 00 00 00 00 00 00[f9]f9 f9 f9 f9 f9 00 00 00 00 0x0aaf7b5be5f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0aaf7b5be600: 00 00 01 f9 f9 f9 f9 f9 00 00 00 00 07 f9 f9 f9 0x0aaf7b5be610: f9 f9 f9 f9 00 00 00 00 00 00 00 00 00 00 00 00 0x0aaf7b5be620: 00 00 00 05 f9 f9 f9 f9 00 00 00 00 07 f9 f9 f9 0x0aaf7b5be630: f9 f9 f9 f9 00 00 f9 f9 f9 f9 f9 f9 07 f9 f9 f9 Shadow byte legend (one shadow byte represents 8 application bytes): Addressable: 00 Partially addressable: 01 02 03 04 05 06 07 Heap left redzone: fa Freed heap region: fd Stack left redzone: f1 Stack mid redzone: f2 Stack right redzone: f3 Stack after return: f5 Stack use after scope: f8 Global redzone: f9 Global init order: f6 Poisoned by user: f7 Container overflow: fc Array cookie: ac Intra object redzone: bb ASan internal: fe Left alloca redzone: ca Right alloca redzone: cb Shadow gap: cc ==29607==ABORTING Fixes: Covertiy CID 1419387 / 1419393 / 1419394 (commit 1809259) Acked-by: Helge Deller <[email protected]> Signed-off-by: Philippe Mathieu-Daudé <[email protected]> Message-Id: <[email protected]> Signed-off-by: Richard Henderson <[email protected]>

There is a use-after-free possible: bdrv_unref_child() leaves bs->backing freed but not NULL. bdrv_attach_child may produce nested polling loop due to drain, than access of freed pointer is possible. I've produced the following crash on 30 iotest with modified code. It does not reproduce on master, but still seems possible: #0 __strcmp_avx2 () at /lib64/libc.so.6 #1 bdrv_backing_overridden (bs=0x55c9d3cc2060) at block.c:6350 #2 bdrv_refresh_filename (bs=0x55c9d3cc2060) at block.c:6404 #3 bdrv_backing_attach (c=0x55c9d48e5520) at block.c:1063 #4 bdrv_replace_child_noperm (child=child@entry=0x55c9d48e5520, new_bs=new_bs@entry=0x55c9d3cc2060) at block.c:2290 #5 bdrv_replace_child (child=child@entry=0x55c9d48e5520, new_bs=new_bs@entry=0x55c9d3cc2060) at block.c:2320 #6 bdrv_root_attach_child (child_bs=child_bs@entry=0x55c9d3cc2060, child_name=child_name@entry=0x55c9d241d478 "backing", child_role=child_role@entry=0x55c9d26ecee0 <child_backing>, ctx=<optimized out>, perm=<optimized out>, shared_perm=21, opaque=0x55c9d3c5a3d0, errp=0x7ffd117108e0) at block.c:2424 #7 bdrv_attach_child (parent_bs=parent_bs@entry=0x55c9d3c5a3d0, child_bs=child_bs@entry=0x55c9d3cc2060, child_name=child_name@entry=0x55c9d241d478 "backing", child_role=child_role@entry=0x55c9d26ecee0 <child_backing>, errp=errp@entry=0x7ffd117108e0) at block.c:5876 #8 in bdrv_set_backing_hd (bs=bs@entry=0x55c9d3c5a3d0, backing_hd=backing_hd@entry=0x55c9d3cc2060, errp=errp@entry=0x7ffd117108e0) at block.c:2576 #9 stream_prepare (job=0x55c9d49d84a0) at block/stream.c:150 #10 job_prepare (job=0x55c9d49d84a0) at job.c:761 #11 job_txn_apply (txn=<optimized out>, fn=<optimized out>) at job.c:145 #12 job_do_finalize (job=0x55c9d49d84a0) at job.c:778 #13 job_completed_txn_success (job=0x55c9d49d84a0) at job.c:832 #14 job_completed (job=0x55c9d49d84a0) at job.c:845 #15 job_completed (job=0x55c9d49d84a0) at job.c:836 #16 job_exit (opaque=0x55c9d49d84a0) at job.c:864 #17 aio_bh_call (bh=0x55c9d471a160) at util/async.c:117 #18 aio_bh_poll (ctx=ctx@entry=0x55c9d3c46720) at util/async.c:117 #19 aio_poll (ctx=ctx@entry=0x55c9d3c46720, blocking=blocking@entry=true) at util/aio-posix.c:728 #20 bdrv_parent_drained_begin_single (poll=true, c=0x55c9d3d558f0) at block/io.c:121 #21 bdrv_parent_drained_begin_single (c=c@entry=0x55c9d3d558f0, poll=poll@entry=true) at block/io.c:114 #22 bdrv_replace_child_noperm (child=child@entry=0x55c9d3d558f0, new_bs=new_bs@entry=0x55c9d3d27300) at block.c:2258 #23 bdrv_replace_child (child=child@entry=0x55c9d3d558f0, new_bs=new_bs@entry=0x55c9d3d27300) at block.c:2320 #24 bdrv_root_attach_child (child_bs=child_bs@entry=0x55c9d3d27300, child_name=child_name@entry=0x55c9d241d478 "backing", child_role=child_role@entry=0x55c9d26ecee0 <child_backing>, ctx=<optimized out>, perm=<optimized out>, shared_perm=21, opaque=0x55c9d3cc2060, errp=0x7ffd11710c60) at block.c:2424 #25 bdrv_attach_child (parent_bs=parent_bs@entry=0x55c9d3cc2060, child_bs=child_bs@entry=0x55c9d3d27300, child_name=child_name@entry=0x55c9d241d478 "backing", child_role=child_role@entry=0x55c9d26ecee0 <child_backing>, errp=errp@entry=0x7ffd11710c60) at block.c:5876 #26 bdrv_set_backing_hd (bs=bs@entry=0x55c9d3cc2060, backing_hd=backing_hd@entry=0x55c9d3d27300, errp=errp@entry=0x7ffd11710c60) at block.c:2576 #27 stream_prepare (job=0x55c9d495ead0) at block/stream.c:150 ... Signed-off-by: Vladimir Sementsov-Ogievskiy <[email protected]> Message-Id: <[email protected]> Reviewed-by: Philippe Mathieu-Daudé <[email protected]> Reviewed-by: John Snow <[email protected]> Signed-off-by: Max Reitz <[email protected]>

when s->inflight is freed, vhost_dev_free_inflight may try to access s->inflight->addr, it will retrigger the following issue. ==7309==ERROR: AddressSanitizer: heap-use-after-free on address 0x604001020d18 at pc 0x555555ce948a bp 0x7fffffffb170 sp 0x7fffffffb160 READ of size 8 at 0x604001020d18 thread T0 #0 0x555555ce9489 in vhost_dev_free_inflight /root/smartx/qemu-el7/qemu-test/hw/virtio/vhost.c:1473 #1 0x555555cd86eb in virtio_reset /root/smartx/qemu-el7/qemu-test/hw/virtio/virtio.c:1214 #2 0x5555560d3eff in virtio_pci_reset hw/virtio/virtio-pci.c:1859 #3 0x555555f2ac53 in device_set_realized hw/core/qdev.c:893 #4 0x5555561d572c in property_set_bool qom/object.c:1925 #5 0x5555561de8de in object_property_set_qobject qom/qom-qobject.c:27 #6 0x5555561d99f4 in object_property_set_bool qom/object.c:1188 #7 0x555555e50ae7 in qdev_device_add /root/smartx/qemu-el7/qemu-test/qdev-monitor.c:626 #8 0x555555e51213 in qmp_device_add /root/smartx/qemu-el7/qemu-test/qdev-monitor.c:806 #9 0x555555e8ff40 in hmp_device_add /root/smartx/qemu-el7/qemu-test/hmp.c:1951 #10 0x555555be889a in handle_hmp_command /root/smartx/qemu-el7/qemu-test/monitor.c:3404 #11 0x555555beac8b in monitor_command_cb /root/smartx/qemu-el7/qemu-test/monitor.c:4296 #12 0x555556433eb7 in readline_handle_byte util/readline.c:393 #13 0x555555be89ec in monitor_read /root/smartx/qemu-el7/qemu-test/monitor.c:4279 #14 0x5555563285cc in tcp_chr_read chardev/char-socket.c:470 #15 0x7ffff670b968 in g_main_context_dispatch (/lib64/libglib-2.0.so.0+0x4a968) #16 0x55555640727c in glib_pollfds_poll util/main-loop.c:215 #17 0x55555640727c in os_host_main_loop_wait util/main-loop.c:238 #18 0x55555640727c in main_loop_wait util/main-loop.c:497 #19 0x555555b2d0bf in main_loop /root/smartx/qemu-el7/qemu-test/vl.c:2013 #20 0x555555b2d0bf in main /root/smartx/qemu-el7/qemu-test/vl.c:4776 #21 0x7fffdd2eb444 in __libc_start_main (/lib64/libc.so.6+0x22444) #22 0x555555b3767a (/root/smartx/qemu-el7/qemu-test/x86_64-softmmu/qemu-system-x86_64+0x5e367a) 0x604001020d18 is located 8 bytes inside of 40-byte region [0x604001020d10,0x604001020d38) freed by thread T0 here: #0 0x7ffff6f00508 in __interceptor_free (/lib64/libasan.so.4+0xde508) #1 0x7ffff671107d in g_free (/lib64/libglib-2.0.so.0+0x5007d) previously allocated by thread T0 here: #0 0x7ffff6f00a88 in __interceptor_calloc (/lib64/libasan.so.4+0xdea88) #1 0x7ffff6710fc5 in g_malloc0 (/lib64/libglib-2.0.so.0+0x4ffc5) SUMMARY: AddressSanitizer: heap-use-after-free /root/smartx/qemu-el7/qemu-test/hw/virtio/vhost.c:1473 in vhost_dev_free_inflight Shadow bytes around the buggy address: 0x0c08801fc150: fa fa 00 00 00 00 04 fa fa fa fd fd fd fd fd fa 0x0c08801fc160: fa fa fd fd fd fd fd fd fa fa 00 00 00 00 04 fa 0x0c08801fc170: fa fa 00 00 00 00 00 01 fa fa 00 00 00 00 04 fa 0x0c08801fc180: fa fa 00 00 00 00 00 01 fa fa 00 00 00 00 00 01 0x0c08801fc190: fa fa 00 00 00 00 00 fa fa fa 00 00 00 00 04 fa =>0x0c08801fc1a0: fa fa fd[fd]fd fd fd fa fa fa fd fd fd fd fd fa 0x0c08801fc1b0: fa fa fd fd fd fd fd fa fa fa fd fd fd fd fd fa 0x0c08801fc1c0: fa fa 00 00 00 00 00 fa fa fa fd fd fd fd fd fd 0x0c08801fc1d0: fa fa 00 00 00 00 00 01 fa fa fd fd fd fd fd fa 0x0c08801fc1e0: fa fa fd fd fd fd fd fa fa fa fd fd fd fd fd fd 0x0c08801fc1f0: fa fa 00 00 00 00 00 01 fa fa fd fd fd fd fd fa Shadow byte legend (one shadow byte represents 8 application bytes): Addressable: 00 Partially addressable: 01 02 03 04 05 06 07 Heap left redzone: fa Freed heap region: fd Stack left redzone: f1 Stack mid redzone: f2 Stack right redzone: f3 Stack after return: f5 Stack use after scope: f8 Global redzone: f9 Global init order: f6 Poisoned by user: f7 Container overflow: fc Array cookie: ac Intra object redzone: bb ASan internal: fe Left alloca redzone: ca Right alloca redzone: cb ==7309==ABORTING Signed-off-by: Li Feng <[email protected]> Message-Id: <[email protected]> Reviewed-by: Michael S. Tsirkin <[email protected]> Signed-off-by: Michael S. Tsirkin <[email protected]> Reviewed-by: Raphael Norwitz <[email protected]>

In host_memory_backend_get_host_nodes, we build host_nodes list and output it to v (a StringOutputVisitor) but forget to free the list. This fixes the memory leak. The memory leak stack: Direct leak of 32 byte(s) in 2 object(s) allocated from: #0 0xfffda30b3393 in __interceptor_calloc (/usr/lib64/libasan.so.4+0xd3393) #1 0xfffda1d28b9b in g_malloc0 (/usr/lib64/libglib-2.0.so.0+0x58b9b) #2 0xaaab05ca6e43 in host_memory_backend_get_host_nodes backends/hostmem.c:94 #3 0xaaab061ddf83 in object_property_get_uint16List qom/object.c:1478 #4 0xaaab05866513 in query_memdev hw/core/machine-qmp-cmds.c:312 #5 0xaaab061d980b in do_object_child_foreach qom/object.c:1001 #6 0xaaab0586779b in qmp_query_memdev hw/core/machine-qmp-cmds.c:328 #7 0xaaab0615ed3f in qmp_marshal_query_memdev qapi/qapi-commands-machine.c:327 #8 0xaaab0632d647 in do_qmp_dispatch qapi/qmp-dispatch.c:147 #9 0xaaab0632d647 in qmp_dispatch qapi/qmp-dispatch.c:190 #10 0xaaab0610f74b in monitor_qmp_dispatch monitor/qmp.c:120 #11 0xaaab0611074b in monitor_qmp_bh_dispatcher monitor/qmp.c:209 #12 0xaaab063caefb in aio_bh_poll util/async.c:117 #13 0xaaab063d30fb in aio_dispatch util/aio-posix.c:459 #14 0xaaab063cac8f in aio_ctx_dispatch util/async.c:268 #15 0xfffda1d22a6b in g_main_context_dispatch (/usr/lib64/libglib-2.0.so.0+0x52a6b) #16 0xaaab063d0e97 in glib_pollfds_poll util/main-loop.c:218 #17 0xaaab063d0e97 in os_host_main_loop_wait util/main-loop.c:241 #18 0xaaab063d0e97 in main_loop_wait util/main-loop.c:517 #19 0xaaab05c8bfa7 in main_loop /root/rpmbuild/BUILD/qemu-4.1.0/vl.c:1791 #20 0xaaab05713bc3 in main /root/rpmbuild/BUILD/qemu-4.1.0/vl.c:4473 #21 0xfffda0a83ebf in __libc_start_main (/usr/lib64/libc.so.6+0x23ebf) #22 0xaaab0571ed5f (aarch64-softmmu/qemu-system-aarch64+0x88ed5f) SUMMARY: AddressSanitizer: 32 byte(s) leaked in 2 allocation(s). Fixes: 4cf1b76 (hostmem: add properties for NUMA memory policy) Reported-by: Euler Robot <[email protected]> Signed-off-by: Keqian Zhu <[email protected]> Tested-by: Chen Qun <[email protected]> Reviewed-by: Igor Mammedov <[email protected]> Message-Id: <[email protected]> Signed-off-by: Eduardo Habkost <[email protected]>

When the reconnect in NBD client is in progress, the iochannel used for NBD connection doesn't exist. Therefore an attempt to detach it from the aio_context of the parent BlockDriverState results in a NULL pointer dereference. The problem is triggerable, in particular, when an outgoing migration is about to finish, and stopping the dataplane tries to move the BlockDriverState from the iothread aio_context to the main loop. If the NBD connection is lost before this point, and the NBD client has entered the reconnect procedure, QEMU crashes: #0 qemu_aio_coroutine_enter (ctx=0x5618056c7580, co=0x0) at /build/qemu-6MF7tq/qemu-5.0.1/util/qemu-coroutine.c:109 #1 0x00005618034b1b68 in nbd_client_attach_aio_context_bh ( opaque=0x561805ed4c00) at /build/qemu-6MF7tq/qemu-5.0.1/block/nbd.c:164 #2 0x000056180353116b in aio_wait_bh (opaque=0x7f60e1e63700) at /build/qemu-6MF7tq/qemu-5.0.1/util/aio-wait.c:55 #3 0x0000561803530633 in aio_bh_call (bh=0x7f60d40a7e80) at /build/qemu-6MF7tq/qemu-5.0.1/util/async.c:136 #4 aio_bh_poll (ctx=ctx@entry=0x5618056c7580) at /build/qemu-6MF7tq/qemu-5.0.1/util/async.c:164 #5 0x0000561803533e5a in aio_poll (ctx=ctx@entry=0x5618056c7580, blocking=blocking@entry=true) at /build/qemu-6MF7tq/qemu-5.0.1/util/aio-posix.c:650 #6 0x000056180353128d in aio_wait_bh_oneshot (ctx=0x5618056c7580, cb=<optimized out>, opaque=<optimized out>) at /build/qemu-6MF7tq/qemu-5.0.1/util/aio-wait.c:71 #7 0x000056180345c50a in bdrv_attach_aio_context (new_context=0x5618056c7580, bs=0x561805ed4c00) at /build/qemu-6MF7tq/qemu-5.0.1/block.c:6172 #8 bdrv_set_aio_context_ignore (bs=bs@entry=0x561805ed4c00, new_context=new_context@entry=0x5618056c7580, ignore=ignore@entry=0x7f60e1e63780) at /build/qemu-6MF7tq/qemu-5.0.1/block.c:6237 #9 0x000056180345c969 in bdrv_child_try_set_aio_context ( bs=bs@entry=0x561805ed4c00, ctx=0x5618056c7580, ignore_child=<optimized out>, errp=<optimized out>) at /build/qemu-6MF7tq/qemu-5.0.1/block.c:6332 #10 0x00005618034957db in blk_do_set_aio_context (blk=0x56180695b3f0, new_context=0x5618056c7580, update_root_node=update_root_node@entry=true, errp=errp@entry=0x0) at /build/qemu-6MF7tq/qemu-5.0.1/block/block-backend.c:1989 #11 0x00005618034980bd in blk_set_aio_context (blk=<optimized out>, new_context=<optimized out>, errp=errp@entry=0x0) at /build/qemu-6MF7tq/qemu-5.0.1/block/block-backend.c:2010 #12 0x0000561803197953 in virtio_blk_data_plane_stop (vdev=<optimized out>) at /build/qemu-6MF7tq/qemu-5.0.1/hw/block/dataplane/virtio-blk.c:292 #13 0x00005618033d67bf in virtio_bus_stop_ioeventfd (bus=0x5618056d9f08) at /build/qemu-6MF7tq/qemu-5.0.1/hw/virtio/virtio-bus.c:245 #14 0x00005618031c9b2e in virtio_vmstate_change (opaque=0x5618056d9f90, running=0, state=<optimized out>) at /build/qemu-6MF7tq/qemu-5.0.1/hw/virtio/virtio.c:3220 #15 0x0000561803208bfd in vm_state_notify (running=running@entry=0, state=state@entry=RUN_STATE_FINISH_MIGRATE) at /build/qemu-6MF7tq/qemu-5.0.1/softmmu/vl.c:1275 #16 0x0000561803155c02 in do_vm_stop (state=RUN_STATE_FINISH_MIGRATE, send_stop=<optimized out>) at /build/qemu-6MF7tq/qemu-5.0.1/cpus.c:1032 #17 0x00005618033e3765 in migration_completion (s=0x5618056e6960) at /build/qemu-6MF7tq/qemu-5.0.1/migration/migration.c:2914 #18 migration_iteration_run (s=0x5618056e6960) at /build/qemu-6MF7tq/qemu-5.0.1/migration/migration.c:3275 #19 migration_thread (opaque=opaque@entry=0x5618056e6960) at /build/qemu-6MF7tq/qemu-5.0.1/migration/migration.c:3439 #20 0x0000561803536ad6 in qemu_thread_start (args=<optimized out>) at /build/qemu-6MF7tq/qemu-5.0.1/util/qemu-thread-posix.c:519 #21 0x00007f61085d06ba in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0 #22 0x00007f610830641d in sysctl () from /lib/x86_64-linux-gnu/libc.so.6 #23 0x0000000000000000 in ?? () Fix it by checking that the iochannel is non-null before trying to detach it from the aio_context. If it is null, no detaching is needed, and it will get reattached in the proper aio_context once the connection is reestablished. Signed-off-by: Roman Kagan <[email protected]> Reviewed-by: Vladimir Sementsov-Ogievskiy <[email protected]> Message-Id: <[email protected]> Signed-off-by: Eric Blake <[email protected]>

Fixes this tsan crash, easy to reproduce with any large enough program: $ tests/unit/test-qht 1..2 ThreadSanitizer: CHECK failed: sanitizer_deadlock_detector.h:67 "((n_all_locks_)) < (((sizeof(all_locks_with_contexts_)/sizeof((all_locks_with_contexts_)[0]))))" (0x40, 0x40) (tid=1821568) #0 __tsan::CheckUnwind() ../../../../src/libsanitizer/tsan/tsan_rtl.cpp:353 (libtsan.so.2+0x90034) #1 __sanitizer::CheckFailed(char const*, int, char const*, unsigned long long, unsigned long long) ../../../../src/libsanitizer/sanitizer_common/sanitizer_termination.cpp:86 (libtsan.so.2+0xca555) #2 __sanitizer::DeadlockDetectorTLS<__sanitizer::TwoLevelBitVector<1ul, __sanitizer::BasicBitVector<unsigned long> > >::addLock(unsigned long, unsigned long, unsigned int) ../../../../src/libsanitizer/sanitizer_common/sanitizer_deadlock_detector.h:67 (libtsan.so.2+0xb3616) #3 __sanitizer::DeadlockDetectorTLS<__sanitizer::TwoLevelBitVector<1ul, __sanitizer::BasicBitVector<unsigned long> > >::addLock(unsigned long, unsigned long, unsigned int) ../../../../src/libsanitizer/sanitizer_common/sanitizer_deadlock_detector.h:59 (libtsan.so.2+0xb3616) #4 __sanitizer::DeadlockDetector<__sanitizer::TwoLevelBitVector<1ul, __sanitizer::BasicBitVector<unsigned long> > >::onLockAfter(__sanitizer::DeadlockDetectorTLS<__sanitizer::TwoLevelBitVector<1ul, __sanitizer::BasicBitVector<unsigned long> > >*, unsigned long, unsigned int) ../../../../src/libsanitizer/sanitizer_common/sanitizer_deadlock_detector.h:216 (libtsan.so.2+0xb3616) #5 __sanitizer::DD::MutexAfterLock(__sanitizer::DDCallback*, __sanitizer::DDMutex*, bool, bool) ../../../../src/libsanitizer/sanitizer_common/sanitizer_deadlock_detector1.cpp:169 (libtsan.so.2+0xb3616) #6 __tsan::MutexPostLock(__tsan::ThreadState*, unsigned long, unsigned long, unsigned int, int) ../../../../src/libsanitizer/tsan/tsan_rtl_mutex.cpp:200 (libtsan.so.2+0xa3382) #7 __tsan_mutex_post_lock ../../../../src/libsanitizer/tsan/tsan_interface_ann.cpp:384 (libtsan.so.2+0x76bc3) #8 qemu_spin_lock /home/cota/src/qemu/include/qemu/thread.h:259 (test-qht+0x44a97) #9 qht_map_lock_buckets ../util/qht.c:253 (test-qht+0x44a97) #10 do_qht_iter ../util/qht.c:809 (test-qht+0x45f33) #11 qht_iter ../util/qht.c:821 (test-qht+0x45f33) #12 iter_check ../tests/unit/test-qht.c:121 (test-qht+0xe473) #13 qht_do_test ../tests/unit/test-qht.c:202 (test-qht+0xe473) #14 qht_test ../tests/unit/test-qht.c:240 (test-qht+0xe7c1) #15 test_default ../tests/unit/test-qht.c:246 (test-qht+0xe828) #16 <null> <null> (libglib-2.0.so.0+0x7daed) #17 <null> <null> (libglib-2.0.so.0+0x7d80a) #18 <null> <null> (libglib-2.0.so.0+0x7d80a) #19 g_test_run_suite <null> (libglib-2.0.so.0+0x7dfe9) #20 g_test_run <null> (libglib-2.0.so.0+0x7e055) #21 main ../tests/unit/test-qht.c:259 (test-qht+0xd2c6) #22 __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58 (libc.so.6+0x29d8f) #23 __libc_start_main_impl ../csu/libc-start.c:392 (libc.so.6+0x29e3f) #24 _start <null> (test-qht+0xdb44) Signed-off-by: Emilio Cota <[email protected]> Reviewed-by: Richard Henderson <[email protected]> Message-Id: <[email protected]> Signed-off-by: Alex Bennée <[email protected]> Message-Id: <[email protected]>

After live migration with virtio block device, qemu crash at: #0 0x000055914f46f795 in object_dynamic_cast_assert (obj=0x559151b7b090, typename=0x55914f80fbc4 "qio-channel", file=0x55914f80fb90 "/images/testvfe/sw/qemu.gerrit/include/io/channel.h", line=30, func=0x55914f80fcb8 <__func__.17257> "QIO_CHANNEL") at ../qom/object.c:872 #1 0x000055914f480d68 in QIO_CHANNEL (obj=0x559151b7b090) at /images/testvfe/sw/qemu.gerrit/include/io/channel.h:29 #2 0x000055914f4812f8 in qio_net_listener_set_client_func_full (listener=0x559151b7a720, func=0x55914f580b97 <tcp_chr_accept>, data=0x5591519f4ea0, notify=0x0, context=0x0) at ../io/net-listener.c:166 #3 0x000055914f580059 in tcp_chr_update_read_handler (chr=0x5591519f4ea0) at ../chardev/char-socket.c:637 #4 0x000055914f583dca in qemu_chr_be_update_read_handlers (s=0x5591519f4ea0, context=0x0) at ../chardev/char.c:226 #5 0x000055914f57b7c9 in qemu_chr_fe_set_handlers_full (b=0x559152bf23a0, fd_can_read=0x0, fd_read=0x0, fd_event=0x0, be_change=0x0, opaque=0x0, context=0x0, set_open=false, sync_state=true) at ../chardev/char-fe.c:279 #6 0x000055914f57b86d in qemu_chr_fe_set_handlers (b=0x559152bf23a0, fd_can_read=0x0, fd_read=0x0, fd_event=0x0, be_change=0x0, opaque=0x0, context=0x0, set_open=false) at ../chardev/char-fe.c:304 #7 0x000055914f378caf in vhost_user_async_close (d=0x559152bf21a0, chardev=0x559152bf23a0, vhost=0x559152bf2420, cb=0x55914f2fb8c1 <vhost_user_blk_disconnect>) at ../hw/virtio/vhost-user.c:2725 #8 0x000055914f2fba40 in vhost_user_blk_event (opaque=0x559152bf21a0, event=CHR_EVENT_CLOSED) at ../hw/block/vhost-user-blk.c:395 #9 0x000055914f58388c in chr_be_event (s=0x5591519f4ea0, event=CHR_EVENT_CLOSED) at ../chardev/char.c:61 #10 0x000055914f583905 in qemu_chr_be_event (s=0x5591519f4ea0, event=CHR_EVENT_CLOSED) at ../chardev/char.c:81 #11 0x000055914f581275 in char_socket_finalize (obj=0x5591519f4ea0) at ../chardev/char-socket.c:1083 #12 0x000055914f46f073 in object_deinit (obj=0x5591519f4ea0, type=0x5591519055c0) at ../qom/object.c:680 #13 0x000055914f46f0e5 in object_finalize (data=0x5591519f4ea0) at ../qom/object.c:694 #14 0x000055914f46ff06 in object_unref (objptr=0x5591519f4ea0) at ../qom/object.c:1202 #15 0x000055914f4715a4 in object_finalize_child_property (obj=0x559151b76c50, name=0x559151b7b250 "char3", opaque=0x5591519f4ea0) at ../qom/object.c:1747 #16 0x000055914f46ee86 in object_property_del_all (obj=0x559151b76c50) at ../qom/object.c:632 #17 0x000055914f46f0d2 in object_finalize (data=0x559151b76c50) at ../qom/object.c:693 #18 0x000055914f46ff06 in object_unref (objptr=0x559151b76c50) at ../qom/object.c:1202 #19 0x000055914f4715a4 in object_finalize_child_property (obj=0x559151b6b560, name=0x559151b76630 "chardevs", opaque=0x559151b76c50) at ../qom/object.c:1747 #20 0x000055914f46ef67 in object_property_del_child (obj=0x559151b6b560, child=0x559151b76c50) at ../qom/object.c:654 #21 0x000055914f46f042 in object_unparent (obj=0x559151b76c50) at ../qom/object.c:673 #22 0x000055914f58632a in qemu_chr_cleanup () at ../chardev/char.c:1189 #23 0x000055914f16c66c in qemu_cleanup () at ../softmmu/runstate.c:830 #24 0x000055914eee7b9e in qemu_default_main () at ../softmmu/main.c:38 #25 0x000055914eee7bcc in main (argc=86, argv=0x7ffc97cb8d88) at ../softmmu/main.c:48 In char_socket_finalize after s->listener freed, event callback function vhost_user_blk_event will be called to handle CHR_EVENT_CLOSED. vhost_user_blk_event is calling qio_net_listener_set_client_func_full which is still using s->listener. Setting s->listener = NULL after object_unref(OBJECT(s->listener)) can solve this issue. Signed-off-by: Yajun Wu <[email protected]> Acked-by: Jiri Pirko <[email protected]> Message-Id: <[email protected]> Reviewed-by: Marc-André Lureau <[email protected]> Reviewed-by: Michael S. Tsirkin <[email protected]> Signed-off-by: Michael S. Tsirkin <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FPU issues when installing inkscape package #22

FPU issues when installing inkscape package #22

glaubitz commented Jan 2, 2017

baryluk commented Oct 25, 2019

glaubitz commented Oct 25, 2019

FPU issues when installing inkscape package #22

FPU issues when installing inkscape package #22

Comments

glaubitz commented Jan 2, 2017

baryluk commented Oct 25, 2019

glaubitz commented Oct 25, 2019