Bug 38672
Summary: | KVM guest boot crashed | ||
---|---|---|---|
Product: | Virtualization | Reporter: | Steve (stefan.bosak) |
Component: | kvm | Assignee: | virtualization_kvm |
Status: | CLOSED INVALID | ||
Severity: | high | CC: | avi, florian, maciej.rutecki, rjw |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | 3.0.0-rc5+ | Subsystem: | |
Regression: | Yes | Bisected commit-id: | |
Bug Depends on: | |||
Bug Blocks: | 36912 |
Description
Steve
2011-07-02 06:56:16 UTC
Reply-To: jan.kiszka@web.de On 2011-07-02 08:56, bugzilla-daemon@bugzilla.kernel.org wrote: > https://bugzilla.kernel.org/show_bug.cgi?id=38672 > > Summary: KVM guest boot crashed > Product: Virtualization > Version: unspecified > Kernel Version: 3.0.0-rc5+ > Platform: All > OS/Version: Linux > Tree: Mainline > Status: NEW > Severity: high > Priority: P1 > Component: kvm > AssignedTo: virtualization_kvm@kernel-bugs.osdl.org > ReportedBy: stefan.bosak@gmail.com > Regression: Yes > > > Windows Server 2008 R2 KVM guest crashed during boot process. > This situation also occur on other linux based guests. What other Linux guests precisely? None of the Windows and Linux guest I have around expose this problem. What is your qemu command line? > > Bug is in: git://git.kernel.org/pub/scm/virt/kvm/qemu-kvm.git > not in: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git. > > Qemu-kvm repository checkouted before 9 days works well. > > Problem could be in this: > > author Jan Kiszka <jan.kiszka@siemens.com> > Mon, 27 Jun 2011 10:23:35 +0000 (12:23 +0200) > committer Avi Kivity <avi@redhat.com> > Tue, 28 Jun 2011 08:20:08 +0000 (11:20 +0300) > commit bcd4f22796ebda2934a980060ea704ebedb46173 > tree f16fe58d2d4120c7b94f23ddef3afb61beb30dfc tree | snapshot > parent 59539c913383fdd3350681301b44f02fa7ee2757 commit | diff > > > author Jan Kiszka <jan.kiszka@siemens.com> > Mon, 27 Jun 2011 10:22:28 +0000 (12:22 +0200) > committer Avi Kivity <avi@redhat.com> > Tue, 28 Jun 2011 08:18:58 +0000 (11:18 +0300) > commit 59539c913383fdd3350681301b44f02fa7ee2757 > tree bfdf23a13004d08d04589d02d3c4c754da3dd076 tree | snapshot > parent b7496707af10ce2827d0803c9e46ca8ddc543716 commit | diff > Can you bisect which change precisely introduced the regression? Thanks, Jan (In reply to comment #1) > Reply-To: jan.kiszka@web.de > > On 2011-07-02 08:56, bugzilla-daemon@bugzilla.kernel.org wrote: > > https://bugzilla.kernel.org/show_bug.cgi?id=38672 > > > > Summary: KVM guest boot crashed > > Product: Virtualization > > Version: unspecified > > Kernel Version: 3.0.0-rc5+ > > Platform: All > > OS/Version: Linux > > Tree: Mainline > > Status: NEW > > Severity: high > > Priority: P1 > > Component: kvm > > AssignedTo: virtualization_kvm@kernel-bugs.osdl.org > > ReportedBy: stefan.bosak@gmail.com > > Regression: Yes > > > > > > Windows Server 2008 R2 KVM guest crashed during boot process. > > This situation also occur on other linux based guests. > > What other Linux guests precisely? None of the Windows and Linux guest I > have around expose this problem. I have more guests typeson the same server: MS Windows Server 2008 R2 (latest updates) Ubuntu 11.04 (2.6.38-10-virtual) Debian wheezy/sid (2.6.38-rc4-git4-vs2.3.0.37-rc4) Gentoo (2.6.38-gentoo-r7) > > What is your qemu command line? Example of guest - OS Gentoo Linux: /usr/bin/qemu-system-x86_64 --enable-kvm -name vps-25-gentoo -chroot /vservers1 -runas kvm -pidfile /var/run/kvm/vps-25-gentoo.pid -vnc a.b.c.d:0 -vga std --full-screen -smp 2 -m 12G -cpu host -mem-path /hugepages -mem-prealloc -kvm-shadow-memory 12G -daemonize -tdf -localtime -balloon virtio -net nic,model=virtio,vlan=0,macaddr=XX:XX:XX:XX:XX:XX -net tap,vhost=on,vlan=0,ifname=qtap0,script=/etc/kvm/kvm-ifup,downscript=/etc/kvm/kvm-ifdown -drive aio=native,index=0,media=disk,cache=writeback,if=virtio,boot=on,file=/vservers1/vps-25-gentoo.img -boot c > > > > > Bug is in: git://git.kernel.org/pub/scm/virt/kvm/qemu-kvm.git > > not in: > git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git. > > > > Qemu-kvm repository checkouted before 9 days works well. > > > > Problem could be in this: > > > > author Jan Kiszka <jan.kiszka@siemens.com> > > Mon, 27 Jun 2011 10:23:35 +0000 (12:23 +0200) > > committer Avi Kivity <avi@redhat.com> > > Tue, 28 Jun 2011 08:20:08 +0000 (11:20 +0300) > > commit bcd4f22796ebda2934a980060ea704ebedb46173 > > tree f16fe58d2d4120c7b94f23ddef3afb61beb30dfc tree | snapshot > > parent 59539c913383fdd3350681301b44f02fa7ee2757 commit | diff > > > > > > author Jan Kiszka <jan.kiszka@siemens.com> > > Mon, 27 Jun 2011 10:22:28 +0000 (12:22 +0200) > > committer Avi Kivity <avi@redhat.com> > > Tue, 28 Jun 2011 08:18:58 +0000 (11:18 +0300) > > commit 59539c913383fdd3350681301b44f02fa7ee2757 > > tree bfdf23a13004d08d04589d02d3c4c754da3dd076 tree | snapshot > > parent b7496707af10ce2827d0803c9e46ca8ddc543716 commit | diff > > > > Can you bisect which change precisely introduced the regression? > Yes, of course, i'm working on this now. > Thanks, > Jan Thank you for your time. Here is result: 6506e4f995967b1a48cc34418c77b318df92ce35 is the first bad commit commit 6506e4f995967b1a48cc34418c77b318df92ce35 Author: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Date: Thu May 19 18:35:44 2011 +0100 xen: remove xen_map_block and xen_unmap_block Replace xen_map_block with qemu_map_cache with the appropriate locking and size parameters. Replace xen_unmap_block with qemu_invalidate_entry. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: Alexander Graf <agraf@suse.de> :100644 100644 01f33bb2bca8ca69ffa03ca5170d1fce3ffd2fb4 e11c1dd97a62669255a35d1628f24fc4adf538fb M exec.c :100644 100644 60f712b229b63f63f2fe9e8bf5c867cf4f031d71 8a2380a151978a8735c797529be871b338958b05 M xen-mapcache-stub.c :100644 100644 57fe24de86b372775b2a0d4d7537f231626d594e fac47cd9be72bf1201f21745498625fec44c4515 M xen-mapcache.c :100644 100644 b89b8f9653a5f58e0ea710ae0db095a7355c9eb6 6216cc3be7eb68d6c53d21c96a950abcc565a1ba M xen-mapcache.h Please could you look at it ? Thank you for your time. You should have KVM guest with more than 4 GB memory. I tested reported bug on more servers also with kernel 3.0.0-rc6+ -> same result. git bisect result: 6506e4f995967b1a48cc34418c77b318df92ce35 is the first bad commit commit 6506e4f995967b1a48cc34418c77b318df92ce35 Author: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Date: Thu May 19 18:35:44 2011 +0100 xen: remove xen_map_block and xen_unmap_block Replace xen_map_block with qemu_map_cache with the appropriate locking and size parameters. Replace xen_unmap_block with qemu_invalidate_entry. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: Alexander Graf <agraf@suse.de> :100644 100644 01f33bb2bca8ca69ffa03ca5170d1fce3ffd2fb4 e11c1dd97a62669255a35d1628f24fc4adf538fb M exec.c :100644 100644 60f712b229b63f63f2fe9e8bf5c867cf4f031d71 8a2380a151978a8735c797529be871b338958b05 M xen-mapcache-stub.c :100644 100644 57fe24de86b372775b2a0d4d7537f231626d594e fac47cd9be72bf1201f21745498625fec44c4515 M xen-mapcache.c :100644 100644 b89b8f9653a5f58e0ea710ae0db095a7355c9eb6 6216cc3be7eb68d6c53d21c96a950abcc565a1ba M xen-mapcache.h complete git bisect log: git bisect start # good: [d32e8d0b8d9e0ef7cf7ab2e74548982972789dfc] Merge commit 'v0.14.1' into stable-0.14 git bisect good d32e8d0b8d9e0ef7cf7ab2e74548982972789dfc # bad: [525e3df73e40290e95743d4c8f8b64d8d9cbe021] Merge branch 'master' of git://git.qemu.org/qemu into next git bisect bad 525e3df73e40290e95743d4c8f8b64d8d9cbe021 # good: [2d2339f995d7176dcb2de10d162aed323a1ffbf3] Merge commit 'f487d6278f75f84378833b8c3a67443346d639dc' into upstream-merge git bisect good 2d2339f995d7176dcb2de10d162aed323a1ffbf3 # good: [0e192fae3c79e7d2830f8b1fa694cd8e128084cf] Update version for 0.14.0-rc0 git bisect good 0e192fae3c79e7d2830f8b1fa694cd8e128084cf # good: [075360945860ad9bdd491921954b383bf762b0e5] spice: don't call displaystate callbacks from spice server context. git bisect good 075360945860ad9bdd491921954b383bf762b0e5 # good: [9047c0b40654ce3578c148f6754f878218569252] usb-ehci: move device/vendor/class id to qdev git bisect good 9047c0b40654ce3578c148f6754f878218569252 # bad: [75ef849696830fc2ddeff8bb90eea5887ff50df6] esp: correctly fill bus id with requested lun git bisect bad 75ef849696830fc2ddeff8bb90eea5887ff50df6 # bad: [d6034a3a61235042a0d79dcc1dfed0fbf461fb18] Merge remote-tracking branch 'qemu-kvm/uq/master' into staging git bisect bad d6034a3a61235042a0d79dcc1dfed0fbf461fb18 # good: [b45a9b185120a10455859341d8035cce9b441fc8] Merge remote-tracking branch 'qemu-kvm/uq/master' into staging git bisect good b45a9b185120a10455859341d8035cce9b441fc8 # bad: [ebed85058b6e89a5202112e9aa2abab3aa3804c3] xen: only track the linear framebuffer git bisect bad ebed85058b6e89a5202112e9aa2abab3aa3804c3 # good: [22e1e729600dad1639329185614d094243409359] Merge branch 'cocoa-for-upstream' of git://repo.or.cz/qemu/afaerber git bisect good 22e1e729600dad1639329185614d094243409359 # good: [c13390cd384a9564e6dded127d01ef0627b6b1c5] xen: fix qemu_map_cache with size != MCACHE_BUCKET_SIZE git bisect good c13390cd384a9564e6dded127d01ef0627b6b1c5 # bad: [38bee5dc94ee355640b030d28f311b03ee2f13d1] exec.c: refactor cpu_physical_memory_map git bisect bad 38bee5dc94ee355640b030d28f311b03ee2f13d1 # bad: [6506e4f995967b1a48cc34418c77b318df92ce35] xen: remove xen_map_block and xen_unmap_block git bisect bad 6506e4f995967b1a48cc34418c77b318df92ce35 # good: [cd306087e5a9ea4091071a0a41c0ea99fac60ab0] xen: remove qemu_map_cache_unlock git bisect good cd306087e5a9ea4091071a0a41c0ea99fac60ab0 Please could you look at ? Thank you for your time. Here is bug (xen-mapcache.c): void qemu_map_cache_init(void) { -> mapcache->entry = qemu_mallocz(size); should be: mapcache->entry = qemu_mallocz(size*sizeof(MapCacheEntry)); } Should somebody commit this fix ? Thank you for your time. After applying above simple fix all tested guests started & running correctly. Should someone commit fix from comment #6 solving reported bug ? Thank you for your time. Please post the fix on qemu-devel@nongnu.org, with a signed-off-by line. First-Bad-Commit : 6506e4f995967b1a48cc34418c77b318df92ce35 Sorry, this is a quemu bug, it appears. Closing. |