Created attachment 93511 [details] guest panic after migration Environment: ------------ Host OS (ia32/ia32e/IA64):ia32e Guest OS (ia32/ia32e/IA64):ia32e Guest OS Type (Linux/Windows):Linux (e.g. RHEL6.3) kvm.git next branch Commit:cbd29cb6e38af6119df2cdac0c58acf0e85c177e qemu-kvm.git Commit:4d9367b76f71c6d938cf8201392abe4bfb1136cb Hardware:SandyBridge-EP, Westmere-EP Bug detailed description: -------------------------- After live migration, guest will panic. This should be a KVM kernel bug. kvm + qemu-kvm = result cbd29cb6 + 4d9367b7 = bad b0da5bec + 4d9367b7 = good Reproduce steps: ---------------- 1. start up a host with kvm (commit: cbd29cb6) 2. Start a TCP daemon for migration: qemu-system-x86_64 -m 1024 -smp 2 -net nic,macaddr=00:12:32:45:12:54 -net tap /root/rhel6u3.img -incoming tcp:localhost:4444 3. create a guest qemu-system-x86_64 -m 1024 -smp 2 -net nic,macaddr=00:12:32:45:12:54 -net tap /root/rhel6u3.img 4. "ctrl+Alt+2" switch to QEMU monitor 5. in monitor: migrate tcp:localhost:4444 Current result: ---------------- after live migration, guest panic Expected result: ---------------- after live migration, guest work fine. Basic root-causing log: ---------------------- WARNING: at lib/list_debug.c:30 __list_add+0x8f/0xa0() (Tainted: G B W --------------- ) Hardware name: Bochs list_add corruption. prev->next should be next (ffff88003fae0ac0), but was ffff8800365c3000. (prev=ffff8800365f9040). Modules linked in: autofs4 sunrpc ipv6 uinput ppdev parport_pc parport microcode sg 8139too 8139cp mii i2c_piix4 i2c_core ext4 mbcache jbd2 sr_mod cdrom sd_mod crc_t10dif pata_acpi ata_generic ata_piix dm_mirror dm_region_hash dm_log dm_mod [last unloaded: speedstep_lib] Pid: 12, comm: events/1 Tainted: G B W --------------- 2.6.32-279.el6.x86_64 #1 Call Trace: [<ffffffff8106b747>] ? warn_slowpath_common+0x87/0xc0 [<ffffffff8106b836>] ? warn_slowpath_fmt+0x46/0x50 [<ffffffff8128301f>] ? __list_add+0x8f/0xa0 [<ffffffff81163f64>] ? free_block+0x154/0x170 [<ffffffff811641b1>] ? drain_array+0xc1/0x100 [<ffffffff8116517e>] ? cache_reap+0x8e/0x260 [<ffffffff81137090>] ? vmstat_update+0x0/0x40 [<ffffffff811650f0>] ? cache_reap+0x0/0x260 [<ffffffff8108c760>] ? worker_thread+0x170/0x2a0 [<ffffffff810920d0>] ? autoremove_wake_function+0x0/0x40 [<ffffffff8108c5f0>] ? worker_thread+0x0/0x2a0 [<ffffffff81091d66>] ? kthread+0x96/0xa0 [<ffffffff8100c14a>] ? child_rip+0xa/0x20 [<ffffffff81091cd0>] ? kthread+0x0/0xa0 [<ffffffff8100c140>] ? child_rip+0x0/0x20 ---[ end trace f17758832a0dcb5e ]--- general protection fault: 0000 [#1] SMP last sysfs file: /sys/devices/pci0000:00/0000:00:03.0/irq CPU 1 Modules linked in: autofs4 sunrpc ipv6 uinput ppdev parport_pc parport microcode sg 8139too 8139cp mii i2c_piix4 i2c_core ext4 mbcache jbd2 sr_mod cdrom sd_mod crc_t10dif pata_acpi ata_generic ata_piix dm_mirror dm_region_hash dm_log dm_mod [last unloaded: speedstep_lib] Pid: 1173, comm: rs:main Q:Reg Tainted: G B W --------------- 2.6.32-279.el6.x86_64 #1 Bochs Bochs RIP: 0010:[<ffffffff81282f00>] [<ffffffff81282f00>] list_del+0x10/0xa0 RSP: 0018:ffff880037547a78 EFLAGS: 00010096 RAX: dead000000200200 RBX: ffffea0000ceb940 RCX: 0000000000000000 RDX: 0000000000000010 RSI: ffff88003edd00d0 RDI: ffffea0000ceb940 RBP: ffff880037547a88 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: ffff88003edd00c0 R13: ffff8800000116c0 R14: 000000000000362e R15: ffffea0000ceb918 FS: 00007fc44b7cc700(0000) GS:ffff880002300000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fc44c5aba10 CR3: 000000003dc44000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process rs:main Q:Reg (pid: 1173, threadinfo ffff880037546000, task ffff880037062ae0) Stack: 0000000000000282 0000000000000001 ffff880037547ba8 ffffffff811258a8 <d> ffff880037547ab8 0000000000000000 ffffffff00000001 ffff88003728b400 <d> 0000000000c7f118 00000040ffffffff 0000000000000000 ffff880000033c28 Call Trace: [<ffffffff811258a8>] get_page_from_freelist+0x288/0x820 [<ffffffffa00869f6>] ? jbd2_journal_stop+0x1e6/0x2b0 [jbd2] [<ffffffff81126f31>] __alloc_pages_nodemask+0x111/0x940 [<ffffffff81161d62>] kmem_getpages+0x62/0x170 [<ffffffff811623cf>] cache_grow+0x2cf/0x320 [<ffffffff81162622>] cache_alloc_refill+0x202/0x240 [<ffffffff8116351f>] kmem_cache_alloc+0x15f/0x190 [<ffffffff811b9738>] fsnotify_create_event+0x38/0x1a0 [<ffffffff811b9430>] fsnotify+0x140/0x160 [<ffffffff8117b0e2>] vfs_write+0x132/0x1a0 [<ffffffff8117ba81>] sys_write+0x51/0x90 [<ffffffff8100b0f2>] system_call_fastpath+0x16/0x1b Code: 89 95 fc fe ff ff e9 ab fd ff ff 4c 8b ad e8 fe ff ff e9 db fd ff ff 90 90 90 90 55 48 89 e5 53 48 89 fb 48 83 ec 08 48 8b 47 08 <4c> 8b 00 4c 39 c7 75 39 48 8b 03 4c 8b 40 08 4c 39 c3 75 4c 48 RIP [<ffffffff81282f00>] list_del+0x10/0xa0 RSP <ffff880037547a78> ---[ end trace f17758832a0dcb5f ]--- Kernel panic - not syncing: Fatal exception Pid: 1173, comm: rs:main Q:Reg Tainted: G B D W --------------- 2.6.32-279.el6.x86_64 #1 Call Trace: [<ffffffff814fd11a>] ? panic+0xa0/0x168 [<ffffffff815012b4>] ? oops_end+0xe4/0x100 [<ffffffff8100f26b>] ? die+0x5b/0x90 [<ffffffff81500e22>] ? do_general_protection+0x152/0x160 [<ffffffff815005f5>] ? general_protection+0x25/0x30 [<ffffffff81282f00>] ? list_del+0x10/0xa0 [<ffffffff811248d2>] ? bad_page+0x52/0x160 [<ffffffff811258a8>] ? get_page_from_freelist+0x288/0x820 [<ffffffffa00869f6>] ? jbd2_journal_stop+0x1e6/0x2b0 [jbd2] [<ffffffff81126f31>] ? __alloc_pages_nodemask+0x111/0x940 [<ffffffff81161d62>] ? kmem_getpages+0x62/0x170 [<ffffffff811623cf>] ? cache_grow+0x2cf/0x320 [<ffffffff81162622>] ? cache_alloc_refill+0x202/0x240 [<ffffffff8116351f>] ? kmem_cache_alloc+0x15f/0x190 [<ffffffff811b9738>] ? fsnotify_create_event+0x38/0x1a0 [<ffffffff811b9430>] ? fsnotify+0x140/0x160 [<ffffffff8117b0e2>] ? vfs_write+0x132/0x1a0 [<ffffffff8117ba81>] ? sys_write+0x51/0x90 [<ffffffff8100b0f2>] ? system_call_fastpath+0x16/0x1b
Jay, Can you please revert caf6900f2d8aaebe404c976753f6813ccd31d95e and see if that makes a difference? If that fails, please bisect to find the offending commit. Thanks.
Marcleo, yes, after reverting that commit "caf6900f2d8", live-migration can work fine.
Reply-To: xiaoguangrong@linux.vnet.ibm.com On 02/20/2013 04:06 PM, bugzilla-daemon@bugzilla.kernel.org wrote: > https://bugzilla.kernel.org/show_bug.cgi?id=54061 > > > > > > --- Comment #2 from Jay Ren <yongjie.ren@intel.com> 2013-02-20 08:06:16 --- > Marcleo, > > yes, after reverting that commit "caf6900f2d8", live-migration can work fine. > Sorry for the wrong patch and the bug is: the fast page fault path can make large-spte writeable without probably set dirty bitmap for all small pages. Two ways to fix these: 1): covert the large-spte to small sptes and wirte-protect them, then no readonly large-spte exists. 2): only allow fast page fault to fix #PF on small spte. Seems the first way is better because the second way can cause useless page table waking.
By reverting the buggy commit, this bug was fixed. See the following commit: commit 6b73a96065e89dc9fa75ba4f78b1aa3a3bbd0470 Author: Marcelo Tosatti <mtosatti@redhat.com> Date: Wed Feb 20 18:52:02 2013 -0300 Revert "KVM: MMU: lazily drop large spte" This reverts commit caf6900f2d8aaebe404c976753f6813ccd31d95e. It is causing migration failures, reference https://bugzilla.kernel.org/show_bug.cgi?id=54061. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>