Bug 54061

Summary: guest panic after live migration
Product: Virtualization Reporter: Jay Ren (yongjie.ren)
Component: kvmAssignee: virtualization_kvm
Status: CLOSED CODE_FIX    
Severity: normal CC: mtosatti
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: Subsystem:
Regression: No Bisected commit-id:
Attachments: guest panic after migration

Description Jay Ren 2013-02-19 03:05:30 UTC
Created attachment 93511 [details]
guest panic after migration

Environment:
------------
Host OS (ia32/ia32e/IA64):ia32e
Guest OS (ia32/ia32e/IA64):ia32e
Guest OS Type (Linux/Windows):Linux (e.g. RHEL6.3)
kvm.git next branch Commit:cbd29cb6e38af6119df2cdac0c58acf0e85c177e
qemu-kvm.git Commit:4d9367b76f71c6d938cf8201392abe4bfb1136cb
Hardware:SandyBridge-EP, Westmere-EP

Bug detailed description:
--------------------------
After live migration, guest will panic.
This should be a KVM kernel bug.
kvm      + qemu-kvm   =  result
cbd29cb6 + 4d9367b7   = bad
b0da5bec + 4d9367b7   = good

Reproduce steps:
----------------
1. start up a host with kvm (commit: cbd29cb6)
2. Start a TCP daemon for migration:
qemu-system-x86_64 -m 1024 -smp 2 -net nic,macaddr=00:12:32:45:12:54 -net tap
/root/rhel6u3.img -incoming tcp:localhost:4444
3. create a guest 
qemu-system-x86_64 -m 1024 -smp 2 -net nic,macaddr=00:12:32:45:12:54 -net tap
/root/rhel6u3.img
4. "ctrl+Alt+2" switch to QEMU monitor
5. in monitor:  migrate tcp:localhost:4444

Current result:
----------------
after live migration, guest panic

Expected result:
----------------
after live migration, guest work fine.

Basic root-causing log:
----------------------
WARNING: at lib/list_debug.c:30 __list_add+0x8f/0xa0() (Tainted: G    B   W 
---------------   )

Hardware name: Bochs

list_add corruption. prev->next should be next (ffff88003fae0ac0), but was
ffff8800365c3000. (prev=ffff8800365f9040).

Modules linked in: autofs4 sunrpc ipv6 uinput ppdev parport_pc parport
microcode sg 8139too 8139cp mii i2c_piix4 i2c_core ext4 mbcache jbd2 sr_mod
cdrom sd_mod crc_t10dif pata_acpi ata_generic ata_piix dm_mirror dm_region_hash
dm_log dm_mod [last unloaded: speedstep_lib]

Pid: 12, comm: events/1 Tainted: G    B   W  ---------------   
2.6.32-279.el6.x86_64 #1

Call Trace:

 [<ffffffff8106b747>] ? warn_slowpath_common+0x87/0xc0

 [<ffffffff8106b836>] ? warn_slowpath_fmt+0x46/0x50

 [<ffffffff8128301f>] ? __list_add+0x8f/0xa0

 [<ffffffff81163f64>] ? free_block+0x154/0x170

 [<ffffffff811641b1>] ? drain_array+0xc1/0x100

 [<ffffffff8116517e>] ? cache_reap+0x8e/0x260

 [<ffffffff81137090>] ? vmstat_update+0x0/0x40

 [<ffffffff811650f0>] ? cache_reap+0x0/0x260

 [<ffffffff8108c760>] ? worker_thread+0x170/0x2a0

 [<ffffffff810920d0>] ? autoremove_wake_function+0x0/0x40

 [<ffffffff8108c5f0>] ? worker_thread+0x0/0x2a0

 [<ffffffff81091d66>] ? kthread+0x96/0xa0

 [<ffffffff8100c14a>] ? child_rip+0xa/0x20

 [<ffffffff81091cd0>] ? kthread+0x0/0xa0

 [<ffffffff8100c140>] ? child_rip+0x0/0x20

---[ end trace f17758832a0dcb5e ]---

general protection fault: 0000 [#1] SMP 

last sysfs file: /sys/devices/pci0000:00/0000:00:03.0/irq

CPU 1 

Modules linked in: autofs4 sunrpc ipv6 uinput ppdev parport_pc parport
microcode sg 8139too 8139cp mii i2c_piix4 i2c_core ext4 mbcache jbd2 sr_mod
cdrom sd_mod crc_t10dif pata_acpi ata_generic ata_piix dm_mirror dm_region_hash
dm_log dm_mod [last unloaded: speedstep_lib]



Pid: 1173, comm: rs:main Q:Reg Tainted: G    B   W  ---------------   
2.6.32-279.el6.x86_64 #1 Bochs Bochs

RIP: 0010:[<ffffffff81282f00>]  [<ffffffff81282f00>] list_del+0x10/0xa0

RSP: 0018:ffff880037547a78  EFLAGS: 00010096

RAX: dead000000200200 RBX: ffffea0000ceb940 RCX: 0000000000000000

RDX: 0000000000000010 RSI: ffff88003edd00d0 RDI: ffffea0000ceb940

RBP: ffff880037547a88 R08: 0000000000000000 R09: 0000000000000000

R10: 0000000000000000 R11: 0000000000000000 R12: ffff88003edd00c0

R13: ffff8800000116c0 R14: 000000000000362e R15: ffffea0000ceb918

FS:  00007fc44b7cc700(0000) GS:ffff880002300000(0000) knlGS:0000000000000000

CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033

CR2: 00007fc44c5aba10 CR3: 000000003dc44000 CR4: 00000000000006e0

DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000

DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400

Process rs:main Q:Reg (pid: 1173, threadinfo ffff880037546000, task
ffff880037062ae0)

Stack:

 0000000000000282 0000000000000001 ffff880037547ba8 ffffffff811258a8

<d> ffff880037547ab8 0000000000000000 ffffffff00000001 ffff88003728b400

<d> 0000000000c7f118 00000040ffffffff 0000000000000000 ffff880000033c28

Call Trace:

 [<ffffffff811258a8>] get_page_from_freelist+0x288/0x820

 [<ffffffffa00869f6>] ? jbd2_journal_stop+0x1e6/0x2b0 [jbd2]

 [<ffffffff81126f31>] __alloc_pages_nodemask+0x111/0x940

 [<ffffffff81161d62>] kmem_getpages+0x62/0x170

 [<ffffffff811623cf>] cache_grow+0x2cf/0x320

 [<ffffffff81162622>] cache_alloc_refill+0x202/0x240

 [<ffffffff8116351f>] kmem_cache_alloc+0x15f/0x190

 [<ffffffff811b9738>] fsnotify_create_event+0x38/0x1a0

 [<ffffffff811b9430>] fsnotify+0x140/0x160

 [<ffffffff8117b0e2>] vfs_write+0x132/0x1a0

 [<ffffffff8117ba81>] sys_write+0x51/0x90

 [<ffffffff8100b0f2>] system_call_fastpath+0x16/0x1b

Code: 89 95 fc fe ff ff e9 ab fd ff ff 4c 8b ad e8 fe ff ff e9 db fd ff ff 90
90 90 90 55 48 89 e5 53 48 89 fb 48 83 ec 08 48 8b 47 08 <4c> 8b 00 4c 39 c7 75
39 48 8b 03 4c 8b 40 08 4c 39 c3 75 4c 48 

RIP  [<ffffffff81282f00>] list_del+0x10/0xa0

 RSP <ffff880037547a78>

---[ end trace f17758832a0dcb5f ]---

Kernel panic - not syncing: Fatal exception

Pid: 1173, comm: rs:main Q:Reg Tainted: G    B D W  ---------------   
2.6.32-279.el6.x86_64 #1

Call Trace:

 [<ffffffff814fd11a>] ? panic+0xa0/0x168

 [<ffffffff815012b4>] ? oops_end+0xe4/0x100

 [<ffffffff8100f26b>] ? die+0x5b/0x90

 [<ffffffff81500e22>] ? do_general_protection+0x152/0x160

 [<ffffffff815005f5>] ? general_protection+0x25/0x30

 [<ffffffff81282f00>] ? list_del+0x10/0xa0

 [<ffffffff811248d2>] ? bad_page+0x52/0x160

 [<ffffffff811258a8>] ? get_page_from_freelist+0x288/0x820

 [<ffffffffa00869f6>] ? jbd2_journal_stop+0x1e6/0x2b0 [jbd2]

 [<ffffffff81126f31>] ? __alloc_pages_nodemask+0x111/0x940

 [<ffffffff81161d62>] ? kmem_getpages+0x62/0x170

 [<ffffffff811623cf>] ? cache_grow+0x2cf/0x320

 [<ffffffff81162622>] ? cache_alloc_refill+0x202/0x240

 [<ffffffff8116351f>] ? kmem_cache_alloc+0x15f/0x190

 [<ffffffff811b9738>] ? fsnotify_create_event+0x38/0x1a0

 [<ffffffff811b9430>] ? fsnotify+0x140/0x160

 [<ffffffff8117b0e2>] ? vfs_write+0x132/0x1a0

 [<ffffffff8117ba81>] ? sys_write+0x51/0x90

 [<ffffffff8100b0f2>] ? system_call_fastpath+0x16/0x1b
Comment 1 Marcelo Tosatti 2013-02-20 02:43:34 UTC
Jay, 

Can you please revert caf6900f2d8aaebe404c976753f6813ccd31d95e and see if that makes a difference? If that fails, please bisect to find the offending commit.

Thanks.
Comment 2 Jay Ren 2013-02-20 08:06:16 UTC
Marcleo,

yes, after reverting that commit "caf6900f2d8", live-migration can work fine.
Comment 3 Anonymous Emailer 2013-02-25 08:26:26 UTC
Reply-To: xiaoguangrong@linux.vnet.ibm.com

On 02/20/2013 04:06 PM, bugzilla-daemon@bugzilla.kernel.org wrote:
> https://bugzilla.kernel.org/show_bug.cgi?id=54061
> 
> 
> 
> 
> 
> --- Comment #2 from Jay Ren <yongjie.ren@intel.com>  2013-02-20 08:06:16 ---
> Marcleo,
> 
> yes, after reverting that commit "caf6900f2d8", live-migration can work fine.
> 

Sorry for the wrong patch and the bug is: the fast page fault path
can make large-spte writeable without probably set dirty bitmap for
all small pages.

Two ways to fix these:
1): covert the large-spte to small sptes and wirte-protect them, then
    no readonly large-spte exists.

2): only allow fast page fault to fix #PF on small spte.

Seems the first way is better because the second way can cause useless
page table waking.
Comment 4 Jay Ren 2013-03-01 07:33:31 UTC
By reverting the buggy commit, this bug was fixed. See the following commit:

commit 6b73a96065e89dc9fa75ba4f78b1aa3a3bbd0470
Author: Marcelo Tosatti <mtosatti@redhat.com>
Date:   Wed Feb 20 18:52:02 2013 -0300

    Revert "KVM: MMU: lazily drop large spte"

    This reverts commit caf6900f2d8aaebe404c976753f6813ccd31d95e.

    It is causing migration failures, reference
    https://bugzilla.kernel.org/show_bug.cgi?id=54061.

    Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>