Bug 69491 - Booting into a guest on Intel Haswell (bare-metal) throws soft lockups [qemu-system-x86:911]
Summary: Booting into a guest on Intel Haswell (bare-metal) throws soft lockups [qemu-...
Status: RESOLVED CODE_FIX
Alias: None
Product: Virtualization
Classification: Unclassified
Component: kvm (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: virtualization_kvm
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-01-27 09:09 UTC by Kashyap Chamarthy
Modified: 2014-02-28 06:01 UTC (History)
2 users (show)

See Also:
Kernel Version: 3.14.0-0.rc0.git9.1.fc21
Subsystem:
Regression: No
Bisected commit-id:


Attachments
Complete stdout of dmesg (251.15 KB, text/plain)
2014-01-27 09:19 UTC, Kashyap Chamarthy
Details
Complete stdout of dmidecode (25.00 KB, text/plain)
2014-01-27 09:20 UTC, Kashyap Chamarthy
Details
Complete stdout of `x86info -a` (10.17 KB, text/plain)
2014-01-27 09:37 UTC, Kashyap Chamarthy
Details
Successful stderr of `dmesg` on L0 with Kernel 3.14.0-0.rc1.git4.1.fc21.x86_64 (70.09 KB, text/plain)
2014-02-10 14:06 UTC, Kashyap Chamarthy
Details

Description Kashyap Chamarthy 2014-01-27 09:09:49 UTC
Description of problem
----------------------

On boot on an Intel Haswell machine, I see the below softlockups, crashes, calltraces, 

   ===
   .
   .
   .  
   [  716.846912] BUG: soft lockup - CPU#1 stuck for 22s! [qemu-system-x86:911]
   
   [  716.946229] hardirqs last  enabled at (138600857): [<ffffffffa05395d3>] kvm_arch_vcpu_ioctl_run+0x363/0x1510 [kvm]
   [  716.957970] hardirqs last disabled at (138600858): [<ffffffff817a7c2d>] apic_timer_interrupt+0x6d/0x80
   [  716.968523] softirqs last  enabled at (138595310): [<ffffffff810991ce>] __do_softirq+0x1ce/0x460
   [  716.978493] softirqs last disabled at (138595305): [<ffffffff81099795>] irq_exit+0xc5/0xd0
   
   [  716.997812] Hardware name: Intel Corporation Shark Bay Client platform/Flathead Creek Crb, BIOS HSWLPTU1.86C.0109.R03.1301282055 01/28/2013
   .
   .
   .
   [  717.146239]  [<ffffffffa053a436>] kvm_arch_vcpu_ioctl_run+0x11c6/0x1510 [kvm]
   [  717.154314]  [<ffffffffa05395c7>] ? kvm_arch_vcpu_ioctl_run+0x357/0x1510 [kvm]
   [  717.162487]  [<ffffffffa052067c>] ? vcpu_load+0x1c/0xa0 [kvm]
   [  717.168995]  [<ffffffffa05358fe>] ? kvm_arch_vcpu_load+0x4e/0x1e0 [kvm]
   [  717.176487]  [<ffffffffa05209fd>] kvm_vcpu_ioctl+0x2bd/0x670 [kvm]
   [  717.183486]  [<ffffffff810c4ee2>] ? creds_are_invalid.part.1+0x12/0x50
   [  717.190876]  [<ffffffff810c4f41>] ? creds_are_invalid+0x21/0x30
   [  717.197580]  [<ffffffff813415c6>] ? inode_has_perm.isra.51+0x26/0x80
   [  717.204775]  [<ffffffff81237b70>] do_vfs_ioctl+0x300/0x520
   [  717.210986]  [<ffffffff81341c2b>] ? selinux_file_ioctl+0x5b/0x110
   [  717.217884]  [<ffffffff81237e11>] SyS_ioctl+0x81/0xa0
   [  717.223601]  [<ffffffff817a6f29>] system_call_fastpath+0x16/0x1b
   .
   .
   .
   ====


Version
-------

On L0 (physical host):

    $ uname -r; rpm -q qemu-system-x86 libvirt-daemon-kvm libguestfs
    3.14.0-0.rc0.git9.1.fc21.x86_64
    qemu-system-x86-1.7.0-4.fc21.x86_64
    libvirt-daemon-kvm-1.2.1-1.fc21.x86_64
    libguestfs-1.25.29-1.fc21.x86_64


How reproducible
----------------
Consistently


Steps to Reproduce
------------------

1. Install Kernel, QEMU, KVM from Rawhide to the above versions:

    $ yum uppdate kernel libvirt-daemon-kvm qemu-system-x86 \
      libguestfs --enablerepo=rawhide

2. Reboot into the Kernel 3.14.0-0.rc0.git9.1.fc21

3. Observe boot messages over serial console


Actual results:
---------------

    ====
    .
    .
    .
    .
    [  690.671628] virbr1: port 2(vnet0) entered listening state
    [  690.677779] virbr1: port 2(vnet0) entered listening state
    [  692.679932] virbr1: port 2(vnet0) entered learning state
    [  694.682708] virbr1: topology change detected, propagating
    [  694.688935] virbr1: port 2(vnet0) entered forwarding state
    [  694.695360] IPv6: ADDRCONF(NETDEV_CHANGE): virbr1: link becomes ready
    [  716.846912] BUG: soft lockup - CPU#1 stuck for 22s! [qemu-system-x86:911]
    [  716.854611] Modules linked in: ipt_MASQUERADE iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack xt_CHECKSUM iptable_mangle tun bridge stp llc ip6table_filter ip6_tables ebtable_nat ebtables x86_pkg_temp_thermal coretemp kvm_intel snd_hda_codec_realtek snd_hda_codec_generic snd_hda_codec_hdmi kvm snd_hda_intel snd_hda_codec snd_hwdep snd_seq i2c_i801 iwlwifi crct10dif_pclmul crc32_pclmul crc32c_intel snd_seq_device cfg80211 ghash_clmulni_intel ppdev snd_pcm microcode iTCO_wdt iTCO_vendor_support serio_raw winbond_cir rfkill snd_timer rc_core parport_pc sdhci_acpi snd parport e1000e ptp sdhci mmc_core nfsd auth_rpcgss i2c_hid nfs_acl lockd dw_dmac i2c_designware_platform dw_dmac_core i2c_designware_core shpchp pps_core soundcore sunrpc mei_me mei lpc_ich mfd_core i915 i2c_algo_bit drm_kms_helper drm i2c_core video usb_storage
    [  716.941989] irq event stamp: 138600858
    [  716.946229] hardirqs last  enabled at (138600857): [<ffffffffa05395d3>] kvm_arch_vcpu_ioctl_run+0x363/0x1510 [kvm]
    [  716.957970] hardirqs last disabled at (138600858): [<ffffffff817a7c2d>] apic_timer_interrupt+0x6d/0x80
    [  716.968523] softirqs last  enabled at (138595310): [<ffffffff810991ce>] __do_softirq+0x1ce/0x460
    [  716.978493] softirqs last disabled at (138595305): [<ffffffff81099795>] irq_exit+0xc5/0xd0
    [  716.987865] CPU: 1 PID: 911 Comm: qemu-system-x86 Not tainted 3.14.0-0.rc0.git9.1.fc21.x86_64 #1
    [  716.997812] Hardware name: Intel Corporation Shark Bay Client platform/Flathead Creek Crb, BIOS HSWLPTU1.86C.0109.R03.1301282055 01/28/2013
    [  717.011991] task: ffff880446ad1ab0 ti: ffff88042cedc000 task.ti: ffff88042cedc000
    [  717.020463] RIP: 0010:[<ffffffff810f4782>]  [<ffffffff810f4782>] lock_release+0xc2/0x310
    [  717.029635] RSP: 0018:ffff88042ceddd38  EFLAGS: 00000246
    [  717.035647] RAX: ffff880446ad1ab0 RBX: 0000000000a74000 RCX: 0000000000009a60
    [  717.043726] RDX: ffff8800a18523c0 RSI: 0000000000000000 RDI: 0000000000000246
    [  717.051812] RBP: ffff88042ceddd60 R08: ffff880446ad2610 R09: 0000000000000000
    [  717.059890] R10: 0000000000000001 R11: 0000000000000000 R12: ffff880446ad2610
    [  717.067968] R13: 0000000000000000 R14: 0000000000000001 R15: 0000000000000000
    [  717.076047] FS:  00007f6abfacd700(0000) GS:ffff8800a1800000(0000) knlGS:0000000000000000
    [  717.085209] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [  717.091713] CR2: 0000000000000000 CR3: 000000042eb48000 CR4: 00000000001427e0
    [  717.099793] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    [  717.107870] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    [  717.115948] Stack:
    [  717.118221]  ffff88042cee0000 0000000000000001 0000000000000000 ffff88042cebc280
    [  717.126632]  ffff88042cebc100 ffff88042cedde38 ffffffffa053a436 ffffffffa05395c7
    [  717.135054]  ffff88042cee0048 ffff88042ceddd90 ffff880446ad1ab0 ffff880446ad1ab0
    [  717.143468] Call Trace:
    [  717.146239]  [<ffffffffa053a436>] kvm_arch_vcpu_ioctl_run+0x11c6/0x1510 [kvm]
    [  717.154314]  [<ffffffffa05395c7>] ? kvm_arch_vcpu_ioctl_run+0x357/0x1510 [kvm]
    [  717.162487]  [<ffffffffa052067c>] ? vcpu_load+0x1c/0xa0 [kvm]
    [  717.168995]  [<ffffffffa05358fe>] ? kvm_arch_vcpu_load+0x4e/0x1e0 [kvm]
    [  717.176487]  [<ffffffffa05209fd>] kvm_vcpu_ioctl+0x2bd/0x670 [kvm]
    [  717.183486]  [<ffffffff810c4ee2>] ? creds_are_invalid.part.1+0x12/0x50
    [  717.190876]  [<ffffffff810c4f41>] ? creds_are_invalid+0x21/0x30
    [  717.197580]  [<ffffffff813415c6>] ? inode_has_perm.isra.51+0x26/0x80
    [  717.204775]  [<ffffffff81237b70>] do_vfs_ioctl+0x300/0x520
    [  717.210986]  [<ffffffff81341c2b>] ? selinux_file_ioctl+0x5b/0x110
    [  717.217884]  [<ffffffff81237e11>] SyS_ioctl+0x81/0xa0
    [  717.223601]  [<ffffffff817a6f29>] system_call_fastpath+0x16/0x1b
    [  717.230399] Code: 85 8c 00 00 00 4c 89 ea 4c 89 e6 48 89 df e8 06 fc ff ff 65 48 8b 04 25 80 c9 00 00 4c 89 f7 c7 80 24 0b 00 00 00 00 00 00 57 9d <0f> 1f 44 00 00 5b 41 5c 41 5d 41 5e 41 5f 5d f3 c3 0f 1f 44 00 
    
    [  744.830791] BUG: soft lockup - CPU#1 stuck for 22s! [qemu-system-x86:911]
    [  744.838481] Modules linked in: ipt_MASQUERADE iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack xt_CHECKSUM iptable_mangle tun bridge stp llc ip6table_filter ip6_tables ebtable_nat ebtables x86_pkg_temp_thermal coretemp kvm_intel snd_hda_codec_realtek snd_hda_codec_generic snd_hda_codec_hdmi kvm snd_hda_intel snd_hda_codec snd_hwdep snd_seq i2c_i801 iwlwifi crct10dif_pclmul crc32_pclmul crc32c_intel snd_seq_device cfg80211 ghash_clmulni_intel ppdev snd_pcm microcode iTCO_wdt iTCO_vendor_support serio_raw winbond_cir rfkill snd_timer rc_core parport_pc sdhci_acpi snd parport e1000e ptp sdhci mmc_core nfsd auth_rpcgss i2c_hid nfs_acl lockd dw_dmac i2c_designware_platform dw_dmac_core i2c_designware_core shpchp pps_core soundcore sunrpc mei_me mei lpc_ich mfd_core i915 i2c_algo_bit drm_kms_helper drm i2c_core video usb_storage
    [  744.925877] irq event stamp: 290621582
    [  744.930118] hardirqs last  enabled at (290621581): [<ffffffffa05395d3>] kvm_arch_vcpu_ioctl_run+0x363/0x1510 [kvm]
    [  744.941856] hardirqs last disabled at (290621582): [<ffffffff817a7c2d>] apic_timer_interrupt+0x6d/0x80
    [  744.952405] softirqs last  enabled at (290616188): [<ffffffff810991ce>] __do_softirq+0x1ce/0x460
    [  744.962372] softirqs last disabled at (290616183): [<ffffffff81099795>] irq_exit+0xc5/0xd0
    [  744.971742] CPU: 1 PID: 911 Comm: qemu-system-x86 Not tainted 3.14.0-0.rc0.git9.1.fc21.x86_64 #1
    [  744.981691] Hardware name: Intel Corporation Shark Bay Client platform/Flathead Creek Crb, BIOS HSWLPTU1.86C.0109.R03.1301282055 01/28/2013
    [  744.995868] task: ffff880446ad1ab0 ti: ffff88042cedc000 task.ti: ffff88042cedc000
    [  745.004341] RIP: 0010:[<ffffffff810f4782>]  [<ffffffff810f4782>] lock_release+0xc2/0x310
    [  745.013514] RSP: 0018:ffff88042ceddd38  EFLAGS: 00000246
    [  745.019527] RAX: ffff880446ad1ab0 RBX: 0000000000000001 RCX: 0000000000009a60
    [  745.027605] RDX: ffff8800a18523c0 RSI: 0000000000000000 RDI: 0000000000000246
    [  745.035693] RBP: ffff88042ceddd60 R08: ffff880446ad2610 R09: 0000000000000000
    [  745.043773] R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000001
    [  745.051852] R13: 0000000000000000 R14: ffffffff813b9ceb R15: ffff88042ceddd60
    [  745.059930] FS:  00007f6abfacd700(0000) GS:ffff8800a1800000(0000) knlGS:0000000000000000
    [  745.069092] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [  745.075594] CR2: 0000000000000000 CR3: 000000042eb48000 CR4: 00000000001427e0
    [  745.083672] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    [  745.091751] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    [  745.099830] Stack:
    [  745.102102]  ffff88042cee0000 0000000000000001 0000000000000000 ffff88042cebc280
    [  745.110512]  ffff88042cebc100 ffff88042cedde38 ffffffffa053a436 ffffffffa05395c7
    [  745.118934]  ffff88042cee0048 ffff88042ceddd90 ffff880446ad1ab0 ffff880446ad1ab0
    [  745.127345] Call Trace:
    [  745.130116]  [<ffffffffa053a436>] kvm_arch_vcpu_ioctl_run+0x11c6/0x1510 [kvm]
    [  745.138190]  [<ffffffffa05395c7>] ? kvm_arch_vcpu_ioctl_run+0x357/0x1510 [kvm]
    [  745.146372]  [<ffffffffa052067c>] ? vcpu_load+0x1c/0xa0 [kvm]
    [  745.152872]  [<ffffffffa05358fe>] ? kvm_arch_vcpu_load+0x4e/0x1e0 [kvm]
    [  745.160363]  [<ffffffffa05209fd>] kvm_vcpu_ioctl+0x2bd/0x670 [kvm]
    [  745.167361]  [<ffffffff810c4ee2>] ? creds_are_invalid.part.1+0x12/0x50
    [  745.174752]  [<ffffffff810c4f41>] ? creds_are_invalid+0x21/0x30
    [  745.181456]  [<ffffffff813415c6>] ? inode_has_perm.isra.51+0x26/0x80
    [  745.188649]  [<ffffffff81237b70>] do_vfs_ioctl+0x300/0x520
    [  745.194861]  [<ffffffff81341c2b>] ? selinux_file_ioctl+0x5b/0x110
    [  745.201758]  [<ffffffff81237e11>] SyS_ioctl+0x81/0xa0
    [  745.207475]  [<ffffffff817a6f29>] system_call_fastpath+0x16/0x1b
    [  745.214274] Code: 85 8c 00 00 00 4c 89 ea 4c 89 e6 48 89 df e8 06 fc ff ff 65 48 8b 04 25 80 c9 00 00 4c 89 f7 c7 80 24 0b 00 00 00 00 00 00 57 9d <0f> 1f 44 00 00 5b 41 5c 41 5d 41 5e 41 5f 5d f3 c3 0f 1f 44 00 
    [  757.911256] INFO: rcu_sched self-detected stall on CPU { 1}  (t=65000 jiffies g=4088 c=4087 q=343)
    [  757.921471] sending NMI to all CPUs:
    [  757.925516] NMI backtrace for cpu 1
    [  757.929474] CPU: 1 PID: 911 Comm: qemu-system-x86 Not tainted 3.14.0-0.rc0.git9.1.fc21.x86_64 #1
    [  757.939423] Hardware name: Intel Corporation Shark Bay Client platform/Flathead Creek Crb, BIOS HSWLPTU1.86C.0109.R03.1301282055 01/28/2013
    [  757.953600] task: ffff880446ad1ab0 ti: ffff88042cedc000 task.ti: ffff88042cedc000
    [  757.962073] RIP: 0010:[<ffffffff810ee310>]  [<ffffffff810ee310>] trace_hardirqs_off+0x0/0x10
    [  757.971640] RSP: 0018:ffff8800a1803d88  EFLAGS: 00000086
    [  757.977651] RAX: 0000000000000008 RBX: 0000000000000055 RCX: 0000000000000038
    [  757.985730] RDX: 00000000000000ff RSI: 0000000000000008 RDI: 0000000000000086
    [  757.993809] RBP: ffff8800a1803de0 R08: ffff880448202f88 R09: 0000000000000001
    [  758.001889] R10: 0000000000000000 R11: ffff8800a1803b36 R12: ffff880448202f88
    [  758.009968] R13: ffff8804487ebf60 R14: 000000000000e0c0 R15: 0000000000080000
    [  758.018046] FS:  00007f6abfacd700(0000) GS:ffff8800a1800000(0000) knlGS:0000000000000000
    [  758.027207] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [  758.033713] CR2: 0000000000000000 CR3: 000000042eb48000 CR4: 00000000001427e0
    [  758.041792] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    [  758.049869] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    [  758.057948] Stack:
    [  758.060221]  ffffffff8105385b 0000000000000086 00000002a1803df0 000000000000e0b8
    [  758.068633]  0000000800000001 0000000000000001 0000000000002710 ffffffff81c7a6c0
    [  758.077044]  ffffffff81d83720 ffffffff81c7a6c0 0000000000000001 ffff8800a1803df0
    [  758.085458] Call Trace:
    [  758.088224]  <IRQ> 
    [  758.090398]  [<ffffffff8105385b>] ? __x2apic_send_IPI_mask+0x1ab/0x1c0
    [  758.098008]  [<ffffffff8105388c>] x2apic_send_IPI_all+0x1c/0x20
    [  758.104709]  [<ffffffff8104ee44>] arch_trigger_all_cpu_backtrace+0x64/0xa0
    [  758.112494]  [<ffffffff81111424>] rcu_check_callbacks+0x584/0x850
    [  758.119392]  [<ffffffff810a3a77>] update_process_times+0x47/0x70
    [  758.126193]  [<ffffffff8111ce95>] tick_sched_handle.isra.19+0x25/0x60
    [  758.133484]  [<ffffffff8111d681>] tick_sched_timer+0x41/0x60
    [  758.139891]  [<ffffffff810c21a6>] __run_hrtimer+0x86/0x440
    [  758.146099]  [<ffffffff8111d640>] ? tick_sched_do_timer+0x40/0x40
    [  758.152997]  [<ffffffff810c2fa7>] hrtimer_interrupt+0xf7/0x240
    [  758.159601]  [<ffffffff8104ce17>] local_apic_timer_interrupt+0x37/0x60
    [  758.166991]  [<ffffffff817a92ef>] smp_apic_timer_interrupt+0x3f/0x60
    [  758.174184]  [<ffffffff817a7c32>] apic_timer_interrupt+0x72/0x80
    [  758.180983]  <EOI> 
    [  758.183157]  [<ffffffff810f425a>] ? lock_acquire+0xba/0x1d0
    [  758.189689]  [<ffffffffa053a41a>] ? kvm_arch_vcpu_ioctl_run+0x11aa/0x1510 [kvm]
    [  758.197969]  [<ffffffffa053a486>] kvm_arch_vcpu_ioctl_run+0x1216/0x1510 [kvm]
    [  758.206053]  [<ffffffffa053a41a>] ? kvm_arch_vcpu_ioctl_run+0x11aa/0x1510 [kvm]
    [  758.214323]  [<ffffffffa052067c>] ? vcpu_load+0x1c/0xa0 [kvm]
    [  758.220832]  [<ffffffffa05358fe>] ? kvm_arch_vcpu_load+0x4e/0x1e0 [kvm]
    [  758.228323]  [<ffffffffa05209fd>] kvm_vcpu_ioctl+0x2bd/0x670 [kvm]
    [  758.235319]  [<ffffffff810c4ee2>] ? creds_are_invalid.part.1+0x12/0x50
    [  758.242711]  [<ffffffff810c4f41>] ? creds_are_invalid+0x21/0x30
    [  758.249412]  [<ffffffff813415c6>] ? inode_has_perm.isra.51+0x26/0x80
    [  758.256606]  [<ffffffff81237b70>] do_vfs_ioctl+0x300/0x520
    [  758.262818]  [<ffffffff81341c2b>] ? selinux_file_ioctl+0x5b/0x110
    [  758.269715]  [<ffffffff81237e11>] SyS_ioctl+0x81/0xa0
    [  758.275434]  [<ffffffff817a6f29>] system_call_fastpath+0x16/0x1b
    [  758.282233] Code: 0f 1f 00 f3 c3 48 c7 c1 82 a7 a4 81 48 c7 c2 4d 64 a4 81 be 42 0a 00 00 48 c7 c7 53 a7 a4 81 31 c0 e8 45 44 fa ff 5d eb d5 66 90 <55> 48 89 e5 48 8b 7d 08 e8 43 ff ff ff 5d c3 90 55 48 81 ff 00 
    [  758.304183] INFO: NMI handler (arch_trigger_all_cpu_backtrace_handler) took too long to run: 378.664 msecs
    [  758.304185] NMI backtrace for cpu 3
    [  758.304186] CPU: 3 PID: 0 Comm: swapper/3 Not tainted 3.14.0-0.rc0.git9.1.fc21.x86_64 #1
    [  758.304187] Hardware name: Intel Corporation Shark Bay Client platform/Flathead Creek Crb, BIOS HSWLPTU1.86C.0109.R03.1301282055 01/28/2013
    [  758.304188] task: ffff880448268000 ti: ffff880448264000 task.ti: ffff880448264000
    [  758.304191] RIP: 0010:[<ffffffff814278ff>]  [<ffffffff814278ff>] intel_idle+0xdf/0x160
    [  758.304191] RSP: 0018:ffff880448265e30  EFLAGS: 00000046
    [  758.304192] RAX: 0000000000000032 RBX: 0000000000000010 RCX: 0000000000000001
    [  758.304192] RDX: 0000000000000000 RSI: ffffffff81cfd560 RDI: 0000000000000003
    [  758.304193] RBP: ffff880448265e58 R08: 00000000006f9072 R09: ffffffffffffffff
    [  758.304193] R10: 000001ced94d755f R11: 0000000225c17d03 R12: 0000000000000005
    [  758.304194] R13: 0000000000000032 R14: 0000000000000004 R15: ffffffff81cfd730
    [  758.304195] FS:  0000000000000000(0000) GS:ffff8800a1c00000(0000) knlGS:0000000000000000
    [  758.304195] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [  758.304196] CR2: 00007f748f00d000 CR3: 0000000001c0c000 CR4: 00000000001427e0
    [  758.304196] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    [  758.304197] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    [  758.304197] Stack:
    [  758.304198]  00000003815fde00 ffffe8ffff400680 ffffffff81cfd560 0000000000000005
    [  758.304200]  000000b04d0e578d ffff880448265e90 ffffffff815fde10 ffff880448265fd8
    [  758.304201]  ffffe8ffff400680 0000000000000005 0000000000000003 ffffffff81cfd560
    [  758.304201] Call Trace:
    [  758.304205]  [<ffffffff815fde10>] cpuidle_enter_state+0x40/0xd0
    [  758.304206]  [<ffffffff815fdf59>] cpuidle_idle_call+0xb9/0x380
    [  758.304209]  [<ffffffff81024c8e>] arch_cpu_idle+0xe/0x40
    [  758.304212]  [<ffffffff81105d0e>] cpu_startup_entry+0x9e/0x3d0
    [  758.304214]  [<ffffffff8104ae29>] start_secondary+0x1e9/0x290
    [  758.304226] Code: c9 00 00 48 89 d1 48 2d c8 1f 00 00 0f 01 c8 65 48 8b 04 25 70 c9 00 00 48 8b 80 38 e0 ff ff a8 08 75 08 b1 01 4c 89 e8 0f 01 c9 <65> 48 8b 04 25 70 c9 00 00 83 a0 3c e0 ff ff fb 0f ae f0 85 1d 
    [  758.304227] NMI backtrace for cpu 0
    .
    .
    .
    ====

Expected results
----------------

Successful boot w/o soft lockups, and crashes
Comment 1 Kashyap Chamarthy 2014-01-27 09:19:25 UTC
Created attachment 123501 [details]
Complete stdout of dmesg
Comment 2 Kashyap Chamarthy 2014-01-27 09:20:03 UTC
Created attachment 123511 [details]
Complete stdout of dmidecode
Comment 3 Kashyap Chamarthy 2014-01-27 09:37:36 UTC
Created attachment 123541 [details]
Complete stdout of `x86info -a`
Comment 4 Kashyap Chamarthy 2014-01-31 16:11:36 UTC
Downstream bug -- https://bugzilla.redhat.com/show_bug.cgi?id=1058209
Comment 5 Robert Ho 2014-02-10 09:23:56 UTC
We suspect it is caused by same root cause with 69361
https://bugzilla.kernel.org/show_bug.cgi?id=69361

We apply the same patch and test on a Haswell machine, don't have such call trace and boot up successfully.

Please have a try on the patch in your environment.
Comment 6 Kashyap Chamarthy 2014-02-10 14:06:32 UTC
Created attachment 125521 [details]
Successful stderr of `dmesg` on L0 with Kernel 3.14.0-0.rc1.git4.1.fc21.x86_64

I just tested with this Kernel 3.14.0-0.rc1.git4.1.fc21.x86_64 my bare-metal,
booted L1 (running: 3.14.0-0.rc1.git4.1.fc21.x86_64) and L2 ( running
3.11.10-301.fc20.x86_64')

Couple of observations
----------------------

  - I don't see the soft lockups any more on host
  - L2 is just stuck on:

      Booting `Fedora, with Linux 3.11.10-301.fc20.x86_64'
  
    when booted over a serial console. I tried booting into single 
    level, it's just stuck there, attempting to "Booting a command list"

  - To isolate the problem Kernel, I also tried:
      - Shutdown L2; Booting L1 into 3.14.0-0.rc1.git1.1.fc21.x86_64;
        Start L2 with 3.11.10-301.fc20.x86_64 -- stuck on boot.

        Repeated the above test with booting L1 into 
        3.12.8-300.fc20.x86_64, and starting L2 results -- stuck on
        boot.

    So, from the above symptoms, I'm inferring the problem appears to be 
    with current L0 Kernel (3.14.0-0.rc1.git4.1). Because
    booting it into its older Kernel (kernel-3.12.8-300.fc20.x86_64),
    brings up L2 successfully

I'm yet to investigate further here.
Comment 7 Kashyap Chamarthy 2014-02-26 20:32:02 UTC
(In reply to Kashyap Chamarthy from comment #6)

[. . .]

> 
> I just tested with this Kernel 3.14.0-0.rc1.git4.1.fc21.x86_64 my bare-metal,
> booted L1 (running: 3.14.0-0.rc1.git4.1.fc21.x86_64) and L2 ( running
> 3.11.10-301.fc20.x86_64')
> 
> Couple of observations
> ----------------------
> 
>   - I don't see the soft lockups any more on host
>   - L2 is just stuck on:
> 
>       Booting `Fedora, with Linux 3.11.10-301.fc20.x86_64'
>   
>     when booted over a serial console. I tried booting into single 
>     level, it's just stuck there, attempting to "Booting a command list"

I still see this same problem with this Kernel on L0 and L1: 3.14.0-0.rc3.git2.1.fc21.x86_64
Comment 8 Kashyap Chamarthy 2014-02-28 06:01:34 UTC
This is now fixed with this[1] patch from Paolo. Test result[2].


  [1] https://patchwork.kernel.org/patch/3736391/
  [2] http://article.gmane.org/gmane.comp.emulators.kvm.devel/119406

Note You need to log in before you can comment on or make changes to this bug.