Bug 196685 - System becomes unresponsive after intel_iommu kernel error
Summary: System becomes unresponsive after intel_iommu kernel error
Status: NEW
Alias: None
Product: Virtualization
Classification: Unclassified
Component: kvm (show other bugs)
Hardware: Intel Linux
: P1 normal
Assignee: virtualization_kvm
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2017-08-17 04:49 UTC by Jason Hicks
Modified: 2017-08-23 07:04 UTC (History)
3 users (show)

See Also:
Kernel Version: 4.12.8
Subsystem:
Regression: Yes
Bisected commit-id:


Attachments
Kernel error in /var/log/kern.log (28.77 KB, text/plain)
2017-08-17 04:49 UTC, Jason Hicks
Details
lspci -vt (8.76 KB, text/plain)
2017-08-17 04:50 UTC, Jason Hicks
Details
inxi -F (2.56 KB, text/plain)
2017-08-17 04:52 UTC, Jason Hicks
Details
iommu groups (12.32 KB, text/plain)
2017-08-17 05:00 UTC, Jason Hicks
Details
dmesg | grep DMAR (1.71 KB, text/plain)
2017-08-17 05:02 UTC, Jason Hicks
Details
dmesg Aug 16 (1018.49 KB, text/plain)
2017-08-17 05:27 UTC, Jason Hicks
Details
oops trace from logs (6.04 KB, text/plain)
2017-08-19 17:03 UTC, Jeff Cook
Details
lspci -vt | Aug 19 2017 (on 4.11.9) (16.99 KB, text/plain)
2017-08-19 17:11 UTC, Jeff Cook
Details
dmesg aug 18-aug 20 (4.90 MB, text/plain)
2017-08-20 13:03 UTC, Jason Hicks
Details
dmesg WARN vmx_deliver_posted_interrupt (123.90 KB, text/plain)
2017-08-21 05:47 UTC, Jeff Cook
Details

Description Jason Hicks 2017-08-17 04:49:05 UTC
Created attachment 257969 [details]
Kernel error in /var/log/kern.log

I using a qemu virtual machine with VGA pass through. I have two Intel NICs in use by my motherboard, and I am passing through one of them so I can network with the VM over gigabit ethernet. I have been using this without issues for several months, but recently my system started becoming unstable. Before each crash, the network connection would abruptly end, and I suspect that is related to this. My desktop environment locks up, and after logging into my user account in another terminal, it is flooded with UFW block messages and no commands using sudo return anything. A kernel error appearing to be caused by intel_iommu is left in the kernel log. After this has happened yesterday, I ran memtest86 and my memory is okay. I am using an engineering sample CPU that is a down-clocked Intel E5 2680 v4. I haven't had any issues with this chip except recently when using VT-d.
Comment 1 Jason Hicks 2017-08-17 04:50:42 UTC
Created attachment 257971 [details]
lspci -vt
Comment 2 Jason Hicks 2017-08-17 04:52:02 UTC
Created attachment 257973 [details]
inxi -F

NVIDIA Device 1b81 is a Geforce GTX 1070 which is passed through to the VM.
Comment 3 Jason Hicks 2017-08-17 05:00:30 UTC
Created attachment 257975 [details]
iommu groups

Output of `for iommu_group in $(find /sys/kernel/iommu_groups/ -maxdepth 1 -mindepth 1 -type d); do echo "IOMMU group $(basename "$iommu_group")"; for device in $(\ls -1 "$iommu_group"/devices/); do echo -n $'\t'; lspci -nns "$device"; done; done`
Comment 4 Jason Hicks 2017-08-17 05:02:02 UTC
Created attachment 257977 [details]
dmesg | grep DMAR
Comment 5 Du, Changbin 2017-08-17 05:13:58 UTC
(In reply to Jason Hicks from comment #0)
> Created attachment 257969 [details]
> Kernel error in /var/log/kern.log
> 
> I using a qemu virtual machine with VGA pass through. I have two Intel NICs
> in use by my motherboard, and I am passing through one of them so I can
> network with the VM over gigabit ethernet. I have been using this without
> issues for several months, but recently my system started becoming unstable.
> Before each crash, the network connection would abruptly end, and I suspect
> that is related to this. My desktop environment locks up, and after logging
> into my user account in another terminal, it is flooded with UFW block
> messages and no commands using sudo return anything. A kernel error
> appearing to be caused by intel_iommu is left in the kernel log. After this
> has happened yesterday, I ran memtest86 and my memory is okay. I am using an
> engineering sample CPU that is a down-clocked Intel E5 2680 v4. I haven't
> had any issues with this chip except recently when using VT-d.

Checking your dmesg, there are several kernel oops from different modules. Can you share a full dmesg start from boot, to get the original error?
Comment 6 Jason Hicks 2017-08-17 05:27:33 UTC
Created attachment 257979 [details]
dmesg Aug 16

Log from boot to the error event.
Comment 7 Jason Hicks 2017-08-17 05:30:17 UTC
I should also add I am using Docker, so there are a lot of bridged network connections originating from it. I usually can work in the VM for a few hours before it crashes.
Comment 8 Du, Changbin 2017-08-18 10:14:20 UTC
If my code match yours, at least there 2 issues:
First, kvm detected a warnning condition when delivering irq to guest.
Second, the WARN caused a paging request oops when executing the trap handler of the UD0 instrunction.

[36330.176867] BUG: unable to handle kernel paging request at ffffffffc0633f92
[36330.176888] IP: report_bug+0x94/0x120
[36330.176891] PGD 168c20c067 
[36330.176893] P4D 168c20c067 
[36330.176895] PUD 168c20e067 
[36330.176898] PMD 2008194067 
[36330.176901] PTE 80000020017bc161
[36330.176904] 
[36330.176911] Oops: 0003 [#1] SMP
[36330.176913] Modules linked in: xt_recent xt_nat veth nls_utf8 udf crc_itu_t rfcomm ipt_MASQUERADE nf_nat_masquerade_ipv4 nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo iptable_nat nf_nat_ipv4 br_netfilter overlay vmw_vsock_vmci_transport vsock vmw_vmci bridge stp llc bnep binfmt_misc nls_iso8859_1 snd_hda_codec_hdmi nvidia_uvm(POE) arc4 intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp snd_hda_codec_realtek snd_hda_codec_generic crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc snd_hda_intel snd_hda_codec btusb btrtl btbcm btintel aesni_intel iwlmvm bluetooth nvidia_drm(POE) aes_x86_64 nvidia_modeset(POE) snd_hda_core crypto_simd input_leds usblp snd_hwdep hid_sony glue_helper joydev ff_memless ecdh_generic cryptd nvidia(POE) mac80211 snd_pcm intel_cstate drm_kms_helper snd_seq_midi
[36330.176991]  snd_seq_midi_event intel_rapl_perf drm snd_rawmidi iwlwifi fb_sys_fops mxm_wmi snd_seq syscopyarea sysfillrect sysimgblt snd_seq_device snd_timer cfg80211 snd mei_me mei lpc_ich soundcore shpchp wmi mac_hid kvm_intel kvm ip6t_REJECT nf_reject_ipv6 nf_log_ipv6 xt_hl ip6t_rt nf_conntrack_ipv6 nf_defrag_ipv6 ipt_REJECT nf_reject_ipv4 nf_log_ipv4 nf_log_common xt_LOG xt_multiport xt_limit xt_tcpudp xt_addrtype nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack ip6table_filter ip6_tables nf_conntrack_netbios_ns nf_conntrack_broadcast nf_nat_ftp nf_nat nf_conntrack_ftp nf_conntrack libcrc32c iptable_filter ip_tables x_tables parport_pc ppdev lp parport autofs4 btrfs xor raid6_pq dm_mirror dm_region_hash dm_log vhost_net vhost tap vfio_pci vfio_virqfd irqbypass vfio_iommu_type1 vfio hid_logitech_hidpp
[36330.177074]  hid_logitech_dj hid_generic usbhid hid uas usb_storage igb e1000e dca ptp ahci i2c_algo_bit pps_core libahci
[36330.177099] CPU: 13 PID: 8711 Comm: qemu-system-x86 Tainted: P           OE   4.12.7-041207-generic #201708160856
[36330.177103] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./X99 Taichi, BIOS P1.10 06/29/2016
[36330.177107] task: ffff94808147c380 task.stack: ffffa46262f7c000
[36330.177114] RIP: 0010:report_bug+0x94/0x120
[36330.177118] RSP: 0018:ffffa46262f7f7a0 EFLAGS: 00010202
[36330.177123] RAX: 0000000000000907 RBX: ffffa46262f7f908 RCX: ffffffffc0633f88
[36330.177127] RDX: 0000000000000001 RSI: 0000000000001359 RDI: 0000000000000001
[36330.177130] RBP: ffffa46262f7f7c0 R08: ffffa46262f80000 R09: 00000000000003ce
[36330.177134] R10: ffffffff83206a80 R11: fffff803741fea0f R12: ffffffffc0614073
[36330.177137] R13: ffffffffc0633a98 R14: 0000000000000004 R15: ffffa46262f7f908
[36330.177153] FS:  00007f5b08c67700(0000) GS:ffff94841f140000(0000) knlGS:00000000003f2000
[36330.177157] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[36330.177161] CR2: ffffffffc0633f92 CR3: 0000001c883de000 CR4: 00000000003426e0
[36330.177165] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[36330.177168] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[36330.177171] Call Trace:
[36330.177186]  ? vmx_deliver_posted_interrupt+0xe3/0xf0 [kvm_intel]
[36330.177196]  fixup_bug+0x2e/0x50
[36330.177203]  do_trap+0x119/0x150
[36330.177209]  do_error_trap+0x89/0x110
[36330.177218]  ? vmx_deliver_posted_interrupt+0xe3/0xf0 [kvm_intel]
[36330.177224]  ? check_preempt_curr+0x74/0x80
[36330.177229]  ? ttwu_do_wakeup+0x1e/0x140
[36330.177233]  ? ttwu_do_activate+0x77/0x80
[36330.177240]  do_invalid_op+0x20/0x30
[36330.177247]  invalid_op+0x1e/0x30
[36330.177255] RIP: 0010:vmx_deliver_posted_interrupt+0xe3/0xf0 [kvm_intel]
[36330.177259] RSP: 0018:ffffa46262f7f9b0 EFLAGS: 00010202
[36330.177263] RAX: 00000c0000f20003 RBX: ffff948081448000 RCX: 0000000000000000
[36330.177267] RDX: ffff94808814f1e0 RSI: 00000000000000d2 RDI: ffff948081448000
[36330.177270] RBP: ffffa46262f7f9b8 R08: 0000000000000000 R09: 0000000000000000
[36330.177273] R10: fffff8037400cf10 R11: fffff803741fea0f R12: 00000000000000d2
[36330.177277] R13: ffff94831fcf6700 R14: 0000000000000000 R15: ffff948081448000
[36330.177326]  __apic_accept_irq+0x25b/0x300 [kvm]
[36330.177351]  kvm_apic_set_irq+0x2a/0x30 [kvm]
[36330.177374]  kvm_irq_delivery_to_apic+0x20b/0x2a0 [kvm]
[36330.177396]  kvm_lapic_reg_write+0x112/0x640 [kvm]
[36330.177417]  kvm_hv_vapic_msr_write+0x34/0x70 [kvm]
[36330.177439]  kvm_hv_set_msr_common+0x303/0x9d0 [kvm]
[36330.177461]  kvm_set_msr_common+0x64c/0xc30 [kvm]
[36330.177484]  ? x86_emulate_insn+0x161/0xdd0 [kvm]
[36330.177491]  vmx_set_msr+0xba/0x7d0 [kvm_intel]
[36330.177510]  kvm_set_msr+0x41/0x70 [kvm]
[36330.177516]  handle_wrmsr+0x59/0x140 [kvm_intel]
[36330.177521]  vmx_handle_exit+0xab/0xb40 [kvm_intel]
[36330.177527]  ? vmx_vcpu_run+0x2d7/0x460 [kvm_intel]
[36330.177547]  vcpu_enter_guest+0x440/0x1120 [kvm]
[36330.177567]  ? x86_emulate_instruction+0x586/0x6e0 [kvm]
[36330.177573]  ? __set_task_blocked+0x3c/0xa0
[36330.177594]  kvm_arch_vcpu_ioctl_run+0x210/0x3f0 [kvm]
[36330.177612]  ? kvm_arch_vcpu_ioctl_run+0x210/0x3f0 [kvm]
[36330.177628]  kvm_vcpu_ioctl+0x2a6/0x620 [kvm]
[36330.177633]  ? do_futex+0x30a/0x510
[36330.177638]  do_vfs_ioctl+0xa5/0x600
[36330.177657]  ? kvm_on_user_return+0x6f/0xa0 [kvm]
[36330.177661]  SyS_ioctl+0x79/0x90
[36330.177665]  entry_SYSCALL_64_fastpath+0x1e/0xa9
[36330.177668] RIP: 0033:0x7f5b27557f07
[36330.177670] RSP: 002b:00007f5b08c66908 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[36330.177674] RAX: ffffffffffffffda RBX: 000000000000ae80 RCX: 00007f5b27557f07
[36330.177676] RDX: 0000000000000000 RSI: 000000000000ae80 RDI: 0000000000000012
[36330.177678] RBP: 0000009fdb7405a0 R08: 0000009fd9118550 R09: 00000000000000ff
[36330.177680] R10: 0000000000000001 R11: 0000000000000246 R12: 0000000000000000
[36330.177682] R13: 00007f5b2c6e3000 R14: 0000000000000000 R15: 0000009fdb7405a0
[36330.177685] Code: 74 59 0f b7 41 0a 4c 63 69 04 0f b7 71 08 89 c7 49 01 cd 83 e7 01 a8 02 74 15 66 85 ff 74 10 a8 04 ba 01 00 00 00 75 26 83 c8 04 <66> 89 41 0a 66 85 ff 74 49 0f b6 49 0b 4c 89 e2 45 31 c9 49 89 
[36330.177735] RIP: report_bug+0x94/0x120 RSP: ffffa46262f7f7a0
[36330.177737] CR2: ffffffffc0633f92
[36330.177740] ---[ end trace 5d67fb9cae2ae6de ]---
Comment 9 Jeff Cook 2017-08-19 16:58:34 UTC
I've been getting a very similar crash on 4.12.x, also making heavy use of PCI passthrough. Came across this bug, as I've been searching regularly looking for others encountering the same troubles.

A snippet of the log observed on 4.12.{3/5}:

kernel: BUG: unable to handle kernel paging request at ffffffffc0955f96
kernel: IP: report_bug+0x94/0x120
kernel: PGD a67a0c067 
kernel: P4D a67a0c067 
kernel: PUD a67a0e067 
kernel: PMD 495886067 
kernel: PTE 800000049bbdd161

Last night I experienced a hard lock with 4.11.9, which has heretofore been stable. The system did not log any information about that crash. This may mean that the issue existed prior to 4.12, but was just much harder to trigger.

See also my related bug report at https://bugs.archlinux.org/task/55141.
Comment 10 Jeff Cook 2017-08-19 17:03:15 UTC
Created attachment 258019 [details]
oops trace from logs

On this particular error condition, the passthrough'd VM locks, but the host system and other VMs initially remain responsive via SSH. They too fail after issuing a few commands. Messages indicating a stall are emitted regularly until the machine is rebooted:

kernel: INFO: rcu_preempt detected stalls on CPUs/tasks:
kernel: Tasks blocked on level-1 rcu_node (CPUs 0-15): P2837
kernel: (detected by 23, t=990092 jiffies, g=2933189, c=2933188, q=9740206)
Comment 11 Jeff Cook 2017-08-19 17:11:56 UTC
Created attachment 258021 [details]
lspci -vt | Aug 19 2017 (on 4.11.9)

Attaching my lspci -vt for comparison, from 4.11.9. Of note, I am also using a GeForce GTX 1070 for VGA passthrough.
Comment 12 Jason Hicks 2017-08-20 13:03:41 UTC
Created attachment 258029 [details]
dmesg aug 18-aug 20

This is an updated dmesg log from boot to kernel oops errors. This shows the system being on for a couple days with the VM running with no issues, but finally when left on over night, and then using the VM this morning (August 20), it flooded the log with call stacks for a while before my computer completely froze. In addition, hardware differences between this and my last complete dmesg log include removed USB devices - a 8GB usb drive, and a Sony controller was removed.
Comment 13 Jeff Cook 2017-08-21 05:42:36 UTC
For the record. I have been running "Linux kvm_master 4.13.0-rc5-g58d4e450a490 #1 SMP PREEMPT", a git build from a commit made on Friday (8/19/2017), for the last 36 hours or so. On this commit, it appears that I received the warning message discussed here, but nothing crashed and the system continues to function in what appears to be a normal manner.

I will attach full dmesg for reference. If this is indeed the same condition and the trap is successful, the crasher may have been fixed somewhere in the 4.13 cycle.
Comment 14 Jeff Cook 2017-08-21 05:47:58 UTC
Created attachment 258033 [details]
dmesg WARN vmx_deliver_posted_interrupt

Full output of dmesg on Linux 4.13.0-rc5-g58d4e450a490, boot to WARN and onward. There have been no apparent negative effects since the error was logged approximately 20 hours ago (caveat: have not used heavily today).
Comment 15 Du, Changbin 2017-08-22 07:47:21 UTC
(In reply to Jeff Cook from comment #13)
> For the record. I have been running "Linux kvm_master
> 4.13.0-rc5-g58d4e450a490 #1 SMP PREEMPT", a git build from a commit made on
> Friday (8/19/2017), for the last 36 hours or so. On this commit, it appears
> that I received the warning message discussed here, but nothing crashed and
> the system continues to function in what appears to be a normal manner.
> 
> I will attach full dmesg for reference. If this is indeed the same condition
> and the trap is successful, the crasher may have been fixed somewhere in the
> 4.13 cycle.

The oops has been fixed by https://py3.patchwork.dja.id.au/patch/41163/. This patch should be backported to stable branches.
Comment 16 Du, Changbin 2017-08-22 09:01:07 UTC
I have send a request to stable queue. Suppose it will be fixed in next 4.11.x and 4.12.x version. This issue is introduced from v4.11-rc3.
Comment 17 Jeff Cook 2017-08-22 15:00:45 UTC
That's great news! It sounds like that solves the first half of the problem (the crash), and that we still need to pin down the warning? Or is the warning reflective of a bad hardware condition?

Separately, I've also been experiencing https://bugzilla.kernel.org/show_bug.cgi?id=196717. Thought I'd mention it here since there may be a cross-section of subscribers here experiencing it too.

Note You need to log in before you can comment on or make changes to this bug.