Bug 194837 - VM with virtio-scsi drive often crashes during boot with kernel 4.11rc1
Summary: VM with virtio-scsi drive often crashes during boot with kernel 4.11rc1
Status: NEW
Alias: None
Product: IO/Storage
Classification: Unclassified
Component: SCSI (show other bugs)
Hardware: All Linux
: P1 blocking
Assignee: linux-scsi@vger.kernel.org
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2017-03-09 23:53 UTC by Adam Williamson
Modified: 2017-03-17 19:02 UTC (History)
4 users (show)

See Also:
Kernel Version: 4.11rc1
Subsystem:
Regression: Yes
Bisected commit-id:


Attachments

Description Adam Williamson 2017-03-09 23:53:52 UTC
In Fedora, we use the openQA automated test system, which runs qemu VMs and does tests in them. By default, it attaches optical disc images to the test VM using a virtio-scsi drive.

When Fedora 26 and Rawhide's kernels went from kernel-4.11.0-0.rc0.git9.1 to kernel-4.11.0-0.rc1.git0.1 , many openQA tests suddenly started failing because at some point in the test, the VM would fail to boot properly, with a kernel error and traceback often displayed (sometimes the screen would just be blank). I've seen three variants on this failure so far. Two have identical-looking tracebacks but a slightly different error message:

https://openqa.fedoraproject.org/tests/60571#step/_console_wait_login/7
https://openqa.fedoraproject.org/tests/60572#step/_console_wait_login/6

note that one error is 'unable to handle kernel paging request' and the other is 'unable to handle kernel NULL pointer dereference', but the tracebacks look very similar.

Another case shows a somewhat different traceback:

https://openqa.fedoraproject.org/tests/60574#step/_console_wait_login/4

but doesn't show an error message (it may just be in the backscroll, unfortunately there's no way to recover it from that test now). The traceback is still in SCSI code, however.

I can reproduce this problem manually using virt-manager, so long as I attach a SCSI optical drive to the VM (Add Hardware, Device type -> "CDROM device", Bus type -> "SCSI"). If I use an IDE optical drive (which is the default in virt-manager), the bug does not occur. So long as a SCSI optical drive is attached, about half of the attempts to boot the system with the affected kernel fail. Usually with a traceback looking like the ones from openQA, sometimes it also just apparently hangs when enumerating SCSI devices (I think that's what it's doing, the last line is 'scsi 3:0:0:0: CD-ROM     QEMU   QEMU CD-ROM   2.0. PQ: 0 ANSI: 5' or a bit after that).
Comment 1 Adam Williamson 2017-03-09 23:58:57 UTC
Downstream bug is https://bugzilla.redhat.com/show_bug.cgi?id=1430043 , Fedora kernel maintainers can provide info on the delta between the affected kernel builds, I hope. The changelog just says:

* Mon Mar 06 2017 Laura Abbott - 4.11.0-0.rc1.git0.1
- Linux v4.11-rc1

* Mon Mar 06 2017 Laura Abbott
- Disable debugging options.

* Fri Mar 03 2017 Laura Abbott - 4.11.0-0.rc0.git9.1
- Linux v4.10-11319-gc82be9d

so the delta is to 'v4.10-11319-gc82be9d', but I dunno what that means exactly.
Comment 2 Adam Williamson 2017-03-10 00:47:12 UTC
I seem to get a different traceback every time this happens in local testing, aside from the times it just hangs - but it always seems to hang or crash at approximately the same point in the boot process, right after the "virtio-pci 0000:00:0a.0: virtio_pci: leaving for legacy driver" messages. The last one I got was:

[    1.327442] Kernel panic - not syncing: CRED: put_cred_rcu() sees ffff938bb584c000 with usage 1953655158
[    1.327442] 
[    1.328949] CPU: 1 PID: 21 Comm: rcuos/1 Not tainted 4.11.0-0.rc1.git0.1.fc26.x86_64 #1
[    1.328949] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
[    1.331157] Call Trace:
[    1.331157]  dump_stack+0x63/0x84
[    1.331157]  panic+0xe4/0x22d
[    1.331157]  ? __prepare_to_swait+0x52/0x80
[    1.331157]  put_cred_rcu+0xb0/0xb0
[    1.331157]  rcu_nocb_kthread+0x15f/0x500
[    1.335141]  ? get_state_synchronize_sched+0x20/0x20
[    1.335141]  kthread+0x11e/0x140
[    1.335141]  ? kthread_park+0x90/0x90
[    1.335141]  ret_from_fork+0x2c/0x40
[    1.335141] Kernel Offset: 0x34000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
[    1.335141] ---[ end Kernel panic - not syncing: CRED: put_cred_rcu() sees ffff938bb584c000 with usage 1953655158

aside from that I've seen:

[    1.419027] Call Trace:
[    1.435026]  <IRQ>
[    1.435026]  mpage_end_io+0x39/0x60
[    1.435026]  bio_endio+0x54/0x60
[    1.435026]  blk_update_request+0x8d/0x340
[    1.435026]  blk_mq_end_request+0x1a/0x80
[    1.435026]  virtblk_request_done+0x45/0x80 [virtio_blk]
[    1.435026]  __blk_mq_complete_request+0xc9/0x120
[    1.435026]  blk_mq_complete_request+0x1e/0x20
[    1.435026]  virtblk_done+0x74/0x100 [virtio_blk]
[    1.435026]  vring_interrupt+0x34/0x80 [virtio_ring]
[    1.435026]  __handle_irq_event_percpu+0x3f/0x1a0
[    1.435026]  handle_irq_event_percpu+0x32/0x80
[    1.435026]  handle_irq_event+0x2c/0x50
[    1.435026]  handle_edge_irq+0x6f/0x120
[    1.435026]  handle_irq+0xad/0x130
[    1.435026]  do_IRQ+0x46/0xd0
[    1.435026]  common_interrupt+0x93/0x93
[    1.435026] RIP: 0010:native_safe_halt+0x6/0x10
[    1.435026] RSP: 0018:ffffffffaae03de0 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff7e
[    1.435026] RAX: 0000000000000000 RBX: ffffffffaae104c0 RCX: 0000000000000000
[    1.435026] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
[    1.435026] RBP: ffffffffaae03de0 R08: 000000001c32e304 R09: ffff883d767b1f00
[    1.435026] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
[    1.435026] R13: ffffffffaae104c0 R14: 0000000000000000 R15: 0000000000000000
[    1.435026]  </IRQ>
[    1.435026]  default_idle+0x20/0xe0
[    1.435026]  arch_cpu_idle+0xf/0x20
[    1.435026]  default_idle_call+0x23/0x30
[    1.435026]  do_idle+0x170/0x200
[    1.435026]  cpu_startup_entry+0x71/0x80
[    1.435026]  rest_init+0x77/0x80
[    1.435026]  start_kernel+0x45a/0x47b
[    1.435026]  ? early_idt_handler_array+0x120/0x120
[    1.435026]  x86_64_start_reservations+0x24/0x26
[    1.435026]  x86_64_start_kernel+0x13c/0x15f
[    1.435026]  start_cpu+0x14/0x14
[    1.435026] Code: ff c3 b8 f8 ff ff ff c3 66 0f 1f 84 00 00 00 00 00 66 66 66 66 90 55 48 89 e5 53 48 89 fb 48 83 ec 08 40 84 f6 75 12 85 d2 75 21 <48> 8b 47 20 a8 01 75 66 f0 80 0f 08 eb 2a 85 d2 75 35 48 89 df 
[    1.435026] RIP: page_endio+0x1a/0xa0 RSP: ffff883dbfc03d68
[    1.435026] CR2: 0000000000000020
[    1.435026] ---[ end trace cf261be33b5d0d05 ]---
[    1.435026] Kernel panic - not syncing: Fatal exception in interrupt
[    1.435026] Kernel Offset: 0x29000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
[    1.435026] ---[ end Kernel panic - not syncing: Fatal exception in interrupt

and:

[    1.530432] general protection fault: 0000 [#1] SMP
[    1.531004] Modules linked in: virtio_scsi(+) virtio_console qxl drm_kms_helper ttm drm crc32c_intel sym53c8xx serio_raw scsi_transport_spi virtio_pci virtio_ring virtio ata_generic pata_acpi qemu_fw_cfg
[    1.531004] CPU: 0 PID: 314 Comm: systemd-udevd Not tainted 4.11.0-0.rc1.git0.1.fc26.x86_64 #1
[    1.531004] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
[    1.531004] task: ffff9c49760c0000 task.stack: ffffbfbcc05d8000
[    1.531004] RIP: 0010:kmem_cache_alloc+0x84/0x1b0
[    1.531004] RSP: 0018:ffffbfbcc05dbe60 EFLAGS: 00010246
[    1.531004] RAX: 2d316f6974726976 RBX: 00000000014000c0 RCX: 00007fcf9e251b30
[    1.531004] RDX: 0000000000002192 RSI: 00000000014000c0 RDI: 000000000001d780
[    1.531004] RBP: ffffbfbcc05dbe90 R08: ffff9c49bfc1d780 R09: 0000000000000077
[    1.531004] R10: 0000000000000000 R11: 0000000000000000 R12: 00000000014000c0
[    1.531004] R13: ffff9c49bd098140 R14: 2d316f6974726976 R15: ffff9c49bd098140
[    1.531004] FS:  00007fcf9f3398c0(0000) GS:ffff9c49bfc00000(0000) knlGS:0000000000000000
[    1.531004] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    1.531004] CR2: 00007ffe1682ffc8 CR3: 000000003617c000 CR4: 00000000000006f0
[    1.531004] Call Trace:
[    1.531004]  ? prepare_creds+0x26/0xf0
[    1.531004]  prepare_creds+0x26/0xf0
[    1.531004]  SyS_access+0x41/0x230
[    1.531004]  do_syscall_64+0x67/0x170
[    1.531004]  entry_SYSCALL64_slow_path+0x25/0x25
[    1.531004] RIP: 0033:0x7fcf9df8651a
[    1.531004] RSP: 002b:00007ffe16830f88 EFLAGS: 00000246 ORIG_RAX: 0000000000000015
[    1.531004] RAX: ffffffffffffffda RBX: 0000556016def330 RCX: 00007fcf9df8651a
[    1.531004] RDX: 00316f6974726976 RSI: 0000000000000000 RDI: 00007ffe16830f90
[    1.531004] RBP: 00007ffe16831030 R08: 0000000000001000 R09: 0000000000000077
[    1.531004] R10: 00007fcf9e251b30 R11: 0000000000000246 R12: 00007ffe16830f90
[    1.531004] R13: 0000556016dea880 R14: 00000000ffffffea R15: 0000556015e1baf8
[    1.531004] Code: 49 83 78 10 00 4d 8b 30 0f 84 fe 00 00 00 4d 85 f6 0f 84 f5 00 00 00 49 63 47 20 49 8b 3f 4c 01 f0 40 f6 c7 0f 0f 85 21 01 00 00 <48> 8b 18 48 8d 4a 01 4c 89 f0 65 48 0f c7 0f 0f 94 c0 84 c0 74 
[    1.531004] RIP: kmem_cache_alloc+0x84/0x1b0 RSP: ffffbfbcc05dbe60
[    1.561110] ---[ end trace b9396075a1c8bec4 ]---

that one actually shows *12* general protection faults, all with the same traceback.
Comment 3 Thorsten Leemhuis 2017-03-13 15:04:36 UTC
Seeing this also (RHEL Host, KVM)

Adam, did you report this here only or also to some mailinglist?
Comment 4 Thorsten Leemhuis 2017-03-13 15:15:19 UTC
(In reply to Adam Williamson from comment #1)
> so the delta is to 'v4.10-11319-gc82be9d', but I dunno what that means
> exactly.

FWIW, most likely suspects afaics: the second batches with changes to KVM and SCSI
https://git.kernel.org/torvalds/c/2d62e0768d3c28536d4cfe4c40ba1e5e8e442a93
https://git.kernel.org/torvalds/c/a3b4924b027f9a4b95ce89a914c1e0459e76f18a
Comment 5 Adam Williamson 2017-03-13 16:01:30 UTC
Only here and downstream Bugzilla. I'm not really signed up to any kernel mailing lists. Please do feel free to forward it anywhere appropriate...
Comment 6 The Linux kernel's regression tracker (Thorsten Leemhuis) 2017-03-14 19:25:12 UTC
FWIW, the culprit is somewhere between 10851-g20b83643abbc and 11620-ga3b4924b027f
Comment 7 The Linux kernel's regression tracker (Thorsten Leemhuis) 2017-03-14 21:23:49 UTC
Leaving the keyboard now; I'm down to 1827adb11ad2...590dce2d4934 (includes e0d072250a54669dce876d8ade70e417356aae74 Merge branch 'for-linus' of git://git.kernel.dk/linux-block )
Comment 8 Omar Sandoval 2017-03-14 22:56:29 UTC
This was fixed in v4.11-rc2.
Comment 9 Thorsten Leemhuis 2017-03-15 06:57:52 UTC
(In reply to Omar Sandoval from comment #8)
> This was fixed in v4.11-rc2.

Care to provide a few more details and a commit that fixes this? Because I'm still seeing crashes with 4.11-rc2; but a few details looks fishy and I more and more wonder if I'm dealing with two bugs here :-/
Comment 10 Thorsten Leemhuis 2017-03-15 07:02:43 UTC
(In reply to Thorsten Leemhuis from comment #9)
> (In reply to Omar Sandoval from comment #8)
>> This was fixed in v4.11-rc2.
> Care to provide a few more details and a commit that fixes this?

Okay, now assuming you mean https://git.kernel.org/torvalds/c/737f98cfe7de8df7433a4d846850aa8efa44bd48 Will investigate…
Comment 11 Omar Sandoval 2017-03-15 07:07:32 UTC
The fix was https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=df23de55615fa7a190a85f49a950ccecdd9102f3, which was part of this patch series:

c01228db4ba9 Revert "scsi, block: fix duplicate bdi name registration crashes"
90f16fddcc28 block: Make del_gendisk() safer for disks without queues
df23de55615f bdi: Fix use-after-free in wb_congested_put()
b6f8fec4448a block: Allow bdi re-registration

If you are still seeing crashes on -rc2, could you please report it to linux-block@vger.kernel.org?
Comment 12 Thorsten Leemhuis 2017-03-15 07:47:05 UTC
(In reply to Omar Sandoval from comment #11)

> The fix was [...]

Thx for providing details!

> If you are still seeing crashes on -rc2,

Yes, but different ones every few boot attempts. But I just noticed those crashes happen without virtio-scsi as well. So definitely something else now. 4.10 otoh works well; just retested to be sure. Sigh, I guess I need to take a closer look to find out what's wrong; that might take a day or two.
Comment 13 Thorsten Leemhuis 2017-03-16 12:49:44 UTC
(In reply to Omar Sandoval from comment #11)
>
> If you are still seeing crashes on -rc2, could you please report it to
> linux-block@vger.kernel.org?

FWIW, these are the crashes I see: https://plus.google.com/+ThorstenLeemhuis/posts/FjyyGjNtrrG But those happen without virtio-scsi as well.
Comment 14 The Linux kernel's regression tracker (Thorsten Leemhuis) 2017-03-16 16:18:34 UTC
Status update: Had a hunch and reverted the vhost/virtio changes (https://git.kernel.org/torvalds/c/54d7989f476ca57fc3c5cc71524c480ccb74c481 ) on top of master and now it's running fine. Will look closer now which of the changes is the culprit.
Comment 15 The Linux kernel's regression tracker (Thorsten Leemhuis) 2017-03-17 19:02:59 UTC
Adam or somebody else: Can you please close this Bug to avoid confusion? There clearly were some bugs in virtio_scsi, but those were already fixed. 

But there are more bugs (at least for me): I bisectected the problem I'm still seeing with todays mainline and found that "virtio_pci: use shared interrupts for virtqueues" (5c34d002dcc7; from hch and mst) was the first bad commit for me. Filed Bug 194911 to track that.

Note You need to log in before you can comment on or make changes to this bug.