Bug 207575

Summary: NULL pointer dereference in nvme reset work-queue when VMD raid mode and SecureBoot turned on simultaneously on TigerLake
Product: Drivers Reporter: You-Sheng Yang (vicamo)
Component: IOMMUAssignee: drivers_iommu
Status: NEW ---    
Severity: normal CC: jonathan.derrick, kai.heng.feng, perry_yuan
Priority: P1    
Hardware: x86-64   
OS: Linux   
Kernel Version: 5.7-rc3 Subsystem:
Regression: No Bisected commit-id:
Attachments: syslog, v5.7-rc3
lspci, v5.7-rc3
allocate info for vmd child devices

Description You-Sheng Yang 2020-05-04 10:47:44 UTC
Created attachment 288893 [details]
syslog, v5.7-rc3

This is found on a Dell TigerLake platform that when VMD raid mode is turned on along with SecureBoot, either deploy mode or audit mode, kernel dumps warnings and null pointer deref errors at boot. While it happens, it blocks systemd-udevd worker processes until killed due to timeout. System still boots to multi-users.target.

Kernel bisect shows commit e3560ee4cfb2 ("iommu/vt-d: Remove VMD child device sanity check") merged in v5.6-rc1 is the first commit to fail, and is still reproducible on v5.7-rc3.

kernel: Secure boot disabled
...
kernel: ------------[ cut here ]------------
kernel: WARNING: CPU: 1 PID: 8 at drivers/iommu/intel-iommu.c:625 domain_get_iommu+0x4b/0x60
kernel: Modules linked in: rc_core r8169(+) intel_lpss nvme crc32_pclmul(+) psmouse intel_ish_ipc(+) i2c_hid i2c_i801(+) realtek idma64 drm virt_dma intel_ishtp vmd(+) nvme_core hid video wmi pinctrl_tigerlake pinctrl_intel
kernel: CPU: 1 PID: 8 Comm: kworker/u8:0 Not tainted 5.7.0-050700rc3-generic #202004262131
kernel: Hardware name: Dell Inc. Vostro 5402/, BIOS 0.1.2 04/13/2020
kernel: Workqueue: nvme-reset-wq nvme_reset_work [nvme]
kernel: RIP: 0010:domain_get_iommu+0x4b/0x60
kernel: Code: eb 22 48 8d 50 01 48 39 c8 74 1b 48 89 d0 8b 74 87 04 48 63 d0 85 f6 74 e9 48 8b 05 ef 63 63 01 48 8b 04 d0 5d c3 31 c0 5d c3 <0f> 0b 31 c0 5d c3 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 0f
kernel: RSP: 0018:ffffb5f5c00fbcf8 EFLAGS: 00010202
kernel: RAX: ffff9ebdaf100b00 RBX: 0000000000000000 RCX: 0000000000000000
kernel: RDX: 0000000000001000 RSI: 000000036dd41000 RDI: ffff9ebdaf100b00
kernel: RBP: ffffb5f5c00fbcf8 R08: ffffffffffffffff R09: ffff9ebdadd41000
kernel: R10: ffffffff8d069060 R11: 0000000000004879 R12: ffff9ebdadd810b0
kernel: R13: 000000036dd41000 R14: ffffffffffffffff R15: ffff9ebdaf100b00
kernel: FS:  0000000000000000(0000) GS:ffff9ebdc1680000(0000) knlGS:0000000000000000
kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
kernel: CR2: 00007f900d20e660 CR3: 000000036e05a003 CR4: 0000000000760ee0
kernel: PKRU: 55555554
kernel: Call Trace:
kernel:  __intel_map_single+0x47/0x1a0
kernel:  intel_alloc_coherent+0xab/0x120
kernel:  dma_alloc_attrs+0x4d/0x60
kernel:  nvme_alloc_queue+0x63/0x180 [nvme]
kernel:  nvme_reset_work+0x31a/0xa64 [nvme]
kernel:  ? wake_up_process+0x15/0x20
kernel:  ? swake_up_locked.part.0+0x17/0x30
kernel:  process_one_work+0x1e8/0x3b0
kernel:  worker_thread+0x4d/0x400
kernel:  kthread+0x104/0x140
kernel:  ? process_one_work+0x3b0/0x3b0
kernel:  ? kthread_park+0x90/0x90
kernel:  ret_from_fork+0x1f/0x40
kernel: ---[ end trace caf06459a58aa8d4 ]---
....
kernel: BUG: kernel NULL pointer dereference, address: 0000000000000018
kernel: #PF: supervisor read access in kernel mode
kernel: #PF: error_code(0x0000) - not-present page
kernel: PGD 0 P4D 0 
kernel: Oops: 0000 [#2] SMP NOPTI
kernel: CPU: 1 PID: 254 Comm: kworker/u8:4 Tainted: G      D W         5.7.0-050700rc3-generic #202004262131
kernel: Hardware name: Dell Inc. Vostro 5402/, BIOS 0.1.2 04/13/2020
kernel: Workqueue: nvme-reset-wq nvme_reset_work [nvme]
kernel: RIP: 0010:__intel_map_single+0xa3/0x1a0
kernel: Code: 89 d2 4c 89 55 d0 e8 ec b3 ff ff 4c 8b 55 d0 48 85 c0 49 89 c6 0f 84 e9 00 00 00 41 b9 01 00 00 00 83 fb 01 76 14 48 8b 45 c0 <4c> 8b 48 18 49 c1 e9 16 49 83 f1 01 41 83 e1 01 44 89 c8 4c 89 e9
kernel: RSP: 0018:ffffb5f5c050f878 EFLAGS: 00010202
kernel: RAX: 0000000000000000 RBX: 0000000000000002 RCX: ffff9ebdbf205140
kernel: RDX: ffff9ebdbf205bc0 RSI: 0000000000000257 RDI: ffff9ebdaf100e30
kernel: RBP: ffffb5f5c050f8c0 R08: ffff9ebdae266f00 R09: 0000000000000001
kernel: R10: 0000000000000001 R11: 0000000000000022 R12: ffff9ebdadd860b0
kernel: R13: 000000036ef0d000 R14: 00000000000ffffa R15: ffff9ebdaf100b00
kernel: FS:  0000000000000000(0000) GS:ffff9ebdc1680000(0000) knlGS:0000000000000000
kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
kernel: CR2: 0000000000000018 CR3: 000000036efd2001 CR4: 0000000000760ee0
kernel: PKRU: 55555554
kernel: Call Trace:
kernel:  intel_map_page+0x86/0xa0
kernel:  nvme_map_data+0x486/0x990 [nvme]
kernel:  ? fbcon_cursor+0x128/0x180
kernel:  ? bit_putcs+0x5a0/0x5a0
kernel:  nvme_queue_rq+0xa2/0x1d0 [nvme]
kernel:  blk_mq_dispatch_rq_list+0x93/0x5d0
kernel:  ? __alloc_pages_nodemask+0x161/0x2f0
kernel:  ? _find_next_bit.constprop.0+0x20/0x80
kernel:  blk_mq_sched_dispatch_requests+0xfe/0x180
kernel:  __blk_mq_run_hw_queue+0x5a/0x110
kernel:  __blk_mq_delay_run_hw_queue+0x15b/0x160
kernel:  blk_mq_run_hw_queue+0x70/0x110
kernel:  blk_mq_sched_insert_request+0xce/0x190
kernel:  ? blk_rq_append_bio+0x28/0x180
kernel:  blk_execute_rq_nowait+0x61/0x70
kernel:  blk_execute_rq+0x50/0xb0
kernel:  __nvme_submit_sync_cmd+0x92/0x1e0 [nvme_core]
kernel:  ? __cpuhp_state_add_instance_cpuslocked+0xe8/0x110
kernel:  nvme_identify_ctrl.isra.0+0x7e/0xc0 [nvme_core]
kernel:  nvme_init_identify+0x97/0x6d0 [nvme_core]
kernel:  nvme_reset_work+0x422/0xa64 [nvme]
kernel:  ? try_to_wake_up+0x65/0x690
kernel:  process_one_work+0x1e8/0x3b0
kernel:  worker_thread+0x4d/0x400
kernel:  kthread+0x104/0x140
kernel:  ? process_one_work+0x3b0/0x3b0
kernel:  ? kthread_park+0x90/0x90
kernel:  ret_from_fork+0x1f/0x40
kernel: Modules linked in: cec(+) intel_lpss_pci(+) rc_core fjes(-) r8169(+) intel_lpss nvme crc32_pclmul psmouse intel_ish_ipc(+) i2c_hid i2c_i801(+) realtek idma64 drm virt_dma intel_ishtp vmd nvme_core hid video wmi pinctrl_tigerlake pinctrl_intel
kernel: CR2: 0000000000000018
kernel: ---[ end trace caf06459a58aa8db ]---
kernel: RIP: 0010:__intel_map_single+0xa3/0x1a0
kernel: Code: 89 d2 4c 89 55 d0 e8 ec b3 ff ff 4c 8b 55 d0 48 85 c0 49 89 c6 0f 84 e9 00 00 00 41 b9 01 00 00 00 83 fb 01 76 14 48 8b 45 c0 <4c> 8b 48 18 49 c1 e9 16 49 83 f1 01 41 83 e1 01 44 89 c8 4c 89 e9
kernel: RSP: 0018:ffffb5f5c00fb878 EFLAGS: 00010202
kernel: RAX: 0000000000000000 RBX: 0000000000000002 RCX: ffff9ebdae2665c0
kernel: RDX: ffff9ebdbf205581 RSI: 0000000000000257 RDI: ffff9ebdaf100e30
kernel: RBP: ffffb5f5c00fb8c0 R08: ffff9ebdaf100e38 R09: 0000000000000001
kernel: R10: 0000000000000001 R11: 0000000000000022 R12: ffff9ebdadd810b0
kernel: R13: 000000036ef0e000 R14: 00000000000ffffd R15: ffff9ebdaf100b00
kernel: FS:  0000000000000000(0000) GS:ffff9ebdc1680000(0000) knlGS:0000000000000000
kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
kernel: CR2: 0000000000000018 CR3: 000000036efd2001 CR4: 0000000000760ee0
kernel: PKRU: 55555554
Comment 1 You-Sheng Yang 2020-05-04 10:48:35 UTC
Created attachment 288895 [details]
lspci, v5.7-rc3
Comment 2 You-Sheng Yang 2020-05-04 14:06:30 UTC
This can be worked-around with an additional "intel_iommu=on" passed. And it can also be reproduced with "iotel_iommu=on iommu=pt" when secure boot is completely disabled.
Comment 3 Jon Derrick 2020-05-04 14:50:22 UTC
Hi You-Shend Yang,

Does it boot successfully with iommu=nopt ?
Comment 4 You-Sheng Yang 2020-05-05 04:49:43 UTC
Hi Jon,

No, still have the same errors with iommu=nopt passed.
Comment 5 Jon Derrick 2020-05-05 23:06:14 UTC
Created attachment 288935 [details]
allocate info for vmd child devices
Comment 6 Jon Derrick 2020-05-05 23:06:27 UTC
Hi You-Sheng Yang,

Can you try attached patch?
If this doesn't work, could you attach your kconfig?
Comment 7 You-Sheng Yang 2020-05-06 09:35:48 UTC
Hi, this works for me without additional hack. Thank you.
Comment 8 You-Sheng Yang 2020-05-12 03:24:42 UTC
Hi, will this be available for review/merge on iommu mailing list any time soon?
Comment 9 Jon Derrick 2020-05-12 14:21:30 UTC
Hi You-Sheng Yang,

This has been superceded by Baolu's set here:
https://lore.kernel.org/linux-iommu/7928dd48-93da-62f0-b455-6e6b248d0fae@linux.intel.com/T/#t

You can review this set instead.