Bug 214771
Summary: | controller reset and subsystem reset cause Kernel panic | ||
---|---|---|---|
Product: | IO/Storage | Reporter: | LiuZhouhua (liuzhouhua) |
Component: | NVMe | Assignee: | IO/NVME Virtual Default Assignee (io_nvme) |
Status: | NEW --- | ||
Severity: | enhancement | CC: | carnil, jch, kbusch, sbeattie |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | 4.19 | Subsystem: | |
Regression: | No | Bisected commit-id: | |
Attachments: | proposed fix |
Description
LiuZhouhua
2021-10-20 13:23:30 UTC
I think what you're showing is the driver accessing a pci register before it has iomapped it. i'll see if there's something we can do about it. Created attachment 299261 [details]
proposed fix
Could you confirm if the attached patch fixes your observation?
(In reply to Keith Busch from comment #2) > Created attachment 299261 [details] > proposed fix > > Could you confirm if the attached patch fixes your observation? I've tried this patch. It hung when insmod nvme.ko [Thu Oct 21 15:59:06 2021] INFO: task kworker/u193:2:67453 blocked for more than 120 seconds. [Thu Oct 21 15:59:06 2021] Tainted: G OE 4.19.36-vhulk1907.1.0.h748.eulerosv2r8.aarch64 #1 [Thu Oct 21 15:59:06 2021] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [Thu Oct 21 15:59:06 2021] kworker/u193:2 D 0 67453 2 0x00000228 [Thu Oct 21 15:59:06 2021] Workqueue: nvme-reset-wq nvme_reset_work [nvme] [Thu Oct 21 15:59:06 2021] Call trace: [Thu Oct 21 15:59:06 2021] __switch_to+0x94/0xe8 [Thu Oct 21 15:59:06 2021] __schedule+0x31c/0x9c8 [Thu Oct 21 15:59:06 2021] schedule+0x2c/0x88 [Thu Oct 21 15:59:06 2021] schedule_preempt_disabled+0x14/0x20 [Thu Oct 21 15:59:06 2021] __mutex_lock.isra.1+0x1fc/0x540 [Thu Oct 21 15:59:06 2021] __mutex_lock_slowpath+0x24/0x30 [Thu Oct 21 15:59:06 2021] mutex_lock+0x80/0xa8 [Thu Oct 21 15:59:06 2021] nvme_pci_reg_write32+0x3c/0x78 [nvme] [Thu Oct 21 15:59:06 2021] nvme_disable_ctrl+0x40/0x78 [nvme] [Thu Oct 21 15:59:06 2021] nvme_reset_work+0x260/0xea8 [nvme] [Thu Oct 21 15:59:06 2021] process_one_work+0x1b4/0x3f8 [Thu Oct 21 15:59:06 2021] worker_thread+0x210/0x470 [Thu Oct 21 15:59:06 2021] kthread+0x134/0x138 [Thu Oct 21 15:59:06 2021] ret_from_fork+0x10/0x18 Oops, of course it's a circular lock... will rework. (In reply to Keith Busch from comment #4) > Oops, of course it's a circular lock... will rework. What happened to that? The most recent set of changes (2022-11-01) don't seem address this unless I'm missing something, Latest code uses the state machine so that you can't stack resets like this but reported. Thank you. I guess this bug can be closed now. https://git.kernel.org/linus/1e866afd4bcdd01a70a5eddb4371158d3035ce03 refers to this bugzilla entry. |