Bug 216400

Summary: Firmware activation starting AEN processing prevents further AER commands sent to the NVMe controller.
Product: IO/Storage Reporter: lixingyuan (lixingyuan)
Component: NVMeAssignee: IO/NVME Virtual Default Assignee (io_nvme)
Status: NEW ---    
Severity: high    
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: v5.19 Subsystem:
Regression: No Bisected commit-id:

Description lixingyuan 2022-08-23 01:14:50 UTC
This bug is related to these two commits:


1. https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v6.0-rc2&id=4c75f877853cfa81b12374a07208e07b077f39b8

These codes will set the controller state to NVME_CTRL_RESETTING while handling the firmware activation staring AEN

2. https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v6.0-rc2&id=0fa0f99fc84e41057cbdd2efbfe91c6b2f47dd9d

When submitting a new AER command to the controller, this code checks if the controller state is NVME_CTRL_LIVE. This caused the problem. When the firmware activation staring AEN was processed before, the controller state was already set to NVME_CTRL_RESETTING, which resulted in no new AER commands being sent to the controller.