Steps: 1. nvme connect to two port 2. down the active port 3. keep_alive expires on that controller 4. All IO are getting timed out 5. so failover is required. 6. error_recovery gets triggered. 7. controller state is changed to resetting/connecting 8. So ideally current_path now should change to next one as earlier path is not valid anymore (ctrl state changed) But here is the problem.. error_recovery or reset_controller may run on different core so following race can happen --> while mpath_make_request () choosing its path, ctrl was LIVE But just after srcu deference of current_path, ctrl state changed to RESETTING/CONNECTING Now the problem is current_path associated ctrl is no more LIVE When IO land to rdma layer there is a check nvmf_check_if_ready() which eventually fails the IO with BLK_STS_ERROR. fio is bailing out. So We need to handle it in nvmf_check_if_ready() so if ctrl state is CONNECTING/RESETTING we need to return BLK_STS_RESOURCE for multipath. ~Susobhan
609c609 < if (!blk_noretry_request(rq) && !(rq->cmd_flags & REQ_NVME_MPATH)) --- > if (!blk_noretry_request(rq)) The above change in fabrics.c works for me. ~Susobhan
See: http://lists.infradead.org/pipermail/linux-nvme/2018-September/020061.html