Bug 201039 - NVMe multipath IO failure
Summary: NVMe multipath IO failure
Status: NEW
Alias: None
Product: Drivers
Classification: Unclassified
Component: Other (show other bugs)
Hardware: All Linux
: P1 high
Assignee: drivers_other
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2018-09-07 21:30 UTC by Susobhan
Modified: 2018-09-21 19:48 UTC (History)
2 users (show)

See Also:
Kernel Version: 4.17.6
Subsystem:
Regression: No
Bisected commit-id:


Attachments

Description Susobhan 2018-09-07 21:30:37 UTC
Steps:

1. nvme connect to two port
2. down the active port
3. keep_alive expires on that controller
4. All IO are getting timed out
5. so failover is required.
6. error_recovery gets triggered.
7. controller state is changed to resetting/connecting
8. So ideally current_path now should change to next one as earlier path is not valid anymore (ctrl state changed)


But here is the problem..
error_recovery or reset_controller may run on different core so following race can happen -->

while mpath_make_request () choosing its path, ctrl was LIVE
But just after srcu deference of current_path, ctrl state changed to RESETTING/CONNECTING

Now the problem is current_path associated ctrl is no more LIVE

When IO land to rdma layer there is a check nvmf_check_if_ready() which eventually fails the IO with BLK_STS_ERROR.
fio is bailing out.

So We need to handle it in nvmf_check_if_ready()

so if ctrl state is CONNECTING/RESETTING we need to return BLK_STS_RESOURCE for multipath. 

~Susobhan
Comment 1 Susobhan 2018-09-07 22:13:50 UTC
609c609
<       if (!blk_noretry_request(rq) && !(rq->cmd_flags & REQ_NVME_MPATH))
---
>       if (!blk_noretry_request(rq))

The above change in fabrics.c works for me.

~Susobhan

Note You need to log in before you can comment on or make changes to this bug.