Bug 199077
Summary: | [Possible REGRESSION, 4.16-rc4] Error updating SMART data during runtime and boot failures with blk_mq_terminate_expired in backtrace | ||
---|---|---|---|
Product: | IO/Storage | Reporter: | Martin Steigerwald (Martin) |
Component: | Block Layer | Assignee: | Tejun Heo (tj) |
Status: | RESOLVED CODE_FIX | ||
Severity: | normal | CC: | bvanassche, jbuchert+kbugs |
Priority: | P1 | ||
Hardware: | Intel | ||
OS: | Linux | ||
Kernel Version: | 4.16-rc4 | Subsystem: | |
Regression: | No | Bisected commit-id: | |
Attachments: |
configuration of the currently running 4.16-rc4 kernel that shows these issues
blk_mq_terminate_expired 4.16.1 quiesce + timeout patch boot failure 4.16.2 backtrace with ming v4 patches 4.16.2 backtrace with ming v4 patches [2/2] |
Description
Martin Steigerwald
2018-03-11 08:17:02 UTC
Created attachment 274651 [details]
configuration of the currently running 4.16-rc4 kernel that shows these issues
[Possible REGRESSION, 4.16-rc4] Error updating SMART data during runtime and could not connect to lvmetad at some boot attempts https://lkml.org/lkml/2018/3/11/15 Message-ID: <27165802.vQ9JbjrmvU@merkaba> Created attachment 274703 [details]
blk_mq_terminate_expired
This is a screenshot of a boot failure from tomorrow morning. I had this another time another day but was not in a good situation to take a screenshot.
So this hints at some issue with blk-mq. I do use blk-mq with BFQ.
The second issue is still unfixed as of 4.16 + [PATCH v2] block: Change a rcu_read_{lock,unlock}_sched() pair into rcu_read_{lock,unlock}() vom Bart Van Asche, which Jens Axboe accepted¹. [1] https://patchwork.kernel.org/patch/10294287/ Currently compiling 4.16.1, but I do not expect a change, as there is nothing about blk-mq subsystem in the changelog as far as I saw. Forget the comment about the second issue. This bug report is just about what I believe is one issue. I just copied from a mail I will send to the mailing list thread I mentioned in comment 2. This patch may help to get rid of the boot failure: "blk-mq: Directly schedule q->timeout_work when aborting a request" (https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1653932.html). Thanks, Bart. Will test with a 4.16.1 with [PATCH v2] block: Change a rcu_read_{lock,unlock}_sched() pair into rcu_read_{lock,unlock}() + blk-mq: Directly schedule q->timeout_work when aborting a request on top. So the "Error updating SMART data" thing I reported here as well may still be a different issue, but I test whether that might have been fixed by it, as well. Created attachment 275279 [details]
4.16.1 quiesce + timeout patch boot failure
4.16.1 with [PATCH v2] block: Change a rcu_read_{lock,unlock}_sched() pair into rcu_read_{lock,unlock}() + blk-mq: Directly schedule q->timeout_work when aborting a request on top fails as well.
The "RIP: scsi_times_out+0x17" crash looks familiar. Please try patch "[PATCH v4] blk-mq: Fix race conditions in request timeout handling" (https://marc.info/?l=linux-block&m=152332423328778). Note: although not everyone agrees about the description of that patch I think there is agreement that that patch itself works reliably. Thank you, Bart. Will try tomorrow. On top of the two other patches or as a replacement for them? Please apply that patch on top of the other two patches. Bart, 4.16.1 with all the three patches did about 15 reboots without a hang. So I think it is good to go. As its a race that only happens sometimes I am not yet completely sure, so I am going to test this kernel during the next days with my regular suspend to ram / hibernate to disk and resume workflow. Thank you for your work on this. Created attachment 275399 [details]
4.16.2 backtrace with ming v4 patches
Ming, a 4.16.2 with the patches:
'[PATCH] blk-mq_Directly schedule q->timeout_work when aborting a request.mbox'
'[PATCH v2] block: Change a rcu_read_{lock,unlock}_sched() pair into rcu_read_{lock,unlock}().mbox'
'[PATCH V4 1_2] blk-mq_set RQF_MQ_TIMEOUT_EXPIRED when the rq'\''s timeout isn'\''t handled.mbox'
'[PATCH V4 2_2] blk-mq_fix race between complete and BLK_EH_RESET_TIMER.mbox'
hung on boot 3 out of 4 times.
I attach backtraces from two different attempts.
Created attachment 275401 [details]
4.16.2 backtrace with ming v4 patches [2/2]
This is fixed meanwhile by f4560231ec42092c6662acccabb28c6cac9f5dfb which went into 4.17 and is marked to be backported/applied to various backport kernels. commit f4560231ec42092c6662acccabb28c6cac9f5dfb Author: Jianchao Wang <[…]> Date: Tue Apr 17 11:46:20 2018 +0800 blk-mq: start request gstate with gen 1 rq->gstate and rq->aborted_gstate both are zero before rqs are allocated. If we have a small timeout, when the timer fires, there could be rqs that are never allocated, and also there could be rq that has been allocated but not initialized and started. At the moment, the rq->gstate and rq->aborted_gstate both are 0, thus the blk_mq_terminate_expired will identify the rq is timed out and invoke .timeout early. For scsi, this will cause scsi_times_out to be invoked before the scsi_cmnd is not initialized, scsi_cmnd->device is still NULL at the moment, then we will get crash. |