Since 4.16-rc4 (upgraded from 4.15.2 which worked) I have an issue with SMART checks occassionally failing like this: smartd[28017]: Device: /dev/sdb [SAT], is in SLEEP mode, suspending checks udisksd[24408]: Error performing housekeeping for drive /org/freedesktop/UDisks2/drives/INTEL_SSDSA2CW300G3_[…]: Error updating SMART data: Error sending ATA command CHECK POWER MODE: Unexpected sense data returned:#0120000: 0e 09 0c 00 00 00 ff 00 00 00 00 00 00 00 50 00 ..............P.#0120010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................#012 (g-io-error-quark, 0) merkaba udisksd[24408]: Error performing housekeeping for drive /org/freedesktop/UDisks2/drives/Crucial_CT480M500SSD3_[…]: Error updating SMART dat a: Error sending ATA command CHECK POWER MODE: Unexpected sense data returned:#0120000: 01 00 1d 00 00 00 0e 09 0c 00 00 00 ff 00 00 00 ................#0120010: 00 0 0 00 00 50 00 00 00 00 00 00 00 00 00 00 00 ....P...........#012 (g-io-error-quark, 0) (Intel SSD is connected via SATA, Crucial via mSATA in a ThinkPad T520) However when I then check manually with smartctl -a | -x | -H the device reports SMART data just fine. Also at some boot attempts the boot hangs with a message "could not connect to lvmetad". I wonder whether that has to do with the new SATA LPM policy stuff, but as I had issues with 3 => Medium power with Device Initiated PM enabled (machine did not boot, which could also have been caused by me accidentally removing all TCP/IP network support in the kernel with that setting) I set it back to CONFIG_SATA_MOBILE_LPM_POLICY=0 (firmware settings) Only other significant change I am aware of is that I switched from SLAB to SLUB allocator as Debian did with their kernels recently I think. I attach the complete configuration as xz. Please understand that I am not into doing a bisect as it can take quite a a while for the issue to appear and I will be holding a Linux training next week. If you have any other suggestions, please tell. I found a thread in LKML about another Crucial SSD not working with more aggressive LPM settings, yet my current 4.16-rc4 kernel runs with LPM policy 0 which should be safe ([PATCH] libata: Apply NOLPM quirk to Crucial MX100 512GB SSDs). Also about 3 => Medium power with Device Initiated PM enabled I am not yet sure which of the both SSDs may cause trouble. Also posted on mailing list.
Created attachment 274651 [details] configuration of the currently running 4.16-rc4 kernel that shows these issues
[Possible REGRESSION, 4.16-rc4] Error updating SMART data during runtime and could not connect to lvmetad at some boot attempts https://lkml.org/lkml/2018/3/11/15 Message-ID: <27165802.vQ9JbjrmvU@merkaba>
Created attachment 274703 [details] blk_mq_terminate_expired This is a screenshot of a boot failure from tomorrow morning. I had this another time another day but was not in a good situation to take a screenshot. So this hints at some issue with blk-mq. I do use blk-mq with BFQ.
The second issue is still unfixed as of 4.16 + [PATCH v2] block: Change a rcu_read_{lock,unlock}_sched() pair into rcu_read_{lock,unlock}() vom Bart Van Asche, which Jens Axboe accepted¹. [1] https://patchwork.kernel.org/patch/10294287/ Currently compiling 4.16.1, but I do not expect a change, as there is nothing about blk-mq subsystem in the changelog as far as I saw.
Forget the comment about the second issue. This bug report is just about what I believe is one issue. I just copied from a mail I will send to the mailing list thread I mentioned in comment 2.
This patch may help to get rid of the boot failure: "blk-mq: Directly schedule q->timeout_work when aborting a request" (https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1653932.html).
Thanks, Bart. Will test with a 4.16.1 with [PATCH v2] block: Change a rcu_read_{lock,unlock}_sched() pair into rcu_read_{lock,unlock}() + blk-mq: Directly schedule q->timeout_work when aborting a request on top. So the "Error updating SMART data" thing I reported here as well may still be a different issue, but I test whether that might have been fixed by it, as well.
Created attachment 275279 [details] 4.16.1 quiesce + timeout patch boot failure 4.16.1 with [PATCH v2] block: Change a rcu_read_{lock,unlock}_sched() pair into rcu_read_{lock,unlock}() + blk-mq: Directly schedule q->timeout_work when aborting a request on top fails as well.
The "RIP: scsi_times_out+0x17" crash looks familiar. Please try patch "[PATCH v4] blk-mq: Fix race conditions in request timeout handling" (https://marc.info/?l=linux-block&m=152332423328778). Note: although not everyone agrees about the description of that patch I think there is agreement that that patch itself works reliably.
Thank you, Bart. Will try tomorrow. On top of the two other patches or as a replacement for them?
Please apply that patch on top of the other two patches.
Bart, 4.16.1 with all the three patches did about 15 reboots without a hang. So I think it is good to go. As its a race that only happens sometimes I am not yet completely sure, so I am going to test this kernel during the next days with my regular suspend to ram / hibernate to disk and resume workflow. Thank you for your work on this.
Created attachment 275399 [details] 4.16.2 backtrace with ming v4 patches Ming, a 4.16.2 with the patches: '[PATCH] blk-mq_Directly schedule q->timeout_work when aborting a request.mbox' '[PATCH v2] block: Change a rcu_read_{lock,unlock}_sched() pair into rcu_read_{lock,unlock}().mbox' '[PATCH V4 1_2] blk-mq_set RQF_MQ_TIMEOUT_EXPIRED when the rq'\''s timeout isn'\''t handled.mbox' '[PATCH V4 2_2] blk-mq_fix race between complete and BLK_EH_RESET_TIMER.mbox' hung on boot 3 out of 4 times. I attach backtraces from two different attempts.
Created attachment 275401 [details] 4.16.2 backtrace with ming v4 patches [2/2]
This is fixed meanwhile by f4560231ec42092c6662acccabb28c6cac9f5dfb which went into 4.17 and is marked to be backported/applied to various backport kernels. commit f4560231ec42092c6662acccabb28c6cac9f5dfb Author: Jianchao Wang <[…]> Date: Tue Apr 17 11:46:20 2018 +0800 blk-mq: start request gstate with gen 1 rq->gstate and rq->aborted_gstate both are zero before rqs are allocated. If we have a small timeout, when the timer fires, there could be rqs that are never allocated, and also there could be rq that has been allocated but not initialized and started. At the moment, the rq->gstate and rq->aborted_gstate both are 0, thus the blk_mq_terminate_expired will identify the rq is timed out and invoke .timeout early. For scsi, this will cause scsi_times_out to be invoked before the scsi_cmnd is not initialized, scsi_cmnd->device is still NULL at the moment, then we will get crash.