Created attachment 300564 [details] dmesg log after suspend resume, io stuck After system suspend/resume filesystem IO will stall, producing a 'kworker blocked for more than x sec" in dmesg, recovering after a long delay. See the attached dmesg-suspend-resume-nvme-stuck.txt. I also noticed the same issue happening right after booting or after runtime suspend transitions. The same issue also happens on multiple SKL systems in the i915 team's CI farm, see: https://gitlab.freedesktop.org/drm/intel/-/issues/4547 I bisected the problem to commit 4f5022453acd0f7b28012e20b7d048470f129894 Author: Jens Axboe <axboe@kernel.dk> Date: Mon Oct 18 08:45:39 2021 -0600 nvme: wire up completion batching for the IRQ path By reverting it on top of 5.17.0-rc7, I can't reproduce the problem. Attached dmesg-suspend-resume-nvme-ok.txt with the revert, captured after a few suspend/resume.
Created attachment 300565 [details] dmesg log after suspend resume, io ok
Thanks for the bisect, that helps a lot. It sounds like a completion is missed somewhere in the batch attempt, I'll take a look this week. I'm not sure if we're getting an rc8 or not, so any potential fix may have to wait for stable.
(In reply to Keith Busch from comment #2) > Thanks for the bisect, that helps a lot. It sounds like a completion is > missed somewhere in the batch attempt, I'll take a look this week. I'm not > sure if we're getting an rc8 or not, so any potential fix may have to wait > for stable. So, we got a rc8; was any progress made to get this fixed? Otherwise I'll add it to the regression tracking so Linus gets aware of it.
(In reply to Keith Busch from comment #2) > Thanks for the bisect, that helps a lot. It sounds like a completion is > missed somewhere in the batch attempt, I'll take a look this week. I'm not > sure if we're getting an rc8 or not, so any potential fix may have to wait > for stable. Keith, is this still on your todo list? It looks like nothing happend in about a month here.
Sorry for the delay. After not finding any issues from code inspection, I tried recreating this, but haven't had any luck. I'll post this to the mailing list since more people subscribe to that.
On Sun, Mar 13, 2022 at 08:13:53PM +0000, bugzilla-daemon@kernel.org wrote: > https://bugzilla.kernel.org/show_bug.cgi?id=215679 > > Bug ID: 215679 > Summary: NVMe/writeback wb_workfn/blocked for more than 30 > seconds > Product: IO/Storage > Version: 2.5 > Kernel Version: 5.17.0-rc7 > Hardware: x86-64 > OS: Linux > Tree: Mainline > Status: NEW > Severity: high > Priority: P1 > Component: NVMe > Assignee: io_nvme@kernel-bugs.kernel.org > Reporter: imre.deak@intel.com > Regression: No > > Created attachment 300564 [details] > --> https://bugzilla.kernel.org/attachment.cgi?id=300564&action=edit > dmesg log after suspend resume, io stuck > > After system suspend/resume filesystem IO will stall, producing a 'kworker > blocked for more than x sec" in dmesg, recovering after a long delay. See the > attached dmesg-suspend-resume-nvme-stuck.txt. I also noticed the same issue > happening right after booting or after runtime suspend transitions. > > The same issue also happens on multiple SKL systems in the i915 team's CI > farm, > see: > > https://gitlab.freedesktop.org/drm/intel/-/issues/4547 > > I bisected the problem to > commit 4f5022453acd0f7b28012e20b7d048470f129894 > Author: Jens Axboe <axboe@kernel.dk> > Date: Mon Oct 18 08:45:39 2021 -0600 > > nvme: wire up completion batching for the IRQ path > > By reverting it on top of 5.17.0-rc7, I can't reproduce the problem. Attached > dmesg-suspend-resume-nvme-ok.txt with the revert, captured after a few > suspend/resume. Forwarding to linux-nvme for higher visibility.
Hi, this is your Linux kernel regression tracker. Top-posting for once, to make this easily accessible to everyone. Jens, this regression is now a month old and the culprit was already identified when the issue was reported, nevertheless it seems no real progress was made to address this, at least afaics. What's up there? Should the culprit be reverted? Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat) P.S.: As the Linux kernel's regression tracker I deal with a lot of reports and sometimes miss something important when writing mails like this. If that's the case here, don't hesitate to tell me in a public reply, it's in everyone's interest to set the public record straight. #regzbot poke On 14.04.22 21:08, Keith Busch wrote: > On Sun, Mar 13, 2022 at 08:13:53PM +0000, bugzilla-daemon@kernel.org wrote: >> https://bugzilla.kernel.org/show_bug.cgi?id=215679 >> >> Bug ID: 215679 >> Summary: NVMe/writeback wb_workfn/blocked for more than 30 >> seconds >> Product: IO/Storage >> Version: 2.5 >> Kernel Version: 5.17.0-rc7 >> Hardware: x86-64 >> OS: Linux >> Tree: Mainline >> Status: NEW >> Severity: high >> Priority: P1 >> Component: NVMe >> Assignee: io_nvme@kernel-bugs.kernel.org >> Reporter: imre.deak@intel.com >> Regression: No >> >> Created attachment 300564 [details] >> --> https://bugzilla.kernel.org/attachment.cgi?id=300564&action=edit >> dmesg log after suspend resume, io stuck >> >> After system suspend/resume filesystem IO will stall, producing a 'kworker >> blocked for more than x sec" in dmesg, recovering after a long delay. See >> the >> attached dmesg-suspend-resume-nvme-stuck.txt. I also noticed the same issue >> happening right after booting or after runtime suspend transitions. >> >> The same issue also happens on multiple SKL systems in the i915 team's CI >> farm, >> see: >> >> https://gitlab.freedesktop.org/drm/intel/-/issues/4547 >> >> I bisected the problem to >> commit 4f5022453acd0f7b28012e20b7d048470f129894 >> Author: Jens Axboe <axboe@kernel.dk> >> Date: Mon Oct 18 08:45:39 2021 -0600 >> >> nvme: wire up completion batching for the IRQ path >> >> By reverting it on top of 5.17.0-rc7, I can't reproduce the problem. >> Attached >> dmesg-suspend-resume-nvme-ok.txt with the revert, captured after a few >> suspend/resume. > > Forwarding to linux-nvme for higher visibility. >
Imre, I pretty unhappy how the maintainers handle this report. Sorry, things shouldn't take this long for a bisected regression. But before escalating the problem further, could you please confirm that the regression is still present in latest mainline, aka 5.18-rc5 or later?
Created attachment 300895 [details] attachment-31105-0.html Take it to the list. As far as I can tell, this report isn't real, and I am pretty unhappy with how you're handling this. On Fri, May 6, 2022, 5:55 AM <bugzilla-daemon@kernel.org> wrote: > https://bugzilla.kernel.org/show_bug.cgi?id=215679 > > --- Comment #8 from The Linux kernel's regression tracker (Thorsten > Leemhuis) (regressions@leemhuis.info) --- > Imre, I pretty unhappy how the maintainers handle this report. Sorry, > things > shouldn't take this long for a bisected regression. But before escalating > the > problem further, could you please confirm that the regression is still > present > in latest mainline, aka 5.18-rc5 or later? > > -- > You may reply to this email to add a comment. > > You are receiving this mail because: > You are on the CC list for the bug. > You are watching the assignee of the bug.
(In reply to keith.busch from comment #9) > Take it to the list. Sorry, now I'm confused. Take if *off* the list or tracked regression reports? Or take it to the mailing list for discussion? But the latter would make sense if this is true: > As far as I can tell, this report isn't real, So I assume it's the former. A quick explanation on why you think this report isn't real (just a few words) would have been good (or is this somewhere mentioned in this bug report? sorry, then I missed it and can't see it now), but FWIW, if Imre wouldn't have replied I'd like have dropped the ball myself. > and I am pretty unhappy with how you're handling this. Why? From my point it looks a lot like this is a regression that is not handled appropriately. If there is a reason why this report is bogus then sorry, I just missed it.
(In reply to The Linux kernel's regression tracker (Thorsten Leemhuis) from comment #10) > (In reply to keith.busch from comment #9) > > > Take it to the list. > > Sorry, now I'm confused. Take if *off* the list or tracked regression > reports? Or take it to the mailing list for discussion? But the latter would > make sense if this is true: The linux-nvme mailing list. bugzilla might be okay as a starting point, but the list is the appropriate place for nvme discussion. > > As far as I can tell, this report isn't real, > > So I assume it's the former. A quick explanation on why you think this > report isn't real (just a few words) would have been good (or is this > somewhere mentioned in this bug report? sorry, then I missed it and can't > see it now), but FWIW, if Imre wouldn't have replied I'd like have dropped > the ball myself. Because no one reported it to the official mailing list. But truthfully, I apologize for coming off as impatient in my previous message.
Created attachment 300897 [details] dmesg with the timeout on 5.18.0-rc5 Fwiw, the problem is still reproducible on 5.18.0-rc5, and without the commit I mentioned it's not reproducible. Please find the logs attached.
Created attachment 300898 [details] dmesg without the timeout on 5.18.0-rc5
(In reply to Imre Deak from comment #12) > Fwiw, the problem is still reproducible on 5.18.0-rc5, and without the > commit I mentioned it's not reproducible. As far as I can tell, it looks like you're just booting with an nvme ext4 root partition. I have been unsuccessful at reproducing the observation. Do you have any other hints on how I might be able to recreate this?
I run 'rtcwake -m mem -s 7' right after booting to reproduce the problem. This leads to the timeout in the log already during the first or if not then during the second suspend/resume sequence; at least in some if not in all the cases during the suspend phase. If it helps I can work on arranging remote access for you to this machine.
(In reply to Keith Busch from comment #11) > > The linux-nvme mailing list. bugzilla might be okay as a starting point, but > the list is the appropriate place for nvme discussion. My problem is: I have to deal with lot of subsystems and developers and some of them prefer to keep things in bugzilla. I can't keep track of who prefer what, so I just look and try to continue in the place where they interacted (to the best of my judgement, which sometimes is wrong). > [...] > But truthfully, I apologize for coming off as impatient in my previous > message. No worries!
(In reply to Imre Deak from comment #15) > If it helps I can work on arranging remote access for you to this machine. I'm not having any luck running in virtual machines, but I don't have any real hardware that I could attempt this kind of test. If you want to send me login details privately, you can reach me at kbusch@kernel.org and I'll take a look.
Keith, ok. Before doing that could you check the stats below, if with that it's worth continuing debugging the issue, or we should just replace the module (since these particular available_spare etc. metrics could cause the problem)? testrunner@skl-6600u:~$ sudo nvme smart-log /dev/nvme0n1 Smart Log for NVME device:nvme0n1 namespace-id:ffffffff critical_warning : 0x1 temperature : 34 C available_spare : 44% available_spare_threshold : 50% percentage_used : 39% data_units_read : 153 585 846 data_units_written : 152 504 340 host_read_commands : 2 017 620 857 host_write_commands : 3 581 240 952 controller_busy_time : 67 613 power_cycles : 179 437 power_on_hours : 36 652 unsafe_shutdowns : 4 206 media_errors : 0 num_err_log_entries : 30 Warning Temperature Time : 0 Critical Composite Temperature Time : 0 Temperature Sensor 1 : 34 C Thermal Management T1 Trans Count : 0 Thermal Management T2 Trans Count : 0 Thermal Management T1 Total Time : 0 Thermal Management T2 Total Time : 0
(In reply to Imre Deak from comment #18) > Keith, ok. Before doing that could you check the stats below, if with that > it's worth continuing debugging the issue, or we should just replace the > module (since these particular available_spare etc. metrics could cause the > problem)? That doesn't sound related to your observation. Even if the hardware is becoming flakey as it ages, that shouldn't cause a stuck task.
I'll take this hardware under changes. First thing to do is update the Dell BIOS. This might not be direct cause, but shouldn't hurt either, and it's more comfortable to work on newest BIOS. If the issue persists after updating BIOS, I'll see if NVMe can be changed easily. This NVMe seems to be towards the end of its life, over half of the spare blocks are already used. This maybe shouldn't cause delays in writes, but very often it does. Reason could also relate to NVMe Firmware (I haven't checked if Toshiba offers updates). Changing the NVMe will clear these problems. If BIOS update and NVMe change are not helping, we'll try to arrange remote access to this host.
I've updated the BIOS and changed the NVMe: the BIOS was pretty vocal (audibly so) about the health of the NVMe. This will probably fix the issue, as it seemed to be very NVMe related. I have the old half-spares-gone Toshiba NVMe drive if there's interest investigating driver functionality with faulty drive.
Dell XPS 13 fi-skl-6600u is back in the farm, and looks all green: https://intel-gfx-ci.01.org/tree/drm-tip/fi-skl-6600u.html I tried to reproduce the issue with 'rtcwake -m mem -s 7' but no luck. i915-CI shard-skl hosts mentioned in fd.o#4547 issue are Chromebooks with eMMC: the problem is probably different there.
Okay then, the BIOS update and NVMe module change seem to have fixed the issue, I'm fine to close this ticket.