Bug 205107 - No HDD spindown/parking on shutdown
Summary: No HDD spindown/parking on shutdown
Status: NEW
Alias: None
Product: IO/Storage
Classification: Unclassified
Component: IDE (show other bugs)
Hardware: x86-64 Linux
: P1 high
Assignee: io_ide@kernel-bugs.osdl.org
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2019-10-06 23:18 UTC by Hélder
Modified: 2020-02-04 01:41 UTC (History)
2 users (show)

See Also:
Kernel Version: 5.0.0-25-generic
Tree: Mainline
Regression: No


Attachments
SDB SMART log (5.20 KB, text/plain)
2019-10-06 23:18 UTC, Hélder
Details
SDA_SMARTCTL (5.03 KB, text/plain)
2019-10-11 23:31 UTC, Hélder
Details

Description Hélder 2019-10-06 23:18:27 UTC
Created attachment 285371 [details]
SDB SMART log

Greetings,

I've got an external Toshiba HDD in good conditions (USB 3.0 cable on USB 2.0 port) that I use to store backups and my work in.
Every time the system shuts down (either by terminal command or regular OS UI), the system does not wait for the hdd to park/spin down (as seen by the systemd logs), and instead, cuts the power off too early, making the disc make an acute screeching sound as if by hard power cutoff.

I believe the same thing is happening to my internal hdd (second one on linux, previous one was old and had the same problem), but I can't be sure as the sound isn't too clear.

I have reported this bug to the distribution's bug tracker (Lubuntu) and systemd, but I was requested to come here instead, as the problem must be down to the kernel.

I have tested this on Linux Mint and the same has happened. As additional information, both on Windows 10 and FreeBSD, this problem did not occur.

I'll attach smartctl logs for both HDDs. The logs say old_age and pre_fail, but this is most likely not true, as I have bought them recently and smartctl doesn't seem to be that smart at times.

I await your reply,
Hélder
Comment 1 Kaur Männamaa 2019-10-10 03:49:03 UTC
(In reply to Hélder from comment #0)

> 
> 
> I'll attach smartctl logs for both HDDs. The logs say old_age and pre_fail,
> but this is most likely not true, as I have bought them recently and
> smartctl doesn't seem to be that smart at times.
> 

old_age and pre_fail indicate the type of the data given, they are not the result of interpreting the data. Also, one of the logs seems to be missing unless I missed something myself :)

Could you unmount (umount -v your/drive*?) and then power off the external drive manually (udisksctl power-off -b /your/drive) and see if the problem remains? You might also want to check that the drive indeed is / has remained off before shutting down (hdparm -C /your/drive).
Comment 2 Hélder 2019-10-11 23:31:36 UTC
Created attachment 285477 [details]
SDA_SMARTCTL

sudo smartctl -a /dev/sda > sda_smartctl.txt
I have just run this.
Comment 3 Hélder 2019-10-11 23:35:18 UTC
The Power-Off Retract Count seems to match the amount of times the computer has shut down without notice. I have had the internal HDD for ~3 months, perhaps. Manufacturing date is around March, if I remember correctly.

As a side note I forgot to include, the external HDD had been formatted on Windows, so, unfortunately, it is still NTFS, but I don't think this influences the problem's cause.
Comment 4 Hélder 2019-10-11 23:45:09 UTC
I have just run those commands. The disk was unmounted and visibly shutdown. The problem did not remain here, but it still does on regular shutdown.

Side note: I'm not sure if this is expected, but even when I use "shutdown -P now", systemd appears to handle it still.
Comment 5 Hélder 2019-10-18 20:16:27 UTC
Anyone? As a temporary fix, I shut down all the devices before turning the system off manually with the following code:

#!/bin/sh
sudo umount -v /dev/sdb*
sudo umount -v /dev/sdc*
sudo umount -v /dev/sdd*
sudo udisksctl power-off -b /dev/sdb
sudo udisksctl power-off -b /dev/sdc
sudo udisksctl power-off -b /dev/sdd
sudo umount -v /dev/sda*
sudo udisksctl power-off -b /dev/sda


It is not a real fix, though.
Comment 6 Kaur Männamaa 2019-10-18 20:51:00 UTC
Apologies for the delay. I have not had time for the research my current (and rather limited) knowledge of the matter requires :) So far I have found quite a few descriptions involving external drives but the problem certainly isn't as massive as what we've had before (https://bugzilla.kernel.org/show_bug.cgi?id=7674).

We know now that in your case the drives remain off during shutdown if turned off manually beforehand. It's not much but it's a piece of the puzzle nevertheless.

You are using usb2, I found some reports of usb3 causing this. True, I have no idea (at least off the top of my head) about the kernel version that was used in these cases. In other words, such a problem can have all kinds of causes.

But while we're at it, are you sure you are not overthinking the problem? I mean, from what I understood from your previous post, your other drive (sda) was not affected by the issue? What about the others? In order to do any bug hunting & get some knowledgeable people involved we must know exactly which drives are affected. Nailing down the basics is crucial in deciding how much attention (if any) the bug will get.

Some further questions:

You said you tested with Mint. I assume the kernel version was the same?
Could you perhaps play around some more? Try with a live version of something (different kernel), maybe even try different systems.


As for your temporary fix, I would say it's a start and during my brief research on the issue I've seen quite a few versions of it used. Perhaps convert it into a systemd service to make the process smoother. It turns out udisksctl supports unmounting too, so you could even make it look nicer if you wanted to:)
Comment 7 Kaur Männamaa 2019-10-18 21:38:55 UTC
I noticed that you have filed the bug under IDE. Why? The drivers are SATA as I understand.
Comment 8 Kaur Männamaa 2020-01-06 21:31:35 UTC
Any news? Have you had the chance to try newer kernels?

As a random thought, does the problem remain if you manually unmount the problematic drive before shutdown but *do not* power it off yourself?
Comment 9 Hélder 2020-01-16 22:37:11 UTC
Greetings,

I'm sorry for the delay, I have been pushing this aside for too long. I was stuck at trying out a non-systemd Linux distro as I could not download the image. The distro was going to be Antix.
This is related to another problem I have been having with Linux. 
The rtl8188ee driver makes my computer always disconnect from the Wifi and this seems to be a recurring problem for users of the same computer model as mine (Toshiba Satellite series).
I have tried so many solutions for the past few months, I can't even recall half of them.

>> 08:00.0 Network controller: Realtek Semiconductor Co., Ltd. RTL8188EE
>> Wireless Network Adapter (rev 01)
I am not in a position where I can try wired connections.

My Wifi connection continuously drops (on average, every 2 mins). The problem is not the router's or the provider's, as every other device (incl. my computer, back when it ran Windows 10) seem to connect just fine. I have updated the kernel and the problem got fixed for a while, but it just started happening again after not too long. It disconnects on its own, over time, but I think it disconnects sooner when I start downloading things.
If I restart the network manager it seems to connect for a while only to drop again right away. 
As a side note, if I'm using the Internet on the phone at the same, it seems to work slightly better. This led me to try out more solutions, including those related to disabling power-saving, but all in vain. What a strange scenario - I was only able to update the kernel by leaving my phone simultaneously playing a 10-hour BBC Earth video on Youtube, right beside my computer. Even then it kept on dropping, only this time less as often.


When it comes to the HDD problem, I have been using that script ever since.
The cable IS usb3, but connected to a usb2 port. If it connected to usb3, it would keep shutting down with every slight touch to the chord, so I ignore its "usb3 power".

I'm not overthinking the problem because it is clearly not healthy for the HDD. Additionally, it clearly seems to be a software problem; shouldn't bugs always be fixed? I may have found a fix by writing down a script, but why can't this be fixed on the kernel at once? This is still unsolved.
I've already stated that the problem does not occur on Windows or FreeBSD.

You wondered about which drives were affected.
Sda may or may not be affected, but the external hdd is on sdb/sdc and it is definitely not being shut down properly.

Mint ran the same version of the kernel, yes, but I tried it primarily to see if the problem was down to Lubuntu's code.

I filed it under IDE by mistake.
If I umount  but do not unplug the HDD, the problem does not occur.

I await your reply,
Hélder
Comment 10 Kaur Männamaa 2020-01-17 14:20:28 UTC
(In reply to Hélder from comment #9)
> Greetings,
> 
> I'm sorry for the delay, I have been pushing this aside for too long. I was
> stuck at trying out a non-systemd Linux distro as I could not download the
> image. The distro was going to be Antix.
> This is related to another problem I have been having with Linux. 
> The rtl8188ee driver makes my computer always disconnect from the Wifi and
> this seems to be a recurring problem for users of the same computer model as
> mine (Toshiba Satellite series).
> I have tried so many solutions for the past few months, I can't even recall
> half of them.
> 
> >> 08:00.0 Network controller: Realtek Semiconductor Co., Ltd. RTL8188EE
> >> Wireless Network Adapter (rev 01)
> I am not in a position where I can try wired connections.
> 

Indeed, there seem to be quite a few threads about this or very similar issues. Had no time for proper research but there seemed to be some solutions as well so hopefully you'll find a way to solve this eventually. If you're stuck, feel free to email me & describe what you have tried so far. I'm no expert but I wouldn't mind looking into this.

However, as for now, let's concentrate on the problem at hand.


> When it comes to the HDD problem, I have been using that script ever since.
> The cable IS usb3, but connected to a usb2 port. If it connected to usb3, it
> would keep shutting down with every slight touch to the chord, so I ignore
> its "usb3 power".
> 
> I'm not overthinking the problem because it is clearly not healthy for the
> HDD. Additionally, it clearly seems to be a software problem; shouldn't bugs
> always be fixed? I may have found a fix by writing down a script, but why
> can't this be fixed on the kernel at once? This is still unsolved.
> I've already stated that the problem does not occur on Windows or FreeBSD.

Don't get me wrong. I had no intention of belittling the problem. I asked as you had included multiple drives, which initially confused me a bit. As for the problem as such, it is certainly worth at least looking into. While a drive can take a considerable number of such brutal shutdowns, the problem should indeed be dealt with in the long run.

Let's assume for now that it affects only external / USB drives. As I understand, this is what your SMART data & observations seem to support as well.

I did some testing (using linux 5.4.12) with my own external drive and had rather intriguing results. In short, the problem seems to exist on shutdown (cannot read SMART to confirm but that's what my ears tell me) but not when the system is halted (shutdown -H). Halting results in the drive spinning down as it should. In my case internal drives are fine either way as expected, no shutdown issues whatsoever.


This may indeed mean that power is cut too early as you suggested but I dare not speculate here at the moment. 

Perhaps I can test with some other external drives in the case of which SMART data is available (easy to access). Just to check once more that the problem is indeed there.

I'll contact someone more knowledgeable. Let's see what they have to say.



Kaur
Comment 11 Kaur Männamaa 2020-01-17 23:14:21 UTC
(In reply to Hélder from comment #9)


> If I umount  but do not unplug the HDD, the problem does not occur.
> 

I contacted Mark Lord, the main dev behind hdparm, and he had some thoughts. Among other things, he asked us to clarify whether just unmounting (instead of unmounting and powering off, e.g. by using 'udisksctl power-off ...') the drive before shutdown really solves the problem. I'm asking as 'unplugging' can be misinterpreted and that could lead to misunderstandings. In short, does the problem occur if you unmount the drive, and then just shut down your computer? And don't forget to disable your script while testing :)

In my case shutting down with an unmounted drive did not solve the problem. If it does indeed work in your case then we may have different scenarios here. Mark had other ideas as well but I'd like to clear the unmounting part up first to plan the next steps better.


Kaur
Comment 12 Hélder 2020-01-18 23:39:03 UTC
The networking problem makes it all much harder, since I can no longer comfortably research the problem online.


For the record:
> uname -a
>> Linux hn-pc 5.0.0-36-generic #39-Ubuntu SMP Tue Nov 12 09:46:06 UTC 2019
>> x86_64 x86_64 x86_64 GNU/Linux


I9
Comment 13 Hélder 2020-01-18 23:45:43 UTC
< That reply was sent too early >

The networking problem makes it all much harder, since I can no longer comfortably research the problem online.

I tried to run the script and proceed with the shutdown without unplugging the cable and also unmounting regularly (with the file manager) and then shutting down without unplugging. In both cases, the problem persists. 

For the record:
>>> uname -a
>> Linux hn-pc 5.0.0-36-generic #39-Ubuntu SMP Tue Nov 12 09:46:06 UTC 2019
>> x86_64 x86_64 x86_64 GNU/Linux

The Linux version I initially used was the latest (?) Lubuntu LTS version (I can't really look it up now). As mentioned previously, I updated it manually later. I believe we are experiencing the exact same problem with different Linux versions.

- Hélder
Comment 14 Hélder 2020-01-18 23:47:39 UTC
< That reply was sent too early >

The networking problem makes it all much harder, since I can no longer comfortably research the problem online.

I tried to run the script and proceed with the shutdown without unplugging the cable and also unmounting regularly (with the file manager) and then shutting down without unplugging. In both cases, the problem persists. 

For the record:
>>> uname -a
>> Linux hn-pc 5.0.0-36-generic #39-Ubuntu SMP Tue Nov 12 09:46:06 UTC 2019
>> x86_64 x86_64 x86_64 GNU/Linux

The Linux version I initially used was the latest (?) Lubuntu LTS version (I can't really look it up now). As mentioned previously, I updated it manually later. I believe we are experiencing the exact same problem with different Linux versions.

- Hélder
Comment 15 Hélder 2020-01-19 00:27:55 UTC
Let me fix some of the information I have sent.
I decided to retry these scenarios just to be certain. If I do leave the cable plugged in until the system is completely off (never unplugged) and (prior to shutdown):
- unmount normally (via the file manager) ->> the problem is still present (disc doesn't spin down).
- Use the script (which includes "udisksctl power-off") ->> the problem is not present most of the time, but the HDD is sometimes re-awoken by the system during shutdown, resulting in the same problem.

PS: Under normal circumstances, I execute the script, unplug the HDD manually and then shut the system down as normal.
PPS: Multiple messages were sent due to the networking problems forcing me to refresh the page until the reply is visibly sent.

- Hélder
Comment 16 Kaur Männamaa 2020-02-04 01:41:00 UTC
Did some additional testing & asked Mark's opinion and here's what we found.

First, I must say that one's ears do not seem to be a trustworthy source of information. I tested with another drive where accessing SMART data was not a problem and found that on many occasions the count didn't actually increase although I would have presumed so judging by what I heard.

Now, as for what is actually going on, it is Mark's view that the kernel is probably not to blame here. Rather systemd or one of its services acting up, for example, which is where you started from...

It is worth pointing out, however that Mark found the presence of the bug to be dependent on whether the drive had been previously mounted. This is something that differs from your observations, unless I'm mistaken.

True, it appears that you didn't test with a drive that had never been mounted after connecting but rather with one that you had unmounted. Who knows, maybe even that can make a difference but that's just speculation by me at the moment. 

Do note, however, what I said about relying on your ears. IF you have based some of your recent reporting on what you heard instead of checking the actual count then this is something to consider.

If, on the other hand, you have checked the count and can still confirm that the bug doesn't care if the drive has been previously mounted or not then you are probably seeing something similar to what I'm seeing on my system, i.e shutting down without ever mounting does not seem to make a difference (meaning the problem is present *sometimes*). Yet, the kernel still seems to do what is expected (at least according to the basic tests Mark helped to arrange). I.e the drive does get flagged for start & stop management by the kernel, as it should.


What I ran into was similar to what is described here: https://www.toradex.com/community/questions/22373/udisk-unable-to-detach-device-after-unmount.html

In other words, I started the system, connected the drive and tried to unmount (if mounted) & power-off manually but did not always succeed as for powering off. If that happens, shutdown always results in the count increasing, which makes sense, after all.

The scenario described in the report I linked to cannot be the problem in my case, as the material is old and the packages, including the kernel, I use are newer. It does indicate, however, that the problem as such has been noticed before. I have some thoughts on what could be happening in my case but I haven't had the time to test my hypothesis yet. I will let you know should I find anything relevant.

K

Note You need to log in before you can comment on or make changes to this bug.