Bug 72191
Summary: | Thinkpad t 440s - Setting ALPM from min_power to max_power gives ATA errors, sets the disk r/o | ||
---|---|---|---|
Product: | IO/Storage | Reporter: | Joost (joostw) |
Component: | Serial ATA | Assignee: | Tejun Heo (tj) |
Status: | NEW --- | ||
Severity: | normal | CC: | alan, bramesh.dev, dion, ganesha, halocaridina, lionghostshop, seager, zack+kernel |
Priority: | P1 | ||
Hardware: | IA-64 | ||
OS: | Linux | ||
Kernel Version: | 3.14.rc6 | Subsystem: | |
Regression: | No | Bisected commit-id: | |
Attachments: |
Setting /sys/class/scsi_host/host[0-2]/link_power_management_policy to max_power on another terminal
dmesg lspci |
Created attachment 129571 [details]
dmesg
Created attachment 129581 [details]
lspci
Sorry, I meant "this is at least one culprit for ...", the hibernation crashes are constant. I can confirm this, my T440s has a Samsung SSD, MZ7TD512HAGM-0001L (512gb) with firmware DXT05L0Q Please note that this problem (I suspect it is the identical problem) has been noticed in Windows too, see http://forums.lenovo.com/t5/T400-T500-and-newer-T-series/T440s-is-killing-Samsung-840-pro-SSD-s/td-p/1366903 It may well be a firmware problem with Samsung drives (the MZ7TD512HAGM-0001L is an OEM version of the 840 series I believe). Thanks for the news! I believe I got a 256GB OEM PM841 inside, at least no firmware upgrades for 840/840 pro/840 evo found an updatable disk. Is there any way to disable ALPM switching when hibernating? I just miss hibernation so much ... You could try adding the device to the table of busted devices drivers/ata/libata-core.c: static const struct ata_blacklist_entry ata_device_blacklist [] = { and you'll see that lists drives with problems. If you add a pattern for your drive with the ATA_HORKAGE_NOLPM flag and build a new kernel that ought to do the trick properly and you can then let us know if it works. Alan Would this disable ALPM completely (not acceptable for me because of the increase in power usage) or only during hibernation? Joost It will disable LPM completely for that device - which is a good starting point for testing if LPM is the problem here and if it fixes hibernation. The detail can be refined once we know that is the case. Right now I'm getting working hibernation with doing cat /etc/pm/sleep.d/00_powertop_autotune #!/bin/sh case "$1" in thaw|resume) echo "SATA_ALPM_ENABLE=true" > /etc/pm/config.d/sata_alpm /usr/sbin/pm-powersave true /usr/sbin/powertop --auto-tune ;; hibernate) echo "SATA_ALPM_ENABLE=false" > /etc/pm/config.d/sata_alpm /usr/sbin/pm-powersave /usr/sbin/pm-powersave false ;; esac Don't know if "pm-powersave" or "pm-powersafe false" does the magic, but with this I have ALPM enabled and working hibernation. I don't know how to check if ALPM is enabled at all (cat'ing /sys/.../link-power-management-policy always returns the last configuration), so I have both commands running right now. ... and re-enabled on thaw/resume of course. Ok, powertop tells me that it's probably pm-powersave false which turns of ALPM without crashing the drive/controller. I'm going to check if that by itself is enough to let us hibernate successfully. So, right now I believe #!/bin/sh case "$1" in thaw|resume) /usr/sbin/pm-powersave true /usr/sbin/powertop --auto-tune ;; hibernate) /usr/sbin/pm-powersave false ;; esac is sufficient for successfull hibernation. FWIW, I'm seeing this problem on another T440s, with the following disk: ATA device, with non-removable media Model Number: SAMSUNG MZ7TD512HAGM-000L1 Serial Number: S151NYAF303599 Firmware Revision: DXT05L0Q Media Serial Num: 00000000000000000000 Transport: Serial, ATA8-AST, SATA 1.0a, SATA II Extensions, SATA Rev 2.5, SATA Rev 2.6, SATA Rev 3.0 I'm not using hibernation on a regular basis, but I notice the problem sporadically when plugging the AC power connector and (very empirical evidence) using the disk immediately after it (e.g. by starting mutt that will check the timestamps of a whole bunch of maildirs). I didn't realize myself that the disk was being set read-only, but I confirm that the only "way out" at that point is a hard reboot. I've the impression that setting CONTROL_HD_POWERMGMT=0 in /etc/laptop-mode/laptop-mode.conf (Debian/testing user here, with kernel 3.14.2) has significantly reduced the incidence of the issue when plugging the AC connect, but it has definitely not made it go away completely. (And, even with that setting in laptop-mode.conf, according to powertop SATA power management is still in use, not sure why.) It is not clear to me if this is something which is fixable at the kernel level, or if it needs to be fixed at the firmware level anyhow. Can someone comment on that? I've tried to update the disk firmware but: 1) I'm not sure which one from www.samsung.com/samsungssd/ is the right firmware image for the above disk (anyone?) 2) as a guess I've tried DXT09B0Q (it is the most similar-looking firmware version to the one hdparm detects on my disk), but the bootable image didn't seem to be really bootable. I guess the Win EXE would work, but I've thrown away Win alltogether... Tips welcome. And thanks a bunch for your awesome kernel work! "I've the impression that setting CONTROL_HD_POWERMGMT=0 in /etc/laptop-mode/laptop-mode.conf (Debian/testing user here, with kernel 3.14.2) has significantly reduced the incidence of the issue when plugging the AC connect, but it has definitely not made it go away completely. (And, even with that setting in laptop-mode.conf, according to powertop SATA power management is still in use, not sure why.)" Just a note on the above: laptop-mode does not handle alpm properly. I'm using TLP to set "medium_power" / "max_performance" instead of "min_power". I haven't had the problem recur but I would prefer to get maximum battery saving if possible. This problem reported solved on windows machines with consumer versions of samsung 840 models via firmware update. Unfortunately, samsung magician software does not work on OEM drives. It would be interesting to see if DXT09B0Q firmware fixes problem. I screwed up my courage and tried the DXT09B0Q upgrade, but it only reports "no supported SSDs found" and exits ... I recently updated my BIOS to GJET77WW (2.27). I have re-enabled link power management and knock on wood have not seen this issue. Previously, I would have this issue constantly and ended up with one of my SSD completely being killed. I also updated the BIOS to GJET77WW (2.27 ), but I am experiencing further lock ups. BTW wes33, at least in my machine there is some special kind of OEM SSD for which no firmware updates seem to be available... Though I would be glad to hear the opposite! @Joost: yep, same here, I've updated the BIOS but the locks up still happen, unfortunately. And I'm, too, still unable to find firmware updates for the OEM SSDs. As a work around I'm now regularly putting the laptop on stand-by, closing the lid, every time I want to plug the AC charger in, to avoid the risk of a lock up. @Joost - you are right- the oem ssd does not take standard firmward upgrades Has anyone else used the "medium_power" setting successfully? I've had zero lockups since starting to use it, and my power useage is not bad (base about 5 watts). @wes33 I have had no lockups with "medium_power" setting ... and my power usage about .8W below "max_power" settings. Hi, I will have a T440p. I will replace the harddrive with 840 SSD. Any body has experience with BIOS 2.25 and SSD firmware DXT09B0Q? I assume SATA Active Link Power Management is on in BIOS. Thank you. I have tested 840 ssd with firmware DXT09B0Q. BIOS 2.25. I am using fedora 20 linux 64bit version with default setting. I have using it for more than one week. Everything works OK. I have tested 840 ssd with firmware DXT09B0Q. T440p BIOS 2.25. I am using fedora 20 linux 64bit version with default setting. I have using it for more than one week. Everything works OK. Here to report that the "medium_power" workaround for the T440S appears to be SSD model sensitive/specific. Currently booting off a: === START OF INFORMATION SECTION === Model Family: Marvell based SanDisk SSDs Device Model: SanDisk SD6SB2M512G1022I Serial Number: XXXXXXXXXXXXXX LU WWN Device Id: XXXXXXXXXXXXXX Firmware Version: X210400 User Capacity: 512,110,190,592 bytes [512 GB] Sector Size: 512 bytes logical/physical Rotation Rate: Solid State Device Form Factor: Unknown (0x0011) Device is: In smartctl database [for details use: -P show] ATA Version is: ATA8-ACS T13/1699-D revision 6 SATA Version is: SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s) Local Time is: Wed Aug 27 21:30:50 2014 CDT SMART support is: Available - device has SMART capability. SMART support is: Enabled OS is up-to-date Arch with TLP for power management. Both "min_power" and "medium_power" pollute the journal with errors related to the SATA interface, particularly when waking from suspend and/or under heavy write activity while on battery. Only errors though, no lock-ups and/or data loss as others have reported. The saving grace in this situation is the 6+3 battery configuration; still getting ~10 hrs of mobile time under "max_performance", other TLP features enabled and 35% backlight. Have you updated your BIOS and SSD firmware? I think it could be hardware's bug. Yes, BIOS and SSD firmware are most up-to-date versions as of this posting date. Hopefully future updates to these and/or the Linux kernel overcome this current limitation. I noticed that lenovo offers a firmware update for several drives including the MZ7TD512HAGM-0001L to firmware version DXT06L0Q It does NOT help with this bug; I still get the errors and am kicked to read-only file system on (some)transitions from min_power to performance setting It seems that new Samsung 850 Pro SSD is ok with min_power (t440p machine) Device Model: Samsung SSD 850 PRO 256GB LU WWN Device Id: 5 002538 8a066a5a2 Firmware Version: EXM01B6Q Don't know about 840 on this particular machine. |
Created attachment 129561 [details] Setting /sys/class/scsi_host/host[0-2]/link_power_management_policy to max_power on another terminal Hello everybody, this is on a T440s 20AR-S0BH00. I am investigating repeating system lock ups when hibernating, and found that when the ALPM setting is changed from min_power to max_power, there is a lot of ATA errors and the disk is set read only, resulting in the need to hard reboot as 'reboot' or 'poweroff' do not work anymore. Setting ALPM to medium_power does not trigger this behaviour. I believe that this is the culprit for constant failures of hibernation (at least one). Attached are a screenshot and some other info, please tell me what else you need. Thanks! Joost