Bug 53551

Summary: Abysmal HDD/USB write speed after sleep on a UEFI system (without EUFI the same situation occurs after several suspend attempts)
Product: Platform Specific/Hardware Reporter: Artem S. Tashkinov (aros)
Component: i386Assignee: platform_i386
Status: CLOSED INVALID    
Severity: blocking CC: alan, bjorn, felipe.contreras, gfa, hendrik.haddorp, hpa, matt, the.ridikulus.rat
Priority: P1    
Hardware: i386   
OS: Linux   
Kernel Version: 3.7.6 vanilla Subsystem:
Regression: No Bisected commit-id:
Attachments: dmesg
dmesg (UEFI disabled)
lspci -vvvv, dd, hdparm, /proc info on boot and after suspend
dmesg ok hp elitebook
/proc/iomem
lspci -vvvvx
/proc/mtrr
/proc/iomem when machine is slow
dmesg when the machine is slow
lspci -vvvvx when the machine is slow

Description Artem S. Tashkinov 2013-02-10 13:29:56 UTC
Created attachment 92831 [details]
dmesg

I have a P8P67 Pro motherboard made by ASUS and recently I decided to switch to EUFI boot. Maybe it 's a coincidence or maybe Linux kernel 3.7.6 (vanilla) has some serious bug but after waking up from sleep write performance becomes intolerable.

On boot I have:

HDD write performance: ~120MB/sec
USB write performance: ~18MB/sec

After sleep:

HDD write performance: ~7MB/sec (i.e 17 times slower)
USB write performance: ~0.5MB/sec (i.e. 36 times slower)

This is totally unacceptable, the computer becomes unusable.

I 'm open to suggestions how to debug this extremely serious problem.

P.S. Since I 'm still using x86 kernel, on boot it switches x86-64 UEFI off:

[ 0.000000] efi: EFI v2.31 by American Megatrends
[ 0.000000] efi: ACPI=0xdf385000 ACPI 2.0=0xdf385000 SMBIOS=0xdec28e98 MPS=0xfc9a0
[ 0.000000] efi: No EFI runtime due to 32/64-bit mismatch with kernel
...
[ 0.000000] efi: Setup done, disabling due to 32/64-bit mismatch
Comment 1 Artem S. Tashkinov 2013-02-10 13:47:20 UTC
Created attachment 92841 [details]
dmesg (UEFI disabled)

OK, I've just rebooted with UEFI disabled, and everything (after suspend/resume) works as expected.
Comment 2 Artem S. Tashkinov 2013-02-12 06:56:31 UTC
Strangely, on the second or third suspend during a single start up sequence (i.e. without reboots/shutdowns), speed became as awful as with a UEFI boot.

I.e. UEFI is not strictly necessary to trigger this problem, with UEFI it just happens immediately after a first suspend attempt.
Comment 3 Matt Fleming 2013-02-12 09:48:54 UTC
In that case, I suspect that it's related to the memory map layout in some way and the UEFI memory map just makes your problem occur more reliably.

Do you know of a kernel version that doesn't exhibit this bug, e.g. is this a regression? If so, doing a git bisect would be the simplest way to find out what commit introduced this slowdown.
Comment 4 Artem S. Tashkinov 2013-02-12 18:14:44 UTC
Created attachment 93161 [details]
lspci -vvvv, dd, hdparm, /proc info on boot and after suspend
Comment 5 Artem S. Tashkinov 2013-03-01 13:42:19 UTC
This time it's taken four suspend attempts for this problem to appear _without_ EUFI.

I will now try to run this test in Windows - let's see if it affects Windows as well - if I can reproduce this problem in this OS then almost certainly it's a BIOS bug.
Comment 6 Artem S. Tashkinov 2013-03-02 09:45:21 UTC
OK, Windows doesn't suffer from this bug, but it's not entirely happy either:

"The system firmware has changed the processor's memory type range registers (MTRRs) across a sleep state transition (S4). This can result in reduced resume performance."

That's kinda weird, as Linux doesn't detect this change in MTTR registers. Can it be the source of slowdown?
Comment 7 Hendrik Haddorp 2013-05-24 16:17:46 UTC
I'm have an issue that looks identical using Ubuntu 12.04 on a Lenovo ThinkPad W520. I believe it started once I upgrade to the 3.5 kernel series.

After a resume the disk write speed is super slow. Today I saw the same issue after taking my laptop out of the port replicator or once I placed it back in.
Comment 8 Alan 2013-11-18 15:43:08 UTC
BIOS bug.

You can probably mostly work around it by setting the mtrr values back sensibly using the /proc/mtrr interface and a bit of scripting.
Comment 9 Artem S. Tashkinov 2013-11-18 16:09:57 UTC
Gotta reopen since it doesn't happen in Windows ever.

If it means Linux has to be more "Windows" like - so it must be it. Hardware vendors often don't give a damn about Linux and we have to play by their rules unfortunately.

Otherwise there's no point in developing Linux.
Comment 10 Alan 2013-11-18 16:20:11 UTC
Its a firmware bug. I've noted how to work around it.

If you wanted to automate that its a distribution problem and shell script in the suspend/resume scripts not a kernel action. We can't put every workaround for every obscure firmware bug in the kernel nor should we.

Alan
Comment 11 Artem S. Tashkinov 2013-11-18 16:47:28 UTC
I've verified that MTTR values rarely change - so most likely it's not related to MTTR. So far, I haven't received any advice as to how I can even debug this issue.

> We can't put every workaround for every obscure firmware bug in the kernel
> nor should we.

Then why are there so many quirks and workarounds in the Linux kernel? I guess no less than several hundreds.

How many motherboards owners based on the Intel P8P67 chipset use Linux? Less than 0.5%?  So how do you know and why do you think you have the right to say it's an obscure firmware bug? So far we haven't even found out the root cause of this problem.

If you have any ideas how I can "see" what parts of the Linux kernel break in a process of UEFI reboot, I'm here to follow and debug.

I happen to run i686 in PAE mode - the problem might be related to this fact.

I will reopen this bug report, when I switch to x86-64 (I still haven't found a single serious reason for that).
Comment 12 gfa 2015-03-18 05:52:02 UTC
i'm using amd64 3.16 and 3.19 and have the same problem, after suspend disk io is very slow. io to tmpfs and network is fine.

$ cat /proc/mtrr 
reg00: base=0x0ff800000 ( 4088MB), size=    8MB, count=1: write-protect
reg01: base=0x000000000 (    0MB), size= 2048MB, count=1: write-back
reg02: base=0x080000000 ( 2048MB), size= 1024MB, count=1: write-back
reg03: base=0x0bc000000 ( 3008MB), size=   64MB, count=1: uncachable
reg04: base=0x0bb000000 ( 2992MB), size=   16MB, count=1: uncachable
reg05: base=0x100000000 ( 4096MB), size= 4096MB, count=1: write-back
reg06: base=0x200000000 ( 8192MB), size= 8192MB, count=1: write-back
reg07: base=0x400000000 (16384MB), size= 1024MB, count=1: write-back
reg08: base=0x43f000000 (17392MB), size=   16MB, count=1: uncachable

/proc/mtrr does not change after suspend.

$ uname -a
Linux io 3.16.0-4-amd64 #1 SMP Debian 3.16.7-ckt7-1 (2015-03-01) x86_64 GNU/Linux

please reopen
Comment 13 Artem S. Tashkinov 2015-03-18 09:31:39 UTC
(In reply to gfa from comment #12)

Please post your hardware configuration:

1) Motherboard and its BIOS version
2) CPU
3) RAM configuration
4) Type of storage and how your disk is connected to the motherboard (which SATA ports)
5) Please attach dmesg, lspci -vvv, /proc/iomem
Comment 14 gfa 2015-03-18 12:51:29 UTC
1) 
hp elitebook 8470p
BIOS Information
        Vendor: Hewlett-Packard
        Version: 68ICF Ver. F.31
        Release Date: 09/24/2012
2)
Intel(R) Core(TM) i5-3340M CPU @ 2.70GHz

3)
2x8G kingston ddr3 1600mhz

4) 
ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
INTEL SSDSC2BW180A3H

ata2: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
WDC WD3200BEKT-60PVMT0
Comment 15 gfa 2015-03-18 12:52:56 UTC
Created attachment 171091 [details]
dmesg ok hp elitebook

dmesg after boot the laptop, suspend and resume *without* the problem
Comment 16 gfa 2015-03-18 12:53:49 UTC
Created attachment 171101 [details]
/proc/iomem

/proc/iomem after suspend and resume without the problem
Comment 17 gfa 2015-03-18 12:55:17 UTC
Created attachment 171111 [details]
lspci -vvvvx

lspci -vvvx after suspend and resume, without the problem
Comment 18 gfa 2015-03-18 12:56:41 UTC
Created attachment 171121 [details]
/proc/mtrr

after suspend and resume without the problem
Comment 19 gfa 2015-03-18 12:59:49 UTC
/proc/mtrr looks the same after and before suspend, with or without the problem

laptop now is working ok, sometimes if works for long time suspending and resuming for almost a month. 
most of the time i have to reboot after every suspend or hibernate
Comment 20 gfa 2015-03-31 06:42:23 UTC
now is slow again, uploading files
Comment 21 gfa 2015-03-31 06:43:19 UTC
Created attachment 172731 [details]
/proc/iomem when machine is slow
Comment 22 gfa 2015-03-31 06:43:44 UTC
Created attachment 172741 [details]
dmesg when the machine is slow
Comment 23 gfa 2015-03-31 06:44:09 UTC
Created attachment 172751 [details]
lspci -vvvvx when the machine is slow
Comment 24 gfa 2015-03-31 06:55:37 UTC
i found this looking at the differences of lspci between slow and ok


00:14.0 USB controller: Intel Corporation 7 Series/C210 Series Chipset Family USB xHCI Host Controller (rev 04) (prog-if 30 [XHCI])
Capabilities: [80] MSI: Enable+ Count=1/8 Maskable- 64bit+

OK:
	Interrupt: pin A routed to IRQ 46
        Address: 00000000fee00398  Data: 0000
SLOW:
        Interrupt: pin A routed to IRQ 45
        Address: 00000000fee00378  Data: 0000


00:1f.2 SATA controller: Intel Corporation 7 Series Chipset Family 6-port SATA Controller [AHCI mode] (rev 04) (prog-if 01 [AHCI 1.0])
Capabilities: [80] MSI: Enable+ Count=1/1 Maskable- 64bit-

OK:
     Interrupt: pin B routed to IRQ 45
     Address: fee00378  Data: 0000

SLOW:
     Interrupt: pin B routed to IRQ 46
     Address: fee00398  Data: 0000


02:00.0 System peripheral: JMicron Technology Corp. SD/MMC Host Controller (rev 30)

OK:
    DevSta:	CorrErr+ UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
SLOW:
    DevSta:	CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
Comment 25 gfa 2015-03-31 07:54:25 UTC
i was able to get a bios update from my vendor. let's hope that fix the issue
Comment 26 gfa 2015-04-02 08:39:56 UTC
i'm sorry but the bug is not on the kernel, acpi-support scripts fool me. you can close this bug