Bug 51771 (acjohnson) - slow and sporadic hard drive write performance on ivy bridge (Toshiba L840 Core i7-3612QM)
Summary: slow and sporadic hard drive write performance on ivy bridge (Toshiba L840 Co...
Status: RESOLVED INSUFFICIENT_DATA
Alias: acjohnson
Product: IO/Storage
Classification: Unclassified
Component: Other (show other bugs)
Hardware: All Linux
: P1 high
Assignee: io_other
URL: https://bugs.launchpad.net/ubuntu/+so...
Keywords:
Depends on:
Blocks:
 
Reported: 2012-12-18 01:47 UTC by Aaron Johnson
Modified: 2013-01-11 00:06 UTC (History)
1 user (show)

See Also:
Kernel Version: 3.7.0
Subsystem:
Regression: Yes
Bisected commit-id:


Attachments
dmesg output during write speed issues (90.12 KB, text/plain)
2012-12-18 01:47 UTC, Aaron Johnson
Details

Description Aaron Johnson 2012-12-18 01:47:52 UTC
Created attachment 89391 [details]
dmesg output during write speed issues

This bug was originally posted here:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1090715

I have been troubleshooting this problem for several months now and I have not been able to narrow the problem down to any other cause except for kernel version. This problem did not exist on Linux 3.2 and it has haunted me on every version of the kernel that I have tried (3.4 and later)

I am currently running Ubuntu 13.04 alpha (32-bit pae linux 3.7.0) on a brand new Toshiba l840 laptop with an ivy bridge core i7 cpu. I started with Ubuntu 12.04 lts and experienced massive instability (constant hard lockups). With Ubuntu 12.10 everything got worse because the lockups did not stop (although they were less frequent) and then hard drive performance was abysmal.

On Ubuntu 13.04 I finally have a stable system with absolutely zero hard lockups (for the past two weeks straight) but I am still having terrible write performance issues :(

When running Linux 3.2 I am able to get 80-100 MB/s write performance on my internal hard drive consistently. On Linux 3.4-3.7 I am lucky if I get 40 MB/s and right now it has dropped to just over 1 MB/s so I will have to reboot now in order to get better write performance temporarily:

owner@Satellite-L840:~$ dd if=/dev/zero of=/tmp/output bs=8k count=10k; rm -f /tmp/output
10240+0 records in
10240+0 records out
83886080 bytes (84 MB) copied, 73.9852 s, 1.1 MB/s
owner@Satellite-L840:~$ dd if=/dev/zero of=/tmp/output bs=8k count=10k; rm -f /tmp/output
10240+0 records in
10240+0 records out
83886080 bytes (84 MB) copied, 66.5723 s, 1.3 MB/s
owner@Satellite-L840:~$ dd if=/dev/zero of=/tmp/output bs=8k count=10k; rm -f /tmp/output
10240+0 records in
10240+0 records out
83886080 bytes (84 MB) copied, 60.1286 s, 1.4 MB/s

I have experienced this problem on not only Ubuntu but Debian and Fedora as well (all 32-bit pae kernels)

Here is a list of troubleshooting steps I have tried to narrow down the issue:

* Ran memtest 86+ for over 24 hours on my laptop and it completed without error
* Scanned my hard drive using mhdd and no bad sectors
* Replaced my hard drive with a different size and different brand hard drive and same issue
* Attempted a clean install of Precise and then updating to the latest packages and then install the Quantal kernel from the stable repositories and reboot. Immediately this bug affects my computer
* Installed a clean install of 13.04 and bug still exists
* Tried installing Ubuntu (12.04 and 13.04 32-bit) on a USB flash drive to see if the bug crops up eventually on the flash drive and it does. Eventually the write speed even to the USB bus will drop to 1-2MB/s !!!
* Tried installing other distributions as well (Fedora and Debian) and they both have the same issue.
* Installed Ubuntu mainline 3.7 kernel and problem still exists.

Also it does seems like certain IO processes can trigger the terrible write performance such as torrent programs or even installing software using apt-get has triggered it in the past.

One more thing worth noting. Ubuntu 13.04 kernel version 3.7.0-2.8 (which was based on Linux 3.7-rc5 I believe) did give me better write performance at first right after boot (something like 60-80 MB/s) but eventually the write performance would drop to around 1 MB/s just like the others.

If someone could at least give me some ideas on where to go from here that would be great. I have been monitoring my logs but at this point I really don't know what to look for.

I've attached a dmesg output if that will help. Please let me know what else I can do to troubleshoot this issue further.

Thank you,
Aaron Johnson
Comment 1 Aaron Johnson 2012-12-30 04:26:18 UTC
Update:

I am able to get excellent write performance when running the Ubuntu mainline kernel version 3.7-rc4-raring located here:

http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.7-rc4-raring/

Good write speed tests:

acjohnson@Satellite-L840:~$ dd if=/dev/zero of=/tmp/output bs=8k count=10k; rm -f /tmp/output
10240+0 records in
10240+0 records out
83886080 bytes (84 MB) copied, 0.128812 s, 651 MB/s
acjohnson@Satellite-L840:~$ dd if=/dev/zero of=/tmp/output bs=8k count=10k; rm -f /tmp/output
10240+0 records in
10240+0 records out
83886080 bytes (84 MB) copied, 0.444917 s, 189 MB/s
acjohnson@Satellite-L840:~$ dd if=/dev/zero of=/tmp/output bs=8k count=10k; rm -f /tmp/output
10240+0 records in
10240+0 records out
83886080 bytes (84 MB) copied, 0.13095 s, 641 MB/s
acjohnson@Satellite-L840:~$ dd if=/dev/zero of=/tmp/output bs=8k count=10k; rm -f /tmp/output
10240+0 records in
10240+0 records out
83886080 bytes (84 MB) copied, 0.13606 s, 617 MB/s
acjohnson@Satellite-L840:~$ dd if=/dev/zero of=/tmp/output bs=8k count=10k; rm -f /tmp/output
10240+0 records in
10240+0 records out
83886080 bytes (84 MB) copied, 0.137573 s, 610 MB/s
acjohnson@Satellite-L840:~$ dd if=/dev/zero of=/tmp/output bs=8k count=10k; rm -f /tmp/output
10240+0 records in
10240+0 records out
83886080 bytes (84 MB) copied, 0.373419 s, 225 MB/s


Actually the stability of my L840 is better with this kernel than any other version I have tested so far but with one exception which is suspend causes lockups. I believe the lockup consistently occurs on the third attempt to go into suspend and it consistently locks up before actually going into suspend.

I still would like to get to the bottom of the write performance bug though.

Why is the write performance so much better in the rc versions of linux 3.7 compared to the final release and even 3.7.1?
Comment 2 Alan 2013-01-02 14:24:10 UTC
Without knowing what configuration differences there are between the kernels it's impossible to tell.

It's certainly very interesting that there is a difference, and a starting point would be to compare

- the dmesg of the two
- the modules loaded
- the kernel build configuration options.
- any boot options being used
Comment 3 Aaron Johnson 2013-01-11 00:06:54 UTC
Update-

So I never really got to the bottom of why this is happening on the 32-bit pae 3.7 kernel... I decided to install 64-bit ubuntu raring (13.04), and guess what, it works flawlessly.

So apparently 32-bit kernels on new hardware is a bad idea, which I should have known I suppose, but it's too bad because there are a few third party applications that I am having a hard time with now because they don't properly support 64-bit linux.

I am happy to report that ubuntu raring 64-bit runs exceptionally good on my Toshiba L840 right out of the box, even with the stock kernel. No hard drive performance issues, and no suspend lock-ups either ;)

Note You need to log in before you can comment on or make changes to this bug.