Bug 12553
Summary: | System slows down to a crawl during and after prolonged disk usage | ||
---|---|---|---|
Product: | Platform Specific/Hardware | Reporter: | Matti Niemenmaa (matti.niemenmaa+kernelbugs) |
Component: | x86-64 | Assignee: | io_other |
Status: | CLOSED DOCUMENTED | ||
Severity: | high | CC: | akpm, alan, matti.niemenmaa+kernelbugs |
Priority: | P1 | ||
Hardware: | x86-64 | ||
OS: | Linux | ||
Kernel Version: | 2.6.28.1 | Subsystem: | |
Regression: | Yes | Bisected commit-id: | |
Attachments: |
strace man man >/dev/null, normal speed
strace man man >/dev/null, slow speed kernel .config dmesg lspci -vvv |
Description
Matti Niemenmaa
2009-01-27 11:40:30 UTC
Created attachment 20011 [details]
strace man man >/dev/null, normal speed
Created attachment 20012 [details]
strace man man >/dev/null, slow speed
Created attachment 20013 [details]
kernel .config
Created attachment 20014 [details]
dmesg
Created attachment 20016 [details]
lspci -vvv
Do you know that the slowdown is IO related at all? What if you just run something like main() { int i; for (i = 0; i < 1000000000; i++) ; } when the machine is running fast and when it's running slow? Try running echo 3 > /proc/sys/vm/drop_caches to blow away cache, see if that helps. Try running `hdparm -t /dev/sda' or whatever to see if the disk is getting slower. If it's actually the CPU getting slower then check /proc/cpuinfo while running the above busy-loop, see if the CPU clock frequency has slowed down. This could even be MTRR-related. Maybe there's some memory which is running really slowly. That memory doesn't initially get used. Later, the system start to use it and you notice the effects. I don't think I've looked at the behaviour of a program that does no IO (d'oh!). I'll see how that does as well as try drop_caches when I get a chance to, thanks. I recall having tried hdparm -t and -T without seeing any slowdown. Although it's only as a result of my experiment that I noticed the behaviour of the system slowdown: I might have just run hdparm in between those spikes. For what it's worth I did have an MTRR-related problem earlier, although it didn't seem to cause any performance problems, only complaints from the proprietary fglrx ATI graphics driver, and I've managed to 'fix' it since. My original /proc/mtrr looked like the following: reg00: base=0x000000000 ( 0MB), size= 8192MB, count=1: write-back reg01: base=0x200000000 ( 8192MB), size= 512MB, count=1: write-back reg02: base=0x220000000 ( 8704MB), size= 256MB, count=1: write-back reg03: base=0x0d0000000 ( 3328MB), size= 256MB, count=1: uncachable reg04: base=0x0e0000000 ( 3584MB), size= 512MB, count=1: uncachable Currently I just run a script on startup (much trial and error was involved in getting it to work) which turns it into the following: reg00: base=0x000000000 ( 0MB), size= 2048MB, count=1: write-back reg01: base=0x0d0000000 ( 3328MB), size= 256MB, count=1: write-combining reg02: base=0x0e0000000 ( 3584MB), size= 512MB, count=1: uncachable reg03: base=0x100000000 ( 4096MB), size= 4096MB, count=1: write-back reg04: base=0x200000000 ( 8192MB), size= 2048MB, count=1: write-back I've noticed the slowdown behaviour with both of the above MTRR tables, though. Alright, some more info: - echo 3 > /proc/sys/vm/drop_caches is a better way of fixing it than a reboot, thanks for that. Of course, the problem recurs after some time. - /proc/cpuinfo tells me that I'm still running at around 3 GHz. - I ran that program but with a constant of only 10 million and not 1000 million; takes around 0.02 seconds normally but it took over 4 seconds when the system was going slow: $ time ./arst Command exited with non-zero status 192 4.63user 0.01system 0:04.66elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+124minor)pagefaults 0swaps - hdparm -t does show a slowdown, but judging from the output of 'time' it's more due to the CPU overhead on system calls: $ sudo time hdparm -t /dev/sda /dev/sda: Timing buffered disk reads: 60 MB in 3.09 seconds = 19.45 MB/sec 0.07user 3.18system 0:06.72elapsed 48%CPU (0avgtext+0avgdata 0maxresident)k 123136inputs+0outputs (0major+1089minor)pagefaults 0swaps - I ran memtest86+ for about 12 hours in case there's a problem with my RAM, but it came up with zero errors. This may or may not be related: doing 'echo 3 > /proc/sys/vm/drop_caches' caused a GPF in the kernel: [37331.924053] general protection fault: 0000 [#5] PREEMPT SMP [37331.924057] last sysfs file: /sys/devices/system/cpu/cpu3/cache/index2/shared_cpu_map [37331.924059] CPU 3 [37331.924061] Modules linked in: [37331.924064] Pid: 15141, comm: sh Tainted: G D 2.6.28.1-deewiant #20 [37331.924066] RIP: 0010:[<ffffffff80244cd1>] [<ffffffff80244cd1>] __wake_up_bit+0xc/0x2d [37331.924072] RSP: 0018:ffff88010ea33d88 EFLAGS: 00010296 [37331.924074] RAX: 57fc74959adaa508 RBX: ffffe20001f3c6e0 RCX: 0000000000000040 [37331.924076] RDX: 0000000000000000 RSI: ffffe20001f3c6e0 RDI: 57fc74959adaa500 [37331.924078] RBP: 0000000005472286 R08: c000000000000000 R09: 0000000d80349a49 [37331.924080] R10: 0000000000006000 R11: 0000000000000001 R12: 0000000000000009 [37331.924082] R13: ffff88010ea33e20 R14: 000000000000027f R15: ffff88022f70c7d0 [37331.924084] FS: 00007f12bfb0d6f0(0000) GS:ffff88022f802b00(0000) knlGS:0000000000000000 [37331.924087] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [37331.924088] CR2: 00007f12bfb28000 CR3: 000000013a99e000 CR4: 00000000000406e0 [37331.924091] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [37331.924093] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [37331.924095] Process sh (pid: 15141, threadinfo ffff88010ea32000, task ffff88016828a000) [37331.924096] Stack: [37331.924098] 0000000000000008 ffffffff00000000 ffffe20001f3c6e0 ffffffff80264687 [37331.924101] ffff88010ea33dd9 ffff88010ea33dd8 0100000100000000 ffffffffffffffff [37331.924104] 000000000000000e 0000000000000000 ffffe200053d7618 ffffe20004c011d8 [37331.924108] Call Trace: [37331.924110] [<ffffffff80264687>] ? __invalidate_mapping_pages+0xcb/0x140 [37331.924115] [<ffffffff8050bbae>] ? _spin_lock_irqsave+0x33/0x3b [37331.924119] [<ffffffff8029d162>] ? drop_caches_sysctl_handler+0xca/0x184 [37331.924122] [<ffffffff802c1eae>] ? proc_sys_call_handler+0x91/0xb8 [37331.924127] [<ffffffff8028451b>] ? vfs_write+0xad/0x156 [37331.924130] [<ffffffff80284680>] ? sys_write+0x45/0x6e [37331.924132] [<ffffffff8020b43a>] ? system_call_fastpath+0x16/0x1b [37331.924136] Code: 00 48 29 f8 2b 8a 10 d0 66 80 4c 01 c0 48 d3 e8 48 6b c0 18 48 03 82 00 d0 66 80 5a 5b 5d c3 48 83 ec 18 48 8d 47 08 89 54 24 08 <48> 39 47 08 74 16 48 89 34 24 48 89 e1 ba 01 00 00 00 be 03 00 [37331.924162] RIP [<ffffffff80244cd1>] __wake_up_bit+0xc/0x2d [37331.924165] RSP <ffff88010ea33d88> [37331.924167] ---[ end trace 4d409d3bbf01b279 ]--- The system isn't currently in its slow state but every time I run the command I get a trace like the above in dmesg. I don't know if its notable that the first trace I got said "Not tainted" and later ones say "Tainted: G D". How much RAM does the system have and how is it laid out according to the early boot messages - wrong MTRR settings for an area of RAM could cause exactly what you see The system has 8 GB of RAM, four 2 GB sticks. As for how it's laid out, I've attached my dmesg and posted my MTRR settings (both original and current, both of which exhibit the problem) in comment #7; is there any other source of info I should look at? Ok so physical memory layout is 0-639K RAM 640K->1M ISA hole RAM 1MB -> 3GB PCI and the like 3GB-4GB RAM 4GB->8GB + a bit You don't seem to have an MTRR range covering 2GB-3GB Your original BIOS MTRR looks closer You might also want to see if it occurs without ever loading fglrx in case its messing up due to the (rather odd) memory mtrr setups coming from the system I haven't used fglrx in months, this does occur without it as well. It was one of the first things I removed when trying to get rid of the problem. :-) xf86-video-radeonhd (the video driver I use currently) does complain about MTRRs if I restart X: the 256MB range completely disappears from /proc/mtrr when I close X. It doesn't seem to cause any problems, though. I'll see about having the MTRRs cover the 2-3 GB range, in case that helps; I'll post results later. Okay, it seems that it was indeed an MTRR problem. I've been running with the following settings for the past week and haven't run into the problem even once: reg00: base=0x000000000 ( 0MB), size= 2048MB, count=1: write-back reg01: base=0x080000000 ( 2048MB), size= 1024MB, count=1: write-back reg02: base=0x0c0000000 ( 3072MB), size= 256MB, count=1: write-back reg03: base=0x0d0000000 ( 3328MB), size= 256MB, count=1: write-combining reg04: base=0x0e0000000 ( 3584MB), size= 512MB, count=1: uncachable reg05: base=0x100000000 ( 4096MB), size= 4096MB, count=1: write-back reg06: base=0x200000000 ( 8192MB), size= 512MB, count=1: write-back reg07: base=0x220000000 ( 8704MB), size= 256MB, count=1: write-back I'm not sure if this is optimal (X still manages to lose that write-combining MTRR permanently if restarted) but it takes care of this problem, at least. I'll leave this report open for now, since it seems to me that correct MTRR setting is something that should be handled by the kernel. Bug 10508 looks similar to this (the original MTRRs described there are similar to what I started out with), but I'll leave deciding whether this is a duplicate to somebody more knowledgeable. Reassigned to Platform Specific/Hardware since I guess that's where this belongs. In case it's relevant, here's the order in which I need to disable the original MTRR settings (see comment #7 for them). I haven't tried all permutations but this is the first one I found that didn't slow the system down to pre-286 speeds: echo "disable=0" >| /proc/mtrr echo "disable=4" >| /proc/mtrr echo "disable=3" >| /proc/mtrr echo "disable=2" >| /proc/mtrr echo "disable=1" >| /proc/mtrr |