Latest working kernel version: 2.6.28-rc1 Earliest failing kernel version: 2.6.28-rc2 Distribution: Debian Hardware Environment: 4 1TB disks behind Sil3726 PMP connected to on Sil3132 Software Environment: 64bit kernel + 32bit userspace, random debugging enabled in kernel Problem Description: It took some time until conversion of SATA from home grown queing to block level one got stable (as bug 11898 was just fixed), but unfortunately although things are now stable, there is quite big performance drop - writes to disks behind PMP are now only 50-70% of speed before Jens's conversion. When I revert Jen's fixes, I get writes ~48MBps to each of 4 disks - 192MBps total bandwidth (after sata_sil24 change to PCIe during 2.6.28-rc it was actually 52MBps/disk). All 4 disks are being written to concurrently, and test completes on all 4 disks almost simultaneously. Now with default values each disk gets completely different bandwidth, and when I watch LEDs on disks, I see that for most of the time I/O goes to only one of disks, and which disk is being used switches every ~2 seconds. Only way how to get at least some part of bandwidth is to allow only 8 queued commands on each disk - they they mostly fit to 31 commands on channel, and starvation code is almost never triggered. Test just starts 4 concurrent 'dd' to ext3 filesystems on each of 4 disks, writting 4GB of data to each one. Default setting: gwy:~# ./x.sh 4000+0 records in 4000+0 records out 4194304000 bytes (4.2 GB) copied, 110.417 s, 38.0 MB/s gwy:~# 4000+0 records in 4000+0 records out 4194304000 bytes (4.2 GB) copied, 155.827 s, 26.9 MB/s 4000+0 records in 4000+0 records out 4194304000 bytes (4.2 GB) copied, 206.971 s, 20.3 MB/s 4000+0 records in 4000+0 records out 4194304000 bytes (4.2 GB) copied, 206.301 s, 20.3 MB/s Only 8 requests per drive; there are 4 drives sharing one tag map with 31 entries: gwy:~# for a in /sys/block/*/queue/nr_requests; do echo 8 > $a; done gwy:~# ./x.sh 4000+0 records in 4000+0 records out 4194304000 bytes (4.2 GB) copied, 103.588 s, 40.5 MB/s gwy:~# 4000+0 records in 4000+0 records out 4194304000 bytes (4.2 GB) copied, 110.86 s, 37.8 MB/s 4000+0 records in 4000+0 records out 4194304000 bytes (4.2 GB) copied, 105.978 s, 39.6 MB/s 4000+0 records in 4000+0 records out 4194304000 bytes (4.2 GB) copied, 107.94 s, 38.9 MB/s
What additional patches were applied? Is this a plain old post-2.6.27 regression?
Patch from bug 11898 commment 36 to get rid of crashes/hangs while running dd test above. I do not know whether James already submitted it to you, but it is not present in Linus's kernel yet. To be absolutely sure it is caused by Jens's changes I'm now building current Linus's tree with these 4 changes reverted: 43a49cbdf31e812c0d8f553d433b09b421f5d52c 3070f69b66b7ab2f02d8a2500edae07039c38508 e013e13bf605b9e6b702adffbe2853cfc60e7806 2fca5ccf97d2c28bcfce44f5b07d85e74e3cd18e It looks like that difference between nr_requests 8 and 128 disappears when hddtemp (which sends SMART non-NCQ command every now and then) is killed.
Um, it looks like that Tejun already reverted Jens's changes, and I did not notice after syncing. In that case I'll have to figure out where else part of my bandwidth went...
Sorry, after rerunning tests on current git with Tejun's revert I'm back on ~50MBps/drive.
Hmm... performance drop is unexpected. Strange. Jens, maybe this is caused by the delay in freeing tags?