Bug 16021 - mptsas target reset under heavy duty
Summary: mptsas target reset under heavy duty
Status: RESOLVED INVALID
Alias: None
Product: IO/Storage
Classification: Unclassified
Component: SCSI (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: kashyap
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-05-21 09:41 UTC by dujun
Modified: 2012-07-12 13:29 UTC (History)
3 users (show)

See Also:
Kernel Version: 2.6.32
Subsystem:
Regression: No
Bisected commit-id:


Attachments

Description dujun 2010-05-21 09:41:04 UTC
We have mptsas driver 4.22.00.00 from lsi official web site driving 1068e chip which is connected to two lsi 12x expander chips. Each of the chips connects 8 sata disks. Under heavy duty, i.e., when we connect 16 hdd and build the linux software raid5, there are lots of error reports in dmesg:
[20826.611906] mptscsih: ioc0: task abort: FAILED (sc=ffff88007b348c00)
[20826.611919] mptscsih: ioc0: attempting target reset! (sc=ffff88007dc78700)
[20826.611921] sd 4:0:16:0: [sdq] CDB: Read(10): 28 00 54 94 55 00 00 00 90 00
[20827.361823] mptscsih: ioc0: target reset: SUCCESS (sc=ffff88007dc78700)
[20837.361008] mptscsih: ioc0: attempting task abort! (sc=ffff88007dc78700)
[20837.361011] sd 4:0:16:0: [sdq] CDB: Test Unit Ready: 00 00 00 00 00 00
[20837.385655] mptbase: ioc0: LogInfo(0x31112000): Originator={PL}, Code={Reset}, SubCode(0x2000)
[20838.111238] mptbase: ioc0: LogInfo(0x31130000): Originator={PL}, Code={IO Not Yet Executed}, SubCode(0x0000)

This would halt the io for about two minutes and then after successful target reset, the io is continuing. 

If we reduce the hdd to less than 12, or we set the scsi command depth of each of the disks to 1, the build process could complete without any error. 
We also tested the same scenario with lsi 36x expander chip without any problem.

Is this a mptsas driver bug or just the 12x expander chip bug?
Comment 1 dujun 2010-05-24 09:05:12 UTC
https://bugzilla.kernel.org/show_bug.cgi?id=14831 
with patch from above bug report, it seems it solved the problem. However the performance is much worse than before.

Note You need to log in before you can comment on or make changes to this bug.