Bug 10413 - (mptbase) Possible incompatibility between SATA and SAS
Summary: (mptbase) Possible incompatibility between SATA and SAS
Status: CLOSED OBSOLETE
Alias: None
Product: IO/Storage
Classification: Unclassified
Component: SCSI (show other bugs)
Hardware: All Linux
: P1 high
Assignee: linux-scsi@vger.kernel.org
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2008-04-07 04:31 UTC by Arthur
Modified: 2012-05-18 10:37 UTC (History)
2 users (show)

See Also:
Kernel Version: 2.6.24
Subsystem:
Regression: No
Bisected commit-id:


Attachments

Description Arthur 2008-04-07 04:31:47 UTC
Latest working kernel version:
Earliest failing kernel version:
Distribution: Debian Lenny
Hardware Environment: ASUS P5BV/SAS, Kingston 4GB (2x2) DDRAM 677Mhz, 5 Seagate SATA x1Tb each
Software Environment: mdadm
Problem Description:I have 4 disks connected to SATA, one to SAS. All of them are merged to raid5 array using mdadm.

The server reboots automatically from time to time, the last reboot was during the scheduled array check. Nothing strange, except this message: mptbase: ioc0: LogInfo(0x31123000). What does it mean?

pr 6 01:06:02 localhost kernel: md: data-check of RAID array md0
Apr 6 01:06:02 localhost kernel: md: minimum _guaranteed_ speed: 1000 KB/sec/disk.
Apr 6 01:06:02 localhost kernel: md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for data-check.
Apr 6 01:06:02 localhost kernel: md: using 128k window, over a total of 976759936 blocks.
Apr 6 01:06:02 localhost mdadm: RebuildStarted event detected on md device /dev/md0
Apr 6 01:09:01 localhost /USR/SBIN/CRON[1503]: (root) CMD ( [ -d /var/lib/php5 ] && find /var/lib/php5/ -type f -cmin +$(/usr/lib/php5/maxlifetime) -prin
t0 | xargs -r -0 rm)
Apr 6 01:17:02 localhost /USR/SBIN/CRON[4439]: (root) CMD ( cd / && run-parts --report /etc/cron.hourly)
Apr 6 01:17:44 localhost kernel: mptbase: ioc0: LogInfo(0x31123000): Originator={PL}, Code={Abort}, SubCode(0x3000)
Apr 6 01:17:44 localhost last message repeated 9 times
Apr 6 01:31:55 localhost kernel: mptbase: ioc0: LogInfo(0x31123000): Originator={PL}, Code={Abort}, SubCode(0x3000)
Apr 6 01:31:55 localhost last message repeated 11 times
Apr 6 01:38:58 localhost kernel: mptbase: ioc0: LogInfo(0x31123000): Originator={PL}, Code={Abort}, SubCode(0x3000)
Apr 6 01:38:59 localhost last message repeated 10 times
Apr 6 01:39:01 localhost /USR/SBIN/CRON[12372]: (root) CMD ( [ -d /var/lib/php5 ] && find /var/lib/php5/ -type f -cmin +$(/usr/lib/php5/maxlifetime) -pri
nt0 | xargs -r -0 rm)
Apr 6 01:53:32 localhost kernel: mptbase: ioc0: LogInfo(0x31123000): Originator={PL}, Code={Abort}, SubCode(0x3000)
Apr 6 01:53:33 localhost last message repeated 11 times
Apr 6 02:04:32 localhost -- MARK --
Apr 6 02:04:41 localhost kernel: mptbase: ioc0: LogInfo(0x31123000): Originator={PL}, Code={Abort}, SubCode(0x3000)
Apr 6 02:04:42 localhost last message repeated 13 times
Apr 6 02:07:02 localhost mdadm: Rebuild20 event detected on md device /dev/md0
Apr 6 02:09:02 localhost /USR/SBIN/CRON[23206]: (root) CMD ( [ -d /var/lib/php5 ] && find /var/lib/php5/ -type f -cmin +$(/usr/lib/php5/maxlifetime) -pri
nt0 | xargs -r -0 rm)
Apr 6 02:17:02 localhost /USR/SBIN/CRON[26105]: (root) CMD ( cd / && run-parts --report /etc/cron.hourly)
Apr 6 02:39:01 localhost /USR/SBIN/CRON[1605]: (root) CMD ( [ -d /var/lib/php5 ] && find /var/lib/php5/ -type f -cmin +$(/usr/lib/php5/maxlifetime) -prin
t0 | xargs -r -0 rm)
Apr 6 02:39:34 localhost kernel: mptbase: ioc0: LogInfo(0x31123000): Originator={PL}, Code={Abort}, SubCode(0x3000)
Apr 6 02:39:35 localhost last message repeated 10 times
Apr 6 03:04:02 localhost mdadm: Rebuild40 event detected on md device /dev/md0
Apr 6 03:09:01 localhost /USR/SBIN/CRON[12430]: (root) CMD ( [ -d /var/lib/php5 ] && find /var/lib/php5/ -type f -cmin +$(/usr/lib/php5/maxlifetime) -pri
nt0 | xargs -r -0 rm)
Apr 6 03:15:19 localhost kernel: mptbase: ioc0: LogInfo(0x31123000): Originator={PL}, Code={Abort}, SubCode(0x3000)
Apr 6 03:15:20 localhost last message repeated 12 times
Apr 6 03:17:01 localhost /USR/SBIN/CRON[15314]: (root) CMD ( cd / && run-parts --report /etc/cron.hourly)
Apr 6 03:20:04 localhost kernel: mptbase: ioc0: LogInfo(0x31123000): Originator={PL}, Code={Abort}, SubCode(0x3000)
Apr 6 03:20:05 localhost last message repeated 10 times
Apr 6 03:34:23 localhost kernel: mptbase: ioc0: LogInfo(0x31123000): Originator={PL}, Code={Abort}, SubCode(0x3000)
Apr 6 03:34:24 localhost last message repeated 13 times
Apr 6 03:39:01 localhost /USR/SBIN/CRON[23220]: (root) CMD ( [ -d /var/lib/php5 ] && find /var/lib/php5/ -type f -cmin +$(/usr/lib/php5/maxlifetime) -pri
nt0 | xargs -r -0 rm)
Apr 6 03:55:02 localhost mdadm: Rebuild60 event detected on md device /dev/md0
Apr 6 04:09:01 localhost /USR/SBIN/CRON[1588]: (root) CMD ( [ -d /var/lib/php5 ] && find /var/lib/php5/ -type f -cmin +$(/usr/lib/php5/maxlifetime) -prin
t0 | xargs -r -0 rm)
Apr 6 04:17:01 localhost /USR/SBIN/CRON[4498]: (root) CMD ( cd / && run-parts --report /etc/cron.hourly)


Steps to reproduce: run mdadm checkarray script
Comment 1 Arthur 2008-04-07 04:34:27 UTC
The Distribution is not Lenny, it is Debian etch x86_64
Comment 2 Arthur 2008-04-07 23:07:34 UTC
When I connect all 5 drives to SAS, I get A LOT of the above mentioned messages during the startup and after it. Running checkarray causes a reboot. (percents completed may vary up to ten). With one drive connected to SAS I was able to complete the array check and even work for 1-2 days. 
Comment 3 Arthur 2008-04-09 03:50:07 UTC
I recompiled the kernel (2.6.24) with the driver given on LSI download page (mptlinux-4.00.21.00-src.tar.gz). Now it is 4.00.21 insted of default 3.06.05. All 5 hard drives are connected to SAS.
host1: ioc0: fw=01.23.00.00 bios=06.12.00.00 driver=4.00.21.00 mpi=105
LSISAS1068 B1: board_name=N/A assembly=ASUSTek tracer=N/A
nvdata_persistent=2b00h nvdata_default=2b00h
io_delay=00 device_delay=00
debug_level=00000000h

But symptoms remained the same, I still get a lot of error messages:

Apr 9 13:42:31 localhost kernel: ReiserFS: dm-1: found reiserfs format "3.6" with standard journal
Apr 9 13:42:31 localhost kernel: ReiserFS: dm-1: using ordered data mode
Apr 9 13:42:31 localhost kernel: ReiserFS: dm-1: journal params: device dm-1, size 8192, journal first block 18, max trans len 1024, max batch 900, max co
mmit age 30, max trans age 30
Apr 9 13:42:31 localhost kernel: ReiserFS: dm-1: checking transaction log (dm-1)
Apr 9 13:42:31 localhost kernel: ReiserFS: dm-1: Using r5 hash to sort names
Apr 9 13:43:23 localhost kernel: md: data-check of RAID array md0
Apr 9 13:43:23 localhost kernel: md: minimum _guaranteed_ speed: 1000 KB/sec/disk.
Apr 9 13:43:23 localhost kernel: md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for data-check.
Apr 9 13:43:23 localhost kernel: md: using 128k window, over a total of 976759936 blocks.
Apr 9 13:43:23 localhost mdadm: RebuildStarted event detected on md device /dev/md0
Apr 9 13:43:26 localhost kernel: mptbase: ioc0: LogInfo(0x31123000): Originator={PL}, Code={Abort}, SubCode(0x3000)
Apr 9 13:43:56 localhost last message repeated 79 times
Apr 9 13:44:57 localhost last message repeated 73 times
Apr 9 13:44:57 localhost last message repeated 8 times
Apr 9 13:45:26 localhost kernel: mptbase: ioc0: LogInfo(0x31123000): Originator={PL}, Code={Abort}, SubCode(0x3000)
Apr 9 13:45:26 localhost last message repeated 12 times

when I run /usr/share/mdadm/checkarray --cron --all --quiet
Comment 4 Roland Kletzing 2008-05-11 15:05:20 UTC
first off, i would not recommend combining different drives or same disks via different paths into a raid volume.
see http://www.howtofixcomputers.com/forums/scsi/sas-sata-arrays-one-controller-lsisas1068-17428-3.html

anyhow, that doesn`t explain why you also have issues with 5 drives via sas.

i would recommend asking here:

LSILOGIC MPT FUSION DRIVERS (FC/SAS/SPI)
P:      Eric Moore
M:      Eric.Moore@lsi.com
M:      support@lsi.com
L:      DL-MPTFusionLinux@lsi.com
L:      linux-scsi@vger.kernel.org
W:      http://www.lsilogic.com/support
S:      Supported

Note You need to log in before you can comment on or make changes to this bug.