Created attachment 50962 [details] dmesg I have a HP DL380 with an Adaptec 2120S that, many moons ago, never exhibited any problems. Since lenny and squeeze I have had frequent "Host adapter abort request" messages followed by a RAID firmware panic under heavy disk I/O. In 2.6.32 it seemed impossible to recover from these things, it would offline the RAID and disable the system. In 2.6.37.3 it seems better at recovering after a delay, but I am afraid something is wrong at a low level that may corrupt data: [572820.064043] aacraid: Host adapter abort request (2,0,0,0) [572820.149996] aacraid: Host adapter abort request (2,0,0,0) [572820.235786] aacraid: Host adapter abort request (2,0,0,0) [572820.321554] aacraid: Host adapter abort request (2,0,0,0) [572820.407418] aacraid: Host adapter reset request. SCSI hang ? [572820.496444] AAC: Host adapter BLINK LED 0x3 [572820.567966] AAC0: adapter kernel panic'd 3. [595047.992045] aacraid: Host adapter abort request (2,0,0,0) [595048.078463] aacraid: Host adapter abort request (2,0,0,0) [595048.164959] aacraid: Host adapter abort request (2,0,0,0) [595048.250844] aacraid: Host adapter abort request (2,0,0,0) [595048.336629] aacraid: Host adapter abort request (2,0,0,0) [595048.336637] aacraid: Host adapter abort request (2,0,0,0) [595048.336644] aacraid: Host adapter abort request (2,0,0,0) [595048.336650] aacraid: Host adapter abort request (2,0,0,0) [595048.336657] aacraid: Host adapter abort request (2,0,0,0) [595048.336663] aacraid: Host adapter abort request (2,0,0,0) [595048.336773] aacraid: Host adapter reset request. SCSI hang ? [595048.336782] AAC: Host adapter BLINK LED 0x3 [595048.336916] AAC0: adapter kernel panic'd 3. [596524.064044] aacraid: Host adapter abort request (2,0,0,0) [596524.146368] aacraid: Host adapter abort request (2,0,0,0) [596524.228197] aacraid: Host adapter abort request (2,0,0,0) [596524.309288] aacraid: Host adapter abort request (2,0,0,0) [596524.389967] aacraid: Host adapter reset request. SCSI hang ? [596524.473168] AAC: Host adapter BLINK LED 0x3 [596524.538332] AAC0: adapter kernel panic'd 3.
Created attachment 50972 [details] lspci -vvv
Also, I have tried a controller firmware 4.2-0[8205] as well as 8208, and on two different controllers, both exhibit the abort/panic problem.
Others seem to be having similar problems in last few years' kernels with no clear resolution, see following links: http://communities.vmware.com/thread/257273 http://lkml.org/lkml/2008/1/23/170 http://forum.proxmox.com/threads/3833-2.6.32-1-pve-aacraid-Host-adapter-reset-request.-SCSI-hang http://lists.debian.org/debian-kernel/2009/05/msg00488.html http://lists.debian.org/debian-user/2011/03/msg00553.html
Today 2.6.27.3 hung, these were its last words on the serial console: [661903.992040] aacraid: Host adapter abort request (2,0,0,0) [661904.064211] aacraid: Host adapter abort request (2,0,0,0) [661904.136081] aacraid: Host adapter reset request. SCSI hang ?
Today 2.6.37.3 hung, these were its last words on the serial console: [661903.992040] aacraid: Host adapter abort request (2,0,0,0) [661904.064211] aacraid: Host adapter abort request (2,0,0,0) [661904.136081] aacraid: Host adapter reset request. SCSI hang ?
Changed to an older card with firmware 8205, I haven't seen a hang in almost 3 weeks now. Perhaps a weird firmware problem? The changelogs don't indicate that anything significant changed in the newer revs.
Just to provide more info, the older card did not solve the problem and there are still sporadic hangs. I am trying to gather more information but it is difficult to get to.
I have found that there are at least two different versions of the ASR-2120S. There is an older version which shipped with v5xxx and v6xxx firmware which has a RAID processor which looks like a PPGA Intel Celeron. There is a newer version which is labeled RoHS on the product label, comes with v8208 firmware (latest), and has a "shiny" BGA looking RAID processor. This latter version has been working since I last reported without a single crash or other problem. I think there may be hardware defects arising in the older versions perhaps due to age or due to problems solved in later versions. Since when the older vesrions of the card crashes Linux it remains upset in other ways (will not POST, will not bring drives online, etc), I am reluctant to blame this on Linux and am therefore closing this bug. Anyone having problems with ASR-2120S should seek the later RoHS hardware revision which can visually be identified by the RoHS indication on the product sticker and the shiny surfaced RAID processor.
Created attachment 89571 [details] Server a22 kernel panic Server a22 kernel panic
Created attachment 89581 [details] Server a3 kernel panic Server a3 kernel panic
Created attachment 89591 [details] Server a20 kernel panic Server a3 kernel panic
Hi, our company have same problems with aacraid. We using Debian 6.0.6 x86_64 wih proxmox-ve kernel: Linux 2.6.32-17-pve #1 SMP Wed Nov 28 07:15:55 CET 2012 x86_64 GNU/Linux modinfo aacraid: filename: /lib/modules/2.6.32-17-pve/kernel/drivers/scsi/aacraid/aacraid.ko version: 1.1-7[28000]-ms license: GPL description: Dell PERC2, 2/Si, 3/Si, 3/Di, Adaptec Advanced Raid Products, HP NetRAID-4M, IBM ServeRAID & ICP SCSI driver author: Red Hat Inc and Adaptec srcversion: DAE1E62971BE24D8A2CE61F vermagic: 2.6.32-17-pve SMP mod_unload modversions Also, i will attach lspci info.
Hi everyone, I have Adaptec 6405E HW Card which make me crazy with these errors : [661903.992040] aacraid: Host adapter abort request (2,0,0,0) [661904.064211] aacraid: Host adapter abort request (2,0,0,0) [661904.136081] aacraid: Host adapter reset request. SCSI hang ? This issue is when you use a new kernel. I have found the issue. Go to http://www.adaptec.com/fr-fr/ and upgrade the firmware of your hw card. I have 6405E and with all OS I have this errors (ebord request, SCSI hang) After the upgrade (yesterday) 18668 to 19076 and 19109 it works very well now. So, upgrade the firmware of your card and it could fix this error. I hope it can helps you.