Since the Areca driver was upgraded to v1.30.00.04 in kernel 3.18, ARC-1110 and ARC-1120 controllers is no longer properly initialized on some systems. All kernels from 3.18 onwards are affected. Loading the module generates the following messages in the log: --------------------- [ 634.409073] Areca RAID Controller5: Model ARC-1120, F/W V1.49 2010-12-02 [ 634.410504] scsi host5: Areca SATA RAID Controller (RAID6 capable) arcmsr version v1.30.00.04-20140919 [ 634.410832] arcmsr 0000:07:0e.0: irq 19 for MSI/MSI-X [ 634.410893] arcmsr5: msi enabled [ 655.017019] arcmsr5: abort device command of scsi id = 0 lun = 0 [ 655.017032] arcmsr5: scsi id = 0 lun = 0 ccb = '0xeeb00000' poll command abort successfully [ 676.009017] arcmsr5: abort device command of scsi id = 1 lun = 0 [ 676.009030] arcmsr5: scsi id = 1 lun = 0 ccb = '0xeeb00660' poll command abort successfully [---- (the above two lines repeat for SCSI IDs 2-15) ---] [ 970.025230] scsi 5:0:16:0: Processor Areca RAID controller R001 PQ: 0 ANSI: 0 CCS --------------------- No logical or passthrough drives are detected regardless of controller configuration. The problem only affects certain hardware. The above was taken from a ProLiant ML370 G3 (Intel x86). The issue was reproduced on x86_64 using an AMD-based Fujitsu PC. Both systems work as expected with kernel 3.17.8, and the exact same controllers works fine under any kernel when used in other systems, like for instance a ProLiant ML350 G5 (x86 or x86_64).
Created attachment 260677 [details] .config for 4.14
Created attachment 260679 [details] Full dmesg log (kernel 4.14)
Created attachment 260681 [details] Output from lspci -vvv
Building the latest Areca driver (1.40.00.02 from http://www.areca.us/support/s_linux/driver/Source%20Code/arcmsr-1.40.00.02-source-only.dkms.tar.gz) against kernel 3.17.8 introduces the bug, so it's definitely the driver rather than some other issue with the kernel. The latest driver that works on the affected systems seems to be 1.20.0X.15-130619 (ftp://ftp.areca.com.tw/RaidCards/AP_Drivers/Linux/DRIVER/SourceCode/arcmsr.1.20.0X.15-130619.zip), which unfortunately doesn't compile against recent kernels.
Interrupts are handled differently by the more recent driver. From a working system running kernel 3.17.8: --- /proc/interrupts --- CPU0 CPU1 CPU2 CPU3 0: 133 0 0 0 IO-APIC-edge timer 1: 1 11 0 0 IO-APIC-edge i8042 6: 0 3 0 0 IO-APIC-edge floppy 7: 0 0 0 0 IO-APIC-edge parport0 8: 0 1 0 0 IO-APIC-edge rtc0 9: 0 0 0 0 IO-APIC-fasteoi acpi 11: 0 0 0 0 IO-APIC-fasteoi ohci_hcd:usb1 12: 0 165 0 0 IO-APIC-edge i8042 14: 1 254 0 0 IO-APIC-edge pata_serverworks 15: 0 0 0 0 IO-APIC-edge pata_serverworks 16: 0 5658 0 0 IO-APIC 10-fasteoi sata_sil 17: 0 0 0 0 IO-APIC 1-fasteoi hpilo 18: 0 247 0 0 IO-APIC 13-fasteoi eth0 19: 0 35 0 0 IO-APIC 8-fasteoi arcmsr NMI: 0 0 0 0 Non-maskable interrupts LOC: 8967 8566 11465 8375 Local timer interrupts SPU: 0 0 0 0 Spurious interrupts PMI: 0 0 0 0 Performance monitoring interrupts IWI: 0 1 0 0 IRQ work interrupts RTR: 1 0 0 0 APIC ICR read retries RES: 3182 1918 6329 2498 Rescheduling interrupts CAL: 992 16 11 1265 Function call interrupts TLB: 134 139 181 189 TLB shootdowns TRM: 0 0 0 0 Thermal event interrupts THR: 0 0 0 0 Threshold APIC interrupts MCE: 0 0 0 0 Machine check exceptions MCP: 1 1 1 1 Machine check polls THR: 0 0 0 0 Hypervisor callback interrupts ERR: 0 MIS: 0 ------------------------ Unloading arcmsr v1.20.00.15 and loading v1.40.0X.02 instead results in this change: --- /proc/interrupts --- 20: 0 0 0 0 PCI-MSI-edge arcmsr ------------------------ Here's the output from lspci. On a working system: --- lspci -vvv with driver v1.20.00.15 --- 07:0e.0 RAID bus controller: Areca Technology Corp. ARC-1120 8-Port PCI-X to SATA RAID Controller Subsystem: Areca Technology Corp. ARC-1120 8-Port PCI-X to SATA RAID Controller Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop- ParErr+ Stepping+ SERR- FastB2B- DisINTx- Status: Cap+ 66MHz+ UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort+ >SERR- <PERR- INTx- Latency: 64 (32000ns min), Cache Line Size: 64 bytes Interrupt: pin A routed to IRQ 19 NUMA node: 0 Region 0: Memory at f7ff0000 (32-bit, non-prefetchable) [size=4K] Region 2: Memory at f7800000 (32-bit, prefetchable) [size=4M] [virtual] Expansion ROM at f7f00000 [disabled] [size=64K] Capabilities: [c0] Power Management version 2 Flags: PMEClk- DSI- D1+ D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-) Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME- Capabilities: [d0] MSI: Enable- Count=1/2 Maskable- 64bit+ Address: 00000000fee0f00c Data: 41a1 Capabilities: [e0] PCI-X non-bridge device Command: DPERE+ ERO- RBC=1024 OST=8 Status: Dev=07:0e.0 64bit+ 133MHz+ SCD- USC- DC=bridge DMMRBC=1024 DMOST=4 DMCRS=32 RSCEM- 266MHz- 533MHz- Kernel driver in use: arcmsr Kernel modules: arcmsr ------------------------------------------ On the same system after loading the most recent arcmsr driver: --- lspci -vvv with driver v1.40.0X.02 --- 07:0e.0 RAID bus controller: Areca Technology Corp. ARC-1120 8-Port PCI-X to SATA RAID Controller Subsystem: Areca Technology Corp. ARC-1120 8-Port PCI-X to SATA RAID Controller Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop- ParErr+ Stepping+ SERR- FastB2B- DisINTx- Status: Cap+ 66MHz+ UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort+ >SERR- <PERR- INTx- Latency: 64 (32000ns min), Cache Line Size: 64 bytes Interrupt: pin A routed to IRQ 20 NUMA node: 0 Region 0: Memory at f7ff0000 (32-bit, non-prefetchable) [size=4K] Region 2: Memory at f7800000 (32-bit, prefetchable) [size=4M] [virtual] Expansion ROM at f7f00000 [disabled] [size=64K] Capabilities: [c0] Power Management version 2 Flags: PMEClk- DSI- D1+ D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-) Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME- Capabilities: [d0] MSI: Enable+ Count=1/2 Maskable- 64bit+ Address: 00000000fee0f00c Data: 41a1 Capabilities: [e0] PCI-X non-bridge device Command: DPERE+ ERO- RBC=1024 OST=8 Status: Dev=07:0e.0 64bit+ 133MHz+ SCD- USC- DC=bridge DMMRBC=1024 DMOST=4 DMCRS=32 RSCEM- 266MHz- 533MHz- Kernel driver in use: arcmsr Kernel modules: arcmsr ------------------------------------------
This bug was caused by the arcmsr driver attempting to use MSIs on non-MSI systems. This behavior may have been fixed (see https://patchwork.kernel.org/patch/10073751/), but regardless, in recent kernels MSIs can be manually disabled with the arcmsr module parameter "msi_enable=0".