While on 2.6.30.5 MPT SAS controller worked fine, on 2.6.31 it fails on heavy operations and start spitting errors to dmesg (they vary). Failsystems also stopped, and i am unable to reboot box properly (only over sysrq or hardreset). x86, Sun Fire X4100, 8 GB RAM, PAE kernel enabled, module loaded with default options I upgrade BIOS, LSI controller BIOS to latest version, it didn't fix the bug. I cannot do bisection, because this is loaded server and semi-embedded system. But i can do tests of patches or reverse specific commits, if you point me to exact commit. http://www.nuclearcat.com/files/dmesg.ok from 2.6.30.5 kernel http://www.nuclearcat.com/files/dmesg.fail from 2.6.31.1 kernel http://www.nuclearcat.com/files/config.gz config from 2.6.31.1 kernel Let me know if you need any additional information.
Reassigned to scsi, cc'ed Eric.
If i just copy fusion directory from previous kernel it works. Most probably changes what trigger that is (just diff between kernels): +static void +mpt_add_sge_64bit(void *pAddr, u32 flagslength, dma_addr_t dma_addr) +{ + SGESimple64_t *pSge = (SGESimple64_t *) pAddr; + pSge->Address.Low = cpu_to_le32 + (lower_32_bits((unsigned long)(dma_addr))); + pSge->Address.High = cpu_to_le32 + (upper_32_bits((unsigned long)dma_addr)); + pSge->FlagsLength = cpu_to_le32 + ((flagslength | MPT_SGE_FLAGS_64_BIT_ADDRESSING)); +} - } else { - SGESimple32_t *pSge = (SGESimple32_t *) pAddr; - pSge->FlagsLength = cpu_to_le32(flagslength); - pSge->Address = cpu_to_le32(dma_addr); +/** + * mpt_add_sge_64bit_1078 - Place a simple 64 bit SGE at address pAddr (1078 workaround). + * @pAddr: virtual address for SGE + * @flagslength: SGE flags and data transfer length + * @dma_addr: Physical address + * + * This routine places a MPT request frame back on the MPT adapter's + * FreeQ. + **/ +static void +mpt_add_sge_64bit_1078(void *pAddr, u32 flagslength, dma_addr_t dma_addr) +{ + SGESimple64_t *pSge = (SGESimple64_t *) pAddr; + u32 tmp; + + pSge->Address.Low = cpu_to_le32 + (lower_32_bits((unsigned long)(dma_addr))); + tmp = (u32)(upper_32_bits((unsigned long)dma_addr)); + Following patch in upstream (but not in latest stable kernel) seems fixing my issue. Probably it must be pushed to stable kernels? commit c55b89fba9872ebcd5ac15cdfdad29ffb89329f0 [SCSI] mptsas : PAE Kernel more than 4 GB kernel panic This patch is solving problem for PAE kernel DMA operation. On PAE system dma_addr and unsigned long will have different values. Now dma_addr is not type casted using unsigned long.