Created attachment 23239 [details] Kernel config file Hi, I've a HighPoint RocketRaid 3120 controller installed in my system with a raid 0 and 16GB of ram in the computer. Kernels with the High Memory Support = 4G are working. But when I build a custom kernel with High Memory Support = 64G enabled, the system crashes while booting with "Kernel panic". I've attached the last screens of the boot progress and the kernel config file. Here the ouput of lspci: === 00:00.0 Host bridge: ATI Technologies Inc RD780 Northbridge only dual slot PCI-e_GFX and HT1 K8 part 00:02.0 PCI bridge: ATI Technologies Inc RD790 PCI to PCI bridge (external gfx0 port A) 00:03.0 PCI bridge: ATI Technologies Inc RD790 PCI to PCI bridge (external gfx0 port B) 00:0a.0 PCI bridge: ATI Technologies Inc RD790 PCI to PCI bridge (PCI express gpp port F) 00:12.0 SATA controller: ATI Technologies Inc SB600 Non-Raid-5 SATA 00:13.0 USB Controller: ATI Technologies Inc SB600 USB (OHCI0) 00:13.1 USB Controller: ATI Technologies Inc SB600 USB (OHCI1) 00:13.2 USB Controller: ATI Technologies Inc SB600 USB (OHCI2) 00:13.3 USB Controller: ATI Technologies Inc SB600 USB (OHCI3) 00:13.4 USB Controller: ATI Technologies Inc SB600 USB (OHCI4) 00:13.5 USB Controller: ATI Technologies Inc SB600 USB Controller (EHCI) 00:14.0 SMBus: ATI Technologies Inc SBx00 SMBus Controller (rev 14) 00:14.1 IDE interface: ATI Technologies Inc SB600 IDE 00:14.3 ISA bridge: ATI Technologies Inc SB600 PCI to LPC Bridge 00:14.4 PCI bridge: ATI Technologies Inc SBx00 PCI to PCI Bridge 00:18.0 Host bridge: Advanced Micro Devices [AMD] K10 [Opteron, Athlon64, Sempron] HyperTransport Configuration 00:18.1 Host bridge: Advanced Micro Devices [AMD] K10 [Opteron, Athlon64, Sempron] Address Map 00:18.2 Host bridge: Advanced Micro Devices [AMD] K10 [Opteron, Athlon64, Sempron] DRAM Controller 00:18.3 Host bridge: Advanced Micro Devices [AMD] K10 [Opteron, Athlon64, Sempron] Miscellaneous Control 00:18.4 Host bridge: Advanced Micro Devices [AMD] K10 [Opteron, Athlon64, Sempron] Link Control 01:00.0 VGA compatible controller: nVidia Corporation G72 [GeForce 7300 SE/7200 GS] (rev a1) 02:00.0 RAID bus controller: HighPoint Technologies, Inc. Device 3120 (rev 02) 03:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168B PCI Express Gigabit Ethernet controller (rev 01) === Thanks a lot for helping Patrick
Created attachment 23240 [details] Image 1 from the boot process
Created attachment 23241 [details] Image 2 from the boot process
Created attachment 23242 [details] Image 3 from the boot process
Created attachment 23243 [details] Image 4 (last) from the boot process
Created attachment 23244 [details] sb600-32bit-only.patch Does this patch fix the problem? Also, can you please post the output of dmidecode?
Created attachment 23245 [details] Output of dmidecode from the 2.6.30 kernel with 4G HighMemorySupport
Created attachment 23247 [details] sb600-32bit-only-by-default.patch Please test this one instead.
Created attachment 23248 [details] Last screen from patched kernel boot
Oh, this one is not about the sb600 controller. My bad. Can you please post successful kernel boot log with the 4G kernel? Also, please post the output of "lspci -nn". thanks.
Sorry that patch did not work here. But I'm now one week in vacation, so I can restart the server when I'm back next saturday. I'll read this during the vacation and can give output from the running system, but can't reboot the system until I'm back. For now many thanks for the fast response. Greets Patrick
Created attachment 23250 [details] Kernel-logs with dmesg
Created attachment 23251 [details] Output of "lspci -nn"
Created attachment 23254 [details] hptiop-32bit.patch Please test this patch.
This patch works for me. Thank you very much!
The question, now, is whether it's the motherboard or the controller. Any chance you can try sil3132 or 3124 controller in the same slot?
Sorry, but I've no such controller. The HighPoint is the only RAID controller I have.
Eh... the problem is that I can't tell which part to blacklist. Can you please attach the output of "dmidecode', "lspci -nnvvv" and "lspci -tnnv". Also, can you be persuaded into buying a sil 3132 controller and try it in the same slot? It'll cost between 20 and 30USD and I can pay you via paypal if you wish. Shane, is there any known problem with 64bit DMA on these configurations? Could we be looking at a bridge / host controller problem? Thanks.
Tejun, "dmidecode" was already provided in comment #6 by maierp. Except for the SB600 SATA 64 DMA issue we discussed before: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=2fcad9d27168b287e3db61f6694254e0afa32f8c I do NOT know other 64 bit DMA problem on these configurations, especially for High Point RAID controller. I can not find such RAID controller here, neither the board GA-MA790X-DS4. So I'm afraid I can not help much on this issue... Shane
Shane, thanks for the input. One question tho. Are you sure that SB600 ahci DMA problem is caused by the ahci controller not by the pci host or bridge controller? Thanks.
Tejun, No, I'm not sure. As you know, although it is related to different BIOS releases on ASUS M2A-VM, we do NOT find the root cause for the SB600 SATA 64 bit DMA issue, our HW engineer told me that we didn't see any SB600 SATA 64bit DMA design issue. As to potential bridge/host controller problem, after check with other guys, we have NOT heard of such issue either. So, trying one different RAID controller on the same platform should help more.
Thanks for the comment, Shane. maierp, can you please try another 64bit capable controller at that slot?
Tejun, is this a controller with the right chipset? http://www.planet4one.de/planet/wbcdirect.php?pid=74799 (DeLock SATA II PCI Express Card, 2 Port (70137)) I think this has a SIL 3121 chipset.
Yeap, that's a sil3132.
So the last days I had time to check it again. I booted from one hdd with the DeLock card with the sil3132 chip and 64GB high memory support and with 16GB RAM inserted. It worked. I than created a 10GB random file and copyed it 3 times to the same hdd. The files all have the same MD5 checksum. So I think this works. With the HighPoint RocketRaid 3120, the system failed to boot. BUT when replacing the 4x4GB RAM with 4x2GB = 8GB RAM the RocketRaid 3120 also boots without problems with the same kernel. The RAM is working, I've made a MemTest. Greets Patrick
maierp, can rocketraid copy large files without error on 8GB configuration too?
Yes, there were no errors. All copied files have the same md5sum. This check was done with 2x4GB = 8GB and 64GB high memory support and 2.6.32
Thanks for verifying. Pinging hpt again.
(In reply to comment #27) > Thanks for verifying. Pinging hpt again. Dear Tejun Heo, please visit the HighPoint website, www.highpoint-tech.com, and download the firmware package (v1.2.25.8) for the RocketRAID 3120 controller from the Support section of the website. Let us know the results are you have finished testing. Thank you HighPoint
maierp, can you please try the newer firmware? Also which firmware version are you currently on? Thanks.
(In reply to comment #29) > Also which firmware version are you currently on? It already has this "new" firmware v1.2.25.8
Firmware v1.2.25.8 fixed 64 bit DMA issue. But this firmware can't support >12G memory if 64 bit DMA enabled.
Created attachment 24282 [details] hptiop-no-64bit-dma.patch Then, the driver shouldn't mark the device as 64bit capable because it will break on larger machine which will become more and more common. I guess something like this patch is in order? Thanks.
Created attachment 24303 [details] Fix rr312x 64bit dma error
Comment on attachment 24303 [details] Fix rr312x 64bit dma error Only RR312x has 64bit dma issue. Please use this patch.
Looks good to me but you're the maintainer of the driver. Can you guys please push the patch upstream and to -stable? Thanks.
A patch referencing this bug report has been merged in Linux v3.6-rc1: commit 23f0bb47a4ec4c662b2bbf0221d6289e91b06ece Author: HighPoint Linux Team <linux@highpoint-tech.com> Date: Thu Jun 14 08:47:07 2012 +0100 [SCSI] hptiop: fix RR312x in hosts with >12GB