Bug 9859 - hp smart array E200i only detected with acpi enabled
Summary: hp smart array E200i only detected with acpi enabled
Status: REJECTED UNREPRODUCIBLE
Alias: None
Product: IO/Storage
Classification: Unclassified
Component: Block Layer (show other bugs)
Hardware: All Linux
: P1 high
Assignee: Mike Miller
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2008-01-31 07:01 UTC by Ross O. Fomerand
Modified: 2009-03-24 07:41 UTC (History)
3 users (show)

See Also:
Kernel Version: 2.6.24
Subsystem:
Regression: Yes
Bisected commit-id:


Attachments
diffs between 2.6.23.12 and 2.6.24 (10.97 KB, patch)
2008-02-07 11:06 UTC, Mike Miller
Details | Diff
working 2.6.23.12 config (22.98 KB, application/octet-stream)
2008-02-07 11:17 UTC, Ross O. Fomerand
Details
non-working 2.6.24 config (25.72 KB, application/octet-stream)
2008-02-07 11:18 UTC, Ross O. Fomerand
Details

Description Ross O. Fomerand 2008-01-31 07:01:56 UTC
Latest working kernel version: 2.6.23.12
Earliest failing kernel version: 2.6.24
Distribution: gentoo
Hardware Environment: HP Proliant BL 465c 
Software Environment: x86_64
Problem Description: make oldconfig from the 2.6.23.12 config file, made sure Compaq Smart Array 5xxx support was built into the kernel; and I end up getting the good old VFS: cannot open root device open root device "cciss/c0d0p1" or unknown block (0,0).  

Steps to reproduce: install 2.6.24 :-)
Comment 1 Mike Miller 2008-01-31 12:10:12 UTC
Please send your .config file to mikem@roadking.cca.cpqcorp.net.
Comment 2 Ross O. Fomerand 2008-02-04 21:13:41 UTC
sent....any thoughts?
Comment 3 Mike Miller 2008-02-05 09:14:19 UTC
I didn't get the mail. Please post to the BZ.
Comment 4 Mike Miller 2008-02-07 11:00:08 UTC
Ross, I can't recreate this failure. I've tried building into the kernel and as a loadable module. Any updates on your end?
Comment 5 Mike Miller 2008-02-07 11:06:25 UTC
Created attachment 14737 [details]
diffs between 2.6.23.12 and 2.6.24

This patch shows there are only a few _minor_ changes.
1. copyright changes
2. bio_endio changes by Jens
3. remove test for disk in deregister_disk, it was not needed
4. added error handling for sg_io stuff
5. turn off DMA refetch on P600
Comment 6 Ross O. Fomerand 2008-02-07 11:09:21 UTC
hey, mike, ive been sick for a couple of days you still need the two different
config files?  I can give you more debug info if you want but the problem is
most certainly with 2.6.24 compaq smart array 5xxx driver (which is built in to
the kernel, we do not use modules in our kernels) and the hp P200i storage
controller....let me know if you still need the two different config files...
Comment 7 Mike Miller 2008-02-07 11:11:39 UTC
Yes, please send me your config files.
Comment 8 Ross O. Fomerand 2008-02-07 11:17:42 UTC
Created attachment 14738 [details]
working 2.6.23.12 config
Comment 9 Ross O. Fomerand 2008-02-07 11:18:20 UTC
Created attachment 14739 [details]
non-working 2.6.24 config
Comment 10 Ross O. Fomerand 2008-02-07 11:28:17 UTC
(In reply to comment #6)
> hey, mike, ive been sick for a couple of days you still need the two
> different
> config files?  I can give you more debug info if you want but the problem is
> most certainly with 2.6.24 compaq smart array 5xxx driver (which is built in
> to
> the kernel, we do not use modules in our kernels) and the hp P200i storage
> controller....let me know if you still need the two different config files...
> 

correction s/P200i/E200i sorry for any confusion this may of created.
Comment 11 Ross O. Fomerand 2008-02-11 07:59:44 UTC
any status update to speak of on this issue Mike?
Comment 12 Mike Miller 2008-02-12 08:35:56 UTC
Ross, I cannot recreate this issue in my lab but I do not have any blades. I've tested on 2 other systems with the e200 standup card with no problems. I'll ask one of the 3rd level support engineers to try this on a bl465c.
Comment 13 Fabio Coatti 2008-02-13 10:19:07 UTC
I've captured the boot sequence thru serial console, if this can help:

Linux version 2.6.24.2 (root@xxx211) (gcc version 4.1.1 (Gentoo 4.1.1)) #2 SMP Wed Feb 13 17:28:25 CET 2008                                                     Command line: root=/dev/cciss/c0d0p1 ro noinitrd console=uart8250,io,0x3f8,9600n8                                                                               BIOS-provided physical RAM map:                                                  BIOS-e820: 0000000000000000 - 000000000009f400 (usable)                         BIOS-e820: 000000000009f400 - 00000000000a0000 (reserved)                       BIOS-e820: 00000000000f0000 - 0000000000100000 (reserved)                       BIOS-e820: 0000000000100000 - 00000000cfe4efc0 (usable)                         BIOS-e820: 00000000cfe4efc0 - 00000000cfe56fc0 (ACPI data)                      BIOS-e820: 00000000cfe56fc0 - 00000000cfe57fc0 (usable)                         BIOS-e820: 00000000cfe57fc0 - 00000000d0000000 (reserved)                       BIOS-e820: 00000000fec00000 - 00000000fed00000 (reserved)                       BIOS-e820: 00000000fee00000 - 00000000fee10000 (reserved)                       BIOS-e820: 00000000ffc00000 - 0000000100000000 (reserved)                       BIOS-e820: 0000000100000000 - 000000012ffff000 (usable)                        Early serial console at I/O port 0x3f8 (options '9600n8')
console [uart0] enabled
end_pfn_map = 1245183
DMI 2.4 present.
No NUMA configuration found
Faking a node at 0000000000000000-000000012ffff000
Bootmem setup node 0 0000000000000000-000000012ffff000
Zone PFN ranges:
  DMA             0 ->     4096
  DMA32        4096 ->  1048576
  Normal    1048576 ->  1245183
Movable zone start PFN for each node
early_node_map[3] active PFN ranges
    0:        0 ->      159
    0:      256 ->   851534
    0:  1048576 ->  1245183
Intel MultiProcessor Specification v1.4
MPTABLE: OEM ID: HP       MPTABLE: Product ID: PROLIANT     MPTABLE: APIC at: 0xFEE00000
Processor #0 (Bootup-CPU)
Processor #1
Processor #2
Processor #3
I/O APIC #8 at 0xFEC00000.
I/O APIC #9 at 0xFEC01000.
Setting APIC routing to flat
Processors: 4
Allocating PCI resources starting at d4000000 (gap: d0000000:2ec00000)
PERCPU: Allocating 21344 bytes of per cpu data
Built 1 zonelists in Node order, mobility grouping on.  Total pages: 1029650
Policy zone: Normal
Kernel command line: root=/dev/cciss/c0d0p1 ro noinitrd console=uart8250,io,0x3f8,9600n8
Initializing CPU#0
PID hash table entries: 4096 (order: 12, 32768 bytes)
TSC calibrated against PIT
Marking TSC unstable due to TSCs unsynchronized
time.c: Detected 2400.198 MHz processor.
Console: colour VGA+ 80x25
console handover: boot [uart0] -> real [ttyS0]
Checking aperture...
CPU 0: aperture @ fb76000000 size 32 MB
Aperture too small (32 MB)
No AGP bridge found
Your BIOS doesn't leave a aperture memory hole
Please enable the IOMMU option in the BIOS setup
This costs you 64 MB of RAM
Mapping aperture over 65536 KB of RAM @ 4000000
Memory: 4052516k/4980732k available (3102k kernel code, 139660k reserved, 1628k data, 312k init)
Calibrating delay using timer specific routine.. 4803.22 BogoMIPS (lpj=9606443)
Dentry cache hash table entries: 524288 (order: 10, 4194304 bytes)
Inode-cache hash table entries: 262144 (order: 9, 2097152 bytes)
Mount-cache hash table entries: 256
CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line)
CPU: L2 Cache: 1024K (64 bytes/line)
CPU 0/0 -> Node 0
CPU: Physical Processor ID: 0
CPU: Processor Core ID: 0
Freeing SMP alternatives: 26k freed
ExtINT not setup in hardware but reported by MP table
Using local APIC timer interrupts.
Detected 12.501 MHz APIC timer.
Booting processor 1/4 APIC 0x1
Initializing CPU#1
Calibrating delay using timer specific routine.. 4800.47 BogoMIPS (lpj=9600947)
CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line)
CPU: L2 Cache: 1024K (64 bytes/line)
CPU 1/1 -> Node 0
CPU: Physical Processor ID: 0
CPU: Processor Core ID: 1
Dual-Core AMD Opteron(tm) Processor 2216 HE stepping 02
Booting processor 2/4 APIC 0x2
Initializing CPU#2
Calibrating delay using timer specific routine.. 4800.50 BogoMIPS (lpj=9601011)
CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line)
CPU: L2 Cache: 1024K (64 bytes/line)
CPU 2/2 -> Node 0
CPU: Physical Processor ID: 1
CPU: Processor Core ID: 0
Dual-Core AMD Opteron(tm) Processor 2216 HE stepping 02
Booting processor 3/4 APIC 0x3
Initializing CPU#3
Calibrating delay using timer specific routine.. 4800.45 BogoMIPS (lpj=9600918)
CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line)
CPU: L2 Cache: 1024K (64 bytes/line)
CPU 3/3 -> Node 0
CPU: Physical Processor ID: 1
CPU: Processor Core ID: 1
Dual-Core AMD Opteron(tm) Processor 2216 HE stepping 02
Brought up 4 CPUs
net_namespace: 120 bytes
NET: Registered protocol family 16
PCI: Using configuration type 1
usbcore: registered new interface driver usbfs
usbcore: registered new interface driver hub
usbcore: registered new device driver usb
PCI: Probing PCI hardware
PCI: HP ProLiant BL465c G1 detected, enabling pci=bfsort.
PCI->APIC IRQ transform: 0000:00:03.0[A] -> IRQ 20
PCI->APIC IRQ transform: 0000:00:04.0[A] -> IRQ 21
PCI->APIC IRQ transform: 0000:00:04.2[B] -> IRQ 21
PCI->APIC IRQ transform: 0000:00:04.4[B] -> IRQ 21
PCI->APIC IRQ transform: 0000:00:04.6[A] -> IRQ 21
PCI->APIC IRQ transform: 0000:00:07.0[A] -> IRQ 5
PCI->APIC IRQ transform: 0000:00:07.1[A] -> IRQ 5
PCI->APIC IRQ transform: 0000:00:07.2[A] -> IRQ 5
PCI->APIC IRQ transform: 0000:02:03.0[A] -> IRQ 22
PCI->APIC IRQ transform: 0000:02:04.0[A] -> IRQ 23
PCI-DMA: Disabling AGP.
PCI-DMA: aperture base @ 4000000 size 65536 KB
PCI-DMA: using GART IOMMU.
PCI-DMA: Reserving 64MB of IOMMU area in the AGP aperture
PCI: Bridge: 0000:01:0d.0
  IO window: disabled.
  MEM window: f8000000-fbffffff
  PREFETCH window: d4000000-d40fffff
PCI: Bridge: 0000:00:05.0
  IO window: disabled.
  MEM window: f8000000-fbffffff
  PREFETCH window: d4000000-d40fffff
NET: Registered protocol family 2
IP route cache hash table entries: 131072 (order: 8, 1048576 bytes)
TCP established hash table entries: 524288 (order: 11, 8388608 bytes)
TCP bind hash table entries: 65536 (order: 8, 1048576 bytes)
TCP: Hash tables configured (established 524288 bind 65536)
TCP reno registered
fuse init (API version 7.9)
SGI XFS with ACLs, security attributes, large block/inode numbers, no debug enabled
SGI XFS Quota Management subsystem
io scheduler noop registered
io scheduler anticipatory registered (default)
io scheduler deadline registered
io scheduler cfq registered
pci 0000:00:04.4: HCRESET not completed yet!
PCI: Found enabled HT MSI Mapping on 0000:00:05.0
Real Time Clock Driver v1.12ac
Non-volatile memory driver v1.2
Linux agpgart interface v0.102
Serial: 8250/16550 driver $Revision: 1.90 $ 4 ports, IRQ sharing disabled
serial8250: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
serial8250: ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A
loop: module loaded
HP CISS Driver (v 3.6.14)
Ethernet Channel Bonding Driver: v3.2.3 (December 6, 2007)
bonding: Warning: either miimon or arp_interval and arp_ip_target module parameters must be specified, otherwise bonding will not detect link failures! see bonding.txt for details.
Broadcom NetXtreme II Gigabit Ethernet Driver bnx2 v1.6.9 (December 8, 2007)
eth0: Broadcom NetXtreme II BCM5706 1000Base-SX (A2) PCI-X 64-bit 100MHz found at mem fa000000, IRQ 22, node addr 00:1a:4b:de:c6:4a
eth1: Broadcom NetXtreme II BCM5706 1000Base-SX (A2) PCI-X 64-bit 100MHz found at mem f8000000, IRQ 23, node addr 00:1a:4b:de:77:f8
ehci_hcd 0000:00:07.2: EHCI Host Controller
ehci_hcd 0000:00:07.2: new USB bus registered, assigned bus number 1
ehci_hcd 0000:00:07.2: irq 5, io mem 0xf7ec0000
ehci_hcd 0000:00:07.2: USB 2.0 started, EHCI 1.00, driver 10 Dec 2004
usb usb1: configuration #1 chosen from 1 choice
hub 1-0:1.0: USB hub found
hub 1-0:1.0: 4 ports detected
ohci_hcd 0000:00:07.0: OHCI Host Controller
ohci_hcd 0000:00:07.0: new USB bus registered, assigned bus number 2
ohci_hcd 0000:00:07.0: irq 5, io mem 0xf7ee0000
usb usb2: configuration #1 chosen from 1 choice
hub 2-0:1.0: USB hub found
hub 2-0:1.0: 2 ports detected
ohci_hcd 0000:00:07.1: OHCI Host Controller
ohci_hcd 0000:00:07.1: new USB bus registered, assigned bus number 3
ohci_hcd 0000:00:07.1: irq 5, io mem 0xf7ed0000
usb usb3: configuration #1 chosen from 1 choice
hub 3-0:1.0: USB hub found
hub 3-0:1.0: 2 ports detected
USB Universal Host Controller Interface driver v3.0
uhci_hcd 0000:00:04.4: UHCI Host Controller
uhci_hcd 0000:00:04.4: new USB bus registered, assigned bus number 4
uhci_hcd 0000:00:04.4: port count misdetected? forcing to 2 ports
uhci_hcd 0000:00:04.4: HCRESET not completed yet!
uhci_hcd 0000:00:04.4: irq 21, io base 0x00001800
usb usb4: configuration #1 chosen from 1 choice
hub 4-0:1.0: USB hub found
hub 4-0:1.0: 2 ports detected
serio: i8042 KBD port at 0x60,0x64 irq 1
serio: i8042 AUX port at 0x60,0x64 irq 12
mice: PS/2 mouse device common for all mice
usb 4-1: new full speed USB device using uhci_hcd and address 2
usb 4-1: configuration #1 chosen from 1 choice
usb 4-2: new full speed USB device using uhci_hcd and address 3
usb 4-2: configuration #1 chosen from 1 choice
hub 4-2:1.0: USB hub found
hub 4-2:1.0: 7 ports detected
input: HP Virtual Keyboard as /devices/pci0000:00/0000:00:04.4/usb4/4-1/4-1:1.0/input/input0
input: USB HID v1.01 Keyboard [HP Virtual Keyboard] on usb-0000:00:04.4-1
input: HP Virtual Keyboard as /devices/pci0000:00/0000:00:04.4/usb4/4-1/4-1:1.1/input/input1
input: USB HID v1.01 Mouse [HP Virtual Keyboard] on usb-0000:00:04.4-1
usbcore: registered new interface driver usbhid
drivers/hid/usbhid/hid-core.c: v2.6:USB HID core driver
Netfilter messages via NETLINK v0.30.
ip_tables: (C) 2000-2006 Netfilter Core Team
arp_tables: (C) 2002 David S. Miller
TCP bic registered
TCP cubic registered
TCP westwood registered
TCP highspeed registered
TCP hybla registered
TCP htcp registered
TCP vegas registered
TCP veno registered
TCP scalable registered
TCP lp registered
TCP yeah registered
TCP illinois registered
NET: Registered protocol family 1
NET: Registered protocol family 17
RPC: Registered udp transport module.
RPC: Registered tcp transport module.
802.1Q VLAN Support v1.8 Ben Greear <greearb@candelatech.com>
All bugs added by David S. Miller <davem@redhat.com>
drivers/rtc/hctosys.c: unable to open rtc device (rtc0)
VFS: Cannot open root device "cciss/c0d0p1" or unknown-block(0,0)
Please append a correct "root=" boot option; here are the available partitions:
Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(0,0)
Comment 14 David Rohr 2008-02-14 06:16:12 UTC
I'm facing a problem that seems related also using gentoo on a x86_64 configuration (Dual Opteron 270 on Tyan K8SE)
I'm booting from a 3ware 9500s controller.
Everything went find with 2.6.23 but stating from 2.6.24 i get the same kernel panic message: Unable to mount root fs on unknown-block(0,0)
The sd messages for sda and sdb that appear under 2.6.23 do no longer appear with 2.6.24 though the 3ware driver reports it's loaded
Comment 15 Mike Miller 2008-02-14 07:54:16 UTC
This points to something other than the cciss driver obviously. I was finally able to get my hands on a Opteron box and recreated the problem on cciss. The driver loads but does not register the disks.
My next step is to retry using the MPT driver. Maybe then I can see what's going on.
Comment 16 Mike Miller 2008-02-14 12:50:45 UTC
I determined that the 2.6.25-rc1 tree does not have this problem. The differences in cciss are the procfs enhancements and the use of upper32 when we build our CDB in do_cciss_request. Still going thru diffs elsewhere in the kernel.
Comment 17 David Rohr 2008-02-14 14:28:33 UTC
I've tried some other Kernel Versions right now.
The Problem with the 3ware card still appears with 2.6.25-rc1 and with the newest 2.6.23 (think it was 16). Another difference that appears with all those kernels at boot is that the system stucks for about two seconds after the message that it loaded the cfq io sheduler.
Comment 18 Fabio Coatti 2008-09-14 09:30:02 UTC
Hi all, 
I've tried just few minutes ago 2.6.26.5 on exactly the same hardware, and I got the same behaviour described earlier on this bug. No news? This is becoming a trouble for us, as we are unable to upgrade a lot of blade servers..
Comment 19 luca pasquali 2008-09-23 12:22:32 UTC
howdy, configuring base ACPI support makes this kernel work.
anyway enabling just this in my understanding should fix hardware recognition on acpi aware peripherals (talking about the non working config attached in this bug report):

# Power management options
#
# CONFIG_PM is not set
Comment 20 Fabio Coatti 2008-10-06 01:45:06 UTC
In fact it seems that make oldconfig didn't configured correctly all the options related to hw discovery.
I've lost track of all the configuration modifications that we tried, but starting from empty config seems to solve this issue.
Now I'm wondering how some config options disappeared from menu (using make oldconfig on newe kernel) leaving some of them missing from menuconfig, thus preventing us to activate the right one.
Anyway, now this ticket seems to be solved.

Note You need to log in before you can comment on or make changes to this bug.