Most recent kernel where this bug did not occur: Distribution: Debian unstable Hardware Environment: HP ML-350 G5 Software Environment: Problem Description: The trace written to the console: ================================= kernel BUG at drivers/block/cciss.c:2479! invalid opcode: 0000 [#1] SMP Modules linked in: mptctl sg nfsd exportfs lockd nfs_acl sunrpc ipv6 8021q raw dm_snapshot shpchp pci_hotplug psmouse serio_raw pcspkr ext3 jbd mbcache raid1 md_mod dm_mod ide_generic ide_cd usbhid hid piix tg3 cciss bnx2 generic ehci_hcd ide_core uhci_hcd usbcore thermal processor fan sr_mod cdrom mptscsih mptbase aic7xxx sym53c8xx scsi_transport_spi BusLogic scsi_mod CPU: 0 EIP: 0060:[<f894c119>] Not tainted VLI EFLAGS: 00010012 (2.6.21-1-686 #1) EIP is at do_cciss_request+0x44/0x349 [cciss] eax: f693b350 ebx: dfb7dbac ecx: 00000000 edx: 00000000 esi: 00000800 edi: f5ff99c0 ebp: f693b350 esp: dfcafbec ds: 007b es: 007b fs: 00d8 gs: 0000 ss: 0068 Process md0_raid1 (pid: 6108, ti=dfcae000 task=df981a90 task.ti=dfcae000) Stack: 00000000 dfd6004c dfb7dbac 00000c00 dfd10000 00000001 00000001 00000001 dfcafc2c c17430e0 00000c00 00000000 00000200 c16bad80 00000000 00000000 c195a24c dfebaadc 00000000 00000040 00000040 dfcafc9c dfd60250 dfc97bec Call Trace: [<c01bd25b>] elv_next_request+0x10d/0x11c [<f894a105>] start_io+0x7b/0xe1 [cciss] [<f894c413>] do_cciss_request+0x33e/0x349 [cciss] [<c016438a>] cache_alloc_refill+0x58/0x466 [<c01c6dfc>] cfq_set_request+0x299/0x315 [<c01bd54a>] elv_rb_add+0x65/0x6d [<c01c63e8>] cfq_add_rq_rb+0x5c/0x6b [<c01c6421>] cfq_insert_request+0x2a/0x3ae [<c0129e5f>] lock_timer_base+0x15/0x2f [<c012a15d>] del_timer+0x48/0x4e [<c01bfa97>] blk_remove_plug+0x57/0x63 [<c0129f73>] __mod_timer+0x9c/0xa6 [<c01bfac0>] __generic_unplug_device+0x1d/0x1f [<c01c0c02>] __make_request+0x34c/0x46c [<c01bedfa>] generic_make_request+0x1a9/0x1b9 [<c012a15d>] del_timer+0x48/0x4e [<c01bfa97>] blk_remove_plug+0x57/0x63 [<f8961ba9>] raid1d+0xbf/0xd0e [raid1] [<c0102ff1>] __switch_to+0xfe/0x131 [<c011b0fe>] __activate_task+0x1c/0x29 [<c029d6d9>] schedule_timeout+0x13/0x8d [<c0124b9c>] do_exit+0x6c2/0x6c6 [<f89a1a71>] md_thread+0xc6/0xdd [md_mod] [<c01328e5>] autoremove_wake_function+0x0/0x35 [<f89a19ab>] md_thread+0x0/0xdd [md_mod] [<c013281a>] kthread+0xb2/0xdc [<c0132768>] kthread+0x0/0xdc [<c01049a7>] kernel_thread_helper+0x7/0x10 ======================= Code: 44 24 10 8b 82 dc 00 00 00 84 c0 0f 88 0b 03 00 00 8b 44 24 08 e8 46 10 87 c7 85 c0 89 c5 0f 84 f8 02 00 00 66 83 78 68 1f 76 04 <0f> 0b eb fe 8b 44 24 10 ba 01 00 00 00 e8 a2 f9 ff ff 85 c0 89 EIP: [<f894c119>] do_cciss_request+0x44/0x349 [cciss] SS:ESP 0068:dfcafbec The bug is triggered on the line: BUG_ON(creq->nr_phys_segments > MAXSGENTRIES); in the function drivers/block/cciss.c:do_cciss_request() It looks like the raid code doen not hounour the limitations of the cciss driver. I have added a printk to dm_table_set_restrictions() and I saw the value of 31 (MAXSGENTRIES) pass by. I filed this bug at first as http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=426705, but as it seems to be de dm-driver, it looks more at place here. An other bug that looks related is: http://bugzilla.kernel.org/show_bug.cgi?id=7763, but it is in dm_crypt. But from the data in the attachment in #13, it seems to have the same cause. To add another data-point, I compiled a kernel with MAX_PHYS_SEGMENTS = MAX_HW_SEGMENTS = 16 This make the problem go away (as expected). Steps to reproduce: I have an ML-350-G5 with an E200i raid controller. There are 2 logical drives defined that map 1-1 on a physical drive. Those disks are part of a software RAID-1 array. When initializing an oracle database on the system, the system panics.
Please change 'component' field to 'MD'.
Hi Leo, Your stack trace suggests that you are using 'md' raid1, not 'dm' raid1. However the fact that a printk in dm_table_set_restrictions printed something suggests that you are using dm as well. Could you please explain exactly the configuration of devices, raid/lvm and filesystems. Both dm and md to attempt to honour the max_phys_segments limitation, so something odd must be happening. A description of your configuration might help me work out where to look. Thanks, NeilBrown
Setting to Need Info for configuration description (I cannot seem to both accept a bug, and set needinfo at the same time...)
On Thu, Nov 15, 2007 at 04:24:18PM -0800, bugme-daemon@bugzilla.kernel.org wrote: > http://bugzilla.kernel.org/show_bug.cgi?id=8672 > > > neilb@suse.de changed: > > What |Removed |Added > ---------------------------------------------------------------------------- > AssignedTo|mike.miller@hp.com |neilb@suse.de > Status|NEW |ASSIGNED > > > > > ------- Comment #2 from neilb@suse.de 2007-11-15 16:24 ------- > Hi Leo, > Your stack trace suggests that you are using 'md' raid1, not 'dm' raid1. > However the fact that a printk in dm_table_set_restrictions printed > something > suggests that you are using dm as well. > > Could you please explain exactly the configuration of devices, raid/lvm and > filesystems. > > Both dm and md to attempt to honour the max_phys_segments limitation, so > something odd must be happening. A description of your configuration might > help me work out where to look. Neil, We have servers with 2 physical disks. Those disks are partitioned alike in a data, a swap and a boot partition. The data and the swap partitions are both mirrored as raid1 devices using mdadm(1). So here you have md on top of cciss. On top of the raid1 data partition, we use lvm2 to create the logical volumes for the filesystems (pvcreate -> vgcreat -> lvcreate ). Leo.
Hmmm.. nothing obvious there that would trigger a problem. Could you try the patch which I will attach? It just prints out some relevant values at various places so we can see where something goes wrong. Thanks.
Created attachment 13702 [details] tracing patch Please try this patch and report what is printed.
Ok, I tried it: <kernel boots> ... Before set_limits : rs=0 q=31 After set_limits: 31 Before combine_low : lhs=0 rhs=31 After combine_low: 31 Before valid: 31 After valid: 31 In table_set_restrictions: 31 Before set_limits : rs=0 q=31 After set_limits: 31 Before combine_low : lhs=0 rhs=31 After combine_low: 31 Before valid: 31 After valid: 31 In table_set_restrictions: 31 Before set_limits : rs=0 q=31 After set_limits: 31 Before combine_low : lhs=0 rhs=31 After combine_low: 31 Before valid: 31 After valid: 31 Before set_limits : rs=0 q=31 After set_limits: 31 Before combine_low : lhs=0 rhs=31 After combine_low: 31 Before valid: 31 After valid: 31 In table_set_restrictions: 31 Before set_limits : rs=0 q=31 After set_limits: 31 Before set_limits : rs=31 q=31 After set_limits: 31 Before combine_low : lhs=0 rhs=31 After combine_low: 31 Before valid: 31 After valid: 31 In table_set_restrictions: 31 In table_set_restrictions: 31 Logical volume "snap" created 27 logical volume(s) in volume group "ahraid" now active Before set_limits : rs=0 q=31 After set_limits: 31 Before combine_low : lhs=0 rhs=31 After combine_low: 31 Before valid: 31 After valid: 31 In table_set_restrictions: 31 Logical volume "snap" successfully removed Starting raid devices: done. Starting raid devices: done. ... <starting database init> ... kernel BUG at drivers/block/cciss.c:2482! invalid opcode: 0000 [#1] SMP Modules linked in: sg ipmi_devintf ipmi_watchdog ipmi_poweroff ipmi_si ipmi_msghandler 8021q nfsd exportfs lockd nfs_acl sunrpc ipv6 raw dm_snapshot tsdev usbhid hid ehci_hcd uhci_hcd usbcore evdev psmouse serio_raw pcspkr shpchp pci_hotplug ext3 jbd mbcache dm_mod ide_generic raid1 md_mod ide_cd ata_piix libata piix generic bnx2 ide_core tg3 cciss thermal processor fan sr_mod cdrom mptscsih mptbase aic7xxx sym53c8xx scsi_transport_spi BusLogic scsi_mod CPU: 0 EIP: 0060:[<f88e9652>] Not tainted VLI EFLAGS: 00010012 (2.6.21.5-ah2 #1) EIP is at do_cciss_request+0x75/0x36c [cciss] eax: 0000001b ebx: dfe92bac ecx: 00000082 edx: 00000000 esi: 00000800 edi: f2bc0bc0 ebp: f75ace18 esp: dfef9bf0 ds: 007b es: 007b fs: 00d8 gs: 0000 ss: 0068 Process md0_raid1 (pid: 1131, ti=dfef8000 task=dfa7a580 task.ti=dfef8000) Stack: f88ecfe2 00000020 0000001f 0000001f dff5d000 00000001 dfe92bac 00000c00 dfb70000 00000001 00000001 dfef9c3c c16dff00 00000c00 00000000 00000200 00000001 00000040 dfbe0250 dfb6cbec 00000000 f75a9ed0 c01bd4a9 00000001 Call Trace: [<c01bd4a9>] elv_next_request+0x10d/0x11c [<f88e6105>] start_io+0x7b/0xe1 [cciss] [<f88e993e>] do_cciss_request+0x361/0x36c [cciss] [<c01645fa>] cache_alloc_refill+0x58/0x465 [<c01c700d>] cfq_set_request+0x299/0x315 [<c01bd798>] elv_rb_add+0x65/0x6d [<c01c65ec>] cfq_add_rq_rb+0x5c/0x6b [<c01c6625>] cfq_insert_request+0x2a/0x3ae [<c0129efc>] lock_timer_base+0x15/0x2f [<c012a1fa>] del_timer+0x48/0x4e [<c01bfcb9>] blk_remove_plug+0x57/0x63 [<c012a010>] __mod_timer+0x9c/0xa6 [<c01bfce2>] __generic_unplug_device+0x1d/0x1f [<c01c0e15>] __make_request+0x337/0x456 [<c01bf02a>] generic_make_request+0x1a9/0x1b9 [<c012a1fa>] del_timer+0x48/0x4e [<c01bfcb9>] blk_remove_plug+0x57/0x63 [<f896abb5>] raid1d+0xbe/0xd02 [raid1] [<c0102fee>] __switch_to+0xfe/0x130 [<c029d990>] schedule_timeout+0x13/0x8d [<c010475b>] common_interrupt+0x23/0x28 [<c029007b>] __find_acq_core+0xa7/0x30d [<f89dba6a>] md_thread+0xc6/0xdc [md_mod] [<c0132971>] autoremove_wake_function+0x0/0x35 [<f89db9a4>] md_thread+0x0/0xdc [md_mod] [<c01328a6>] kthread+0xb2/0xdc [<c01327f4>] kthread+0x0/0xdc [<c01049a3>] kernel_thread_helper+0x7/0x10 ======================= Code: 18 0f b7 81 40 01 00 00 c7 44 24 08 1f 00 00 00 89 54 24 04 c7 04 24 e2 cf 8e f8 89 44 24 0c e8 b8 8f 83 c7 66 83 7d 68 1f 76 04 <0f> 0b eb fe 8b 44 24 20 ba 01 00 00 00 e8 ce cc ff ff 85 c0 89 EIP: [<f88e9652>] do_cciss_request+0x75/0x36c [cciss] SS:ESP 0068:dfef9bf0
The patch should have cause a message to be printed immediate before the kernel BUG at ... message. I don't see it in the output you provide. Did you clip it, or was it really not there. If you did leave it out by mistake, please provide it. Thanks.
It really isn't there. And it _is_ in the source, just checked it. Can it be that it has not been flushed? Is there a way to make sure that the message is printed before running into the BUG_ON() macro?
On Wednesday November 28, bugme-daemon@bugzilla.kernel.org wrote: > http://bugzilla.kernel.org/show_bug.cgi?id=8672 > It really isn't there. And it _is_ in the source, just checked it. Can it be > that it has not been flushed? Is there a way to make sure that the message is > printed before running into the BUG_ON() macro? The BUG_ON macro uses printk to print messages, so there is no way it would interfere with any previous printk. It is possible that cciss is a module, and an old module is being loaded? The patch will have changed the line number of the BUG, and between the oops it changes from 2479 to 2482, which is a 3 line change to match the 3 lines that were added, so it seems to be OK. So that tells us that .... the BUG_ON macro is being miscompiled???. How about changing the code to read: if (creq->nr_phys_segments > MAXSGENTRIES) { printk("nps = %d, MAX=%d mps=%d\n", creq->nr_phys_segments, MAXSGENTRIES, q->max_phys_segments); BUG(); } Then we will definitely get a printk before the BUG. NeilBrown
I am out of ideas.... - I tried your suggestion above, but the was no output from the printk(). I even added an extra comment line before BUG(), so the line number would change. - I tried printk(KERN_CRIT "nps = %d.... to see if it would make a difference. It didn't. - I added an udelay() after the printk, see if that would help... No. - I added int i; for (i = 0; i < 1000; i++) usleep(1000); just before BUG(), but it does not help. Allthough it hints at the fact that our code is used, since the BUG has moved to CPU#1 and I now have a soft lockup on CPU#0. And, oh, adding 'nosmp' to the kernel bootparms does not work either, since that will make the disk probe fail...
Weird.... How about changing the code to if (creq->nr_phys_segments > MAXSGENTRIES) { printk(".....", ......); end_that_request_last(creq, -EIO); goto queue; } so that instead of calling BUG, we just fail the request...
OK, tried that one, but no printf output. I now get: ------------[ cut here ]------------ kernel BUG at block/ll_rw_blk.c:2749! invalid opcode: 0000 [#1] SMP Modules linked in: sg ipmi_devintf ipmi_watchdog ipmi_poweroff ipmi_si ipmi_msghandler 8021q nfsd exportfs lockd nfs_acl sunrpc ipv6 raw dm_snapshot tsdev usbhid hid ehci_hcd uhci_hcd evdev psmouse usbcore shpchp pci_hotplug serio_raw pcspkr ext3 jbd mbcache dm_mod ide_generic raid1 md_mod ide_cd ata_piix libata piix bnx2 generic ide_core tg3 cciss thermal processor fan sr_mod cdrom mptscsih mptbase aic7xxx sym53c8xx scsi_transport_spi BusLogic scsi_mod CPU: 0 EIP: 0060:[<c01bf118>] Not tainted VLI EFLAGS: 00010083 (2.6.21.5-ah2 #2) EIP is at __blk_put_request+0x33/0x75 eax: 00000001 ebx: dfdef858 ecx: 00000000 edx: 00000000 esi: 00018485 edi: dfbe8bac ebp: dfdef858 esp: c1b6bbbc ds: 007b es: 007b fs: 00d8 gs: 0000 ss: 0068 Process md0_raid1 (pid: 1132, ti=c1b6a000 task=dfe0da90 task.ti=c1b6a000) Stack: dfbe8bac 00000800 dfbe8bac dfdef858 f88e9657 f88ecfe2 00000020 0000001f 0000001f c1b22750 c1b22750 dfbe8bac f88e9941 dfbb0000 dfe84e00 c01c6134 dfba747c dfe84e00 00000001 dfbe87ec 00000000 dfbb0000 00000005 00000005 Call Trace: [<f88e9657>] do_cciss_request+0x7a/0x36f [cciss] [<f88e9941>] do_cciss_request+0x364/0x36f [cciss] [<c01c6134>] cfq_dispatch_insert+0x34/0x53 [<c01c700d>] cfq_set_request+0x299/0x315 [<c01bd798>] elv_rb_add+0x65/0x6d [<c0129efc>] lock_timer_base+0x15/0x2f [<c012a1fa>] del_timer+0x48/0x4e [<c0129efc>] lock_timer_base+0x15/0x2f [<c012a1fa>] del_timer+0x48/0x4e [<c01bfcb9>] blk_remove_plug+0x57/0x63 [<c01bfce2>] __generic_unplug_device+0x1d/0x1f [<c01bd55e>] elv_insert+0xa6/0x146 [<c012a010>] __mod_timer+0x9c/0xa6 [<c01c0e07>] __make_request+0x329/0x456 [<c01bf02a>] generic_make_request+0x1a9/0x1b9 [<c012a1fa>] del_timer+0x48/0x4e [<c01bfcb9>] blk_remove_plug+0x57/0x63 [<f88aebb5>] raid1d+0xbe/0xd02 [raid1] [<c0102fee>] __switch_to+0xfe/0x130 [<c029d990>] schedule_timeout+0x13/0x8d [<c010475b>] common_interrupt+0x23/0x28 [<f89dba6a>] md_thread+0xc6/0xdc [md_mod] [<c0132971>] autoremove_wake_function+0x0/0x35 [<f89db9a4>] md_thread+0x0/0xdc [md_mod] [<c01328a6>] kthread+0xb2/0xdc [<c01327f4>] kthread+0x0/0xdc [<c01049a3>] kernel_thread_helper+0x7/0x10 ======================= Code: 89 d3 74 64 8b 82 80 00 00 00 48 85 c0 89 82 80 00 00 00 75 53 89 f8 e8 25 e2 ff ff 8b 73 14 f7 c6 00 00 01 00 74 41 39 1b 74 04 <0f> 0b eb fe 83 7b 48 00 74 04 0f 0b eb fe 89 f5 81 e5 00 04 00 EIP: [<c01bf118>] __blk_put_request+0x33/0x75 SS:ESP 0068:c1b6bbbc Note that block/ll_rw_blk.c:2749 contains: BUG_ON(!list_empty(&req->queuelist));
I'm afraid I don't know what to suggest next. The fact that the printk isn't working suggests that something very odd is going on, and until that is resolve, there is not a lot of point pursuing the original symptom. I would probably start sprinkling printks all over the place and trying to figure out why some work and some don't.
OK, got it! I have modified lib/bug.c. I added: char leo_msg[1024]; EXPORT_SYMBOL(leo_msg); and to report_bug(), I added just below the 'cut here' print: if (leo_msg[0] != '\0') { printk(KERN_EMERG "%s", leo_msg); } In cciss.c I changed the debugging code block to: if (creq->nr_phys_segments > MAXSGENTRIES) { extern char leo_msg[]; printk(KERN_EMERG "nps = %d, MAX=%d mps=%d\n", creq->nr_phys_segments, MAXSGENTRIES, q->max_phys_segments); sprintf(leo_msg, "nps = %d, MAX=%d mps=%d\n", creq->nr_phys_segments, MAXSGENTRIES, q->max_phys_segments); BUG(); } And now I've got both messages... But alas, finally the needed data ;-) I pasted it from the riloe serial output: [ .... First part of kernel boot msg's deleted .... ]] hda: ATAPI 48X DVD-ROM CD-R/RW drive, 1536kB Cache, UDMA(33) Uniform CD-ROM driver Revision: 3.20 cciss0: <0x3238> at PCI 0000:13:08.0 IRQ 216 using DAC blocks= 143305920 block_size= 512 heads=255, sectors=32, cylinders=17562 blocks= 143305920 block_size= 512 heads=255, sectors=32, cylinders=17562 blocks= 143305920 block_size= 512 heads=255, sectors=32, cylinders=17562 cciss/c0d0: p1 p2 p3 blocks= 143305920 block_size= 512 heads=255, sectors=32, cylinders=17562 cciss/c0d1: p1 p2 p3 tg3.c:v3.75.2 (June 5, 2007) ACPI: PCI Interrupt 0000:14:01.0[A] -> GSI 19 (level, low) -> IRQ 18 eth1: Tigon3 [partno(407709-001) rev 1100 PHY(5703)] (PCIX:133MHz:64-bit) 10/100/1000Base-T Ethernet 00:18:fe:83:a5:ec eth1: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] WireSpeed[1] TSOcap[1] eth1: dma_rwctrl[769c4000] dma_mask[64-bit] Done. Begin: Mounting root file system... ... Begin: Running /scripts/local-top ... md: raid1 personality registered for level 1 md: md1 stopped. md: bind<cciss/c0d1p1> md: bind<cciss/c0d0p1> Before stack: t=128 b=31 After stack: t=31 Before stack: t=31 b=31 After stack: t=31 raid1: raid set md1 active with 2 out of 2 mirrors mdadm: /dev//md1 has been started with 2 drives. md: md0 stopped. md: bind<cciss/c0d1p3> md: bind<cciss/c0d0p3> md: md0: raid array is not clean -- starting background reconstruction Before stack: t=128 b=31 After stack: t=31 raid1: raid set md0 active with 1 out of 2 mirrors Before stack: t=31 b=31 After stack: t=31 RAID1 conf printout: --- wd:1 rd:2 disk 0, wo:0, o:1, dev:cciss/c0d0p3 disk 1, wo:1, o:1, dev:cciss/c0d1p3 md: recovery of RAID array md0 md: minimum _guaranteed_ speed: 1000 KB/sec/disk. mdadm: /dev//md0md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for recovery. has been started with 1 drive (md: using 128k window, over a total of 33794560 blocks. out of 2) and 1 spare. device-mapper: ioctl: 4.11.0-ioctl (2006-10-12) initialised: dm-devel@redhat.com Before set_limits : rs=0 q=31 After set_limits: 31 Before combine_low : lhs=0 rhs=31 After combine_low: 31 Before valid: 31 After valid: 31 In table_set_restrictions: 31 Before set_limits : rs=0 q=31 After set_limits: 31 Before combine_low : lhs=0 rhs=31 After combine_low: 31 Before valid: 31 After valid: 31 In table_set_restrictions: 31 Before set_limits : rs=0 q=31 After set_limits: 31 Before combine_low : lhs=0 rhs=31 After combine_low: 31 Before valid: 31 After valid: 31 In table_set_restrictions: 31 Before set_limits : rs=0 q=31 After set_limits: 31 Before combine_low : lhs=0 rhs=31 After combine_low: 31 Before valid: 31 After valid: 31 In table_set_restrictions: 31 Before set_limits : rs=0 q=31 After set_limits: 31 Before combine_low : lhs=0 rhs=31 After combine_low: 31 Before valid: 31 After valid: 31 In table_set_restrictions: 31 Before set_limits : rs=0 q=31 After set_limits: 31 Before combine_low : lhs=0 rhs=31 After combine_low: 31 Before valid: 31 After valid: 31 In table_set_restrictions: 31 Before set_limits : rs=0 q=31 After set_limits: 31 Before combine_low : lhs=0 rhs=31 After combine_low: 31 Before valid: 31 After valid: 31 In table_set_restrictions: 31 Before set_limits : rs=0 q=31 After set_limits: 31 Before combine_low : lhs=0 rhs=31 After combine_low: 31 Before valid: 31 After valid: 31 In table_set_restrictions: 31 Before set_limits : rs=0 q=31 After set_limits: 31 Before combine_low : lhs=0 rhs=31 After combine_low: 31 Before valid: 31 After valid: 31 In table_set_restrictions: 31 Before set_limits : rs=0 q=31 After set_limits: 31 Before combine_low : lhs=0 rhs=31 After combine_low: 31 Before valid: 31 After valid: 31 In table_set_restrictions: 31 Before set_limits : rs=0 q=31 After set_limits: 31 Before combine_low : lhs=0 rhs=31 After combine_low: 31 Before valid: 31 After valid: 31 In table_set_restrictions: 31 Before set_limits : rs=0 q=31 After set_limits: 31 Before combine_low : lhs=0 rhs=31 After combine_low: 31 Before valid: 31 After valid: 31 In table_set_restrictions: 31 Before set_limits : rs=0 q=31 After set_limits: 31 Before combine_low : lhs=0 rhs=31 After combine_low: 31 Before valid: 31 After valid: 31 In table_set_restrictions: 31 Before set_limits : rs=0 q=31 After set_limits: 31 Before combine_low : lhs=0 rhs=31 After combine_low: 31 Before valid: 31 After valid: 31 In table_set_restrictions: 31 Before set_limits : rs=0 q=31 After set_limits: 31 Before combine_low : lhs=0 rhs=31 After combine_low: 31 Before valid: 31 After valid: 31 In table_set_restrictions: 31 Before set_limits : rs=0 q=31 After set_limits: 31 Before combine_low : lhs=0 rhs=31 After combine_low: 31 Before valid: 31 After valid: 31 In table_set_restrictions: 31 Before set_limits : rs=0 q=31 After set_limits: 31 Before combine_low : lhs=0 rhs=31 After combine_low: 31 Before valid: 31 After valid: 31 In table_set_restrictions: 31 Before set_limits : rs=0 q=31 After set_limits: 31 Before combine_low : lhs=0 rhs=31 After combine_low: 31 Before valid: 31 After valid: 31 In table_set_restrictions: 31 Before set_limits : rs=0 q=31 After set_limits: 31 Before combine_low : lhs=0 rhs=31 After combine_low: 31 Before valid: 31 After valid: 31 In table_set_restrictions: 31 Before set_limits : rs=0 q=31 After set_limits: 31 Before combine_low : lhs=0 rhs=31 After combine_low: 31 Before valid: 31 After valid: 31 In table_set_restrictions: 31 Before set_limits : rs=0 q=31 After set_limits: 31 Before combine_low : lhs=0 rhs=31 After combine_low: 31 Before valid: 31 After valid: 31 In table_set_restrictions: 31 Before set_limits : rs=0 q=31 After set_limits: 31 Before combine_low : lhs=0 rhs=31 After combine_low: 31 Before valid: 31 After valid: 31 In table_set_restrictions: 31 Before set_limits : rs=0 q=31 After set_limits: 31 Before combine_low : lhs=0 rhs=31 After combine_low: 31 Before valid: 31 After valid: 31 In table_set_restrictions: 31 Before set_limits : rs=0 q=31 After set_limits: 31 Before combine_low : lhs=0 rhs=31 After combine_low: 31 Before valid: 31 After valid: 31 In table_set_restrictions: 31 Before set_limits : rs=0 q=31 After set_limits: 31 Before combine_low : lhs=0 rhs=31 After combine_low: 31 Before valid: 31 After valid: 31 In table_set_restrictions: 31 Before set_limits : rs=0 q=31 After set_limits: 31 Before combine_low : lhs=0 rhs=31 After combine_low: 31 Before valid: 31 After valid: 31 In table_set_restrictions: 31 Before set_limits : rs=0 q=31 After set_limits: 31 Before combine_low : lhs=0 rhs=31 After combine_low: 31 Before valid: 31 After valid: 31 In table_set_restrictions: 31 27 logical volume(s) in volume group "ahraid" now active Done. Begin: Running /scripts/local-premount ... Done. EXT3-fs: INFO: recovery required on readonly filesystem. EXT3-fs: write access will be enabled during recovery. kjournald starting. Commit interval 5 seconds [ .... Initrd does his thing .... ] Loading kernel module raid1. Before set_limits : rs=0 q=31 After set_limits: 31 Before combine_low : lhs=0 rhs=31 After combine_low: 31 Before valid: 31 After valid: 31 In table_set_restrictions: 31 Before set_limits : rs=0 q=31 After set_limits: 31 Before combine_low : lhs=0 rhs=31 After combine_low: 31 Before valid: 31 After valid: 31 In table_set_restrictions: 31 Before set_limits : rs=0 q=31 After set_limits: 31 Before combine_low : lhs=0 rhs=31 After combine_low: 31 Before valid: 31 After valid: 31 Before set_limits : rs=0 q=31 After set_limits: 31 Before combine_low : lhs=0 rhs=31 After combine_low: 31 Before valid: 31 After valid: 31 In table_set_restrictions: 31 Before set_limits : rs=0 q=31 After set_limits: 31 Before set_limits : rs=31 q=31 After set_limits: 31 Before combine_low : lhs=0 rhs=31 After combine_low: 31 Before valid: 31 After valid: 31 In table_set_restrictions: 31 Logical volume "snap" created 27 logical volume(s) in volume group "ahraid" now active Before set_limits : rs=0 q=31 After set_limits: 31 Before combine_low : lhs=0 rhs=31 After combine_low: 31 Before valid: 31 After valid: 31 In table_set_restrictions: 31 Logical volume "snap" successfully removed Starting raid devices: done. Setting up LVM Volume Groups... [ .... Rest of boot msg's deleted .... ] nps = 32, MAX=31 mps=31 ------------[ cut here ]------------ nps = 32, MAX=31 mps=31 kernel BUG at drivers/block/cciss.c:2485! invalid opcode: 0000 [#1] SMP Modules linked in: sg ipmi_devintf ipmi_watchdog ipmi_poweroff ipmi_si ipmi_msghandler 8021q nfsd exportfs lockd nfs_acl sunrpc ipv6 raw dm_snapshot tsdev usbhid hid ehci_hcd uhci_hcd shpchp pci_hotplug usbcore psmouse serio_raw evdev pcspkr ext3 jbd mbcache dm_mod ide_generic raid1 md_mod ide_cd ata_piix libata piix bnx2 generic ide_core tg3 cciss thermal processor fan sr_mod cdrom mptscsih mptbase aic7xxx sym53c8xx scsi_transport_spi BusLogic scsi_mod CPU: 1 EIP: 0060:[<f88b067e>] Not tainted VLI EFLAGS: 00010086 (2.6.21.5-ah2 #5) EIP is at do_cciss_request+0xa1/0x398 [cciss] eax: 00000018 ebx: dfe34bac ecx: c1b41bd4 edx: c1b41bd4 esi: 00000800 edi: dfe34bac ebp: f7b7be18 esp: c1b41bcc ds: 007b es: 007b fs: 00d8 gs: 0000 ss: 0068 Process md0_raid1 (pid: 1128, ti=c1b40000 task=dfb3c580 task.ti=c1b40000) Stack: c03bba00 f88b401e 00000020 0000001f 0000001f dfbe2750 029ae960 dfe34bac dfeac59c dfbd0000 c01c6134 dfeac59c dfe8ca00 dff517a0 00000001 dfe347ec 00000000 dfbd0000 00000009 00000009 c1b41cbc c1677660 00000000 00000000 Call Trace: [<c01c6134>] cfq_dispatch_insert+0x34/0x53 [<c01c700d>] cfq_set_request+0x299/0x315 [<c01bd798>] elv_rb_add+0x65/0x6d [<c0129efc>] lock_timer_base+0x15/0x2f [<c012a1fa>] del_timer+0x48/0x4e [<c0129efc>] lock_timer_base+0x15/0x2f [<c012a1fa>] del_timer+0x48/0x4e [<c01bfcb9>] blk_remove_plug+0x57/0x63 [<c01bfce2>] __generic_unplug_device+0x1d/0x1f [<c01bd55e>] elv_insert+0xa6/0x146 [<c012a010>] __mod_timer+0x9c/0xa6 [<c01c0e07>] __make_request+0x329/0x456 [<c01bf02a>] generic_make_request+0x1a9/0x1b9 [<c012a1fa>] del_timer+0x48/0x4e [<c01bfcb9>] blk_remove_plug+0x57/0x63 [<f89a1bb5>] raid1d+0xbe/0xd02 [raid1] [<c0102fee>] __switch_to+0xfe/0x130 [<c029d9b0>] schedule_timeout+0x13/0x8d [<c010475b>] common_interrupt+0x23/0x28 [<f89bea6a>] md_thread+0xc6/0xdc [md_mod] [<c0132971>] autoremove_wake_function+0x0/0x35 [<f89be9a4>] md_thread+0x0/0xdc [md_mod] [<c01328a6>] kthread+0xb2/0xdc [<c01327f4>] kthread+0x0/0xdc [<c01049a3>] kernel_thread_helper+0x7/0x10 ======================= Code: 01 00 00 0f b7 55 68 c7 44 24 0c 1f 00 00 00 c7 44 24 04 1e 40 8b f8 89 44 24 10 89 54 24 08 c7 04 24 00 ba 3b c0 e8 7f b6 91 c7 <0f> 0b eb fe 8b 44 24 24 ba 01 00 00 00 e8 a2 cc ff ff 85 c0 89 EIP: [<f88b067e>] do_cciss_request+0xa1/0x398 [cciss] SS:ESP 0068:c1b41bcc
So, despite max_phys_segments always being 31, we have a case where nr_phys_segments is 32. I suspect the most likely cause of this is that blk_recalc_rq_segments makes different BIOVEC_PHYS_MERGEABLE etc decisions at the cciss level than at the md or dm level. I wonder if it could be seg_boundary_mask. dm default it to '-1' and then uses min_non_zero, so it might stay at -1 which would seem wrong. Maybe try changing the -1 in 'check_for_valid_limits' in driver/md/dm-table.c to 0xffffffff (8 f's). It is just a hunch and I will try to look closer and see if there are other possibilities, but it is worth trying I think.
Neil, I recently found similar problem in dm-crypt, (bug 7763) see patch http://www2.kernel.org/pub/linux/kernel/people/agk/patches/2.6/editing/dm-crypt-use-bio_add_page.patch In raid1.c are similar code patterns directly manipulating with bvec, isn't this source of problems too ? Milan
On Friday December 7, bugme-daemon@bugzilla.kernel.org wrote: > Neil, > > I recently found similar problem in dm-crypt, (bug 7763) see patch > > http://www2.kernel.org/pub/linux/kernel/people/agk/patches/2.6/editing/dm-crypt-use-bio_add_page.patch > > In raid1.c are similar code patterns directly manipulating with bvec, > isn't this source of problems too ? Hmmm. that's a nasty problem. It could affect raid1 if --write-behind is being used, which I don't think is the case here. Fixing the --write-behind case would be rather ugly. Probably the best fix would be to set the seb_boundary_mask very low if --write-behind is in effect. NeilBrown
If anyone's still watching this, please try the patch in: http://lkml.org/lkml/2008/11/30/108 Subject: [PATCH] block: fix setting of max_segment_size and seg_boundary mask
bugme-daemon@bugzilla.kernel.org wrote: > http://bugzilla.kernel.org/show_bug.cgi?id=8672 I think that this problem was fixed here http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=0e435ac26e3f951d83338ed3d4ab7dc0fe0055bc