Latest working kernel version: 2.6.24 Earliest failing kernel version: 2.6.27-rc8 Distribution: Mandriva Problem Description: Oops when reading /proc/megaraid/hba0/diskdrives-ch* BUG: unable to handle kernel NULL pointer dereference at 00000000 IP: [<f88729b4>] :megaraid:mega_internal_command+0x74/0x130 *pdpt = 0000000032c80001 *pde = 0000000000000000 Oops: 0002 [#1] SMP Modules linked in: nfsd auth_rpcgss exportfs ohci1394 ieee1394 nfs lockd nfs_acl sunrpc af_packet binfmt_misc loop ext3 jbd dm_mod rtc_cmos eepro100 shpchp pci_hotplug ipmi_msghandler e100 ide_cd_mod mii i2c_core sg sworks_agp agpgart serverworks ide_core i2o_core megaraid sd_mod scsi_mod crc_t10dif xfs uhci_hcd ohci_hcd ehci_hcd usbcore [last unloaded: scsi_wait_scan] Pid: 2319, comm: diff Not tainted (2.6.27-server-0.rc8.2mnb #1) EIP: 0060:[<f88729b4>] EFLAGS: 00010246 CPU: 0 EIP is at mega_internal_command+0x74/0x130 [megaraid] EAX: 00000000 EBX: f7942364 ECX: 00000000 EDX: f2d96000 ESI: f7942850 EDI: f7942860 EBP: f24e1df4 ESP: f24e1dc0 DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 Process diff (pid: 2319, ti=f24e0000 task=f39f5710 task.ti=f24e0000) Stack: 3691e000 00000000 f24e1e1c c010e4c8 f24e1e0c 00000000 f24e1dfe f79429b4 f79428e8 f2d96000 f7942364 f24e1dfe f24e1e10 f24e1e1c f8872b51 00040000 00000000 e0000000 00003691 00000000 f8873f00 f7930c08 c2a56c08 f24e1f08 Call Trace: [<c010e4c8>] ? dma_alloc_coherent+0x108/0x2c0 [<f8872b51>] ? mega_adapinq+0x41/0x60 [megaraid] [<f8873f00>] ? proc_pdrv_ch0+0x0/0x20 [megaraid] [<f8873887>] ? proc_pdrv+0xb7/0x6d0 [megaraid] [<c01845fe>] ? __alloc_pages_internal+0xae/0x460 [<c01b6d29>] ? do_filp_open+0x1c9/0x7c0 [<f8873f00>] ? proc_pdrv_ch0+0x0/0x20 [megaraid] [<f8873f18>] ? proc_pdrv_ch0+0x18/0x20 [megaraid] [<c01ed6ab>] ? proc_file_read+0x17b/0x260 [<c01ed530>] ? proc_file_read+0x0/0x260 [<c01e8c9e>] ? proc_reg_read+0x5e/0x90 [<c01ab959>] ? vfs_read+0x99/0x160 [<c01e8c40>] ? proc_reg_read+0x0/0x90 [<c01abadd>] ? sys_read+0x3d/0x70 [<c0109e03>] ? sysenter_do_call+0x12/0x2f [<c0390000>] ? native_cpu_up+0xf0/0x743 ======================= Code: 98 4c c0 e8 ff 39 93 c7 8d bb fc 04 00 00 89 45 f0 8b 55 f0 89 83 84 05 00 00 8b 43 48 89 02 8b 83 b8 05 00 00 89 b3 44 06 00 00 <c6> 00 e1 8b 45 ec 83 8b f0 04 00 00 01 8b 75 e4 89 83 48 05 00 EIP: [<f88729b4>] mega_internal_command+0x74/0x130 [megaraid] SS:ESP 0068:f24e1dc0 ---[ end trace 18d8357732584241 ]---
Sorry, last working version is actually 2.6.22.19, this server never had 2.6.24
Reassigned to scsi, marked as a regression.
Reply-To: akpm@linux-foundation.org (switched to email. Please respond via emailed reply-to-all, not via the bugzilla web interface). For some reason this didn't come out on linux-scsi when I reassigned it to scsi. On Sun, 19 Oct 2008 16:27:35 -0700 (PDT) bugme-daemon@bugzilla.kernel.org wrote: > http://bugzilla.kernel.org/show_bug.cgi?id=11792 > > Summary: Oops when reading /proc/megaraid/hba0/diskdrives-ch* > Product: Drivers > Version: 2.5 > KernelVersion: 2.6.27-rc8 > Platform: All > OS/Version: Linux > Tree: Mainline > Status: NEW > Severity: normal > Priority: P1 > Component: Other > AssignedTo: drivers_other@kernel-bugs.osdl.org > ReportedBy: pterjan@gmail.com > > > Latest working kernel version: 2.6.24 > Earliest failing kernel version: 2.6.27-rc8 It's a regression. > Distribution: Mandriva > Problem Description: > > Oops when reading /proc/megaraid/hba0/diskdrives-ch* > > BUG: unable to handle kernel NULL pointer dereference at 00000000 > IP: [<f88729b4>] :megaraid:mega_internal_command+0x74/0x130 > *pdpt = 0000000032c80001 *pde = 0000000000000000 > Oops: 0002 [#1] SMP > Modules linked in: nfsd auth_rpcgss exportfs ohci1394 ieee1394 nfs lockd > nfs_acl sunrpc af_packet binfmt_misc loop ext3 jbd dm_mod rtc_cmos eepro100 > shpchp pci_hotplug ipmi_msghandler e100 ide_cd_mod mii i2c_core sg sworks_agp > agpgart serverworks ide_core i2o_core megaraid sd_mod scsi_mod crc_t10dif xfs > uhci_hcd ohci_hcd ehci_hcd usbcore [last unloaded: scsi_wait_scan] > > Pid: 2319, comm: diff Not tainted (2.6.27-server-0.rc8.2mnb #1) > EIP: 0060:[<f88729b4>] EFLAGS: 00010246 CPU: 0 > EIP is at mega_internal_command+0x74/0x130 [megaraid] > EAX: 00000000 EBX: f7942364 ECX: 00000000 EDX: f2d96000 > ESI: f7942850 EDI: f7942860 EBP: f24e1df4 ESP: f24e1dc0 > DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 > Process diff (pid: 2319, ti=f24e0000 task=f39f5710 task.ti=f24e0000) > Stack: 3691e000 00000000 f24e1e1c c010e4c8 f24e1e0c 00000000 f24e1dfe > f79429b4 > f79428e8 f2d96000 f7942364 f24e1dfe f24e1e10 f24e1e1c f8872b51 > 00040000 > 00000000 e0000000 00003691 00000000 f8873f00 f7930c08 c2a56c08 > f24e1f08 > Call Trace: > [<c010e4c8>] ? dma_alloc_coherent+0x108/0x2c0 > [<f8872b51>] ? mega_adapinq+0x41/0x60 [megaraid] > [<f8873f00>] ? proc_pdrv_ch0+0x0/0x20 [megaraid] > [<f8873887>] ? proc_pdrv+0xb7/0x6d0 [megaraid] > [<c01845fe>] ? __alloc_pages_internal+0xae/0x460 > [<c01b6d29>] ? do_filp_open+0x1c9/0x7c0 > [<f8873f00>] ? proc_pdrv_ch0+0x0/0x20 [megaraid] > [<f8873f18>] ? proc_pdrv_ch0+0x18/0x20 [megaraid] > [<c01ed6ab>] ? proc_file_read+0x17b/0x260 > [<c01ed530>] ? proc_file_read+0x0/0x260 > [<c01e8c9e>] ? proc_reg_read+0x5e/0x90 > [<c01ab959>] ? vfs_read+0x99/0x160 > [<c01e8c40>] ? proc_reg_read+0x0/0x90 > [<c01abadd>] ? sys_read+0x3d/0x70 > [<c0109e03>] ? sysenter_do_call+0x12/0x2f > [<c0390000>] ? native_cpu_up+0xf0/0x743 > ======================= > Code: 98 4c c0 e8 ff 39 93 c7 8d bb fc 04 00 00 89 45 f0 8b 55 f0 89 83 84 05 > 00 00 8b 43 48 89 02 8b 83 b8 05 00 00 89 b3 44 06 00 00 <c6> 00 e1 8b 45 ec > 83 > 8b f0 04 00 00 01 8b 75 e4 89 83 48 05 00 > EIP: [<f88729b4>] mega_internal_command+0x74/0x130 [megaraid] SS:ESP > 0068:f24e1dc0 > ---[ end trace 18d8357732584241 ]--- > >
On Tue, Oct 21, 2008 at 12:47:01PM -0700, Andrew Morton wrote: > > Latest working kernel version: 2.6.24 > > Earliest failing kernel version: 2.6.27-rc8 > > It's a regression. > > > Pid: 2319, comm: diff Not tainted (2.6.27-server-0.rc8.2mnb #1) It's also a distro kernel by the looks of things. Can it be reproduced with an upstream kernel?
On Tue, Oct 21, 2008 at 9:54 PM, Matthew Wilcox <matthew@wil.cx> wrote: > On Tue, Oct 21, 2008 at 12:47:01PM -0700, Andrew Morton wrote: >> > Latest working kernel version: 2.6.24 >> > Earliest failing kernel version: 2.6.27-rc8 >> >> It's a regression. >> >> > Pid: 2319, comm: diff Not tainted (2.6.27-server-0.rc8.2mnb #1) > > It's also a distro kernel by the looks of things. Can it be reproduced > with an upstream kernel? I will try booting the server on vanilla kernel but I'm not sure when (we already rebooted it 2 times recently and users won't enjoy it). This is a distro kernel but I don't see patches that could impact this : http://svn.mandriva.com/cgi-bin/viewvc.cgi/packages/cooker/kernel/current/PATCHES/patches/ Machine is a old HP NetServer LT 6000 04:03.1 I2O: Intel Corporation 80960RP (i960RP) Microprocessor (rev 09) (prog-if 01) Subsystem: Hewlett-Packard Company MegaRAID, Integrated NetRAID Flags: bus master, fast Back2Back, medium devsel, latency 64, IRQ 11 Memory at f4000000 (32-bit, prefetchable) [size=64M] [virtual] Expansion ROM at a8130000 [disabled] [size=32K] Capabilities: [80] Power Management version 2 Kernel driver in use: megaraid_legacy Kernel modules: i2o_core, megaraid
Reply-To: fujita.tomonori@lab.ntt.co.jp On Tue, 21 Oct 2008 22:22:37 +0200 "Pascal Terjan" <pterjan@gmail.com> wrote: > On Tue, Oct 21, 2008 at 9:54 PM, Matthew Wilcox <matthew@wil.cx> wrote: > > On Tue, Oct 21, 2008 at 12:47:01PM -0700, Andrew Morton wrote: > >> > Latest working kernel version: 2.6.24 > >> > Earliest failing kernel version: 2.6.27-rc8 > >> > >> It's a regression. > >> > >> > Pid: 2319, comm: diff Not tainted (2.6.27-server-0.rc8.2mnb #1) > > > > It's also a distro kernel by the looks of things. Can it be reproduced > > with an upstream kernel? > > I will try booting the server on vanilla kernel but I'm not sure when > (we already rebooted it 2 times recently and users won't enjoy it). > > This is a distro kernel but I don't see patches that could impact this : > > http://svn.mandriva.com/cgi-bin/viewvc.cgi/packages/cooker/kernel/current/PATCHES/patches/ > > Machine is a old HP NetServer LT 6000 > > 04:03.1 I2O: Intel Corporation 80960RP (i960RP) Microprocessor (rev > 09) (prog-if 01) > Subsystem: Hewlett-Packard Company MegaRAID, Integrated NetRAID > Flags: bus master, fast Back2Back, medium devsel, latency 64, IRQ 11 > Memory at f4000000 (32-bit, prefetchable) [size=64M] > [virtual] Expansion ROM at a8130000 [disabled] [size=32K] > Capabilities: [80] Power Management version 2 > Kernel driver in use: megaraid_legacy > Kernel modules: i2o_core, megaraid This patch helps? diff --git a/drivers/scsi/megaraid.c b/drivers/scsi/megaraid.c index 28c9da7..9294ed8 100644 --- a/drivers/scsi/megaraid.c +++ b/drivers/scsi/megaraid.c @@ -4414,12 +4414,14 @@ mega_internal_command(adapter_t *adapter, megacmd_t *mc, mega_passthru *pthru) scmd = &adapter->int_scmd; memset(scmd, 0, sizeof(Scsi_Cmnd)); + memset(adapter->int_cdb, 0, sizeof(adapter->int_cdb)); sdev = kzalloc(sizeof(struct scsi_device), GFP_KERNEL); scmd->device = sdev; scmd->device->host = adapter->host; scmd->host_scribble = (void *)scb; + scmd->cmnd = adapter->int_cdb; scmd->cmnd[0] = MEGA_INTERNAL_CMD; scb->state |= SCB_ACTIVE; diff --git a/drivers/scsi/megaraid.h b/drivers/scsi/megaraid.h index ee70bd4..5ffec15 100644 --- a/drivers/scsi/megaraid.h +++ b/drivers/scsi/megaraid.h @@ -889,6 +889,7 @@ typedef struct { u8 sglen; /* f/w supported scatter-gather list length */ scb_t int_scb; + unsigned char int_cdb[MAX_COMMAND_SIZE]; Scsi_Cmnd int_scmd; struct mutex int_mtx; /* To synchronize the internal commands */
Reply-To: bharrosh@panasas.com FUJITA Tomonori wrote: > On Tue, 21 Oct 2008 22:22:37 +0200 > "Pascal Terjan" <pterjan@gmail.com> wrote: > >> On Tue, Oct 21, 2008 at 9:54 PM, Matthew Wilcox <matthew@wil.cx> wrote: >>> On Tue, Oct 21, 2008 at 12:47:01PM -0700, Andrew Morton wrote: >>>>> Latest working kernel version: 2.6.24 >>>>> Earliest failing kernel version: 2.6.27-rc8 >>>> It's a regression. >>>> >>>>> Pid: 2319, comm: diff Not tainted (2.6.27-server-0.rc8.2mnb #1) >>> It's also a distro kernel by the looks of things. Can it be reproduced >>> with an upstream kernel? >> I will try booting the server on vanilla kernel but I'm not sure when >> (we already rebooted it 2 times recently and users won't enjoy it). >> >> This is a distro kernel but I don't see patches that could impact this : >> >> http://svn.mandriva.com/cgi-bin/viewvc.cgi/packages/cooker/kernel/current/PATCHES/patches/ >> >> Machine is a old HP NetServer LT 6000 >> >> 04:03.1 I2O: Intel Corporation 80960RP (i960RP) Microprocessor (rev >> 09) (prog-if 01) >> Subsystem: Hewlett-Packard Company MegaRAID, Integrated NetRAID >> Flags: bus master, fast Back2Back, medium devsel, latency 64, IRQ 11 >> Memory at f4000000 (32-bit, prefetchable) [size=64M] >> [virtual] Expansion ROM at a8130000 [disabled] [size=32K] >> Capabilities: [80] Power Management version 2 >> Kernel driver in use: megaraid_legacy >> Kernel modules: i2o_core, megaraid > > This patch helps? > > > diff --git a/drivers/scsi/megaraid.c b/drivers/scsi/megaraid.c > index 28c9da7..9294ed8 100644 > --- a/drivers/scsi/megaraid.c > +++ b/drivers/scsi/megaraid.c > @@ -4414,12 +4414,14 @@ mega_internal_command(adapter_t *adapter, megacmd_t > *mc, mega_passthru *pthru) > > scmd = &adapter->int_scmd; > memset(scmd, 0, sizeof(Scsi_Cmnd)); > + memset(adapter->int_cdb, 0, sizeof(adapter->int_cdb)); > > sdev = kzalloc(sizeof(struct scsi_device), GFP_KERNEL); > scmd->device = sdev; > > scmd->device->host = adapter->host; > scmd->host_scribble = (void *)scb; > + scmd->cmnd = adapter->int_cdb; > scmd->cmnd[0] = MEGA_INTERNAL_CMD; > > scb->state |= SCB_ACTIVE; > diff --git a/drivers/scsi/megaraid.h b/drivers/scsi/megaraid.h > index ee70bd4..5ffec15 100644 > --- a/drivers/scsi/megaraid.h > +++ b/drivers/scsi/megaraid.h > @@ -889,6 +889,7 @@ typedef struct { > u8 sglen; /* f/w supported scatter-gather list length */ > > scb_t int_scb; > + unsigned char int_cdb[MAX_COMMAND_SIZE]; > Scsi_Cmnd int_scmd; > struct mutex int_mtx; /* To synchronize the internal > commands */ > > -- Hi TOMO. This might not be enough for example I don't see the allocation of sense_buffer. It might be much easer to allocate using the new command allocation API James did, just for such cases. These are: scsi_allocate_command/scsi_free_command Thanks Boaz
Reply-To: fujita.tomonori@lab.ntt.co.jp On Wed, 22 Oct 2008 11:04:44 +0200 Boaz Harrosh <bharrosh@panasas.com> wrote: > FUJITA Tomonori wrote: > > On Tue, 21 Oct 2008 22:22:37 +0200 > > "Pascal Terjan" <pterjan@gmail.com> wrote: > > > >> On Tue, Oct 21, 2008 at 9:54 PM, Matthew Wilcox <matthew@wil.cx> wrote: > >>> On Tue, Oct 21, 2008 at 12:47:01PM -0700, Andrew Morton wrote: > >>>>> Latest working kernel version: 2.6.24 > >>>>> Earliest failing kernel version: 2.6.27-rc8 > >>>> It's a regression. > >>>> > >>>>> Pid: 2319, comm: diff Not tainted (2.6.27-server-0.rc8.2mnb #1) > >>> It's also a distro kernel by the looks of things. Can it be reproduced > >>> with an upstream kernel? > >> I will try booting the server on vanilla kernel but I'm not sure when > >> (we already rebooted it 2 times recently and users won't enjoy it). > >> > >> This is a distro kernel but I don't see patches that could impact this : > >> > http://svn.mandriva.com/cgi-bin/viewvc.cgi/packages/cooker/kernel/current/PATCHES/patches/ > >> > >> Machine is a old HP NetServer LT 6000 > >> > >> 04:03.1 I2O: Intel Corporation 80960RP (i960RP) Microprocessor (rev > >> 09) (prog-if 01) > >> Subsystem: Hewlett-Packard Company MegaRAID, Integrated NetRAID > >> Flags: bus master, fast Back2Back, medium devsel, latency 64, IRQ 11 > >> Memory at f4000000 (32-bit, prefetchable) [size=64M] > >> [virtual] Expansion ROM at a8130000 [disabled] [size=32K] > >> Capabilities: [80] Power Management version 2 > >> Kernel driver in use: megaraid_legacy > >> Kernel modules: i2o_core, megaraid > > > > This patch helps? > > > > > > diff --git a/drivers/scsi/megaraid.c b/drivers/scsi/megaraid.c > > index 28c9da7..9294ed8 100644 > > --- a/drivers/scsi/megaraid.c > > +++ b/drivers/scsi/megaraid.c > > @@ -4414,12 +4414,14 @@ mega_internal_command(adapter_t *adapter, megacmd_t > *mc, mega_passthru *pthru) > > > > scmd = &adapter->int_scmd; > > memset(scmd, 0, sizeof(Scsi_Cmnd)); > > + memset(adapter->int_cdb, 0, sizeof(adapter->int_cdb)); > > > > sdev = kzalloc(sizeof(struct scsi_device), GFP_KERNEL); > > scmd->device = sdev; > > > > scmd->device->host = adapter->host; > > scmd->host_scribble = (void *)scb; > > + scmd->cmnd = adapter->int_cdb; > > scmd->cmnd[0] = MEGA_INTERNAL_CMD; > > > > scb->state |= SCB_ACTIVE; > > diff --git a/drivers/scsi/megaraid.h b/drivers/scsi/megaraid.h > > index ee70bd4..5ffec15 100644 > > --- a/drivers/scsi/megaraid.h > > +++ b/drivers/scsi/megaraid.h > > @@ -889,6 +889,7 @@ typedef struct { > > u8 sglen; /* f/w supported scatter-gather list length */ > > > > scb_t int_scb; > > + unsigned char int_cdb[MAX_COMMAND_SIZE]; > > Scsi_Cmnd int_scmd; > > struct mutex int_mtx; /* To synchronize the internal > > commands */ > > > > -- > > Hi TOMO. > > This might not be enough for example I don't see the allocation of > sense_buffer. > It might be much easer to allocate using the new command allocation API James > did, just for such cases. These are: scsi_allocate_command/scsi_free_command Yeah, it might be. It's fine by me too. But this code path is used only for issuing internal special commands. It doesn't use the great portion of scsi_cmnd. For example, these commands don't use sense buffer, I think. The code path uses scsi_cmnd just for hooking scb_t, a structure that megaraid allocates per command.
Reply-To: bharrosh@panasas.com FUJITA Tomonori wrote: > On Wed, 22 Oct 2008 11:04:44 +0200 > Boaz Harrosh <bharrosh@panasas.com> wrote: > >> FUJITA Tomonori wrote: >>> On Tue, 21 Oct 2008 22:22:37 +0200 >>> "Pascal Terjan" <pterjan@gmail.com> wrote: >>> >>>> On Tue, Oct 21, 2008 at 9:54 PM, Matthew Wilcox <matthew@wil.cx> wrote: >>>>> On Tue, Oct 21, 2008 at 12:47:01PM -0700, Andrew Morton wrote: >>>>>>> Latest working kernel version: 2.6.24 >>>>>>> Earliest failing kernel version: 2.6.27-rc8 >>>>>> It's a regression. >>>>>> >>>>>>> Pid: 2319, comm: diff Not tainted (2.6.27-server-0.rc8.2mnb #1) >>>>> It's also a distro kernel by the looks of things. Can it be reproduced >>>>> with an upstream kernel? >>>> I will try booting the server on vanilla kernel but I'm not sure when >>>> (we already rebooted it 2 times recently and users won't enjoy it). >>>> >>>> This is a distro kernel but I don't see patches that could impact this : >>>> >>>> http://svn.mandriva.com/cgi-bin/viewvc.cgi/packages/cooker/kernel/current/PATCHES/patches/ >>>> >>>> Machine is a old HP NetServer LT 6000 >>>> >>>> 04:03.1 I2O: Intel Corporation 80960RP (i960RP) Microprocessor (rev >>>> 09) (prog-if 01) >>>> Subsystem: Hewlett-Packard Company MegaRAID, Integrated NetRAID >>>> Flags: bus master, fast Back2Back, medium devsel, latency 64, IRQ 11 >>>> Memory at f4000000 (32-bit, prefetchable) [size=64M] >>>> [virtual] Expansion ROM at a8130000 [disabled] [size=32K] >>>> Capabilities: [80] Power Management version 2 >>>> Kernel driver in use: megaraid_legacy >>>> Kernel modules: i2o_core, megaraid >>> This patch helps? >>> >>> >>> diff --git a/drivers/scsi/megaraid.c b/drivers/scsi/megaraid.c >>> index 28c9da7..9294ed8 100644 >>> --- a/drivers/scsi/megaraid.c >>> +++ b/drivers/scsi/megaraid.c >>> @@ -4414,12 +4414,14 @@ mega_internal_command(adapter_t *adapter, megacmd_t >>> *mc, mega_passthru *pthru) >>> >>> scmd = &adapter->int_scmd; >>> memset(scmd, 0, sizeof(Scsi_Cmnd)); >>> + memset(adapter->int_cdb, 0, sizeof(adapter->int_cdb)); >>> >>> sdev = kzalloc(sizeof(struct scsi_device), GFP_KERNEL); >>> scmd->device = sdev; >>> >>> scmd->device->host = adapter->host; >>> scmd->host_scribble = (void *)scb; >>> + scmd->cmnd = adapter->int_cdb; >>> scmd->cmnd[0] = MEGA_INTERNAL_CMD; >>> >>> scb->state |= SCB_ACTIVE; >>> diff --git a/drivers/scsi/megaraid.h b/drivers/scsi/megaraid.h >>> index ee70bd4..5ffec15 100644 >>> --- a/drivers/scsi/megaraid.h >>> +++ b/drivers/scsi/megaraid.h >>> @@ -889,6 +889,7 @@ typedef struct { >>> u8 sglen; /* f/w supported scatter-gather list length */ >>> >>> scb_t int_scb; >>> + unsigned char int_cdb[MAX_COMMAND_SIZE]; >>> Scsi_Cmnd int_scmd; >>> struct mutex int_mtx; /* To synchronize the internal >>> commands */ >>> >>> -- >> Hi TOMO. >> >> This might not be enough for example I don't see the allocation of >> sense_buffer. >> It might be much easer to allocate using the new command allocation API >> James >> did, just for such cases. These are: scsi_allocate_command/scsi_free_command > > Yeah, it might be. It's fine by me too. But this code path is used > only for issuing internal special commands. It doesn't use the great > portion of scsi_cmnd. For example, these commands don't use sense > buffer, I think. The code path uses scsi_cmnd just for hooking scb_t, > a structure that megaraid allocates per command. OK Thanks. I was not sure because it looks like in mega_cmd_done(), if the status is 0x2 (CHECK_CONDITION) then it would set the sense_buffer. But from what you say, the HW will never return 0x2 in case of an Internal-Command. I Just wanted to make sure. Boaz
Reply-To: fujita.tomonori@lab.ntt.co.jp On Wed, 22 Oct 2008 12:08:27 +0200 Boaz Harrosh <bharrosh@panasas.com> wrote: > FUJITA Tomonori wrote: > > On Wed, 22 Oct 2008 11:04:44 +0200 > > Boaz Harrosh <bharrosh@panasas.com> wrote: > > > >> FUJITA Tomonori wrote: > >>> On Tue, 21 Oct 2008 22:22:37 +0200 > >>> "Pascal Terjan" <pterjan@gmail.com> wrote: > >>> > >>>> On Tue, Oct 21, 2008 at 9:54 PM, Matthew Wilcox <matthew@wil.cx> wrote: > >>>>> On Tue, Oct 21, 2008 at 12:47:01PM -0700, Andrew Morton wrote: > >>>>>>> Latest working kernel version: 2.6.24 > >>>>>>> Earliest failing kernel version: 2.6.27-rc8 > >>>>>> It's a regression. > >>>>>> > >>>>>>> Pid: 2319, comm: diff Not tainted (2.6.27-server-0.rc8.2mnb #1) > >>>>> It's also a distro kernel by the looks of things. Can it be reproduced > >>>>> with an upstream kernel? > >>>> I will try booting the server on vanilla kernel but I'm not sure when > >>>> (we already rebooted it 2 times recently and users won't enjoy it). > >>>> > >>>> This is a distro kernel but I don't see patches that could impact this : > >>>> > http://svn.mandriva.com/cgi-bin/viewvc.cgi/packages/cooker/kernel/current/PATCHES/patches/ > >>>> > >>>> Machine is a old HP NetServer LT 6000 > >>>> > >>>> 04:03.1 I2O: Intel Corporation 80960RP (i960RP) Microprocessor (rev > >>>> 09) (prog-if 01) > >>>> Subsystem: Hewlett-Packard Company MegaRAID, Integrated NetRAID > >>>> Flags: bus master, fast Back2Back, medium devsel, latency 64, IRQ 11 > >>>> Memory at f4000000 (32-bit, prefetchable) [size=64M] > >>>> [virtual] Expansion ROM at a8130000 [disabled] [size=32K] > >>>> Capabilities: [80] Power Management version 2 > >>>> Kernel driver in use: megaraid_legacy > >>>> Kernel modules: i2o_core, megaraid > >>> This patch helps? > >>> > >>> > >>> diff --git a/drivers/scsi/megaraid.c b/drivers/scsi/megaraid.c > >>> index 28c9da7..9294ed8 100644 > >>> --- a/drivers/scsi/megaraid.c > >>> +++ b/drivers/scsi/megaraid.c > >>> @@ -4414,12 +4414,14 @@ mega_internal_command(adapter_t *adapter, > megacmd_t *mc, mega_passthru *pthru) > >>> > >>> scmd = &adapter->int_scmd; > >>> memset(scmd, 0, sizeof(Scsi_Cmnd)); > >>> + memset(adapter->int_cdb, 0, sizeof(adapter->int_cdb)); > >>> > >>> sdev = kzalloc(sizeof(struct scsi_device), GFP_KERNEL); > >>> scmd->device = sdev; > >>> > >>> scmd->device->host = adapter->host; > >>> scmd->host_scribble = (void *)scb; > >>> + scmd->cmnd = adapter->int_cdb; > >>> scmd->cmnd[0] = MEGA_INTERNAL_CMD; > >>> > >>> scb->state |= SCB_ACTIVE; > >>> diff --git a/drivers/scsi/megaraid.h b/drivers/scsi/megaraid.h > >>> index ee70bd4..5ffec15 100644 > >>> --- a/drivers/scsi/megaraid.h > >>> +++ b/drivers/scsi/megaraid.h > >>> @@ -889,6 +889,7 @@ typedef struct { > >>> u8 sglen; /* f/w supported scatter-gather list length */ > >>> > >>> scb_t int_scb; > >>> + unsigned char int_cdb[MAX_COMMAND_SIZE]; > >>> Scsi_Cmnd int_scmd; > >>> struct mutex int_mtx; /* To synchronize the internal > >>> commands */ > >>> > >>> -- > >> Hi TOMO. > >> > >> This might not be enough for example I don't see the allocation of > sense_buffer. > >> It might be much easer to allocate using the new command allocation API > James > >> did, just for such cases. These are: > scsi_allocate_command/scsi_free_command > > > > Yeah, it might be. It's fine by me too. But this code path is used > > only for issuing internal special commands. It doesn't use the great > > portion of scsi_cmnd. For example, these commands don't use sense > > buffer, I think. The code path uses scsi_cmnd just for hooking scb_t, > > a structure that megaraid allocates per command. > > OK Thanks. > I was not sure because it looks like in mega_cmd_done(), if the status is > 0x2 (CHECK_CONDITION) then it would set the sense_buffer. But from what > you say, the HW will never return 0x2 in case of an Internal-Command. I > Just wanted to make sure. I thought that all internal commands are non SCSI command but seems that there is one exception (issuing INQUIRY as an internal command). I'm not sure I understand correctly the driver but anyway here is a version using scsi_allocate_command and scsi_free_command. I guess that we need to check the kzalloc failure too but it is supposed to be fixed by a different patch. diff --git a/drivers/scsi/megaraid.c b/drivers/scsi/megaraid.c index 28c9da7..7dc62de 100644 --- a/drivers/scsi/megaraid.c +++ b/drivers/scsi/megaraid.c @@ -4402,6 +4402,10 @@ mega_internal_command(adapter_t *adapter, megacmd_t *mc, mega_passthru *pthru) scb_t *scb; int rval; + scmd = scsi_allocate_command(GFP_KERNEL); + if (!scmd) + return -ENOMEM; + /* * The internal commands share one command id and hence are * serialized. This is so because we want to reserve maximum number of @@ -4412,12 +4416,11 @@ mega_internal_command(adapter_t *adapter, megacmd_t *mc, mega_passthru *pthru) scb = &adapter->int_scb; memset(scb, 0, sizeof(scb_t)); - scmd = &adapter->int_scmd; - memset(scmd, 0, sizeof(Scsi_Cmnd)); - sdev = kzalloc(sizeof(struct scsi_device), GFP_KERNEL); scmd->device = sdev; + memset(adapter->int_cdb, 0, sizeof(adapter->int_cdb)); + scmd->cmnd = adapter->int_cdb; scmd->device->host = adapter->host; scmd->host_scribble = (void *)scb; scmd->cmnd[0] = MEGA_INTERNAL_CMD; @@ -4456,6 +4459,8 @@ mega_internal_command(adapter_t *adapter, megacmd_t *mc, mega_passthru *pthru) mutex_unlock(&adapter->int_mtx); + scsi_free_command(GFP_KERNEL, scmd); + return rval; } diff --git a/drivers/scsi/megaraid.h b/drivers/scsi/megaraid.h index ee70bd4..795201f 100644 --- a/drivers/scsi/megaraid.h +++ b/drivers/scsi/megaraid.h @@ -888,8 +888,8 @@ typedef struct { u8 sglen; /* f/w supported scatter-gather list length */ + unsigned char int_cdb[MAX_COMMAND_SIZE]; scb_t int_scb; - Scsi_Cmnd int_scmd; struct mutex int_mtx; /* To synchronize the internal commands */ struct completion int_waitq; /* wait queue for internal
I saw the latest working kernel: 2.6.24 and first failing kernel version: 2.6.27-rc8. I understand there are lots of changes between those two kernels. Can you take a look the changes from kernels to find out the root cause? Also if you believe this is the driver issue and need LSI to help, can you report this issue to LSI? Thanks. Bo Yang -----Original Message----- From: FUJITA Tomonori [mailto:fujita.tomonori@lab.ntt.co.jp] Sent: Wednesday, October 22, 2008 8:34 AM To: bharrosh@panasas.com Cc: fujita.tomonori@lab.ntt.co.jp; pterjan@gmail.com; matthew@wil.cx; akpm@linux-foundation.org; linux-scsi@vger.kernel.org; Patro, Sumant; Yang, Bo; bugme-daemon@bugzilla.kernel.org Subject: Re: [Bugme-new] [Bug 11792] New: Oops when reading /proc/megaraid/hba0/diskdrives-ch* On Wed, 22 Oct 2008 12:08:27 +0200 Boaz Harrosh <bharrosh@panasas.com> wrote: > FUJITA Tomonori wrote: > > On Wed, 22 Oct 2008 11:04:44 +0200 > > Boaz Harrosh <bharrosh@panasas.com> wrote: > > > >> FUJITA Tomonori wrote: > >>> On Tue, 21 Oct 2008 22:22:37 +0200 > >>> "Pascal Terjan" <pterjan@gmail.com> wrote: > >>> > >>>> On Tue, Oct 21, 2008 at 9:54 PM, Matthew Wilcox <matthew@wil.cx> wrote: > >>>>> On Tue, Oct 21, 2008 at 12:47:01PM -0700, Andrew Morton wrote: > >>>>>>> Latest working kernel version: 2.6.24 > >>>>>>> Earliest failing kernel version: 2.6.27-rc8 > >>>>>> It's a regression. > >>>>>> > >>>>>>> Pid: 2319, comm: diff Not tainted (2.6.27-server-0.rc8.2mnb #1) > >>>>> It's also a distro kernel by the looks of things. Can it be reproduced > >>>>> with an upstream kernel? > >>>> I will try booting the server on vanilla kernel but I'm not sure when > >>>> (we already rebooted it 2 times recently and users won't enjoy it). > >>>> > >>>> This is a distro kernel but I don't see patches that could impact this : > >>>> > http://svn.mandriva.com/cgi-bin/viewvc.cgi/packages/cooker/kernel/current/PATCHES/patches/ > >>>> > >>>> Machine is a old HP NetServer LT 6000 > >>>> > >>>> 04:03.1 I2O: Intel Corporation 80960RP (i960RP) Microprocessor (rev > >>>> 09) (prog-if 01) > >>>> Subsystem: Hewlett-Packard Company MegaRAID, Integrated NetRAID > >>>> Flags: bus master, fast Back2Back, medium devsel, latency 64, IRQ 11 > >>>> Memory at f4000000 (32-bit, prefetchable) [size=64M] > >>>> [virtual] Expansion ROM at a8130000 [disabled] [size=32K] > >>>> Capabilities: [80] Power Management version 2 > >>>> Kernel driver in use: megaraid_legacy > >>>> Kernel modules: i2o_core, megaraid > >>> This patch helps? > >>> > >>> > >>> diff --git a/drivers/scsi/megaraid.c b/drivers/scsi/megaraid.c > >>> index 28c9da7..9294ed8 100644 > >>> --- a/drivers/scsi/megaraid.c > >>> +++ b/drivers/scsi/megaraid.c > >>> @@ -4414,12 +4414,14 @@ mega_internal_command(adapter_t *adapter, > megacmd_t *mc, mega_passthru *pthru) > >>> > >>> scmd = &adapter->int_scmd; > >>> memset(scmd, 0, sizeof(Scsi_Cmnd)); > >>> + memset(adapter->int_cdb, 0, sizeof(adapter->int_cdb)); > >>> > >>> sdev = kzalloc(sizeof(struct scsi_device), GFP_KERNEL); > >>> scmd->device = sdev; > >>> > >>> scmd->device->host = adapter->host; > >>> scmd->host_scribble = (void *)scb; > >>> + scmd->cmnd = adapter->int_cdb; > >>> scmd->cmnd[0] = MEGA_INTERNAL_CMD; > >>> > >>> scb->state |= SCB_ACTIVE; > >>> diff --git a/drivers/scsi/megaraid.h b/drivers/scsi/megaraid.h > >>> index ee70bd4..5ffec15 100644 > >>> --- a/drivers/scsi/megaraid.h > >>> +++ b/drivers/scsi/megaraid.h > >>> @@ -889,6 +889,7 @@ typedef struct { > >>> u8 sglen; /* f/w supported scatter-gather list length */ > >>> > >>> scb_t int_scb; > >>> + unsigned char int_cdb[MAX_COMMAND_SIZE]; > >>> Scsi_Cmnd int_scmd; > >>> struct mutex int_mtx; /* To synchronize the internal > >>> commands */ > >>> > >>> -- > >> Hi TOMO. > >> > >> This might not be enough for example I don't see the allocation of > sense_buffer. > >> It might be much easer to allocate using the new command allocation API > James > >> did, just for such cases. These are: > scsi_allocate_command/scsi_free_command > > > > Yeah, it might be. It's fine by me too. But this code path is used > > only for issuing internal special commands. It doesn't use the great > > portion of scsi_cmnd. For example, these commands don't use sense > > buffer, I think. The code path uses scsi_cmnd just for hooking scb_t, > > a structure that megaraid allocates per command. > > OK Thanks. > I was not sure because it looks like in mega_cmd_done(), if the status is > 0x2 (CHECK_CONDITION) then it would set the sense_buffer. But from what > you say, the HW will never return 0x2 in case of an Internal-Command. I > Just wanted to make sure. I thought that all internal commands are non SCSI command but seems that there is one exception (issuing INQUIRY as an internal command). I'm not sure I understand correctly the driver but anyway here is a version using scsi_allocate_command and scsi_free_command. I guess that we need to check the kzalloc failure too but it is supposed to be fixed by a different patch. diff --git a/drivers/scsi/megaraid.c b/drivers/scsi/megaraid.c index 28c9da7..7dc62de 100644 --- a/drivers/scsi/megaraid.c +++ b/drivers/scsi/megaraid.c @@ -4402,6 +4402,10 @@ mega_internal_command(adapter_t *adapter, megacmd_t *mc, mega_passthru *pthru) scb_t *scb; int rval; + scmd = scsi_allocate_command(GFP_KERNEL); + if (!scmd) + return -ENOMEM; + /* * The internal commands share one command id and hence are * serialized. This is so because we want to reserve maximum number of @@ -4412,12 +4416,11 @@ mega_internal_command(adapter_t *adapter, megacmd_t *mc, mega_passthru *pthru) scb = &adapter->int_scb; memset(scb, 0, sizeof(scb_t)); - scmd = &adapter->int_scmd; - memset(scmd, 0, sizeof(Scsi_Cmnd)); - sdev = kzalloc(sizeof(struct scsi_device), GFP_KERNEL); scmd->device = sdev; + memset(adapter->int_cdb, 0, sizeof(adapter->int_cdb)); + scmd->cmnd = adapter->int_cdb; scmd->device->host = adapter->host; scmd->host_scribble = (void *)scb; scmd->cmnd[0] = MEGA_INTERNAL_CMD; @@ -4456,6 +4459,8 @@ mega_internal_command(adapter_t *adapter, megacmd_t *mc, mega_passthru *pthru) mutex_unlock(&adapter->int_mtx); + scsi_free_command(GFP_KERNEL, scmd); + return rval; } diff --git a/drivers/scsi/megaraid.h b/drivers/scsi/megaraid.h index ee70bd4..795201f 100644 --- a/drivers/scsi/megaraid.h +++ b/drivers/scsi/megaraid.h @@ -888,8 +888,8 @@ typedef struct { u8 sglen; /* f/w supported scatter-gather list length */ + unsigned char int_cdb[MAX_COMMAND_SIZE]; scb_t int_scb; - Scsi_Cmnd int_scmd; struct mutex int_mtx; /* To synchronize the internal commands */ struct completion int_waitq; /* wait queue for internal
Reply-To: fujita.tomonori@lab.ntt.co.jp On Wed, 22 Oct 2008 07:03:03 -0600 "Yang, Bo" <Bo.Yang@lsi.com> wrote: > I saw the latest working kernel: 2.6.24 and first failing kernel > version: 2.6.27-rc8. I understand there are lots of changes between > those two kernels. Can you take a look the changes from kernels to > find out the root cause? Sorry, I didn't explain the possible root cause. struct scsi_cmnd in 2.6.25: unsigned char cmnd[MAX_COMMAND_SIZE]; struct scsi_cmnd in 2.6.26: unsigned char *cmnd; In short, struct scsi_cmnd doesn't have static array for cdb. You need to allocate memory for it (the scsi midlayer does for common usage). So static int mega_internal_command(adapter_t *adapter, megacmd_t *mc, mega_passthru *pthru) { ... scb = &adapter->int_scb; memset(scb, 0, sizeof(scb_t)); scmd = &adapter->int_scmd; memset(scmd, 0, sizeof(Scsi_Cmnd)); sdev = kzalloc(sizeof(struct scsi_device), GFP_KERNEL); scmd->device = sdev; scmd->device->host = adapter->host; scmd->host_scribble = (void *)scb; scmd->cmnd[0] = MEGA_INTERNAL_CMD; I suspect that the driver crashes here. My patch adds array to adapter_t and use it here. After 2.6.25, sense_buffer also is converted from static array to pointer. In general, using scsi_allocate_command/scsi_free_command is the recommended way to use struct scsi_cmnd. So my latest patch removes struct scsi_cmnd in adapter_t and uses the API in mega_internal_command(). > Also if you believe this is the driver issue and need LSI to help, > can you report this issue to LSI? Yeah, I think that we need to update this driver because of the changes to SCSI mid-layer. It would be appreciated if you can test my latest path: http://marc.info/?l=linux-scsi&m=122467887502481&w=2 Can you think of this thread as a bug report to LSI?
Reply-To: bharrosh@panasas.com FUJITA Tomonori wrote: > On Wed, 22 Oct 2008 12:08:27 +0200 > Boaz Harrosh <bharrosh@panasas.com> wrote: > >> FUJITA Tomonori wrote: >>> On Wed, 22 Oct 2008 11:04:44 +0200 >>> Boaz Harrosh <bharrosh@panasas.com> wrote: >>> >>>> FUJITA Tomonori wrote: >>>>> On Tue, 21 Oct 2008 22:22:37 +0200 >>>>> "Pascal Terjan" <pterjan@gmail.com> wrote: >>>>> >>>>>> On Tue, Oct 21, 2008 at 9:54 PM, Matthew Wilcox <matthew@wil.cx> wrote: >>>>>>> On Tue, Oct 21, 2008 at 12:47:01PM -0700, Andrew Morton wrote: >>>>>>>>> Latest working kernel version: 2.6.24 >>>>>>>>> Earliest failing kernel version: 2.6.27-rc8 >>>>>>>> It's a regression. >>>>>>>> >>>>>>>>> Pid: 2319, comm: diff Not tainted (2.6.27-server-0.rc8.2mnb #1) >>>>>>> It's also a distro kernel by the looks of things. Can it be reproduced >>>>>>> with an upstream kernel? >>>>>> I will try booting the server on vanilla kernel but I'm not sure when >>>>>> (we already rebooted it 2 times recently and users won't enjoy it). >>>>>> >>>>>> This is a distro kernel but I don't see patches that could impact this : >>>>>> >>>>>> http://svn.mandriva.com/cgi-bin/viewvc.cgi/packages/cooker/kernel/current/PATCHES/patches/ >>>>>> >>>>>> Machine is a old HP NetServer LT 6000 >>>>>> >>>>>> 04:03.1 I2O: Intel Corporation 80960RP (i960RP) Microprocessor (rev >>>>>> 09) (prog-if 01) >>>>>> Subsystem: Hewlett-Packard Company MegaRAID, Integrated NetRAID >>>>>> Flags: bus master, fast Back2Back, medium devsel, latency 64, IRQ 11 >>>>>> Memory at f4000000 (32-bit, prefetchable) [size=64M] >>>>>> [virtual] Expansion ROM at a8130000 [disabled] [size=32K] >>>>>> Capabilities: [80] Power Management version 2 >>>>>> Kernel driver in use: megaraid_legacy >>>>>> Kernel modules: i2o_core, megaraid >>>>> This patch helps? >>>>> >>>>> >>>>> diff --git a/drivers/scsi/megaraid.c b/drivers/scsi/megaraid.c >>>>> index 28c9da7..9294ed8 100644 >>>>> --- a/drivers/scsi/megaraid.c >>>>> +++ b/drivers/scsi/megaraid.c >>>>> @@ -4414,12 +4414,14 @@ mega_internal_command(adapter_t *adapter, >>>>> megacmd_t *mc, mega_passthru *pthru) >>>>> >>>>> scmd = &adapter->int_scmd; >>>>> memset(scmd, 0, sizeof(Scsi_Cmnd)); >>>>> + memset(adapter->int_cdb, 0, sizeof(adapter->int_cdb)); >>>>> >>>>> sdev = kzalloc(sizeof(struct scsi_device), GFP_KERNEL); >>>>> scmd->device = sdev; >>>>> >>>>> scmd->device->host = adapter->host; >>>>> scmd->host_scribble = (void *)scb; >>>>> + scmd->cmnd = adapter->int_cdb; >>>>> scmd->cmnd[0] = MEGA_INTERNAL_CMD; >>>>> >>>>> scb->state |= SCB_ACTIVE; >>>>> diff --git a/drivers/scsi/megaraid.h b/drivers/scsi/megaraid.h >>>>> index ee70bd4..5ffec15 100644 >>>>> --- a/drivers/scsi/megaraid.h >>>>> +++ b/drivers/scsi/megaraid.h >>>>> @@ -889,6 +889,7 @@ typedef struct { >>>>> u8 sglen; /* f/w supported scatter-gather list length */ >>>>> >>>>> scb_t int_scb; >>>>> + unsigned char int_cdb[MAX_COMMAND_SIZE]; >>>>> Scsi_Cmnd int_scmd; >>>>> struct mutex int_mtx; /* To synchronize the internal >>>>> commands */ >>>>> >>>>> -- >>>> Hi TOMO. >>>> >>>> This might not be enough for example I don't see the allocation of >>>> sense_buffer. >>>> It might be much easer to allocate using the new command allocation API >>>> James >>>> did, just for such cases. These are: >>>> scsi_allocate_command/scsi_free_command >>> Yeah, it might be. It's fine by me too. But this code path is used >>> only for issuing internal special commands. It doesn't use the great >>> portion of scsi_cmnd. For example, these commands don't use sense >>> buffer, I think. The code path uses scsi_cmnd just for hooking scb_t, >>> a structure that megaraid allocates per command. >> OK Thanks. >> I was not sure because it looks like in mega_cmd_done(), if the status is >> 0x2 (CHECK_CONDITION) then it would set the sense_buffer. But from what >> you say, the HW will never return 0x2 in case of an Internal-Command. I >> Just wanted to make sure. > > I thought that all internal commands are non SCSI command but seems > that there is one exception (issuing INQUIRY as an internal command). > I'm not sure I understand correctly the driver but anyway here is a > version using scsi_allocate_command and scsi_free_command. > Thanks TOMO. This is actual my bug from the days of making a scsi_cmnd->cmnd into a pointer, and skipping this driver. Reviewed-by: Boaz Harrosh <bharrosh@panasas.com> > I guess that we need to check the kzalloc failure too but it is > supposed to be fixed by a different patch. > > > diff --git a/drivers/scsi/megaraid.c b/drivers/scsi/megaraid.c > index 28c9da7..7dc62de 100644 > --- a/drivers/scsi/megaraid.c > +++ b/drivers/scsi/megaraid.c > @@ -4402,6 +4402,10 @@ mega_internal_command(adapter_t *adapter, megacmd_t > *mc, mega_passthru *pthru) > scb_t *scb; > int rval; > > + scmd = scsi_allocate_command(GFP_KERNEL); > + if (!scmd) > + return -ENOMEM; > + > /* > * The internal commands share one command id and hence are > * serialized. This is so because we want to reserve maximum number of > @@ -4412,12 +4416,11 @@ mega_internal_command(adapter_t *adapter, megacmd_t > *mc, mega_passthru *pthru) > scb = &adapter->int_scb; > memset(scb, 0, sizeof(scb_t)); > > - scmd = &adapter->int_scmd; > - memset(scmd, 0, sizeof(Scsi_Cmnd)); > - > sdev = kzalloc(sizeof(struct scsi_device), GFP_KERNEL); > scmd->device = sdev; > > + memset(adapter->int_cdb, 0, sizeof(adapter->int_cdb)); > + scmd->cmnd = adapter->int_cdb; > scmd->device->host = adapter->host; > scmd->host_scribble = (void *)scb; > scmd->cmnd[0] = MEGA_INTERNAL_CMD; > @@ -4456,6 +4459,8 @@ mega_internal_command(adapter_t *adapter, megacmd_t > *mc, mega_passthru *pthru) > > mutex_unlock(&adapter->int_mtx); > > + scsi_free_command(GFP_KERNEL, scmd); > + > return rval; > } > > diff --git a/drivers/scsi/megaraid.h b/drivers/scsi/megaraid.h > index ee70bd4..795201f 100644 > --- a/drivers/scsi/megaraid.h > +++ b/drivers/scsi/megaraid.h > @@ -888,8 +888,8 @@ typedef struct { > > u8 sglen; /* f/w supported scatter-gather list length */ > > + unsigned char int_cdb[MAX_COMMAND_SIZE]; > scb_t int_scb; > - Scsi_Cmnd int_scmd; > struct mutex int_mtx; /* To synchronize the internal > commands */ > struct completion int_waitq; /* wait queue for internal TODO: One more thing that needs to be done in this driver is one time allocation of a scsi host-device and use it as a proper "scmd->device = sdev". And freeing at destruction. Failing to do so enables a posibility of an internal command been completed after the deletion of the host. Thanks again Boaz
Thanks TOMM. If this is the case, it may affect some of other drivers like our MPT and SAS driver. Is there a way kernel can fix it? Thanks, Bo Yang -----Original Message----- From: FUJITA Tomonori [mailto:fujita.tomonori@lab.ntt.co.jp] Sent: Wednesday, October 22, 2008 9:39 AM To: Yang, Bo Cc: fujita.tomonori@lab.ntt.co.jp; bharrosh@panasas.com; pterjan@gmail.com; matthew@wil.cx; akpm@linux-foundation.org; linux-scsi@vger.kernel.org; Patro, Sumant; bugme-daemon@bugzilla.kernel.org; Austria, Winston Subject: RE: [Bugme-new] [Bug 11792] New: Oops when reading /proc/megaraid/hba0/diskdrives-ch* On Wed, 22 Oct 2008 07:03:03 -0600 "Yang, Bo" <Bo.Yang@lsi.com> wrote: > I saw the latest working kernel: 2.6.24 and first failing kernel > version: 2.6.27-rc8. I understand there are lots of changes between > those two kernels. Can you take a look the changes from kernels to > find out the root cause? Sorry, I didn't explain the possible root cause. struct scsi_cmnd in 2.6.25: unsigned char cmnd[MAX_COMMAND_SIZE]; struct scsi_cmnd in 2.6.26: unsigned char *cmnd; In short, struct scsi_cmnd doesn't have static array for cdb. You need to allocate memory for it (the scsi midlayer does for common usage). So static int mega_internal_command(adapter_t *adapter, megacmd_t *mc, mega_passthru *pthru) { ... scb = &adapter->int_scb; memset(scb, 0, sizeof(scb_t)); scmd = &adapter->int_scmd; memset(scmd, 0, sizeof(Scsi_Cmnd)); sdev = kzalloc(sizeof(struct scsi_device), GFP_KERNEL); scmd->device = sdev; scmd->device->host = adapter->host; scmd->host_scribble = (void *)scb; scmd->cmnd[0] = MEGA_INTERNAL_CMD; I suspect that the driver crashes here. My patch adds array to adapter_t and use it here. After 2.6.25, sense_buffer also is converted from static array to pointer. In general, using scsi_allocate_command/scsi_free_command is the recommended way to use struct scsi_cmnd. So my latest patch removes struct scsi_cmnd in adapter_t and uses the API in mega_internal_command(). > Also if you believe this is the driver issue and need LSI to help, > can you report this issue to LSI? Yeah, I think that we need to update this driver because of the changes to SCSI mid-layer. It would be appreciated if you can test my latest path: http://marc.info/?l=linux-scsi&m=122467887502481&w=2 Can you think of this thread as a bug report to LSI?
Reply-To: bharrosh@panasas.com Yang, Bo wrote: > Thanks TOMM. If this is the case, it may affect some of other drivers like > our > MPT and SAS driver. Is there a way kernel can fix it? > > Thanks, > > Bo Yang > Hi Bo Yang What are the source files for the MPT and SAS drivers from LSI? I have made a system wide search for such problems as below and could not find any more. But I might have missed them. If you tell me the file names I will inspect more closly. Thanks Boaz > -----Original Message----- > From: FUJITA Tomonori [mailto:fujita.tomonori@lab.ntt.co.jp] > Sent: Wednesday, October 22, 2008 9:39 AM > To: Yang, Bo > Cc: fujita.tomonori@lab.ntt.co.jp; bharrosh@panasas.com; pterjan@gmail.com; > matthew@wil.cx; akpm@linux-foundation.org; linux-scsi@vger.kernel.org; Patro, > Sumant; bugme-daemon@bugzilla.kernel.org; Austria, Winston > Subject: RE: [Bugme-new] [Bug 11792] New: Oops when reading > /proc/megaraid/hba0/diskdrives-ch* > > On Wed, 22 Oct 2008 07:03:03 -0600 > "Yang, Bo" <Bo.Yang@lsi.com> wrote: > >> I saw the latest working kernel: 2.6.24 and first failing kernel >> version: 2.6.27-rc8. I understand there are lots of changes between >> those two kernels. Can you take a look the changes from kernels to >> find out the root cause? > > Sorry, I didn't explain the possible root cause. > > struct scsi_cmnd in 2.6.25: > > unsigned char cmnd[MAX_COMMAND_SIZE]; > > > struct scsi_cmnd in 2.6.26: > > unsigned char *cmnd; > > > In short, struct scsi_cmnd doesn't have static array for cdb. You need > to allocate memory for it (the scsi midlayer does for common usage). > > So > > static int > mega_internal_command(adapter_t *adapter, megacmd_t *mc, mega_passthru > *pthru) > { > ... > scb = &adapter->int_scb; > memset(scb, 0, sizeof(scb_t)); > > scmd = &adapter->int_scmd; > memset(scmd, 0, sizeof(Scsi_Cmnd)); > > sdev = kzalloc(sizeof(struct scsi_device), GFP_KERNEL); > scmd->device = sdev; > > scmd->device->host = adapter->host; > scmd->host_scribble = (void *)scb; > scmd->cmnd[0] = MEGA_INTERNAL_CMD; > > I suspect that the driver crashes here. My patch adds array to > adapter_t and use it here. > > After 2.6.25, sense_buffer also is converted from static array to > pointer. In general, using scsi_allocate_command/scsi_free_command is > the recommended way to use struct scsi_cmnd. > > So my latest patch removes struct scsi_cmnd in adapter_t and uses the > API in mega_internal_command(). > > >> Also if you believe this is the driver issue and need LSI to help, >> can you report this issue to LSI? > > Yeah, I think that we need to update this driver because of the > changes to SCSI mid-layer. > > It would be appreciated if you can test my latest path: > > http://marc.info/?l=linux-scsi&m=122467887502481&w=2 > > > Can you think of this thread as a bug report to LSI? > -- > To unsubscribe from this list: send the line "unsubscribe linux-scsi" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html >
Reply-To: bharrosh@panasas.com Boaz Harrosh wrote: > Yang, Bo wrote: >> Thanks TOMM. If this is the case, it may affect some of other drivers like >> our >> MPT and SAS driver. Is there a way kernel can fix it? >> >> Thanks, >> >> Bo Yang >> > > Hi Bo Yang > > What are the source files for the MPT and SAS drivers from LSI? > I have made a system wide search for such problems as below and could not > find any more. But I might have missed them. If you tell me the file names > I will inspect more closly. > > Thanks > Boaz In megaraid_sas.c I do not see any places where megasas_cmd->scmd is set other then at megasas_queue_command() which is the scsi .queue_command vector. Commands that come from scsi-ml are guarantied to be fully allocated. Other then that I do not see any places that privately allocate a scsi_cmnd structure. So I would say megaraid_sas.c should be safe from this bug Any other files I should inspect? Boaz
Thanks Boaz, the name for SAS is megaraid_sas and MPT is fusion. Regards, Bo Yang -----Original Message----- From: Boaz Harrosh [mailto:bharrosh@panasas.com] Sent: Wednesday, October 22, 2008 10:20 AM To: Yang, Bo Cc: FUJITA Tomonori; pterjan@gmail.com; matthew@wil.cx; akpm@linux-foundation.org; linux-scsi@vger.kernel.org; Patro, Sumant; bugme-daemon@bugzilla.kernel.org; Austria, Winston Subject: Re: [Bugme-new] [Bug 11792] New: Oops when reading /proc/megaraid/hba0/diskdrives-ch* Yang, Bo wrote: > Thanks TOMM. If this is the case, it may affect some of other drivers like > our > MPT and SAS driver. Is there a way kernel can fix it? > > Thanks, > > Bo Yang > Hi Bo Yang What are the source files for the MPT and SAS drivers from LSI? I have made a system wide search for such problems as below and could not find any more. But I might have missed them. If you tell me the file names I will inspect more closly. Thanks Boaz > -----Original Message----- > From: FUJITA Tomonori [mailto:fujita.tomonori@lab.ntt.co.jp] > Sent: Wednesday, October 22, 2008 9:39 AM > To: Yang, Bo > Cc: fujita.tomonori@lab.ntt.co.jp; bharrosh@panasas.com; pterjan@gmail.com; > matthew@wil.cx; akpm@linux-foundation.org; linux-scsi@vger.kernel.org; Patro, > Sumant; bugme-daemon@bugzilla.kernel.org; Austria, Winston > Subject: RE: [Bugme-new] [Bug 11792] New: Oops when reading > /proc/megaraid/hba0/diskdrives-ch* > > On Wed, 22 Oct 2008 07:03:03 -0600 > "Yang, Bo" <Bo.Yang@lsi.com> wrote: > >> I saw the latest working kernel: 2.6.24 and first failing kernel >> version: 2.6.27-rc8. I understand there are lots of changes between >> those two kernels. Can you take a look the changes from kernels to >> find out the root cause? > > Sorry, I didn't explain the possible root cause. > > struct scsi_cmnd in 2.6.25: > > unsigned char cmnd[MAX_COMMAND_SIZE]; > > > struct scsi_cmnd in 2.6.26: > > unsigned char *cmnd; > > > In short, struct scsi_cmnd doesn't have static array for cdb. You need > to allocate memory for it (the scsi midlayer does for common usage). > > So > > static int > mega_internal_command(adapter_t *adapter, megacmd_t *mc, mega_passthru > *pthru) > { > ... > scb = &adapter->int_scb; > memset(scb, 0, sizeof(scb_t)); > > scmd = &adapter->int_scmd; > memset(scmd, 0, sizeof(Scsi_Cmnd)); > > sdev = kzalloc(sizeof(struct scsi_device), GFP_KERNEL); > scmd->device = sdev; > > scmd->device->host = adapter->host; > scmd->host_scribble = (void *)scb; > scmd->cmnd[0] = MEGA_INTERNAL_CMD; > > I suspect that the driver crashes here. My patch adds array to > adapter_t and use it here. > > After 2.6.25, sense_buffer also is converted from static array to > pointer. In general, using scsi_allocate_command/scsi_free_command is > the recommended way to use struct scsi_cmnd. > > So my latest patch removes struct scsi_cmnd in adapter_t and uses the > API in mega_internal_command(). > > >> Also if you believe this is the driver issue and need LSI to help, >> can you report this issue to LSI? > > Yeah, I think that we need to update this driver because of the > changes to SCSI mid-layer. > > It would be appreciated if you can test my latest path: > > http://marc.info/?l=linux-scsi&m=122467887502481&w=2 > > > Can you think of this thread as a bug report to LSI? > -- > To unsubscribe from this list: send the line "unsubscribe linux-scsi" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html >
Reply-To: bharrosh@panasas.com Yang, Bo wrote: > Thanks Boaz, the name for SAS is megaraid_sas and MPT is fusion. > > Regards, > > Bo Yang > I have also reviewed megaraid_mbox.c and megaraid_mm.c. and all files in drivers/message/fusion/* (See other mail for megaraid_sas.c). I do not see any problems with these drivers. They do not allocate private scsi_cmnd structures. Anything else to review? Thanks Boaz
On Wed, Oct 22, 2008 at 2:33 PM, FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> wrote: > > diff --git a/drivers/scsi/megaraid.c b/drivers/scsi/megaraid.c > index 28c9da7..7dc62de 100644 > --- a/drivers/scsi/megaraid.c > +++ b/drivers/scsi/megaraid.c > @@ -4402,6 +4402,10 @@ mega_internal_command(adapter_t *adapter, megacmd_t > *mc, mega_passthru *pthru) > scb_t *scb; > int rval; > > + scmd = scsi_allocate_command(GFP_KERNEL); > + if (!scmd) > + return -ENOMEM; > + > /* > * The internal commands share one command id and hence are > * serialized. This is so because we want to reserve maximum number of > @@ -4412,12 +4416,11 @@ mega_internal_command(adapter_t *adapter, megacmd_t > *mc, mega_passthru *pthru) > scb = &adapter->int_scb; > memset(scb, 0, sizeof(scb_t)); > > - scmd = &adapter->int_scmd; > - memset(scmd, 0, sizeof(Scsi_Cmnd)); > - > sdev = kzalloc(sizeof(struct scsi_device), GFP_KERNEL); > scmd->device = sdev; > > + memset(adapter->int_cdb, 0, sizeof(adapter->int_cdb)); > + scmd->cmnd = adapter->int_cdb; > scmd->device->host = adapter->host; > scmd->host_scribble = (void *)scb; > scmd->cmnd[0] = MEGA_INTERNAL_CMD; > @@ -4456,6 +4459,8 @@ mega_internal_command(adapter_t *adapter, megacmd_t > *mc, mega_passthru *pthru) > > mutex_unlock(&adapter->int_mtx); > > + scsi_free_command(GFP_KERNEL, scmd); > + > return rval; > } > > diff --git a/drivers/scsi/megaraid.h b/drivers/scsi/megaraid.h > index ee70bd4..795201f 100644 > --- a/drivers/scsi/megaraid.h > +++ b/drivers/scsi/megaraid.h > @@ -888,8 +888,8 @@ typedef struct { > > u8 sglen; /* f/w supported scatter-gather list length */ > > + unsigned char int_cdb[MAX_COMMAND_SIZE]; > scb_t int_scb; > - Scsi_Cmnd int_scmd; > struct mutex int_mtx; /* To synchronize the internal > commands */ > struct completion int_waitq; /* wait queue for internal > I confirm that this patch fixes the oops and I can now read the usual info
Reply-To: fujita.tomonori@lab.ntt.co.jp On Fri, 24 Oct 2008 00:49:07 +0200 "Pascal Terjan" <pterjan@gmail.com> wrote: > On Wed, Oct 22, 2008 at 2:33 PM, FUJITA Tomonori > <fujita.tomonori@lab.ntt.co.jp> wrote: > > > > diff --git a/drivers/scsi/megaraid.c b/drivers/scsi/megaraid.c > > index 28c9da7..7dc62de 100644 > > --- a/drivers/scsi/megaraid.c > > +++ b/drivers/scsi/megaraid.c > > @@ -4402,6 +4402,10 @@ mega_internal_command(adapter_t *adapter, megacmd_t > *mc, mega_passthru *pthru) > > scb_t *scb; > > int rval; > > > > + scmd = scsi_allocate_command(GFP_KERNEL); > > + if (!scmd) > > + return -ENOMEM; > > + > > /* > > * The internal commands share one command id and hence are > > * serialized. This is so because we want to reserve maximum number > of > > @@ -4412,12 +4416,11 @@ mega_internal_command(adapter_t *adapter, megacmd_t > *mc, mega_passthru *pthru) > > scb = &adapter->int_scb; > > memset(scb, 0, sizeof(scb_t)); > > > > - scmd = &adapter->int_scmd; > > - memset(scmd, 0, sizeof(Scsi_Cmnd)); > > - > > sdev = kzalloc(sizeof(struct scsi_device), GFP_KERNEL); > > scmd->device = sdev; > > > > + memset(adapter->int_cdb, 0, sizeof(adapter->int_cdb)); > > + scmd->cmnd = adapter->int_cdb; > > scmd->device->host = adapter->host; > > scmd->host_scribble = (void *)scb; > > scmd->cmnd[0] = MEGA_INTERNAL_CMD; > > @@ -4456,6 +4459,8 @@ mega_internal_command(adapter_t *adapter, megacmd_t > *mc, mega_passthru *pthru) > > > > mutex_unlock(&adapter->int_mtx); > > > > + scsi_free_command(GFP_KERNEL, scmd); > > + > > return rval; > > } > > > > diff --git a/drivers/scsi/megaraid.h b/drivers/scsi/megaraid.h > > index ee70bd4..795201f 100644 > > --- a/drivers/scsi/megaraid.h > > +++ b/drivers/scsi/megaraid.h > > @@ -888,8 +888,8 @@ typedef struct { > > > > u8 sglen; /* f/w supported scatter-gather list length */ > > > > + unsigned char int_cdb[MAX_COMMAND_SIZE]; > > scb_t int_scb; > > - Scsi_Cmnd int_scmd; > > struct mutex int_mtx; /* To synchronize the > internal > > commands */ > > struct completion int_waitq; /* wait queue for internal > > > > I confirm that this patch fixes the oops and I can now read the usual info Thanks! LSI people, can I get the ack on this? = From: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Subject: [PATCH] megaraid: fix mega_internal_command oops scsi_cmnd->cmnd was changed from a static array to a pointer post 2.6.25. It breaks mega_internal_command(): static int mega_internal_command(adapter_t *adapter, megacmd_t *mc, mega_passthru *pthru) { ... scb = &adapter->int_scb; memset(scb, 0, sizeof(scb_t)); scmd = &adapter->int_scmd; memset(scmd, 0, sizeof(Scsi_Cmnd)); sdev = kzalloc(sizeof(struct scsi_device), GFP_KERNEL); scmd->device = sdev; scmd->device->host = adapter->host; scmd->host_scribble = (void *)scb; scmd->cmnd[0] = MEGA_INTERNAL_CMD; mega_internal_command() uses scsi_cmnd allocated internally so scmd->cmnd is NULL here. This patch adds a static array for cdb to adapter_t and uses it here. This also uses scsi_allocate_command/scsi_free_command, the recommended way to allocate struct scsi_cmnd since the driver might use sense_buffer in struct scsi_cmnd. Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Reviewed-by: Boaz Harrosh <bharrosh@panasas.com> Tested-by: Pascal Terjan <pterjan@gmail.com> Reported-by: Pascal Terjan <pterjan@gmail.com> --- drivers/scsi/megaraid.c | 11 ++++++++--- drivers/scsi/megaraid.h | 2 +- 2 files changed, 9 insertions(+), 4 deletions(-) diff --git a/drivers/scsi/megaraid.c b/drivers/scsi/megaraid.c index 28c9da7..7dc62de 100644 --- a/drivers/scsi/megaraid.c +++ b/drivers/scsi/megaraid.c @@ -4402,6 +4402,10 @@ mega_internal_command(adapter_t *adapter, megacmd_t *mc, mega_passthru *pthru) scb_t *scb; int rval; + scmd = scsi_allocate_command(GFP_KERNEL); + if (!scmd) + return -ENOMEM; + /* * The internal commands share one command id and hence are * serialized. This is so because we want to reserve maximum number of @@ -4412,12 +4416,11 @@ mega_internal_command(adapter_t *adapter, megacmd_t *mc, mega_passthru *pthru) scb = &adapter->int_scb; memset(scb, 0, sizeof(scb_t)); - scmd = &adapter->int_scmd; - memset(scmd, 0, sizeof(Scsi_Cmnd)); - sdev = kzalloc(sizeof(struct scsi_device), GFP_KERNEL); scmd->device = sdev; + memset(adapter->int_cdb, 0, sizeof(adapter->int_cdb)); + scmd->cmnd = adapter->int_cdb; scmd->device->host = adapter->host; scmd->host_scribble = (void *)scb; scmd->cmnd[0] = MEGA_INTERNAL_CMD; @@ -4456,6 +4459,8 @@ mega_internal_command(adapter_t *adapter, megacmd_t *mc, mega_passthru *pthru) mutex_unlock(&adapter->int_mtx); + scsi_free_command(GFP_KERNEL, scmd); + return rval; } diff --git a/drivers/scsi/megaraid.h b/drivers/scsi/megaraid.h index ee70bd4..795201f 100644 --- a/drivers/scsi/megaraid.h +++ b/drivers/scsi/megaraid.h @@ -888,8 +888,8 @@ typedef struct { u8 sglen; /* f/w supported scatter-gather list length */ + unsigned char int_cdb[MAX_COMMAND_SIZE]; scb_t int_scb; - Scsi_Cmnd int_scmd; struct mutex int_mtx; /* To synchronize the internal commands */ struct completion int_waitq; /* wait queue for internal
Tom, I will update you as soon as LSI verifies it. Not today, next week will be safe. Regards, Bo Yang -----Original Message----- From: FUJITA Tomonori [mailto:fujita.tomonori@lab.ntt.co.jp] Sent: Thursday, October 23, 2008 8:21 PM To: pterjan@gmail.com; James.Bottomley@hansenpartnership.com Cc: fujita.tomonori@lab.ntt.co.jp; bharrosh@panasas.com; matthew@wil.cx; akpm@linux-foundation.org; linux-scsi@vger.kernel.org; Patro, Sumant; Yang, Bo; bugme-daemon@bugzilla.kernel.org Subject: Re: [Bugme-new] [Bug 11792] New: Oops when reading /proc/megaraid/hba0/diskdrives-ch* On Fri, 24 Oct 2008 00:49:07 +0200 "Pascal Terjan" <pterjan@gmail.com> wrote: > On Wed, Oct 22, 2008 at 2:33 PM, FUJITA Tomonori > <fujita.tomonori@lab.ntt.co.jp> wrote: > > > > diff --git a/drivers/scsi/megaraid.c b/drivers/scsi/megaraid.c > > index 28c9da7..7dc62de 100644 > > --- a/drivers/scsi/megaraid.c > > +++ b/drivers/scsi/megaraid.c > > @@ -4402,6 +4402,10 @@ mega_internal_command(adapter_t *adapter, megacmd_t > *mc, mega_passthru *pthru) > > scb_t *scb; > > int rval; > > > > + scmd = scsi_allocate_command(GFP_KERNEL); > > + if (!scmd) > > + return -ENOMEM; > > + > > /* > > * The internal commands share one command id and hence are > > * serialized. This is so because we want to reserve maximum number > of > > @@ -4412,12 +4416,11 @@ mega_internal_command(adapter_t *adapter, megacmd_t > *mc, mega_passthru *pthru) > > scb = &adapter->int_scb; > > memset(scb, 0, sizeof(scb_t)); > > > > - scmd = &adapter->int_scmd; > > - memset(scmd, 0, sizeof(Scsi_Cmnd)); > > - > > sdev = kzalloc(sizeof(struct scsi_device), GFP_KERNEL); > > scmd->device = sdev; > > > > + memset(adapter->int_cdb, 0, sizeof(adapter->int_cdb)); > > + scmd->cmnd = adapter->int_cdb; > > scmd->device->host = adapter->host; > > scmd->host_scribble = (void *)scb; > > scmd->cmnd[0] = MEGA_INTERNAL_CMD; > > @@ -4456,6 +4459,8 @@ mega_internal_command(adapter_t *adapter, megacmd_t > *mc, mega_passthru *pthru) > > > > mutex_unlock(&adapter->int_mtx); > > > > + scsi_free_command(GFP_KERNEL, scmd); > > + > > return rval; > > } > > > > diff --git a/drivers/scsi/megaraid.h b/drivers/scsi/megaraid.h > > index ee70bd4..795201f 100644 > > --- a/drivers/scsi/megaraid.h > > +++ b/drivers/scsi/megaraid.h > > @@ -888,8 +888,8 @@ typedef struct { > > > > u8 sglen; /* f/w supported scatter-gather list length */ > > > > + unsigned char int_cdb[MAX_COMMAND_SIZE]; > > scb_t int_scb; > > - Scsi_Cmnd int_scmd; > > struct mutex int_mtx; /* To synchronize the > internal > > commands */ > > struct completion int_waitq; /* wait queue for internal > > > > I confirm that this patch fixes the oops and I can now read the usual info Thanks! LSI people, can I get the ack on this? = From: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Subject: [PATCH] megaraid: fix mega_internal_command oops scsi_cmnd->cmnd was changed from a static array to a pointer post 2.6.25. It breaks mega_internal_command(): static int mega_internal_command(adapter_t *adapter, megacmd_t *mc, mega_passthru *pthru) { ... scb = &adapter->int_scb; memset(scb, 0, sizeof(scb_t)); scmd = &adapter->int_scmd; memset(scmd, 0, sizeof(Scsi_Cmnd)); sdev = kzalloc(sizeof(struct scsi_device), GFP_KERNEL); scmd->device = sdev; scmd->device->host = adapter->host; scmd->host_scribble = (void *)scb; scmd->cmnd[0] = MEGA_INTERNAL_CMD; mega_internal_command() uses scsi_cmnd allocated internally so scmd->cmnd is NULL here. This patch adds a static array for cdb to adapter_t and uses it here. This also uses scsi_allocate_command/scsi_free_command, the recommended way to allocate struct scsi_cmnd since the driver might use sense_buffer in struct scsi_cmnd. Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Reviewed-by: Boaz Harrosh <bharrosh@panasas.com> Tested-by: Pascal Terjan <pterjan@gmail.com> Reported-by: Pascal Terjan <pterjan@gmail.com> --- drivers/scsi/megaraid.c | 11 ++++++++--- drivers/scsi/megaraid.h | 2 +- 2 files changed, 9 insertions(+), 4 deletions(-) diff --git a/drivers/scsi/megaraid.c b/drivers/scsi/megaraid.c index 28c9da7..7dc62de 100644 --- a/drivers/scsi/megaraid.c +++ b/drivers/scsi/megaraid.c @@ -4402,6 +4402,10 @@ mega_internal_command(adapter_t *adapter, megacmd_t *mc, mega_passthru *pthru) scb_t *scb; int rval; + scmd = scsi_allocate_command(GFP_KERNEL); + if (!scmd) + return -ENOMEM; + /* * The internal commands share one command id and hence are * serialized. This is so because we want to reserve maximum number of @@ -4412,12 +4416,11 @@ mega_internal_command(adapter_t *adapter, megacmd_t *mc, mega_passthru *pthru) scb = &adapter->int_scb; memset(scb, 0, sizeof(scb_t)); - scmd = &adapter->int_scmd; - memset(scmd, 0, sizeof(Scsi_Cmnd)); - sdev = kzalloc(sizeof(struct scsi_device), GFP_KERNEL); scmd->device = sdev; + memset(adapter->int_cdb, 0, sizeof(adapter->int_cdb)); + scmd->cmnd = adapter->int_cdb; scmd->device->host = adapter->host; scmd->host_scribble = (void *)scb; scmd->cmnd[0] = MEGA_INTERNAL_CMD; @@ -4456,6 +4459,8 @@ mega_internal_command(adapter_t *adapter, megacmd_t *mc, mega_passthru *pthru) mutex_unlock(&adapter->int_mtx); + scsi_free_command(GFP_KERNEL, scmd); + return rval; } diff --git a/drivers/scsi/megaraid.h b/drivers/scsi/megaraid.h index ee70bd4..795201f 100644 --- a/drivers/scsi/megaraid.h +++ b/drivers/scsi/megaraid.h @@ -888,8 +888,8 @@ typedef struct { u8 sglen; /* f/w supported scatter-gather list length */ + unsigned char int_cdb[MAX_COMMAND_SIZE]; scb_t int_scb; - Scsi_Cmnd int_scmd; struct mutex int_mtx; /* To synchronize the internal commands */ struct completion int_waitq; /* wait queue for internal -- 1.5.5.GIT
The patch works for us. Regards, Bo Yang -----Original Message----- From: Yang, Bo Sent: Friday, October 24, 2008 9:31 AM To: 'FUJITA Tomonori'; pterjan@gmail.com; James.Bottomley@hansenpartnership.com Cc: bharrosh@panasas.com; matthew@wil.cx; akpm@linux-foundation.org; linux-scsi@vger.kernel.org; Patro, Sumant; bugme-daemon@bugzilla.kernel.org; Austria, Winston Subject: RE: [Bugme-new] [Bug 11792] New: Oops when reading /proc/megaraid/hba0/diskdrives-ch* Tom, I will update you as soon as LSI verifies it. Not today, next week will be safe. Regards, Bo Yang -----Original Message----- From: FUJITA Tomonori [mailto:fujita.tomonori@lab.ntt.co.jp] Sent: Thursday, October 23, 2008 8:21 PM To: pterjan@gmail.com; James.Bottomley@hansenpartnership.com Cc: fujita.tomonori@lab.ntt.co.jp; bharrosh@panasas.com; matthew@wil.cx; akpm@linux-foundation.org; linux-scsi@vger.kernel.org; Patro, Sumant; Yang, Bo; bugme-daemon@bugzilla.kernel.org Subject: Re: [Bugme-new] [Bug 11792] New: Oops when reading /proc/megaraid/hba0/diskdrives-ch* On Fri, 24 Oct 2008 00:49:07 +0200 "Pascal Terjan" <pterjan@gmail.com> wrote: > On Wed, Oct 22, 2008 at 2:33 PM, FUJITA Tomonori > <fujita.tomonori@lab.ntt.co.jp> wrote: > > > > diff --git a/drivers/scsi/megaraid.c b/drivers/scsi/megaraid.c > > index 28c9da7..7dc62de 100644 > > --- a/drivers/scsi/megaraid.c > > +++ b/drivers/scsi/megaraid.c > > @@ -4402,6 +4402,10 @@ mega_internal_command(adapter_t *adapter, megacmd_t > *mc, mega_passthru *pthru) > > scb_t *scb; > > int rval; > > > > + scmd = scsi_allocate_command(GFP_KERNEL); > > + if (!scmd) > > + return -ENOMEM; > > + > > /* > > * The internal commands share one command id and hence are > > * serialized. This is so because we want to reserve maximum number > of > > @@ -4412,12 +4416,11 @@ mega_internal_command(adapter_t *adapter, megacmd_t > *mc, mega_passthru *pthru) > > scb = &adapter->int_scb; > > memset(scb, 0, sizeof(scb_t)); > > > > - scmd = &adapter->int_scmd; > > - memset(scmd, 0, sizeof(Scsi_Cmnd)); > > - > > sdev = kzalloc(sizeof(struct scsi_device), GFP_KERNEL); > > scmd->device = sdev; > > > > + memset(adapter->int_cdb, 0, sizeof(adapter->int_cdb)); > > + scmd->cmnd = adapter->int_cdb; > > scmd->device->host = adapter->host; > > scmd->host_scribble = (void *)scb; > > scmd->cmnd[0] = MEGA_INTERNAL_CMD; > > @@ -4456,6 +4459,8 @@ mega_internal_command(adapter_t *adapter, megacmd_t > *mc, mega_passthru *pthru) > > > > mutex_unlock(&adapter->int_mtx); > > > > + scsi_free_command(GFP_KERNEL, scmd); > > + > > return rval; > > } > > > > diff --git a/drivers/scsi/megaraid.h b/drivers/scsi/megaraid.h > > index ee70bd4..795201f 100644 > > --- a/drivers/scsi/megaraid.h > > +++ b/drivers/scsi/megaraid.h > > @@ -888,8 +888,8 @@ typedef struct { > > > > u8 sglen; /* f/w supported scatter-gather list length */ > > > > + unsigned char int_cdb[MAX_COMMAND_SIZE]; > > scb_t int_scb; > > - Scsi_Cmnd int_scmd; > > struct mutex int_mtx; /* To synchronize the > internal > > commands */ > > struct completion int_waitq; /* wait queue for internal > > > > I confirm that this patch fixes the oops and I can now read the usual info Thanks! LSI people, can I get the ack on this? = From: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Subject: [PATCH] megaraid: fix mega_internal_command oops scsi_cmnd->cmnd was changed from a static array to a pointer post 2.6.25. It breaks mega_internal_command(): static int mega_internal_command(adapter_t *adapter, megacmd_t *mc, mega_passthru *pthru) { ... scb = &adapter->int_scb; memset(scb, 0, sizeof(scb_t)); scmd = &adapter->int_scmd; memset(scmd, 0, sizeof(Scsi_Cmnd)); sdev = kzalloc(sizeof(struct scsi_device), GFP_KERNEL); scmd->device = sdev; scmd->device->host = adapter->host; scmd->host_scribble = (void *)scb; scmd->cmnd[0] = MEGA_INTERNAL_CMD; mega_internal_command() uses scsi_cmnd allocated internally so scmd->cmnd is NULL here. This patch adds a static array for cdb to adapter_t and uses it here. This also uses scsi_allocate_command/scsi_free_command, the recommended way to allocate struct scsi_cmnd since the driver might use sense_buffer in struct scsi_cmnd. Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Reviewed-by: Boaz Harrosh <bharrosh@panasas.com> Tested-by: Pascal Terjan <pterjan@gmail.com> Reported-by: Pascal Terjan <pterjan@gmail.com> --- drivers/scsi/megaraid.c | 11 ++++++++--- drivers/scsi/megaraid.h | 2 +- 2 files changed, 9 insertions(+), 4 deletions(-) diff --git a/drivers/scsi/megaraid.c b/drivers/scsi/megaraid.c index 28c9da7..7dc62de 100644 --- a/drivers/scsi/megaraid.c +++ b/drivers/scsi/megaraid.c @@ -4402,6 +4402,10 @@ mega_internal_command(adapter_t *adapter, megacmd_t *mc, mega_passthru *pthru) scb_t *scb; int rval; + scmd = scsi_allocate_command(GFP_KERNEL); + if (!scmd) + return -ENOMEM; + /* * The internal commands share one command id and hence are * serialized. This is so because we want to reserve maximum number of @@ -4412,12 +4416,11 @@ mega_internal_command(adapter_t *adapter, megacmd_t *mc, mega_passthru *pthru) scb = &adapter->int_scb; memset(scb, 0, sizeof(scb_t)); - scmd = &adapter->int_scmd; - memset(scmd, 0, sizeof(Scsi_Cmnd)); - sdev = kzalloc(sizeof(struct scsi_device), GFP_KERNEL); scmd->device = sdev; + memset(adapter->int_cdb, 0, sizeof(adapter->int_cdb)); + scmd->cmnd = adapter->int_cdb; scmd->device->host = adapter->host; scmd->host_scribble = (void *)scb; scmd->cmnd[0] = MEGA_INTERNAL_CMD; @@ -4456,6 +4459,8 @@ mega_internal_command(adapter_t *adapter, megacmd_t *mc, mega_passthru *pthru) mutex_unlock(&adapter->int_mtx); + scsi_free_command(GFP_KERNEL, scmd); + return rval; } diff --git a/drivers/scsi/megaraid.h b/drivers/scsi/megaraid.h index ee70bd4..795201f 100644 --- a/drivers/scsi/megaraid.h +++ b/drivers/scsi/megaraid.h @@ -888,8 +888,8 @@ typedef struct { u8 sglen; /* f/w supported scatter-gather list length */ + unsigned char int_cdb[MAX_COMMAND_SIZE]; scb_t int_scb; - Scsi_Cmnd int_scmd; struct mutex int_mtx; /* To synchronize the internal commands */ struct completion int_waitq; /* wait queue for internal -- 1.5.5.GIT