Bug 11792 - Oops when reading /proc/megaraid/hba0/diskdrives-ch*
Summary: Oops when reading /proc/megaraid/hba0/diskdrives-ch*
Status: CLOSED CODE_FIX
Alias: None
Product: IO/Storage
Classification: Unclassified
Component: SCSI (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: drivers_other
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2008-10-19 16:27 UTC by Pascal Terjan
Modified: 2012-05-22 15:00 UTC (History)
1 user (show)

See Also:
Kernel Version: 2.6.27-rc8
Subsystem:
Regression: Yes
Bisected commit-id:


Attachments

Description Pascal Terjan 2008-10-19 16:27:34 UTC
Latest working kernel version: 2.6.24
Earliest failing kernel version: 2.6.27-rc8
Distribution: Mandriva
Problem Description:

Oops when reading /proc/megaraid/hba0/diskdrives-ch*

BUG: unable to handle kernel NULL pointer dereference at 00000000
IP: [<f88729b4>] :megaraid:mega_internal_command+0x74/0x130
*pdpt = 0000000032c80001 *pde = 0000000000000000 
Oops: 0002 [#1] SMP 
Modules linked in: nfsd auth_rpcgss exportfs ohci1394 ieee1394 nfs lockd nfs_acl sunrpc af_packet binfmt_misc loop ext3 jbd dm_mod rtc_cmos eepro100 shpchp pci_hotplug ipmi_msghandler e100 ide_cd_mod mii i2c_core sg sworks_agp agpgart serverworks ide_core i2o_core megaraid sd_mod scsi_mod crc_t10dif xfs uhci_hcd ohci_hcd ehci_hcd usbcore [last unloaded: scsi_wait_scan]

Pid: 2319, comm: diff Not tainted (2.6.27-server-0.rc8.2mnb #1)
EIP: 0060:[<f88729b4>] EFLAGS: 00010246 CPU: 0
EIP is at mega_internal_command+0x74/0x130 [megaraid]
EAX: 00000000 EBX: f7942364 ECX: 00000000 EDX: f2d96000
ESI: f7942850 EDI: f7942860 EBP: f24e1df4 ESP: f24e1dc0
 DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
Process diff (pid: 2319, ti=f24e0000 task=f39f5710 task.ti=f24e0000)
Stack: 3691e000 00000000 f24e1e1c c010e4c8 f24e1e0c 00000000 f24e1dfe f79429b4 
       f79428e8 f2d96000 f7942364 f24e1dfe f24e1e10 f24e1e1c f8872b51 00040000 
       00000000 e0000000 00003691 00000000 f8873f00 f7930c08 c2a56c08 f24e1f08 
Call Trace:
 [<c010e4c8>] ? dma_alloc_coherent+0x108/0x2c0
 [<f8872b51>] ? mega_adapinq+0x41/0x60 [megaraid]
 [<f8873f00>] ? proc_pdrv_ch0+0x0/0x20 [megaraid]
 [<f8873887>] ? proc_pdrv+0xb7/0x6d0 [megaraid]
 [<c01845fe>] ? __alloc_pages_internal+0xae/0x460
 [<c01b6d29>] ? do_filp_open+0x1c9/0x7c0
 [<f8873f00>] ? proc_pdrv_ch0+0x0/0x20 [megaraid]
 [<f8873f18>] ? proc_pdrv_ch0+0x18/0x20 [megaraid]
 [<c01ed6ab>] ? proc_file_read+0x17b/0x260
 [<c01ed530>] ? proc_file_read+0x0/0x260
 [<c01e8c9e>] ? proc_reg_read+0x5e/0x90
 [<c01ab959>] ? vfs_read+0x99/0x160
 [<c01e8c40>] ? proc_reg_read+0x0/0x90
 [<c01abadd>] ? sys_read+0x3d/0x70
 [<c0109e03>] ? sysenter_do_call+0x12/0x2f
 [<c0390000>] ? native_cpu_up+0xf0/0x743
 =======================
Code: 98 4c c0 e8 ff 39 93 c7 8d bb fc 04 00 00 89 45 f0 8b 55 f0 89 83 84 05 00 00 8b 43 48 89 02 8b 83 b8 05 00 00 89 b3 44 06 00 00 <c6> 00 e1 8b 45 ec 83 8b f0 04 00 00 01 8b 75 e4 89 83 48 05 00 
EIP: [<f88729b4>] mega_internal_command+0x74/0x130 [megaraid] SS:ESP 0068:f24e1dc0
---[ end trace 18d8357732584241 ]---
Comment 1 Pascal Terjan 2008-10-19 16:29:25 UTC
Sorry, last working version is actually 2.6.22.19, this server never had 2.6.24
Comment 2 Andrew Morton 2008-10-21 12:00:54 UTC
Reassigned to scsi, marked as a regression.
Comment 3 Anonymous Emailer 2008-10-21 12:47:33 UTC
Reply-To: akpm@linux-foundation.org


(switched to email.  Please respond via emailed reply-to-all, not via the
bugzilla web interface).

For some reason this didn't come out on linux-scsi when I reassigned it
to scsi.

On Sun, 19 Oct 2008 16:27:35 -0700 (PDT)
bugme-daemon@bugzilla.kernel.org wrote:

> http://bugzilla.kernel.org/show_bug.cgi?id=11792
> 
>            Summary: Oops when reading /proc/megaraid/hba0/diskdrives-ch*
>            Product: Drivers
>            Version: 2.5
>      KernelVersion: 2.6.27-rc8
>           Platform: All
>         OS/Version: Linux
>               Tree: Mainline
>             Status: NEW
>           Severity: normal
>           Priority: P1
>          Component: Other
>         AssignedTo: drivers_other@kernel-bugs.osdl.org
>         ReportedBy: pterjan@gmail.com
> 
> 
> Latest working kernel version: 2.6.24
> Earliest failing kernel version: 2.6.27-rc8

It's a regression.

> Distribution: Mandriva
> Problem Description:
> 
> Oops when reading /proc/megaraid/hba0/diskdrives-ch*
> 
> BUG: unable to handle kernel NULL pointer dereference at 00000000
> IP: [<f88729b4>] :megaraid:mega_internal_command+0x74/0x130
> *pdpt = 0000000032c80001 *pde = 0000000000000000 
> Oops: 0002 [#1] SMP 
> Modules linked in: nfsd auth_rpcgss exportfs ohci1394 ieee1394 nfs lockd
> nfs_acl sunrpc af_packet binfmt_misc loop ext3 jbd dm_mod rtc_cmos eepro100
> shpchp pci_hotplug ipmi_msghandler e100 ide_cd_mod mii i2c_core sg sworks_agp
> agpgart serverworks ide_core i2o_core megaraid sd_mod scsi_mod crc_t10dif xfs
> uhci_hcd ohci_hcd ehci_hcd usbcore [last unloaded: scsi_wait_scan]
> 
> Pid: 2319, comm: diff Not tainted (2.6.27-server-0.rc8.2mnb #1)
> EIP: 0060:[<f88729b4>] EFLAGS: 00010246 CPU: 0
> EIP is at mega_internal_command+0x74/0x130 [megaraid]
> EAX: 00000000 EBX: f7942364 ECX: 00000000 EDX: f2d96000
> ESI: f7942850 EDI: f7942860 EBP: f24e1df4 ESP: f24e1dc0
>  DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
> Process diff (pid: 2319, ti=f24e0000 task=f39f5710 task.ti=f24e0000)
> Stack: 3691e000 00000000 f24e1e1c c010e4c8 f24e1e0c 00000000 f24e1dfe
> f79429b4 
>        f79428e8 f2d96000 f7942364 f24e1dfe f24e1e10 f24e1e1c f8872b51
>        00040000 
>        00000000 e0000000 00003691 00000000 f8873f00 f7930c08 c2a56c08
>        f24e1f08 
> Call Trace:
>  [<c010e4c8>] ? dma_alloc_coherent+0x108/0x2c0
>  [<f8872b51>] ? mega_adapinq+0x41/0x60 [megaraid]
>  [<f8873f00>] ? proc_pdrv_ch0+0x0/0x20 [megaraid]
>  [<f8873887>] ? proc_pdrv+0xb7/0x6d0 [megaraid]
>  [<c01845fe>] ? __alloc_pages_internal+0xae/0x460
>  [<c01b6d29>] ? do_filp_open+0x1c9/0x7c0
>  [<f8873f00>] ? proc_pdrv_ch0+0x0/0x20 [megaraid]
>  [<f8873f18>] ? proc_pdrv_ch0+0x18/0x20 [megaraid]
>  [<c01ed6ab>] ? proc_file_read+0x17b/0x260
>  [<c01ed530>] ? proc_file_read+0x0/0x260
>  [<c01e8c9e>] ? proc_reg_read+0x5e/0x90
>  [<c01ab959>] ? vfs_read+0x99/0x160
>  [<c01e8c40>] ? proc_reg_read+0x0/0x90
>  [<c01abadd>] ? sys_read+0x3d/0x70
>  [<c0109e03>] ? sysenter_do_call+0x12/0x2f
>  [<c0390000>] ? native_cpu_up+0xf0/0x743
>  =======================
> Code: 98 4c c0 e8 ff 39 93 c7 8d bb fc 04 00 00 89 45 f0 8b 55 f0 89 83 84 05
> 00 00 8b 43 48 89 02 8b 83 b8 05 00 00 89 b3 44 06 00 00 <c6> 00 e1 8b 45 ec
> 83
> 8b f0 04 00 00 01 8b 75 e4 89 83 48 05 00 
> EIP: [<f88729b4>] mega_internal_command+0x74/0x130 [megaraid] SS:ESP
> 0068:f24e1dc0
> ---[ end trace 18d8357732584241 ]---
> 
> 
Comment 4 Matthew Wilcox 2008-10-21 12:54:41 UTC
On Tue, Oct 21, 2008 at 12:47:01PM -0700, Andrew Morton wrote:
> > Latest working kernel version: 2.6.24
> > Earliest failing kernel version: 2.6.27-rc8
> 
> It's a regression.
> 
> > Pid: 2319, comm: diff Not tainted (2.6.27-server-0.rc8.2mnb #1)

It's also a distro kernel by the looks of things.  Can it be reproduced
with an upstream kernel?
Comment 5 Pascal Terjan 2008-10-21 13:22:41 UTC
On Tue, Oct 21, 2008 at 9:54 PM, Matthew Wilcox <matthew@wil.cx> wrote:
> On Tue, Oct 21, 2008 at 12:47:01PM -0700, Andrew Morton wrote:
>> > Latest working kernel version: 2.6.24
>> > Earliest failing kernel version: 2.6.27-rc8
>>
>> It's a regression.
>>
>> > Pid: 2319, comm: diff Not tainted (2.6.27-server-0.rc8.2mnb #1)
>
> It's also a distro kernel by the looks of things.  Can it be reproduced
> with an upstream kernel?

I will try booting the server on vanilla kernel but I'm not sure when
(we already rebooted it 2 times recently and users won't enjoy it).

This is a distro kernel but I don't see patches that could impact this :
http://svn.mandriva.com/cgi-bin/viewvc.cgi/packages/cooker/kernel/current/PATCHES/patches/

Machine is a old HP NetServer LT 6000

04:03.1 I2O: Intel Corporation 80960RP (i960RP) Microprocessor (rev
09) (prog-if 01)
	Subsystem: Hewlett-Packard Company MegaRAID, Integrated NetRAID
	Flags: bus master, fast Back2Back, medium devsel, latency 64, IRQ 11
	Memory at f4000000 (32-bit, prefetchable) [size=64M]
	[virtual] Expansion ROM at a8130000 [disabled] [size=32K]
	Capabilities: [80] Power Management version 2
	Kernel driver in use: megaraid_legacy
	Kernel modules: i2o_core, megaraid
Comment 6 Anonymous Emailer 2008-10-21 16:09:03 UTC
Reply-To: fujita.tomonori@lab.ntt.co.jp

On Tue, 21 Oct 2008 22:22:37 +0200
"Pascal Terjan" <pterjan@gmail.com> wrote:

> On Tue, Oct 21, 2008 at 9:54 PM, Matthew Wilcox <matthew@wil.cx> wrote:
> > On Tue, Oct 21, 2008 at 12:47:01PM -0700, Andrew Morton wrote:
> >> > Latest working kernel version: 2.6.24
> >> > Earliest failing kernel version: 2.6.27-rc8
> >>
> >> It's a regression.
> >>
> >> > Pid: 2319, comm: diff Not tainted (2.6.27-server-0.rc8.2mnb #1)
> >
> > It's also a distro kernel by the looks of things.  Can it be reproduced
> > with an upstream kernel?
> 
> I will try booting the server on vanilla kernel but I'm not sure when
> (we already rebooted it 2 times recently and users won't enjoy it).
> 
> This is a distro kernel but I don't see patches that could impact this :
>
> http://svn.mandriva.com/cgi-bin/viewvc.cgi/packages/cooker/kernel/current/PATCHES/patches/
> 
> Machine is a old HP NetServer LT 6000
> 
> 04:03.1 I2O: Intel Corporation 80960RP (i960RP) Microprocessor (rev
> 09) (prog-if 01)
>       Subsystem: Hewlett-Packard Company MegaRAID, Integrated NetRAID
>       Flags: bus master, fast Back2Back, medium devsel, latency 64, IRQ 11
>       Memory at f4000000 (32-bit, prefetchable) [size=64M]
>       [virtual] Expansion ROM at a8130000 [disabled] [size=32K]
>       Capabilities: [80] Power Management version 2
>       Kernel driver in use: megaraid_legacy
>       Kernel modules: i2o_core, megaraid

This patch helps?


diff --git a/drivers/scsi/megaraid.c b/drivers/scsi/megaraid.c
index 28c9da7..9294ed8 100644
--- a/drivers/scsi/megaraid.c
+++ b/drivers/scsi/megaraid.c
@@ -4414,12 +4414,14 @@ mega_internal_command(adapter_t *adapter, megacmd_t *mc, mega_passthru *pthru)
 
 	scmd = &adapter->int_scmd;
 	memset(scmd, 0, sizeof(Scsi_Cmnd));
+	memset(adapter->int_cdb, 0, sizeof(adapter->int_cdb));
 
 	sdev = kzalloc(sizeof(struct scsi_device), GFP_KERNEL);
 	scmd->device = sdev;
 
 	scmd->device->host = adapter->host;
 	scmd->host_scribble = (void *)scb;
+	scmd->cmnd = adapter->int_cdb;
 	scmd->cmnd[0] = MEGA_INTERNAL_CMD;
 
 	scb->state |= SCB_ACTIVE;
diff --git a/drivers/scsi/megaraid.h b/drivers/scsi/megaraid.h
index ee70bd4..5ffec15 100644
--- a/drivers/scsi/megaraid.h
+++ b/drivers/scsi/megaraid.h
@@ -889,6 +889,7 @@ typedef struct {
 	u8	sglen;	/* f/w supported scatter-gather list length */
 
 	scb_t			int_scb;
+	unsigned char		int_cdb[MAX_COMMAND_SIZE];
 	Scsi_Cmnd		int_scmd;
 	struct mutex		int_mtx;	/* To synchronize the internal
 						commands */
Comment 7 Anonymous Emailer 2008-10-22 02:05:00 UTC
Reply-To: bharrosh@panasas.com

FUJITA Tomonori wrote:
> On Tue, 21 Oct 2008 22:22:37 +0200
> "Pascal Terjan" <pterjan@gmail.com> wrote:
> 
>> On Tue, Oct 21, 2008 at 9:54 PM, Matthew Wilcox <matthew@wil.cx> wrote:
>>> On Tue, Oct 21, 2008 at 12:47:01PM -0700, Andrew Morton wrote:
>>>>> Latest working kernel version: 2.6.24
>>>>> Earliest failing kernel version: 2.6.27-rc8
>>>> It's a regression.
>>>>
>>>>> Pid: 2319, comm: diff Not tainted (2.6.27-server-0.rc8.2mnb #1)
>>> It's also a distro kernel by the looks of things.  Can it be reproduced
>>> with an upstream kernel?
>> I will try booting the server on vanilla kernel but I'm not sure when
>> (we already rebooted it 2 times recently and users won't enjoy it).
>>
>> This is a distro kernel but I don't see patches that could impact this :
>>
>> http://svn.mandriva.com/cgi-bin/viewvc.cgi/packages/cooker/kernel/current/PATCHES/patches/
>>
>> Machine is a old HP NetServer LT 6000
>>
>> 04:03.1 I2O: Intel Corporation 80960RP (i960RP) Microprocessor (rev
>> 09) (prog-if 01)
>>      Subsystem: Hewlett-Packard Company MegaRAID, Integrated NetRAID
>>      Flags: bus master, fast Back2Back, medium devsel, latency 64, IRQ 11
>>      Memory at f4000000 (32-bit, prefetchable) [size=64M]
>>      [virtual] Expansion ROM at a8130000 [disabled] [size=32K]
>>      Capabilities: [80] Power Management version 2
>>      Kernel driver in use: megaraid_legacy
>>      Kernel modules: i2o_core, megaraid
> 
> This patch helps?
> 
> 
> diff --git a/drivers/scsi/megaraid.c b/drivers/scsi/megaraid.c
> index 28c9da7..9294ed8 100644
> --- a/drivers/scsi/megaraid.c
> +++ b/drivers/scsi/megaraid.c
> @@ -4414,12 +4414,14 @@ mega_internal_command(adapter_t *adapter, megacmd_t
> *mc, mega_passthru *pthru)
>  
>       scmd = &adapter->int_scmd;
>       memset(scmd, 0, sizeof(Scsi_Cmnd));
> +     memset(adapter->int_cdb, 0, sizeof(adapter->int_cdb));
>  
>       sdev = kzalloc(sizeof(struct scsi_device), GFP_KERNEL);
>       scmd->device = sdev;
>  
>       scmd->device->host = adapter->host;
>       scmd->host_scribble = (void *)scb;
> +     scmd->cmnd = adapter->int_cdb;
>       scmd->cmnd[0] = MEGA_INTERNAL_CMD;
>  
>       scb->state |= SCB_ACTIVE;
> diff --git a/drivers/scsi/megaraid.h b/drivers/scsi/megaraid.h
> index ee70bd4..5ffec15 100644
> --- a/drivers/scsi/megaraid.h
> +++ b/drivers/scsi/megaraid.h
> @@ -889,6 +889,7 @@ typedef struct {
>       u8      sglen;  /* f/w supported scatter-gather list length */
>  
>       scb_t                   int_scb;
> +     unsigned char           int_cdb[MAX_COMMAND_SIZE];
>       Scsi_Cmnd               int_scmd;
>       struct mutex            int_mtx;        /* To synchronize the internal
>                                               commands */
> 
> --

Hi TOMO.

This might not be enough for example I don't see the allocation of sense_buffer.
It might be much easer to allocate using the new command allocation API James
did, just for such cases. These are: scsi_allocate_command/scsi_free_command

Thanks
Boaz
Comment 8 Anonymous Emailer 2008-10-22 02:38:47 UTC
Reply-To: fujita.tomonori@lab.ntt.co.jp

On Wed, 22 Oct 2008 11:04:44 +0200
Boaz Harrosh <bharrosh@panasas.com> wrote:

> FUJITA Tomonori wrote:
> > On Tue, 21 Oct 2008 22:22:37 +0200
> > "Pascal Terjan" <pterjan@gmail.com> wrote:
> > 
> >> On Tue, Oct 21, 2008 at 9:54 PM, Matthew Wilcox <matthew@wil.cx> wrote:
> >>> On Tue, Oct 21, 2008 at 12:47:01PM -0700, Andrew Morton wrote:
> >>>>> Latest working kernel version: 2.6.24
> >>>>> Earliest failing kernel version: 2.6.27-rc8
> >>>> It's a regression.
> >>>>
> >>>>> Pid: 2319, comm: diff Not tainted (2.6.27-server-0.rc8.2mnb #1)
> >>> It's also a distro kernel by the looks of things.  Can it be reproduced
> >>> with an upstream kernel?
> >> I will try booting the server on vanilla kernel but I'm not sure when
> >> (we already rebooted it 2 times recently and users won't enjoy it).
> >>
> >> This is a distro kernel but I don't see patches that could impact this :
> >>
> http://svn.mandriva.com/cgi-bin/viewvc.cgi/packages/cooker/kernel/current/PATCHES/patches/
> >>
> >> Machine is a old HP NetServer LT 6000
> >>
> >> 04:03.1 I2O: Intel Corporation 80960RP (i960RP) Microprocessor (rev
> >> 09) (prog-if 01)
> >>    Subsystem: Hewlett-Packard Company MegaRAID, Integrated NetRAID
> >>    Flags: bus master, fast Back2Back, medium devsel, latency 64, IRQ 11
> >>    Memory at f4000000 (32-bit, prefetchable) [size=64M]
> >>    [virtual] Expansion ROM at a8130000 [disabled] [size=32K]
> >>    Capabilities: [80] Power Management version 2
> >>    Kernel driver in use: megaraid_legacy
> >>    Kernel modules: i2o_core, megaraid
> > 
> > This patch helps?
> > 
> > 
> > diff --git a/drivers/scsi/megaraid.c b/drivers/scsi/megaraid.c
> > index 28c9da7..9294ed8 100644
> > --- a/drivers/scsi/megaraid.c
> > +++ b/drivers/scsi/megaraid.c
> > @@ -4414,12 +4414,14 @@ mega_internal_command(adapter_t *adapter, megacmd_t
> *mc, mega_passthru *pthru)
> >  
> >     scmd = &adapter->int_scmd;
> >     memset(scmd, 0, sizeof(Scsi_Cmnd));
> > +   memset(adapter->int_cdb, 0, sizeof(adapter->int_cdb));
> >  
> >     sdev = kzalloc(sizeof(struct scsi_device), GFP_KERNEL);
> >     scmd->device = sdev;
> >  
> >     scmd->device->host = adapter->host;
> >     scmd->host_scribble = (void *)scb;
> > +   scmd->cmnd = adapter->int_cdb;
> >     scmd->cmnd[0] = MEGA_INTERNAL_CMD;
> >  
> >     scb->state |= SCB_ACTIVE;
> > diff --git a/drivers/scsi/megaraid.h b/drivers/scsi/megaraid.h
> > index ee70bd4..5ffec15 100644
> > --- a/drivers/scsi/megaraid.h
> > +++ b/drivers/scsi/megaraid.h
> > @@ -889,6 +889,7 @@ typedef struct {
> >     u8      sglen;  /* f/w supported scatter-gather list length */
> >  
> >     scb_t                   int_scb;
> > +   unsigned char           int_cdb[MAX_COMMAND_SIZE];
> >     Scsi_Cmnd               int_scmd;
> >     struct mutex            int_mtx;        /* To synchronize the internal
> >                                             commands */
> > 
> > --
> 
> Hi TOMO.
> 
> This might not be enough for example I don't see the allocation of
> sense_buffer.
> It might be much easer to allocate using the new command allocation API James
> did, just for such cases. These are: scsi_allocate_command/scsi_free_command

Yeah, it might be. It's fine by me too. But this code path is used
only for issuing internal special commands. It doesn't use the great
portion of scsi_cmnd. For example, these commands don't use sense
buffer, I think. The code path uses scsi_cmnd just for hooking scb_t,
a structure that megaraid allocates per command.
Comment 9 Anonymous Emailer 2008-10-22 03:09:09 UTC
Reply-To: bharrosh@panasas.com

FUJITA Tomonori wrote:
> On Wed, 22 Oct 2008 11:04:44 +0200
> Boaz Harrosh <bharrosh@panasas.com> wrote:
> 
>> FUJITA Tomonori wrote:
>>> On Tue, 21 Oct 2008 22:22:37 +0200
>>> "Pascal Terjan" <pterjan@gmail.com> wrote:
>>>
>>>> On Tue, Oct 21, 2008 at 9:54 PM, Matthew Wilcox <matthew@wil.cx> wrote:
>>>>> On Tue, Oct 21, 2008 at 12:47:01PM -0700, Andrew Morton wrote:
>>>>>>> Latest working kernel version: 2.6.24
>>>>>>> Earliest failing kernel version: 2.6.27-rc8
>>>>>> It's a regression.
>>>>>>
>>>>>>> Pid: 2319, comm: diff Not tainted (2.6.27-server-0.rc8.2mnb #1)
>>>>> It's also a distro kernel by the looks of things.  Can it be reproduced
>>>>> with an upstream kernel?
>>>> I will try booting the server on vanilla kernel but I'm not sure when
>>>> (we already rebooted it 2 times recently and users won't enjoy it).
>>>>
>>>> This is a distro kernel but I don't see patches that could impact this :
>>>>
>>>> http://svn.mandriva.com/cgi-bin/viewvc.cgi/packages/cooker/kernel/current/PATCHES/patches/
>>>>
>>>> Machine is a old HP NetServer LT 6000
>>>>
>>>> 04:03.1 I2O: Intel Corporation 80960RP (i960RP) Microprocessor (rev
>>>> 09) (prog-if 01)
>>>>    Subsystem: Hewlett-Packard Company MegaRAID, Integrated NetRAID
>>>>    Flags: bus master, fast Back2Back, medium devsel, latency 64, IRQ 11
>>>>    Memory at f4000000 (32-bit, prefetchable) [size=64M]
>>>>    [virtual] Expansion ROM at a8130000 [disabled] [size=32K]
>>>>    Capabilities: [80] Power Management version 2
>>>>    Kernel driver in use: megaraid_legacy
>>>>    Kernel modules: i2o_core, megaraid
>>> This patch helps?
>>>
>>>
>>> diff --git a/drivers/scsi/megaraid.c b/drivers/scsi/megaraid.c
>>> index 28c9da7..9294ed8 100644
>>> --- a/drivers/scsi/megaraid.c
>>> +++ b/drivers/scsi/megaraid.c
>>> @@ -4414,12 +4414,14 @@ mega_internal_command(adapter_t *adapter, megacmd_t
>>> *mc, mega_passthru *pthru)
>>>  
>>>     scmd = &adapter->int_scmd;
>>>     memset(scmd, 0, sizeof(Scsi_Cmnd));
>>> +   memset(adapter->int_cdb, 0, sizeof(adapter->int_cdb));
>>>  
>>>     sdev = kzalloc(sizeof(struct scsi_device), GFP_KERNEL);
>>>     scmd->device = sdev;
>>>  
>>>     scmd->device->host = adapter->host;
>>>     scmd->host_scribble = (void *)scb;
>>> +   scmd->cmnd = adapter->int_cdb;
>>>     scmd->cmnd[0] = MEGA_INTERNAL_CMD;
>>>  
>>>     scb->state |= SCB_ACTIVE;
>>> diff --git a/drivers/scsi/megaraid.h b/drivers/scsi/megaraid.h
>>> index ee70bd4..5ffec15 100644
>>> --- a/drivers/scsi/megaraid.h
>>> +++ b/drivers/scsi/megaraid.h
>>> @@ -889,6 +889,7 @@ typedef struct {
>>>     u8      sglen;  /* f/w supported scatter-gather list length */
>>>  
>>>     scb_t                   int_scb;
>>> +   unsigned char           int_cdb[MAX_COMMAND_SIZE];
>>>     Scsi_Cmnd               int_scmd;
>>>     struct mutex            int_mtx;        /* To synchronize the internal
>>>                                             commands */
>>>
>>> --
>> Hi TOMO.
>>
>> This might not be enough for example I don't see the allocation of
>> sense_buffer.
>> It might be much easer to allocate using the new command allocation API
>> James
>> did, just for such cases. These are: scsi_allocate_command/scsi_free_command
> 
> Yeah, it might be. It's fine by me too. But this code path is used
> only for issuing internal special commands. It doesn't use the great
> portion of scsi_cmnd. For example, these commands don't use sense
> buffer, I think. The code path uses scsi_cmnd just for hooking scb_t,
> a structure that megaraid allocates per command.

OK Thanks.
I was not sure because it looks like in mega_cmd_done(), if the status is
0x2 (CHECK_CONDITION) then it would set the sense_buffer. But from what
you say, the HW will never return 0x2 in case of an Internal-Command. I
Just wanted to make sure.

Boaz
Comment 10 Anonymous Emailer 2008-10-22 05:33:58 UTC
Reply-To: fujita.tomonori@lab.ntt.co.jp

On Wed, 22 Oct 2008 12:08:27 +0200
Boaz Harrosh <bharrosh@panasas.com> wrote:

> FUJITA Tomonori wrote:
> > On Wed, 22 Oct 2008 11:04:44 +0200
> > Boaz Harrosh <bharrosh@panasas.com> wrote:
> > 
> >> FUJITA Tomonori wrote:
> >>> On Tue, 21 Oct 2008 22:22:37 +0200
> >>> "Pascal Terjan" <pterjan@gmail.com> wrote:
> >>>
> >>>> On Tue, Oct 21, 2008 at 9:54 PM, Matthew Wilcox <matthew@wil.cx> wrote:
> >>>>> On Tue, Oct 21, 2008 at 12:47:01PM -0700, Andrew Morton wrote:
> >>>>>>> Latest working kernel version: 2.6.24
> >>>>>>> Earliest failing kernel version: 2.6.27-rc8
> >>>>>> It's a regression.
> >>>>>>
> >>>>>>> Pid: 2319, comm: diff Not tainted (2.6.27-server-0.rc8.2mnb #1)
> >>>>> It's also a distro kernel by the looks of things.  Can it be reproduced
> >>>>> with an upstream kernel?
> >>>> I will try booting the server on vanilla kernel but I'm not sure when
> >>>> (we already rebooted it 2 times recently and users won't enjoy it).
> >>>>
> >>>> This is a distro kernel but I don't see patches that could impact this :
> >>>>
> http://svn.mandriva.com/cgi-bin/viewvc.cgi/packages/cooker/kernel/current/PATCHES/patches/
> >>>>
> >>>> Machine is a old HP NetServer LT 6000
> >>>>
> >>>> 04:03.1 I2O: Intel Corporation 80960RP (i960RP) Microprocessor (rev
> >>>> 09) (prog-if 01)
> >>>>  Subsystem: Hewlett-Packard Company MegaRAID, Integrated NetRAID
> >>>>  Flags: bus master, fast Back2Back, medium devsel, latency 64, IRQ 11
> >>>>  Memory at f4000000 (32-bit, prefetchable) [size=64M]
> >>>>  [virtual] Expansion ROM at a8130000 [disabled] [size=32K]
> >>>>  Capabilities: [80] Power Management version 2
> >>>>  Kernel driver in use: megaraid_legacy
> >>>>  Kernel modules: i2o_core, megaraid
> >>> This patch helps?
> >>>
> >>>
> >>> diff --git a/drivers/scsi/megaraid.c b/drivers/scsi/megaraid.c
> >>> index 28c9da7..9294ed8 100644
> >>> --- a/drivers/scsi/megaraid.c
> >>> +++ b/drivers/scsi/megaraid.c
> >>> @@ -4414,12 +4414,14 @@ mega_internal_command(adapter_t *adapter,
> megacmd_t *mc, mega_passthru *pthru)
> >>>  
> >>>   scmd = &adapter->int_scmd;
> >>>   memset(scmd, 0, sizeof(Scsi_Cmnd));
> >>> + memset(adapter->int_cdb, 0, sizeof(adapter->int_cdb));
> >>>  
> >>>   sdev = kzalloc(sizeof(struct scsi_device), GFP_KERNEL);
> >>>   scmd->device = sdev;
> >>>  
> >>>   scmd->device->host = adapter->host;
> >>>   scmd->host_scribble = (void *)scb;
> >>> + scmd->cmnd = adapter->int_cdb;
> >>>   scmd->cmnd[0] = MEGA_INTERNAL_CMD;
> >>>  
> >>>   scb->state |= SCB_ACTIVE;
> >>> diff --git a/drivers/scsi/megaraid.h b/drivers/scsi/megaraid.h
> >>> index ee70bd4..5ffec15 100644
> >>> --- a/drivers/scsi/megaraid.h
> >>> +++ b/drivers/scsi/megaraid.h
> >>> @@ -889,6 +889,7 @@ typedef struct {
> >>>   u8      sglen;  /* f/w supported scatter-gather list length */
> >>>  
> >>>   scb_t                   int_scb;
> >>> + unsigned char           int_cdb[MAX_COMMAND_SIZE];
> >>>   Scsi_Cmnd               int_scmd;
> >>>   struct mutex            int_mtx;        /* To synchronize the internal
> >>>                                           commands */
> >>>
> >>> --
> >> Hi TOMO.
> >>
> >> This might not be enough for example I don't see the allocation of
> sense_buffer.
> >> It might be much easer to allocate using the new command allocation API
> James
> >> did, just for such cases. These are:
> scsi_allocate_command/scsi_free_command
> > 
> > Yeah, it might be. It's fine by me too. But this code path is used
> > only for issuing internal special commands. It doesn't use the great
> > portion of scsi_cmnd. For example, these commands don't use sense
> > buffer, I think. The code path uses scsi_cmnd just for hooking scb_t,
> > a structure that megaraid allocates per command.
> 
> OK Thanks.
> I was not sure because it looks like in mega_cmd_done(), if the status is
> 0x2 (CHECK_CONDITION) then it would set the sense_buffer. But from what
> you say, the HW will never return 0x2 in case of an Internal-Command. I
> Just wanted to make sure.

I thought that all internal commands are non SCSI command but seems
that there is one exception (issuing INQUIRY as an internal command).
I'm not sure I understand correctly the driver but anyway here is a
version using scsi_allocate_command and scsi_free_command.

I guess that we need to check the kzalloc failure too but it is
supposed to be fixed by a different patch.


diff --git a/drivers/scsi/megaraid.c b/drivers/scsi/megaraid.c
index 28c9da7..7dc62de 100644
--- a/drivers/scsi/megaraid.c
+++ b/drivers/scsi/megaraid.c
@@ -4402,6 +4402,10 @@ mega_internal_command(adapter_t *adapter, megacmd_t *mc, mega_passthru *pthru)
 	scb_t	*scb;
 	int	rval;
 
+	scmd = scsi_allocate_command(GFP_KERNEL);
+	if (!scmd)
+		return -ENOMEM;
+
 	/*
 	 * The internal commands share one command id and hence are
 	 * serialized. This is so because we want to reserve maximum number of
@@ -4412,12 +4416,11 @@ mega_internal_command(adapter_t *adapter, megacmd_t *mc, mega_passthru *pthru)
 	scb = &adapter->int_scb;
 	memset(scb, 0, sizeof(scb_t));
 
-	scmd = &adapter->int_scmd;
-	memset(scmd, 0, sizeof(Scsi_Cmnd));
-
 	sdev = kzalloc(sizeof(struct scsi_device), GFP_KERNEL);
 	scmd->device = sdev;
 
+	memset(adapter->int_cdb, 0, sizeof(adapter->int_cdb));
+	scmd->cmnd = adapter->int_cdb;
 	scmd->device->host = adapter->host;
 	scmd->host_scribble = (void *)scb;
 	scmd->cmnd[0] = MEGA_INTERNAL_CMD;
@@ -4456,6 +4459,8 @@ mega_internal_command(adapter_t *adapter, megacmd_t *mc, mega_passthru *pthru)
 
 	mutex_unlock(&adapter->int_mtx);
 
+	scsi_free_command(GFP_KERNEL, scmd);
+
 	return rval;
 }
 
diff --git a/drivers/scsi/megaraid.h b/drivers/scsi/megaraid.h
index ee70bd4..795201f 100644
--- a/drivers/scsi/megaraid.h
+++ b/drivers/scsi/megaraid.h
@@ -888,8 +888,8 @@ typedef struct {
 
 	u8	sglen;	/* f/w supported scatter-gather list length */
 
+	unsigned char int_cdb[MAX_COMMAND_SIZE];
 	scb_t			int_scb;
-	Scsi_Cmnd		int_scmd;
 	struct mutex		int_mtx;	/* To synchronize the internal
 						commands */
 	struct completion	int_waitq;	/* wait queue for internal
Comment 11 bo yang 2008-10-22 06:03:21 UTC
I saw the latest working kernel: 2.6.24 and first failing kernel version: 2.6.27-rc8.  I understand there are lots of changes between those two kernels.  Can you take a look the changes from kernels to find out the root cause?  Also if you believe this is the driver issue and need LSI to help, can you report this issue to LSI?

Thanks.

Bo Yang

-----Original Message-----
From: FUJITA Tomonori [mailto:fujita.tomonori@lab.ntt.co.jp]
Sent: Wednesday, October 22, 2008 8:34 AM
To: bharrosh@panasas.com
Cc: fujita.tomonori@lab.ntt.co.jp; pterjan@gmail.com; matthew@wil.cx; akpm@linux-foundation.org; linux-scsi@vger.kernel.org; Patro, Sumant; Yang, Bo; bugme-daemon@bugzilla.kernel.org
Subject: Re: [Bugme-new] [Bug 11792] New: Oops when reading /proc/megaraid/hba0/diskdrives-ch*

On Wed, 22 Oct 2008 12:08:27 +0200
Boaz Harrosh <bharrosh@panasas.com> wrote:

> FUJITA Tomonori wrote:
> > On Wed, 22 Oct 2008 11:04:44 +0200
> > Boaz Harrosh <bharrosh@panasas.com> wrote:
> >
> >> FUJITA Tomonori wrote:
> >>> On Tue, 21 Oct 2008 22:22:37 +0200
> >>> "Pascal Terjan" <pterjan@gmail.com> wrote:
> >>>
> >>>> On Tue, Oct 21, 2008 at 9:54 PM, Matthew Wilcox <matthew@wil.cx> wrote:
> >>>>> On Tue, Oct 21, 2008 at 12:47:01PM -0700, Andrew Morton wrote:
> >>>>>>> Latest working kernel version: 2.6.24
> >>>>>>> Earliest failing kernel version: 2.6.27-rc8
> >>>>>> It's a regression.
> >>>>>>
> >>>>>>> Pid: 2319, comm: diff Not tainted (2.6.27-server-0.rc8.2mnb #1)
> >>>>> It's also a distro kernel by the looks of things.  Can it be reproduced
> >>>>> with an upstream kernel?
> >>>> I will try booting the server on vanilla kernel but I'm not sure when
> >>>> (we already rebooted it 2 times recently and users won't enjoy it).
> >>>>
> >>>> This is a distro kernel but I don't see patches that could impact this :
> >>>>
> http://svn.mandriva.com/cgi-bin/viewvc.cgi/packages/cooker/kernel/current/PATCHES/patches/
> >>>>
> >>>> Machine is a old HP NetServer LT 6000
> >>>>
> >>>> 04:03.1 I2O: Intel Corporation 80960RP (i960RP) Microprocessor (rev
> >>>> 09) (prog-if 01)
> >>>>  Subsystem: Hewlett-Packard Company MegaRAID, Integrated NetRAID
> >>>>  Flags: bus master, fast Back2Back, medium devsel, latency 64, IRQ 11
> >>>>  Memory at f4000000 (32-bit, prefetchable) [size=64M]
> >>>>  [virtual] Expansion ROM at a8130000 [disabled] [size=32K]
> >>>>  Capabilities: [80] Power Management version 2
> >>>>  Kernel driver in use: megaraid_legacy
> >>>>  Kernel modules: i2o_core, megaraid
> >>> This patch helps?
> >>>
> >>>
> >>> diff --git a/drivers/scsi/megaraid.c b/drivers/scsi/megaraid.c
> >>> index 28c9da7..9294ed8 100644
> >>> --- a/drivers/scsi/megaraid.c
> >>> +++ b/drivers/scsi/megaraid.c
> >>> @@ -4414,12 +4414,14 @@ mega_internal_command(adapter_t *adapter,
> megacmd_t *mc, mega_passthru *pthru)
> >>>
> >>>   scmd = &adapter->int_scmd;
> >>>   memset(scmd, 0, sizeof(Scsi_Cmnd));
> >>> + memset(adapter->int_cdb, 0, sizeof(adapter->int_cdb));
> >>>
> >>>   sdev = kzalloc(sizeof(struct scsi_device), GFP_KERNEL);
> >>>   scmd->device = sdev;
> >>>
> >>>   scmd->device->host = adapter->host;
> >>>   scmd->host_scribble = (void *)scb;
> >>> + scmd->cmnd = adapter->int_cdb;
> >>>   scmd->cmnd[0] = MEGA_INTERNAL_CMD;
> >>>
> >>>   scb->state |= SCB_ACTIVE;
> >>> diff --git a/drivers/scsi/megaraid.h b/drivers/scsi/megaraid.h
> >>> index ee70bd4..5ffec15 100644
> >>> --- a/drivers/scsi/megaraid.h
> >>> +++ b/drivers/scsi/megaraid.h
> >>> @@ -889,6 +889,7 @@ typedef struct {
> >>>   u8      sglen;  /* f/w supported scatter-gather list length */
> >>>
> >>>   scb_t                   int_scb;
> >>> + unsigned char           int_cdb[MAX_COMMAND_SIZE];
> >>>   Scsi_Cmnd               int_scmd;
> >>>   struct mutex            int_mtx;        /* To synchronize the internal
> >>>                                           commands */
> >>>
> >>> --
> >> Hi TOMO.
> >>
> >> This might not be enough for example I don't see the allocation of
> sense_buffer.
> >> It might be much easer to allocate using the new command allocation API
> James
> >> did, just for such cases. These are:
> scsi_allocate_command/scsi_free_command
> >
> > Yeah, it might be. It's fine by me too. But this code path is used
> > only for issuing internal special commands. It doesn't use the great
> > portion of scsi_cmnd. For example, these commands don't use sense
> > buffer, I think. The code path uses scsi_cmnd just for hooking scb_t,
> > a structure that megaraid allocates per command.
>
> OK Thanks.
> I was not sure because it looks like in mega_cmd_done(), if the status is
> 0x2 (CHECK_CONDITION) then it would set the sense_buffer. But from what
> you say, the HW will never return 0x2 in case of an Internal-Command. I
> Just wanted to make sure.

I thought that all internal commands are non SCSI command but seems
that there is one exception (issuing INQUIRY as an internal command).
I'm not sure I understand correctly the driver but anyway here is a
version using scsi_allocate_command and scsi_free_command.

I guess that we need to check the kzalloc failure too but it is
supposed to be fixed by a different patch.


diff --git a/drivers/scsi/megaraid.c b/drivers/scsi/megaraid.c
index 28c9da7..7dc62de 100644
--- a/drivers/scsi/megaraid.c
+++ b/drivers/scsi/megaraid.c
@@ -4402,6 +4402,10 @@ mega_internal_command(adapter_t *adapter, megacmd_t *mc, mega_passthru *pthru)
        scb_t   *scb;
        int     rval;

+       scmd = scsi_allocate_command(GFP_KERNEL);
+       if (!scmd)
+               return -ENOMEM;
+
        /*
         * The internal commands share one command id and hence are
         * serialized. This is so because we want to reserve maximum number of
@@ -4412,12 +4416,11 @@ mega_internal_command(adapter_t *adapter, megacmd_t *mc, mega_passthru *pthru)
        scb = &adapter->int_scb;
        memset(scb, 0, sizeof(scb_t));

-       scmd = &adapter->int_scmd;
-       memset(scmd, 0, sizeof(Scsi_Cmnd));
-
        sdev = kzalloc(sizeof(struct scsi_device), GFP_KERNEL);
        scmd->device = sdev;

+       memset(adapter->int_cdb, 0, sizeof(adapter->int_cdb));
+       scmd->cmnd = adapter->int_cdb;
        scmd->device->host = adapter->host;
        scmd->host_scribble = (void *)scb;
        scmd->cmnd[0] = MEGA_INTERNAL_CMD;
@@ -4456,6 +4459,8 @@ mega_internal_command(adapter_t *adapter, megacmd_t *mc, mega_passthru *pthru)

        mutex_unlock(&adapter->int_mtx);

+       scsi_free_command(GFP_KERNEL, scmd);
+
        return rval;
 }

diff --git a/drivers/scsi/megaraid.h b/drivers/scsi/megaraid.h
index ee70bd4..795201f 100644
--- a/drivers/scsi/megaraid.h
+++ b/drivers/scsi/megaraid.h
@@ -888,8 +888,8 @@ typedef struct {

        u8      sglen;  /* f/w supported scatter-gather list length */

+       unsigned char int_cdb[MAX_COMMAND_SIZE];
        scb_t                   int_scb;
-       Scsi_Cmnd               int_scmd;
        struct mutex            int_mtx;        /* To synchronize the internal
                                                commands */
        struct completion       int_waitq;      /* wait queue for internal
Comment 12 Anonymous Emailer 2008-10-22 06:39:25 UTC
Reply-To: fujita.tomonori@lab.ntt.co.jp

On Wed, 22 Oct 2008 07:03:03 -0600
"Yang, Bo" <Bo.Yang@lsi.com> wrote:

> I saw the latest working kernel: 2.6.24 and first failing kernel
> version: 2.6.27-rc8.  I understand there are lots of changes between
> those two kernels.  Can you take a look the changes from kernels to
> find out the root cause?

Sorry, I didn't explain the possible root cause.

struct scsi_cmnd in 2.6.25:

unsigned char cmnd[MAX_COMMAND_SIZE];


struct scsi_cmnd in 2.6.26:

unsigned char *cmnd;


In short, struct scsi_cmnd doesn't have static array for cdb. You need
to allocate memory for it (the scsi midlayer does for common usage).

So

static int
mega_internal_command(adapter_t *adapter, megacmd_t *mc, mega_passthru *pthru)
{
...
	scb = &adapter->int_scb;
	memset(scb, 0, sizeof(scb_t));

	scmd = &adapter->int_scmd;
	memset(scmd, 0, sizeof(Scsi_Cmnd));

	sdev = kzalloc(sizeof(struct scsi_device), GFP_KERNEL);
	scmd->device = sdev;

	scmd->device->host = adapter->host;
	scmd->host_scribble = (void *)scb;
	scmd->cmnd[0] = MEGA_INTERNAL_CMD;

I suspect that the driver crashes here. My patch adds array to
adapter_t and use it here.

After 2.6.25, sense_buffer also is converted from static array to
pointer. In general, using scsi_allocate_command/scsi_free_command is
the recommended way to use struct scsi_cmnd.

So my latest patch removes struct scsi_cmnd in adapter_t and uses the
API in mega_internal_command().


> Also if you believe this is the driver issue and need LSI to help,
> can you report this issue to LSI?

Yeah, I think that we need to update this driver because of the
changes to SCSI mid-layer.

It would be appreciated if you can test my latest path:

http://marc.info/?l=linux-scsi&m=122467887502481&w=2


Can you think of this thread as a bug report to LSI?
Comment 13 Anonymous Emailer 2008-10-22 06:52:40 UTC
Reply-To: bharrosh@panasas.com

FUJITA Tomonori wrote:
> On Wed, 22 Oct 2008 12:08:27 +0200
> Boaz Harrosh <bharrosh@panasas.com> wrote:
> 
>> FUJITA Tomonori wrote:
>>> On Wed, 22 Oct 2008 11:04:44 +0200
>>> Boaz Harrosh <bharrosh@panasas.com> wrote:
>>>
>>>> FUJITA Tomonori wrote:
>>>>> On Tue, 21 Oct 2008 22:22:37 +0200
>>>>> "Pascal Terjan" <pterjan@gmail.com> wrote:
>>>>>
>>>>>> On Tue, Oct 21, 2008 at 9:54 PM, Matthew Wilcox <matthew@wil.cx> wrote:
>>>>>>> On Tue, Oct 21, 2008 at 12:47:01PM -0700, Andrew Morton wrote:
>>>>>>>>> Latest working kernel version: 2.6.24
>>>>>>>>> Earliest failing kernel version: 2.6.27-rc8
>>>>>>>> It's a regression.
>>>>>>>>
>>>>>>>>> Pid: 2319, comm: diff Not tainted (2.6.27-server-0.rc8.2mnb #1)
>>>>>>> It's also a distro kernel by the looks of things.  Can it be reproduced
>>>>>>> with an upstream kernel?
>>>>>> I will try booting the server on vanilla kernel but I'm not sure when
>>>>>> (we already rebooted it 2 times recently and users won't enjoy it).
>>>>>>
>>>>>> This is a distro kernel but I don't see patches that could impact this :
>>>>>>
>>>>>> http://svn.mandriva.com/cgi-bin/viewvc.cgi/packages/cooker/kernel/current/PATCHES/patches/
>>>>>>
>>>>>> Machine is a old HP NetServer LT 6000
>>>>>>
>>>>>> 04:03.1 I2O: Intel Corporation 80960RP (i960RP) Microprocessor (rev
>>>>>> 09) (prog-if 01)
>>>>>>  Subsystem: Hewlett-Packard Company MegaRAID, Integrated NetRAID
>>>>>>  Flags: bus master, fast Back2Back, medium devsel, latency 64, IRQ 11
>>>>>>  Memory at f4000000 (32-bit, prefetchable) [size=64M]
>>>>>>  [virtual] Expansion ROM at a8130000 [disabled] [size=32K]
>>>>>>  Capabilities: [80] Power Management version 2
>>>>>>  Kernel driver in use: megaraid_legacy
>>>>>>  Kernel modules: i2o_core, megaraid
>>>>> This patch helps?
>>>>>
>>>>>
>>>>> diff --git a/drivers/scsi/megaraid.c b/drivers/scsi/megaraid.c
>>>>> index 28c9da7..9294ed8 100644
>>>>> --- a/drivers/scsi/megaraid.c
>>>>> +++ b/drivers/scsi/megaraid.c
>>>>> @@ -4414,12 +4414,14 @@ mega_internal_command(adapter_t *adapter,
>>>>> megacmd_t *mc, mega_passthru *pthru)
>>>>>  
>>>>>   scmd = &adapter->int_scmd;
>>>>>   memset(scmd, 0, sizeof(Scsi_Cmnd));
>>>>> + memset(adapter->int_cdb, 0, sizeof(adapter->int_cdb));
>>>>>  
>>>>>   sdev = kzalloc(sizeof(struct scsi_device), GFP_KERNEL);
>>>>>   scmd->device = sdev;
>>>>>  
>>>>>   scmd->device->host = adapter->host;
>>>>>   scmd->host_scribble = (void *)scb;
>>>>> + scmd->cmnd = adapter->int_cdb;
>>>>>   scmd->cmnd[0] = MEGA_INTERNAL_CMD;
>>>>>  
>>>>>   scb->state |= SCB_ACTIVE;
>>>>> diff --git a/drivers/scsi/megaraid.h b/drivers/scsi/megaraid.h
>>>>> index ee70bd4..5ffec15 100644
>>>>> --- a/drivers/scsi/megaraid.h
>>>>> +++ b/drivers/scsi/megaraid.h
>>>>> @@ -889,6 +889,7 @@ typedef struct {
>>>>>   u8      sglen;  /* f/w supported scatter-gather list length */
>>>>>  
>>>>>   scb_t                   int_scb;
>>>>> + unsigned char           int_cdb[MAX_COMMAND_SIZE];
>>>>>   Scsi_Cmnd               int_scmd;
>>>>>   struct mutex            int_mtx;        /* To synchronize the internal
>>>>>                                           commands */
>>>>>
>>>>> --
>>>> Hi TOMO.
>>>>
>>>> This might not be enough for example I don't see the allocation of
>>>> sense_buffer.
>>>> It might be much easer to allocate using the new command allocation API
>>>> James
>>>> did, just for such cases. These are:
>>>> scsi_allocate_command/scsi_free_command
>>> Yeah, it might be. It's fine by me too. But this code path is used
>>> only for issuing internal special commands. It doesn't use the great
>>> portion of scsi_cmnd. For example, these commands don't use sense
>>> buffer, I think. The code path uses scsi_cmnd just for hooking scb_t,
>>> a structure that megaraid allocates per command.
>> OK Thanks.
>> I was not sure because it looks like in mega_cmd_done(), if the status is
>> 0x2 (CHECK_CONDITION) then it would set the sense_buffer. But from what
>> you say, the HW will never return 0x2 in case of an Internal-Command. I
>> Just wanted to make sure.
> 
> I thought that all internal commands are non SCSI command but seems
> that there is one exception (issuing INQUIRY as an internal command).
> I'm not sure I understand correctly the driver but anyway here is a
> version using scsi_allocate_command and scsi_free_command.
> 

Thanks TOMO.
This is actual my bug from the days of making a scsi_cmnd->cmnd into a pointer,
and skipping this driver.

Reviewed-by: Boaz Harrosh <bharrosh@panasas.com>

> I guess that we need to check the kzalloc failure too but it is
> supposed to be fixed by a different patch.
> 
> 
> diff --git a/drivers/scsi/megaraid.c b/drivers/scsi/megaraid.c
> index 28c9da7..7dc62de 100644
> --- a/drivers/scsi/megaraid.c
> +++ b/drivers/scsi/megaraid.c
> @@ -4402,6 +4402,10 @@ mega_internal_command(adapter_t *adapter, megacmd_t
> *mc, mega_passthru *pthru)
>       scb_t   *scb;
>       int     rval;
>  
> +     scmd = scsi_allocate_command(GFP_KERNEL);
> +     if (!scmd)
> +             return -ENOMEM;
> +
>       /*
>        * The internal commands share one command id and hence are
>        * serialized. This is so because we want to reserve maximum number of
> @@ -4412,12 +4416,11 @@ mega_internal_command(adapter_t *adapter, megacmd_t
> *mc, mega_passthru *pthru)
>       scb = &adapter->int_scb;
>       memset(scb, 0, sizeof(scb_t));
>  
> -     scmd = &adapter->int_scmd;
> -     memset(scmd, 0, sizeof(Scsi_Cmnd));
> -
>       sdev = kzalloc(sizeof(struct scsi_device), GFP_KERNEL);
>       scmd->device = sdev;
>  
> +     memset(adapter->int_cdb, 0, sizeof(adapter->int_cdb));
> +     scmd->cmnd = adapter->int_cdb;
>       scmd->device->host = adapter->host;
>       scmd->host_scribble = (void *)scb;
>       scmd->cmnd[0] = MEGA_INTERNAL_CMD;
> @@ -4456,6 +4459,8 @@ mega_internal_command(adapter_t *adapter, megacmd_t
> *mc, mega_passthru *pthru)
>  
>       mutex_unlock(&adapter->int_mtx);
>  
> +     scsi_free_command(GFP_KERNEL, scmd);
> +
>       return rval;
>  }
>  
> diff --git a/drivers/scsi/megaraid.h b/drivers/scsi/megaraid.h
> index ee70bd4..795201f 100644
> --- a/drivers/scsi/megaraid.h
> +++ b/drivers/scsi/megaraid.h
> @@ -888,8 +888,8 @@ typedef struct {
>  
>       u8      sglen;  /* f/w supported scatter-gather list length */
>  
> +     unsigned char int_cdb[MAX_COMMAND_SIZE];
>       scb_t                   int_scb;
> -     Scsi_Cmnd               int_scmd;
>       struct mutex            int_mtx;        /* To synchronize the internal
>                                               commands */
>       struct completion       int_waitq;      /* wait queue for internal

TODO: One more thing that needs to be done in this driver is one time
allocation of a scsi host-device and use it as a proper 
"scmd->device = sdev". And freeing at destruction.
Failing to do so enables a posibility of an internal command been completed
after the deletion of the host.

Thanks again
Boaz
Comment 14 bo yang 2008-10-22 07:00:09 UTC
Thanks TOMM.  If this is the case, it may affect some of other drivers like our MPT and SAS driver.  Is there a way kernel can fix it?

Thanks,

Bo Yang

-----Original Message-----
From: FUJITA Tomonori [mailto:fujita.tomonori@lab.ntt.co.jp]
Sent: Wednesday, October 22, 2008 9:39 AM
To: Yang, Bo
Cc: fujita.tomonori@lab.ntt.co.jp; bharrosh@panasas.com; pterjan@gmail.com; matthew@wil.cx; akpm@linux-foundation.org; linux-scsi@vger.kernel.org; Patro, Sumant; bugme-daemon@bugzilla.kernel.org; Austria, Winston
Subject: RE: [Bugme-new] [Bug 11792] New: Oops when reading /proc/megaraid/hba0/diskdrives-ch*

On Wed, 22 Oct 2008 07:03:03 -0600
"Yang, Bo" <Bo.Yang@lsi.com> wrote:

> I saw the latest working kernel: 2.6.24 and first failing kernel
> version: 2.6.27-rc8.  I understand there are lots of changes between
> those two kernels.  Can you take a look the changes from kernels to
> find out the root cause?

Sorry, I didn't explain the possible root cause.

struct scsi_cmnd in 2.6.25:

unsigned char cmnd[MAX_COMMAND_SIZE];


struct scsi_cmnd in 2.6.26:

unsigned char *cmnd;


In short, struct scsi_cmnd doesn't have static array for cdb. You need
to allocate memory for it (the scsi midlayer does for common usage).

So

static int
mega_internal_command(adapter_t *adapter, megacmd_t *mc, mega_passthru *pthru)
{
...
        scb = &adapter->int_scb;
        memset(scb, 0, sizeof(scb_t));

        scmd = &adapter->int_scmd;
        memset(scmd, 0, sizeof(Scsi_Cmnd));

        sdev = kzalloc(sizeof(struct scsi_device), GFP_KERNEL);
        scmd->device = sdev;

        scmd->device->host = adapter->host;
        scmd->host_scribble = (void *)scb;
        scmd->cmnd[0] = MEGA_INTERNAL_CMD;

I suspect that the driver crashes here. My patch adds array to
adapter_t and use it here.

After 2.6.25, sense_buffer also is converted from static array to
pointer. In general, using scsi_allocate_command/scsi_free_command is
the recommended way to use struct scsi_cmnd.

So my latest patch removes struct scsi_cmnd in adapter_t and uses the
API in mega_internal_command().


> Also if you believe this is the driver issue and need LSI to help,
> can you report this issue to LSI?

Yeah, I think that we need to update this driver because of the
changes to SCSI mid-layer.

It would be appreciated if you can test my latest path:

http://marc.info/?l=linux-scsi&m=122467887502481&w=2


Can you think of this thread as a bug report to LSI?
Comment 15 Anonymous Emailer 2008-10-22 07:20:47 UTC
Reply-To: bharrosh@panasas.com

Yang, Bo wrote:
> Thanks TOMM.  If this is the case, it may affect some of other drivers like
> our 
> MPT and SAS driver.  Is there a way kernel can fix it?
> 
> Thanks,
> 
> Bo Yang
> 

Hi Bo Yang

What are the source files for the MPT and SAS drivers from LSI?
I have made a system wide search for such problems as below and could not
find any more. But I might have missed them. If you tell me the file names
I will inspect more closly.

Thanks
Boaz

> -----Original Message-----
> From: FUJITA Tomonori [mailto:fujita.tomonori@lab.ntt.co.jp]
> Sent: Wednesday, October 22, 2008 9:39 AM
> To: Yang, Bo
> Cc: fujita.tomonori@lab.ntt.co.jp; bharrosh@panasas.com; pterjan@gmail.com;
> matthew@wil.cx; akpm@linux-foundation.org; linux-scsi@vger.kernel.org; Patro,
> Sumant; bugme-daemon@bugzilla.kernel.org; Austria, Winston
> Subject: RE: [Bugme-new] [Bug 11792] New: Oops when reading
> /proc/megaraid/hba0/diskdrives-ch*
> 
> On Wed, 22 Oct 2008 07:03:03 -0600
> "Yang, Bo" <Bo.Yang@lsi.com> wrote:
> 
>> I saw the latest working kernel: 2.6.24 and first failing kernel
>> version: 2.6.27-rc8.  I understand there are lots of changes between
>> those two kernels.  Can you take a look the changes from kernels to
>> find out the root cause?
> 
> Sorry, I didn't explain the possible root cause.
> 
> struct scsi_cmnd in 2.6.25:
> 
> unsigned char cmnd[MAX_COMMAND_SIZE];
> 
> 
> struct scsi_cmnd in 2.6.26:
> 
> unsigned char *cmnd;
> 
> 
> In short, struct scsi_cmnd doesn't have static array for cdb. You need
> to allocate memory for it (the scsi midlayer does for common usage).
> 
> So
> 
> static int
> mega_internal_command(adapter_t *adapter, megacmd_t *mc, mega_passthru
> *pthru)
> {
> ...
>         scb = &adapter->int_scb;
>         memset(scb, 0, sizeof(scb_t));
> 
>         scmd = &adapter->int_scmd;
>         memset(scmd, 0, sizeof(Scsi_Cmnd));
> 
>         sdev = kzalloc(sizeof(struct scsi_device), GFP_KERNEL);
>         scmd->device = sdev;
> 
>         scmd->device->host = adapter->host;
>         scmd->host_scribble = (void *)scb;
>         scmd->cmnd[0] = MEGA_INTERNAL_CMD;
> 
> I suspect that the driver crashes here. My patch adds array to
> adapter_t and use it here.
> 
> After 2.6.25, sense_buffer also is converted from static array to
> pointer. In general, using scsi_allocate_command/scsi_free_command is
> the recommended way to use struct scsi_cmnd.
> 
> So my latest patch removes struct scsi_cmnd in adapter_t and uses the
> API in mega_internal_command().
> 
> 
>> Also if you believe this is the driver issue and need LSI to help,
>> can you report this issue to LSI?
> 
> Yeah, I think that we need to update this driver because of the
> changes to SCSI mid-layer.
> 
> It would be appreciated if you can test my latest path:
> 
> http://marc.info/?l=linux-scsi&m=122467887502481&w=2
> 
> 
> Can you think of this thread as a bug report to LSI?
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
Comment 16 Anonymous Emailer 2008-10-22 07:58:28 UTC
Reply-To: bharrosh@panasas.com

Boaz Harrosh wrote:
> Yang, Bo wrote:
>> Thanks TOMM.  If this is the case, it may affect some of other drivers like
>> our 
>> MPT and SAS driver.  Is there a way kernel can fix it?
>>
>> Thanks,
>>
>> Bo Yang
>>
> 
> Hi Bo Yang
> 
> What are the source files for the MPT and SAS drivers from LSI?
> I have made a system wide search for such problems as below and could not
> find any more. But I might have missed them. If you tell me the file names
> I will inspect more closly.
> 
> Thanks
> Boaz

In megaraid_sas.c I do not see any places where megasas_cmd->scmd is set
other then at megasas_queue_command() which is the scsi .queue_command
vector. Commands that come from scsi-ml are guarantied to be fully allocated.
Other then that I do not see any places that privately allocate a scsi_cmnd
structure. So I would say megaraid_sas.c should be safe from this bug

Any other files I should inspect?

Boaz
 
Comment 17 bo yang 2008-10-22 08:58:19 UTC
Thanks Boaz, the name for SAS is megaraid_sas and MPT is fusion.

Regards,

Bo Yang

-----Original Message-----
From: Boaz Harrosh [mailto:bharrosh@panasas.com]
Sent: Wednesday, October 22, 2008 10:20 AM
To: Yang, Bo
Cc: FUJITA Tomonori; pterjan@gmail.com; matthew@wil.cx; akpm@linux-foundation.org; linux-scsi@vger.kernel.org; Patro, Sumant; bugme-daemon@bugzilla.kernel.org; Austria, Winston
Subject: Re: [Bugme-new] [Bug 11792] New: Oops when reading /proc/megaraid/hba0/diskdrives-ch*

Yang, Bo wrote:
> Thanks TOMM.  If this is the case, it may affect some of other drivers like
> our
> MPT and SAS driver.  Is there a way kernel can fix it?
>
> Thanks,
>
> Bo Yang
>

Hi Bo Yang

What are the source files for the MPT and SAS drivers from LSI?
I have made a system wide search for such problems as below and could not
find any more. But I might have missed them. If you tell me the file names
I will inspect more closly.

Thanks
Boaz

> -----Original Message-----
> From: FUJITA Tomonori [mailto:fujita.tomonori@lab.ntt.co.jp]
> Sent: Wednesday, October 22, 2008 9:39 AM
> To: Yang, Bo
> Cc: fujita.tomonori@lab.ntt.co.jp; bharrosh@panasas.com; pterjan@gmail.com;
> matthew@wil.cx; akpm@linux-foundation.org; linux-scsi@vger.kernel.org; Patro,
> Sumant; bugme-daemon@bugzilla.kernel.org; Austria, Winston
> Subject: RE: [Bugme-new] [Bug 11792] New: Oops when reading
> /proc/megaraid/hba0/diskdrives-ch*
>
> On Wed, 22 Oct 2008 07:03:03 -0600
> "Yang, Bo" <Bo.Yang@lsi.com> wrote:
>
>> I saw the latest working kernel: 2.6.24 and first failing kernel
>> version: 2.6.27-rc8.  I understand there are lots of changes between
>> those two kernels.  Can you take a look the changes from kernels to
>> find out the root cause?
>
> Sorry, I didn't explain the possible root cause.
>
> struct scsi_cmnd in 2.6.25:
>
> unsigned char cmnd[MAX_COMMAND_SIZE];
>
>
> struct scsi_cmnd in 2.6.26:
>
> unsigned char *cmnd;
>
>
> In short, struct scsi_cmnd doesn't have static array for cdb. You need
> to allocate memory for it (the scsi midlayer does for common usage).
>
> So
>
> static int
> mega_internal_command(adapter_t *adapter, megacmd_t *mc, mega_passthru
> *pthru)
> {
> ...
>         scb = &adapter->int_scb;
>         memset(scb, 0, sizeof(scb_t));
>
>         scmd = &adapter->int_scmd;
>         memset(scmd, 0, sizeof(Scsi_Cmnd));
>
>         sdev = kzalloc(sizeof(struct scsi_device), GFP_KERNEL);
>         scmd->device = sdev;
>
>         scmd->device->host = adapter->host;
>         scmd->host_scribble = (void *)scb;
>         scmd->cmnd[0] = MEGA_INTERNAL_CMD;
>
> I suspect that the driver crashes here. My patch adds array to
> adapter_t and use it here.
>
> After 2.6.25, sense_buffer also is converted from static array to
> pointer. In general, using scsi_allocate_command/scsi_free_command is
> the recommended way to use struct scsi_cmnd.
>
> So my latest patch removes struct scsi_cmnd in adapter_t and uses the
> API in mega_internal_command().
>
>
>> Also if you believe this is the driver issue and need LSI to help,
>> can you report this issue to LSI?
>
> Yeah, I think that we need to update this driver because of the
> changes to SCSI mid-layer.
>
> It would be appreciated if you can test my latest path:
>
> http://marc.info/?l=linux-scsi&m=122467887502481&w=2
>
>
> Can you think of this thread as a bug report to LSI?
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
Comment 18 Anonymous Emailer 2008-10-22 10:31:36 UTC
Reply-To: bharrosh@panasas.com

Yang, Bo wrote:
> Thanks Boaz, the name for SAS is megaraid_sas and MPT is fusion.
> 
> Regards,
> 
> Bo Yang
> 

I have also reviewed megaraid_mbox.c and megaraid_mm.c. and all files
in drivers/message/fusion/* (See other mail for megaraid_sas.c).
I do not see any problems with these drivers. They do not allocate
private scsi_cmnd structures.

Anything else to review?

Thanks
Boaz
Comment 19 Pascal Terjan 2008-10-23 15:49:39 UTC
On Wed, Oct 22, 2008 at 2:33 PM, FUJITA Tomonori
<fujita.tomonori@lab.ntt.co.jp> wrote:
>
> diff --git a/drivers/scsi/megaraid.c b/drivers/scsi/megaraid.c
> index 28c9da7..7dc62de 100644
> --- a/drivers/scsi/megaraid.c
> +++ b/drivers/scsi/megaraid.c
> @@ -4402,6 +4402,10 @@ mega_internal_command(adapter_t *adapter, megacmd_t
> *mc, mega_passthru *pthru)
>        scb_t   *scb;
>        int     rval;
>
> +       scmd = scsi_allocate_command(GFP_KERNEL);
> +       if (!scmd)
> +               return -ENOMEM;
> +
>        /*
>         * The internal commands share one command id and hence are
>         * serialized. This is so because we want to reserve maximum number of
> @@ -4412,12 +4416,11 @@ mega_internal_command(adapter_t *adapter, megacmd_t
> *mc, mega_passthru *pthru)
>        scb = &adapter->int_scb;
>        memset(scb, 0, sizeof(scb_t));
>
> -       scmd = &adapter->int_scmd;
> -       memset(scmd, 0, sizeof(Scsi_Cmnd));
> -
>        sdev = kzalloc(sizeof(struct scsi_device), GFP_KERNEL);
>        scmd->device = sdev;
>
> +       memset(adapter->int_cdb, 0, sizeof(adapter->int_cdb));
> +       scmd->cmnd = adapter->int_cdb;
>        scmd->device->host = adapter->host;
>        scmd->host_scribble = (void *)scb;
>        scmd->cmnd[0] = MEGA_INTERNAL_CMD;
> @@ -4456,6 +4459,8 @@ mega_internal_command(adapter_t *adapter, megacmd_t
> *mc, mega_passthru *pthru)
>
>        mutex_unlock(&adapter->int_mtx);
>
> +       scsi_free_command(GFP_KERNEL, scmd);
> +
>        return rval;
>  }
>
> diff --git a/drivers/scsi/megaraid.h b/drivers/scsi/megaraid.h
> index ee70bd4..795201f 100644
> --- a/drivers/scsi/megaraid.h
> +++ b/drivers/scsi/megaraid.h
> @@ -888,8 +888,8 @@ typedef struct {
>
>        u8      sglen;  /* f/w supported scatter-gather list length */
>
> +       unsigned char int_cdb[MAX_COMMAND_SIZE];
>        scb_t                   int_scb;
> -       Scsi_Cmnd               int_scmd;
>        struct mutex            int_mtx;        /* To synchronize the internal
>                                                commands */
>        struct completion       int_waitq;      /* wait queue for internal
>

I confirm that this patch fixes the oops and I can now read the usual info
Comment 20 Anonymous Emailer 2008-10-23 17:21:42 UTC
Reply-To: fujita.tomonori@lab.ntt.co.jp

On Fri, 24 Oct 2008 00:49:07 +0200
"Pascal Terjan" <pterjan@gmail.com> wrote:

> On Wed, Oct 22, 2008 at 2:33 PM, FUJITA Tomonori
> <fujita.tomonori@lab.ntt.co.jp> wrote:
> >
> > diff --git a/drivers/scsi/megaraid.c b/drivers/scsi/megaraid.c
> > index 28c9da7..7dc62de 100644
> > --- a/drivers/scsi/megaraid.c
> > +++ b/drivers/scsi/megaraid.c
> > @@ -4402,6 +4402,10 @@ mega_internal_command(adapter_t *adapter, megacmd_t
> *mc, mega_passthru *pthru)
> >        scb_t   *scb;
> >        int     rval;
> >
> > +       scmd = scsi_allocate_command(GFP_KERNEL);
> > +       if (!scmd)
> > +               return -ENOMEM;
> > +
> >        /*
> >         * The internal commands share one command id and hence are
> >         * serialized. This is so because we want to reserve maximum number
> of
> > @@ -4412,12 +4416,11 @@ mega_internal_command(adapter_t *adapter, megacmd_t
> *mc, mega_passthru *pthru)
> >        scb = &adapter->int_scb;
> >        memset(scb, 0, sizeof(scb_t));
> >
> > -       scmd = &adapter->int_scmd;
> > -       memset(scmd, 0, sizeof(Scsi_Cmnd));
> > -
> >        sdev = kzalloc(sizeof(struct scsi_device), GFP_KERNEL);
> >        scmd->device = sdev;
> >
> > +       memset(adapter->int_cdb, 0, sizeof(adapter->int_cdb));
> > +       scmd->cmnd = adapter->int_cdb;
> >        scmd->device->host = adapter->host;
> >        scmd->host_scribble = (void *)scb;
> >        scmd->cmnd[0] = MEGA_INTERNAL_CMD;
> > @@ -4456,6 +4459,8 @@ mega_internal_command(adapter_t *adapter, megacmd_t
> *mc, mega_passthru *pthru)
> >
> >        mutex_unlock(&adapter->int_mtx);
> >
> > +       scsi_free_command(GFP_KERNEL, scmd);
> > +
> >        return rval;
> >  }
> >
> > diff --git a/drivers/scsi/megaraid.h b/drivers/scsi/megaraid.h
> > index ee70bd4..795201f 100644
> > --- a/drivers/scsi/megaraid.h
> > +++ b/drivers/scsi/megaraid.h
> > @@ -888,8 +888,8 @@ typedef struct {
> >
> >        u8      sglen;  /* f/w supported scatter-gather list length */
> >
> > +       unsigned char int_cdb[MAX_COMMAND_SIZE];
> >        scb_t                   int_scb;
> > -       Scsi_Cmnd               int_scmd;
> >        struct mutex            int_mtx;        /* To synchronize the
> internal
> >                                                commands */
> >        struct completion       int_waitq;      /* wait queue for internal
> >
> 
> I confirm that this patch fixes the oops and I can now read the usual info

Thanks!

LSI people, can I get the ack on this?


=
From: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Subject: [PATCH] megaraid: fix mega_internal_command oops

scsi_cmnd->cmnd was changed from a static array to a pointer post
2.6.25. It breaks mega_internal_command():

static int
mega_internal_command(adapter_t *adapter, megacmd_t *mc, mega_passthru *pthru)
{
...
	scb = &adapter->int_scb;
	memset(scb, 0, sizeof(scb_t));

	scmd = &adapter->int_scmd;
	memset(scmd, 0, sizeof(Scsi_Cmnd));

	sdev = kzalloc(sizeof(struct scsi_device), GFP_KERNEL);
	scmd->device = sdev;

	scmd->device->host = adapter->host;
	scmd->host_scribble = (void *)scb;
	scmd->cmnd[0] = MEGA_INTERNAL_CMD;

mega_internal_command() uses scsi_cmnd allocated internally so
scmd->cmnd is NULL here. This patch adds a static array for cdb to
adapter_t and uses it here. This also uses
scsi_allocate_command/scsi_free_command, the recommended way to
allocate struct scsi_cmnd since the driver might use sense_buffer in
struct scsi_cmnd.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Reviewed-by: Boaz Harrosh <bharrosh@panasas.com>
Tested-by: Pascal Terjan <pterjan@gmail.com>
Reported-by: Pascal Terjan <pterjan@gmail.com>
---
 drivers/scsi/megaraid.c |   11 ++++++++---
 drivers/scsi/megaraid.h |    2 +-
 2 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/drivers/scsi/megaraid.c b/drivers/scsi/megaraid.c
index 28c9da7..7dc62de 100644
--- a/drivers/scsi/megaraid.c
+++ b/drivers/scsi/megaraid.c
@@ -4402,6 +4402,10 @@ mega_internal_command(adapter_t *adapter, megacmd_t *mc, mega_passthru *pthru)
 	scb_t	*scb;
 	int	rval;
 
+	scmd = scsi_allocate_command(GFP_KERNEL);
+	if (!scmd)
+		return -ENOMEM;
+
 	/*
 	 * The internal commands share one command id and hence are
 	 * serialized. This is so because we want to reserve maximum number of
@@ -4412,12 +4416,11 @@ mega_internal_command(adapter_t *adapter, megacmd_t *mc, mega_passthru *pthru)
 	scb = &adapter->int_scb;
 	memset(scb, 0, sizeof(scb_t));
 
-	scmd = &adapter->int_scmd;
-	memset(scmd, 0, sizeof(Scsi_Cmnd));
-
 	sdev = kzalloc(sizeof(struct scsi_device), GFP_KERNEL);
 	scmd->device = sdev;
 
+	memset(adapter->int_cdb, 0, sizeof(adapter->int_cdb));
+	scmd->cmnd = adapter->int_cdb;
 	scmd->device->host = adapter->host;
 	scmd->host_scribble = (void *)scb;
 	scmd->cmnd[0] = MEGA_INTERNAL_CMD;
@@ -4456,6 +4459,8 @@ mega_internal_command(adapter_t *adapter, megacmd_t *mc, mega_passthru *pthru)
 
 	mutex_unlock(&adapter->int_mtx);
 
+	scsi_free_command(GFP_KERNEL, scmd);
+
 	return rval;
 }
 
diff --git a/drivers/scsi/megaraid.h b/drivers/scsi/megaraid.h
index ee70bd4..795201f 100644
--- a/drivers/scsi/megaraid.h
+++ b/drivers/scsi/megaraid.h
@@ -888,8 +888,8 @@ typedef struct {
 
 	u8	sglen;	/* f/w supported scatter-gather list length */
 
+	unsigned char int_cdb[MAX_COMMAND_SIZE];
 	scb_t			int_scb;
-	Scsi_Cmnd		int_scmd;
 	struct mutex		int_mtx;	/* To synchronize the internal
 						commands */
 	struct completion	int_waitq;	/* wait queue for internal
Comment 21 bo yang 2008-10-24 06:32:17 UTC
Tom,

I will update you as soon as LSI verifies it.  Not today, next week will be safe.

Regards,

Bo Yang

-----Original Message-----
From: FUJITA Tomonori [mailto:fujita.tomonori@lab.ntt.co.jp]
Sent: Thursday, October 23, 2008 8:21 PM
To: pterjan@gmail.com; James.Bottomley@hansenpartnership.com
Cc: fujita.tomonori@lab.ntt.co.jp; bharrosh@panasas.com; matthew@wil.cx; akpm@linux-foundation.org; linux-scsi@vger.kernel.org; Patro, Sumant; Yang, Bo; bugme-daemon@bugzilla.kernel.org
Subject: Re: [Bugme-new] [Bug 11792] New: Oops when reading /proc/megaraid/hba0/diskdrives-ch*

On Fri, 24 Oct 2008 00:49:07 +0200
"Pascal Terjan" <pterjan@gmail.com> wrote:

> On Wed, Oct 22, 2008 at 2:33 PM, FUJITA Tomonori
> <fujita.tomonori@lab.ntt.co.jp> wrote:
> >
> > diff --git a/drivers/scsi/megaraid.c b/drivers/scsi/megaraid.c
> > index 28c9da7..7dc62de 100644
> > --- a/drivers/scsi/megaraid.c
> > +++ b/drivers/scsi/megaraid.c
> > @@ -4402,6 +4402,10 @@ mega_internal_command(adapter_t *adapter, megacmd_t
> *mc, mega_passthru *pthru)
> >        scb_t   *scb;
> >        int     rval;
> >
> > +       scmd = scsi_allocate_command(GFP_KERNEL);
> > +       if (!scmd)
> > +               return -ENOMEM;
> > +
> >        /*
> >         * The internal commands share one command id and hence are
> >         * serialized. This is so because we want to reserve maximum number
> of
> > @@ -4412,12 +4416,11 @@ mega_internal_command(adapter_t *adapter, megacmd_t
> *mc, mega_passthru *pthru)
> >        scb = &adapter->int_scb;
> >        memset(scb, 0, sizeof(scb_t));
> >
> > -       scmd = &adapter->int_scmd;
> > -       memset(scmd, 0, sizeof(Scsi_Cmnd));
> > -
> >        sdev = kzalloc(sizeof(struct scsi_device), GFP_KERNEL);
> >        scmd->device = sdev;
> >
> > +       memset(adapter->int_cdb, 0, sizeof(adapter->int_cdb));
> > +       scmd->cmnd = adapter->int_cdb;
> >        scmd->device->host = adapter->host;
> >        scmd->host_scribble = (void *)scb;
> >        scmd->cmnd[0] = MEGA_INTERNAL_CMD;
> > @@ -4456,6 +4459,8 @@ mega_internal_command(adapter_t *adapter, megacmd_t
> *mc, mega_passthru *pthru)
> >
> >        mutex_unlock(&adapter->int_mtx);
> >
> > +       scsi_free_command(GFP_KERNEL, scmd);
> > +
> >        return rval;
> >  }
> >
> > diff --git a/drivers/scsi/megaraid.h b/drivers/scsi/megaraid.h
> > index ee70bd4..795201f 100644
> > --- a/drivers/scsi/megaraid.h
> > +++ b/drivers/scsi/megaraid.h
> > @@ -888,8 +888,8 @@ typedef struct {
> >
> >        u8      sglen;  /* f/w supported scatter-gather list length */
> >
> > +       unsigned char int_cdb[MAX_COMMAND_SIZE];
> >        scb_t                   int_scb;
> > -       Scsi_Cmnd               int_scmd;
> >        struct mutex            int_mtx;        /* To synchronize the
> internal
> >                                                commands */
> >        struct completion       int_waitq;      /* wait queue for internal
> >
>
> I confirm that this patch fixes the oops and I can now read the usual info

Thanks!

LSI people, can I get the ack on this?


=
From: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Subject: [PATCH] megaraid: fix mega_internal_command oops

scsi_cmnd->cmnd was changed from a static array to a pointer post
2.6.25. It breaks mega_internal_command():

static int
mega_internal_command(adapter_t *adapter, megacmd_t *mc, mega_passthru *pthru)
{
...
        scb = &adapter->int_scb;
        memset(scb, 0, sizeof(scb_t));

        scmd = &adapter->int_scmd;
        memset(scmd, 0, sizeof(Scsi_Cmnd));

        sdev = kzalloc(sizeof(struct scsi_device), GFP_KERNEL);
        scmd->device = sdev;

        scmd->device->host = adapter->host;
        scmd->host_scribble = (void *)scb;
        scmd->cmnd[0] = MEGA_INTERNAL_CMD;

mega_internal_command() uses scsi_cmnd allocated internally so
scmd->cmnd is NULL here. This patch adds a static array for cdb to
adapter_t and uses it here. This also uses
scsi_allocate_command/scsi_free_command, the recommended way to
allocate struct scsi_cmnd since the driver might use sense_buffer in
struct scsi_cmnd.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Reviewed-by: Boaz Harrosh <bharrosh@panasas.com>
Tested-by: Pascal Terjan <pterjan@gmail.com>
Reported-by: Pascal Terjan <pterjan@gmail.com>
---
 drivers/scsi/megaraid.c |   11 ++++++++---
 drivers/scsi/megaraid.h |    2 +-
 2 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/drivers/scsi/megaraid.c b/drivers/scsi/megaraid.c
index 28c9da7..7dc62de 100644
--- a/drivers/scsi/megaraid.c
+++ b/drivers/scsi/megaraid.c
@@ -4402,6 +4402,10 @@ mega_internal_command(adapter_t *adapter, megacmd_t *mc, mega_passthru *pthru)
        scb_t   *scb;
        int     rval;

+       scmd = scsi_allocate_command(GFP_KERNEL);
+       if (!scmd)
+               return -ENOMEM;
+
        /*
         * The internal commands share one command id and hence are
         * serialized. This is so because we want to reserve maximum number of
@@ -4412,12 +4416,11 @@ mega_internal_command(adapter_t *adapter, megacmd_t *mc, mega_passthru *pthru)
        scb = &adapter->int_scb;
        memset(scb, 0, sizeof(scb_t));

-       scmd = &adapter->int_scmd;
-       memset(scmd, 0, sizeof(Scsi_Cmnd));
-
        sdev = kzalloc(sizeof(struct scsi_device), GFP_KERNEL);
        scmd->device = sdev;

+       memset(adapter->int_cdb, 0, sizeof(adapter->int_cdb));
+       scmd->cmnd = adapter->int_cdb;
        scmd->device->host = adapter->host;
        scmd->host_scribble = (void *)scb;
        scmd->cmnd[0] = MEGA_INTERNAL_CMD;
@@ -4456,6 +4459,8 @@ mega_internal_command(adapter_t *adapter, megacmd_t *mc, mega_passthru *pthru)

        mutex_unlock(&adapter->int_mtx);

+       scsi_free_command(GFP_KERNEL, scmd);
+
        return rval;
 }

diff --git a/drivers/scsi/megaraid.h b/drivers/scsi/megaraid.h
index ee70bd4..795201f 100644
--- a/drivers/scsi/megaraid.h
+++ b/drivers/scsi/megaraid.h
@@ -888,8 +888,8 @@ typedef struct {

        u8      sglen;  /* f/w supported scatter-gather list length */

+       unsigned char int_cdb[MAX_COMMAND_SIZE];
        scb_t                   int_scb;
-       Scsi_Cmnd               int_scmd;
        struct mutex            int_mtx;        /* To synchronize the internal
                                                commands */
        struct completion       int_waitq;      /* wait queue for internal
--
1.5.5.GIT
Comment 22 bo yang 2008-10-30 06:02:30 UTC
The patch works for us.

Regards,

Bo Yang

-----Original Message-----
From: Yang, Bo
Sent: Friday, October 24, 2008 9:31 AM
To: 'FUJITA Tomonori'; pterjan@gmail.com; James.Bottomley@hansenpartnership.com
Cc: bharrosh@panasas.com; matthew@wil.cx; akpm@linux-foundation.org; linux-scsi@vger.kernel.org; Patro, Sumant; bugme-daemon@bugzilla.kernel.org; Austria, Winston
Subject: RE: [Bugme-new] [Bug 11792] New: Oops when reading /proc/megaraid/hba0/diskdrives-ch*

Tom,

I will update you as soon as LSI verifies it.  Not today, next week will be safe.

Regards,

Bo Yang

-----Original Message-----
From: FUJITA Tomonori [mailto:fujita.tomonori@lab.ntt.co.jp]
Sent: Thursday, October 23, 2008 8:21 PM
To: pterjan@gmail.com; James.Bottomley@hansenpartnership.com
Cc: fujita.tomonori@lab.ntt.co.jp; bharrosh@panasas.com; matthew@wil.cx; akpm@linux-foundation.org; linux-scsi@vger.kernel.org; Patro, Sumant; Yang, Bo; bugme-daemon@bugzilla.kernel.org
Subject: Re: [Bugme-new] [Bug 11792] New: Oops when reading /proc/megaraid/hba0/diskdrives-ch*

On Fri, 24 Oct 2008 00:49:07 +0200
"Pascal Terjan" <pterjan@gmail.com> wrote:

> On Wed, Oct 22, 2008 at 2:33 PM, FUJITA Tomonori
> <fujita.tomonori@lab.ntt.co.jp> wrote:
> >
> > diff --git a/drivers/scsi/megaraid.c b/drivers/scsi/megaraid.c
> > index 28c9da7..7dc62de 100644
> > --- a/drivers/scsi/megaraid.c
> > +++ b/drivers/scsi/megaraid.c
> > @@ -4402,6 +4402,10 @@ mega_internal_command(adapter_t *adapter, megacmd_t
> *mc, mega_passthru *pthru)
> >        scb_t   *scb;
> >        int     rval;
> >
> > +       scmd = scsi_allocate_command(GFP_KERNEL);
> > +       if (!scmd)
> > +               return -ENOMEM;
> > +
> >        /*
> >         * The internal commands share one command id and hence are
> >         * serialized. This is so because we want to reserve maximum number
> of
> > @@ -4412,12 +4416,11 @@ mega_internal_command(adapter_t *adapter, megacmd_t
> *mc, mega_passthru *pthru)
> >        scb = &adapter->int_scb;
> >        memset(scb, 0, sizeof(scb_t));
> >
> > -       scmd = &adapter->int_scmd;
> > -       memset(scmd, 0, sizeof(Scsi_Cmnd));
> > -
> >        sdev = kzalloc(sizeof(struct scsi_device), GFP_KERNEL);
> >        scmd->device = sdev;
> >
> > +       memset(adapter->int_cdb, 0, sizeof(adapter->int_cdb));
> > +       scmd->cmnd = adapter->int_cdb;
> >        scmd->device->host = adapter->host;
> >        scmd->host_scribble = (void *)scb;
> >        scmd->cmnd[0] = MEGA_INTERNAL_CMD;
> > @@ -4456,6 +4459,8 @@ mega_internal_command(adapter_t *adapter, megacmd_t
> *mc, mega_passthru *pthru)
> >
> >        mutex_unlock(&adapter->int_mtx);
> >
> > +       scsi_free_command(GFP_KERNEL, scmd);
> > +
> >        return rval;
> >  }
> >
> > diff --git a/drivers/scsi/megaraid.h b/drivers/scsi/megaraid.h
> > index ee70bd4..795201f 100644
> > --- a/drivers/scsi/megaraid.h
> > +++ b/drivers/scsi/megaraid.h
> > @@ -888,8 +888,8 @@ typedef struct {
> >
> >        u8      sglen;  /* f/w supported scatter-gather list length */
> >
> > +       unsigned char int_cdb[MAX_COMMAND_SIZE];
> >        scb_t                   int_scb;
> > -       Scsi_Cmnd               int_scmd;
> >        struct mutex            int_mtx;        /* To synchronize the
> internal
> >                                                commands */
> >        struct completion       int_waitq;      /* wait queue for internal
> >
>
> I confirm that this patch fixes the oops and I can now read the usual info

Thanks!

LSI people, can I get the ack on this?


=
From: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Subject: [PATCH] megaraid: fix mega_internal_command oops

scsi_cmnd->cmnd was changed from a static array to a pointer post
2.6.25. It breaks mega_internal_command():

static int
mega_internal_command(adapter_t *adapter, megacmd_t *mc, mega_passthru *pthru)
{
...
        scb = &adapter->int_scb;
        memset(scb, 0, sizeof(scb_t));

        scmd = &adapter->int_scmd;
        memset(scmd, 0, sizeof(Scsi_Cmnd));

        sdev = kzalloc(sizeof(struct scsi_device), GFP_KERNEL);
        scmd->device = sdev;

        scmd->device->host = adapter->host;
        scmd->host_scribble = (void *)scb;
        scmd->cmnd[0] = MEGA_INTERNAL_CMD;

mega_internal_command() uses scsi_cmnd allocated internally so
scmd->cmnd is NULL here. This patch adds a static array for cdb to
adapter_t and uses it here. This also uses
scsi_allocate_command/scsi_free_command, the recommended way to
allocate struct scsi_cmnd since the driver might use sense_buffer in
struct scsi_cmnd.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Reviewed-by: Boaz Harrosh <bharrosh@panasas.com>
Tested-by: Pascal Terjan <pterjan@gmail.com>
Reported-by: Pascal Terjan <pterjan@gmail.com>
---
 drivers/scsi/megaraid.c |   11 ++++++++---
 drivers/scsi/megaraid.h |    2 +-
 2 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/drivers/scsi/megaraid.c b/drivers/scsi/megaraid.c
index 28c9da7..7dc62de 100644
--- a/drivers/scsi/megaraid.c
+++ b/drivers/scsi/megaraid.c
@@ -4402,6 +4402,10 @@ mega_internal_command(adapter_t *adapter, megacmd_t *mc, mega_passthru *pthru)
        scb_t   *scb;
        int     rval;

+       scmd = scsi_allocate_command(GFP_KERNEL);
+       if (!scmd)
+               return -ENOMEM;
+
        /*
         * The internal commands share one command id and hence are
         * serialized. This is so because we want to reserve maximum number of
@@ -4412,12 +4416,11 @@ mega_internal_command(adapter_t *adapter, megacmd_t *mc, mega_passthru *pthru)
        scb = &adapter->int_scb;
        memset(scb, 0, sizeof(scb_t));

-       scmd = &adapter->int_scmd;
-       memset(scmd, 0, sizeof(Scsi_Cmnd));
-
        sdev = kzalloc(sizeof(struct scsi_device), GFP_KERNEL);
        scmd->device = sdev;

+       memset(adapter->int_cdb, 0, sizeof(adapter->int_cdb));
+       scmd->cmnd = adapter->int_cdb;
        scmd->device->host = adapter->host;
        scmd->host_scribble = (void *)scb;
        scmd->cmnd[0] = MEGA_INTERNAL_CMD;
@@ -4456,6 +4459,8 @@ mega_internal_command(adapter_t *adapter, megacmd_t *mc, mega_passthru *pthru)

        mutex_unlock(&adapter->int_mtx);

+       scsi_free_command(GFP_KERNEL, scmd);
+
        return rval;
 }

diff --git a/drivers/scsi/megaraid.h b/drivers/scsi/megaraid.h
index ee70bd4..795201f 100644
--- a/drivers/scsi/megaraid.h
+++ b/drivers/scsi/megaraid.h
@@ -888,8 +888,8 @@ typedef struct {

        u8      sglen;  /* f/w supported scatter-gather list length */

+       unsigned char int_cdb[MAX_COMMAND_SIZE];
        scb_t                   int_scb;
-       Scsi_Cmnd               int_scmd;
        struct mutex            int_mtx;        /* To synchronize the internal
                                                commands */
        struct completion       int_waitq;      /* wait queue for internal
--
1.5.5.GIT

Note You need to log in before you can comment on or make changes to this bug.