Bug 11529 - aacraid - WARNING: at lib/scatterlist.c
aacraid - WARNING: at lib/scatterlist.c
Status: RESOLVED CODE_FIX
Product: SCSI Drivers
Classification: Unclassified
Component: AACRAID
All Linux
: P1 normal
Assigned To: scsi_drivers-aacraid
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2008-09-10 01:07 UTC by Winfried Tilanus
Modified: 2008-09-10 10:35 UTC (History)
1 user (show)

See Also:
Kernel Version: 2.6.26.5
Tree: Mainline
Regression: ---


Attachments

Description Winfried Tilanus 2008-09-10 01:07:50 UTC
Latest working kernel version: 2.6.25.17
Earliest failing kernel version: 2.6.26
Distribution: Debian 
Hardware Environment: Intel 686
Problem Description:

During boot I see 16 warnings about scatterlist.c, right after the 
initiantion of aacraid (Using a Adeptec AAR-2410SA raid card). Boot 
seems to proceed normally. After the boot I don't see these messages 
anymore:

[    5.640780] AAC0: kernel 4.2-0[8205] Aug 17 2005
[    5.640780] AAC0: monitor 4.2-0[8205]
[    5.640780] AAC0: bios 4.2-0[8205]
[    5.640780] AAC0: serial C92A77
[    5.640781] scsi2 : aacraid
[    5.641921] ------------[ cut here ]------------
[    5.641924] WARNING: at lib/scatterlist.c:316 sg_copy_buffer+0x2e/0x14f()
[    5.641926] Modules linked in: pata_marvell(+) ata_piix(+) ohci1394(+) libata ieee1394 aacraid scsi_mod dock ehci_hcd uhci_hcd usbcore e1000e thermal processor fan thermal_sys
[    5.641940] Pid: 900, comm: modprobe Not tainted 2.6.26.5-pristine #1
[    5.641942]  [<c01237c0>] warn_on_slowpath+0x40/0x79
[    5.641948]  [<c017ee31>] do_lookup+0x53/0x153
[    5.641952]  [<c0115c45>] do_page_fault+0x0/0x7e7
[    5.641956]  [<c02c00fa>] error_code+0x72/0x78
[    5.641960]  [<c0118bfb>] kmap_atomic_prot+0xc9/0xff
[    5.641963]  [<c0118c42>] kmap_atomic+0x11/0x14
[    5.641966]  [<c0118b17>] kunmap_atomic+0x4b/0x66
[    5.641969]  [<c0158acf>] file_read_actor+0x75/0xc5
[    5.641973]  [<c01275a6>] current_fs_time+0x13/0x15
[    5.641976]  [<c018bfa6>] mnt_drop_write+0x1a/0xb4
[    5.641979]  [<c015ad90>] generic_file_aio_read+0x492/0x512
[    5.641984]  [<c01e9462>] sg_copy_buffer+0x2e/0x14f
[    5.641987]  [<c0178b34>] do_sync_read+0xc0/0x107
[    5.641991]  [<c01e958e>] sg_copy_to_buffer+0xb/0xe
[    5.641994]  [<f8b98449>] get_container_name_callback+0xa2/0xf8 [aacraid]
[    5.642004]  [<c016a7a0>] mmap_region+0x38d/0x45b
[    5.642007]  [<c013364e>] autoremove_wake_function+0x0/0x2d
[    5.642012]  [<f8b9dd06>] aac_intr_normal+0x166/0x1a2 [aacraid]
[    5.642021]  [<f8b9e943>] aac_rx_intr_message+0x25/0x55 [aacraid]
[    5.642030]  [<c01545a7>] handle_IRQ_event+0x23/0x51
[    5.642034]  [<c01553b2>] handle_fasteoi_irq+0x71/0xa4
[    5.642037]  [<c0105f8b>] do_IRQ+0x4d/0x66
[    5.642040]  [<c010434b>] common_interrupt+0x23/0x28
[    5.642044]  =======================
[    5.642046] ---[ end trace bbed9608e04f0ab9 ]---

The Adaptec card is an AAR-2410SA, a PCI SATA RAID card. lspci -vvv reports on it:

04:00.0 RAID bus controller: Adaptec AAC-RAID (rev 01)
	Subsystem: Adaptec Device 0290
	Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx-
	Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=slow >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 32 (250ns min, 250ns max), Cache Line Size: 64 bytes
	Interrupt: pin A routed to IRQ 21
	Region 0: Memory at e0000000 (32-bit, prefetchable) [size=64M]
	Expansion ROM at e7008000 [disabled] [size=32K]
	Capabilities: [80] Power Management version 2
		Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
		Status: D0 PME-Enable- DSel=0 DScale=0 PME-
	Kernel driver in use: aacraid
	Kernel modules: aacraid

Further info: Intel Core 2 Quad, 4 GB ram. Kernel compiled with a slightly modified config taken from debians 686-bigmem kernel.
Comment 1 Anonymous Emailer 2008-09-10 02:01:39 UTC
Reply-To: fujita.tomonori@lab.ntt.co.jp

On Wed, 10 Sep 2008 01:07:50 -0700 (PDT)
bugme-daemon@bugzilla.kernel.org wrote:

> http://bugzilla.kernel.org/show_bug.cgi?id=11529
> 
>            Summary: aacraid - WARNING: at lib/scatterlist.c
>            Product: SCSI Drivers
>            Version: 2.5
>      KernelVersion: 2.6.26.5
>           Platform: All
>         OS/Version: Linux
>               Tree: Mainline
>             Status: NEW
>           Severity: normal
>           Priority: P1
>          Component: AACRAID
>         AssignedTo: scsi_drivers-aacraid@kernel-bugs.osdl.org
>         ReportedBy: winfried@tilanus.com
> 
> 
> Latest working kernel version: 2.6.25.17
> Earliest failing kernel version: 2.6.26
> Distribution: Debian 
> Hardware Environment: Intel 686
> Problem Description:
> 
> During boot I see 16 warnings about scatterlist.c, right after the 
> initiantion of aacraid (Using a Adeptec AAR-2410SA raid card). Boot 
> seems to proceed normally. After the boot I don't see these messages 
> anymore:
> 
> [    5.640780] AAC0: kernel 4.2-0[8205] Aug 17 2005
> [    5.640780] AAC0: monitor 4.2-0[8205]
> [    5.640780] AAC0: bios 4.2-0[8205]
> [    5.640780] AAC0: serial C92A77
> [    5.640781] scsi2 : aacraid
> [    5.641921] ------------[ cut here ]------------
> [    5.641924] WARNING: at lib/scatterlist.c:316 sg_copy_buffer+0x2e/0x14f()
> [    5.641926] Modules linked in: pata_marvell(+) ata_piix(+) ohci1394(+)
> libata ieee1394 aacraid scsi_mod dock ehci_hcd uhci_hcd usbcore e1000e thermal
> processor fan thermal_sys

Sorry about the bug.

I thought that this function was called with local interrupts disabled
since in 2.6.25 this uses KM_IRQ0 without disabling local
interrupts...

This fix needs to go to scsi-rc-fixes.

=
From: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Subject: [PATCH] aacraid: disable local interrupts before scsi_sg_copy APIs

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
CC: stable@kernel.org
---
 drivers/scsi/aacraid/aachba.c |    3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/drivers/scsi/aacraid/aachba.c b/drivers/scsi/aacraid/aachba.c
index aa4e77c..8e60d06 100644
--- a/drivers/scsi/aacraid/aachba.c
+++ b/drivers/scsi/aacraid/aachba.c
@@ -404,13 +404,16 @@ static void get_container_name_callback(void *context, struct fib * fibptr)
 			char d[sizeof(((struct inquiry_data *)NULL)->inqd_pid)];
 			int count = sizeof(d);
 			char *dp = d;
+			unsigned long flags;
 			do {
 				*dp++ = (*sp) ? *sp++ : ' ';
 			} while (--count > 0);
 
+			local_irq_save(flags);
 			scsi_sg_copy_to_buffer(scsicmd, &inq, sizeof(inq));
 			memcpy(inq.inqd_pid, d, sizeof(d));
 			scsi_sg_copy_from_buffer(scsicmd, &inq, sizeof(inq));
+			local_irq_restore(flags);
 		}
 	}
 
Comment 2 Winfried Tilanus 2008-09-10 05:44:30 UTC
The patch didn't apply, but adding the lines by hand worked. It solved most of the warnings, but I am still left with two warnings at boottime:

[    5.237554] ------------[ cut here ]------------
[    5.237554] WARNING: at lib/scatterlist.c:316 sg_copy_buffer+0x2e/0x14f()
[    5.237554] Modules linked in: usbhid(+) hid ff_memless sd_mod ohci1394(+) ata_piix(+) pata_marvell(+) aacraid ieee1394 libata scsi_mod dock ehci_hcd uhci_hcd usbcore e1000e thermal processor fan thermal_sys
[    5.237554] Pid: 1025, comm: modprobe Not tainted 2.6.26.5-pristine-patched #1
[    5.237554]  [<c01237c0>] warn_on_slowpath+0x40/0x79
[    5.237554]  [<c0118bfb>] kmap_atomic_prot+0xc9/0xff
[    5.237554]  [<c0118c42>] kmap_atomic+0x11/0x14
[    5.237554]  [<c0118b17>] kunmap_atomic+0x4b/0x66
[    5.237554]  [<c0158acf>] file_read_actor+0x75/0xc5
[    5.237554]  [<c01275a6>] current_fs_time+0x13/0x15
[    5.237554]  [<c018bfa6>] mnt_drop_write+0x1a/0xb4
[    5.237554]  [<c015ad90>] generic_file_aio_read+0x492/0x512
[    5.237554]  [<c01e9462>] sg_copy_buffer+0x2e/0x14f
[    5.237554]  [<c01e959c>] sg_copy_from_buffer+0xb/0xe
[    5.237554]  [<f8bb036a>] get_container_serial_callback+0x87/0xc4 [aacraid]
[    5.237554]  [<f8bb5d32>] aac_intr_normal+0x166/0x1a2 [aacraid]
[    5.237554]  [<f8bb696f>] aac_rx_intr_message+0x25/0x55 [aacraid]
[    5.237554]  [<c01545a7>] handle_IRQ_event+0x23/0x51
[    5.237554]  [<c01553b2>] handle_fasteoi_irq+0x71/0xa4
[    5.237554]  [<c0105f8b>] do_IRQ+0x4d/0x66
[    5.237554]  [<c010434b>] common_interrupt+0x23/0x28
[    5.237554]  =======================
[    5.237554] ---[ end trace d08418442dbc698b ]---

And:

[   17.985377] ------------[ cut here ]------------
[   17.985377] WARNING: at lib/scatterlist.c:316 sg_copy_buffer+0x2e/0x14f()
[   17.985377] Modules linked in: jfs nls_base dm_mirror dm_log dm_snapshot dm_mod usb_storage sg sr_mod cdrom ide_pci_generic ide_core ata_generic ahci usbhid hid ff_memless sd_mod ohci1394 ata_piix pata_marvell aacraid ieee1394 libata scsi_mod dock ehci_hcd uhci_hcd usbcore e1000e thermal processor fan thermal_sys
[   17.985377] Pid: 1770, comm: udevd Tainted: G        W 2.6.26.5-pristine-patched #1
[   17.985377]  [<c01237c0>] warn_on_slowpath+0x40/0x79
[   17.985383]  [<c01877d5>] dput+0x16/0xdb
[   17.985383]  [<c0115fa7>] do_page_fault+0x362/0x7e7
[   17.985383]  [<c011600e>] do_page_fault+0x3c9/0x7e7
[   17.985383]  [<c018c08e>] mntput_no_expire+0x18/0xf3
[   17.985383]  [<c018c08e>] mntput_no_expire+0x18/0xf3
[   17.985383]  [<c025517d>] sys_sendto+0xfc/0x127
[   17.985383]  [<c01e9462>] sg_copy_buffer+0x2e/0x14f
[   17.985383]  [<c01befd1>] security_sk_alloc+0xd/0xf
[   17.985383]  [<c0256e6c>] sk_prot_alloc+0x36/0x83
[   17.985383]  [<c01e959c>] sg_copy_from_buffer+0xb/0xe
[   17.985383]  [<f8bb036a>] get_container_serial_callback+0x87/0xc4 [aacraid]
[   17.985383]  [<f8bb5d32>] aac_intr_normal+0x166/0x1a2 [aacraid]
[   17.985383]  [<f8bb696f>] aac_rx_intr_message+0x25/0x55 [aacraid]
[   17.985383]  [<c01545a7>] handle_IRQ_event+0x23/0x51
[   17.985383]  [<c01553b2>] handle_fasteoi_irq+0x71/0xa4
[   17.985383]  [<c0105f8b>] do_IRQ+0x4d/0x66
[   17.985383]  [<c010434b>] common_interrupt+0x23/0x28
[   17.985383]  [<c02c0000>] __reacquire_kernel_lock+0x1e/0x3d
[   17.985383]  =======================
[   17.985383] ---[ end trace d08418442dbc698b ]---
Comment 3 Anonymous Emailer 2008-09-10 07:13:24 UTC
Reply-To: fujita.tomonori@lab.ntt.co.jp

On Wed, 10 Sep 2008 05:44:31 -0700 (PDT)
bugme-daemon@bugzilla.kernel.org wrote:

> http://bugzilla.kernel.org/show_bug.cgi?id=11529
> 
> 
> 
> 
> 
> ------- Comment #2 from winfried@tilanus.com  2008-09-10 05:44 -------
> The patch didn't apply, but adding the lines by hand worked.

Strange, I can cleanly apply the patch to 2.6.26.5 that the mailing
list delivered...


> It solved most of
> the warnings, but I am still left with two warnings at boottime:

Ah, sorry. Can you try this patch?

Thanks,


diff --git a/drivers/scsi/aacraid/aachba.c b/drivers/scsi/aacraid/aachba.c
index aa4e77c..160feab 100644
--- a/drivers/scsi/aacraid/aachba.c
+++ b/drivers/scsi/aacraid/aachba.c
@@ -404,13 +404,16 @@ static void get_container_name_callback(void *context, struct fib * fibptr)
 			char d[sizeof(((struct inquiry_data *)NULL)->inqd_pid)];
 			int count = sizeof(d);
 			char *dp = d;
+			unsigned long flags;
 			do {
 				*dp++ = (*sp) ? *sp++ : ' ';
 			} while (--count > 0);
 
+			local_irq_save(flags);
 			scsi_sg_copy_to_buffer(scsicmd, &inq, sizeof(inq));
 			memcpy(inq.inqd_pid, d, sizeof(d));
 			scsi_sg_copy_from_buffer(scsicmd, &inq, sizeof(inq));
+			local_irq_restore(flags);
 		}
 	}
 
@@ -793,6 +796,7 @@ static void get_container_serial_callback(void *context, struct fib * fibptr)
 	get_serial_reply = (struct aac_get_serial_resp *) fib_data(fibptr);
 	/* Failure is irrelevant, using default value instead */
 	if (le32_to_cpu(get_serial_reply->status) == CT_OK) {
+		unsigned long flags;
 		char sp[13];
 		/* EVPD bit set */
 		sp[0] = INQD_PDT_DA;
@@ -800,7 +804,9 @@ static void get_container_serial_callback(void *context, struct fib * fibptr)
 		sp[2] = 0;
 		sp[3] = snprintf(sp+4, sizeof(sp)-4, "%08X",
 		  le32_to_cpu(get_serial_reply->uid));
+		local_irq_save(flags);
 		scsi_sg_copy_from_buffer(scsicmd, sp, sizeof(sp));
+		local_irq_restore(flags);
 	}
 
 	scsicmd->result = DID_OK << 16 | COMMAND_COMPLETE << 8 | SAM_STAT_GOOD;

Comment 4 Winfried Tilanus 2008-09-10 10:35:28 UTC
Thanks, this patch did the trick, no warnings any more.

Note You need to log in before you can comment on or make changes to this bug.