Kernel Bug Tracker – Bug 17321
i386 WARNING: at kernel/softirq.c:143 _local_bh_enable_ip+0x2f/0x7e
Last modified: 2010-09-20 16:18:12 UTC
Subject : i386 WARNING: at kernel/softirq.c:143 _local_bh_enable_ip+0x2f/0x7e
Submitter : Arno Schuring <firstname.lastname@example.org>
Date : 2010-08-27 20:04
Message-ID : 4C781A3A.email@example.com
References : http://marc.info/?l=linux-kernel&m=128294076822387&w=2
This entry is being used for tracking a regression from 2.6.35. Please don't
close it until the problem is fixed in the mainline.
Created attachment 28321 [details]
Created attachment 28331 [details]
Created attachment 28341 [details]
lspci -vv output
still present in -rc3. I notice that the full backtrace is not listed yet (it is in the attached dmesg though), so here it is in full:
[ 12.675795] ------------[ cut here ]------------
[ 12.675814] WARNING: at kernel/softirq.c:143 _local_bh_enable_ip+0x2f/0x7e()
[ 12.675818] Hardware name:
[ 12.675821] Modules linked in: xfs exportfs w83627hf_wdt w83627hf hwmon_vid d
m_crypt dm_mod rtc_cmos 8250_pnp processor i2c_sis96x 8250 rtc_core evdev pcspkr
button usbhid usb_storage rtc_lib serial_core hid ext4 jbd2 crc16 ohci_hcd ehci
_hcd usbcore floppy thermal sr_mod thermal_sys nls_base cdrom [last unloaded: sc
[ 12.675872] Pid: 0, comm: swapper Not tainted 2.6.36-rc3+ #20
[ 12.675875] Call Trace:
[ 12.675891] [<c1024fc1>] ? warn_slowpath_common+0x6a/0x7b
[ 12.675896] [<c102975a>] ? _local_bh_enable_ip+0x2f/0x7e
[ 12.675902] [<c1024fdf>] ? warn_slowpath_null+0xd/0x10
[ 12.675907] [<c102975a>] ? _local_bh_enable_ip+0x2f/0x7e
[ 12.675914] [<c125c832>] ? netpoll_rx+0x71/0x77
[ 12.675919] [<c125c846>] ? netif_rx+0xe/0xf7
[ 12.675928] [<c122085b>] ? sis900_interrupt+0x1c7/0x4ab
[ 12.675936] [<c10580e3>] ? handle_IRQ_event+0x54/0x102
[ 12.675943] [<c1059981>] ? handle_fasteoi_irq+0x8c/0xdf
[ 12.675948] [<c10598f5>] ? handle_fasteoi_irq+0x0/0xdf
[ 12.675951] <IRQ> [<c1003a3c>] ? do_IRQ+0x2b/0x7f
[ 12.675961] [<c1002ca9>] ? common_interrupt+0x29/0x30
[ 12.675971] [<c1007eba>] ? default_idle+0x39/0x4c
[ 12.675975] [<c1001b00>] ? cpu_idle+0x35/0xad
[ 12.675983] [<c14346c0>] ? start_kernel+0x274/0x277
[ 12.675987] ---[ end trace c042fd3b7ebacdb1 ]---
Added Paul on the cc list.
The patch at http://lkml.org/lkml/2010/9/2/292 should address this.
>Comment #6 From Paul E. McKenney 2010-09-06 15:26:36 -------
>The patch at http://lkml.org/lkml/2010/9/2/292 should address this.
Sadly, it doesn't:
aschuring@neminis:~/work/linux-GIT$ grep -A5 _bh_enable_ip kernel/softirq.c
static inline void _local_bh_enable_ip(unsigned long ip)
aschuring@neminis:~/work/linux-GIT$ ls -l kernel/softirq.c
-rw-r--r-- 1 aschuring aschuring 21733 Sep 7 11:23 kernel/softirq.c
[ 0.000000] Linux version 2.6.36-rc3+ (aschuring@neminis) (gcc version 4.4.5 20100728 (prerelease) (Debian 4.4.4-8) ) #21 PREEMPT Tue Sep 7 11:29:32 CEST 2010
[ 12.880400] WARNING: at kernel/softirq.c:143 _local_bh_enable_ip+0x28/0x77()
The backtrace is identical, except for the addresses in _local_bh_enable_ip. Will do a make clean and retest.
No change after a clean build:
[ 11.811728] WARNING: at kernel/softirq.c:143 _local_bh_enable_ip+0x28/0x77()
Closer inspection of the backtrace shows we're in a code path from handle_IRQ_event, so it shouldn't be too surprising that the patch has no effect?
Created attachment 30382 [details]
first bisection attempt
de85d99eb7b595f6751550184b94c1e2f74a828b is the first bad commit
Author: Herbert Xu <firstname.lastname@example.org>
Date: Thu Jun 10 16:12:44 2010 +0000
netpoll: Fix RCU usage
The use of RCU in netpoll is incorrect in a number of places:
1) The initial setting is lacking a write barrier.
2) The synchronize_rcu is in the wrong place.
3) Read barriers are missing.
4) Some places are even missing rcu_read_lock.
5) npinfo is zeroed after freeing.
This patch fixes those issues. As most users are in BH context,
this also converts the RCU usage to the BH variant.
Signed-off-by: Herbert Xu <email@example.com>
Acked-by: Paul E. McKenney <firstname.lastname@example.org>
Signed-off-by: David S. Miller <email@example.com>
:040000 040000 27b02b9815fedf05ce1dc42cf935b984e97abba9 0f46cb6360a8a082f5695d56c48fdf40078cd6b3 Minclude
:040000 040000 fdf4c432f33c67668c9740dde0b8c89f2806a43b 39c5f6fb0abea040a520ff2e08e26aac4689c49c Mnet
I'm not entirely convinced that this bisection result is correct, although it seems plausible. My problem with the commit is this:
$ git describe de85d99eb7b595f6751550184b94c1e2f74a828b
Doesn't this mean that I should have seen this WARNING before? I have been running 35-rc4 and 35.1 before and never saw this trace. Moreover, the bisection took me as far back as 2.6.34, even though I started out with marking 2.6.35 as good. I'm attaching the full bisect log for closer inspection, I may have confused git by forcing it to re-test v2.6.35 (5th step).
$ git tag --contains de85d99eb7b595f6751550184b94c1e2f74a828b
$ git describe just says on what it is _based_ upon. But it doesn't say when it was integrated into mainline.
Please check if the patch from:
fixes the issue for you.
Thanks for the git info :)
The patch does indeed solve the problem.
Ok, I'm closing this as a duplicate of that bug then.
Thanks for the testing and bisection.
*** This bug has been marked as a duplicate of bug 18492 ***