Bug 17321
Summary: | i386 WARNING: at kernel/softirq.c:143 _local_bh_enable_ip+0x2f/0x7e | ||
---|---|---|---|
Product: | Platform Specific/Hardware | Reporter: | Maciej Rutecki (maciej.rutecki) |
Component: | i386 | Assignee: | platform_i386 |
Status: | CLOSED DUPLICATE | ||
Severity: | normal | CC: | aelschuring, florian, maciej.rutecki, paulmck, rjw |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | 2.6.36-rc2 | Subsystem: | |
Regression: | Yes | Bisected commit-id: | |
Bug Depends on: | |||
Bug Blocks: | 16444 | ||
Attachments: |
config
full dmesg lspci -vv output first bisection attempt |
Description
Maciej Rutecki
2010-08-29 13:57:35 UTC
Created attachment 28321 [details]
config
Created attachment 28331 [details]
full dmesg
Created attachment 28341 [details]
lspci -vv output
still present in -rc3. I notice that the full backtrace is not listed yet (it is in the attached dmesg though), so here it is in full: [ 12.675795] ------------[ cut here ]------------ [ 12.675814] WARNING: at kernel/softirq.c:143 _local_bh_enable_ip+0x2f/0x7e() [ 12.675818] Hardware name: [ 12.675821] Modules linked in: xfs exportfs w83627hf_wdt w83627hf hwmon_vid d m_crypt dm_mod rtc_cmos 8250_pnp processor i2c_sis96x 8250 rtc_core evdev pcspkr button usbhid usb_storage rtc_lib serial_core hid ext4 jbd2 crc16 ohci_hcd ehci _hcd usbcore floppy thermal sr_mod thermal_sys nls_base cdrom [last unloaded: sc si_wait_scan] [ 12.675872] Pid: 0, comm: swapper Not tainted 2.6.36-rc3+ #20 [ 12.675875] Call Trace: [ 12.675891] [<c1024fc1>] ? warn_slowpath_common+0x6a/0x7b [ 12.675896] [<c102975a>] ? _local_bh_enable_ip+0x2f/0x7e [ 12.675902] [<c1024fdf>] ? warn_slowpath_null+0xd/0x10 [ 12.675907] [<c102975a>] ? _local_bh_enable_ip+0x2f/0x7e [ 12.675914] [<c125c832>] ? netpoll_rx+0x71/0x77 [ 12.675919] [<c125c846>] ? netif_rx+0xe/0xf7 [ 12.675928] [<c122085b>] ? sis900_interrupt+0x1c7/0x4ab [ 12.675936] [<c10580e3>] ? handle_IRQ_event+0x54/0x102 [ 12.675943] [<c1059981>] ? handle_fasteoi_irq+0x8c/0xdf [ 12.675948] [<c10598f5>] ? handle_fasteoi_irq+0x0/0xdf [ 12.675951] <IRQ> [<c1003a3c>] ? do_IRQ+0x2b/0x7f [ 12.675961] [<c1002ca9>] ? common_interrupt+0x29/0x30 [ 12.675971] [<c1007eba>] ? default_idle+0x39/0x4c [ 12.675975] [<c1001b00>] ? cpu_idle+0x35/0xad [ 12.675983] [<c14346c0>] ? start_kernel+0x274/0x277 [ 12.675987] ---[ end trace c042fd3b7ebacdb1 ]--- Added Paul on the cc list. The patch at http://lkml.org/lkml/2010/9/2/292 should address this. >Comment #6 From Paul E. McKenney 2010-09-06 15:26:36 -------
>The patch at http://lkml.org/lkml/2010/9/2/292 should address this.
Sadly, it doesn't:
aschuring@neminis:~/work/linux-GIT$ grep -A5 _bh_enable_ip kernel/softirq.c
static inline void _local_bh_enable_ip(unsigned long ip)
{
WARN_ON_ONCE(in_irq());
#ifdef CONFIG_TRACE_IRQFLAGS
local_irq_disable();
#endif
--
[...]
aschuring@neminis:~/work/linux-GIT$ ls -l kernel/softirq.c
-rw-r--r-- 1 aschuring aschuring 21733 Sep 7 11:23 kernel/softirq.c
[ 0.000000] Linux version 2.6.36-rc3+ (aschuring@neminis) (gcc version 4.4.5 20100728 (prerelease) (Debian 4.4.4-8) ) #21 PREEMPT Tue Sep 7 11:29:32 CEST 2010
[...]
[ 12.880400] WARNING: at kernel/softirq.c:143 _local_bh_enable_ip+0x28/0x77()
The backtrace is identical, except for the addresses in _local_bh_enable_ip. Will do a make clean and retest.
No change after a clean build: [ 11.811728] WARNING: at kernel/softirq.c:143 _local_bh_enable_ip+0x28/0x77() Closer inspection of the backtrace shows we're in a code path from handle_IRQ_event, so it shouldn't be too surprising that the patch has no effect? Created attachment 30382 [details] first bisection attempt de85d99eb7b595f6751550184b94c1e2f74a828b is the first bad commit commit de85d99eb7b595f6751550184b94c1e2f74a828b Author: Herbert Xu <herbert@gondor.apana.org.au> Date: Thu Jun 10 16:12:44 2010 +0000 netpoll: Fix RCU usage The use of RCU in netpoll is incorrect in a number of places: 1) The initial setting is lacking a write barrier. 2) The synchronize_rcu is in the wrong place. 3) Read barriers are missing. 4) Some places are even missing rcu_read_lock. 5) npinfo is zeroed after freeing. This patch fixes those issues. As most users are in BH context, this also converts the RCU usage to the BH variant. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net> :040000 040000 27b02b9815fedf05ce1dc42cf935b984e97abba9 0f46cb6360a8a082f5695d56c48fdf40078cd6b3 Minclude :040000 040000 fdf4c432f33c67668c9740dde0b8c89f2806a43b 39c5f6fb0abea040a520ff2e08e26aac4689c49c Mnet I'm not entirely convinced that this bisection result is correct, although it seems plausible. My problem with the commit is this: $ git describe de85d99eb7b595f6751550184b94c1e2f74a828b v2.6.35-rc1-454-gde85d99 Doesn't this mean that I should have seen this WARNING before? I have been running 35-rc4 and 35.1 before and never saw this trace. Moreover, the bisection took me as far back as 2.6.34, even though I started out with marking 2.6.35 as good. I'm attaching the full bisect log for closer inspection, I may have confused git by forcing it to re-test v2.6.35 (5th step). try $ git tag --contains de85d99eb7b595f6751550184b94c1e2f74a828b $ git describe just says on what it is _based_ upon. But it doesn't say when it was integrated into mainline. Please check if the patch from: https://bugzilla.kernel.org/show_bug.cgi?id=18492 fixes the issue for you. Thanks for the git info :) The patch does indeed solve the problem. Ok, I'm closing this as a duplicate of that bug then. Thanks for the testing and bisection. *** This bug has been marked as a duplicate of bug 18492 *** |