Most recent kernel where this bug did not occur: 2.6.20+ Distribution: Gentoo Hardware Environment: 8 Different PC Software Environment: ? Problem Description: Catched by netconsole: [91922.085864] ------------[ cut here ]------------ [91922.085975] kernel BUG at kernel/timer.c:606! [91922.086058] invalid opcode: 0000 [#1] [91922.086127] SMP [91922.086201] Modules linked in: netconsole cls_u32 sch_sfq sch_htb xt_tcpudp iptable_filter ip_tables x_tables i2c_i801 i2c_core [91922.086386] CPU: 1 [91922.086387] EIP: 0060:[<c0127387>] Not tainted VLI [91922.086389] EFLAGS: 00010087 (2.6.23-gentoo-r4-fw #4) [91922.086600] EIP is at cascade+0x34/0x4f [91922.086669] eax: c0452200 ebx: f450408c ecx: 00000022 edx: f3c6e08c [91922.086740] esi: 00000022 edi: c21ce000 ebp: 00000001 esp: c21c3ef8 [91922.086815] ds: 007b es: 007b fs: 00d8 gs: 0000 ss: 0068 [91922.086885] Process swapper (pid: 0, ti=c21c2000 task=c21af000 task.ti=c21c2000) [91922.086954] Stack: f3c6e08c c21bfb74 00000000 c21ce000 0000000a c012767a c21af000 00000001 [91922.087119] c21c3f18 c0106963 c21c3f68 00000001 00000021 c03c0b08 0000000a c0124556 [91922.087285] 00000046 00000000 c21c2008 00000000 c01245ec c2015120 c0114a11 00000046 [91922.087451] Call Trace: [91922.087586] [<c012767a>] run_timer_softirq+0x51/0x154 [91922.087669] [<c0106963>] profile_pc+0x21/0x46 [91922.087752] [<c0124556>] __do_softirq+0x5d/0xc1 [91922.087833] [<c01245ec>] do_softirq+0x32/0x36 [91922.087915] [<c0114a11>] smp_apic_timer_interrupt+0x74/0x80 [91922.087997] [<c010484c>] apic_timer_interrupt+0x28/0x30 [91922.088076] [<c0102255>] mwait_idle_with_hints+0x3b/0x3f [91922.088162] [<c0102259>] mwait_idle+0x0/0xa [91922.088237] [<c0102398>] cpu_idle+0x91/0xaa [91922.088319] ======================= [91922.088390] Code: 08 8d 04 ca 8b 10 89 62 04 89 14 24 8b 50 04 89 22 89 00 89 54 24 04 8b 14 24 89 40 04 8b 1a eb 19 8b 42 14 83 e0 fe 39 f8 74 04 <0f> 0b eb fe 89 f8 e8 d8 fe ff ff 89 da 8b 1b 39 e2 75 e3 59 89 [91922.088864] EIP: [<c0127387>] cascade+0x34/0x4f SS:ESP 0068:c21c3ef8 Steps to reproduce: Random. 1-3 times in week . I do every hour echo START >> log.txt iptables-restore < xxx.txt tc qdisc del dev eth0 root tc qdisc del dev eth1 root tc -b new_rules.txt echo END >> log.txt Bug always be between START and END
Looks like the pending timer was corrupted. Say it was freed/reused without del_timer(). I don't know how to add myself to CC list and send the patch at the same time. Will send the debug patch in the next comment.
Created attachment 14183 [details] debug patch to detect the corrupted pending timers WARNING: the patch is not tested. Still, ant chance you can reproduce the bug with this patch?
Hmm.. PC must be HA. i can't test paches on it =( i replace BUG_ON(xxx) to if (xxx) panic(); i think its help to me and i wait for fixing bug (reboot is better when freeze).
Hi Leigh Sharpe had the ?same? problem as you maybe having right now. Read on here: http://lists.openwall.net/netdev/2007/12/05/43 He solved the problem by modifying order of commands in his script. Below is his original reply to my inquiry: ---------------------------- My issue wasn't with an SMP kernel. I only had one CPU in the machine I was playing with. I actually found the reason for the problem I was having. It seems that when adding qdiscs or classes, you need to do it in the right order. If you add filters to a qdisc, redirecting traffic to a class or qdisc which has not yet been set up, the kernel will crash. If classes are set up prior to adding the filters, it doesn't cause a problem. Leigh ---------------------------- Good luck Badalian.
Order Class add Qdisc add Filter add
first rule qdisc add dev eth1 root handle 1 htb default 7
I think this order could especially matter with ifb which is used by Leigh and Marek, but even if changing the order helps it's not the proper bug fix, after all. BTW, Slava, could you add here some more details like .config and maybe a bit more about these rules (addresses masked of course)? I especially wonder if you are using something like ifb, vlan or bonding? Of course the outcome from Oleg's patch should be still the most interesting.
One more BTW: Oleg, your patch looks very interesting (clever) and I hope you'll manage to add this to the kernel under some new or existing debugging option - it seems to be very useful considering the 'optimistic' way of deleting timers in many places. But, maybe I miss something, it seems you could try to add 'recovering' of these timers after the buggy one, e.g. with some backwards loop?
On 12/28, bugme-daemon@bugzilla.kernel.org wrote: > > ------- Comment #8 from jarkao2@gmail.com 2007-12-28 00:34 ------- > I hope you'll manage to add this to the kernel under some new or > existing debugging option Will try to do. The patch is simple, but unfortunately it needs a lot of #ifdef's to minimize the impact without CONFIG_DEBUG_XXX. > But, maybe > I miss something, it seems you could try to add 'recovering' of > these timers after the buggy one, e.g. with some backwards loop? Yes Jarek. I didn't dare to do this right now, just did a minimal hack to catch the bug. It also need other changes. Say, __run_timers() should also check the timer. Btw, please look at the first attachment at http://bugzilla.kernel.org/show_bug.cgi?id=9180 This patch is similar, and it does recover the list in run_workqueue(). But it is not "complete" as well. Oleg.
> Will try to do. The patch is simple, but unfortunately it needs a lot > of #ifdef's to minimize the impact without CONFIG_DEBUG_XXX. After you catch the idea it could look like simple... But, it took some time for me... I thought about using a static table for this, like in lockdep, that's why I called your way clever. And I 'personally' like ifdefs: they make it harder to read the code at the beginning, but IMHO they make it easier e.g. to debug it later, when you can easilly skip what doesn't matter for sure. But it would be very bad to abstain from adding debugging (and tolerate such misterious crashes) for such esthetical reasons. > This patch is similar, and it does recover the list in run_workqueue(). > But it is not "complete" as well. Yes, I've thought about mentionning here workqueue too... Of course, this is needed as well - maybe a bit less after fixing the cancelling in workqueues... Probably some more solutions could be reimplemented too (lockdep checks for del_timer_sync?!). I also wonder why there is no such simple thing as del_timer_last() or _exit(), which could be required in all xyz_exit() or xyz_destroy() functions to mark the timer_list structure can't be used by mod_timer() anymore, without init_timer()?! There could be also considered if checking the function field only in such a debugging is reliable enough: maybe some checksum would be even better. But, of course, there is no need to wait with adding a basic debugging like this now (especially to -mm). Any additional features or improvements could be done later. Thanks!
> [...] maybe some checksum would be even better [...] As a matter of fact the simplest and most realiable should be storing of some pointer or key, which could be verified by a call to mm if it's still valid (this patrt of memory not kfreed in the meantime), but I don't know how much I'm dreaming with this...
Hmmm... in last kernels system work normal... i think need close bug... i reopen if bug still in kernel and i can reproduce it.