Bug 9632 - kernel BUG at kernel/timer.c:606
Summary: kernel BUG at kernel/timer.c:606
Status: CLOSED CODE_FIX
Alias: None
Product: Timers
Classification: Unclassified
Component: Other (show other bugs)
Hardware: All Linux
: P1 high
Assignee: john stultz
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2007-12-25 01:20 UTC by Badalian Slava
Modified: 2008-01-28 22:50 UTC (History)
4 users (show)

See Also:
Kernel Version: 2.6.23.12
Subsystem:
Regression: ---
Bisected commit-id:


Attachments
debug patch to detect the corrupted pending timers (3.91 KB, patch)
2007-12-25 09:55 UTC, Oleg Nesterov
Details | Diff

Description Badalian Slava 2007-12-25 01:20:29 UTC
Most recent kernel where this bug did not occur:
2.6.20+
Distribution:
Gentoo
Hardware Environment:
8 Different PC
Software Environment:
?
Problem Description:
Catched by netconsole:
[91922.085864] ------------[ cut here ]------------
[91922.085975] kernel BUG at kernel/timer.c:606!
[91922.086058] invalid opcode: 0000 [#1]
[91922.086127] SMP
[91922.086201] Modules linked in: netconsole cls_u32 sch_sfq sch_htb
xt_tcpudp iptable_filter ip_tables x_tables i2c_i801 i2c_core
[91922.086386] CPU:    1
[91922.086387] EIP:    0060:[<c0127387>]    Not tainted VLI
[91922.086389] EFLAGS: 00010087   (2.6.23-gentoo-r4-fw #4)
[91922.086600] EIP is at cascade+0x34/0x4f
[91922.086669] eax: c0452200   ebx: f450408c   ecx: 00000022   edx: f3c6e08c
[91922.086740] esi: 00000022   edi: c21ce000   ebp: 00000001   esp: c21c3ef8
[91922.086815] ds: 007b   es: 007b   fs: 00d8  gs: 0000  ss: 0068
[91922.086885] Process swapper (pid: 0, ti=c21c2000 task=c21af000
task.ti=c21c2000)
[91922.086954] Stack: f3c6e08c c21bfb74 00000000 c21ce000 0000000a
c012767a c21af000 00000001
[91922.087119]        c21c3f18 c0106963 c21c3f68 00000001 00000021
c03c0b08 0000000a c0124556
[91922.087285]        00000046 00000000 c21c2008 00000000 c01245ec
c2015120 c0114a11 00000046
[91922.087451] Call Trace:
[91922.087586]  [<c012767a>] run_timer_softirq+0x51/0x154
[91922.087669]  [<c0106963>] profile_pc+0x21/0x46
[91922.087752]  [<c0124556>] __do_softirq+0x5d/0xc1
[91922.087833]  [<c01245ec>] do_softirq+0x32/0x36
[91922.087915]  [<c0114a11>] smp_apic_timer_interrupt+0x74/0x80
[91922.087997]  [<c010484c>] apic_timer_interrupt+0x28/0x30
[91922.088076]  [<c0102255>] mwait_idle_with_hints+0x3b/0x3f
[91922.088162]  [<c0102259>] mwait_idle+0x0/0xa
[91922.088237]  [<c0102398>] cpu_idle+0x91/0xaa
[91922.088319]  =======================
[91922.088390] Code: 08 8d 04 ca 8b 10 89 62 04 89 14 24 8b 50 04 89 22
89 00 89 54 24 04 8b 14 24 89 40 04 8b 1a eb 19 8b 42 14 83 e0 fe 39 f8
74 04 <0f> 0b eb fe 89 f8 e8 d8 fe ff ff 89 da 8b 1b 39 e2 75 e3 59 89
[91922.088864] EIP: [<c0127387>] cascade+0x34/0x4f SS:ESP 0068:c21c3ef8

Steps to reproduce:
Random. 1-3 times in week
.
I do every hour
echo START >> log.txt
iptables-restore < xxx.txt
tc qdisc del dev eth0 root
tc qdisc del dev eth1 root
tc -b new_rules.txt
echo END >> log.txt

Bug always be between START and END
Comment 1 Oleg Nesterov 2007-12-25 09:52:30 UTC
Looks like the pending timer was corrupted. Say it was freed/reused
without del_timer().

I don't know how to add myself to CC list and send the patch at the
same time. Will send the debug patch in the next comment.
Comment 2 Oleg Nesterov 2007-12-25 09:55:15 UTC
Created attachment 14183 [details]
debug patch to detect the corrupted pending timers

WARNING: the patch is not tested.

Still, ant chance you can reproduce the bug with this patch?
Comment 3 Badalian Slava 2007-12-25 10:31:40 UTC
Hmm.. PC must be HA. i can't test paches on it =(

i replace
BUG_ON(xxx)
to
if (xxx) panic();

i think its help to me and i wait for fixing bug (reboot is better when freeze).
Comment 4 Marek Kierdelewicz 2007-12-27 16:46:04 UTC
Hi

Leigh Sharpe had the ?same? problem as you maybe having right now. Read on here:
http://lists.openwall.net/netdev/2007/12/05/43

He solved the problem by modifying order of commands in his script. Below is his original reply to my inquiry:

----------------------------
 My issue wasn't with an SMP kernel. I only had one CPU in the machine I was playing with.
 I actually found the reason for the problem I was having. It seems that when adding qdiscs or classes, you need to do it in the right order. If you add filters to a qdisc, redirecting traffic to a class or qdisc which has not yet been set up, the kernel will crash. If classes are set up prior to adding the filters, it doesn't cause a problem.

Leigh
----------------------------

Good luck Badalian.
Comment 5 Badalian Slava 2007-12-27 22:40:50 UTC
Order
Class add
Qdisc add
Filter add
Comment 6 Badalian Slava 2007-12-27 22:41:37 UTC
first rule
qdisc add dev eth1 root handle 1 htb default 7
Comment 7 Jarek Poplawski 2007-12-28 00:09:50 UTC
I think this order could especially matter with ifb which is used
by Leigh and Marek, but even if changing the order helps it's not
the proper bug fix, after all.

BTW, Slava, could you add here some more details like .config and
maybe a bit more about these rules (addresses masked of course)?
I especially wonder if you are using something like ifb, vlan or
bonding? Of course the outcome from Oleg's patch should be still
the most interesting.
Comment 8 Jarek Poplawski 2007-12-28 00:34:30 UTC
One more BTW: Oleg, your patch looks very interesting (clever) and
I hope you'll manage to add this to the kernel under some new or
existing debugging option - it seems to be very useful considering
the 'optimistic' way of deleting timers in many places. But, maybe
I miss something, it seems you could try to add 'recovering' of
these timers after the buggy one, e.g.  with some backwards loop?
Comment 9 Oleg Nesterov 2007-12-28 03:07:24 UTC
On 12/28, bugme-daemon@bugzilla.kernel.org wrote:
>
> ------- Comment #8 from jarkao2@gmail.com  2007-12-28 00:34 -------
> I hope you'll manage to add this to the kernel under some new or
> existing debugging option

Will try to do. The patch is simple, but unfortunately it needs a lot
of #ifdef's to minimize the impact without CONFIG_DEBUG_XXX.

> But, maybe
> I miss something, it seems you could try to add 'recovering' of
> these timers after the buggy one, e.g.  with some backwards loop?

Yes Jarek. I didn't dare to do this right now, just did a minimal hack
to catch the bug. It also need other changes. Say, __run_timers() should
also check the timer.

Btw, please look at the first attachment at

	http://bugzilla.kernel.org/show_bug.cgi?id=9180

This patch is similar, and it does recover the list in run_workqueue().
But it is not "complete" as well.

Oleg.
Comment 10 Jarek Poplawski 2007-12-28 05:04:28 UTC
> Will try to do. The patch is simple, but unfortunately it needs a lot
> of #ifdef's to minimize the impact without CONFIG_DEBUG_XXX.

After you catch the idea it could look like simple... But, it took
some time for me... I thought about using a static table for this,
like in lockdep, that's why I called your way clever. And I
'personally' like ifdefs: they make it harder to read the code at
the beginning, but IMHO they make it easier e.g. to debug it later,
when you can easilly skip what doesn't matter for sure. But it
would be very bad to abstain from adding debugging (and tolerate
such misterious crashes) for such esthetical reasons.

> This patch is similar, and it does recover the list in run_workqueue().
> But it is not "complete" as well.

Yes, I've thought about mentionning here workqueue too... Of course,
this is needed as well - maybe a bit less after fixing the cancelling
in workqueues... Probably some more solutions could be reimplemented
too (lockdep checks for del_timer_sync?!). I also wonder why there is
no such simple thing as del_timer_last() or _exit(), which could be
required in all xyz_exit() or xyz_destroy() functions to mark the
timer_list structure can't be used by mod_timer() anymore, without
init_timer()?! There could be also considered if checking the function
field only in such a debugging is reliable enough: maybe some checksum
would be even better. But, of course, there is no need to wait with
adding a basic debugging like this now (especially to -mm). Any
additional features or improvements could be done later. Thanks!
Comment 11 Jarek Poplawski 2007-12-28 05:27:51 UTC
> [...] maybe some checksum would be even better [...]

As a matter of fact the simplest and most realiable should be storing
of some pointer or key, which could be verified by a call to mm if
it's still valid (this patrt of memory not kfreed in the meantime),
but I don't know how much I'm dreaming with this...
Comment 12 Badalian Slava 2008-01-28 22:50:50 UTC
Hmmm... in last kernels system work normal... i think need close bug... i reopen if bug still in kernel and i can reproduce it.

Note You need to log in before you can comment on or make changes to this bug.