Bug 10489
Summary: | Kprobe smoke test lockdep warning | ||
---|---|---|---|
Product: | Other | Reporter: | jean-marc LACROIX (jeanmarc.lacroix) |
Component: | Other | Assignee: | other_other |
Status: | CLOSED OBSOLETE | ||
Severity: | normal | CC: | alan, bunk |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | 2.6.25 | Subsystem: | |
Regression: | No | Bisected commit-id: |
Description
jean-marc LACROIX
2008-04-20 11:35:12 UTC
plateform is -> cat /proc/cpuinfo processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 8 model name : Pentium III (Coppermine) stepping : 3 cpu MHz : 664.519 cache size : 256 KB fdiv_bug : no hlt_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 2 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 sep mtrr pge mca cmov pat pse36 mmx fxsr sse up bogomips : 1334.17 clflush size : 32 http://bugzilla.kernel.org/show_bug.cgi?id=10489 Please reply via email (more information from the submitter is in the Bugzilla entry). ReportedBy: jeanmarc.lacroix@free.Fr Earliest failing kernel version: 2.6.25 Distribution: DEbian Etch Hardware Environment: PIII Software Environment: Problem Description: Warning at boot after running all embedded test Steps to reproduce: boot !!! 2.701005] Registering sys device '<NULL>' [ 2.702320] Registering sys device '<NULL>' [ 2.703711] Registering sys device '<NULL>' [ 2.704536] Initializing RT-Tester: OK [ 2.704824] device: 'snapshot': device_add [ 2.705140] PM: Adding info for No Bus:snapshot [ 2.706804] audit: initializing netlink socket (disabled) [ 2.707199] type=2000 audit(1208713807.707:1): initialized [ 2.707509] Kprobe smoke test started [ 2.709300] ------------[ cut here ]------------ [ 2.709420] WARNING: at kernel/lockdep.c:2658 check_flags+0x4d/0x12c() [ 2.709541] Modules linked in: [ 2.709588] Pid: 1, comm: swapper Not tainted 2.6.25.jml.057 #1 [ 2.709588] [<c0126acc>] warn_on_slowpath+0x41/0x51 [ 2.709588] [<c010bafc>] ? save_stack_trace+0x1d/0x3b [ 2.709588] [<c0140a83>] ? save_trace+0x37/0x89 [ 2.709588] [<c011987d>] ? kernel_map_pages+0x103/0x11c [ 2.709588] [<c0109803>] ? native_sched_clock+0xca/0xea [ 2.709588] [<c0142958>] ? mark_held_locks+0x41/0x5c [ 2.709588] [<c0382580>] ? kprobe_exceptions_notify+0x322/0x3af [ 2.709588] [<c0142aff>] ? trace_hardirqs_on+0xf1/0x119 [ 2.709588] [<c03825b3>] ? kprobe_exceptions_notify+0x355/0x3af [ 2.709588] [<c0140823>] check_flags+0x4d/0x12c [ 2.709588] [<c0143c9d>] lock_release+0x58/0x195 [ 2.709588] [<c038347c>] ? __atomic_notifier_call_chain+0x0/0x80 [ 2.709588] [<c03834d6>] __atomic_notifier_call_chain+0x5a/0x80 [ 2.709588] [<c0383508>] atomic_notifier_call_chain+0xc/0xe [ 2.709588] [<c013b6d4>] notify_die+0x2d/0x2f [ 2.709588] [<c038168a>] do_debug+0x67/0xfe [ 2.709588] [<c0381287>] debug_stack_correct+0x27/0x30 [ 2.709588] [<c01564c0>] ? kprobe_target+0x1/0x34 [ 2.709588] [<c0156572>] ? init_test_probes+0x50/0x186 [ 2.709588] [<c04fae48>] init_kprobes+0x85/0x8c [ 2.709588] [<c04e947b>] kernel_init+0x13d/0x298 [ 2.709588] [<c04e933e>] ? kernel_init+0x0/0x298 [ 2.709588] [<c04e933e>] ? kernel_init+0x0/0x298 [ 2.709588] [<c0105ef7>] kernel_thread_helper+0x7/0x10 [ 2.709588] ======================= [ 2.709588] ---[ end trace 778e504de7e3b1e3 ]--- [ 2.709588] possible reason: unannotated irqs-off. [ 2.709588] irq event stamp: 370065 [ 2.709588] hardirqs last enabled at (370065): [<c0382580>] kprobe_exceptions_notify+0x322/0x3af [ 2.709588] hardirqs last disabled at (370064): [<c0381bb7>] do_int3+0x1d/0x7d [ 2.709588] softirqs last enabled at (370050): [<c012b464>] __do_softirq+0xfa/0x100 [ 2.709588] softirqs last disabled at (370045): [<c0107438>] do_softirq+0x74/0xd9 [ 2.714751] Kprobe smoke test passed successfully [ 2.716150] Total HugeTLB memory allocated, 0 [ 2.717288] VFS: Disk quotas dquot_6.5.1 [ 2.717689] Dquot-cache hash table entries: 1024 (order 0, 4096 bytes) [ 2.719630] device class 'bsg': registering [ 2.720568] Block layer SCSI generic (bsg) driver version 0.4 loaded (major 253) [ 2.720619] io scheduler noop registered [ 2.720732] io scheduler anticipatory registered [ 2.720848] io scheduler deadline registered [ 2.721089] io scheduler cfq registered (default) [ 2.721354] pci 0000:00:01.0: Boot video device cu Adrian On Mon, Apr 21, 2008 at 5:06 PM, Adrian Bunk <bunk@kernel.org> wrote: > http://bugzilla.kernel.org/show_bug.cgi?id=10489 > > Please reply via email (more information from the submitter is in the > Bugzilla entry). > > > ReportedBy: jeanmarc.lacroix@free.Fr > > Earliest failing kernel version: 2.6.25 > Distribution: DEbian Etch > Hardware Environment: PIII > Software Environment: > Problem Description: Warning at boot after running all > embedded test > > Steps to reproduce: boot !!! > > > 2.701005] Registering sys device '<NULL>' > [ 2.702320] Registering sys device '<NULL>' > [ 2.703711] Registering sys device '<NULL>' > [ 2.704536] Initializing RT-Tester: OK > [ 2.704824] device: 'snapshot': device_add > [ 2.705140] PM: Adding info for No Bus:snapshot > [ 2.706804] audit: initializing netlink socket (disabled) > [ 2.707199] type=2000 audit(1208713807.707:1): initialized > [ 2.707509] Kprobe smoke test started > [ 2.709300] ------------[ cut here ]------------ > [ 2.709420] WARNING: at kernel/lockdep.c:2658 check_flags+0x4d/0x12c() > [ 2.709541] Modules linked in: > [ 2.709588] Pid: 1, comm: swapper Not tainted 2.6.25.jml.057 #1 > [ 2.709588] [<c0126acc>] warn_on_slowpath+0x41/0x51 > [ 2.709588] [<c010bafc>] ? save_stack_trace+0x1d/0x3b > [ 2.709588] [<c0140a83>] ? save_trace+0x37/0x89 > [ 2.709588] [<c011987d>] ? kernel_map_pages+0x103/0x11c > [ 2.709588] [<c0109803>] ? native_sched_clock+0xca/0xea > [ 2.709588] [<c0142958>] ? mark_held_locks+0x41/0x5c > [ 2.709588] [<c0382580>] ? kprobe_exceptions_notify+0x322/0x3af > [ 2.709588] [<c0142aff>] ? trace_hardirqs_on+0xf1/0x119 > [ 2.709588] [<c03825b3>] ? kprobe_exceptions_notify+0x355/0x3af > [ 2.709588] [<c0140823>] check_flags+0x4d/0x12c > [ 2.709588] [<c0143c9d>] lock_release+0x58/0x195 > [ 2.709588] [<c038347c>] ? __atomic_notifier_call_chain+0x0/0x80 > [ 2.709588] [<c03834d6>] __atomic_notifier_call_chain+0x5a/0x80 > [ 2.709588] [<c0383508>] atomic_notifier_call_chain+0xc/0xe > [ 2.709588] [<c013b6d4>] notify_die+0x2d/0x2f > [ 2.709588] [<c038168a>] do_debug+0x67/0xfe > [ 2.709588] [<c0381287>] debug_stack_correct+0x27/0x30 > [ 2.709588] [<c01564c0>] ? kprobe_target+0x1/0x34 > [ 2.709588] [<c0156572>] ? init_test_probes+0x50/0x186 > [ 2.709588] [<c04fae48>] init_kprobes+0x85/0x8c > [ 2.709588] [<c04e947b>] kernel_init+0x13d/0x298 > [ 2.709588] [<c04e933e>] ? kernel_init+0x0/0x298 > [ 2.709588] [<c04e933e>] ? kernel_init+0x0/0x298 > [ 2.709588] [<c0105ef7>] kernel_thread_helper+0x7/0x10 > [ 2.709588] ======================= > [ 2.709588] ---[ end trace 778e504de7e3b1e3 ]--- > [ 2.709588] possible reason: unannotated irqs-off. > [ 2.709588] irq event stamp: 370065 > [ 2.709588] hardirqs last enabled at (370065): [<c0382580>] > kprobe_exceptions_notify+0x322/0x3af > [ 2.709588] hardirqs last disabled at (370064): [<c0381bb7>] > do_int3+0x1d/0x7d > [ 2.709588] softirqs last enabled at (370050): [<c012b464>] > __do_softirq+0xfa/0x100 > [ 2.709588] softirqs last disabled at (370045): [<c0107438>] > do_softirq+0x74/0xd9 > [ 2.714751] Kprobe smoke test passed successfully > [ 2.716150] Total HugeTLB memory allocated, 0 > [ 2.717288] VFS: Disk quotas dquot_6.5.1 > [ 2.717689] Dquot-cache hash table entries: 1024 (order 0, 4096 bytes) > [ 2.719630] device class 'bsg': registering > [ 2.720568] Block layer SCSI generic (bsg) driver version 0.4 loaded > (major > 253) > [ 2.720619] io scheduler noop registered > [ 2.720732] io scheduler anticipatory registered > [ 2.720848] io scheduler deadline registered > [ 2.721089] io scheduler cfq registered (default) > [ 2.721354] pci 0000:00:01.0: Boot video device > This is because kprobes (just like kmemcheck and mmiotrace) needs to disable interrupts while it single-steps the instruction. This is not dangerous, and should be fixed by doing as the warning suggests, by annotating the irqs-off. I don't remember the name of the function(s), it is probably somewhere in lockdep.h. Vegard Hi, I cannot reproduce it here, but this seems to be a likely fix. Vegard diff --git a/arch/x86/kernel/kprobes.c b/arch/x86/kernel/kprobes.c index 34a5912..6d58f6e 100644 --- a/arch/x86/kernel/kprobes.c +++ b/arch/x86/kernel/kprobes.c @@ -424,6 +424,7 @@ static void __kprobes prepare_singlestep(struct kprobe *p, struct pt_regs *regs) clear_btf(); regs->flags |= X86_EFLAGS_TF; regs->flags &= ~X86_EFLAGS_IF; + trace_hardirqs_off(); /* single step inline if the instruction is an int3 */ if (p->opcode == BREAKPOINT_INSTRUCTION) regs->ip = (unsigned long)p->addr; Reply-To: mhiramat@redhat.com Thank you for reporting. Actually, kprobes tries to fixup thread's flags in post_kprobe_handler (which is called from kprobe_exceptions_notify) by trace_hardirqs_fixup_flags(pt_regs->flags). However, even the irq flag is set in pt_regs->flags, true hardirq is still off until returning from do_debug. Thus, lockdep assumes that hardirq is off without annotation. IMHO, one possible solution is that fixing hardirq flags right after notify_die in do_debug instead of in post_kprobe_handler. Thanks, Reply-To: peterz@infradead.org On Mon, 2008-04-21 at 18:54 -0400, Masami Hiramatsu wrote: > Thank you for reporting. > > Actually, kprobes tries to fixup thread's flags in post_kprobe_handler > (which is called from kprobe_exceptions_notify) by > trace_hardirqs_fixup_flags(pt_regs->flags). However, even the irq flag > is set in pt_regs->flags, true hardirq is still off until returning > from do_debug. Thus, lockdep assumes that hardirq is off without annotation. > > IMHO, one possible solution is that fixing hardirq flags right after > notify_die in do_debug instead of in post_kprobe_handler. My reply to BZ 10489: > [ 2.707509] Kprobe smoke test started > [ 2.709300] ------------[ cut here ]------------ > [ 2.709420] WARNING: at kernel/lockdep.c:2658 check_flags+0x4d/0x12c() > [ 2.709541] Modules linked in: > [ 2.709588] Pid: 1, comm: swapper Not tainted 2.6.25.jml.057 #1 > [ 2.709588] [<c0126acc>] warn_on_slowpath+0x41/0x51 > [ 2.709588] [<c010bafc>] ? save_stack_trace+0x1d/0x3b > [ 2.709588] [<c0140a83>] ? save_trace+0x37/0x89 > [ 2.709588] [<c011987d>] ? kernel_map_pages+0x103/0x11c > [ 2.709588] [<c0109803>] ? native_sched_clock+0xca/0xea > [ 2.709588] [<c0142958>] ? mark_held_locks+0x41/0x5c > [ 2.709588] [<c0382580>] ? kprobe_exceptions_notify+0x322/0x3af > [ 2.709588] [<c0142aff>] ? trace_hardirqs_on+0xf1/0x119 > [ 2.709588] [<c03825b3>] ? kprobe_exceptions_notify+0x355/0x3af > [ 2.709588] [<c0140823>] check_flags+0x4d/0x12c > [ 2.709588] [<c0143c9d>] lock_release+0x58/0x195 > [ 2.709588] [<c038347c>] ? __atomic_notifier_call_chain+0x0/0x80 > [ 2.709588] [<c03834d6>] __atomic_notifier_call_chain+0x5a/0x80 > [ 2.709588] [<c0383508>] atomic_notifier_call_chain+0xc/0xe > [ 2.709588] [<c013b6d4>] notify_die+0x2d/0x2f > [ 2.709588] [<c038168a>] do_debug+0x67/0xfe > [ 2.709588] [<c0381287>] debug_stack_correct+0x27/0x30 > [ 2.709588] [<c01564c0>] ? kprobe_target+0x1/0x34 > [ 2.709588] [<c0156572>] ? init_test_probes+0x50/0x186 > [ 2.709588] [<c04fae48>] init_kprobes+0x85/0x8c > [ 2.709588] [<c04e947b>] kernel_init+0x13d/0x298 > [ 2.709588] [<c04e933e>] ? kernel_init+0x0/0x298 > [ 2.709588] [<c04e933e>] ? kernel_init+0x0/0x298 > [ 2.709588] [<c0105ef7>] kernel_thread_helper+0x7/0x10 > [ 2.709588] ======================= > [ 2.709588] ---[ end trace 778e504de7e3b1e3 ]--- > [ 2.709588] possible reason: unannotated irqs-off. > [ 2.709588] irq event stamp: 370065 > [ 2.709588] hardirqs last enabled at (370065): [<c0382580>] > kprobe_exceptions_notify+0x322/0x3af > [ 2.709588] hardirqs last disabled at (370064): [<c0381bb7>] > do_int3+0x1d/0x7d > [ 2.709588] softirqs last enabled at (370050): [<c012b464>] > __do_softirq+0xfa/0x100 > [ 2.709588] softirqs last disabled at (370045): [<c0107438>] > do_softirq+0x74/0xd9 > [ 2.714751] Kprobe smoke test passed successfully how I love this stuff... Ok, do_debug() is a trap, this can happen at any time regardless of the machine's IRQ state. So the first thing we do is fix up the IRQ state. Then we call this die notifier stuff; and return with messed up IRQ state... YAY. So, kprobes fudges it.. notify_die(DIE_DEBUG) kprobe_exceptions_notify() post_kprobe_handler() modify regs->flags trace_hardirqs_fixup_flags(regs->flags); <--- must be it So what's the use of modifying flags if they're not meant to take effect at some point. /me tries to reproduce issue; enable kprobes test thingy && boot OK, that reproduces.. So the below makes it work - but I'm not getting this code; at the time I wrote that stuff I CC'ed each and every kprobe maintainer listed in the usual places but got no reposonse - can some please explain this stuff to me? Are the saved flags only for the TF bit or are they made in full effect later (and if so, where) ? Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> --- diff --git a/arch/x86/kernel/kprobes.c b/arch/x86/kernel/kprobes.c index 34a5912..fe71e64 100644 --- a/arch/x86/kernel/kprobes.c +++ b/arch/x86/kernel/kprobes.c @@ -865,7 +865,6 @@ static int __kprobes post_kprobe_handler(struct pt_regs *regs) resume_execution(cur, regs, kcb); regs->flags |= kcb->kprobe_saved_flags; - trace_hardirqs_fixup_flags(regs->flags); /* Restore back the original saved kprobes variables and continue. */ if (kcb->kprobe_status == KPROBE_REENTER) { Reply-To: peterz@infradead.org On Tue, 2008-04-22 at 15:09 +0200, Peter Zijlstra wrote: > On Mon, 2008-04-21 at 18:54 -0400, Masami Hiramatsu wrote: > > Thank you for reporting. > > > > Actually, kprobes tries to fixup thread's flags in post_kprobe_handler > > (which is called from kprobe_exceptions_notify) by > > trace_hardirqs_fixup_flags(pt_regs->flags). However, even the irq flag > > is set in pt_regs->flags, true hardirq is still off until returning > > from do_debug. Thus, lockdep assumes that hardirq is off without > annotation. Ah, can you clarrify? pt_regs->flags will only be set when returning to the original trap site? in that case we should not need a lockdep annotation I guess, unless its allowed and exptected for the int3 site to change IRQ state. Reply-To: mhiramat@redhat.com Peter Zijlstra wrote: > On Tue, 2008-04-22 at 15:09 +0200, Peter Zijlstra wrote: >> On Mon, 2008-04-21 at 18:54 -0400, Masami Hiramatsu wrote: >>> Thank you for reporting. >>> >>> Actually, kprobes tries to fixup thread's flags in post_kprobe_handler >>> (which is called from kprobe_exceptions_notify) by >>> trace_hardirqs_fixup_flags(pt_regs->flags). However, even the irq flag >>> is set in pt_regs->flags, true hardirq is still off until returning >>> from do_debug. Thus, lockdep assumes that hardirq is off without >>> annotation. > > Ah, can you clarrify? pt_regs->flags will only be set when returning to > the original trap site? in that case we should not need a lockdep > annotation I guess, unless its allowed and exptected for the int3 site > to change IRQ state. As far as I took a look at the lockdep, your suggestion is correct. post_kprobe_handler should not set a lockdep annotation, because processor's IF is not changed yet in that time. Thanks, Reply-To: peterz@infradead.org On Tue, 2008-04-22 at 11:25 -0400, Masami Hiramatsu wrote: > Peter Zijlstra wrote: > > On Tue, 2008-04-22 at 15:09 +0200, Peter Zijlstra wrote: > >> On Mon, 2008-04-21 at 18:54 -0400, Masami Hiramatsu wrote: > >>> Thank you for reporting. > >>> > >>> Actually, kprobes tries to fixup thread's flags in post_kprobe_handler > >>> (which is called from kprobe_exceptions_notify) by > >>> trace_hardirqs_fixup_flags(pt_regs->flags). However, even the irq flag > >>> is set in pt_regs->flags, true hardirq is still off until returning > >>> from do_debug. Thus, lockdep assumes that hardirq is off without > annotation. > > > > Ah, can you clarrify? pt_regs->flags will only be set when returning to > > the original trap site? in that case we should not need a lockdep > > annotation I guess, unless its allowed and exptected for the int3 site > > to change IRQ state. > > As far as I took a look at the lockdep, your suggestion is correct. > post_kprobe_handler should not set a lockdep annotation, because > processor's IF is not changed yet in that time. That much was clear; but when _will_ it be changed? Reply-To: mhiramat@redhat.com Hi Peter, Thank you for the patch, Peter Zijlstra wrote: > So the below makes it work - but I'm not getting this code; at the time > I wrote that stuff I CC'ed each and every kprobe maintainer listed in > the usual places but got no reposonse - can some please explain this > stuff to me? Sure, I came to the same decision. It is enough to remove unnecessary trace_hardirqs_fixup_flags() from post_kprobe_handler. pt_regs->flags is just a variable on stack -- not effective flags in that time. So we don't need to change thread's hardirq flag. > Are the saved flags only for the TF bit or are they made in full effect > later (and if so, where) ? The saved flags is for TF and IF bits. Actually, kprobes does following: --- do_int3() kprobe_exceptions_notify() kprobe_handler() prepare_singlestep() : save regs->flags, set TF and clear IF on regs->flags "iret" : pop regs->flags image to EFLAGS reg. <single step an instruction> do_debug() kprobe_exceptions_notify() post_kprobe_handler() : restore regs->flags (clearing TF and restoring IF) "iret" : pop regs->flags image to EFLAGS reg. --- While executing a single step, EFLAGS is different from the original EFLAGS. However, it will be recovered soon by post_kprobe_handler. NOTE: if the single-stepping instruction is a pushf, kprobes writes saved EFLAGS to the top of stack. > Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> I tested it on x86-64/i386. Acked-by: Masami Hiramatsu <mhiramat@redhat.com> > --- > diff --git a/arch/x86/kernel/kprobes.c b/arch/x86/kernel/kprobes.c > index 34a5912..fe71e64 100644 > --- a/arch/x86/kernel/kprobes.c > +++ b/arch/x86/kernel/kprobes.c > @@ -865,7 +865,6 @@ static int __kprobes post_kprobe_handler(struct pt_regs > *regs) > > resume_execution(cur, regs, kcb); > regs->flags |= kcb->kprobe_saved_flags; > - trace_hardirqs_fixup_flags(regs->flags); > > /* Restore back the original saved kprobes variables and continue. */ > if (kcb->kprobe_status == KPROBE_REENTER) { > > > Reply-To: mhiramat@redhat.com Peter Zijlstra wrote: > On Tue, 2008-04-22 at 11:25 -0400, Masami Hiramatsu wrote: >> Peter Zijlstra wrote: >>> On Tue, 2008-04-22 at 15:09 +0200, Peter Zijlstra wrote: >>>> On Mon, 2008-04-21 at 18:54 -0400, Masami Hiramatsu wrote: >>>>> Thank you for reporting. >>>>> >>>>> Actually, kprobes tries to fixup thread's flags in post_kprobe_handler >>>>> (which is called from kprobe_exceptions_notify) by >>>>> trace_hardirqs_fixup_flags(pt_regs->flags). However, even the irq flag >>>>> is set in pt_regs->flags, true hardirq is still off until returning >>>>> from do_debug. Thus, lockdep assumes that hardirq is off without >>>>> annotation. >>> Ah, can you clarrify? pt_regs->flags will only be set when returning to >>> the original trap site? in that case we should not need a lockdep >>> annotation I guess, unless its allowed and exptected for the int3 site >>> to change IRQ state. >> As far as I took a look at the lockdep, your suggestion is correct. >> post_kprobe_handler should not set a lockdep annotation, because >> processor's IF is not changed yet in that time. > > That much was clear; but when _will_ it be changed? processor's IF bit is changed when an iret(q) is issued. In other words, when returning from a debug/int3 exception. Reply-To: mhiramat@redhat.com Masami Hiramatsu wrote: > processor's IF bit is changed when an iret(q) is issued. > In other words, when returning from a debug/int3 exception. By the way, when returning from an exception on x86-64, it seems paranoidexit@entry_64.S doesn't use TRACE_IRQS_IRETQ except for it executes swapgs, does it? If not, I guess the lockdep on x86-64 may causes an warning by below code: spin_lock(...) kprobe_target_function() spin_unlock(...) Thanks, |