Bug 35532
Summary: | Caught 64-bit read from uninitialized memory in process_one_work | ||
---|---|---|---|
Product: | Memory Management | Reporter: | Christian Casteyde (casteyde.christian) |
Component: | Page Allocator | Assignee: | Tejun Heo (tj) |
Status: | CLOSED CODE_FIX | ||
Severity: | normal | CC: | florian |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | 2.6.39 | Subsystem: | |
Regression: | No | Bisected commit-id: | |
Attachments: |
Full dmesg output for 2.6.39
lockdep-clear.patch |
Description
Christian Casteyde
2011-05-21 08:52:47 UTC
Created attachment 58862 [details]
Full dmesg output for 2.6.39
I think I'll assign this one to Tejun then run away :) Update: still present in 3.0-rc1. Hmmm... I can't reproduce the problem here. Can you please determine which line the RIP corresponds to using gdb? ie. build the kernel with debug info, run gdb on vmlinux and run "l *0xffffffff810900e3". Also attaching disassembly of the offending function (process_one_work) would be helpful. Thank you. I've checked with 3.0-rc2, the problem is still there. I've got several occurrences of this warning. For this one: PCI: Using host bridge windows from ACPI; if necessary, use "pci=nocrs" and report a bug WARNING: kmemcheck: Caught 64-bit read from uninitialized memory (ffff8801c6783c58) 0000000000000000ee0caa81ffffffff00000000000000000000000000000000 i i i i i i i i i i i i i i i i i i i i u u u u u u u u u u u u ^ Pid: 1, comm: swapper Not tainted 3.0.0-rc2 #8 Acer Aspire 7750G/JE70_HR RIP: 0010:[<ffffffff8108f3b3>] [<ffffffff8108f3b3>] process_one_work+0x83/0x450 RSP: 0018:ffff8801c6507d80 EFLAGS: 00010002 RAX: 0000000000000000 RBX: ffff8801c7418300 RCX: 0000000000000000 RDX: ffff8801c79d5800 RSI: ffff8801c6783c10 RDI: ffff8801c7418300 RBP: ffff8801c6507e10 R08: ffff8801c64f8058 R09: 0000000000000001 R10: 0000000000000000 R11: 0000000000000000 R12: ffff8801c6783c10 R13: ffff8801c79d5800 R14: ffff8801c780ca00 R15: 000000000000003b FS: 0000000000000000(0000) GS:ffff8801c7800000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: ffff8801c7430364 CR3: 0000000001c03000 CR4: 00000000000406f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff4ff0 DR7: 0000000000000400 [<ffffffff810927cc>] worker_thread+0x15c/0x330 [<ffffffff81097396>] kthread+0xa6/0xb0 [<ffffffff817b3c54>] kernel_thread_helper+0x4/0x10 [<ffffffffffffffff>] 0xffffffffffffffff \_SB_.PCI0:_OSC invalid UUID _OSC request data:1 8 1f I get this: gdb vmlinux GNU gdb (GDB) 7.2 Copyright (C) 2010 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-slackware-linux". For bug reporting instructions, please see: <http://www.gnu.org/software/gdb/bugs/>... Reading symbols from /usr/src/linux-2.6.39/vmlinux...rundone. (gdb) l *0xffffffff8108f3b3 0xffffffff8108f3b3 is in process_one_work (kernel/workqueue.c:1815). 1810 * inside the function that is called from it, this we need to 1811 * take into account for lockdep too. To avoid bogus "held 1812 * lock freed" warnings as well as problems when looking into 1813 * work->lockdep_map, make a copy and use that here. 1814 */ 1815 struct lockdep_map lockdep_map = work->lockdep_map; 1816 #endif 1817 /* 1818 * A single work shouldn't be executed concurrently by 1819 * multiple workers on a single cpu. Check whether anyone is (gdb) disass 0xffffffff8108f3b3 Dump of assembler code for function process_one_work: 0xffffffff8108f330 <+0>: push %rbp 0xffffffff8108f331 <+1>: mov %rsp,%rbp 0xffffffff8108f334 <+4>: push %r15 0xffffffff8108f336 <+6>: mov %rsi,%r15 0xffffffff8108f339 <+9>: push %r14 0xffffffff8108f33b <+11>: push %r13 0xffffffff8108f33d <+13>: xor %r13d,%r13d 0xffffffff8108f340 <+16>: push %r12 0xffffffff8108f342 <+18>: mov %rsi,%r12 0xffffffff8108f345 <+21>: push %rbx 0xffffffff8108f346 <+22>: mov %rdi,%rbx 0xffffffff8108f349 <+25>: sub $0x68,%rsp 0xffffffff8108f34d <+29>: mov (%rsi),%rax 0xffffffff8108f350 <+32>: mov %rax,%rdx 0xffffffff8108f353 <+35>: and $0xfffffffffffffe00,%rdx 0xffffffff8108f35a <+42>: test $0x4,%al 0xffffffff8108f35c <+44>: mov %rsi,%rax 0xffffffff8108f35f <+47>: cmovne %rdx,%r13 0xffffffff8108f363 <+51>: shr $0x6,%rax 0xffffffff8108f367 <+55>: mov 0x0(%r13),%r14 0xffffffff8108f36b <+59>: shr $0xc,%r15 0xffffffff8108f36f <+63>: add %rax,%r15 0xffffffff8108f372 <+66>: mov 0x8(%r13),%rax 0xffffffff8108f376 <+70>: and $0x3f,%r15d 0xffffffff8108f37a <+74>: mov (%rax),%eax 0xffffffff8108f37c <+76>: add $0x8,%r15 0xffffffff8108f380 <+80>: mov %eax,-0x64(%rbp) 0xffffffff8108f383 <+83>: mov 0x18(%rsi),%rax 0xffffffff8108f387 <+87>: mov %rax,-0x78(%rbp) 0xffffffff8108f38b <+91>: mov 0x20(%rsi),%rax 0xffffffff8108f38f <+95>: mov %rax,-0x60(%rbp) 0xffffffff8108f393 <+99>: mov 0x28(%rsi),%rax 0xffffffff8108f397 <+103>: mov %rax,-0x58(%rbp) 0xffffffff8108f39b <+107>: mov 0x30(%rsi),%rax 0xffffffff8108f39f <+111>: mov %rax,-0x50(%rbp) 0xffffffff8108f3a3 <+115>: mov 0x38(%rsi),%rax 0xffffffff8108f3a7 <+119>: mov %rax,-0x48(%rbp) 0xffffffff8108f3ab <+123>: mov 0x40(%rsi),%rax 0xffffffff8108f3af <+127>: mov %rax,-0x40(%rbp) 0xffffffff8108f3b3 <+131>: mov 0x48(%rsi),%rax 0xffffffff8108f3b7 <+135>: mov 0x38(%r14,%r15,8),%rsi 0xffffffff8108f3bc <+140>: mov %rax,-0x38(%rbp) 0xffffffff8108f3c0 <+144>: test %rsi,%rsi 0xffffffff8108f3c3 <+147>: jne 0xffffffff8108f3d8 <process_one_work+168> 0xffffffff8108f3c5 <+149>: jmp 0xffffffff8108f400 <process_one_work+208> 0xffffffff8108f3c7 <+151>: nopw 0x0(%rax,%rax,1) 0xffffffff8108f3d0 <+160>: mov (%rsi),%rsi 0xffffffff8108f3d3 <+163>: test %rsi,%rsi 0xffffffff8108f3d6 <+166>: je 0xffffffff8108f400 <process_one_work+208> 0xffffffff8108f3d8 <+168>: cmp 0x10(%rsi),%r12 0xffffffff8108f3dc <+172>: jne 0xffffffff8108f3d0 <process_one_work+160> 0xffffffff8108f3de <+174>: mov %r12,%rdi 0xffffffff8108f3e1 <+177>: add $0x20,%rsi 0xffffffff8108f3e5 <+181>: xor %edx,%edx 0xffffffff8108f3e7 <+183>: callq 0xffffffff8108ee80 <move_linked_works> 0xffffffff8108f3ec <+188>: add $0x68,%rsp 0xffffffff8108f3f0 <+192>: pop %rbx 0xffffffff8108f3f1 <+193>: pop %r12 0xffffffff8108f3f3 <+195>: pop %r13 0xffffffff8108f3f5 <+197>: pop %r14 0xffffffff8108f3f7 <+199>: pop %r15 0xffffffff8108f3f9 <+201>: leaveq 0xffffffff8108f3fa <+202>: retq The others are exactly at the same RIP. Update: Still present in 3.0-rc7 Created attachment 65552 [details]
lockdep-clear.patch
Sorry, missed the last post. Can you please test whether the attached patch makes the problem go away?
Thanks.
Yes, it works well. Thanks Awesome. Patch posted upstream. http://article.gmane.org/gmane.linux.kernel/1167447 Thank you. A patch referencing this bug report has been merged in Linux v3.1-rc1: commit f59de8992aa6dc85e81aadc26b0f69e17809721d Author: Tejun Heo <tj@kernel.org> Date: Thu Jul 14 15:19:09 2011 +0200 lockdep: Clear whole lockdep_map on initialization Indeed I cannot reproduce since 3.1. Closing |