Bug 9096 - NULL pointer dereference rb_erase DWARF2 unwinder stuck... thread related?
Summary: NULL pointer dereference rb_erase DWARF2 unwinder stuck... thread related?
Status: REJECTED INVALID
Alias: None
Product: Other
Classification: Unclassified
Component: Other (show other bugs)
Hardware: All Linux
: P1 blocking
Assignee: other_other
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2007-09-28 06:57 UTC by Pedro Fortuny Ayuso
Modified: 2007-09-30 01:34 UTC (History)
0 users

See Also:
Kernel Version: 2.6.18-1.2798.fc6 #1
Subsystem:
Regression: ---
Bisected commit-id:


Attachments

Description Pedro Fortuny Ayuso 2007-09-28 06:57:26 UTC
Most recent kernel where this bug did not occur: --
Distribution: Fedora-core 6
Hardware Environment: Intel Xeon Dual Core (Pentium D 925 3000MHz), Dell PowerEdge SC440
Software Environment: samba-3.0.23, apache-2, openssh...
Problem Description: system gets suddendly blocked. Answers to pings but nothing else (samba, ssh and apache stop answering), rebooting needed. Reading dmesg gives an oops (attahed at bottom).

Steps to reproduce: Unable to say, system has worked OK for 3+months and this has happened just once, yesterday. Cannot test because it is in deep production.

relevant dmesg output:
Sep 27 10:45:00 svrdlv kernel: Unable to handle kernel NULL pointer dereference at 0000000000000010 RIP:
Sep 27 10:45:00 svrdlv kernel:  [<ffffffff80222a56>] rb_erase+0x14c/0x2aa
Sep 27 10:45:00 svrdlv kernel: PGD 0
Sep 27 10:45:00 svrdlv kernel: Oops: 0000 [1] SMP
Sep 27 10:45:00 svrdlv kernel: last sysfs file: /class/net/eth0/address
Sep 27 10:45:00 svrdlv kernel: CPU 0
Sep 27 10:45:00 svrdlv kernel: Modules linked in: nls_utf8 cifs autofs4 ipv6 dm_mirror dm_multipath dm_mod video sbs i2c_ec button battery asus_acpi ac parport_pc lp parport intel_rng sg ide_cd serio_raw cdrom i2c_i801 i2c_core shpchp tg3 pcspkr ata_piix libata sd_mod scsi_mod ext3 jbd ehci_hcd ohci_hcd uhci_hcd
Sep 27 10:45:00 svrdlv kernel: Pid: 8, comm: events/0 Tainted: G   M  2.6.18-1.2798.fc6 #1
Sep 27 10:45:00 svrdlv kernel: RIP: 0010:[<ffffffff80222a56>]  [<ffffffff80222a56>] rb_erase+0x14c/0x2aa
Sep 27 10:45:00 svrdlv kernel: RSP: 0018:ffff810037f49de8  EFLAGS: 00010282
Sep 27 10:45:00 svrdlv kernel: RAX: 0000000000000000 RBX: ffff810035215a48 RCX: ffff8100384f15c8
Sep 27 10:45:00 svrdlv kernel: RDX: 0000000000000000 RSI: ffffffff806e45e0 RDI: 0000000000000000
Sep 27 10:45:00 svrdlv kernel: RBP: ffffffff806e45e0 R08: ffff810035215448 R09: ffff81003fe2aa80
Sep 27 10:45:00 svrdlv kernel: R10: 0000000000000000 R11: 00007fff9d365bb0 R12: ffff81003fe2aa80
Sep 27 10:45:00 svrdlv kernel: R13: 0000000000000282 R14: 0000000000000000 R15: ffffffff803125ff
Sep 27 10:45:00 svrdlv kernel: FS:  0000000000000000(0000) GS:ffffffff80609000(0000) knlGS:0000000000000000
Sep 27 10:45:00 svrdlv kernel: CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
Sep 27 10:45:00 svrdlv kernel: CR2: 0000000000000010 CR3: 0000000007083000 CR4: 00000000000006e0
Sep 27 10:45:00 svrdlv kernel: Process events/0 (pid: 8, threadinfo ffff810037f48000, task ffff810037fef7d0)
Sep 27 10:45:00 svrdlv kernel: Stack:  ffff8100384f1980 ffff8100384f1988 ffffffff80312659 0000000000000282
Sep 27 10:45:00 svrdlv kernel:  ffffffff8056c280 ffffffff8056c288 ffffffff8024bf5f ffffffff8024884a
Sep 27 10:45:00 svrdlv kernel:  ffff81003fe2aa80 ffffffff8024884a ffff81003fe15c80 ffff81003fe15cf0
Sep 27 10:45:00 svrdlv kernel: Call Trace:
Sep 27 10:45:00 svrdlv kernel:  [<ffffffff80312659>] key_cleanup+0x5a/0xfb
Sep 27 10:45:00 svrdlv kernel:  [<ffffffff8024bf5f>] run_workqueue+0x9a/0xed
Sep 27 10:45:00 svrdlv kernel:  [<ffffffff8024893a>] worker_thread+0xf0/0x122
Sep 27 10:45:00 svrdlv kernel:  [<ffffffff80232843>] kthread+0xf6/0x12a
Sep 27 10:45:00 svrdlv kernel:  [<ffffffff8025cea5>] child_rip+0xa/0x11
Sep 27 10:45:00 svrdlv kernel: DWARF2 unwinder stuck at child_rip+0xa/0x11
Sep 27 10:45:00 svrdlv kernel: Leftover inexact backtrace:
Sep 27 10:45:00 svrdlv kernel:  [<ffffffff8023274d>] kthread+0x0/0x12a
Sep 27 10:45:00 svrdlv kernel:  [<ffffffff8025ce9b>] child_rip+0x0/0x11
Sep 27 10:45:00 svrdlv kernel:
Sep 27 10:45:00 svrdlv kernel:
Sep 27 10:45:00 svrdlv kernel: Code: 48 8b 4f 10 48 85 c9 74 07 48 8b 01 a8 01 74 17 48 8b 47 08
Sep 27 10:45:00 svrdlv kernel: RIP  [<ffffffff80222a56>] rb_erase+0x14c/0x2aa
Sep 27 10:45:00 svrdlv kernel:  RSP <ffff810037f49de8>
Sep 27 10:45:00 svrdlv kernel: CR2: 0000000000000010
Comment 1 Dave Jones 2007-09-29 18:59:35 UTC
Tainted: G   M 

The 'm' is indication that your CPU reported a machine check exception.
Nearly always the sign of hardware failure.
Could be bad ram, insufficient power, poor cooling etc. etc.

You're also running an *ancient* kernel with known security vulnerabilities, and many, many bugs fixed since then.  If this oops did have a non-hardware-related cause, it may even have been fixed at some point in the year since that kernel was released.  (2.6.20 fixed a security hole in the key handling code that would cause corrupted lists that may have manifested like this).
Comment 2 Pedro Fortuny Ayuso 2007-09-30 01:34:10 UTC
(In reply to comment #1)
> Tainted: G   M 
> 
> The 'm' is indication that your CPU reported a machine check exception.
> Nearly always the sign of hardware failure.
> Could be bad ram, insufficient power, poor cooling etc. etc.
> 
> You're also running an *ancient* kernel with known security vulnerabilities,
> and many, many bugs fixed since then.  If this oops did have a
> non-hardware-related cause, it may even have been fixed at some point in the
> year since that kernel was released.  (2.6.20 fixed a security hole in the
> key
> handling code that would cause corrupted lists that may have manifested like
> this).
> 

Thanks, sorry then for the mess.

Pedro.

Note You need to log in before you can comment on or make changes to this bug.