LKML mail describing the problem: http://article.gmane.org/gmane.linux.ports.ppc.embedded/33999 Kerneloops link http://www.kerneloops.org/raw.php?rawid=2787010&msgid=http://mid.gmane.org/20100321043725.GA21566@amit-x200.redhat.com The .config is available in the mail linked to above. Crash log: BUG: unable to handle kernel NULL pointer dereference at (null) IP: [<ffffffff8101f4dc>] task_is_waking+0x1/0x1f PGD 3d261067 PUD 3d013067 PMD 0 Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC last sysfs file: /sys/devices/virtual/block/ram13/removable CPU 0 Modules linked in: Pid: 573, comm: console_check Not tainted 2.6.34-rc2 #102 /Bochs RIP: 0010:[<ffffffff8101f4dc>] [<ffffffff8101f4dc>] task_is_waking+0x1/0x1f RSP: 0018:ffff88003bdf5b48 EFLAGS: 00010246 RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffff81646e30 RDX: ffff88003bdf5b78 RSI: ffff88003bdf5ba0 RDI: 0000000000000000 RBP: ffff88003bdf5b78 R08: 0000000000000000 R09: ffffffff81646e08 R10: 0000000000000046 R11: 0000000000001130 R12: 00000000001d1d00 R13: 0000000000000000 R14: ffff88003bdf5ba0 R15: 000000000000000f FS: 00007f330731b6f0(0000) GS:ffff880003800000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 000000003be78000 CR4: 00000000000006b0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process console_check (pid: 573, threadinfo ffff88003bdf4000, task ffff88003bc3a2d0) Stack: ffff88003bdf5b78 ffffffff8102058e 0000000000000000 0000000000000000 <0> 0000000000000000 0000000000000000 ffff88003bdf5bd8 ffffffff81026f03 <0> ffff88003ead8cd8 ffff88003eb10490 ffff88003bdf5bd8 ffffffff8118cea9 Call Trace: [<ffffffff8102058e>] ? task_rq_lock+0x24/0x98 [<ffffffff81026f03>] try_to_wake_up+0x4b/0x33b [<ffffffff8118cea9>] ? resize_console+0x25/0x95 [<ffffffff8102721f>] wake_up_process+0x10/0x12 [<ffffffff8118c18e>] hvc_kick+0x1a/0x1c [<ffffffff8118cbb4>] hvc_open+0xf6/0x102 [<ffffffff81179f7d>] tty_open+0x369/0x4f0 [<ffffffff810a47e8>] chrdev_open+0x127/0x148 [<ffffffff810a46c1>] ? chrdev_open+0x0/0x148 [<ffffffff810a066b>] __dentry_open+0x154/0x28a [<ffffffff810a0866>] nameidata_to_filp+0x3a/0x4b [<ffffffff810ab9cb>] do_last+0x473/0x5ba [<ffffffff810abd12>] do_filp_open+0x200/0x602 [<ffffffff8104e43b>] ? get_lock_stats+0x20/0x4c [<ffffffff8124fa30>] ? _raw_spin_unlock+0x45/0x52 [<ffffffff810b4d68>] ? spin_unlock+0x9/0xb [<ffffffff810b5455>] ? alloc_fd+0x111/0x123 [<ffffffff810a045a>] do_sys_open+0x57/0xd7 [<ffffffff810a0503>] sys_open+0x1b/0x1d [<ffffffff81001ec2>] system_call_fastpath+0x16/0x1b Code: c5 e8 ef 9b 09 00 48 83 c3 08 48 83 3b 00 75 c4 4c 89 e7 48 c7 c6 87 a7 4c 81 e8 69 95 09 00 41 58 31 c0 5b 41 5c 41 5d c9 c3 55 <48> 8b 17 31 c0 48 89 e5 48 81 fa 00 01 00 00 75 0b 8b 47 14 d1 RIP [<ffffffff8101f4dc>] task_is_waking+0x1/0x1f RSP <ffff88003bdf5b48> CR2: 0000000000000000 ---[ end trace a8d89f6ae287538e ]--- note: console_check[573] exited with preempt_count 2 BUG: scheduling while atomic: console_check/573/0x10000002 INFO: lockdep is turned off. Modules linked in: Pid: 573, comm: console_check Tainted: G D 2.6.34-rc2 #102 Call Trace: [<ffffffff8104f44e>] ? __debug_show_held_locks+0x1b/0x24 [<ffffffff81024f1b>] __schedule_bug+0x72/0x77 [<ffffffff8124d073>] schedule+0xcc/0x69f [<ffffffff81027f95>] __cond_resched+0x13/0x1f [<ffffffff8124d6e8>] _cond_resched+0x16/0x1d [<ffffffff81085874>] unmap_vmas+0x733/0x7b1 [<ffffffff8108a474>] exit_mmap+0x88/0xdc [<ffffffff8102b1ce>] mmput+0x43/0xb4 [<ffffffff8102ed92>] exit_mm+0x103/0x110 [<ffffffff8118b138>] ? spin_unlock_irq+0x9/0xb [<ffffffff8103067d>] do_exit+0x1e7/0x68c [<ffffffff8102d4ca>] ? spin_unlock_irqrestore+0x9/0xb [<ffffffff8102e14c>] ? kmsg_dump+0x150/0x16a [<ffffffff810059be>] ? oops_end+0x44/0x94 [<ffffffff81005a09>] oops_end+0x8f/0x94 [<ffffffff81019f80>] no_context+0x1f7/0x206 [<ffffffff81017450>] ? kvm_clock_read+0x3e/0x5c [<ffffffff8101a11c>] __bad_area_nosemaphore+0x18d/0x1b0 [<ffffffff8101a31d>] ? do_page_fault+0xaa/0x295 [<ffffffff8101a14d>] bad_area_nosemaphore+0xe/0x10 [<ffffffff8101a3bc>] do_page_fault+0x149/0x295 [<ffffffff81250435>] page_fault+0x25/0x30 [<ffffffff8101f4dc>] ? task_is_waking+0x1/0x1f [<ffffffff8124fa7f>] ? _raw_spin_unlock_irqrestore+0x42/0x79 [<ffffffff8102058e>] ? task_rq_lock+0x24/0x98 [<ffffffff81026f03>] try_to_wake_up+0x4b/0x33b [<ffffffff8118cea9>] ? resize_console+0x25/0x95 [<ffffffff8102721f>] wake_up_process+0x10/0x12 [<ffffffff8118c18e>] hvc_kick+0x1a/0x1c [<ffffffff8118cbb4>] hvc_open+0xf6/0x102 [<ffffffff81179f7d>] tty_open+0x369/0x4f0 [<ffffffff810a47e8>] chrdev_open+0x127/0x148 [<ffffffff810a46c1>] ? chrdev_open+0x0/0x148 [<ffffffff810a066b>] __dentry_open+0x154/0x28a [<ffffffff810a0866>] nameidata_to_filp+0x3a/0x4b [<ffffffff810ab9cb>] do_last+0x473/0x5ba [<ffffffff810abd12>] do_filp_open+0x200/0x602 [<ffffffff8104e43b>] ? get_lock_stats+0x20/0x4c [<ffffffff8124fa30>] ? _raw_spin_unlock+0x45/0x52 [<ffffffff810b4d68>] ? spin_unlock+0x9/0xb [<ffffffff810b5455>] ? alloc_fd+0x111/0x123 [<ffffffff810a045a>] do_sys_open+0x57/0xd7 [<ffffffff810a0503>] sys_open+0x1b/0x1d [<ffffffff81001ec2>] system_call_fastpath+0x16/0x1b EXT4-fs (sda1): mounted filesystem with ordered data mode BUG: unable to handle kernel NULL pointer dereference at (null) IP: [<ffffffff8101f4dc>] task_is_waking+0x1/0x1f PGD 3e7ab067 PUD 3ce84067 PMD 0 Oops: 0000 [#2] PREEMPT SMP DEBUG_PAGEALLOC last sysfs file: /sys/devices/pci0000:00/0000:00:01.1/host0/target0:0:0/0:0:0:0/block/sda/sda2/dev CPU 1 Modules linked in: Pid: 951, comm: agetty Tainted: G D 2.6.34-rc2 #102 /Bochs RIP: 0010:[<ffffffff8101f4dc>] [<ffffffff8101f4dc>] task_is_waking+0x1/0x1f RSP: 0018:ffff88003cdd9b48 EFLAGS: 00010246 RAX: 0000000000000001 RBX: 0000000000000000 RCX: 0000000000000000 RDX: ffff88003cdd9bb8 RSI: ffff88003cdd9ba0 RDI: 0000000000000000 RBP: ffff88003cdd9b78 R08: ffff88003f80d168 R09: ffff88003ef60018 R10: 0000000000000046 R11: 0000000000000292 R12: 00000000001d1d00 R13: 0000000000000000 R14: ffff88003cdd9ba0 R15: 000000000000000f FS: 00007fd200b486f0(0000) GS:ffff880003a00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 000000003f189000 CR4: 00000000000006a0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process agetty (pid: 951, threadinfo ffff88003cdd8000, task ffff88003d2f8000) Stack: ffff88003cdd9b78 ffffffff8102058e 0000000000000000 0000000000000000 <0> 0000000000000000 0000000000000001 ffff88003cdd9bd8 ffffffff81026f03 <0> 0000000000000000 0000000000000046 ffff88003ef60018 0000000000000046 Call Trace: [<ffffffff8102058e>] ? task_rq_lock+0x24/0x98 [<ffffffff81026f03>] try_to_wake_up+0x4b/0x33b [<ffffffff8102721f>] wake_up_process+0x10/0x12 [<ffffffff8118c18e>] hvc_kick+0x1a/0x1c [<ffffffff8118cb1b>] hvc_open+0x5d/0x102 [<ffffffff81179f7d>] tty_open+0x369/0x4f0 [<ffffffff810a47e8>] chrdev_open+0x127/0x148 [<ffffffff810a2899>] ? spin_unlock+0x9/0xb [<ffffffff810a46c1>] ? chrdev_open+0x0/0x148 [<ffffffff810a066b>] __dentry_open+0x154/0x28a [<ffffffff810a0866>] nameidata_to_filp+0x3a/0x4b [<ffffffff810ab9cb>] do_last+0x473/0x5ba [<ffffffff810abd12>] do_filp_open+0x200/0x602 [<ffffffff8124fa30>] ? _raw_spin_unlock+0x45/0x52 [<ffffffff810b4d68>] ? spin_unlock+0x9/0xb [<ffffffff810b5455>] ? alloc_fd+0x111/0x123 [<ffffffff810a045a>] do_sys_open+0x57/0xd7 [<ffffffff810a0503>] sys_open+0x1b/0x1d [<ffffffff81001ec2>] system_call_fastpath+0x16/0x1b Code: c5 e8 ef 9b 09 00 48 83 c3 08 48 83 3b 00 75 c4 4c 89 e7 48 c7 c6 87 a7 4c 81 e8 69 95 09 00 41 58 31 c0 5b 41 5c 41 5d c9 c3 55 <48> 8b 17 31 c0 48 89 e5 48 81 fa 00 01 00 00 75 0b 8b 47 14 d1 RIP [<ffffffff8101f4dc>] task_is_waking+0x1/0x1f RSP <ffff88003cdd9b48> CR2: 0000000000000000 ---[ end trace a8d89f6ae287538f ]--- note: agetty[951] exited with preempt_count 2
On Tue, 23 Mar 2010 04:04:16 GMT bugzilla-daemon@bugzilla.kernel.org wrote: > https://bugzilla.kernel.org/show_bug.cgi?id=15615 > > Summary: NULL pointer deref in task_is_waking > Product: Process Management > Version: 2.5 > Kernel Version: 2.6.34-rc2 > Platform: All > OS/Version: Linux > Tree: Mainline > Status: NEW > Severity: high > Priority: P1 > Component: Scheduler > AssignedTo: mingo@elte.hu > ReportedBy: shahamit@gmail.com > CC: shahamit@gmail.com > Regression: Yes > > > LKML mail describing the problem: > > http://article.gmane.org/gmane.linux.ports.ppc.embedded/33999 > > Kerneloops link > > > http://www.kerneloops.org/raw.php?rawid=2787010&msgid=http://mid.gmane.org/20100321043725.GA21566@amit-x200.redhat.com This is an hvc_console bug, methinks. > The .config is available in the mail linked to above. > > Crash log: > > BUG: unable to handle kernel NULL pointer dereference at (null) > IP: [<ffffffff8101f4dc>] task_is_waking+0x1/0x1f > PGD 3d261067 PUD 3d013067 PMD 0 > Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC > last sysfs file: /sys/devices/virtual/block/ram13/removable > CPU 0 > Modules linked in: > > Pid: 573, comm: console_check Not tainted 2.6.34-rc2 #102 /Bochs > RIP: 0010:[<ffffffff8101f4dc>] [<ffffffff8101f4dc>] > task_is_waking+0x1/0x1f > RSP: 0018:ffff88003bdf5b48 EFLAGS: 00010246 > RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffff81646e30 > RDX: ffff88003bdf5b78 RSI: ffff88003bdf5ba0 RDI: 0000000000000000 > RBP: ffff88003bdf5b78 R08: 0000000000000000 R09: ffffffff81646e08 > R10: 0000000000000046 R11: 0000000000001130 R12: 00000000001d1d00 > R13: 0000000000000000 R14: ffff88003bdf5ba0 R15: 000000000000000f > FS: 00007f330731b6f0(0000) GS:ffff880003800000(0000) > knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > CR2: 0000000000000000 CR3: 000000003be78000 CR4: 00000000000006b0 > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 > Process console_check (pid: 573, threadinfo ffff88003bdf4000, task > ffff88003bc3a2d0) > Stack: > ffff88003bdf5b78 ffffffff8102058e 0000000000000000 0000000000000000 > <0> 0000000000000000 0000000000000000 ffff88003bdf5bd8 ffffffff81026f03 > <0> ffff88003ead8cd8 ffff88003eb10490 ffff88003bdf5bd8 ffffffff8118cea9 > Call Trace: > [<ffffffff8102058e>] ? task_rq_lock+0x24/0x98 > [<ffffffff81026f03>] try_to_wake_up+0x4b/0x33b > [<ffffffff8118cea9>] ? resize_console+0x25/0x95 > [<ffffffff8102721f>] wake_up_process+0x10/0x12 > [<ffffffff8118c18e>] hvc_kick+0x1a/0x1c > [<ffffffff8118cbb4>] hvc_open+0xf6/0x102 > [<ffffffff81179f7d>] tty_open+0x369/0x4f0 > [<ffffffff810a47e8>] chrdev_open+0x127/0x148 > [<ffffffff810a46c1>] ? chrdev_open+0x0/0x148 > [<ffffffff810a066b>] __dentry_open+0x154/0x28a > [<ffffffff810a0866>] nameidata_to_filp+0x3a/0x4b > [<ffffffff810ab9cb>] do_last+0x473/0x5ba > [<ffffffff810abd12>] do_filp_open+0x200/0x602 > [<ffffffff8104e43b>] ? get_lock_stats+0x20/0x4c > [<ffffffff8124fa30>] ? _raw_spin_unlock+0x45/0x52 > [<ffffffff810b4d68>] ? spin_unlock+0x9/0xb > [<ffffffff810b5455>] ? alloc_fd+0x111/0x123 > [<ffffffff810a045a>] do_sys_open+0x57/0xd7 > [<ffffffff810a0503>] sys_open+0x1b/0x1d > [<ffffffff81001ec2>] system_call_fastpath+0x16/0x1b > Code: c5 e8 ef 9b 09 00 48 83 c3 08 48 83 3b 00 75 c4 4c 89 e7 48 c7 c6 > 87 a7 4c 81 e8 69 95 09 00 41 58 31 c0 5b 41 5c 41 5d c9 c3 55 <48> 8b > 17 31 c0 48 89 e5 48 81 fa 00 01 00 00 75 0b 8b 47 14 d1 > RIP [<ffffffff8101f4dc>] task_is_waking+0x1/0x1f > RSP <ffff88003bdf5b48> > CR2: 0000000000000000 The code is now running hvc_kick() before running hvc_init(). Perhaps as a result of changes to tty_open(), dunno. Something dumb like this should plug the bug: --- a/drivers/char/hvc_console.c~a +++ a/drivers/char/hvc_console.c @@ -285,6 +285,9 @@ EXPORT_SYMBOL_GPL(hvc_instantiate); /* Wake the sleeping khvcd */ void hvc_kick(void) { + if (!hvc_task) + return; /* HVC hasn't been initialised yet */ + hvc_kicked = 1; wake_up_process(hvc_task); } _ although a) it might be more consistent to test hvc_driver and b) we shouldn't be calling into an uninitialised driver in the first place.
Reply-To: amit.shah@redhat.com On (Wed) Mar 24 2010 [11:06:45], Andrew Morton wrote: > > > > > http://www.kerneloops.org/raw.php?rawid=2787010&msgid=http://mid.gmane.org/20100321043725.GA21566@amit-x200.redhat.com > > This is an hvc_console bug, methinks. Yes, I saw that too. > The code is now running hvc_kick() before running hvc_init(). Perhaps > as a result of changes to tty_open(), dunno. > > Something dumb like this should plug the bug: I sent a similar patch to Ben yesterday to confirm if it's the same thing he's seeing that locks up ppc boot. > --- a/drivers/char/hvc_console.c~a > +++ a/drivers/char/hvc_console.c > @@ -285,6 +285,9 @@ EXPORT_SYMBOL_GPL(hvc_instantiate); > /* Wake the sleeping khvcd */ > void hvc_kick(void) > { > + if (!hvc_task) > + return; /* HVC hasn't been initialised yet */ > + > hvc_kicked = 1; > wake_up_process(hvc_task); > } > _ > > although a) it might be more consistent to test hvc_driver and b) we > shouldn't be calling into an uninitialised driver in the first place. Agreed! Amit
Patch has been posted http://lkml.org/lkml/2010/4/6/110
Patch : https://patchwork.kernel.org/patch/90782/ Handled-By : Anton Blanchard <anton@samba.org>
Fixed by commit 320718ee074acce5ffced6506cb51af1388942aa .