Subject : PROBLEM: kernel crash on USB-modem (Huawei E1750) hangup. Submitter : O01eg <O01eg@yandex.ru> Date : 2010-09-26 19:50 Message-ID : op.vjnn1up1yohvy1@localhost References : http://marc.info/?l=linux-kernel&m=128553111709569&w=2 This entry is being used for tracking a regression from 2.6.35. Please don't close it until the problem is fixed in the mainline.
I got this bug yet at 2.6.34 but cann't catch enought information.
I found this in your logs, which may be enough information for now.... [ 5304.074589] scsi host1: __pm_runtime_resume() returns 1! [ 5305.047375] sd 18:0:0:0: __pm_runtime_resume()! [ 5305.047724] sd 18:0:0:0: __pm_runtime_resume() returns 1! [ 5308.060687] device: 'ppp1': device_add [ 5308.060984] PM: Adding info for No Bus:ppp1 [ 5310.418590] BUG: unable to handle kernel NULL pointer dereference at 00000000000002f0 [ 5310.419081] IP: [<ffffffff810847d2>] __lock_acquire+0x93/0x922 [ 5310.419562] PGD 17cb1067 PUD 742af067 PMD 0 [ 5310.420062] Oops: 0000 [#1] PREEMPT SMP [ 5310.420633] last sysfs file: /sys/devices/virtual/net/ppp1/flags [ 5310.421207] CPU 0 [ 5310.421219] Modules linked in: netconsole tun nouveau ttm drm_kms_helper [ 5310.421767] [ 5310.421767] Pid: 0, comm: swapper Not tainted 2.6.36-rc5 #1 P35T-DS3P/P35T-DS3P [ 5310.421767] RIP: 0010:[<ffffffff810847d2>] [<ffffffff810847d2>] __lock_acquire+0x93/0x922 [ 5310.421767] RSP: 0018:ffff880004c03b98 EFLAGS: 00010097 [ 5310.421767] RAX: 0000000000000046 RBX: 00000000000002f0 RCX: 0000000000000000 [ 5310.421767] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 00000000000002f0 [ 5310.421767] RBP: ffff880004c03c28 R08: 0000000000000002 R09: 0000000000000001 [ 5310.421767] R10: ffffffff812fbc4f R11: 0000000000000096 R12: ffffffff81a29020 [ 5305.047724] sd 18:0:0:0: __pm_runtime_resume() returns 1! [ 5308.060687] device: 'ppp1': device_add [ 5308.060984] PM: Adding info for No Bus:ppp1 [ 5310.418590] BUG: unable to handle kernel NULL pointer dereference at 00000000000002f0 [ 5310.419081] IP: [<ffffffff810847d2>] __lock_acquire+0x93/0x922 [ 5310.419562] PGD 17cb1067 PUD 742af067 PMD 0 [ 5310.420062] Oops: 0000 [#1] PREEMPT SMP [ 5310.420633] last sysfs file: /sys/devices/virtual/net/ppp1/flags [ 5310.421207] CPU 0 [ 5310.421219] Modules linked in: netconsole tun nouveau ttm drm_kms_helper [ 5310.421767] [ 5310.421767] Pid: 0, comm: swapper Not tainted 2.6.36-rc5 #1 P35T-DS3P/P35T-DS3P [ 5310.421767] RIP: 0010:[<ffffffff810847d2>] [<ffffffff810847d2>] __lock_acquire+0x93/0x922 [ 5310.421767] RSP: 0018:ffff880004c03b98 EFLAGS: 00010097 [ 5310.421767] RAX: 0000000000000046 RBX: 00000000000002f0 RCX: 0000000000000000 [ 5310.421767] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 00000000000002f0 [ 5310.421767] RBP: ffff880004c03c28 R08: 0000000000000002 R09: 0000000000000001 [ 5310.421767] R10: ffffffff812fbc4f R11: 0000000000000096 R12: ffffffff81a29020 [ 5310.421767] R13: 0000000000000096 R14: 0000000000000002 R15: 0000000000000000 [ 5310.421767] FS: 0000000000000000(0000) GS:ffff880004c00000(0000) knlGS:0000000000000000 [ 5310.421767] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 5310.421767] CR2: 00000000000002f0 CR3: 000000007909a000 CR4: 00000000000006f0 [ 5310.421767] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 5310.429488] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 5310.429488] Process swapper (pid: 0, threadinfo ffffffff81a00000, task ffffffff81a29020) [ 5310.429488] Stack: [ 5310.429488] ffff880004dd35c0 00000000001d35c0 ffff880004c03bd8 ffffffff81078545 [ 5310.429488] <0> ffffffff81078545 0000000000000006 0000000104c03bd8 0000000000000000 [ 5310.429488] <0> ffffffff81a29020 00000000ca4adee6 0000000000000ba7 0000000000000000 [ 5310.429488] Call Trace: [ 5310.429488] <IRQ> [ 5310.429488] [<ffffffff81078545>] ? sched_clock_cpu+0xca/0xd8 [ 5310.429488] [<ffffffff81078545>] ? sched_clock_cpu+0xca/0xd8 [ 5310.429488] [<ffffffff810850b8>] lock_acquire+0x57/0x6d [ 5310.429488] [<ffffffff812fbc4f>] ? tty_buffer_request_room+0x2d/0x152 [ 5310.429488] [<ffffffff815bcba5>] _raw_spin_lock_irqsave+0x46/0x80 [ 5310.429488] [<ffffffff812fbc4f>] ? tty_buffer_request_room+0x2d/0x152 [ 5310.429488] [<ffffffff812fbc4f>] tty_buffer_request_room+0x2d/0x152 [ 5310.429488] [<ffffffff812fbef4>] tty_insert_flip_string_fixed_flag+0x47/0xa2 [ 5310.429488] [<ffffffff813db01b>] usb_wwan_indat_callback+0xad/0x130 [ 5310.429488] [<ffffffff813aafa4>] usb_hcd_giveback_urb+0x8c/0xc0 [ 5310.429488] [<ffffffff813cb41e>] uhci_giveback_urb+0x121/0x21f [ 5310.429488] [<ffffffff813cbc64>] uhci_scan_schedule+0x632/0x8f1 [ 5310.429488] [<ffffffff813cdf84>] ? uhci_irq+0x119/0x139 [ 5310.429488] [<ffffffff813cdf8c>] uhci_irq+0x121/0x139 [ 5310.429488] [<ffffffff813aa7fd>] usb_hcd_irq+0x38/0x86 [ 5310.429488] [<ffffffff810aa16e>] handle_IRQ_event+0x20/0x9f [ 5310.429488] [<ffffffff810abe7b>] handle_fasteoi_irq+0x8d/0xcd [ 5310.429488] [<ffffffff810285d7>] handle_irq+0x83/0x8c [ 5310.429488] [<ffffffff81027b9e>] do_IRQ+0x5e/0xc4 [ 5310.429488] [<ffffffff815bd853>] ret_from_intr+0x0/0xf [ 5310.429488] <EOI> [ 5310.429488] [<ffffffff81077be3>] ? __atomic_notifier_call_chain+0x0/0x90 [ 5310.429488] [<ffffffff8102d237>] ? mwait_idle+0x7f/0x8c [ 5310.429488] [<ffffffff8102d22e>] ? mwait_idle+0x76/0x8c [ 5310.429488] [<ffffffff81024c86>] cpu_idle+0xa1/0xf7 [ 5310.429488] [<ffffffff815a5897>] rest_init+0xcb/0xd2 [ 5310.429488] [<ffffffff815a57cc>] ? rest_init+0x0/0xd2 [ 5310.429488] [<ffffffff81cc4d29>] start_kernel+0x3c5/0x3d0 [ 5310.429488] [<ffffffff81cc42a3>] x86_64_start_reservations+0xb3/0xb7 [ 5310.429488] [<ffffffff81cc438b>] x86_64_start_kernel+0xe4/0xeb [ 5310.429488] Code: 00 00 83 3d 60 83 5a 01 00 0f 85 9a 08 00 00 e9 8c 07 00 00 83 7d c8 07 76 11 e8 2a 1a 20 00 48 c7 c7 c4 ae 7e 81 e9 4d 08 00 00 <48> 8b 03 ba 01 00 00 00 48 3d 40 d1 d6 81 44 0f 44 f2 83 7d c8 [ 5310.429488] RIP [<ffffffff810847d2>] __lock_acquire+0x93/0x922 [ 5310.429488] RSP <ffff880004c03b98> [ 5310.429488] CR2: 00000000000002f0 [ 5310.429488] ---[ end trace 438588041d6254f1 ]--- [ 5310.429488] Kernel panic - not syncing: Fatal exception in interrupt [ 5310.429488] Pid: 0, comm: swapper Tainted: G D 2.6.36-rc5 #1 [ 5310.429488] Call Trace: [ 5310.429488] <IRQ> [<ffffffff815b9528>] panic+0x96/0x1a6 [ 5310.429488] [<ffffffff8105c194>] ? kmsg_dump+0x136/0x150 [ 5310.429488] [<ffffffff81029875>] oops_end+0xa9/0xb9 [ 5310.429488] [<ffffffff81045286>] no_context+0x1f7/0x206 [ 5310.429488] [<ffffffff8104541c>] __bad_area_nosemaphore+0x187/0x1aa [ 5310.456156] [<ffffffff81078415>] ? sched_clock_local+0x1c/0x82
Created attachment 31952 [details] dmesg attached on original bug report
Created attachment 31962 [details] config attached to original bug report
Created attachment 31992 [details] lspci proc_cpuinfo proc_iomem proc_ioports proc_modules proc_scsi_scsi proc_version ver_linux
What was the last kernel version without this bug?
I got this bug at 2.6.34-gentoo-r6 and 2.6.36-rc5. Other kernels were not checked. Also the bug is unstable. Some hangups occurred normal before crash.
Try doing the same thing using a kernel with CONFIG_DEBUG_DRIVER disabled. Maybe some other important messages got lost among all the driver messages. For example, I don't see anything indicating a hangup.
It's only log of kernel messages. pppd tell about hangup. Is kernel also tell about hangup? How does look such message?
The kernel certainly knows when a hangup occurs, but it might not send any messages to the log. In fact I can't tell exactly what the problem is. It's clear that usb_wwan_indat_callback() was called (meaning that some data was received from the modem) after the hangup occurred and after the tty structure was released. That is, tty_port_tty_get(&port->port) returned a NULL pointer, which it shouldn't have because the device file apparently was still open. However I don't know what's supposed to happen in this case, so I don't know what needs to be done to fix the problem. Alan Cox should have a good idea.
That doesn't appear to be a regression, dropping from the list.
(In reply to comment #10) > In fact I can't tell exactly what the problem is. It's clear that > usb_wwan_indat_callback() was called (meaning that some data was received > from > the modem) after the hangup occurred and after the tty structure was > released. > That is, tty_port_tty_get(&port->port) returned a NULL pointer, which it > shouldn't have because the device file apparently was still open. This bug was fixed by upstream commit 38237fd2be9421c104f84cc35665097bdce89013 related to the bug https://bugzilla.kernel.org/show_bug.cgi?id=24582 . I applied the patch from this commit in the kernel kernel-2.6.35.11-83.fc14 from Fedora 14 (http://koji.fedoraproject.org/koji/buildinfo?buildID=217311) and the crashes due to my 3G modem Huawei 1756 hangup have not occur more.
Thanks.