Bug 19302

Summary: PROBLEM: kernel crash on USB-modem (Huawei E1750) hangup.
Product: Drivers Reporter: Maciej Rutecki (maciej.rutecki)
Component: USBAssignee: Greg Kroah-Hartman (greg)
Status: CLOSED CODE_FIX    
Severity: normal CC: alan, florian, maciej.rutecki, o01eg, rjw, shpnft, stern
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 2.6.36-rc5 Subsystem:
Regression: No Bisected commit-id:
Attachments: dmesg attached on original bug report
config attached to original bug report
lspci proc_cpuinfo proc_iomem proc_ioports proc_modules proc_scsi_scsi proc_version ver_linux

Description Maciej Rutecki 2010-09-29 18:38:28 UTC
Subject    : PROBLEM: kernel crash on USB-modem (Huawei E1750) hangup.
Submitter  : O01eg <O01eg@yandex.ru>
Date       : 2010-09-26 19:50
Message-ID : op.vjnn1up1yohvy1@localhost
References : http://marc.info/?l=linux-kernel&m=128553111709569&w=2

This entry is being used for tracking a regression from 2.6.35. Please don't
close it until the problem is fixed in the mainline.
Comment 1 O01eg 2010-09-29 18:42:23 UTC
I got this bug yet at 2.6.34 but cann't catch enought information.
Comment 2 Florian Mickler 2010-09-30 05:05:19 UTC
I found this in your logs, which may be enough information for now.... 

[ 5304.074589] scsi host1: __pm_runtime_resume() returns 1!
[ 5305.047375] sd 18:0:0:0: __pm_runtime_resume()!
[ 5305.047724] sd 18:0:0:0: __pm_runtime_resume() returns 1!
[ 5308.060687] device: 'ppp1': device_add
[ 5308.060984] PM: Adding info for No Bus:ppp1
[ 5310.418590] BUG: unable to handle kernel NULL pointer dereference at 00000000000002f0
[ 5310.419081] IP: [<ffffffff810847d2>] __lock_acquire+0x93/0x922
[ 5310.419562] PGD 17cb1067 PUD 742af067 PMD 0 
[ 5310.420062] Oops: 0000 [#1] PREEMPT SMP 
[ 5310.420633] last sysfs file: /sys/devices/virtual/net/ppp1/flags
[ 5310.421207] CPU 0 
[ 5310.421219] Modules linked in: netconsole tun nouveau ttm drm_kms_helper
[ 5310.421767] 
[ 5310.421767] Pid: 0, comm: swapper Not tainted 2.6.36-rc5 #1 P35T-DS3P/P35T-DS3P
[ 5310.421767] RIP: 0010:[<ffffffff810847d2>]  [<ffffffff810847d2>] __lock_acquire+0x93/0x922
[ 5310.421767] RSP: 0018:ffff880004c03b98  EFLAGS: 00010097
[ 5310.421767] RAX: 0000000000000046 RBX: 00000000000002f0 RCX: 0000000000000000
[ 5310.421767] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 00000000000002f0
[ 5310.421767] RBP: ffff880004c03c28 R08: 0000000000000002 R09: 0000000000000001
[ 5310.421767] R10: ffffffff812fbc4f R11: 0000000000000096 R12: ffffffff81a29020
[ 5305.047724] sd 18:0:0:0: __pm_runtime_resume() returns 1!
[ 5308.060687] device: 'ppp1': device_add
[ 5308.060984] PM: Adding info for No Bus:ppp1
[ 5310.418590] BUG: unable to handle kernel NULL pointer dereference at 00000000000002f0
[ 5310.419081] IP: [<ffffffff810847d2>] __lock_acquire+0x93/0x922
[ 5310.419562] PGD 17cb1067 PUD 742af067 PMD 0 
[ 5310.420062] Oops: 0000 [#1] PREEMPT SMP 
[ 5310.420633] last sysfs file: /sys/devices/virtual/net/ppp1/flags
[ 5310.421207] CPU 0 
[ 5310.421219] Modules linked in: netconsole tun nouveau ttm drm_kms_helper
[ 5310.421767] 
[ 5310.421767] Pid: 0, comm: swapper Not tainted 2.6.36-rc5 #1 P35T-DS3P/P35T-DS3P
[ 5310.421767] RIP: 0010:[<ffffffff810847d2>]  [<ffffffff810847d2>] __lock_acquire+0x93/0x922
[ 5310.421767] RSP: 0018:ffff880004c03b98  EFLAGS: 00010097
[ 5310.421767] RAX: 0000000000000046 RBX: 00000000000002f0 RCX: 0000000000000000
[ 5310.421767] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 00000000000002f0
[ 5310.421767] RBP: ffff880004c03c28 R08: 0000000000000002 R09: 0000000000000001
[ 5310.421767] R10: ffffffff812fbc4f R11: 0000000000000096 R12: ffffffff81a29020
[ 5310.421767] R13: 0000000000000096 R14: 0000000000000002 R15: 0000000000000000
[ 5310.421767] FS:  0000000000000000(0000) GS:ffff880004c00000(0000) knlGS:0000000000000000
[ 5310.421767] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 5310.421767] CR2: 00000000000002f0 CR3: 000000007909a000 CR4: 00000000000006f0
[ 5310.421767] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 5310.429488] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 5310.429488] Process swapper (pid: 0, threadinfo ffffffff81a00000, task ffffffff81a29020)
[ 5310.429488] Stack:
[ 5310.429488]  ffff880004dd35c0 00000000001d35c0 ffff880004c03bd8 ffffffff81078545
[ 5310.429488] <0> ffffffff81078545 0000000000000006 0000000104c03bd8 0000000000000000
[ 5310.429488] <0> ffffffff81a29020 00000000ca4adee6 0000000000000ba7 0000000000000000
[ 5310.429488] Call Trace:
[ 5310.429488]  <IRQ> 
[ 5310.429488]  [<ffffffff81078545>] ? sched_clock_cpu+0xca/0xd8
[ 5310.429488]  [<ffffffff81078545>] ? sched_clock_cpu+0xca/0xd8
[ 5310.429488]  [<ffffffff810850b8>] lock_acquire+0x57/0x6d
[ 5310.429488]  [<ffffffff812fbc4f>] ? tty_buffer_request_room+0x2d/0x152
[ 5310.429488]  [<ffffffff815bcba5>] _raw_spin_lock_irqsave+0x46/0x80
[ 5310.429488]  [<ffffffff812fbc4f>] ? tty_buffer_request_room+0x2d/0x152
[ 5310.429488]  [<ffffffff812fbc4f>] tty_buffer_request_room+0x2d/0x152
[ 5310.429488]  [<ffffffff812fbef4>] tty_insert_flip_string_fixed_flag+0x47/0xa2
[ 5310.429488]  [<ffffffff813db01b>] usb_wwan_indat_callback+0xad/0x130
[ 5310.429488]  [<ffffffff813aafa4>] usb_hcd_giveback_urb+0x8c/0xc0
[ 5310.429488]  [<ffffffff813cb41e>] uhci_giveback_urb+0x121/0x21f
[ 5310.429488]  [<ffffffff813cbc64>] uhci_scan_schedule+0x632/0x8f1
[ 5310.429488]  [<ffffffff813cdf84>] ? uhci_irq+0x119/0x139
[ 5310.429488]  [<ffffffff813cdf8c>] uhci_irq+0x121/0x139
[ 5310.429488]  [<ffffffff813aa7fd>] usb_hcd_irq+0x38/0x86
[ 5310.429488]  [<ffffffff810aa16e>] handle_IRQ_event+0x20/0x9f
[ 5310.429488]  [<ffffffff810abe7b>] handle_fasteoi_irq+0x8d/0xcd
[ 5310.429488]  [<ffffffff810285d7>] handle_irq+0x83/0x8c
[ 5310.429488]  [<ffffffff81027b9e>] do_IRQ+0x5e/0xc4
[ 5310.429488]  [<ffffffff815bd853>] ret_from_intr+0x0/0xf
[ 5310.429488]  <EOI> 
[ 5310.429488]  [<ffffffff81077be3>] ? __atomic_notifier_call_chain+0x0/0x90
[ 5310.429488]  [<ffffffff8102d237>] ? mwait_idle+0x7f/0x8c
[ 5310.429488]  [<ffffffff8102d22e>] ? mwait_idle+0x76/0x8c
[ 5310.429488]  [<ffffffff81024c86>] cpu_idle+0xa1/0xf7
[ 5310.429488]  [<ffffffff815a5897>] rest_init+0xcb/0xd2
[ 5310.429488]  [<ffffffff815a57cc>] ? rest_init+0x0/0xd2
[ 5310.429488]  [<ffffffff81cc4d29>] start_kernel+0x3c5/0x3d0
[ 5310.429488]  [<ffffffff81cc42a3>] x86_64_start_reservations+0xb3/0xb7
[ 5310.429488]  [<ffffffff81cc438b>] x86_64_start_kernel+0xe4/0xeb
[ 5310.429488] Code: 00 00 83 3d 60 83 5a 01 00 0f 85 9a 08 00 00 e9 8c 07 00 00 83 7d c8 07 76 11 e8 2a 1a 20 00 48 c7 c7 c4 ae 7e 81 e9 4d 08 00 00 <48> 8b 03 ba 01 00 00 00 48 3d 40 d1 d6 81 44 0f 44 f2 83 7d c8 
[ 5310.429488] RIP  [<ffffffff810847d2>] __lock_acquire+0x93/0x922
[ 5310.429488]  RSP <ffff880004c03b98>
[ 5310.429488] CR2: 00000000000002f0
[ 5310.429488] ---[ end trace 438588041d6254f1 ]---
[ 5310.429488] Kernel panic - not syncing: Fatal exception in interrupt
[ 5310.429488] Pid: 0, comm: swapper Tainted: G      D     2.6.36-rc5 #1
[ 5310.429488] Call Trace:
[ 5310.429488]  <IRQ>  [<ffffffff815b9528>] panic+0x96/0x1a6
[ 5310.429488]  [<ffffffff8105c194>] ? kmsg_dump+0x136/0x150
[ 5310.429488]  [<ffffffff81029875>] oops_end+0xa9/0xb9
[ 5310.429488]  [<ffffffff81045286>] no_context+0x1f7/0x206
[ 5310.429488]  [<ffffffff8104541c>] __bad_area_nosemaphore+0x187/0x1aa
[ 5310.456156]  [<ffffffff81078415>] ? sched_clock_local+0x1c/0x82
Comment 3 Florian Mickler 2010-09-30 05:06:36 UTC
Created attachment 31952 [details]
dmesg attached on original bug report
Comment 4 Florian Mickler 2010-09-30 05:07:24 UTC
Created attachment 31962 [details]
config attached to original bug report
Comment 5 Florian Mickler 2010-09-30 05:17:40 UTC
Created attachment 31992 [details]
lspci  proc_cpuinfo  proc_iomem  proc_ioports  proc_modules  proc_scsi_scsi  proc_version ver_linux
Comment 6 Florian Mickler 2010-09-30 05:22:35 UTC
What was the last kernel version without this bug?
Comment 7 O01eg 2010-09-30 15:46:23 UTC
I got this bug at 2.6.34-gentoo-r6 and 2.6.36-rc5. Other kernels were not checked.
Also the bug is unstable. Some hangups occurred normal before crash.
Comment 8 Alan Stern 2010-10-19 15:56:00 UTC
Try doing the same thing using a kernel with CONFIG_DEBUG_DRIVER disabled.  Maybe some other important messages got lost among all the driver messages.  For example, I don't see anything indicating a hangup.
Comment 9 O01eg 2010-10-19 16:38:53 UTC
It's only log of kernel messages. pppd tell about hangup.
Is kernel also tell about hangup? How does look such message?
Comment 10 Alan Stern 2010-10-19 17:06:30 UTC
The kernel certainly knows when a hangup occurs, but it might not send any messages to the log.

In fact I can't tell exactly what the problem is.  It's clear that usb_wwan_indat_callback() was called (meaning that some data was received from the modem) after the hangup occurred and after the tty structure was released.  That is, tty_port_tty_get(&port->port) returned a NULL pointer, which it shouldn't have because the device file apparently was still open.

However I don't know what's supposed to happen in this case, so I don't know what needs to be done to fix the problem.  Alan Cox should have a good idea.
Comment 11 Rafael J. Wysocki 2010-10-19 20:25:03 UTC
That doesn't appear to be a regression, dropping from the list.
Comment 12 Rodrigo de Farias Gomes 2011-04-12 12:43:12 UTC
(In reply to comment #10)

> In fact I can't tell exactly what the problem is.  It's clear that
> usb_wwan_indat_callback() was called (meaning that some data was received
> from
> the modem) after the hangup occurred and after the tty structure was
> released. 
> That is, tty_port_tty_get(&port->port) returned a NULL pointer, which it
> shouldn't have because the device file apparently was still open.

This bug was fixed by upstream commit 38237fd2be9421c104f84cc35665097bdce89013 related to the bug https://bugzilla.kernel.org/show_bug.cgi?id=24582 . 

I applied the patch from this commit in the kernel kernel-2.6.35.11-83.fc14 from Fedora 14 (http://koji.fedoraproject.org/koji/buildinfo?buildID=217311) and the crashes due to my 3G modem Huawei 1756 hangup have not occur more.
Comment 13 O01eg 2011-04-12 17:03:49 UTC
Thanks.