Bug 13564
Summary: | random general protection fault at boot time caused by khubd. | ||
---|---|---|---|
Product: | Drivers | Reporter: | Pauli (suokkos) |
Component: | USB | Assignee: | Greg Kroah-Hartman (greg) |
Status: | CLOSED UNREPRODUCIBLE | ||
Severity: | normal | CC: | rjw, stern |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | git & Ubuntu 2.6.30-9 | Subsystem: | |
Regression: | Yes | Bisected commit-id: | |
Bug Depends on: | |||
Bug Blocks: | 13070 | ||
Attachments: |
dmesg from failed boot
Config that I used to compile kernel lspci -vvvnn lsusb -v objdump -dS timer.o |
Created attachment 21983 [details]
Config that I used to compile kernel
Created attachment 21985 [details]
lspci -vvvnn
Created attachment 21987 [details]
lsusb -v
Can you reproduce this on a kernel.org kernel release and not a ubuntu one? yes. I was able to reproduce it with Linus' git tree from Wednesday. I did compile newer git version today which haven't at least yet caused the USB bug. I didn't apply any Ubuntu patches to git tree so this is almost vanilla kernel now (alomost=one DRM fix patch) I can help trying to collect more information from my system if anything could help resolving the bug. Bisect isn't practical because I don't know sure way to reproduce the bug. (switched to email. Please respond via emailed reply-to-all, not via the bugzilla web interface). On Thu, 18 Jun 2009 12:44:49 GMT bugzilla-daemon@bugzilla.kernel.org wrote: > http://bugzilla.kernel.org/show_bug.cgi?id=13564 > > Summary: random general protection fault at boot time caused by > khubd. > Product: Drivers > Version: 2.5 > Kernel Version: git & Ubuntu 2.6.30-9 > Platform: All > OS/Version: Linux > Tree: Mainline > Status: NEW > Severity: normal > Priority: P1 > Component: USB > AssignedTo: greg@kroah.com > ReportedBy: suokkos@gmail.com > Regression: Yes > > > Created an attachment (id=21982) > --> (http://bugzilla.kernel.org/attachment.cgi?id=21982) > dmesg from failed boot > > I firt saw this in the kernel 2.6.30-8 from Ubuntu Karmic. > Afer that I have upgraded to 2.6.30-9 which has caused same problem. > Now I wanted to try vanila git kernel with radeon KMS so I compiled my own > kernel from git. Today I saw it first time also with this kernel. > > What happens: > 1. I make cold boot or restart > 2. Computer boots normaly except all USB ports are dead > 3. dmesg shows general protection error caused by khubd > > This problem doen't happen even nearly every boot. I would give around 20% > chance for khubd failure without collecting statistics. > This is a regression in 2.6.30.8. However we are not told which kernel version we regresed against. Which is the latest kernel which was known to work OK? The bad bit is: : [ 1.604937] Write protecting the kernel read-only data: 1732k : [ 1.672060] Clocksource tsc unstable (delta = 736134769 ns) : [ 1.704695] usb 3-1: new low speed USB device using uhci_hcd and address 2 : [ 1.858154] general protection fault: 0000 [#1] SMP : [ 1.858277] last sysfs file: : [ 1.858318] Modules linked in: : [ 1.858393] : [ 1.858435] Pid: 231, comm: khubd Not tainted (2.6.30-9-generic #10kms5-Ubuntu) Aspire 1350 : [ 1.858485] EIP: 0060:[<c053595b>] EFLAGS: 00010202 CPU: 0 : [ 1.858541] EIP is at schedule_timeout+0x12b/0x1b0 : [ 1.858585] EAX: 000004e0 EBX: 000004e0 ECX: c07eed54 EDX: b48c4368 : [ 1.858631] ESI: df97fd14 EDI: c07ee440 EBP: df97fd54 ESP: df97fd00 : [ 1.858677] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 : [ 1.858723] Process khubd (pid: 231, ti=df97e000 task=df973e30 task.ti=df97e000) : [ 1.858771] Stack: : [ 1.858809] de118b60 1c800000 fffedcd6 000004e2 00000296 00000000 00200200 fffee1b8 : [ 1.859102] c0144100 df973e30 c07ee440 c05356d2 6275686b 64640064 00000000 00000000 : [ 1.859462] 000000e7 b48c4368 df97fda4 000004e2 df97fda8 df97fd8c c05356d2 df97fdb0 : [ 1.859857] Call Trace: : [ 1.859898] [<c0144100>] ? process_timeout+0x0/0x10 : [ 1.859985] [<c05356d2>] ? wait_for_common+0xa2/0x120 : [ 1.860029] [<c05356d2>] ? wait_for_common+0xa2/0x120 : [ 1.860029] [<c0131260>] ? default_wake_function+0x0/0x10 : [ 1.860029] [<c05357cd>] ? wait_for_completion_timeout+0xd/0x10 : [ 1.860029] [<c03da8c9>] ? usb_start_wait_urb+0x99/0xb0 : [ 1.860029] [<c03daae0>] ? usb_control_msg+0xc0/0x120 : [ 1.860029] [<c03daaf0>] ? usb_control_msg+0xd0/0x120 : [ 1.860029] [<c03db2d5>] ? usb_get_descriptor+0x95/0xc0 : [ 1.860029] [<c03de702>] ? usb_get_configuration+0xa2/0x3e0 : [ 1.860029] [<c03de736>] ? usb_get_configuration+0xd6/0x3e0 : [ 1.860029] [<c03db3b4>] ? usb_get_device_descriptor+0xb4/0xc0 : [ 1.860029] [<c03d4bad>] ? usb_configure_device+0xcd/0x100 : [ 1.860029] [<c03dcc27>] ? usb_autopm_do_device+0x67/0xf0 : [ 1.860029] [<c03d4dcb>] ? usb_new_device+0x2b/0xc0 : [ 1.860029] [<c03d5627>] ? hub_port_connect_change+0x3f7/0x7e0 : [ 1.860029] [<c03daaf0>] ? usb_control_msg+0xd0/0x120 : [ 1.860029] [<c03d323b>] ? clear_port_feature+0x4b/0x60 : [ 1.860029] [<c03d6a95>] ? hub_events+0x1f5/0x500 : [ 1.860029] [<c0535251>] ? __schedule+0x431/0x750 : [ 1.860029] [<c014ef0a>] ? finish_wait+0x4a/0x70 : [ 1.860029] [<c03d6dd5>] ? hub_thread+0x35/0x150 : [ 1.860029] [<c014eda0>] ? autoremove_wake_function+0x0/0x40 : [ 1.860029] [<c03d6da0>] ? hub_thread+0x0/0x150 : [ 1.860029] [<c014ea46>] ? kthread+0x46/0x80 : [ 1.860029] [<c014ea00>] ? kthread+0x0/0x80 : [ 1.860029] [<c0103ef7>] ? kernel_thread_helper+0x7/0x10 : [ 1.860029] Code: 8b 55 bc 89 f8 e8 86 18 00 00 e8 31 fc ff ff 89 f0 e8 3a e7 c0 ff a1 00 da 72 c0 29 c3 89 da 89 d8 c1 fa 1f f7 d2 21 d0 8b 55 f0 <65> 33 15 14 00 00 00 75 46 83 c4 48 5b 5e 5f 5d c3 8d 74 26 00 : [ 1.860029] EIP: [<c053595b>] schedule_timeout+0x12b/0x1b0 SS:ESP 0068:df97fd00 : [ 1.864916] ---[ end trace f734973b361164b7 ]--- 2.6.28 (Ubuntu kernel were in regular use) and 2.6.29 (but less than 10 boots in this machine) haven't shown this problem. Now I haven't seen this problem either after 2.6.31-rc1. But I have booted rc kernel only a few time so I might have been lucky. Versions where I have seen this is 2.6.30 from Ubuntu and git version from time when merge window for 2.6.31 was open. On Tue, Jun 30, 2009 at 2:11 AM, Andrew Morton<akpm@linux-foundation.org> wrote: > > (switched to email. Please respond via emailed reply-to-all, not via the > bugzilla web interface). > > On Thu, 18 Jun 2009 12:44:49 GMT > bugzilla-daemon@bugzilla.kernel.org wrote: > >> http://bugzilla.kernel.org/show_bug.cgi?id=13564 >> >> Summary: random general protection fault at boot time caused by >> khubd. >> Product: Drivers >> Version: 2.5 >> Kernel Version: git & Ubuntu 2.6.30-9 >> Platform: All >> OS/Version: Linux >> Tree: Mainline >> Status: NEW >> Severity: normal >> Priority: P1 >> Component: USB >> AssignedTo: greg@kroah.com >> ReportedBy: suokkos@gmail.com >> Regression: Yes >> >> >> Created an attachment (id=21982) >> --> (http://bugzilla.kernel.org/attachment.cgi?id=21982) >> dmesg from failed boot >> >> I firt saw this in the kernel 2.6.30-8 from Ubuntu Karmic. >> Afer that I have upgraded to 2.6.30-9 which has caused same problem. >> Now I wanted to try vanila git kernel with radeon KMS so I compiled my own >> kernel from git. Today I saw it first time also with this kernel. >> >> What happens: >> 1. I make cold boot or restart >> 2. Computer boots normaly except all USB ports are dead >> 3. dmesg shows general protection error caused by khubd >> >> This problem doen't happen even nearly every boot. I would give around 20% >> chance for khubd failure without collecting statistics. >> > > This is a regression in 2.6.30.8. However we are not told which kernel > version we regresed against. Which is the latest kernel which was > known to work OK? > > The bad bit is: > > > : [ 1.604937] Write protecting the kernel read-only data: 1732k > : [ 1.672060] Clocksource tsc unstable (delta = 736134769 ns) > : [ 1.704695] usb 3-1: new low speed USB device using uhci_hcd and address > 2 > : [ 1.858154] general protection fault: 0000 [#1] SMP > : [ 1.858277] last sysfs file: > : [ 1.858318] Modules linked in: > : [ 1.858393] > : [ 1.858435] Pid: 231, comm: khubd Not tainted (2.6.30-9-generic > #10kms5-Ubuntu) Aspire 1350 > : [ 1.858485] EIP: 0060:[<c053595b>] EFLAGS: 00010202 CPU: 0 > : [ 1.858541] EIP is at schedule_timeout+0x12b/0x1b0 > : [ 1.858585] EAX: 000004e0 EBX: 000004e0 ECX: c07eed54 EDX: b48c4368 > : [ 1.858631] ESI: df97fd14 EDI: c07ee440 EBP: df97fd54 ESP: df97fd00 > : [ 1.858677] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 > : [ 1.858723] Process khubd (pid: 231, ti=df97e000 task=df973e30 > task.ti=df97e000) > : [ 1.858771] Stack: > : [ 1.858809] de118b60 1c800000 fffedcd6 000004e2 00000296 00000000 > 00200200 fffee1b8 > : [ 1.859102] c0144100 df973e30 c07ee440 c05356d2 6275686b 64640064 > 00000000 00000000 > : [ 1.859462] 000000e7 b48c4368 df97fda4 000004e2 df97fda8 df97fd8c > c05356d2 df97fdb0 > : [ 1.859857] Call Trace: > : [ 1.859898] [<c0144100>] ? process_timeout+0x0/0x10 > : [ 1.859985] [<c05356d2>] ? wait_for_common+0xa2/0x120 > : [ 1.860029] [<c05356d2>] ? wait_for_common+0xa2/0x120 > : [ 1.860029] [<c0131260>] ? default_wake_function+0x0/0x10 > : [ 1.860029] [<c05357cd>] ? wait_for_completion_timeout+0xd/0x10 > : [ 1.860029] [<c03da8c9>] ? usb_start_wait_urb+0x99/0xb0 > : [ 1.860029] [<c03daae0>] ? usb_control_msg+0xc0/0x120 > : [ 1.860029] [<c03daaf0>] ? usb_control_msg+0xd0/0x120 > : [ 1.860029] [<c03db2d5>] ? usb_get_descriptor+0x95/0xc0 > : [ 1.860029] [<c03de702>] ? usb_get_configuration+0xa2/0x3e0 > : [ 1.860029] [<c03de736>] ? usb_get_configuration+0xd6/0x3e0 > : [ 1.860029] [<c03db3b4>] ? usb_get_device_descriptor+0xb4/0xc0 > : [ 1.860029] [<c03d4bad>] ? usb_configure_device+0xcd/0x100 > : [ 1.860029] [<c03dcc27>] ? usb_autopm_do_device+0x67/0xf0 > : [ 1.860029] [<c03d4dcb>] ? usb_new_device+0x2b/0xc0 > : [ 1.860029] [<c03d5627>] ? hub_port_connect_change+0x3f7/0x7e0 > : [ 1.860029] [<c03daaf0>] ? usb_control_msg+0xd0/0x120 > : [ 1.860029] [<c03d323b>] ? clear_port_feature+0x4b/0x60 > : [ 1.860029] [<c03d6a95>] ? hub_events+0x1f5/0x500 > : [ 1.860029] [<c0535251>] ? __schedule+0x431/0x750 > : [ 1.860029] [<c014ef0a>] ? finish_wait+0x4a/0x70 > : [ 1.860029] [<c03d6dd5>] ? hub_thread+0x35/0x150 > : [ 1.860029] [<c014eda0>] ? autoremove_wake_function+0x0/0x40 > : [ 1.860029] [<c03d6da0>] ? hub_thread+0x0/0x150 > : [ 1.860029] [<c014ea46>] ? kthread+0x46/0x80 > : [ 1.860029] [<c014ea00>] ? kthread+0x0/0x80 > : [ 1.860029] [<c0103ef7>] ? kernel_thread_helper+0x7/0x10 > : [ 1.860029] Code: 8b 55 bc 89 f8 e8 86 18 00 00 e8 31 fc ff ff 89 f0 e8 > 3a e7 c0 ff a1 00 da 72 c0 29 c3 89 da 89 d8 c1 fa 1f f7 d2 21 d0 8b 55 f0 > <65> 33 15 14 00 00 00 75 46 83 c4 48 5b 5e 5f 5d c3 8d 74 26 00 > : [ 1.860029] EIP: [<c053595b>] schedule_timeout+0x12b/0x1b0 SS:ESP > 0068:df97fd00 > : [ 1.864916] ---[ end trace f734973b361164b7 ]--- > > On Mon, 29 Jun 2009, Andrew Morton wrote: > On Thu, 18 Jun 2009 12:44:49 GMT > bugzilla-daemon@bugzilla.kernel.org wrote: > > > http://bugzilla.kernel.org/show_bug.cgi?id=13564 > > > > Summary: random general protection fault at boot time caused by > > khubd. > > I firt saw this in the kernel 2.6.30-8 from Ubuntu Karmic. > > Afer that I have upgraded to 2.6.30-9 which has caused same problem. > > Now I wanted to try vanila git kernel with radeon KMS so I compiled my own > > kernel from git. Today I saw it first time also with this kernel. > > > > What happens: > > 1. I make cold boot or restart > > 2. Computer boots normaly except all USB ports are dead > > 3. dmesg shows general protection error caused by khubd > > > > This problem doen't happen even nearly every boot. I would give around 20% > > chance for khubd failure without collecting statistics. > > > > This is a regression in 2.6.30.8. However we are not told which kernel > version we regresed against. Which is the latest kernel which was > known to work OK? > > The bad bit is: > > > : [ 1.604937] Write protecting the kernel read-only data: 1732k > : [ 1.672060] Clocksource tsc unstable (delta = 736134769 ns) > : [ 1.704695] usb 3-1: new low speed USB device using uhci_hcd and address > 2 > : [ 1.858154] general protection fault: 0000 [#1] SMP > : [ 1.858277] last sysfs file: > : [ 1.858318] Modules linked in: > : [ 1.858393] > : [ 1.858435] Pid: 231, comm: khubd Not tainted (2.6.30-9-generic > #10kms5-Ubuntu) Aspire 1350 > : [ 1.858485] EIP: 0060:[<c053595b>] EFLAGS: 00010202 CPU: 0 > : [ 1.858541] EIP is at schedule_timeout+0x12b/0x1b0 > : [ 1.858585] EAX: 000004e0 EBX: 000004e0 ECX: c07eed54 EDX: b48c4368 > : [ 1.858631] ESI: df97fd14 EDI: c07ee440 EBP: df97fd54 ESP: df97fd00 > : [ 1.858677] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 > : [ 1.858723] Process khubd (pid: 231, ti=df97e000 task=df973e30 > task.ti=df97e000) > : [ 1.858771] Stack: > : [ 1.858809] de118b60 1c800000 fffedcd6 000004e2 00000296 00000000 > 00200200 fffee1b8 > : [ 1.859102] c0144100 df973e30 c07ee440 c05356d2 6275686b 64640064 > 00000000 00000000 > : [ 1.859462] 000000e7 b48c4368 df97fda4 000004e2 df97fda8 df97fd8c > c05356d2 df97fdb0 > : [ 1.859857] Call Trace: > : [ 1.859898] [<c0144100>] ? process_timeout+0x0/0x10 > : [ 1.859985] [<c05356d2>] ? wait_for_common+0xa2/0x120 > : [ 1.860029] [<c05356d2>] ? wait_for_common+0xa2/0x120 > : [ 1.860029] [<c0131260>] ? default_wake_function+0x0/0x10 > : [ 1.860029] [<c05357cd>] ? wait_for_completion_timeout+0xd/0x10 > : [ 1.860029] [<c03da8c9>] ? usb_start_wait_urb+0x99/0xb0 This looks like a bug in schedule_timeout, to judge by the EIP value. Which would make it a core kernel issue, unrelated to USB. Alan Stern On Tue, 30 Jun 2009 10:25:47 -0400 (EDT) Alan Stern <stern@rowland.harvard.edu> wrote: > On Mon, 29 Jun 2009, Andrew Morton wrote: > > > On Thu, 18 Jun 2009 12:44:49 GMT > > bugzilla-daemon@bugzilla.kernel.org wrote: > > > > > http://bugzilla.kernel.org/show_bug.cgi?id=13564 > > > > > > Summary: random general protection fault at boot time caused > by > > > khubd. > > > > I firt saw this in the kernel 2.6.30-8 from Ubuntu Karmic. > > > Afer that I have upgraded to 2.6.30-9 which has caused same problem. > > > Now I wanted to try vanila git kernel with radeon KMS so I compiled my > own > > > kernel from git. Today I saw it first time also with this kernel. > > > > > > What happens: > > > 1. I make cold boot or restart > > > 2. Computer boots normaly except all USB ports are dead > > > 3. dmesg shows general protection error caused by khubd > > > > > > This problem doen't happen even nearly every boot. I would give around > 20% > > > chance for khubd failure without collecting statistics. > > > > > > > This is a regression in 2.6.30.8. However we are not told which kernel > > version we regresed against. Which is the latest kernel which was > > known to work OK? > > > > The bad bit is: > > > > > > : [ 1.604937] Write protecting the kernel read-only data: 1732k > > : [ 1.672060] Clocksource tsc unstable (delta = 736134769 ns) > > : [ 1.704695] usb 3-1: new low speed USB device using uhci_hcd and > address 2 > > : [ 1.858154] general protection fault: 0000 [#1] SMP > > : [ 1.858277] last sysfs file: > > : [ 1.858318] Modules linked in: > > : [ 1.858393] > > : [ 1.858435] Pid: 231, comm: khubd Not tainted (2.6.30-9-generic > #10kms5-Ubuntu) Aspire 1350 > > : [ 1.858485] EIP: 0060:[<c053595b>] EFLAGS: 00010202 CPU: 0 > > : [ 1.858541] EIP is at schedule_timeout+0x12b/0x1b0 > > : [ 1.858585] EAX: 000004e0 EBX: 000004e0 ECX: c07eed54 EDX: b48c4368 > > : [ 1.858631] ESI: df97fd14 EDI: c07ee440 EBP: df97fd54 ESP: df97fd00 > > : [ 1.858677] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 > > : [ 1.858723] Process khubd (pid: 231, ti=df97e000 task=df973e30 > task.ti=df97e000) > > : [ 1.858771] Stack: > > : [ 1.858809] de118b60 1c800000 fffedcd6 000004e2 00000296 00000000 > 00200200 fffee1b8 > > : [ 1.859102] c0144100 df973e30 c07ee440 c05356d2 6275686b 64640064 > 00000000 00000000 > > : [ 1.859462] 000000e7 b48c4368 df97fda4 000004e2 df97fda8 df97fd8c > c05356d2 df97fdb0 > > : [ 1.859857] Call Trace: > > : [ 1.859898] [<c0144100>] ? process_timeout+0x0/0x10 > > : [ 1.859985] [<c05356d2>] ? wait_for_common+0xa2/0x120 > > : [ 1.860029] [<c05356d2>] ? wait_for_common+0xa2/0x120 > > : [ 1.860029] [<c0131260>] ? default_wake_function+0x0/0x10 > > : [ 1.860029] [<c05357cd>] ? wait_for_completion_timeout+0xd/0x10 > > : [ 1.860029] [<c03da8c9>] ? usb_start_wait_urb+0x99/0xb0 > > This looks like a bug in schedule_timeout, to judge by the EIP value. > Which would make it a core kernel issue, unrelated to USB. > That sounds pretty unlikely. The oops is a bit odd though - maybe some interrupt went off and scribbled on the stack? Or some DMA operation? Also there is: > 2. Computer boots normaly except all USB ports are dead On Tue, 30 Jun 2009, Andrew Morton wrote: > > This looks like a bug in schedule_timeout, to judge by the EIP value. > > Which would make it a core kernel issue, unrelated to USB. > > > > That sounds pretty unlikely. Indeed, it is extremely unlikely. > The oops is a bit odd though - maybe some interrupt went off and > scribbled on the stack? Or some DMA operation? If that's so, it won't be easy to track down. > Also there is: > > > 2. Computer boots normaly except all USB ports are dead That's because the khubd thread died. Without it, there's nothing monitoring any USB ports. Pauli, can you send a disassembly listing of your schedule_timeout function? It might help to know exactly what statement corresponds to offset 0x12b. Alan Stern Created attachment 22155 [details]
objdump -dS timer.o
This should be recompiled from same source as backtrace was. I have already upgraded to newer kernel.
I'm now traveling so can't test anything much but I haven't faced this problem ever after updating to rc1 kernel. So only kernel versions when merge window was open was affected. I can try to go back to test earlier kernel version to locate what fixed the problem. If problem was something timing related then this bug still might be jsut hidden under somewhere. Should this bug report be closed? This probably should be closed because I'm notable to reproduce this any more after rc1. So I guess this was caused by some bug which was fixed. Thanks, am closing out. |
Created attachment 21982 [details] dmesg from failed boot I firt saw this in the kernel 2.6.30-8 from Ubuntu Karmic. Afer that I have upgraded to 2.6.30-9 which has caused same problem. Now I wanted to try vanila git kernel with radeon KMS so I compiled my own kernel from git. Today I saw it first time also with this kernel. What happens: 1. I make cold boot or restart 2. Computer boots normaly except all USB ports are dead 3. dmesg shows general protection error caused by khubd This problem doen't happen even nearly every boot. I would give around 20% chance for khubd failure without collecting statistics.