Bug 10989 - kernel oopses when wiggling the mouse to make it known to hidd
kernel oopses when wiggling the mouse to make it known to hidd
Status: CLOSED CODE_FIX
Product: Drivers
Classification: Unclassified
Component: Bluetooth
All Linux
: P1 normal
Assigned To: drivers_bluetooth@kernel-bugs.osdl.org
:
Depends on:
Blocks: 10492
  Show dependency treegraph
 
Reported: 2008-06-26 10:32 UTC by Daniel Vetter
Modified: 2008-10-06 09:00 UTC (History)
7 users (show)

See Also:
Kernel Version: 2.6.26-rc7-00051-g62786b9
Tree: Mainline
Regression: Yes


Attachments
Use a more unique bus name for connections (3.72 KB, patch)
2008-06-26 13:43 UTC, Marcel Holtmann
Details | Diff
dmesg output for comment #10 (53.23 KB, text/plain)
2008-07-01 01:19 UTC, Daniel Vetter
Details

Description Daniel Vetter 2008-06-26 10:32:07 UTC
Latest working kernel version: 2.6.25-rc5
Earliest failing kernel version:
Distribution: Debian unstable
Hardware Environment: amd64 2x Dual Core Opteron
Software Environment:
Problem Description:

While booting up, when wiggling the mouse at the right moment (to make hidd notice and configure it), the kernel crashes with an oops. Sometimes the bluetooth deamons just crashes, and then the kernel oopses when trying to shutdown the system or when trying to restart the bluetooth tools.

The mouse is a Logitech MX900 with the MX900 bluetooth usb dongle.

Steps to reproduce: Crash seems to be a race, most often it works flawless. Bisecting the problem doesn't work because I have not found out a way to reliably reproduce the crash.

Relevant portions of the dmesg when crashing while booting and mouse-wiggling (via serial-console):

[  180.691199] Bluetooth: L2CAP ver 2.9
[  180.695294] Bluetooth: L2CAP socket layer initialized
[  180.763617] Bluetooth: HIDP (Human Interface Emulation) ver 1.2
[  181.272621] Bluetooth: RFCOMM socket layer initialized
[  181.294815] Bluetooth: RFCOMM TTY layer initialized
[  181.300399] Bluetooth: RFCOMM ver 1.8
[  181.792272] kobject_add_internal failed for acl0007610D7C7E with -EEXIST, do.
[  181.806802] Pid: 5158, comm: btaddconn Not tainted 2.6.26-rc7-00051-g62786b96
[  181.814798]
[  181.814799] Call Trace:
[  181.818473]  [<ffffffff8035c2f7>] ? kobject_put+0x27/0x60
[  181.825841]  [<ffffffff8035c662>] kobject_add_internal+0x112/0x1c0
[  181.833846]  [<ffffffff8035c7c4>] kobject_add_varg+0x54/0x90
[  181.841976]  [<ffffffffa0293cb0>] ? :bluetooth:add_conn+0x0/0x90
[  181.849829]  [<ffffffff8035c8f9>] kobject_add+0x69/0x90
[  181.857407]  [<ffffffff80260371>] ? trace_hardirqs_on+0x121/0x170
[  181.861516]  [<ffffffff8024d802>] ? flush_cpu_workqueue+0x82/0xa0
[  181.875179]  [<ffffffff8025e369>] ? get_lock_stats+0x19/0x70
[  181.879189]  [<ffffffff8035c4aa>] ? kobject_get+0x1a/0x30
[  181.888821]  [<ffffffff803ddb17>] ? get_device+0x17/0x20
[  181.892830]  [<ffffffff803de521>] device_add+0xe1/0x630
[  181.899005]  [<ffffffff8035abf9>] ? __next_cpu+0x19/0x30
[  181.907194]  [<ffffffffa0293cb0>] ? :bluetooth:add_conn+0x0/0x90
[  181.919084]  [<ffffffffa0293cd9>] :bluetooth:add_conn+0x29/0x90
[  181.923974]  [<ffffffff8024d3e5>] run_workqueue+0xf5/0x240
[  181.931983]  [<ffffffff8024d5d7>] worker_thread+0xa7/0x120
[  181.935951]  [<ffffffff80251ac0>] ? autoremove_wake_function+0x0/0x40
[  181.943954]  [<ffffffff8024d530>] ? worker_thread+0x0/0x120
[  181.948393]  [<ffffffff80251719>] kthread+0x49/0x80
[  181.956268]  [<ffffffff8020d478>] child_rip+0xa/0x12
[  181.964275]  [<ffffffff8020cb63>] ? restore_args+0x0/0x30
[  181.967361]  [<ffffffff802516d0>] ? kthread+0x0/0x80
[  181.972072]  [<ffffffff8020d46e>] ? child_rip+0x0/0x12
[  181.982170]
[  181.982694] add_conn: Failed to register connection device
[  181.989678] kobject_add_internal failed for acl0007610D7C7E with -EEXIST, do.
[  182.002233] Pid: 5158, comm: btaddconn Not tainted 2.6.26-rc7-00051-g62786b96
[  182.010379]
[  182.010379] Call Trace:
[  182.014406]  [<ffffffff8035c2f7>] ? kobject_put+0x27/0x60
[  182.023722]  [<ffffffff8035c662>] kobject_add_internal+0x112/0x1c0
[  182.034150]  [<ffffffff8035c7c4>] kobject_add_varg+0x54/0x90
[  182.038164]  [<ffffffffa0293cb0>] ? :bluetooth:add_conn+0x0/0x90
[  182.046994]  [<ffffffff8035c8f9>] kobject_add+0x69/0x90
[  182.049975]  [<ffffffff80260371>] ? trace_hardirqs_on+0x121/0x170
[  182.058809]  [<ffffffff8024d802>] ? flush_cpu_workqueue+0x82/0xa0
[  182.063473]  [<ffffffff8025e369>] ? get_lock_stats+0x19/0x70
[  182.072313]  [<ffffffff8035c4aa>] ? kobject_get+0x1a/0x30
[  182.076324]  [<ffffffff803ddb17>] ? get_device+0x17/0x20
[  182.082138]  [<ffffffff803de521>] device_add+0xe1/0x630
[  182.087519]  [<ffffffff8035abf9>] ? __next_cpu+0x19/0x30
[  182.096994]  [<ffffffffa0293cb0>] ? :bluetooth:add_conn+0x0/0x90
[  182.101338]  [<ffffffffa0293cd9>] :bluetooth:add_conn+0x29/0x90
[  182.110184]  [<ffffffff8024d3e5>] run_workqueue+0xf5/0x240
[  182.114398]  [<ffffffff8024d5d7>] worker_thread+0xa7/0x120
[  182.119826]  [<ffffffff80251ac0>] ? autoremove_wake_function+0x0/0x40
[  182.128673]  [<ffffffff8024d530>] ? worker_thread+0x0/0x120
[  182.138404]  [<ffffffff80251719>] kthread+0x49/0x80
[  182.142554]  [<ffffffff8020d478>] child_rip+0xa/0x12
[  182.151788]  [<ffffffff8020cb63>] ? restore_args+0x0/0x30
[  182.155803]  [<ffffffff802516d0>] ? kthread+0x0/0x80
[  182.161261]  [<ffffffff8020d46e>] ? child_rip+0x0/0x12
[  182.170818]
[  182.172391] add_conn: Failed to register connection device
[  182.306041] BUG: unable to handle kernel NULL pointer dereference at 00000008
[  182.450066] IP: [<ffffffff8031423a>] sysfs_find_dirent+0xa/0x40
[  182.450066] PGD 7c0d4067 PUD 7c0d5067 PMD 0
[  182.450066] Oops: 0000 [1] PREEMPT SMP
[  182.450066] CPU 3
[  182.450066] Modules linked in: rfcomm hidp l2cap nls_iso8859_1 nls_cp437 vfas
[  182.450066] Pid: 7725, comm: hidd Not tainted 2.6.26-rc7-00051-g62786b9 #6
[  182.450066] RIP: 0010:[<ffffffff8031423a>]  [<ffffffff8031423a>] sysfs_find_0
[  182.450066] RSP: 0018:ffff81007c171a48  EFLAGS: 00010296
[  182.450066] RAX: 2222222222222222 RBX: ffffffff8057744f RCX: 2222222222222222
[  182.450066] RDX: 2222222222222222 RSI: ffffffff8057744f RDI: 0000000000000000
[  182.450066] RBP: ffff81007c171a58 R08: 0000000000000001 R09: 0000000000000003
[  182.450066] R10: 0000000000000000 R11: 0000000000000001 R12: ffffffff8057744f
[  182.450066] R13: ffffffff805eb1e0 R14: ffff8101614b7910 R15: ffff8101614b7698
[  182.450066] FS:  00007f720ab0f6e0(0000) GS:ffff8101614292c0(0000) knlGS:00000
[  182.450066] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[  182.450066] CR2: 0000000000000028 CR3: 000000007c04d000 CR4: 00000000000006e0
[  182.450066] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  182.450066] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[  182.450066] Process hidd (pid: 7725, threadinfo ffff81007c170000, task ffff8)
[  182.450066] Stack:  ffffffff8057744f 0000000000000000 ffff81007c171a78 fffff0
[  182.450066]  0000000000000000 ffff8101614b7760 ffff81007c171aa8 ffffffff80315
[  182.450066]  ffff8101614b7760 ffff8101614b7760 ffff8101614b7000 ffff81015d058
[  182.450066] Call Trace:
[  182.450066]  [<ffffffff80314380>] sysfs_get_dirent+0x30/0x80
[  182.450066]  [<ffffffff80315c75>] sysfs_remove_group+0x25/0xe0
[  182.450066]  [<ffffffff803e3f27>] dpm_sysfs_remove+0x17/0x20
[  182.450066]  [<ffffffff803e4652>] device_pm_remove+0x22/0x60
[  182.450066]  [<ffffffff803de24a>] device_del+0x1a/0x190
[  182.450066]  [<ffffffff803de3d1>] device_unregister+0x11/0x20
[  182.450066]  [<ffffffff80403fbb>] input_unregister_device+0x19b/0x1c0
[  182.450066]  [<ffffffffa011f94d>] :hid:hidinput_disconnect+0x3d/0x60
[  182.450066]  [<ffffffffa01220ca>] :hid:hidinput_connect+0x24fa/0x2530
[  182.450066]  [<ffffffffa04d5c99>] :hidp:hidp_add_connection+0x479/0x740
[  182.450066]  [<ffffffff802b992f>] ? fget+0x7f/0xa0
[  182.450066]  [<ffffffffa04d6c0b>] :hidp:hidp_sock_ioctl+0x20b/0x260
[  182.450066]  [<ffffffff8032e959>] ? avc_has_perm+0x59/0x80
[  182.450066]  [<ffffffff80330e35>] ? inode_has_perm+0x45/0x80
[  182.450066]  [<ffffffff80412caa>] ? sys_sendto+0xea/0x120
[  182.450066]  [<ffffffff804117f6>] sock_ioctl+0x56/0x260
[  182.450066]  [<ffffffff802c74a1>] vfs_ioctl+0x31/0xa0
[  182.450066]  [<ffffffff802c7783>] do_vfs_ioctl+0x273/0x2e0
[  182.450066]  [<ffffffff802c7889>] sys_ioctl+0x99/0xa0
[  182.450066]  [<ffffffff8020c50b>] system_call_after_swapgs+0x7b/0x80
[  182.450066]
[  182.450066]
[  182.450066] Code: ff ff 0f 1f 80 00 00 00 00 48 8b 43 38 0f 1f 40 00 e9 64 f
[  182.450066] RIP  [<ffffffff8031423a>] sysfs_find_dirent+0xa/0x40
[  182.450066]  RSP <ffff81007c171a48>
[  182.450066] CR2: 0000000000000028
[  183.254096] ---[ end trace f38a864186b8e81d ]---
[  590.949898] SysRq : Emergency Sync
[  591.632111] Emergency Sync complete
[  594.045739] SysRq : Emergency Remount R/O
[  594.045739] Emergency Remount complete
Comment 1 Andrew Morton 2008-06-26 11:27:13 UTC
You sort-of imply that this is a regression, but that isn't clear?

Comment 2 Dmitry Torokhov 2008-06-26 11:38:28 UTC
Hmm, I think we had a similar issue earlier when we tear down connection that is not completely set up. Maybe Marcel has an idea.
Comment 3 Marcel Holtmann 2008-06-26 13:43:27 UTC
Created attachment 16642 [details]
Use a more unique bus name for connections
Comment 4 Marcel Holtmann 2008-06-26 13:44:20 UTC
You can test this patch if it works. Also you can pull in the bluetooth-2.6 tree since it contains something like 22 pending patches.
Comment 5 Adrian Bunk 2008-06-26 14:05:06 UTC
You said 2.6.25-rc5 worked.

Does 2.6.25 also work?
Comment 6 Dave Young 2008-06-26 19:38:45 UTC
Two things:
1. Why kobject EXIST? I remember some csr dongles need to reset firstly before being used or will cause something like this.

2. How can we retreive the work function return value? any idea?

If the device_add fails we ignore the return value now, so further bug is triggered.

Comment 7 Daniel Vetter 2008-06-27 02:41:53 UTC
> ------- Comment #5 from bunk@kernel.org  2008-06-26 14:05 -------
> You said 2.6.25-rc5 worked.
> 
> Does 2.6.25 also work?

Debian flavour 2.6.25-2 works. I only started tracking -linus very
recently, but if you want I could recheck mainline 2.6.25. The problem is
that it's sometimes very hard to hit: Yesterday I rebooted consecutively at least 20
times without hitting it, right this morning I've hitted it on the first
boot-up.
Comment 8 Daniel Vetter 2008-06-27 02:53:03 UTC
I just saw that I screwed up the version numbers. I meant 2.6.26-rc7-git doesn't work and 2.6.26-rc5 _does_ work. So s/25-rc/26-rc/ in all my comments.

Sorry for that mess-up.
Comment 9 Daniel Vetter 2008-06-28 06:52:59 UTC
> ------- Comment #4 from marcel@holtmann.org  2008-06-26 13:44 -------
> You can test this patch if it works. Also you can pull in the bluetooth-2.6
> tree since it contains something like 22 pending patches.

I tested:

- 2.6.26-rc8 vanilla, which crashed,
- 2.6.26-rc8 + bluetooth-2.6 and your patched reverted (it did not apply
  to rc8), which eventually crashed and
- 2.6.26-rc8 + bluetooth-2.6, which survived quite a lot of test-runs.

So I think your patch does prevent the oops from happening.
Comment 10 Daniel Vetter 2008-06-30 02:52:02 UTC
While retesting mainline I capture an oops for the
second case (i.e. beside the mouse everything works 'till shutdown, were
presumably a process is stuck somewhere odd).

This is on 2.6.26-rc8-00076-g1702b52 from -linus.
Comment 11 Daniel Vetter 2008-07-01 01:19:08 UTC
Created attachment 16670 [details]
dmesg output for comment #10
Comment 12 Rafael J. Wysocki 2008-07-06 10:58:55 UTC
Handled-By : Marcel Holtmann <marcel@holtmann.org>
Comment 13 Rafael J. Wysocki 2008-07-10 03:34:38 UTC
On Thursday, 10 of July 2008, Daniel Vetter wrote:
> Hi
> 
> After about a dozen rebots I managed to crash 2.6.26-rc9-00005-g1b40a89 in
> the same fashion.
> 
> Yours Daniel
> On Sun, Jul 06, 2008 at 01:45:53PM +0200, Rafael J. Wysocki wrote:
> > This message has been generated automatically as a part of a report
> > of recent regressions.
> > 
> > The following bug entry is on the current list of known regressions
> > from 2.6.25.  Please verify if it still should be listed.
> > 
> > 
> > Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=10989
> > Subject		: kernel oopses when wiggling the mouse to make it known to hidd
> > Submitter	: Daniel Vetter <daniel@ffwll.ch>
> > Date		: 2008-06-26 10:32 (11 days old)

Comment 14 Rafael J. Wysocki 2008-07-14 09:23:36 UTC
On Monday, 14 of July 2008, Daniel Vetter wrote:
> 2.6.26 crashed right away on the first boot-up.
> 
> On Sun, Jul 13, 2008 at 08:00:46PM +0200, Rafael J. Wysocki wrote:
> > This message has been generated automatically as a part of a report
> > of recent regressions.
> > 
> > The following bug entry is on the current list of known regressions
> > from 2.6.25.  Please verify if it still should be listed.
> > 
> > 
> > Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=10989
> > Subject		: kernel oopses when wiggling the mouse to make it known to hidd
> > Submitter	: Daniel Vetter <daniel@ffwll.ch>
> > Date		: 2008-06-26 10:32 (18 days old)
> > Handled-By	: Marcel Holtmann <marcel@holtmann.org>

Comment 15 Daniel Vetter 2008-10-06 09:00:10 UTC
I'm closing this (fix arrived in -linus in time for 2.6.27).

btw: is it the reporters duty to close a bug in a case like this?

Note You need to log in before you can comment on or make changes to this bug.