Bug 15758 - GENERATE_KEYMAP produces oops when attempting to start hald
Summary: GENERATE_KEYMAP produces oops when attempting to start hald
Status: ASSIGNED
Alias: None
Product: Other
Classification: Unclassified
Component: Other (show other bugs)
Hardware: All Linux
: P1 high
Assignee: Alan
URL:
Keywords:
: 43201 (view as bug list)
Depends on:
Blocks:
 
Reported: 2010-04-11 10:58 UTC by Nuzhna Pomoshch
Modified: 2012-06-18 21:12 UTC (History)
4 users (show)

See Also:
Kernel Version: 3.4
Subsystem:
Regression: Yes
Bisected commit-id:


Attachments

Description Nuzhna Pomoshch 2010-04-11 10:58:43 UTC
I have spent well over a year and many hours tracking this down. I hope this proves useful.

1. Out of more than a dozen machines, this problem affects exactly one - an IBM T23 laptop.

2. The problem was not in 2.6.23, but appeared shortly after, and has been present in every kernel since.

3. I have some keyboard differences I prefer, and apply the following patch to drivers/char/Makefile:

--- linux-2.6.32/drivers/char/Makefile  2009-12-03 03:51:21.000000000 +0000
+++ linux-2.6.32/drivers/char/Makefile  2010-04-10 00:00:00.000000000 +0000
@@ -126,7 +126,7 @@
 # Uncomment if you're changing the keymap and have an appropriate
 # loadkeys version for the map. By default, we'll use the shipped
 # versions.
-# GENERATE_KEYMAP := 1
+GENERATE_KEYMAP := 1

 ifdef GENERATE_KEYMAP

4. In addition, I make some changes to drivers/char/defkeymap.map and include/asm-generic/termios.h.

5. The problem does not depend on any modifications to the files mentioned in statement 4. Just removing the two characters as described in statement 3 will trigger the problem with no other changes. Without the keymap generation, all kernels I have tested work flawlessly.

6. The oops occurs immediately when starting hald, whether as part of the boot sequence, or skipping it and starting it later manually. If I never start hald (which I need for X), the problem never appears.

7. I have done the most debugging work with 2.6.31.6, but the problem still exists in 2.6.33.2.

8. All of the kernels I have tested this with, I have compiled without support for modules.

9. The generated keymap is significantly larger than the shipped keymap.

10. I have links to various files and output (from 2.6.31.6):

Config file used to compile kernel:
http://bugs.gentoo.org/attachment.cgi?id=214685

Kernel compilation messages without generating the keymap:
http://bugs.gentoo.org/attachment.cgi?id=214687

Kernel compilation messages generating the keymap:
http://bugs.gentoo.org/attachment.cgi?id=214688

Bootup messages without a generated keymap:
http://bugs.gentoo.org/attachment.cgi?id=214690

Bootup messages with a generated keymap:
http://bugs.gentoo.org/attachment.cgi?id=214691

defkeymap.c_shipped:
http://bugs.gentoo.org/attachment.cgi?id=215384

Generated keymap with NO changes to defkeymap.map:
http://bugs.gentoo.org/attachment.cgi?id=215386

11. The oops output for 2.6.33.2 (if that is more helpful):
Apr 11 09:25:04 system kernel: BUG: unable to handle kernel NULL pointer dereference at 0000004a
Apr 11 09:25:04 system kernel: IP: [<c110736a>] strlen+0x8/0x11
Apr 11 09:25:04 system kernel: *pde = 00000000
Apr 11 09:25:04 system kernel: Oops: 0000 [#1]
Apr 11 09:25:04 system kernel: last sysfs file: /sys/devices/virtual/misc/agpgart/uevent
Apr 11 09:25:04 system kernel:
Apr 11 09:25:04 system kernel: Pid: 2291, comm: udevadm Not tainted 2.6.33.2 #1 26472TA/26472TA
Apr 11 09:25:04 system kernel: EIP: 0060:[<c110736a>] EFLAGS: 00010246 CPU: 0
Apr 11 09:25:04 system kernel: EIP is at strlen+0x8/0x11
Apr 11 09:25:04 system kernel: EAX: 00000000 EBX: 00000000 ECX: ffffffff EDX: 000000d0
Apr 11 09:25:04 system kernel: ESI: 0000004a EDI: 0000004a EBP: 000000d0 ESP: ef0bded4
Apr 11 09:25:04 system kernel: DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068
Apr 11 09:25:04 system kernel: Process udevadm (pid: 2291, ti=ef0bc000 task=ef1bf4a0 task.ti=ef0bc000)
Apr 11 09:25:04 system kernel: Stack:
Apr 11 09:25:04 system kernel: ef92cb40 c104d040 ffffffff 00000000 ef0bdf20 ef92cb40 ef0bdf26 c1189610
Apr 11 09:25:04 system kernel: <0> ef92cb40 ef355000 ef355000 ef92cb48 c11896ac ef355000 c13c2e6e 000000af
Apr 11 09:25:04 system kernel: <0> ef355000 c13c2e65 0000000a 00000000 00be8280 c141ffe8 00000000 c1189831
Apr 11 09:25:04 system kernel: Call Trace:
Apr 11 09:25:04 system kernel: [<c104d040>] ? kstrdup+0x18/0x38
Apr 11 09:25:04 system kernel: [<c1189610>] ? device_get_devnode+0x41/0x8f
Apr 11 09:25:04 system kernel: [<c11896ac>] ? dev_uevent+0x4e/0x105
Apr 11 09:25:04 system kernel: [<c1189831>] ? show_uevent+0x64/0xa5
Apr 11 09:25:04 system kernel: [<c11897cd>] ? show_uevent+0x0/0xa5
Apr 11 09:25:04 system kernel: [<c11894ba>] ? dev_attr_show+0x16/0x32
Apr 11 09:25:04 system kernel: [<c10927bb>] ? sysfs_read_file+0x8b/0xea
Apr 11 09:25:04 system kernel: [<c1092730>] ? sysfs_read_file+0x0/0xea
Apr 11 09:25:04 system kernel: [<c1060967>] ? vfs_read+0x81/0x102
Apr 11 09:25:04 system kernel: [<c1060a80>] ? sys_read+0x3c/0x63
Apr 11 09:25:04 system kernel: [<c10025f0>] ? sysenter_do_call+0x12/0x26
Apr 11 09:25:04 system kernel: Code: eb 04 19 c0 0c 01 5e 5f c3 56 89 c6 89 d0 88 c4 ac 38 e0 74 09 84 c0 75 f7 be 01 00 00 00 89 f0 48 5e c3 57 83 c9 ff 89 c7 31 c0 <f2> ae f7 d1 49 89 c8 5f c3 57 31 ff 85 c9 74 0e 89 c7 89 d0 f2
Apr 11 09:25:04 system kernel: EIP: [<c110736a>] strlen+0x8/0x11 SS:ESP 0068:ef0bded4
Apr 11 09:25:04 system kernel: CR2: 000000000000004a
Apr 11 09:25:04 system kernel: ---[ end trace 6f467cd39ca0f982 ]---
Apr 11 09:25:10 system kernel: BUG: unable to handle kernel NULL pointer dereference at 00000069
Apr 11 09:25:10 system kernel: IP: [<c115c0bd>] misc_open+0x35/0xb7
Apr 11 09:25:10 system kernel: *pde = 00000000
Apr 11 09:25:10 system kernel: Oops: 0000 [#2]
Apr 11 09:25:10 system kernel: last sysfs file: /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/boot_vga
Apr 11 09:25:10 system kernel:
Apr 11 09:25:10 system kernel: Pid: 2646, comm: X Tainted: G      D    2.6.33.2 #1 26472TA/26472TA
Apr 11 09:25:10 system kernel: EIP: 0060:[<c115c0bd>] EFLAGS: 00213293 CPU: 0
Apr 11 09:25:10 system kernel: EIP is at misc_open+0x35/0xb7
Apr 11 09:25:10 system kernel: EAX: 0000005d EBX: 0000003f ECX: ef089ea0 EDX: 00000069
Apr 11 09:25:10 system kernel: ESI: ef1dd380 EDI: 00000000 EBP: ef24ec5c ESP: ef089e88
Apr 11 09:25:10 system kernel: DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068
Apr 11 09:25:10 system kernel: Process X (pid: 2646, ti=ef088000 task=ef301b80 task.ti=ef088000)
Apr 11 09:25:10 system kernel: Stack:
Apr 11 09:25:10 system kernel: 00000000 ef8409c0 00000000 ef24ec5c c1061f5a ef1dd380 0000003f ef1dd380
Apr 11 09:25:10 system kernel: <0> ef24ec5c 00000000 c1061ea5 c105eca2 efbeed80 ef4cb700 ef1dd380 ef089f08
Apr 11 09:25:10 system kernel: <0> ef089f08 00000003 c105ee0c 00000000 ee031360 00000000 c1068737 c1a7ee80
Apr 11 09:25:10 system kernel: Call Trace:
Apr 11 09:25:10 system kernel: [<c1061f5a>] ? chrdev_open+0xb5/0xcb
Apr 11 09:25:10 system kernel: [<c1061ea5>] ? chrdev_open+0x0/0xcb
Apr 11 09:25:10 system kernel: [<c105eca2>] ? __dentry_open+0xd6/0x1b3
Apr 11 09:25:10 system kernel: [<c105ee0c>] ? nameidata_to_filp+0x26/0x37
Apr 11 09:25:10 system kernel: [<c1068737>] ? do_filp_open+0x445/0x866
Apr 11 09:25:10 system kernel: [<c10506d3>] ? handle_mm_fault+0x209/0x44c
Apr 11 09:25:10 system kernel: [<c106eeeb>] ? alloc_fd+0x49/0xab
Apr 11 09:25:10 system kernel: [<c105eaa1>] ? do_sys_open+0x48/0x114
Apr 11 09:25:10 system kernel: [<c105ebb1>] ? sys_open+0x1e/0x23
Apr 11 09:25:10 system kernel: [<c10025f0>] ? sysenter_do_call+0x12/0x26
Apr 11 09:25:10 system kernel: Code: 34 b8 c0 5d 41 c1 e8 c9 57 1a 00 a1 cc 5d 41 c1 81 e3 ff ff 0f 00 83 e8 0c eb 10 39 18 75 09 8b 40 08 85 c0 75 51 eb 11 8d 42 f4 <8b> 50 0c 0f 18 02 90 3d c0 5d 41 c1 75 e2 b8 c0 5d 41 c1 e8 33
Apr 11 09:25:10 system kernel: EIP: [<c115c0bd>] misc_open+0x35/0xb7 SS:ESP 0068:ef089e88
Apr 11 09:25:10 system kernel: CR2: 0000000000000069
Apr 11 09:25:10 system kernel: ---[ end trace 6f467cd39ca0f983 ]---

No doubt I forgot something important, but I will monitor and attempt to respond to questions as quickly as possible.

Thank you for your assistance.
Comment 1 Nuzhna Pomoshch 2010-10-24 13:44:02 UTC
This has become MUCH WORSE as of 2.6.35.

Last tested with 2.6.35.3.

Works fine without generating the keymap.

Uncommenting line 132 (no other changes) of /usr/src/linux-2.6.35.3/drivers/char/Makefile and recompiling causes the machine (every computer I have now, not just a single laptop) to crash (screen goes black and completely unresponsive - no ability to ssh in, for example), even at the console, with X not running at all.

Unable to gather any additional information or logs, since the machine goes completely dead upon the press of any key.
Comment 2 Nuzhna Pomoshch 2010-12-13 23:32:27 UTC
Still exists in the 2.6.36 series.
Comment 3 Alan 2012-06-18 20:22:32 UTC
The generate keymap feature doesn't work but there really isn't any reason to use it with modern loadable keymaps
Comment 4 Alan 2012-06-18 21:12:13 UTC
*** Bug 43201 has been marked as a duplicate of this bug. ***

Note You need to log in before you can comment on or make changes to this bug.