Bug 32052
Summary: | Unloading wmi module causes a kernel oops. | ||
---|---|---|---|
Product: | Platform Specific/Hardware | Reporter: | Coacher (itumaykin+kernel) |
Component: | x86-64 | Assignee: | other_modules |
Status: | CLOSED CODE_FIX | ||
Severity: | normal | CC: | dmitry.torokhov, error27, florian, itumaykin+kernel, jlee |
Priority: | P1 | ||
Hardware: | x86-64 | ||
OS: | Linux | ||
Kernel Version: | 2.6.37 and newer | Subsystem: | |
Regression: | Yes | Bisected commit-id: | |
Attachments: |
lspci -vvv
Boot and error trace on 2.6.37 Boot and error trace on 2.6.38 Boot and error trace on 2.6.39 Boot and error trace on 2.6.39.3 dmesg from openSUSE DSDT table Output of wmidump lshw output lsmod start lsmod finish Trace on openSUSE after modules unloaded wmi.c wmi.c Properly clean up WMI devices Second _WDG method dmesg output with debug_dump_wdg Trace from 3.1-rc4-git2 |
Created attachment 52312 [details]
Boot and error trace on 2.6.37
Created attachment 52322 [details]
Boot and error trace on 2.6.38
When using "modprobe -r acer_wmi" the error occurs as described above, but when using "rmmod acer_wmi" everything is normal, but if after this run "rmmod wmi" then error occurs as described above. So, looks like the acer_wmi module is innocent and this bug is concerned only with wmi module. Also, I've checked the changes in acer_wmi.c from 2.6.36.4 to 2.6.37.6 and there were nothing that can cause such an error. However, I am far from the kernel developer. Maybe that's why I can't say if there are something suspicious in wmi.c. I've also checked diffs in it, but didn't understand most of it. Created attachment 68162 [details]
Boot and error trace on 2.6.39
Still seeing this bug in 2.6.39.3. Tested on stable branch Gentoo amd64. Can provide any necessary info. This bug is really annoying because I can't suspend my notebook without unloading acer_wmi, which is causing unloading the wmi module, which is broken. And suspend is vital for notebook.
Created attachment 68852 [details]
Boot and error trace on 2.6.39.3
As seen in previous attachment the kernel complains about something wlan related, but I definitely remember there was error about /sys/module/wmi/refcnt. So, I booted the 39 kernel and unloaded all modules related to network and acer_wmi. After unloading wmi I've got the error trace attached. To me it looks like the unloading procedure is expecting /sys/module/wmi/refcnt while it is already removed. If anyone could guide me through the steps about how to collect all needed debug info I'll do it.
This should be reassigned to drivers_platform_x86@kernel-bugs.osdl.org Please someone. I cann't reproduce this issue on my Acer machine with openSUSE 11.4 and I never saw the error on Comment#5. Will try to reproduce it first. Created attachment 69312 [details]
dmesg from openSUSE
Well, I have openSUSE 11.4 on my Acer too. With today's updates I still see this error. Kernel is 2.6.37 + SUSE patches, which is default shipped with distribution. Here is uname -a output:
Linux Photon 2.6.37.6-0.7-desktop #1 SMP PREEMPT 2011-07-21 02:17:24 +0200 x86_64 x86_64 x86_64 GNU/Linux.
The dmesg output is provided. It is the result of subsequent running "rmmod acer_wmi" and "modprobe -r wmi" right after boot is completed.
I have Acer Aspire TimelineX 1830T.
Created attachment 69322 [details]
DSDT table
May be helpful.
Created attachment 69332 [details]
Output of wmidump
Will try x86_64. Can NOT reproduce this issue on Acer TravelMate 8572 with OpenSUSE 11.4, kernel version is 2.6.37.1-1.2-desktop and 2.6.37.6-0.7-desktop OpenSUSE 11.4 2.6.37.1-1.2-desktop linux-cr4d:~ # rmmod acer-wmi linux-cr4d:~ # modprobe -r wmi Aug 20 00:11:41 linux-cr4d kernel: [ 1832.891142] wmi: Mapper loaded Aug 20 00:11:41 linux-cr4d kernel: [ 1832.894369] acer-wmi: Acer Laptop ACPI-WMI Extras Aug 20 00:11:41 linux-cr4d kernel: [ 1832.894787] acer-wmi: Brightness must be controlled by generic video driver Aug 20 00:11:41 linux-cr4d NetworkManager[7211]: <info> found WiFi radio killswitch rfkill6 (at /sys/devices/platform/acer-wmi/rfkill/rfkill6) (driver acer-wmi) Aug 20 00:12:03 linux-cr4d NetworkManager[7211]: <info> radio killswitch /sys/devices/platform/acer-wmi/rfkill/rfkill6 disappeared Aug 20 00:12:03 linux-cr4d kernel: [ 1855.203628] acer-wmi: Acer Laptop WMI Extras unloaded Aug 20 00:12:12 linux-cr4d kernel: [ 1864.079536] wmi: Mapper unloaded 2.6.37.6-0.7-desktop [ 54.402005] acer-wmi: Acer Laptop ACPI-WMI Extras [ 54.402449] acer-wmi: Brightness must be controlled by generic video driver [ 59.430073] acer-wmi: Acer Laptop WMI Extras unloaded [ 61.808118] wmi: Mapper unloaded Tracing the kernel oops in Comment#8 [ 51.666633] acer-wmi: Acer Laptop WMI Extras unloaded [ 54.756101] general protection fault: 0000 [#1] PREEMPT SMP [ 54.756115] last sysfs file: /sys/module/wmi/refcnt [ 54.756122] CPU 3 [ 54.756125] Modules linked in: af_packet ip6t_LOG xt_tcpudp xt_pkttype ipt_LOG xt_limit edd ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_raw xt_NOTRACK ipt_REJECT iptable_raw iptable_filter ip6table_mangle nf_conntrack_netbios_ns nf_conntrack_ipv4 nf_defrag_ipv4 ip_tables xt_conntrack nf_conntrack ip6table_filter ip6_tables x_tables snd_pcm_oss cpufreq_conservative snd_mixer_oss cpufreq_userspace cpufreq_powersave snd_seq snd_seq_device acpi_cpufreq mperf dm_mod arc4 ecb ath9k snd_hda_codec_hdmi snd_hda_codec_realtek mac80211 snd_hda_intel snd_hda_codec ath9k_common uvcvideo ath9k_hw snd_hwdep videodev ath v4l1_compat cfg80211 snd_pcm v4l2_compat_ioctl32 atl1c shpchp snd_timer snd pci_hotplug i2c_i801 iTCO_wdt sg soundcore snd_page_alloc rfkill iTCO_vendor_support intel_ips pcspkr wmi(-) joydev battery ac ext4 jbd2 crc16 i915 drm_kms_helper drm i2c_algo_bit button video fan processor thermal thermal_sys [last unloaded: acer_wmi] [ 54.756277] [ 54.756285] Pid: 3217, comm: modprobe Not tainted 2.6.37.6-0.7-desktop #1 Acer Aspire 1830T/Base Board Product Name [ 54.756301] RIP: 0010:[<ffffffff81329306>] [<ffffffff81329306>] device_del+0x16/0x1a0 [ 54.756322] RSP: 0018:ffff88006a94be18 EFLAGS: 00010206 [ 54.756331] RAX: 554e514553004241 RBX: ffff880069185040 RCX: 000000000000598f [ 54.756341] RDX: ffff88006cc04448 RSI: 0000000000000282 RDI: ffff880069185040 [ 54.756351] RBP: 4854415056454400 R08: 0000000000000400 R09: ffffffff817ca43c [ 54.756360] R10: ffffffffa01d747e R11: 0000000000000003 R12: ffff88006bfdc000 [ 54.756370] R13: 00007fffdb410cb0 R14: 000000000060f080 R15: 0000000000406750 [ 54.756381] FS: 00007fb8d6de5700(0000) GS:ffff88006f0c0000(0000) knlGS:0000000000000000 [ 54.756392] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 54.756401] CR2: 00007fb8d6dfe000 CR3: 000000006a659000 CR4: 00000000000006e0 [ 54.756412] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 54.756428] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 54.756444] Process modprobe (pid: 3217, threadinfo ffff88006a94a000, task ffff880037892840) [ 54.756454] Stack: [ 54.756460] ffff880069185040 ffffffffa01d7c40 ffff88006bfdc000 ffffffff813294a5 [ 54.756474] 007665647562696c ffffffffa01d647f ffff88006bfdc350 ffffffffa01d64ae [ 54.756487] 000000000060f080 ffffffff812be3e5 ffff88006bfdc350 ffffffffa01d7d30 [ 54.756499] Call Trace: [ 54.756517] [<ffffffff813294a5>] device_unregister+0x15/0x50 [ 54.756539] [<ffffffffa01d647f>] wmi_free_devices+0x2f/0x40 [wmi] [ 54.756568] [<ffffffffa01d64ae>] acpi_wmi_remove+0x1e/0x30 [wmi] [ 54.756592] [<ffffffff812be3e5>] acpi_device_remove+0x81/0xa0 [ 54.756616] [<ffffffff8132ce1f>] __device_release_driver+0x6f/0xf0 [ 54.756639] [<ffffffff8132d658>] driver_detach+0xa8/0xb0 [ 54.756658] [<ffffffff8132cc7a>] bus_remove_driver+0x7a/0xf0 [ 54.756674] [<ffffffffa01d7244>] acpi_wmi_exit+0x10/0x2e [wmi] [ 54.756692] [<ffffffff81096965>] sys_delete_module+0x175/0x260 [ 54.756711] [<ffffffff81002f8b>] system_call_fastpath+0x16/0x1b [ 54.756726] [<00007fb8d6731bc7>] 0x7fb8d6731bc7 [ 54.756735] Code: 24 48 8b 6c 24 08 4c 8b 64 24 10 48 83 c4 18 c3 0f 1f 44 00 00 41 54 55 53 48 8b 87 80 00 00 00 48 89 fb 48 8b 2f 48 85 c0 74 18 <48> 8b 78 60 48 89 da be 02 00 00 00 48 81 c7 c0 00 00 00 e8 92 [ 54.756796] RIP [<ffffffff81329306>] device_del+0x16/0x1a0 [ 54.756808] RSP <ffff88006a94be18>d70 [ 54.756857] ---[ end trace f145bbcd3c8a271b ]--- the device_del function in drivers/base/core.c objdump -d core.o > core.disasm 0000000000000d70 <device_del>: d70: 41 54 push %r12 d72: 55 push %rbp d73: 53 push %rbx d74: 48 8b 87 80 00 00 00 mov 0x80(%rdi),%rax d7b: 48 89 fb mov %rdi,%rbx d7e: 48 8b 2f mov (%rdi),%rbp d81: 48 85 c0 test %rax,%rax d84: 74 18 je d9e <device_del+0x2e> d86: 48 8b 78 60 mov 0x60(%rax),%rdi <===== OOPS d8a: 48 89 da mov %rbx,%rdx d8d: be 02 00 00 00 mov $0x2,%esi d92: 48 81 c7 c0 00 00 00 add $0xc0,%rdi d99: e8 00 00 00 00 callq d9e <device_del+0x2e> d9e: 48 89 df mov %rbx,%rdi da1: e8 00 00 00 00 callq da6 <device_del+0x36> da6: 48 89 df mov %rbx,%rdi da9: e8 00 00 00 00 callq dae <device_del+0x3e> dae: 48 85 ed test %rbp,%rbp db1: 74 0d je dc0 <device_del+0x50> In C source code: void device_del(struct device *dev) { struct device *parent = dev->parent; struct class_interface *class_intf; asm("#0"); /* Notify clients of device removal. This call must come * before dpm_sysfs_remove(). */ if (dev->bus) asm("#1"); blocking_notifier_call_chain(&dev->bus->p->bus_notifier, BUS_NOTIFY_DEL_DEVICE, dev); <===== OOPS asm("#2"); (In reply to comment #12) Looks like it is somehow hardware specific. Should I provide any extra info in addition to lspci? (In reply to comment #13) What should I do next? Created attachment 69412 [details]
lshw output
Another view from core.s : #APP # 1058 "drivers/base/core.c" 1 #1 # 0 "" 2 .LVL249: #NO_APP .L160: .loc 1 1059 0 movq 96(%rax), %rdi # D.20532_4->p, D.20532_4->p <===== OOPS movq %rbx, %rdx # dev, movl $2, %esi #, addq $192, %rdi #, tmp94 call blocking_notifier_call_chain # .LVL250: .loc 1 1061 0 #APP # 1061 "drivers/base/core.c" 1 #2 # 0 "" 2 .loc 1 1062 0 #NO_APP movq %rbx, %rdi # dev, call device_pm_remove # The same in source code: void device_del(struct device *dev) { struct device *parent = dev->parent; struct class_interface *class_intf; asm("#0"); /* Notify clients of device removal. This call must come * before dpm_sysfs_remove(). */ if (dev->bus) asm("#1"); blocking_notifier_call_chain(&dev->bus->p->bus_notifier, BUS_NOTIFY_DEL_DEVICE, dev); <===== OOPS asm("#2"); (In reply to comment #16) > Created an attachment (id=69412) [details] > lshw output Thank's for your information. This issue relates to notify clients of device before we remove it from a bus, this happen on we try to notify clients by bus_notifier when remove wmi (this device) from platform bus. Sorry for I am not good for this area, I am looking at the 2.6.37 source code and hope can find out some thing wrong. But, Yes, like your concern, maybe this issue also relate to machine, because I can NOT reproduce it on my Acer TravelMate 8572 even by openSUSE 11.4 64-bits edition. Does this issue also happen when you remove any device from other bus? (In reply to comment #4) > Created an attachment (id=68162) [details] > Boot and error trace on 2.6.39 > > Still seeing this bug in 2.6.39.3. Tested on stable branch Gentoo amd64. Can > provide any necessary info. This bug is really annoying because I can't > suspend > my notebook without unloading acer_wmi, which is causing unloading the wmi > module, which is broken. And suspend is vital for notebook. By the way, You cann't suspend your machine without unloading acer-wmi? It's odd, my machine can suspend and don't need unload acer-wmi. How can you know the acer-wmi is a problem when you suspend(S3 or S4?) your machine? (In reply to comment #18) I am far from specialist, but looking through the code of 37.4 and up to 39.3 I noticed that device_del() and blocking_notifier_call_chain() functions' code hasn't changed. However, the code of WMI driver changed a lot. To me it looks like the logic of the driver changed. I stuck my attention to acpi_wmi_exit() in wmi.c. In 36.4 first called wmi_class_exit() and then acpi_bus_unregister_driver(), but in 37.4 and 39.3 first called acpi_bus_unregister_driver() and then class_unregister(&wmi_class). So, I decided to change the order of these functions in wmi.c, but with no luck. After recompiling I was stiil experiencing OOPS. Maybe, this is wrong tack, but problems only when unloading this module. I tried to unload as many modules as possible on my openSUSE machine. Started with lsmod.begin, finished with lsmod.finish. No errors occured. After unloading wmi I got the trace attached. Thank you, for your attention and help! Created attachment 69462 [details]
lsmod start
All modules list right after boot.
Created attachment 69472 [details]
lsmod finish
Unloaded as many modules as possible
Created attachment 69482 [details]
Trace on openSUSE after modules unloaded
(In reply to comment #19) I can successfully suspend my machine without unloading acer_wmi. However, after resume I'll get the state of my bluetooth and WiFi set to disabled and I have to manually switch the state using Fn+F3. When I include acer_wmi into SUSPEND_MODULES (I'm using pm-utils) I get the state of internal devices recovered properly. And pm-utils unload modules in SUSPEND_MODULES recursively, this is how I get wmi module unloaded. Several workarounds possible: 1. Compile wmi in kernel itself, as I really dont need it to be unloaded, only acer_wmi. 2. Tell modrpobe not to unload wmi module using modprobe.conf. 3. Fix the code of pm-utils to not unload modules recursively, or use rmmod, or more elegant solution using some kind of blacklisting. It is easy to do because they are plain bash scripts. However, the bug exists. The problem is not concerned with suspend or hibernate. It is concerned only with unloading wmi module. Running vanilla 2.6.39.3. Compiled wmi in kernel and acer_wmi as module. Well, I can unload/load acer_wmi safely now, even with "modprobe -r". But now the Fn+F3 switch which enables/disables wifi and bluetooth in cycle does nothing. Brightness controls, touchpad enable/disable, display switch, sound controls works fine. Some output: #rfkill list 2: phy0: Wireless LAN Soft blocked: no Hard blocked: no 3: hci0: Bluetooth Soft blocked: no Hard blocked: no 4: acer-wireless: Wireless LAN Soft blocked: no Hard blocked: no 5: acer-bluetooth: Bluetooth Soft blocked: no Hard blocked: no Pressing Fn+F3 does nothing, but on 2.6.36 the wlan became hard blocked and bluetooth device were removed completely. Syslog full of: Aug 20 20:41:43 Photon kernel: [ 456.530888] atkbd serio0: Unknown key pressed (translated set 2, code 0x93 on isa0060/serio0). Aug 20 20:41:43 Photon kernel: [ 456.530899] atkbd serio0: Use 'setkeycodes e013 <keycode>' to make it known. Aug 20 20:41:43 Photon kernel: [ 456.562487] keyboard: can't emulate rawmode for keycode 240 Aug 20 20:41:43 Photon kernel: [ 456.562498] keyboard: can't emulate rawmode for keycode 240 Aug 20 20:41:43 Photon kernel: [ 456.562504] acer_wmi: Unknown key number - 0x4 Aug 20 20:41:43 Photon kernel: [ 456.613570] atkbd serio0: Unknown key released (translated set 2, code 0x93 on isa0060/serio0). Aug 20 20:41:43 Photon kernel: [ 456.613576] atkbd serio0: Use 'setkeycodes e013 <keycode>' to make it known. Aug 20 20:41:48 Photon kernel: [ 461.360482] atkbd serio0: Unknown key pressed (translated set 2, code 0x93 on isa0060/serio0). Aug 20 20:41:48 Photon kernel: [ 461.360492] atkbd serio0: Use 'setkeycodes e013 <keycode>' to make it known. Aug 20 20:41:48 Photon kernel: [ 461.393342] keyboard: can't emulate rawmode for keycode 240 Aug 20 20:41:48 Photon kernel: [ 461.393361] keyboard: can't emulate rawmode for keycode 240 Aug 20 20:41:48 Photon kernel: [ 461.393372] acer_wmi: Unknown key number - 0x4 When I'm pressing Fn+F3. Will try to use setkeycodes as suggested. After "setkeycodes e013 240" in terminal (not in emulator under X) the message changed to this: acer_wmi: Unknown key number - 0x4. And still no actual switching happens. As mentioned here: http://permalink.gmane.org/gmane.linux.drivers.platform.x86.devel/2373 some work is needed to support new notebooks. I guess if my notebook concerned? Work is still in progress as I can see: http://permalink.gmane.org/gmane.linux.drivers.platform.x86.devel/2398 but should this behaviour be treated as temporarily defect due to incomplete support or as a bug? (In reply to comment #24) Workaround 1 is working partially, see comment #25. Workaround 2 is working partially with the same issue. Workaround 3 is not tested. (In reply to comment #26) > I guess if my notebook concerned? Looking through acer-wmi.c and list of GUID there my notebook is AMW0_V2 type. So it should work. The problem lies here (line1342): switch (return_value.function) { case WMID_HOTKEY_EVENT: if (return_value.device_state) { u16 device_state = return_value.device_state; pr_debug("deivces states: 0x%x\n", device_state); if (has_cap(ACER_CAP_WIRELESS)) rfkill_set_sw_state(wireless_rfkill, !(device_state & ACER_WMID3_GDS_WIRELESS)); if (has_cap(ACER_CAP_BLUETOOTH)) rfkill_set_sw_state(bluetooth_rfkill, !(device_state & ACER_WMID3_GDS_BLUETOOTH)); if (has_cap(ACER_CAP_THREEG)) rfkill_set_sw_state(threeg_rfkill, !(device_state & ACER_WMID3_GDS_THREEG)); } if (!sparse_keymap_report_event(acer_wmi_input_dev, return_value.key_num, 1, true)) pr_warning("Unknown key number - 0x%x\n", return_value.key_num); break; default: pr_warning("Unknown function number - %d - %d\n", return_value.function, return_value.key_num); break; } (In reply to comment #28) > (In reply to comment #26) > > I guess if my notebook concerned? > > Looking through acer-wmi.c and list of GUID there my notebook is AMW0_V2 > type. > So it should work. The problem lies here (line1342): > switch (return_value.function) { > case WMID_HOTKEY_EVENT: > if (return_value.device_state) { > u16 device_state = return_value.device_state; > pr_debug("deivces states: 0x%x\n", device_state); > if (has_cap(ACER_CAP_WIRELESS)) > rfkill_set_sw_state(wireless_rfkill, > !(device_state & ACER_WMID3_GDS_WIRELESS)); > if (has_cap(ACER_CAP_BLUETOOTH)) > rfkill_set_sw_state(bluetooth_rfkill, > !(device_state & ACER_WMID3_GDS_BLUETOOTH)); > if (has_cap(ACER_CAP_THREEG)) > rfkill_set_sw_state(threeg_rfkill, > !(device_state & ACER_WMID3_GDS_THREEG)); > } > if (!sparse_keymap_report_event(acer_wmi_input_dev, > return_value.key_num, 1, true)) > pr_warning("Unknown key number - 0x%x\n", > return_value.key_num); > break; > default: > pr_warning("Unknown function number - %d - %d\n", > return_value.function, return_value.key_num); > break; > } I suggest file another bug for this issue because it does not relate to wmi module problem. Could you please file another bug? On the other hand, please try Seth's patch that might fix your problem: commit 1a04d8ffc04c10fc50124f311d4c8c391f9a04ca Author: Seth Forshee <seth.forshee@canonical.com> Date: Tue Jun 21 12:00:32 2011 -0500 acer-wmi: Add support for Aspire 1830 wlan hotkey Signed-off-by: Seth Forshee <seth.forshee@canonical.com> Signed-off-by: Matthew Garrett <mjg@redhat.com> Let's back to wmi module removed problem: (In reply to comment #20) > (In reply to comment #18) > I am far from specialist, but looking through the code of 37.4 and up to > 39.3 I noticed that device_del() and blocking_notifier_call_chain() > functions' > code hasn't changed. However, the code of WMI driver changed a lot. To me it > looks like the logic of the driver changed. > I stuck my attention to acpi_wmi_exit() in wmi.c. In 36.4 first called > wmi_class_exit() and then acpi_bus_unregister_driver(), but in 37.4 and 39.3 > first called acpi_bus_unregister_driver() and then > class_unregister(&wmi_class). So, I decided to change the order of these > functions in wmi.c, but with no luck. After recompiling I was stiil > experiencing OOPS. Maybe, this is wrong tack, but problems only when > unloading > this module. > I tried to unload as many modules as possible on my openSUSE machine. > Started with lsmod.begin, finished with lsmod.finish. No errors occured. > After > unloading wmi I got the trace attached. > > Thank you, for your attention and help! The changed in wmi exit function is from this patch: From c64eefd48c44fa8145ad1f96edabf4a053fffc49 Mon Sep 17 00:00:00 2001 From: Dmitry Torokhov <dmitry.torokhov@gmail.com> Date: Thu, 26 Aug 2010 00:15:30 -0700 Subject: [PATCH] WMI: embed struct device directly into wmi_block Instead of creating wmi_blocks and then register corresponding devices on a separate pass do it all in one shot, since lifetime rules for both objects are the same. This also takes care of leaking devices when device_create fails for one of them. Signed-off-by: Dmitry Torokhov <dtor@mail.ru> Signed-off-by: Matthew Garrett <mjg@redhat.com> Maybe we should try to test the wmi.c BEFORE and AFTER applied this patch. Created attachment 69702 [details]
wmi.c
wmi.c file that was BEFORE applied c64eefd48c44fa8145ad1f96edabf4a053fffc49
Created attachment 69712 [details]
wmi.c
wmi.c file that was AFTER applied c64eefd48c44fa8145ad1f96edabf4a053fffc49
I wonder how many "_WDG" methods you have. Could you please load wmi modebug_dump_wdg=1 option? Thanks. Created attachment 69732 [details]
Properly clean up WMI devices
Could you please try this patch and see if it helps?
(In reply to comment #32) What kernel version should I use? There are differences between wmi.c marked as BEFORE you provided and wmi.c shipped with vanilla 2.6.39.3. (In reply to comment #34) > Created an attachment (id=69732) [details] > Properly clean up WMI devices > > Could you please try this patch and see if it helps? After applying this patch to wmi.c shipped with vanilla 2.6.39.3 it won't compile with error: CC drivers/platform/x86/wmi.o drivers/platform/x86/wmi.c: In function ‘wmi_free_devices’: drivers/platform/x86/wmi.c:757: error: expected expression before ‘for’ drivers/platform/x86/wmi.c:754: warning: unused variable ‘next’ drivers/platform/x86/wmi.c:754: warning: unused variable ‘wblock’ make[3]: *** [drivers/platform/x86/wmi.o] Error 1 I guess I'm doing something wrong. Created attachment 71282 [details] Second _WDG method (In reply to comment #33) I didn't know it could be more than one. I've searched through DSDT table, which is attached, and found another one, passed it through wmidump and attached. No problem loading with this option, but where I will get the output? Created attachment 71352 [details] dmesg output with debug_dump_wdg (In reply to comment #33) This is dmesg part with all output from wmi module with debug_dump_wdg=1 (In reply to comment #36) > (In reply to comment #34) > > Created an attachment (id=69732) [details] [details] > > Properly clean up WMI devices > > > > Could you please try this patch and see if it helps? > After applying this patch to wmi.c shipped with vanilla 2.6.39.3 it won't > compile with error: > CC drivers/platform/x86/wmi.o > drivers/platform/x86/wmi.c: In function ‘wmi_free_devices’: > drivers/platform/x86/wmi.c:757: error: expected expression before ‘for’ > drivers/platform/x86/wmi.c:754: warning: unused variable ‘next’ > drivers/platform/x86/wmi.c:754: warning: unused variable ‘wblock’ > make[3]: *** [drivers/platform/x86/wmi.o] Error 1 > > I guess I'm doing something wrong. There weren't any changes to the file for a while so the patch should apply to v2.6.38 onward. I wonder if you had an older version of the file that you patched. Please re-fetch the sources and apply the patch again. If you still get compiler errors please post resulting version of wmi.c. Thanks. (In reply to comment #32) > Created an attachment (id=69712) [details] > wmi.c > > wmi.c file that was AFTER applied c64eefd48c44fa8145ad1f96edabf4a053fffc49 Well, I've taken 3.1-rc4-git2 vanilla kernel which already has changes that are mentioned in c64eefd48c44fa8145ad1f96edabf4a053fffc49. However, it has differences with wmi.c you provided. Too bad the bug is still somewhere there. See trace below if it helps. Created attachment 71792 [details]
Trace from 3.1-rc4-git2
(In reply to comment #29) > I suggest file another bug for this issue because it does not relate to wmi > module problem. > Could you please file another bug? > > On the other hand, please try Seth's patch that might fix your problem: > > commit 1a04d8ffc04c10fc50124f311d4c8c391f9a04ca > Author: Seth Forshee <seth.forshee@canonical.com> > Date: Tue Jun 21 12:00:32 2011 -0500 > > acer-wmi: Add support for Aspire 1830 wlan hotkey > > Signed-off-by: Seth Forshee <seth.forshee@canonical.com> > Signed-off-by: Matthew Garrett <mjg@redhat.com> Talking about this issue. It was fixed with the commit you've mentioned and will be in mainline 3.1. Thanks. But the same problem arised with bluetooth, because on my Acer they are switched via one button in cycle. So, I guess another one-line-commit will do the trick. I'll file a bug report later. (In reply to comment #39) Hooray! It is fixed now with patch you provided. Tested on vanilla 2.6.39.3 again with your patch. Compiled successfully now and working (unloading). Looks like I probably did some mess in wmi.c last time. Sorry. Thank you and Mr. Lee, Chun-Yi for your help and attention. I'll mark this as resolved for now. If any issues will appear I'll reopen it. Waiting for this patch to make its way in upstream. (In reply to comment #42) > (In reply to comment #29) > > I suggest file another bug for this issue because it does not relate to wmi > > module problem. > > Could you please file another bug? > > > > On the other hand, please try Seth's patch that might fix your problem: > > > > commit 1a04d8ffc04c10fc50124f311d4c8c391f9a04ca > > Author: Seth Forshee <seth.forshee@canonical.com> > > Date: Tue Jun 21 12:00:32 2011 -0500 > > > > acer-wmi: Add support for Aspire 1830 wlan hotkey > > > > Signed-off-by: Seth Forshee <seth.forshee@canonical.com> > > Signed-off-by: Matthew Garrett <mjg@redhat.com> > > Talking about this issue. It was fixed with the commit you've mentioned and > will be in mainline 3.1. Thanks. > But the same problem arised with bluetooth, because on my Acer they are > switched via one button in cycle. So, I guess another one-line-commit will do > the trick. I'll file a bug report later. Yes, please file another bug then share your dmesg and dmidecode to me. Please attach dmesg after press bluetooth key. A patch referencing this bug report has been merged in Linux v3.2-rc1: commit 023b9565972a4a5e0f01b9aa32680af6e9b5c388 Author: Dmitry Torokhov <dmitry.torokhov@gmail.com> Date: Wed Sep 7 15:00:02 2011 -0700 WMI: properly cleanup devices to avoid crashes (In reply to comment #45) > A patch referencing this bug report has been merged in Linux v3.2-rc1: > > commit 023b9565972a4a5e0f01b9aa32680af6e9b5c388 > Author: Dmitry Torokhov <dmitry.torokhov@gmail.com> > Date: Wed Sep 7 15:00:02 2011 -0700 > > WMI: properly cleanup devices to avoid crashes In fact this patch was already merged in any recent kernel branch: it was added in 3.0.9, it is already presented in 3.1.5 and 3.2 branches. So, this is RESOLVED CODE_FIX. |
Created attachment 52302 [details] lspci -vvv Hello. I have Acer Aspire TimelineX 1830T so I'm using acer_wmi module. Since 2.6.37 there is a regression which is causing a NULL pointer dereference and kernel oops. Steps to reproduce: 1. Boot any kernel newer than 2.6.37 (tested on 2.6.37.1, .3, .4, .5 and 2.6.38.1) with acer_wmi built as module 2. Run "modprobe -r acer_wmi" 3. Kernel oops! On 2.6.36, .1, .2, .4 works flawlessly. Running ArchLinux x86_64. lspci -vvv and appropriate part of everything.log attached. Can provide any other info if needed.