Bug 32052 - Unloading wmi module causes a kernel oops.
Summary: Unloading wmi module causes a kernel oops.
Status: CLOSED CODE_FIX
Alias: None
Product: Platform Specific/Hardware
Classification: Unclassified
Component: x86-64 (show other bugs)
Hardware: x86-64 Linux
: P1 normal
Assignee: other_modules
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-03-28 12:15 UTC by Coacher
Modified: 2012-01-15 17:23 UTC (History)
5 users (show)

See Also:
Kernel Version: 2.6.37 and newer
Tree: Mainline
Regression: Yes


Attachments
lspci -vvv (11.06 KB, text/plain)
2011-03-28 12:15 UTC, Coacher
Details
Boot and error trace on 2.6.37 (59.87 KB, text/plain)
2011-03-28 12:22 UTC, Coacher
Details
Boot and error trace on 2.6.38 (63.23 KB, text/plain)
2011-03-28 12:51 UTC, Coacher
Details
Boot and error trace on 2.6.39 (47.01 KB, text/plain)
2011-08-09 08:57 UTC, Coacher
Details
Boot and error trace on 2.6.39.3 (48.00 KB, text/plain)
2011-08-15 10:00 UTC, Coacher
Details
dmesg from openSUSE (53.45 KB, text/plain)
2011-08-18 20:39 UTC, Coacher
Details
DSDT table (529.98 KB, text/plain)
2011-08-18 20:41 UTC, Coacher
Details
Output of wmidump (611 bytes, text/plain)
2011-08-18 20:42 UTC, Coacher
Details
lshw output (16.63 KB, text/plain)
2011-08-19 15:21 UTC, Coacher
Details
lsmod start (3.70 KB, text/plain)
2011-08-20 10:44 UTC, Coacher
Details
lsmod finish (1.57 KB, text/plain)
2011-08-20 10:45 UTC, Coacher
Details
Trace on openSUSE after modules unloaded (4.28 KB, text/plain)
2011-08-20 10:46 UTC, Coacher
Details
wmi.c (23.28 KB, text/x-csrc)
2011-08-23 04:36 UTC, Lee, Chun-Yi
Details
wmi.c (22.51 KB, text/x-csrc)
2011-08-23 04:37 UTC, Lee, Chun-Yi
Details
Properly clean up WMI devices (841 bytes, patch)
2011-08-23 05:46 UTC, Dmitry Torokhov
Details | Diff
Second _WDG method (3.24 KB, text/plain)
2011-09-01 22:32 UTC, Coacher
Details
dmesg output with debug_dump_wdg (5.38 KB, text/plain)
2011-09-02 14:17 UTC, Coacher
Details
Trace from 3.1-rc4-git2 (3.38 KB, text/plain)
2011-09-06 20:32 UTC, Coacher
Details

Description Coacher 2011-03-28 12:15:42 UTC
Created attachment 52302 [details]
lspci -vvv

Hello.

I have Acer Aspire TimelineX 1830T so I'm using acer_wmi module. Since 2.6.37 there is a regression which is causing a NULL pointer dereference and kernel oops.

Steps to reproduce:
1. Boot any kernel newer than 2.6.37 (tested on 2.6.37.1, .3, .4, .5 and 2.6.38.1) with acer_wmi built as module
2. Run "modprobe -r acer_wmi"
3. Kernel oops!

On 2.6.36, .1, .2, .4 works flawlessly.
Running ArchLinux x86_64.

lspci -vvv and appropriate part of everything.log attached. 
Can provide any other info if needed.
Comment 1 Coacher 2011-03-28 12:22:28 UTC
Created attachment 52312 [details]
Boot and error trace on 2.6.37
Comment 2 Coacher 2011-03-28 12:51:39 UTC
Created attachment 52322 [details]
Boot and error trace on 2.6.38
Comment 3 Coacher 2011-07-26 13:19:01 UTC
When using "modprobe -r acer_wmi" the error occurs as described above, but when using "rmmod acer_wmi" everything is normal, but if after this run "rmmod wmi" then error occurs as described above. So, looks like the acer_wmi module is innocent and this bug is concerned only with wmi module. Also, I've checked the changes in acer_wmi.c from 2.6.36.4 to 2.6.37.6 and there were nothing that can cause such an error. However, I am far from the kernel developer. Maybe that's why I can't say if there are something suspicious in wmi.c. I've also checked diffs in it, but didn't understand most of it.
Comment 4 Coacher 2011-08-09 08:57:19 UTC
Created attachment 68162 [details]
Boot and error trace on 2.6.39

Still seeing this bug in 2.6.39.3. Tested on stable branch Gentoo amd64. Can provide any necessary info. This bug is really annoying because I can't suspend my notebook without unloading acer_wmi, which is causing unloading the wmi module, which is broken. And suspend is vital for notebook.
Comment 5 Coacher 2011-08-15 10:00:30 UTC
Created attachment 68852 [details]
Boot and error trace on 2.6.39.3

As seen in previous attachment the kernel complains about something wlan related, but I definitely remember there was error about /sys/module/wmi/refcnt. So, I booted the 39 kernel and unloaded all modules related to network and acer_wmi. After unloading wmi I've got the error trace attached. To me it looks like the unloading procedure is expecting /sys/module/wmi/refcnt while it is already removed. If anyone could guide me through the steps about how to collect all needed debug info I'll do it.
Comment 6 Coacher 2011-08-17 22:28:39 UTC
This should be reassigned to drivers_platform_x86@kernel-bugs.osdl.org
Please someone.
Comment 7 Lee, Chun-Yi 2011-08-18 00:22:58 UTC
I cann't reproduce this issue on my Acer machine with openSUSE 11.4 and I never saw the error on Comment#5.

Will try to reproduce it first.
Comment 8 Coacher 2011-08-18 20:39:30 UTC
Created attachment 69312 [details]
dmesg from openSUSE

Well, I have openSUSE 11.4 on my Acer too. With today's updates I still see this error. Kernel is 2.6.37 + SUSE patches, which is default shipped with distribution. Here is uname -a output:
Linux Photon 2.6.37.6-0.7-desktop #1 SMP PREEMPT 2011-07-21 02:17:24 +0200 x86_64 x86_64 x86_64 GNU/Linux.
The dmesg output is provided. It is the result of subsequent running "rmmod acer_wmi" and "modprobe -r wmi" right after boot is completed.
I have Acer Aspire TimelineX 1830T.
Comment 9 Coacher 2011-08-18 20:41:17 UTC
Created attachment 69322 [details]
DSDT table

May be helpful.
Comment 10 Coacher 2011-08-18 20:42:31 UTC
Created attachment 69332 [details]
Output of wmidump
Comment 11 Lee, Chun-Yi 2011-08-19 00:00:31 UTC
Will try x86_64.
Comment 12 Lee, Chun-Yi 2011-08-19 11:03:26 UTC
Can NOT reproduce this issue on Acer TravelMate 8572 with OpenSUSE 11.4, kernel version is 2.6.37.1-1.2-desktop and 2.6.37.6-0.7-desktop


OpenSUSE 11.4 	2.6.37.1-1.2-desktop

linux-cr4d:~ # rmmod acer-wmi
linux-cr4d:~ # modprobe -r wmi

Aug 20 00:11:41 linux-cr4d kernel: [ 1832.891142] wmi: Mapper loaded
Aug 20 00:11:41 linux-cr4d kernel: [ 1832.894369] acer-wmi: Acer Laptop ACPI-WMI Extras
Aug 20 00:11:41 linux-cr4d kernel: [ 1832.894787] acer-wmi: Brightness must be controlled by generic video driver
Aug 20 00:11:41 linux-cr4d NetworkManager[7211]: <info> found WiFi radio killswitch rfkill6 (at /sys/devices/platform/acer-wmi/rfkill/rfkill6) (driver acer-wmi)
Aug 20 00:12:03 linux-cr4d NetworkManager[7211]: <info> radio killswitch /sys/devices/platform/acer-wmi/rfkill/rfkill6 disappeared
Aug 20 00:12:03 linux-cr4d kernel: [ 1855.203628] acer-wmi: Acer Laptop WMI Extras unloaded
Aug 20 00:12:12 linux-cr4d kernel: [ 1864.079536] wmi: Mapper unloaded



2.6.37.6-0.7-desktop
[   54.402005] acer-wmi: Acer Laptop ACPI-WMI Extras
[   54.402449] acer-wmi: Brightness must be controlled by generic video driver
[   59.430073] acer-wmi: Acer Laptop WMI Extras unloaded
[   61.808118] wmi: Mapper unloaded
Comment 13 Lee, Chun-Yi 2011-08-19 11:05:40 UTC
Tracing the kernel oops in Comment#8

[   51.666633] acer-wmi: Acer Laptop WMI Extras unloaded
[   54.756101] general protection fault: 0000 [#1] PREEMPT SMP 
[   54.756115] last sysfs file: /sys/module/wmi/refcnt
[   54.756122] CPU 3 
[   54.756125] Modules linked in: af_packet ip6t_LOG xt_tcpudp xt_pkttype ipt_LOG xt_limit edd ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_raw xt_NOTRACK ipt_REJECT iptable_raw iptable_filter ip6table_mangle nf_conntrack_netbios_ns nf_conntrack_ipv4 nf_defrag_ipv4 ip_tables xt_conntrack nf_conntrack ip6table_filter ip6_tables x_tables snd_pcm_oss cpufreq_conservative snd_mixer_oss cpufreq_userspace cpufreq_powersave snd_seq snd_seq_device acpi_cpufreq mperf dm_mod arc4 ecb ath9k snd_hda_codec_hdmi snd_hda_codec_realtek mac80211 snd_hda_intel snd_hda_codec ath9k_common uvcvideo ath9k_hw snd_hwdep videodev ath v4l1_compat cfg80211 snd_pcm v4l2_compat_ioctl32 atl1c shpchp snd_timer snd pci_hotplug i2c_i801 iTCO_wdt sg soundcore snd_page_alloc rfkill iTCO_vendor_support intel_ips pcspkr wmi(-) joydev battery ac ext4 jbd2 crc16 i915 drm_kms_helper drm i2c_algo_bit button video fan processor thermal thermal_sys [last unloaded: acer_wmi]
[   54.756277] 
[   54.756285] Pid: 3217, comm: modprobe Not tainted 2.6.37.6-0.7-desktop #1 Acer Aspire 1830T/Base Board Product Name
[   54.756301] RIP: 0010:[<ffffffff81329306>]  [<ffffffff81329306>] device_del+0x16/0x1a0
[   54.756322] RSP: 0018:ffff88006a94be18  EFLAGS: 00010206
[   54.756331] RAX: 554e514553004241 RBX: ffff880069185040 RCX: 000000000000598f
[   54.756341] RDX: ffff88006cc04448 RSI: 0000000000000282 RDI: ffff880069185040
[   54.756351] RBP: 4854415056454400 R08: 0000000000000400 R09: ffffffff817ca43c
[   54.756360] R10: ffffffffa01d747e R11: 0000000000000003 R12: ffff88006bfdc000
[   54.756370] R13: 00007fffdb410cb0 R14: 000000000060f080 R15: 0000000000406750
[   54.756381] FS:  00007fb8d6de5700(0000) GS:ffff88006f0c0000(0000) knlGS:0000000000000000
[   54.756392] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[   54.756401] CR2: 00007fb8d6dfe000 CR3: 000000006a659000 CR4: 00000000000006e0
[   54.756412] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   54.756428] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[   54.756444] Process modprobe (pid: 3217, threadinfo ffff88006a94a000, task ffff880037892840)
[   54.756454] Stack:
[   54.756460]  ffff880069185040 ffffffffa01d7c40 ffff88006bfdc000 ffffffff813294a5
[   54.756474]  007665647562696c ffffffffa01d647f ffff88006bfdc350 ffffffffa01d64ae
[   54.756487]  000000000060f080 ffffffff812be3e5 ffff88006bfdc350 ffffffffa01d7d30
[   54.756499] Call Trace:
[   54.756517]  [<ffffffff813294a5>] device_unregister+0x15/0x50
[   54.756539]  [<ffffffffa01d647f>] wmi_free_devices+0x2f/0x40 [wmi]
[   54.756568]  [<ffffffffa01d64ae>] acpi_wmi_remove+0x1e/0x30 [wmi]
[   54.756592]  [<ffffffff812be3e5>] acpi_device_remove+0x81/0xa0
[   54.756616]  [<ffffffff8132ce1f>] __device_release_driver+0x6f/0xf0
[   54.756639]  [<ffffffff8132d658>] driver_detach+0xa8/0xb0
[   54.756658]  [<ffffffff8132cc7a>] bus_remove_driver+0x7a/0xf0
[   54.756674]  [<ffffffffa01d7244>] acpi_wmi_exit+0x10/0x2e [wmi]
[   54.756692]  [<ffffffff81096965>] sys_delete_module+0x175/0x260
[   54.756711]  [<ffffffff81002f8b>] system_call_fastpath+0x16/0x1b
[   54.756726]  [<00007fb8d6731bc7>] 0x7fb8d6731bc7
[   54.756735] Code: 24 48 8b 6c 24 08 4c 8b 64 24 10 48 83 c4 18 c3 0f 1f 44 00 00 41 54 55 53 48 8b 87 80 00 00 00 48 89 fb 48 8b 2f 48 85 c0 74 18 <48> 8b 78 60 48 89 da be 02 00 00 00 48 81 c7 c0 00 00 00 e8 92 
[   54.756796] RIP  [<ffffffff81329306>] device_del+0x16/0x1a0
[   54.756808]  RSP <ffff88006a94be18>d70 
[   54.756857] ---[ end trace f145bbcd3c8a271b ]---

the device_del function in drivers/base/core.c
objdump -d core.o > core.disasm

0000000000000d70 <device_del>:
     d70:       41 54                   push   %r12
     d72:       55                      push   %rbp
     d73:       53                      push   %rbx
     d74:       48 8b 87 80 00 00 00    mov    0x80(%rdi),%rax
     d7b:       48 89 fb                mov    %rdi,%rbx
     d7e:       48 8b 2f                mov    (%rdi),%rbp
     d81:       48 85 c0                test   %rax,%rax
     d84:       74 18                   je     d9e <device_del+0x2e>
     d86:       48 8b 78 60             mov    0x60(%rax),%rdi		<===== OOPS
     d8a:       48 89 da                mov    %rbx,%rdx
     d8d:       be 02 00 00 00          mov    $0x2,%esi
     d92:       48 81 c7 c0 00 00 00    add    $0xc0,%rdi
     d99:       e8 00 00 00 00          callq  d9e <device_del+0x2e>
     d9e:       48 89 df                mov    %rbx,%rdi
     da1:       e8 00 00 00 00          callq  da6 <device_del+0x36>
     da6:       48 89 df                mov    %rbx,%rdi
     da9:       e8 00 00 00 00          callq  dae <device_del+0x3e>
     dae:       48 85 ed                test   %rbp,%rbp
     db1:       74 0d                   je     dc0 <device_del+0x50>


In C source code:

void device_del(struct device *dev)
{
        struct device *parent = dev->parent;
        struct class_interface *class_intf;
asm("#0");
        /* Notify clients of device removal.  This call must come
         * before dpm_sysfs_remove().
         */
        if (dev->bus)
asm("#1");
                blocking_notifier_call_chain(&dev->bus->p->bus_notifier,
                                             BUS_NOTIFY_DEL_DEVICE, dev);	<===== OOPS
asm("#2");
Comment 14 Coacher 2011-08-19 13:33:01 UTC
(In reply to comment #12)
Looks like it is somehow hardware specific. Should I provide any extra info in addition to lspci?
Comment 15 Coacher 2011-08-19 13:36:07 UTC
(In reply to comment #13)
What should I do next?
Comment 16 Coacher 2011-08-19 15:21:33 UTC
Created attachment 69412 [details]
lshw output
Comment 17 Lee, Chun-Yi 2011-08-19 23:09:46 UTC
Another view from core.s :

#APP
# 1058 "drivers/base/core.c" 1
        #1
# 0 "" 2
.LVL249:
#NO_APP
.L160: 
        .loc 1 1059 0
        movq    96(%rax), %rdi  # D.20532_4->p, D.20532_4->p	<===== OOPS
        movq    %rbx, %rdx      # dev,
        movl    $2, %esi        #,
        addq    $192, %rdi      #, tmp94
        call    blocking_notifier_call_chain    #
.LVL250:
        .loc 1 1061 0
#APP
# 1061 "drivers/base/core.c" 1
        #2
# 0 "" 2
        .loc 1 1062 0
#NO_APP
        movq    %rbx, %rdi      # dev,
        call    device_pm_remove        #


The same in source code:

void device_del(struct device *dev)
{
        struct device *parent = dev->parent;
        struct class_interface *class_intf;
asm("#0");
        /* Notify clients of device removal.  This call must come
         * before dpm_sysfs_remove().
         */
        if (dev->bus)
asm("#1");
                blocking_notifier_call_chain(&dev->bus->p->bus_notifier,
                                             BUS_NOTIFY_DEL_DEVICE, dev);   
<===== OOPS
asm("#2");
Comment 18 Lee, Chun-Yi 2011-08-19 23:25:56 UTC
(In reply to comment #16)
> Created an attachment (id=69412) [details]
> lshw output

Thank's for your information. 

This issue relates to notify clients of device before we remove it from a bus, this happen on we try to notify clients by bus_notifier when remove wmi (this device) from platform bus.

Sorry for I am not good for this area, I am looking at the 2.6.37 source code and hope can find out some thing wrong.

But, Yes, like your concern,
maybe this issue also relate to machine, because I can NOT reproduce it on my Acer TravelMate 8572 even by openSUSE 11.4 64-bits edition.

Does this issue also happen when you remove any device from other bus?
Comment 19 Lee, Chun-Yi 2011-08-19 23:29:01 UTC
(In reply to comment #4)
> Created an attachment (id=68162) [details]
> Boot and error trace on 2.6.39
> 
> Still seeing this bug in 2.6.39.3. Tested on stable branch Gentoo amd64. Can
> provide any necessary info. This bug is really annoying because I can't
> suspend
> my notebook without unloading acer_wmi, which is causing unloading the wmi
> module, which is broken. And suspend is vital for notebook.

By the way, 
You cann't suspend your machine without unloading acer-wmi? It's odd, my machine can suspend and don't need unload acer-wmi.

How can you know the acer-wmi is a problem when you suspend(S3 or S4?) your machine?
Comment 20 Coacher 2011-08-20 10:43:23 UTC
(In reply to comment #18)
    I am far from specialist, but looking through the code of 37.4 and up to 39.3 I noticed that device_del() and blocking_notifier_call_chain() functions' code hasn't changed. However, the code of WMI driver changed a lot. To me it looks like the logic of the driver changed.
    I stuck my attention to acpi_wmi_exit() in wmi.c. In 36.4 first called wmi_class_exit() and then acpi_bus_unregister_driver(), but in 37.4 and 39.3 first called acpi_bus_unregister_driver() and then class_unregister(&wmi_class). So, I decided to change the order of these functions in wmi.c, but with no luck. After recompiling I was stiil experiencing OOPS. Maybe, this is wrong tack, but problems only when unloading this module.
    I tried to unload as many modules as possible on my openSUSE machine. Started with lsmod.begin, finished with lsmod.finish. No errors occured. After unloading wmi I got the trace attached.

    Thank you, for your attention and help!
Comment 21 Coacher 2011-08-20 10:44:42 UTC
Created attachment 69462 [details]
lsmod start

All modules list right after boot.
Comment 22 Coacher 2011-08-20 10:45:26 UTC
Created attachment 69472 [details]
lsmod finish

Unloaded as many modules as possible
Comment 23 Coacher 2011-08-20 10:46:10 UTC
Created attachment 69482 [details]
Trace on openSUSE after modules unloaded
Comment 24 Coacher 2011-08-20 11:01:15 UTC
(In reply to comment #19)
    I can successfully suspend my machine without unloading acer_wmi. However, after resume I'll get the state of my bluetooth and WiFi set to disabled and I have to manually switch the state using Fn+F3. When I include acer_wmi into SUSPEND_MODULES (I'm using pm-utils) I get the state of internal devices recovered properly. And pm-utils unload modules in SUSPEND_MODULES recursively, this is how I get wmi module unloaded.
    Several workarounds possible:
        1. Compile wmi in kernel itself, as I really dont need it to be unloaded, only acer_wmi.
        2. Tell modrpobe not to unload wmi module using modprobe.conf.
        3. Fix the code of pm-utils to not unload modules recursively, or use rmmod, or more elegant solution using some kind of blacklisting. It is easy to do because they are plain bash scripts.
    However, the bug exists. The problem is not concerned with suspend or hibernate. It is concerned only with unloading wmi module.
Comment 25 Coacher 2011-08-20 16:47:06 UTC
Running vanilla 2.6.39.3. Compiled wmi in kernel and acer_wmi as module. Well, I can unload/load acer_wmi safely now, even with "modprobe -r". But now the Fn+F3 switch which enables/disables wifi and bluetooth in cycle does nothing. Brightness controls, touchpad enable/disable, display switch, sound controls works fine.

Some output:
#rfkill list
2: phy0: Wireless LAN
        Soft blocked: no
        Hard blocked: no
3: hci0: Bluetooth
        Soft blocked: no
        Hard blocked: no
4: acer-wireless: Wireless LAN
        Soft blocked: no
        Hard blocked: no
5: acer-bluetooth: Bluetooth
        Soft blocked: no
        Hard blocked: no
    Pressing Fn+F3 does nothing, but on 2.6.36 the wlan became hard blocked and bluetooth device were removed completely. 
    
Syslog full of:
Aug 20 20:41:43 Photon kernel: [  456.530888] atkbd serio0: Unknown key pressed (translated set 2, code 0x93 on isa0060/serio0).
Aug 20 20:41:43 Photon kernel: [  456.530899] atkbd serio0: Use 'setkeycodes e013 <keycode>' to make it known.
Aug 20 20:41:43 Photon kernel: [  456.562487] keyboard: can't emulate rawmode for keycode 240
Aug 20 20:41:43 Photon kernel: [  456.562498] keyboard: can't emulate rawmode for keycode 240
Aug 20 20:41:43 Photon kernel: [  456.562504] acer_wmi: Unknown key number - 0x4
Aug 20 20:41:43 Photon kernel: [  456.613570] atkbd serio0: Unknown key released (translated set 2, code 0x93 on isa0060/serio0).
Aug 20 20:41:43 Photon kernel: [  456.613576] atkbd serio0: Use 'setkeycodes e013 <keycode>' to make it known.
Aug 20 20:41:48 Photon kernel: [  461.360482] atkbd serio0: Unknown key pressed (translated set 2, code 0x93 on isa0060/serio0).
Aug 20 20:41:48 Photon kernel: [  461.360492] atkbd serio0: Use 'setkeycodes e013 <keycode>' to make it known.
Aug 20 20:41:48 Photon kernel: [  461.393342] keyboard: can't emulate rawmode for keycode 240
Aug 20 20:41:48 Photon kernel: [  461.393361] keyboard: can't emulate rawmode for keycode 240
Aug 20 20:41:48 Photon kernel: [  461.393372] acer_wmi: Unknown key number - 0x4
When I'm pressing Fn+F3. Will try to use setkeycodes as suggested.
Comment 26 Coacher 2011-08-20 17:04:30 UTC
    After "setkeycodes e013 240" in terminal (not in emulator under X) the message changed to this:  acer_wmi: Unknown key number - 0x4.
And still no actual switching happens.

As mentioned here: http://permalink.gmane.org/gmane.linux.drivers.platform.x86.devel/2373
some work is needed to support new notebooks. I guess if my notebook concerned?

Work is still in progress as I can see:
http://permalink.gmane.org/gmane.linux.drivers.platform.x86.devel/2398
but should this behaviour be treated as temporarily defect due to incomplete support or as a bug?
Comment 27 Coacher 2011-08-20 17:34:21 UTC
(In reply to comment #24)
Workaround 1 is working partially, see comment #25.
Workaround 2 is working partially with the same issue.
Workaround 3 is not tested.
Comment 28 Coacher 2011-08-20 20:55:44 UTC
(In reply to comment #26)
> I guess if my notebook concerned?

Looking through acer-wmi.c and list of GUID there my notebook is AMW0_V2 type. So it should work. The problem lies here (line1342):
switch (return_value.function) {
	case WMID_HOTKEY_EVENT:
		if (return_value.device_state) {
			u16 device_state = return_value.device_state;
			pr_debug("deivces states: 0x%x\n", device_state);
			if (has_cap(ACER_CAP_WIRELESS))
				rfkill_set_sw_state(wireless_rfkill,
				!(device_state & ACER_WMID3_GDS_WIRELESS));
			if (has_cap(ACER_CAP_BLUETOOTH))
				rfkill_set_sw_state(bluetooth_rfkill,
				!(device_state & ACER_WMID3_GDS_BLUETOOTH));
			if (has_cap(ACER_CAP_THREEG))
				rfkill_set_sw_state(threeg_rfkill,
				!(device_state & ACER_WMID3_GDS_THREEG));
		}
		if (!sparse_keymap_report_event(acer_wmi_input_dev,
				return_value.key_num, 1, true))
			pr_warning("Unknown key number - 0x%x\n",
				return_value.key_num);
		break;
	default:
		pr_warning("Unknown function number - %d - %d\n",
			return_value.function, return_value.key_num);
		break;
	}
Comment 29 Lee, Chun-Yi 2011-08-23 04:01:41 UTC
(In reply to comment #28)
> (In reply to comment #26)
> > I guess if my notebook concerned?
> 
> Looking through acer-wmi.c and list of GUID there my notebook is AMW0_V2
> type.
> So it should work. The problem lies here (line1342):
> switch (return_value.function) {
>     case WMID_HOTKEY_EVENT:
>         if (return_value.device_state) {
>             u16 device_state = return_value.device_state;
>             pr_debug("deivces states: 0x%x\n", device_state);
>             if (has_cap(ACER_CAP_WIRELESS))
>                 rfkill_set_sw_state(wireless_rfkill,
>                 !(device_state & ACER_WMID3_GDS_WIRELESS));
>             if (has_cap(ACER_CAP_BLUETOOTH))
>                 rfkill_set_sw_state(bluetooth_rfkill,
>                 !(device_state & ACER_WMID3_GDS_BLUETOOTH));
>             if (has_cap(ACER_CAP_THREEG))
>                 rfkill_set_sw_state(threeg_rfkill,
>                 !(device_state & ACER_WMID3_GDS_THREEG));
>         }
>         if (!sparse_keymap_report_event(acer_wmi_input_dev,
>                 return_value.key_num, 1, true))
>             pr_warning("Unknown key number - 0x%x\n",
>                 return_value.key_num);
>         break;
>     default:
>         pr_warning("Unknown function number - %d - %d\n",
>             return_value.function, return_value.key_num);
>         break;
>     }

I suggest file another bug for this issue because it does not relate to wmi module problem.
Could you please file another bug?

On the other hand, please try Seth's patch that might fix your problem:

commit 1a04d8ffc04c10fc50124f311d4c8c391f9a04ca
Author: Seth Forshee <seth.forshee@canonical.com>
Date:   Tue Jun 21 12:00:32 2011 -0500

    acer-wmi: Add support for Aspire 1830 wlan hotkey
    
    Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
    Signed-off-by: Matthew Garrett <mjg@redhat.com>
Comment 30 Lee, Chun-Yi 2011-08-23 04:26:07 UTC
Let's back to wmi module removed problem:

(In reply to comment #20)
> (In reply to comment #18)
>     I am far from specialist, but looking through the code of 37.4 and up to
> 39.3 I noticed that device_del() and blocking_notifier_call_chain()
> functions'
> code hasn't changed. However, the code of WMI driver changed a lot. To me it
> looks like the logic of the driver changed.
>     I stuck my attention to acpi_wmi_exit() in wmi.c. In 36.4 first called
> wmi_class_exit() and then acpi_bus_unregister_driver(), but in 37.4 and 39.3
> first called acpi_bus_unregister_driver() and then
> class_unregister(&wmi_class). So, I decided to change the order of these
> functions in wmi.c, but with no luck. After recompiling I was stiil
> experiencing OOPS. Maybe, this is wrong tack, but problems only when
> unloading
> this module.
>     I tried to unload as many modules as possible on my openSUSE machine.
> Started with lsmod.begin, finished with lsmod.finish. No errors occured.
> After
> unloading wmi I got the trace attached.
> 
>     Thank you, for your attention and help!

The changed in wmi exit function is from this patch:

From c64eefd48c44fa8145ad1f96edabf4a053fffc49 Mon Sep 17 00:00:00 2001
From: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Date: Thu, 26 Aug 2010 00:15:30 -0700
Subject: [PATCH] WMI: embed struct device directly into wmi_block

Instead of creating wmi_blocks and then register corresponding devices
on a separate pass do it all in one shot, since lifetime rules for both
objects are the same. This also takes care of leaking devices when
device_create fails for one of them.

Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
Signed-off-by: Matthew Garrett <mjg@redhat.com>

Maybe we should try to test the wmi.c BEFORE and AFTER applied this patch.
Comment 31 Lee, Chun-Yi 2011-08-23 04:36:21 UTC
Created attachment 69702 [details]
wmi.c

wmi.c file that was BEFORE applied c64eefd48c44fa8145ad1f96edabf4a053fffc49
Comment 32 Lee, Chun-Yi 2011-08-23 04:37:16 UTC
Created attachment 69712 [details]
wmi.c

wmi.c file that was AFTER applied c64eefd48c44fa8145ad1f96edabf4a053fffc49
Comment 33 Dmitry Torokhov 2011-08-23 05:26:08 UTC
I wonder how many "_WDG" methods you have. Could you please load wmi modebug_dump_wdg=1 option?

Thanks.
Comment 34 Dmitry Torokhov 2011-08-23 05:46:55 UTC
Created attachment 69732 [details]
Properly clean up WMI devices

Could you please try this patch and see if it helps?
Comment 35 Coacher 2011-09-01 20:34:07 UTC
(In reply to comment #32)
What kernel version should I use? 
There are differences between wmi.c marked as BEFORE you provided and wmi.c shipped with vanilla 2.6.39.3.
Comment 36 Coacher 2011-09-01 21:03:11 UTC
(In reply to comment #34)
> Created an attachment (id=69732) [details]
> Properly clean up WMI devices
> 
> Could you please try this patch and see if it helps?
After applying this patch to wmi.c shipped with vanilla 2.6.39.3 it won't compile with error:
  CC      drivers/platform/x86/wmi.o
drivers/platform/x86/wmi.c: In function ‘wmi_free_devices’:
drivers/platform/x86/wmi.c:757: error: expected expression before ‘for’
drivers/platform/x86/wmi.c:754: warning: unused variable ‘next’
drivers/platform/x86/wmi.c:754: warning: unused variable ‘wblock’
make[3]: *** [drivers/platform/x86/wmi.o] Error 1

I guess I'm doing something wrong.
Comment 37 Coacher 2011-09-01 22:32:56 UTC
Created attachment 71282 [details]
Second _WDG method

(In reply to comment #33)

I didn't know it could be more than one. I've searched through DSDT table, which is attached, and found another one, passed it through wmidump and attached.
No problem loading with this option, but where I will get the output?
Comment 38 Coacher 2011-09-02 14:17:31 UTC
Created attachment 71352 [details]
dmesg output with debug_dump_wdg

(In reply to comment #33)

This is dmesg part with all output from wmi module with debug_dump_wdg=1
Comment 39 Dmitry Torokhov 2011-09-06 18:28:29 UTC
(In reply to comment #36)
> (In reply to comment #34)
> > Created an attachment (id=69732) [details] [details]
> > Properly clean up WMI devices
> > 
> > Could you please try this patch and see if it helps?
> After applying this patch to wmi.c shipped with vanilla 2.6.39.3 it won't
> compile with error:
>   CC      drivers/platform/x86/wmi.o
> drivers/platform/x86/wmi.c: In function ‘wmi_free_devices’:
> drivers/platform/x86/wmi.c:757: error: expected expression before ‘for’
> drivers/platform/x86/wmi.c:754: warning: unused variable ‘next’
> drivers/platform/x86/wmi.c:754: warning: unused variable ‘wblock’
> make[3]: *** [drivers/platform/x86/wmi.o] Error 1
> 
> I guess I'm doing something wrong.

There weren't any changes to the file for a while so the patch should apply to v2.6.38 onward. I wonder if you had an older version of the file that you patched. Please re-fetch the sources and apply the patch again. If you still get compiler errors please post resulting version of wmi.c.

Thanks.
Comment 40 Coacher 2011-09-06 20:29:01 UTC
(In reply to comment #32)
> Created an attachment (id=69712) [details]
> wmi.c
> 
> wmi.c file that was AFTER applied c64eefd48c44fa8145ad1f96edabf4a053fffc49

Well, I've taken 3.1-rc4-git2 vanilla kernel which already has changes that are mentioned in c64eefd48c44fa8145ad1f96edabf4a053fffc49. However, it has differences with wmi.c you provided. Too bad the bug is still somewhere there. See trace below if it helps.
Comment 41 Coacher 2011-09-06 20:32:28 UTC
Created attachment 71792 [details]
Trace from 3.1-rc4-git2
Comment 42 Coacher 2011-09-06 20:36:57 UTC
(In reply to comment #29)
> I suggest file another bug for this issue because it does not relate to wmi
> module problem.
> Could you please file another bug?
> 
> On the other hand, please try Seth's patch that might fix your problem:
> 
> commit 1a04d8ffc04c10fc50124f311d4c8c391f9a04ca
> Author: Seth Forshee <seth.forshee@canonical.com>
> Date:   Tue Jun 21 12:00:32 2011 -0500
> 
>     acer-wmi: Add support for Aspire 1830 wlan hotkey
> 
>     Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
>     Signed-off-by: Matthew Garrett <mjg@redhat.com>

Talking about this issue. It was fixed with the commit you've mentioned and will be in mainline 3.1. Thanks.
But the same problem arised with bluetooth, because on my Acer they are switched via one button in cycle. So, I guess another one-line-commit will do the trick. I'll file a bug report later.
Comment 43 Coacher 2011-09-06 21:58:15 UTC
(In reply to comment #39)

Hooray! It is fixed now with patch you provided.
Tested on vanilla 2.6.39.3 again with your patch. Compiled successfully now and working (unloading). Looks like I probably did some mess in wmi.c last time. Sorry.

Thank you and Mr. Lee, Chun-Yi for your help and attention.

I'll mark this as resolved for now. If any issues will appear I'll reopen it. Waiting for this patch to make its way in upstream.
Comment 44 Lee, Chun-Yi 2011-09-07 02:56:00 UTC
(In reply to comment #42)
> (In reply to comment #29)
> > I suggest file another bug for this issue because it does not relate to wmi
> > module problem.
> > Could you please file another bug?
> > 
> > On the other hand, please try Seth's patch that might fix your problem:
> > 
> > commit 1a04d8ffc04c10fc50124f311d4c8c391f9a04ca
> > Author: Seth Forshee <seth.forshee@canonical.com>
> > Date:   Tue Jun 21 12:00:32 2011 -0500
> > 
> >     acer-wmi: Add support for Aspire 1830 wlan hotkey
> > 
> >     Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
> >     Signed-off-by: Matthew Garrett <mjg@redhat.com>
> 
> Talking about this issue. It was fixed with the commit you've mentioned and
> will be in mainline 3.1. Thanks.
> But the same problem arised with bluetooth, because on my Acer they are
> switched via one button in cycle. So, I guess another one-line-commit will do
> the trick. I'll file a bug report later.

Yes, please file another bug then share your dmesg and dmidecode to me.
Please attach dmesg after press bluetooth key.
Comment 45 Florian Mickler 2012-01-12 21:20:07 UTC
A patch referencing this bug report has been merged in Linux v3.2-rc1:

commit 023b9565972a4a5e0f01b9aa32680af6e9b5c388
Author: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Date:   Wed Sep 7 15:00:02 2011 -0700

    WMI: properly cleanup devices to avoid crashes
Comment 46 Coacher 2012-01-14 00:32:33 UTC
(In reply to comment #45)
> A patch referencing this bug report has been merged in Linux v3.2-rc1:
> 
> commit 023b9565972a4a5e0f01b9aa32680af6e9b5c388
> Author: Dmitry Torokhov <dmitry.torokhov@gmail.com>
> Date:   Wed Sep 7 15:00:02 2011 -0700
> 
>     WMI: properly cleanup devices to avoid crashes

In fact this patch was already merged in any recent kernel branch: it was added in 3.0.9, it is already presented in 3.1.5 and 3.2 branches. So, this is RESOLVED CODE_FIX.

Note You need to log in before you can comment on or make changes to this bug.