Bug 13458

Summary: general protection fault - sony VGN-Z540 laptop
Product: Drivers Reporter: Reinette Chatre (reinette.chatre)
Component: PlatformAssignee: acpi_platform-drivers (acpi_platform-drivers)
Status: CLOSED CODE_FIX    
Severity: normal CC: akpm, johannes, lenb, malattia, rui.zhang
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 2.6.30-rc8-wl Subsystem:
Regression: No Bisected commit-id:
Bug Depends on:    
Bug Blocks: 56331    

Description Reinette Chatre 2009-06-04 18:11:32 UTC
I am running kernel 2.6.30-rc8-wl which is the wireless-testing kernel (Linus's kernel + latest wireless bits). On my system (a Sony VGN-Z540) I see a gpf when I enable rfkill. Even though the gpf appears, rfkill works fine (wireless stops working). The gpf only appeared once when I enabled rfkill, after that when I disabled/enabled rfkill a few times I did not see it again. 

Here it is:

[  709.366404] general protection fault: 0000 [#1] SMP 
[  709.366551] last sysfs file: /sys/class/rfkill/rfkill5/state
[  709.366616] CPU 1 
[  709.366707] Modules linked in: iwlagn iwlcore led_class mac80211 cfg80211 i915 drm i2c_algo_bit i2c_core ppdev ipv6 acpi_cpufreq cpufreq_userspace cpufreq_powersave cpufreq_ondemand cpufreq_conservative cpufreq_stats freq_table container sbs sbshc lp parport arc4 ecb pcmcia joydev yenta_socket rsrc_nonstatic af_packet tpm_infineon sony_laptop psmouse iTCO_wdt iTCO_vendor_support pcmcia_core intel_agp video output tpm tpm_bios rfkill pcspkr serio_raw processor battery button ac evdev ext3 jbd mbcache sg sr_mod sd_mod cdrom ahci libata scsi_mod ehci_hcd uhci_hcd usbcore thermal fan thermal_sys fuse [last unloaded: cfg80211]
[  709.369182] Pid: 66, comm: kacpi_notify Not tainted 2.6.30-rc8-wl #40 VGN-Z540N
[  709.369182] RIP: 0010:[<ffffffffa0200dda>]  [<ffffffffa0200dda>] sony_nc_rfkill_set+0xa/0x40 [sony_laptop]
[  709.369182] RSP: 0018:ffff8800bb663dc0  EFLAGS: 00010286
[  709.369182] RAX: 0000000000000001 RBX: 0000000000000000 RCX: 0000000000000001
[  709.369182] RDX: 0000000000000001 RSI: 0000000000000001 RDI: ffff880036f74480
[  709.369182] RBP: ffff8800bb663dd0 R08: 0000000000000000 R09: 0000000000000000
[  709.369182] R10: 0000000000000001 R11: 0000000000000001 R12: 0000000000000000
[  709.369182] R13: 0000000000000000 R14: 0000000000000000 R15: ffff8800bb663df0
[  709.369182] FS:  0000000000000000(0000) GS:ffff88000105e000(0000) knlGS:0000000000000000
[  709.369182] CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
[  709.369182] CR2: 00007ffff2051000 CR3: 00000000ba6b6000 CR4: 00000000000006e0
[  709.369182] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  709.369182] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[  709.369182] Process kacpi_notify (pid: 66, threadinfo ffff8800bb662000, task ffff8800bb658f40)
[  709.369182] Stack:
[  709.369182]  0000000000000000 0000000000000000 ffff8800bb663e20 ffffffffa0203292
[  709.369182]  ffff8800bb659548 0000000000000246 0000000000000202 ffff8800b729e2f8
[  709.369182]  ffff8800b5371898 ffff8800bb658f40 ffff880001058ec0 ffffffff803b5596
[  709.369182] Call Trace:
[  709.369182]  [<ffffffffa0203292>] sony_nc_notify+0x222/0x260 [sony_laptop]
[  709.369182]  [<ffffffff803b5596>] ? acpi_os_execute_deferred+0x0/0x39
[  709.369182]  [<ffffffff803b8192>] acpi_device_notify+0x14/0x16
[  709.369182]  [<ffffffff803c5690>] acpi_ev_notify_dispatch+0x5f/0x6b
[  709.369182]  [<ffffffff803b55c2>] acpi_os_execute_deferred+0x2c/0x39
[  709.369182]  [<ffffffff8024d720>] worker_thread+0x1f0/0x340
[  709.369182]  [<ffffffff8024d6cd>] ? worker_thread+0x19d/0x340
[  709.369182]  [<ffffffff802525a0>] ? autoremove_wake_function+0x0/0x40
[  709.369182]  [<ffffffff80263a8d>] ? trace_hardirqs_on+0xd/0x10
[  709.369182]  [<ffffffff8024d530>] ? worker_thread+0x0/0x340
[  709.369182]  [<ffffffff8024d530>] ? worker_thread+0x0/0x340
[  709.369182]  [<ffffffff80252156>] kthread+0x56/0x90
[  709.369182]  [<ffffffff8020ce7a>] child_rip+0xa/0x20
[  709.369182]  [<ffffffff8020c87c>] ? restore_args+0x0/0x30
[  709.369182]  [<ffffffff80252100>] ? kthread+0x0/0x90
[  709.369182]  [<ffffffff8020ce70>] ? child_rip+0x0/0x20
[  709.369182] Code: e1 89 c2 48 c7 c6 36 4e 20 a0 e8 b2 fe ff ff 89 c2 89 d0 48 8b 1c 24 4c 8b 64 24 08 c9 c3 0f 1f 00 55 89 f2 48 89 e5 48 83 ec 10 <8b> 34 bd 80 3e 20 a0 bf 24 01 00 00 81 c6 00 01 00 00 89 f0 0d 
[  709.369182] RIP  [<ffffffffa0200dda>] sony_nc_rfkill_set+0xa/0x40 [sony_laptop]
[  709.369182]  RSP <ffff8800bb663dc0>
[  709.375883] ---[ end trace 6c116875da7d6347 ]---
Comment 1 Andrew Morton 2009-06-04 21:57:35 UTC
(cc  malattia@linux.it)
Comment 2 Zhang Rui 2009-06-05 03:35:19 UTC
Q1. is this gpf can be reproduced consistently? i.e. does it happen every time when you enable rfkill for the first time after a fresh boot?
Q2. is this a regression? did you see the same problem happen on an earlier kernel release?
Q3. can this bug be reproducible in a vanilla kernel?
Comment 3 Zhang Rui 2009-06-05 03:40:09 UTC
I have a sony-z540 on hand.
please give a detailed description on how to reproduce this bug. :)
Comment 4 Mattia Dongili 2009-06-05 04:46:19 UTC
Oooh!! Zang, excellent.
From the first report it looks like just booting and enabling rfkill (I'd say using the physical switch) should trigger the bug.
If that is not it than we will have to wait for the original reporter to tell us how.

Thanks
Comment 5 Reinette Chatre 2009-06-05 18:25:46 UTC
Adding Johannes because this problem may be related to new rfkill changes ...

(In reply to comment #2)
> Q1. is this gpf can be reproduced consistently? i.e. does it happen every
> time
> when you enable rfkill for the first time after a fresh boot?

It can. I tried the following a few times now:
* reboot machine
* disable/enable rfkill

In all of these tests I saw the gpf when I enabled rfkill (wireless swithch to "off") for the first time after booting the machine. 

> Q2. is this a regression? did you see the same problem happen on an earlier
> kernel release?

This is not a regression. 

> Q3. can this bug be reproducible in a vanilla kernel?

No.

Your last question prompted some more testing and I have found the following:

When I run with the latest wireless-testing kernel I see the problem every time. Even when I unload all wireless drivers the problem still appears. In these tests the rfkill module was loaded.

When I repeat the test in a fresh pull of Linus's kernel then the problem is not present. Here I also unload wireless modules (keeping rfkill and sony-laptop loaded) and there is no gpf.

So, it seems that there could be a problem with the new rfkill and the current sony-laptop. The new rfkill will be in 2.6.31 so this issue will probably show up once this kernel is released.
Comment 6 Reinette Chatre 2009-06-05 18:27:28 UTC
(In reply to comment #3)
> I have a sony-z540 on hand.
> please give a detailed description on how to reproduce this bug. :)

I reproduce the bug as follows:
* boot the machine into console mode with rfkill disabled (wireless switch is on "on" position)
* enable rfkill after machine is up (no need to log into machine)

The gpf appears at this point.
Comment 7 Johannes Berg 2009-06-06 18:31:56 UTC
This should fix it, Reinette can you verify please?

Sorry, stupid mistake here passing the wrong argument to the call.

--- wireless-testing.orig/drivers/platform/x86/sony-laptop.c	2009-06-06 20:29:47.000000000 +0200
+++ wireless-testing/drivers/platform/x86/sony-laptop.c	2009-06-06 20:31:05.000000000 +0200
@@ -1135,8 +1135,7 @@ static void sony_nc_rfkill_update()
 
 		if (hwblock) {
 			if (rfkill_set_hw_state(sony_rfkill_devices[i], true))
-				sony_nc_rfkill_set(sony_rfkill_devices[i],
-						   true);
+				sony_nc_rfkill_set((void *)i, true);
 			continue;
 		}
Comment 8 Reinette Chatre 2009-06-08 18:30:25 UTC
Thanks Johannes.

This patch does make the gpf disappear, but unfortunately the rfkill does not behave as expected anymore.

When I boot with rfkill disabled (wireless is on) then things work as expected. I then enable rfkill and this works well now also (previously there was the gpf). Now, when I disable rfkill again after this it does not take effect .... rfkill remains enabled.

I tried with the other sony-laptop rfkill patch [1] recently submitted and saw same behavior.

To highlight this ... without your patch rfkill does behave as expected, it just has the gpf the very first time you enable rfkill.


[1] http://marc.info/?l=linux-wireless&m=124445745725419&w=2
Comment 9 Johannes Berg 2009-06-08 22:42:11 UTC
Hmm. I can't find anything wrong. Can you grab http://git.sipsolutions.net/rfkill.git/ and run 'rfkill event' while you push the button?

Or try this, maybe:

diff --git a/drivers/platform/x86/sony-laptop.c b/drivers/platform/x86/sony-laptop.c
index aec0b27..536a63c 100644
--- a/drivers/platform/x86/sony-laptop.c
+++ b/drivers/platform/x86/sony-laptop.c
@@ -1134,9 +1134,7 @@ static void sony_nc_rfkill_update()
 			continue;
 
 		if (hwblock) {
-			if (rfkill_set_hw_state(sony_rfkill_devices[i], true))
-				sony_nc_rfkill_set(sony_rfkill_devices[i],
-						   true);
+			rfkill_set_hw_state(sony_rfkill_devices[i], true);
 			continue;
 		}
 

when going into hw block there's no need to set the sw state too...
Comment 10 Reinette Chatre 2009-06-10 23:18:16 UTC
This is it.

I ran "rfkill event" without your patch and saw:

Start "rfkill event" ...
RFKILL event: idx 0 type 1 op 0 soft 0 hard 0
RFKILL event: idx 1 type 2 op 0 soft 0 hard 0
RFKILL event: idx 2 type 1 op 0 soft 0 hard 0
Now enable rfkill ... 
RFKILL event: idx 2 type 1 op 2 soft 0 hard 1
RFKILL event: idx 0 type 1 op 2 soft 0 hard 1
RFKILL event: idx 1 type 2 op 2 soft 0 hard 1
Now disable rfkill ...
RFKILL event: idx 0 type 1 op 2 soft 1 hard 1
RFKILL event: idx 1 type 2 op 2 soft 1 hard 1
Now enable rfkill ...
(nothing)
Now disable rfkill ...
RFKILL event: idx 0 type 1 op 2 soft 1 hard 1
RFKILL event: idx 1 type 2 op 2 soft 1 hard 1

With your patch I saw this:

rfkill event
RFKILL event: idx 0 type 1 op 0 soft 0 hard 0
RFKILL event: idx 1 type 2 op 0 soft 0 hard 0
RFKILL event: idx 2 type 1 op 0 soft 0 hard 0
Now enable rfkill...
RFKILL event: idx 2 type 1 op 2 soft 0 hard 1
RFKILL event: idx 0 type 1 op 2 soft 0 hard 1
RFKILL event: idx 1 type 2 op 2 soft 0 hard 1
Now disable rfkill...
RFKILL event: idx 2 type 1 op 2 soft 0 hard 0
RFKILL event: idx 0 type 1 op 2 soft 0 hard 1
RFKILL event: idx 1 type 2 op 2 soft 0 hard 1
Now enable rfkill...
RFKILL event: idx 2 type 1 op 2 soft 0 hard 1
Now disable rfkill...
RFKILL event: idx 2 type 1 op 2 soft 0 hard 0
RFKILL event: idx 0 type 1 op 2 soft 0 hard 1
RFKILL event: idx 1 type 2 op 2 soft 0 hard 1
Now enable rfkill...
RFKILL event: idx 2 type 1 op 2 soft 0 hard 1
Now disable rfkill...
RFKILL event: idx 2 type 1 op 2 soft 0 hard 0
RFKILL event: idx 0 type 1 op 2 soft 0 hard 1
RFKILL event: idx 1 type 2 op 2 soft 0 hard 1

With your patch rfkill keeps working as expected. Could you please send this patch to 2.6.31? Without this patch rfkill does not behave well on this platform.

Thank you very much
Comment 11 Len Brown 2009-08-28 14:53:14 UTC
commit e1f8a19e6fc4f6d4267f6d3fe465553c3688f28e
Author: Johannes Berg <johannes@sipsolutions.net>
Date:   Thu Jun 11 12:08:15 2009 +0200

    sony: fix rfkill code again

shipped in 2.6.31-rc1
closed