|Summary:||Kernel ignores BIOS setting to NOT use Airplane Switch|
|Severity:||normal||CC:||fin4478, gabriele.mzt, lenb, Mario.Limonciello, mario_limonciello, pali, pepijndevos|
debugfs output for all possible states of ethernet and wifi switch
Git bisect log of /drivers/platfrom/x86/dell*
possible patch to fix issue
Patch for v4.15 (The kernel version on Ubuntu 18.04 LTS)
Description Chris 2018-10-14 19:37:15 UTC
Hi My Dell laptop ceased to use my BIOS settings for the Airplane Switch. This was somewhere around Linux kernel 4.13 - on the last version of Mint. For reference, I still have Linux Mint 18.3 "live DVD" on a flash drive, which used a prior kernel, 4.10, when I can plug that in and boot into it, it still works correctly as it did before. So that rules out any hardware or BIOS changes which otherwise might cause this. justme@travel:~$ inxi -Fxz System: Host: travel Kernel: 4.15.0-36-generic x86_64 bits: 64 gcc: 7.3.0 Desktop: Cinnamon 3.8.9 (Gtk 3.22.30-1ubuntu1) Distro: Linux Mint 19 Tara Machine: Device: laptop System: Dell product: Latitude E6420 v: 01 serial: N/A Mobo: Dell model: 0X8R3Y serial: N/A BIOS: Dell v: A25 date: 03/06/2018 Battery BAT0: charge: 73.3 Wh 100.0% condition: 73.3/73.3 Wh (100%) model: DP-LGC53 DELL 7M0N51C status: Full CPU: Dual core Intel Core i7-2640M (-MCP-) arch: Sandy Bridge rev.7 cache: 4096 KB flags: (lm nx sse sse2 sse3 sse4_1 sse4_2 ssse3 vmx) bmips: 11174 clock speeds: max: 3500 MHz 1: 1331 MHz 2: 1131 MHz Graphics: Card: Intel 2nd Generation Core Processor Family Integrated Graphics Controller bus-ID: 00:02.0 Display Server: x11 (X.Org 1.19.6 ) drivers: modesetting (unloaded: fbdev,vesa) Resolution: email@example.com OpenGL: renderer: Mesa DRI Intel Sandybridge Mobile version: 3.3 Mesa 18.0.5 Direct Render: Yes Audio: Card Intel 6 Series/C200 Series Family High Definition Audio Controller driver: snd_hda_intel bus-ID: 00:1b.0 Sound: Advanced Linux Sound Architecture v: k4.15.0-36-generic Network: Card-1: Intel 82579LM Gigabit Network Connection driver: e1000e v: 3.2.6-k port: 3080 bus-ID: 00:19.0 IF: enp0s25 state: down mac: <filter> Card-2: Intel Centrino Advanced-N 6205 [Taylor Peak] driver: iwlwifi bus-ID: 02:00.0 IF: wlp2s0 state: up mac: <filter> Drives: HDD Total Size: 256.1GB (29.1% used) ID-1: /dev/sda model: Samsung_SSD_860 size: 256.1GB Partition: ID-1: / size: 233G used: 68G (31%) fs: ext4 dev: /dev/sda1 ID-2: swap-1 size: 2.56GB used: 0.00GB (0%) fs: swap dev: /dev/sda2 RAID: No RAID devices: /proc/mdstat, md_mod kernel module present Sensors: System Temperatures: cpu: 53.0C mobo: N/A Fan Speeds (in rpm): cpu: N/A Info: Processes: 194 Uptime: 1:17 Memory: 1924.8/7855.8MB Init: systemd runlevel: 5 Gcc sys: 7.3.0 Client: Shell (bash 4.4.191) inxi: 2.3.56 justme@travel:~$ I have found this file on my HDD which is used to output kernel info: /sys/kernel/debug/dell_laptop/rfkill Its contents show this: return: 0 status: 0x1011D Bit 0 : Hardware switch supported: 1 Bit 1 : Wifi locator supported: 0 Bit 2 : Wifi is supported: 1 Bit 3 : Bluetooth is supported: 1 Bit 4 : WWAN is supported: 1 Bit 5 : Wireless keyboard supported: 0 Bit 6 : UWB supported: 0 Bit 7 : WiGig supported: 0 Bit 8 : Wifi is installed: 1 Bit 9 : Bluetooth is installed: 0 Bit 10: WWAN is installed: 0 Bit 11: UWB installed: 0 Bit 12: WiGig installed: 0 Bit 16: Hardware switch is on: 1 Bit 17: Wifi is blocked: 0 Bit 18: Bluetooth is blocked: 0 Bit 19: WWAN is blocked: 0 Bit 20: UWB is blocked: 0 Bit 21: WiGig is blocked: 0 hwswitch_return: 0 hwswitch_state: 0x1011D Bit 0 : Wifi controlled by switch: 1 Bit 1 : Bluetooth controlled by switch: 0 Bit 2 : WWAN controlled by switch: 1 Bit 3 : UWB controlled by switch: 1 Bit 4 : WiGig controlled by switch: 1 Bit 7 : Wireless switch config locked: 0 Bit 8 : Wifi locator enabled: 1 Bit 15: Wifi locator setting locked: 0 Also there is another file with data which might be related: /sys/kernel/debug/iwlwifi/0000:02:00.0/trans/rfkill debug: 0 hw: 0 The first shows what I am experiencing, that the Hardware Switch (aka Airplane Switch) is "enabled" for WiFi. Except that it is NOT in my BIOS - there I have it DESELECTED, and that is what I am expecting. From this online list: linux/Documentation/admin-guide/kernel-parameters.txt I have tried adding this Linux Kernel Parameter [rfkill.master_switch_mode=0], saving it, then running "sudo update-grub", and rebooting a couple times, to no avail: rfkill.master_switch_mode= 0 The "airplane mode" button does nothing. 1 The "airplane mode" button toggles between everything blocked and the previous configuration. 2 The "airplane mode" button toggles between everything blocked and everything unblocked. Systemd doesn't seem to have anything to do with this. I just don't know of anywhere else to look where this might be configurable. The Airplane Switch on my computer is on the right-hand side near where I use the mouse. It is very sensitive and any little bump on that side, and the WiFi becomes unavailable until I toggle it. I would like to be able to disable it, as I have before. In diagnostic testing, I have even tried newer kernels on other distros, and this is not distro-specific. It is something that was broken earlier in 2018, during the time Linux Mint was using the 4.13 kernel (I know, not much help here). This was also around the time that patches were being rolled out for the Intel management engine vulnerabilites. Any ideas out there? Somehow I don't see going into my computer and breaking the damn switch off with wire cutters as being a good option. :) Thanks
Comment 1 Chris 2018-10-14 19:46:14 UTC
I meant to also mention that this happens with the current Linux Mint "live DVD" (using the 4.15 kernel). That is the same as what I have installed on my HDD, except without any possible customizations by me.
Comment 2 fin4478 2018-10-20 08:07:05 UTC
You can make a custom kernel where you have disabled the driver that is for your laptop buttons. To build a faster and stable custom kernel, install: sudo build-essential kernel-package qt5-default qt5-qmake qtbase5-dev qtbase5-dev-tools bison flex libelf-dev libssl-dev pkg-config . Download the kernel source from kernel.org. If you have AMD graphics, download the AMD drm-next-4.21-wip kernel source. The kernel configuration file of Debian Official kernel are available in /boot, named after the kernel release. Copy the .config file to the linux directory. Connect all your devices and run the command: make localmodconfig. Create a custom Debian kernel package: export CONCURRENCY_LEVEL=2 or use -j 2 with make-kpkg (use number of threads in your cpu) fakeroot make-kpkg --initrd kernel_image Add kernel_headers to the fakeroot command if you need headers. Install the kernel package with Gdebi. To make a custom kernel to boot, add a line to /etc/initramfs-tools/modules: unix And run: sudo update-initramfs Reboot.
Comment 3 Chris 2018-10-22 20:41:07 UTC
Thanks, but each time I upgrade the kernel on my machine I would need to compile it like this, so at best it is temporary solution. So it remains broken across the board. I can't believe I am the only one who finds this annoying, this is a very common computer.
Comment 4 Pepijn de Vos 2019-03-18 19:44:02 UTC
Created attachment 281895 [details] debugfs output for all possible states of ethernet and wifi switch I have my bios set up to ignore the switch but disable wifi when Ethernet is connected.
Comment 5 Pepijn de Vos 2019-03-18 19:54:57 UTC
It seems this behaviour is controlled by `static int dell_rfkill_set(void *data, bool blocked)` in `/drivers/platform/x86/dell-laptop.c` so I added the authors to the CC list. For me it works perfectly for disabling wifi when I connect ethernet as set in my bios, but it indeed does not ignore the wifi switch. What raises my suspicion is that status and hwswitch_state are always identical, while their bits seem to have different meaning and they come from a different location. Is this normal? I would be willing to try to fix this issue myself. I'm proficient with C but new to kernel development, so I would appreciate if someone could point me in the right direction.
Comment 6 Gabriele Mazzotta 2019-03-23 08:46:39 UTC
The switch is controlled by dell-rbtn. On my laptop, the WiFi button is entirely controlled by the BIOS when this module is not available, on some other systems WiFi buttons/switches stop working altogether. So maybe try to blacklist/unload the module and see if anything changes. This driver has been available for quite some time and I don't see any significant change near release 4.13, both in dell-laptop and dell-rbtn, so I don't know why you are having issues only now.
Comment 7 Gabriele Mazzotta 2019-03-23 14:21:20 UTC
I wanted to add that the laptops of the Latitude and Precision series have a fallback mechanism to handle the WiFi switch/button which is used in case dell-rbtn is not available. This fallback mechanism is actually all there was before dell-rbtn superseded it. You can force this fallback mechanism by removing/unloading/blacklisting dell-rbtn. If we find that without dell-rbtn everything works as you expect, then one possibility is that your distro started to include dell-rbtn only recently. If this is not the case, then we need to find other causes of this unwanted behavior.
Comment 8 Pepijn de Vos 2019-03-23 15:59:33 UTC
On my laptop, disabling dell_rbtn does not disable the wifi switch. Disabling dell_laptop does disable the switch, but freezes my laptop after a minute or so. My suspicion is still directed at hwswitch_state, which read identical to status. This means that "Hardware switch supported" in status also controls "Wifi controlled by switch". I'm pretty sure the value for hwswitch_state is bogus. Status says "WiGig supported" is 0, but "WiGig controlled by switch" is 1 in hwswitch_state because it is just reading the status value. I tried to read the code where hwswitch_state comes from, but it just puts some opaque constants in dell_fill_request and dell_send_request. Is there documentation on these methods? In case these come directly from the bios, this might point to a bios bug? I did update my bios at some point, not sure when the switch started acting up.
Comment 9 Gabriele Mazzotta 2019-03-23 16:46:09 UTC
(In reply to Pepijn de Vos from comment #8) > Disabling dell_laptop does disable the switch, but freezes my laptop after a > minute or so. I don't think this should happen. > My suspicion is still directed at hwswitch_state, which read identical to > status. This means that "Hardware switch supported" in status also controls > "Wifi controlled by switch". > > I'm pretty sure the value for hwswitch_state is bogus. Status says "WiGig > supported" is 0, but "WiGig controlled by switch" is 1 in hwswitch_state > because it is just reading the status value. > > I tried to read the code where hwswitch_state comes from, but it just puts > some opaque constants in dell_fill_request and dell_send_request. Is there > documentation on these methods? Most of the information were taken from libsmbios that DELL released, as also mentioned in the header of the source file. Part of the comments of libsmbios are also available in dell-laptop.c. dell_send_request() uses SMM under the hood, so the data should come from the BIOS. > In case these come directly from the bios, this might point to a bios bug? I > did update my bios at some point, not sure when the switch started acting up. I don't know if the issue is due to some BIOS update or something else, but that's definitely a possibility. However, this means we have two different bugs, because Chris observed a regression, since everything works as expected when using an older kernel. 'git bisect' can be very handy in situations like this one. A BIOS update can definitely cause issues, but I can't say if it's the actual source of the issue.
Comment 10 Chris 2019-03-23 17:50:55 UTC
Thanks for looking into this, as it seems as it would otherwise be a persistent thing going into the future. This was repeatable for me, between the 4.10 kernel and 4.13 kernel (started in one of the 4.13 intermediate point releases). For positive test results, try switching between 4.10 and maybe 4.15 just to be sure. Easiest way to demonstrate this is to just boot up using the ISO for Linux Mint 18.3. This natively uses the 4.10 kernel, before any kernel changes or upgrades, and the BIOS function to disable the HW switch works as expected. https://linuxmint.com/edition.php?id=246 With Linux Mint 18.3 installed, I could break/fix this functionality just by switching between different kernel versions used. It actually became broken somewhere in the middle of the 4.13 point releases. Afterwards all kernel releases and point releases exhibited that behavior. I am now using Linux Mint 19.1 and it only allows for 4.15 or 4.18 kernel versions, so it is always broken now.
Comment 11 Pepijn de Vos 2019-03-23 18:09:41 UTC
For me this also worked at some previous time, could be the upgrade from Ubuntu 16.04 to 18.04. I'll try the Mint ISO when I have time. The freeze is something completely unrelated that also happens in Windows when I wake from hibernate, seemingly related to graphics drivers... But it definitely appears dell_laptop is controlling more than just a switch. Chris, could you share the output of the debugfs for the working kernel version? In your initial post I see the same thing, where hwswitch_state==status. I wonder if this is also the case for the working kernel.
Comment 12 Chris 2019-03-23 18:54:31 UTC
Okay, here is the same info for the bootable USB "live session" (uninstalled) for Linux Mint 18.3, as per my last post: mint@mint ~ $ inxi -Fxz System: Host: mint Kernel: 4.10.0-38-generic x86_64 (64 bit gcc: 5.4.0) Desktop: Cinnamon 3.6.6 (Gtk 3.18.9-1ubuntu3.3) Distro: Linux Mint 18.3 Sylvia Machine: System: Dell (portable) product: Latitude E6420 v: 01 Mobo: Dell model: 0X8R3Y Bios: Dell v: A25 date: 03/06/2018 CPU: Dual core Intel Core i7-2640M (-MCP-) cache: 4096 KB flags: (lm nx sse sse2 sse3 sse4_1 sse4_2 ssse3 vmx) bmips: 11174 clock speeds: max: 3500 MHz 1: 1040 MHz 2: 3139 MHz Graphics: Card: Intel 2nd Generation Core Processor Family Integrated Graphics Controller bus-ID: 00:02.0 Display Server: X.Org 1.18.4 drivers: intel (unloaded: fbdev,vesa) Resolution: firstname.lastname@example.org GLX Renderer: Mesa DRI Intel Sandybridge Mobile GLX Version: 3.0 Mesa 17.0.7 Direct Rendering: Yes Audio: Card Intel 6 Series/C200 Series Family High Definition Audio Controller driver: snd_hda_intel bus-ID: 00:1b.0 Sound: Advanced Linux Sound Architecture v: k4.10.0-38-generic Network: Card-1: Intel 82579LM Gigabit Network Connection driver: e1000e v: 3.2.6-k port: 3080 bus-ID: 00:19.0 IF: enp0s25 state: down mac: <filter> Card-2: Intel Centrino Advanced-N 6205 [Taylor Peak] driver: iwlwifi bus-ID: 02:00.0 IF: wlp2s0 state: down mac: <filter> Drives: HDD Total Size: 287.3GB (2.7% used) ID-1: /dev/sda model: Samsung_SSD_860 size: 256.1GB temp: 0C ID-2: USB /dev/sdb model: Cruzer_Glide size: 15.6GB temp: 0C ID-3: USB /dev/sdc model: Cruzer_Glide size: 15.7GB temp: 0C Partition: ID-1: swap-1 size: 2.56GB used: 0.00GB (0%) fs: swap dev: /dev/sda2 RAID: No RAID devices: /proc/mdstat, md_mod kernel module present Sensors: System Temperatures: cpu: 65.0C mobo: N/A Fan Speeds (in rpm): cpu: N/A Info: Processes: 192 Uptime: 1 min Memory: 564.7/7860.9MB Init: systemd runlevel: 5 Gcc sys: 5.4.0 Client: Shell (bash 4.3.481) inxi: 2.2.35 mint@mint ~ $ ------------------------------------ /sys/kernel/debug/dell_laptop/rfkill/sys/kernel/debug/dell_laptop/rfkill return: 0 status: 0x1011D Bit 0 : Hardware switch supported: 1 Bit 1 : Wifi locator supported: 0 Bit 2 : Wifi is supported: 1 Bit 3 : Bluetooth is supported: 1 Bit 4 : WWAN is supported: 1 Bit 5 : Wireless keyboard supported: 0 Bit 6 : UWB supported: 0 Bit 7 : WiGig supported: 0 Bit 8 : Wifi is installed: 1 Bit 9 : Bluetooth is installed: 0 Bit 10: WWAN is installed: 0 Bit 11: UWB installed: 0 Bit 12: WiGig installed: 0 Bit 16: Hardware switch is on: 1 Bit 17: Wifi is blocked: 0 Bit 18: Bluetooth is blocked: 0 Bit 19: WWAN is blocked: 0 Bit 20: UWB is blocked: 0 Bit 21: WiGig is blocked: 0 hwswitch_return: 0 hwswitch_state: 0x8 Bit 0 : Wifi controlled by switch: 0 Bit 1 : Bluetooth controlled by switch: 0 Bit 2 : WWAN controlled by switch: 0 Bit 3 : UWB controlled by switch: 1 Bit 4 : WiGig controlled by switch: 0 Bit 7 : Wireless switch config locked: 0 Bit 8 : Wifi locator enabled: 0 Bit 15: Wifi locator setting locked: 0 --------------------------------------- /sys/kernel/debug/iwlwifi/0000:02:00.0/trans/rfkill [file doesn't exist in directory for USB drive 18.3 "live session"] Chris
Comment 13 Pepijn de Vos 2019-03-25 21:58:20 UTC
Right, so working kernels have working hwswitch_state. Can confirm that 4.13 works for me too. I've successfully compiled my first kernel from source. Currently bisecting through it to find the offending commit. Only a dozen steps left, so bear with me...
Comment 14 Pepijn de Vos 2019-03-26 11:03:50 UTC
Created attachment 282031 [details] Git bisect log of /drivers/platfrom/x86/dell* I bisected my way from 4.13 to 4.15 selecting only the files related to the dell platform. I ended up at commit 5246741a3f2e0285394cf74f3105cb252b8f38ad that seems kinda trivial and unrelated that just moves an allocation around. Worth noting is that for some revisions the file /sys/kernel/debug/dell_laptop/rfkill did not exist or returned a "no such device" error. In these cases I tested with the switch itself rather than with the debug output. This means that I may have bisected my way to another unrelated bug. It could be the module just failed and the fallback mechanism was used. I can either try again with "git bisect skip" in cases where the debugfs fails, or widen the scope to include other files, because I'm not 100% sure the problem is in the dell platform files either.
Comment 15 Chris 2019-03-26 16:34:28 UTC
I had a similar thought - if this only applies to just my generation of the Dell Latitude, the E6420 and related units (E6400, E6500 series, etc), or perhaps even later models which may have more significant changes. Back around 2017 or so, libSMBIOS was a package from Dell which needed to be compiled and installed. I wanted to extend the time on for the keyboard backlight beyond 30 seconds or so. These days it is the same as the monitor settings. This HW switch setting at least seems to be a regression of some sort. Thanks for your time...
Comment 16 Gabriele Mazzotta 2019-03-26 21:18:05 UTC
Commit 5246741a3f2e0285394cf74f3105cb252b8f38ad is right after a major refactoring of the driver (https://www.spinics.net/lists/platform-driver-x86/msg13672.html). That buffer is the one used to send the smbios request, so either the request was failing and you were getting out of dell_setup_rfkill() with an error or you were getting garbage in return. This would explain why you were not seeing the rfkill file.
Comment 17 Pepijn de Vos 2019-03-26 22:14:43 UTC
You are right. I tried again using "git bisect skip", and I narrowed it down to a range of commits just before that fix. This was a lengthy process because it more or less degraded to a linear search. There are only 'skip'ped commits left to test. The first bad commit could be any of: f2645fa317b8905b8934f06a0601d5b7fa66aba0 1f8543a5d602b816b9b64a62cafd6caae2af4ca6 ce7ff1cffdaf82354aca5f4c8691e5c85474fbde 307ab2a99d190d3a7949258b8551b66887ce8cf4 da1f607ed6e6a904463396bb6a28bf96584c61cc 1a258e670434f404a4500b65ba1afea2c2b29bba 8b9528a6d9a901b9f933231505fef5630e80ce5a 549b4930f057658dc50d8010e66219233119a4d8 868b8d33f91e431b1961a35baa6b5022639067f3 5246741a3f2e0285394cf74f3105cb252b8f38ad We cannot bisect more! Then I did a magic trick, I used "git bisect log", edited out all the skips and replayed the bisection. But I cherry-picked the fix from 5246741a3f2e0285394cf74f3105cb252b8f38ad onto the broken commits, allowing me to narrow down the real culprit. Ladies, gentlemen, and other beings, without further ado, the offending commit is 549b4930f057658dc50d8010e66219233119a4d8 platform/x86: dell-smbios: Introduce dispatcher for SMM calls Now I badly need some sleep, but the next step is figuring out what's actually wrong. It'd be so cool to land a patch in the kernel. I'm completely new to kernel debugging though. Where is my gdb and printf?? Any guidance appreciated.
Comment 18 Mario Limonciello 2019-03-27 03:00:15 UTC
Created attachment 282045 [details] possible patch to fix issue Can you see if this patch helps to fix the issue for you? I believe there might have been a logic error when the driver was converted in that commit.
Comment 19 Pepijn de Vos 2019-03-27 07:29:28 UTC
Looking at the commit with fresh eyes, this seems indeed the error. I tried your patch and it works! I'm glad we solved the problem, but honestly a bit salty because I was so excited to debug and fix my first kernel bug. So close...
Comment 20 Pali Rohár 2019-03-27 08:18:12 UTC
When you send that patch to LKML, please add Fixes: keyword with commit which broke driver. So patch would be propagated to -stable releases. Anyway, nice catch! You can add my Acked-by.
Comment 21 Pepijn de Vos 2019-03-27 09:46:01 UTC
Created attachment 282047 [details] Patch for v4.15 (The kernel version on Ubuntu 18.04 LTS)
Comment 22 Mario Limonciello 2019-03-27 15:03:38 UTC
Thanks guys, glad to hear the good news. I've sent it to the ML with these extra comments Pali. https://email@example.com/T/#u
Comment 23 Chris 2019-03-27 16:47:20 UTC
Thanks to all. That was the first time I reported a kernel-level issue. Pepijn, I will try your patched kernel.
Comment 24 Pepijn de Vos 2019-04-29 05:27:33 UTC
What is the expected path forward from here? It seems there has not been much action recently.
Comment 25 Mario Limonciello 2019-04-29 12:16:48 UTC