Created attachment 152231 [details] The output of `dmesg` of a boot after using alt+sysrq+reisub that shows what is happening when the network card doesn't work (fails to enter state 4 as it says) Sometimes I have to use the alt+sysrq+reisub combination to restart a frozen machine but after it boots back up again (after using the combination, no restarts/shutdowns inbetween) I am not able to use my wireless card. My dmesg gets littered with: [ 38.195828] ieee80211 phy0: rt2800_wait_wpdma_ready: Error - WPDMA TX/RX busy [0x00000068] [ 39.297104] ieee80211 phy0: rt2800_wait_wpdma_ready: Error - WPDMA TX/RX busy [0x00000068] [ 39.297114] ieee80211 phy0: rt2800pci_set_device_state: Error - Device failed to enter state 4 (-5) After shutting down the system via `shutdown -h now` and starting up again the card works as expected. I am attaching the log of dmesg when the issue happens. What `lspci -v` says about my wireless card: 03:00.0 Network controller: Ralink corp. RT3290 Wireless 802.11n 1T/1R PCIe Subsystem: Hewlett-Packard Company Ralink RT3290LE 802.11bgn 1x1 Wi-Fi and Bluetooth 4.0 Combo Adapter Flags: bus master, fast devsel, latency 0, IRQ 19 Memory at d0610000 (32-bit, non-prefetchable) [size=64K] Capabilities: [40] Power Management version 3 Capabilities: [50] MSI: Enable- Count=1/32 Maskable- 64bit+ Capabilities: [70] Express Endpoint, MSI 00 Capabilities: [100] Advanced Error Reporting Capabilities: [140] Device Serial Number 00-00-d5-1d-9d-31-17-a4 Kernel driver in use: rt2800pci Kernel modules: rt2800pci
Sorry but I forgot to add that this also happens with version 3.16.3-1 vanilla archlinux kernel and it seems that only shutting down and starting up again helps to fix this, `sudo reboot` doesn't fix this.
Using connman this happens after a simple restart. NetworkManager somehow only manages to trigger this bug with sysrq+reisub.
(In reply to Giedrius Statkevičius from comment #2) > Using connman this happens after a simple restart. NetworkManager somehow > only manages to trigger this bug with sysrq+reisub. In my machine this happens after any restart, i`m using NetworkManager, but this bug reproduces when no connman, wicd, networkmanager or any other network manager installed too. This reproduces in this kernels too: 4.8.0-2 (debian) 4.8.13-Arch, 4.8.16 (fedora), 4.10.0rc3
(In reply to Mike Wortin from comment #3) > This reproduces in this kernels too: 4.8.0-2 (debian) 4.8.13-Arch, 4.8.16 > (fedora), 4.10.0rc3 Does it work before on previous versions ? If so, what is latest working kernel version?
(In reply to Stanislaw Gruszka from comment #4) > (In reply to Mike Wortin from comment #3) > > This reproduces in this kernels too: 4.8.0-2 (debian) 4.8.13-Arch, 4.8.16 > > (fedora), 4.10.0rc3 > > Does it work before on previous versions ? If so, what is latest working > kernel version? Yes, it works on previous versions. The latest kernel version (in my mind), which does not have this bug is 3.16 (debian stable)
Those are rt2x00 changes between 3.16 and 4.8: $ git log v3.16..v4.8 --no-merges --oneline -- drivers/net/wireless/ralink/ 2557654 rt2800lib: enable MFP if hw crypt is disabled 57fbcce cfg80211: remove enum ieee80211_band 8b4c000 rt2x00usb: Use usb anchor to manage URB f36f299 rt2x00: add new rt2800usb device Buffalo WLI-UC-G450 9cc3fdc rt2x00: unterminated strlen of user data 5b45171 net: wireless: rt2x00: Space Required b2cc2dd net: wireless: rt2x00: Space issue ac2b335 net: wireless: rt2x00: Fixed Spacing issues 262c741 rt2x00: fix monitor mode regression 50ea05e mac80211: pass block ack session timeout to to driver 7683fe0 rt2x00pci: Disable memory-write-invalidate when the driver exits 952348a rt2x00: type bug in _rt2500usb_register_read() 33aca94 rt2x00: move under ralink vendor directory $ git log v3.16..v4.8 --no-merges --oneline -- drivers/net/wireless/rt2x00/ 33aca94 rt2x00: move under ralink vendor directory 4a733ef mac80211: remove PM-QoS listener 910367e rt2800usb: add usb ID 1b75:3070 for Airlive WT-2000USB e3abc8f mac80211: allow to transmit A-MSDU within A-MPDU 11ab35e rt2x00: use DECLARE_EWMA f10746f rt2x00: adjust EEPROM_SIZE for rt2500usb ed8e0ed rt2800: fix assigning same WCID for different stations 30686bf mac80211: convert HW flags to unsigned long bitmap 9352c19 mac80211: extend get_tkip_seq to all keys df14046 mac80211: remove support for IFF_PROMISC 01fbd4e rt2800usb: check Autorun mode on FW load only once ea345c1 rt2x00: add new rt2800usb device DWA 130 7daa54b rt2x00usb: drop rt2x00usb_disable_radio() from rt2800usb_disable_radio() 92d5e24 rt2x00usb: check USB's request error code in rt2800usb_autorun_detect() e4fcfaf rt2x00usb: initialize the read value in case of failure 4ed20be cfg80211: remove "channel" from survey names 6341e62 kconfig: use bool instead of boolean for type definition attributes b9d305c rt2x00: use helper to check capability/requirement dc50a52 Revert "rt2x00: Endless loop on hub port power down" 14bc8bd rt2x00: change REGISTER_TIMEOUT 7a5a735 rt2x00: change REGISTER_BUSY_COUNT for USB ad92bc9 rt2x00: use timeout in rt2x00usb_vendor_request 87dd2d7 rt2800: calculate tx power temperature compensation on selected chips a344d67 mac80211: allow drivers to support NL80211_SCAN_FLAG_RANDOM_ADDR cfd9167 rt2x00: do not align payload on modern H/W 664d6a7 wireless: rt2x00: add new rt2800usb device e9dc51a rt2x00: tune multi-registers I/O timeout f853e9b net: wireless: rt2x00: drop owner assignment from platform_drivers 01f7fee rt2800: correct BBP1_TX_POWER_CTRL mask ac0372a rt2x00: support Ralink 5362. 9baa3c3 PCI: Remove DEFINE_PCI_DEVICE_TABLE macro use 6a06e55 wireless: rt2x00: add new rt2800usb devices d4150246 drivers/net/wireless/rt2x00/rt2x00dev.c: remove null test before kfree df6e633 rt2x00: Use dma_zalloc_coherent 19dcb76 rt2x00: do not initialize BCN_OFFSET registers ddb4055 rt2x00: change order when stop beaconing 88ff2f4 rt2x00: change default MAC_BSSID_DW1_BSS_BCN_NUM ba08910 rt2x00: change beaconing setup on RT2800 283dafa rt2x00: change beaconing locking 57eaeb6 net: wireless: rt2x00: rt2x00mac.c: Cleaning up uninitialized variables I do not see in them commit that could possibly couse the problem. Perhaps issue was coused by change in PCI subsystem not in rt2x00 driver.
If I understand correctly problem do not happen after power-off - power-on , only if software reset is performed, hence maybe card is not reset properly by PCI sub-layer.
Created attachment 251941 [details] enable rt3290 unconditionally Does the patch make things better ?
I was updated BIOS to the last version and this bug does not affecting me now. Now my BIOS version is F35 (Insyde H20). I have a HP 15-r047er laptop.
Pity you didn't test the patch before BIOS update ...
*** Bug 192581 has been marked as a duplicate of this bug. ***
Giedrius, could you test patch from comment 8 ?
(In reply to Stanislaw Gruszka from comment #12) > Giedrius, could you test patch from comment 8 ? I could give it a try this weekend. Due to this bug I actually switched to a Atheros card but I will put the old card back in just for this.
Created attachment 253271 [details] rt3290_enable_disable.patch Not sure if previous patch will be sufficient, please also test this one, which also disable RT3290 PCI remove callback (however I'm not sure if reset via sysrq perform PCI devices removal callbacks, if not RT3290 disable should be done on initialization, before it's enabling code).
(In reply to Stanislaw Gruszka from comment #8) > Created attachment 251941 [details] > enable rt3290 unconditionally > > Does the patch make things better ? Applied this on next-20170125. With pure next-20170125 the bug is still reproducible on the same hardware and the same messages are printed as it was described in the original report. It seems like the best way to reproduce this is to get some network activity going on and type alt+sysrq+reisub quickly. With this patch I was unable to reproduce the bug in 10 tries or so.
(In reply to Stanislaw Gruszka from comment #14) > Created attachment 253271 [details] > rt3290_enable_disable.patch > > Not sure if previous patch will be sufficient, please also test this one, > which also disable RT3290 PCI remove callback (however I'm not sure if reset > via sysrq perform PCI devices removal callbacks, if not RT3290 disable > should be done on initialization, before it's enabling code). Reproduced the bug on a second or third try with this patch. The patch was applied on next-20170125. Error messages were identical to the ones in the original report.
(In reply to Giedrius Statkevičius from comment #16) > (In reply to Stanislaw Gruszka from comment #14) > > Created attachment 253271 [details] > > rt3290_enable_disable.patch > > > > Not sure if previous patch will be sufficient, please also test this one, > > which also disable RT3290 PCI remove callback (however I'm not sure if > reset > > via sysrq perform PCI devices removal callbacks, if not RT3290 disable > > should be done on initialization, before it's enabling code). > > Reproduced the bug on a second or third try with this patch. The patch was > applied on next-20170125. Error messages were identical to the ones in the > original report. Also, this patch introduces some compilation warnings.
(In reply to Giedrius Statkevičius from comment #15) > alt+sysrq+reisub quickly. With this patch I was unable to reproduce the bug > in 10 tries or so. I assume patch fixes orginally reported bug. I will post it soon.
Patch was commited to wireless-drivers-next: https://git.kernel.org/cgit/linux/kernel/git/kvalo/wireless-drivers-next.git/commit/?id=6715208d0a95ae417203f8e4a7937c1b4c4947f2 I'm closing the bug. Thanks for reporting and testing Giedrius!