Bug 12818

Summary: iwlagn broken after suspend to RAM (iwlagn: MAC is in deep sleep!)
Product: Drivers Reporter: Stefan Seyfried (stefan.seyfried)
Component: network-wirelessAssignee: drivers_network-wireless (drivers_network-wireless)
Status: RESOLVED CODE_FIX    
Severity: normal CC: alan, dmueller, helmut.schaa, linville, markus.zimmermann, reinette.chatre, rjw
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 2.6.30-rc2, 2.6.29 SUSE Subsystem:
Regression: Yes Bisected commit-id:
Bug Depends on:    
Bug Blocks: 11808    
Attachments: /var/log/messages with debug-enabled iwlagn
force restart of fw

Description Stefan Seyfried 2009-03-04 08:32:01 UTC
Latest working kernel version: Tested with 2.6.27, not sure about 2.6.28
Earliest failing kernel version: not sure, did not test before 2.6.29-rc5 :-(
Distribution: openSUSE FACTORY
Hardware Environment: HP Compaq 2510p
Software Environment: current openSUSE FACTORY
Problem Description: after resume from suspend to RAM, the iwlagn driver does not work anymore. It looks like the hardware is confused.

Steps to reproduce: suspend to RAM, resume. Try to use wireless.

This is in the log after resume:

[ 4797.909541] iwlagn: Intel(R) Wireless WiFi Link AGN driver for Linux, 1.3.27ks
[ 4797.909548] iwlagn: Copyright(c) 2003-2008 Intel Corporation
[ 4797.909755] iwlagn 0000:10:00.0: PCI INT A -> GSI 17 (level, low) -> IRQ 17
[ 4797.909774] iwlagn 0000:10:00.0: setting latency timer to 64
[ 4797.909875] iwlagn: Detected Intel Wireless WiFi Link 4965AGN REV=0xFFFFFFFF
[ 4797.909960] iwlagn: MAC is in deep sleep!
[ 4797.910001] iwlagn 0000:10:00.0: PCI INT A disabled
[ 4797.910029] iwlagn: probe of 0000:10:00.0 failed with error -5

unloading and reloading the module after this happened does not help, nor does a suspend to disk. The only way to make the wireless work again is to reboot.

WORKAROUND:
If I unload iwlagn before suspend and reload it after resume, it still works.
Comment 1 John W. Linville 2009-03-04 08:42:54 UTC
I thought this was supposed to be fixed by this commit:

commit 42802d71dd14dd0e435a8da59d817d0c6f8a2866
Author: Zhu, Yi <yi.zhu@intel.com>
Date:   Fri Dec 5 07:58:39 2008 -0800

    iwlwifi: fix "MAC in deep sleep" error
    
    This patch fixes the misue of CSR_GP_CNTRL with CSR_RESET address
    in polling the CSR_GP_CNTRL_REG_FLAG_MAC_CLOCK_READY bit in
    iwl4965_apm_reset(). This causes "MAC in deep sleep" error sometimes.
    The patch also fixes the timeout value and the iwl_poll_bit() return
    value check.
    
    Signed-off-by: Zhu Yi <yi.zhu@intel.com>
    Acked-by: Tomas Winkler <tomas.winkler@intel.com>
    Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
    Signed-off-by: John W. Linville <linville@tuxdriver.com>
Comment 2 Reinette Chatre 2009-03-04 11:35:25 UTC
Could you please compile your driver with debugging enabled (CONFIG_IWLWIFI_DEBUG)?

When you have done so, could you please modify your module load parameters (modprobe.conf?) to load module like "modprobe iwlagn debug=0x8043fff ?
Comment 3 Reinette Chatre 2009-03-13 16:28:17 UTC
Could you please try to disable AMT in your BIOS?
Comment 4 Stefan Seyfried 2009-03-17 01:54:52 UTC
(In reply to comment #3)
> Could you please try to disable AMT in your BIOS?

I tried (selected "unconfigure AMT on next boot" and rebooted). However, I'm not sure if I was successfull. Is there a way to check that?

It did not help, still

[  290.203264] iwlagn: MAC is in deep sleep!

after resume, rmmod/modprobe yields:

[  320.578922] iwlagn: Detected Intel Wireless WiFi Link 4965AGN REV=0xFFFFFFFF
[  320.579006] iwlagn: MAC is in deep sleep!

I will try newer kernels and enabling debug stuff this week, I was on vacation last week and am just catching up.
Comment 5 Stefan Seyfried 2009-03-18 07:18:56 UTC
Created attachment 20584 [details]
/var/log/messages with debug-enabled iwlagn

This is with the openSUSE FACTORY kernel-debug package, which has CONFIG_IWLWIFI_DEBUG=y and with
modprobe iwlagn debug=0x8043fff

from the rmmod before until after resume from s2ram.
Comment 6 Reinette Chatre 2009-04-10 22:18:38 UTC
Created attachment 20933 [details]
force restart of fw 

This patch applies to 2.6.27 but can easily be ported to other kernel versions. Could you please give it a try? In this patch we force a restart of the firmware.
Comment 7 Stefan Seyfried 2009-04-10 22:59:49 UTC
I actually have updated to kernel 2.6.29 (as current in openSUSE FACTORY).

There the failure is different: after resume from s2ram, i get

[78817.168081] iwlagn: START_ALIVE timeout after 4000ms.

(unload module)

[78826.272220] iwlagn 0000:10:00.0: PCI INT A disabled

(reload module)

[78829.854420] iwlagn: Intel(R) Wireless WiFi Link AGN driver for Linux, 1.3.27ks
[78829.854427] iwlagn: Copyright(c) 2003-2008 Intel Corporation
[78829.855659] iwlagn 0000:10:00.0: PCI INT A -> GSI 17 (level, low) -> IRQ 17
[78829.855679] iwlagn 0000:10:00.0: setting latency timer to 64
[78829.855838] iwlagn: Detected Intel Wireless WiFi Link 4965AGN REV=0x4
[78829.862885] iwlagn: Time out reading EEPROM[30]
[78829.862890] iwlagn: Unable to init EEPROM
[78829.862909] iwlagn 0000:10:00.0: PCI INT A disabled
[78829.862955] iwlagn: probe of 0000:10:00.0 failed with error -110

But interesting: after a suspend to disk, and unloading/reloading the module again, it works:

[187174.596759] iwlagn: Intel(R) Wireless WiFi Link AGN driver for Linux, 1.3.27ks
[187174.596766] iwlagn: Copyright(c) 2003-2008 Intel Corporation
[187174.596921] iwlagn 0000:10:00.0: PCI INT A -> GSI 17 (level, low) -> IRQ 17
[187174.596940] iwlagn 0000:10:00.0: setting latency timer to 64
[187174.597046] iwlagn: Detected Intel Wireless WiFi Link 4965AGN REV=0x4
[187174.636730] iwlagn: Tunable channels: 13 802.11bg, 19 802.11a channels
[187174.636827] iwlagn 0000:10:00.0: irq 27 for MSI/MSI-X
[187174.666521] wmaster0 (iwlagn): not using net_device_ops yet
[187174.668471] phy3: Selected rate control algorithm 'iwl-agn-rs'
[187174.668564] wlan0 (iwlagn): not using net_device_ops yet
[187174.678483] wlan0 renamed to air by udevd [29436]
[187174.694850] udev: renamed network interface wlan0 to air
[187179.078897] iwlagn 0000:10:00.0: firmware: requesting iwlwifi-4965-2.ucode
[187179.140960] iwlagn loaded firmware version 228.57.2.23
[187179.347983] Registered led device: iwl-phy3::radio
[187179.348034] Registered led device: iwl-phy3::assoc
[187179.348075] Registered led device: iwl-phy3::RX
[187179.348113] Registered led device: iwl-phy3::TX
[187179.391988] ADDRCONF(NETDEV_UP): air: link is not ready
[187185.301338] air: authenticate with AP 00:09:5b:5b:0a:03
[187185.304000] air: authenticated
[187185.304005] air: associate with AP 00:09:5b:5b:0a:03
[187185.307685] air: RX AssocResp from 00:09:5b:5b:0a:03 (capab=0x431 status=0 aid=1)
[187185.307692] air: associated
[187185.330587] ADDRCONF(NETDEV_CHANGE): air: link becomes ready
[187185.401549] cfg80211: Calling CRDA for country: GB
[187185.508063] cfg80211: Current regulatory domain updated by AP to: GB
[187185.508079]         (start_freq - end_freq @ bandwidth), (max_antenna_gain, max_eirp)
[187185.508086]         (2402000 KHz - 2482000 KHz @ 40000 KHz), (N/A, 2000 mBm)
[187195.865067] air: no IPv6 routers present

Now should I still test the patch against 2.6.27? Or ist there something else to test against 2.6.29 (I'm also able to test current git snapshots, it will just take a little bit more time).
Comment 8 Reinette Chatre 2009-04-10 23:23:17 UTC
Does not seem as though you encountered the original problem. It does not seem as though the device resumes well after s2r. I will have to see if I can reproduce that here.

In the s2disk case ... is a unload/reload of the module required for you to get wireless back? What happens if you do not reload the module?
Comment 9 Stefan Seyfried 2009-04-11 08:16:24 UTC
Actually, even after s2disk it is not reliably working after unloading / reloading. I tried the following:
- suspend to disk without unloading
- resume. Network is gone. "ifconfig up" gives timeout, kernel log shows "iwlagn: START_ALIVE timeout after 4000ms."
- unload/reload iwlagn.
[322385.073834] iwlagn: Intel(R) Wireless WiFi Link AGN driver for Linux, 1.3.27ks
[322385.073840] iwlagn: Copyright(c) 2003-2008 Intel Corporation
[322385.074082] iwlagn 0000:10:00.0: PCI INT A -> GSI 17 (level, low) -> IRQ 17
[322385.074099] __ratelimit: 3 callbacks suppressed
[322385.074104] do_IRQ: 0.99 No irq handler for vector
[322385.074131] iwlagn 0000:10:00.0: setting latency timer to 64
[322385.074248] iwlagn: Detected Intel Wireless WiFi Link 4965AGN REV=0x4
[322385.081452] iwlagn: Time out reading EEPROM[30]
[322385.081460] iwlagn: Unable to init EEPROM
[322385.081485] iwlagn 0000:10:00.0: PCI INT A disabled
[322385.096054] iwlagn: probe of 0000:10:00.0 failed with error -110
- "ifconfig up": START_ALIVE timeout
- i wanted to try s2ram, unload/reload, then s2disk, unload/reload, because this was the original sequence that did work. However, after s2ram, unloading and trying to reload the module hung the machine hard (no sysrq, had to power off hard).

Before trying the s2ram, I had the following trace in the log twice, after loading the module, I'm not sure if it's related or a different issue:
[322427.805681] iwlagn 0000:10:00.0: firmware: requesting iwlwifi-4965-2.ucode
[322427.921024] iwlagn loaded firmware version 228.57.2.23
[322428.096372] ------------[ cut here ]------------
[322428.096380] WARNING: at drivers/net/wireless/iwlwifi/iwl-tx.c:1254 iwl_tx_cmd_complete+0x5e/0x1fc [iwlcore]()
[322428.096386] Hardware name: HP Compaq 2510p Notebook PC
[322428.096391] wrong command queue 25, sequence 0x3932 readp=0 writep=0
[322428.096396] Modules linked in: iwlagn iwlcore mac80211 cfg80211 cpufreq_stats hp_wmi rfkill tun usb_storage ppp_deflate zlib_deflate bsd_comp ppp_async crc_ccitt ppp_generic slhc hidp nfs lockd nfs_acl auth_rpcgss sunrpc autofs4 af_packet i915 drm i2c_algo_bit i2c_core sco bridge stp rfcomm bnep l2cap snd_pcm_oss snd_mixer_oss snd_seq snd_seq_device ipv6 cpufreq_conservative cpufreq_userspace cpufreq_powersave acpi_cpufreq fuse sha256_generic aes_x86_64 aes_generic cbc usbhid hid ohci_hcd dm_crypt loop dm_mod snd_hda_codec_analog(N) pcmcia arc4 ecb iTCO_wdt snd_hda_intel snd_hda_codec(N) kvm_intel(N) sr_mod sdhci_pci sdhci yenta_socket rsrc_nonstatic snd_hwdep kvm(N) sierra joydev serio_raw pcspkr cdrom sg iTCO_vendor_support ricoh_mmc mmc_core ieee1394 pcmcia_core snd_pcm snd_timer snd soundcore snd_page_alloc rtc_cmos rtc_core usbserial btusb tpm_infineon bluetooth e1000e intel_agp wmi battery rtc_lib container tpm tpm_bios video output ac hp_accel(N) led_class lis3lv02d(N) button uhci_hcd sd_mod crc_t10dif ehci_hcd usbcore edd ext3 mbcache jbd fan ide_pci_generic piix ide_core ata_piix ahci thermal processor thermal_sys hwmon ata_generic libata scsi_mod [last unloaded: iwlagn]
[322428.096622] Supported: Yes
[322428.096629] Pid: 0, comm: swapper Tainted: G        W N  2.6.29-6-default #1
[322428.096634] Call Trace:
[322428.096659]  [<ffffffff8020ff31>] try_stack_unwind+0x70/0x127
[322428.096672]  [<ffffffff8020f0c0>] dump_trace+0x9a/0x2a6
[322428.096684]  [<ffffffff8020fc82>] show_trace_log_lvl+0x4c/0x58
[322428.096696]  [<ffffffff8020fc9e>] show_trace+0x10/0x12
[322428.096707]  [<ffffffff804f5777>] dump_stack+0x72/0x7b
[322428.096720]  [<ffffffff80248353>] warn_slowpath+0xb1/0xed
[322428.096748]  [<ffffffffa03b6778>] iwl_tx_cmd_complete+0x5e/0x1fc [iwlcore]
[322428.096812]  [<ffffffffa03d0d73>] iwl_rx_handle+0x12e/0x231 [iwlagn]
[322428.096847]  [<ffffffffa03d1092>] iwl_irq_tasklet+0x21c/0x2d0 [iwlagn]
[322428.096873]  [<ffffffff8024d086>] tasklet_action+0xb1/0x13b
[322428.096885]  [<ffffffff8024de8b>] __do_softirq+0xd6/0x1f3
[322428.096897]  [<ffffffff8020d83c>] call_softirq+0x1c/0x30
[322428.096908]  [<ffffffff8020ea10>] do_softirq+0x44/0x8f
[322428.096919]  [<ffffffff8024db36>] irq_exit+0x3f/0x7e
[322428.096929]  [<ffffffff8020ecb4>] do_IRQ+0xc3/0xe7
[322428.096939]  [<ffffffff8020cf93>] ret_from_intr+0x0/0x29
[322428.096968]  [<ffffffffa00815bc>] acpi_idle_enter_simple+0x172/0x1ed [processor]
[322428.096988]  [<ffffffff8045119d>] cpuidle_idle_call+0x8c/0xc7
[322428.097000]  [<ffffffff8020b4ab>] cpu_idle+0x59/0x9a
[322428.097013]  [<ffffffff804f0280>] start_secondary+0xbd/0xbf
[322428.097021] ---[ end trace 008890f8b610dcbc ]---
[322428.097046] ------------[ cut here ]------------
Comment 10 Rafael J. Wysocki 2009-04-11 10:41:17 UTC
FWIW, I have iwlagn in my test box (Toshiba Portege R500) and it works well after resume from suspend to RAM and from hibernation.

Stefan, have you tested the current mainline?
Comment 11 Stefan Seyfried 2009-04-11 16:02:20 UTC
I'm using openSUSE Factory which should be pretty close to mainline (but it's old as usually only once a kernel is in RC phase it gets checked into factory, no git snapshots)
Comment 12 Rafael J. Wysocki 2009-04-11 18:20:52 UTC
In that case it's better to wait for .30-rc2.  There are quite a few fixes related to suspend in current -git.
Comment 13 Dirk Mueller 2009-04-17 12:59:31 UTC
I've tried 2.6.30-rc2 and it doesn't seem to help in that regard. I also run into that issue.
Comment 14 Reinette Chatre 2009-05-24 17:47:02 UTC
There are a few issues in this bug report. The warning containing "wrong command queue" will be fixed when you update to the latest ucode.

For the "MAC is in deep sleep" issue, please try out the latest code from wireless-testing. We just pushed a few patches there that addresses this problem. These patches are significant and cannot be backported.
Comment 15 Reinette Chatre 2009-06-19 21:06:26 UTC
Stefan or Dirk - have either of you been able to test the latest code?
Comment 16 Markus Zimmermann 2009-06-19 23:06:50 UTC
I have a thinkpad x200s with a WiFi Link 5100 and I'm using the openSUSE 2.6.30-41-default factory kernel.

I can suspend to disk and ram with wlan enabled/disabled. And I can connect after resume to an AP without any issues. With a 2.6.28 or 2.6.29 kernel this didn't worked for me and I had the same issues as posted in this bug report.

But know everything just works! I think you can close this bug (?)
Comment 17 Stefan Seyfried 2009-06-20 00:08:58 UTC
Same for me - running FACTORY, with all the latest stuff (i guess latest ucode and recent kernel 2.6.30rc's), the problem vanished a few weeks ago. I removed the "unload iwlagn during suspend hack" from my system at least two weeks ago.

Sorry for the late answer.