Bug 17722

Summary: 2.6.36-rc3: WARNING: at net/mac80211/scan.c:269 ieee80211_scan_completed
Product: Drivers Reporter: Maciej Rutecki (maciej.rutecki)
Component: network-wirelessAssignee: drivers_network-wireless (drivers_network-wireless)
Status: CLOSED CODE_FIX    
Severity: normal CC: florian, linville, maciej.rutecki, rjw, rockorequin, thomas, wey-yi.w.guy, zdenek.kabelac
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 2.6.36-rc3 Subsystem:
Regression: Yes Bisected commit-id:
Bug Depends on:    
Bug Blocks: 16444    
Attachments: Patch to fix abort_scan case in request_scan

Description Maciej Rutecki 2010-09-03 18:41:03 UTC
Subject    : 2.6.36-rc3: WARNING: at net/mac80211/scan.c:269 ieee80211_scan_completed
Submitter  : Thomas Meyer <thomas@m3y3r.de>
Date       : 2010-08-31 20:14
Message-ID : 201008312214.52473.thomas@m3y3r.de
References : http://marc.info/?l=linux-kernel&m=128328580504227&w=2

This entry is being used for tracking a regression from 2.6.35. Please don't
close it until the problem is fixed in the mainline.
Comment 1 Rafael J. Wysocki 2010-09-20 20:07:19 UTC
References : http://www.spinics.net/lists/netdev/msg140769.html
Comment 2 Rafael J. Wysocki 2010-09-23 20:20:06 UTC
*** Bug 18692 has been marked as a duplicate of this bug. ***
Comment 3 Zdenek Kabelac 2010-09-24 10:07:59 UTC
iwl3945 0000:03:00.0: Error: Response NULL in 'REPLY_ADD_STA'
iwl3945 0000:03:00.0: Adding station ff:ff:ff:ff:ff:ff failed.
------------[ cut here ]------------
WARNING: at net/mac80211/scan.c:266 ieee80211_scan_completed+0x13f/0x1e0 [mac80211]()
Hardware name: 6464CTO
Modules linked in: hidp ip6_tables ebtable_nat ebtables xt_tcpudp bridge stp llc nfsd lockd nfs_acl auth_rpcgss exportfs sunrpc
systemd[1]: Job atd.service/stop finished, success=yes
 rfcomm sco bnep l2cap ipt_REJECT xt_physdev xt_state iptable_filter ipt_MASQUERADE iptable_nat nf_nat nf_conntrack_ipv4 nf_conntrack nf_defrag_ipv4 ip_tables x_tables binfmt_misc loop dm_mirror dm_region_hash dm_log kvm_intel kvm i915 drm_kms_helper drm i2c_algo_bit uinput snd_hda_codec_analog arc4 ecb snd_hda_intel crypto_blkcipher cryptomgr snd_hda_codec aead crypto_algapi iwl3945 snd_seq iwlcore snd_seq_device btusb snd_pcm mac80211 thinkpad_acpi snd_timer bluetooth snd psmouse i2c_i801 cfg80211 e1000e soundcore iTCO_wdt i2c_core wmi nvram serio_raw snd_page_alloc iTCO_vendor_support evdev autofs4 usbhid hid pcmcia sdhci_pci ehci_hcd uhci_hcd sr_mod sdhci cdrom usbcore yenta_socket mmc_core led_class video backlight output dm_mod [last unloaded: scsi_wait_scan]
Pid: 2235, comm: kworker/u:0 Not tainted 2.6.36-rc5-00074-gbdcf046 #104
Call Trace:
 [<ffffffff8105194f>] warn_slowpath_common+0x7f/0xc0
 [<ffffffff810519aa>] warn_slowpath_null+0x1a/0x20
 [<ffffffffa01f54af>] ieee80211_scan_completed+0x13f/0x1e0 [mac80211]
 [<ffffffffa02b33c2>] iwl_bg_scan_completed+0x162/0x220 [iwlcore]
 [<ffffffff8106e77a>] process_one_work+0x19a/0x560
 [<ffffffff8106e70e>] ? process_one_work+0x12e/0x560
 [<ffffffff8147cea9>] ? _raw_spin_lock_irq+0x19/0x80
 [<ffffffffa02b3260>] ? iwl_bg_scan_completed+0x0/0x220 [iwlcore]
 [<ffffffff81070109>] worker_thread+0x169/0x340
 [<ffffffff8106ffa0>] ? worker_thread+0x0/0x340
 [<ffffffff81075266>] kthread+0xa6/0xb0
 [<ffffffff81003fd4>] kernel_thread_helper+0x4/0x10
 [<ffffffff8147da2b>] ? _raw_spin_unlock_irq+0x3b/0x60
 [<ffffffff8147e040>] ? restore_args+0x0/0x30
 [<ffffffff810751c0>] ? kthread+0x0/0xb0
 [<ffffffff81003fd0>] ? kernel_thread_helper+0x0/0x10
---[ end trace 9af1b6c7b17205a3 ]---
------------[ cut here ]------------
WARNING: at net/mac80211/scan.c:269 ieee80211_scan_completed+0x1cf/0x1e0 [mac80211]()
Hardware name: 6464CTO
Modules linked in: hidp ip6_tables ebtable_nat ebtables xt_tcpudp bridge stp llc nfsd lockd nfs_acl auth_rpcgss exportfs sunrpc rfcomm sco bnep l2cap ipt_REJECT xt_physdev xt_state iptable_filter ipt_MASQUERADE iptable_nat nf_nat nf_conntrack_ipv4 nf_conntrack nf_defrag_ipv4 ip_tables x_tables binfmt_misc loop dm_mirror dm_region_hash dm_log kvm_intel kvm i915 drm_kms_helper drm i2c_algo_bit uinput snd_hda_codec_analog arc4 ecb snd_hda_intel crypto_blkcipher cryptomgr snd_hda_codec aead crypto_algapi iwl3945 snd_seq iwlcore snd_seq_device btusb snd_pcm mac80211 thinkpad_acpi snd_timer bluetooth snd psmouse i2c_i801 cfg80211 e1000e soundcore iTCO_wdt i2c_core wmi nvram serio_raw snd_page_alloc iTCO_vendor_support evdev autofs4 usbhid hid pcmcia sdhci_pci ehci_hcd uhci_hcd sr_mod sdhci cdrom usbcore yenta_socket mmc_core led_class video backlight output dm_mod [last unloaded: scsi_wait_scan]
Pid: 2235, comm: kworker/u:0 Tainted: G        W   2.6.36-rc5-00074-gbdcf046 #104
Call Trace:
 [<ffffffff8105194f>] warn_slowpath_common+0x7f/0xc0
 [<ffffffff810519aa>] warn_slowpath_null+0x1a/0x20
 [<ffffffffa01f553f>] ieee80211_scan_completed+0x1cf/0x1e0 [mac80211]
 [<ffffffffa02b33c2>] iwl_bg_scan_completed+0x162/0x220 [iwlcore]
 [<ffffffff8106e77a>] process_one_work+0x19a/0x560
 [<ffffffff8106e70e>] ? process_one_work+0x12e/0x560
 [<ffffffff8147cea9>] ? _raw_spin_lock_irq+0x19/0x80
 [<ffffffffa02b3260>] ? iwl_bg_scan_completed+0x0/0x220 [iwlcore]
 [<ffffffff81070109>] worker_thread+0x169/0x340
 [<ffffffff8106ffa0>] ? worker_thread+0x0/0x340
 [<ffffffff81075266>] kthread+0xa6/0xb0
 [<ffffffff81003fd4>] kernel_thread_helper+0x4/0x10
 [<ffffffff8147da2b>] ? _raw_spin_unlock_irq+0x3b/0x60
 [<ffffffff8147e040>] ? restore_args+0x0/0x30
 [<ffffffff810751c0>] ? kthread+0x0/0xb0
 [<ffffffff81003fd0>] ? kernel_thread_helper+0x0/0x10
---[ end trace 9af1b6c7b17205a4 ]---
Comment 4 Florian Mickler 2010-09-24 10:52:48 UTC
In the referenced thread above, Johannes Berg wrote the following:
> On Tue, 2010-09-14 at 17:14 +0200, Michael S. Tsirkin wrote:
> > > No, it shouldn't happen. But there are various races in iwlwifi that can
> > > cause this. Stanislaw has fixed them recently, but it's a fairly large
> > > patchset. I'm not sure, but I don't think this is really new in .36.
> 
> > So ... get back to you after 2.6.37?
> 
> Err, yes, I guess so, though I suppose it doesn't really solve the
> problem. This particular instance should be harmless though, I believe,
> other than the warning.
> 
> johannes
Comment 5 Florian Mickler 2010-09-24 11:13:50 UTC
Oh, and I'm not shure, but shouldn't iwl3945_request_scan queue an "abort_scan" work, instead of a "scan_completed" work item in it's done: branch?

3011  done:
3012         /* can not perform scan make sure we clear scanning
3013          * bits from status so next scan request can be performed.
3014          * if we dont clear scanning status bit here all next scan
3015          * will fail
3016         */
3017         clear_bit(STATUS_SCAN_HW, &priv->status);
3018         clear_bit(STATUS_SCANNING, &priv->status);
3019 
3020         /* inform mac80211 scan aborted */
3021         queue_work(priv->workqueue, &priv->scan_completed);
3022 }
Comment 6 Florian Mickler 2010-09-28 05:26:26 UTC
Created attachment 31671 [details]
Patch to fix abort_scan case in request_scan

Can you check this patch? It is queued for 2.6.36.
Comment 7 Florian Mickler 2010-09-28 05:52:24 UTC
*** Bug 18992 has been marked as a duplicate of this bug. ***
Comment 8 John W. Linville 2010-09-28 13:37:48 UTC
Umm...where is it queued for 2.6.36?
Comment 9 John W. Linville 2010-09-28 20:24:45 UTC
I have now queued it in wireless-2.6 -- I hope that it isn't queued elsewhere...
Comment 10 Florian Mickler 2010-09-28 20:43:58 UTC
Hm, it probably was only queued in my imagination. Sorry for the confusion. 
(I confused it with the Acked-By by Wey-Yi Guy [1]) 

Also, I closed the other Bug as a duplicate, because it was the same backtrace.

I think it would be good, if the patch could be tested though, because I wasn't able to reproduce the issue on my laptop. I could only verify that it works like before.

Regards,
Flo


[1] http://www.spinics.net/lists/netdev/msg142085.html
Comment 11 rocko 2010-09-29 00:13:02 UTC
I've run with the patch for about 12 hours and so far the slowpath hasn't reoccurred. It's not a conclusive test yet because (eg) yesterday it happened only twice (and in rapid succession). I'll report back later.
Comment 12 rocko 2010-09-30 04:31:31 UTC
It's still working fine. I haven't seen the warn_slowpath since I installed the patch. I'm running 2.6.36-rc6 now with the patch installed.
Comment 13 Florian Mickler 2010-09-30 05:32:54 UTC
Thx for testing!
Comment 14 Rafael J. Wysocki 2010-10-03 21:06:53 UTC
Patch : https://bugzilla.kernel.org/attachment.cgi?id=31671
Handled-By : Florian Mickler <florian@mickler.org>
Comment 15 Rafael J. Wysocki 2010-10-10 17:31:16 UTC
Fixed by commit e7ee762cf074b0fd8eec483d0cef8fdbf0d04b81 .
Comment 16 Zdenek Kabelac 2010-10-18 19:33:16 UTC
looking at git log from my kernel - it looks like patch e7ee762cf074b  is really in - but I still could see this trace with 2.6.36-rc8.

Happens during NetworkManager shutdown.

iwl3945 0000:03:00.0: Error: Response NULL in 'REPLY_ADD_STA'
iwl3945 0000:03:00.0: Adding station ff:ff:ff:ff:ff:ff failed.
------------[ cut here ]------------
WARNING: at net/mac80211/scan.c:266 ieee80211_scan_completed+0x13f/0x1e0 [mac80211]()
Hardware name: 6464CTO
Modules linked in: hidp ip6_tables ebtable_nat ebtables iptable_mangle xt_tcpudp bridge stp llc sunrpc rfcomm sco bnep l2cap ipt_REJECT xt_physdev xt_state iptable_filter ipt_MASQUERADE iptable_nat nf_nat nf_conntrack_ipv4 nf_conntrack nf_defrag_ipv4 ip_tables x_tables binfmt_misc dm_mirror dm_region_hash dm_log kvm_intel kvm i915 drm_kms_helper drm i2c_algo_bit uinput snd_hda_codec_analog arc4 ecb crypto_blkcipher cryptomgr aead snd_hda_intel crypto_algapi snd_hda_codec iwl3945 snd_seq iwlcore snd_seq_device snd_pcm thinkpad_acpi btusb mac80211 snd_timer e1000e snd psmouse i2c_i801 iTCO_wdt i2c_core nvram wmi soundcore cfg80211 bluetooth serio_raw iTCO_vendor_support snd_page_alloc evdev autofs4 usbhid hid pcmcia sdhci_pci ehci_hcd sdhci uhci_hcd sr_mod mmc_core yenta_socket cdrom led_class usbcore video backlight output dm_mod [last unloaded: scsi_wait_scan]
Pid: 2336, comm: kworker/u:1 Not tainted 2.6.36-rc8-00024-ga7ac73b #6
Call Trace:
 [<ffffffff810519af>] warn_slowpath_common+0x7f/0xc0
 [<ffffffff81051a0a>] warn_slowpath_null+0x1a/0x20
 [<ffffffffa02784af>] ieee80211_scan_completed+0x13f/0x1e0 [mac80211]
 [<ffffffffa02d75a4>] iwl_bg_scan_completed+0x164/0x220 [iwlcore]
 [<ffffffff8106e7aa>] process_one_work+0x19a/0x560
 [<ffffffff8106e73e>] ? process_one_work+0x12e/0x560
 [<ffffffff8147ca69>] ? _raw_spin_lock_irq+0x19/0x80
 [<ffffffffa02d7440>] ? iwl_bg_scan_completed+0x0/0x220 [iwlcore]
 [<ffffffff81070139>] worker_thread+0x169/0x340
 [<ffffffff8106ffd0>] ? worker_thread+0x0/0x340
 [<ffffffff81075296>] kthread+0xa6/0xb0
 [<ffffffff81004014>] kernel_thread_helper+0x4/0x10
 [<ffffffff8147d5eb>] ? _raw_spin_unlock_irq+0x3b/0x60
 [<ffffffff8147dc40>] ? restore_args+0x0/0x30
 [<ffffffff810751f0>] ? kthread+0x0/0xb0
 [<ffffffff81004010>] ? kernel_thread_helper+0x0/0x10
---[ end trace aeee1448e57d79cf ]---
------------[ cut here ]------------
WARNING: at net/mac80211/scan.c:269 ieee80211_scan_completed+0x1cf/0x1e0 [mac80211]()
Hardware name: 6464CTO
Modules linked in: hidp ip6_tables ebtable_nat ebtables iptable_mangle xt_tcpudp bridge stp llc sunrpc rfcomm sco bnep l2cap ipt_REJECT xt_physdev xt_state iptable_filter ipt_MASQUERADE iptable_nat nf_nat nf_conntrack_ipv4 nf_conntrack nf_defrag_ipv4 ip_tables x_tables binfmt_misc dm_mirror dm_region_hash dm_log kvm_intel kvm i915 drm_kms_helper drm i2c_algo_bit uinput snd_hda_codec_analog arc4 ecb crypto_blkcipher cryptomgr aead snd_hda_intel crypto_algapi snd_hda_codec iwl3945 snd_seq iwlcore snd_seq_device snd_pcm thinkpad_acpi btusb mac80211 snd_timer e1000e snd psmouse i2c_i801 iTCO_wdt i2c_core nvram wmi soundcore cfg80211 bluetooth serio_raw iTCO_vendor_support snd_page_alloc evdev autofs4 usbhid hid pcmcia sdhci_pci ehci_hcd sdhci uhci_hcd sr_mod mmc_core yenta_socket cdrom led_class usbcore video backlight output dm_mod [last unloaded: scsi_wait_scan]
Pid: 2336, comm: kworker/u:1 Tainted: G        W   2.6.36-rc8-00024-ga7ac73b #6
Call Trace:
 [<ffffffff810519af>] warn_slowpath_common+0x7f/0xc0
 [<ffffffff81051a0a>] warn_slowpath_null+0x1a/0x20
 [<ffffffffa027853f>] ieee80211_scan_completed+0x1cf/0x1e0 [mac80211]
 [<ffffffffa02d75a4>] iwl_bg_scan_completed+0x164/0x220 [iwlcore]
 [<ffffffff8106e7aa>] process_one_work+0x19a/0x560
 [<ffffffff8106e73e>] ? process_one_work+0x12e/0x560
 [<ffffffff8147ca69>] ? _raw_spin_lock_irq+0x19/0x80
 [<ffffffffa02d7440>] ? iwl_bg_scan_completed+0x0/0x220 [iwlcore]
 [<ffffffff81070139>] worker_thread+0x169/0x340
 [<ffffffff8106ffd0>] ? worker_thread+0x0/0x340
 [<ffffffff81075296>] kthread+0xa6/0xb0
 [<ffffffff81004014>] kernel_thread_helper+0x4/0x10
 [<ffffffff8147d5eb>] ? _raw_spin_unlock_irq+0x3b/0x60
 [<ffffffff8147dc40>] ? restore_args+0x0/0x30
 [<ffffffff810751f0>] ? kthread+0x0/0xb0
 [<ffffffff81004010>] ? kernel_thread_helper+0x0/0x10
---[ end trace aeee1448e57d79d0 ]---
Comment 17 Florian Mickler 2010-10-19 07:48:38 UTC
Thanks for the information.

In 2.6.37 there will be a thorough rework of the iwl scanning and this is what will be backported to 2.6.36 stable: 
 http://www.spinics.net/lists/netdev/msg143086.html

Can you check with this on top of current mainline?
Comment 18 rocko 2010-10-26 05:32:07 UTC
FWIW, I've been running with that patch for five days now without any adverse effects, but I wasn't getting the slowpath message with the stock 2.6.36 either.
Comment 19 Zdenek Kabelac 2010-11-03 12:38:14 UTC
(In reply to comment #17)
> Thanks for the information.
> 
> In 2.6.37 there will be a thorough rework of the iwl scanning and this is
> what
> will be backported to 2.6.36 stable: 
>  http://www.spinics.net/lists/netdev/msg143086.html
> 
> Can you check with this on top of current mainline?

I'm using patch from given msg and I'm still able to see posted warning from my comment 16.

This time it happened during some  suspend/resume cycle.
On this other hand it does not appear as frequently as before (i.e. each reboot).
I've seen this message just once during suspend/resume.
Comment 20 Florian Mickler 2010-11-16 14:31:55 UTC
This should be fixed now in mainline.