Bug 11326

Summary: iwl3945: Error: Response NULL in 'REPLY_ADD_STA', followed by fault
Product: Drivers Reporter: Jim Paris (jim)
Component: network-wirelessAssignee: Zhu Yi (yi.zhu)
Status: RESOLVED CODE_FIX    
Severity: normal CC: marcus, raa.lkml
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 2.6.27-rc3 Subsystem:
Regression: No Bisected commit-id:
Attachments: patch to try

Description Jim Paris 2008-08-13 21:36:41 UTC
I'm playing with aircrack-ng, trying to find a kernel version that works well with airodump and my iwl3945 card.  On 2.6.27-rc3, it captures packets very slowly (must miss a lot) and eventually while playing around with iwconfig options and running airodump, I got this:

[19991.393753] iwl3945: Error: Response NULL in 'REPLY_ADD_STA'
[19991.393753] BUG: unable to handle kernel paging request at ffff8801ffff8801
[19991.393753] IP: [<ffffffff80293d8a>] unmap_vmas+0x7ea/0x990
[19991.393753] PGD 202063 PUD 0 
[19991.393753] Oops: 0000 [1] PREEMPT SMP 
[19991.393753] CPU 1 
[19991.393753] Modules linked in: iwl3945 i915 drm binfmt_misc rfcomm l2cap kvm_intel kvm ppdev parport_pc lp parport ipv6 pci_slot sbs sbshc container acpi_cpufreq cpufr
eq_stats cpufreq_powersave cpufreq_conservative cpufreq_ondemand freq_table cpufreq_userspace fuse input_polldev loop firewire_sbp2 hci_usb pcmcia bluetooth snd_hda_intel
 snd_pcm_oss snd_mixer_oss snd_pcm arc4 ecb snd_timer snd_page_alloc snd_hwdep serio_raw sierra snd psmouse usbserial pcspkr mac80211 i2c_i801 yenta_socket rsrc_nonstatic
 pcmcia_core i2c_core cfg80211 ac battery soundcore video output button thinkpad_acpi intel_agp rfkill led_class nvram evdev ext3 jbd mbcache sha256_generic aes_x86_64 ae
s_generic cbc dm_crypt crypto_blkcipher dm_mirror dm_log dm_snapshot dm_mod pata_acpi ata_generic sd_mod crc_t10dif firewire_ohci ata_piix ahci firewire_core crc_itu_t li
bata scsi_mod ehci_hcd e1000e uhci_hcd dock thermal processor fan thermal_sys [last unloaded: iwl3945]
[19991.393753] Pid: 583, comm: airodump-ng Not tainted 2.6.27-rc3 #1
[19991.393753] RIP: 0010:[<ffffffff80293d8a>]  [<ffffffff80293d8a>] unmap_vmas+0x7ea/0x990
[19991.393753] RSP: 0018:ffff880101f45ae8  EFLAGS: 00010246
[19991.393753] RAX: ffff8801ffff8801 RBX: ffff880028031440 RCX: 0000000000000000
[19991.393753] RDX: 0000000000000000 RSI: 0000000000000008 RDI: 0000000000000001
[19991.393753] RBP: ffff880101f45bd8 R08: 0000000000000000 R09: 0000000000000000
[19991.393753] R10: 0000000000000000 R11: 0000000000000246 R12: ffffe20004515560
[19991.393753] R13: ffff88013959adb0 R14: 00007fec9a3b7000 R15: ffffffffffffff97
[19991.393753] FS:  0000000000000000(0000) GS:ffff88013b892b40(0000) knlGS:0000000000000000
[19991.393753] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[19991.393753] CR2: ffff8801ffff8801 CR3: 0000000000201000 CR4: 00000000000026a0
[19991.393753] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[19991.393753] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[19991.393753] Process airodump-ng (pid: 583, threadinfo ffff880101f44000, task ffff880101f52140)
[19991.393753] Stack:  ffff880101f45b58 00007fec9a3ecfff 0000000000000000 ffff880101f45be8
[19991.393753]  ffffffffffffffff 31dc588000000000 ffff8801ffff8801 ffff880101f45bf0
[19991.393753]  0000000100000082 0000000000000000 ffff880139521940 00007fec9a3ed000
[19991.393753] Call Trace:
[19991.393753]  [<ffffffff8029878e>] exit_mmap+0x9e/0x160
[19991.393753]  [<ffffffff8023c23d>] mmput+0x2d/0xd0
[19991.393753]  [<ffffffff80240598>] exit_mm+0x108/0x140
[19991.393753]  [<ffffffff804bbe61>] ? _spin_unlock_irq+0x11/0x40
[19991.393753]  [<ffffffff8024288a>] do_exit+0x80a/0x960
[19991.393753]  [<ffffffff804a7de2>] ? wireless_send_event+0x172/0x310
[19991.393753]  [<ffffffff80430900>] ? netdev_run_todo+0x1d0/0x270
[19991.393753]  [<ffffffff80242a1e>] do_group_exit+0x3e/0xb0
[19991.393753]  [<ffffffff8024de4d>] get_signal_to_deliver+0x27d/0x3c0
[19991.393753]  [<ffffffff8020b6b9>] do_notify_resume+0x99/0x9c0
[19991.393753]  [<ffffffff804bbea2>] ? _spin_unlock_irqrestore+0x12/0x40
[19991.393753]  [<ffffffff80259821>] ? hrtimer_start+0x101/0x200
[19991.393753]  [<ffffffff80259b09>] ? ktime_get_ts+0x59/0x60
[19991.393753]  [<ffffffff8022fc23>] ? hrtick_start_fair+0x183/0x1a0
[19991.393753]  [<ffffffff804bbe61>] ? _spin_unlock_irq+0x11/0x40
[19991.393753]  [<ffffffff804b974c>] ? thread_return+0xa3/0x717
[19991.393753]  [<ffffffff802bbae1>] ? vfs_ioctl+0x31/0xa0
[19991.393753]  [<ffffffff804ba7a7>] ? do_nanosleep+0x77/0xc0
[19991.393753]  [<ffffffff8020c62c>] ? sysret_signal+0x14/0x1f
[19991.393753]  [<ffffffff8020c8a7>] ptregscall_common+0x67/0xb0
[19991.393753] 
[19991.393753] 
[19991.393753] Code: 80 38 e0 ff ff 08 0f 84 e5 00 00 00 48 83 bd 58 ff ff ff 00 0f 85 00 01 00 00 e8 f2 60 22 00 48 8b 85 40 ff ff ff bf 01 00 00 00 <4c> 8b 20 e8 4e b5 22 00 48 8b 1d ef 29 3b 00 48 c7 c2 40 74 69 
[19991.393753] RIP  [<ffffffff80293d8a>] unmap_vmas+0x7ea/0x990
[19991.393753]  RSP <ffff880101f45ae8>
[19991.393753] CR2: ffff8801ffff8801
[19991.393753] ---[ end trace 10018cf841eb794c ]---
[19991.393753] Fixing recursive fault but reboot is needed!
Comment 1 John W. Linville 2008-08-14 11:08:02 UTC
Hmmm...assigning to Zhu Yi based on iwl3945 error message...
Comment 2 Zhu Yi 2008-08-14 22:58:05 UTC
Created attachment 17258 [details]
patch to try

Please see if this patch fix the problem.
Comment 3 Alex Riesen 2008-12-18 13:42:55 UTC
I had a very similar crash and tried your patch. After a day of light use,
I couldn't reproduce it. It may have helped. I continue using it...
Comment 4 Alex Riesen 2008-12-19 12:39:36 UTC
It certainly helped with the crash: I had the error message once, but
no crash. What does the patch actually do? What is canceled (according
the labels name)?
Comment 5 Reinette Chatre 2009-01-09 13:22:33 UTC
(In reply to comment #4)
> It certainly helped with the crash: I had the error message once, but
> no crash. What does the patch actually do? What is canceled (according
> the labels name)?
> 

The patch has been submitted upstream. The commit message explains the details you are interested in.
http://marc.info/?l=linux-wireless&m=123143878214020&w=2

Could you please close this bug?
Comment 6 Alex Riesen 2009-01-10 02:02:52 UTC
According to the last message (attached) from Zhu Yi <yi.zhu@intel.com> it is
not fixed, just worked around:
From: Zhu Yi <yi.zhu@intel.com>
To: Alex Riesen <raa.lkml@gmail.com>
Cc: "Chatre, Reinette" <reinette.chatre@intel.com>, "linville@tuxdriver.com" <linville@tuxdriver.com>, "ipw3945-devel@lists.sourceforge.net" <ipw3945-devel@lists.sourceforge.net>, Jim Paris <jim@jtan.com>, Linux Kernel Mailing List <linux-kernel@vger.kernel.org>, Linus Torvalds <torvalds@linux-foundation.org>
In-Reply-To: <81b0412b0812250640t5d6180dct31d8cd941e06ef50@mail.gmail.com>
References: <alpine.LFD.2.00.0812241540170.3535@localhost.localdomain>
	 <81b0412b0812250640t5d6180dct31d8cd941e06ef50@mail.gmail.com>
Date: Fri, 26 Dec 2008 10:27:02 +0800
Message-Id: <1230258422.27521.9.camel@debian>

On Thu, 2008-12-25 at 22:40 +0800, Alex Riesen wrote:
> iwl3945 is still broken as described in
> 
>   http://bugzilla.kernel.org/show_bug.cgi?id=11326
> 
> It is actually quite annoying, as it prevents shutdown when crashed.
> The bug discussion mentions a fix (which looks like a workaround,
> because there is still a scary message in the log). But at least
> the system can be safely shut down.

The patch should be a valid fix for the symptom. But we haven't find the
root cause yet. Will submit the patch to wireless-testing.

Thanks,
-yi

May I suggest _not_ to close the _bug_ until it is actually _fixed_?
Comment 7 Alex Riesen 2009-01-10 02:11:58 UTC
(Ok not worked around, the crash in aftereffect was fixed. Which still does not fix the bug with "Response NULL in 'REPLY_ADD_STA'")
Comment 8 Alex Riesen 2009-01-11 04:25:52 UTC
OTOH, how about opening another bug for specifically this
"NULL in REPLY_ADD_STA" problem? (assuming the problem is not found yet)
Comment 9 Reinette Chatre 2009-01-16 14:35:24 UTC
Jim or admin,

The patch fixing the BUG is now in wireless-testing. Could you please close this bug?
Comment 10 Jim Paris 2009-05-13 20:50:55 UTC
Sorry for the delay -- yes, this does fix the crash so I think the bug can be closed.  Thanks.