iwlagn makes my t500 misserable ever since early .34 Johill made me run with debug=0x802, output in attachment. the GPF splat also here: [19442.770407] general protection fault: 0000 [#1] SMP [19442.770415] last sysfs file: /sys/devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A08:00/device:00/PNP0C09:00/PNP0C0A:00/power_supply/BAT0/energy_full [19442.770422] CPU 0 [19442.770426] Modules linked in: nfs lockd nfs_acl auth_rpcgss sunrpc cryptd aes_x86_64 aes_generic rfcomm binfmt_misc sco bridge stp ppdev bnep l2cap snd_hda_codec_ conexant joydev fbcon tileblit font bitblit softcursor kvm_intel autofs4 kvm arc4 snd_hda_intel snd_hda_codec snd_hwdep snd_pcm_oss snd_mixer_oss snd_pcm snd_seq_dummy snd_seq_oss snd_seq_midi pcmcia snd_rawmidi snd_seq_midi_event iwlagn snd_seq thinkpad_acpi iwlcore snd_timer i915 snd_seq_device drm_kms_helper mac80211 drm yenta_socket sdhci_pci sdhci psmouse btusb rsrc_nonstatic i2c_algo_bit bluetooth nvram snd serio_raw led_class pcmcia_core video cfg80211 soundcore snd_page_alloc output lp intel_agp parport xfs exportfs ohci1394 ieee1394 ahci e1000e [19442.770555] [19442.770563] Pid: 863, comm: iwlagn Not tainted 2.6.34-rc3-tip+ #5 2242CTO/2242CTO [19442.770569] RIP: 0010:[<ffffffff812a4d20>] [<ffffffff812a4d20>] memcpy+0x10/0xb0 [19442.770584] RSP: 0018:ffff880135215cf8 EFLAGS: 00010202 [19442.770590] RAX: ffff880130628316 RBX: ffff8801354c97a0 RCX: 0000000000000001 [19442.770596] RDX: 0000000000000001 RSI: 000c0199000020de RDI: ffff880130628316 [19442.770602] RBP: ffff880135215d10 R08: 00000000000000ea R09: 00000000ffffffff [19442.770608] R10: 000000000a6a2100 R11: 00000000000000c8 R12: ffff8801354c1360 [19442.770614] R13: ffff8801354c1db8 R14: 0000000000000070 R15: ffff880130628000 [19442.770621] FS: 0000000000000000(0000) GS:ffff880001e00000(0000) knlGS:0000000000000000 [19442.770628] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [19442.770634] CR2: 00007ffc65782a18 CR3: 0000000001a29000 CR4: 00000000000006f0 [19442.770640] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [19442.770646] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [19442.770653] Process iwlagn (pid: 863, threadinfo ffff880135214000, task ffff880134008000) [19442.770658] Stack: [19442.770662] ffffffffa0370e21 ffffffffa037eec0 0000000000000001 ffff880135215e30 [19442.770671] <0> ffffffffa0371acf ffffe8ff0000000a ffff880135215fd8 ffff880135215d70 [19442.770682] <0> ffffffffa035c32b ffff880135215d50 ffffffff02800000 ffff880100000055 [19442.770693] Call Trace: [19442.770727] [<ffffffffa0370e21>] ? iwl_fill_probe_req+0x81/0xb0 [iwlcore] [19442.770755] [<ffffffffa0371acf>] iwl_bg_request_scan+0x63f/0x1390 [iwlcore] [19442.770783] [<ffffffffa035c32b>] ? iwl_send_cmd_async+0x5b/0xe0 [iwlcore] [19442.770809] [<ffffffffa035c3ea>] ? iwl_send_cmd_pdu_async+0x3a/0x40 [iwlcore] [19442.770836] [<ffffffffa035c3f0>] ? iwl_generic_cmd_callback+0x0/0x150 [iwlcore] [19442.770858] [<ffffffffa03f95b8>] ? iwl5000_send_tx_power+0x58/0x60 [iwlagn] [19442.770884] [<ffffffffa0351096>] ? iwl_set_tx_power+0x96/0x160 [iwlcore] [19442.770910] [<ffffffffa0371490>] ? iwl_bg_request_scan+0x0/0x1390 [iwlcore] [19442.770921] [<ffffffff810770dd>] worker_thread+0x17d/0x2b0 [19442.770932] [<ffffffff8107be70>] ? autoremove_wake_function+0x0/0x40 [19442.770941] [<ffffffff81076f60>] ? worker_thread+0x0/0x2b0 [19442.770948] [<ffffffff8107b936>] kthread+0x96/0xa0 [19442.770957] [<ffffffff8100ae24>] kernel_thread_helper+0x4/0x10 [19442.770965] [<ffffffff8107b8a0>] ? kthread+0x0/0xa0 [19442.770972] [<ffffffff8100ae20>] ? kernel_thread_helper+0x0/0x10 [19442.770977] Code: c0 49 89 70 58 41 c6 40 4c 04 83 e0 fc 83 c0 08 41 88 40 4d c9 c3 90 90 90 90 90 48 89 f8 89 d1 c1 e9 03 83 e2 07 f3 48 a5 89 d1 <f3> a4 c3 8b 1e 4c 8b 46 08 4c 89 1f 4c 89 47 08 4c 8b 4e 10 4c [19442.771060] RIP [<ffffffff812a4d20>] memcpy+0x10/0xb0 [19442.771068] RSP <ffff880135215cf8> [19442.771084] ---[ end trace d26f7b1d5bd2cfb3 ]---
Created attachment 25800 [details] dmesg output dmesg output
Please add http://paste.pocoo.org/raw/196316/ --- wireless-testing.orig/drivers/net/wireless/iwlwifi/iwl-scan.c 2010-04-01 16:04:48.000000000 +0200 +++ wireless-testing/drivers/net/wireless/iwlwifi/iwl-scan.c 2010-04-01 16:06:15.000000000 +0200 @@ -264,6 +264,7 @@ static void iwl_rx_scan_complete_notif(s reschedule: priv->scan_pass_start = jiffies; + printk(KERN_DEBUG "queueing request_scan in complete, bands=%x\n", priv->scan_bands); queue_work(priv->workqueue, &priv->request_scan); } @@ -475,6 +476,7 @@ static int iwl_scan_initiate(struct iwl_ priv->scan_start = jiffies; priv->scan_pass_start = priv->scan_start; + printk(KERN_DEBUG "queueing request_scan in initiate, bands=%x\n", priv->scan_bands); queue_work(priv->workqueue, &priv->request_scan); return 0; @@ -575,6 +577,7 @@ int iwl_internal_short_hw_scan(struct iw IWL_DEBUG_SCAN(priv, "Start internal short scan...\n"); set_bit(STATUS_SCANNING, &priv->status); priv->is_internal_short_scan = true; + printk(KERN_DEBUG "queueing request_scan in internal, bands=%x\n", priv->scan_bands); queue_work(priv->workqueue, &priv->request_scan); out:
Created attachment 25820 [details] dmesg output with printk patch Another splat, with debug=0x802 and the printk patch applied. Please let me know if you need more information.
Either bugzilla or I am confused, or you attached the same file again?
Created attachment 25836 [details] dmesg with printk output for real now
Well, ok, clear now what's going on... is_internal_short_scan starts out as true ieee80211 phy0: U iwl_bg_scan_completed SCAN complete scan ieee80211 phy0: I iwl_internal_short_hw_scan Start internal short scan... queueing request_scan in internal, bands=1 still set to true ieee80211 phy0: U iwl_bg_scan_completed internal short scan completed this sets it to false, but the next scan is already in progress!! Now iwl_bg_request_scan will check the variable and find it false, even though it should be true. is_internal_short_scan needs to be protected by a lock, probably priv->mutex.
No, it's not that simple ... all this is really racy.
Created attachment 25881 [details] patch to fix internal scan race Let me know what happens.
Yep, that seems to have cured it, machine has been happy for days now, whereas it used to die several times a day before this.