Bug 15388
Summary: | After suspend to ram, laptop doesn't connect to the wired network. Sky2 error. Marvell 88E8036 | ||
---|---|---|---|
Product: | Drivers | Reporter: | Eduardo (aberkoke) |
Component: | PCI | Assignee: | drivers_pci (drivers_pci) |
Status: | RESOLVED INSUFFICIENT_DATA | ||
Severity: | normal | CC: | aberkoke, akpm, alan, auxsvr, bj.cardon, bjorn, jbarnes, rjw |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
URL: | http://bbs.archlinux.org/viewtopic.php?id=91833 | ||
Kernel Version: | 2.6.38.10 | Subsystem: | |
Regression: | Yes | Bisected commit-id: | |
Bug Depends on: | |||
Bug Blocks: | 7216 | ||
Attachments: |
lspci output
everything.log lsmod output git bisect log proposed patch (reverts c82f63e411) |
Description
Eduardo
2010-02-24 18:54:55 UTC
Created attachment 25195 [details]
lspci output
Created attachment 25196 [details]
everything.log
Created attachment 25197 [details]
lsmod output
Comment on attachment 25195 [details]
lspci output
from archlinux 2.6.32
Comment on attachment 25196 [details]
everything.log
from archlinux 2.6.32
Comment on attachment 25197 [details]
lsmod output
from archlinux 2.6.32
Problem has between 2.6.31-rc7 and 2.6.31-rc8. Hi. I hope someone can read this. With git bisect, i found commit that causes the problem. eduardo@eduardo-laptop:~/linux-git$ git bisect bad c82f63e411f1b58427c103bd95af2863b1c96dd1 is first bad commit commit c82f63e411f1b58427c103bd95af2863b1c96dd1 Author: Alek Du <alek.du@intel.com> Date: Sat Aug 8 08:46:19 2009 +0800 PCI: check saved state before restore Without the check, the config space may be filled with zeros. Though the driver should try to avoid call restoring before saving, but the pci layer also should check this. Also removes the existing check in pci_restore_standard_config, since it's superfluous with the new check in restore_state. Acked-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Alek Du <alek.du@intel.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org> :040000 040000 b363995a162a427fdf907059d38882036d68109d 6aca235abde6bf4545e479a87b7f7171e934a988 M drivers Attached is content of BISECT_LOG file. I think, because my problem is after suspend, this is the commit responsible of the problem. Can i do anything else? Thanks Created attachment 25302 [details]
git bisect log
I can't check if problem is present in 2.6.33 and 2.6.34rc1 because suspend to ram doesn't work with those kernels. I will test with 2.6.33.1 and 2.6.34rc2 soon. The problem is also present in kernel 2.6.33.3 Did you try to revert the commit you found via 'git bisect' from 2.6.33.3, for example? Of course. With 2.6.33.3 without c82f63e411f1b58427c103bd95af2863b1c96dd1, problem is solved!! Is the problem still present in 2.6.37 (works for me)? I am on 2.6.38.10 and I still have this issue. I'm using this: 02:00.0 Ethernet controller: Marvell Technology Group Ltd. 88E8040 PCI-E Fast Ethernet Controller Subsystem: Hewlett-Packard Company Device 361a Kernel driver in use: sky2 Kernel modules: sky2 I don't know what other information is useful. Eduardo, bj, does this problem still occur on 3.5? Sorry to just ask without any actual debugging on my part, but this has fallen through the cracks for a long time and I don't want to waste time if it's been accidentally fixed in the meantime. At this moment I don't know. I will test with kernel 3.5.3 and i will post here with the results. Thank you. Ping, Eduardo, bj, any update? Bjorn, I tested this on 3.5.0 (built on Kubuntu 12.10 Beta) and it seems to be working correctly finally. My netbook was virtually useless because you couldn't plug in ethernet on the fly or suspend and get your ethernet back. Both of those issues seem fixed on 3.5. Thanks! Hi Bjorn. I'm so sorry for the delay. I tested with latest stable 3.5 kernel (3.5.4) and the problem is still present. For 3.6 the problem is still present in my hardware. Thanks Alek, Rafael, any ideas? Eduardo, I assume that reverting c82f63e411f1 still fixes 3.6 on your hardware? Can you attach the complete dmesg log (covering initial boot, suspend to RAM, and resume) from both 3.6 and 3.6 with c82f63e411f1 reverted? Eduardo, you said v3.6 still has the problem on your hardware. I haven't heard any defense of c82f63e411f1, so if you confirm that v3.6 with that change reverted it fixes your hardware, I'll push that revert upstream. The last test was from 2.6.33.3, which is getting a bit old. Created attachment 84911 [details]
proposed patch (reverts c82f63e411)
It's no longer trivial to revert c82f63e411 because of other changes in that area. This patch applies to v3.7-rc2 and effectively reverts c82f63e411.
Eduardo, can you test this and verify that it fixes the problem?
I have been having the same problem for quite some time, even received the following warning once: ------------[ cut here ]------------ WARNING: at /home/abuild/rpmbuild/BUILD/kernel-desktop-3.6.0/linux-3.6/net/sched/sch_generic.c:255 dev_watchdog+0x1e0/0x1f0() Hardware name: HP Mini 5102 NETDEV WATCHDOG: eth0 (sky2): transmit queue 0 timed out Modules linked in: cpufreq_stats nls_utf8 loop rfcomm bnep btusb bluetooth arc4 brcmsmac mac80211 bcma brcmutil cfg80211 cordic af_packet cpufreq_conservative cpufreq_userspace cpufreq_powersave fuse hp_wmi snd_hda_codec_idt snd_hda_intel sparse_keymap rfkill snd_hda_codec iTCO_wdt iTCO_vendor_support uvcvideo videobuf2_core videodev videobuf2_vmalloc snd_hwdep snd_pcm videobuf2_memops sg acpi_cpufreq mperf coretemp microcode hp_accel container lis3lv02d snd_timer wmi joydev input_polldev snd lpc_ich battery serio_raw soundcore mfd_core snd_page_alloc edd ac sky2 autofs4 i915 drm_kms_helper drm i2c_algo_bit button video scsi_dh_emc scsi_dh_rdac scsi_dh_alua scsi_dh_hp_sw scsi_dh fan processor thermal thermal_sys Pid: 0, comm: swapper/1 Tainted: G W 3.6.0-1-desktop #1 Call Trace: [<c02054a9>] try_stack_unwind+0x199/0x1b0 [<c02041c7>] dump_trace+0x47/0xf0 [<c020550b>] show_trace_log_lvl+0x4b/0x60 [<c0205538>] show_trace+0x18/0x20 [<c0714c48>] dump_stack+0x6d/0x72 [<c0237b18>] warn_slowpath_common+0x78/0xb0 [<c0237be3>] warn_slowpath_fmt+0x33/0x40 [<c065a020>] dev_watchdog+0x1e0/0x1f0 [<c0246e53>] run_timer_softirq+0x103/0x310 [<c023fd99>] __do_softirq+0x99/0x1e0 [<c02040a6>] do_softirq+0x76/0xb0 [<00000003>] 0x2 ---[ end trace 35cf0f09d1d8830f ]--- sky2 0000:43:00.0: eth0: tx timeout sky2 0000:43:00.0: eth0: transmit ring 43 .. 54 report=43 done=43 If I reload sky2, then the NIC works fine. auxsvr, can you test the patch in comment #24 and see whether it resolves the problem? In my case the problem is triggered by ethtool -s eth0 autoneg off which is executed by the laptop-mode scripts when on battery. I will check the patch later, I'm too busy at the moment. Just tried the patch on 3.7-rc5 and the connection stops with ethtool -s eth0 autoneg off duplex full speed 100, yet it is restored with ethtool -s eth0 autoneg on, which wouldn't work before (no module reload necessary). Also, there is no error message in the log. This looks fixed to me. Well, I just tried 3.6.3 without the patch and it has the same behaviour, i.e. with autoneg off the connection stops and resumes with autoneg on, without errors. Hopefully, it won't take many suspend-resume cycles to observe the problem reported originally. I'm closing this for lack of information. If the problem still occurs on a recent kernel (v3.11), please reopen the bug and attach the complete dmesg and "lspci -vv" logs. |