|Summary:||After suspend to ram, laptop doesn't connect to the wired network. Sky2 error. Marvell 88E8036|
|Severity:||normal||CC:||aberkoke, akpm, alan, auxsvr, bj.cardon, bjorn, jbarnes, rjw|
|Bug Depends on:|
git bisect log
proposed patch (reverts c82f63e411)
Description Eduardo 2010-02-24 18:54:55 UTC
Hi. I think there is a problem with kernel in my laptop. After suspend (to ram), my computer doesn't connect to wired network (eth0 doesn't connect). The problem occurs in ArchLinux 32 bits (with kernels 2.6.31, 32 and 33 rc-8) and with Ubuntu Karmik 9.10 (2.6.31). I have not checked with 2.6.33-rc1 to -rc7 but I suppose the problem is there. Archlinux with kernel 2.6.30 and ubuntu 9.04 (kernel 2.6.28) doesn't present this problem. I think, due to problem is present in ubuntu and archlinux, since 2.6.31, there is a bug in the kernel or related to it. I think the problem is with Marvell 88E8036, reponsible of wired network and sky2 module (log file shows: kernel: sky2 0000:02:00.0: eth0: phy I/O error). Wifi network works fine after suspend. My computer is a laptop Toshiba M40-285. Attached files: everything.log, output of lsmod and lspci, all after suspend to ram. I could help you to solve this problem. If you want more logs or more information tell me. Thanks,
Comment 4 Eduardo 2010-02-24 18:59:03 UTC
Comment on attachment 25195 [details] lspci output from archlinux 2.6.32
Comment 5 Eduardo 2010-02-24 18:59:38 UTC
Comment on attachment 25196 [details] everything.log from archlinux 2.6.32
Comment 6 Eduardo 2010-02-24 19:00:03 UTC
Comment on attachment 25197 [details] lsmod output from archlinux 2.6.32
Comment 7 Eduardo 2010-02-27 02:38:25 UTC
Problem has between 2.6.31-rc7 and 2.6.31-rc8.
Comment 8 Eduardo 2010-03-01 23:12:51 UTC
Hi. I hope someone can read this. With git bisect, i found commit that causes the problem. eduardo@eduardo-laptop:~/linux-git$ git bisect bad c82f63e411f1b58427c103bd95af2863b1c96dd1 is first bad commit commit c82f63e411f1b58427c103bd95af2863b1c96dd1 Author: Alek Du <email@example.com> Date: Sat Aug 8 08:46:19 2009 +0800 PCI: check saved state before restore Without the check, the config space may be filled with zeros. Though the driver should try to avoid call restoring before saving, but the pci layer also should check this. Also removes the existing check in pci_restore_standard_config, since it's superfluous with the new check in restore_state. Acked-by: Rafael J. Wysocki <firstname.lastname@example.org> Signed-off-by: Alek Du <email@example.com> Signed-off-by: Jesse Barnes <firstname.lastname@example.org> :040000 040000 b363995a162a427fdf907059d38882036d68109d 6aca235abde6bf4545e479a87b7f7171e934a988 M drivers Attached is content of BISECT_LOG file. I think, because my problem is after suspend, this is the commit responsible of the problem. Can i do anything else? Thanks
Comment 10 Eduardo 2010-03-25 01:06:17 UTC
I can't check if problem is present in 2.6.33 and 2.6.34rc1 because suspend to ram doesn't work with those kernels. I will test with 126.96.36.199 and 2.6.34rc2 soon.
Comment 11 Eduardo 2010-05-04 10:47:28 UTC
The problem is also present in kernel 188.8.131.52
Comment 12 Rafael J. Wysocki 2010-05-04 19:46:14 UTC
Did you try to revert the commit you found via 'git bisect' from 184.108.40.206, for example?
Comment 13 Eduardo 2010-05-05 16:59:45 UTC
Of course. With 220.127.116.11 without c82f63e411f1b58427c103bd95af2863b1c96dd1, problem is solved!!
Comment 14 Rafael J. Wysocki 2011-01-16 22:24:42 UTC
Is the problem still present in 2.6.37 (works for me)?
Comment 15 bj.cardon 2011-08-16 21:07:17 UTC
I am on 18.104.22.168 and I still have this issue. I'm using this: 02:00.0 Ethernet controller: Marvell Technology Group Ltd. 88E8040 PCI-E Fast Ethernet Controller Subsystem: Hewlett-Packard Company Device 361a Kernel driver in use: sky2 Kernel modules: sky2 I don't know what other information is useful.
Comment 16 Bjorn Helgaas 2012-08-23 23:31:38 UTC
Eduardo, bj, does this problem still occur on 3.5? Sorry to just ask without any actual debugging on my part, but this has fallen through the cracks for a long time and I don't want to waste time if it's been accidentally fixed in the meantime.
Comment 17 Eduardo 2012-08-26 20:37:43 UTC
At this moment I don't know. I will test with kernel 3.5.3 and i will post here with the results. Thank you.
Comment 18 Bjorn Helgaas 2012-10-01 20:03:38 UTC
Ping, Eduardo, bj, any update?
Comment 19 bj.cardon 2012-10-02 16:51:47 UTC
Bjorn, I tested this on 3.5.0 (built on Kubuntu 12.10 Beta) and it seems to be working correctly finally. My netbook was virtually useless because you couldn't plug in ethernet on the fly or suspend and get your ethernet back. Both of those issues seem fixed on 3.5. Thanks!
Comment 20 Eduardo 2012-10-02 23:24:43 UTC
Hi Bjorn. I'm so sorry for the delay. I tested with latest stable 3.5 kernel (3.5.4) and the problem is still present.
Comment 21 Eduardo 2012-10-02 23:37:35 UTC
For 3.6 the problem is still present in my hardware. Thanks
Comment 22 Bjorn Helgaas 2012-10-03 19:57:04 UTC
Alek, Rafael, any ideas? Eduardo, I assume that reverting c82f63e411f1 still fixes 3.6 on your hardware? Can you attach the complete dmesg log (covering initial boot, suspend to RAM, and resume) from both 3.6 and 3.6 with c82f63e411f1 reverted?
Comment 23 Bjorn Helgaas 2012-10-25 20:06:45 UTC
Eduardo, you said v3.6 still has the problem on your hardware. I haven't heard any defense of c82f63e411f1, so if you confirm that v3.6 with that change reverted it fixes your hardware, I'll push that revert upstream. The last test was from 22.214.171.124, which is getting a bit old.
Comment 24 Bjorn Helgaas 2012-10-26 01:25:24 UTC
Created attachment 84911 [details] proposed patch (reverts c82f63e411) It's no longer trivial to revert c82f63e411 because of other changes in that area. This patch applies to v3.7-rc2 and effectively reverts c82f63e411. Eduardo, can you test this and verify that it fixes the problem?
Comment 25 auxsvr 2012-10-27 19:20:27 UTC
I have been having the same problem for quite some time, even received the following warning once: ------------[ cut here ]------------ WARNING: at /home/abuild/rpmbuild/BUILD/kernel-desktop-3.6.0/linux-3.6/net/sched/sch_generic.c:255 dev_watchdog+0x1e0/0x1f0() Hardware name: HP Mini 5102 NETDEV WATCHDOG: eth0 (sky2): transmit queue 0 timed out Modules linked in: cpufreq_stats nls_utf8 loop rfcomm bnep btusb bluetooth arc4 brcmsmac mac80211 bcma brcmutil cfg80211 cordic af_packet cpufreq_conservative cpufreq_userspace cpufreq_powersave fuse hp_wmi snd_hda_codec_idt snd_hda_intel sparse_keymap rfkill snd_hda_codec iTCO_wdt iTCO_vendor_support uvcvideo videobuf2_core videodev videobuf2_vmalloc snd_hwdep snd_pcm videobuf2_memops sg acpi_cpufreq mperf coretemp microcode hp_accel container lis3lv02d snd_timer wmi joydev input_polldev snd lpc_ich battery serio_raw soundcore mfd_core snd_page_alloc edd ac sky2 autofs4 i915 drm_kms_helper drm i2c_algo_bit button video scsi_dh_emc scsi_dh_rdac scsi_dh_alua scsi_dh_hp_sw scsi_dh fan processor thermal thermal_sys Pid: 0, comm: swapper/1 Tainted: G W 3.6.0-1-desktop #1 Call Trace: [<c02054a9>] try_stack_unwind+0x199/0x1b0 [<c02041c7>] dump_trace+0x47/0xf0 [<c020550b>] show_trace_log_lvl+0x4b/0x60 [<c0205538>] show_trace+0x18/0x20 [<c0714c48>] dump_stack+0x6d/0x72 [<c0237b18>] warn_slowpath_common+0x78/0xb0 [<c0237be3>] warn_slowpath_fmt+0x33/0x40 [<c065a020>] dev_watchdog+0x1e0/0x1f0 [<c0246e53>] run_timer_softirq+0x103/0x310 [<c023fd99>] __do_softirq+0x99/0x1e0 [<c02040a6>] do_softirq+0x76/0xb0 [<00000003>] 0x2 ---[ end trace 35cf0f09d1d8830f ]--- sky2 0000:43:00.0: eth0: tx timeout sky2 0000:43:00.0: eth0: transmit ring 43 .. 54 report=43 done=43 If I reload sky2, then the NIC works fine.
Comment 26 Bjorn Helgaas 2012-10-29 23:30:13 UTC
auxsvr, can you test the patch in comment #24 and see whether it resolves the problem?
Comment 27 auxsvr 2012-10-30 09:25:52 UTC
In my case the problem is triggered by ethtool -s eth0 autoneg off which is executed by the laptop-mode scripts when on battery. I will check the patch later, I'm too busy at the moment.
Comment 28 auxsvr 2012-11-18 18:05:35 UTC
Just tried the patch on 3.7-rc5 and the connection stops with ethtool -s eth0 autoneg off duplex full speed 100, yet it is restored with ethtool -s eth0 autoneg on, which wouldn't work before (no module reload necessary). Also, there is no error message in the log. This looks fixed to me.
Comment 29 auxsvr 2012-11-19 07:53:00 UTC
Well, I just tried 3.6.3 without the patch and it has the same behaviour, i.e. with autoneg off the connection stops and resumes with autoneg on, without errors. Hopefully, it won't take many suspend-resume cycles to observe the problem reported originally.
Comment 30 Bjorn Helgaas 2013-09-10 22:38:32 UTC
I'm closing this for lack of information. If the problem still occurs on a recent kernel (v3.11), please reopen the bug and attach the complete dmesg and "lspci -vv" logs.