Bug 10130

Summary: r8169 doesn't negotiate 1000baseT mode after suspend/resume
Product: Drivers Reporter: Bas Zoetekouw (bas)
Component: NetworkAssignee: Francois Romieu (romieu)
Status: RESOLVED CODE_FIX    
Severity: normal CC: bugzilla, bunk, centaur, erbrochendes, jnelson-kernel-bugzilla, pizza, rjw, stefan.andreas.bauer
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 2.6.24.2 Subsystem:
Regression: No Bisected commit-id:
Bug Depends on:    
Bug Blocks: 7216    
Attachments: PHY init after resume
[2.6.33.5] Fix random mdio write failures
[2.6.33.5] Fix mdio read and update mdio write after Realtek's spec
(Hand-patch as applied to F13 kernel)
incorrect identifier for a 8168dp

Description Bas Zoetekouw 2008-02-28 07:10:47 UTC
Latest working kernel version: unknown
Earliest failing kernel version: unknown (but 2.6.23 is broken, too)
Distribution: Debian sid (but not using distro kernel)
Hardware Environment: Intel E6550 x86_64
Software Environment: Debian sid amd64
Problem Description: 

After a suspend/resume cycle (acpi S3), r8169 doesn't seem to renegotiate the speed it shoudl use, and it falls back to 100Mbit.  After I manually force the nic to Gb mode, everything works fine again.
After a normal cold boot, it correctly negotiates 1Gbit speed.

I have the following nic:
04:00.0 0200: 10ec:8168 (rev 01)
04:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168B PCI Express Gigabit Ethernet controller (rev 01)
	Subsystem: Giga-byte Technology Unknown device e000
	Flags: bus master, fast devsel, latency 0, IRQ 16
	I/O ports at b000 [size=256]
	Memory at f8000000 (64-bit, non-prefetchable) [size=4K]
	[virtual] Expansion ROM at 80000000 [disabled] [size=128K]
	Capabilities: <access denied>
	Kernel driver in use: r8169
	Kernel modules: r8169
Comment 1 Francois Romieu 2008-02-28 15:38:16 UTC
Can you send the output of 'mii-tool -vv' :
- before the suspend
- after the resume (broken)
- after the mode is forced

Thanks.

-- 
Ueimor
Comment 2 Bas Zoetekouw 2008-02-29 01:17:25 UTC
Before suspending:

Using SIOCGMIIPHY=0x8947
eth0: negotiated 1000baseT-FD flow-control, link ok
  registers for MII PHY 32: 
    1000 796d 001c c912 0de1 c5e1 000d 2001
    4ce1 0300 3800 0000 1007 f880 0000 3000
    0060 acc0 0000 0000 1060 0000 d00c 2108
    2740 8c00 0040 0106 097c 8000 0123 0000
  product info: vendor 00:07:32, model 17 rev 2
  basic mode:   autonegotiation enabled
  basic status: autonegotiation complete, link ok
  capabilities: 1000baseT-HD 1000baseT-FD 100baseTx-FD 100baseTx-HD 10baseT-FD 10baseT-HD
  advertising:  1000baseT-FD 100baseTx-FD 100baseTx-HD 10baseT-FD 10baseT-HD flow-control


After suspend/resume cycle:

Using SIOCGMIIPHY=0x8947
eth0: negotiated 100baseTx-FD flow-control, link ok
  registers for MII PHY 32: 
    1000 796d 001c c912 0de1 c5e1 000f 2001
    c5e1 0000 0000 0000 1007 f880 0000 3000
    0060 6c00 0000 6c42 1060 0000 870c 2108
    2740 8c00 0040 0162 846c 8000 0123 0000
  product info: vendor 00:07:32, model 17 rev 2
  basic mode:   autonegotiation enabled
  basic status: autonegotiation complete, link ok
  capabilities: 100baseTx-FD 100baseTx-HD 10baseT-FD 10baseT-HD
  advertising:  100baseTx-FD 100baseTx-HD 10baseT-FD 10baseT-HD flow-control
  link partner: 100baseTx-FD 100baseTx-HD 10baseT-FD 10baseT-HD flow-control


after forcing Gb speed with ethtool:

Using SIOCGMIIPHY=0x8947
eth0: negotiated 1000baseT-FD flow-control, link ok
  registers for MII PHY 32: 
    1000 796d 001c c912 0de1 c5e1 000f 2001
    4d68 0300 3800 0000 1007 f880 0000 3000
    0060 ac80 0000 6c42 1060 0000 e00c 2108
    2740 8c00 0040 0106 097c 8000 0123 0000
  product info: vendor 00:07:32, model 17 rev 2
  basic mode:   autonegotiation enabled
  basic status: autonegotiation complete, link ok
  capabilities: 1000baseT-HD 1000baseT-FD 100baseTx-FD 100baseTx-HD 10baseT-FD 10baseT-HD
  advertising:  1000baseT-FD 100baseTx-FD 100baseTx-HD 10baseT-FD 10baseT-HD flow-control
  link partner: 1000baseT-HD 1000baseT-FD 100baseTx-FD 100baseTx-HD 10baseT-FD 10baseT-HD flow-control
Comment 3 Francois Romieu 2008-02-29 15:10:12 UTC
Created attachment 15100 [details]
PHY init after resume

Can you give this patch a try on top of 2.6.24 ?

Please add the updated mii-tool outputs (and keep the old ones).

-- 
Ueimor
Comment 4 Bas Zoetekouw 2008-02-29 15:49:20 UTC
Just tried it, and it doesn't seem to make any difference...
Comment 5 Francois Romieu 2008-03-05 14:12:06 UTC
bas@debian.org  2008-02-29 15:49 :
> Just tried it, and it doesn't seem to make any difference...

Can you send the mii-tool output of the network interface just after
the resume ?

The r8169 does not seem to see the expected advertisement from the
link partner after resume. Is there some way for you to check that
the switch still advertises 1000 Mbps after the resume ?

I'll welcome the command that you use to force the Gb mode too.
Comment 6 Bas Zoetekouw 2008-03-05 14:25:11 UTC
> Can you send the mii-tool output of the network interface just after the
> resume ?

That's quoted above already in comment #2 ("After suspend/resume cycle").  Note in particular that according to mii-tool, the 1000baseT _capability_ of the card is missing at that point.

To switch the card back to 1000Mbit, I simply use "ethtool -s eth0 speed 1000".
Comment 7 Solomon Peachy 2009-05-15 03:44:36 UTC
For what it's worth, I have this problem too.

uname -r:

2.6.29.3-60.fc10.x86_64

(bug exists in 2.6.27 too, and possibly older)

lspci -v:

06:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168B PCI Express Gigabit Ethernet controller (rev 02)
        Subsystem: ASUSTeK Computer Inc. U6V laptop
        Flags: bus master, fast devsel, latency 0, IRQ 17
        I/O ports at e800 [size=256]
        Memory at f8fff000 (64-bit, prefetchable) [size=4K]
        Memory at f8fe0000 (64-bit, prefetchable) [size=64K]
        Expansion ROM at feaf0000 [disabled] [size=64K]
        Capabilities: <access denied>
        Kernel driver in use: r8169
        Kernel modules: r8169

ethtool pre-suspend:  (nothing plugged in)

        Supported ports: [ TP MII ]
        Supported link modes:   10baseT/Half 10baseT/Full 
                                100baseT/Half 100baseT/Full 
                                1000baseT/Half 1000baseT/Full 
        Supports auto-negotiation: Yes
        Advertised link modes:  10baseT/Half 10baseT/Full 
                                100baseT/Half 100baseT/Full 
                                1000baseT/Half 1000baseT/Full 
        Advertised auto-negotiation: Yes
        Speed: 10Mb/s
        Duplex: Half
        Port: MII
        PHYAD: 0
        Transceiver: internal
        Auto-negotiation: on
        Supports Wake-on: pumbg
        Wake-on: g
        Current message level: 0x00000033 (51)
        Link detected: no

ethtool post-suspend:

        Supported ports: [ TP MII ]
        Supported link modes:   10baseT/Half 10baseT/Full 
                                100baseT/Half 100baseT/Full 
                                1000baseT/Half 1000baseT/Full 
        Supports auto-negotiation: Yes
        Advertised link modes:  10baseT/Half 10baseT/Full 
                                100baseT/Half 100baseT/Full 
        Advertised auto-negotiation: Yes
        Speed: 10Mb/s
        Duplex: Half
        Port: MII
        PHYAD: 0
        Transceiver: internal
        Auto-negotiation: on
        Supports Wake-on: pumbg
        Wake-on: g
        Current message level: 0x00000033 (51)
        Link detected: no

Interestingly, if I remove the module and re-insert it, things work fine.
Comment 8 Jon Nelson 2009-08-25 18:08:37 UTC
I can confirm all of the above.

openSUSE 11.1 kernel, 2.6.27.29
Comment 9 James Ettle 2010-01-26 20:41:43 UTC
Any progress on this? Still seen in 2.6.31.
Comment 10 Solomon Peachy 2010-02-05 16:59:01 UTC
Bug still present in 2.6.32.7
Comment 11 Solomon Peachy 2010-06-08 13:57:32 UTC
Bug still present in 2.6.33.5
Comment 12 Carlos 2010-06-13 21:49:51 UTC
Bug still present in 2.6.34
I can confirm this bug on my Debian box.
Please fix this!
Comment 13 Per 2010-07-03 23:49:10 UTC
Confirmed still present in F13 with kernel 2.6.33.5-124.fc13.x86_64
Comment 14 Francois Romieu 2010-07-08 21:09:58 UTC
Created attachment 27048 [details]
[2.6.33.5] Fix random mdio write failures
Comment 15 Francois Romieu 2010-07-08 21:12:57 UTC
Created attachment 27049 [details]
[2.6.33.5] Fix mdio read and update mdio write after Realtek's spec

Can someone check the two patches above against a stable 2.6.33.5 ?

Thanks.

-- 
Ueimor
Comment 16 James Ettle 2010-07-14 11:19:05 UTC
Created attachment 27099 [details]
(Hand-patch as applied to F13 kernel)

(In reply to comment #15)
> Created an attachment (id=27049) [details]
> [2.6.33.5] Fix mdio read and update mdio write after Realtek's spec
> 
> Can someone check the two patches above against a stable 2.6.33.5 ?
> 
> Thanks.

Well, I tried it against Fedora 13's kernel-2.6.34.1-11.fc13, but had to do so by hand (see the attached diff). Had no effect. Maybe I botched the patch?
Comment 17 Francois Romieu 2010-07-15 21:27:13 UTC
Created attachment 27123 [details]
incorrect identifier for a 8168dp
Comment 18 Francois Romieu 2010-07-15 21:35:31 UTC
James Ettle :
[...]
> Well, I tried it against Fedora 13's kernel-2.6.34.1-11.fc13, but had to do
> so
> by hand (see the attached diff). Had no effect. Maybe I botched the patch?

Apparently not. Your diff seems fine.

Can you grep the xid debug line from the kernel log when the module
it inserted ? There are quite a few different 8168 chips.

-- 
Ueimor
Comment 19 James Ettle 2010-07-16 05:10:58 UTC
(In reply to comment #18)
> Can you grep the xid debug line from the kernel log when the module
> it inserted ? There are quite a few different 8168 chips.

$ dmesg | grep 8169
r8169 Gigabit Ethernet driver 2.3LK-NAPI loaded
r8169 0000:08:00.0: PCI INT A -> GSI 19 (level, low) -> IRQ 19
r8169 0000:08:00.0: setting latency timer to 64
r8169 0000:08:00.0: irq 32 for MSI/MSI-X
r8169 0000:08:00.0: eth0: RTL8168b/8111b at 0xffffc9002474c000, 00:90:f5:69:d5:9e, XID 18000000 IRQ 32
r8169 0000:08:00.0: eth0: link down

(It's really not plugged in at the moment.)
Comment 20 James Ettle 2010-11-08 09:10:16 UTC
Anything happened since? (I'm not currently near a Gigabit network, I'll be able to test with 2.6.36 some time next week.)
Comment 21 Rafael J. Wysocki 2010-12-29 23:51:57 UTC
It looks like this should be fixed in 2.6.37-rc8, any chance to verify?
Comment 22 Stefan Bauer 2011-01-02 14:01:50 UTC
(In reply to comment #21)
> It looks like this should be fixed in 2.6.37-rc8, any chance to verify?

I just tested STR and STD on 2.6.37-rc8 and it seems to be fixed for me.

(This is the on-board NIC of an Asus M3A78)

03:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168B PCI Express Gigabit Ethernet controller (rev 02)
        Subsystem: ASUSTeK Computer Inc. Device 82c6
        Flags: bus master, fast devsel, latency 0, IRQ 41
        I/O ports at e800 [size=256]
        Memory at fbeff000 (64-bit, non-prefetchable) [size=4K]
        Memory at faff0000 (64-bit, prefetchable) [size=64K]
        Expansion ROM at fbec0000 [disabled] [size=128K]
        Capabilities: [40] Power Management version 3
        Capabilities: [50] MSI: Enable+ Count=1/1 Maskable- 64bit+
        Capabilities: [70] Express Endpoint, MSI 01
        Capabilities: [b0] MSI-X: Enable- Count=2 Masked-
        Capabilities: [d0] Vital Product Data
        Capabilities: [100] Advanced Error Reporting
        Capabilities: [140] Virtual Channel <?>
        Capabilities: [160] Device Serial Number 00-00-00-00-ec-10-68-81
        Kernel driver in use: r8169
        Kernel modules: r8169
Comment 23 Rafael J. Wysocki 2011-01-02 14:57:37 UTC
Great, closing.