Bug 7924
Summary: | same issue as closed bug 7555 with r8169 and slow transfer | ||
---|---|---|---|
Product: | Drivers | Reporter: | Tom Van den Eynde (tom) |
Component: | Network | Assignee: | Francois Romieu (romieu) |
Status: | CLOSED CODE_FIX | ||
Severity: | normal | CC: | bunk, nord73, Roel.Teuwen, romieu |
Priority: | P2 | ||
Hardware: | i386 | ||
OS: | Linux | ||
Kernel Version: | 2.6.19.2 | Subsystem: | |
Regression: | --- | Bisected commit-id: | |
Attachments: |
.config 2.6.19.2
dmesg 2.6.19.2 ifconfig 2.6.19.2 interupts 2.6.19.2 lsmod 2.6.19.2 lspci 2.6.19.2 r8168 driver version 8.001.00 + compilation fixes lspci -vvx from 2.6.20 .config for 2.6.21.1 PHY power-on change |
Description
Tom Van den Eynde
2007-02-02 07:29:17 UTC
Please attach the following informations to the current PR : - complete (untruncated) dmesg output (add the kernel version in the description of the attachment); - /sbin/lspci -vvx - /sbin/lsmod - cat /proc/interrupts (add the kernel version in the description of the attachment) - /sbin/ifconfig - kernel .config (add the kernel version in the description of the attachment) - version of the bios -- Ueimor Created attachment 10261 [details]
.config 2.6.19.2
Created attachment 10262 [details]
dmesg 2.6.19.2
Created attachment 10263 [details]
ifconfig 2.6.19.2
ifconfig with sanitized IP addresses
Created attachment 10264 [details]
interupts 2.6.19.2
output of cat /proc/interupts
Created attachment 10265 [details]
lsmod 2.6.19.2
output of lsmod
Created attachment 10266 [details]
lspci 2.6.19.2
output of lspci
I will provide you with a bios version as soon as I can reboot the box. Thanks in advance, Tom BIOS revision is 0307 I just installed the latest kernel 2.6.20 and upgraded the BIOS to the latest 0405 release. Same issue occurs Created attachment 10297 [details]
r8168 driver version 8.001.00 + compilation fixes
Can you give the attached driver a try ?
Just drop it as a replacement to the current drivers/net/r8169.c file.
--
Ueimor
Hi, I tried the driver in 2.6.20 but with no look. Issue is still the same. I suffered from a disk drive crash so I had to reinstall the box and reinstalled it with a x86_64 2.6.20 now. Also tried the driver in there but same issue remains. Can you try the latest patch attached to http://bugzilla.kernel.org/show_bug.cgi?id=5137 ? It should not eat babies but it may be a bit rough. -- Ueimor I tried the patch but it didn't solve the issue bugme-daemon@bugzilla.kernel.org <bugme-daemon@bugzilla.kernel.org> : [...] > I tried the patch but it didn't solve the issue Just to be sure: you tried attachments 10512 + 10515, right ? Oh, I didn't see that. I compiled the kernel on the 20th. So I only used nr 10465 So I have to do 10512 + 10515. I will get on it this afternoon I triend with the 2 suggested patches (against 2.6.21-rc1) but no luck. The issue is still there Can you try 2.6.21-rc2 (or later) + http://www.fr.zoreil.com/people/francois/misc/20070228-2.6.21-rc2-r8169-test.patch + http://bugzilla.kernel.org/attachment.cgi?id=10628 -- Ueimor Hello, I tried with 2.6.21-rc4 but the issue is still there. Kind regards, Tom Tom Van den Eynde 2007-03-22 10:04: [...] > I tried with 2.6.21-rc4 but the issue is still there. It is not too surprizing as the patches in #18 are not in. Can you try: http://www.fr.zoreil.com/people/francois/misc/20070316-2.6.21-rc4-r8169-test.patch -- Ueimor I tried the patch you suggested but the same issue still occurs. tom@vandeneynde.net: > I tried the patch you suggested but the same issue still occurs. Ok, thanks. From now on, please work with the last rc candidate + the aforementionned patch _without_ NAPI. It should still suck. I would then welcome a pcap (tcpdump/tethereal) dump of a few seconds of traffic for both the r8169 and the working network card. The more you use the same sequence for both tests, the easier the comparison. Please send the detailled + registers output of mii-tool for both too. It could give a hint. OK, I will take the pcaps this weekend. Should I test with rc-5 + the patch you provided? What do you mean with NAPI? tom@vandeneynde.net: > OK, I will take the pcaps this weekend. Excellent. > Should I test with rc-5 + the patch you provided? Or latest git at your convenance. > What do you mean with NAPI? Disable CONFIG_R8169_NAPI Hello, You can download the requested debug info at http://www.vandeneynde.net/debugr8169.tar.bz2 The archive containts the following -rwx------ 1 tvde tvde 111M 2007-03-31 00:59 e100.cap -rwx------ 1 tvde tvde 394 2007-03-31 00:59 e100.mii -rwx------ 1 tvde tvde 6.9K 2007-03-31 01:01 e100.png -rwx------ 1 tvde tvde 223 2007-03-31 01:19 kernel.txt -rwx------ 1 tvde tvde 408 2007-03-31 00:33 r8169.mii -rwx------ 1 tvde tvde 8.6M 2007-03-31 01:04 realtek.cap -rwx------ 1 tvde tvde 7.8K 2007-03-31 01:05 realtek.png The cap are full snaplength pcaps taken when trying to copy a 750Mb file over SMB. The .mii is the mii-tool output and the .png are screenshots taken to show the end users' problem. If you need more info, just let me know. Kind regards, Tom I'm seeing the exact same problem on 2.6.21.1 using either of the two onboard interfaces : 03:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168B PCI E xpress Gigabit Ethernet controller (rev 01) Subsystem: ABIT Computer Corp. Unknown device 1073 Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Step ping- SERR- FastB2B- Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- Latency: 0, Cache Line Size: 64 bytes Interrupt: pin A routed to IRQ 17 Region 0: I/O ports at de00 [size=256] Region 2: Memory at fdeff000 (64-bit, non-prefetchable) [size=4K] [virtual] Expansion ROM at fdd00000 [disabled] [size=128K] Capabilities: [40] Power Management version 2 Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=375mA PME(D0-,D1+,D2+,D3h ot+,D3cold+) Status: D0 PME-Enable- DSel=0 DScale=0 PME- Capabilities: [48] Vital Product Data Capabilities: [50] Message Signalled Interrupts: Mask- 64bit+ Queue=0/1 Enable- Address: 0000000000000000 Data: 0000 Capabilities: [60] Express Endpoint IRQ 0 Device: Supported: MaxPayload 1024 bytes, PhantFunc 0, ExtTag+ Device: Latency L0s <1us, L1 unlimited Device: AtnBtn+ AtnInd+ PwrInd+ Device: Errors: Correctable- Non-Fatal- Fatal- Unsupported- Device: RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+ Device: MaxPayload 128 bytes, MaxReadReq 512 bytes Link: Supported Speed 2.5Gb/s, Width x1, ASPM L0s, Port 0 Link: Latency L0s unlimited, L1 unlimited Link: ASPM Disabled RCB 64 bytes CommClk+ ExtSynch- Link: Speed 2.5Gb/s, Width x1 Capabilities: [84] Vendor Specific Information Capabilities: [100] Advanced Error Reporting Capabilities: [12c] Virtual Channel Capabilities: [148] Device Serial Number <snipped> Capabilities: [154] Power Budgeting 00: ec 10 68 81 07 00 10 00 01 00 00 02 10 00 00 00 10: 01 de 00 00 00 00 00 00 04 f0 ef fd 00 00 00 00 20: 00 00 00 00 00 00 00 00 00 00 00 00 7b 14 73 10 30: 00 00 00 00 40 00 00 00 00 00 00 00 0a 01 00 00 Roel, can you: - try with/without NAPI http://www.fr.zoreil.com/people/francois/misc/20070510-2.6.21.1-r8169-bz7924.patch - attach lspci -vvx and .config - describe which card does work correctly or which kernel did not exhibit this behavior. -- Ueimor Hello Francois, I already tried with/without NAPI, it makes no difference. I'm currently running without NAPI. Everything works fine with a different card on the same infrastructure. I've compiled the driver with the patch (several hunks applied with 1 line offset) but can only try it this evening. Created attachment 11479 [details]
lspci -vvx from 2.6.20
Created attachment 11480 [details]
.config for 2.6.21.1
This is a new machine, no kernel worked before. Though, I should note that with 2.6.21.1 I can very occasionally manage high speeds (12-20MB/s) for a few (10) seconds, but this isn't easily repeatable. After those few seconds, it dies off again and gets the bad 40kb/s speeds I'm seeing usually with older kernels. When doing scp / sftp I can manage about 1mb/s with any kernel. As a workaround I've now attached an usb2 100mbit adapter, which I can max out using samba or sftp. I've replaced every component in the network path, with the onboard ports the speed remains bad, any other usb/pci card is working fine. Rebooted with the patch applied, no change. In case I wasn't clear enough. The machine that got replaced by this one had working gigabit ethernet with a realtek addon card. Network infrastructure is cat5e with gigabit switches. Is there anything else I can test / check ? bugme-daemon@bugzilla.kernel.org <bugme-daemon@bugzilla.kernel.org> : Roel.Teuwen@advalvas.be 2007-05-18 01:39: > In case I wasn't clear enough. The machine that got replaced by this one had > working gigabit ethernet with a realtek addon card. Network infrastructure is > cat5e with gigabit switches. > > Is there anything else I can test / check ? Not much so far. A mii-tool -vv and the brand name of your motherboard could help. There are several different 8168 bugs. At least they really seem to go along the 8168. mii-tool -vv output for both r8168 interfaces (eth1 is not connected) : Using SIOCGMIIPHY=0x8947 eth1: no link registers for MII PHY 32: 1000 7949 001c c912 0de1 0000 0004 2001 0000 0300 0000 0000 1007 f880 0000 3000 0060 4000 0000 0040 1060 0000 080d 2108 2740 8c00 0040 4013 8409 8000 0123 0000 product info: vendor 00:07:32, model 17 rev 2 basic mode: autonegotiation enabled basic status: no link capabilities: 100baseTx-FD 100baseTx-HD 10baseT-FD 10baseT-HD advertising: 100baseTx-FD 100baseTx-HD 10baseT-FD 10baseT-HD flow-control ---- Using SIOCGMIIPHY=0x8947 eth2: negotiated 100baseTx-FD flow-control, link ok registers for MII PHY 32: 1000 796d 001c c912 0de1 cde1 000f 2001 4780 0300 3800 0000 1007 f880 0000 3000 0060 ac80 0000 6c42 1060 0000 441c 2108 2740 8c00 0040 0106 097c 8000 0123 0000 product info: vendor 00:07:32, model 17 rev 2 basic mode: autonegotiation enabled basic status: autonegotiation complete, link ok capabilities: 100baseTx-FD 100baseTx-HD 10baseT-FD 10baseT-HD advertising: 100baseTx-FD 100baseTx-HD 10baseT-FD 10baseT-HD flow-control link partner: 100baseTx-FD 100baseTx-HD 10baseT-FD 10baseT-HD flow-control Curiously, "capabilities", "negotiated", etc seems wrong. ethtool output : Settings for eth2: Supported ports: [ TP ] Supported link modes: 10baseT/Half 10baseT/Full 100baseT/Half 100baseT/Full 1000baseT/Full Supports auto-negotiation: Yes Advertised link modes: 10baseT/Half 10baseT/Full 100baseT/Half 100baseT/Full 1000baseT/Full Advertised auto-negotiation: Yes Speed: 1000Mb/s Duplex: Full Port: Twisted Pair PHYAD: 0 Transceiver: internal Auto-negotiation: on Supports Wake-on: pumbg Wake-on: g Current message level: 0x00000033 (51) Link detected: yes ----- Settings for eth1: Supported ports: [ TP ] Supported link modes: 10baseT/Half 10baseT/Full 100baseT/Half 100baseT/Full 1000baseT/Full Supports auto-negotiation: Yes Advertised link modes: 10baseT/Half 10baseT/Full 100baseT/Half 100baseT/Full 1000baseT/Full Advertised auto-negotiation: Yes Speed: Unknown! (0) Duplex: Half Port: Twisted Pair PHYAD: 0 Transceiver: internal Auto-negotiation: on Supports Wake-on: pumbg Wake-on: g Current message level: 0x00000033 (51) Link detected: no Motherboard is : Base Board Information Manufacturer: http://www.abit.com.tw/ Product Name: AB9/AB9RPO(Intel965+ICH8) Version: 1.x (BIOS:15) I have the AB9Pro with the two onboard nics. Roel, can you try 2.6.22-rc3 with http://www.fr.zoreil.com/people/francois/misc/20070527-2.6.22-rc3-r8169.patch and attach the output of 'ethtool -e eth1', 'ethtool -e eth2' ? Thanks in advance. tested -rc3 and the patch : same problems, getting 100kb/s now, and a peak of 16mb/s during one second somewhere 5 seconds after starting the transfer. eth2 is cabled and configured, eth1 is down. Even though eth1 is not cabled, eth1 shows 'link detected : yes" Settings for eth1: Supported ports: [ TP ] Supported link modes: 10baseT/Half 10baseT/Full 100baseT/Half 100baseT/Full 1000baseT/Full Supports auto-negotiation: Yes Advertised link modes: 10baseT/Half 10baseT/Full 100baseT/Half 100baseT/Full 1000baseT/Full Advertised auto-negotiation: Yes Speed: Unknown! (0) Duplex: Half Port: Twisted Pair PHYAD: 0 Transceiver: internal Auto-negotiation: on Supports Wake-on: pumbg Wake-on: g Current message level: 0x00000033 (51) Link detected: yes # ethtool -e eth1 Offset Values ------ ------ 0x0000 a4 04 a4 04 a4 04 a4 04 b0 43 b0 43 b0 43 b0 43 0x0010 a0 05 a0 05 a0 05 a0 05 ec 51 ec 51 ec 51 ec 51 0x0020 cc 41 cc 41 cc 41 cc 41 10 04 10 04 10 04 10 04 0x0030 00 80 00 80 00 80 00 80 00 40 00 40 00 40 00 40 0x0040 34 5e 34 5e 34 5e 34 5e 00 a0 00 a0 00 a0 00 a0 0x0050 14 7c 14 7c 14 7c 14 7c 08 df 08 df 08 df 08 df 0x0060 08 01 08 01 08 01 08 01 8c fc 8c fc 8c fc 8c fc 0x0070 00 40 00 40 00 40 00 40 10 0c 10 0c 10 0c 10 0c 0x0080 a0 05 a0 05 a0 05 a0 05 b0 43 b0 43 b0 43 b0 43 0x0090 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x00a0 fc ff fc ff fc ff fc ff fc ff fc ff fc ff fc ff 0x00b0 fc ff fc ff fc ff fc ff 7c 00 7c 00 7c 00 7c 00 0x00c0 00 1c 00 1c 00 1c 00 1c 5c fb 5c fb 5c fb 5c fb 0x00d0 40 c0 40 c0 40 c0 40 c0 c0 07 c0 07 c0 07 c0 07 0x00e0 fc 06 fc 06 fc 06 fc 06 00 00 00 00 00 00 00 00 0x00f0 80 01 80 01 80 01 80 01 00 c0 00 c0 00 c0 00 c0 # ethtool -e eth2 Offset Values ------ ------ 0x0000 a4 04 a4 04 a4 04 a4 04 b0 43 b0 43 b0 43 b0 43 0x0010 a0 05 a0 05 a0 05 a0 05 ec 51 ec 51 ec 51 ec 51 0x0020 cc 41 cc 41 cc 41 cc 41 10 04 10 04 10 04 10 04 0x0030 00 80 00 80 00 80 00 80 00 40 00 40 00 40 00 40 0x0040 34 5e 34 5e 34 5e 34 5e 00 a4 00 a4 00 a4 00 a4 0x0050 14 7c 14 7c 14 7c 14 7c 08 df 08 df 08 df 08 df 0x0060 08 01 08 01 08 01 08 01 8c fc 8c fc 8c fc 8c fc 0x0070 00 40 00 40 00 40 00 40 10 0c 10 0c 10 0c 10 0c 0x0080 a0 05 a0 05 a0 05 a0 05 b0 43 b0 43 b0 43 b0 43 0x0090 00 00 00 00 00 00 00 00 00 04 00 04 00 04 00 04 0x00a0 fc ff fc ff fc ff fc ff fc ff fc ff fc ff fc ff 0x00b0 fc ff fc ff fc ff fc ff 7c 00 7c 00 7c 00 7c 00 0x00c0 00 1c 00 1c 00 1c 00 1c 5c fb 5c fb 5c fb 5c fb 0x00d0 40 c0 40 c0 40 c0 40 c0 c0 07 c0 07 c0 07 c0 07 0x00e0 fc 06 fc 06 fc 06 fc 06 00 00 00 00 00 00 00 00 0x00f0 80 01 80 01 80 01 80 01 00 c0 00 c0 00 c0 00 c0 Ok, it seems I have found a way to have slow transfers and fast transfers completely repeatable now. Things are starting to get strange... it appears to depends on the file that I try to transfer. When transferring an ubunto .iso file, things are slow, when transferring a TV recording in mpeg, everything is fast. With a different network card, both are fast. Hope this helps somehow... :-/ Created attachment 12216 [details]
PHY power-on change
Roel, can you try the attached patch on top of 2.6.23-rc1 (or above) ?
Thansk in advance.
--
Ueimor
No change in the symptoms, I'm afraid. Most files I tested transferred at 100KB/s, but I successfully transferred 1 file at high speed (50MB/s). Transferring the same file a second time is slow again, though. Francois, Excellent news. I'm now running 2.6.23-rc5-git1 with 20070903-2.6.23-rc5-r8169-test.patch applied on top, and the transfer speed is now always around 40MB/s I will keep monitoring the status, but it seems the issue has been solved. Best regards, Roel Thanks for the news Roel. Can you narrow the fix and check if patches #0001 and #0002 are enough ? The patch kit is located at: http://www.fr.zoreil.com/linux/kernel/2.6.x/2.6.23-rc5/r8169-20070903/ Just to gain a little background: - was this with NAPI enabled ? - do the 40Mb/s stand in either direction ? -- Ueimor Speeds are ok in both directions. NAPI has been disabled since the problems began. I will perform some tests with and without NAPI, and with just 0001 and 0002 with and without NAPI as soon as I can reboot the machine. I've not tested 2.6.23-rc5 vanilla without any of your patches, should I try that as well ? Tests seem fine in both directions with or without NAPI with 0001 and 0002 applied. Not tested without them. rebooted between tests. Best regards, Roel Fixed in 2.6.23 as of commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=d78ae2dcc2acebb9a1048278f47f762c069db75c -- Ueimor |