Most recent kernel where this bug did not occur: Distribution: Debian Hardware Environment: MB Kontron 986LCD, Core 2 Duo T7600, 2GB DDR, plug + cable Cat6 shielded (SFTP) Software Environment: Debian Unstable Problem Description: When using r8169 module, i must wait for a random time to get the whole information on big NFS shared directory. The time range from 0.5 to 13s. After the first call, the second is almost immediate. Looking at wireshark/tcpdump, it seems to happen every time i get a "Reassembled PDU" (the time between first packet and this PDU determine how much time between 0.5 and 13s it will take). It seems that the situation get worst with time. This is a bit annoying because the "big" directory is my home dir! Using r1000 (realtek driver) 1.05 (not 1.05a which oops), i have a constant time of 0.005s. I have no problem of speed and latency in other case. Steps to reproduce: Computer A has a RTL8111/8168B PCI Express Gigabit Ethernet controller Computer B is a standard computer 1. create a /dir/ with a lot of hidden file (.XXX) on A 2. export /dir/ with NFS from A to B 2. mount /dir/ && umount /dir/ && time ls /dir/ on B It can also be useful to "touch /dir/test" which seems to trigger a new RPC call to A.
Created attachment 11897 [details] config-2.6.21.5-core2
Created attachment 11898 [details] dmesg-2.6.21.5-core2
Created attachment 11899 [details] ifconfig-2.6.21.5-core2
Created attachment 11900 [details] interrupts-2.6.21.5-core2
Created attachment 11901 [details] lsmod-2.6.21.5-core2
Created attachment 11902 [details] lspci-2.6.21.5-core2
Can you send your r100 driver (version 1.05 and 1.05a) ? Thanks in advance. -- Ueimor
Where should i put the source ? (attached to this bugzilla/send to you by email).
Subject: Re: r8169: high latency when packet fragmentation occurs (NFS) sylvain@le-gall.net 2007-07-04 15:26: > Where should i put the source ? (attached to this bugzilla/send to you by > email). Please add it to bugzilla.
Created attachment 11943 [details] r1000 driver (works)
Created attachment 11944 [details] r1000 driver (oops)
I have a similar problem with the r8169 driver: Machine A: SuSE Linux 9.3 64-Bit Kernel 2.6.21.6 or even 2.6.20.4 (from kernel.org, unmodified) Mainboard: ASUS P5B with on-board Realtek PCI-E Gigabit LAN controller, reported by the kernel driver as: "eth1: RTL8168b/8111b at 0xffffc20000030000, 00:18:f3:51:dd:16, IRQ 19" Machine B: Windows XP System Machine A connects as a Samba client to Machine B. The problem is: When I edit a large file of machine B in a text editor on machine A (via Samba), and when I save the file, the saving always hangs for many minutes. It's even the same when editing an image file with a paint program. Usually the saving stops hanging and finishes when I play with a VNC client on machine A, which even connects to machine B (just move the mouse across the VNC window). Copying files via the Samba connection in a file manager or shell works without any problems, and unlike VNC, it does not let the editors continue to save. (yes, this is somehow unbelievable) When using another network card and driver, everything works fine.
I have switch to 2.6.22.1 with vserver patch (sorry, if really needed i can remove this patch). The problem is still here, but i have additional data: * the problem goes worse with time ! Yesterday, i just have rebooted my computer to take the new kernel into account and i was thinking the bug was gone (almost no problem). Today, i have a : gildor@grand:~$ time ls bin Desktop download images news programmation teleir debian documents GNUstep mail-archive playlist public_html tmp real 0m3.410s user 0m0.000s sys 0m0.000s Which is not good at all (i.e. the two computer are one 3Com switch away and connected with Cat 6 cable... which should work fine). Now, the next problem is: r1000 doesn't compile on this new kernel version. So i really need this driver ;-) (which is far better than the r1000). Let me know if you have problem reproducing it, i can do some test case. Thanks and regards Sylvain Le Gall
Hello, Using the same computer/switches/computer but only adding a new Gigabit NIC (DGE 530T/skge) gives me a stable result -- without the latency of the r8169: gildor@grand:~$ time ls bin debian Desktop documents download GNUstep images mail-archive news playlist programmation public_html teleir tmp unison.log real 0m0.045s user 0m0.000s sys 0m0.004s Even if i have a solution to circumvent this problem, i would really like to solve it. Let me know if you need more data to solve it. Regard Sylvain Le Gall
bugme-daemon@bugzilla.kernel.org <bugme-daemon@bugzilla.kernel.org> : [...] > Even if i have a solution to circumvent this problem, i would really like to > solve it. Let me know if you need more data to solve it. If you have some spare time, can you try against 2.6.23-rc3: http://www.fr.zoreil.com/people/francois/misc/20070818-2.6.23-rc3-r8169-test.patch or (tarball sits one level higher): http://www.fr.zoreil.com/linux/kernel/2.6.x/2.6.23-rc3/r8169-20070818/ (do not hurry, I am far away from my usual computers until 31/08).
After some mail exchanged with the support of Kontron (provider of my motherboard). They told me that there is an alignement problem in the RX buffer. If you are interested, i have the Realtek driver and their driver, which show a difference in the file r1000_n.c It could be a solution to the problem, but the driver doesn't compile (too old). Kontron people told me that this problem is not related to their motherboard but can be found on any chipset of Realtek... I attach the diff. Please let me know if you still want me to test 20070818-2.6.23-rc3-r8169-test.patch Regards Sylvin Le Gall
Created attachment 12716 [details] Difference between r1000 v1.05 and v1.05a_20070227 (kontron) Show the difference in RX buffer alignement in the r1000 driver
Hi Sylvain, 1. Please try 2.6.23-rc5 (without highres timer). 2. If it still does not work correctly, please try the patches at http://www.fr.zoreil.com/linux/kernel/2.6.x/2.6.23-rc5/r8169-20070903 Each patch of the serie applies on top of the previous one. If you need to go through 2., I would expect a change of behavior as soon as patch #0002 of the serie is applied. -- Ueimor
Please reopen this bug if: - it is still present with kernel 2.6.23-rc9 and - you have done the further requested testing.
The bug is fixed by d78ae2dcc2acebb9a1048278f47f762c069db75c which has been merged during 2.6.23-rc and Sylvain has done further testing. Adrian, I'd appreciate if you sent a short notice before rejecting any bug that I follow. Thanks. -- Ueimor