Bug 5947

Summary: r8169 Losing some ticks
Product: Drivers Reporter: Olivier Mondoloni (elgrande71)
Component: NetworkAssignee: Francois Romieu (romieu)
Status: CLOSED CODE_FIX    
Severity: normal CC: akpm, romieu
Priority: P2    
Hardware: i386   
OS: Linux   
Kernel Version: 2.6.15.1, 2.6.16-rc1, 2.6.16-rc1-mm2 Subsystem:
Regression: --- Bisected commit-id:
Attachments: kernel config file for 2.6.16-rc1-mm2 compilation
lspci from my computer
/proc/interrupts before eth1 is start
/proc/interrupts after eth1 started
lower the maximum time spent polling the MII registers
dmesg from 2.6.15.1 with r8169 patch
dmesg from 2.6.15.1 with r8169 napi
shorten the time spent polling the mii registers
2.6.15.1 dmesg with r8169 patch2

Description Olivier Mondoloni 2006-01-24 01:16:05 UTC
Most recent kernel where this bug did not occur: 2.6.16-rc1-mm2
Distribution: Gentoo Linux 2005.1
Hardware Environment: ADM64 + Yukon Gigabit Ethernet + RTL8169 Gigabit Ethernet
Software Environment: Linux (console or X)
Problem Description: My RTL8169 network interface seems to display 'Losing some
ticks...checking if CPU frequency changed' in dmesg when the link is down. When
the link is up, no losing ticks messages is saved in dmesg.
To be more precise, my Yukon network interface is for Internet and my RTL8169 is
for fast LAN.

Steps to reproduce: compile kernels with skge and r8169 drivers in kernel and
without linux kernel modules system activated. In my network setup, ips
broadcast and networks are different. My network configuration has been tested
on Windows XP without any problems.
Comment 1 Olivier Mondoloni 2006-01-24 01:17:49 UTC
Sorry, but in the 2.6.16-rc1-mm2 kernel, the bug is still present.
Comment 2 Olivier Mondoloni 2006-01-24 01:21:32 UTC
Created attachment 7113 [details]
kernel config file for 2.6.16-rc1-mm2 compilation
Comment 3 Andrew Morton 2006-01-24 01:25:29 UTC
I suppose that r8169 is disabling interrupts for too long when the
link is down.
Comment 4 Olivier Mondoloni 2006-01-24 01:26:00 UTC
Created attachment 7114 [details]
lspci from my computer
Comment 5 Olivier Mondoloni 2006-01-24 01:43:38 UTC
Created attachment 7117 [details]
/proc/interrupts before eth1 is start
Comment 6 Olivier Mondoloni 2006-01-24 01:45:43 UTC
Created attachment 7118 [details]
/proc/interrupts after eth1 started
Comment 7 Francois Romieu 2006-01-24 16:03:23 UTC
Created attachment 7125 [details]
lower the maximum time spent polling the MII registers

> I suppose that r8169 is disabling interrupts for too long when the
> link is down.

Possible workaround attached. I'll check the 802.3 spec for the upper bound but
this stuff should be moved to a workqueue context anyway.

--
Ueimor
Comment 8 Andrew Morton 2006-01-24 16:17:28 UTC
If mdio_read/write are actually causing loss of ticks, either they really
are timing out, or they're taking unexpectedly long.

If the latter is true, this patch could cause the driver to malfunction,
couldn't it?

Comment 9 Olivier Mondoloni 2006-01-25 01:56:30 UTC
Created attachment 7127 [details]
dmesg from 2.6.15.1 with r8169 patch

I try the patch posted above but nothing new. Always losing ticks message.
Comment 10 Francois Romieu 2006-01-25 04:20:50 UTC
akpm:
[...]
> If the latter is true, this patch could cause the driver to malfunction,
> couldn't it?

Yes but the link is supposed to be down when the issue happens.

Anyway it is apparently not the usual suspect. Olivier, can you give a simple
try with NAPI enabled (just wondering) ?
-- 
Ueimor
Comment 11 Olivier Mondoloni 2006-01-25 08:39:36 UTC
Created attachment 7140 [details]
dmesg from 2.6.15.1 with r8169 napi

Same message with napi activated at compilation time.
If r8169 (module or kernel built-in) is not present on .config settings
compilation file, Losing ticks does not appear in dmesg.
Comment 12 Francois Romieu 2006-01-25 11:38:08 UTC
Created attachment 7143 [details]
shorten the time spent polling the mii registers

Olivier:
[...]
> If r8169 (module or kernel built-in) is not present on .config settings
> compilation file, Losing ticks does not appear in dmesg.

Ok but look closer at http://bugzilla.kernel.org/show_bug.cgi?id=5947:
[...]
> SCSI device sda: 312581808 512-byte hdwr sectors (160042 MB)
> SCSI device sda: drive cache: write back
> SCSI device sda: 312581808 512-byte hdwr sectors (160042 MB)
> Losing some ticks... checking if CPU frequency changed.
> SCSI device sda: drive cache: write back

-> The r8169 module was not loaded at this time (it appears later in the
   dmesg). Something else hurts latency as well.

The patch should fix whatever excess delay is due to the r8169 driver. It will
not spend more than 0.5 ms with irq disabled.

-- 
Ueimor
Comment 13 Olivier Mondoloni 2006-01-25 13:30:35 UTC
Created attachment 7148 [details]
2.6.15.1 dmesg with r8169 patch2

It seems that the message Losing ticks .. is gone.
Thank you.
Comment 14 Olivier Mondoloni 2006-01-25 13:32:16 UTC
I am going test this kernel with this patch a few days to be sure.
Comment 15 Olivier Mondoloni 2006-01-25 23:07:24 UTC
Your patch seems to work well today.
Thank you again.