Bug 10600 - e1000 updates rx_bytes infrequently
Summary: e1000 updates rx_bytes infrequently
Status: RESOLVED PATCH_ALREADY_AVAILABLE
Alias: None
Product: Networking
Classification: Unclassified
Component: Other (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: Jesse Brandeburg
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2008-05-03 14:55 UTC by Ben Liblit
Modified: 2008-05-14 13:11 UTC (History)
1 user (show)

See Also:
Kernel Version: 2.6.24.4
Subsystem:
Regression: ---
Bisected commit-id:


Attachments

Description Ben Liblit 2008-05-03 14:55:07 UTC
Latest working kernel version: not known
Earliest failing kernel version: 2.6.24.4 for sure, but it's definitely not new
Distribution: Fedora 8
Hardware Environment: Intel 82566DC Gigabit Network Connection
Software Environment: e1000 driver version 7.3.20-k2-NAPI
Problem Description:

The count of received bytes reported in "/sys/class/net/eth0/statistics/rx_bytes" updates approximately once every two seconds, even when data is streaming in smoothly at 300K/s.  This makes transfer rates appear to be quite bursty when viewed in network monitoring tools such as the GNOME System Monitor applet.

Data is not *actually* arriving in once-every-two-second bursts.  For example, if I am storing a large download in a file, I am definitely seeing data arriving continuously.  It's only the "rx_bytes" count that makes things seem bursty because it updates so infrequently.

Is this frequency tunable in some way?

Steps to reproduce:

1. Run the following command to check the "rx_bytes" count every quarter second and highlight any changes from the previous count:

    watch -d -n 0.25 'cat /sys/class/net/eth0/statistics/rx_bytes'

2. Start a large download from a site that you know will give you a good, smooth, continuous flow of data.

3. Observe that "rx_bytes" only updates about once every two seconds, even though data is actually arriving continuously.

Additional information:

Originally reported as a GNOME System Monitor bug: <http://bugzilla.gnome.org/show_bug.cgi?id=518355>.  The conclusion there was that this is a kernel issue and therefore should be transferred here.
Comment 1 Anonymous Emailer 2008-05-05 15:55:37 UTC
Reply-To: akpm@linux-foundation.org

(switched to email.  Please respond via emailed reply-to-all, not via the
bugzilla web interface).

On Sat,  3 May 2008 14:55:08 -0700 (PDT)
bugme-daemon@bugzilla.kernel.org wrote:

> http://bugzilla.kernel.org/show_bug.cgi?id=10600
> 
>            Summary: e1000 updates rx_bytes infrequently
>            Product: Networking
>            Version: 2.5
>      KernelVersion: 2.6.24.4
>           Platform: All
>         OS/Version: Linux
>               Tree: Mainline
>             Status: NEW
>           Severity: normal
>           Priority: P1
>          Component: Other
>         AssignedTo: acme@ghostprotocols.net
>         ReportedBy: liblit@acm.org
> 
> 
> Latest working kernel version: not known
> Earliest failing kernel version: 2.6.24.4 for sure, but it's definitely not
> new
> Distribution: Fedora 8
> Hardware Environment: Intel 82566DC Gigabit Network Connection
> Software Environment: e1000 driver version 7.3.20-k2-NAPI
> Problem Description:
> 
> The count of received bytes reported in
> "/sys/class/net/eth0/statistics/rx_bytes" updates approximately once every
> two
> seconds, even when data is streaming in smoothly at 300K/s.  This makes
> transfer rates appear to be quite bursty when viewed in network monitoring
> tools such as the GNOME System Monitor applet.
> 
> Data is not *actually* arriving in once-every-two-second bursts.  For
> example,
> if I am storing a large download in a file, I am definitely seeing data
> arriving continuously.  It's only the "rx_bytes" count that makes things seem
> bursty because it updates so infrequently.
> 
> Is this frequency tunable in some way?
> 
> Steps to reproduce:
> 
> 1. Run the following command to check the "rx_bytes" count every quarter
> second
> and highlight any changes from the previous count:
> 
>     watch -d -n 0.25 'cat /sys/class/net/eth0/statistics/rx_bytes'
> 
> 2. Start a large download from a site that you know will give you a good,
> smooth, continuous flow of data.
> 
> 3. Observe that "rx_bytes" only updates about once every two seconds, even
> though data is actually arriving continuously.
> 
> Additional information:
> 
> Originally reported as a GNOME System Monitor bug:
> <http://bugzilla.gnome.org/show_bug.cgi?id=518355>.  The conclusion there was
> that this is a kernel issue and therefore should be transferred here.
> 

Yes, that is a bit obnoxious.

I've noticed that when I'm downloading stuff at home, gkrellm will display
eth0 as consuming 0 kbytes/sec, then 400 kbytes/sec, then 0, then 400 ad
nauseum.  I always assumed that gkrellm was busted.  Perhaps wrongly...
Comment 2 David S. Miller 2008-05-05 16:18:36 UTC
From: Andrew Morton <akpm@linux-foundation.org>
Date: Mon, 5 May 2008 15:55:02 -0700

> I've noticed that when I'm downloading stuff at home, gkrellm will display
> eth0 as consuming 0 kbytes/sec, then 400 kbytes/sec, then 0, then 400 ad
> nauseum.  I always assumed that gkrellm was busted.  Perhaps wrongly...

It's a tradeoff between excess DMA traffic updating the statistics,
and having them updated more frequently.

Actually, the thing that matters is when ->get_stats() is called.

So if a driver can trigger a statistics DMA update at ->get_stats()
time, that's probably what it should do.  But this could get
expensive and make the "do DMA less often" optimization less useful.
Comment 3 Ben Liblit 2008-05-05 16:38:56 UTC
Andrew Morton wrote:
> I've noticed that when I'm downloading stuff at home, gkrellm will display
> eth0 as consuming 0 kbytes/sec, then 400 kbytes/sec, then 0, then 400 ad
> nauseum.  I always assumed that gkrellm was busted.  Perhaps wrongly...

Likewise, my initial assumption was that GNOME System Monitor was 
faulty.  You might try the command that I suggested in my original report:

    watch -d -n 0.25 'cat /sys/class/net/eth0/statistics/rx_bytes'

Run that during a big download and it should be pretty clear whether 
it's being updated frequently or not.  If not, then gkrellm is not to blame.
Comment 4 Michael Chan 2008-05-05 16:51:17 UTC
On Mon, 2008-05-05 at 16:18 -0700, David Miller wrote:
> Actually, the thing that matters is when ->get_stats() is called.
> 
> So if a driver can trigger a statistics DMA update at ->get_stats()
> time, that's probably what it should do.  But this could get
> expensive and make the "do DMA less often" optimization less useful.
> 
> 

The width of hardware counters may also be limited and so periodic
fetching or DMA may be needed to prevent overflow.
Comment 5 David S. Miller 2008-05-05 16:54:29 UTC
From: "Michael Chan" <mchan@broadcom.com>
Date: Mon, 05 May 2008 17:55:35 -0700

> The width of hardware counters may also be limited and so periodic
> fetching or DMA may be needed to prevent overflow.

To the best of my knowledge, they are all 64-bit on e1000, so I don't
think it's a real issue in this specific case.

But yes, in general, this is a concern.
Comment 6 Ben Liblit 2008-05-05 16:55:47 UTC
Perhaps userspace needs a way to tweak the update frequency?  Or perhaps 
the driver needs a way to tell userspace what update frequency it should 
expect?  Without some kind of coordination, we have the present silly 
situation: userspace network activity monitors are checking rx_bytes 
more frequently than could possibly be useful.

Do different Ethernet drivers update rx_bytes at the same rate?  Or is 
this completely ad hoc?
Comment 7 Michael Chan 2008-05-05 17:12:42 UTC
On Mon, 2008-05-05 at 18:55 -0500, Ben Liblit wrote:
> Perhaps userspace needs a way to tweak the update frequency?  Or perhaps 
> the driver needs a way to tell userspace what update frequency it should 
> expect?  Without some kind of coordination, we have the present silly 
> situation: userspace network activity monitors are checking rx_bytes 
> more frequently than could possibly be useful.
> 
> Do different Ethernet drivers update rx_bytes at the same rate?  Or is 
> this completely ad hoc?

"ethtool -C eth0 stats-block-usecs" can be used to control statistics
update frequency on most tg3 and bnx2 devices.  The default is 1 second
on these devices.
Comment 8 Krzysztof Oledzki 2008-05-05 17:43:57 UTC

On Mon, 5 May 2008, Michael Chan wrote:

> On Mon, 2008-05-05 at 18:55 -0500, Ben Liblit wrote:
>> Perhaps userspace needs a way to tweak the update frequency?  Or perhaps
>> the driver needs a way to tell userspace what update frequency it should
>> expect?  Without some kind of coordination, we have the present silly
>> situation: userspace network activity monitors are checking rx_bytes
>> more frequently than could possibly be useful.
>>
>> Do different Ethernet drivers update rx_bytes at the same rate?  Or is
>> this completely ad hoc?
>
> "ethtool -C eth0 stats-block-usecs" can be used to control statistics
> update frequency on most tg3 and bnx2 devices.  The default is 1 second
> on these devices.

Hm... strange - I tested it on 2.6.23.17 with:
03:00.0 Ethernet controller: Broadcom Corporation NetXtreme II BCM5708 Gigabit Ethernet (rev 12)

Setting both:
  # ethtool -C eth0 stats-block-usecs 50
and:
  # ethtool -C eth0 stats-block-usecs 5000000
gives exactly the same situation - statistics update frequency is still ~1s.

# ethtool -c eth1 |grep stats-block-usecs
stats-block-usecs: 999936

Best regards,

 				Krzysztof Ol
Comment 9 Michael Chan 2008-05-05 17:51:59 UTC
On Tue, 2008-05-06 at 02:43 +0200, Krzysztof Oledzki wrote:
> Hm... strange - I tested it on 2.6.23.17 with:
> 03:00.0 Ethernet controller: Broadcom Corporation NetXtreme II BCM5708
> Gigabit Ethernet (rev 12)
> 
> Setting both:
>   # ethtool -C eth0 stats-block-usecs 50
> and:
>   # ethtool -C eth0 stats-block-usecs 5000000
> gives exactly the same situation - statistics update frequency is
> still ~1s.
> 
> # ethtool -c eth1 |grep stats-block-usecs
> stats-block-usecs: 999936

Unfortunately, 5708 has a bug in the statistics DMA engine that was
reported here on netdev about 1 year ago.  To work around it, we have to
disable the statistics block DMA and rely on the driver's timer function
which runs at a fixed 1 second interval.  So only 0 and 999936 are
allowed values on the 5708.  Newer chips or the older 5706 do not have
this problem.
Comment 10 Ben Liblit 2008-05-05 18:24:56 UTC
Michael Chan wrote:
> "ethtool -C eth0 stats-block-usecs" can be used to control statistics
> update frequency on most tg3 and bnx2 devices.

But not on my e1000-driven Intel 82566DC, sadly:

	# ethtool -C eth0 stats-block-usecs 50
	Cannot get device coalesce settings: Operation not supported

	# ethtool -c eth0
	Coalesce parameters for eth0:
	Cannot get device coalesce settings: Operation not supported
Comment 11 Jesse Brandeburg 2008-05-06 10:22:54 UTC
Ben Liblit wrote:
> Michael Chan wrote:
>> "ethtool -C eth0 stats-block-usecs" can be used to control statistics
>> update frequency on most tg3 and bnx2 devices.
> 
> But not on my e1000-driven Intel 82566DC, sadly:
> 
>       # ethtool -C eth0 stats-block-usecs 50
>       Cannot get device coalesce settings: Operation not supported
> 
>       # ethtool -c eth0
>       Coalesce parameters for eth0:
>       Cannot get device coalesce settings: Operation not supported

e1000/e1000e already has patches upstream that count bytes and packets
"on the fly" to fix this sort of issue.

see commit:
commit ef90e4eca9fcade05dd03f853df75cf459a75422
Author: Auke Kok <auke-jan.h.kok@intel.com>
Date:   Tue Nov 13 20:49:15 2007 -0800

    [E1000]: update netstats traffic counters realtime

    formerly e1000/e1000e only updated traffic counters once every
    2 seconds with the register values of bytes/packets. With newer
    code however in the interrupt and polling code we can real-time
    fill in these values in the netstats struct for users to see.

    Signed-off-by: Auke Kok <auke-jan.h.kok@intel.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

and for e1000e (which you should be using for 82566)
commit 419886927796dfeca87c1fd11d1fe2ed442103cc
Author: Auke Kok <auke-jan.h.kok@intel.com>
Date:   Tue Nov 13 20:48:36 2007 -0800

    [E1000E]: update netstats traffic counters realtime

    formerly e1000/e1000e only updated traffic counters once every
    2 seconds with the register values of bytes/packets. With newer
    code however in the interrupt and polling code we can real-time
    fill in these values in the netstats struct for users to see.

    Signed-off-by: Auke Kok <auke-jan.h.kok@intel.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

I'm not sure if this patch is in 2.6.24 and my git foo is lacking... 
is this patch not working for you?  if so we need to figure out why.

Jesse
Comment 12 Jesse Brandeburg 2008-05-06 13:49:31 UTC
Brandeburg, Jesse wrote:
>Ben Liblit wrote:
>> Michael Chan wrote:
>>> "ethtool -C eth0 stats-block-usecs" can be used to control
statistics
>>> update frequency on most tg3 and bnx2 devices.
>> 
>> But not on my e1000-driven Intel 82566DC, sadly:

> e1000 commit ef90e4eca9fcade05dd03f853df75cf459a75422

> and for e1000e (which you should be using for 82566)
> commit 419886927796dfeca87c1fd11d1fe2ed442103cc

> I'm not sure if this patch is in 2.6.24 and my git foo is lacking...
> is this patch not working for you?  if so we need to figure out why.

I found that these commits (to do real time stats update) are in
2.6.25-rc1, not 2.6.24, it would be great if you were able to try a
newer kernel version to verify that these fixes work for you.

an ethtool -C parameter implementation to support modifiable polling
frequency for the other stats is in progress.

Jesse
Comment 13 Roland Kletzing 2008-05-11 14:44:59 UTC
what about giving e1000 8.0.1 from sourceforge project site a try ?
http://sourceforge.net/project/showfiles.php?group_id=42302
Comment 14 Ben Liblit 2008-05-11 22:30:59 UTC
Roland Kletzing wrote:
> what about giving e1000 8.0.1 from sourceforge project site a try ?
> http://sourceforge.net/project/showfiles.php?group_id=42302

That driver fails utterly for me.  The module loads without complaint, 
but seems to behave as though it found no hardware to drive.  "ip link" 
mentions only the loopback interface, nothing else.  This in turn causes 
Fedora's networking scripts to report "Device eth0 does not seem to be 
present, delaying initialization."  Similarly, "/sys/class/net" shows 
only "lo", no "eth0".  I'm completely unable to bring up an ethernet 
network interface using e1000 driver version 8.0.1.
Comment 15 Ben Liblit 2008-05-11 22:44:50 UTC
Jesse Brandeburg wrote:
> and for e1000e (which you should be using for 82566)

That seems unlikely.  When I replace "alias eth0 e1000" with "alias eth0 
e100e" in my module configuration I lose the network interface entirely. 
  "ip link" does not mention it and it does not appear in 
"/sys/class/net".  Switching back to "alias eth0 e1000" restores the 
ability to use the network interface.

It seems pretty clear from my end that the e1000e driver is *not* 
recognizing and driving my Intel 82566DC ethernet device.
Comment 16 David S. Miller 2008-05-11 23:18:15 UTC
From: Ben Liblit <liblit@acm.org>
Date: Mon, 12 May 2008 00:44:39 -0500

> Jesse Brandeburg wrote:
> > and for e1000e (which you should be using for 82566)
> 
> That seems unlikely.  When I replace "alias eth0 e1000" with "alias eth0 
> e100e" in my module configuration I lose the network interface entirely. 

You're missing a zero in that string, it's "e1000e" not "e100e", and
did you make sure to enable CONFIG_E1000E in your kernel config?
Comment 17 Ben Liblit 2008-05-11 23:26:54 UTC
David Miller wrote:
> You're missing a zero in that string, it's "e1000e" not "e100e", and
> did you make sure to enable CONFIG_E1000E in your kernel config?

Sorry, that was a typo in my comment but not in my actual experiments. 
I used "e1000e", not "e100e".  CONFIG_E1000E is enabled, and "lsmod" 
confirmed that the e1000e module was definitely loaded into the kernel. 
  "The kernel", for the record, is the current Fedora 8 kernel, 
2.6.24.5-85.fc8.

I realize that pristine, self-built kernels are preferred for bug 
reporting.  I am currently building my own 2.6.25.3 so that I can affirm 
or refute Jesse Brandeburg's claim (comment #12) that this bug is fixed 
in 2.6.25-rc1 and later.  I'll report my findings when available, 
assuming I still remember how to install my own kernels.  (Yes, yes, 
I've gotten soft and lazy.)
Comment 18 Ben Liblit 2008-05-11 23:42:52 UTC
Jesse Brandeburg wrote:
> I found that these commits (to do real time stats update) are in
> 2.6.25-rc1, not 2.6.24, it would be great if you were able to try a
> newer kernel version to verify that these fixes work for you.

I'm now running a self-built 2.6.25.3 kernel.  I confirm that rx_bytes 
updates as frequently as I'm reasonably able to check it.

Furthermore, the e1000e driver included with 2.5.25.3 works on my 
hardware when using this kernel.  Both e1000 and e1000e work, and 
rx_bytes is updated promptly when using either driver.

Problem solved.
Comment 19 Jesse Brandeburg 2008-05-14 13:10:58 UTC
closing due to users feedback that this has been solved upstream.

Note You need to log in before you can comment on or make changes to this bug.