Bug 8106

Summary: tx_errors and tx_fifo_errors not updated consistantly
Product: Drivers Reporter: Chongfeng Hu (loveminix)
Component: NetworkAssignee: Mark Brown (broonie)
Status: CLOSED WILL_NOT_FIX    
Severity: normal CC: broonie, jgarzik, protasnb
Priority: P2    
Hardware: i386   
OS: Linux   
Kernel Version: 2.6.20 Subsystem:
Regression: --- Bisected commit-id:

Description Chongfeng Hu 2007-03-01 10:55:47 UTC
Distribution: Kernel 2.6.20

Problem Description:

In natsemi.c, a Linux PCI Ethernet driver for the NatSemi DP8381x series, I 
found that two fields in network device statistics structure, namely tx_errors 
and tx_fifo_errors, are not updated consistantly. In struct net_device_stats, 
tx_errors indicates packet transmit problems, and tx_fifo_errors indicates 
transmitter's fifo overrun. tx_fifo_errors is a sub-category of tx_errors, and 
whenever tx_fifo_errors is incremented, tx_errors should also be incremented 
accordingly. This is conformed in most places except one. In 
drivers/net/natsemi.c, in function static void netdev_error(struct net_device 
*dev, int intr_status), the suspicious code segment is as follows:

if (intr_status & RxStatusFIFOOver) {
        if (netif_msg_rx_err(np) && netif_msg_intr(np)) {
            printk(KERN_NOTICE "%s: Rx status FIFO overrun\n",
                dev->name);
        }
        np->stats.rx_fifo_errors++;
    }
    /* Hmmmmm, it's not clear how to recover from PCI faults. */
    if (intr_status & IntrPCIErr) {
        printk(KERN_NOTICE "%s: PCI error %#08x\n", dev->name,
            intr_status & IntrPCIErr);
        np->stats.tx_fifo_errors++;
        np->stats.rx_fifo_errors++;
    }

Here, tx_fifo_errors is incremented, but there's no tx_errors updates in this 
function overall. This might cause inaccurate transmit error statistics.

Steps to reproduce:

I found this bug by using a code-checking tool.
Comment 1 Olaf Kirch 2007-03-09 08:07:16 UTC
I don't think there is a general rule that tx_fifo_errors is a subset
of tx_errors. Different drivers handle this differently, and while some
increment tx_errors for every kind of transmit error, some will just
increment say tx_fifo_errors and leave tx_errors for unspecified problems.
Comment 2 Mark Brown 2007-03-12 03:50:30 UTC
In the case of PCI errors it's not particularly clear why the FIFO error counts
are being incremented in the first place.
Comment 3 Natalie Protasevich 2007-07-20 00:32:38 UTC
Any updates on this discussion? Needs to be discussed on netdev maybe?
Thanks.
Comment 4 Mark Brown 2007-07-20 11:10:16 UTC
As far as I remember the discussion on netdev about this sort of issue has always suggested that as Olaf says there's no particular standard.

In any case, I expect that for most FIFO errors there will also be an error reported via the descriptor which would increment the overall error counter - if that happens incrementing it again here would be redundant.
Comment 5 Mark Brown 2008-09-24 11:00:07 UTC
FWIW a change implementing the requested behaviour was written by Andrew Morton and merged some time ago.