Bug 10301

Summary: r8169 doesn't work anymore
Product: Drivers Reporter: Laurent Goujon (laurent.goujon)
Component: NetworkAssignee: Francois Romieu (romieu)
Status: RESOLVED CODE_FIX    
Severity: normal CC: jm, romieu
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 2.6.24.3-custom Subsystem:
Regression: Yes Bisected commit-id:
Attachments: dmesg
lspci

Description Laurent Goujon 2008-03-21 16:33:41 UTC
Latest working kernel version:
Earliest failing kernel version: 2.6.24
Distribution: Ubuntu Hardy Heron
Hardware Environment:
CPU AMD Turion 64 X2 Mobile Technology TL-50
RAM: 1GB
Chipset: nVidia Corporation MCP51
Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168B PCI Express Gigabit Ethernet controller

Problem Description:

Starting with 2.6.24, wired network is not able anymore to get an IP address using DHCP. Setting a static IP doesn't work too. No error displayed: it's as if return packets aren't received

Reverting back to 2.6.22 is okay

Adding pci=nomsi to the boot commandline solves the problem

This bug is also reported in Ubuntu and Fedora bugs management systems:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/181081
https://bugzilla.redhat.com/show_bug.cgi?id=434629

Was also reproduced with vanilla kernel using git (Following these instructions: https://wiki.ubuntu.com/KernelTeam/GitKernelBuild)

Steps to reproduce:
1) assign a valid static IP or use DHCP
2) try to ping another host
3) got no response
Comment 1 Laurent Goujon 2008-03-21 16:36:20 UTC
Created attachment 15385 [details]
dmesg
Comment 2 Laurent Goujon 2008-03-21 16:37:23 UTC
Created attachment 15386 [details]
lspci
Comment 3 Matt Fischer 2008-04-09 09:01:36 UTC
I'm seeing this as well on different hardware.  The box we have is a NAS-like box running an Intel single-core CPU, Intel chipset, and 256MB of RAM. 

01:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168B PCI Express Gigabit Ethernet controller (rev 02)

We are using 2.6.24.3 + customizations like the original submitter.

The work-around suggested (pci=nomsi) works for us too.  

We've also found that if we PXE boot the machine and let the BIOS configure the device, it works everytime, but fails in Linux between 10% and 30% of the time.  We were unable to reproduce the issue in Windows either.
Comment 4 Laurent Goujon 2008-04-11 13:57:49 UTC
After a lots of kernel compilation, it seems bug have been fixed (for myself at least) since linux-2.6.25rc3.
Using git bisect, it seems the following patch fixed the bug:

commit 9dc625e72309e1c919ea3e7f51d0ffca96123787
Author: Peer Chen <pchen@nvidia.com>
Date:   Mon Feb 4 23:50:13 2008 -0800

    PCI: quirks: set 'En' bit of MSI Mapping for devices onHT-based nvidia platform
    
    According to HT spec, to get message interrupt from devices mapped to HT
    interrupt message, the 'En' bit of MSI Mapping capability need to be set.
    The patch do this setting in quirks code for the devices on HT-based nvidia
    platform.
    
    [akpm@linux-foundation.org: coding-style fixes]
    
    Signed-off-by: Andy Currid <acurrid@nvidia.com>
    Signed-off-by: Peer Chen <pchen@nvidia.com>
    Cc: "Eric W. Biederman" <ebiederm@xmission.com>
    Cc: Ingo Molnar <mingo@elte.hu>
    Cc: Thomas Gleixner <tglx@linutronix.de>
    Cc: Andi Kleen <ak@suse.de>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

:040000 040000 479224d3d9b51c6554b70f00224963ec124cb6a7 a0e3e966c5b27a7508cc63423d477285cd52278f M	drivers

This patch seems to fix MSI handling for NVidia chipset. Possibly a similar issue might occur with Intel chipset.
Comment 5 Francois Romieu 2008-09-10 13:33:27 UTC
Matt, did you reproduce the problem with a recent (2.6.27-rc) kernel ?

-- 
Ueimor