Bug 4049

Summary: Problems with built-in RealTek 8169 Ethernet
Product: ACPI Reporter: Richard Dawe (rich)
Component: Config-InterruptsAssignee: acpi_config-interrupts
Status: CLOSED PATCH_ALREADY_AVAILABLE    
Severity: normal CC: Hugo.Vandeputte, mgg+kernelorg, romieu
Priority: P2    
Hardware: i386   
OS: Linux   
Kernel Version: 2.6.9,2.6.10,2.6.11-rc1 Subsystem:
Regression: --- Bisected commit-id:
Attachments: /proc/interrupts with 2.6.11-rc1
dmesg with 2.6.11-rc1
lspci -vv with 2.6.11-rc1
dmidecode with 2.6.11-rc1
acpidmp with 2.6.11-rc1
Fixed DSDT
/proc/interrupts with 2.6.11-rc1 and acpi=noirq
dmesg with 2.6.11-rc1 and acpi=noirq
Patch to fix r8169's slow performance

Description Richard Dawe 2005-01-16 08:49:44 UTC
Distribution: Fedora Core 3
Hardware Environment: Acer 1524WLMi
Software Environment: Standard install, with a few rpms patched to work on x86_64
Problem Description:

I have an Acer 1524WLMI laptop, which an x86_64-based laptop using VIA chipset
and a RealTek 8169 Gigabit Ethernet chip. The problem I have is that the
networking performance is terrible.

When I connect it to my Netgear 10/100 hub, it autodetects 100Mpbs full-duplex.
But there are so many errors that TCP backs off massively, resulting in an
effective data rate of 4Kbps. I can get the data rate up by forcing 10Mbps, but
that's hardly ideal. Also, when I connect it to a switch at work, I have to
force 100Mbps half-duplex and set the MTU to 576 bytes.

I've tried 2.6.9, various 2.6.10 (2.6.10-rc1, -rc2, -rc3, -rc2-mm2, -rc3-mm1)
and now 2.6.11-rc1. They all behave the same.

I wrote to the Linux netdev people and they suggested that I might need to fix
up my ACPI DSDT:

http://oss.sgi.com/projects/netdev/archive/2004-11/msg00764.html

There were a couple of warnings and an error, when I tried to recompile it with
iasl. I fixed them, but it made no difference. 

If I boot with pci=noacpi, the 8169 gets assigned an IRQ of 0 and doesn't work.
If I boot with acpi=off, I see messages about "agpgart" and then the box hangs
solidly - I have to reboot.

There are a couple of reasons I think this may be an ACPI problem:

* I see there are various PCI quirks in the kernel sources for VIA boxes, so I'm
wondering if there is an extra quirk this box needs.

* The Ethernet is sharing an IRQ with the sound chip and (with 2.6.11-rc1) the
modem chip. I previously built a 2.6.10 kernel with no sound drivers (and there
was no modem driver either), but this did not solve the problem.

* One of the PCI devices (the IDE controller) gets assigned an IRQ of 0, because
there is no GSI for it, according to the dmesg output. (I have no idea what that
message means, by the way.) This may be bogus, since it later assigns an IRQ for
the primary and secondary controllers.

Steps to reproduce:

Just try to use the built-in Ethernet on an Acer 1524WLMi.
Comment 1 Richard Dawe 2005-01-16 08:50:39 UTC
Created attachment 4414 [details]
/proc/interrupts with 2.6.11-rc1
Comment 2 Richard Dawe 2005-01-16 08:51:13 UTC
Created attachment 4415 [details]
dmesg with 2.6.11-rc1
Comment 3 Richard Dawe 2005-01-16 08:51:52 UTC
Created attachment 4416 [details]
lspci -vv with 2.6.11-rc1
Comment 4 Richard Dawe 2005-01-16 08:52:49 UTC
Created attachment 4417 [details]
dmidecode with 2.6.11-rc1

I have no idea how accurate this is. For instance, I'd like someone to tell me
where the two COM ports are. They're not physically visible. ;)
Comment 5 Richard Dawe 2005-01-16 08:53:19 UTC
Created attachment 4418 [details]
acpidmp with 2.6.11-rc1
Comment 6 Richard Dawe 2005-01-16 08:54:55 UTC
I have a page describing how well the Acer 1524WLMi works with Linux here:

http://homepages.nildram.co.uk/~phekda/richdawe/fedora/FC3/acer-1542wlmi.html
Comment 7 Richard Dawe 2005-01-16 08:56:11 UTC
Created attachment 4419 [details]
Fixed DSDT
Comment 8 Shaohua 2005-01-16 18:46:22 UTC
>One of the PCI devices (the IDE controller) gets assigned an IRQ of 0
it doesn't matter. It's IDE device, so no GSI is ol.
Did any PCI device work in the system, such as the sound card, USB device?
Comment 9 Richard Dawe 2005-01-17 11:49:41 UTC
Yeah, everything else seems to work. The built-in sound works fine. I use a USB
mouse from time to time. I have an Atheros-based wireless PC Card that seems to
work OK. It's just the Gig Ethernet that's broken.

It has built-in wireless too, but that's based on a Cisco chipset that requires
ndiswrapper. ndiswrapper doesn't work under x86_64. :(
Comment 10 Richard Dawe 2005-01-18 15:11:44 UTC
Another user of this laptop reported that booting with acpi=noirq worked for
him. Sadly it doesn't work for me. I've asked for more details, like whether it
has another OS installed and what the kernel version & Linux distro is.

Is there any way of forcing the IRQ for the Ethernet chip (e.g.: hack the DSDL
or kernel) or is that hard-wired in the hardware?
Comment 11 Shaohua 2005-01-18 17:04:35 UTC
The BIOS told us the irq should be 22 if you use ACPI. It would be great to 
see the IRQ assignments in the success case (with acpi=noirq).
Comment 12 Mike Grant 2005-01-20 12:52:20 UTC
I recently got a very similiar laptop (1522WLMi), which differs only in the CPU
speed as far as I know.  My r8169 works fine in most situations, but I've been
seeing a number of problems that appear to be related to ACPI/IRQ routing.  I
swapped out the non-Linux internal wireless card for an ipw2100, which lead me
to investigate further while trying to get it to work.  I've been mailing
Richard Dawes regarding this laptop and he suggested attaching to this bug. 
I've put a bunch of dmesgs, etc relating to the tests above on
http://nobodymuch.org/laptop/acpi/ - didn't want to make lots of attachments
until I was sure this was related/helpful. 

When I boot the machine with the latest FC3 kernel (x86-32), it goes through the
normal allocation processes, then catches a bad IRQ and disables the interrupt,
breaking things.  Attempting to load the ipw2100
driver causes another bad IRQ and more failure.  Loading various combinations of
modules that seem to use the same IRQ always seems to end up with a bad IRQ error.

If I load with pci=noacpi or acpi=noirq, I get some warnings about routing
conflicts, but everything seems to work fine.  It looks like the IRQ assignments
were:
(success, pci=noacpi)  
  9:          0          XT-PIC  acpi
 10:        109          XT-PIC  uhci_hcd, uhci_hcd, yenta, eth1(ipw2100)
 11:         77          XT-PIC  VIA8233, ehci_hcd, uhci_hcd, yenta, ohci1394,
eth0(r8169)
(failure, no additional parameters)
  9:          0          XT-PIC  acpi, VIA8233, uhci_hcd, yenta
 10:     100000          XT-PIC  uhci_hcd, ohci1394
 11:     100000          XT-PIC  ehci_hcd, uhci_hcd, yenta
reading the dmesg, I think ipw2100 and r8169 were both allocated IRQ11.

With an x86-64 FC3 install and kernel (which I believe Richard is using),
similar things happen, but the pci=noacpi and acpi=noirq options don't help -
instead they cause it to apparently assign IRQ 0.  It looks like the x86-64
kernel forces use of APIC, which may have something to do with it.

2.6.11-rc1 gives the same results on x86-32, but I've not built it for x86-64 yet.

Finally, it all works fine in Win/XP Home, which is also on the laptop at the
minute (assigning the ipw2100 to IRQ21 & the r8169 to IRQ22).

I have FC3-32, FC3-64 & Win/XP currently on the laptop, so can test with any of
those systems.
Comment 13 Richard Dawe 2005-01-22 05:03:03 UTC
Created attachment 4439 [details]
/proc/interrupts with 2.6.11-rc1 and acpi=noirq
Comment 14 Richard Dawe 2005-01-22 05:04:34 UTC
Created attachment 4440 [details]
dmesg with 2.6.11-rc1 and acpi=noirq
Comment 15 Richard Dawe 2005-01-22 05:07:06 UTC
When I boot 2.6.11-rc1 with acpi=noirq, I have to switch init into interactive
mode (press "I" just after the banner) and skip starting networking. If I try to
bring networking up, the box hangs. Presumably this is because the 8169 has been
assigned IRQ 0.
Comment 16 Richard Dawe 2005-01-22 07:12:25 UTC
Created attachment 4441 [details]
Patch to fix r8169's slow performance

The RealTek 8169 in the Acer laptop must be an embedded version of the 8169.
These have embedded PHYs which are different that the ones used normally with
the 8169, according to the source of the FreeBSD driver.

Looking at the latest driver from RealTek, I see that the PHY version for the
chip I have in my laptop has some specific initialisation code. I ported that
to the driver in 2.6.11-rc1 and, voila, I can now get ~7.1Mbps.

I'll work with the Linux driver's maintainer to get this into the kernel.

/me is very happy!

PS: Sorry I filed this in the ACPI category. At the time I thought it was an
ACPI issue.
Comment 17 Francois Romieu 2005-02-26 16:11:10 UTC
Mike, can you give 2.6.11-rc5 x86-32 a try and report the result ?

--
Ueimor
Comment 18 Mike Grant 2005-03-01 11:39:43 UTC
I tried 2.6.11-rc5 and have similar results - a normal boot with ACPI seems to
have problems with IRQ routing resulting in a bad_irq and a boot with pci=noacpi
gives a working machine.

I've put some more dmesg output, etc up for 2.6.11-rc5 at
http://nobodymuch.org/laptop/acpi/x86-32/ again.  I also put a diff of the
dmesgs from a working (pci=noacpi) boot and non-working (ACPI) boot, which shows
different IRQs being assigned, if that's relevant.

I think this problem is probably unrelated to the r8169 (I see Rich's patch
seems to have gone in) - seems to be to do with ACPI vs PCI IRQ routing. 
Perhaps I should open a new bug and attach files there?
Comment 19 Mike Grant 2005-06-29 14:45:23 UTC
I put FC4-64 on the machine to see if that helped and it all now seems to be
working ok in 64 bit mode (no problems observed with using both network
devices/other hardware).  The kernel used was 2.6.11-1.1369_FC4.

I do see a machine check event or two, but it doesn't seem to be related.  Also,
"pci=noacpi" mode doesn't work (though I guess that's not surprising either).

If anyone would like further information, I'm happy to provide it, otherwise
it's probably safe to close this bug :)
Comment 20 Richard Dawe 2005-07-01 00:48:14 UTC
This was fixed in 2.6.11.