Bug 955 - Nvidia Nforce2 interrupt handling problems
Summary: Nvidia Nforce2 interrupt handling problems
Status: REJECTED DUPLICATE of bug 10
Alias: None
Product: Drivers
Classification: Unclassified
Component: PCI (show other bugs)
Hardware: i386 Linux
: P2 normal
Assignee: Andy Grover
URL:
Keywords:
: 929 (view as bug list)
Depends on:
Blocks:
 
Reported: 2003-07-17 22:28 UTC by Andy Dustman
Modified: 2003-07-18 10:55 UTC (History)
1 user (show)

See Also:
Kernel Version: 2.6.0-test1
Subsystem:
Regression: ---
Bisected commit-id:


Attachments
boot test 1 (16.70 KB, text/plain)
2003-07-17 22:34 UTC, Andy Dustman
Details
boot test 2 (15.83 KB, text/plain)
2003-07-17 22:39 UTC, Andy Dustman
Details
boot test 3 (14.78 KB, text/plain)
2003-07-17 22:42 UTC, Andy Dustman
Details
boot test 4 (10.66 KB, text/plain)
2003-07-17 22:47 UTC, Andy Dustman
Details
boot test 5 (10.46 KB, text/plain)
2003-07-17 22:48 UTC, Andy Dustman
Details
boot test 6 (11.80 KB, text/plain)
2003-07-17 22:51 UTC, Andy Dustman
Details
boot test 7 (9.84 KB, text/plain)
2003-07-17 22:53 UTC, Andy Dustman
Details
boot test 8 (10.10 KB, text/plain)
2003-07-17 22:54 UTC, Andy Dustman
Details
.config (24.46 KB, text/plain)
2003-07-17 22:58 UTC, Andy Dustman
Details

Description Andy Dustman 2003-07-17 22:28:31 UTC
Distribution: Gentoo

Hardware Environment: Athlon-XP, Nvidia Nforce2 chipset, Radeon 8500 (r200) AGP,
3com 3c920

Software Environment:
AWARD BIOS
gcc version 3.2.3 20030422 (Gentoo Linux 1.4 3.2.3-r1, propolice)
GNU ld version 2.14.90.0.2 20030515 <-- binutils
XFree-4.3.0

Problem Description:

PCI interrupt assignments get weird. With certain boot parameters, you can get
from mostly non-functional to mostly functional. dmesg from several combinations
of parmeters will be attached, but in a nutshell, with local APIC and IO-APIC 
and ACPI enabled in the kernel, a lot of strange things happen, even if you use
noapic and pci=noacpi. Disabling the APIC helps a little, but to be functional
at all, pci=noacpi must be used.

Steps to reproduce:

Will be detailed in the following attachments.
Comment 1 Andy Dustman 2003-07-17 22:34:15 UTC
Created attachment 558 [details]
boot test 1

In this test, local APIC, IO-APIC, and ACPI are all enabled in the kernel. ACPI
reports IRQs 20-22 are disabled, but assigns devices there anyway. This later
produces some "irq xx: nobody cared!" messages and backtraces for the ALSA
intel8x0 driver (for Nvidia Nforce2 sound) and radeon. In addition, the 3c59x
doesn't work, though this is not obvious from this dmesg output.
Comment 2 Andy Dustman 2003-07-17 22:39:11 UTC
Created attachment 559 [details]
boot test 2

This uses the same kernel as before, but adds pci=noacpi. This kernel runs for
awhile, i.e. sound and network and DRI on the radeon all work, but after a few
minutes of running something that exercises all three (i.e. Enemy Territory),
the network connection is lost, and the kernel reports:

NETDEV WATCHDOG: eth0: transmit timed out
eth0: transmit timed out, tx_status 00 status e601.
  diagnostics: net 0cc0 media 8080 dma 0000003a fifo 0000
eth0: Interrupt posted but not delivered -- IRQ blocked by another device?
  Flags; bus-master 1, dirty 5780(4) current 5780(4)
  Transmit list 00000000 vs. de955480.

Note that the intel8x0 audio and 3c59x both share the same interrupt.
Comment 3 Andy Dustman 2003-07-17 22:42:21 UTC
Created attachment 560 [details]
boot test 3

In this test, noapic is used in an attempt to turn off the APIC. This doesn't
work. It causes a spew of this message:

APIC error on CPU0: 40(40)
Comment 4 Andy Dustman 2003-07-17 22:47:34 UTC
Created attachment 561 [details]
boot test 4

noapic and pci=noacpi in a further attempt to disable the ACPI. Note that some
of the APIC initialization message occur before the kernel command line is
printed, perhaps before it is parsed. The system locks up hard after a minute
or two of Enemy Territory, i.e. SysRq is unresponsive.
Comment 5 Andy Dustman 2003-07-17 22:48:41 UTC
Created attachment 562 [details]
boot test 5

noapci pci=noacpi mem=nopentium. Same results as #4.
Comment 6 Andy Dustman 2003-07-17 22:51:42 UTC
Created attachment 563 [details]
boot test 6

At this point, I decide that the only way to really disable the APIC is to turn
it off in the kernel. Build, install, reboot with no additional parameters.
Once again the 3c59x is broken (transmit timed out, IRQ blocked by another
device?).
Comment 7 Andy Dustman 2003-07-17 22:53:18 UTC
Created attachment 564 [details]
boot test 7

As with the previous test but adding pci=noacpi. System locks hard after a few
minutes of game play.
Comment 8 Andy Dustman 2003-07-17 22:54:48 UTC
Created attachment 565 [details]
boot test 8

Like previous test, but using mem=nopentium pci=noacpi. Another lock-up after a
brief moment of game play. I'm out of ideas at this point.
Comment 9 Andy Dustman 2003-07-17 22:58:51 UTC
Created attachment 566 [details]
.config

Kernel config used for tests 6-8. The one for 1-5 is the same except with APIC
and IO-APIC enabled.

Additional note: With kernel 2.4.21-pre6 (with patches, it's Gentoo gs-sources)
and ATI's binary driver, it works pretty well, with just some video
artifacts/rendering errors, and the occassional soft lockup (sometimes frees up
on it's own, other times I have to kill, sometimes with sig 9).
Comment 10 Alistair Strachan 2003-07-18 10:04:52 UTC
I can verify at least part of this bug report. Noted on LKML to which I got no response, message 
titled "APIC & ACPI on EPoX 8RDA+ (nForce 2)"; 
 
http://marc.theaimsgroup.com/?l=linux-kernel&m=105802788025190&w=2 
 
However, this machine is completely stable with a recent BIOS and the proprietary NVIDIA 
drivers. I haven't played ET yet, but games like Quake3, UT2003 and Tribes2 all run fine. 
 
The symptoms are identical, except that with APIC the kernel simply doesn't boot at all. With 
ACPI & pci=noacpi everything seems to work fine. Otherwise, devices such as my USB, 
IEEE1394 and USB 2.0 onboard do not get assigned an IRQ and fail to initialise. 
 
Andy, make sure your board's BIOS is the most recent available and try again. I did not find 
that ACPI causes any problems except when it is given control of PCI routing. "nForce 2 
chipset" is too vague, who retails your board? If it's EPoX, maybe this board should be 
blacklisted until this issue is resolved. 
Comment 11 Andy Grover 2003-07-18 10:52:05 UTC
I'm pretty sure this is an ACPI bug.
Comment 12 Greg Kroah-Hartman 2003-07-18 10:53:09 UTC

*** This bug has been marked as a duplicate of 10 ***
Comment 13 Greg Kroah-Hartman 2003-07-18 10:55:26 UTC
*** Bug 929 has been marked as a duplicate of this bug. ***

Note You need to log in before you can comment on or make changes to this bug.