Bug 6082

Summary: when making use of the drm r300 module al the devices that share the same interrupt stop working
Product: Drivers Reporter: Diego Gonz (diego)
Component: Video(DRI - non Intel)Assignee: Dave Airlie (airlied)
Status: RESOLVED CODE_FIX    
Severity: high CC: acpi-bugzilla, diegocg, vsu
Priority: P2    
Hardware: i386   
OS: Linux   
Kernel Version: 2.6.15.1 Subsystem:
Regression: --- Bisected commit-id:
Attachments: output of kernel.logs for this kernel (runing with the radeon module)
patch to invoke pci_enable_device() when using old PCI detect

Description Diego Gonz 2006-02-15 18:13:39 UTC
Most recent kernel where this bug did not occur:

2.6.15.1 i tried some previous version and it also happends

Distribution:

Debian, i compiled the kernel, Mesa and XOrg

Hardware Environment:

Dell Latitude D610.
Pentium Centrino 2GHz, 1Gb RAM
wired network interface: tg3
wireless network interface: ipw2200
video card: ATI M300

Output of lspci:

0000:00:00.0 Host bridge: Intel Corporation Mobile 915GM/PM/GMS/910GML Express
Processor to DRAM Controller (rev 03)
0000:00:01.0 PCI bridge: Intel Corporation Mobile 915GM/PM Express PCI Express
Root Port (rev 03)
0000:00:1c.0 PCI bridge: Intel Corporation 82801FB/FBM/FR/FW/FRW (ICH6 Family)
PCI Express Port 1 (rev 03)
0000:00:1d.0 USB Controller: Intel Corporation 82801FB/FBM/FR/FW/FRW (ICH6
Family) USB UHCI #1 (rev 03)
0000:00:1d.1 USB Controller: Intel Corporation 82801FB/FBM/FR/FW/FRW (ICH6
Family) USB UHCI #2 (rev 03)
0000:00:1d.2 USB Controller: Intel Corporation 82801FB/FBM/FR/FW/FRW (ICH6
Family) USB UHCI #3 (rev 03)
0000:00:1d.3 USB Controller: Intel Corporation 82801FB/FBM/FR/FW/FRW (ICH6
Family) USB UHCI #4 (rev 03)
0000:00:1d.7 USB Controller: Intel Corporation 82801FB/FBM/FR/FW/FRW (ICH6
Family) USB2 EHCI Controller (rev 03)
0000:00:1e.0 PCI bridge: Intel Corporation 82801 Mobile PCI Bridge (rev d3)
0000:00:1e.2 Multimedia audio controller: Intel Corporation
82801FB/FBM/FR/FW/FRW (ICH6 Family) AC'97 Audio Controller (rev 03)
0000:00:1e.3 Modem: Intel Corporation 82801FB/FBM/FR/FW/FRW (ICH6 Family) AC'97
Modem Controller (rev 03)
0000:00:1f.0 ISA bridge: Intel Corporation 82801FBM (ICH6M) LPC Interface Bridge
(rev 03)
0000:00:1f.2 IDE interface: Intel Corporation 82801FBM (ICH6M) SATA Controller
(rev 03)
0000:00:1f.3 SMBus: Intel Corporation 82801FB/FBM/FR/FW/FRW (ICH6 Family) SMBus
Controller (rev 03)
0000:01:00.0 VGA compatible controller: ATI Technologies Inc M22 [Radeon
Mobility M300]
0000:02:00.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5751 Gigabit
Ethernet PCI Express (rev 01)
0000:03:01.0 CardBus bridge: Texas Instruments PCI6515 Cardbus Controller
0000:03:01.5 Communication controller: Texas Instruments PCI6515 SmartCard
Controller
0000:03:03.0 Network controller: Intel Corporation PRO/Wireless 2915ABG MiniPCI
Adapter (rev 05)


Software Environment:

XOrg CVS HEAD
MESA CVS HEAD
Linux 2.6.15.1

Problem Description:

When i run an Xserver with the glx module so that it can use direct rendering
the wireless network interface stops working. The wired network interfaces works ok.

i get the following output on the screen: Disabling IRQ #169

I get this on the logs:

Jan 30 01:17:59 calisto kernel: irq 169: nobody cared (try booting with the
"irqpoll" option)
Jan 30 01:17:59 calisto kernel:  [__report_bad_irq+43/105]
__report_bad_irq+0x2b/0x69
Jan 30 01:17:59 calisto kernel:  [note_interrupt+107/143] note_interrupt+0x6b/0x8f
Jan 30 01:17:59 calisto kernel:  [__do_IRQ+104/143] __do_IRQ+0x68/0x8f
Jan 30 01:17:59 calisto kernel:  [do_IRQ+29/40] do_IRQ+0x1d/0x28
Jan 30 01:17:59 calisto kernel:  [common_interrupt+26/32] common_interrupt+0x1a/0x20
Jan 30 01:17:59 calisto kernel:  [pg0+949345105/1069827072]
acpi_processor_idle+0x153/0x2bd [processor]
Jan 30 01:17:59 calisto kernel:  [cpu_idle+63/87] cpu_idle+0x3f/0x57
Jan 30 01:17:59 calisto kernel:  [start_kernel+315/317] start_kernel+0x13b/0x13d
Jan 30 01:17:59 calisto kernel: handlers:
Jan 30 01:17:59 calisto kernel: [usb_hcd_irq+0/79] (usb_hcd_irq+0x0/0x4f)
Jan 30 01:17:59 calisto kernel: [usb_hcd_irq+0/79] (usb_hcd_irq+0x0/0x4f)
Jan 30 01:17:59 calisto kernel: [pg0+947762531/1069827072]
(snd_intel8x0_interrupt+0x0/0x1b7 [snd_intel8x0])
Jan 30 01:17:59 calisto kernel: [pg0+944433224/1069827072]
(tg3_interrupt_tagged+0x0/0xba [tg3])
Jan 30 01:17:59 calisto kernel: Disabling IRQ #169

i compiled the kernel with and without CONFIG_PREEMPT_NONE=y, the result is the
same.

Steps to reproduce: Just run an Xserver with the dri and glx modules
Comment 1 Diego Calleja 2006-02-16 03:50:37 UTC
I think it would be neccesary for people to have a dmesg and /proc/interrupts
output from your machine
Comment 2 Diego Gonz 2006-02-16 06:33:33 UTC
/proc/interrupts

           CPU0
  0:    1061648    IO-APIC-edge  timer
  1:      15411    IO-APIC-edge  i8042
  8:          4    IO-APIC-edge  rtc
  9:          2   IO-APIC-level  acpi
 12:        114    IO-APIC-edge  i8042
 14:      24644    IO-APIC-edge  libata
 15:      10583    IO-APIC-edge  libata
169:          2   IO-APIC-level  ehci_hcd:usb1, uhci_hcd:usb2, Intel ICH6, eth0
177:          1   IO-APIC-level  uhci_hcd:usb5, yenta
209:     171405   IO-APIC-level  uhci_hcd:usb3, Intel ICH6 Modem, ipw2200
217:      72378   IO-APIC-level  uhci_hcd:usb4
NMI:          0
LOC:     125846
ERR:          0
MIS:          0
Comment 3 Diego Gonz 2006-02-16 06:51:21 UTC
with the DRM module loaded i get this /proc/interrupts:

           CPU0
  0:      98559    IO-APIC-edge  timer
  1:       1939    IO-APIC-edge  i8042
  8:          4    IO-APIC-edge  rtc
  9:          2   IO-APIC-level  acpi
 12:        114    IO-APIC-edge  i8042
 14:      10739    IO-APIC-edge  libata
 15:        967    IO-APIC-edge  libata
169:      19046   IO-APIC-level  ehci_hcd:usb1, uhci_hcd:usb2, Intel ICH6,
radeon@pci:0000:01:00.0, eth0
177:          1   IO-APIC-level  uhci_hcd:usb5, yenta
209:      16065   IO-APIC-level  uhci_hcd:usb3, ipw2200, Intel ICH6 Modem
217:       2188   IO-APIC-level  uhci_hcd:usb4
NMI:          0
LOC:      21519
ERR:          0
MIS:          0
Comment 4 Diego Gonz 2006-02-16 06:57:43 UTC
sorry i made a mistake, it is the wired interface the one that doesn't work.
Actually it seems that everything that is sharing the same interrupt is not
working (the sound doesn't work either).

I'm chaging the summary of the bug to indicate this.
Comment 5 Diego Gonz 2006-02-16 07:00:00 UTC
Created attachment 7367 [details]
output of kernel.logs for this kernel (runing with the radeon module)
Comment 6 Michal Pytasz 2006-02-16 08:16:32 UTC
Hi,      
     
I have the same with r300 and realtek r8186 (onboard) network card.     
Here is dmesg after X having started (probed drm and radeon modules):     
     
agpgart: Found an AGP 3.0 compliant device at 0000:00:00.0.     
agpgart: X tried to set rate=x12. Setting to AGP3 x8 mode.     
agpgart: Putting AGP V3 device at 0000:00:00.0 into 8x mode     
agpgart: Putting AGP V3 device at 0000:01:00.0 into 8x mode     
[drm] Loading R300 Microcode     
irq 16: nobody cared (try booting with the "irqpoll" option)     
     
Call Trace: <IRQ> <ffffffff8014ec70>{__report_bad_irq+48}      
<ffffffff8014ef0e>{note_interrupt+593}     
       <ffffffff8014e74c>{__do_IRQ+206} <ffffffff80110eec>{do_IRQ+45}     
       <ffffffff8010ed1c>{ret_from_intr+0}  <EOI>      
<ffffffff803ee07b>{thread_return+0}     
       <ffffffff8010cb47>{default_idle+0} <ffffffff8010cb7b>{default_idle+52}     
       <ffffffff8010ccc5>{cpu_idle+63} <ffffffff8054a78a>{start_kernel+465}     
       <ffffffff8054a24c>{_sinittext+588}     
handlers:     
[<ffffffff80310ba0>] (rtl8169_interrupt+0x0/0x2af)     
Disabling IRQ #16     
     
What helps is pci=routeirq as boot parameter. Problem exisit only with cvs drm    
module, one from kernel does not cause this problem. System is amd64 runing    
natively, with 2.6.15 kernel and x11r7. gcc is 3.4.5    
   
Comment 7 Diego Calleja 2006-02-16 09:04:11 UTC
I find very weird that a system uses IRQ 169, CC'ing the ACPI guys who may know
more about this.
Comment 8 Sergey Vlasov 2006-02-17 02:55:23 UTC
Created attachment 7381 [details]
patch to invoke pci_enable_device() when using old PCI detect

If pci=routeirq helps, this means that the driver did not call
pci_enable_device(), and is therefore using a bogus IRQ number (recent kernels
perform IRQ routing in pci_enable_device(), and pdev->irq does not have the
correct value until that function is called).

The drm core currently does not call pci_enable_device() when a framebuffer
driver (including vesafb) is active.  If a proper framebuffer driver (e.g.,
radeonfb) is used, drm still works fine, because the framebuffer driver calls
pci_enable_device() itself, and drm gets proper pdev->irq.  However, when drm
is used together with vesafb, pci_enable_device() is not performed, and the IRQ
problem shows up.

This patch makes the drm core call pci_enable_device() even if a framebuffer
driver was loaded before; this fixes the IRQ problem.

---

drm: fix IRQ problem when used together with vesafb

Even if drm uses old PCI probing (because a framebuffer driver is active), it
still needs to call pci_enable_device() to get correct IRQ.

Signed-off-by: Sergey Vlasov <vsu@altlinux.ru>
Comment 9 Diego Gonz 2006-02-17 05:12:36 UTC
i just tried using pci=routeirq and it works
Comment 10 Dave Airlie 2006-02-17 20:22:50 UTC
this isn't a kernel bug if you are using drivers from DRM CVS.. if you are using
DRM CVS drivers please close this bug, I've check an always enavble device into
DRM CVS.
Comment 11 Boris Peterbarg 2006-02-18 01:50:03 UTC
Hi, I'm having a similar issue, even with latest drm, except my problem is with
a CDRW device. It gets disconnected each time a 3d program runs.

I've got Slackware current, kernel 2.6.15.4, XOrg 6.9 (both self-compiled),
latest drm and mesa.

Hardware Environment:
Athlon XP 2500+, 1 GB RAM
video card: ATI Radeon R350

lspci output:
00:00.0 Host bridge: nVidia Corporation nForce2 AGP (different version?) (rev c1)
00:00.1 RAM memory: nVidia Corporation nForce2 Memory Controller 1 (rev c1)
00:00.2 RAM memory: nVidia Corporation nForce2 Memory Controller 4 (rev c1)
00:00.3 RAM memory: nVidia Corporation nForce2 Memory Controller 3 (rev c1)
00:00.4 RAM memory: nVidia Corporation nForce2 Memory Controller 2 (rev c1)
00:00.5 RAM memory: nVidia Corporation nForce2 Memory Controller 5 (rev c1)
00:01.0 ISA bridge: nVidia Corporation nForce2 ISA Bridge (rev a4)
00:01.1 SMBus: nVidia Corporation nForce2 SMBus (MCP) (rev a2)
00:02.0 USB Controller: nVidia Corporation nForce2 USB Controller (rev a4)
00:02.1 USB Controller: nVidia Corporation nForce2 USB Controller (rev a4)
00:02.2 USB Controller: nVidia Corporation nForce2 USB Controller (rev a4)
00:04.0 Ethernet controller: nVidia Corporation nForce2 Ethernet Controller (rev a1)
00:05.0 Multimedia audio controller: nVidia Corporation nForce Audio Processing
Unit (rev a2)
00:06.0 Multimedia audio controller: nVidia Corporation nForce2 AC97 Audio
Controler (MCP) (rev a1)
00:08.0 PCI bridge: nVidia Corporation nForce2 External PCI Bridge (rev a3)
00:09.0 IDE interface: nVidia Corporation nForce2 IDE (rev a2)
00:0d.0 FireWire (IEEE 1394): nVidia Corporation nForce2 FireWire (IEEE 1394)
Controller (rev a3)
00:1e.0 PCI bridge: nVidia Corporation nForce2 AGP (rev c1)
01:0b.0 Ethernet controller: Intel Corporation 82540EM Gigabit Ethernet
Controller (rev 02)
01:0c.0 RAID bus controller: Integrated Technology Express, Inc. IT/ITE8212 Dual
channel ATA RAID controller (PCI version seems to be IT8212, embedded seems (rev 10)
01:0d.0 Mass storage controller: Silicon Image, Inc. SiI 3112
[SATALink/SATARaid] Serial ATA Controller (rev 02)
03:00.0 VGA compatible controller: ATI Technologies Inc Radeon R350 [Radeon 9800
Pro]
03:00.1 Display controller: ATI Technologies Inc Radeon R350 [Radeon 9800 Pro]
(Secondary)

My /proc/interrupts with drm loaded:
           CPU0
  0:   74255552          XT-PIC  timer
  1:     110676          XT-PIC  i8042
  2:          0          XT-PIC  cascade
  5:    7360005          XT-PIC  libata, ohci_hcd:usb2, radeon@pci:0000:03:00.0
  7:          3          XT-PIC  parport0
  8:          0          XT-PIC  rtc
  9:     323178          XT-PIC  acpi, ehci_hcd:usb1, eth0, NVidia nForce2
 11:   15840418          XT-PIC  ide2, ide3, ohci_hcd:usb3, eth1
 12:        102          XT-PIC  i8042
 14:     223958          XT-PIC  ide0
 15:      37171          XT-PIC  ide1
NMI:          0
ERR:          0

If I mount a cdrom, then start a 3d program, the CDRW read light starts flashing.
dmesg output (repeated a couple of times):
hda: status error: status=0x7f { DriveReady DeviceFault SeekComplete DataRequest
CorrectedError Index Error }
hda: status error: error=0x7f { IllegalLengthIndication EndOfMedia
AbortedCommand MediaChangeRequested LastFailedSense=0x07 }
ide: failed opcode was: unknown
hda: drive not ready for command
hda: ATAPI reset complete
hda: status error: status=0x7f { DriveReady DeviceFault SeekComplete DataRequest
CorrectedError Index Error }
hda: status error: error=0x7f { IllegalLengthIndication EndOfMedia
AbortedCommand MediaChangeRequested LastFailedSense=0x07 }
ide: failed opcode was: unknown
hda: drive not ready for command
hda: ATAPI reset complete
hda: status error: status=0x7f { DriveReady DeviceFault SeekComplete DataRequest
CorrectedError Index Error }
hda: status error: error=0x7f { IllegalLengthIndication EndOfMedia
AbortedCommand MediaChangeRequested LastFailedSense=0x07 }
ide: failed opcode was: unknown
hda: drive not ready for command

and after it:
ide: failed opcode was: unknown
hda: drive not ready for command
hda: drive not ready for command
VFS: busy inodes on changed media.
VFS: busy inodes on changed media.

dmesg output when I exit the program and try to access the CDRW:
hda: lost interrupt
VFS: busy inodes on changed media.

and after a couple of seconds, the CDRW starts working normally again, until the
next time I activate a 3d program.
Comment 12 Diego Calleja 2006-08-03 13:21:56 UTC
Is this bug fixed? shouldn't it be closed?

Boris, in your setup the video card is not sharing the irq with the ide device
and you are using a nvidia binary module aswell, it looks different from this
bug. If the problem persist (without the nvidia driver loaded) using the latest
stable kernel, report this issue in a new bug
Comment 13 Dave Airlie 2006-09-21 20:50:05 UTC
this has been fixed in git as far as I'm aware.