Bug 4921

Summary: AGP bridge causes bizarre failures after resume - incorrect aperture address?
Product: Drivers Reporter: Matthew Garrett (mjg59-kernel)
Component: Video(AGP)Assignee: Dave Jones (davej)
Status: RESOLVED CODE_FIX    
Severity: normal CC: airlied, akpm, davej
Priority: P2    
Hardware: i386   
OS: Linux   
Kernel Version: 2.6.12 Subsystem:
Regression: --- Bisected commit-id:

Description Matthew Garrett 2005-07-21 11:23:47 UTC
Distribution: Ubuntu
Hardware Environment: HP TC1105 with Intel 855PM chipset
Software Environment: N/A
Problem Description:

If intel-agp is loaded, then after a system suspend/resume cycle (using ACPI)
other PCI drivers fail to work correctly. Setting byte 51 of the PCI
configuration space of device 00:00.0 disables AGP but allows other drivers to
start working again.

After resume, loading the b44 driver with byte 51 set to 02 gives:

b44.c:v0.95 (Aug 3, 2004)
ACPI: PCI interrupt 0000:02:07.0[A] -> GSI 12 (level, low) -> IRQ 12
eth0: Broadcom 4400 10/100BaseT Ethernet f0:00:00:01:00:00

Setting byte 51 to 0, rmmodding b44 and then re-modprobing it gives:

b44.c:v0.95 (Aug 3, 2004)
ACPI: PCI interrupt 0000:02:07.0[A] -> GSI 12 (level, low) -> IRQ 12
eth0: Broadcom 4400 10/100BaseT Ethernet 00:0b:cd:ad:28:4d

Setting byte 51 back to 2 restores the original failure mode. Possibly
significant is that testgart reports the aperture base at 0xe2000000 before
suspend and 0xe0000000 after resume. 0xe0000000 overwrites the memory base of
various other PCI devices. Could it just be that the aperture base is not
maintained over suspend/resume?
Comment 1 Andrew Morton 2005-07-21 19:35:18 UTC
Did any earlier versions of 2.6 handle this correctly?  If so, which versions?

Thanks.
Comment 2 Matthew Garrett 2005-07-21 19:42:06 UTC
I don't believe so, but I've found the problem. The resume code writes APBASE
first - however, APBASE can't be set to a value that conflicts with APSIZE.
APSIZE is then written, but APBASE is then not reset. As a result, we get an
aperture of the correct size but at a different address to where it was
initially - in my case, overlapping with the memory base of various PCI devices.
I've hacked around this by storing and explicitly setting APSIZE before setting
APBASE, but I'm sure there's a nicer way of doing it.
Comment 3 Andrew Morton 2005-07-21 20:09:11 UTC
ooh, I take hacks ;)

Please email me the patch+description and Cc Dave Airlie <airlied@linux.ie> and
Dave Jones <davej@codemonkey.org.uk> and we'll get this fixed up, thanks.
Comment 4 Andrew Morton 2005-08-04 14:38:33 UTC
So I'll assume that Matthew's patch fixed all this up.

If not, please reopen the bug.