Bug 14176

Summary: i915: External VGA EDID detection broken on 945 chip
Product: Drivers Reporter: Kevin Mitchell (kevmitch)
Component: Video(DRI - Intel)Assignee: Eric Anholt (eric)
Status: RESOLVED CODE_FIX    
Severity: normal CC: gordon.jin, jbarnes, remur, yakui.zhao
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 2.6.31-rc9 Subsystem:
Regression: Yes Bisected commit-id:

Description Kevin Mitchell 2009-09-14 09:30:43 UTC
This regression is a result of commit db54501900ad3665dd669f5708ecd04fc5aed495. It is therefore present in 2.6.31-rc9 onwards. 

On a Thinkpad T60 using KMS, an external display connected via the VGA port is no longer correctly detected. The external display will activate only once X is started, and with incorrect resolution. Furthermore the correct resolution is not available via xrandr. This has been tested to happen with two separate displays (one CRT and the other LCD).

Relevant printks from boot up include :
[    0.537910] ACPI: Video Device [VID] (multi-head: yes  rom: no  post: no)
[    0.543686] Linux agpgart interface v0.103
[    0.543718] agpgart-intel 0000:00:00.0: Intel 945GM Chipset
[    0.544710] agpgart-intel 0000:00:00.0: detected 7932K stolen memory
[    0.549928] agpgart-intel 0000:00:00.0: AGP aperture is 256M @ 0xd0000000
[    0.550124] [drm] Initialized drm 1.1.0 20060810
[    0.550248] i915 0000:00:02.0: power state changed by ACPI to D0
[    0.550264] i915 0000:00:02.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
[    0.550275] i915 0000:00:02.0: setting latency timer to 64
[    0.562179] i915 0000:00:02.0: BAR 6: can't allocate resource (bogus alignment) [0x0-0x0] flags 0x0
[    0.562198] [drm] failed to find VBIOS tables
[    0.562244] render error detected, EIR: 0x00000010
[    0.562251] page table error
[    0.562257]   PGTBL_ER: 0x00000001
[    0.562264] [drm:i915_handle_error] *ERROR* EIR stuck: 0x00000010, masking
[    0.562275] render error detected, EIR: 0x00000010
[    0.562282] page table error
[    0.562286]   PGTBL_ER: 0x00000001
[    1.014476] i2c-adapter i2c-0: unable to read EDID block.
[    1.014486] i915 0000:00:02.0: VGA-1: no EDID data
[    1.488586] [drm] LVDS-8: set mode 1280x800 c
[    1.671458] Console: switching to colour frame buffer device 160x50
[    1.676884] [drm] fb0: inteldrmfb frame buffer device
[    1.676940] [drm] Initialized i915 1.6.0 20080730 for 0000:00:02.0 on minor 0

When X starts I also get a bunch more
[   17.517073] i2c-adapter i2c-0: unable to read EDID block.
[   17.517081] i915 0000:00:02.0: VGA-1: no EDID data

On reverting the above mentioned commit from 2.6.31, I get the expected behaviour. Namely, the external display activates long before X and at the same time that the LVDS frame buffer goes to native resolution. When X does start, both displays have the correct resolution. I also don't get worrying "unable to read EDID block" or "PGTBL_ER" messages.

Kevin
Comment 1 Gordon Jin 2009-09-16 06:05:52 UTC
so the regressin happens at:

commit db54501900ad3665dd669f5708ecd04fc5aed495
Author: David Müller (ELSOFT AG) <d.mueller@elsoft.ch>
Date:   Sat Aug 29 08:54:45 2009 +0200

    drm/i915: Improve CRTDDC mapping by using VBT info
    
    Use VBT information to determine which DDC bus to use for CRTDCC.
    Fall back to GPIOA if VBT info is not available.
    
    Signed-off-by: David Müller <d.mueller@elsoft.ch>
    Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
    Signed-off-by: Eric Anholt <eric@anholt.net>
    Tested on: 855 (David), and 945GM, 965GM, GM45, and G45 (anholt)
Comment 2 Kevin Mitchell 2009-12-10 01:56:46 UTC
I've done some more digging around and have the additional information about what exactly is going wrong. 

It looks like I'm bailing out of intel_init_bios early at drivers/gpu/drm/i915/intel_bios.c:396:

    bios = pci_map_rom(pdev, &size);
    if (!bios)
        return -1;

It is around this time, that I see
[    0.447220] i915 0000:00:02.0: BAR 6: can't allocate resource (bogus alignment) [0x0-0x0] flags 0x0
[    0.447227] [drm] failed to find VBIOS tables

It looks like this was happening even before the offending commit, but I never saw any symptoms that bothered me.

This commit however introduces the new variable dev->dev_priv->crt_ddc_bus which is supposed to be initialised to -1 in a call to parse_general_definitions down at 418, but that never actually happens because I bail out before then. It therefore looks like dev->dev_priv->crt_ddc_bus=0 instead.

The real problem with this occurs at drivers/gpu/drm/i915/intel_crt.c:552:

    /* Set up the DDC bus. */
    if (IS_IGDNG(dev))
        i2c_reg = PCH_GPIOA;
    else {
        i2c_reg = GPIOA;
        /* Use VBT information for CRT DDC if available */
        if (dev_priv->crt_ddc_bus != -1)
            i2c_reg = dev_priv->crt_ddc_bus;
    }

The nested if statment was added in this commit. Before, I was walking away with i2c_reg=GPIOA, which gave me the desired behaviour. Now that's getting overwritten with 0 since that's the erroneous value of dev_priv->crt_ddc_bus which is not -1.

In conclusion, we would seemingly need one or both of the following 

1) make sure that dev_priv->crt_ddc_bus = -1 gets set even if intel_init_bios doesn't complete.
2) fix the call to pci_map_rom so that intel_init_bios does complete.

The first one seems like the easiest. I really have no idea what's happening with pci_map_rom.

Kevin
Comment 3 Jesse Barnes 2009-12-10 18:16:46 UTC
Good catch.  I'd like to figure out why your ROM isn't getting mapped; not having VBT data available makes many things more difficult as we're depending on it for various things these days.

Based on the dmesg, it sounds like we're not actually allocating any resource space for it?  If you enable the PCI debug output you may get more info about why the video ROM BAR isn't getting space.
Comment 4 Kevin Mitchell 2009-12-11 01:06:46 UTC
Unfortunately, it doesn't appear that pci has as nice a debugging infrastructure as drm. However, with a few printks, I was able to trace the following codepath

drivers/pci/rom.c:pci_map_rom


    if (res->flags & IORESOURCE_ROM_SHADOW) {
    ...
    } else {
        if (res->flags &
             (IORESOURCE_ROM_COPY | IORESOURCE_ROM_BIOS_COPY)) {

        } else {
    }
Comment 5 Kevin Mitchell 2009-12-11 01:41:27 UTC
Oops, ignore the above.

I traced it down to falling off the end of the following switch statement in kernel/resource.c:579

resource_size_t resource_alignment(struct resource *res)
{
        switch (res->flags & (IORESOURCE_SIZEALIGN | IORESOURCE_STARTALIGN)) {
        case IORESOURCE_SIZEALIGN:
                return resource_size(res);
        case IORESOURCE_STARTALIGN:
                return res->start;
        default: 
                return 0;
        }
}

So it looks like pdev->res->flags has neither IORESOURCE_SIZEALIGN  nor IORESOURCE_STARTALIGN set. Where is this supposed to happen?
Comment 6 ykzhao 2009-12-28 16:01:08 UTC
Hi, Kevin
    Please try the latest linus kernel(2.6.33-rc1/rc2) and see whether the issue still exists?
    Maybe this issue can be fixed by the following commit:
    >commit 29874f44fbcbc24b231b42c9956f8f9de9407231
Author: Shaohua Li <shaohua.li@intel.com>
Date:   Wed Nov 18 15:15:02 2009 +0800
 
    drm/i915: fix gpio register detection logic for BIOS without VBT

Thanks.
Comment 7 Kevin Mitchell 2009-12-29 00:14:45 UTC
This looks like it should make things work a bit better. Unfortunately I won't have an external vga display to test it for an other week or so. In any case, I hate to be a spoiler, but doesn't this rely on the uninitialised value of crt_ddc_bus being 0? Nominally this appears to be the case, but is this a safe assumption?
Comment 8 Kevin Mitchell 2009-12-31 00:09:17 UTC
I've been digging around to try and determine why I'm not getting space for my vbt and am having trouble pinning anything down. 

I have confirmed that drm_get_dev() at drivers/gpu/drm/drm_stub.c: 400 is returning successfully. I'm guessing however that it somewhere around here that I should be getting space allocated for my ROM. In particular pci_enable_device() looks like a good place. This is just a wrapper to __pci_enable_device() (drivers/pci/pci.c: 930) where flags=(IORESOURCE_MEM | IORESOURCE_IO) and

	for (i = 0; i < DEVICE_COUNT_RESOURCE; i++)
		if (dev->resource[i].flags & flags)
			bars |= (1 << i);

	err = do_pci_enable_device(dev, bars);

However my kernel doesn't boot if i stick a printk in there to see what "bars" is likely because this function is used very early on before such printing is possible. Am I even on the right track here?

I do however note that the first line in the above posted printks has video ACPI saying "rom: no". Is it possible that my hardware is just set up to supply the necessary information?
Comment 9 Kevin Mitchell 2009-12-31 03:32:52 UTC
I have confirmed that 29874f44fbcbc24b231b42c9956f8f9de9407231 restores the expected behaviour in my case, however my previous question about the generality of this fix remains.
Comment 10 Jesse Barnes 2010-07-23 19:57:42 UTC
Marking as fixed since you no longer have the problem.  If you still have the ROM allocation failure and want to track it down you could try the linux-pci@vger.kernel.org list and/or file a PCI bug.