Bug 4965

Summary: Dual-head not working correctly with Matrox g550 in 2.6.13-rc[34]
Product: Drivers Reporter: Jarmo Ilonen (trewas)
Component: Video(Other)Assignee: drivers_video-other
Status: CLOSED CODE_FIX    
Severity: blocking CC: airlied, akpm, bunk, greg, infos
Priority: P2    
Hardware: i386   
OS: Linux   
Kernel Version: 2.6.13-rc[34] Subsystem:
Regression: --- Bisected commit-id:
Attachments: dmesg -s 100000 from 2.6.13-rc1
dmesg -s 100000 from 2.6.13-rc3

Description Jarmo Ilonen 2005-07-29 08:17:42 UTC
Distribution:
Debian Unstable

Hardware Environment:
AMD Sempron 2800+, Matrox G550

Software Environment:
X related packages are version 6.8.2.dfsg.1-4

Problem Description:
I have a dual-head setup with two monitors using Matrox G550. The primary
monitor is using 1400x1050 resolution and the secondary is to the right of the
primary with 1024x768 resolution. With 2.6.13-rc1 (and 2.6.12.1) the system
works fine. With 2.6.13-rc3 and -rc4 only the primary monitor is active, and it
functions as the secondary monitor (with 1024x768 resolution and mouse cursor
has to be moved to the right from the initial position to "get there"). 

Initially I thought that the problem was caused by the new Xorg packages in
Debian unstable because I could not fathom how the kernel could have anything to
do with such a problem. But then I tried the same X setup with another kernel
versions and the setup works consistently with 2.6.13-rc1 and always fails with
2.6.13-rc3 and -rc4 (-rc2 failed to compile).

If I modify xorg.conf to use only the primary monitor, then it works correctly
with -rc[34].

Steps to reproduce:
Boot with 2.6.13-rc[34], run startx and only the primary monitor gets an image,
and that is the image that should be on the secondary monitor.
Comment 1 Adrian Bunk 2005-07-29 09:49:48 UTC
Please attach the output of "dmesg -s 1000000" of both the -rc1 and the -rc3 kernel.

Is the problem still present in 2.6.13-rc3-mm3?
Comment 2 Dave Jones 2005-07-29 11:29:18 UTC
This problem has actually been around for at least a year.

Fedora users saw this as soon as we cut over to Xorg too, so it's interesting
that you now see it now that Debian has also moved away from XFree86.

There's an Xorg bug open on this.  See
https://bugs.freedesktop.org/show_bug.cgi?id=615

I'm unconvinced that this is a kernel bug.
Comment 3 Andrew Morton 2005-07-29 12:12:43 UTC
But Dave, Jarmo says that 2.6.13-rc1 works OK and 2.6.13-rc3
does not.

Is there anything interesting in the X logs? (/var/log/XFree86.0.log
on my old machine)
Comment 4 Dave Jones 2005-07-29 12:25:02 UTC
diddling with card registers means we don't get the same state across reboots,
so it might just be working purely by chance.

Either that, or its a seperate bug to the one I'm thinking of.

Is it repeatably working ok on -rc1 ? Even from a cold power-on ?
Comment 5 Jarmo Ilonen 2005-07-30 06:24:47 UTC
I tried cold-booting with both rc1 and rc3, and the results were the same as
before: dual-head works ok with rc1 and fails (as previosly described) with rc3.
So far the same has happened with every reboot, cold or warm, so at least the
bug seems to occur only with rc3 whether it is an actual kernel bug or a bug in
Xorg.

/var/log/Xorg.0.log had one obvious difference between rc1 and rc3. Line "(II)
Truncating PCI BIOS Length to 36864" appears only when booted with rc3 (after
line "MGA(0): BIOS at 0xE7020000", and the same for the the second head MGA(1)).

BTW I was using xorg from ubuntu before upgrading to the Debian version and I
never saw the same problem. The last kernel I used with Ubuntu xorg was 2.6.12.1.

I can try 2.6.13-rc3-mm3 or any other kernel version in Monday if it necessary.
Comment 6 Jarmo Ilonen 2005-07-30 06:26:03 UTC
Created attachment 5420 [details]
dmesg -s 100000 from 2.6.13-rc1
Comment 7 Jarmo Ilonen 2005-07-30 06:27:00 UTC
Created attachment 5421 [details]
dmesg -s 100000 from 2.6.13-rc3

Not that there seems to be anything interesting differences between rc1 and
rc3...
Comment 8 Dave Airlie 2005-07-30 23:42:17 UTC
any changes in that area to mmap or /dev/mem?

It's not anything in the graphics drivers or AGP drivers, the problem is that
the MGA gets some information from the BIOS.. but the truncation stuff is
causing it to be unable to read that flag....
Comment 9 Jarmo Ilonen 2005-08-04 03:25:34 UTC
I just tried 2.6.13-rc5 and 2.6.13-rc4-mm1, and dual-head is not working
correctly with either of them, wherever the real problem is. If there is
anything I can do to pinpoint/fix the problem, leave a comment...
Comment 10 Dave Airlie 2005-08-05 03:28:00 UTC
can you do as root with -rc1 and -rc3

lspci -v -s 3:0.0

and maybe lspci -v -H1 -s 3:0.0 and see if there are any differences in what it
tells you?
Comment 11 Dave Airlie 2005-08-05 03:56:13 UTC
*** Bug 4971 has been marked as a duplicate of this bug. ***
Comment 12 Jarmo Ilonen 2005-08-05 07:27:01 UTC
lspci -v -s 3:0.0 gives the following with -rc3:

0000:03:00.0 VGA compatible controller: Matrox Graphics, Inc. MGA G550 AGP (rev
01) (prog-if 00 [VGA])
        Subsystem: Matrox Graphics, Inc. Millennium G550 Dual Head DDR 32Mb
        Flags: bus master, medium devsel, latency 32, IRQ 11
        Memory at e4000000 (32-bit, prefetchable) [size=32M]
        Memory at e7000000 (32-bit, non-prefetchable) [size=16K]
        Memory at e8000000 (32-bit, non-prefetchable) [size=8M]
        Expansion ROM at e7020000 [disabled] [size=128K]
        Capabilities: [dc] Power Management version 2
        Capabilities: [f0] AGP version 2.0

The output is identical to -rc1 except for line "Expansion ROM at e7020000
[disabled] [size=128K]", which does not exist with -rc1.
With lspci -v -H1 -s 3:0.0 both -rc1 and -rc3 give exactly the same output.
Comment 13 Dave Airlie 2005-08-05 16:14:15 UTC
my guess is the PCI assigning resources is now putting the ROM somewhere and
pissing off X .... X is probably doing something broken but I'd hate to have to
try and fix it all .. 

Greg any ideas? perhaps we can not assigned addresses for VGA Expansion ROMs...
Comment 14 Greg Kroah-Hartman 2005-08-05 16:41:14 UTC
This patch might have caused it:
http://www.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=10f4338ca8534823bc6c843edbbe42fd4e73d258

Could you try reverting it (apply it with the -R option to patch) and see
if that solves this issue?
Comment 15 Jarmo Ilonen 2005-08-06 04:46:50 UTC
The mentioned patch is apparently not yet in -rc3, only in -rc5 (and -rc4-mm1).
According to log it was applied 7 days ago to Linus' tree. Anyway, reverting the
patch from -rc4-mm1 or -rc5 does not change anything.
Comment 16 Dave Airlie 2005-08-06 19:20:32 UTC
how about reverting

http://www.kernel.org/git/gitweb.cgi?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff_plain;h=299de0343c7d18448a69c635378342e9214b14af;hp=90b54929b626c80056262d9d99b3f48522e404d0

I'd love to know what X is doing so badly.. maybe before the ROM wasn't showing
up and the code works but now it is and the X code breaks.. bad X..

Comment 17 Jarmo Ilonen 2005-08-08 01:09:14 UTC
Reverting that patch ([PATCH] PCI: pci_assign_unassigned_resources() on x86)
helped, now dualhead works ok with -rc3. lspci -v -s 3:0.0 also gives identical
output as -rc1, no "Expansion ROM at e7020000 [disabled] [size=128K]" line anymore. 
Comment 18 Dave Airlie 2005-08-08 01:14:52 UTC
ah nuts.. I thought that might be it .. there is a bug in the Xorg BIOS handling
by the looks of it.. it shouldn't matter to Xorg if the kernel assigns the BIOS
or not .. I'll see if I can track down what might be happening..

Greg any ideas on this? an X.org fix will take a while to propogate and it would
be nice not to break working dualhead.. 
Comment 19 Jarmo Ilonen 2005-08-29 03:43:13 UTC
Considering Linus' comments about 2.6.13 I was quite surprised that dual-head is
actually working with it. Then again, X related packages have been updated
in Debian unstable to version 6.8.2.dfsg.1-5, though there did not seem to be
any relevant (mga or pci/bios) changes reported. The "Expansion ROM at e7020000
[disabled] [size=128K]" line is still in lspci output, but there are no
"Truncating PCI BIOS Length to 36864" lines in Xorg.0.log anymore. 

I will close this bug now that dual-head works fine with kernel 2.6.13 and X
6.8.2.dfsg.1-5. I'll mark the resolution to CODE_FIX though maybe the actual fix
is in Xorg.