Bug 9709 - MGA video modes garbled
Summary: MGA video modes garbled
Status: CLOSED CODE_FIX
Alias: None
Product: Drivers
Classification: Unclassified
Component: Video(Other) (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: Dave Airlie
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2008-01-07 16:24 UTC by Damon
Modified: 2010-01-19 22:21 UTC (History)
8 users (show)

See Also:
Kernel Version: 2.6.23-rc3 through 2.6.24-rc7
Subsystem:
Regression: Yes
Bisected commit-id:


Attachments
grep MGA from the Xorg startup log (17.74 KB, text/plain)
2008-01-07 16:26 UTC, Damon
Details
/proc/config.gz contents (48.21 KB, text/plain)
2008-01-07 16:26 UTC, Damon
Details
Reverse commit e798bd95b61918e653f3d28f9176237236f2d103 (3.74 KB, patch)
2008-01-30 21:06 UTC, Damon
Details | Diff

Description Damon 2008-01-07 16:24:17 UTC
Latest working kernel version:

2.6.23-rc2

Earliest failing kernel version:

2.6.23-rc3

Latest failing kernel version tried:

2.6.24-rc7

Distribution:

None (self-built)

Hardware Environment:

- Athlon XP
- Matrox millenium G550 AGP video
- LCD monitor which needs to be driven at a vertical refresh rate of 60Hz.

Software Environment:

- Xorg server 1.2.0
- Kernel drivers mga, drm, via_agp, agpgart.

Problem Description:

I have had the following problem since i first upgraded to the 2.6.23 series.  I have since gone back and determined that the relevant change occurred between 2.6.23-rc2 (works) and 2.6.23-rc3 (doesn't work).  I have also tested 2.6.24-rc7 and the problem still exists there.

I have an LCD monitor that needs to be driven at a vertical refresh rate of 60Hz.  I have Xorg configured to require VertRefresh 60 for the monitor, and in the past it has always successfully auto-detected appropriate, compatible modes on the matrox card as follows (as reported by the monitor's info panel):

1600x1200: horiz 75KHz, vert 60Hz, pixclock 162MHz
1280x1024: horiz 63KHz, vert 60Hz, pixclock 109MHz
1024x768:  horiz 48KHz, vert 60Hz, pixclock 65MHz
800x600:   horiz 37KHz, vert 60Hz, pixclock 40MHz
640x480:   horiz 31KHz, vert 60Hz, pixclock 25MHz

Since 2.6.23-rc3, when I start Xorg, my monitor reports that the signal is out of range.  If i use the C-M-keypad combinations to select a different video mode, most are still out of range, one is shaky, and only one is solid.  My monitor won't report anything for the ones that are out of range, but here are the two i can see:

1280x1024: horiz 57KHz, vert 54Hz, pixclock 98MHz (shaky)
1024x768:  horiz 48KHz, vert 60Hz, pixclock 65MHz (same as above)

I have also witnessed similar effects on my other machines, all of which have a similar setup. Most of them have more forgiving display devices, but i still see problems with the refresh rate.

The Xorg log files are essentially identical between the two kernels.  I'm attaching the relevant parts of Xorg's log (grep MGA).  Xorg seems to be finding the correct values, on both kernels, but what comes out the other en changes radically from one RC to the next.

I'm also attaching the contents of /proc/config.gz under 2.6.23-rc2. The same configuration was also used for 2.6.23-rc3.

I see from the incremental patch that a bunch of changes were made in the MGA drivers relating to pixel clocks; unfortunately it's meaningless to me.  Please let me know if (and how) there's any more useful information i can provide.

Thanks!
Comment 1 Damon 2008-01-07 16:26:00 UTC
Created attachment 14337 [details]
grep MGA from the Xorg startup log
Comment 2 Damon 2008-01-07 16:26:29 UTC
Created attachment 14338 [details]
/proc/config.gz contents
Comment 3 Damon 2008-01-15 00:41:36 UTC
Further to this bug report, i had a little time on my hands so i tried simply reversing all changes to matrox-related files included in 2.6.23-rc3.  This worked perfectly - i am now running 2.6.23.14 (i had unrelated problems with the 2.6.24-rc series but doubtless would get the same result there).

According to the changelog, those changes were committed by Paul A. Clarke <pc@us.ibm.com> in commit e798bd95b61918e653f3d28f9176237236f2d103 on 2007-08-10, to `rectify jitter'.  However, for me they make the matrox driver completely unusable, as described above.

I will use my reverse patch locally, but please let me know if there is anything else i can do to help determine the cause.
Comment 4 Damon 2008-01-30 21:06:03 UTC
Created attachment 14661 [details]
Reverse commit e798bd95b61918e653f3d28f9176237236f2d103

This is the patch that fixes the matrox driver for me, simply by reversing commit e798bd95b61918e653f3d28f9176237236f2d103.
Comment 5 Ruud van Melick 2008-03-31 12:25:44 UTC
I think have the same problem that Damon describes.

Last working kernel version: 
linux-image-2.6.22-3-k7 (version 2.6.22-6, Debian testing)

Failing kernel version: 
linux-image-2.6.24-1-686 (version 2.6.24-4, Debian testing)

Hardware Environment:
- AMD Sempron
- Matrox G450, AGP, dual D-SUB (no DVI!)
- CRT: Iiyama 404 (S704HT, sync freq: 27-96Khz, 50-160Hz)

Instead of varying the display resolution, I varied the refresh rate (60Hz, 70Hz, 75Hz, 85Hz, set through Gnome preferences) and wrote down the video mode info as shown on the monitor itself. These are all for 1024x768 resolution:

Kernel 2.6.22:
vertical 60Hz, horizontal 48.4Khz
vertical 70Hz, horizontal 56.5Khz
vertical 75Hz, horizontal 60.1Khz
vertical 85Hz, horizontal 68.7Khz

Kernel 2.6.24:
vertical 57Hz, horizontal 46.3Khz [1]
vertical 70Hz, horizontal 56.5Khz (solid, same as with 2.6.22 kernel)
vertical 74Hz, horizontal 59.4Khz [1]
vertical 83Hz, horizontal 66.8Khz [1]

[1] shaky, unstable. I notice it most strongly on the right side of the screen. Seems to be vibrating fast (<->) making the right border of the screen look noisy (right border is no longer a straight line, but fluctuates rapidly). And at unpredicable intervals the CRT goes black for a moment as if it's switching between video modes (which perhaps it is).

The only difference is the kernel used. Xorg log files show no differences.

Similar to what Damon did, I recompiled the 2.6.24 debian kernel with a small patch, commenting out one single line in /drivers/video/matrox/g450_pll.c (line 344):

  matroxfb_DAC_out(PMINFO M1064_XDVICLKCTRL,tmp);

That solved the problem for me. If more info or testing is needed, let me know.
Comment 6 Natalie Protasevich 2008-06-04 23:40:17 UTC
Hmm, commit in #4 is still in main line, cc-ing to maintainer.
Comment 7 Damon 2008-06-05 09:25:03 UTC
Just to confirm, i am still applying my patch from comment #4 (reversing the changes from 2.6.23-rc3) to the latest kernels (2.6.25.4) and it is still working for me.  I have meant to try the smaller patch from comment #5 but keep forgetting - i've no doubt it would work, though, as that line was central to the change in 2.6.23-rc3.
Comment 8 Petr Vandrovec 2008-06-05 23:40:06 UTC
I have no idea.  There is no documentation available (apparently IBM has one, but I do not), so apparently choice is to break either your system, or IBM's.  Can you put

printk(KERN_DEBUG "Max DVI pixclock is %u, tmp is %02X\n",
MINFO->max_pixel_clock_panellink, tmp);

instead of DAC_out line you removed, and tell me what is being programmed for your 4 videomodes?

Thanks.
Comment 9 Ruud van Melick 2008-06-06 14:49:40 UTC
Petr, that shows "Max DVI pixclock is 112000, tmp is BB" for the 1024x768 resolution, regardless of which vertical refresh rate (see comment #5) is set.
Comment 10 Ian Romanick 2008-06-10 14:59:58 UTC
Could you also upload your /etc/X11/xorg.conf?  I suspect that that 'Option "UseFBDev" "true"' is set.  Have you tried setting that to false?  While not a solution, it might provide a temporary work-around for the time being.
Comment 11 Damon 2008-06-10 15:35:07 UTC
Hi Petr -

I have also tried the patch you gave above, and tried experimenting with the UseFBDev option in the Device section for my card.  Here's what i discovered:

On boot, when the framebuffer is first initialised, i can see output from your printk mixed in with the FB/console initialisation as follows:

matroxfb: Matrox G550 detected
PInS memtype = 5
matroxfb: MTRR's turned on
matroxfb: 1024x768x8bpp (virtual: 1024x16384)
matroxfb: framebuffer at 0xD8000000, mapped to 0xf8880000, size 33554432
Max DVI pixclock is 112000, tmp is BB
Console: switching to colour frame buffer device 128x48
fb0: MATROX frame buffer device
matroxfb_crtc2: secondary head of fb0 was registered as fb1

Originally, UseFBDev was *not* turned on (not mentioned in my xorg.conf file at all).  While that was the case, i never saw another message, either when X started up or when switching video modes, either with ctrl+alt+plus/minus, via xrandr, or via the xfce display preferences (which i assume also uses RandR).

Is it possible that X's direct access is somehow bypassing your code when it shouldn't be?

When i then tried turning UseFBDev on, i started getting messages logged.  All modes are working correctly, since the offending line in g450_pll.c has been replaced with your printk.

First, when X starts up at 1600x1200, then Xfce, i get the following sequence:

Max DVI pixclock is 112000, tmp is BB
Max DVI pixclock is 112000, tmp is BB
Max DVI pixclock is 112000, tmp is BB
mtrr: no MTRR for d8000000,1000000 found
Max DVI pixclock is 112000, tmp is BB
Max DVI pixclock is 112000, tmp is 00

Then, here's what i get as i switch modes with ctrl+alt+plus/minus:

to 1280x1024x60: Max DVI pixclock is 112000, tmp is BB
to 1024x768x60: Max DVI pixclock is 112000, tmp is BB
to 800x600x60: Max DVI pixclock is 112000, tmp is BB
to 640x480x60: Max DVI pixclock is 112000, tmp is BB
back to 1600x1200: Max DVI pixclock is 112000, tmp is 00

Anything else i can do to help debug?
Comment 12 Petr Vandrovec 2008-06-11 03:17:51 UTC
(In reply to comment #11)
> Anything else i can do to help debug?

Probably no.  If I won't hear from Paul, I'll just create patch to hide 
M1064_XDVICLKCTRL access under some kernel option, defaulting it to false on anything else than PPC.
Comment 13 Paul A. Clarke 2008-06-16 09:38:29 UTC
Sorry, just returning from holiday.  Let me see if I can put together a setup that shows the problem.  Apologies for the troubles.  In the meantime, Petr, you are free (of course) to hack a workaround that makes mainline work.

In trying to grok the debug that's been reported here, is there any association between the "tmp is BB", "tmp is 00", and whether the jitter is present?
Comment 14 Damon 2008-06-16 09:52:41 UTC
Hi Paul -

No association.  By my understanding, it's this one line that causes the problem:

matroxfb_DAC_out(PMINFO M1064_XDVICLKCTRL,tmp);

Removing that line solves the problem completely. I *replaced* that line with Petr's printk statement, so i haven't experienced any jitter, regardless of what `tmp' was.

Admittedly, i haven't actually tried a kernel without that line removed since 2.6.24-rc... :)
Comment 15 Paul A. Clarke 2008-06-19 19:26:58 UTC
I'm having trouble finding a usable combination of machine PCI-Express slot for the G550 I have. I'm also leaving tomorrow, and will be away until the 30th.  I'll pick this up again upon my return.  It would probably be wise to push Petr's suggested workaround upstream in the meantime.
Comment 16 Alan 2008-09-24 03:44:51 UTC
What is the status on this one Petr ?
Comment 17 Ruud van Melick 2009-11-29 00:05:04 UTC
I just upgraded to a 2.6.30 kernel and the problem still exists. 

Fortunately commenting out this line in g450_pll.c still works:
matroxfb_DAC_out(PMINFO M1064_XDVICLKCTRL,tmp);

Any news on this one?
Comment 18 Alan 2009-11-30 08:27:59 UTC
Given a complete lack of feedback from the notional maintainer I'm simply going to submit a patch to revert that line and see what happens.
Comment 19 Petr Vandrovec 2009-12-01 09:53:35 UTC
Sorry Alan.  Original change was driven by Paul, and at that time on my hardware it worked both with and without that line.  Since then I've dumped all my Matrox hardware, and I never had any docs for G450/G550, so I have no opinion about patch: there is at least one hardware on which this write is necessary, at least one on which it is harmful, and at least one on which it does not matter.

I promise to get around sending patch to mark matroxfb & ncpfs orphaned before end of year.

Note You need to log in before you can comment on or make changes to this bug.