Bug 25732
Description
Tõnu Raitviir
2010-12-27 22:14:40 UTC
Created attachment 41912 [details]
Prep the HDMI encoder for AVI format
Can you please test this patch? From the documentation it looks like this is required whenever using the HDMI encoding and not just for audio signals.
No good, picture still goes green. Did you have an SDVO attached HDMI? I seem to recall you hit an earlier bug in that configuration... Yes I'm using HDMI output, it's all integrated on the motherboard. # lspci -vv 00:02.0 VGA compatible controller: Intel Corporation 82G35 Express Integrated Graphics Controller (rev 03) (prog-if 00 [VGA controller]) Subsystem: ASUSTeK Computer Inc. Device 8276 Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+ Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 0 Interrupt: pin A routed to IRQ 43 Region 0: Memory at fe900000 (32-bit, non-prefetchable) [size=1M] Region 2: Memory at d0000000 (64-bit, prefetchable) [size=256M] Region 4: I/O ports at ec00 [size=8] Expansion ROM at <unassigned> [disabled] Capabilities: [90] MSI: Enable+ Count=1/1 Maskable- 64bit- Address: fee0100c Data: 4189 Capabilities: [d0] Power Management version 2 Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-) Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME- Kernel driver in use: i915 Kernel modules: i915 00:02.1 Display controller: Intel Corporation 82G35 Express Integrated Graphics Controller (rev 03) Subsystem: ASUSTeK Computer Inc. Device 8276 Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 0 Region 0: Memory at fea00000 (32-bit, non-prefetchable) [size=1M] Capabilities: [d0] Power Management version 2 Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-) Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME- First-Bad-Commit : 3c17fe4b8f40a112a85758a9ab2aebf772bdd647 Handled-By : Chris Wilson <chris@chris-wilson.co.uk> Still present in 2.6.37 release. Any plans to fix it? Step 1: Work out how to fix it. What's the state of progress on this? Tõnu, is this still a problem in 2.6.38? Sorry, I don't have the hardware running daily any more, so it took some time. I can now confirm that the problem _is_ still present in 2.6.38.5. I'll test latest 2.6.39-rc soon. Well, 2.6.39-rc5-git6 has the same problem. Basically anything newer than 2.6.36 is unusable on this setup. I've done the bisect, is there anything else I can help with? Does reverting the bisected commit on top of current linus kernel help? I see three options: a) revert the patch (causing other regressions) b) add a module option (if the AVI infoframe can indeed confuse some HW) c) go through the AVI infoframe spec (CEA-861-D) and check if changing some flags in the infoframe would make your TV behave (there are some, like IT mode, which might make sense) Who prepares some patches regarding option c) for Tõnu to check out? Do you have a pointer to the specs? Else option a) is normally the modus operandi, because of the no regressions rule... (meaning, don't fix stuff(or worse: introduce new features) while breaking other stuff).. but in the end, the i915 maintainers are the one in charge here... I am also having this problem. Same motherboard and Linux 2.6.38. Florian: I tried reverting the patch on my kernel and the patch would not apply. I guess there are too many other changes since then although I am not a linux kernel expert. *** Bug 39812 has been marked as a duplicate of this bug. *** Removing the ecc field from the dip_infoframe struct in intel_drv.h (and the corresponding assignment in intel_hdmi.c) fixes the issue for me - tested with Linux 3.0. I'm not familiar with the relevant specifications, but I suspect that the ECC byte must not be included for HDMI using SDVO hardware. My mainboard is an Asus P5E-VM HDMI (G35). Jesse, can you check whether the infoframe structure really did change between SDVO-HDMI and HDMI, and which one is correct? I have tested Jürg's patch on my Kernel (2.6.39) and it works a charm. I can try to come up with a proper patch (removing ECC only in the SDVO case), but I'd like to wait until someone familiar with SDVO confirms that this is the right approach. We're not supposed to write that byte apparently, but the specs aren't clear on whether we need to skip the write by manually writing the index field or simply ignore its existence as Jurg's patch does... AFAICT the formats are the same between SDVO HDMI and regular HDMI. Wonder if Jurg's patch will help https://bugs.freedesktop.org/show_bug.cgi?id=39314 as well. Jurg's fix (removing ecc from the two files mentioned) resolved the issue for me as well. I'm using an Intel GM965/GL960 controller (on a Dell Studio Hybrid) running 3.0.3 with Gentoo patches. I can also confirm that Jürg's patch fixes this issue on my Intel GM965 embedded on a Gigabyte GA-6KIEH-RH. This is with Ubuntu 2.6.38-11-generic kernel. Has anyone established how this should be handled correctly so we can get this submitted as a proper patch for inclusion? Sorry for loosing track of this bug altogether. This is what I wrote to Peter Ross a few minutes ago: I did test infoframes without the ecc field for "hdmi" hardware and confirmed that it wasn't working as expected. I never tried with or without the ecc field for SDVO hardware since I don't have any. I think this info was included in the commit message: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=3c17fe4b8f40a112a85758a9ab2aebf772bdd647 quote: I'm assuming that the sdvo hardware also stores a header ECC byte in the MSB of the first dword - is this correct? Please have a look at these threads: http://lists.freedesktop.org/archives/intel-gfx/2010-September/008042.html http://lists.freedesktop.org/archives/intel-gfx/2010-September/008052.html http://lists.freedesktop.org/archives/intel-gfx/2010-September/008161.html http://lists.freedesktop.org/archives/intel-gfx/2010-September/008215.html Basically, I did submit a version first which didn't touch the SDVO case, then Chris Wilson asked me to merge the two, which I did. My guess is that SDVO hardware doesn't need the ECC byte and that "hdmi" hardware does. Oh, and I also seem to remember that there was something funky about the SDVO avi infoframe code that predated my code....it wrote multiples of 8 bytes no matter what the size of the infofram struct was (something to keep in mind if you want to revert the SDVO behaviour). Just to be clear: I no longer have this issue on latest kernel versions since about 3.0 so it appears to be fixed for me and at least one other person I know who had the same issue. Oh, and also check this reference: http://intellinuxgraphics.org/VOL_3_display_registers_updated.pdf page 108, section 2.8.9, the DIP write table, MSB of DW0 IIRC, I did a readout of the AVI infoframe after the write (where ECC was simply set to zero) and the MSB of DW0 did contain a correct ECC value. Sorry, I'm lost. Tõnu, can you verify that your issue is fixed too in v3.2? I'm the other person Tom Hunt was referring to, however he was incorrect, this issue is still present in 3.2 for me. Please do this: - git clone git://people.freedesktop.org/~pzanoni/intel-gpu-tools - compile it (autogen.sh, make) - sudo tools/intel_infoframes -d This tool will print the InfoFrames your machine is sending. Run the command when the screen is in a "bad state", then run the command when the screen is in a "good state". Is there any difference? If there's no difference, you should try to compare the differences between the output of tools/intel_reg_dumper in both the "bad" and the "good" state. The intel_infoframes tool allows you to change the infoframes too: read the --help output. This might help the debugging process... Unfortunately I don't have access to the hardware anymore and cannot test this. I have Intel DH67GD with Sandy Bridge CPU. Diffs are after a good boot with Fedora kernel 3.1.6-1.fc16.x86_64 and after a suspend & resume cycle where I get a green tint. diff -u good_infoframes.txt bad_infoframes.txt --- good_infoframes.txt 2012-02-26 09:05:40.649830784 +0100 +++ bad_infoframes.txt 2012-02-26 09:06:44.971543280 +0100 @@ -22,29 +22,31 @@ AVI InfoFrame: - frequency: reserved (invalid) - raw: - e40d0282 0003006c 00000000 00000000 - 00000000 00000000 000d0282 00000000 - 3a36000a 00000000 00000000 00000000 + 000d0282 00000000 00000000 00000000 00000000 00000000 00000000 00000000 -- type: 82, version: 2, length: d, ecc: e4, checksum: 6c + 00000000 00000000 00000000 00000000 + 00000000 00000000 00000000 00000000 +- type: 82, version: 2, length: d, ecc: 0, checksum: 0 - S: 0, B: 0, A: 0, Y: 0, Rsvd0: 0 -- R: 3, M: 0, C: 0 +- R: 0, M: 0, C: 0 - SC: 0, Q: 0, EC: 0, ITC: 0 - VIC: 0, Rsvd1: 0 - PR: 0, Rsvd2: 0 - top: 0, bottom: 0, left: 0, right: 0 -- Rsvd3: 0, Rsvd4[0]: 0, Rsvd4[1]: d0282, Rsvd4[2]: 0 +- Rsvd3: 0, Rsvd4[0]: 0, Rsvd4[1]: 0, Rsvd4[2]: 0 +Invalid InfoFrame checksum! SPD InfoFrame: - frequency: reserved (invalid) - raw: - 00000000 00000000 00000000 00000000 - 00000000 00000000 00190183 00004976 + 00190183 746e4976 00006c65 746e4900 + 61726765 20646574 00786667 00000900 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 -- type: 0, version: 0, length: 0, ecc: 0, checksum: 0 -- vendor: -- description: -- source: reserved +- type: 83, version: 1, length: 19, ecc: 0, checksum: 76 +- vendor: Intel +- description: Integrated gfx +- source: pc general +Invalid InfoFrame checksum! Transcoder B: - disabled With the 3.2 series it gets worse. When I boot Fedora kernel 3.2.7-1.fc16.x86_64 I constantly get a pink tint (could this be due to vsync) diff -u good_infoframes.txt 1_pink_start.txt --- good_infoframes.txt 2012-02-26 17:07:17.932101147 +0100 +++ 1_pink_start.txt 2012-02-26 17:07:17.952100818 +0100 @@ -20,31 +20,31 @@ - enabled - GCP: disabled AVI InfoFrame: -- frequency: reserved (invalid) +- frequency: every vsync - raw: - e40d0282 0003006c 00000000 00000000 - 00000000 00000000 000d0282 00000000 - 3a36000a 00000000 00000000 00000000 + e40d0282 0000006f 00000000 00000000 00000000 00000000 00000000 00000000 -- type: 82, version: 2, length: d, ecc: e4, checksum: 6c + 0000000a 00000000 00000000 00000000 + 00000000 00000000 00000000 00000000 +- type: 82, version: 2, length: d, ecc: e4, checksum: 6f - S: 0, B: 0, A: 0, Y: 0, Rsvd0: 0 -- R: 3, M: 0, C: 0 +- R: 0, M: 0, C: 0 - SC: 0, Q: 0, EC: 0, ITC: 0 - VIC: 0, Rsvd1: 0 - PR: 0, Rsvd2: 0 - top: 0, bottom: 0, left: 0, right: 0 -- Rsvd3: 0, Rsvd4[0]: 0, Rsvd4[1]: d0282, Rsvd4[2]: 0 +- Rsvd3: 0, Rsvd4[0]: 0, Rsvd4[1]: 0, Rsvd4[2]: 0 SPD InfoFrame: -- frequency: reserved (invalid) +- frequency: every vsync - raw: - 00000000 00000000 00000000 00000000 - 00000000 00000000 00190183 00004976 + 00190183 746e49f2 00006c65 746e4900 + 61726765 20646574 00786667 00000900 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 -- type: 0, version: 0, length: 0, ecc: 0, checksum: 0 -- vendor: -- description: -- source: reserved +- type: 83, version: 1, length: 19, ecc: 0, checksum: f2 +- vendor: Intel +- description: Integrated gfx +- source: pc general Transcoder B: - disabled However, after running the intel_infoframes tool everything is back to normal. When I suspend & resume this I get a green tint back. Below are diffs from the before and after resume: diff -u 2_good_start.txt 3_bad_resume.txt --- 2_good_start.txt 2012-02-26 17:07:17.932101147 +0100 +++ 3_bad_resume.txt 2012-02-26 17:07:17.915101430 +0100 @@ -22,11 +22,11 @@ AVI InfoFrame: - frequency: every vsync - raw: - e40d0282 0000006f 00000000 00000000 + 000d0282 0000006f 00000000 00000000 00000000 00000000 00000000 00000000 - 0000005f 00000000 00000000 00000000 00000000 00000000 00000000 00000000 -- type: 82, version: 2, length: d, ecc: e4, checksum: 6f + 00000000 00000000 00000000 00000000 +- type: 82, version: 2, length: d, ecc: 0, checksum: 6f - S: 0, B: 0, A: 0, Y: 0, Rsvd0: 0 - R: 0, M: 0, C: 0 - SC: 0, Q: 0, EC: 0, ITC: 0 @@ -37,11 +37,11 @@ SPD InfoFrame: - frequency: every vsync - raw: - 64190183 746e49f2 00006c65 746e4900 + 00190183 746e49f2 00006c65 746e4900 61726765 20646574 00786667 00000900 - 4856d5e5 00000000 00000000 00000000 00000000 00000000 00000000 00000000 -- type: 83, version: 1, length: 19, ecc: 64, checksum: f2 + 00000000 00000000 00000000 00000000 +- type: 83, version: 1, length: 19, ecc: 0, checksum: f2 - vendor: Intel - description: Integrated gfx - source: pc general And again, running the intel_infoframe tool fixes everthing again. (Please note that running this tool in comment #31 also fixed the output. I just forgot to mention this) Hi The diff got a little bit hard to read. Could you please paste/attach the full outputs? Things go back to normal just by running "intel_infoframes -d"? Or did you use any other parameters to change the infoframe values? Created attachment 72487 [details]
After a reboot where everything looks fine
3.1.6-1.fc16.x86_64
Created attachment 72488 [details]
After a pm-suspend/resume cycle where I get a green tint
3.1.6-1.fc16.x86_64
Also note that everthing gets fine as soon as "intel_infoframes -d" is executed.
When executing it when the screen is in a bad state it takes about 5 seconds for the command to complete.
Created attachment 72489 [details]
pink tint on boot with 3.2 kernels
3.2.7-1.fc16.x86_64 (but tried various other 3.2 kernels with same symptom)
Directly after a boot where screen have a pink tint.
1_pink_start.txt
When "intel_infoframes -d" command completes the screen looks good
2_good_start.txt -
Run again when the screen looked good. Now I do a pm-suspend and then wake the system. Now I observe a green tint.
3_bad_resume.txt
After running intel_infoframes again it's back to normal.
4_good_resume.txt
Finally running again when things look good
I just get "This program still only supports ILK or newer.", I presume I need to be running a later kernel to test with this? Currently using 3.2. (In reply to comment #37) > I just get "This program still only supports ILK or newer.", I presume I need > to be running a later kernel to test with this? Currently using 3.2. The tool I sent is still under development and I did not implement support for all possible hardware yet. I'll add support for more hardware later. The attachments from John gave us a clue. I'll try to provide a patch based on it. Three debug patches to be attached. Please try 0001 first. If that doesn't fix the problem, try 0002 (apply it on top of 0001) and then 0003. If you're using Ironlake (core i3/5/7) or newer then patch 0002 shouldn't make a difference. Created attachment 72693 [details]
patch 0001: first blind attempt to fix the problem...
Created attachment 72694 [details]
patch 2 (for gen4 and before)
Created attachment 72695 [details]
patch 3: may have the same effect as running intel_infoframes?
(In reply to comment #35) > When executing it when the screen is in a bad state it takes about 5 seconds > for the command to complete. Could you please try to discover which line inside intel_infoframes.c takes so much time to finish? I'd add a few gettimeofday() inside load_infoframe()... Patch 1 didn't make a difference Patch 2 wasn't applied since I have SNB i5-2400S Patch 3 didn't make a difference intel_infoframes -d fixes things once again Please note that my setup is Computer <--HDMI--> Receiver <--HDMI--> 1080p TV if that makes any difference!? I ran intel_infoframes under callgrind but it seems most of the time was spend in LD. Don't know why, I'll try to see what is taking time in load_infoframe() instead. Perhaps I should try with a 3.1 kernel as instead since someone threw in several buckets with pink color in the 3.2 series. Created attachment 72786 [details]
handle sdvo hdmi input timings correctly
This patch fixes an issue with how we set up the encoder input timings and might help for your configuration. Please test it.
I've tested the above patch from Daniel Vetter and can confirm that this doesn't fix the issue for me (Intel G35, vanilla 3.3.1). Also tested the other three patches above with the same results as John. Daniel, the "handle sdvo hdmi input timings" doesn't help me either (SNB). Created attachment 73069 [details]
proposed patches
Hi
Could you please test these 9 patches attached on the .tar.gz? They fix many different possible theoretical bugs, and one of them might be the cause.
First of all, please apply all the patches and test the whole set.
Then, if the problem is fixed and you're a patient person with a lot of free time to help Kernel development, you could try to "reverse bisect" and discover which of the patches really fixes the problem. Then I could maybe think about submitting this specific patch to the stable kernels.
Thanks,
Paulo
Hello, Unfortunatly this doesn't change anything. This is what I did: git clone git://people.freedesktop.org/~danvet/drm-intel cd drm-intel git apply ~/patches/gen4-infoframes/000* I've done drm.debug=0xe, regdump, infoframes, dmesg, xrandr logs of the 3.4.0-rc2 kernel where I get a pink.jpg boot. I've also added the same info with 3.1.0 kernel that boots fine. Also note, with 3.1.0, somehow I wasn't able to recover with "intel_infoframes -d" after a pm-suspend. I'm think I was able to do this before with this kernel. -- john Created attachment 73103 [details]
Tests with 3.1.0 (last kernel to boot fine)
Created attachment 73104 [details]
Tests with 3.4.0-rc4 ( git://people.freedesktop.org/~danvet/drm-intel)
Created attachment 73105 [details]
good colors
Created attachment 73106 [details]
R+B is pink
Created attachment 73107 [details]
R is pink, G green, B black
(In reply to comment #49) > Hello, > > Unfortunatly this doesn't change anything. > > This is what I did: > git clone git://people.freedesktop.org/~danvet/drm-intel > cd drm-intel > git apply ~/patches/gen4-infoframes/000* Did you switch to the drm-intel-next-queued branch? I'll provide a new round of patches soon: I found a bug related to InfoFrames I can reproduce on one of my machines. It also gets fixed when I run intel_infoframes -d, so I really hope it is the same problem you're having :) bahh, forgot to switch to the drm-inte-next-queued branch. Should I redo the tests for this branch or can I wait for your new patches? Created attachment 73112 [details]
patch 1/2 of the fix
Ignore the previous patches. Please test this and the next one against the drm-intel-next-queued branch of danvet's drm-intel tree.
These two patches solved the following problem:
- Boot IVB with just a single HDMI monitor (no LVDS), the infoframes are not being sent.
Created attachment 73113 [details]
patch 2/2 of the fix
Boot still shows pink screen for me. *BUT*, pm-suspend and a resume finally looks great! I've captured logs for this (3.4.0-rc3+ Tests) - dmesg, xrandr, regdump, infoframes after a pink boot which infoframes -d fixes - then logs after a pm-suspend/resume cycle Then I did a reboot to get to pink display. Then straight to pm-suspend/resume and captured logs "resume2". So now the only problem for me is the pink screen on boot that was introduced between 3.1.0 and 3.2.0. Any ideas for this? Created attachment 73119 [details]
3.4.0-rc3+ Tests after Paulo's fix
Created attachment 73152 [details]
New set of patches
New set of patches to test. Patch 1 is for another bug, but should be harmless.
My machine is fixed with these (too). Please test.
Thank you,
Paulo
Kernel won't boot for me (I get an error about "unable to process initqueue"). I had this while testing the previous patch as well. But the screen was pink all the way to the crash. But perhaps the patches didn't do anything regarding the boot? Okay, I've just tried it again on current drm-intel-next-queued branch and now it boots fine. Still pink boot. But pm-suspend & resume once again gets me back to a good looking desktop. (In reply to comment #63) > Okay, I've just tried it again on current drm-intel-next-queued branch and > now > it boots fine. > > Still pink boot. But pm-suspend & resume once again gets me back to a good > looking desktop. Running intel-infoframes -d still fixes your problem? I have an even newer set of patches to test... I'm sending these patches to the intel-gfx mailing list. I'll attach them here. Remove all the previous patches, apply these and test, please. This should still apply on top of danvet's drm-intel-next-queued branch. Created attachment 73188 [details]
New set of patches
Apply on drm-intel-next-queued. Remove all the previous patches.
Same as before. Pink screen on boot which infoframes -d fixes. Display comes back fine after a pm-suspend/resume cycle. Created attachment 73260 [details]
dmesg from 3.4.0-rc3
For easier reference, the latest dmesg as a simple text/plain attachment.
Ok, this bug is a giant mess where tons off different reporters mix in tons of different. John Obaterspok, can you please file a new bug report for your problem? According to your dmesg, you have a sandybridge whereas the original reporter which filed this regression has a G35. Note that hdmi support works _completely_ different on these two chips, so there's zero chance you have the same problem. Because this tracks I regression, I'll keep this bug report around and just kick out any attachments not by the original reporter (or patches not for the original reporter). To everyone else: Please file a separate bug report to avoid further confusion, thanks. Small note: If you _have_ a g35 with hdmi issues, you can stay on this obviously until we've figured it out. Created attachment 73263 [details] fixup sdvo avi infoframe support Ok, I've fixed this mess up, tested on my g33 and it seems to work. Tested-by on other sdvo hdmi machines still highly appreciated. Note that the patch applies on top of drm-intel-next-queued branch at http://cgit.freedesktop.org/~danvet/drm-intel/ Sorry that it took so long to properly fix this regression. Given that the patch works for me and I could reproduce the green monitor issue, I'll mark this as 'patch available'. Please yell if this doesn't work for you. Tested-by: Peter Ross <pross@xvid.org> (G35 SDVO-HDMI) All those that don't have their systems fixed by Daniel's patch, please move to this bug: https://bugzilla.kernel.org/show_bug.cgi?id=43256 This patch solves the problem for me on G35 (same as initial reporter). Sorry if spamming, just wanted to add a third party confirmation. A patch referencing this bug report has been merged in Linux v3.5-rc1: commit 81014b9d0b55fb0b48f26cd2a943359750d532db Author: Daniel Vetter <daniel.vetter@ffwll.ch> Date: Sat May 12 20:22:00 2012 +0200 drm/i915: fixup infoframe support for sdvo I'm still seeing the same issue with Linux 3.6.2 on my Asus P5E-VM HDMI. The TV screen is tinted green at all times. Created attachment 83441 [details]
drm/i915/sdvo: do not send uninitialized memory in infoframe
The attached patch fixes the issue for me. As far as I can tell, the last SDVO_CMD_SET_HBUF_DATA command used uninitialized memory as sdvo_data was not a multiple of 8. According to the HDMI spec, the reserved bytes (PB14-PB27) must be set to 0.
Created attachment 83901 [details]
patch for Jürg to clear the entire infoframe buffer
(In reply to comment #78) > Created an attachment (id=83901) [details] > patch for Jürg to clear the entire infoframe buffer Works fine on top of 3.6.2, thanks. Tested-by: Jürg Billeter <j@bitron.ch> Thanks for testing. Out of curiosity, can you please attach full dmesg with drm.debug=0xe, so that I can check the buffer sizes? Created attachment 84131 [details]
dmesg with debug information
A patch referencing this bug report has been merged in Linux v3.7-rc4: commit b6e0e543f75729f207b9c72b0162ae61170635b2 Author: Daniel Vetter <daniel.vetter@ffwll.ch> Date: Sun Oct 21 12:52:39 2012 +0200 drm/i915: clear the entire sdvo infoframe buffer |