Bug 46761
Description
Sudaraka Wijesinghe
2012-08-31 10:20:00 UTC
Created attachment 78881 [details]
xrandr output from 3.4.9 where everything works fine
Created attachment 78891 [details]
lspci output
Two things: - Can you please boot with drm.debug=0xe on both 3.4.x and 3.5.y and attach the full dmesg? - Can you try to bisect this issue? Knowing the culprit that introduce a problem usually helps enormously in tracking down a regression? Created attachment 78941 [details]
dmesg output from 3.4.9 with drm-debug=0xe where monitor gets a signal
Created attachment 78951 [details]
dmesg output from 3.6-rc3 with drm-debug=0xe where monitor is not getting a signal
Daniel, I have attached the two dmesg outputs as you asked. For the bisect thing, I will need more assistance on how to do it. I'm using the source downloaded from kernel.org, do I need to switch to using git version? (which is fine) Thanks. > --- Comment #6 from Sudaraka Wijesinghe <sudaraka.wijesinghe@gmail.com> > 2012-08-31 16:50:52 --- > Daniel, > > I have attached the two dmesg outputs as you asked. > > For the bisect thing, I will need more assistance on how to do it. I'm using > the source downloaded from kernel.org, do I need to switch to using git > version? (which is fine) Yes. See http://www.reactivated.net/weblog/archives/2006/01/using-git-bisect-to-find-buggy-kernel-patches/ for a nice howto. Thank you for the info. Here is the bisect result dc257cf154be708ecc47b8b89c12ad8cd2cc35e4 is the first bad commit Hm, that's strange that you've hit this merge commit. Can you pls double-check whether the two immediate parents really both work? You can check them out from your git repo with $ git checkout 5bc69bf and $ git checkout d48b97b Maybe also double-check that dc257cf154be708ecc47b8b89c12ad8cd2cc35e4 really is bad. If this is indeed the case, we're pretty much back to square one :( The merge doesn't touch the hdmi support, and hence would only inflict a timing change. It's possible that I might have taken a wrong turn in the first bisect, during which I hit the commit b3daeaef559d87b974c13a096582c5c70dc11061 where output was fine via HDMI but the output on laptop screen was messed up, pixels were like out of sync with each other. First time I marked this as bad and ended up with the commit I mentioned above. This time I marked it as good (since output on the external monitor was fine) and goto the commit 64172ccbe22ad85f9e6ac7ec47af2a350f00e0da. When building at this commit, I get a build error which seems to be from a unrelated modules but keeps me from continuing. ERROR: "set_personality_ia32" [arch/x86/ia32/ia32_aout.ko] undefined! make[1]: *** [__modpost] Error 1 make: *** [modules] Error 2 > --- Comment #10 from Sudaraka Wijesinghe <sudaraka.wijesinghe@gmail.com> > 2012-09-01 20:21:05 --- > It's possible that I might have taken a wrong turn in the first bisect, > during > which I hit the commit b3daeaef559d87b974c13a096582c5c70dc11061 where output > was fine via HDMI but the output on laptop screen was messed up, pixels were > like out of sync with each other. Yeah, if you do larger bisects it sometimes happens that other issues (like this one or the compile issue below) get in the way. It's important that you ignore these as much as possible. If disabling other modules (like below) or ignoring unrelated issues isn't possible, you can tell git bisect that you can't test this commit with $ git bisect skip git will then look for another commit to test. > First time I marked this as bad and ended up with the commit I mentioned > above. > This time I marked it as good (since output on the external monitor was fine) > and goto the commit 64172ccbe22ad85f9e6ac7ec47af2a350f00e0da. > > When building at this commit, I get a build error which seems to be from a > unrelated modules but keeps me from continuing. > > ERROR: "set_personality_ia32" [arch/x86/ia32/ia32_aout.ko] undefined! > make[1]: *** [__modpost] Error 1 > make: *** [modules] Error 2 You can work around this specific issue by disable a.out support (which you very likely don't need - it's backwards compat stuff for ~15 year old binaries ...). Just disable CONFIG_BINFMT_AOUT. Thanks for the info, I hope the following info will be useful. This is the bad commit I got from the bisect. 4e89ee174bb2da341bf90a84321c7008a3c9210d is the first bad commit commit 4e89ee174bb2da341bf90a84321c7008a3c9210d Author: Paulo Zanoni <paulo.r.zanoni@intel.com> Date: Fri May 4 17:18:26 2012 -0300 drm/i915: set the DIP port on ibx_write_infoframe Just like Gen 4, IBX has a "Port Select" field on the DIP register, but the ports are different. Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> :040000 040000 dbe82f98dd2286daa19238da4d5a025962fcfc60 431089b262fa4b2e7fb4ad673e0b264285227597 M drivers Created attachment 78981 [details]
fix-up revert
Yeah, that commit makes much more sense. To double-check, can you please test whether the attached manual revert (there was a small conflict) applied on top of 3.5 makes hdmi work for you again?
yes, hdmi works with 3.5.y after applying that patch. Reassigning the bug to me since I'm the author of the "regression". 1 - Is this an HDMI or DVI monitor? Is your laptop port an HDMI port or a DVI port? Is there any kind of HDMI/DVI converter/adaptor being used? Any man-in-the-middle devices between the monitor and the computer? 2 - After booting the "bad kernel", is there any difference if you turn your monitor off and then on? 3 - Please download and compile intel-gpu-tools (http://cgit.freedesktop.org/xorg/app/intel-gpu-tools). 4 - Please boot the "bad kernel", then run "sudo ./tools/intel_infoframes -d". Please paste the output here. 5 - Then run "sudo ./tools/intel_infoframes -t A -x -t B -x". Then turn the monitor off, then turn the monitor on. Does it solve anything? 6 - Since you already have the "good kernel" compiled (with Daniel's patch), can you please also boot it, run "sudo ./intel_infoframes -d" and paste the output here? It smells like maybe your monitor does not like/want infoframes. It took me many hours to make infoframes work :) > 1 - Is this an HDMI or DVI monitor? Is your laptop port an HDMI port or a DVI > port? Is there any kind of HDMI/DVI converter/adaptor being used? Any > man-in-the-middle devices between the monitor and the computer? > Monitor supports all HDMI, DVI and VGA. But in this case it connected directly to my laptop HDMI port via cable (no intermediate devices). (My monitor is a Acer S240HL Abid, if the specifics help) > 2 - After booting the "bad kernel", is there any difference if you turn your > monitor off and then on? > Tried this with no luck. > 4 - Please boot the "bad kernel", then run "sudo ./tools/intel_infoframes > -d". > Please paste the output here. > Please see the attached intel_infoframes-linux-3.5.3-with-4e89ee-commit.txt > 5 - Then run "sudo ./tools/intel_infoframes -t A -x -t B -x". Then turn the > monitor off, then turn the monitor on. Does it solve anything? > Yes, Once I ran this display came on the monitor (without turning it off and on again) > 6 - Since you already have the "good kernel" compiled (with Daniel's patch), > can you please also boot it, run "sudo ./intel_infoframes -d" and paste the > output here? > Please see the attached intel_infoframes-linux-3.5.3-without-4e89ee-commit.txt > It smells like maybe your monitor does not like/want infoframes. It took me > many hours to make infoframes work :) > Appreciate all your work, hope we can get this resolved soon. Thanks. Created attachment 79241 [details]
intel_infoframes output when no signal on monitor
Created attachment 79251 [details]
intel_infoframes-linux-3.5.3-with-4e89ee-commit.txt
Created attachment 79261 [details]
intel_infoframes-linux-3.5.3-without-4e89ee-commit.txt
So, if I understood things correctly, basically it seems your monitor does not like HDMI infoframes. And this creates a problem, because as far as we have checked, in some use cases we are required to send Infoframes and in your case we are required not to send them :( An idea would be to check if by sending extra Infoframe information your monitor would actually start working, but this requires some time playing with the intel_infoframes tool in your specific monitor, which I don't have. Another idea is to check if there's something in the EDID saying "please don't send infoframes". A quick check yesterday did not show anything. If we just revert the patch we'll have other users reporting that the "revert" is causing them problems because now there are no infoframes and so they get black screens :( > An idea would be to check if by sending extra Infoframe information your > monitor would actually start working, but this requires some time playing > with > the intel_infoframes tool in your specific monitor, which I don't have. > I understand, will it be possible for me to attempt to do this, given that I have no idea what infoframes are? (I don't mind reading some documentation if that can be done in that way) > If we just revert the patch we'll have other users reporting that the > "revert" > is causing them problems because now there are no infoframes and so they get > black screens :( > I agree, I also haven't met anyone having the same problem in any of the forums I posted about this issue, so I don't think reverting is an option. However, can we have a config parameter to stop sending the infoframes so people like me can turn it off? or is that too much trouble for nothing? (In reply to comment #21) > > An idea would be to check if by sending extra Infoframe information your > > monitor would actually start working, but this requires some time playing > with > > the intel_infoframes tool in your specific monitor, which I don't have. > > > I understand, will it be possible for me to attempt to do this, given > that I have no idea what infoframes are? (I don't mind reading some > documentation if that can be done in that way) > It would be necessary to read the CEA-861 and the HDMI specifications, but they cost money to obtain. > > > If we just revert the patch we'll have other users reporting that the > "revert" > > is causing them problems because now there are no infoframes and so they > get > > black screens :( > > > I agree, I also haven't met anyone having the same problem in any of the > forums I posted about this issue, so I don't think reverting is an option. > However, can we have a config parameter to stop sending the infoframes > so people like me can turn it off? or is that too much trouble for nothing? Having a parameter wouldn't prevent the black screens. First you get a black screen and then you learn about the parameter and use it. We should work on a solution that allows us to automagically decide whether we want or not to send infoframes. The parameter should be really the last thing to attempt. Testing the same machine on different monitors and the same monitor on different machines would also help. Can you please test the patch at https://patchwork.kernel.org/patch/1386831/ Thanks, Daniel > Can you please test the patch at
>
> https://patchwork.kernel.org/patch/1386831/
This patch did not solve the issue.
I did reverse the last patch you gave me before applying this. I hope
that's the correct way to do it?
On the "bad kernel", *none* of the modes work? Please boot the machine on the bad Kernel, then use xrandr to try to switch between the different modes. Do all modes give you a black screen? Thanks, Paulo Another test: Boot the "bad Kernel", then put your monitor on the native 1920x1080 mode (the one that should be the default), then run: sudo ./intel_infoframes -t B -f AVI -c 'VIC 16' Does it work? After this, please run: sudo ./intel_infoframes -t B -f SPD -n Does it work? Then: sudo ./intel_infoframes -t B -f AVI -c 'ITC 1' Does it work? Then: sudo ./intel_infoframes -t B -f AVI -n Does it work? Thanks, Paulo > Please boot the machine on the bad Kernel, then use xrandr to try to switch
> between the different modes. Do all modes give you a black screen?
>
Yes, all modes get the blank screen.
> Boot the "bad Kernel", then put your monitor on the native 1920x1080 mode > (the > one that should be the default), then run: > > sudo ./intel_infoframes -t B -f AVI -c 'VIC 16' > This Works (monitor turns on) > After this, please run: > > sudo ./intel_infoframes -t B -f SPD -n > This keeps the monitor on the same condition (working) from above > Then: > > sudo ./intel_infoframes -t B -f AVI -c 'ITC 1' > This keeps the monitor on the same condition (working) from above > > Then: > > sudo ./intel_infoframes -t B -f AVI -n > This keeps the monitor on the same condition (working) from above (In reply to comment #28) > > Boot the "bad Kernel", then put your monitor on the native 1920x1080 mode > (the > > one that should be the default), then run: > > > > sudo ./intel_infoframes -t B -f AVI -c 'VIC 16' > > > This Works (monitor turns on) > This is interesting news :) So previously we discovered that if we turn off the infoframes your monitor would work. Now we discovered that if we set the VIC field of the AVI infoframe to non-zero your monitor works. Your monitor is the first one I see requiring the VIC to be non-zero, so I'll ask for some more tests just to confirm. So, after you set the VIC to 16 that, can you please run these commands and tell me what happens after each of them? sudo ./intel_infoframes -t B -f AVI -c 'VIC 0' sudo ./intel_infoframes -t B -f AVI -c 'VIC 1' sudo ./intel_infoframes -t B -f AVI -c 'VIC 15' sudo ./intel_infoframes -t B -f AVI -c 'VIC 16' (you can also play with other values if you want... feel free) So basically the CEA-861 specification defines dozens of modes and they give each mode a VIC (Video Identification Code), which is basically an ID for the mode. The spec says that if you're using one of the modes provided by the CEA-861 spec you "shall" (mandatory) to set its VIC on the VIC field of the AVI Infoframe. But the spec says that if you set the VIC to 0 it means you're using any other mode not specified by the CEA-861 spec. So far, all monitors we saw work by setting VIC always to 0. It also seems that your xrandr shows some modes that are not defined by the CEA-861, I wonder why they don't work when we set VIC to 0. So basically what we need to do to at least fix some of your modes is to write code that correctly sets the VIC value according to the spec. > > After this, please run: > > > > sudo ./intel_infoframes -t B -f SPD -n > > > This keeps the monitor on the same condition (working) from above > Oh, I was not exactly expecting the last command to work. So, what happens if you're on the "bad state" and run this? > > Then: > > > > sudo ./intel_infoframes -t B -f AVI -c 'ITC 1' > > > This keeps the monitor on the same condition (working) from above > > > > > Then: > > > > sudo ./intel_infoframes -t B -f AVI -n > > > This keeps the monitor on the same condition (working) from above > --- Comment #29 from Paulo Zanoni <przanoni@gmail.com> 2012-09-15 19:45:54 > --- > (In reply to comment #28) >>> Boot the "bad Kernel", then put your monitor on the native 1920x1080 mode >>> (the >>> one that should be the default), then run: >>> >>> sudo ./intel_infoframes -t B -f AVI -c 'VIC 16' >>> >> This Works (monitor turns on) >> > > This is interesting news :) > > So previously we discovered that if we turn off the infoframes your monitor > would work. Now we discovered that if we set the VIC field of the AVI > infoframe > to non-zero your monitor works. Your monitor is the first one I see requiring > the VIC to be non-zero, so I'll ask for some more tests just to confirm. > > So, after you set the VIC to 16 that, can you please run these commands and > tell me what happens after each of them? > > sudo ./intel_infoframes -t B -f AVI -c 'VIC 0' > sudo ./intel_infoframes -t B -f AVI -c 'VIC 1' > sudo ./intel_infoframes -t B -f AVI -c 'VIC 15' > sudo ./intel_infoframes -t B -f AVI -c 'VIC 16' > > (you can also play with other values if you want... feel free) > > So basically the CEA-861 specification defines dozens of modes and they give > each mode a VIC (Video Identification Code), which is basically an ID for the > mode. The spec says that if you're using one of the modes provided by the > CEA-861 spec you "shall" (mandatory) to set its VIC on the VIC field of the > AVI > Infoframe. But the spec says that if you set the VIC to 0 it means you're > using > any other mode not specified by the CEA-861 spec. So far, all monitors we saw > work by setting VIC always to 0. It also seems that your xrandr shows some > modes that are not defined by the CEA-861, I wonder why they don't work when > we > set VIC to 0. > > So basically what we need to do to at least fix some of your modes is to > write > code that correctly sets the VIC value according to the spec. > I ran above command with different random values for VIC between 0 and 79 and monitor turned on for all of them (Including VIC 0). I checked the value of VIC after booting with the "bad kernel" (while monitor shows no signal) and it was 0, setting the VIC to 0 again with the command above turned the monitor on. I don't know if this means anything but feels wired to me. Also, I noticed that even with the "bad kernel" after I "fix" the monitor with the command above (or ones that work below) rebooting the laptop still keeps the monitor in good state. It goes to the bad state after I turn the laptop off and on again. >>> After this, please run: >>> >>> sudo ./intel_infoframes -t B -f SPD -n >>> >> This keeps the monitor on the same condition (working) from above >> > > Oh, I was not exactly expecting the last command to work. So, what happens if > you're on the "bad state" and run this? > You were right about this, when I ran this command while the monitor is in the bad state it did not changed anything. >>> Then: >>> >>> sudo ./intel_infoframes -t B -f AVI -c 'ITC 1' >>> >> This keeps the monitor on the same condition (working) from above >> >>> >>> Then: >>> >>> sudo ./intel_infoframes -t B -f AVI -n >>> >> This keeps the monitor on the same condition (working) from above > Both above command also worked when I ran them while monitor is in the bad state. Oh, maybe I was wrong in comment #29. This time your bug looks more like the bugs I saw in the past :) What I saw in the past was that just running "intel_infoframes -d" would make things work. So in your case, as far as I understood, just running it won't help, but setting some value of the AVI infoframe (even if it's a value that's already being used) will solve it. So a final question just to confirm: After booting on the "bad state", running sudo ./intel_infoframes -d *won't* make the monitor work, but running: sudo ./intel_infoframes -t B -f AVI -c 'VIC 0' will make it work? If I'm right, please save the output of the first "intel_infoframes -d" (on a bad state), then after the second command (where you set VIC to 0) can you please run "intel_infoframes -d" again and attach it here? Thanks a lot for your help! Having bug reporters like you makes debugging much more fun and easier :) Created attachment 80321 [details]
intel_infoframes output right after booting into "bad kernel"
Created attachment 80331 [details]
intel_infoframes output right after running 'VIC 0' command
> After booting on the "bad state", running > > sudo ./intel_infoframes -d > Attached: https://bugzilla.kernel.org/attachment.cgi?id=80321 > *won't* make the monitor work, but running: > > sudo ./intel_infoframes -t B -f AVI -c 'VIC 0' > Attached: https://bugzilla.kernel.org/attachment.cgi?id=80331 > will make it work? > > If I'm right, please save the output of the first "intel_infoframes -d" (on a > bad state), then after the second command (where you set VIC to 0) can you > please run "intel_infoframes -d" again and attach it here? > Yes, I can confirm this, every time I run intel_infoframes with VIC parameter (even if it has the current value) monitor turns on from it's "no signal" state. > Thanks a lot for your help! Having bug reporters like you makes debugging > much > more fun and easier :) > You are welcome, I'm glad to help out and it also gave me a change to learn few tricks :) And, Thank you both Paulo and Daniel for your time effort in solving my issue and guiding me though the process so I could provide you with the information you need. Created attachment 80591 [details] Fix attempt and debug info Can you please try this patch? It should apply against Torvald's 3.6 tree. Ok, so here are the current theories: - By looking at the diff between the attachments in comments #32 and #33 it seems that for some reason your ECC is wrong and your monitor does not like the wrong ECC, which makes sense. - The ECC is generated by the Hardware and should be read-only, so it's not like we are setting the wrong ECC value. - On my computer (where infoframes are supposed to work), when I dump the infoframes, byte 0 is e40d0282 and byte 8 is 0000005f, which matches attachment 80331 [details] from comment #33, which should be correct - From the fact that running 'intel_infoframes -d' does not fix your problem but running 'intel_infoframes -t B -f AVI -c 'VIC 0' does solve your problem, it seems there's a difference between the implementation of intel_infoframes and the implementation in the Kernel So this patch tries to do a few things: - Add some debug messages - Make sure you're not leaving the function on the "default" case - Write 8 infoframe bytes instead of "only the necessary". This is what the intel_infoframes tool does, and maybe it is needed in order to generate the correct ECC value? Can you please boot this patch with drm.debug=0xe and attach the dmesg? Let's hope this works.. Created attachment 80651 [details] dmesg with patch from comment #35 dmesg out put attached. https://bugzilla.kernel.org/attachment.cgi?id=80651 Monitor is working when I boot with this patch. Created attachment 80661 [details]
Final patch 1/2
Created attachment 80671 [details]
Final patch 2/2
Hi
Can you please remove any patches you may have previously applied and then apply these 2 patches to the 3.6 tree and test them?
These are the patches I plan to send to the official Kernel trees, so having them tested by you is the final step.
Thank you for your patience and your help!
Paulo
Yes, this patch solves the issue I had with my monitor. Thank you. Marking this bug as Resolved as the fix/patch is now available in the 3.6.2 stable kernel. Thanks Paulo, Daniel and others. A patch referencing this bug report has been merged in Linux v3.7-rc1: commit adf00b26d18e1b3570451296e03bcb20e4798cdd Author: Paulo Zanoni <paulo.r.zanoni@intel.com> Date: Tue Sep 25 13:23:34 2012 -0300 drm/i915: make sure we write all the DIP data bytes A patch referencing a commit referencing this bug report has been merged in Linux v3.7-rc4: commit b6e0e543f75729f207b9c72b0162ae61170635b2 Author: Daniel Vetter <daniel.vetter@ffwll.ch> Date: Sun Oct 21 12:52:39 2012 +0200 drm/i915: clear the entire sdvo infoframe buffer |