Bug 43154 - [965GM] 2px line on external HDMI-connected monitor (SDVO hdmi/audio woes)
Summary: [965GM] 2px line on external HDMI-connected monitor (SDVO hdmi/audio woes)
Status: RESOLVED CODE_FIX
Alias: None
Product: Drivers
Classification: Unclassified
Component: Video(DRI - Intel) (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: drivers_video-dri-intel@kernel-bugs.osdl.org
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2012-04-23 21:36 UTC by Karolis
Modified: 2013-01-08 20:58 UTC (History)
6 users (show)

See Also:
Kernel Version:
Subsystem:
Regression: Yes
Bisected commit-id:


Attachments
dmesg output (81.49 KB, text/plain)
2012-04-23 21:36 UTC, Karolis
Details
dmidecode output (12.00 KB, text/plain)
2012-04-23 21:36 UTC, Karolis
Details
lspci -vvnn output (15.75 KB, text/plain)
2012-04-23 21:37 UTC, Karolis
Details
picture of issue (759.69 KB, image/jpeg)
2012-04-24 08:27 UTC, Daniel Vetter
Details
dmesg output with drm.debug=0xe (65.22 KB, text/plain)
2012-04-24 23:38 UTC, Karolis
Details
xrandr output (4.85 KB, text/plain)
2012-04-24 23:39 UTC, Karolis
Details
dmesg output with drm.debug=0xe (152.42 KB, text/plain)
2012-05-06 08:47 UTC, Karolis
Details
clear the entire infoframe buffer (4.24 KB, patch)
2012-10-20 14:01 UTC, Daniel Vetter
Details | Diff
clear the entire infoframe buffer: updated version (4.75 KB, patch)
2012-10-21 14:50 UTC, Daniel Vetter
Details | Diff
Patch log (2.80 KB, text/plain)
2012-10-21 17:56 UTC, Karolis
Details

Description Karolis 2012-04-23 21:36:29 UTC
Created attachment 73053 [details]
dmesg output

Since upgrading to 3.2.0-23-generic kernel (tried in Ubuntu 11.10 and 12.04) my external monitor LG IPS235 connected to Dell XPS M1330 laptop, displays 2px purple line (I believe the colour is Ubuntu-specific).

I have also tried upgrading to 3.4 RC3 and got the same issue there. The issue was not present on the same hardware using Ubuntu 11.10 with 3.0 kernel.

Photo of the issue https://launchpadlibrarian.net/102636061/Purple_line.jpg

Also reported here https://bugs.launchpad.net/ubuntu/+source/linux/+bug/985949
Comment 1 Karolis 2012-04-23 21:36:58 UTC
Created attachment 73054 [details]
dmidecode output
Comment 2 Karolis 2012-04-23 21:37:34 UTC
Created attachment 73055 [details]
lspci -vvnn output
Comment 3 Daniel Vetter 2012-04-24 08:26:17 UTC
Which is the last kernel version that worked? Also a bisect would be highly appreciated if you can do it.

Also please boot with drm.debug=0xe attached to your kernel cmdline and attach the complete dmesg and also please attach the output of xrandr --verbose.
Comment 4 Daniel Vetter 2012-04-24 08:27:10 UTC
Created attachment 73061 [details]
picture of issue
Comment 5 Karolis 2012-04-24 23:38:53 UTC
Created attachment 73070 [details]
dmesg output with drm.debug=0xe
Comment 6 Karolis 2012-04-24 23:39:19 UTC
Created attachment 73071 [details]
xrandr output
Comment 7 Karolis 2012-04-24 23:39:53 UTC
Latest working version that I used was what ubuntu reports as 3.0.0-17-generic

Requested logs attached. Will now attempt bisect.
Comment 8 Karolis 2012-05-05 22:03:01 UTC
384a48d71520ca569a63f1e61e51a538bedb16df is the first bad commit
commit 384a48d71520ca569a63f1e61e51a538bedb16df
Author: Stephen Warren <swarren@nvidia.com>
Date:   Wed Jun 1 11:14:21 2011 -0600

    ALSA: hda: HDMI: Support codecs with fewer cvts than pins
    
    The general concept of this change is to create a PCM device for each
    pin widget instead of each converter widget. Whenever a PCM is opened,
    a converter is dynamically selected to drive that pin based on those
    available for muxing into the pin.
    
    The one thing this model doesn't support is a single PCM/converter
    sending audio to multiple pin widgets at once.
    
    Note that this means that a struct hda_pcm_stream's nid variable is
    set to 0 except between a stream's open and cleanup calls. The dynamic
    de-assignment of converters to PCMs occurs within cleanup, not close,
    in order for it to co-incide with when controller stream IDs are
    cleaned up from converters.
    
    While the PCM for a pin is not open, the pin is disabled (its widget
    control's PIN_OUT bit is cleared) so that if the currently routed
    converter is used to drive a different PCM/pin, that audio does not
    leak out over a disabled pin.
    
    We use the recently added SPDIF virtualization feature in order to
    create SPDIF controls for each pin widget instead of each converter
    widget, so that state is specific to a PCM.
    
    In order to support this, a number of more mechanical changes are made:
    
    * s/nid/pin_nid/ or s/nid/cvt_nid/ in many places in order to make it
      clear exactly what the code is dealing with.
    
    * We now have per_pin and per_cvt arrays in hdmi_spec to store relevant
      data. In particular, we store a converter's capabilities in the per_cvt
      entry, rather than relying on a combination of codec_pcm_pars and
      the struct hda_pcm_stream.
    
    * ELD-related workarounds were removed from hdmi_channel_allocation
      into hdmi_instrinsic in order to simplifiy infoframe calculations and
      remove HW dependencies.
    
    * Various functions only apply to a single pin, since there is now
      only 1 pin per PCM. For example, hdmi_setup_infoframe,
      hdmi_setup_stream.
    
    * hdmi_add_pin and hdmi_add_cvt are more oriented at pure codec parsing
      and data retrieval, rather than determining which pins/converters
      are to be used for creating PCMs.
    
    This is quite a large change; it may be appropriate to simply read the
    result of the patch rather than the diffs. Some small parts of the change
    might be separable into different patches, but I think the bulk of the
    change will probably always be one large patch. Hopefully the change
    isn't too opaque!
    
    This has been tested on:
    
    * NVIDIA GeForce 400 series discrete graphics card. This model has the
      classical 1:1:1 codec:converter:pcm widget model. Tested stereo PCM
      audio to a PC monitor that supports audio.
    
    * NVIDIA GeForce 520 discrete graphics card. This model is the new
      1 codec n converters m pins m>n model. Tested stereo PCM audio to a
      PC monitor that supports audio.
    
    * NVIDIA GeForce 400 series laptop graphics chip. This model has the
      classical 1:1:1 codec:converter:pcm widget model. Tested stereo PCM,
      multi-channel PCM, and AC3 pass-through to an AV receiver.
    
    * Intel Ibex Peak laptop. This model is the new 1 codec n converters m
      pins m>n model. Tested stereo PCM, multi-channel PCM, and AC3 pass-
      through to an AV receiver.
    
    Note that I'm not familiar at all with AC3 pass-through. Hence, I may
    not have covered all possible mechanisms that are applicable here. I do
    know that my receiver definitely received AC3, not decoded PCM. I tested
    with mplayer's "-afm hwac3" and/or "-af lavcac3enc" options, and alsa a
    WAV file that I believe has AC3 content rather than PCM.
    
    I also tested:
    * Play a stream
    * Mute while playing
    * Stop stream
    * Play some other streams to re-assign the converter to a different
      pin, PCM, set of SPDIF controls, ... hence hopefully triggering
      cleanup for the original PCM.
    * Unmute original stream while not playing
    * Play a stream on the original pin/PCM.
    
    This was to test SPDIF control virtualization.
    
    Signed-off-by: Stephen Warren <swarren@nvidia.com>
    Signed-off-by: Takashi Iwai <tiwai@suse.de>

:040000 040000 894370c6534b1bf03df9a8a8c7d85c2eeffc7555 98cb8a73a0ed46f034e25bd35002930bc22376ef M	sound
Comment 9 Daniel Vetter 2012-05-05 22:44:12 UTC
Hm, the bisect is a fun one, but not really the first one where audio-over-hdmi wreaks havoc. Also, can you please reattach the dmesg - the one you've attached doesn't have drm.debug=0xe set.
Comment 10 Daniel Vetter 2012-05-05 22:45:41 UTC
Also added also folks from that commit, maybe they have an insight.
Comment 11 Karolis 2012-05-06 08:47:33 UTC
Created attachment 73203 [details]
dmesg output with drm.debug=0xe

I must have attached the wrong file before. This should be what you asked for.
Comment 12 Takashi Iwai 2012-05-06 10:28:43 UTC
In order to exclude the possible side-effect of HDMI audio, just try to remove snd-hda-codec-hdmi.ko and reboot.  If the problem persists, it has nothing to do with the bisected commit.
Comment 13 Karolis 2012-05-06 10:52:16 UTC
I have blacklisted snd_hda_codec_hdmi and the problem is still there.
Comment 14 Takashi Iwai 2012-05-06 13:52:50 UTC
Just to be sure: the blacklisting via modprobe config might not work for hd-audio codec drivers (since it's loaded from snd-hda-intel in the kernel side), so make sure that it really doesn't appear in lsmod output.
Comment 15 Karolis 2012-05-06 14:13:57 UTC
What I've done is added snd_hda_codec_hdmi.blacklist=yes to GRUB_CMDLINE_LINUX and after rebooting ran "lsmod | grep hdmi" - nothing came up.

Before blacklisting I got:

snd_hda_codec_hdmi     32474  1 
snd_hda_codec         127706  3 snd_hda_codec_hdmi,snd_hda_codec_idt,snd_hda_intel
snd_pcm                97188  4 snd_hda_codec_hdmi,snd_usb_audio,snd_hda_intel,snd_hda_codec
snd                    78855  22 snd_hda_codec_hdmi,snd_hda_codec_idt,snd_usb_audio,snd_usbmidi_lib,snd_hda_intel,snd_hda_codec,snd_hwdep,snd_pcm,snd_rawmidi,snd_seq,snd_timer,snd_seq_device

So the only thing for sure is that the issue appeared between 3.0.0-rc1 and 3.0.0-rc2
Comment 16 Daniel Vetter 2012-05-06 14:24:31 UTC
Ok, so we're back to square one for bisecting :( Can you re-try the bisect with the hda module always blacklisted? Hopefully it points to something more interesting ...
Comment 17 Karolis 2012-05-06 14:43:12 UTC
Sure. Can you just confirm whether you have snd_hda_codec_hdmi in mind or all snd_hda_* modules?
Comment 18 Daniel Vetter 2012-05-06 15:47:15 UTC
Blacklisting snd_hda_codec_hdmi alone should be good enough. Just make sure that even with the hdmi codec blacklisted, things still work as expected on -rc1.
Comment 19 Karolis 2012-05-07 12:17:28 UTC
Is blacklisting via kernel boot parameters good enough, or is there a way to blackilst a module upon compilation? Sorry if it's a stupid question, but I couldn't find the answer by googling.
Comment 20 Daniel Vetter 2012-05-07 12:28:15 UTC
You can disable all sound modules in the kernel configuration with by setting CONFIG_SND=n. Beware though that you retest both the good and bad kernel first to ensure that this doesn't change the behavior.
Comment 21 Karolis 2012-05-07 22:36:13 UTC
This is confusing. After setting CONFIG_SND_HDA_CODEC_HDMI=n and compiling 3.0.0-rc1, I get the same issue, even though rc1 was fine before.

Bisecting further...
Comment 22 Karolis 2012-05-12 00:56:38 UTC
v2.6.39-rc1 has the same problem when compiled with CONFIG_SND_HDA_CODEC_HDMI=n

I could not compile v2.6.38 since it throws the following error:

/tmp/ccrooehp.s: Assembler messages:
/tmp/ccrooehp.s: Error: .size expression for do_hypervisor_callback does not evaluate to a constant
make[3]: *** [arch/x86/kernel/entry_64.o] Error 1
make[2]: *** [arch/x86/kernel] Error 2
make[1]: *** [arch/x86] Error 2
make[1]: *** Waiting for unfinished jobs....
  LD      init/mounts.o
  LD      init/built-in.o
make[1]: Leaving directory `/home/karolis/Downloads/linux'
make: *** [debian/stamp/build/kernel] Error 2

It looks to me like SND_HDA_CODEC_HDMI is related to this issue after all.
Comment 23 Daniel Vetter 2012-05-12 14:48:24 UTC
Ok, it sounds like the hdmi sound subsystem is just the messenger and not the problem. Paulo and I are looking into fixing up our sdvo hdmi support - we miss out completely on handling audio properly.

Don't hold your breath though for patches, at our current workload it might take a while :(
Comment 24 Paulo Zanoni 2012-05-21 17:21:28 UTC
Karolis: is your monitor DVI? Are you using some kind of dvi/hdmi cable/converter?

I started getting these pink lines on my machine after I started sending InfoFrames to my DVI monitor... This was the accidental bug of a patch I was writing, and it was not on SDVO.
Comment 25 Karolis 2012-05-21 17:32:59 UTC
I'm not using any converter - it's HDMI on both ends. I tried connecting my laptop to a TV using a different HDMI cable and I get the same result.
Comment 26 Chris Wilson 2012-05-22 15:52:30 UTC
Different hw but similar symptoms: https://bugzilla.kernel.org/show_bug.cgi?id=43272
Comment 27 Daniel Vetter 2012-10-20 14:01:22 UTC
Created attachment 84121 [details]
clear the entire infoframe buffer

Sorry for the extremely long delay in any updates for this bug. Can you please test this attached patch? If that doesn't help, I have some further ideas for sdvo hdmi infoframe support, but need a tester with a broken system first ...
Comment 28 Karolis 2012-10-20 15:12:43 UTC
I'm more than happy to help, but I'm not very experienced at this, so you'll have to bear with my simple questions. To begin with: which version of kernel should I apply this patch against?
Comment 29 Daniel Vetter 2012-10-20 15:33:21 UTC
Should apply to 3.6.x series. If it doesn't, I can do a quick backport.
Comment 30 Karolis 2012-10-21 11:24:04 UTC
I've tried applying the patch to 3.6 and 3.6.2 - in both cases the problem still exists. Unless I didn't patch it correctly.
Comment 31 Daniel Vetter 2012-10-21 14:50:17 UTC
Created attachment 84171 [details]
clear the entire infoframe buffer: updated version

There was a bug in the first patch, can you please retry with this one here?
Comment 32 Karolis 2012-10-21 17:56:13 UTC
Created attachment 84191 [details]
Patch log

Still getting the issue with patched 3.6.2, but I'm not quite confident I apply the patch correctly, so here's the log, just in case.
Comment 33 Daniel Vetter 2012-10-21 21:22:58 UTC
Can you check what happens when you disable the audio output? I.e.

xrandr --output HDMIx --auto --set audio off
Comment 34 Karolis 2012-10-21 21:48:54 UTC
On an unpatched 3.5.0-18-generic (ubuntu 12.10) kernel I get the same problem. Do you want me to try the same on a patched kernel?

Also, maybe there's something wrong with my testing method. Since the issue only happens after the monitor wakes up from sleep (2px line is not present upon boot), I send it to sleep using 'xset dpms force off'. The issue also happens if I just wait for it to turn off automatically.
Comment 35 Daniel Vetter 2012-10-21 23:56:44 UTC
On Sun, Oct 21, 2012 at 11:48 PM,  <bugzilla-daemon@bugzilla.kernel.org> wrote:
> --- Comment #34 from Karolis <kpocius@gmail.com>  2012-10-21 21:48:54 ---
> On an unpatched 3.5.0-18-generic (ubuntu 12.10) kernel I get the same
> problem.
> Do you want me to try the same on a patched kernel?

Nope, I need to write more patches first.

> Also, maybe there's something wrong with my testing method. Since the issue
> only happens after the monitor wakes up from sleep (2px line is not present
> upon boot), I send it to sleep using 'xset dpms force off'. The issue also
> happens if I just wait for it to turn off automatically.

Underneath the exact same code gets run, so this is expect. Rather
interesting that the dpsm cycle causes the issue, and not just the
modeset at boot-up ...
Comment 36 Daniel Vetter 2012-11-01 22:14:30 UTC
New idea to test: Please grab the latest intel-gpu-tools and run

# intel_reg_read 0x61170

If it's not 0x0, then please clear it with

# intel_reg_write 0x61170 0x0
Comment 37 Karolis 2012-11-01 23:43:25 UTC
I've killed X, then from the login screen switched to shell and ran:

# intel_reg_write 0x61170 0x0

I get:

Value before: 0x600
Value after: 0x600

Doesn't look like I can clear the value. Or am I doing it wrong?
Comment 38 Daniel Vetter 2012-11-02 08:01:23 UTC
(In reply to comment #37)
> Doesn't look like I can clear the value. Or am I doing it wrong?

Everything done right, those bits can't be cleared. In any case, the bios doesn't leave this enabled (bit31 would have indicated that). Thanks for quickly checking this.
Comment 39 Daniel Vetter 2012-12-18 10:58:31 UTC
Can you please test the patch at

https://bugs.freedesktop.org/attachment.cgi?id=71493

quickly? Alternatively you can use the also tools to change the settings at runtime, see

https://bugs.freedesktop.org/show_bug.cgi?id=55556
Comment 40 Karolis 2012-12-18 11:09:41 UTC
I'll try to test the patch in the next couple of days, but first, a standard question: which version of kernel should I apply this to?
Comment 41 Daniel Vetter 2012-12-18 11:46:36 UTC
On Tue, Dec 18, 2012 at 12:09 PM,  <bugzilla-daemon@bugzilla.kernel.org> wrote:
> --- Comment #40 from Karolis <kpocius@gmail.com>  2012-12-18 11:09:41 ---
> I'll try to test the patch in the next couple of days, but first, a standard
> question: which version of kernel should I apply this to?

It's an alsa patch, so I dunno how far back this will apply to. Should
work on 3.7 though.
Comment 42 Daniel Vetter 2013-01-07 17:24:48 UTC
Ok, alsa patch has landed as:

commit 6169b673618bf0b2518ce413b54925782a603f06
Author: Takashi Iwai <tiwai@suse.de>
Date:   Fri Dec 14 10:22:35 2012 +0100

    ALSA: hda - Always turn on pins for HDMI/DP

Please retest.
Comment 43 Karolis 2013-01-08 19:42:37 UTC
The issue seems to be resolved.

Do you want me to run any particular tests besides just a general observation?
Comment 44 Daniel Vetter 2013-01-08 20:58:54 UTC
(In reply to comment #43)
> The issue seems to be resolved.
> 
> Do you want me to run any particular tests besides just a general
> observation?

Nah, sounds good enough - the patch fixed similar issues for other people after all. Thanks a lot for reporting this bug and sticking around with us for that long.

Note You need to log in before you can comment on or make changes to this bug.