Bug 48911

Summary: Intel HD: Sound is distorted at beginning of stream
Product: Drivers Reporter: Ralf (post+kernel)
Component: Sound(ALSA)Assignee: Jaroslav Kysela (perex)
Status: RESOLVED CODE_FIX    
Severity: normal CC: alan, florian, kurt, tiwai
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 3.6.2 Subsystem:
Regression: Yes Bisected commit-id:
Attachments: dmesg output of affected kernel with CONFIG_SND_DEBUG_VERBOSE=y
Fix patch for disabling LPIB delay counting for invalid values

Description Ralf 2012-10-16 09:40:32 UTC
Using the 3.6.2 kernel on my Debian Wheezy system, I am having sound issues: The first few seconds of every stream are distorted. If I seek in an audio/video file, it is again distorted for a few seconds before going back to normal. I am using PulseAudio 2.0.

I can work around the issue by passing "tsched=0" to the PulseAudio udev device detector, but then the following error appears in the PA log:

E: [alsa-sink] alsa-sink.c: ALSA woke us up to write new data to the device, but there was actually nothing to write!
E: [alsa-sink] alsa-sink.c: Most likely this is a bug in the ALSA driver 'snd_hda_intel'. Please report this issue to the ALSA developers.
E: [alsa-sink] alsa-sink.c: We were woken up with POLLOUT set -- however a subsequent snd_pcm_avail() returned 0 or another value < min_avail.

I do not know if that's the same or an independent bug.

None of this (distortion with tsched=1, error message with tsched=0) happens when using the 3.6 kernel, so this is a regression.

All this is on an Asus X53SM. The relevant part of the lspci -v output is:
00:1b.0 Audio device: Intel Corporation 6 Series/C200 Series Chipset Family High Definition Audio Controller (rev 05)
        Subsystem: ASUSTeK Computer Inc. Device 1ac3
        Flags: bus master, fast devsel, latency 0, IRQ 49
        Memory at df000000 (64-bit, non-prefetchable) [size=16K]
        Capabilities: [50] Power Management version 2
        Capabilities: [60] MSI: Enable+ Count=1/1 Maskable- 64bit+
        Capabilities: [70] Express Root Complex Integrated Endpoint, MSI 00
        Capabilities: [100] Virtual Channel
        Capabilities: [130] Root Complex Link
        Kernel driver in use: snd_hda_intel

There was a similar sound issue on my system earlier in the 3.6 development process, see https://bugzilla.kernel.org/show_bug.cgi?id=47461 - maybe that's related.
Comment 1 Ralf 2012-10-16 12:28:23 UTC
Some more testing showed that the following commit introduced this regression:


commit fbd15b54708f20d25e70d40c4035db37fa7c6c2a
Author: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Date:   Fri Sep 21 18:39:06 2012 -0500

    ALSA: hda - use LPIB for delay estimation
Comment 2 Takashi Iwai 2012-10-16 13:11:42 UTC
OK, it's good to know that there are still buggy hardware.
Could you check whether you get a kernel message ("delay xxx > period_bytes yyy") when built with CONFIG_SND_DEBUG_VERBOSE=y?
Comment 3 Ralf 2012-10-16 15:14:37 UTC
Created attachment 83661 [details]
dmesg output of affected kernel with CONFIG_SND_DEBUG_VERBOSE=y

> OK, it's good to know that there are still buggy hardware.
I think if there's one thing you can rely on, that's a constant supply of buggy hardware ;-)

> Could you check whether you get a kernel message ("delay xxx > period_bytes
> yyy") when built with CONFIG_SND_DEBUG_VERBOSE=y?
I compiled fbd15b54 with CONFIG_SND_DEBUG_VERBOSE=y and used pulseaudio both with timer-based scheduling (up to [  138.166718]) and interrupt-based scheduling.
In the first case, there's indeed a whole bunch of messages like

delay 65532 > period_bytes 32768

There are no such messages when I used interrupt-based scheduling. The full dmesg output is attached. Is the PulseAudio output of any interest to you?
Comment 4 Takashi Iwai 2012-10-16 15:22:38 UTC
Thanks, this implies that the LPIB value on your machine is somehow unexpected for the delay calculation.  Could you try the attached patch?  This should disable the invalid delay calculation when such an unexpected value hits once. This won't avoid the very first problem but the succeeding operation should work, so it should be an improvement.
Comment 5 Takashi Iwai 2012-10-16 15:26:26 UTC
Created attachment 83671 [details]
Fix patch for disabling LPIB delay counting for invalid values
Comment 6 Ralf 2012-10-16 16:23:02 UTC
Indeed this fixes the distortions when using timer-based mode (I couldn't even hear the one distortion that supposedly still happened).

The PA error when using tsched=0, however, is still present.
Comment 7 Takashi Iwai 2012-10-16 16:28:09 UTC
Good to hear it fixes in one side.  I guess the PA error with tsched=0 is no regression in 3.6 kernel. There is no big change there except for the recent COMBO mode change and the LPIB delay counting.  Could you check it?  If it looks so, I'm going to push the fixed patch for including to 3.7-rc2 (and back to 3.6 stable tree).
Comment 8 Ralf 2012-10-16 19:06:46 UTC
(In reply to comment #7)
> Good to hear it fixes in one side.  I guess the PA error with tsched=0 is no
> regression in 3.6 kernel. There is no big change there except for the recent
> COMBO mode change and the LPIB delay counting.  Could you check it?
You are right - I thought I checked that, but now I get that error even when booting the 3.2 shipped by Debian. Should I report a bug for this one?

> If it
> looks so, I'm going to push the fixed patch for including to 3.7-rc2 (and
> back
> to 3.6 stable tree).
Just for the fun of it, I tried reverting your patch for the old sound issue (https://bugzilla.kernel.org/show_bug.cgi?id=47461) on top of 3.6.2, instead of the patch you posted here - which also seems to fix the issue here. Could that be, have there been changes in that area?
I am currently compiling without the debug options, just to be sure.
Comment 9 Takashi Iwai 2012-10-16 19:21:10 UTC
(In reply to comment #8)
> (In reply to comment #7)
> > Good to hear it fixes in one side.  I guess the PA error with tsched=0 is
> no
> > regression in 3.6 kernel. There is no big change there except for the
> recent
> > COMBO mode change and the LPIB delay counting.  Could you check it?
> You are right - I thought I checked that, but now I get that error even when
> booting the 3.2 shipped by Debian. Should I report a bug for this one?

Well, tsched=0 setup isn't really well supported, and I'll have little time for it, honestly :)

> > If it
> > looks so, I'm going to push the fixed patch for including to 3.7-rc2 (and
> back
> > to 3.6 stable tree).
> Just for the fun of it, I tried reverting your patch for the old sound issue
> (https://bugzilla.kernel.org/show_bug.cgi?id=47461) on top of 3.6.2, instead
> of
> the patch you posted here - which also seems to fix the issue here. Could
> that
> be, have there been changes in that area?
> I am currently compiling without the debug options, just to be sure.

Reverting that commit doesn't change the behavior so much for 3.7.
For 3.6, it's a bit different, but I'm afraid it'll make things more complicated.  So, I'd say leave it as is.

The new fix patch is anyway good, not only for yours.  I queued it now.
Comment 10 Ralf 2012-10-16 19:25:15 UTC
(In reply to comment #9)
> > Just for the fun of it, I tried reverting your patch for the old sound
> issue
> > (https://bugzilla.kernel.org/show_bug.cgi?id=47461) on top of 3.6.2,
> instead of
> > the patch you posted here - which also seems to fix the issue here. Could
> that
> > be, have there been changes in that area?
> > I am currently compiling without the debug options, just to be sure.
> 
> Reverting that commit doesn't change the behavior so much for 3.7.
> For 3.6, it's a bit different, but I'm afraid it'll make things more
> complicated.  So, I'd say leave it as is.
Okay. I just thought "removing a quirk can never be a bad thing", but I don't know at all what all this code really does ;-)


> The new fix patch is anyway good, not only for yours.  I queued it now.
Thanks a lot!
Comment 11 Florian Mickler 2012-10-23 20:56:30 UTC
A patch referencing this bug report has been merged in Linux v3.7-rc2:

commit 1f04661fde9deda4a2cd5845258715a22d8af197
Author: Takashi Iwai <tiwai@suse.de>
Date:   Tue Oct 16 16:52:26 2012 +0200

    ALSA: hda - Stop LPIB delay counting on broken hardware
Comment 12 Kurt Roeckx 2012-12-17 17:22:53 UTC
I just found this is my kernel log after an upgrade to 3.6.9:
hda-intel: Unstable LPIB (65496 >= 8192); disabling LPIB delay counting

I guess that's normal and I should just ignore it?


Kurt
Comment 13 Takashi Iwai 2012-12-17 19:17:01 UTC
Yes, this message is harmless.