Bug 73801 - snd-hda-intel hangs the machine
Summary: snd-hda-intel hangs the machine
Status: NEEDINFO
Alias: None
Product: Drivers
Classification: Unclassified
Component: Sound(ALSA) (show other bugs)
Hardware: All Linux
: P1 high
Assignee: Jaroslav Kysela
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-04-10 21:04 UTC by Adrian Knoth
Modified: 2016-12-24 18:42 UTC (History)
3 users (show)

See Also:
Kernel Version: 3.14
Subsystem:
Regression: Yes
Bisected commit-id:


Attachments
camshot of hanging boot process (450.75 KB, image/jpeg)
2014-04-10 21:04 UTC, Adrian Knoth
Details

Description Adrian Knoth 2014-04-10 21:04:28 UTC
Created attachment 131891 [details]
camshot of hanging boot process

Hi!

Booting a freshly compiled 3.14 hangs (no problems with 3.13.x) at device initialisation with some lines of

   hda-intel: spurious response 0x0:0x0, last cmd=0x0f0000

(shown in attached jpg)

Blacklisting snd-hda-intel makes the machine boot.

If I then

    modprobe snd-atiixp
    modprobe snd-hda-intel

the machine will lock up when loading snd-hda-intel. Loading snd-hda-intel without having snd-atiixp loaded still works but cannot find no codec.

I haven't bisected it yet, hence I cannot tell if it's hda-intel or snd-atiixp.



Cheers
Comment 1 Adrian Knoth 2014-04-10 21:07:16 UTC
JFTR:

00:14.2 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] IXP SB4x0 High Definition Audio Controller (rev 01)

00:14.2 0403: 1002:437b (rev 01)
Comment 2 Takashi Iwai 2014-04-11 05:30:58 UTC
It's something insane if snd-atiixp interferes.  It's for AC97 controllers, thus it must not be coexist with HD-audio.  As 3.13 kernel worked, could you bisect?
Comment 3 Adrian Knoth 2014-04-20 22:15:26 UTC
snd-atiixp removed from the equation, it's not to blame.

Bisecting this bug is very hard, I've tried 60 kernel versions so far with different scope (limited to sound/pci/hda, then broader sound/, then entire v3.13..v3.14 range)

I haven't identified a single commit that I could revert to fix the problem.

For a start, we're talking about AD1981. And occasionally, pulseaudio complains:

[alsa-sink-AD1981 Analog] alsa-util.c: snd_pcm_avail() returned a value that is exceptionally large: 655360 bytes (3715 ms
[alsa-sink-AD1981 Analog] alsa-util.c: Most likely this is a bug in the ALSA driver 'snd_hda_intel'. Please report this i
[alsa-sink-AD1981 Analog] alsa-util.c: snd_pcm_dump():
[alsa-sink-AD1981 Analog] alsa-util.c: Soft volume PCM
[alsa-sink-AD1981 Analog] alsa-util.c: Control: PCM Playback Volume
[alsa-sink-AD1981 Analog] alsa-util.c: min_dB: -51
[alsa-sink-AD1981 Analog] alsa-util.c: max_dB: 0
[alsa-sink-AD1981 Analog] alsa-util.c: resolution: 256
[alsa-sink-AD1981 Analog] alsa-util.c: Its setup is:
[alsa-sink-AD1981 Analog] alsa-util.c:   stream       : PLAYBACK
[alsa-sink-AD1981 Analog] alsa-util.c:   access       : MMAP_INTERLEAVED
[alsa-sink-AD1981 Analog] alsa-util.c:   format       : S16_LE
[alsa-sink-AD1981 Analog] alsa-util.c:   subformat    : STD
[alsa-sink-AD1981 Analog] alsa-util.c:   channels     : 2
[alsa-sink-AD1981 Analog] alsa-util.c:   rate         : 44100
[alsa-sink-AD1981 Analog] alsa-util.c:   exact rate   : 44100 (44100/1)
[alsa-sink-AD1981 Analog] alsa-util.c:   msbits       : 16 
[alsa-sink-AD1981 Analog] alsa-util.c:   buffer_size  : 16384
[alsa-sink-AD1981 Analog] alsa-util.c:   period_size  : 8192
[alsa-sink-AD1981 Analog] alsa-util.c:   period_time  : 185759
[alsa-sink-AD1981 Analog] alsa-util.c:   tstamp_mode  : ENABLE
[alsa-sink-AD1981 Analog] alsa-util.c:   period_step  : 1
[alsa-sink-AD1981 Analog] alsa-util.c:   avail_min    : 15502
[alsa-sink-AD1981 Analog] alsa-util.c:   period_event : 0
[alsa-sink-AD1981 Analog] alsa-util.c:   start_threshold  : -1
[alsa-sink-AD1981 Analog] alsa-util.c:   stop_threshold   : 4611686018427387904   
[alsa-sink-AD1981 Analog] alsa-util.c:   silence_threshold: 0
[alsa-sink-AD1981 Analog] alsa-util.c:   silence_size : 0
[alsa-sink-AD1981 Analog] alsa-util.c:   boundary     : 4611686018427387904
[alsa-sink-AD1981 Analog] alsa-util.c: Slave: Hardware PCM card 0 'HDA ATI SB' device 0 subdevice 0 
[alsa-sink-AD1981 Analog] alsa-util.c: Its setup is:
[alsa-sink-AD1981 Analog] alsa-util.c:   stream       : PLAYBACK
[alsa-sink-AD1981 Analog] alsa-util.c:   access       : MMAP_INTERLEAVED
[alsa-sink-AD1981 Analog] alsa-util.c:   format       : S16_LE
[alsa-sink-AD1981 Analog] alsa-util.c:   subformat    : STD
[alsa-sink-AD1981 Analog] alsa-util.c:   channels     : 2
[alsa-sink-AD1981 Analog] alsa-util.c:   rate         : 44100
[alsa-sink-AD1981 Analog] alsa-util.c:   exact rate   : 44100 (44100/1)
[alsa-sink-AD1981 Analog] alsa-util.c:   msbits       : 16 
[alsa-sink-AD1981 Analog] alsa-util.c:   buffer_size  : 16384
[alsa-sink-AD1981 Analog] alsa-util.c:   period_size  : 8192
[alsa-sink-AD1981 Analog] alsa-util.c:   period_time  : 185759
[alsa-sink-AD1981 Analog] alsa-util.c:   tstamp_mode  : ENABLE
[alsa-sink-AD1981 Analog] alsa-util.c:   period_step  : 1
[alsa-sink-AD1981 Analog] alsa-util.c:   avail_min    : 15502
[alsa-sink-AD1981 Analog] alsa-util.c:   period_event : 0
[alsa-sink-AD1981 Analog] alsa-util.c:   start_threshold  : -1
[alsa-sink-AD1981 Analog] alsa-util.c:   stop_threshold   : 4611686018427387904   
[alsa-sink-AD1981 Analog] alsa-util.c:   silence_threshold: 0
[alsa-sink-AD1981 Analog] alsa-util.c:   silence_size : 0
[alsa-sink-AD1981 Analog] alsa-util.c:   boundary     : 4611686018427387904
[alsa-sink-AD1981 Analog] alsa-util.c:   appl_ptr     : 32768
[alsa-sink-AD1981 Analog] alsa-util.c:   hw_ptr       : 180224


Some revisions are impossible to judge, since they're only showing

                                dev_warn(chip->card->dev,
                                         "Codec #%d probe error; disabling it...\n", c);

and later

                dev_err(chip->card->dev, "no codecs initialized\n");


(hda_controller.c)


Then, commit 595fe1b702c3fa739ab395ab0471054875f20b90 seems to need
f8f1becfa4ac3231da55de68698cf7facf089646 and ef8e39b51f24c8ac907e1f1995910ee3083375e0 applied (the Kconfig thing). I've done this manually whenever possible, still not leading to clear results.


Here are two independent git bisect runs:

# bad: [455c6fdbd219161bd09b1165f11699d6d73de11c] Linux 3.14
# good: [d8ec26d7f8287f5788a494f56e8814210f0e64be] Linux 3.13
git bisect start 'v3.14' 'v3.13' '--' 'sound/pci/hda/'
# good: [43a8e50a46a4e1dd1451e4a4ffa1f7695fb7d287] ALSA: hda - Don't create duplicated ctls for loopback paths
git bisect good 43a8e50a46a4e1dd1451e4a4ffa1f7695fb7d287
# good: [c20f31ec421ea4fabea5e95a6afd46c5f41e5599] ALSA: hda - Fix missing VREF setup for Mac Pro 1,1
git bisect good c20f31ec421ea4fabea5e95a6afd46c5f41e5599
# bad: [28fba95087a7f3d107a3a6728aef7dbfaf3fd782] ALSA: hda/ca0132 - setup/cleanup streams
git bisect bad 28fba95087a7f3d107a3a6728aef7dbfaf3fd782
# bad: [f47e5dc464251f661da9495fcbf003a0d22c1360] ALSA: hda - Add a headset quirk for Dell XPS 13
git bisect bad f47e5dc464251f661da9495fcbf003a0d22c1360
# good: [f8f1becfa4ac3231da55de68698cf7facf089646] ALSA: hda - Fix leftover ifdef checks after modularization
git bisect good f8f1becfa4ac3231da55de68698cf7facf089646
# bad: [7fe307117db5bd7ec6efb93c563dcf44577b6d2b] ALSA: hda - Fix inconsistent Mic mute LED
git bisect bad 7fe307117db5bd7ec6efb93c563dcf44577b6d2b
# first bad commit: [7fe307117db5bd7ec6efb93c563dcf44577b6d2b] ALSA: hda - Fix inconsistent Mic mute LED


Another run:

# bad: [455c6fdbd219161bd09b1165f11699d6d73de11c] Linux 3.14
# good: [d8ec26d7f8287f5788a494f56e8814210f0e64be] Linux 3.13
git bisect start 'v3.14' 'v3.13' '--' 'sound/'
# good: [e240a46965dd14db6e6bdf8974ebb347f3db860b] ALSA: ctxfi - Work around emu20k1 glitch to prevent buffered sound data loss
git bisect good e240a46965dd14db6e6bdf8974ebb347f3db860b
# good: [efe2ab9b0274b60327ca148e101177ba9708b9ba] ASoC: fsl_ssi: Do not print 'baud clock' error message all the time
git bisect good efe2ab9b0274b60327ca148e101177ba9708b9ba
# good: [c20f31ec421ea4fabea5e95a6afd46c5f41e5599] ALSA: hda - Fix missing VREF setup for Mac Pro 1,1
git bisect good c20f31ec421ea4fabea5e95a6afd46c5f41e5599
# skip: [fce0a0c72618e021e29ed4e051ce6b42b218c5e6] ALSA: hda/realtek - Add more entry for enable HP mute led
git bisect skip fce0a0c72618e021e29ed4e051ce6b42b218c5e6
# skip: [e3947ecb4e8adf260b6323eac43b80eeb26911cf] ASoC: blackfin: Fix machine driver Kconfig dependencies
git bisect skip e3947ecb4e8adf260b6323eac43b80eeb26911cf
# skip: [e3947ecb4e8adf260b6323eac43b80eeb26911cf] ASoC: blackfin: Fix machine driver Kconfig dependencies
git bisect skip e3947ecb4e8adf260b6323eac43b80eeb26911cf
# bad: [28fba95087a7f3d107a3a6728aef7dbfaf3fd782] ALSA: hda/ca0132 - setup/cleanup streams
git bisect bad 28fba95087a7f3d107a3a6728aef7dbfaf3fd782
# bad: [28fba95087a7f3d107a3a6728aef7dbfaf3fd782] ALSA: hda/ca0132 - setup/cleanup streams
git bisect bad 28fba95087a7f3d107a3a6728aef7dbfaf3fd782
# bad: [f88abaa0d0dc0d1f1a9ae21f8e822918e5aadfdf] ALSA: hda - Fix mic capture on Sony VAIO Pro 11
git bisect bad f88abaa0d0dc0d1f1a9ae21f8e822918e5aadfdf
# bad: [7fe307117db5bd7ec6efb93c563dcf44577b6d2b] ALSA: hda - Fix inconsistent Mic mute LED
git bisect bad 7fe307117db5bd7ec6efb93c563dcf44577b6d2b
# bad: [276ab336b4c6e483d12fd46cbf24f97f71867710] ALSA: hda - Improve loopback path lookups for AD1983
git bisect bad 276ab336b4c6e483d12fd46cbf24f97f71867710
# first bad commit: [276ab336b4c6e483d12fd46cbf24f97f71867710] ALSA: hda - Improve loopback path lookups for AD1983


At least 28fba95 shows up consistently in both paths, though neither of the resulting final commits is really the culprit.

Maybe AD1981 rings a bell? I'm open for suggestions how to further debug the problem. Do you recall a range of HDA fixes that might cause the trouble? If in doubt, I can try them all individually.
Comment 4 Raymond 2014-04-21 05:31:39 UTC
set start_threshold to -1 won't ensure the stream only be started by application using  snd_pcm_start


in function snd_pcm_write_areas ,  pcm->start_threshold is casted as signed and pcm is started



frames = err;
6822                 if (state == SND_PCM_STATE_PREPARED) {
6823                         snd_pcm_sframes_t hw_avail = pcm->buffer_size - avail;
6824                         hw_avail += frames;
6825                         /* some plugins might automatically start the stream */
6826                         state = snd_pcm_state(pcm);
6827                         if (state == SND_PCM_STATE_PREPARED &&
6828                             hw_avail >= (snd_pcm_sframes_t) pcm->start_threshold) {
6829                                 err = snd_pcm_start(pcm);
6830                                 if (err < 0)
6831                                         goto _end;
6832                         }
6833                 }
Comment 5 Raymond 2014-04-21 12:51:28 UTC
alsa-sink-AD1981 Analog] alsa-util.c:   appl_ptr     : 32768
[alsa-sink-AD1981 Analog] alsa-util.c:   hw_ptr       : 180224

hw_ptr is ahead of appl_ptr for (180224-32768)/44100 ≈3.3 seconds
Comment 6 Takashi Iwai 2014-04-29 16:26:55 UTC
Well, we just want to spot the first bad commit.  It's not necessarily to be reversible.

And, please concentrate on a single problem.  This bug was opened for a hang at boot.  Your original description was:
  Booting a freshly compiled 3.14 hangs (no problems with 3.13.x) at device initialisation with some lines of

   hda-intel: spurious response 0x0:0x0, last cmd=0x0f0000

We need to find the culprit of this regression at first.  The rest issues (e.g. from PulseAudio) can be addressed later, preferably in another bug report.
Comment 7 Adrian Knoth 2014-04-29 22:11:37 UTC
Sorry, my bad. This time strictly only the boot hang, yet it doesn't look particularly useful to me:

adi@foh:/opt/adi-src/sound-2.6$ git bisect log
# bad: [455c6fdbd219161bd09b1165f11699d6d73de11c] Linux 3.14
# good: [d8ec26d7f8287f5788a494f56e8814210f0e64be] Linux 3.13
git bisect start 'v3.14' 'v3.13' '--' 'sound/'
# good: [e240a46965dd14db6e6bdf8974ebb347f3db860b] ALSA: ctxfi - Work around emu20k1 glitch to prevent buffered sound data loss
git bisect good e240a46965dd14db6e6bdf8974ebb347f3db860b
# good: [efe2ab9b0274b60327ca148e101177ba9708b9ba] ASoC: fsl_ssi: Do not print 'baud clock' error message all the time
git bisect good efe2ab9b0274b60327ca148e101177ba9708b9ba
# good: [c20f31ec421ea4fabea5e95a6afd46c5f41e5599] ALSA: hda - Fix missing VREF setup for Mac Pro 1,1
git bisect good c20f31ec421ea4fabea5e95a6afd46c5f41e5599
# bad: [fce0a0c72618e021e29ed4e051ce6b42b218c5e6] ALSA: hda/realtek - Add more entry for enable HP mute led
git bisect bad fce0a0c72618e021e29ed4e051ce6b42b218c5e6
# bad: [d05d7803405f18b0f923cc395676a10a2b5e7201] Merge remote-tracking branches 'asoc/fix/blackfin', 'asoc/fix/da9055', 'asoc/fix/davinci', 'asoc/fix/fsl', 'asoc/fix/fsl-esai', 'asoc/fix/max98090', 'asoc/fix/rt5640', 'asoc/fix/samsung' and 'asoc/fix/txx9aclc-ac97' into asoc-linus
git bisect bad d05d7803405f18b0f923cc395676a10a2b5e7201
# good: [2078600b1f8fc68b02f6ebdda1759fb0c9f51afb] ALSA: Revert "ALSA: hda/realtek - Avoid invalid COEFs for ALC271X"
git bisect good 2078600b1f8fc68b02f6ebdda1759fb0c9f51afb
# good: [fb6d208d54de2791d6d361ef258ea7d5d3427d01] ASoC: davinci-evm: Add pm callbacks to platform driver
git bisect good fb6d208d54de2791d6d361ef258ea7d5d3427d01
# good: [25e7e34805a39a1b05b659acbed878b3187a2707] ASoC: samsung: Fix trivial typo
git bisect good 25e7e34805a39a1b05b659acbed878b3187a2707
# good: [e126a646f77fdd66978785cb0a3a5e46b07aee2e] ASoC: max98090: make REVISION_ID readable
git bisect good e126a646f77fdd66978785cb0a3a5e46b07aee2e
# bad: [b31b2b6d5de71c569413d8dc4f7b050cbe25a09e] ASoC: rt5640: Add ACPI ID for Intel Baytrail
git bisect bad b31b2b6d5de71c569413d8dc4f7b050cbe25a09e
# first bad commit: [b31b2b6d5de71c569413d8dc4f7b050cbe25a09e] ASoC: rt5640: Add ACPI ID for Intel Baytrail
Comment 8 Takashi Iwai 2014-04-30 05:26:36 UTC
Could you do the full bisection instead of limiting to sound/ ?
Comment 9 Adrian Knoth 2014-06-02 20:27:37 UTC
OK, took me a while, yet no useful bisect log:

# bad: [455c6fdbd219161bd09b1165f11699d6d73de11c] Linux 3.14
# good: [d8ec26d7f8287f5788a494f56e8814210f0e64be] Linux 3.13
git bisect start 'v3.14' 'v3.13'
# good: [82c477669a4665eb4e52030792051e0559ee2a36] Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
git bisect good 82c477669a4665eb4e52030792051e0559ee2a36
# good: [ca2a650f3dfdc30d71d21bcbb04d2d057779f3f9] Merge branch 'for-linus' of git://git.infradead.org/users/vkoul/slave-dma
git bisect good ca2a650f3dfdc30d71d21bcbb04d2d057779f3f9
# good: [f2de3a159937bfb1ab1ca671e0f2d06cda286a24] Merge tag 'sound-3.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
git bisect good f2de3a159937bfb1ab1ca671e0f2d06cda286a24
# good: [9b3e7c9b9ab5c2827c1ecd45327b851a1bd01c2a] Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
git bisect good 9b3e7c9b9ab5c2827c1ecd45327b851a1bd01c2a
# good: [52a4c6404f91f2d2c5592ee6365a8418c4565f53] selinux: add gfp argument to security_xfrm_policy_alloc and fix callers
git bisect good 52a4c6404f91f2d2c5592ee6365a8418c4565f53
# good: [53611c0ce9f6e2fa2e31f9ab4ad8c08c512085ba] Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
git bisect good 53611c0ce9f6e2fa2e31f9ab4ad8c08c512085ba
# bad: [b37199e626b31e1175fb06764c5d1d687723aac2] rcuwalk: recheck mount_lock after mountpoint crossing attempts
git bisect bad b37199e626b31e1175fb06764c5d1d687723aac2
# good: [004e5cf743086990e5fc04a14437b3966d7fa9a2] Merge branch 'exynos-drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/daeinki/drm-exynos into drm-fixes
git bisect good 004e5cf743086990e5fc04a14437b3966d7fa9a2
# good: [7c3895383fea48dab2374b04a936de4717a85a81] Merge branch 'for-3.14-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup
git bisect good 7c3895383fea48dab2374b04a936de4717a85a81
# bad: [477cc48484ea0bfdee9e8319259b35a0ec03f332] Merge tag 'trace-fixes-v3.14-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
git bisect bad 477cc48484ea0bfdee9e8319259b35a0ec03f332
# good: [92f99731e61a3487781762d4e6ae8b1576e68e07] Merge tag 'sound-3.14' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
git bisect good 92f99731e61a3487781762d4e6ae8b1576e68e07
# bad: [2eddb708d83ead02b5d41c65bfb26bab5afc8210] MIPS: Octeon: Fix warning in of_device_alloc on cn3xxx
git bisect bad 2eddb708d83ead02b5d41c65bfb26bab5afc8210
# good: [4fe2169acecb6e62821dfe14bc5c5852870b516f] MIPS: BCM47XX: Check all (32) GPIOs when looking for a pin
git bisect good 4fe2169acecb6e62821dfe14bc5c5852870b516f
# bad: [b08ac66b4026f0151d712903695bf266042fbe2c] MIPS: ftrace: Tweak safe_load()/safe_store() macros
git bisect bad b08ac66b4026f0151d712903695bf266042fbe2c
# first bad commit: [b08ac66b4026f0151d712903695bf266042fbe2c] MIPS: ftrace: Tweak safe_load()/safe_store() macros

Note You need to log in before you can comment on or make changes to this bug.