Bug 43069

Summary: [Bisected] genirq,mtpav: Genirq patch + SND_MTPAV=y makes Compaq Presario V6000 hang on boot
Product: Drivers Reporter: Ronald (ronald645)
Component: Sound(ALSA)Assignee: other_other
Status: RESOLVED INSUFFICIENT_DATA    
Severity: normal CC: alan, perex, tglx, tiwai
Priority: P1    
Hardware: x86-64   
OS: Linux   
Kernel Version: 3.2 Subsystem:
Regression: Yes Bisected commit-id:
Bug Depends on:    
Bug Blocks: 42644    

Description Ronald 2012-04-07 19:16:46 UTC
Hello kernel community,

I'm reporting a bug, since recent kernel versions cause my laptop to hand during boot. Rendering it practically useless.

Stable version 3.2.8 works.
Stable version 3.2.9 fails.

Mainline 3.3.0 fails as well.

Reverting: [b4bc724e82e80478cba5fe9825b62e71ddf78757] genirq: Handle pending irqs in irq_startup() on top of 3.4-rc1 fixes the issue and makes the kernel boot again.

I did a git bisect initially, which yielded this patch:
[aa0eb3474beae8f6d9dcc2311dc02bea50cfd7b7] genirq: Unmask oneshot irqs when thread was not woken
But that was wrong (probably my fault). I post the git bisect log here anyway, might be useful:

[aa0eb3474beae8f6d9dcc2311dc02bea50cfd7b7] genirq: Unmask oneshot irqs when thread was not woken

Charlie linux-git-stable # git bisect log
# bad: [44fb3170ae46f8de964a4bb5b0504e865a6dd7da] Linux 3.2.9
# good: [1de504ea25617f701ac3a246a1c9dfd2246d4900] Linux 3.2.8
git bisect start 'v3.2.9' 'v3.2.8'
# good: [4cc383ba35be70d24fb7d43dd67f15f6ec3c7ebc] USB: option: cleanup zte 3g-dongle's pid in option.c
git bisect good 4cc383ba35be70d24fb7d43dd67f15f6ec3c7ebc
# good: [758e4d3da5bc2a30a7618cb8f1710e096dac0e53] ARM: omap: fix oops in arch/arm/mach-omap2/vp.c when pmic is not found
git bisect good 758e4d3da5bc2a30a7618cb8f1710e096dac0e53
# bad: [df9a5f8f94f3276aaa8c960a46f6838f7fdab974] davinci_emac: Do not free all rx dma descriptors during init
git bisect bad df9a5f8f94f3276aaa8c960a46f6838f7fdab974
# bad: [37ef0e621b065f2d9e1c37ff42a37d6bd74bf039] genirq: Handle pending irqs in irq_startup()
git bisect bad 37ef0e621b065f2d9e1c37ff42a37d6bd74bf039
# good: [72633f08ad74b93530b8e038041c450492a00ed5] ath9k: stop on rates with idx -1 in ath9k rate control's .tx_status
git bisect good 72633f08ad74b93530b8e038041c450492a00ed5
Comment 1 Ronald 2012-04-07 21:47:33 UTC
I also just recently tested the latest git master (f4e52e7ffde). Same problem, same fix.
Comment 2 Ronald 2012-04-13 07:00:59 UTC
(from https://lkml.org/lkml/2012/4/12/46)

I have attached my .config [1] and a photo of the latest hanging kernel [2]
which is post 3.4 rc-2 by now (f549e088b80). I also attached a correctly
working dmesg of current head (again f549e088b80) with the offending
patch reverted [3].

[1] http://pastebin.com/fvWAqmqv
[2] http://tinypic.com/view.php?pic=2uif21j&s=5
[3] http://pastebin.com/QmMA6hGS
Comment 3 Ronald 2012-04-26 22:25:09 UTC
Current head: 2300fd67b4f29eec19addb15a8571837228f63fc (post 3.4-rc4)

I did a clean build of my kernel image with a x86_64 defconfig and eventually found out that this is a combination of:

CONFIG_SND_MTPAV=y and commit b4bc724e82e8047

So:

CONFIG_SND_MTPAV=y + commit b4bc724e82e8047 => Hang
CONFIG_SND_MTPAV=y + revert commit b4bc724e => Works!
CONFIG_SND_MTPAV=n + commit b4bc724e82e8047 => Works!

So the combination of both causes the hang during boot.
Comment 4 Takashi Iwai 2012-05-08 10:19:37 UTC
I don't think you are testing with this device, right?

If the commit above really matters, it must have to anything to do with the interrupt handler in mtpav.c, but the question is why the interrupt is kicked off at all.  This is an old ISA-like device without the irq sharing.
Comment 5 Ronald 2012-05-09 17:28:42 UTC
I *do* have this device. I have no idea what 'ISA-like' implies, however over here it says:

 #1: MTPAV on parallel port at 0x378

Also had to supply these as bootparameters:

snd_hda_intel.index=0 snd_mtpav.index=1

In order to prevent vlc playing PCM through this MIDI device ;)
Comment 6 Takashi Iwai 2012-05-09 19:25:12 UTC
Ah, I see.  I thought you're testing just the kernel without the actual hardware (e.g. for QA), but you have the real one.

Does the crash happen even if you don't connect the device?  Or it's only with the device connected?
Comment 7 Ronald 2012-05-09 19:37:14 UTC
It's an internal device, I cannot disconnect it without a screwdriver I guess. I do not 'dock' it or something like that.

These:

[    4.671773] ALSA device list:
[    4.673076]   #0: HDA NVidia at 0xb0000000 irq 21
[    4.674382]   #1: MTPAV on parallel port at 0x378

Are all internal. However, doing lspci -v -v -v -v, I have a hard time finding the device that should correspond with the midi card (since I don't use the driver anymore I don't know 'where' it binds to). Will this info help you?
Comment 8 Takashi Iwai 2012-05-10 05:52:21 UTC
Hm?  MTPAV is a device connected via a parallel port, as you see in the list.   The driver controls the parallel port at 0x378, where the box over there has up to 8 MIDI I/O ports.  Is this really what you have?

I guess you are using a wrong driver at the very first place.  MTPAV driver might work somehow, but it's no right choice.  For a generic MPU401 interface on a mobo, snd-mpu401 (with proper parameters) is the correct one.

Of course, this doesn't justify to keep the kernel crashing at boot, but we need to understand the test situation at first.
Comment 9 Alan 2012-05-12 01:09:54 UTC
Seems to crash because the IRQ handler is enabled too early by mtpav_get_ISA....

which would explain the rest
Comment 10 Takashi Iwai 2012-05-12 08:06:42 UTC
As far as I see, all the necessary initialization has been done before mtpav_get_ISA().

I guess it's in the interrupt handler that assumes some special case.  Maybe snd_mtpav_read_bytes() got stuck...
Comment 11 Takashi Iwai 2013-11-15 15:12:13 UTC
snd-mtpav must be a wrong driver.
Comment 12 Ronald 2013-11-15 16:24:36 UTC
Yes, I guess. How to find out?

It's a Compaq Presario V6221eu.
Comment 13 Takashi Iwai 2013-11-15 16:42:05 UTC
As I mentioned, MTP-AV is an external MIDI box connected via a parallel port.  If you're using a laptop, this cannot be it.

The device has a fixed port and a fixed irq number like the old good ISA device.  If you build the driver into kernel, it'd be started even if no device is actually connected, so it'll poke some ioports and handle the bogus irq.