Subject : [parisc] 2.6.24-rc3 (64-bit, smp) fails to boot on 9000/785/J5600 Submitter : Frans Pop <elendil@planet.nl> References : http://lkml.org/lkml/2007/11/27/72 Handled-By : Kyle McMartin <kyle@mcmartin.ca>
Is there any way on HP-PARISC to figure out the place of such hangs? (like nmi_watchdog=1/2 on x86)? if there's no such mechanism and if you've got time you could try the latency tracer and its print_functions feature: http://people.redhat.com/mingo/latency-tracing-patches/latency-tracing-v2.6.24-rc3.combo.patch but this would need some hacking from a HP-PARISC developer, the mcount stub is needed to make boot-hang debugging functional. In that case mcount_enabled=1 and print_functions=1 together can help debug the location of such hangs.
Yes, rather easily. All PA-RISC machines have a TOC switch (Transfer of Control) that boots control back to firmware, saves registers to nvram and reboots. Won't provide a stack trace, but at least provides the program counter. In this case, the problem was that a recent commit adds a check for IRQ_DISABLED to the IRQ_PER_CPU codepath, and for some reason, we were accidently |= IRQ_PER_CPU instead of setting it, so IRQ_DISABLED leaked through. (Actually, you've inspired me to go check to see if I can get a high priority interrupt on TOC, so I could dump the stack trace...)
http://git.kernel.org/?p=linux/kernel/git/kyle/parisc-2.6.git;a=commit;h=f882ea80f4ef4b71ad5d788c10a714ff4fb62d35
> Yes, rather easily. All PA-RISC machines have a TOC switch (Transfer > of Control) that boots control back to firmware, saves registers to > nvram and reboots. Won't provide a stack trace, but at least provides > the program counter. heh, that's easier than x86, which has an NMI watchdog only after some time (so we cannot easily debug early-bootup hangs). > In this case, the problem was that a recent commit adds a check for > IRQ_DISABLED to the IRQ_PER_CPU codepath, and for some reason, we were > accidently |= IRQ_PER_CPU instead of setting it, so IRQ_DISABLED > leaked through. > > (Actually, you've inspired me to go check to see if I can get a high > priority interrupt on TOC, so I could dump the stack trace...) yeah, would be useful i guess.
i guess this fix will hit 2.6.24, right? Havent seen it in -rc4 yet.
This is in mainline now and can be closed again: 2421ba5b57ddbc3a972b9d6fb884817c39d2fff7
great. Fix will first show up in 2.6.24-rc5.