Kernel Bug Tracker – Bug 18402
2.6.36-rc4 KPs whenever booting on a Pentium 4 HT system
Last modified: 2010-09-23 20:07:57 UTC
Created attachment 29732 [details]
Kernel configuration I compiled with
Sometime between 2.6.36-rc3 and 2.6.36-rc4, a regression bug was introduced that prevents the kernel from booting on a Pentium 4 HT system with i875/ICH5 chipset. Instead, it kernelpanics with a message indicating that the system timer is not ticking. (I don't have the exact message since the system in question is headless and I will not have access to a monitor for it for quite some time.) I normally have "clocksource=hpet" in the kernel command line, but I also tried without this and I get the same problem. I have attached some info about the system.
Created attachment 29742 [details]
Created attachment 29752 [details]
Created attachment 29762 [details]
Created attachment 29772 [details]
Created attachment 29782 [details]
Created attachment 29792 [details]
Please attach boot log from the last known good kernel.
It would be good to know the point in the boot sequence where the panic happens.
Created attachment 29842 [details]
dmesg from last good kernel (2.6.36-rc3)
Here is the dmesg from 2.6.36-rc3, the last known-good kernel.
I don't know the exact point that the KP occurs because I don't have a monitor for the system, but I can say that it always occurs about a half-second after the system begins to boot. During a normal boot, the network LED flashes about 1sec into the bootup, but this never happens with 2.6.36-rc4.
Any progress on this?
It looks like from looking at the successful boot dmesg that it panics around the time it would normally say:
[ 0.341986] hpet clockevent registered
[ 0.341986] hpet0: at MMIO 0xfed00000, IRQs 2, 8, 0
[ 0.341986] hpet0: 3 comparators, 64-bit 14.318180 MHz counter
[ 0.345025] Switching to clocksource hpet
Now I am thinking I remember that it said "Timer not working" so I googled that but only came up with a bunch of APIC-related stuff. I already tried booting with "noapic", and that had no effect.
Hard to tell w/o any hint what kind of panic it runs into. I just went
through the x86 changes between -rc3 and -rc4 and I can't see an
obvious candidate. It might be something which was hidden due to
different timing though. Any chance to hook up a serial console ?
I have secured a monitor to borrow for the system, but I won't be able to hook that up until tomorrow evening. I will post the exact error then.
Created attachment 30642 [details]
Kernel panic message
Here is the exact output from the kernel panic. I tried rebooting with apic=debug, but that had no effect.
Created attachment 30652 [details]
Boot with "noapic"
If I boot with "noapic," the kernel just freezes. Here is what it looks like after freezing.
I got this message:
"This message has been generated automatically as a part of a summary report
of recent regressions.
The following bug entry is on the current list of known regressions
from 2.6.35. Please verify if it still should be listed and let the tracking team
know (either way)."
The bug is a regression from 2.6.36-rc3 to 2.6.36-rc4.
No surprise, but -rc5 does not fix the issue; it produces exactly the same kernel panic. Any progress on finding out what caused it?
> --- Comment #17 from Michael Marley <firstname.lastname@example.org> 2010-09-21 01:54:45 ---
> No surprise, but -rc5 does not fix the issue; it produces exactly the same
> kernel panic. Any progress on finding out what caused it?
Hmm, I have to admit that I have no clue at all. The delta in the
related areas (ioapic, apic, hpet, 8259, 8253) from rc3 to rc4 is
exaclty zero. Any chance, that you can bisect it ?
I will try.
OK, I officially have no idea what is going on here. My bisect completely failed, all 8 kernels I built from it failed in exactly the same way. So, I tried building 2.6.36-rc3 again, and I get the same problem as I was getting. Then, I tried compiling both -rc3 and -rc5 with gcc-4.4 instead of 4.5, and still got the same error on both.
*Which* gcc 4.4 and 4.5 are you using? In particular, are you using stock gcc or a distro build?
Stock gcc 4.4.4 is known to miscompile the kernel.
I am using the GCC 4.4.4 and 4.5.1 build from Ubuntu Maverick. I had been using 4.5.0 and 4.5.1 successfully for a while (keeping them up-to-date, of course), but it quit working sometime between -rc3 and -rc4. Now, I cannot compile any 2.6.36 kernel successfully using any compiler on my system.
I found the problem. It is not a kernel issue or a compiler issue, but instead a linker issue. There was an update of Binutils on ubuntu not long after I compiled 2.6.36-rc3 that caused all future builds to work improperly. I will file a bug there. Sorry to bother you.