Subject : Only three cores found on quad-core machine. Submitter : Dave Jones <davej@redhat.com> Date : 2008-08-01 18:15 References : http://marc.info/?l=linux-kernel&m=121761475224719&w=4 This entry is being used for tracking a regression from 2.6.26. Please don't close it until the problem is fixed in the mainline.
Dave, still not fixed ?
0000000000000000 <.text>: 0: 4a 8b 14 ea mov (%rdx,%r13,8),%rdx rdx: ffff880001027f80 r13: 000a81a000000002 Non-canonical address?
Looks like r13 contains crap, there...
Looks like CPU2 never finished its initialization: Booting processor 2/1 ip 6000 Initializing CPU#2 Stuck ?? Inquiring remote APIC #1... ... APIC #1 ID: failed ... APIC #1 VERSION: failed ... APIC #1 SPIV: failed migration/2 used greatest stack depth: 7224 bytes left
And then later it oopsed in the CPU init code, probably due to ftrace. After that things go wrong because the CPU init never finished but that processor is trying to allocate memory to handle the oops.
On Friday, 10 of October 2008, Dave Jones wrote: > On Wed, Oct 08, 2008 at 03:24:40PM +1100, Nick Piggin wrote: > > On Wednesday 08 October 2008 02:32, Dave Jones wrote: > > > On Wed, Oct 08, 2008 at 02:18:18AM +1100, Nick Piggin wrote: > > > > On Sunday 05 October 2008 04:32, Rafael J. Wysocki wrote: > > > > > This message has been generated automatically as a part of a report > > > > > of recent regressions. > > > > > > > > > > The following bug entry is on the current list of known regressions > > > > > from 2.6.26. Please verify if it still should be listed and let me > > > > > know (either way). > > > > > > > > > > > > > > > Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=11224 > > > > > Subject : Only three cores found on quad-core machine. > > > > > Submitter : Dave Jones <davej@redhat.com> > > > > > Date : 2008-08-01 18:15 (65 days old) > > > > > References : > http://marc.info/?l=linux-kernel&m=121761475224719&w=4 > > > > > > > > Dave, is your CPU2 getting stuck in calibrate_delay? Can you still > > > > reproduce the bug? What if you boot the kernel with lpj=<something > sane > > > > like 2666785> in order to skip the calibrate_delay code? > > > > > > > > If that helps, can you try adding some printks to narrow down where > it > > > > is getting stuck? > > > > > > This is going to sound strange, but I've forgotten which box that was. > > > I'm due to reinstall all my test boxes now that the next Fedora beta > > > has come out anyway, so I'll hopefully figure out by the end of the week > :) > > > > That does sound strange ;) But I can't believe that! It would be really > > interesting if it was stuck in calibrate_delay, but maybe the backtrace > > is just messed up... > > > > If you do manage to reproduce it, I would be quite interested. Otherwise, > > I think we can say it's not a showstopper for 2.6.27. > > Ok, this can be closed out. 2.6.27 seems to work fine on all the > quad core machines I have in my cube.