Created attachment 134091 [details] .config file This bug is introduced with commit d2f7cbe7b26a74dbbbf8f325b2a6fd01bc34032c. After we identified the bug, Boris (bp@alien8.de) added a workaround to quirk out UV and allow us to boot (commit 95648c0e9fdd1cb1199ef387025d684704a8e62e). While this does work to get things booting, it does not address the underlying issue. For reference, below is the output from the failed boot, on the latest kernel, built a few minutes ago. Config file is attached. <snip> Enabled IRQ remapping in x2apic mode Enabling x2apic Enabled x2apic Switched APIC routing to cluster x2apic. ..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1 smpboot: CPU0: Genuine Intel(R) CPU @ 2.60GHz (fam: 06, model: 2d, stepping: 06) UV: Found UV2 hub ------------[ cut here ]------------ kernel BUG at arch/x86/mm/init_64.c:351! invalid opcode: 0000 [#1] SMP Modules linked in: CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.15.0-rc3-athorlton-dirty #846 Hardware name: SGI UV2000/ROMLEY, BIOS SGI UV 2000/3000 series BIOS 01/15/2013 task: ffff880ff977a010 ti: ffff880ff977c000 task.ti: ffff880ff977c000 RIP: 0010:[<ffffffff818ca862>] [<ffffffff818ca862>] __init_extra_mapping+0x111/0x143 RSP: 0000:ffff880ff977dd18 EFLAGS: 00010206 RAX: 0000000000000f00 RBX: ffff880001c6b018 RCX: 0000000000000002 RDX: ffff880fff8d7f00 RSI: 0000000002000000 RDI: 00000000fc000000 RBP: ffff880ff977dd48 R08: 0000000000000000 R09: 0000000000000000 R10: ffff88ef7e7f5000 R11: 0000000000000000 R12: 00000000fc000000 R13: 0000000002000000 R14: ffff8800fc000000 R15: 0000000080000000 FS: 0000000000000000(0000) GS:ffff880fffc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffff88ef7efff000 CR3: 00000000017f4000 CR4: 00000000000406f0 Stack: 80000000000001fb 0000000000000000 0000000000000080 000000000000b018 000000000000b010 000000000000b008 ffff880ff977dd58 ffffffff818ca8a7 ffff880ff977de28 ffffffff818c600b ffff880fffc0cc80 0000000000000080 Call Trace: [<ffffffff818ca8a7>] init_extra_mapping_uc+0x13/0x15 [<ffffffff818c600b>] uv_system_init+0x102/0x111d [<ffffffff8108c3f2>] ? clockevents_config_and_register+0x21/0x25 [<ffffffff81029283>] ? setup_APIC_timer+0xbb/0xc7 [<ffffffff8154ee04>] ? printk+0x72/0x74 [<ffffffff818c3da6>] ? setup_boot_APIC_clock+0x4a8/0x4b7 [<ffffffff8154ee04>] ? printk+0x72/0x74 [<ffffffff818c1a6e>] native_smp_prepare_cpus+0x389/0x3d6 [<ffffffff818b57c6>] kernel_init_freeable+0xb7/0x1fb [<ffffffff81546900>] ? rest_init+0x74/0x74 [<ffffffff81546909>] kernel_init+0x9/0xd5 [<ffffffff81552f7c>] ret_from_fork+0x7c/0xb0 [<ffffffff81546900>] ? rest_init+0x74/0x74 Code: ff ff ff 3f 00 00 48 23 13 48 b8 00 00 00 00 00 88 ff ff 48 01 c2 4c 89 e0 48 c1 e8 12 25 f8 0f 00 00 48 01 c2 48 83 3a 00 74 04 <0f> 0b eb fe 48 8b 45 d0 49 81 ed 00 00 20 00 4c 09 e0 49 81 c4 RIP [<ffffffff818ca862>] __init_extra_mapping+0x111/0x143 RSP <ffff880ff977dd18> ---[ end trace d3716733eb04969d ]--- Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b </snip> I made the following changes to remove the workaround and re-expose the bug (did it this way, as I wasn't sure of the implications of reverting the entire commit containing the WAR): diff --git a/arch/x86/platform/efi/efi.c b/arch/x86/platform/efi/efi.c index 3781dd3..aa8d237 100644 --- a/arch/x86/platform/efi/efi.c +++ b/arch/x86/platform/efi/efi.c @@ -1336,6 +1336,6 @@ void __init efi_apply_memmap_quirks(void) /* * UV doesn't support the new EFI pagetable mapping yet. */ - if (is_uv_system()) - set_bit(EFI_OLD_MEMMAP, &efi.flags); +// if (is_uv_system()) +// set_bit(EFI_OLD_MEMMAP, &efi.flags); } Original lkml discussion of the bug can be found here: http://www.gossamer-threads.com/lists/linux/kernel/1855555 We (SGI) are beginning to investigate and attempt to resolve the bug, but wanted to track our progress here in the community, in case others run into similar issues. - Alex
Alan, I don't think it's correct to label this as a regression. There *was* a regression, before commit 95648c0e9fdd1cb1199ef387025d684704a8e62e, but every config that used to work for SGI UV should still work after that commit without user tweaks. It's just that SGI UV doesn't take advantage of the new code.
(In reply to Matt Fleming from comment #1) > It's just that SGI UV doesn't take advantage of the new code. And we're working on fixing that too - it is just not trivial and the quirk in a5d90c923bcf ("x86/efi: Quirk out SGI UV") is for the interim.
This is supported in recent kernels (with recent firmware). Closing.