v3.19-rc1 kernel fails to boot on VMware ESXi 5.1 host. The kernel only output "Probing EDD (edd=off to disable)...ok" and hangs, no backtrace or warning or error or even VGA mode change, just hangs. (Maybe even did changed to protected mode). On qemu-kvm or vbox or bare machine, it is OK. Bisect shows the following commit causing the bug: commit f5b2831d654167d77da8afbef4d2584897b12d0c Author: Juergen Gross <jgross@suse.com> Date: Mon Nov 3 14:02:02 2014 +0100 x86: Respect PAT bit when copying pte values between large and normal pages Thanks, Qu
Hi, I saw this problem, too. Full boot log: > Loading /kernel... ok > Loading /initramfs...ok > early console in setup code > Probing EDD (edd=off to disable)... ok > early console in decompress_kernel > KASLR using RDRAND RDTSC... > > Decompressing Linux... > > [ 0.000000] Initializing cgroup subsys cpuset > [ 0.000000] Initializing cgroup subsys cpu > [ 0.000000] Initializing cgroup subsys cpuacct > [ 0.000000] Linux version 3.19.0-rc1 (root@bare42) (gcc version 4.8.4 > (Gentoo 4.8.4 p1.0, pie-0.6.1) ) #1 SMP Mon Dec 29 15:55:00 CET 2014 > [ 0.000000] Command line: BOOT_IMAGE=/kernel dolvm root=UID=... > rootfs=ext4 initrd=/initramfs debug LOGLEVEL=8 earlyprintk=vga,keep > [ 0.000000] KERNEL supported cpus: > [ 0.000000] Intel GenuineIntel > [ 0.000000] Disabled fast string operations > [ 0.000000] e820: BIOS-provided physical RAM map: > [ Skipping the RAM map - please request if needed ] > [ 0.000000] debug: ignoring loglevel setting. > [ 0.000000] console [earlyvga0] enabled > [ 0.000000] MX (Execute Disable) protection: active > [ 0.000000] SMBIOS 2.4 present. > [ 0.000000] DMI: VMware, Inc. VMware Virtual Platform/440BX Desktop > Reference Platform, BIOS 6.00 07/31/2013 > [ 0.000000] e820: update [mem 0x00000000-0x00000fff] usable ==> reserved > [ 0.000000] e820: remove [mem 0x000a0000-0x000fffff] usable > [ 0.000000] e820: last_pfn = 0x140000 max_arch_pfn = 0x400000000 > [ 0.000000] MTRR default type: uncachable > [ 0.000000] MTRR fixed ranges enabled: > [ 0.000000] 00000-9FFFF write-back > [ 0.000000] A0000-BFFFF uncachable > [ 0.000000] C0000-CBFFF write-protect > [ 0.000000] CC000-EFFFF uncachable > [ 0.000000] F0000-FFFFF write-protect > [ 0.000000] MTRR variable ranges enabled: > [ 0.000000] 0 base 00C0000000 mask FFc0000000 uncachable > [ 0.000000] 1 base 0000000000 mask FF00000000 write-back > [ 0.000000] 2 base 0100000000 mask FFC0000000 write-back > [ 0.000000] 3 disabled > [ 0.000000] 4 disabled > [ 0.000000] 5 disabled > [ 0.000000] 6 disabled > [ 0.000000] 7 disabled > PANIC: early exception 06 rip 10:ffffffff8d03e852 error 0 cr2 > ffff88000dd4cff8 > [ 0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 3.19.0-rc1 #1 > [ 0.000000] Hardware name: VMware, Inc. VMware Virtual Platform/440BX > Desktop Reference Platform. BIOS 6.00 07/31/2013 > [ 0.000000] ffffffff8dc03e1b ffffffff8dc03d70 ffffffff8d6a9d63 > 000000000000004d > [ 0.000000] 00000000ffffffff ffffffff8dc03e08 ffffffff8dd011a1 > 0000000000000030 > [ 0.000000] 000000000000002c 0000000000000019 0000000000000018 > 0000000000000000 > [ 0.000000] Call Trace: > [ 0.000000] [<ffffffff8d6a9d63>] dump_stack+0x45/0x57 > [ 0.000000] [<ffffffff8dd011a1>] early_idt_handler+0x81/0xa8 > [ 0.000000] [<ffffffff8d03e852>] ? update_cache_mode_entry+0x42/0x50 > [ 0.000000] [<ffffffff8d043d2c>] pat_init_cache_modes+0x7c/0xc0 > [ 0.000000] [<ffffffff8d043df4>] pat_init+0x84/0xa0 > [ 0.000000] [<ffffffff8dd0af26>] get_mtrr_state+0x284/0x296 > [ 0.000000] [<ffffffff8dd0aae3>] mtrr_bp_init+0x134/0x157 > [ 0.000000] [<ffffffff8dd047f1>] setup_arch+0x56e/0xc6f > [ 0.000000] [<ffffffff8d11b69f>] ? vprintk_default+0x1f/0x30 > [ 0.000000] [<ffffffff8dd01c87>] start_kernel+0xd3/0x471 > [ 0.000000] [<ffffffff8dd015ad>] x86_64_start_reservations+0x2a/0x2c > [ 0.000000] [<ffffffff8dd016a6>] x86_x64_start_kernel+0xf7/0xfb > [ 0.000000] RIP 0x3 But this is already known and fixed, see https://lkml.org/lkml/2014/12/28/35 I can confirm that the patch is working for me. Now booting looks like > [ 0.000000] MTRR default type: uncachable > [ 0.000000] MTRR fixed ranges enabled: > [ 0.000000] 00000-9FFFF write-back > [ 0.000000] A0000-BFFFF uncachable > [ 0.000000] C0000-CBFFF write-protect > [ 0.000000] CC000-EFFFF uncachable > [ 0.000000] F0000-FFFFF write-protect > [ 0.000000] MTRR variable ranges enabled: > [ 0.000000] 0 base 00C0000000 mask FFC0000000 uncachable > [ 0.000000] 1 base 0000000000 mask FF00000000 write-back > [ 0.000000] 2 base 0100000000 mask FFC0000000 write-back > [ 0.000000] 3 disabled > [ 0.000000] 4 disabled > [ 0.000000] 5 disabled > [ 0.000000] 6 disabled > [ 0.000000] 7 disabled > [ 0.000000] PAT read returns always zero, disabled. > [ 0.000000] original variable MTRRs
Thanks for the info. I'm currently using nopat kernel option to avoid the bug. The fixing bug is not in v3.19-rc1, so I still hit it without nopat option. Since the bug is already known and fix patch is already sent, I will close the BZ. Thanks, Qu