Bug 35602

Summary: Oops on resume enabling CPU1, setup_disablecpuid
Product: Platform Specific/Hardware Reporter: Jukka Ollila (jiiksteri)
Component: x86-64Assignee: platform_x86_64 (platform_x86_64)
Status: CLOSED CODE_FIX    
Severity: normal CC: fenghua.yu, florian, maciej.rutecki, parag.lkml, rjw
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 2.6.39 Subsystem:
Regression: Yes Bisected commit-id:
Bug Depends on:    
Bug Blocks: 7216, 32012    

Description Jukka Ollila 2011-05-22 13:23:54 UTC
Kernel 2.6.39 reliably oopses on resume from suspend. The system resumes and is usable but one of the cores is missing.

The machine is an Acer Ferrari 1000 laptop, dual-core amd64

This didn't happen in 2.6.39-rc2, haven't bisected it yet.

Kernel log snippet below in case the problem's obvious to someone. I can try patches or provide additional info on request.

May 22 14:26:30 monza kernel: [   80.218559] ACPI: Low-level resume complete
May 22 14:26:30 monza kernel: [   80.218559] PM: Restoring platform NVS memory
May 22 14:26:30 monza kernel: [   80.218559] Enabling non-boot CPUs ...
May 22 14:26:30 monza kernel: [   80.230864] Booting Node 0 Processor 1 APIC 0x1
May 22 14:26:30 monza kernel: [   80.230867] smpboot cpu 1: start_ip = 98000
May 22 14:26:30 monza kernel: [   80.217599] kernel tried to execute NX-protected page - exploit attempt? (uid: 0)
May 22 14:26:30 monza kernel: [   80.217599] BUG: unable to handle kernel paging request at ffffffff815b8c59
May 22 14:26:30 monza kernel: [   80.217599] IP: [<ffffffff815b8c59>] setup_disablecpuid+0x43/0x43
May 22 14:26:30 monza kernel: [   80.217599] PGD 1549067 PUD 154d063 PMD 6c31c063 PTE 80000000015b8163
May 22 14:26:30 monza kernel: [   80.217599] Oops: 0011 [#1] PREEMPT SMP 
May 22 14:26:30 monza kernel: [   80.217599] CPU 1 
May 22 14:26:30 monza kernel: [   80.217599] Modules linked in: cpufreq_stats cpufreq_conservative cpufreq_userspace cpufreq_powersave cpufreq_ondemand ipv6 bluetooth uinput fuse powernow_k8 freq_table mperf firewire_sbp2 loop radeon ttm snd_hda_codec_realtek joydev snd_hda_intel snd_hda_codec snd_hwdep drm_kms_helper drm snd_pcm pcmcia snd_seq snd_timer snd_seq_device snd irda yenta_socket i2c_piix4 pcmcia_rsrc i2c_algo_bit processor ac container psmouse battery acer_wmi evdev serio_raw sparse_keymap pcspkr k8temp i2c_core crc_ccitt button soundcore rfkill snd_page_alloc pcmcia_core wmi ext3 jbd mbcache usbhid hid sd_mod ohci_hcd ata_generic sata_sil thermal libata tg3 firewire_ohci firewire_core crc_itu_t libphy scsi_mod ehci_hcd [last unloaded: scsi_wait_scan]
May 22 14:26:30 monza kernel: [   80.217599] 
May 22 14:26:30 monza kernel: [   80.217599] Pid: 0, comm: kworker/0:0 Not tainted 2.6.39+ #2 Acer, inc. Ferrari 1000    /Ferrari6        
May 22 14:26:30 monza kernel: [   80.217599] RIP: 0010:[<ffffffff815b8c59>]  [<ffffffff815b8c59>] setup_disablecpuid+0x43/0x43
May 22 14:26:30 monza kernel: [   80.217599] RSP: 0000:ffff88006fbb7f10  EFLAGS: 00010012
May 22 14:26:30 monza kernel: [   80.217599] RAX: 0000000000040f82 RBX: 0000000000000001 RCX: 0000000000002001
May 22 14:26:30 monza kernel: [   80.217599] RDX: 00000000178bfbff RSI: ffffffff813a95b4 RDI: ffff88006fd11380
May 22 14:26:30 monza kernel: [   80.217599] RBP: ffff88006fd11380 R08: 000000008000000a R09: 0000000000000001
May 22 14:26:30 monza kernel: [   80.217599] R10: 0000000000000000 R11: 0000000000000000 R12: ffff88006fd11394
May 22 14:26:30 monza kernel: [   80.217599] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
May 22 14:26:30 monza kernel: [   80.217599] FS:  0000000000000000(0000) GS:ffff88006fd00000(0000) knlGS:0000000000000000
May 22 14:26:30 monza kernel: [   80.217599] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
May 22 14:26:30 monza kernel: [   80.217599] CR2: ffffffff815b8c59 CR3: 0000000001547000 CR4: 00000000000006a0
May 22 14:26:30 monza kernel: [   80.217599] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
May 22 14:26:30 monza kernel: [   80.217599] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
May 22 14:26:30 monza kernel: [   80.217599] Process kworker/0:0 (pid: 0, threadinfo ffff88006fbb6000, task ffff88006fb5e450)
May 22 14:26:30 monza kernel: [   80.217599] Stack:
May 22 14:26:30 monza kernel: [   80.217599]  ffffffff812a1073 0000000000000001 0000000000000000 0000000000000000
May 22 14:26:30 monza kernel: [   80.217599]  ffffffff812a1282 0000000000000000 ffffffff812a3d0d 000000000000059f
May 22 14:26:30 monza kernel: [   80.217599]  0000000000000000 0000000000000000 0000000000000000 0000000000000000
May 22 14:26:30 monza kernel: [   80.217599] Call Trace:
May 22 14:26:30 monza kernel: [   80.217599]  [<ffffffff812a1073>] ? identify_cpu+0xb4/0x2af
May 22 14:26:30 monza kernel: [   80.217599]  [<ffffffff812a1282>] ? identify_secondary_cpu+0x14/0x1d
May 22 14:26:30 monza kernel: [   80.217599]  [<ffffffff812a3d0d>] ? start_secondary+0x107/0x188
May 22 14:26:30 monza kernel: [   80.217599] Code: ff 85 c0 0f 89 3a ff ff ff 48 8b 3d ca 38 20 00 e8 bd a4 ff ff 48 c7 05 ba 38 20 00 00 00 00 00 83 c8 ff e9 35 ff ff ff 66 2e 0f <1f> 84 00 00 00 00 00 e8 5b a4 ff ff 83 38 0b 0f 95 c0 0f b6 c0 
May 22 14:26:30 monza kernel: [   80.217599] RIP  [<ffffffff815b8c59>] setup_disablecpuid+0x43/0x43
May 22 14:26:30 monza kernel: [   80.217599]  RSP <ffff88006fbb7f10>
May 22 14:26:30 monza kernel: [   80.217599] CR2: ffffffff815b8c59
May 22 14:26:30 monza kernel: [   80.217599] ---[ end trace 0cda1761d71a2783 ]---
May 22 14:26:30 monza kernel: [   80.217599] Kernel panic - not syncing: Attempted to kill the idle task!
May 22 14:26:30 monza kernel: [   80.217599] Pid: 0, comm: kworker/0:0 Tainted: G      D     2.6.39+ #2
May 22 14:26:30 monza kernel: [   80.217599] Call Trace:
May 22 14:26:30 monza kernel: [   80.217599]  [<ffffffff812a718b>] ? panic+0xa1/0x1aa
May 22 14:26:30 monza kernel: [   80.217599]  [<ffffffff81039b91>] ? do_exit+0xa8/0x718
May 22 14:26:30 monza kernel: [   80.217599]  [<ffffffff81038170>] ? kmsg_dump+0x84/0xd4
May 22 14:26:30 monza kernel: [   80.217599]  [<ffffffff81005362>] ? oops_end+0xa9/0xae
May 22 14:26:30 monza kernel: [   80.217599]  [<ffffffff815b8c59>] ? setup_disablecpuid+0x43/0x43
May 22 14:26:30 monza kernel: [   80.217599]  [<ffffffff8102356b>] ? no_context+0x1ed/0x1fa
May 22 14:26:30 monza kernel: [   80.217599]  [<ffffffff815b8c59>] ? setup_disablecpuid+0x43/0x43
May 22 14:26:30 monza kernel: [   80.217599]  [<ffffffff81023a42>] ? do_page_fault+0x152/0x348
May 22 14:26:30 monza kernel: [   80.217599]  [<ffffffff812a99f9>] ? _raw_spin_unlock_irq+0x11/0x30
May 22 14:26:30 monza kernel: [   80.217599]  [<ffffffff8102d293>] ? finish_task_switch+0x42/0x91
May 22 14:26:30 monza kernel: [   80.217599]  [<ffffffff812a7d3f>] ? schedule+0x74d/0x833
May 22 14:26:30 monza kernel: [   80.217599]  [<ffffffff812a9f8f>] ? page_fault+0x1f/0x30
May 22 14:26:30 monza kernel: [   80.217599]  [<ffffffff815b8c59>] ? setup_disablecpuid+0x43/0x43
May 22 14:26:30 monza kernel: [   80.217599]  [<ffffffff812a1073>] ? identify_cpu+0xb4/0x2af
May 22 14:26:30 monza kernel: [   80.217599]  [<ffffffff812a1282>] ? identify_secondary_cpu+0x14/0x1d
May 22 14:26:30 monza kernel: [   80.217599]  [<ffffffff812a3d0d>] ? start_secondary+0x107/0x188
May 22 14:26:30 monza kernel: [   85.287094] CPU1: Stuck ??
May 22 14:26:30 monza kernel: [   85.287162] Error taking CPU1 up: -5
Comment 1 Rafael J. Wysocki 2011-05-22 13:53:59 UTC
Well, this is not reproducible on my Acer Ferrari One, so please bisect if you
can.
Comment 2 Jukka Ollila 2011-05-22 22:31:15 UTC
Duh, turns out the broken tree is not exactly 2.6.39 but some random upstream tree after that :(

Bisect result is below, but since this is nothing that's exactly released, should this report just be closed and reopened if it appears in something that actually resembles a release, like .40-rc1?

Bisect result:

de5397ad5b9ad22e2401c4dacdf1bb3b19c05679 is the first bad commit
commit de5397ad5b9ad22e2401c4dacdf1bb3b19c05679
Author: Fenghua Yu <fenghua.yu@intel.com>
Date:   Wed May 11 16:51:05 2011 -0700

    x86, cpu: Enable/disable Supervisor Mode Execution Protection
    
    Enable/disable newly documented SMEP (Supervisor Mode Execution Protection) CPU
    feature in kernel. CR4.SMEP (bit 20) is 0 at power-on. If the feature is
    supported by CPU (X86_FEATURE_SMEP), enable SMEP by setting CR4.SMEP. New kernel
    option nosmep disables the feature even if the feature is supported by CPU.
    
    [ hpa: moved the call to setup_smep() until after the vendor-specific
      initialization; that ensures that CPUID features are unmasked.  We
      will still run it before we have userspace (never mind uncontrolled
      userspace). ]
    
    Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
    LKML-Reference: <1305157865-31727-1-git-send-email-fenghua.yu@intel.com>
    Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>

:040000 040000 d8ad18a82278c0f4cea62fd4f98617aeb6da4799 b53858c394d5de9f8d2fdc799bb3061b34d0d7d7 M      Documentation
:040000 040000 fac275c02a6f2b7ce3e80b6eca2e51271cb730d2 58a10ce4381cfbe69b035bda4abae3ebb22d322b M      arch
Comment 3 Rafael J. Wysocki 2011-05-22 22:44:59 UTC
No need to close, the issue is there, right? :-)

Thanks for the bisection, we'll close the bug when the problem is fixed.
Comment 4 Jukka Ollila 2011-05-22 23:35:32 UTC
Yes it's still there, tried 71a8638480eb8fb6cfabe2ee9ca3fbc6e3453a14

And reverting the bisected commit off that fixes the problem.

Thanks
Comment 5 Rafael J. Wysocki 2011-05-23 21:28:36 UTC
Please check if the current Linus' tree fixes the problem for you.
Comment 6 Fenghua Yu 2011-05-23 21:32:17 UTC
See if commit 1d487624fcc17a40aa67acaa9e8f3815fb7cd0f0 in Linus' tree fixes the issue.
Comment 7 paragw 2011-05-23 22:47:01 UTC
Should be fixed by Linus' commit  82da65dab5f438ac7df28eeb43e2f5b742aa00ef.
Comment 8 Jukka Ollila 2011-05-23 23:18:40 UTC
I can confirm 1d487624fcc17a40aa67acaa9e8f3815fb7cd0f0 (identical to 82da65dab5f438ac7df28eeb43e2f5b742aa00ef) fixes this issue for me.

Thank you.