Bug 15385

Summary:	CPU loses support for mwait instruction following suspend/resume
Product:	Platform Specific/Hardware	Reporter:	Alan Stern (stern)
Component:	i386	Assignee:	drivers_video-dri-intel (drivers_video-dri-intel)
Status:	CLOSED CODE_FIX
Severity:	normal	CC:	hpa, mingo, rjw, shaohua.li, suresh.b.siddha, tglx, venki
Priority:	P1
Hardware:	IA-32
OS:	Linux
Kernel Version:	2.6.33-rc8	Subsystem:
Regression:	No	Bisected commit-id:
Bug Depends on:
Bug Blocks:	7216
Attachments:	Dmesg for boot followed by suspend/resume

Description Alan Stern 2010-02-24 16:57:04 UTC

On my desktop system (Intel ICH5 chipset), the i915 video driver hangs during resume from RAM.  The screen remains blank and the CapsLock and ScrollLock LEDs on the keyboard start blinking.  The guilty driver was confirmed with CONFIG_PM_TRACE_RTC.  Other relevant config entries include:

    CONFIG_DRM=y
    CONFIG_DRM_KMS_HELPER=y
    CONFIG_DRM_I915=y
    CONFIG_DRM_I915_KMS=y
    CONFIG_FB=y
    CONFIG_FRAMEBUFFER_CONSOLE=y

When I boot with "vga=0 acpi_sleep=sci_force_enable init=/bin/bash" and no initramdisk, the screen does switch to the framebuffer console (with a nearly-unreadable 160x60 character display) but the system still hangs during resume.

CONFIG_X86_CHECK_BIOS_CORRUPTION does not report any problems.

The graphics adapter is:

00:02.0 VGA compatible controller: Intel Corporation 82865G Integrated Graphics Controller (rev 02) (prog-if 00 [VGA controller])
        Subsystem: GVC/BCM Advanced Research Device 2181
        Flags: bus master, fast devsel, latency 0, IRQ 16
        Memory at f0000000 (32-bit, prefetchable) [size=128M]
        Memory at fe780000 (32-bit, non-prefetchable) [size=512K]
        I/O ports at ec00 [size=8]
        Expansion ROM at <unassigned> [disabled]
        Capabilities: <access denied>
        Kernel driver in use: i915
        Kernel modules: i915

Possibly related are these log messages that appear during every startup:

[drm] Initialized drm 1.1.0 20060810
i915 0000:00:02.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
i915 0000:00:02.0: setting latency timer to 64
fbcon: inteldrmfb (fb0) is primary device
render error detected, EIR: 0x00000010
[drm:i915_handle_error] *ERROR* EIR stuck: 0x00000010, masking
render error detected, EIR: 0x00000010
[drm] DAC-5: set mode 640x480 16

Comment 1 Rafael J. Wysocki 2010-02-24 19:27:39 UTC

I think this is an i915 issue, so reassigning.

Comment 2 Jesse Barnes 2010-02-26 23:08:33 UTC

Ouch, 865...  it sounds like it may have panic'd at resume?  Does it suspend/resume correctly if you booth with i915.modeset=0 (i.e. use the old suspend/resume code)?

Comment 3 Alan Stern 2010-03-01 17:14:26 UTC

Thanks Jesse, that suggestion was a big help!  The system still panics, but with i915.modeset=0 the screen comes back so I can see what's going on.

It turns out the panic is caused by an invalid opcode exception in mwait_idle().  The bad instruction is the assembler __monitor() call; the offending IP points directly to the 0x0f,0x01,0xc8 bytes in the instruction stream.

For the record, I have set CONFIG_M686=y and:

$ cat /proc/cpuinfo
processor       : 0
vendor_id       : GenuineIntel
cpu family      : 15
model           : 4
model name      : Intel(R) Celeron(R) CPU 2.53GHz
stepping        : 9
cpu MHz         : 2533.270
cache size      : 256 KB
fdiv_bug        : no
hlt_bug         : no
f00f_bug        : no
coma_bug        : no
fpu             : yes
fpu_exception   : yes
cpuid level     : 5
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe constant_tsc up pebs bts pni dtes64 monitor ds_cpl cid xtpr lahf_lm
bogomips        : 5053.30
clflush size    : 64
power management:

From dmesg:

CPU: Trace cache: 12K uops, L1 D cache: 16K
CPU: L2 cache: 256K
CPU: Hyper-Threading is disabled
mce: CPU supports 4 MCE banks
CPU0: Thermal monitoring enabled (TM1)
using mwait in idle threads.

So it looks like the idle-routine selection logic is messed up.  Accordingly, I am reclassifying this bug and adding hpa to the CC list in the hope that he can figure out what's going wrong.

Comment 4 Alan Stern 2010-03-01 21:10:49 UTC

Confirmed.  With the following patch applied, the resume worked correctly:

Index: 2.6.33-rc8/arch/x86/kernel/process.c
===================================================================
--- 2.6.33-rc8.orig/arch/x86/kernel/process.c
+++ 2.6.33-rc8/arch/x86/kernel/process.c
@@ -507,6 +507,9 @@ static int __cpuinit mwait_usable(const 
 		return 0;
 
 	cpuid(MWAIT_INFO, &eax, &ebx, &ecx, &edx);
+	printk(KERN_INFO "cpuid: ecx %x edx %x\n", ecx, edx);
+	return 0;
+
 	/* Check, whether EDX has extended info about MWAIT */
 	if (!(ecx & MWAIT_ECX_EXTENDED_INFO))
 		return 1;

The output from the printk was:

[    0.032024] cpuid: ecx 0 edx 0

This suggests the "return 1" in the last line above really should be "return 0".  But I don't know enough about these processors to tell if that's the right solution.

Comment 5 H. Peter Anvin 2010-03-01 22:48:47 UTC

No, this is clearly wrong.  If ECX = EDX = 0, it simply means there are no extensions to MONITOR/MWAIT.  The reason your patch "works" is because you're disabling MWAIT.

You're saying you have the "offending IP", but you're not giving any register values.  In particular, if ECX != 0 at the point we're invoking MONITOR, that is buggy for this CPU, and would trigger a #GP(0).

Comment 6 Alan Stern 2010-03-02 16:01:31 UTC

The panic message lists the register contents as follows:

    EAX = 0xC1319008, EBX = 0xC13B0780, ECX = 0, EDX = 0

Also, the PID is 0.  By the way, do you have any idea why this exception should trigger during system resume and not during normal operation?

Comment 7 Venkatesh Pallipadi 2010-03-02 18:10:49 UTC

return 1; in comment #4 is correct. The check is for mwait extended info and basic mwait is supposed to work without that as well. And as you said things are working OK on boot up. Something seems to be broken on resume. May be something related to microcode? Can you check CPUID.1.ECX.bit3 is set on resume? That bit says whether monitor/mwait is supported or not. 

Adding Suresh.

Comment 8 Alan Stern 2010-03-02 20:03:18 UTC

Good guess!  Before the suspend, cpuid(1) gives ECX = 0x441d, but afterward it gives ECX = 0x4415.  What could cause this sort of thing?

Comment 9 H. Peter Anvin 2010-03-02 20:05:58 UTC

Usually failure to load microcode on the way out of suspend, OR control registers being changed.

In this case, your CPU had monitor/mwait coming in, and not on the way out!

Comment 10 Venkatesh Pallipadi 2010-03-02 20:10:18 UTC

Not sure whether the microcode loaded by the BIOS or the one loaded by the OS is at fault here. Do you see any microcode related in dmesg after boot? If yes, do you see similar message after resume with the workaround in comment #4.

Comment 11 Alan Stern 2010-03-02 21:04:01 UTC

Created attachment 25327 [details]
Dmesg for boot followed by suspend/resume

I'm not sure what to look for, but attached is the complete dmesg log, starting from bootup and showing a successful suspend/resume transition.  The only change to the vanilla 2.6.33-rc8 kernel was "return 0;" added near the start of mwait_usable().

Comment 12 Venkatesh Pallipadi 2010-03-02 21:11:51 UTC

I was expecting to see "updated to revision" kind of message from arch/x86/kernel/microcode_intel.c. I don't see such a message in dmesg.

So, I think this is the BIOS that is loading a microcode update for this CPU at the boot time and not loading it during resume.

Comment 13 Alan Stern 2010-03-03 21:18:36 UTC

There are no BIOS updates available from HP more recent than the one I have now.  So it will be necessary to work around this problem somehow.  Should I simply boot with "idle=poll" always?

Comment 14 H. Peter Anvin 2010-03-03 21:21:37 UTC

I wonder if you can register the appropriate microcode with the microcode driver, even if it is the one currently loaded into the CPU; my understanding is that the microcode driver will check and re-load the microcode on resume.

Comment 15 Alan Stern 2010-03-03 21:44:45 UTC

My previous testing was all done without CONFIG_MICROCODE enabled, so anything that was loaded had to have been by the BIOS.  I just tried again with CONFIG_MICROCODE and CONFIG_MICROCODE_INTEL set (and with "idle=poll").  I also added "#define DEBUG" at the start of microcode_intel.c.  It worked, but there's no indication that any microcode was actually loaded.  The only new lines in the dmesg log are:

[    0.607123] microcode: CPU0 sig=0xf49, pf=0x4, revision=0x3
[    0.608874] microcode: Microcode Update Driver: v2.00 <tigran@aivazian.fsnet.co.uk>, Peter Oruba

Comment 16 H. Peter Anvin 2010-03-03 21:48:09 UTC

Well, you could check what CPUID does after suspend/resume.
You should be able to *not* use idle=poll with the microcode driver.

Comment 17 Alan Stern 2010-03-03 22:49:11 UTC

Even with the microcode stuff enabled, the ecx value from cpuid(1) still changes from 0x441d to 0x4415.  Not surprising, since the microcode driver never loads anything into the CPU.

Comment 18 Alan Stern 2010-03-05 14:59:14 UTC

I'm not at all familiar with the microcode driver.  Are you saying that after it loads its data into the CPU, it stores a copy of all the resident microcode?  And then it reloads that copy back into the CPU during a system resume?

What functions should I look at to find where this is supposed to happen and fix it?

Comment 19 Alan Stern 2010-03-11 16:16:37 UTC

It turns out that microcode is not the answer.

I finally got the microcode driver to do something.  The code in the file supplied by Intel was not getting loaded because the revision level in the file was the same as the CPU's current revision level.  I changed the driver to force 
it to load the microcode anyway:

[   47.985453] microcode: CPU0 updated to revision 0x3, date = 2005-04-21

It didn't help.  The mwait-support flag (bit 0x8 of ecx following cpuid(1)) was completely unaffected:

    Loading the microcode before doing a suspend left the flag turned on.

    Then after a suspend the flag was off, even though the microcode driver
    did reload the data into the CPU during early resume.

    After rebooting, not loading any microcode, and doing a suspend, loading
    the microcode by hand left the flag turned off.

I'm at a loss for ideas as to the cause.  Does anybody at Intel have a suggestion?

Booting with "idle=halt" seems like a reasonable workaround, but it would be nice to actually fix the problem.

Comment 20 Alan Stern 2010-06-28 19:24:04 UTC

Fixed by commit 85a0e7539781dad4bfcffd98e72fa9f130f4e40d (PM / x86: Save/restore MISC_ENABLE register).  The problem was caused by the fact that the MISC_ENABLE register was not getting saved and restored across the suspend by either the system or the BIOS.