Bug 3586

Summary: S3 resume: push $0 causes reboots Athlon XP 2600+: Fujitsu-Siemens laptop
Product: ACPI Reporter: Alberto Piai (albeclemit)
Component: Power-Sleep-WakeAssignee: Luming Yu (luming.yu)
Status: REJECTED INSUFFICIENT_DATA    
Severity: high CC: acpi-bugzilla, dagit
Priority: P2    
Hardware: i386   
OS: Linux   
Kernel Version: 2.6.9-rc4 Subsystem:
Regression: --- Bisected commit-id:
Attachments: lspci from my laptop
lspci -V from my laptop
contents of /proc/cpuinfo

Description Alberto Piai 2004-10-18 04:50:57 UTC
Distribution: Debian Testing
Hardware Environment: VIA KT400/KT400A/KT600
Software Environment:
Problem Description: waking up from s3 state leads to instant reboot

Steps to reproduce:
echo -n "mem" > /sys/power/state
then lid button or power button to wake up the laptop.
Comment 1 Alberto Piai 2004-10-18 04:56:03 UTC
Created attachment 3849 [details]
lspci from my laptop
Comment 2 Alberto Piai 2004-10-18 04:59:34 UTC
Created attachment 3850 [details]
lspci -V from my laptop
Comment 3 Alberto Piai 2004-10-18 05:11:29 UTC
From a long discussion/debug process on the acpi-devel mailing list was pointed out:

the reboot is issued from a pushl $0 in arch/i386/kernel/acpi/wakeup.S

Code from arch/i386/kernel/acpi/wakeup.S:
[...]
	mov	$(wakeup_stack - wakeup_code), %sp		# Private stack is needed for ASUS board
	movw	$0x0e00 + 'S', %fs:(0x12)

	pushl	$0	###THIS LINE CAUSES REBOOT!		# Kill any dangerous flags
	popfl

	movl	real_magic - wakeup_code, %eax
[...]

Commenting out pushl $0 and popfl wakeup code goes on until another issue blocks
it later (more on this to be added as soon as we point something out on the
acpi-devel mailing list)

Two patches were proposed on the ml (wakeup_address.patch and wakeup_gdt.patch)
but they don't change things on this laptop (while they may fix other wakup
problems).
Comment 4 Len Brown 2004-11-18 21:47:48 UTC
so the suspend works, but when you wake up the system it boots instantly,
or does it start to resume, and then reboot?
Comment 5 Alberto Piai 2004-11-20 17:40:23 UTC
it starts resuming, enters wakeup.S and the pushl $0 issues instant reboot;
beeping dead-loops were used to narrow it down to a single line.

NOTE: a workaround other than removing those two lines has to be found, but
other issues freeze (not reboot) this unlucky laptop upon s3 resume. As soon as
i understand a bit more about this i'll file a new bugzilla entry.
Comment 6 Luming Yu 2004-11-30 03:19:00 UTC
How could pushl	$0 cause reboot in real mode?
I guess it is in protect mode.



Comment 7 Luming Yu 2004-11-30 03:33:57 UTC
I suggest you to try the following patch to see if "pushl $0" still cause 
reboot.

--- linux-2.6.10-rc2/arch/i386/kernel/acpi/wakeup.S.orig	2004-12-02 
18:18:01.132147992 -0800
+++ linux-2.6.10-rc2/arch/i386/kernel/acpi/wakeup.S	2004-12-02 
18:17:34.872140120 -0800
@@ -20,6 +20,8 @@
 	wakeup_code_start = .
 	.code16
 
+	xor	%eax, %eax
+	movl	%eax, %cr0
  	movw	$0xb800, %ax
 	movw	%ax,%fs
 	movw	$0x0e00 + 'L', %fs:(0x10)
Comment 8 Alberto Piai 2004-12-07 11:15:31 UTC
Sorry for the late answer, but university has been pressing in the last weeks..
Luming, this patch doesn't help, reboot occurs as usual :(
could i post other useful info about the system? maybe the DSDT?
Comment 9 Luming Yu 2004-12-09 04:21:34 UTC
Hmm, strange. Would you try to change 

  pushl	$0	###THIS LINE CAUSES REBOOT! # Kill any dangerous flags

with 

  xorw %ax, %ax
  pushw  %ax

And see what could happen?
Not sure if it can help, I just wonder why my objdump shows that GNU assembler 
just emit 6A instead of 68 for "pushl $0". 

PS:
6A Push imm8
68 Push imm16/imm32


Comment 10 Alberto Piai 2004-12-22 17:55:23 UTC
no.. usual hated reboot...

by the way objdump gives this code... (this is the object file without the last
xorw %ax %ax push %ax modification you suggested)
**************
linux-2.6.10-rc2-bk9/arch/i386/kernel/acpi/wakeup.o:     file form
at elf32-i386

Disassembly of section .text:

00000000 <wakeup_start>:

    ...

      22:       53                      push   %ebx
      23:       0e                      push   %cs
      24:       66 6a 00                pushw  $0x0
      27:       66 9d                   popfw
    ...
Comment 11 Jason Dagit 2005-03-22 22:10:09 UTC
Could this bug be related to bug 3691?  Diagnostic information about my machine
is posted in that bug.

I have an averatec laptop and removing those lines worked for a while (about 20
reboots) and I could resume from sleep.  But now my laptop has started to have
other weird behaviors if the pushl/popl are missing.  Sometimes it reboots,
sometimes it hangs during/after resume.  And the behavior seems to change
depending whether or not I boot into WinXP (reboot of course) and then try to
suspend/resume in linux.

I was asking around about what could cause a reboot, and someone suggested that
there may be a triple fault triggering a reboot.

How can we debug this further?  I'm willing to help in anyway I can.  But I lack
the technical experience to troubleshoot this myself.

Thanks!
Comment 12 Luming Yu 2005-03-23 07:35:41 UTC
I want to know what is the CPU type?

It is rather strange a single PUSH $0 could cause problem in real mode.
because, according to IA32 manual, In the real-address mode, if the ESP or SP 
register is 1 when the PUSH instruction is executed, the processor shuts down 
due to a lack of stack space. Or if the new value of the SP or ESP register is 
outside the stack segment limit.

I want to see 2 thing:
1. add printk(" virt_to_phys: acpi_wakeup_address %p \n", acpi_wakeup_address);
2. comment out "mov $(wakeup_stack - wakeup_code), %sp" in wakeup_code

Comment 13 Alberto Piai 2005-03-23 11:43:01 UTC
CPU is a mobile Athlon XP 2600+

i added the printk right after the point where it reports supported sleep
states, and i get the following in dmesg:

virt_to_phys: acpi_wakeup_address c0001000

commenting out
mov $(wakeup_stack - wakeup_code), %sp
nothing changes, i still get a reboot

removing the pushl statement i'm able to resume (boot with vga=normal and
vbetool to get screen back to life), but problems with the hard disk occur
then.. more following on the mailing list (but that's a different bug)
Comment 14 Jason Dagit 2005-03-23 13:34:51 UTC
Created attachment 4790 [details]
contents of /proc/cpuinfo
Comment 15 Jason Dagit 2005-03-23 13:41:42 UTC
I've been playing with suspend/resume and booting different ways before I do
that.  It would appear that my laptop doesn't always run the wakeup code, but if
I boot into WinXP do a suspend/resume and then boot into linux and try the
suspend/resume it almost always runs the wakeup code.  So I think my machine is
experiencing two bugs here.  1) It doesn't always run the wakeup code 2) It
reboots on pushl.

I added the printout of the acpi_wakeup_address to my code and I get the same
address.  Which after virt_to_phys it becomes 00001000.

Unfortunately I cannot get my laptop to make sound through the PC speaker so
using that debugging trick will not help on my system.
Comment 16 Luming Yu 2005-03-23 18:48:44 UTC
Could you check PE bit at CR0? I suspect the CPU is NOT in real mode.
Comment 17 Jason Dagit 2005-03-23 21:48:41 UTC
"Could you check PE bit at CR0? I suspect the CPU is NOT in real mode."

Forgive my ignorance, but I don't know how to do that.  

If your solution involves something of the form "if PE then BEEP" Keep in mind
that my laptop can't beep for some reason.  But perhaps an infinite loop or
something else would be helpful.

I'm really thinking that the machine is just not being prepared properly.  Why
else would booting into WinXP before the suspend/resume help?  I wonder if that
must be causing some sort of initialization or something is getting left over in
some weird place in memory.  If that were the case how could we figure out what
is going on?  Maybe there is a fairly standard plan of attack?  Some way to rule
it out maybe?
Comment 18 Luming Yu 2005-03-23 22:50:24 UTC
How about if not in real-mode ,Then skip pushl $0 ; popfl
Comment 19 Jason Dagit 2005-03-23 23:28:12 UTC
I changed the pushl $0; popl lines to the following:

movl  %cr0, %eax
and $0x0001, %ax
test $0x01, %ax
je skip_push_pop

pushl $0
popl
skip_push_pop:


If I understand that code it will only skip that if PE is set in cr0.  I don't
know asm so I could be wrong.  Please check that code.

Here are the restults of testing this:
1) Boot into WinXP do a suspend/resume/reboot.  This makes sure that my machine
will enter the wakeup code.
2) Boot into modifed kernel, echo mem >/sys/power/state
3) Wait for the machine to sleep and then hit the power button.
The machine resumes correctly.  So this means the laptop is in protected mode right?
Comment 20 Shaohua 2005-03-23 23:31:30 UTC
Please check if below change fixes the issue:
-        mov     $(wakeup_stack - wakeup_code), %sp
+        movl     $(wakeup_stack - wakeup_code), %esp
I suspect the high bits of esp isn't cleared, which cause any kind of 'push' 
(Jason said 'pushw 0' can't survive also) reboots the system.
Comment 21 Luming Yu 2005-03-23 23:40:56 UTC
If so, could you disable PE at the begin of wakeup_code.
Comment 22 Jason Dagit 2005-03-23 23:42:35 UTC
Please check if below change fixes the issue:
-        mov     $(wakeup_stack - wakeup_code), %sp
+        movl     $(wakeup_stack - wakeup_code), %esp

David, I tried that code with the same 3 step test and the machine rebooted.
Comment 23 Jason Dagit 2005-03-23 23:44:18 UTC
Luming, I don't know how to disable PE.  I looked it up on the web and found it
to be quite a confusing process for someone with no experience.

See section 14.5:
http://library.n0i.net/hardware/intel80386-programmer-manual/Chap14.html
Comment 24 Luming Yu 2005-03-23 23:57:03 UTC
movl  %cr0, %eax
xorl $0xFFFFFFFE, %eax
movl %eax, %cr0
Comment 25 Jason Dagit 2005-03-24 00:05:19 UTC
I reverted wakeup.S back to the original version again, and made just the change
to clear the PE bit.  Then I did my 3 step test and the machine reboots.
Comment 26 Luming Yu 2005-03-24 00:11:03 UTC
Please post your changes before push and pop
Comment 27 Jason Dagit 2005-03-24 00:13:43 UTC
Right now the only change is adding those 3 lines:

wakeup_code:
       wakeup_code_start = .
       .code16
       
       movl  %cr0, %eax
       xorl $0xFFFFFFFE, %eax
       movl %eax, %cr0

[...]

The rest is unchanged from linux-2.6.12-rc1.
Comment 28 Jason Dagit 2005-03-24 00:18:07 UTC
Perhaps this would be easier to do over IRC.  If so, I'm currentl logged into
#kernel on irc.freenode.net.  My nick is lispy.
Comment 29 Jason Dagit 2005-03-24 02:38:48 UTC
I tried the modifications that Lumin suggested on irc.  But my machine still
rebooted.  As a sanity check I changed the code back to the version that skips
the push/pop if PE is set and the machine suspends/resumes again.

Hardcoding CS and so on to 0x100 and forcing PE to be disabled does not help. 
This is quite odd.

Is it possible that the bios does not set the machine to real-mode as it should?
 Perhaps the machine expects to be put into real-mode before sleep?  Speaking of
sleep it is very late in my timezone so I will work on this more another time.

Thanks again.
Comment 30 Luming Yu 2005-03-24 05:49:46 UTC
Do you find out where the reboot happen?
Comment 31 Jason Dagit 2005-03-24 14:23:38 UTC
wakeup_code:
	wakeup_code_start = .
	.code16
                
 	movw	$0xb800, %ax
	movw	%ax,%fs
	movw	$0x0e00 + 'L', %fs:(0x10)

	cli
	cld

	# setup data segment
        movw    $0x100, %ax
        movw    %ax, %cs       #### The reboot happens on this line
        movw    %ax, %ds
        movw    %ax, %ss

	mov	$(wakeup_stack - wakeup_code), %sp		# Private stack is needed for ASUS board
       	movw	$0x0e00 + 'S', %fs:(0x12)

        # Make sure the PE bit is not set
        movl %cr0, %eax
        xorl $0xFFFFFFFE, %eax
        movl %eax, %cr0

	pushl	$0						# Kill any dangerous flags
	popfl
Comment 32 Jason Dagit 2005-03-24 16:00:21 UTC
From what I understand, movw    %ax, %cs causes a reboot because the machine is
in protected mode.  So I tried moving the code that disables PE bit to before
setting up the data segment.  But the machine reboots on the line that loads
into %cr0 if I do that.

I think we are not understanding the problem.  I think a couple things could be
happening 1) The machine is not being put to sleep properly for my bios so it
does not resume properly.  This could mean that we need to force the machine to
real-mode before sleeping.  2) My bios violates the ACPI spec, and we need to
either check if we are in protected mode and deal with that case or try a
different DSDT.
Comment 33 Luming Yu 2005-03-24 17:49:35 UTC
Could you try this:

   IF PE mode then 
 ljmpl	$__KERNEL_CS,$wakeup_pmode_return

Comment 34 Luming Yu 2005-03-27 17:58:41 UTC
To comment #19, the code is wrong. So, push-pop will be ignored if real mode.
Comment 35 Luming Yu 2005-08-22 00:57:12 UTC
Please test SMP kernel (see bug 5107) 
Please test UP kernel with lapic. 
 
 
Comment 36 Adrian Bunk 2006-02-13 14:42:06 UTC
I;m assuming this issue is already fixed.

Please reopen this bug if:
- it is still present in recent 2.6 kernels and
- you can provide the requested information.
Comment 37 Jason Dagit 2006-02-13 18:21:30 UTC
This has not been fixed.  Please reopen the bug unless you can  
demonstrate a fix.

Thanks,
Jason

On Feb 13, 2006, at 2:42 PM, bugme-daemon@bugzilla.kernel.org wrote:

> http://bugzilla.kernel.org/show_bug.cgi?id=3586
>
> bunk@stusta.de changed:
>
>            What    |Removed                     |Added
> ---------------------------------------------------------------------- 
> ------
>              Status|ASSIGNED                    |REJECTED
>          Resolution|                            |INSUFFICIENT_DATA
>
>
>
> ------- Additional Comments From bunk@stusta.de  2006-02-13 14:42  
> -------
> I;m assuming this issue is already fixed.
>
> Please reopen this bug if:
> - it is still present in recent 2.6 kernels and
> - you can provide the requested information.
>
> ------- You are receiving this mail because: -------
> You are on the CC list for the bug, or are watching someone who is.
>