Bug 7995 - S4 restores reserved RAM - confusing BIOS AML
Summary: S4 restores reserved RAM - confusing BIOS AML
Status: CLOSED CODE_FIX
Alias: None
Product: Power Management
Classification: Unclassified
Component: Other (show other bugs)
Hardware: i386 Linux
: P2 normal
Assignee: Rafael J. Wysocki
URL:
Keywords:
Depends on:
Blocks: 7216
  Show dependency tree
 
Reported: 2007-02-12 21:17 UTC by Andrey Borzenkov
Modified: 2007-08-08 09:20 UTC (History)
1 user (show)

See Also:
Kernel Version: 2.6.20
Subsystem:
Regression: ---
Bisected commit-id:


Attachments
minimal kernel config where this bug happens (20.17 KB, text/plain)
2007-02-12 21:18 UTC, Andrey Borzenkov
Details
dmesg from booting current kernel where bug happens too (13.25 KB, text/plain)
2007-02-12 21:19 UTC, Andrey Borzenkov
Details
DSDT from affected system (26.21 KB, application/octet-stream)
2007-02-12 21:19 UTC, Andrey Borzenkov
Details
lspci from affected system (15.47 KB, text/plain)
2007-02-12 21:20 UTC, Andrey Borzenkov
Details
files under /proc/acpi (3.89 KB, text/plain)
2007-02-12 21:21 UTC, Andrey Borzenkov
Details
acpidump output (122.62 KB, text/plain)
2007-02-13 10:05 UTC, Andrey Borzenkov
Details
acpidump from 2.6.20-git boot (122.62 KB, text/plain)
2007-02-16 23:48 UTC, Andrey Borzenkov
Details
.config for 2.6.20-git (20.95 KB, text/plain)
2007-02-16 23:49 UTC, Andrey Borzenkov
Details
Proposed fix patch, against 2.6.22-rc3 (2.39 KB, patch)
2007-05-31 10:27 UTC, Rafael J. Wysocki
Details | Diff

Description Andrey Borzenkov 2007-02-12 21:17:58 UTC
Most recent kernel where this bug did *NOT* occur:
Unknown. Goes as far as 2.6.17

Distribution:
Mandriva cooker

Hardware Environment:
Toshiba Portege 4000

Software Environment:
Kernel 2.6.17 .. 2.6.20 using either swsusp or user level suspend. Currently 
using 2.6.20

Problem Description:
If AC adapter state changes during suspend to disk, it is not reflected by 
ACPI. As result, all user level tools continue to believe we are running on AC 
power resulting in hard power off instead of clean shutdown when battery is 
empty.

Steps to reproduce:
Plug in AC. Check /proc/acpi/ac_adapter/ADP1/state - it is on-line. Suspend to 
disk, remove power cord, resume, check /proc/acpi/ac_adapter/ADP1/state - it 
is still on-line. Replug, unplug power cord - state correctly changes to 
off-line

kernel config, dmesg, lspci, dsdt (binary) and list of files under /proc/acpi 
attached,
Comment 1 Andrey Borzenkov 2007-02-12 21:18:43 UTC
Created attachment 10395 [details]
minimal kernel config where this bug happens
Comment 2 Andrey Borzenkov 2007-02-12 21:19:23 UTC
Created attachment 10396 [details]
dmesg from booting current kernel where bug happens too
Comment 3 Andrey Borzenkov 2007-02-12 21:19:53 UTC
Created attachment 10397 [details]
DSDT from affected system
Comment 4 Andrey Borzenkov 2007-02-12 21:20:32 UTC
Created attachment 10398 [details]
lspci from affected system
Comment 5 Andrey Borzenkov 2007-02-12 21:21:02 UTC
Created attachment 10399 [details]
files under /proc/acpi
Comment 6 Vladimir Lebedev 2007-02-13 08:55:59 UTC
What will be the AC state in 10 minutes after resume?
Please perform the same action with the battery - remove the battery instead 
of power cord.
Comment 7 Vladimir Lebedev 2007-02-13 09:10:11 UTC
Is it a stable bug?
Comment 8 Andrey Borzenkov 2007-02-13 10:05:15 UTC
Created attachment 10406 [details]
acpidump output

Re. comment #7 - yes, it is stable; it has been this way for quite some time.
Rafael also confirmed it.
Comment 9 Vladimir Lebedev 2007-02-13 10:25:38 UTC
And what about comment #6 ?
Comment 10 Andrey Borzenkov 2007-02-13 10:33:45 UTC
Re comment #6.

Yes, it persists also 10 minutes after resume. Moreover, both with power cord 
*or* battery removed it shows exactly the same information:

{pts/0}% cat /proc/acpi/ac_adapter/ADP1/state
state:                   on-line
{pts/0}% cat /proc/acpi/battery/BAT1/state
present:                 yes
capacity state:          ok
charging state:          charging
present rate:            11124 mW
remaining capacity:      34678 mWh
present voltage:         11340 mV

(well, with battery present the remaining capacity really changes down when I 
check several times)
Comment 11 Vladimir Lebedev 2007-02-13 11:03:29 UTC

The same question is about thermal devices; does the tempethure change during 
various loadings of CPU?
Comment 12 Andrey Borzenkov 2007-02-13 11:09:43 UTC
yes, it does.
Comment 13 Vladimir Lebedev 2007-02-14 14:34:48 UTC
Please try the latest linux-2.6.20-git*, we will start the bug investigation 
with the latest kernel.
Please post acpidump output, .config file, and boot options
Comment 14 Andrey Borzenkov 2007-02-16 21:59:09 UTC
Fails the same (with minimal config I posted) as of commit 
7de970e11fb832a56c897276967fb0e49f59b313
Comment 15 Andrey Borzenkov 2007-02-16 23:48:32 UTC
Created attachment 10439 [details]
acpidump from 2.6.20-git boot
Comment 16 Andrey Borzenkov 2007-02-16 23:49:22 UTC
Created attachment 10440 [details]
.config for 2.6.20-git
Comment 17 Andrey Borzenkov 2007-02-16 23:51:37 UTC
And I boot using grub with this entry (no extra options):

title str
kernel (hd0,1)/home/bor/build/str-bisect/arch/i386/boot/bzImage BOOT_IMAGE=str 
root=/dev/hda2 init=/bin/sh vga=791

Then

mount -t proc none /proc
mount -t sysfs none /sys
/sbin/swapon -a
echo disk > /sys/power/state

unplug power cord
boot into the same entry again

mount -t sysfs none /sys
echo 3:1 > /sys/power/resume

check AC state ...
Comment 18 Vladimir Lebedev 2007-02-19 09:10:59 UTC
What is the ac/battery behavior if you reboot or boot the system?
What is the ac/battery behavior if you remove/insert ac cable/battery after 
resume or boot?

Also, there are some principal changes in mm tree, can you try the latest mm?
Comment 19 Vladimir Lebedev 2007-02-19 09:23:28 UTC
Please look at bug #7689 and verify the problem without psmouse loaded.
Comment 20 Vladimir Lebedev 2007-02-21 14:08:13 UTC
Please test rc1 w w/o psmouse loaded.

Comment 21 Andrey Borzenkov 2007-02-23 02:57:57 UTC
No change without psmouse
No change with 2.6.20-mm2
No change with current git as of b5bf28cde894b3bb3bd25c13a7647020562f9ea0
Comment 22 Andrey Borzenkov 2007-02-23 03:01:16 UTC
Oh, missed earlier questions.

AC state is correctly detected during boot. Replugging power cord after resume 
also works correctly (i.e. unplugging after that shows AC as unplugged). I 
guess I already mentioned this.
Comment 23 Andrey Borzenkov 2007-02-23 05:39:37 UTC
Frankly speaking (not knowing internals) the most probable reason seems to be 
that part of system memory gets restored on resume. AC status simply returns a 
value from memory (which apparently gets updated only on actual status 
change):

            Method (_PSR, 0, NotSerialized)
            {
                Return (\_SB.MEM.ACST)
            }

where ACST should be located somewhere in lower memory:

            OperationRegion (SRAM, SystemMemory, 0x000EE800, 0x1800)
                     ...
            Field (SRAM, AnyAcc, NoLock, Preserve)
                     ...
                Offset (0xFF),
                ACST,   1,
                     ...

this perfectly explains what happens. This part is marked as reserved

 BIOS-e820: 0000000000000000 - 000000000009fc00 (usable)
 BIOS-e820: 000000000009fc00 - 00000000000a0000 (reserved)
 BIOS-e820: 00000000000e0000 - 00000000000eee00 (reserved)
 BIOS-e820: 00000000000eee00 - 00000000000ef000 (ACPI NVS)
 BIOS-e820: 00000000000ef000 - 0000000000100000 (reserved)
Comment 24 Andrey Borzenkov 2007-02-24 01:48:06 UTC
While I could not find where to check (and display) physical page addresses, 
here is indirect proof. Suspend to disk, remove AC plug, boot into kernel 
(before initiating resume), display memory state:

sh-3.1# /display_ac_state
0000000: 02                                       .

*now* initiate resume and *immediately* display memory state again:

sh-3.1# /display_ac_state
0000000: 03                                       .

where script does 

{pts/0}% cat /display_ac_state
#!/bin/sh

/bin/dd if=/dev/mem skip=$[0xee800+0xff] bs=1c count=1 
2> /dev/null | /usr/bin/xxd

well, this seems to prove that this part of memory is indeed (incorrectly) 
restored during resume from STD.
Comment 25 Rafael J. Wysocki 2007-02-24 02:58:41 UTC
Well, this might be the reason, but I have exactly the same problem on x86_64, 
where this region is not restored during the resume.
Comment 26 Vladimir Lebedev 2007-03-07 09:37:02 UTC
There are set of emails with patches/etc ... for your problem.
So the problem is solved or not?
Comment 27 Andrey Borzenkov 2007-03-07 23:55:29 UTC
The reason for misbehavior is identified. I think ATM there is no patch that 
is included upstream (but all of them work for me). In any case it does not 
look like ACPI issue and should probably be reassigned to more appropriate 
component. To me bug is fixed when patch is in some tree that has chance to 
appear upstream.
Comment 28 Vladimir Lebedev 2007-03-08 05:53:03 UTC
Thanks, so I suggest to close the bug, OK?
Comment 29 Len Brown 2007-03-08 14:40:08 UTC
There doesn't seem to be a bugzilla category for
suspend-to-disk failures that are unrelated to ACPI,
so I'll drop this in Power-management/Other
and assign it to Pavel, who maintains suspend-to-disk.
Comment 30 Rafael J. Wysocki 2007-05-30 10:32:25 UTC
I'll try to prepare a patch to fix this against the current mainline.
Comment 31 Rafael J. Wysocki 2007-05-31 10:27:19 UTC
Created attachment 11624 [details]
Proposed fix patch, against 2.6.22-rc3

Andrey, can you please confirm that this patch fixes the problem?
Comment 32 Rafael J. Wysocki 2007-08-08 09:20:29 UTC
Patch from Comment #31 is present in 2.6.23-rc2.  Closing.

Note You need to log in before you can comment on or make changes to this bug.