Bug 12328

Summary: Low memory corruption on laptop lid close
Product: ACPI Reporter: Gautam Iyer (gi1242)
Component: BIOSAssignee: acpi_bios
Status: CLOSED DUPLICATE    
Severity: normal CC: bjorn.helgaas, fengguang.wu
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 2.6.28 Subsystem:
Regression: No Bisected commit-id:

Description Gautam Iyer 2008-12-29 23:37:56 UTC
Latest working kernel version: NEVER WORKING FULLY
Earliest failing kernel version: 2.6.25
Distribution: Gentoo
Hardware Environment: HP 2710p tablet PC.
    00:00.0 Host bridge: Intel Corporation Mobile Memory Controller Hub (rev 0c)
    00:02.0 VGA compatible controller: Intel Corporation Mobile Integrated Graphics Controller (rev 0c)
    00:02.1 Display controller: Intel Corporation Mobile Integrated Graphics Controller (rev 0c)
    00:19.0 Ethernet controller: Intel Corporation 82566MM Gigabit Network Connection (rev 03)
    00:1a.0 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI Contoller #4 (rev 03)
    00:1a.1 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI Controller #5 (rev 03)
    00:1a.7 USB Controller: Intel Corporation 82801H (ICH8 Family) USB2 EHCI Controller #2 (rev 03)
    00:1b.0 Audio device: Intel Corporation 82801H (ICH8 Family) HD Audio Controller (rev 03)
    00:1c.0 PCI bridge: Intel Corporation 82801H (ICH8 Family) PCI Express Port 1 (rev 03)
    00:1c.1 PCI bridge: Intel Corporation 82801H (ICH8 Family) PCI Express Port 2 (rev 03)
    00:1c.2 PCI bridge: Intel Corporation 82801H (ICH8 Family) PCI Express Port 3 (rev 03)
    00:1d.0 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI Controller #1 (rev 03)
    00:1d.1 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI Controller #2 (rev 03)
    00:1d.2 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI Controller #3 (rev 03)
    00:1d.7 USB Controller: Intel Corporation 82801H (ICH8 Family) USB2 EHCI Controller #1 (rev 03)
    00:1e.0 PCI bridge: Intel Corporation 82801 Mobile PCI Bridge (rev f3)
    00:1f.0 ISA bridge: Intel Corporation Mobile LPC Interface Controller (rev 03)
    00:1f.1 IDE interface: Intel Corporation Mobile IDE Controller (rev 03)
    02:09.0 FireWire (IEEE 1394): Ricoh Co Ltd R5C832 IEEE 1394 Controller (rev 05)
    02:09.1 SD Host controller: Ricoh Co Ltd R5C822 SD/SDIO/MMC/MS/MSPro Host Adapter (rev 22)
    02:09.2 System peripheral: Ricoh Co Ltd Device 0843 (rev 12)
    10:00.0 Network controller: Broadcom Corporation BCM4312 802.11a/b/g (rev 02)

Software Environment:
    If some fields are empty or look unusual you may have an old version.
    Compare to the current minimal requirements in Documentation/Changes.
     
    Linux mordor 2.6.28 #4 SMP Mon Dec 29 18:44:33 IST 2008 i686 Intel(R) Core(TM)2 Duo CPU U7600 @ 1.20GHz GenuineIntel GNU/Linux
     
    Gnu C                  4.1.2
    Gnu make               3.81
    binutils               2.18
    util-linux             /usr/src/linux/scripts/ver_linux: line 23: fdformat: command not found
    mount                  support
    module-init-tools      found
    Linux C Library        2.6.1
    Dynamic linker (ldd)   2.6.1
    Procps                 3.2.7
    Kbd                    1.13
    Sh-utils               6.10
    udev                   124
    Modules Loaded         rfkill_input b43 ssb rfkill mac80211 crc32 input_polldev i915 drm libafs fuse ipv6 autofs4 xt_recent ipt_addrtype xt_multiport xt_mac ipt_MASQUERADE xt_state xt_tcpudp ipt_REJECT ipt_LOG xt_limit iptable_nat nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables xt_iprange x_tables nf_conntrack_ftp nf_conntrack snd_pcm_oss snd_mixer_oss snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device ext2 mbcache arc4 ecb rng_core bitrev loop acpi_cpufreq intelfb snd_hda_intel ehci_hcd uhci_hcd sdhci_pci fb i2c_algo_bit snd_pcm usbcore sdhci mmc_core cfbcopyarea snd_timer wmi video backlight output snd_page_alloc battery intel_agp 8250_pnp 8250 serial_core e1000e leds_hp_disk led_class i2c_core cfbimgblt cfbfillrect snd_hwdep serio_raw button ac snd soundcore agpgart evdev

Problem Description:

    This occurs 'randomly' when my laptop lid is closed. Sometimes everything is just fine. Sometimes I get a complete and immediate system freeze (with no log messages). After upgrading to 2.6.28, I sometimes get the following messages instead of a freeze:

	Corrupted low memory at c000fe2c (fe2c phys) = 00000010
	Corrupted low memory at c000fe38 (fe38 phys) = 00000ff0
	Corrupted low memory at c000fe40 (fe40 phys) = 000006d0
	Corrupted low memory at c000fe4c (fe4c phys) = 00000800
	Corrupted low memory at c000fe50 (fe50 phys) = cff3007b
	Corrupted low memory at c000fe54 (fe54 phys) = 000fffff
	Corrupted low memory at c000fe5c (fe5c phys) = 8f9300d8
	Corrupted low memory at c000fe60 (fe60 phys) = 000fffff
	Corrupted low memory at c000fe64 (fe64 phys) = 01366000
	Corrupted low memory at c000fe68 (fe68 phys) = cf000000
	Corrupted low memory at c000fe6c (fe6c phys) = 000fffff
	Corrupted low memory at c000fe74 (fe74 phys) = 00820000
	Corrupted low memory at c000fe78 (fe78 phys) = 000007ff
	Corrupted low memory at c000fe7c (fe7c phys) = c0424000
	Corrupted low memory at c000fe80 (fe80 phys) = 008b0080
	Corrupted low memory at c000fe84 (fe84 phys) = 0000206b
	Corrupted low memory at c000fe88 (fe88 phys) = c17f7800
	Corrupted low memory at c000fe8c (fe8c phys) = c17f4000
	Corrupted low memory at c000fe94 (fe94 phys) = c0424000
	Corrupted low memory at c000fea4 (fea4 phys) = 00008000
	Corrupted low memory at c000fea8 (fea8 phys) = 00820000
	Corrupted low memory at c000feac (feac phys) = 000000ff
	Corrupted low memory at c000feb0 (feb0 phys) = c17f4000
	Corrupted low memory at c000feb4 (feb4 phys) = cf000000
	Corrupted low memory at c000feb8 (feb8 phys) = 000fffff
	Corrupted low memory at c000fec0 (fec0 phys) = cff3007b
	Corrupted low memory at c000fec4 (fec4 phys) = 000fffff
	Corrupted low memory at c000fecc (fecc phys) = cf9b0060
	Corrupted low memory at c000fed0 (fed0 phys) = 000fffff
	Corrupted low memory at c000fed8 (fed8 phys) = cf930068
	Corrupted low memory at c000fedc (fedc phys) = 000fffff
	Corrupted low memory at c000fee4 (fee4 phys) = fedb0000
	Corrupted low memory at c000fee8 (fee8 phys) = 00000001
	Corrupted low memory at c000feec (feec phys) = 00050000
	Corrupted low memory at c000fef8 (fef8 phys) = fedb0000
	Corrupted low memory at c000fefc (fefc phys) = 00030100
	Corrupted low memory at c000ff14 (ff14 phys) = 000000ff
	Corrupted low memory at c000ff18 (ff18 phys) = 0000778e
	Corrupted low memory at c000ff5c (ff5c phys) = 000000f3
	Corrupted low memory at c000ff64 (ff64 phys) = 00000008
	Corrupted low memory at c000ff6c (ff6c phys) = 000000b2
	Corrupted low memory at c000ff74 (ff74 phys) = 000000b2
	Corrupted low memory at c000ff7c (ff7c phys) = f70edd2c
	Corrupted low memory at c000ff84 (ff84 phys) = f700f960
	Corrupted low memory at c000ff8c (ff8c phys) = 000000b2
	Corrupted low memory at c000ff94 (ff94 phys) = f70edddc
	Corrupted low memory at c000ffa4 (ffa4 phys) = 00b20003
	Corrupted low memory at c000ffa8 (ffa8 phys) = 0000007b
	Corrupted low memory at c000ffac (ffac phys) = 00000060
	Corrupted low memory at c000ffb0 (ffb0 phys) = 00000068
	Corrupted low memory at c000ffb4 (ffb4 phys) = 0000007b
	Corrupted low memory at c000ffb8 (ffb8 phys) = 000000d8
	Corrupted low memory at c000ffc4 (ffc4 phys) = 00000080
	Corrupted low memory at c000ffc8 (ffc8 phys) = 00000400
	Corrupted low memory at c000ffd0 (ffd0 phys) = ffff0ff0
	Corrupted low memory at c000ffd8 (ffd8 phys) = c022afc0
	Corrupted low memory at c000ffe8 (ffe8 phys) = 00000246
	Corrupted low memory at c000fff0 (fff0 phys) = 00494000
	Corrupted low memory at c000fff8 (fff8 phys) = 8005003b
	------------[ cut here ]------------
	WARNING: at arch/x86/kernel/setup.c:718 check_for_bios_corruption+0xb6/0xc0()
	Memory corruption detected in low memory
	Modules linked in: b43 ssb rfkill mac80211 crc32 input_polldev i915 drm libafs(P) fuse ipv6 autofs4 xt_recent ipt_addrtype xt_multiport xt_mac ipt_MASQUERADE xt_state xt_tcpudp ipt_REJECT ipt_LOG xt_limit iptable_nat nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables xt_iprange x_tables nf_conntrack_ftp nf_conntrack snd_pcm_oss snd_mixer_oss snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device ext2 mbcache arc4 ecb rng_core bitrev loop acpi_cpufreq ehci_hcd uhci_hcd sdhci_pci usbcore sdhci mmc_core intelfb fb i2c_algo_bit cfbcopyarea snd_hda_intel snd_pcm video backlight output snd_timer snd_page_alloc wmi battery 8250_pnp 8250 serial_core button e1000e leds_hp_disk led_class i2c_core cfbimgblt cfbfillrect snd_hwdep intel_agp ac serio_raw snd soundcore agpgart evdev [last unloaded: input_polldev]
	Pid: 0, comm: swapper Tainted: P       A   2.6.28 #4
	Call Trace:
	[<c012a656>] warn_slowpath+0x76/0x90
	[<c011eb85>] place_entity+0xb5/0xd0
	[<c0141611>] up+0x11/0x40
	[<c012ae1d>] release_console_sem+0x19d/0x1d0
	[<c011dcf6>] activate_task+0x16/0x20
	[<c0123700>] try_to_wake_up+0x20/0x180
	[<c013d2ab>] autoremove_wake_function+0x1b/0x50
	[<c011e11b>] __wake_up_common+0x4b/0x80
	[<c012b3ab>] printk+0x1b/0x20
	[<c0107986>] check_for_bios_corruption+0xb6/0xc0
	[<c0107995>] periodic_check_for_corruption+0x5/0x30
	[<c0133274>] run_timer_softirq+0x144/0x1b0
	[<c0107990>] periodic_check_for_corruption+0x0/0x30
	[<c0107990>] periodic_check_for_corruption+0x0/0x30
	[<c012efe4>] __do_softirq+0x94/0x160
	[<c012f0e7>] do_softirq+0x37/0x40
	[<c012f479>] irq_exit+0x59/0x80
	[<c0106462>] do_IRQ+0x52/0x90
	[<c0118aba>] read_hpet+0xa/0x10
	[<c014332d>] getnstimeofday+0x3d/0xe0
	[<c01047a3>] common_interrupt+0x23/0x28
	[<c014007b>] __run_hrtimer+0x9b/0xa0
	[<c024920c>] acpi_idle_enter_bm+0x28d/0x2f8
	[<c02b69a9>] cpuidle_idle_call+0x69/0xc0
	[<c010273b>] cpu_idle+0x4b/0xa0
	---[ end trace 2ff63db346bda773 ]---

This could be related to bug #11259 (I didn't know weather I should post there or start a new one, so I did both.)

The memory areas reported in the above message are always the same. However booting with memmap=0x01d4$0xc000fe2c does not change anything.

Steps to reproduce:

    Close the laptop lid a few times and cross your fingers.
Comment 1 Zhang Rui 2009-01-07 23:24:41 UTC
can you reproduce this problem with boot option "maxcpus=1"?
can you reproduce this problem after set /proc/acpi/video/*/DOS to 1?
Comment 2 Gautam Iyer 2009-01-19 11:38:55 UTC
(In reply to comment #1)

Sorry for the delayed response. This is my primary computer, and I had to wait till the system was rebootable before I tried this.

> can you reproduce this problem with boot option "maxcpus=1"?

No.

> can you reproduce this problem after set /proc/acpi/video/*/DOS to 1?

No.

I would add -- Sometimes when the system boots up, it *never* crashes on the lid close. I've recently had the system running for 2 weeks (once even up to 45 days) with repeated lid closes and hibernates with no problems at all. However on a subsequent reboot, it would almost certainly crash on a lid close.

Don't know if that helps,

GI
Comment 3 ykzhao 2009-02-23 01:03:52 UTC
Hi, Rui
    How about this bug? It seems that this is a duplicated of bug 12106.
    Thanks.
Comment 4 Zhang Rui 2009-02-23 01:13:43 UTC
problem can be solved by an IGD Operegion fix.
patch will be attached later.

*** This bug has been marked as a duplicate of bug 11259 ***
Comment 5 Gautam Iyer 2009-02-23 12:09:39 UTC
(In reply to comment #4)
> problem can be solved by an IGD Operegion fix.
> patch will be attached later.
> 
> *** This bug has been marked as a duplicate of bug 11259 ***
> 

Ooh!! Patch? Will it be in GIT upstream? How can I get it? Eagerly waiting, and many many many thanks!!

GI
Comment 6 Zhang Rui 2009-02-23 17:37:57 UTC
> 
> Ooh!! Patch? Will it be in GIT upstream?
No, we just root caused this problem.
Using a customized DSDT or optimizing the IGD OpRegion code could help on this issue.
But we have not got a solution yet, because of driver some dependency problem.

> How can I get it? Eagerly waiting, and
> many many many thanks!!
> 
Hah, I'll update as soon as the patch is available.
Comment 7 Bjorn Helgaas 2009-07-29 22:38:42 UTC
I think this might be related to bug 13751, and I just added a patch to that report.  If this is still an issue, Gautam, please test that patch and report the results.
Comment 8 Gautam Iyer 2009-07-30 01:38:12 UTC
(In reply to comment #6)

>> Ooh!! Patch? Will it be in GIT upstream?
>> 
> No, we just root caused this problem. Using a customized DSDT or
> optimizing the IGD OpRegion code could help on this issue. But we have
> not got a solution yet, because of driver some dependency problem.

It didn't quite work for me (I tried the patch in 2.6.30), so perhaps
this is not related to bug 11259).

(In reply to comment #7)
> I think this might be related to bug 13751, and I just added a patch
> to that report.  If this is still an issue, Gautam, please test that
> patch and report the results.

OK (It'll take a couple of weeks as I'm moving right now).

GI