Bug 3918 - S3 resume: main memory bandwidth decreased
Summary: S3 resume: main memory bandwidth decreased
Status: REJECTED WILL_NOT_FIX
Alias: None
Product: ACPI
Classification: Unclassified
Component: BIOS (show other bugs)
Hardware: i386 Linux
: P2 normal
Assignee: Shaohua
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2004-12-18 20:57 UTC by Ron Rechenmacher
Modified: 2005-01-10 14:36 UTC (History)
3 users (show)

See Also:
Kernel Version: 2.6.10-rc3-bk12 and also 2.6.10-rc3 and 2.6.9 w/ acpi-20041203-2
Subsystem:
Regression: ---
Bisected commit-id:


Attachments
lspci -xxx before suspend (15.23 KB, text/plain)
2004-12-27 22:01 UTC, Ron Rechenmacher
Details
lspci -xxx after suspend (15.23 KB, text/plain)
2004-12-27 22:03 UTC, Ron Rechenmacher
Details

Description Ron Rechenmacher 2004-12-18 20:57:21 UTC
Distribution:
     Fermi [Redhat] Linux LTS Release 3.0.1 (Feynman)

Hardware Environment:
     Hardware Environment:Dell Precision M60 laptop 1.8 Ghz Pentium M step 6, 1
GB RAM

Software Environment:
     Simple C program measuring mem copy performance.
          ++ uname -r
     + cd /usr/src/linux-2.6.10-rc3-bk12
     + sh scripts/ver_linux
     If some fields are empty or look unusual you may have an old version.
     Compare to the current minimal requirements in Documentation/Changes.
      
     Linux ron.lap 2.6.10-rc3-bk12 #2 Sat Dec 18 20:10:41 CST 2004 i686 i686
i386 GNU/Linux
      
     Gnu C                  3.2.3
     Gnu make               3.79.1
     binutils               2.14.90.0.4
     util-linux             2.11y
     mount                  2.11y
     module-init-tools      3.1-pre5
     e2fsprogs              1.32
     jfsutils               1.1.2
     reiserfsprogs          line
     reiser4progs           line
     pcmcia-cs              3.1.31
     quota-tools            3.09.
     PPP                    2.4.1
     isdn4k-utils           3.1pre4
     nfs-utils              1.0.5
     Linux C Library        2.3.2
     Dynamic linker (ldd)   2.3.2
     Procps                 2.0.13
     Net-tools              1.60
     Kbd                    1.08
     Sh-utils               4.5.3
     Modules Loaded         pcspkr psmouse nls_iso8859_1 nls_cp437 mousedev
ehci_hcd uhci_hcd usbcore
     + cat /proc/cpuinfo
     processor  : 0
     vendor_id  : GenuineIntel
     cpu family : 6
     model              : 13
     model name : Intel(R) Pentium(R) M processor 1.80GHz
     stepping   : 6
     cpu MHz            : 1794.542
     cache size : 2048 KB
     fdiv_bug   : no
     hlt_bug            : no
     f00f_bug   : no
     coma_bug   : no
     fpu                : yes
     fpu_exception      : yes
     cpuid level        : 2
     wp         : yes
     flags              : fpu vme de pse tsc msr mce cx8 sep mtrr pge mca cmov
pat clflush dts acpi mmx fxsr sse sse2 ss tm pbe est tm2
     bogomips   : 3555.32
     
     + cat /proc/modules
     pcspkr 3336 0 - Live 0xf8871000
     psmouse 22216 0 - Live 0xf8918000
     nls_iso8859_1 3968 1 - Live 0xf8858000
     nls_cp437 5632 1 - Live 0xf8855000
     mousedev 11160 1 - Live 0xf886d000
     ehci_hcd 30848 0 - Live 0xf8864000
     uhci_hcd 33608 0 - Live 0xf885a000
     usbcore 120008 3 ehci_hcd,uhci_hcd, Live 0xf8873000
     + cat /proc/ioports
     0000-001f : dma1
     0020-0021 : pic1
     0040-0043 : timer0
     0050-0053 : timer1
     0060-006f : keyboard
     0070-0077 : rtc
     0080-008f : dma page reg
     00a0-00a1 : pic2
     00c0-00df : dma2
     00f0-00ff : fpu
     0170-0177 : ide1
     01f0-01f7 : ide0
     0376-0376 : ide1
     03c0-03df : vga+
     03f6-03f6 : ide0
     04d0-04d1 : pnp 00:01
     0800-087f : 0000:00:1f.0
       0800-0803 : PM1a_EVT_BLK
       0804-0805 : PM1a_CNT_BLK
       0806-0807 : pnp 00:02
       0808-080b : PM_TMR
       0820-0820 : PM2_CNT_BLK
       0828-082f : GPE0_BLK
       0860-087f : pnp 00:02
     0880-08bf : 0000:00:1f.0
       0880-08bf : pnp 00:02
     08c0-08df : pnp 00:02
     08e0-08e5 : ACPI CPU throttle
     0900-097f : pnp 00:07
     0cf8-0cff : PCI conf1
     b800-b8ff : 0000:00:1f.5
     bc40-bc7f : 0000:00:1f.5
     bf20-bf3f : 0000:00:1d.2
       bf20-bf3f : uhci_hcd
     bf40-bf5f : 0000:00:1d.1
       bf40-bf5f : uhci_hcd
     bf80-bf9f : 0000:00:1d.0
       bf80-bf9f : uhci_hcd
     bfa0-bfaf : 0000:00:1f.1
       bfa0-bfa7 : ide0
       bfa8-bfaf : ide1
     c000-cfff : PCI Bus #01
     ecf8-ecff : 0000:02:01.3
     f400-f4fe : motherboard
       f400-f4fe : pnp 00:02
     + cat /proc/iomem
     00000000-0009efff : System RAM
     0009f000-0009ffff : reserved
     000a0000-000bffff : Video RAM area
     000c0000-000cf7ff : Video ROM
     000f0000-000fffff : System ROM
     00100000-3ffadfff : System RAM
       00100000-003e4bdd : Kernel code
       003e4bde-0057733f : Kernel data
     3ffae000-3fffffff : reserved
     40000000-400003ff : 0000:00:1f.1
     40001000-40001fff : 0000:02:01.0
     40002000-40002fff : 0000:02:01.1
     d0000000-dfffffff : PCI Bus #01
       d0000000-dfffffff : 0000:01:00.0
     e0000000-e7ffffff : 0000:00:00.0
     f4fff400-f4fff4ff : 0000:00:1f.5
     f4fff800-f4fff9ff : 0000:00:1f.5
     f4fffc00-f4ffffff : 0000:00:1d.7
       f4fffc00-f4ffffff : ehci_hcd
     fafe8000-fafebfff : 0000:02:01.2
     fafee000-fafeefff : 0000:02:03.0
     fafef800-fafeffff : 0000:02:01.2
       fafef800-fafeffff : ohci1394
     faff0000-faffffff : 0000:02:00.0
       faff0000-faffffff : tg3
     fc000000-fdffffff : PCI Bus #01
       fc000000-fcffffff : 0000:01:00.0
     feda0000-fedfffff : reserved
     ffb00000-ffffffff : reserved
     + lspci -vvv
     00:00.0 Host bridge: Intel Corp. 82855PM Processor to I/O Controller (rev 03)
        Subsystem: Dell: Unknown device 013f
        Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
Stepping- SERR+ FastB2B-
        Status: Cap+ 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- <TAbort-
<MAbort+ >SERR- <PERR-
        Latency: 0
        Region 0: Memory at e0000000 (32-bit, prefetchable) [size=128M]
        Capabilities: [e4] #09 [4104]
        Capabilities: [a0] AGP version 2.0
                Status: RQ=31 SBA+ 64bit- FW+ Rate=x1,x2,x4
                Command: RQ=0 SBA- AGP- 64bit- FW- Rate=<none>
     
     00:01.0 PCI bridge: Intel Corp. 82855PM Processor to AGP Controller (rev
03) (prog-if 00 [Normal decode])
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
Stepping- SERR+ FastB2B-
        Status: Cap- 66Mhz+ UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- <TAbort-
<MAbort- >SERR- <PERR-
        Latency: 32
        Bus: primary=00, secondary=01, subordinate=01, sec-latency=32
        I/O behind bridge: 0000c000-0000cfff
        Memory behind bridge: fc000000-fdffffff
        Prefetchable memory behind bridge: d0000000-dfffffff
        BridgeCtl: Parity- SERR- NoISA+ VGA+ MAbort- >Reset- FastB2B-
     
     00:1d.0 USB Controller: Intel Corp. 82801DB/DBL/DBM (ICH4/ICH4-L/ICH4-M)
USB UHCI Controller #1 (rev 01) (prog-if 00 [UHCI])
        Subsystem: Dell: Unknown device 013f
        Control: I/O+ Mem- BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
Stepping- SERR- FastB2B-
        Status: Cap- 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort-
<TAbort- <MAbort- >SERR- <PERR-
        Latency: 0
        Interrupt: pin A routed to IRQ 11
        Region 4: I/O ports at bf80 [size=32]
     
     00:1d.1 USB Controller: Intel Corp. 82801DB/DBL/DBM (ICH4/ICH4-L/ICH4-M)
USB UHCI Controller #2 (rev 01) (prog-if 00 [UHCI])
        Subsystem: Dell: Unknown device 013f
        Control: I/O+ Mem- BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
Stepping- SERR- FastB2B-
        Status: Cap- 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort-
<TAbort- <MAbort- >SERR- <PERR-
        Latency: 0
        Interrupt: pin B routed to IRQ 11
        Region 4: I/O ports at bf40 [size=32]
     
     00:1d.2 USB Controller: Intel Corp. 82801DB/DBL/DBM (ICH4/ICH4-L/ICH4-M)
USB UHCI Controller #3 (rev 01) (prog-if 00 [UHCI])
        Subsystem: Dell: Unknown device 013f
        Control: I/O+ Mem- BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
Stepping- SERR- FastB2B-
        Status: Cap- 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort-
<TAbort- <MAbort- >SERR- <PERR-
        Latency: 0
        Interrupt: pin C routed to IRQ 11
        Region 4: I/O ports at bf20 [size=32]
     
     00:1d.7 USB Controller: Intel Corp. 82801DB/DBM (ICH4/ICH4-M) USB 2.0 EHCI
Controller (rev 01) (prog-if 20 [EHCI])
        Subsystem: Dell: Unknown device 013f
        Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
Stepping- SERR+ FastB2B-
        Status: Cap+ 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort-
<TAbort- <MAbort- >SERR- <PERR-
        Latency: 0
        Interrupt: pin D routed to IRQ 11
        Region 0: Memory at f4fffc00 (32-bit, non-prefetchable) [size=1K]
        Capabilities: [50] Power Management version 2
                Flags: PMEClk- DSI- D1- D2- AuxCurrent=375mA
PME(D0+,D1-,D2-,D3hot+,D3cold+)
                Status: D0 PME-Enable- DSel=0 DScale=0 PME-
        Capabilities: [58] #0a [2080]
     
     00:1e.0 PCI bridge: Intel Corp. 82801 PCI Bridge (rev 81) (prog-if 00
[Normal decode])
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
Stepping- SERR+ FastB2B-
        Status: Cap- 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- <TAbort-
<MAbort- >SERR- <PERR+
        Latency: 0
        Bus: primary=00, secondary=02, subordinate=02, sec-latency=32
        I/O behind bridge: 0000d000-0000efff
        Memory behind bridge: f6000000-fbffffff
        Prefetchable memory behind bridge: fff00000-000fffff
        BridgeCtl: Parity- SERR- NoISA+ VGA- MAbort- >Reset- FastB2B-
     
     00:1f.0 ISA bridge: Intel Corp. 82801DBM LPC Interface Controller (rev 01)
        Control: I/O+ Mem+ BusMaster+ SpecCycle+ MemWINV- VGASnoop- ParErr-
Stepping- SERR+ FastB2B-
        Status: Cap- 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort-
<TAbort- <MAbort- >SERR- <PERR-
        Latency: 0
     
     00:1f.1 IDE interface: Intel Corp. 82801DBM (ICH4) Ultra ATA Storage
Controller (rev 01) (prog-if 8a [Master SecP PriP])
        Subsystem: Dell: Unknown device 013f
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
Stepping- SERR- FastB2B-
        Status: Cap- 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort-
<TAbort- <MAbort- >SERR- <PERR-
        Latency: 0
        Interrupt: pin A routed to IRQ 11
        Region 0: I/O ports at <ignored>
        Region 1: I/O ports at <ignored>
        Region 2: I/O ports at <ignored>
        Region 3: I/O ports at <ignored>
        Region 4: I/O ports at bfa0 [size=16]
        Region 5: Memory at 40000000 (32-bit, non-prefetchable) [size=1K]
     
     00:1f.5 Multimedia audio controller: Intel Corp. 82801DB/DBL/DBM
(ICH4/ICH4-L/ICH4-M) AC'97 Audio Controller (rev 01)
        Subsystem: Dell: Unknown device 013f
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
Stepping- SERR- FastB2B-
        Status: Cap+ 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort-
<TAbort- <MAbort- >SERR- <PERR-
        Latency: 0
        Interrupt: pin B routed to IRQ 11
        Region 0: I/O ports at b800 [size=256]
        Region 1: I/O ports at bc40 [size=64]
        Region 2: Memory at f4fff800 (32-bit, non-prefetchable) [size=512]
        Region 3: Memory at f4fff400 (32-bit, non-prefetchable) [size=256]
        Capabilities: [50] Power Management version 2
                Flags: PMEClk- DSI- D1- D2- AuxCurrent=375mA
PME(D0+,D1-,D2-,D3hot+,D3cold+)
                Status: D0 PME-Enable- DSel=0 DScale=0 PME-
     
     01:00.0 VGA compatible controller: nVidia Corporation NVIDIA Quadro FX 700
Go (rev a1) (prog-if 00 [VGA])
        Subsystem: Dell: Unknown device 019b
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop+ ParErr-
Stepping- SERR- FastB2B-
        Status: Cap+ 66Mhz+ UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort-
<TAbort- <MAbort- >SERR- <PERR-
        Latency: 32 (1250ns min, 250ns max)
        Interrupt: pin A routed to IRQ 11
        Region 0: Memory at fc000000 (32-bit, non-prefetchable) [size=16M]
        Region 1: Memory at d0000000 (32-bit, prefetchable) [size=256M]
        Expansion ROM at 80000000 [disabled] [size=128K]
        Capabilities: [60] Power Management version 2
                Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA
PME(D0-,D1-,D2-,D3hot-,D3cold-)
                Status: D0 PME-Enable- DSel=0 DScale=0 PME-
        Capabilities: [44] AGP version 3.0
                Status: RQ=31 SBA+ 64bit- FW+ Rate=x1,x2,x4
                Command: RQ=0 SBA- AGP- 64bit- FW- Rate=<none>
     
     02:00.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5705M
Gigabit Ethernet (rev 01)
        Subsystem: Dell: Unknown device 865d
        Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
Stepping- SERR+ FastB2B-
        Status: Cap+ 66Mhz+ UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort-
<TAbort- <MAbort- >SERR- <PERR-
        Latency: 32 (16000ns min), cache line size 08
        Interrupt: pin A routed to IRQ 11
        Region 0: Memory at faff0000 (64-bit, non-prefetchable) [size=64K]
        Capabilities: [48] Power Management version 2
                Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA
PME(D0-,D1-,D2-,D3hot+,D3cold+)
                Status: D0 PME-Enable- DSel=0 DScale=1 PME-
        Capabilities: [50] Vital Product Data
        Capabilities: [58] Message Signalled Interrupts: 64bit+ Queue=0/3 Enable-
                Address: fffffbfdfbff4bb8  Data: 7bf7
     
     02:01.0 CardBus bridge: Texas Instruments: Unknown device ac47 (rev 01)
        Subsystem: Dell: Unknown device 013f
        Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr-
Stepping- SERR- FastB2B-
        Status: Cap+ 66Mhz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort-
<TAbort- <MAbort- >SERR- <PERR-
        Interrupt: pin A routed to IRQ 255
        Region 0: Memory at 40001000 (32-bit, non-prefetchable) [disabled] [size=4K]
        Bus: primary=02, secondary=03, subordinate=06, sec-latency=176
        Memory window 0: 00000000-00000000 [disabled] (prefetchable)
        Memory window 1: 00000000-00000000 [disabled] (prefetchable)
        I/O window 0: 00000000-00000003 [disabled]
        I/O window 1: 00000000-00000003 [disabled]
        BridgeCtl: Parity- SERR- ISA- VGA- MAbort- >Reset+ 16bInt- PostWrite+
        16-bit legacy interface ports at 0001
     
     02:01.1 CardBus bridge: Texas Instruments: Unknown device ac4a (rev 01)
        Subsystem: Dell: Unknown device 013f
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
Stepping- SERR+ FastB2B-
        Status: Cap+ 66Mhz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort-
<TAbort- <MAbort- >SERR- <PERR-
        Latency: 32, cache line size 08
        Interrupt: pin A routed to IRQ 11
        Region 0: Memory at 40002000 (32-bit, non-prefetchable) [size=4K]
        Bus: primary=02, secondary=07, subordinate=0a, sec-latency=176
        Memory window 0: 00000000-00000000 (prefetchable)
        Memory window 1: 00000000-00000000 (prefetchable)
        I/O window 0: 00000000-00000003
        I/O window 1: 00000000-00000003
        BridgeCtl: Parity- SERR- ISA- VGA- MAbort- >Reset+ 16bInt- PostWrite-
        16-bit legacy interface ports at 0001
     
     02:01.2 FireWire (IEEE 1394): Texas Instruments: Unknown device 802b
(prog-if 10 [OHCI])
        Subsystem: Dell: Unknown device 013f
        Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop- ParErr-
Stepping- SERR+ FastB2B-
        Status: Cap+ 66Mhz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort-
<TAbort- <MAbort- >SERR- <PERR-
        Latency: 32 (500ns min, 1000ns max), cache line size 08
        Interrupt: pin A routed to IRQ 11
        Region 0: Memory at fafef800 (32-bit, non-prefetchable) [size=2K]
        Region 1: Memory at fafe8000 (32-bit, non-prefetchable) [size=16K]
        Capabilities: [44] Power Management version 2
                Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA
PME(D0+,D1+,D2+,D3hot+,D3cold-)
                Status: D0 PME-Enable- DSel=0 DScale=0 PME+
     
     02:01.3 System peripheral: Texas Instruments: Unknown device 8204
        Subsystem: Dell: Unknown device 013f
        Control: I/O+ Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr-
Stepping- SERR+ FastB2B-
        Status: Cap+ 66Mhz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort-
<TAbort- <MAbort- >SERR- <PERR-
        Region 0: I/O ports at ecf8 [size=8]
        Capabilities: [44] Power Management version 2
                Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA
PME(D0-,D1-,D2-,D3hot-,D3cold-)
                Status: D0 PME-Enable- DSel=0 DScale=0 PME-
     
     02:03.0 Network controller: Intel Corp. PRO/Wireless 2200BG (rev 05)
        Subsystem: Intel Corp.: Unknown device 2721
        Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop- ParErr-
Stepping- SERR+ FastB2B-
        Status: Cap+ 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort-
<TAbort- <MAbort- >SERR- <PERR-
        Latency: 32 (750ns min, 6000ns max), cache line size 08
        Interrupt: pin A routed to IRQ 11
        Region 0: Memory at fafee000 (32-bit, non-prefetchable) [size=4K]
        Capabilities: [dc] Power Management version 2
                Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA
PME(D0+,D1-,D2-,D3hot+,D3cold+)
                Status: D0 PME-Enable- DSel=0 DScale=1 PME-
     
     + cat /proc/scsi/scsi
     Attached devices:
     + cat /proc/mtrr
     reg00: base=0x00000000 (   0MB), size=1024MB: write-back, count=1
     reg01: base=0xfeda0000 (4077MB), size= 128KB: write-through, count=1

Problem Description:
     The memory bandwith as tested by a simple program that copies 32bit
locations from one mem buffer to another, decreases after wake after 1st suspend
to RAM after a boot/reboot:

     Performance before 1st suspend (i.e. measured right after reboot):
     # loops     bytes     seconds     bytes/second
     4194304       256  2.100492 511185866.789113
     2097152       512  2.073641 517805054.977469
     1048576      1024  2.060252 521170092.380577
      524288      2048  2.052698 523088030.088424
      262144      4096  2.039247 526538376.692490
      131072      8192  2.044709 525131799.203941
       65536     16384  2.057255 521929371.181502
       32768     32768  2.053823 522801539.867567
       16384     65536  2.053944 522770772.074838
        8192    131072  2.238565 479656315.227236
        4096    262144  2.054797 522553740.550410
        2048    524288  2.054639 522593942.682512
        1024   1048576  2.088225 514188787.926278
         512   2097152  2.257877 475553754.428407
         256   4194304  2.819464 380831901.411452
         128   8388608  3.121978 343929972.822267
          64  16777216  3.165082 339246133.754594
          32  33554432  3.157960 340011224.038280
          16  67108864  2.926294 366928913.861524
     
     Performance after 1st suspend:
     # loops     bytes     seconds     bytes/second
     4194304       256  2.100117 511277152.787996
     2097152       512  2.073161 517924926.797557
     1048576      1024  2.060612 521079098.412580
      524288      2048  2.052631 523105103.105875
      262144      4096  2.039196 526551550.912906
      131072      8192  2.044657 525145209.305469
       65536     16384  2.057808 521789078.527142
       32768     32768  2.070937 518481123.648766
       16384     65536  2.053861 522791890.406682
        8192    131072  2.053784 522811493.143815
        4096    262144  2.053972 522763611.648508
        2048    524288  2.055298 522426383.691156
        1024   1048576  2.181485 492206847.482478
         512   2097152  2.566232 418411834.244989
         256   4194304  3.184070 337223046.854247
         128   8388608  3.103361 345993218.834314
          64  16777216  3.290089 326356478.275219
          32  33554432  3.364081 319178336.648871
          16  67108864  3.293490 326019464.409720

Steps to reproduce:
     1. boot or reboot
     2. measure main memory bandwidth
     3. software suspend: echo mem >/sys/power/state)
     4. wake (via power button)
     5. remeausure main memory bandwith and notice decrease in performance
Comment 1 Ron Rechenmacher 2004-12-18 21:21:38 UTC
The output for before and after the 1st sleep should have been:

Performance before 1st suspend (i.e. measured right after reboot):
# loops     bytes     seconds     bytes/second
4194304       256  2.128207 504528854.566207
2097152       512  2.064953 519983640.231064
1048576      1024  2.038665 526688688.129248
 524288      2048  2.030522 528800915.687797
 262144      4096  2.027024 529713416.193157
 131072      8192  2.027298 529641837.541972
  65536     16384  2.064885 520000751.362871
  32768     32768  2.050999 523521397.453337
  16384     65536  2.050308 523697819.887343
   8192    131072  2.050273 523706772.020141
   4096    262144  2.050425 523667920.764058
   2048    524288  2.051523 523387668.379623
   1024   1048576  2.088559 514106494.729768
    512   2097152  2.205559 486834320.262604
    256   4194304  2.440570 439955370.196205
    128   8388608  2.571250 417595232.034043
     64  16777216  2.581927 415868379.934201
     32  33554432  2.576516 416741778.530588
     16  67108864  2.592133 414230984.782878

Performance after 1st suspend:
# loops     bytes     seconds     bytes/second
4194304       256  2.100117 511277152.787996
2097152       512  2.073161 517924926.797557
1048576      1024  2.060612 521079098.412580
 524288      2048  2.052631 523105103.105875
 262144      4096  2.039196 526551550.912906
 131072      8192  2.044657 525145209.305469
  65536     16384  2.057808 521789078.527142
  32768     32768  2.070937 518481123.648766
  16384     65536  2.053861 522791890.406682
   8192    131072  2.053784 522811493.143815
   4096    262144  2.053972 522763611.648508
   2048    524288  2.055298 522426383.691156
   1024   1048576  2.181485 492206847.482478
    512   2097152  2.566232 418411834.244989
    256   4194304  3.184070 337223046.854247
    128   8388608  3.103361 345993218.834314
     64  16777216  3.290089 326356478.275219
     32  33554432  3.364081 319178336.648871
     16  67108864  3.293490 326019464.409720

Big "I'm sorry" for the silly mistake of entering the wrong output
(showing performance about the same before and after :(

Here's the simple program used to produce the output:

#include <stdio.h>              /* stderr, stdout, fprintf, printf */
#include <stdint.h>             /* uint8, uint */
#include <stdlib.h>             /* malloc, strtol */
#include <sys/time.h>           /* gettimeofday */
#include <errno.h>              /* sys_errlist */
#include <getopt.h>             /* getopt_long */
#include <linux/trace.h>        /* TRACE */

int     g_bufsiz=0x4000000;
int     g_loop_multiplier=16;

int
main(  int      argc
     , char     *argv[] )
{
    struct timeval      t0_s, t1_s;
    uint32_t            *buf1, *buf2;
    int                 siz, xx;
    int                 loop, loops=1;
    double              mark, curr, delta;

    buf1 = malloc( g_bufsiz );
    buf2 = malloc( g_bufsiz );
    for (xx=0; (xx<<2)<g_bufsiz; xx++) buf2[xx] = xx;

    for (siz = 0x100; siz<g_bufsiz; siz<<=1) loops<<=1;
    loops *= g_loop_multiplier;

    printf( "# loops     bytes     seconds     bytes/second\n" );
    for (siz = 0x100; siz<=g_bufsiz; )
    {   gettimeofday( &t0_s, NULL );
        for (loop=0; loop<loops; loop++)
        {
            for (xx=0; (xx<<2)<siz; xx++)
            {   buf1[xx] = buf2[xx];
            }
        }
        gettimeofday( &t1_s, NULL );
        curr = (double)t1_s.tv_usec/1000000;
        mark = (double)t0_s.tv_usec/1000000;
        curr += (double)t1_s.tv_sec;
        mark += (double)t0_s.tv_sec;
        delta = curr - mark;
        printf(  "%7d  %8d  %f %f\n"
               , loops, siz, delta, ((double)loops*siz)/delta );
        siz<<=1;
        loops>>=1;
    }
    return (0);
}   /* main */
Comment 2 Len Brown 2004-12-22 20:07:19 UTC
this program access 128MB of RAM. 
What does vmstat say about the free memory 
when this program is running before and after suspend? 
If you run the program a 2nd time after resume, does it get the same answer? 
 
Comment 3 Ron Rechenmacher 2004-12-22 22:53:39 UTC
Yes, the program I used (as an example) is simplistic and one should be carefull
to have enough free memory. I reran the exact cases I ran before and vmstat did
show that both before and after the sleep, I had lot of free memory:
procs                      memory      swap          io     system         cpu
 r  b   swpd   free   buff  cache   si   so    bi    bo   in    cs us sy id wa
 2  0      0 979224   8312  23696    0    0    37    11 1011    40  7  0 90  2

Yes, when I run the program a 2nd time, I do get the same answer. Additionally,
(or actually originally) I see the same decrease in main mem bandwith using the
streams benchmark (Ref.http://www.cs.virginia.edu/stream/):
before sleep:
-------------------------------------------------------------
This system uses 8 bytes per DOUBLE PRECISION word.
-------------------------------------------------------------
Array size = 2000000, Offset = 0
Total memory required = 45.8 MB.
Each test is run 10 times, but only
the *best* time for each is used.
-------------------------------------------------------------
Your clock granularity/precision appears to be 5 microseconds.
Each test below will take on the order of 31033 microseconds.
   (= 6206 clock ticks)
Increase the size of the arrays if this shows that
you are not getting at least 20 clock ticks per test.
-------------------------------------------------------------
WARNING -- The above is only a rough guideline.
For best results, please be sure you know the
precision of your system timer.
-------------------------------------------------------------
Function      Rate (MB/s)   RMS time     Min time     Max time
Copy:         918.3507       0.0349       0.0348       0.0354
Scale:        899.5326       0.0356       0.0356       0.0357
Add:         1156.2094       0.0415       0.0415       0.0416
Triad:       1152.0744       0.0417       0.0417       0.0417

after resume:
-------------------------------------------------------------
This system uses 8 bytes per DOUBLE PRECISION word.
-------------------------------------------------------------
Array size = 2000000, Offset = 0
Total memory required = 45.8 MB.
Each test is run 10 times, but only
the *best* time for each is used.
-------------------------------------------------------------
Your clock granularity/precision appears to be 5 microseconds.
Each test below will take on the order of 38527 microseconds.
   (= 7705 clock ticks)
Increase the size of the arrays if this shows that
you are not getting at least 20 clock ticks per test.
-------------------------------------------------------------
WARNING -- The above is only a rough guideline.
For best results, please be sure you know the
precision of your system timer.
-------------------------------------------------------------
Function      Rate (MB/s)   RMS time     Min time     Max time
Copy:         742.3733       0.0432       0.0431       0.0438
Scale:        727.7368       0.0440       0.0440       0.0440
Add:          934.2698       0.0514       0.0514       0.0514
Triad:        931.5138       0.0515       0.0515       0.0517

The streams benchmark (stream_d), I believe, lock pages.  The results
are also repeatable.
Comment 4 Shaohua 2004-12-22 23:10:09 UTC
This possibly is caused by decreased CPU frequency. Please try load the 
cpufreq driver and scale CPU frequency to maxium, then retest it.
Comment 5 Ron Rechenmacher 2004-12-23 00:04:21 UTC
I do not know exactly what you mean by "load the cpufreq driver and scale CPU
frequency to maximum". Perhaps you mean:
   cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_max_freq \
      >/sys/devices/system/cpu/cpu0/cpufreq/scaling_setspeed
?

I doubt that it is CPU freq because the "cache" performance does not change,
just the main memory ("bigger than cache") performance changes. I do see that
before and after the sleep both
        /sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_cur_freq
   and  /sys/devices/system/cpu/cpu0/cpufreq/scaling_cur_freq
have 1800000, the max frequency for my processor. I went ahead and did the "cat"
mentioned above anyway and there was no change.  Again, I doubt it's anything to
do with CPU freq (more like some memory controller setting??), but if you still
want me to try the cpufreq driver thing (which I will be glad to do), then
please tell me exactly what steps to take. Thanks.
Comment 6 Venkatesh Pallipadi 2004-12-23 15:09:13 UTC
You can do
   echo "performance" >/sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
to run at maximum frequency all the time.
Comment 7 Ron Rechenmacher 2004-12-24 06:37:20 UTC
Thanks for mentioning /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor,
although I hope we are all in agreement that this has nothing to do with the
problem. I did do it so that I could be 100% sure. Again, thanks for mentioning
it because it was something I missed/forgot and seems it will be usefull; I've
done some preliminary tests and seems I will be able to achieve better power
savings -- it's another topic but I will investigate how thermal throtling
effect power consumption.
Comment 8 Ron Rechenmacher 2004-12-27 15:40:45 UTC
Hi David, Len, and Venkatesh,

I think the problem might be a chipset bug. At least with the version I have:
   Intel Corp. 82855PM Processor to I/O Controller (rev 03)
there seems to be an issue with 333 Mhz memory support (it just can't do it?).
I'm hoping you guys from Intel can confirm all this. So, just based on that, I'm
thinking about asking Dell if they can update my main board - to one that has a
82855PM revision 21h (stepping B1). The decreased performance after suspend also
appears to happen under windows; this is what leads me to believe it may just be
a chipset bug. Maybe you guys can find a work-around (assuming I'm correct) but
even so, I would like to get 333 Mhz performance.
Can you verify the existance of any systems that do maintain memory bandwith
performance after suspend?  The results I get under window are with using 
SiSoftware Sandra Lite 2005.1.10.37 and the numbers before and after suspend are
as follows:
                                          before         after
    RAM Bandwidth Int   Buff'd iSSE2:    2280 MB/s      1895 MB/s
    RAM Bandwidth Float Buff'd iSSE2:    2284 MB/s      1893 MB/s
Comment 9 Len Brown 2004-12-27 19:53:07 UTC
The specification update for the 855PM agrees with you
http://support.intel.com/design/chipsets/specupdt/25348802.pdf
that PC2700 DDR 333 support was added at B1.

But that doesn't explain why performance started fast,
and then decreased after resume.

Probably we need to dump out the configuration registers
of the memory controller before and after the suspend/resume
to find out exactly what is going on.

Comment 10 Len Brown 2004-12-27 20:08:13 UTC
please attach (do not paste) the output of "lspci -xxx"
from before and after the suspend.
Comment 11 Ron Rechenmacher 2004-12-27 22:01:27 UTC
Created attachment 4305 [details]
lspci -xxx before suspend
Comment 12 Ron Rechenmacher 2004-12-27 22:03:45 UTC
Created attachment 4306 [details]
lspci -xxx after suspend

I've attached the output from "lspci -xxx" (before and after suspend)
Prior to my previous post (Additional Comment #8) I did an analyzsis of the
82855PM registers (which contributed to my "bug" theory):

    register 7c DRC   DRAM Controller Mode Register  Ref. p.71
	   bit 28 changes from 1 to 0	 DRAM Power-down disabled???
	   bits 6:4 change from 2 to 7
		before: All CPU cycles to DRAM result in an all banks
			precharge  command on the DRAM interface. 
		after:	Normal operation.

	    Would running in "All Banks Pre-charge Enable" be better????

     register b8 ATTBASE   Aperture Translation Table Base Register
	    changes from 36460000 to 36ba0000

Is this analysis correct?
And what about posting to some other list to verify 100% that other
82855PM (rev 03) chipset can maintain performance after suspend????
Comment 13 Shaohua 2004-12-27 22:46:15 UTC
Your anysis is quite right.
Though I don't know the detail of the config register bits, but your problem 
are very likely related with the issue. OS generally doesn't touch PCI host 
controller's registers, I think it's BIOS's responsiblity. As you said, the 
issue exists in Win. I would think it's a BIOS bug. Test laptops from other 
vendors (should have different BIOSs) would be helpful.
Comment 14 Luming Yu 2004-12-28 01:47:59 UTC
To comment 12:
9c9
< 70: 03 03 00 00 00 00 00 00 00 00 02 2d 71 32 40 30
---
> 70: 03 03 00 00 00 00 00 00 00 00 02 2d 71 37 40 20
bit 0-7:   71 --> 71
bit 8-15:   32 --> 37  
[bit 8-11: 2 -->7. (010: Refresh interval 7.8 sec --> 111 Reserved)]
bit 24-31: 30 --> 20

why it is "bits 6:4 change from 2 to 7"
Am I wrong?
Comment 15 Ron Rechenmacher 2004-12-28 07:09:05 UTC
Hi Luming,
use:
  setpci -d 8086:3340 7c.l
to see the byteswap (i.e. after suspend):
  # setpci -d 8086:3340 7c.l
  20403771
But maybe there's not really suppose to be a byte swap? (I've done device
work before, but this is my very first experience with these chipset registers)

Hi David,
As an aside, I have tested all the BIOS revisions that exist for my Dell
Precision M60. Yes, I would like to be able to test other 82855PM systems. I
think the best way to do this would be via the internet community. But I quite a
bit weak at communicating with the internet community and was hoping you guys
would be able to know the most efficient way of doing this; i.e. which news
groups to post to.
Comment 16 Ron Rechenmacher 2004-12-28 08:04:57 UTC
I did a google search and came up with
  http://dev.gentoo.org/~brix/papers/X31/X31.html
and emailed Henrik <brix@gentoo.org> and asked him to do a mem test and
check out this page.
Comment 17 Ron Rechenmacher 2004-12-28 11:57:18 UTC
Here at fermi, there are a lot of Dell lap tops. I just had a colleague test, 
under windows, his Dell Inspiron 600m, which has the 82855PM rev 03 and he 
also see the same 17% performance decrease after resume from suspend (standby).
Comment 18 Shaohua 2004-12-28 17:09:19 UTC
Test in Dell laptops can't tell us anything. They possibly have the same BIOS. 
We need test different vendors' laptops.
I tested in several laptops (including HP nx5000, Toshiba M2, but not 82855PM 
based system), hostbridge's config register doesn't be changed (or 
changed 'status' register, but it's normal). That's why I suspect it's a BIOS 
error. But anyway, let's see more test results.
As for the mail list, I think acpi-devel@lists.sourceforge.net is ok and it 
would be better with a highlight title.
Comment 19 Ron Rechenmacher 2004-12-28 18:34:54 UTC
OK. Anyway, because the same thing happens under windows, I don't think it's
an ACPI4Linux problem so I'm kind of between lists as far as begging for help.
And I appreciate your responses in helping me collect the information I have.
David, are you saying the the HP nx5000 did have an 82855PM and the just the
Toshiba M2 did not? And are you saying the the HP nx5000 had the same main
memory bandwidth measurement before the 1st suspend as it did after the suspend?
If so, was it an 82855PM rev 03 or rev 21?
... acpi-devel@lists.sourceforge.net; what do you mean by "highlight title"?
I just figured out how to view that list archive via the web.
David,
Could you please post a "highlight title" measure there and I will watch for it?
You can mention the "steps to reproduce the (potential) problem":
   Steps to reproduce:
     1a. verify that the chipset is 82855PM and note the rev (03 or 21)
     1b. boot or reboot
     2.  measure main memory bandwidth (prior to the 1st suspend after reboot)
     3.  software suspend (or standby under windows)
     4.  wake (via power button)
     5.  remeausure main memory bandwith and (potentially) notice decrease
         in performance. Note that L1/L2 cache performance does not decrease.
You could also ref. (not) bug page:
   http://bugzilla.kernel.org/show_bug.cgi?id=3918
Thanks.
Comment 20 Shaohua 2004-12-28 18:41:49 UTC
No, I haven't laptop with 855PM. What I said highlight just means to get 
people's attention :). I would like add one line in your list:
compare 'lspci -xxx' output from before/after suspend/resume.
Comment 21 Ron Rechenmacher 2004-12-28 19:10:15 UTC
OK, since highlight is just getting people's attention, then I went ahead and
attempted an email to acpi-devel@lists.sourceforge.net and included the request
for lspci -xxx info.
Comment 22 Luming Yu 2004-12-28 19:27:30 UTC
Please use setpci to restore the original vaue after S3 resume, and re-test 
memory bandwidth.
Comment 23 Shaohua 2004-12-28 19:32:39 UTC
>Please use setpci to restore the original vaue after S3 resume,
Changing (or even reading some registers) host controller's config register in 
runtime possibly cause severe impact. Please don't do it unless you really 
know what you are doing.
Comment 24 Luming Yu 2004-12-28 19:49:01 UTC
I just want to see the result of resote original Refresh interval, which is
 {bit 8-15:   32 --> 37  
 [bit 8-11: 2 -->7. (010: Refresh interval 7.8 sec --> 111 Reserved)]}



Comment 25 Ron Rechenmacher 2004-12-28 21:04:24 UTC
Putting register 7c back to the pre suspend value solves the problem.
I went ahead and wrote that whole register:
    xx=`setpci -d 8086:3340 7c.l`
    if [ $xx = 20403771 ];then
        echo "20403771 is the value of register 7c that makes main mem slow"
        echo "main mem is faster with the reboot value of 30403271"
        #setpci -d 8086:3340 7c.l=20403771 # slow
         setpci -d 8086:3340 7c.l=30403271 # fast
    fi

(confession: I was still thinking is was bits 4-6, then went over the bits again
and finally see that it is bits 8-11 as it appears in "7c.l=30403271 # fast")
Thanks for pushing me to try this.
So this (non)bug is closed!
Thanks to all.
Comment 26 Shaohua 2004-12-28 21:14:17 UTC
This is expected result. What we want to know is if it's a BIOS bug. But now 
we know it is. Yes, OS can workaround this issue, but I think we can't make a 
generic solution, save/restore hostbridge's config space is dangerous. I would 
suggest you report the bug to Dell as a BIOS bug.
Comment 27 Ron Rechenmacher 2004-12-28 21:19:15 UTC
I will do so.
Thanks
Comment 28 Matt Domsch 2005-01-04 11:43:06 UTC
Ron, can you post your BIOS version number to this issue please?
Thanks,
Matt
Comment 29 Ron Rechenmacher 2005-01-04 11:52:41 UTC
Hi Matt,

A07. I also reloaded/tried A03 through A06 and possibly A01 and A02 also.
When I was trying the older revs, it was all the same day (several days ago) and
then I went back to A07. I checked today and A07 is the latest.

Thanks,
Ron
Comment 30 Matt Domsch 2005-01-10 14:36:37 UTC
Per Dell BIOS team, this has been fixed internally already.   New BIOSes for
each affected platform will release on support.dell.com "soon" which include the
fix.  The Inspiron 600m BIOS A15 will be the first release with this fixed in BIOS.
Thanks,
Matt

Note You need to log in before you can comment on or make changes to this bug.