Bug 7760 - page allocation failure on ixp4xx (nslu2) with 128MB RAM
page allocation failure on ixp4xx (nslu2) with 128MB RAM
Status: CLOSED OBSOLETE
Product: Platform Specific/Hardware
Classification: Unclassified
Component: ARM
i386 Linux
: P2 normal
Assigned To: Russell King
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2007-01-02 06:46 UTC by Stephan
Modified: 2012-05-14 16:58 UTC (History)
5 users (show)

See Also:
Kernel Version: 2.6.18-3
Tree: Mainline
Regression: No


Attachments

Description Stephan 2007-01-02 06:46:37 UTC
This bug did not occur on the same hardware with kernel-cmdline mem=32MB@0x0 or
mem=64MB@0x0 and seems to occur an previous kernel versions too, but without
these error message.

Distribution: 
Debian Etch installed with "Debian/NSLU2 Etch RC1"

Hardware Environment: 
Linksys NSLU2 with extended memory to 128MB

NSLU2:~# cat /proc/cpuinfo
Processor       : XScale-IXP42x Family rev 1 (v5l)
BogoMIPS        : 266.24
Features        : swp half fastmult edsp
CPU implementer : 0x69
CPU architecture: 5TE
CPU variant     : 0x0
CPU part        : 0x41f
CPU revision    : 1
Cache type      : undefined 5
Cache clean     : undefined 5
Cache lockdown  : undefined 5
Cache format    : Harvard
I size          : 32768
I assoc         : 32
I line length   : 32
I sets          : 32
D size          : 32768
D assoc         : 32
D line length   : 32
D sets          : 32

Hardware        : Linksys NSLU2
Revision        : 0000
Serial          : 0000000000000000

NSLU2:~# cat /proc/meminfo
MemTotal:       127456 kB
MemFree:        104980 kB
Buffers:          3420 kB
Cached:          11180 kB
SwapCached:          0 kB
Active:          12260 kB
Inactive:         4792 kB
HighTotal:           0 kB
HighFree:            0 kB
LowTotal:       127456 kB
LowFree:        104980 kB
SwapTotal:           0 kB
SwapFree:            0 kB
Dirty:              28 kB
Writeback:           0 kB
AnonPages:        2472 kB
Mapped:           2816 kB
Slab:             2824 kB
PageTables:        216 kB
NFS_Unstable:        0 kB
Bounce:              0 kB
CommitLimit:     63728 kB
Committed_AS:     6324 kB
VmallocTotal:   892928 kB
VmallocUsed:     16840 kB
VmallocChunk:   868348 kB

Software Environment:
NSLU2:~# uname -a
Linux NSLU2 2.6.18-3-ixp4xx #1 Tue Dec 5 16:52:07 UTC 2006 armv5tel GNU/Linux

Problem Description:

When using 128MB memory on two hardware-modified linksys nslu2, something seams
wrong with memory-management or DMA. While transfering huge data from/to usb-hdd
the system crashes with:

********************************************************************************************
usb-storage: page allocation failure. order:2, mode:0x21
Mem-info:
DMA per-cpu:
cpu 0 hot: high 18, batch 3 used:2
cpu 0 cold: high 6, batch 1 used:0
DMA32 per-cpu:
cpu 0 hot: high 18, batch 3 used:0
cpu 0 cold: high 6, batch 1 used:0
Normal per-cpu: empty
HighMem per-cpu: empty
Free pages:        1672kB (0kB HighMem)
Active:1730 inactive:28024 dirty:2901 writeback:0 unstable:0 free:418 slab:1022
mapped:873 pagetables:63
DMA free:948kB min:724kB low:904kB high:1084kB active:12kB inactive:61136kB
present:65536kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 64 64 64
DMA32 free:724kB min:724kB low:904kB high:1084kB active:6908kB inactive:50960kB
present:65536kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0 0 0
Normal free:0kB min:0kB low:0kB high:0kB active:0kB inactive:0kB present:0kB
pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0 0 0
HighMem free:0kB min:128kB low:128kB high:128kB active:0kB inactive:0kB
present:0kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0 0 0
DMA: 1*4kB 108*8kB 1*16kB 0*32kB 1*64kB 0*128kB 0*256kB 0*512kB 0*1024kB
0*2048kB 0*4096kB = 948kB
DMA32: 1*4kB 0*8kB 5*16kB 0*32kB 2*64kB 0*128kB 0*256kB 1*512kB 0*1024kB
0*2048kB 0*4096kB = 724kB
Normal: empty
HighMem: empty
Swap cache: add 0, delete 0, find 0/0, race 0+0
Free swap  = 0kB
Total swap = 0kB
Free swap:            0kB
32768 pages of RAM
515 free pages
971 reserved pages
1022 slab pages
17164 pages shared
0 pages swap cached
ehci_hcd 0000:00:01.2: alloc_safe_buffer: could not alloc dma memory (size=16384)
ehci_hcd 0000:00:01.2: map_single: unable to map unsafe buffer c75e4000!
usb-storage: page allocation failure. order:2, mode:0x21
Mem-info:
DMA per-cpu:
cpu 0 hot: high 18, batch 3 used:2
cpu 0 cold: high 6, batch 1 used:0
DMA32 per-cpu:
cpu 0 hot: high 18, batch 3 used:0
cpu 0 cold: high 6, batch 1 used:0
Normal per-cpu: empty
HighMem per-cpu: empty
Free pages:        1672kB (0kB HighMem)
Active:1730 inactive:28024 dirty:2901 writeback:0 unstable:0 free:418 slab:1022
mapped:873 pagetables:63
DMA free:948kB min:724kB low:904kB high:1084kB active:12kB inactive:61136kB
present:65536kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 64 64 64
DMA32 free:724kB min:724kB low:904kB high:1084kB active:6908kB inactive:50960kB
present:65536kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0 0 0
Normal free:0kB min:0kB low:0kB high:0kB active:0kB inactive:0kB present:0kB
pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0 0 0
HighMem free:0kB min:128kB low:128kB high:128kB active:0kB inactive:0kB
present:0kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0 0 0
DMA: 1*4kB 108*8kB 1*16kB 0*32kB 1*64kB 0*128kB 0*256kB 0*512kB 0*1024kB
0*2048kB 0*4096kB = 948kB
DMA32: 1*4kB 0*8kB 5*16kB 0*32kB 2*64kB 0*128kB 0*256kB 1*512kB 0*1024kB
0*2048kB 0*4096kB = 724kB
Normal: empty
HighMem: empty
Swap cache: add 0, delete 0, find 0/0, race 0+0
Free swap  = 0kB
Total swap = 0kB
Free swap:            0kB
32768 pages of RAM
515 free pages
971 reserved pages
1022 slab pages
17164 pages shared
0 pages swap cached
ehci_hcd 0000:00:01.2: alloc_safe_buffer: could not alloc dma memory (size=16384)
ehci_hcd 0000:00:01.2: map_single: unable to map unsafe buffer c7618000!
********************************************************************************************
Comment 1 Russell King 2007-01-02 11:11:24 UTC
The VM decided you were out of 16K pages in the DMA zone and refused to allocate
you the page.

This is one of the hazards of using the DMA bounce code - if you have large
allocations, the VM can choose to refuse to allocate you memory, especially
when you ask for it in atomic contexts.

You're probably seeing this because 128MB could be sufficiently large that
you're starting to use the bounce buffers; since I don't know NSLU2 hardware
(or even IXP hardware) I couldn't say for certain though.
Comment 2 Rob Lockhart 2007-04-16 20:47:32 UTC
Please see http://bugzilla.kernel.org/show_bug.cgi?id=7760 in regards
to >64MB and "page allocation failure" and DMA_BOUNCE failure.  Since
only USB on PCI bus, any PCI DMA problems will cause corruption of USB
devices (and thus rootfs if it is located on USB bus).

I have an IXP420 with 256MiB SDRAM and USB EHCI on the PCI bus.  Using
kernel 2.6.18 and 2.6.20,  I have serial console and ssh login.  The
root fs is mounted to a USB device /dev/sda1.

The USB driver gets a page allocation error (USB EHCI on PCI bus),
after the error message below occurs, if the USB driver (i.e.,
usb_storage) requested the memory.

oom-killer: gfp_mask=0x200d2, order=0
memtest: page allocation failure. order:0, mode:0x200d2
ehci_hcd 0000:00:01.2: alloc_safe_buffer: could not alloc dma memory (size=8192)
ehci_hcd 0000:00:01.2: map_single: unable to map unsafe buffer cff04000!

When doing a "memtest 256m 1", I get a dump of the messages above to
the console  (serial) and similar messages (different buffer address).
I have also been able to get similar messages by running any program
at attempts to use nearly all the memory available (i.e., rsync with a
sufficiently large database).  Note that this is not a solder problem
or hardware problem, as I have been able to successfully test >95% of
the memory.

I have pasted the complete process here, including all debugging I
could imagine:
http://pastebin.ca/440753 -> debian 2.6.18 w/apex 1.4.18 (from debian armle)
http://pastebin.ca/442296 -> slugosle 2.6.20 w/apex 1.4.18 (generic
kernel from sources)

Please let me know what I can do to help debug this problem.  I am
willing to try with 2.6.21 kernel.  Note that I have been able to
successfully get it into this state by using the following process:

1) do "free", hopefully not many buffers used.  If so, may have to do
this more than once.

2) convert that number to MB by /1024.  Round the number down, let it
be X.  Then:
  memtester X  (if slugosle generic kernel)
  memtest X 1  (if debian)

3) Note that if you do the same test with X-1 instead of X, the
problem doesn't occur, ever (thus verifying the hardware stability).

4) Note also that if the test passes, perhaps the cache was flushed,
so go back to 1) and confirm free space, and then re-run 2) as
appropriate.


Here is /proc/iomem (for addresses of devices):
NSLU2:/proc# cat iomem
00000000-0fffffff : System RAM
 0001f000-0020636f : Kernel text
 00208000-002842a7 : Kernel data
48000000-4bffffff : PCI Memory Space
 48000000-48000fff : 0000:00:01.0
   48000000-48000fff : ohci_hcd
 48001000-48001fff : 0000:00:01.1
   48001000-48001fff : ohci_hcd
 48002000-480020ff : 0000:00:01.2
   48002000-480020ff : ehci_hcd
50000000-50ffffff : IXP4XX-Flash.0
 50000000-50ffffff : IXP4XXFlash
60000000-60003fff : ixp4xx_qmgr.0
 60000000-60003fff : ixp_qmgr
c8000000-c8000fff : serial8250.0
 c8000000-c800001f : serial
c8001000-c8001fff : serial8250.0
 c8001000-c800101f : serial
c8007000-c8007fff : ixp4xx_npe.1
 c8007000-c8007fff : NPE-B
c8008000-c8008fff : ixp4xx_npe.2
 c8008000-c8008fff : NPE-C
c8009000-c80091ff : ixp4xx_mac.0
 c8009000-c80091ff : ixp4xx_mac

Here is the complete dump of the error message (from serial console):
NSLU2:/proc# oom-killer: gfp_mask=0x201d2, order=0
Mem-info:
DMA per-cpu:
cpu 0 hot: high 18, batch 3 used:0
cpu 0 cold: high 6, batch 1 used:5
DMA32 per-cpu:
cpu 0 hot: high 90, batch 15 used:0
cpu 0 cold: high 30, batch 7 used:6
Normal per-cpu: empty
HighMem per-cpu: empty
Free pages:        1008kB (0kB HighMem)
Active:33222 inactive:28814 dirty:0 writeback:0 unstable:0 free:252
slab:857 mapped:7 pagetables:216
DMA free:1008kB min:512kB low:640kB high:768kB active:29984kB
inactive:31572kB present:65536kB pages_scanned:137512
all_unreclaimable? yes
lowmem_reserve[]: 0 192 192 192
DMA32 free:0kB min:1536kB low:1920kB high:2304kB active:102904kB
inactive:83684kB present:196608kB pages_scanned:271001
all_unreclaimable? yes
lowmem_reserve[]: 0 0 0 0
Normal free:0kB min:0kB low:0kB high:0kB active:0kB inactive:0kB
present:0kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0 0 0
HighMem free:0kB min:128kB low:128kB high:128kB active:0kB
inactive:0kB present:0kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0 0 0
DMA: 0*4kB 0*8kB 1*16kB 1*32kB 1*64kB 1*128kB 1*256kB 1*512kB 0*1024kB
0*2048kB 0*4096kB = 1008kB
DMA32: 0*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB
0*1024kB 0*2048kB 0*4096kB = 0kB
Normal: empty
HighMem: empty
Swap cache: add 187052, delete 125513, find 300/490, race 0+0
Free swap  = 5336kB
Total swap = 257024kB
Free swap:         5336kB
65536 pages of RAM
406 free pages
1236 reserved pages
857 slab pages
60 pages shared
61539 pages swap cached

other related links:
http://lists.arm.linux.org.uk/pipermail/linux-arm-kernel/2003-June/015758.html
http://lists.arm.linux.org.uk/pipermail/linux-arm-kernel/2004-March/020737.html
http://lists.arm.linux.org.uk/pipermail/linux-arm-kernel/2005-January/026346.html
http://lists.arm.linux.org.uk/pipermail/linux-arm-kernel/2006-June/034900.html
http://lists.arm.linux.org.uk/pipermail/linux-arm-kernel/2007-January/037844.html
http://thread.gmane.org/gmane.comp.misc.nslu2.devel/1486/focus=1488
Comment 3 Rob Lockhart 2007-04-18 00:01:02 UTC
Please see http://bugzilla.kernel.org/show_bug.cgi?id=7760 in regards
to >64MB and "page allocation failure" and DMA_BOUNCE failure.  Since
only USB on PCI bus, any PCI DMA problems will cause corruption of USB
devices (and thus rootfs if it is located on USB bus).

I have an IXP420 with 256MiB SDRAM and USB EHCI on the PCI bus.  Using
kernel 2.6.18 and 2.6.20,  I have serial console and ssh login.  The
root fs is mounted to a USB device /dev/sda1.

The USB driver gets a page allocation error (USB EHCI on PCI bus),
after the error message below occurs, if the USB driver (i.e.,
usb_storage) requested the memory.

oom-killer: gfp_mask=0x200d2, order=0
memtest: page allocation failure. order:0, mode:0x200d2
ehci_hcd 0000:00:01.2: alloc_safe_buffer: could not alloc dma memory (size=8192)
ehci_hcd 0000:00:01.2: map_single: unable to map unsafe buffer cff04000!

When doing a "memtest 256m 1", I get a dump of the messages above to
the console  (serial) and similar messages (different buffer address).
I have also been able to get similar messages by running any program
at attempts to use nearly all the memory available (i.e., rsync with a
sufficiently large database).  Note that this is not a solder problem
or hardware problem, as I have been able to successfully test >95% of
the memory.

I have pasted the complete process here, including all debugging I
could imagine:
http://pastebin.ca/440753 -> debian 2.6.18 w/apex 1.4.18 (from debian armle)
http://pastebin.ca/442296 -> slugosle 2.6.20 w/apex 1.4.18 (generic
kernel from sources)

Please let me know what I can do to help debug this problem.  I am
willing to try with 2.6.21 kernel.  Note that I have been able to
successfully get it into this state by using the following process:

1) do "free", hopefully not many buffers used.  If so, may have to do
this more than once.

2) convert that number to MB by /1024.  Round the number down, let it
be X.  Then:
  memtester X  (if slugosle generic kernel)
  memtest X 1  (if debian)

3) Note that if you do the same test with X-1 instead of X, the
problem doesn't occur, ever (thus verifying the hardware stability).

4) Note also that if the test passes, perhaps the cache was flushed,
so go back to 1) and confirm free space, and then re-run 2) as
appropriate.


Here is /proc/iomem (for addresses of devices):
NSLU2:/proc# cat iomem
00000000-0fffffff : System RAM
 0001f000-0020636f : Kernel text
 00208000-002842a7 : Kernel data
48000000-4bffffff : PCI Memory Space
 48000000-48000fff : 0000:00:01.0
   48000000-48000fff : ohci_hcd
 48001000-48001fff : 0000:00:01.1
   48001000-48001fff : ohci_hcd
 48002000-480020ff : 0000:00:01.2
   48002000-480020ff : ehci_hcd
50000000-50ffffff : IXP4XX-Flash.0
 50000000-50ffffff : IXP4XXFlash
60000000-60003fff : ixp4xx_qmgr.0
 60000000-60003fff : ixp_qmgr
c8000000-c8000fff : serial8250.0
 c8000000-c800001f : serial
c8001000-c8001fff : serial8250.0
 c8001000-c800101f : serial
c8007000-c8007fff : ixp4xx_npe.1
 c8007000-c8007fff : NPE-B
c8008000-c8008fff : ixp4xx_npe.2
 c8008000-c8008fff : NPE-C
c8009000-c80091ff : ixp4xx_mac.0
 c8009000-c80091ff : ixp4xx_mac

Here is the complete dump of the error message (from serial console):
NSLU2:/proc# oom-killer: gfp_mask=0x201d2, order=0
Mem-info:
DMA per-cpu:
cpu 0 hot: high 18, batch 3 used:0
cpu 0 cold: high 6, batch 1 used:5
DMA32 per-cpu:
cpu 0 hot: high 90, batch 15 used:0
cpu 0 cold: high 30, batch 7 used:6
Normal per-cpu: empty
HighMem per-cpu: empty
Free pages:        1008kB (0kB HighMem)
Active:33222 inactive:28814 dirty:0 writeback:0 unstable:0 free:252
slab:857 mapped:7 pagetables:216
DMA free:1008kB min:512kB low:640kB high:768kB active:29984kB
inactive:31572kB present:65536kB pages_scanned:137512
all_unreclaimable? yes
lowmem_reserve[]: 0 192 192 192
DMA32 free:0kB min:1536kB low:1920kB high:2304kB active:102904kB
inactive:83684kB present:196608kB pages_scanned:271001
all_unreclaimable? yes
lowmem_reserve[]: 0 0 0 0
Normal free:0kB min:0kB low:0kB high:0kB active:0kB inactive:0kB
present:0kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0 0 0
HighMem free:0kB min:128kB low:128kB high:128kB active:0kB
inactive:0kB present:0kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0 0 0
DMA: 0*4kB 0*8kB 1*16kB 1*32kB 1*64kB 1*128kB 1*256kB 1*512kB 0*1024kB
0*2048kB 0*4096kB = 1008kB
DMA32: 0*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB
0*1024kB 0*2048kB 0*4096kB = 0kB
Normal: empty
HighMem: empty
Swap cache: add 187052, delete 125513, find 300/490, race 0+0
Free swap  = 5336kB
Total swap = 257024kB
Free swap:         5336kB
65536 pages of RAM
406 free pages
1236 reserved pages
857 slab pages
60 pages shared
61539 pages swap cached

other related links:
http://lists.arm.linux.org.uk/pipermail/linux-arm-kernel/2003-June/015758.html
http://lists.arm.linux.org.uk/pipermail/linux-arm-kernel/2004-March/020737.html
http://lists.arm.linux.org.uk/pipermail/linux-arm-kernel/2005-January/026346.html
http://lists.arm.linux.org.uk/pipermail/linux-arm-kernel/2006-June/034900.html
http://lists.arm.linux.org.uk/pipermail/linux-arm-kernel/2007-January/037844.html
http://thread.gmane.org/gmane.comp.misc.nslu2.devel/1486/focus=1488
Comment 4 Russell King 2007-04-26 13:52:10 UTC
Given Rob's attempts at trying to get some reaction on the ARM mailing lists
were unfruitful, I'm not sure where we go from here.

The DMA bounce code is a _hack_ around the problem - the API was never designed
in the first place to have bounces happening at this point.  It was designed to
map the passed buffer to an address visible on the bus, be that via an IOMMU or
via a straight translation.

The handling of "can this buffer be visible to the device for DMA" is supposed
to be handled by whoever allocates the buffer, and that means having the right
DMA mask set for the device.  Not sure how to deal with that when it's a
property of the upstream busses rather than the device itself; Linux's DMA
masks are based around capabilities of the device.

And, as I've said in comment #2, I'm not certain of the issues with PCI on
IXP4xx, so having a bug assigned to me in this bugzilla is utterly useless
(and I don't think there's anyone else in the ARM kernel community on this
bugzilla I could assign it to.)

What we need are people who remain responsible for platforms, rather than
dumping code into the patch system and then running away from responsibility
for that code.  Unfortunately, people with that attribute are few and far
between in the ARM kernel community.

Sorry.
Comment 5 Rob Lockhart 2007-04-28 05:59:01 UTC
The original poster (Stephan) suggested ".. something seams wrong with
memory-management or DMA".  It seems, at least to me, that memory is being
requested without being available, as oom_killer is being called.  Could the DMA
error (also PCI error) be an artifact of the VM subsystem alloc failure?

Hideo Aoki has a patch for OVERCOMMIT_GUESS:
"Patch: mm: An enhancement of OVERCOMMIT_GUESS"
http://lwn.net/Articles/178850/
http://marc.info/?l=linux-kernel&m=112993489022427&w=2

The differences here I see are in regards to 

linux-2.6.21-rc6-arm/mm/page_alloc.c:
 zone->pages_high  = zone->pages_min + (tmp >> 1);

but the patches above seem to have a different parameter for this.
I don't claim to understand the nuances of these implementations but I suggest
it as it might be the culprit.

Can someone comment as to whether this seems plausible?  I am by far an expert
in the linux vm subsystem.
Comment 6 Rob Lockhart 2007-04-28 08:45:11 UTC
s/I am by far an expert in the linux vm subsystem/I am by far *NOT* an expert
in the linux vm subsystem/
Comment 7 Rod Whitby 2007-07-27 08:26:59 UTC
Here is my theory - please tell me if I'm way off-base or not:

In alloc_safe_buffer in dmabounce.c, it looks at the requested size.  If it's <= small size (2048 in ixp4xx case) it uses the pre-allocated small pool.  If <= large size (4096 for ixp4xx) it uses the preallocated large pool.  If bigger than 4096 (which is the case in the failing situations reported), it just does a dma_alloc_coherent (which does not use preallocated memory).

So in the case where we are using memtester to trigger this bug, memtester would have already requested all available memory from the kernel, and then something wants to read or write to USB.

That read or write triggers an alloc of >4K, and the request cannot be satisifed, which calls oom and kills the USB and therefore the rootfs which is mounted from the USB disk.

Is that plausible?
Comment 8 jean jacques goessens 2008-01-26 15:20:46 UTC
In my opinion, it is something like this :

while trying allocating memory, it appears too low, so system decides to allocate disk swap, but the problem is that disk swap invokes USB transfert, and USB request memory for DMA... then it enters into a kind of loop.

here is the problem?

so, possibles solutions are :
- increasing minimum merory?
- turning USB into a non-dma transferts if memory is too low?

can someone more expert that me give some opinion?
Comment 9 Rob Lockhart 2008-01-28 23:10:22 UTC
I think that a workaround might have been discovered, by Stephen Miller.  It is
in the testing stages right now, but so far I haven't been able to create the
aforementioned problem.  For the record, I'm using an NSLU2 with 256MB in
two banks of 2 @ x16 512Mb memory.  And it is running Debian BE:
Linux NSLU2 2.6.18-5-ixp4xx #1 Sun Dec 23 05:17:39 UTC 2007 armv5tel

I am using Apex v1.4.7 as 2nd stage boot loader, and specify this:
setenv startup sdram-init; memscan -u 0+256m; copy -s fis://kernel 0x00008000;
copy -s fis://ramdisk 0x01000000; wait 20 Type ^C key to cancel autoboot.; boot

I then put this in the beginning of /etc/rcS.d/S01glibc.sh (just below first
line):

echo 100 >/proc/sys/vm/swappiness
echo 8192 >/proc/sys/vm/min_free_kbytes

This is not the greatest idea but it is just a test.  There seems to be some
influence of "/etc/sysctl.conf" in these values but I think it is too late in
the bootup process.  I.e., if booting uses too much memory w/o the values being
changed, then the kernel will hang as the values haven't been changed when it's
running out of memory.

file:  /etc/sysctl.conf
vm.overcommit_memory=2
vm.overcommit_ratio=80
vm.min_free_kbytes=10240

Note that this seems to be called from within S30procps.sh and this occurs
AFTER S30checkfs.sh so it's too late.

I have posted an updated log here:  http://pastebin.ca/876984
Note that before I moved the commands, the FatSlug would hang when doing the
180-day file system check.  Now it does not, even when the available memory has
been exceeded (forced to use swap).

PLEASE NOTE that I have effectively commented out the effects of the first two of three parameters above in /etc/sysctl.conf and the values for my NSLU2 are currently set to:

/proc/sys/vm/overcommit_memory -> 0
/proc/sys/vm/overcommit_ratio -> 50
/proc/sys/vm/min_free_kbytes -> 10240

Comments?
Comment 10 Rob Lockhart 2008-02-02 18:16:42 UTC
I hereby revoke my optimistic statement regarding a possible work-around.  Indeed, the lockup still occurs when transferring large amounts of data whereby the DMA (PCI->USB->IDE) process seems to hang.  Going back to 64MiB seems to alleviate the condition.

Comment 11 jean jacques goessens 2008-02-04 09:45:29 UTC
I agree

i also made similar modifications, here is the script

#! /bin/sh
echo '#! /bin/sh' > /etc/init.d/lowmem
echo '# jean jacques.goessens - crée automotiquement' >> /etc/init.d/lowmem
echo 'echo 4096 > /proc/sys/vm/min_free_kbytes' >> /etc/init.d/lowmem
chmod +x /etc/init.d/lowmem
ln -s /etc/init.d/lowmem /etc/rc1.d/S10lowmem
ln -s /etc/init.d/lowmem /etc/rc2.d/S10lowmem
ln -s /etc/init.d/lowmem /etc/rc3.d/S10lowmem
ln -s /etc/init.d/lowmem /etc/rc4.d/S10lowmem
ln -s /etc/init.d/lowmem /etc/rc5.d/S10lowmem

this script creates this other script

#! /bin/sh
# jean jacques.goessens - crée automotiquement
echo 4096 > /proc/sys/vm/min_free_kbytes

and link it in RS's

Now, my 128M slug works fine, but i never tried an initial fsck.

I confirm than my slug was hanging every time it performs fsck at startup, now i have made the modification, i did not checked yet.

anyway, maybe linking to S01lowmem can make it load before S30checkfs, but i did not checked.


the best should be to modify dma_bounce and to recompile the kernel?

but this is too early for me.

what is the purpose of "swappiness"??

JJ
Comment 12 Levente Nagy 2008-05-19 07:00:35 UTC
I use slugOS-4.8 beta (kernel 2.6.21.7), and changes made in /proc/sys/vm/min_free_kbytes doesn't help: in case of formatting a 100Gbyte HDD with Ext3 my slug with 256MByte of RAM hangs up.

I did the following slug specific workaround, which solves the problem:
I've increased the DMA pool block size, and there is no need to call dma_alloc_coherent() because there are always preallocated DMA buffers exist.
I've changed the line
dmabounce_register_dev(dev, 2048, 4096);
to
dmabounce_register_dev(dev, 16384, 131072);
in module
/linux-2.6.21/arch/arm/mach-ixp4xx/common-pci.c
in function
static int ixp4xx_pci_platform_notify(struct device *dev) {}

My slug has been running with this patch for a month without any problem.

Levente
Comment 13 Rob Lockhart 2008-06-22 21:41:09 UTC
I created SlugOS-BE 4.10-alpha for NSLU2 with 256MB slug, including Apex 1.5.14 (for support of 256MB memory detection).  The kernel version here is 2.6.24.7 - I couldn't change the field above.  I added the patches mentioned in Comment #12 in my custom build, which was built per these instructions:

http://www.nslu2-linux.org/wiki/HowTo/CrossCompileWithCentOS

Then I "ipkg install memtester", then "memtester 128m".  That performed just fine.  However, doing "memtester 256m", and I got the dreaded OOM bug (console frozen).  Output is below.

I humbly suggest that the "memtester" be used (if tested in SlugOS environment) or "memtest" (if tested with Debian environment) to verify a kernel fix.

Note that the aforementioned settings were not set (those in /proc/sys/vm/) but were left at defaults, which were:
/proc/sys/vm/overcommit_memory =    0
/proc/sys/vm/overcommit_ratio  =   50
/proc/sys/vm/min_free_kbytes   = 2039
/proc/sys/vm/swappiness        =   60

The first test below was with defaults above.

root@NSLU2:~# memtester 64m
memtester version 4.0.6 (32-bit)
Copyright (C) 2006 Charles Cazabon.
Licensed under the GNU General Public License version 2 (only).

pagesize is 4096
pagesizemask is 0xfffff000
want 64MB (67108864 bytes)
got  64MB (67108864 bytes), trying mlock ...locked.
Loop 1:
  Stuck Address       : testing   6
root@NSLU2:~# free
              total         used         free       shared      buffers
  Mem:       257416        95328       162088            0         4868
 Swap:       145144            0       145144
Total:       402560        95328       307232
root@NSLU2:~# memtester 180m
memtester version 4.0.6 (32-bit)
Copyright (C) 2006 Charles Cazabon.
Licensed under the GNU General Public License version 2 (only).

pagesize is 4096
pagesizemask is 0xfffff000
want 180MB (188743680 bytes)
got  180MB (188743680 bytes), trying mlock ...locked.
Loop 1:
  Stuck Address       : testing   1
root@NSLU2:~# free
              total         used         free       shared      buffers
  Mem:       257416        68956       188460            0         4664
 Swap:       145144            0       145144
Total:       402560        68956       333604
root@NSLU2:~# memtester 250m
memtester version 4.0.6 (32-bit)
Copyright (C) 2006 Charles Cazabon.
Licensed under the GNU General Public License version 2 (only).

pagesize is 4096
pagesizemask is 0xfffff000
want 250MB (262144000 bytes)
got  250MB (262144000 bytes), trying mlock ...

memtester invoked oom-killer: gfp_mask=0x1200d2, order=0, oomkilladj=0
Function entered at [<c0023f08>] from [<c0060c34>]
Function entered at [<c0060bdc>] from [<c0061098>]
 r7:00000fb3 r6:cfc5f3e0 r5:c025af44 r4:cfc5f3e0
Function entered at [<c0060f0c>] from [<c0063258>]
Function entered at [<c0062fe8>] from [<c006b5c8>]
Function entered at [<c006b470>] from [<c006bb38>]
Function entered at [<c006ba18>] from [<c006d168>]
Function entered at [<c006d0dc>] from [<c006db50>]
 r8:cfddcce0 r7:40129000 r6:cff65f6c r5:cfec9e48 r4:ffff05ff
Function entered at [<c006da44>] from [<c006dcb8>]
Function entered at [<c006dc04>] from [<c006dda4>]
 r6:40129000 r5:fffffff4 r4:0fa01000
Function entered at [<c006dd04>] from [<c001fde0>]
 r6:0fa00000 r5:0fa00000 r4:0fa00008
Mem-info:
DMA per-cpu:
CPU    0: Hot: hi:   18, btch:   3 usd:   2   Cold: hi:    6, btch:   1 usd:   0
Normal per-cpu:
CPU    0: Hot: hi:   90, btch:  15 usd:  76   Cold: hi:   30, btch:   7 usd:  28
Active:31216 inactive:30842 dirty:0 writeback:0 unstable:0
 free:686 slab:485 mapped:0 pagetables:175 bounce:0
DMA free:1268kB min:508kB low:632kB high:760kB active:29756kB inactive:29864kB present:65024kB pages_scanned:99747 all_unreclaimable? yes
lowmem_reserve[]: 0 190 190
Normal free:1476kB min:1524kB low:1904kB high:2284kB active:95108kB inactive:93504kB present:195072kB pages_scanned:310638 all_unreclaimable? yes
lowmem_reserve[]: 0 0 0
DMA: 1*4kB 0*8kB 1*16kB 1*32kB 1*64kB 1*128kB 0*256kB 0*512kB 1*1024kB 0*2048kB 0*4096kB = 1268kB
Normal: 3*4kB 1*8kB 1*16kB 1*32kB 0*64kB 1*128kB 1*256kB 0*512kB 1*1024kB 0*2048kB 0*4096kB = 1476kB
Swap cache: add 36286, delete 182, find 0/0, race 0+0
Free swap  = 0kB
Total swap = 145144kB
Free swap:            0kB
65536 pages of RAM
873 free pages
1694 reserved pages
485 slab pages
96 pages shared
36104 pages swap cached
Out of memory: kill process 3980 (memtester) score 4019 or a child
Killed process 3980 (memtester)

root@NSLU2:~# memtester 256m
memtester version 4.0.6 (32-bit)
Copyright (C) 2006 Charles Cazabon.
Licensed under the GNU General Public License version 2 (only).

pagesize is 4096
pagesizemask is 0xfffff000
want 256MB (268435456 bytes)
got  256MB (268435456 bytes), trying mlock ...memtester invoked oom-killer: gfp_mask=0x1200d2, order=0, oomkilladj=0
Function entered at [<c0023f08>] from [<c0060c34>]
Function entered at [<c0060bdc>] from [<c0061098>]
 r7:00001013 r6:cfc1e060 r5:c025af44 r4:cfc1e060
Function entered at [<c0060f0c>] from [<c0063258>]
Function entered at [<c0062fe8>] from [<c006b5c8>]
Function entered at [<c006b470>] from [<c006bb38>]
Function entered at [<c006ba18>] from [<c006d168>]
Function entered at [<c006d0dc>] from [<c006db50>]
 r8:cfe60a40 r7:40129000 r6:cfdbdf6c r5:cfed6da0 r4:fffeffff
Function entered at [<c006da44>] from [<c006dcb8>]
Function entered at [<c006dc04>] from [<c006dda4>]
 r6:40129000 r5:fffffff4 r4:10001000
Function entered at [<c006dd04>] from [<c001fde0>]
 r6:10000000 r5:10000000 r4:10000008
Mem-info:
DMA per-cpu:
CPU    0: Hot: hi:   18, btch:   3 usd:   2   Cold: hi:    6, btch:   1 usd:   0
Normal per-cpu:
CPU    0: Hot: hi:   90, btch:  15 usd:  23   Cold: hi:   30, btch:   7 usd:   9
Active:31348 inactive:30830 dirty:0 writeback:0 unstable:0
 free:693 slab:469 mapped:0 pagetables:175 bounce:0
DMA free:1268kB min:508kB low:632kB high:760kB active:29888kB inactive:29732kB present:65024kB pages_scanned:99010 all_unreclaimable? yes
lowmem_reserve[]: 0 190 190
Normal free:1504kB min:1524kB low:1904kB high:2284kB active:95504kB inactive:93588kB present:195072kB pages_scanned:320579 all_unreclaimable? yes
lowmem_reserve[]: 0 0 0
DMA: 1*4kB 0*8kB 1*16kB 1*32kB 1*64kB 1*128kB 0*256kB 0*512kB 1*1024kB 0*2048kB 0*4096kB = 1268kB
Normal: 8*4kB 2*8kB 1*16kB 1*32kB 0*64kB 1*128kB 1*256kB 0*512kB 1*1024kB 0*2048kB 0*4096kB = 1504kB
Swap cache: add 72435, delete 36771, find 15/23, race 0+0
Free swap  = 0kB
Total swap = 145144kB
Free swap:            0kB
65536 pages of RAM
808 free pages
1694 reserved pages
469 slab pages
11 pages shared
35664 pages swap cached
Out of memory: kill process 3981 (memtester) score 4115 or a child
Killed process 3981 (memtester)
Killed
root@NSLU2:~# memtester 250m
memtester version 4.0.6 (32-bit)
Copyright (C) 2006 Charles Cazabon.
Licensed under the GNU General Public License version 2 (only).

pagesize is 4096
pagesizemask is 0xfffff000
want 250MB (262144000 bytes)
got  250MB (262144000 bytes), trying mlock ...ntpd invoked oom-killer: gfp_mask=0x1201d2, order=0, oomkilladj=0
Function entered at [<c0023f08>] from [<c0060c34>]
Function entered at [<c0060bdc>] from [<c0061098>]
 r7:00000fb3 r6:cff2d480 r5:c025af44 r4:cff2d480
Function entered at [<c0060f0c>] from [<c0063258>]
Function entered at [<c0062fe8>] from [<c0065234>]
Function entered at [<c0065168>] from [<c00657d0>]
Function entered at [<c0065764>] from [<c0060074>]
 r7:cff59540 r6:00000000 r5:00000fff r4:00000000
Function entered at [<c005ff0c>] from [<c006a78c>]
Function entered at [<c006a720>] from [<c006b708>]
Function entered at [<c006b470>] from [<c0025cd0>]
Function entered at [<c0025bdc>] from [<c0025eb4>]
Function entered at [<c0025e94>] from [<c001f1ac>]
 r5:00000000 r4:ffffffff
Function entered at [<c001f194>] from [<c001fd60>]
Exception stack(0xcfd9bfb0 to 0xcfd9bff8)
bfa0:                                     ffffffff 00000004 00000010 00000000
bfc0: ffffffff 00000000 be99acf8 be99ad78 00000000 00000000 401cb000 00000000
bfe0: 000520fc be99acf4 00010ce4 400ddfe0 60000010 ffffffff
Mem-info:
DMA per-cpu:
CPU    0: Hot: hi:   18, btch:   3 usd:   4   Cold: hi:    6, btch:   1 usd:   0
Normal per-cpu:
CPU    0: Hot: hi:   90, btch:  15 usd:  33   Cold: hi:   30, btch:   7 usd:   6
Active:30780 inactive:31366 dirty:0 writeback:0 unstable:0
 free:695 slab:465 mapped:0 pagetables:175 bounce:0
DMA free:1268kB min:508kB low:632kB high:760kB active:29264kB inactive:30344kB present:65024kB pages_scanned:99352 all_unreclaimable? yes
lowmem_reserve[]: 0 190 190
Normal free:1512kB min:1524kB low:1904kB high:2284kB active:93856kB inactive:95120kB present:195072kB pages_scanned:303405 all_unreclaimable? yes
lowmem_reserve[]: 0 0 0
DMA: 1*4kB 0*8kB 1*16kB 1*32kB 1*64kB 1*128kB 0*256kB 0*512kB 1*1024kB 0*2048kB 0*4096kB = 1268kB
Normal: 8*4kB 3*8kB 1*16kB 1*32kB 0*64kB 1*128kB 1*256kB 0*512kB 1*1024kB 0*2048kB 0*4096kB = 1512kB
Swap cache: add 108389, delete 72731, find 92/141, race 0+0
Free swap  = 0kB
Total swap = 145144kB
Free swap:            0kB
65536 pages of RAM
819 free pages
1694 reserved pages
465 slab pages
12 pages shared
35658 pages swap cached
Out of memory: kill process 3982 (memtester) score 4019 or a child
Killed process 3982 (memtester)
Killed
root@NSLU2:~# ehci_hcd 0000:00:01.2: alloc_safe_buffer: could not alloc dma memory (size=32768)
ehci_hcd 0000:00:01.2: map_single: unable to map unsafe buffer cf448000!

Next, I changed the defaults for /proc/sys/mem as follows:

root@NSLU2:~# echo 100 >/proc/sys/vm/swappiness
root@NSLU2:~# echo 10240 >/proc/sys/vm/min_free_kbytes
root@NSLU2:~# free
              total         used         free       shared      buffers
  Mem:       257416        94112       163304            0         4648
 Swap:       145144            0       145144
Total:       402560        94112       308448
root@NSLU2:~# memtester 250m
memtester version 4.0.6 (32-bit)
Copyright (C) 2006 Charles Cazabon.
Licensed under the GNU General Public License version 2 (only).

pagesize is 4096
pagesizemask is 0xfffff000
want 250MB (262144000 bytes)
got  250MB (262144000 bytes), trying mlock ...ntpd invoked oom-killer: gfp_mask=0x1201d2, order=0, oomkilladj=0
Function entered at [<c0023f08>] from [<c0060c34>]
Function entered at [<c0060bdc>] from [<c0061098>]
 r7:00000fb3 r6:cfc5f960 r5:c025af44 r4:cfc5f960
Function entered at [<c0060f0c>] from [<c0063258>]
Function entered at [<c0062fe8>] from [<c0065234>]
Function entered at [<c0065168>] from [<c00657d0>]
Function entered at [<c0065764>] from [<c0060074>]
 r7:cfd22f60 r6:00000000 r5:00000fff r4:00000000
Function entered at [<c005ff0c>] from [<c006a78c>]
Function entered at [<c006a720>] from [<c006b708>]
Function entered at [<c006b470>] from [<c0025cd0>]
Function entered at [<c0025bdc>] from [<c0025eb4>]
Function entered at [<c0025e94>] from [<c001f1ac>]
 r5:40112598 r4:ffffffff
Function entered at [<c001f194>] from [<c001fd60>]
Exception stack(0xcfe11fb0 to 0xcfe11ff8)
1fa0:                                     40125720 00000000 00000000 40125720
1fc0: 00000000 40112598 80000000 00000001 00000000 00000000 40125000 bef6eb14
1fe0: 0001d188 bef6eaf8 0000adc4 4007023c 60000010 ffffffff
Mem-info:
DMA per-cpu:
CPU    0: Hot: hi:   18, btch:   3 usd:   3   Cold: hi:    6, btch:   1 usd:   0
Normal per-cpu:
CPU    0: Hot: hi:   90, btch:  15 usd:  84   Cold: hi:   30, btch:   7 usd:  20
Active:30086 inactive:29913 dirty:0 writeback:0 unstable:0
 free:2744 slab:476 mapped:0 pagetables:167 bounce:0
DMA free:3320kB min:2560kB low:3200kB high:3840kB active:28912kB inactive:28656kB present:65024kB pages_scanned:86516 all_unreclaimable? yes
lowmem_reserve[]: 0 190 190
Normal free:7656kB min:7680kB low:9600kB high:11520kB active:91432kB inactive:90996kB present:195072kB pages_scanned:294421 all_unreclaimable? yes
lowmem_reserve[]: 0 0 0
DMA: 0*4kB 1*8kB 1*16kB 1*32kB 1*64kB 1*128kB 0*256kB 0*512kB 1*1024kB 1*2048kB 0*4096kB = 3320kB
Normal: 8*4kB 1*8kB 0*16kB 0*32kB 1*64kB 1*128kB 1*256kB 0*512kB 1*1024kB 1*2048kB 1*4096kB = 7656kB
Swap cache: add 36366, delete 650, find 10/22, race 0+0
Free swap  = 0kB
Total swap = 145144kB
Free swap:            0kB
65536 pages of RAM
2928 free pages
1694 reserved pages
476 slab pages
26 pages shared
35716 pages swap cached
Out of memory: kill process 3844 (memtester) score 4019 or a child
Killed process 3844 (memtester)
Killed


root@NSLU2:~# memtester 256m
memtester version 4.0.6 (32-bit)
Copyright (C) 2006 Charles Cazabon.
Licensed under the GNU General Public License version 2 (only).

pagesize is 4096
pagesizemask is 0xfffff000
want 256MB (268435456 bytes)
got  256MB (268435456 bytes), trying mlock ...memtester invoked oom-killer: gfp_mask=0x1200d2, order=0, oomkilladj=0
Function entered at [<c0023f08>] from [<c0060c34>]
Function entered at [<c0060bdc>] from [<c0061098>]
 r7:00001013 r6:cfe96680 r5:c025af44 r4:cfe96680
Function entered at [<c0060f0c>] from [<c0063258>]
Function entered at [<c0062fe8>] from [<c006b5c8>]
Function entered at [<c006b470>] from [<c006bb38>]
Function entered at [<c006ba18>] from [<c006d168>]
Function entered at [<c006d0dc>] from [<c006db50>]
 r8:cfdf8a20 r7:40129000 r6:cff91f6c r5:cfe212cc r4:fffeffff
Function entered at [<c006da44>] from [<c006dcb8>]
Function entered at [<c006dc04>] from [<c006dda4>]
 r6:40129000 r5:fffffff4 r4:10001000
Function entered at [<c006dd04>] from [<c001fde0>]
 r6:10000000 r5:10000000 r4:10000008
Mem-info:
DMA per-cpu:
CPU    0: Hot: hi:   18, btch:   3 usd:   2   Cold: hi:    6, btch:   1 usd:   5
Normal per-cpu:
CPU    0: Hot: hi:   90, btch:  15 usd:  21   Cold: hi:   30, btch:   7 usd:  12
Active:30143 inactive:29971 dirty:0 writeback:0 unstable:0
 free:2745 slab:456 mapped:0 pagetables:168 bounce:0
DMA free:3316kB min:2560kB low:3200kB high:3840kB active:28988kB inactive:28568kB present:65024kB pages_scanned:94915 all_unreclaimable? yes
lowmem_reserve[]: 0 190 190
Normal free:7664kB min:7680kB low:9600kB high:11520kB active:91584kB inactive:91316kB present:195072kB pages_scanned:283698 all_unreclaimable? yes
lowmem_reserve[]: 0 0 0
DMA: 1*4kB 0*8kB 1*16kB 1*32kB 1*64kB 1*128kB 0*256kB 0*512kB 1*1024kB 1*2048kB 0*4096kB = 3316kB
Normal: 2*4kB 3*8kB 1*16kB 0*32kB 1*64kB 1*128kB 1*256kB 0*512kB 1*1024kB 1*2048kB 1*4096kB = 7664kB
Swap cache: add 87014, delete 51312, find 73/121, race 0+0
Free swap  = 0kB
Total swap = 145144kB
Free swap:            0kB
65536 pages of RAM
2862 free pages
1694 reserved pages
456 slab pages
44 pages shared
35702 pages swap cached
Out of memory: kill process 3845 (memtester) score 4115 or a child
Killed process 3845 (memtester)
Killed


root@NSLU2:~# memtester 256m
memtester version 4.0.6 (32-bit)
Copyright (C) 2006 Charles Cazabon.
Licensed under the GNU General Public License version 2 (only).

pagesize is 4096
pagesizemask is 0xfffff000
want 256MB (268435456 bytes)
got  256MB (268435456 bytes), trying mlock ...init invoked oom-killer: gfp_mask=0x1200d2, order=0, oomkilladj=0
Function entered at [<c0023f08>] from [<c0060c34>]
Function entered at [<c0060bdc>] from [<c0061098>]
 r7:00001013 r6:cfc5e8e0 r5:c025af44 r4:cfc5e8e0
Function entered at [<c0060f0c>] from [<c0063258>]
Function entered at [<c0062fe8>] from [<c0073fb4>]
Function entered at [<c0073f74>] from [<c006b0d8>]
 r7:cfd339b0 r6:0000098d r5:00000008 r4:00000006
Function entered at [<c006b098>] from [<c006b744>]
 r8:be94a000 r7:cfd339b0 r6:00002fa0 r5:00000000 r4:00131a00
Function entered at [<c006b470>] from [<c0025cd0>]
Function entered at [<c0025bdc>] from [<c001f1ec>]
Function entered at [<c001f1b0>] from [<c001fa00>]
Exception stack(0xcfc21de8 to 0xcfc21e30)
1de0:                   be94abe0 cfc21e64 ffffffe4 00000000 cfc21e54 00000004
1e00: be94abe0 00000000 00000000 00000000 be94ad10 cfc21fa4 0000001c cfc21e30
1e20: 00000000 c0105a34 00000013 ffffffff
 r8:00000000 r7:00000000 r6:be94abe0 r5:cfc21e1c r4:ffffffff
Function entered at [<c008b284>] from [<c001fde0>]
Mem-info:
DMA per-cpu:
CPU    0: Hot: hi:   18, btch:   3 usd:   2   Cold: hi:    6, btch:   1 usd:   5
Normal per-cpu:
CPU    0: Hot: hi:   90, btch:  15 usd:  38   Cold: hi:   30, btch:   7 usd:  28
Active:30013 inactive:30097 dirty:0 writeback:0 unstable:0
 free:2746 slab:453 mapped:0 pagetables:168 bounce:0
DMA free:3320kB min:2560kB low:3200kB high:3840kB active:28540kB inactive:29012kB present:65024kB pages_scanned:98257 all_unreclaimable? yes
lowmem_reserve[]: 0 190 190
Normal free:7664kB min:7680kB low:9600kB high:11520kB active:91512kB inactive:91376kB present:195072kB pages_scanned:288632 all_unreclaimable? yes
lowmem_reserve[]: 0 0 0
DMA: 2*4kB 0*8kB 1*16kB 1*32kB 1*64kB 1*128kB 0*256kB 0*512kB 1*1024kB 1*2048kB 0*4096kB = 3320kB
Normal: 0*4kB 2*8kB 2*16kB 0*32kB 1*64kB 1*128kB 1*256kB 0*512kB 1*1024kB 1*2048kB 1*4096kB = 7664kB
Swap cache: add 132337, delete 96640, find 127/217, race 0+0
Free swap  = 0kB
Total swap = 145144kB
Free swap:            0kB
65536 pages of RAM
2896 free pages
1694 reserved pages
453 slab pages
44 pages shared
35697 pages swap cached
Out of memory: kill process 3846 (memtester) score 4115 or a child
Killed process 3846 (memtester)
Killed
ehci_hcd 0000:00:01.2: alloc_safe_buffer: could not alloc dma memory (size=28672)
ehci_hcd 0000:00:01.2: map_single: unable to map unsafe buffer cb010000!

Comment 14 Lennert Buytenhek 2008-06-23 01:31:57 UTC
(In reply to comment #13)

> Then I "ipkg install memtester", then "memtester 128m".  That performed just
> fine.  However, doing "memtester 256m", and I got the dreaded OOM bug (console
> frozen).  Output is below.
>
> [...]
> got  250MB (262144000 bytes), trying mlock ...

You cannot expect to be able to mlock 250 MB of memory on a machine with
256 MB of RAM, no matter how much swap you have.  This has nothing to do
with the bug described in this bugzilla.
Comment 15 Rob Lockhart 2008-06-23 06:07:23 UTC
Per suggestion #14 that adding swap is not related to the bug, I turned swap off and re-tested clean (i.e., re-flashed completely and re-started test).

This is for the NSLU2 with 256MB SDRAM, IXP420 @ 266MHz, SlugOSBE-4.10-alpha, Apex 1.5.14, kernel as mentioned above.

root@NSLU2:~# echo 100 >/proc/sys/vm/swappiness
root@NSLU2:~# echo 10240 >/proc/sys/vm/min_free_kbytes
root@NSLU2:~# free
              total         used         free       shared      buffers
  Mem:       257416        94104       163312            0         4624
 Swap:            0            0            0
Total:       257416        94104       163312
root@NSLU2:~# memtester 250m
memtester version 4.0.6 (32-bit)
Copyright (C) 2006 Charles Cazabon.
Licensed under the GNU General Public License version 2 (only).

pagesize is 4096
pagesizemask is 0xfffff000
want 250MB (262144000 bytes)
got  228MB (239128576 bytes), trying mlock ...locked.
Loop 1:
  Stuck Address       : ok
  Random Value        : ok
  Compare XOR         : ok
  Compare SUB         : ok
  Compare MUL         : ok
  Compare DIV         : ok
  Compare OR          : ok
  Compare AND         : ok
  Sequential Increment: ok
  Solid Bits          : ok
  Block Sequential    : ok
  Checkerboard        : ok
  Bit Spread          : testing  17
root@NSLU2:~#
root@NSLU2:~#
root@NSLU2:~# man memtester
No manual entry for memtester
root@NSLU2:~# ipkg list | grep memtest
ehci_hcd 0000:00:01.2: alloc_safe_buffer: could not alloc dma memory (size=28672)
ehci_hcd 0000:00:01.2: map_single: unable to map unsafe buffer cddd0000!

See above - the NSLU2 hung at that point.  I was trying to find the man page for memtester (so I could make it run more quickly).

My presumtion that the remarks per #14 suggest to not have swap turned on, it appears that the OOM / DMA bug still exists with no swap enabled.

However, note that the initial amount of requested memory is still significantly less than 250MB (which is what I would expect).

Note You need to log in before you can comment on or make changes to this bug.