Bug 13518 - slab grows with NFS write activity.
Summary: slab grows with NFS write activity.
Status: CLOSED CODE_FIX
Alias: None
Product: File System
Classification: Unclassified
Component: NFS (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: bfields
URL:
Keywords:
Depends on:
Blocks: 13070
  Show dependency tree
 
Reported: 2009-06-12 09:51 UTC by Andrew Randrianasulu
Modified: 2010-10-11 19:23 UTC (History)
3 users (show)

See Also:
Kernel Version: 2.6.30
Subsystem:
Regression: Yes
Bisected commit-id:


Attachments
My .config (89.15 KB, text/plain)
2009-06-12 09:51 UTC, Andrew Randrianasulu
Details
first slabtop "screenshot" (1.46 KB, text/plain)
2009-06-12 09:54 UTC, Andrew Randrianasulu
Details
Second ... (1.46 KB, text/plain)
2009-06-12 09:54 UTC, Andrew Randrianasulu
Details
3-rd (1.46 KB, text/plain)
2009-06-12 09:56 UTC, Andrew Randrianasulu
Details
last one (1.46 KB, text/plain)
2009-06-12 09:56 UTC, Andrew Randrianasulu
Details
/etc/exports (403 bytes, text/plain)
2009-06-12 09:58 UTC, Andrew Randrianasulu
Details
/etc/fstab (semi-autogenerated by script) (909 bytes, text/plain)
2009-06-12 10:00 UTC, Andrew Randrianasulu
Details
lspci -v (6.23 KB, text/plain)
2009-06-12 10:06 UTC, Andrew Randrianasulu
Details
last 3100 lines from /var/log/syslog (226.26 KB, text/plain)
2009-06-12 15:29 UTC, Andrew Randrianasulu
Details
kmemtrace log (733.49 KB, application/octet-stream)
2009-06-13 20:11 UTC, Andrew Randrianasulu
Details
fix possible reply cache leak (678 bytes, patch)
2009-07-12 19:34 UTC, bfields
Details | Diff
/proc/slab_allocators before any NFS activity (21.03 KB, text/plain)
2009-07-22 10:10 UTC, Andrew Randrianasulu
Details
/proc/slab_allocators after client boot (21.62 KB, text/plain)
2009-07-22 10:44 UTC, Andrew Randrianasulu
Details
My current .config (92.26 KB, text/plain)
2009-07-22 10:48 UTC, Andrew Randrianasulu
Details

Description Andrew Randrianasulu 2009-06-12 09:51:26 UTC
Created attachment 21869 [details]
My .config

I first observed this bug with 2.6.30-rc8, while trying to reproduce #13375. After whole night of NFS activity  system becomes very slow, i blamed mozilla first, but now i think problem is deeper, at least slabtop clearly showed constantly growing kmalloc-96, without X at all.

Current /proc/meminfo (with NFS server active)

guest@slax:~$ cat /proc/meminfo 
MemTotal:         242072 kB
MemFree:            6392 kB
Buffers:            8864 kB
Cached:            65464 kB
SwapCached:         2428 kB
Active:            59032 kB
Inactive:          89468 kB
Active(anon):      37384 kB
Inactive(anon):    38304 kB
Active(file):      21648 kB
Inactive(file):    51164 kB
Unevictable:           0 kB
Mlocked:               0 kB
HighTotal:             0 kB
HighFree:              0 kB
LowTotal:         242072 kB
LowFree:            6392 kB
SwapTotal:        399992 kB
SwapFree:         396644 kB
Dirty:                48 kB
Writeback:             8 kB
AnonPages:         71908 kB
Mapped:            30376 kB
Slab:              78524 kB
SReclaimable:       8160 kB
SUnreclaim:        70364 kB
PageTables:         1364 kB
NFS_Unstable:          0 kB
Bounce:                0 kB
WritebackTmp:          0 kB
CommitLimit:      521028 kB
Committed_AS:     156896 kB
VmallocTotal:     770104 kB
VmallocUsed:        6208 kB
VmallocChunk:     731224 kB
HugePages_Total:       0
HugePages_Free:        0
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       4096 kB
DirectMap4k:      262080 kB
DirectMap4M:           0 kB

----------

uname -a
Linux slax 2.6.30-i486 #4 SMP Thu Jun 11 17:44:06 GMT 2009 i686 AMD Duron(tm) Processor AuthenticAMD GNU/Linux

i'll attach kernel .config and few slabtop "screenshots" from console.
Comment 1 Andrew Randrianasulu 2009-06-12 09:54:05 UTC
Created attachment 21870 [details]
first slabtop "screenshot"

slabtop -o > file.txt
Comment 2 Andrew Randrianasulu 2009-06-12 09:54:59 UTC
Created attachment 21871 [details]
Second ...
Comment 3 Andrew Randrianasulu 2009-06-12 09:56:04 UTC
Created attachment 21872 [details]
3-rd
Comment 4 Andrew Randrianasulu 2009-06-12 09:56:56 UTC
Created attachment 21873 [details]
last one

captured 10 minutes after first.
Comment 5 Andrew Randrianasulu 2009-06-12 09:58:19 UTC
Created attachment 21874 [details]
/etc/exports
Comment 6 Andrew Randrianasulu 2009-06-12 10:00:26 UTC
Created attachment 21875 [details]
/etc/fstab (semi-autogenerated by script)
Comment 7 Andrew Randrianasulu 2009-06-12 10:06:58 UTC
Created attachment 21876 [details]
lspci -v

as you can see, my network card is 
00:06.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL-8139/8139C/8139C+ (rev 10)
Comment 8 Trond Myklebust 2009-06-12 12:07:13 UTC
Looks like a server issue rather than a client problem. Reassigning to Bruce.
Comment 9 Andrew Randrianasulu 2009-06-12 15:29:13 UTC
Created attachment 21880 [details]
last 3100 lines from /var/log/syslog

after i enabled debug for slab/kmalloc-96 

echo 1 >  /sys/kernel/slab/kmalloc-96/trace
echo 1 >  /sys/kernel/slab/kmalloc-96/sanity_checks

i got this log .(there was also hal-something, but i disabled hald completely, kmalloc-96 still grows)
Comment 10 bfields 2009-06-12 15:55:14 UTC
I don't understand how to read the slab debugging traces.... Is that a stack dump of every alloc/free on kmalloc-96?

What's your last known good server kernel?
Comment 11 Andrew Randrianasulu 2009-06-12 18:47:06 UTC
I just turned on slab debug in hope someone can understand my problem a bit better.

Last good (for NFS) kernel was 2.6.27.19 (i skipped .28 and .29 already has additional bugs)
Comment 12 Neil Brown 2009-06-12 23:01:03 UTC
If you can compile your kernel with CONFIG_DEBUG_SLAB_LEAK
then report /proc/slab_allocators at an appropriate time, that should
prove quite useful.
Comment 13 Andrew Randrianasulu 2009-06-13 03:07:47 UTC
With CONFIG_SLUB=y i can't see CONFIG_DEBUG_SLAB_LEAK, only CONFIG_SLUB_DEBUG=y and CONFIG_SLUB_STATS=y. I will play with different allocator today.
Comment 14 Andrew Randrianasulu 2009-06-13 20:11:50 UTC
Created attachment 21900 [details]
kmemtrace log

2.6.30 has ftrace-based kernel memory tracer. Not sure if it useful or not, but here is output, captured with userspace tool from

http://repo.or.cz/w/kmemtrace-user.git?a=shortlog;h=refs/heads/ftrace-temp
Comment 15 Andrew Randrianasulu 2009-06-14 09:55:24 UTC
testing 2.6.30 with SLAB allocator currently. May be this memleak was already fixed in mainline - 

http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=7747a0b0af5976ba3828796b4f7a7adc3bb76dbd

author	Felix Blyakher <felixb@sgi.com>
	Thu, 11 Jun 2009 22:07:28 +0000 (17:07 -0500)
committer	Felix Blyakher <felixb@sgi.com>
	Fri, 12 Jun 2009 15:26:52 +0000 (10:26 -0500)
commit	7747a0b0af5976ba3828796b4f7a7adc3bb76dbd
tree	cf56450f057c3045341fe50c4e865466ee8a4522	tree | snapshot
parent	35fd035968de4f674b9d62ee7b1d80ab7a50c384	commit | diff
xfs: fix freeing memory in xfs_getbmap()

Regression from commit 28e211700a81b0a934b6c7a4b8e7dda843634d2f.
Need to free temporary buffer allocated in xfs_getbmap().

commit 28e211700a81b0a934b6c7a4b8e7dda843634d2f (xfs: fix getbmap vs mmap deadlock) was from Thu, 30 Apr 2009 05:29:02 +0000 (00:29 -0500). 

I will try this separate fix, and may be whole linux-git tree.
Comment 16 Andrew Randrianasulu 2009-06-29 07:28:10 UTC
Bug still here, at least in Linus git up to commit 987fed3bf6982f2627d4fa242caa9026ef61132a
Merge: ed4fc72... 8b169b5...
Author: Linus Torvalds <torvalds@linux-foundation.org>
Date:   Thu Jun 25 17:04:37 2009 -0700

    Merge branch 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6

slab grow from ~30mb to well over 100Mb in just few hours of NFS activity.
Comment 17 Andrew Randrianasulu 2009-07-10 09:27:45 UTC
Bug still here for me  with Linus's tree up to commit 

2e3167308048ca6c810733384d8289082f7e4ec1 

(Date:   Wed Jul 8 17:05:32 2009 -0700
    fealnx: Fix build breakage -- PR_CONT should be KERN_CONT)

1775296 1775289  99%    0.06K  27739       64    110956K kmalloc-64


this is top line from slabtop.
Comment 18 bfields 2009-07-10 22:22:59 UTC
Questions:

1. Does that number eventually go back down, after a period of inactivity, or does it stay high indefinitely.

2. Is the behavior you're seeing new with recent kernels, or does the same workload on an older kernel give different results?
Comment 19 Andrew Randrianasulu 2009-07-10 22:31:47 UTC
1. - this time my machine just hang hard. But usually stopping nfsd and killing  nearly all killable processes doesn't give me memory back. Just leaving it inactive has same effect, i.e. slab/ureclaimable memory doesn't go down.

2. i saw no such behaviour with 2.6.29.5, but it has another bugs in NFS from XFS partitation area.
Comment 20 bfields 2009-07-12 19:34:23 UTC
Created attachment 22326 [details]
fix possible reply cache leak

Were you ever able to CONFIG_DEBUG_SLAB_LEAK?

Looking at the code....  I can't see how the reply cache code could ever have been right.  (Maybe I'm missing something.)

As a stab in the dark, could you try the attached patch? (Compile-tested only.)
Comment 21 Andrew Randrianasulu 2009-07-13 12:13:37 UTC
Sorry, i think i write incorrect description, this bug occurs with SLUB memory allocator, not tried  SLOB, and it was also preset with SLAB (only in latter case i can select CONFIG_DEBUG_SLAB_LEAK).

 i have 



CONFIG_SLUB_DEBUG=y
# CONFIG_SLAB is not set
CONFIG_SLUB=y
# CONFIG_SLOB is not set

in my currecnt .config. But slabinfo works with both SLUB and SLAB. So i added slab word in my bug description, because before i learn abot this small utility i only by pure chance found that SUnreclaim line actually grow with  nfs server activity. Before it was nearly mystery for me.



Will test patch today, but because i have sync argument in my /etc/exports  testing will take some  hours (i also want to be sure  what slab not only grows, but also never shrinks)
Comment 22 Andrew Randrianasulu 2009-07-21 05:54:27 UTC
No, sorry, but patch from #20 doesn't fix this bug. I'm only in the middle of "syncing local tree" phase, and after 6hrs of usage with patched kernel i have slab size 

 Active / Total Size (% used)       : 101871.09K / 104065.19K (97.9%)
  OBJS ACTIVE  USE OBJ SIZE  SLABS OBJ/SLAB CACHE SIZE NAME                   
1253056 1253051  99%    0.06K  19579       64     78316K kmalloc-64
Comment 23 bfields 2009-07-21 14:09:36 UTC
What is ""syncing local tree" phase"?

Have you been able to try Neil's suggestion from #12 yet?
Comment 24 Andrew Randrianasulu 2009-07-21 17:16:14 UTC
I have Gentoo Linux on my client machine. Gentoo has app named "emerge-webrsync", it downloads over http huge tree of ebuilds in lzma-packed tarball, unpack them, and sync local working tree with newly-unpacked one. Tree is over 130K small files. As name suggest, tool uses rsync behind the scene, i think.

Not sure if testing with DIFFERENT memory allocator actually makes sense, but i tried it some time ago, same bug. Right now i have 2.6.31-rc3-git5 with kmemleak enabled, let me boot it first, and then  i will recompile with SLAB allocator and CONFIG_DEBUG_SLAB_LEAK and report here. (ah, just realized if leak is not in allocator code itself, different allocators should behave the same, SLAB just offer better debugging, yes?)
Comment 25 Andrew Randrianasulu 2009-07-22 10:10:38 UTC
Created attachment 22440 [details]
/proc/slab_allocators  before  any NFS activity
Comment 26 Andrew Randrianasulu 2009-07-22 10:44:14 UTC
Created attachment 22441 [details]
/proc/slab_allocators after client boot
Comment 27 Andrew Randrianasulu 2009-07-22 10:48:15 UTC
Created attachment 22442 [details]
My current .config

Something wrong with it, because  i have slab size  nearly at 100Mb  right after booting my "server", before X start.

I can't really test this bug with kernel configured like this, too few MBs in my machine. (i have only 256Mb RAM)
Comment 28 Andrew Randrianasulu 2009-07-25 03:39:05 UTC
I don't see this bug anymore with 2.6.31-rc4

Probably fixed  with commit bc146d23d1358af43f03793c3ad8c9f16bbcffcb ("ide: fix memory leak when flush command is issued")
Comment 29 bfields 2009-07-27 21:44:49 UTC
OK, thanks for following up.  Might be worth testing whether applying (or reverting) that single patch is enough to fix (or, respectively, reproduce) the problem.

And then maybe see if it's gone into stable if appropriate.

Note You need to log in before you can comment on or make changes to this bug.