|Summary:||slab grows with NFS write activity.|
|Product:||File System||Reporter:||Andrew Randrianasulu (randrik)|
|Severity:||normal||CC:||neilb, rjw, trondmy|
|Bug Depends on:|
first slabtop "screenshot"
/etc/fstab (semi-autogenerated by script)
last 3100 lines from /var/log/syslog
fix possible reply cache leak
/proc/slab_allocators before any NFS activity
/proc/slab_allocators after client boot
My current .config
Description Andrew Randrianasulu 2009-06-12 09:51:26 UTC
Created attachment 21869 [details] My .config I first observed this bug with 2.6.30-rc8, while trying to reproduce #13375. After whole night of NFS activity system becomes very slow, i blamed mozilla first, but now i think problem is deeper, at least slabtop clearly showed constantly growing kmalloc-96, without X at all. Current /proc/meminfo (with NFS server active) guest@slax:~$ cat /proc/meminfo MemTotal: 242072 kB MemFree: 6392 kB Buffers: 8864 kB Cached: 65464 kB SwapCached: 2428 kB Active: 59032 kB Inactive: 89468 kB Active(anon): 37384 kB Inactive(anon): 38304 kB Active(file): 21648 kB Inactive(file): 51164 kB Unevictable: 0 kB Mlocked: 0 kB HighTotal: 0 kB HighFree: 0 kB LowTotal: 242072 kB LowFree: 6392 kB SwapTotal: 399992 kB SwapFree: 396644 kB Dirty: 48 kB Writeback: 8 kB AnonPages: 71908 kB Mapped: 30376 kB Slab: 78524 kB SReclaimable: 8160 kB SUnreclaim: 70364 kB PageTables: 1364 kB NFS_Unstable: 0 kB Bounce: 0 kB WritebackTmp: 0 kB CommitLimit: 521028 kB Committed_AS: 156896 kB VmallocTotal: 770104 kB VmallocUsed: 6208 kB VmallocChunk: 731224 kB HugePages_Total: 0 HugePages_Free: 0 HugePages_Rsvd: 0 HugePages_Surp: 0 Hugepagesize: 4096 kB DirectMap4k: 262080 kB DirectMap4M: 0 kB ---------- uname -a Linux slax 2.6.30-i486 #4 SMP Thu Jun 11 17:44:06 GMT 2009 i686 AMD Duron(tm) Processor AuthenticAMD GNU/Linux i'll attach kernel .config and few slabtop "screenshots" from console.
Comment 1 Andrew Randrianasulu 2009-06-12 09:54:05 UTC
Created attachment 21870 [details] first slabtop "screenshot" slabtop -o > file.txt
Comment 2 Andrew Randrianasulu 2009-06-12 09:54:59 UTC
Created attachment 21871 [details] Second ...
Comment 4 Andrew Randrianasulu 2009-06-12 09:56:56 UTC
Created attachment 21873 [details] last one captured 10 minutes after first.
Comment 5 Andrew Randrianasulu 2009-06-12 09:58:19 UTC
Created attachment 21874 [details] /etc/exports
Comment 6 Andrew Randrianasulu 2009-06-12 10:00:26 UTC
Created attachment 21875 [details] /etc/fstab (semi-autogenerated by script)
Comment 7 Andrew Randrianasulu 2009-06-12 10:06:58 UTC
Created attachment 21876 [details] lspci -v as you can see, my network card is 00:06.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL-8139/8139C/8139C+ (rev 10)
Comment 8 Trond Myklebust 2009-06-12 12:07:13 UTC
Looks like a server issue rather than a client problem. Reassigning to Bruce.
Comment 9 Andrew Randrianasulu 2009-06-12 15:29:13 UTC
Created attachment 21880 [details] last 3100 lines from /var/log/syslog after i enabled debug for slab/kmalloc-96 echo 1 > /sys/kernel/slab/kmalloc-96/trace echo 1 > /sys/kernel/slab/kmalloc-96/sanity_checks i got this log .(there was also hal-something, but i disabled hald completely, kmalloc-96 still grows)
Comment 10 bfields 2009-06-12 15:55:14 UTC
I don't understand how to read the slab debugging traces.... Is that a stack dump of every alloc/free on kmalloc-96? What's your last known good server kernel?
Comment 11 Andrew Randrianasulu 2009-06-12 18:47:06 UTC
I just turned on slab debug in hope someone can understand my problem a bit better. Last good (for NFS) kernel was 220.127.116.11 (i skipped .28 and .29 already has additional bugs)
Comment 12 Neil Brown 2009-06-12 23:01:03 UTC
If you can compile your kernel with CONFIG_DEBUG_SLAB_LEAK then report /proc/slab_allocators at an appropriate time, that should prove quite useful.
Comment 13 Andrew Randrianasulu 2009-06-13 03:07:47 UTC
With CONFIG_SLUB=y i can't see CONFIG_DEBUG_SLAB_LEAK, only CONFIG_SLUB_DEBUG=y and CONFIG_SLUB_STATS=y. I will play with different allocator today.
Comment 14 Andrew Randrianasulu 2009-06-13 20:11:50 UTC
Created attachment 21900 [details] kmemtrace log 2.6.30 has ftrace-based kernel memory tracer. Not sure if it useful or not, but here is output, captured with userspace tool from http://repo.or.cz/w/kmemtrace-user.git?a=shortlog;h=refs/heads/ftrace-temp
Comment 15 Andrew Randrianasulu 2009-06-14 09:55:24 UTC
testing 2.6.30 with SLAB allocator currently. May be this memleak was already fixed in mainline - http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=7747a0b0af5976ba3828796b4f7a7adc3bb76dbd author Felix Blyakher <email@example.com> Thu, 11 Jun 2009 22:07:28 +0000 (17:07 -0500) committer Felix Blyakher <firstname.lastname@example.org> Fri, 12 Jun 2009 15:26:52 +0000 (10:26 -0500) commit 7747a0b0af5976ba3828796b4f7a7adc3bb76dbd tree cf56450f057c3045341fe50c4e865466ee8a4522 tree | snapshot parent 35fd035968de4f674b9d62ee7b1d80ab7a50c384 commit | diff xfs: fix freeing memory in xfs_getbmap() Regression from commit 28e211700a81b0a934b6c7a4b8e7dda843634d2f. Need to free temporary buffer allocated in xfs_getbmap(). commit 28e211700a81b0a934b6c7a4b8e7dda843634d2f (xfs: fix getbmap vs mmap deadlock) was from Thu, 30 Apr 2009 05:29:02 +0000 (00:29 -0500). I will try this separate fix, and may be whole linux-git tree.
Comment 16 Andrew Randrianasulu 2009-06-29 07:28:10 UTC
Bug still here, at least in Linus git up to commit 987fed3bf6982f2627d4fa242caa9026ef61132a Merge: ed4fc72... 8b169b5... Author: Linus Torvalds <email@example.com> Date: Thu Jun 25 17:04:37 2009 -0700 Merge branch 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6 slab grow from ~30mb to well over 100Mb in just few hours of NFS activity.
Comment 17 Andrew Randrianasulu 2009-07-10 09:27:45 UTC
Bug still here for me with Linus's tree up to commit 2e3167308048ca6c810733384d8289082f7e4ec1 (Date: Wed Jul 8 17:05:32 2009 -0700 fealnx: Fix build breakage -- PR_CONT should be KERN_CONT) 1775296 1775289 99% 0.06K 27739 64 110956K kmalloc-64 this is top line from slabtop.
Comment 18 bfields 2009-07-10 22:22:59 UTC
Questions: 1. Does that number eventually go back down, after a period of inactivity, or does it stay high indefinitely. 2. Is the behavior you're seeing new with recent kernels, or does the same workload on an older kernel give different results?
Comment 19 Andrew Randrianasulu 2009-07-10 22:31:47 UTC
1. - this time my machine just hang hard. But usually stopping nfsd and killing nearly all killable processes doesn't give me memory back. Just leaving it inactive has same effect, i.e. slab/ureclaimable memory doesn't go down. 2. i saw no such behaviour with 18.104.22.168, but it has another bugs in NFS from XFS partitation area.
Comment 20 bfields 2009-07-12 19:34:23 UTC
Created attachment 22326 [details] fix possible reply cache leak Were you ever able to CONFIG_DEBUG_SLAB_LEAK? Looking at the code.... I can't see how the reply cache code could ever have been right. (Maybe I'm missing something.) As a stab in the dark, could you try the attached patch? (Compile-tested only.)
Comment 21 Andrew Randrianasulu 2009-07-13 12:13:37 UTC
Sorry, i think i write incorrect description, this bug occurs with SLUB memory allocator, not tried SLOB, and it was also preset with SLAB (only in latter case i can select CONFIG_DEBUG_SLAB_LEAK). i have CONFIG_SLUB_DEBUG=y # CONFIG_SLAB is not set CONFIG_SLUB=y # CONFIG_SLOB is not set in my currecnt .config. But slabinfo works with both SLUB and SLAB. So i added slab word in my bug description, because before i learn abot this small utility i only by pure chance found that SUnreclaim line actually grow with nfs server activity. Before it was nearly mystery for me. Will test patch today, but because i have sync argument in my /etc/exports testing will take some hours (i also want to be sure what slab not only grows, but also never shrinks)
Comment 22 Andrew Randrianasulu 2009-07-21 05:54:27 UTC
No, sorry, but patch from #20 doesn't fix this bug. I'm only in the middle of "syncing local tree" phase, and after 6hrs of usage with patched kernel i have slab size Active / Total Size (% used) : 101871.09K / 104065.19K (97.9%) OBJS ACTIVE USE OBJ SIZE SLABS OBJ/SLAB CACHE SIZE NAME 1253056 1253051 99% 0.06K 19579 64 78316K kmalloc-64
Comment 23 bfields 2009-07-21 14:09:36 UTC
What is ""syncing local tree" phase"? Have you been able to try Neil's suggestion from #12 yet?
Comment 24 Andrew Randrianasulu 2009-07-21 17:16:14 UTC
I have Gentoo Linux on my client machine. Gentoo has app named "emerge-webrsync", it downloads over http huge tree of ebuilds in lzma-packed tarball, unpack them, and sync local working tree with newly-unpacked one. Tree is over 130K small files. As name suggest, tool uses rsync behind the scene, i think. Not sure if testing with DIFFERENT memory allocator actually makes sense, but i tried it some time ago, same bug. Right now i have 2.6.31-rc3-git5 with kmemleak enabled, let me boot it first, and then i will recompile with SLAB allocator and CONFIG_DEBUG_SLAB_LEAK and report here. (ah, just realized if leak is not in allocator code itself, different allocators should behave the same, SLAB just offer better debugging, yes?)
Comment 25 Andrew Randrianasulu 2009-07-22 10:10:38 UTC
Created attachment 22440 [details] /proc/slab_allocators before any NFS activity
Comment 26 Andrew Randrianasulu 2009-07-22 10:44:14 UTC
Created attachment 22441 [details] /proc/slab_allocators after client boot
Comment 27 Andrew Randrianasulu 2009-07-22 10:48:15 UTC
Created attachment 22442 [details] My current .config Something wrong with it, because i have slab size nearly at 100Mb right after booting my "server", before X start. I can't really test this bug with kernel configured like this, too few MBs in my machine. (i have only 256Mb RAM)
Comment 28 Andrew Randrianasulu 2009-07-25 03:39:05 UTC
I don't see this bug anymore with 2.6.31-rc4 Probably fixed with commit bc146d23d1358af43f03793c3ad8c9f16bbcffcb ("ide: fix memory leak when flush command is issued")
Comment 29 bfields 2009-07-27 21:44:49 UTC
OK, thanks for following up. Might be worth testing whether applying (or reverting) that single patch is enough to fix (or, respectively, reproduce) the problem. And then maybe see if it's gone into stable if appropriate.