Distribution: slackware Hardware Environment: i386 Software Environment: linux Problem Description: After upgrading from 2.6.x to 2.6.10, after some days running, when the nat table is filled with thousands of lines , when you do a : cat /proc/net/ip_conntrack | wc -l You get a number of lines (currently between 22k and 23k), preceeded by the message "No space left on device". Note that the ip_conntrack_max parameter in the proc fs is set to 65535. The problem appears randomly. When doing a strace of the previous command, you see possibly a buffer overflow. If you decrease the ip_conntrac_max value to 8192, the problem disappears. Steps to reproduce: - set the ip_conntrack_max parameter to 65535 - wait for some hours, opening a lot of connections recorded in the ip_conntrack file ( proc fs ) - do a "wc -l" of the conntrack file - you randomly get the error....
news: - ip_conntrack_max set to 65535 - opened lots of connections - issued a "wc -l /proc/net/ip_conntrack" - result was ok (22442) - ... zzzZZZzzz ... waited some time (5 minutes) - issued the same command - result was nok (no space left on device, but only 11787 lines this time !!! ) Here's the example: root@linux-gw:/etc/rc.d# wc -l /proc/net/ip_conntrack 22442 /proc/net/ip_conntrack root@linux-gw:/etc/rc.d# wc -l /proc/net/ip_conntrack wc: /proc/net/ip_conntrack: No space left on device 11787 /proc/net/ip_conntrack
I can confirm this bug on 2.6.11 (vanilla). We have about 60000 entries now. (kernel without swap but with many free memory) And may be related thing: rapidly increasing number of entries FROM FAQ: "The answer is easy: UNREPLIED entries are temporary entries, i.e. as soon as we run out of connection tracking entries (we reach /proc/sys/net/ipv4/ip_conntrack_max), we delete old UNREPLIED entries. In other words: instead of having empty conntrack entries, we'd rather keep some maybe useful information in them until we really need them." From my opinion this is not true. Any sample: 1) create Established Unreplied entries (I tried 230) (timeout is the 5 days) 2) then setup ip_conntrack_max to 232 and try more then 2 connections (not possible) 3) leave machine running and tried create more then 2 connections next day => the same result (no one unreplied entry was deleted) With higher numbers I got same results. I think that it is not expected behavior. Or better in FAQ is: "... , we delete old UNREPLIED entries which was timeouted (now 5 days). ..." ?
Well, the 'problem' here is that some (at least one) of your hash buckets in the conntrack table gets too large to be dumped into one buffer provided via userspace / the seq_file system. In such a case, we simply truncate the output and return ENOSPC to indicate that something is missing. We cannot do a 'perfect' iteration of the hash table, since we 1) cannot lock the table while kernel is scheduling to the userspace process 2) don't have any 'handle' that would allow us to have a smaller iteration granularity than one hash bucket of the table. instead of 'cat /proc/net/ip_conntrack | wc -l' you should be doing 'cat /proc/net/ip_conntrack_count' anyway, since the latter doesn't affect your system performance (and the former is quite expensive).