Bug 8151
Summary: | Established connections not displayed in /proc/tcp/netstat | ||
---|---|---|---|
Product: | Networking | Reporter: | David Tung (getitupsidelines) |
Component: | Other | Assignee: | Stephen Hemminger (stephen) |
Status: | REJECTED INSUFFICIENT_DATA | ||
Severity: | normal | CC: | bunk, getitupsidelines, okir |
Priority: | P2 | ||
Hardware: | i386 | ||
OS: | Linux | ||
Kernel Version: | 2.6.17-rt7-default #3 PREEMPT | Subsystem: | |
Regression: | --- | Bisected commit-id: | |
Attachments: | fdsock test program |
Description
David Tung
2007-03-08 07:53:53 UTC
Reply-To: akpm@linux-foundation.org Begin forwarded message: Date: Thu, 8 Mar 2007 07:53:57 -0800 From: bugme-daemon@bugzilla.kernel.org To: bugme-new@lists.osdl.org Subject: [Bugme-new] [Bug 8151] New: Established connections not displayed in /proc/tcp/netstat http://bugzilla.kernel.org/show_bug.cgi?id=8151 Summary: Established connections not displayed in /proc/tcp/netstat Kernel Version: 2.6.17-rt7-default #3 PREEMPT Status: NEW Severity: normal Owner: acme@conectiva.com.br Submitter: getitupsidelines@yahoo.com CC: getitupsidelines@yahoo.com Most recent kernel where this bug did *NOT* occur: Unknown Distribution: SUSE 10 Pro Hardware Environment: Intel(R) Pentium(R) 4 CPU 2.80GHz, 32 bit, 512 Megs of RAM Software Environment: SUSE 10, 2.6.17-rt7-default with ingo molnar RT patch Problem Description: Established tcp network connections are not shown in /proc/net/tcp (not displayed in netstat). I began to notice this when I would use netstat to display all established connections. There were a few connections not shown as ESTABLISHED by netstat, yet my application did not report a connection broken with the remote device. I logged into the remote device and observed the connection was ESTABLISHED on the remote side. In addition, I verified the transfer of data to the remote device. I searched through the proc file system to find out more info and it seems that my application still held the socket file descriptor as opened. #### LOCAL MACHINE WITH BUG# 10.1.2.71 is IP of remote device note connection not established local:/ # netstat -a | grep 10.1.2.71 ## NOTE IP of local device local:/ # ifconfig eth1 eth1 Link encap:Ethernet HWaddr 00:03:2D:09:0E:6F inet addr:10.1.2.10 Bcast:10.1.2.255 Mask:255.255.255.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:558434 errors:0 dropped:0 overruns:0 frame:0 TX packets:264478 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:37082257 (35.3 Mb) TX bytes:19489061 (18.5 Mb) Interrupt:18 # ##telnet into remote machine local:/ # telnet 10.1.2.71 Trying 10.1.2.71... Connected to 10.1.2.71. Escape character is '^]'. MontaVista(R) Linux(R) Professional Edition 3.1 Linux/ppc 2.4.20_mvl31-405gp_eval (none) login: root Password: Linux (none) 2.4.20_mvl31-405gp_eval #474 Tue Dec 27 16:37:25 PST 2005 ppc unknown MontaVista(R) Linux(R) Professional Edition 3.1 BusyBox v0.60.3 (2004.01.09-22:55+0000) Built-in shell (ash) Enter 'help' for a list of built-in commands. ######## Remote machine. Note connection established on port 1400 with local machine # netstat -a | grep 10.1.2.10 tcp 0 0 10.1.2.71:telnet 10.1.2.10:51555 ESTABLISHED tcp 0 0 10.1.2.71:1400 10.1.2.10:60380 ESTABLISHED ###### Back on LOCAL machine with bug ### Note established connection to IP 10.1.2.71 on port 1400 not shown. However, other established connections shown to other devices running the same remote server software. local:/proc/2774/fd # grep 0A02010A /proc/net/tcp | more 64: 0A02010A:EBDC 4702010A:0578 01 00000000:00000000 00:00000000 00000000 0 0 7477 1 d8ba1980 201 40 10 2 100 70: 0A02010A:A547 4802010A:0578 01 00000000:00000000 00:00000000 00000000 0 0 7478 1 d8ba1400 201 40 10 2 100 #### Display open inodes ### I grep for the inode of the established connections from above. My application will open the sockets to the remote devices sequentially. ml_alc10:/proc/2774/fd # ls -l | grep socket | grep 747[0-9] lrwx------ 1 root root 64 Mar 7 10:37 154 -> socket:[7471] lrwx------ 1 root root 64 Mar 7 10:37 155 -> socket:[7472] lrwx------ 1 root root 64 Mar 7 10:37 158 -> socket:[7473] lrwx------ 1 root root 64 Mar 7 10:37 159 -> socket:[7474] lrwx------ 1 root root 64 Mar 7 10:37 160 -> socket:[7475] lrwx------ 1 root root 64 Mar 7 10:37 161 -> socket:[7476] lrwx------ 1 root root 64 Mar 7 10:37 162 -> socket:[7477] lrwx------ 1 root root 64 Mar 7 10:37 163 -> socket:[7478] lrwx------ 1 root root 64 Mar 7 10:37 164 -> socket:[7479] ## Check if socket inode 7479 exists in /proc/net/tcp local:/proc/2774/fd # grep 7479 /proc/net/tcp | more local:/proc/2774/fd # ls -l | grep 7479 lrwx------ 1 root root 64 Mar 7 10:37 164 -> socket:[7479] local:/proc/2774/fd # grep 7479 /proc/net/tcp | more local:/proc/2774/fd # #### NOTE Very Strange behavior. The connection will intermittently be shown in /proc/net/tcp with the same local port number and inode. This leads me to believe that the connection persists throughout and that the kernel is failing to accurately display all established socket connections. Please advise on how display all ESTABLISHED connections correctly. Thanks. ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is. The data in /proc/net/tcp is not atomic. If there are lots of connections it is easily possible for a list to change during the traversal and some entries will be missed. Is this problem occurring on an active machine with connections coming and going? or is it a standalone test/workstation? Hello and thanks for the response. The machine is active. Of course connections can occur at anytime the application is running but for the specific sockets that are "missing" from the /proc/net/tcp file the connection is proven to persists for hours or days at a time. Andrew Morton <akpm@linux-foundation.org> wrote: > > Date: Thu, 8 Mar 2007 07:53:57 -0800 > From: bugme-daemon@bugzilla.kernel.org > To: bugme-new@lists.osdl.org > Subject: [Bugme-new] [Bug 8151] New: Established connections not displayed in /proc/tcp/netstat Please try running 'ss -nt' to see if it displays the missing connections. If it does then we can confirm a problem with /proc, otherwise the you might have a TCP problem instead. Note that with a recent kernel you may have to build/load the tcp_diag module before 'ss -nt' will work. Cheers, This is confusing. Your example shows # grep 0A02010A /proc/net/tcp | more 64: 0A02010A:EBDC 4702010A:0578 01 00000000:00000000 00:00000000 00000000 0 0 7477 1 d8ba1980 201 40 10 2 100 and you say "established connection to IP 10.1.2.71 on port 1400 not shown". But 4702010A:0578 really is 10.1.2.71, port 1400 (0x0578 in hex), so it _is_ shown. The remainder of the report - you list some open sockets of your application, and then go looking for the one with inode 7479 in /proc/net/tcp. Have you confirmed prior to this that the socket is indeed a TCP socket? What does lsof -p <pid-of-your-application> report? ss -nt does not show the established socket. In addition, lsof -p shows "can't identify protocol" for my missing sockets. So If understand correctly: as far as the system is concerned the missing sockets are not tcp sockets (according to lsof, even though they are according to everything else I see) therefore, they are not listed in /proc/net/tcp and netstat. So unless im missing something here I assume this is NOT a kernel bug and should therefore be reported on another board. Please advise. Thanks The lsof error message could mean that lsof is broken, or that the kernel indeed isn't printing all sockets. Can you please try the following: - compile the program attached below, put the binary in /tmp/fdsock - pick an application where you believe netstat et al don't show connected sockets properly. Run the app from the command line, and note the PID - Attach to the running process using gdb: gdb /path/to/my/application <pid> - Within gdb, run p system("/tmp/fdsock") This should display a listing of all file handles the application has open, with an indication of whether it's a socket, tty, etc. Note that the output goes to the tty of the application you have attached to, not the terminal you're running gdb in. For sockets, fdsock will also print out the local address, and - if connected - the peer's address, like this: socket inet/stream 192.168.5.32:58131 connected to 192:168.5.1:22 flags 802 (RDWR,NONBLOCK on fd 3 Compare that to the output of /proc/net/tcp, netstat, etc. Created attachment 10773 [details]
fdsock test program
Well, it helps to actually attach the test program, not just talk about
doing it :-)
Maybe the missing sockets are Unix domain, not TCP at all. Please give us some feedback, is this still a bug or is it a non-problem. Please reopen this bug if: - it is still present with kernel 2.6.22 and - you can provide the requested information. |