Hi there! I've been experienced a big problem with the lastest kernel version and all 2.6 versions prior to this version. I'm using Fedora Core 3 and the machine in cause is a router and a dialin server. I use pppoe-server in kernel mode (rp-pppoe-3.7 and pppd 2.4.3). When a user connects to server with pppoe, then to the http daemon and then disconnects the kernel start saying messages like this kind: Message from syslogd@nextc at Thu Mar 9 10:51:14 2006 ... nextc kernel: unregister_netdevice: waiting for ppp9 to become free. Usage count = 233 NOW this blocks any connection attempt to the pppoe-server for at least 10 min!!! I cannot even do a simple "ip addr list" cause my shell will hang! I've made this path: --- include/net/tcp.h.original 2006-03-06 05:22:38.000000000 +0200 +++ include/net/tcp.h 2006-03-06 05:24:58.000000000 +0200 @@ -100,7 +100,7 @@ */ -#define TCP_TIMEWAIT_LEN (60*HZ) /* how long to wait to destroy TIME-WAIT +#define TCP_TIMEWAIT_LEN (2*HZ) /* how long to wait to destroy TIME-WAIT * state, about 60 seconds */ #define TCP_FIN_TIMEOUT TCP_TIMEWAIT_LEN /* BSD style FIN_WAIT2 deadlock breaker. after some googling on the net, and now the tcp conection dissapears very fast but the problem still remains... Best regards, Zabavschi Vlad
NOW I realise that this bug is occuring even if the user doesn't disconnects from the server!!!
Created attachment 7539 [details] /var/log/messages output
Begin forwarded message: Date: Thu, 9 Mar 2006 01:24:06 -0800 From: bugme-daemon@bugzilla.kernel.org To: bugme-new@lists.osdl.org Subject: [Bugme-new] [Bug 6197] New: unregister_netdevice: waiting for ppp9 to become free. Usage count = 658 http://bugzilla.kernel.org/show_bug.cgi?id=6197 Summary: unregister_netdevice: waiting for ppp9 to become free. Usage count = 658 Kernel Version: 2.6.15 and all 2.6 series Status: NEW Severity: blocking Owner: acme@conectiva.com.br Submitter: vlad031@yahoo.com Hi there! I've been experienced a big problem with the lastest kernel version and all 2.6 versions prior to this version. I'm using Fedora Core 3 and the machine in cause is a router and a dialin server. I use pppoe-server in kernel mode (rp-pppoe-3.7 and pppd 2.4.3). When a user connects to server with pppoe, then to the http daemon and then disconnects the kernel start saying messages like this kind: Message from syslogd@nextc at Thu Mar 9 10:51:14 2006 ... nextc kernel: unregister_netdevice: waiting for ppp9 to become free. Usage count = 233 NOW this blocks any connection attempt to the pppoe-server for at least 10 min!!! I cannot even do a simple "ip addr list" cause my shell will hang! I've made this path: --- include/net/tcp.h.original 2006-03-06 05:22:38.000000000 +0200 +++ include/net/tcp.h 2006-03-06 05:24:58.000000000 +0200 @@ -100,7 +100,7 @@ */ -#define TCP_TIMEWAIT_LEN (60*HZ) /* how long to wait to destroy TIME-WAIT +#define TCP_TIMEWAIT_LEN (2*HZ) /* how long to wait to destroy TIME-WAIT * state, about 60 seconds */ #define TCP_FIN_TIMEOUT TCP_TIMEWAIT_LEN /* BSD style FIN_WAIT2 deadlock breaker. after some googling on the net, and now the tcp conection dissapears very fast but the problem still remains... Best regards, Zabavschi Vlad ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
Created attachment 7551 [details] Proposed patch to fix We need to handle the NETDEV_UNREGISTER message and remove all references to the device. We currently fail to do so.
The patch doesn't work .... more then that it breaks the iproute!!!! # ip route add default dev ppp0 table 1 RTNETLINK answers: No such device #
Same here on 2.6.14.7 with rp-pppoe server (tested pppoe in kernel and in userspace mode - no difference). I have few routers with the same kernel and the same setup but the problem is happening only one one of these :/
Created attachment 7864 [details] kernel configuration
loaded modules: Module Size Used by ipt_mac 1796 1 iptable_nat 6520 0 ip_nat 16597 1 iptable_nat ipv6 225836 32 ipt_connlimit 2787 36 ip_conntrack 45456 3 iptable_nat,ip_nat,ipt_connlimit nfnetlink 5163 2 ip_nat,ip_conntrack ipt_ipp2p 7843 54 ipt_p2p 4060 54 ipt_mark 1612 36 ipt_limit 2072 18 ipt_IMQ 1949 36 cls_u32 7297 2 sch_htb 15470 2 iptable_mangle 2308 1 iptable_filter 2453 1 ip6_tables 16532 0 ip_tables 18741 10 ipt_mac,iptable_nat,ipt_connlimit,ipt_ipp2p,ipt _p2p,ipt_mark,ipt_limit,ipt_IMQ,iptable_mangle,iptable_filter imq 4023 0 sch_sfq 4972 38 realtime 9245 0 pppoe 10325 10 pppox 2727 1 pppoe ppp_generic 24204 22 pppoe,pppox slhc 6068 1 ppp_generic 8139too 22976 0 mii 4316 1 8139too ne2k_pci 8674 0 8390 8061 1 ne2k_pci ext3 120069 2 mbcache 7417 1 ext3 jbd 46895 1 ext3 ide_disk 13945 4 piix 9060 0 [permanent] ide_core 110102 2 ide_disk,piix
The problem still remains on the 2.6.16 series ... could someone PLEASE fix this bug?????????? I'm waiting for weeks now and the problem isn't solved !!!
Created attachment 8272 [details] Hard test patch So ... I've patched dev.c NOT to check for refs anymore ... Now the problem remains but it doesn't appear that often ... and instead of showing the message " waiting for ... to bla bla bla " it gives me this: ------------[ cut here ]------------ kernel BUG at net/core/dev.c:2949! invalid opcode: 0000 [#1] PREEMPT Modules linked in: xt_MARK xt_mark xt_limit xt_state bonding sch_teql eql ipt_ULOG ipt_TTL ipt_ttl ipt_TOS ipt_tos ipt_TCPMSS ipt_SAME ipt_REJECT ipt_REDIRECT ipt_recent ipt_policy ipt_owner ipt_NETMAP ipt_multiport ipt_MASQUERADE ipt_LOG ipt_iprange ipt_IMQ ipt_hashlimit ipt_esp ipt_ECN ipt_ecn ipt_CLUSTERIP ipt_ah ipt_addrtype iptable_raw iptable_nat iptable_mangle iptable_filter ip_tables ip_nat_tftp ip_nat_snmp_basic ip_nat_pptp ip_nat_irc ip_nat_ftp ip_nat_amanda ip_conntrack_tftp ip_conntrack_pptp ip_conntrack_netlink ip_nat ip_conntrack_netbios_ns ip_conntrack_irc ip_conntrack_ftp ip_conntrack_amanda ip_conntrack arpt_mangle arptable_filter arp_tables intel_agp agpgart CPU: 0 EIP: 0060:[<c0258ffb>] Not tainted VLI EFLAGS: 00010206 (2.6.16.19.31 #1) EIP is at netdev_run_todo+0xf2/0x1a4 eax: 00000003 ebx: dd286400 ecx: d7923df0 edx: 00000000 esi: dd286400 edi: d7e79c00 ebp: dfccdac4 esp: dccc5f68 ds: 007b es: 007b ss: 0068 Process pppd (pid: 22669, threadinfo=dccc5000 task=de559030) Stack: <0>dccc5f68 dccc5f68 dccc5000 dd286400 c02249ff d7e79c00 d8f92900 dfca611c c0224c54 dfccdb1c c013f299 dffa4f40 d8f92900 00000000 dfc03580 dccc5000 c013cf85 00000008 80040e98 00000000 c010245f 00000008 ffffffe0 8003e488 Call Trace: [<c02249ff>] ppp_shutdown_interface+0x57/0xa0 [<c0224c54>] ppp_release+0x20/0x4c [<c013f299>] __fput+0x85/0x146 [<c013cf85>] filp_close+0x4e/0x54 [<c010245f>] sysenter_past_esp+0x54/0x75 Code: bd 00 00 00 50 53 68 bc ea 2e c0 e9 a9 00 00 00 89 d8 e8 21 8d 00 00 c7 83 6c 01 00 00 04 00 00 00 8b 83 58 01 00 00 85 c0 74 08 <0f> 0b 85 0b 91 e8 2e c0 83 bb a8 00 00 00 00 74 1c 68 86 0b 00 [root@localhost ~]#
I HAVE FOUND THE BUG! The bug is in the conntrack modules from netfilter ... if I do nat, they keep a ref and the ppp connection hangs!
Now ... this bug starts to piss me off ... I see that conntrack is keeping connection trackings for all addresses ... not just the nated ones ... isn't that a bug?! (and also a performance impediment?!) Best regards, Vlad Z.
Your kernel configuration file looks like you are using a distribution kernel with some big patches, so please bug your vendor. Aside I wonder how I ended on the cc list for this bug. I may have done some networking stuff but certainly nothing related to PPP.
First, I'm not using a distro kernel ... I'm using the lastest stable vanilla kernel from kernel.org... Second, THIS IS A KERNEL BUG, not an pppd... Sorry if my mail distrubed you, but it seems that no one is willing to help me ... So, I will no longer use pppoe-server, as for those who can help me and don't want to ... I'll just say -> YOU SUCK! Sorry for everyone else ... Best regards, Vlad Z.
Please retry without the IMQ patch and tell us if the problem persists.
Yes, the problem still persists without the imq patch ... however ... I'm starting to think that this is a ppp bug ... I really don't know how to approach this problem ... Best regards, Vlad Z.
Possible, but it could just as well be anywhere else. Just to clariy: unloading conntrack/NAT really makes the problem go away? If so please describe your full ruleset and (in any case) your network setup.
Well.. I can accelerate the appearance of the bug ... but still ... I can't get him go away ... this happens when I let users connect to server and if I just put -j DROP to his ip in FORWARD, table filter ... Even if I put 0 in /proc/sys/net/ipv4/ip_conntrack_max the bug is still there... I don't know what to do... Best regards, Vlad Z.
Please answer my questions if you want me to help you.
And I told you that no ... I cannot get the bug go away when unloading conntrack/NAT ... That was just an illusion... And I was also trying to say that the bug appears very fast with the -j DROP...
That was only one part of it. Please describe your network setup and post your iptables ruleset.
Guys ! I have exactly the same problem but with bridging code. I use XEN and bridged networking. When any of hosted domUs goes down, the system tried to release bridged network interface but fails with message: unregister_netdevice: waiting for vif3.0 to become free. Usage count = 15 This bug existed in 2.6.12 kernel and all kernels I tried with XEN up to current 2.6.16.13. Someone may say that this is XEN related problem but its definatelly not because XEN patches have nothing to do with vanilla bridging support. Also I did some search through this bugzilla and found another guy with the problem but ipv6 related: http://bugzilla.kernel.org/show_bug.cgi?id=6698 I can give as more information about my system and settings as one need, just ask
That message is just the symptom of a bug, which can be caused by a large number of reasons, so without any evidence I wouldn't necessarily expect them to be related. Feel free to describe your setup, we might notice some similarities, but if not I prefer to focus on this case first.
I am not sure what info is required, here is something that may be useful. If anything else needs, please let me know andrew:~# lsmod Module Size Used by xt_physdev 2128 5 iptable_nat 6596 0 ip_nat 15244 1 iptable_nat ip_conntrack 45164 2 iptable_nat,ip_nat nfnetlink 5176 2 ip_nat,ip_conntrack iptable_filter 2304 1 ip_tables 10936 2 iptable_nat,iptable_filter x_tables 9732 3 xt_physdev,iptable_nat,ip_tables ip_queue 8800 1 intel_agp 20732 1 agpgart 30096 1 intel_agp andrew:~# lspci 0000:00:00.0 Host bridge: Intel Corp. 82865G/PE/P DRAM Controller/Host-Hub Interface (rev 02) 0000:00:01.0 PCI bridge: Intel Corp. 82865G/PE/P PCI to AGP Controller (rev 02) 0000:00:1d.0 USB Controller: Intel Corp. 82801EB/ER (ICH5/ICH5R) USB UHCI #1 (rev 02) 0000:00:1d.1 USB Controller: Intel Corp. 82801EB/ER (ICH5/ICH5R) USB UHCI #2 (rev 02) 0000:00:1d.2 USB Controller: Intel Corp. 82801EB/ER (ICH5/ICH5R) USB UHCI #3 (rev 02) 0000:00:1d.3 USB Controller: Intel Corp. 82801EB/ER (ICH5/ICH5R) USB UHCI #4 (rev 02) 0000:00:1d.7 USB Controller: Intel Corp. 82801EB/ER (ICH5/ICH5R) USB2 EHCI Controller (rev 02) 0000:00:1e.0 PCI bridge: Intel Corp. 82801 PCI Bridge (rev c2) 0000:00:1f.0 ISA bridge: Intel Corp. 82801EB/ER (ICH5/ICH5R) LPC Bridge (rev 02) 0000:00:1f.2 IDE interface: Intel Corp. 82801EB (ICH5) Serial ATA 150 Storage Controller (rev 02) 0000:00:1f.3 SMBus: Intel Corp. 82801EB/ER (ICH5/ICH5R) SMBus Controller (rev 02) 0000:00:1f.5 Multimedia audio controller: Intel Corp. 82801EB/ER (ICH5/ICH5R) AC'97 Audio Controller (rev 02) 0000:01:00.0 VGA compatible controller: nVidia Corporation NV34 [GeForce FX 5200] (rev a1) 0000:02:01.0 Ethernet controller: 3Com Corporation 3c905C-TX/TX-M [Tornado] (rev 78) andrew:~# ifconfig eth0 Link encap:Ethernet HWaddr 00:0A:5E:49:B7:8F inet addr:xxx.xxx.191.100 Bcast:xxx.xxx.191.111 Mask:255.255.255.240 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:7392 errors:0 dropped:0 overruns:0 frame:0 TX packets:4358 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:5677283 (5.4 MiB) TX bytes:1011620 (987.9 KiB) lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 UP LOOPBACK RUNNING MTU:16436 Metric:1 RX packets:447 errors:0 dropped:0 overruns:0 frame:0 TX packets:447 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:153615 (150.0 KiB) TX bytes:153615 (150.0 KiB) peth0 Link encap:Ethernet HWaddr FE:FF:FF:FF:FF:FF UP BROADCAST RUNNING NOARP MULTICAST MTU:1500 Metric:1 RX packets:824228 errors:0 dropped:0 overruns:30 frame:0 TX packets:1029137 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:178338055 (170.0 MiB) TX bytes:1302189026 (1.2 GiB) Interrupt:16 Base address:0x2000 vif0.0 Link encap:Ethernet HWaddr FE:FF:FF:FF:FF:FF UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:4358 errors:0 dropped:0 overruns:0 frame:0 TX packets:7392 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:1011620 (987.9 KiB) TX bytes:5677283 (5.4 MiB) vif5.0 Link encap:Ethernet HWaddr FE:FF:FF:FF:FF:FF UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:1669 errors:0 dropped:0 overruns:0 frame:0 TX packets:3235 errors:0 dropped:12 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:238965 (233.3 KiB) TX bytes:896602 (875.5 KiB) vif6.0 Link encap:Ethernet HWaddr FE:FF:FF:FF:FF:FF UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:101808 errors:0 dropped:0 overruns:0 frame:0 TX packets:88639 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:92074727 (87.8 MiB) TX bytes:11000932 (10.4 MiB) vif7.0 Link encap:Ethernet HWaddr FE:FF:FF:FF:FF:FF UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:96355 errors:0 dropped:0 overruns:0 frame:0 TX packets:87861 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:80993591 (77.2 MiB) TX bytes:14691890 (14.0 MiB) vif8.0 Link encap:Ethernet HWaddr FE:FF:FF:FF:FF:FF UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:820304 errors:0 dropped:0 overruns:0 frame:0 TX packets:636143 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:1125509012 (1.0 GiB) TX bytes:142082026 (135.4 MiB) vif9.0 Link encap:Ethernet HWaddr FE:FF:FF:FF:FF:FF UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:4589 errors:0 dropped:0 overruns:0 frame:0 TX packets:4718 errors:0 dropped:4 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:2115778 (2.0 MiB) TX bytes:715754 (698.9 KiB) xenbr0 Link encap:Ethernet HWaddr FE:FF:FF:FF:FF:FF UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:740 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:29306 (28.6 KiB) TX bytes:0 (0.0 b) The only common thing I can see that we both use netfilter conntrack
some more info andrew:~# iptables -nvL Chain INPUT (policy ACCEPT 7775 packets, 5872K bytes) pkts bytes target prot opt in out source destination Chain FORWARD (policy ACCEPT 12182 packets, 6714K bytes) pkts bytes target prot opt in out source destination 381K 420M QUEUE all -- * * 0.0.0.0/0 217.24.191.105 196K 37M QUEUE all -- * * 0.0.0.0/0 217.24.191.104 256K 27M QUEUE all -- * * 0.0.0.0/0 217.24.191.103 728K 152M QUEUE all -- * * 0.0.0.0/0 217.24.191.102 6707 1022K QUEUE all -- * * 0.0.0.0/0 217.24.191.101 258K 26M ACCEPT all -- * * 0.0.0.0/0 0.0.0.0/0 PHYSDEV match --physdev-in vif5.0 289K 251M ACCEPT all -- * * 0.0.0.0/0 0.0.0.0/0 PHYSDEV match --physdev-in vif6.0 220K 198M ACCEPT all -- * * 0.0.0.0/0 0.0.0.0/0 PHYSDEV match --physdev-in vif7.0 948K 1270M ACCEPT all -- * * 0.0.0.0/0 0.0.0.0/0 PHYSDEV match --physdev-in vif8.0 7495 3030K ACCEPT all -- * * 0.0.0.0/0 0.0.0.0/0 PHYSDEV match --physdev-in vif9.0 Chain OUTPUT (policy ACCEPT 5522 packets, 1316K bytes) pkts bytes target prot opt in out source destination
@Anton: does flushing ip_queue's queue help? Packets passed to iptables from the bridging layer have a reference to the bridge port which ip_queue doesn't check for when a NETDEV_UNREGISTER notification comes, so these packets are not automatically released.
Created attachment 8343 [details] Free queue entries when bridge port disappears @Anton: alternatively just try this patch.
Patrick, I am sorry for my silence - been damn busy :( Unfortunatelly I can't apply your patch right now but I can try in 2-3 hours. What I can check right now is: I can remove all lines from iptables with -j QUEUE, then unload ip_queue module and then try to shutdown test domU (and it's interface). Would it help somehow to shed some light before I apply the patch ?
Yes, that would also show if the problem really is within ip_queue.
Well ... I tried and got the same again: andrew:~# iptables -n -v -L Chain INPUT (policy ACCEPT 16628 packets, 7073K bytes) pkts bytes target prot opt in out source destination Chain FORWARD (policy ACCEPT 29629 packets, 8852K bytes) pkts bytes target prot opt in out source destination 833K 690M ACCEPT all -- * * 0.0.0.0/0 0.0.0.0/0 PHYSDEV match --physdev-in vif6.0 549K 502M ACCEPT all -- * * 0.0.0.0/0 0.0.0.0/0 PHYSDEV match --physdev-in vif7.0 2141K 2797M ACCEPT all -- * * 0.0.0.0/0 0.0.0.0/0 PHYSDEV match --physdev-in vif8.0 20186 5781K ACCEPT all -- * * 0.0.0.0/0 0.0.0.0/0 PHYSDEV match --physdev-in vif9.0 Chain OUTPUT (policy ACCEPT 14553 packets, 2948K bytes) pkts bytes target prot opt in out source destination syslog: Jun 20 22:25:58 andrew kernel: xenbr0: port 3(vif5.0) entering disabled state Jun 20 22:25:58 andrew kernel: device vif5.0 left promiscuous mode Jun 20 22:25:58 andrew kernel: xenbr0: port 3(vif5.0) entering disabled state Jun 20 22:25:58 andrew logger: /etc/xen/scripts/vif-bridge: offline XENBUS_PATH=backend/vif/5/0 Jun 20 22:25:59 andrew logger: /etc/xen/scripts/vif-bridge: brctl delif xenbr0 vif5.0 failed Jun 20 22:25:59 andrew logger: /etc/xen/scripts/vif-bridge: ifconfig vif5.0 down failed Jun 20 22:25:59 andrew logger: /etc/xen/scripts/vif-bridge: Successful vif-bridge offline for vif5.0, bridge xenbr0. Jun 20 22:26:08 andrew kernel: unregister_netdevice: waiting for vif5.0 to become free. Usage count = 249 Jun 20 22:26:39 andrew last message repeated 3 times
P.S. sorry I ran "iptables -n -v -L" after shutdown of domU but I don't think it really matters
Patrick, is it possible to check what is locking the device ? I found that kernel gets in infinite loop in net/core/dev.c in netdev_wait_allrefs() function. I see where it gets usage count from but I can't see how to get what is using the device to print an useful debug message.
No, thats not possible. Something took the references and didn't release them, which could be for a number of reasons: - the references leaked - something that contains them leaked - something that is still holding them doesn't notice that they should be released ip_queue falls in the last category. I'll see if I can spot some more, I could imagine more similar mistakes since this part of bridging is badly integrated. BTW, does the problem go away after some time (couple of minutes)?
One more thing: are you using NAT on your bridge?
Patrick: 1. No the problem is still here after about an hour since interface shudown. refcount is still the same = 249 2. No I don't use NAT. All addresses including dom0 address are from the same network xxx.xxx.191.96/255.255.255.240 again, the current list of loaded modules: andrew:~# lsmod Module Size Used by xt_physdev 2128 4 iptable_nat 6596 0 ip_nat 15244 1 iptable_nat ip_conntrack 45164 2 iptable_nat,ip_nat nfnetlink 5176 2 ip_nat,ip_conntrack iptable_filter 2304 1 ip_tables 10936 2 iptable_nat,iptable_filter x_tables 9732 3 xt_physdev,iptable_nat,ip_tables intel_agp 20732 1 agpgart 30096 1 intel_agp and current iptables rules: andrew:~# iptables-save # Generated by iptables-save v1.2.11 on Tue Jun 20 23:22:41 2006 *nat :PREROUTING ACCEPT [127949:6881642] :POSTROUTING ACCEPT [128275:6901908] :OUTPUT ACCEPT [329:20482] COMMIT # Completed on Tue Jun 20 23:22:41 2006 # Generated by iptables-save v1.2.11 on Tue Jun 20 23:22:41 2006 *filter :INPUT ACCEPT [18890:7257551] :FORWARD ACCEPT [47562:17379588] :OUTPUT ACCEPT [16316:3317802] -A FORWARD -m physdev --physdev-in vif6.0 -j ACCEPT -A FORWARD -m physdev --physdev-in vif7.0 -j ACCEPT -A FORWARD -m physdev --physdev-in vif8.0 -j ACCEPT -A FORWARD -m physdev --physdev-in vif9.0 -j ACCEPT COMMIT # Completed on Tue Jun 20 23:22:41 2006
Created attachment 8353 [details] My kernel config Here is my dom0 kernel config. Hope this helps
Hello guys Today I set up new XEN server with exactly the same config except that I used Debian Etch (testing) instead of Debian Sarge (stable) from dom0 but the same domUs. I see NO SUCH PROBLEM with this distro. It looks like it's Debian Sarge specific problem. I am going to move all users from affected server to the new one and then reinstall new one with Etch to be sure it is not hardware related. P.S. Debian Etch uses another gcc (version 4.0.4 20060507) vs gcc version 3.3.5 in Sarge
Umm sorry ... It seems like Debian is not guilty. I am getting this message again on the new server. Looks like it's ip_queue fault. I will check the Patrick's patch tomorrow to be sure if it helps.
2 Patrick: I've done a lot of tests and I can say for sure: the problem is inside IPQ kernel module. And I tried to apply your patch too but I doesn't help. More information: when I boot the server without `modprobe ip_queue` it works excellent. I can reboot or shutdown any domU (which makes changes to bridge topology) without problem. After I load IPQ module and redirect some traffic to it ( I use ipcad to collect stats) I get the problem. Then even if I do `rmmod ip_queue` before I get kernel message "unregister_netdevice" I get it anyway so it looks like IPQ module leaves some trace in kernel (dunno how). So I am going to try to use ULOG yet.
Created attachment 8494 [details] Handle NF_STOP in nf_reinject This means we're leaking somewhere on the queue path. I found a possible reason for this in the queueing core, but I can't see how this could be triggered. Please try this patch anyway.
Nope, SSDD :( I applied your patch in addition to 2 previous ones. Btw after I applied it and then ran `make modules` it didnt rebuild any modules so I ran `touch net/netfilter/nfnetlink_queue.c` and then `make modules` again so it rebuild net/netfilter/nfnetlink_queue.ko Was it correct ?
No, this affects a part that is always statically built in, so you need to rebuild and install the entire kernel.
Patrick, I have rebuilt the whole kernel (make clean and then make ...) Now it behaves strange but a bit more stable. When i try to ping domU domains to generate some traffic it works stable. When I use flood ping to do the same sometimes I get unregister_netdevice message after shutdown but not always. I tried to reproduce the bug twice but I have no success. The statistics is as follows: I rebooted dom0 server 4 times and did some pings (normal and flood) after every reboot and tried to reboot/shutdown domU domains. Once I got the error, 3 times everything worked fine (stable). First time it was fine, then second time I got error and used sysrq reboot, then 2 times it worked fine and I couldn't reproduce the bug again.
There was one case my previous patch didn't handle, in case userspace sends incorrect verdicts the packet may also leak. Does this patch behave more reliable? BTW, you don't need to make clean, just "make" should be fine.
Created attachment 8496 [details] Handle NF_STOP and invalid verdicts
One more thing: please also try to rmmod the ip_queue module with this patch in case the problem appears.
Patrick, please let me know if your fiest patchset (Free queue entries when bridge port disappears) is required. I have it applied to my kernel.
Not sure if its required for your case (it doesn't seem to help), but it is a correct fix and it also part of 2.6.18-rc1.
Ok, some new info: I still CAN reproduce the bug but only under high traffic (2 flood pings to different domUs in parallel). First domU rebooted fine under flood ping, second domU failed and I got: unregister_netdevice: waiting for vif10.0 to become free. Usage count = 49786 removing -j QUEUE lines from iptables and then `rmmod ip_queue` didn't help too. When traffic is not high everything works fine.
OK thanks. The patch is right in any case, I'm going to continue looking for other possible reasons, but probably won't get to it today. One question I'm not entirely clear about: the problem appears in the host system, not in the guest system, right?
Yes the problem appears in host system (dom0) when rebooting/shutdown guest system (domU). Patrick, before you go, please answer: are there any chances you do something on weekends ? I need to upgrade the system and put the server back to datacenter today but I can wait until monday's morning (local time) if you need to check anything else before this time.
I'll look into it this weekend, but not sure if I will find anything.
Ok I'll leave the server here and will be watching this topic
Any updates on this problem? Vlad, Anton - have you tested with later kernels since? There were multiple netlink fixes submitted by Patrick lately. Please test with latest kernel if you haven't already. Thanks.
*** Bug 8638 has been marked as a duplicate of this bug. ***
I've consistently seen this problem and it appears to be a deadlock problem with a net device not being able to be released because it's being held by ipsec/racoon. I've tried this on stock kernels 2.6.21.5 and 2.6.23.1 both with the same issue. The scenario that makes this happen consistently for me is to have two pppoe processes in use at the same time with an ipsec tunnel using racoon established over one of the links. If I then terminate the link (ppp0) that does NOT have the tunnel established over it, I get: unregister_netdevice: waiting for ppp0 to become free. Usage count = 1 and both the pppd and racoon processes are held infinitely in a non-interruptible 'D' state and the only option is to reboot the server. Killing the other pppoe session (ppp2) where the tunnel is established over this link does not seem to cause this problem. Interesting also, if I tell Racoon to ONLY bind to the ppp device that has the tunnel, the problem still happens.
On 2.6.23.1, there are two setups which expose this bug for me: eth0 (skge) eth1 (tulip) 6to4 (sit) IPv4 and IPv6 forwarding is in use. After about 24 hours of heavy traffic, attempting to bring down 6to4 hangs. eth0 (skge) eth1 (tulip) br0 (bridge of eth0 and eth1) After about 24 hours of heavy traffic, attempting to bring down br0 hangs. In both cases, dmesg shows a steady stream of something like unregister_netdevice: waiting for br0 to become free. Usage count = 647
Comment #56, I get exactly the same behavior, but I'm not using racoon (just manual 2.6sec configuration). I will be upgrading the relevant box to F8 very soon and will report back. That will move me from 2.6.20 to 2.6.23. I'm disappointed to hear you have the bug show up in 23, that means this hasn't been resolved yet?!
Created attachment 13759 [details] fix xfrm state leak This should fix the IPsec-related problems. As for the others, since this report is a complete mess of I don't know how many different problems I'm going to remove myself from the CC list. If you want someone to actually look into this, I 'd suggest opening new bugs for the different cases and adding full information about your network configuration.
Why don't we do just that. I will close the bug, and if anyone still has problems please open new entry. It is OK if we will have duplicates, better than several problems on one... Patrick, are you planning to submit the patch, or was the code fixed in other way? Thanks.
The IPsec leak is already fixed upstream (5dba4797), using a slightly different patch.
Great, thanks! Closing the bug then.