Most recent kernel where this bug did not occur: none known (yet) Distribution: reproduced on Debian/stable, SuSE/10.0, SuSE/10.1 Hardware Environment: reproduced on UML, i386, x86/64 Software Environment: reproduced with openvpn and UML tap devices Problem Description: after adding IPv6 to my previously working openvpn tunneling setup, a (really old) IPv6-related bug started to occurr: http://lkml.org/lkml/2003/8/21/1 I also reproduced this bug with kernel 2.6.15.1(vanilla,uml) and 2.6.16.13(SuSE-version,x86/64) and linux-2.6.13 (SuSE-version,i386) Steps to reproduce: echo 0 > /proc/sys/net/ipv6/conf/all/forwarding # this is important initialization Have (any version of) openvpn open a tunnel using a tap (virtual ethernet) device. In the "up" script do: echo 1 > /proc/sys/net/ipv6/conf/all/forwarding this can be easily tested with these lines: apt-get install openvpn modprobe tun mknod /dev/net/tun c 10 200 echo 0 > /proc/sys/net/ipv6/conf/all/forwarding echo "echo 1 > /proc/sys/net/ipv6/conf/all/forwarding" > /tmp/up ; chmod a+x /tmp/up openvpn --dev-type tap --remote tunnel.lsmod.de 5003 --ifconfig 10.9.0.2 255.255.255.0 --dev-node /dev/net/tun --up /tmp/up # at this point you can verify your tunnel setup by ping 10.9.0.1 # on the server I have this: openvpn --dev-type tap --ifconfig 10.9.0.1 255.255.255.0 --port 5003 --dev-node /dev/net/tun --float # you need UDP port 5003 to pass through your firewall for this Alternatively get an user-mode-linux(UML) binary and do something along the lines of: apt-get install uml-utilities TAP=`tunctl -b` ifconfig $TAP 192.168.121.1 netmask 255.255.255.252 echo 1 > /proc/sys/net/ipv6/conf/all/forwarding /path/to/linux eth0=tuntap,$TAP ... # booting up to the point where the tap dev is really bound (at "ifconfig eth0 192.168.121.2" within the UML) tunctl -d $TAP After 20 seconds kill the openvpn or linux process. This hangs indefinitely, leaving the openvpn process in "D" state. syslog states every 10 secs: unregister_netdevice: waiting for tap0 to become free. Usage count = 1 The kernel will then hang "ifconfig" and "ip" commands, probably because the waiting-for-tap0 still holds a mutex. After a dozen reboots of trying I found a work-around: replacing the critical line with (sleep 2 ; echo 1 > /proc/sys/net/ipv6/conf/all/forwarding )& A sleep 1 does not suffice. Doing the echo before calling openvpn also works fine, so there seems to be a timing problem or race condition during initialization of the IPv6 on the newly created tap0 device.
Now I found an even simpler way to trigger this bug, not needing any separate software. # preparation ifconfig eth0 down echo 0 > /proc/sys/net/ipv6/conf/all/forwarding modprobe ipv6 # triggering - this line should be executed as one statement ifconfig eth0 up ; echo 1 > /proc/sys/net/ipv6/conf/eth0/forwarding ifconfig eth0 down rmmod $ethernetdriver # this hangs because of the refcount-leak also reproduced the bug on a 2.4.21 kernel, so this bug is REALLY old
bugme-daemon@bugzilla.kernel.org wrote: > > http://bugzilla.kernel.org/show_bug.cgi?id=6698 > > Summary: unregister_netdevice hangs indefinitely from > /proc/sys/net/ipv6/conf/all/forwarding > Kernel Version: 2.6.17-rc6 > Status: NEW > Severity: normal > Owner: yoshfuji@linux-ipv6.org > Submitter: kernelbmw@lsmod.de > > > Most recent kernel where this bug did not occur: none known (yet) > Distribution: reproduced on Debian/stable, SuSE/10.0, SuSE/10.1 > Hardware Environment: reproduced on UML, i386, x86/64 > Software Environment: reproduced with openvpn and UML tap devices > Problem Description: after adding IPv6 to my previously working openvpn > tunneling setup, a (really old) IPv6-related bug started to occurr: > http://lkml.org/lkml/2003/8/21/1 > I also reproduced this bug with kernel 2.6.15.1(vanilla,uml) and > 2.6.16.13(SuSE-version,x86/64) and linux-2.6.13 (SuSE-version,i386) > > Steps to reproduce: > echo 0 > /proc/sys/net/ipv6/conf/all/forwarding # this is important initialization > > Have (any version of) openvpn open a tunnel using a tap (virtual ethernet) > device. In the "up" script do: > echo 1 > /proc/sys/net/ipv6/conf/all/forwarding > this can be easily tested with these lines: > apt-get install openvpn > modprobe tun > mknod /dev/net/tun c 10 200 > echo 0 > /proc/sys/net/ipv6/conf/all/forwarding > echo "echo 1 > /proc/sys/net/ipv6/conf/all/forwarding" > /tmp/up ; chmod a+x /tmp/up > openvpn --dev-type tap --remote tunnel.lsmod.de 5003 --ifconfig 10.9.0.2 > 255.255.255.0 --dev-node /dev/net/tun --up /tmp/up > # at this point you can verify your tunnel setup by ping 10.9.0.1 > # on the server I have this: openvpn --dev-type tap --ifconfig 10.9.0.1 > 255.255.255.0 --port 5003 --dev-node /dev/net/tun --float > # you need UDP port 5003 to pass through your firewall for this > > > Alternatively get an user-mode-linux(UML) binary and do something along the > lines of: > apt-get install uml-utilities > TAP=`tunctl -b` > ifconfig $TAP 192.168.121.1 netmask 255.255.255.252 > echo 1 > /proc/sys/net/ipv6/conf/all/forwarding > /path/to/linux eth0=tuntap,$TAP ... # booting up to the point where the tap dev > is really bound (at "ifconfig eth0 192.168.121.2" within the UML) > tunctl -d $TAP > > > After 20 seconds kill the openvpn or linux process. > This hangs indefinitely, leaving the openvpn process in "D" state. > syslog states every 10 secs: > unregister_netdevice: waiting for tap0 to become free. Usage count = 1 > > The kernel will then hang "ifconfig" and "ip" commands, probably because the > waiting-for-tap0 still holds a mutex. > > After a dozen reboots of trying I found a work-around: replacing the critical > line with > (sleep 2 ; echo 1 > /proc/sys/net/ipv6/conf/all/forwarding )& > > A sleep 1 does not suffice. > Doing the echo before calling openvpn also works fine, so there seems to be a > timing problem or race condition during initialization of the IPv6 on the newly > created tap0 device. > Thought to be an ipv6 refcount leak.
I completely failed to reproduce this with either 2.6.16 or 2.6.17. Could you please do a cat /proc/net/igmp6 just before you bring eth0 down and unload the module? Thanks,
I reproduced this bug with 2.6.18-rc2 today (even with worse impact) and made me a script to easily reproduce this. Sometimes this failed to set the IPv6 addr and didnt cause the bug - just run it for a second time then. uploaded to http://www3.zq1.de/~bernhard/linux/6698/ I also uploaded my kernel config (the x86_64 one) and the igmp6 outputs before the rmmod. (igmp6-bad is the one when the bug is happening) diff igmp6-good igmp6-bad 2d1 < 6 eth0 ff0200000000000000000001ff000000 1 00000004 0 3a3,5 > 6 eth0 ff0200000000000000000001ff000000 2 00000004 0 > 6 eth0 ff020000000000000000000000000002 1 00000004 0 > 6 eth0 ff0200000000000000000001ff000001 1 00000004 0 ### script to cause bug 6698 ### #!/bin/sh drv=sk98lin echo 0 > /proc/sys/net/ipv6/conf/all/forwarding modprobe ipv6 ifconfig eth0 down ifconfig eth1 down rmmod forcedeth rmmod $drv modprobe $drv # the bug happens in the next line ifconfig eth0 192.168.0.2 up ; ifconfig eth0 add 2003:1234:1234::1/64 ; echo 1 > /proc/sys/net/ipv6/conf/all/forwarding sleep 9 cat /proc/net/igmp6 > /root/igmp6.$$ rmmod $drv This should help you getting started... err getting your kernel crashed :-)
Any updates on this bug? There have been multiple patches to ipv6 lately. Would you please test with latest git. Thanks.
Please reopen this bug if it's still present with kernel 2.6.22.