There was a similar bug #8747 , but the fix is not somplete. Problem Description: It is related to the possibility to obtain MSG_ERRQUEUE messages from the udp and raw sockets, both connected and unconnected. There is a bug in net/ipv6/icmp.c code, which prevents such messages to be delivered to the errqueue of the correspond raw socket, when the socket is CONNECTED. The bug is related to wrong obtaining of the saddr/daddr pair, used to find the raw socket. Consider __raw_v6_lookup() function from net/ipv6/raw.c. When a raw socket is looked up usual way, it is something like: sk = __raw_v6_lookup(sk, nexthdr, daddr, saddr, IP6CB(skb)->iif); where "daddr" is a destination address of the incoming packet (IOW our local address), "saddr" is a source address of the incoming packet (the remote end). But when the raw socket is looked up for some icmp error report, in net/ipv6/icmp.c:icmpv6_notify() , daddr/saddr must be obtained from the echoed fragment of the "bad" packet, not from the ipv6 header of the icmp packet itself. Consider: ipv6_header -- icmp_header -- echoed_ipv6_header -- at_least_8_bytes ... Now saddr/daddr, used for __raw_v6_lookup, are from the first "ipv6_header", but must be from the "echoed_ipv6_header" . In the previous bug #8747, I assumed that the issue is just a typo, by switching saddr/daddr in agrument list. Unfortunately, it appears that the pair is even obtained from the wrong place... Steps to reproduce: Create some raw socket, connect it to an address, and cause some error situation: f.e. set ttl=1 where the remote address is more than 1 hop to reach. Set IPV6_RECVERR . Then send something and wait for the error (f.e. poll() with POLLERR|POLLIN). You should receive "time exceeded" icmp message (because of "ttl=1"), but the socket do not receive it. If you do not connect your raw socket, you will receive MSG_ERRQUEUE successfully. (The reason is that for unconnected socket there are no actual checks for local/remote addresses).
*** Bug 8747 has been marked as a duplicate of this bug. ***
Created attachment 15716 [details] Patch to fix the issue completely. The patch is simple enough. Now tested that it actually fixes the issue.
I know that it is more preferable to send patches to some person from the MAINTAINERS list, but it seems that my e-mails are dropped by some dumby anti-spam :( Hence, sorry for a (possible) lot of duplicate messages, if any. A reason to set the HIGH severity: I'm an author, upstream and Fedora maintainer of the new traceroute(8) implementation for Linux, http://traceroute.sourceforge.net The "connected raw sockets" is a good feature for this implementation. This feature allows to filter all alien packets from the socket's input. Without this, the program will receive all the incoming raw packets (for the "icmp" and "tcp" tracerouting methods, which use raw sockets). Because of the wrong assumption that bug #8747 fixes the issue, I've implemented a check, whether the kernel version is more than 2.6.22.2 (since the bug should be fixed), and if so, allow raw ipv6 sockets to connect. But since 2.6.22.2, the issue still not fixed, and now any combination of "traceroute >= 2.0.8" and "kernel >= 2.6.22.2" leads to impossibility to do "icmp" and "tcp" tracerouting for IPv6. :( Since Fedora 8 and other distros about I know (Gentoo, Kubuntu) already use traceroute-2.0.9, the issue affects them. Now I have to ship an update. The easiest way for me is just do not connect raw sockets for ipv6. But if the issue can be fixed in the nearest future, I'll change "2.6.22.2" to new version to check...
Reply-To: akpm@linux-foundation.org (switched to email. Please respond via emailed reply-to-all, not via the bugzilla web interface). On Thu, 10 Apr 2008 05:53:39 -0700 (PDT) bugme-daemon@bugzilla.kernel.org wrote: > http://bugzilla.kernel.org/show_bug.cgi?id=10437 > > Summary: MSG_ERRQUEUE messages do not pass to connected raw > sockets > Product: Networking > Version: 2.5 > KernelVersion: 2.6.24 > Platform: All > OS/Version: Linux > Tree: Mainline > Status: NEW > Severity: high > Priority: P1 > Component: IPV6 > AssignedTo: yoshfuji@linux-ipv6.org > ReportedBy: dmitry@butskoy.name > > > There was a similar bug #8747 , but the fix is not somplete. > > Problem Description: > > It is related to the possibility to obtain MSG_ERRQUEUE messages from the udp > and raw sockets, both connected and unconnected. > > There is a bug in net/ipv6/icmp.c code, which prevents such messages to > be delivered to the errqueue of the correspond raw socket, when the socket is > CONNECTED. The bug is related to wrong obtaining of the saddr/daddr pair, > used > to find the raw socket. > > Consider __raw_v6_lookup() function from net/ipv6/raw.c. When a raw socket is > looked up usual way, it is something like: > > sk = __raw_v6_lookup(sk, nexthdr, daddr, saddr, IP6CB(skb)->iif); > > where "daddr" is a destination address of the incoming packet (IOW our local > address), "saddr" is a source address of the incoming packet (the remote > end). > > But when the raw socket is looked up for some icmp error report, in > net/ipv6/icmp.c:icmpv6_notify() , daddr/saddr must be obtained from the > echoed > fragment of the "bad" packet, not from the ipv6 header of the icmp packet > itself. > > Consider: > > ipv6_header -- icmp_header -- echoed_ipv6_header -- at_least_8_bytes ... > > Now saddr/daddr, used for __raw_v6_lookup, are from the first "ipv6_header", > but must be from the "echoed_ipv6_header" . > > > In the previous bug #8747, I assumed that the issue is just a typo, by > switching saddr/daddr in agrument list. Unfortunately, it appears that the > pair > is even obtained from the wrong place... > > > Steps to reproduce: > > Create some raw socket, connect it to an address, and cause some error > situation: f.e. set ttl=1 where the remote address is more than 1 hop to > reach. > Set IPV6_RECVERR . > Then send something and wait for the error (f.e. poll() with POLLERR|POLLIN). > You should receive "time exceeded" icmp message (because of "ttl=1"), but the > socket do not receive it. > > If you do not connect your raw socket, you will receive MSG_ERRQUEUE > successfully. (The reason is that for unconnected socket there are no actual > checks for local/remote addresses). > (There's more info, and a patch at the above link). Dmitry, I'd suggest that you send the patch via email to netdev@vger.kernel.org and to YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>.
From: Andrew Morton <akpm@linux-foundation.org> Date: Thu, 10 Apr 2008 10:34:44 -0700 > Dmitry, I'd suggest that you send the patch via email to > netdev@vger.kernel.org and to YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>. This person is having trouble emailing things to the lists and elsewhere, for whatever reason, and he states this in the bugzilla entry so asking him to email a patch somewhere isn't going to help. The code in this area has changed a bit, here is the patch I just checked into net-2.6 and which I'll push to Linus for 2.6.25 and will also submit for the 2.6.24-stable branch. commit b45e9189c058bfa495073951ff461ee0eea968be Author: David S. Miller <davem@davemloft.net> Date: Sun Apr 13 23:14:15 2008 -0700 [IPV6]: Fix ipv6 address fetching in raw6_icmp_error(). Fixes kernel bugzilla 10437 Based almost entirely upon a patch by Dmitry Butskoy. When deciding what raw sockets to deliver the ICMPv6 to, we should use the addresses in the ICMPv6 quoted IPV6 header, not the top-level one. Signed-off-by: David S. Miller <davem@davemloft.net> diff --git a/net/ipv6/raw.c b/net/ipv6/raw.c index 8897ccf..0a6fbc1 100644 --- a/net/ipv6/raw.c +++ b/net/ipv6/raw.c @@ -372,8 +372,10 @@ void raw6_icmp_error(struct sk_buff *skb, int nexthdr, read_lock(&raw_v6_hashinfo.lock); sk = sk_head(&raw_v6_hashinfo.ht[hash]); if (sk != NULL) { - saddr = &ipv6_hdr(skb)->saddr; - daddr = &ipv6_hdr(skb)->daddr; + struct ipv6hdr *hdr = (struct ipv6hdr *) skb->data; + + saddr = &hdr->saddr; + daddr = &hdr->daddr; net = skb->dev->nd_net; while ((sk = __raw_v6_lookup(net, sk, nexthdr, saddr, daddr,