Bug 10437 - MSG_ERRQUEUE messages do not pass to connected raw sockets
Summary: MSG_ERRQUEUE messages do not pass to connected raw sockets
Status: RESOLVED CODE_FIX
Alias: None
Product: Networking
Classification: Unclassified
Component: IPV6 (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: Hideaki YOSHIFUJI
URL:
Keywords:
: 8747 (view as bug list)
Depends on:
Blocks:
 
Reported: 2008-04-10 05:53 UTC by Dmitry Butskoy
Modified: 2008-04-11 07:55 UTC (History)
0 users

See Also:
Kernel Version: (...,) 2.6.23, 2.6.24
Subsystem:
Regression: ---
Bisected commit-id:


Attachments
Patch to fix the issue completely. (515 bytes, patch)
2008-04-10 05:56 UTC, Dmitry Butskoy
Details | Diff

Description Dmitry Butskoy 2008-04-10 05:53:36 UTC
There was a similar bug #8747 , but the fix is not somplete.

Problem Description:

It is related to the possibility to obtain MSG_ERRQUEUE messages from the udp
and raw sockets, both connected and unconnected.

There is a bug in net/ipv6/icmp.c code, which prevents such messages to
be delivered to the errqueue of the correspond raw socket, when the socket is
CONNECTED. The bug is related to wrong obtaining of the saddr/daddr pair, used to find the raw socket.

Consider __raw_v6_lookup() function from net/ipv6/raw.c. When a raw socket is
looked up usual way, it is something like:

sk = __raw_v6_lookup(sk, nexthdr, daddr, saddr, IP6CB(skb)->iif);

where "daddr" is a destination address of the incoming packet (IOW our local
address), "saddr" is a source address of the incoming packet (the remote end).

But when the raw socket is looked up for some icmp error report, in
net/ipv6/icmp.c:icmpv6_notify() , daddr/saddr must be obtained from the echoed
fragment of the "bad" packet, not from the ipv6 header of the icmp packet itself.

Consider:

ipv6_header -- icmp_header -- echoed_ipv6_header -- at_least_8_bytes ...

Now saddr/daddr, used for __raw_v6_lookup, are from the first "ipv6_header", but must be from the "echoed_ipv6_header" .


In the previous bug #8747, I assumed that the issue is just a typo, by switching saddr/daddr in agrument list. Unfortunately, it appears that the pair is even obtained from the wrong place...


Steps to reproduce:

Create some raw socket, connect it to an address, and cause some error
situation: f.e. set ttl=1 where the remote address is more than 1 hop to reach.
Set IPV6_RECVERR .
Then send something and wait for the error (f.e. poll() with POLLERR|POLLIN).
You should receive "time exceeded" icmp message (because of "ttl=1"), but the
socket do not receive it.

If you do not connect your raw socket, you will receive MSG_ERRQUEUE 
successfully. (The reason is that for unconnected socket there are no actual
checks for local/remote addresses).
Comment 1 Dmitry Butskoy 2008-04-10 05:54:11 UTC
*** Bug 8747 has been marked as a duplicate of this bug. ***
Comment 2 Dmitry Butskoy 2008-04-10 05:56:22 UTC
Created attachment 15716 [details]
Patch to fix the issue completely.

The patch is simple enough. Now tested that it actually fixes the issue.
Comment 3 Dmitry Butskoy 2008-04-10 06:08:54 UTC
I know that it is more preferable to send patches to some person from the MAINTAINERS list, but it seems that my e-mails are dropped by some dumby anti-spam :(  Hence, sorry for a (possible) lot of duplicate messages, if any.


A reason to set the HIGH severity:

I'm an author, upstream and Fedora maintainer of the new traceroute(8) implementation for Linux, http://traceroute.sourceforge.net

The "connected raw sockets" is a good feature for this implementation. This feature allows to filter all alien packets from the socket's input. Without this, the program will receive all the incoming raw packets (for the "icmp" and "tcp" tracerouting methods, which use raw sockets).

Because of the wrong assumption that bug #8747 fixes the issue, I've implemented a check, whether the kernel version is more than 2.6.22.2 (since the bug should be fixed), and if so, allow raw ipv6 sockets to connect. But since 2.6.22.2, the issue still not fixed, and now any combination of "traceroute >= 2.0.8" and "kernel >= 2.6.22.2" leads to impossibility to do "icmp" and "tcp" tracerouting for IPv6.  :(

Since Fedora 8 and other distros about I know (Gentoo, Kubuntu) already use traceroute-2.0.9, the issue affects them. Now I have to ship an update.

The easiest way for me is just do not connect raw sockets for ipv6. But if the issue can be fixed in the nearest future, I'll change "2.6.22.2" to new version to check...
Comment 4 Anonymous Emailer 2008-04-10 10:35:17 UTC
Reply-To: akpm@linux-foundation.org

(switched to email.  Please respond via emailed reply-to-all, not via the
bugzilla web interface).

On Thu, 10 Apr 2008 05:53:39 -0700 (PDT) bugme-daemon@bugzilla.kernel.org wrote:

> http://bugzilla.kernel.org/show_bug.cgi?id=10437
> 
>            Summary: MSG_ERRQUEUE messages do not pass to connected raw
>                     sockets
>            Product: Networking
>            Version: 2.5
>      KernelVersion: 2.6.24
>           Platform: All
>         OS/Version: Linux
>               Tree: Mainline
>             Status: NEW
>           Severity: high
>           Priority: P1
>          Component: IPV6
>         AssignedTo: yoshfuji@linux-ipv6.org
>         ReportedBy: dmitry@butskoy.name
> 
> 
> There was a similar bug #8747 , but the fix is not somplete.
> 
> Problem Description:
> 
> It is related to the possibility to obtain MSG_ERRQUEUE messages from the udp
> and raw sockets, both connected and unconnected.
> 
> There is a bug in net/ipv6/icmp.c code, which prevents such messages to
> be delivered to the errqueue of the correspond raw socket, when the socket is
> CONNECTED. The bug is related to wrong obtaining of the saddr/daddr pair,
> used
> to find the raw socket.
> 
> Consider __raw_v6_lookup() function from net/ipv6/raw.c. When a raw socket is
> looked up usual way, it is something like:
> 
> sk = __raw_v6_lookup(sk, nexthdr, daddr, saddr, IP6CB(skb)->iif);
> 
> where "daddr" is a destination address of the incoming packet (IOW our local
> address), "saddr" is a source address of the incoming packet (the remote
> end).
> 
> But when the raw socket is looked up for some icmp error report, in
> net/ipv6/icmp.c:icmpv6_notify() , daddr/saddr must be obtained from the
> echoed
> fragment of the "bad" packet, not from the ipv6 header of the icmp packet
> itself.
> 
> Consider:
> 
> ipv6_header -- icmp_header -- echoed_ipv6_header -- at_least_8_bytes ...
> 
> Now saddr/daddr, used for __raw_v6_lookup, are from the first "ipv6_header",
> but must be from the "echoed_ipv6_header" .
> 
> 
> In the previous bug #8747, I assumed that the issue is just a typo, by
> switching saddr/daddr in agrument list. Unfortunately, it appears that the
> pair
> is even obtained from the wrong place...
> 
> 
> Steps to reproduce:
> 
> Create some raw socket, connect it to an address, and cause some error
> situation: f.e. set ttl=1 where the remote address is more than 1 hop to
> reach.
> Set IPV6_RECVERR .
> Then send something and wait for the error (f.e. poll() with POLLERR|POLLIN).
> You should receive "time exceeded" icmp message (because of "ttl=1"), but the
> socket do not receive it.
> 
> If you do not connect your raw socket, you will receive MSG_ERRQUEUE 
> successfully. (The reason is that for unconnected socket there are no actual
> checks for local/remote addresses).
> 

(There's more info, and a patch at the above link).

Dmitry, I'd suggest that you send the patch via email to
netdev@vger.kernel.org and to YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>.
Comment 5 David S. Miller 2008-04-13 23:16:16 UTC
From: Andrew Morton <akpm@linux-foundation.org>
Date: Thu, 10 Apr 2008 10:34:44 -0700

> Dmitry, I'd suggest that you send the patch via email to
> netdev@vger.kernel.org and to YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>.

This person is having trouble emailing things to the lists and elsewhere,
for whatever reason, and he states this in the bugzilla entry so asking
him to email a patch somewhere isn't going to help.

The code in this area has changed a bit, here is the patch I just checked
into net-2.6 and which I'll push to Linus for 2.6.25 and will also submit
for the 2.6.24-stable branch.

commit b45e9189c058bfa495073951ff461ee0eea968be
Author: David S. Miller <davem@davemloft.net>
Date:   Sun Apr 13 23:14:15 2008 -0700

    [IPV6]: Fix ipv6 address fetching in raw6_icmp_error().
    
    Fixes kernel bugzilla 10437
    
    Based almost entirely upon a patch by Dmitry Butskoy.
    
    When deciding what raw sockets to deliver the ICMPv6
    to, we should use the addresses in the ICMPv6 quoted
    IPV6 header, not the top-level one.
    
    Signed-off-by: David S. Miller <davem@davemloft.net>

diff --git a/net/ipv6/raw.c b/net/ipv6/raw.c
index 8897ccf..0a6fbc1 100644
--- a/net/ipv6/raw.c
+++ b/net/ipv6/raw.c
@@ -372,8 +372,10 @@ void raw6_icmp_error(struct sk_buff *skb, int nexthdr,
 	read_lock(&raw_v6_hashinfo.lock);
 	sk = sk_head(&raw_v6_hashinfo.ht[hash]);
 	if (sk != NULL) {
-		saddr = &ipv6_hdr(skb)->saddr;
-		daddr = &ipv6_hdr(skb)->daddr;
+		struct ipv6hdr *hdr = (struct ipv6hdr *) skb->data;
+
+		saddr = &hdr->saddr;
+		daddr = &hdr->daddr;
 		net = skb->dev->nd_net;
 
 		while ((sk = __raw_v6_lookup(net, sk, nexthdr, saddr, daddr,

Note You need to log in before you can comment on or make changes to this bug.