Bug 202355

Summary: UDP does not report all ICMP errors on connected sockets in violation of RFC1122 4.1.1.3
Product: Networking Reporter: Perry Lorier (linux)
Component: IPV4Assignee: Stephen Hemminger (stephen)
Status: NEW ---    
Severity: normal CC: fweimer
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: From 0.98 until v5.0-rc1 Subsystem:
Regression: No Bisected commit-id:

Description Perry Lorier 2019-01-21 04:11:06 UTC
RFC1122 section 4.1.1.3 says:

            UDP MUST pass to the application layer all ICMP error
            messages that it receives from the IP layer. 

 -- https://tools.ietf.org/html/rfc1122#page-78

However, Linux appears to try and (misapply) RFC1122 section 3.2.2.1:

            A Destination Unreachable message that is received with code
            0 (Net), 1 (Host), or 5 (Bad Source Route) may result from a
            routing transient and MUST therefore be interpreted as only
            a hint, not proof, that the specified destination is
            unreachable [IP:11].  For example, it MUST NOT be used as
            proof of a dead gateway (see Section 3.3.1).

 -- https://tools.ietf.org/html/rfc1122#page-40


Where it does not report Destination Host/Net unreachable, Source Route Failed, and Fragmentation Required back to userspace on connect()ed sockets due to them being considered transient.
See: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/net/ipv4/icmp.c#n118
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/net/ipv4/udp.c#n704
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/net/ipv4/udp.c#n722

I believe it's doing this because it incorrectly believes that these are "non fatal" errors, and therefore should not be reported on the socket back to userspace.  Currently this can be overridden as a side effect of setting SO_RECVERR.  Digging around, this behaviour was added in Linux 0.98 with the original import of the the networking stack into the Linux Kernel(!).

This differs from RFC1122, the *BSD's implementation, and differs from people's expectations.  (Eg see the thread https://lists.dns-oarc.net/pipermail/dns-operations/2019-January/018271.html)

Linux should be RFC1122 compliant, and report all ICMP error messages back to userspace, *without* requiring SO_RECVERR sockopt being set.  Or, at the very least, this discrepancy should be very clearly documented.
Comment 1 Stephen Hemminger 2019-01-21 21:39:40 UTC
Unless there is any objection, I intend to close this bug
as "that is the way Linux works, we can't break userspace"


On Mon, 21 Jan 2019 04:11:06 +0000
bugzilla-daemon@bugzilla.kernel.org wrote:

> https://bugzilla.kernel.org/show_bug.cgi?id=202355
> 
>             Bug ID: 202355
>            Summary: UDP does not report all ICMP errors on connected
>                     sockets in violation of RFC1122 4.1.1.3
>            Product: Networking
>            Version: 2.5
>     Kernel Version: From 0.98 until v5.0-rc1
>           Hardware: All
>                 OS: Linux
>               Tree: Mainline
>             Status: NEW
>           Severity: normal
>           Priority: P1
>          Component: IPV4
>           Assignee: stephen@networkplumber.org
>           Reporter: linux@isomer.meta.net.nz
>         Regression: No
> 
> RFC1122 section 4.1.1.3 says:
> 
>             UDP MUST pass to the application layer all ICMP error
>             messages that it receives from the IP layer. 
> 
>  -- https://tools.ietf.org/html/rfc1122#page-78
> 
> However, Linux appears to try and (misapply) RFC1122 section 3.2.2.1:
> 
>             A Destination Unreachable message that is received with code
>             0 (Net), 1 (Host), or 5 (Bad Source Route) may result from a
>             routing transient and MUST therefore be interpreted as only
>             a hint, not proof, that the specified destination is
>             unreachable [IP:11].  For example, it MUST NOT be used as
>             proof of a dead gateway (see Section 3.3.1).
> 
>  -- https://tools.ietf.org/html/rfc1122#page-40
> 
> 
> Where it does not report Destination Host/Net unreachable, Source Route
> Failed,
> and Fragmentation Required back to userspace on connect()ed sockets due to
> them
> being considered transient.
> See:
>
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/net/ipv4/icmp.c#n118
>
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/net/ipv4/udp.c#n704
>
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/net/ipv4/udp.c#n722
> 
> I believe it's doing this because it incorrectly believes that these are "non
> fatal" errors, and therefore should not be reported on the socket back to
> userspace.  Currently this can be overridden as a side effect of setting
> SO_RECVERR.  Digging around, this behaviour was added in Linux 0.98 with the
> original import of the the networking stack into the Linux Kernel(!).
> 
> This differs from RFC1122, the *BSD's implementation, and differs from
> people's
> expectations.  (Eg see the thread
> https://lists.dns-oarc.net/pipermail/dns-operations/2019-January/018271.html)
> 
> Linux should be RFC1122 compliant, and report all ICMP error messages back to
> userspace, *without* requiring SO_RECVERR sockopt being set.  Or, at the very
> least, this discrepancy should be very clearly documented.
>
Comment 2 Perry Lorier 2019-01-21 22:51:12 UTC
Seems reasonable.

I'll investigate getting this clearly documented in the manpages.