Bug 34322

Summary: No ECN marking in IPv6
Product: Networking Reporter: Steinar H. Gunderson (steinar+kernel)
Component: IPV6Assignee: Hideaki YOSHIFUJI (yoshfuji)
Status: CLOSED CODE_FIX    
Severity: normal CC: florian, kenyon
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 2.6.38.4 Subsystem:
Regression: No Bisected commit-id:

Description Steinar H. Gunderson 2011-05-03 20:04:57 UTC
Hi,

I'm completely unable to get ECN to work for IPv6. /proc/sys/net/ipv4/tcp_ecn is set to 1 on both sides, and ECN works just fine for IPv4 TCP connections, but when I connect over IPv6 tclass just stays at 0x0, and as far as I'd understood, there should be two bits here set to 10 (like in the diffserv field of IPv4), right?

I do get ECN-echo bits in the TCP header, though (for the initial SYN and SYN/ACK  packets).
Comment 1 Steinar H. Gunderson 2011-05-03 22:25:23 UTC
I looked at the code, and although I don't understand all of it, it looks like e9df2e8fd8fbc95c57dbd1d33dada66c4627b44c (which also brought ECN for IPv6 SCTP) might be the culprit. It seems like TCP_ECN_send() calls INET_ECN_xmit(), which only sets the ECN bit in the IPv4 ToS field (inet_sk(sk)->tos), but after the patch, what's checked is inet6_sk(sk)->tclass, which is a completely different field.

Is the analysis correct? Should the tclass be set as well?
Comment 2 Steinar H. Gunderson 2011-05-03 23:04:18 UTC
OK, as a quick hack, I did this:

--- a/linux-2.6.38.5/include/net/inet_ecn.h	2011-04-14 22:03:56.000000000 +0200
+++ b/linux-2.6.38.5/include/net/inet_ecn.h	2011-05-04 00:36:52.803377902 +0200
@@ -38,7 +38,7 @@
 	return outer;
 }
 
-#define	INET_ECN_xmit(sk) do { inet_sk(sk)->tos |= INET_ECN_ECT_0; } while (0)
+#define	INET_ECN_xmit(sk) do { inet_sk(sk)->tos |= INET_ECN_ECT_0; if (inet6_sk(sk) != NULL) inet6_sk(sk)->tclass |= INET_ECN_ECT_0; } while (0)
 #define	INET_ECN_dontxmit(sk) \
 	do { inet_sk(sk)->tos &= ~INET_ECN_MASK; } while (0)

and now my packets are properly marked with tclass 0x02 (ie., signalling ECN-capable transport, no congestion experienced yet).

I guess this isn't the right way of doing it, but at least it confirms that the lack of setting tclass is part of the problem.
Comment 3 Andrew Morton 2011-05-05 22:43:38 UTC
(switched to email.  Please respond via emailed reply-to-all, not via the
bugzilla web interface).

On Tue, 3 May 2011 20:05:00 GMT
bugzilla-daemon@bugzilla.kernel.org wrote:

> https://bugzilla.kernel.org/show_bug.cgi?id=34322
> 
>            Summary: No ECN marking in IPv6
>            Product: Networking
>            Version: 2.5
>     Kernel Version: 2.6.38.4
>           Platform: All
>         OS/Version: Linux
>               Tree: Mainline
>             Status: NEW
>           Severity: normal
>           Priority: P1
>          Component: IPV6
>         AssignedTo: yoshfuji@linux-ipv6.org
>         ReportedBy: sgunderson@bigfoot.com
>         Regression: No
> 
> 
> Hi,
> 
> I'm completely unable to get ECN to work for IPv6. /proc/sys/net/ipv4/tcp_ecn
> is set to 1 on both sides, and ECN works just fine for IPv4 TCP connections,
> but when I connect over IPv6 tclass just stays at 0x0, and as far as I'd
> understood, there should be two bits here set to 10 (like in the diffserv
> field
> of IPv4), right?
> 
> I do get ECN-echo bits in the TCP header, though (for the initial SYN and
> SYN/ACK  packets).
> 

and

> I looked at the code, and although I don't understand all of it, it
> looks like e9df2e8fd8fbc95c57dbd1d33dada66c4627b44c (which also brought
> ECN for IPv6 SCTP) might be the culprit.  It seems like TCP_ECN_send()
> calls INET_ECN_xmit(), which only sets the ECN bit in the IPv4 ToS
> field (inet_sk(sk)->tos), but after the patch, what's checked is
> inet6_sk(sk)->tclass, which is a completely different field.
> 
> Is the analysis correct? Should the tclass be set as well?

and

> OK, as a quick hack, I did this:
> 
> --- a/linux-2.6.38.5/include/net/inet_ecn.h    2011-04-14
> 22:03:56.000000000+0200
> +++ b/linux-2.6.38.5/include/net/inet_ecn.h    2011-05-04
> 00:36:52.803377902+0200
> @@ -38,7 +38,7 @@
>      return outer;
>  }
> 
> -#define    INET_ECN_xmit(sk) do { inet_sk(sk)->tos |= INET_ECN_ECT_0; }
> while (0)
> +#define    INET_ECN_xmit(sk) do { inet_sk(sk)->tos |= INET_ECN_ECT_0; if
> (inet6_sk(sk) != NULL) inet6_sk(sk)->tclass |= INET_ECN_ECT_0; } while (0)
>  #define    INET_ECN_dontxmit(sk) \
>      do { inet_sk(sk)->tos &= ~INET_ECN_MASK; } while (0)
> 
> and now my packets are properly marked with tclass 0x02 (ie.,
> signalling ECN-capable transport, no congestion experienced yet).
> 
> I guess this isn't the right way of doing it, but at least it confirms
> that the lack of setting tclass is part of the problem.
Comment 4 Eric Dumazet 2011-05-06 15:05:22 UTC
Le jeudi 05 mai 2011 à 14:41 -0700, Andrew Morton a écrit :
> (switched to email.  Please respond via emailed reply-to-all, not via the
> bugzilla web interface).
> 
> On Tue, 3 May 2011 20:05:00 GMT
> bugzilla-daemon@bugzilla.kernel.org wrote:
> 
> > https://bugzilla.kernel.org/show_bug.cgi?id=34322
> > 
> >            Summary: No ECN marking in IPv6
> >            Product: Networking
> >            Version: 2.5
> >     Kernel Version: 2.6.38.4
> >           Platform: All
> >         OS/Version: Linux
> >               Tree: Mainline
> >             Status: NEW
> >           Severity: normal
> >           Priority: P1
> >          Component: IPV6
> >         AssignedTo: yoshfuji@linux-ipv6.org
> >         ReportedBy: sgunderson@bigfoot.com
> >         Regression: No
> > 
> > 
> > Hi,
> > 
> > I'm completely unable to get ECN to work for IPv6.
> /proc/sys/net/ipv4/tcp_ecn
> > is set to 1 on both sides, and ECN works just fine for IPv4 TCP
> connections,
> > but when I connect over IPv6 tclass just stays at 0x0, and as far as I'd
> > understood, there should be two bits here set to 10 (like in the diffserv
> field
> > of IPv4), right?
> > 
> > I do get ECN-echo bits in the TCP header, though (for the initial SYN and
> > SYN/ACK  packets).
> > 
> 
> and
> 
> > I looked at the code, and although I don't understand all of it, it
> > looks like e9df2e8fd8fbc95c57dbd1d33dada66c4627b44c (which also brought
> > ECN for IPv6 SCTP) might be the culprit.  It seems like TCP_ECN_send()
> > calls INET_ECN_xmit(), which only sets the ECN bit in the IPv4 ToS
> > field (inet_sk(sk)->tos), but after the patch, what's checked is
> > inet6_sk(sk)->tclass, which is a completely different field.
> > 
> > Is the analysis correct? Should the tclass be set as well?
> 
> and
> 
> > OK, as a quick hack, I did this:
> > 
> > --- a/linux-2.6.38.5/include/net/inet_ecn.h    2011-04-14
> 22:03:56.000000000+0200
> > +++ b/linux-2.6.38.5/include/net/inet_ecn.h    2011-05-04
> 00:36:52.803377902+0200
> > @@ -38,7 +38,7 @@
> >      return outer;
> >  }
> > 
> > -#define    INET_ECN_xmit(sk) do { inet_sk(sk)->tos |= INET_ECN_ECT_0; }
> while (0)
> > +#define    INET_ECN_xmit(sk) do { inet_sk(sk)->tos |= INET_ECN_ECT_0; if
> (inet6_sk(sk) != NULL) inet6_sk(sk)->tclass |= INET_ECN_ECT_0; } while (0)
> >  #define    INET_ECN_dontxmit(sk) \
> >      do { inet_sk(sk)->tos &= ~INET_ECN_MASK; } while (0)
> > 
> > and now my packets are properly marked with tclass 0x02 (ie.,
> > signalling ECN-capable transport, no congestion experienced yet).
> > 
> > I guess this isn't the right way of doing it, but at least it confirms
> > that the lack of setting tclass is part of the problem.
> 

Cc YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>

Hello Steinar

Analysis seems fine, but you also need to change INET_ECN_dontxmit() for
retransmitted packets.

Any chance you can refine your patch ?

Thanks !
Comment 5 Steinar H. Gunderson 2011-05-06 17:43:48 UTC
On Fri, May 06, 2011 at 05:04:52PM +0200, Eric Dumazet wrote:
> Analysis seems fine, but you also need to change INET_ECN_dontxmit() for
> retransmitted packets.
> 
> Any chance you can refine your patch ?

Sure, but is really checking against NULL the right way of checking for IPv6
sockets? I'd imagined I should have checked address family or something
instead...

/* Steinar */
Comment 6 Eric Dumazet 2011-05-07 09:45:18 UTC
From: Steinar H. Gunderson <sgunderson@bigfoot.com>

Le vendredi 06 mai 2011 à 19:12 +0200, Steinar H. Gunderson a écrit :

> Sure, but is really checking against NULL the right way of checking for IPv6
> sockets? I'd imagined I should have checked address family or something
> instead...
> 

It should be fine.

I cooked for you the official patch and made sure it worked with a RED
ECN setup, and one ipv6 tcp xmit.

# tc -s -d qdisc show dev eth1
...
qdisc red 11: parent 1:11 limit 120Kb min 8Kb max 80Kb ecn ewma 2 Plog 21 Scell_log 11
 Sent 114694826 bytes 76446 pkt (dropped 15, overlimits 485 requeues 0) 
 rate 12126Kbit 1011pps backlog 0b 0p requeues 0 
  marked 470 early 15 pdrop 0 other 0


Thanks again !

[PATCH] ipv6: restore correct ECN handling on TCP xmit

Since commit e9df2e8fd8fbc9 (Use appropriate sock tclass setting for
routing lookup) we lost ability to properly add ECN codemarks to ipv6
TCP frames.

It seems like TCP_ECN_send() calls INET_ECN_xmit(), which only sets the
ECN bit in the IPv4 ToS field (inet_sk(sk)->tos), but after the patch,
what's checked is inet6_sk(sk)->tclass, which is a completely different
field.

Close bug https://bugzilla.kernel.org/show_bug.cgi?id=34322

[Eric Dumazet] : added the INET_ECN_dontxmit() fix and replace macros
by inline functions for clarity.

Signed-off-by: Steinar H. Gunderson <sgunderson@bigfoot.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
---
 include/net/inet_ecn.h |   16 +++++++++++++---
 1 files changed, 13 insertions(+), 3 deletions(-)

diff --git a/include/net/inet_ecn.h b/include/net/inet_ecn.h
index 88bdd01..2fa8d13 100644
--- a/include/net/inet_ecn.h
+++ b/include/net/inet_ecn.h
@@ -38,9 +38,19 @@ static inline __u8 INET_ECN_encapsulate(__u8 outer, __u8 inner)
 	return outer;
 }
 
-#define	INET_ECN_xmit(sk) do { inet_sk(sk)->tos |= INET_ECN_ECT_0; } while (0)
-#define	INET_ECN_dontxmit(sk) \
-	do { inet_sk(sk)->tos &= ~INET_ECN_MASK; } while (0)
+static inline void INET_ECN_xmit(struct sock *sk)
+{
+	inet_sk(sk)->tos |= INET_ECN_ECT_0;
+	if (inet6_sk(sk) != NULL)
+		inet6_sk(sk)->tclass |= INET_ECN_ECT_0;
+}
+
+static inline void INET_ECN_dontxmit(struct sock *sk)
+{
+	inet_sk(sk)->tos &= ~INET_ECN_MASK;
+	if (inet6_sk(sk) != NULL)
+		inet6_sk(sk)->tclass &= ~INET_ECN_MASK;
+}
 
 #define IP6_ECN_flow_init(label) do {		\
       (label) &= ~htonl(INET_ECN_MASK << 20);	\
Comment 7 Steinar H. Gunderson 2011-05-07 09:59:45 UTC
On Sat, May 07, 2011 at 11:44:46AM +0200, Eric Dumazet wrote:
> I cooked for you the official patch and made sure it worked with a RED
> ECN setup, and one ipv6 tcp xmit.

Great, thanks :-) This looks good to me.

/* Steinar */
Comment 8 David S. Miller 2011-05-12 22:53:16 UTC
From: "Steinar H. Gunderson" <sgunderson@bigfoot.com>
Date: Sat, 7 May 2011 11:59:10 +0200

> On Sat, May 07, 2011 at 11:44:46AM +0200, Eric Dumazet wrote:
>> I cooked for you the official patch and made sure it worked with a RED
>> ECN setup, and one ipv6 tcp xmit.
> 
> Great, thanks :-) This looks good to me.

Applied, thanks everyone.
Comment 9 Steinar H. Gunderson 2011-05-12 22:55:34 UTC
On Thu, May 12, 2011 at 03:52:21PM -0700, David Miller wrote:
>>> I cooked for you the official patch and made sure it worked with a RED
>>> ECN setup, and one ipv6 tcp xmit.
>> Great, thanks :-) This looks good to me.
> Applied, thanks everyone.

Do you think it would be possible to get it backported to stable kernels?
(I've been told getting it into the 2.6.32.x series is the easiest way to get
it into Debian stable, and I guess it should be a candidate.)

/* Steinar */
Comment 10 David S. Miller 2011-05-13 00:05:47 UTC
From: "Steinar H. Gunderson" <sgunderson@bigfoot.com>
Date: Fri, 13 May 2011 00:54:58 +0200

> On Thu, May 12, 2011 at 03:52:21PM -0700, David Miller wrote:
>>>> I cooked for you the official patch and made sure it worked with a RED
>>>> ECN setup, and one ipv6 tcp xmit.
>>> Great, thanks :-) This looks good to me.
>> Applied, thanks everyone.
> 
> Do you think it would be possible to get it backported to stable kernels?
> (I've been told getting it into the 2.6.32.x series is the easiest way to get
> it into Debian stable, and I guess it should be a candidate.)

I only submit patches for what stable@kernel.org is actively maintaining
which is 2.6.38 right now.
Comment 11 Ben Hutchings 2011-05-13 04:58:22 UTC
On Fri, 2011-05-13 at 00:15 +0100, Steinar H. Gunderson wrote:
> On Thu, May 12, 2011 at 03:52:21PM -0700, David Miller wrote:
> >>> I cooked for you the official patch and made sure it worked with a RED
> >>> ECN setup, and one ipv6 tcp xmit.
> >> Great, thanks :-) This looks good to me.
> > Applied, thanks everyone.
> 
> Do you think it would be possible to get it backported to stable kernels?
> (I've been told getting it into the 2.6.32.x series is the easiest way to get
> it into Debian stable, and I guess it should be a candidate.)

If this doesn't apply directly to 2.6.32 then you'll need to prepare the
backport and send it to stable@kernel.org with a reference to the commit
in mainline.

Ben.

-- 
Ben Hutchings
Once a job is fouled up, anything done to improve it makes it worse.
Comment 12 Steinar H. Gunderson 2011-05-13 09:45:38 UTC
On Fri, May 13, 2011 at 04:58:25AM +0000, bugzilla-daemon@bugzilla.kernel.org wrote:
> If this doesn't apply directly to 2.6.32 then you'll need to prepare the
> backport and send it to stable@kernel.org with a reference to the commit
> in mainline.

It applies directly (no rejects, no fuzz).

/* Steinar */
Comment 13 Florian Mickler 2011-05-19 07:23:28 UTC
A patch referencing this bug report has been merged in v2.6.39:

commit ca06707022d6ba4744198a8ebbe4994786b0c613
Author: Steinar H. Gunderson <sgunderson@bigfoot.com>
Date:   Fri May 6 23:44:46 2011 +0000

    ipv6: restore correct ECN handling on TCP xmit
Comment 14 Steinar H. Gunderson 2011-05-19 07:40:04 UTC
On Fri, May 13, 2011 at 11:45:04AM +0200, Steinar H. Gunderson wrote:
>> If this doesn't apply directly to 2.6.32 then you'll need to prepare the
>> backport and send it to stable@kernel.org with a reference to the commit
>> in mainline.
> It applies directly (no rejects, no fuzz).

Just to clear up: If it applies directly, will it automatically get picked up
in 2.6.32? Or should I sent it to stable@kernel.org myself?

/* Steinar */