Bug 15682 - XFRM is not updating RTAX_ADVMSS metric
Summary: XFRM is not updating RTAX_ADVMSS metric
Status: RESOLVED OBSOLETE
Alias: None
Product: Networking
Classification: Unclassified
Component: Other (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: Arnaldo Carvalho de Melo
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-04-02 17:33 UTC by Panisset
Modified: 2012-07-09 08:16 UTC (History)
2 users (show)

See Also:
Kernel Version: 2.6.28-2
Subsystem:
Regression: No
Bisected commit-id:


Attachments

Description Panisset 2010-04-02 17:33:35 UTC
I have been testing DSMIPv6 code which uses all kind of advanced
features of XFRM framework and I believe I have found a bug related to
update RTAX_ADVMSS route metric.
The XFRM code on net/xfrm/xfrm_policy.c by its functions
xfrm_init_pmtu and xfrm_bundle_ok updates RTAX_MTU route caching
metric however I believe it must update RTAX_ADVMSS as this later is
used by tcp connect function for adverting the MSS value on SYN
messages.

As MSS is not being updated by XFRM the TCP SYN messages (e.g.
originated from a internet browser)  is erroneously informing its MSS
(without taking into account the overhead added to IP packet size by
XFRM transformations).  One result of that is the browser gets
"frozen" after starts a TCP connection because TCP messages sent by
TCP server will never get to it (TCP server is sending too large
segments to browser).

Below I describe the changes I have done (on xfrm_init_pmtu and
xfrm_bundle_ok) and that seem to fix this problem:

xfrm_init_pmtu:
                 .
                 .
                 .

        dst->metrics[RTAX_MTU-1] = pmtu; // original code, below my changes

        if (dst->xfrm->props.mode == XFRM_MODE_TUNNEL)
                 switch (dst->xfrm->props.family)
                 {
                 case AF_INET:
                 dst->metrics[RTAX_ADVMSS-1] = max_t(unsigned int,
pmtu - sizeof(struct iphdr) - sizeof(struct tcphdr), 256);
                 break;

                 case AF_INET6:
                 dst->metrics[RTAX_ADVMSS-1] = max_t(unsigned int,
pmtu - sizeof(struct ipv6hdr) - sizeof(struct tcphdr),
                            dev_net(dst->dev)->ipv6.
sysctl.ip6_rt_min_advmss);
                 break;
                 }

xfrm_bundle_ok:

               .
               .
               .

        dst->metrics[RTAX_MTU-1] = mtu; // original code, below my changes

        if (dst->xfrm->props.mode == XFRM_MODE_TUNNEL)
                switch (dst->xfrm->props.family)
                {
                case AF_INET:
                        dst->metrics[RTAX_ADVMSS-1] = max_t(unsigned
int, mtu - sizeof(struct iphdr) - sizeof(struct tcphdr), 256);
                break;

                case AF_INET6:
                        dst->metrics[RTAX_ADVMSS-1] = max_t(unsigned
int, mtu - sizeof(struct ipv6hdr) - sizeof(struct tcphdr),

dev_net(dst->dev)->ipv6.sysctl.ip6_rt_min_advmss);
                break;
                }

Regards,
Eduardo Panisset.
Comment 1 Andrew Morton 2010-04-05 19:52:40 UTC
(switched to email.  Please respond via emailed reply-to-all, not via the
bugzilla web interface).

On Fri, 2 Apr 2010 17:34:35 GMT
bugzilla-daemon@bugzilla.kernel.org wrote:

> https://bugzilla.kernel.org/show_bug.cgi?id=15682
> 
>            Summary: XFRM is not updating RTAX_ADVMSS metric
>            Product: Networking
>            Version: 2.5
>     Kernel Version: 2.6.28-2
>           Platform: All
>         OS/Version: Linux
>               Tree: Mainline
>             Status: NEW
>           Severity: normal
>           Priority: P1
>          Component: Other
>         AssignedTo: acme@ghostprotocols.net
>         ReportedBy: eduardo.panisset@gmail.com
>         Regression: No
> 
> 
> I have been testing DSMIPv6 code which uses all kind of advanced
> features of XFRM framework and I believe I have found a bug related to
> update RTAX_ADVMSS route metric.
> The XFRM code on net/xfrm/xfrm_policy.c by its functions
> xfrm_init_pmtu and xfrm_bundle_ok updates RTAX_MTU route caching
> metric however I believe it must update RTAX_ADVMSS as this later is
> used by tcp connect function for adverting the MSS value on SYN
> messages.
> 
> As MSS is not being updated by XFRM the TCP SYN messages (e.g.
> originated from a internet browser)  is erroneously informing its MSS
> (without taking into account the overhead added to IP packet size by
> XFRM transformations).  One result of that is the browser gets
> "frozen" after starts a TCP connection because TCP messages sent by
> TCP server will never get to it (TCP server is sending too large
> segments to browser).
> 
> Below I describe the changes I have done (on xfrm_init_pmtu and
> xfrm_bundle_ok) and that seem to fix this problem:
> 
> xfrm_init_pmtu:
>                  .
>                  .
>                  .
> 
>         dst->metrics[RTAX_MTU-1] = pmtu; // original code, below my changes
> 
>         if (dst->xfrm->props.mode == XFRM_MODE_TUNNEL)
>                  switch (dst->xfrm->props.family)
>                  {
>                  case AF_INET:
>                  dst->metrics[RTAX_ADVMSS-1] = max_t(unsigned int,
> pmtu - sizeof(struct iphdr) - sizeof(struct tcphdr), 256);
>                  break;
> 
>                  case AF_INET6:
>                  dst->metrics[RTAX_ADVMSS-1] = max_t(unsigned int,
> pmtu - sizeof(struct ipv6hdr) - sizeof(struct tcphdr),
>                             dev_net(dst->dev)->ipv6.
> sysctl.ip6_rt_min_advmss);
>                  break;
>                  }
> 
> xfrm_bundle_ok:
> 
>                .
>                .
>                .
> 
>         dst->metrics[RTAX_MTU-1] = mtu; // original code, below my changes
> 
>         if (dst->xfrm->props.mode == XFRM_MODE_TUNNEL)
>                 switch (dst->xfrm->props.family)
>                 {
>                 case AF_INET:
>                         dst->metrics[RTAX_ADVMSS-1] = max_t(unsigned
> int, mtu - sizeof(struct iphdr) - sizeof(struct tcphdr), 256);
>                 break;
> 
>                 case AF_INET6:
>                         dst->metrics[RTAX_ADVMSS-1] = max_t(unsigned
> int, mtu - sizeof(struct ipv6hdr) - sizeof(struct tcphdr),
> 
> dev_net(dst->dev)->ipv6.sysctl.ip6_rt_min_advmss);
>                 break;
>                 }
>
Comment 2 Anonymous Emailer 2010-04-06 13:41:32 UTC
Reply-To: hadi@cyberus.ca

Herbert would give better answers. I dont think what Eduardo is 
doing is correct. You cant just start factoring in tcp headers
at the xfrm level - and besides, the mtu calculation
already takes care tunnel headers - so tcp should be able to 
compute correct MSS. 

cheers,
jamal

On Mon, 2010-04-05 at 12:50 -0700, Andrew Morton wrote:
> (switched to email.  Please respond via emailed reply-to-all, not via the
> bugzilla web interface).
> 
> On Fri, 2 Apr 2010 17:34:35 GMT
> bugzilla-daemon@bugzilla.kernel.org wrote:
> 
> > https://bugzilla.kernel.org/show_bug.cgi?id=15682
> > 
> >            Summary: XFRM is not updating RTAX_ADVMSS metric
> >            Product: Networking
> >            Version: 2.5
> >     Kernel Version: 2.6.28-2
> >           Platform: All
> >         OS/Version: Linux
> >               Tree: Mainline
> >             Status: NEW
> >           Severity: normal
> >           Priority: P1
> >          Component: Other
> >         AssignedTo: acme@ghostprotocols.net
> >         ReportedBy: eduardo.panisset@gmail.com
> >         Regression: No
> > 
> > 
> > I have been testing DSMIPv6 code which uses all kind of advanced
> > features of XFRM framework and I believe I have found a bug related to
> > update RTAX_ADVMSS route metric.
> > The XFRM code on net/xfrm/xfrm_policy.c by its functions
> > xfrm_init_pmtu and xfrm_bundle_ok updates RTAX_MTU route caching
> > metric however I believe it must update RTAX_ADVMSS as this later is
> > used by tcp connect function for adverting the MSS value on SYN
> > messages.
> > 
> > As MSS is not being updated by XFRM the TCP SYN messages (e.g.
> > originated from a internet browser)  is erroneously informing its MSS
> > (without taking into account the overhead added to IP packet size by
> > XFRM transformations).  One result of that is the browser gets
> > "frozen" after starts a TCP connection because TCP messages sent by
> > TCP server will never get to it (TCP server is sending too large
> > segments to browser).
> > 
> > Below I describe the changes I have done (on xfrm_init_pmtu and
> > xfrm_bundle_ok) and that seem to fix this problem:
> > 
> > xfrm_init_pmtu:
> >                  .
> >                  .
> >                  .
> > 
> >         dst->metrics[RTAX_MTU-1] = pmtu; // original code, below my changes
> > 
> >         if (dst->xfrm->props.mode == XFRM_MODE_TUNNEL)
> >                  switch (dst->xfrm->props.family)
> >                  {
> >                  case AF_INET:
> >                  dst->metrics[RTAX_ADVMSS-1] = max_t(unsigned int,
> > pmtu - sizeof(struct iphdr) - sizeof(struct tcphdr), 256);
> >                  break;
> > 
> >                  case AF_INET6:
> >                  dst->metrics[RTAX_ADVMSS-1] = max_t(unsigned int,
> > pmtu - sizeof(struct ipv6hdr) - sizeof(struct tcphdr),
> >                             dev_net(dst->dev)->ipv6.
> > sysctl.ip6_rt_min_advmss);
> >                  break;
> >                  }
> > 
> > xfrm_bundle_ok:
> > 
> >                .
> >                .
> >                .
> > 
> >         dst->metrics[RTAX_MTU-1] = mtu; // original code, below my changes
> > 
> >         if (dst->xfrm->props.mode == XFRM_MODE_TUNNEL)
> >                 switch (dst->xfrm->props.family)
> >                 {
> >                 case AF_INET:
> >                         dst->metrics[RTAX_ADVMSS-1] = max_t(unsigned
> > int, mtu - sizeof(struct iphdr) - sizeof(struct tcphdr), 256);
> >                 break;
> > 
> >                 case AF_INET6:
> >                         dst->metrics[RTAX_ADVMSS-1] = max_t(unsigned
> > int, mtu - sizeof(struct ipv6hdr) - sizeof(struct tcphdr),
> > 
> > dev_net(dst->dev)->ipv6.sysctl.ip6_rt_min_advmss);
> >                 break;
> >                 }
> > 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
Comment 3 Panisset 2010-04-06 19:02:53 UTC
Hi,

My intention is only to report a problem that I have faced. The
solution proposed isn't (I know that) probably the best one to adopt
as I'm not a kernel specialist, it is more illustrative to allow you
guys understanding what I'm meaning and solve that on the better way
(hence I haven't submited a patch).

Regards,
Eduardo Panisset.

On Tue, Apr 6, 2010 at 10:40 AM, jamal <hadi@cyberus.ca> wrote:
>
> Herbert would give better answers. I dont think what Eduardo is
> doing is correct. You cant just start factoring in tcp headers
> at the xfrm level - and besides, the mtu calculation
> already takes care tunnel headers - so tcp should be able to
> compute correct MSS.
>
> cheers,
> jamal
>
> On Mon, 2010-04-05 at 12:50 -0700, Andrew Morton wrote:
>> (switched to email.  Please respond via emailed reply-to-all, not via the
>> bugzilla web interface).
>>
>> On Fri, 2 Apr 2010 17:34:35 GMT
>> bugzilla-daemon@bugzilla.kernel.org wrote:
>>
>> > https://bugzilla.kernel.org/show_bug.cgi?id=15682
>> >
>> >            Summary: XFRM is not updating RTAX_ADVMSS metric
>> >            Product: Networking
>> >            Version: 2.5
>> >     Kernel Version: 2.6.28-2
>> >           Platform: All
>> >         OS/Version: Linux
>> >               Tree: Mainline
>> >             Status: NEW
>> >           Severity: normal
>> >           Priority: P1
>> >          Component: Other
>> >         AssignedTo: acme@ghostprotocols.net
>> >         ReportedBy: eduardo.panisset@gmail.com
>> >         Regression: No
>> >
>> >
>> > I have been testing DSMIPv6 code which uses all kind of advanced
>> > features of XFRM framework and I believe I have found a bug related to
>> > update RTAX_ADVMSS route metric.
>> > The XFRM code on net/xfrm/xfrm_policy.c by its functions
>> > xfrm_init_pmtu and xfrm_bundle_ok updates RTAX_MTU route caching
>> > metric however I believe it must update RTAX_ADVMSS as this later is
>> > used by tcp connect function for adverting the MSS value on SYN
>> > messages.
>> >
>> > As MSS is not being updated by XFRM the TCP SYN messages (e.g.
>> > originated from a internet browser)  is erroneously informing its MSS
>> > (without taking into account the overhead added to IP packet size by
>> > XFRM transformations).  One result of that is the browser gets
>> > "frozen" after starts a TCP connection because TCP messages sent by
>> > TCP server will never get to it (TCP server is sending too large
>> > segments to browser).
>> >
>> > Below I describe the changes I have done (on xfrm_init_pmtu and
>> > xfrm_bundle_ok) and that seem to fix this problem:
>> >
>> > xfrm_init_pmtu:
>> >                  .
>> >                  .
>> >                  .
>> >
>> >         dst->metrics[RTAX_MTU-1] = pmtu; // original code, below my
>> changes
>> >
>> >         if (dst->xfrm->props.mode == XFRM_MODE_TUNNEL)
>> >                  switch (dst->xfrm->props.family)
>> >                  {
>> >                  case AF_INET:
>> >                  dst->metrics[RTAX_ADVMSS-1] = max_t(unsigned int,
>> > pmtu - sizeof(struct iphdr) - sizeof(struct tcphdr), 256);
>> >                  break;
>> >
>> >                  case AF_INET6:
>> >                  dst->metrics[RTAX_ADVMSS-1] = max_t(unsigned int,
>> > pmtu - sizeof(struct ipv6hdr) - sizeof(struct tcphdr),
>> >                             dev_net(dst->dev)->ipv6.
>> > sysctl.ip6_rt_min_advmss);
>> >                  break;
>> >                  }
>> >
>> > xfrm_bundle_ok:
>> >
>> >                .
>> >                .
>> >                .
>> >
>> >         dst->metrics[RTAX_MTU-1] = mtu; // original code, below my changes
>> >
>> >         if (dst->xfrm->props.mode == XFRM_MODE_TUNNEL)
>> >                 switch (dst->xfrm->props.family)
>> >                 {
>> >                 case AF_INET:
>> >                         dst->metrics[RTAX_ADVMSS-1] = max_t(unsigned
>> > int, mtu - sizeof(struct iphdr) - sizeof(struct tcphdr), 256);
>> >                 break;
>> >
>> >                 case AF_INET6:
>> >                         dst->metrics[RTAX_ADVMSS-1] = max_t(unsigned
>> > int, mtu - sizeof(struct ipv6hdr) - sizeof(struct tcphdr),
>> >
>> > dev_net(dst->dev)->ipv6.sysctl.ip6_rt_min_advmss);
>> >                 break;
>> >                 }
>> >
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe netdev" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
>
Comment 4 Anonymous Emailer 2010-04-07 12:46:47 UTC
Reply-To: hadi@cyberus.ca

Hi Eduardo,

As a first step, I dont know what the real cause is but 
what you showed as a solution didnt look right. I am also not familiar
with DSMIPv6. I know a lot of other xfrm subsystems work fine with
the current code. So thats why i pointed to Herbert who is more
knowledgeable. I am adding Yoshfuji to the list. 
If you tell me what kernel options to turn on and a very
basic test setup to reproduce it (I dont have much hardware, so make it
very minimal maybe requiring one or two PCs max) then I could make time
and try to reproduce it.

cheers,
jamal

On Tue, 2010-04-06 at 16:02 -0300, Eduardo Panisset wrote:
> Hi,
> 
> My intention is only to report a problem that I have faced. The
> solution proposed isn't (I know that) probably the best one to adopt
> as I'm not a kernel specialist, it is more illustrative to allow you
> guys understanding what I'm meaning and solve that on the better way
> (hence I haven't submited a patch).
> 
> Regards,
> Eduardo Panisset.
Comment 5 Herbert Xu 2010-04-07 14:27:03 UTC
On Tue, Apr 06, 2010 at 09:40:40AM -0400, jamal wrote:
> 
> Herbert would give better answers. I dont think what Eduardo is 
> doing is correct. You cant just start factoring in tcp headers
> at the xfrm level - and besides, the mtu calculation
> already takes care tunnel headers - so tcp should be able to 
> compute correct MSS. 

I think Eduardo is basically right.

Unfortunately, we've kind of stuffed up in how we handle ADVMSS.
In particular, we don't update it at all when we get a normal
non-IPsec PTMU message.

As a result, if we now start updating it it may actually break
existing users.

We do provide a locked bit for the user to use, however, because
the fact that we have never updated ADVMSS on the fly before,
people out there may not be aware of the locked bit.

Dave, what do you think about us starting to update ADVMSS on
the fly, just like the MTU?

The only risk is that existing users who are forcing ADVMSS to
a value higher than what it would otherwise be would now get a
lower value, if they're not using the "lock" keyword.

This shouldn't be fatal as it would only result in smaller packets
which should still work, and they can always start using the
"lock" keyword to get back the old behaviour.

Cheers,
Comment 6 David S. Miller 2010-04-07 23:48:02 UTC
From: Herbert Xu <herbert@gondor.apana.org.au>
Date: Wed, 7 Apr 2010 21:54:46 +0800

> Dave, what do you think about us starting to update ADVMSS on
> the fly, just like the MTU?

This sounds fine.

> The only risk is that existing users who are forcing ADVMSS to
> a value higher than what it would otherwise be would now get a
> lower value, if they're not using the "lock" keyword.
> 
> This shouldn't be fatal as it would only result in smaller packets
> which should still work, and they can always start using the
> "lock" keyword to get back the old behaviour.

And if we get them to fix their kit and use the lock setting,
it'll work in older kernels too.

Note You need to log in before you can comment on or make changes to this bug.