Bug 6530

Summary: traffic freezes when sending large amount of data over pptp
Product: Drivers Reporter: xeb
Component: NetworkAssignee: Jeff Garzik (jgarzik)
Status: CLOSED CODE_FIX    
Severity: normal CC: dsd
Priority: P2    
Hardware: i386   
OS: Linux   
Kernel Version: 2.6.16 Subsystem:
Regression: --- Bisected commit-id:
Bug Depends on:    
Bug Blocks: 6402    

Description xeb 2006-05-10 02:19:31 UTC
Most recent kernel where this bug did not occur:
Distribution:  gentoo 2005.1
Hardware Environment: i386 (Intel(R) Celeron(R) M processor         1.40GHz), RAM 256M
Software Environment:
Problem Description:
Hello.
I tried to setup pptp (pptpd) service on my linux box, but it don't work correctly.
Problem: traffic freezes when sending large amount of data.

I find lack in ppp_async.c. 

	/* try to push more stuff out */
	if (test_bit(XMIT_WAKEUP, &ap->xmit_flags) && ppp_async_push(ap))
		ppp_output_wakeup(&ap->chan);
---->> I added this lines and pptpd starts work.
	else
	{
	    if (test_bit(XMIT_FULL, &ap->xmit_flags))
		ppp_asynctty_wakeup(ap->tty);
	}
Comment 1 Andrew Morton 2006-05-10 02:36:10 UTC
bugme-daemon@bugzilla.kernel.org wrote:
>
> http://bugzilla.kernel.org/show_bug.cgi?id=6530
> 
>            Summary: MAINLINE
>     Kernel Version: 2.6.16
>             Status: NEW
>           Severity: normal
>              Owner: jgarzik@pobox.com
>          Submitter: xeb@mail.ru
> 
> 
> Most recent kernel where this bug did not occur:
> Distribution:  gentoo 2005.1
> Hardware Environment: i386 (Intel(R) Celeron(R) M processor         1.40GHz), RAM 256M
> Software Environment:
> Problem Description:
> Hello.
> I tried to setup pptp (pptpd) service on my linux box, but it don't work correctly.
> Problem: traffic freezes when sending large amount of data.
> 
> I find lack in ppp_async.c. 
> 
> 	/* try to push more stuff out */
> 	if (test_bit(XMIT_WAKEUP, &ap->xmit_flags) && ppp_async_push(ap))
> 		ppp_output_wakeup(&ap->chan);
> ---->> I added this lines and pptpd starts work.
> 	else
> 	{
> 	    if (test_bit(XMIT_FULL, &ap->xmit_flags))
> 		ppp_asynctty_wakeup(ap->tty);
> 	}
> 

hm, a PPP fix.  We seem to need some of those lately.

Paul, does this look sane?


From: <xeb@mail.ru>

Adapted from  http://bugzilla.kernel.org/show_bug.cgi?id=6530

Reschedule the async Tx tasklet if the transmit queue was full.

Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
---

 drivers/net/ppp_async.c |    2 ++
 1 files changed, 2 insertions(+)

diff -puN drivers/net/ppp_async.c~ppp_async-hang-fix drivers/net/ppp_async.c
--- devel/drivers/net/ppp_async.c~ppp_async-hang-fix	2006-05-10 02:28:15.000000000 -0700
+++ devel-akpm/drivers/net/ppp_async.c	2006-05-10 02:28:53.000000000 -0700
@@ -516,6 +516,8 @@ static void ppp_async_process(unsigned l
 	/* try to push more stuff out */
 	if (test_bit(XMIT_WAKEUP, &ap->xmit_flags) && ppp_async_push(ap))
 		ppp_output_wakeup(&ap->chan);
+	else if (test_bit(XMIT_FULL, &ap->xmit_flags))
+		ppp_asynctty_wakeup(ap->tty);
 }
 
 /*
_

Comment 2 Paul Mackerras 2006-05-10 03:27:09 UTC
Andrew Morton writes:

> hm, a PPP fix.  We seem to need some of those lately.
> 
> Paul, does this look sane?

/me pages in 7 year old code...

> @@ -516,6 +516,8 @@ static void ppp_async_process(unsigned l
>  	/* try to push more stuff out */
>  	if (test_bit(XMIT_WAKEUP, &ap->xmit_flags) && ppp_async_push(ap))
>  		ppp_output_wakeup(&ap->chan);
> +	else if (test_bit(XMIT_FULL, &ap->xmit_flags))
> +		ppp_asynctty_wakeup(ap->tty);

ppp_asynctty_wakeup is supposed to be called by the serial driver when
it can take more output.  It's slightly bogus having ppp_async call it
itself whether or not the serial driver can take more output at the
moment, but I suppose it won't hurt.  I would really like to know the
precise circumstances where we need this fake wakeup though.  Is the
serial driver failing to give us a wakeup call where it should, or is
ppp_async ignoring a wakeup for some reason?

I think the same effect could be achieved without an extra trip
through tasklet_schedule et al. by making those lines look like this
(untested):

	if ((test_bit(XMIT_WAKEUP, &ap->xmit_flags) ||
             test_bit(XMIT_FULL, &ap->xmit_flags)) && ppp_async_push(ap))
		ppp_output_wakeup(&ap->chan);

so that ppp_async_push gets called if either XMIT_WAKEUP or XMIT_FULL
is set.

This is all relying on getting some input to kick off more output when
the wakeup gets missed, though.  That's a reasonable workaround in most
situations, I guess, but I'd really like to know why the wakeup is
getting missed.

Paul.

Comment 3 xeb 2006-05-10 04:41:03 UTC
Boys,  in this construction:
          if ((test_bit(XMIT_WAKEUP, &ap->xmit_flags) ||
             test_bit(XMIT_FULL, &ap->xmit_flags)) && ppp_async_push(ap))
                  ppp_output_wakeup(&ap->chan);

if ppp_async_push()  doesn't send any data i.e. XMIT_FULL is set then all (transfer) hangs up 
while somebody push again, for instance lcp-echo.
 
Comment 4 Andrew Morton 2006-05-10 20:31:52 UTC
Paul Mackerras <paulus@samba.org> wrote:
>
> Andrew Morton writes:
> 
> > hm, a PPP fix.  We seem to need some of those lately.
> > 
> > Paul, does this look sane?
> 
> /me pages in 7 year old code...
> 
> > @@ -516,6 +516,8 @@ static void ppp_async_process(unsigned l
> >  	/* try to push more stuff out */
> >  	if (test_bit(XMIT_WAKEUP, &ap->xmit_flags) && ppp_async_push(ap))
> >  		ppp_output_wakeup(&ap->chan);
> > +	else if (test_bit(XMIT_FULL, &ap->xmit_flags))
> > +		ppp_asynctty_wakeup(ap->tty);
> 
> ppp_asynctty_wakeup is supposed to be called by the serial driver when
> it can take more output.  It's slightly bogus having ppp_async call it
> itself whether or not the serial driver can take more output at the
> moment, but I suppose it won't hurt.  I would really like to know the
> precise circumstances where we need this fake wakeup though.  Is the
> serial driver failing to give us a wakeup call where it should, or is
> ppp_async ignoring a wakeup for some reason?
> 
> I think the same effect could be achieved without an extra trip
> through tasklet_schedule et al. by making those lines look like this
> (untested):
> 
> 	if ((test_bit(XMIT_WAKEUP, &ap->xmit_flags) ||
>              test_bit(XMIT_FULL, &ap->xmit_flags)) && ppp_async_push(ap))
> 		ppp_output_wakeup(&ap->chan);
> 
> so that ppp_async_push gets called if either XMIT_WAKEUP or XMIT_FULL
> is set.
> 
> This is all relying on getting some input to kick off more output when
> the wakeup gets missed, though.  That's a reasonable workaround in most
> situations, I guess, but I'd really like to know why the wakeup is
> getting missed.
> 

(xeb, on this bug please respond via email using reply-to-all rather than
via the bugzilla web form).

xeb has said:

in this construction:

          if ((test_bit(XMIT_WAKEUP, &ap->xmit_flags) ||
             test_bit(XMIT_FULL, &ap->xmit_flags)) && ppp_async_push(ap))
                  ppp_output_wakeup(&ap->chan);

if ppp_async_push() doesn't send any data i.e.  XMIT_FULL is set then all
(transfer) hangs up while somebody push again, for instance lcp-echo.  

Comment 5 Paul Mackerras 2006-05-10 21:01:29 UTC
Andrew Morton writes:

> xeb has said:
> 
> in this construction:
> 
>           if ((test_bit(XMIT_WAKEUP, &ap->xmit_flags) ||
>              test_bit(XMIT_FULL, &ap->xmit_flags)) && ppp_async_push(ap))
>                   ppp_output_wakeup(&ap->chan);
> 
> if ppp_async_push() doesn't send any data i.e.  XMIT_FULL is set then all
> (transfer) hangs up while somebody push again, for instance lcp-echo.  

If XMIT_FULL and ppp_async_push doesn't send any data, that means the
serial driver's output buffer was full.  If that's the case, *and* we
don't see a call to ppp_output_wakeup, then the finger points squarely
at the serial driver as the source of the bug.

Paul.

Comment 6 Andrew Morton 2006-05-10 21:08:34 UTC
Paul Mackerras <paulus@samba.org> wrote:
>
> Andrew Morton writes:
> 
> > xeb has said:
> > 
> > in this construction:
> > 
> >           if ((test_bit(XMIT_WAKEUP, &ap->xmit_flags) ||
> >              test_bit(XMIT_FULL, &ap->xmit_flags)) && ppp_async_push(ap))
> >                   ppp_output_wakeup(&ap->chan);
> > 
> > if ppp_async_push() doesn't send any data i.e.  XMIT_FULL is set then all
> > (transfer) hangs up while somebody push again, for instance lcp-echo.  
> 
> If XMIT_FULL and ppp_async_push doesn't send any data, that means the
> serial driver's output buffer was full.  If that's the case, *and* we
> don't see a call to ppp_output_wakeup, then the finger points squarely
> at the serial driver as the source of the bug.
> 

OK, thanks.  So the next question is: which driver is being used?

Comment 7 Anonymous Emailer 2006-05-10 22:50:56 UTC
Reply-To: andy@andynet.net

On Wed, 2006-05-10 at 20:27 +1000, Paul Mackerras wrote:
> Andrew Morton writes:
> 
> > hm, a PPP fix.  We seem to need some of those lately.
> > 
> > Paul, does this look sane?
> 
> /me pages in 7 year old code...
> 
> > @@ -516,6 +516,8 @@ static void ppp_async_process(unsigned l
> >  	/* try to push more stuff out */
> >  	if (test_bit(XMIT_WAKEUP, &ap->xmit_flags) && ppp_async_push(ap))
> >  		ppp_output_wakeup(&ap->chan);
> > +	else if (test_bit(XMIT_FULL, &ap->xmit_flags))
> > +		ppp_asynctty_wakeup(ap->tty);
> 
> ppp_asynctty_wakeup is supposed to be called by the serial driver when
> it can take more output.

How does the serial driver know it has to call ppp_asynctty_wakeup()?
I'd have thought it wouldn't know or care if it's being used for ppp or
any other async traffic.

There were a bunch of changes to the serial drivers between 2.6.15 and
2.6.16, maybe that's where this problem was introduced. Do we know which
serial driver is involved in the original report?

(I'm interested in this because I need to convert an out of tree serial
driver for 2.6.16+, I'm wondering if I'll need to do anything special to
support ppp).

>   It's slightly bogus having ppp_async call it
> itself whether or not the serial driver can take more output at the
> moment, but I suppose it won't hurt.  I would really like to know the
> precise circumstances where we need this fake wakeup though.  Is the
> serial driver failing to give us a wakeup call where it should, or is
> ppp_async ignoring a wakeup for some reason?
> 
> I think the same effect could be achieved without an extra trip
> through tasklet_schedule et al. by making those lines look like this
> (untested):
> 
> 	if ((test_bit(XMIT_WAKEUP, &ap->xmit_flags) ||
>              test_bit(XMIT_FULL, &ap->xmit_flags)) && ppp_async_push(ap))
> 		ppp_output_wakeup(&ap->chan);
> 
> so that ppp_async_push gets called if either XMIT_WAKEUP or XMIT_FULL
> is set.
> 
> This is all relying on getting some input to kick off more output when
> the wakeup gets missed, though.  That's a reasonable workaround in most
> situations, I guess, but I'd really like to know why the wakeup is
> getting missed.
> 
> Paul.
> -
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comment 8 Andrew Morton 2006-05-10 22:58:41 UTC
Andy Gay <andy@andynet.net> wrote:
>
> On Wed, 2006-05-10 at 20:27 +1000, Paul Mackerras wrote:
> > Andrew Morton writes:
> > 
> > > hm, a PPP fix.  We seem to need some of those lately.
> > > 
> > > Paul, does this look sane?
> > 
> > /me pages in 7 year old code...
> > 
> > > @@ -516,6 +516,8 @@ static void ppp_async_process(unsigned l
> > >  	/* try to push more stuff out */
> > >  	if (test_bit(XMIT_WAKEUP, &ap->xmit_flags) && ppp_async_push(ap))
> > >  		ppp_output_wakeup(&ap->chan);
> > > +	else if (test_bit(XMIT_FULL, &ap->xmit_flags))
> > > +		ppp_asynctty_wakeup(ap->tty);
> > 
> > ppp_asynctty_wakeup is supposed to be called by the serial driver when
> > it can take more output.
> 
> How does the serial driver know it has to call ppp_asynctty_wakeup()?
> I'd have thought it wouldn't know or care if it's being used for ppp or
> any other async traffic.

xeb (who forgot to do reply-to-all) tells me that pptpd uses ptys.

Comment 9 Paul Mackerras 2006-05-11 18:58:39 UTC
Andy Gay writes:

> How does the serial driver know it has to call ppp_asynctty_wakeup()?

The serial driver is supposed to call the line discipline's wakeup
function when it has room in the output buffer and the
TTY_DO_WRITE_WAKEUP bit is set in tty->flags.  When the serial port is
set to the ppp line discipline, then it uses the functions defined in
the ppp_ldisc structure in drivers/net/ppp_async.c, and the
write_wakeup field in that structure points to ppp_asynctty_wakeup.

> There were a bunch of changes to the serial drivers between 2.6.15 and
> 2.6.16, maybe that's where this problem was introduced. Do we know which
> serial driver is involved in the original report?

Apparently it's the pty driver.

Paul.

Comment 10 Anonymous Emailer 2006-05-11 19:12:27 UTC
Reply-To: andy@andynet.net

On Fri, 2006-05-12 at 11:59 +1000, Paul Mackerras wrote:
> Andy Gay writes:
> 
> > How does the serial driver know it has to call ppp_asynctty_wakeup()?
> 
> The serial driver is supposed to call the line discipline's wakeup
> function when it has room in the output buffer and the
> TTY_DO_WRITE_WAKEUP bit is set in tty->flags.  When the serial port is
> set to the ppp line discipline, then it uses the functions defined in
> the ppp_ldisc structure in drivers/net/ppp_async.c, and the
> write_wakeup field in that structure points to ppp_asynctty_wakeup.
> 
OK, thanks for the explanation. I'll pay special attention to that stuff
in my driver!

> > There were a bunch of changes to the serial drivers between 2.6.15 and
> > 2.6.16, maybe that's where this problem was introduced. Do we know which
> > serial driver is involved in the original report?
> 
> Apparently it's the pty driver.
> 
So I heard. Hopefully the maintainer of that driver will see this....

> Paul.

Comment 11 Paul Mackerras 2006-05-12 22:26:46 UTC
Andrew Morton writes:

> xeb (who forgot to do reply-to-all) tells me that pptpd uses ptys.

I tried to replicate this using pppd running on a pty, with a
"charshunt" process on the master side of the pty transferring
characters between it and a socket.  I didn't see any freezeups in
either direction.  So - precise details on how to replicate this would
be appreciated.

Paul.

Comment 12 xeb 2006-05-14 21:58:22 UTC
You need two computers on 100Mbit ethernet.
Setup pptpd (server) on one .
Setup pptp (client)  on other.
Then setup ppp-connection and try to send random file (min 100M) from one comp. to other.
Software: http://poptop.sourceforge.net, http://pptpclient.sourceforge.net/

Comment 13 Paul Mackerras 2006-05-19 03:29:44 UTC
Turns out xeb's mail server is rejecting my emails because they come from
ozlabs.org.

*shrug* *plonk*
Comment 14 xeb 2006-05-19 04:54:27 UTC
 
> Turns out xeb's mail server is rejecting my emails because they come from
> ozlabs.org.
> 
> *shrug* *plonk*
> 
> ------- You are receiving this mail because: -------
> You reported the bug, or are watching the reporter.
>

Try this xxebxebx@gmail.com
 

Comment 15 Paul Mackerras 2006-05-19 05:03:28 UTC
OK.  I may get time to set up pptp and pptpd on the weekend...
Comment 16 Robert Fitzsimons 2006-05-23 11:34:55 UTC
I've been able to reproduce this on my setup at home.

Using git bisect I found the commit that first triggers this problem is:

[PATCH] TTY layer buffering revamp
33f0f88f1c51ae5c2d593d26960c760ea154c2e2

I used the following setup.


Server

Debian testing/unstable i386
kernel 2.6.17-rc4 (from git)
ppp 2.4.4b1-1
pptpd 1.2.3-1.1
pptp-linux 1.7.0-2

:/etc/pptpd.conf
option /etc/ppp/pptpd-options
logwtmp
localip 192.168.250.1
remoteip 192.168.251.64-128

:/etc/ppp/pptpd-options
name pptpvpn
proxyarp
nodefaultroute
lock
nobsdcomp

:/etc/ppp/options
asyncmap 0
auth
crtscts
lock
hide-password
modem
proxyamp
lcp-echo-interval 30
lcp-echo-failure 4
noipx

:/etc/ppp/chap-secrets
pptptest pptpvpn pptptest *



Client i386

Debian testing/unstable
kernel various (from git)
ppp 2.4.4b1-1
pptp-linux 1.7.0-2

:/etc/ppp/peers/pptpvpn
pty "pptp <address of server> --nolauchppd"
name pptptest
remotename pptpvpn
ipparam pptpvpn
lock
noauth
nobsdcomp
nodefalte

:/etc/ppp/chap-secrets
pptptest pptpvpn pptptest *


The from the client:
# pon pptpvpn
# scp random.data 192.168.250.1:

If the file is large the upload will stall a few times and the upload seems to
be slower.


Hope that helps.
Comment 17 Daniel Drake 2007-06-09 09:27:08 UTC
This looks like bug #6402. xeb, can you confirm this is fixed in recent kernels?
Comment 18 xeb 2007-06-09 10:20:35 UTC
Sorry, I can't do it now, but this bug was fixed and pptp worked fine on 
2.6.17, 2.6.18