Bug 8042

Summary: Cisco VPN Client cannot connect using TCP with Intel 82573L NIC
Product: Networking Reporter: John Marrett (johnf)
Component: IPV4Assignee: Jesse Brandeburg (jbrandeb)
Status: REJECTED INVALID    
Severity: normal CC: jdelvare
Priority: P2    
Hardware: i386   
OS: Linux   
Kernel Version: 2.6.18.6 Subsystem:
Regression: --- Bisected commit-id:
Attachments: lspci output from affected machine
2.6.18.6 Config
dmesg

Description John Marrett 2007-02-19 15:55:15 UTC
Most recent kernel where this bug did *NOT* occur: -
Distribution: Ubuntu, Debian
Hardware Environment: Lenovo Thinkpad T60p
Software Environment: -
Problem Description:

I have an issue with the cisco vpn client
(vpnclient-linux-x86_64-4.8.00.0490-k9.tar.gz) that appears to be related to
packet fragmentation and the e1000 driver (hardware is 82573L, I don't believe
that this issue affects earlier chips).

When I try to connect to a VPN using Cisco's TCP tunneling feature I experience
an issue where I am unable to connect to the vpn concentrator.

If I recompile the e1000 module, setting the option:

CONFIG_E1000_DISABLE_PACKET_SPLIT=y

then I am able to connect without issue.

I have experience this problem with the following kernels:

ubuntu edgy 2.6.16-11-generic
debian sid  2.6.18-4-686 (Based on 2.6.18.6 w/hand picked later patches)
kernel.org  2.6.18.6

There was a perhaps related bug resolved for udp recently, see this changelog entry:

http://www.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=753eab76a3337863a0d86ce045fa4eb6c3cbeef9

You can also see some discussion surrounding the issue (I had initially believe
it related to another issue with the 82573L), starting from this comment:

http://bugzilla.kernel.org/show_bug.cgi?id=6929#c9

Please let me know if there is anything else I can do to better explain the problem.

Steps to reproduce:

It's not possible to reproduce this issue without:

 - A 82573L chip based network card
 - A Cisco VPN Concentrator you can access using TCP tunneling
 - The cisco vpn client ()

I have all of these, and would be more than pleased to reproduce the problem,
provide packet captures, etc... If you want to reproduce the problem yourself,
and have the above equipment, try to open a TCP encapsulated connection to the
VPN Concentrator, you should not be able to unless you have compiled e1000 with
 CONFIG_E1000_DISABLE_PACKET_SPLIT=y.
Comment 1 John Marrett 2007-02-19 17:05:29 UTC
Created attachment 10467 [details]
lspci output from affected machine
Comment 2 John Marrett 2007-02-19 17:06:16 UTC
Created attachment 10468 [details]
2.6.18.6 Config
Comment 3 John Marrett 2007-02-19 17:07:23 UTC
One final note, this issue is not related to the Cisco VPN kernel module itself
(though I believe the module does generate the affected traffic).
Comment 4 John Marrett 2007-02-19 17:14:36 UTC
Created attachment 10469 [details]
dmesg
Comment 5 Anonymous Emailer 2007-02-19 22:27:41 UTC
Reply-To: akpm@linux-foundation.org

On Mon, 19 Feb 2007 15:55:19 -0800 bugme-daemon@bugzilla.kernel.org wrote:

> http://bugzilla.kernel.org/show_bug.cgi?id=8042
> 
>            Summary: Cisco VPN Client cannot connect using TCP with Intel
>                     82573L NIC
>     Kernel Version: 2.6.18.6
>             Status: NEW
>           Severity: normal
>              Owner: shemminger@osdl.org
>          Submitter: johnf@dsl.ca
> 
> 
> Most recent kernel where this bug did *NOT* occur: -
> Distribution: Ubuntu, Debian
> Hardware Environment: Lenovo Thinkpad T60p
> Software Environment: -
> Problem Description:
> 
> I have an issue with the cisco vpn client
> (vpnclient-linux-x86_64-4.8.00.0490-k9.tar.gz) that appears to be related to
> packet fragmentation and the e1000 driver (hardware is 82573L, I don't believe
> that this issue affects earlier chips).
> 
> When I try to connect to a VPN using Cisco's TCP tunneling feature I experience
> an issue where I am unable to connect to the vpn concentrator.
> 
> If I recompile the e1000 module, setting the option:
> 
> CONFIG_E1000_DISABLE_PACKET_SPLIT=y
> 
> then I am able to connect without issue.
> 
> I have experience this problem with the following kernels:
> 
> ubuntu edgy 2.6.16-11-generic
> debian sid  2.6.18-4-686 (Based on 2.6.18.6 w/hand picked later patches)
> kernel.org  2.6.18.6
> 
> There was a perhaps related bug resolved for udp recently, see this changelog entry:
> 
> http://www.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=753eab76a3337863a0d86ce045fa4eb6c3cbeef9
> 
> You can also see some discussion surrounding the issue (I had initially believe
> it related to another issue with the 82573L), starting from this comment:
> 
> http://bugzilla.kernel.org/show_bug.cgi?id=6929#c9
> 
> Please let me know if there is anything else I can do to better explain the problem.
> 
> Steps to reproduce:
> 
> It's not possible to reproduce this issue without:
> 
>  - A 82573L chip based network card
>  - A Cisco VPN Concentrator you can access using TCP tunneling
>  - The cisco vpn client ()
> 
> I have all of these, and would be more than pleased to reproduce the problem,
> provide packet captures, etc... If you want to reproduce the problem yourself,
> and have the above equipment, try to open a TCP encapsulated connection to the
> VPN Concentrator, you should not be able to unless you have compiled e1000 with
>  CONFIG_E1000_DISABLE_PACKET_SPLIT=y.
> 
> ------- You are receiving this mail because: -------
> You are on the CC list for the bug, or are watching someone who is.

Comment 6 Jesse Brandeburg 2007-02-20 15:24:11 UTC
This bug is unlikely to be a problem with the e1000 driver, and is much more
likely to be some protocol or kernel portion of (cisco) code that is unable to
handle skbs with nr_frags and skb_shinfo(skb)->frags[] making up the data (in
addition to the usual skb->data.)

what function copies the data out from the skb to the cisco kernel module? Oh
its in closed source.  From the source that is visible, it appears that the vpn
driver module makes an incorrect assumption that skb->data has *all* packet data
in it.  If they would like to have it work, they should call skb_is_linear and
skb_linearize if necessary to have a simple fix at the cost of CPU, or a more
complicated fix is to have their code handle a fragmented skb.

I don't think we would be the only adapter to break using this vpn software. I
suggest you take the problem up with Cisco as it is their bug, we are using a 
documented method (albeit not very often used) of receiving data.

I'll be glad to try and help more if I can.
Comment 7 John Marrett 2007-02-20 16:07:46 UTC
Jesse,

Thank you for your assistance, with the information you have given me I will
open a case with Cisco, refering them to the information you provided.

This falls well, as I have another outstanding issue that I need to address with
Cisco regarding the vpn client (try deleting a tun device while connected with
the cisco vpn client, it results in a kernel panic).