Bug 109471 - Linux generating invalid Ethernet frames which get dropped (as over-size) by gateway
Summary: Linux generating invalid Ethernet frames which get dropped (as over-size) by ...
Status: NEW
Alias: None
Product: Networking
Classification: Unclassified
Component: IPV4 (show other bugs)
Hardware: ARM Linux
: P1 normal
Assignee: Stephen Hemminger
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2015-12-16 15:47 UTC by Brad Hein
Modified: 2016-07-27 08:34 UTC (History)
4 users (show)

See Also:
Kernel Version: 3.18.11-v7+ #781 SMP PREEMPT armv7l GNU/Linux
Subsystem:
Regression: No
Bisected commit-id:


Attachments
attachment-7946-0.html (4.18 KB, text/html)
2016-07-26 16:38 UTC, Brad Hein
Details

Description Brad Hein 2015-12-16 15:47:16 UTC
Also described in detail here: http://networkengineering.stackexchange.com/questions/25254/1522-byte-frames-from-access-point-being-dropped-by-gateway 

I configured the Linux device in question (Raspberry Pi 2b) as a hostap WIFI access point. Wifi clients connect and can sometimes transfer some data but as soon as the packet flow reaches a frame size of 1522 the flow stops - the gateway starts showing a spike in dropped frames (reflected in /sys/class/net/eth0  /statistics/rx_length_errors) as the client retransmits the dropped packets. A closer look shows that the gateway is dropping the packets due to size. 

The TCP/IP part of the packet is 1500 bytes and I expect framing to add a final 14 or 18 bytes before transmitting it onto the wire. However 22 bytes are being added, and it seems to be too many.


I have packet captures available to send via email (I'd like to not attach them for privacy reasons). 


The following is a sample oversized packet as exported from Wireshark. Note that I tweaked the MAC and IP addresses for privacy.


No.     Time            Source                Destination           Protocol Length Info
     35 21:55:21.644314 192.168.1.149        54.239.25.200         TCP      1522   [TCP Retransmission] 44053→443 [ACK] Seq=350 Ack=155 Win=88832 Len=1460 [ETHERNET FRAME CHECK SEQUENCE INCORRECT]

Frame 35: 1522 bytes on wire (12176 bits), 1522 bytes captured (12176 bits)
    Encapsulation type: Ethernet (1)
    Arrival Time: Dec 15, 2015 21:55:21.644314000 EST
    [Time shift for this packet: 0.000000000 seconds]
    Epoch Time: 1450234521.644314000 seconds
    [Time delta from previous captured frame: 0.150820000 seconds]
    [Time delta from previous displayed frame: 3.683211000 seconds]
    [Time since reference or first frame: 9.568260000 seconds]
    Frame Number: 35
    Frame Length: 1522 bytes (12176 bits)
    Capture Length: 1522 bytes (12176 bits)
    [Frame is marked: False]
    [Frame is ignored: False]
    [Protocols in frame: eth:ethertype:ip:tcp]
    [Coloring Rule Name: Bad TCP]
    [Coloring Rule String: tcp.analysis.flags && !tcp.analysis.window_update]
Ethernet II, Src: xx:xx:xx:xx:xx:xx (xx:xx:xx:xx:xx:xx), Dst: Raspberr_xx:xx:xx (xx:xx:xx:xx:xx:xx)
    Destination: Raspberr_xx:xx:xx (xx:xx:xx:xx:xx:xx)
        Address: Raspberr_xx:xx:xx (xx:xx:xx:xx:xx:xx)
        .... ..0. .... .... .... .... = LG bit: Globally unique address (factory default)
        .... ...0 .... .... .... .... = IG bit: Individual address (unicast)
    Source: xx:xx:xx:xx:xx:xx (xx:xx:xx:xx:xx:xx)
        Address: xx:xx:xx:xx:xx:xx (xx:xx:xx:xx:xx:xx)
        .... ..0. .... .... .... .... = LG bit: Globally unique address (factory default)
        .... ...0 .... .... .... .... = IG bit: Individual address (unicast)
    Type: IP (0x0800)
    Trailer: 2f8e4de5
    Frame check sequence: 0x29f37f68 [incorrect, should be 0x3b863a28]
        [FCS Good: False]
        [FCS Bad: True]
            [Expert Info (Error/Checksum): Bad checksum]
                [Bad checksum]
                [Severity level: Error]
                [Group: Checksum]
Internet Protocol Version 4, Src: 192.168.1.149 (192.168.1.149), Dst: 54.239.25.200 (54.239.25.200)
    Version: 4
    Header Length: 20 bytes
    Differentiated Services Field: 0x00 (DSCP 0x00: Default; ECN: 0x00: Not-ECT (Not ECN-Capable Transport))
        0000 00.. = Differentiated Services Codepoint: Default (0x00)
        .... ..00 = Explicit Congestion Notification: Not-ECT (Not ECN-Capable Transport) (0x00)
    Total Length: 1500
    Identification: 0xaaf3 (43763)
    Flags: 0x02 (Don't Fragment)
        0... .... = Reserved bit: Not set
        .1.. .... = Don't fragment: Set
        ..0. .... = More fragments: Not set
    Fragment offset: 0
    Time to live: 64
    Protocol: TCP (6)
    Header checksum: 0x3234 [validation disabled]
        [Good: False]
        [Bad: False]
    Source: 192.168.1.149 (192.168.1.149)
    Destination: 54.239.25.200 (54.239.25.200)
    [Source GeoIP: Unknown]
    [Destination GeoIP: Unknown]
Transmission Control Protocol, Src Port: 44053 (44053), Dst Port: 443 (443), Seq: 350, Ack: 155, Len: 1460
    Source Port: 44053 (44053)
    Destination Port: 443 (443)
    [Stream index: 1]
    [TCP Segment Len: 1460]
    Sequence number: 350    (relative sequence number)
    [Next sequence number: 1810    (relative sequence number)]
    Acknowledgment number: 155    (relative ack number)
    Header Length: 20 bytes
    .... 0000 0001 0000 = Flags: 0x010 (ACK)
        000. .... .... = Reserved: Not set
        ...0 .... .... = Nonce: Not set
        .... 0... .... = Congestion Window Reduced (CWR): Not set
        .... .0.. .... = ECN-Echo: Not set
        .... ..0. .... = Urgent: Not set
        .... ...1 .... = Acknowledgment: Set
        .... .... 0... = Push: Not set
        .... .... .0.. = Reset: Not set
        .... .... ..0. = Syn: Not set
        .... .... ...0 = Fin: Not set
    Window size value: 347
    [Calculated window size: 88832]
    [Window size scaling factor: 256]
    Checksum: 0x1c58 [validation disabled]
        [Good Checksum: False]
        [Bad Checksum: False]
    Urgent pointer: 0
    [SEQ/ACK analysis]
        [iRTT: 0.028704000 seconds]
        [Bytes in flight: 2077]
        [TCP Analysis Flags]
            [Expert Info (Note/Sequence): This frame is a (suspected) retransmission]
                [This frame is a (suspected) retransmission]
                [Severity level: Note]
                [Group: Sequence]
            [The RTO for this segment was: 7.247571000 seconds]
            [RTO based on delta from frame: 12]
    Retransmitted TCP segment data (1460 bytes)
Comment 1 Brad Hein 2015-12-17 02:37:57 UTC
I also have an intrusion detection server (IDS) on the network that happens to see the over-sized packets, and it also throwing kernel alerts about the oversized packet length. 


Dec 16 23:57:11 mri kernel: [1587969.108160] asix 1-1.5:1.0 eth1: asix_rx_fixup() Bad RX Length 1522
	
Dec 16 23:57:07 mri kernel: [1587964.778865] asix 1-1.5:1.0 eth1: asix_rx_fixup() Bad RX Length 1522
	
Dec 16 23:57:06 mri kernel: [1587963.821912] asix 1-1.5:1.0 eth1: asix_rx_fixup() Bad RX Length 1522
	
Dec 16 23:57:05 mri kernel: [1587962.620922] asix 1-1.5:1.0 eth1: asix_rx_fixup() Bad RX Length 1522
	
Dec 16 23:57:04 mri kernel: [1587962.119229] asix 1-1.5:1.0 eth1: asix_rx_fixup() Bad RX Length 1522
	
Dec 16 23:57:04 mri kernel: [1587961.535235] asix 1-1.5:1.0 eth1: asix_rx_fixup() Bad RX Length 1522
	
Dec 16 23:57:03 mri kernel: [1587961.000442] asix 1-1.5:1.0 eth1: asix_rx_fixup() Bad RX Length 1522
Comment 2 Stephen Hemminger 2015-12-18 01:36:14 UTC
Its a driver bug.
The Linux networking stack does not care what received packet size (only that the frame is valid).  The driver is responsible for handling long frames.
A correctly written driver should:
 * drop any frame that is truncated on receive
 * keep local statistics about dropped packets
Comment 3 Brad Hein 2015-12-18 02:29:31 UTC

The gateway doesn't show any errors in the syslog associated with the drops... but the statistic rx_length_errors does correlate to the drops. 

The fixup errors I posted are from another server listening to the traffic to/from the gateway. I hope I didn't confuse the question by adding the syslog errors from the IDS server. I meant only to illustrate that multiple servers on the network are complaining about the packet length coming from the access point.
Comment 4 Brad Hein 2015-12-22 14:38:38 UTC
Just to be extra clear, the issue being reported is the over-sized network frames. 

I'd like to try and rule some thing out... here's a sketch of the packet flow: 

  wifi client -> hostapd -> wlan0 -> Linux Bridge -> eth0 

I'll work on ruling out the hardware by trying the same OS image on another device. Maybe bad hardware? 

I could also use some pointers/suggestions for ruling out the Linux Bridge drivers/logic.
Comment 5 Brad Hein 2015-12-22 17:10:55 UTC
I swapped hardware (took the OS SD card and moved it to another Pi) and the problem followed the software. So that at least rules out hardware.
Comment 6 Brad Hein 2015-12-22 19:23:13 UTC
I took a new disk and loaded a slightly older OS but with the same hostapd/bridge/network configuration and same hardware. The problem doesn't exist with same configuration except older kernel/software. 

Not Broken: 
  Linux 3.18.7-v7+ #755 SMP PREEMPT Thu Feb 12 17:20:48 GMT 2015 armv7l GNU/Linux (Source: 2015-02-16-raspbian-wheezy.img)


Broken: 
  Linux 3.18.11-v7+ #781 SMP PREEMPT Tue Apr 21 18:07:59 BST 2015 armv7l GNU/Linux (Source: 2015-09-24-raspbian-jessie.img)
Comment 7 Brad Hein 2015-12-23 03:10:53 UTC
Scratch that - the problem happens in both OS images, it took a while to recreate the conditions necessary to see the big packets but eventually the older image also exhibited the problem.

I did however narrow down the problem further by trying a different USB wifi adapter. I found that switching to a Tenda RT5370 USB adapter made the problem go away. 

**The problem followed the USB adapter**. The following adapter exhibits the too-long-frame problem reported in this ticket: 

  Bus 001 Device 005: ID 0cf3:9271 Atheros Communications, Inc. AR9271 802.11n (TP-Link TL-WN-722N USB with external RPSMA antenna)


For comparison the following adapter does not exhibit the problem: 
  GOOD: Bus 001 Device 004: ID 148f:5370 Ralink Technology, Corp. RT5370 Wireless Adapter (Tenda USB 802.11N)
Comment 8 Brad Hein 2015-12-23 03:32:10 UTC
Wrote up a bug report on the Atheros firmware site, in case it's a problem with the in-RAM firmware on this adapter. https://github.com/qca/open-ath9k-htc-firmware/issues/80
Comment 9 Oleksij Rempel 2015-12-23 08:11:55 UTC
From what i see here right now:
- old kernel is used
- old firmware is used

Lets start with updating things...
Comment 10 Brad Hein 2015-12-31 02:29:30 UTC
It's a little tricky getting the latest kernel onto a Raspberry Pi. the public repo (https://github.com/raspberrypi/linux) is currently at 4.1.y and it looks like 4.3 is current. Today I loaded this 4.1.y kernel and I'll re-check the wifi card to see if the behavior is still present. 

Not sure how feasible it is to take a kernel from kernel.org and apply the raspi config to build it. I'm guessing it's nontrivial since there is a dedicated repo for the raspberry pi ARM branch. Nonetheless I may also tray that route to help narrow down the search area.
Comment 11 Oleksij Rempel 2015-12-31 05:57:47 UTC
Suddenly i get this kind of answer from RPi users all the time. Not sure if i wish to accept any bug report from RPi side.

Now i need to let out my frustration:
RPi is not an open source product!
Bana Pi is much more open then RPi, it is just working with vanilla kernel from kernel.org.
Why do people care about RPi at all?!
Comment 12 Dean Jenkins 2016-07-26 13:29:55 UTC
Seems the above analysis in comment 1 above is incomplete. The warning is

Dec 16 23:57:11 mri kernel: [1587969.108160] asix 1-1.5:1.0 eth1: asix_rx_fixup() Bad RX Length 1522

which shows the received Ethernet frame length is 1522 octets long. This is a legal Ethernet frame length with a MTU size of 1500 and with 802.1Q tag (VLAN) field present, see https://en.wikipedia.org/wiki/Ethernet_frame#Structure

Looking at the ASIX driver source code in drivers/net/usb/asix_common.c (linux 4.7) shows:

int asix_rx_fixup_internal(struct usbnet *dev, struct sk_buff *skb,
			   struct asix_rx_fixup_info *rx)
<snip>
			if (size > dev->net->mtu + ETH_HLEN + VLAN_HLEN) {
				netdev_dbg(dev->net, "asix_rx_fixup() Bad RX Length %d\n",
					   size);
				return 0;
			}
<snip>

Note linux 4.7 changed the netdev_err() to netdev_dbg() as the "fix" for the reported issue by no longer reporting the size issue, sigh.

Analysis of the size calculation shows it has a bug in it:
if (size > dev->net->mtu + ETH_HLEN + VLAN_HLEN)

because the maximum Ethernet frame length for MTU of 1500 is:
Ethernet header length (14) + 802.1Q tag (VLAN) length (4) + payload length (MTU) (1500) + FCS length (4) = 1522

Therefore the Frame Check Sequence Frame (FCS) field length is missing from the size calculation so the maximum length accepted by the source code was 1518.

Note the VLAN length and FCS length are both 4 so the code works OK when there is no VLAN in use which is the usual case. This masks the defect.

I will raise a patch for this issue to add in the missing ETH_FCS_LEN into the size calculation.

In the failure case, in the above comment description the wireshark report shows a bad FCS value "ETHERNET FRAME CHECK SEQUENCE INCORRECT". That is worrying. Also I think wireshark is not showing a 802.1Q tag (VLAN) field. So it is unclear why the Ethernet frame length is 1522 octets from the wireshark log.

Note that when the size variable value exceeds 1518 octets (current source code), the Ethernet frame is immediately not read out of the URB buffer and the URB is not processed further (potentially losing other Ethernet frames to be read). This can also impact the next URB as synchronisation with the next header data word may have been lost.

In addition no network statistics are incremented to indicate a dropped frame because the frame never not read out of the URB buffer. This was why netdev_err() was used to report the failure.
Comment 13 Brad Hein 2016-07-26 16:38:48 UTC
Created attachment 226511 [details]
attachment-7946-0.html

Just to confirm, no tagging is in use on the network in question, so there
were no 802.1q headers present at any point.

[Sent from mobile device]
On Jul 26, 2016 9:29 AM, <bugzilla-daemon@bugzilla.kernel.org> wrote:

> https://bugzilla.kernel.org/show_bug.cgi?id=109471
>
> Dean Jenkins <Dean_Jenkins@mentor.com> changed:
>
>            What    |Removed                     |Added
>
> ----------------------------------------------------------------------------
>                  CC|                            |Dean_Jenkins@mentor.com
>
> --- Comment #12 from Dean Jenkins <Dean_Jenkins@mentor.com> ---
> Seems the above analysis in comment 1 above is incomplete. The warning is
>
> Dec 16 23:57:11 mri kernel: [1587969.108160] asix 1-1.5:1.0 eth1:
> asix_rx_fixup() Bad RX Length 1522
>
> which shows the received Ethernet frame length is 1522 octets long. This
> is a
> legal Ethernet frame length with a MTU size of 1500 and with 802.1Q tag
> (VLAN)
> field present, see https://en.wikipedia.org/wiki/Ethernet_frame#Structure
>
> Looking at the ASIX driver source code in drivers/net/usb/asix_common.c
> (linux
> 4.7) shows:
>
> int asix_rx_fixup_internal(struct usbnet *dev, struct sk_buff *skb,
>                struct asix_rx_fixup_info *rx)
> <snip>
>             if (size > dev->net->mtu + ETH_HLEN + VLAN_HLEN) {
>                 netdev_dbg(dev->net, "asix_rx_fixup() Bad RX Length %d\n",
>                        size);
>                 return 0;
>             }
> <snip>
>
> Note linux 4.7 changed the netdev_err() to netdev_dbg() as the "fix" for
> the
> reported issue by no longer reporting the size issue, sigh.
>
> Analysis of the size calculation shows it has a bug in it:
> if (size > dev->net->mtu + ETH_HLEN + VLAN_HLEN)
>
> because the maximum Ethernet frame length for MTU of 1500 is:
> Ethernet header length (14) + 802.1Q tag (VLAN) length (4) + payload length
> (MTU) (1500) + FCS length (4) = 1522
>
> Therefore the Frame Check Sequence Frame (FCS) field length is missing
> from the
> size calculation so the maximum length accepted by the source code was
> 1518.
>
> Note the VLAN length and FCS length are both 4 so the code works OK when
> there
> is no VLAN in use which is the usual case. This masks the defect.
>
> I will raise a patch for this issue to add in the missing ETH_FCS_LEN into
> the
> size calculation.
>
> In the failure case, in the above comment description the wireshark report
> shows a bad FCS value "ETHERNET FRAME CHECK SEQUENCE INCORRECT". That is
> worrying. Also I think wireshark is not showing a 802.1Q tag (VLAN) field.
> So
> it is unclear why the Ethernet frame length is 1522 octets from the
> wireshark
> log.
>
> Note that when the size variable value exceeds 1518 octets (current source
> code), the Ethernet frame is immediately not read out of the URB buffer
> and the
> URB is not processed further (potentially losing other Ethernet frames to
> be
> read). This can also impact the next URB as synchronisation with the next
> header data word may have been lost.
>
> In addition no network statistics are incremented to indicate a dropped
> frame
> because the frame never not read out of the URB buffer. This was why
> netdev_err() was used to report the failure.
>
> --
> You are receiving this mail because:
> You reported the bug.
>
Comment 14 Dean Jenkins 2016-07-26 17:16:44 UTC
(In reply to Brad Hein from comment #13)
> 
> Just to confirm, no tagging is in use on the network in question, so there
> were no 802.1q headers present at any point.
> 
I was aware that "no tagging" was reported. It is a separate issue for why the Ethernet frame contains 1522 octets which seems to be non-conforming to the Ethernet specifications were transmitted.

However, the scenario unmasks a bug in the ASIX driver's sanity length calculation for receiving Ethernet frames.

The ASIX driver at the point of reporting a bad RX frame length only cares that the number of octets to read out of the URB buffer does not exceed the maximum permissible Ethernet frame length. The ASIX driver does not decode the Ethernet frame, the size value comes from the USB to Ethernet ASIX chip itself by inserting an in-band 32-bit header value saying that the next block of data has a certain length until the next in-band 32-bit header value appears in the data stream. A block of data is the Ethernet frame.

Therefore, the ASIX driver should accept a block of data up to the maximum length of 1522 octets for a MTU value of 1500. Then pass this block of data to the network stack to decode and reject the Ethernet frame as necessary.

As the ASIX sanity length check calculation is wrong, the network stack never gets the opportunity to check the 1522 octet block of data against the Ethernet protocol specifications.

In other words, the ASIX driver is ignoring legal Ethernet frame lengths of 1522 octets for a MTU of 1500. Had VLAN tagging been used so 1522 octet frame lengths were used, the Ethernet frame would still not get through to the network stack so clearly is a bug.
Comment 15 Dean Jenkins 2016-07-27 08:34:54 UTC
I understand why the FCS check fails when receiving the "gateway big Ethernet frame".

See https://en.wikipedia.org/wiki/Ethernet_frame#Frame_check_sequence

When an Ethernet frame is received, the FCS algorithm (CRC) is applied across ALL octets of the received data block. This means the FCS field contents is treated as data to the algorithm. The FCS is declared good when the FCS algorithm outputs the expected magic number result.

In the case of the "gateway big Ethernet frame", 4 extra octets are erroneously appended to the end of the Ethernet frame. Therefore, when the FCS algorithm is applied, these extra octets are consumed causing the CRC calculation to not create the expected magic number result.

This suggests that the gateway has a bug in it which allocates a block of data too big to contain the Ethernet frame although the length of the data block is a legal length. Probably, part of the gateway software thought VLAN tagging was deployed but the actual Ethernet frame did not have a VLAN tag field so there is a mismatch in the lengths of the block and the Ethernet frame.

Note You need to log in before you can comment on or make changes to this bug.