I am running kernel-2.6.31.5-127.fc12.x86_64 on two computers with one e1000e in each computer. They are connected via a crossover cable. I set the MTU to 9216, aka jumbo frames. It works for a while, and then the link stops working. If I change the MTU to 1500 on both ends, it starts working again. If I start with a MTU of 1500 on both ends, the link always works. Before upgrading both computers to F12 I was using kernel-2.6.30.9-96.fc11.x86_64 with no problems.
(switched to email. Please respond via emailed reply-to-all, not via the bugzilla web interface). On Tue, 24 Nov 2009 23:08:00 GMT bugzilla-daemon@bugzilla.kernel.org wrote: > http://bugzilla.kernel.org/show_bug.cgi?id=14684 > > Summary: e1000e jumbo frames failure > Product: Drivers > Version: 2.5 > Platform: All > OS/Version: Linux > Tree: Mainline > Status: NEW > Severity: normal > Priority: P1 > Component: Network > AssignedTo: drivers_network@kernel-bugs.osdl.org > ReportedBy: kernel-bugzilla@cygnusx-1.org > Regression: No > > > I am running kernel-2.6.31.5-127.fc12.x86_64 on two computers with one e1000e > in each computer. They are connected via a crossover cable. I set the MTU to > 9216, aka jumbo frames. It works for a while, and then the link stops > working. > If I change the MTU to 1500 on both ends, it starts working again. If I start > with a MTU of 1500 on both ends, the link always works. > > Before upgrading both computers to F12 I was using > kernel-2.6.30.9-96.fc11.x86_64 with no problems. > Thanks, I'll mark this as a regression.
>-----Original Message----- >From: Andrew Morton [mailto:akpm@linux-foundation.org] >Sent: Tuesday, November 24, 2009 3:34 PM >To: Kirsher, Jeffrey T; Brandeburg, Jesse; Allan, Bruce W; Waskiewicz Jr, >Peter P; Ronciak, John >Cc: bugzilla-daemon@bugzilla.kernel.org; bugme-daemon@bugzilla.kernel.org; >e1000-devel@lists.sourceforge.net; kernel-bugzilla@cygnusx-1.org >Subject: Re: [Bugme-new] [Bug 14684] New: e1000e jumbo frames failure > > >(switched to email. Please respond via emailed reply-to-all, not via the >bugzilla web interface). > >On Tue, 24 Nov 2009 23:08:00 GMT >bugzilla-daemon@bugzilla.kernel.org wrote: > >> http://bugzilla.kernel.org/show_bug.cgi?id=14684 >> >> Summary: e1000e jumbo frames failure >> Product: Drivers >> Version: 2.5 >> Platform: All >> OS/Version: Linux >> Tree: Mainline >> Status: NEW >> Severity: normal >> Priority: P1 >> Component: Network >> AssignedTo: drivers_network@kernel-bugs.osdl.org >> ReportedBy: kernel-bugzilla@cygnusx-1.org >> Regression: No >> >> >> I am running kernel-2.6.31.5-127.fc12.x86_64 on two computers with one >e1000e >> in each computer. They are connected via a crossover cable. I set the >MTU to >> 9216, aka jumbo frames. It works for a while, and then the link stops >working. >> If I change the MTU to 1500 on both ends, it starts working again. If I >start >> with a MTU of 1500 on both ends, the link always works. >> >> Before upgrading both computers to F12 I was using >> kernel-2.6.30.9-96.fc11.x86_64 with no problems. >> > >Thanks, I'll mark this as a regression. You didn't say which device supported by e1000e you have. Please provide the output of lspci and any pertinent messages that may be in your system log. The output of 'ethtool -S ethX' both before increasing your mtu and after increasing the mtu and it stops working might also help (where ethX is your interface name). Thanks, Bruce.
I haven't rebooted since I got errors with jumbo frames. So there may be clues in the ethtool data from the problem. Both computers are desktops, and both cards are PCI-E 1x. I use the link almost exclusively for iSCSI. The exceptions are testing like icmp when it fails. Computer 1: 4:00.0 Ethernet controller: Intel Corporation 82572EI Gigabit Ethernet Controller (Copper) (rev 06) Subsystem: Intel Corporation PRO/1000 PT Desktop Adapter Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+ Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 0, Cache Line Size: 32 bytes Interrupt: pin A routed to IRQ 30 Region 0: Memory at fe9e0000 (32-bit, non-prefetchable) [size=128K] Region 1: Memory at fe9c0000 (32-bit, non-prefetchable) [size=128K] Region 2: I/O ports at bc00 [size=32] Expansion ROM at fe9a0000 [disabled] [size=128K] Capabilities: [c8] Power Management version 2 Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+) Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=1 PME- Capabilities: [d0] MSI: Enable+ Count=1/1 Maskable- 64bit+ Address: 00000000fee0200c Data: 41d9 Capabilities: [e0] Express (v1) Endpoint, MSI 00 DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s <512ns, L1 <64us ExtTag- AttnBtn- AttnInd- PwrInd- RBE- FLReset- DevCtl: Report errors: Correctable+ Non-Fatal+ Fatal+ Unsupported+ RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+ MaxPayload 128 bytes, MaxReadReq 512 bytes DevSta: CorrErr- UncorrErr+ FatalErr- UnsuppReq+ AuxPwr+ TransPend- LnkCap: Port #0, Speed 2.5GT/s, Width x1, ASPM L0s, Latency L0 <4us, L1 <64us ClockPM- Surprise- LLActRep- BwNot- LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- Retrain- CommClk+ ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt- LnkSta: Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt- Capabilities: [100] Advanced Error Reporting UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq+ ACSViol- UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UESvrt: DLP+ SDES- TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol- CESta: RxErr+ BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr- CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr- AERCap: First Error Pointer: 14, GenCap- CGenEn- ChkCap- ChkEn- Capabilities: [140] Device Serial Number 00-1b-21-ff-ff-14-ef-d7 Kernel driver in use: e1000e Kernel modules: e1000e "ethtool -S eth2" output at mtu 1500: NIC statistics: rx_packets: 10620133 tx_packets: 15304833 rx_bytes: 10406062051 tx_bytes: 15657625639 rx_broadcast: 5 tx_broadcast: 1164 rx_multicast: 280 tx_multicast: 588 rx_errors: 0 tx_errors: 0 tx_dropped: 0 multicast: 280 collisions: 0 rx_length_errors: 0 rx_over_errors: 0 rx_crc_errors: 0 rx_frame_errors: 0 rx_no_buffer_count: 0 rx_missed_errors: 0 tx_aborted_errors: 0 tx_carrier_errors: 0 tx_fifo_errors: 0 tx_heartbeat_errors: 0 tx_window_errors: 0 tx_abort_late_coll: 0 tx_deferred_ok: 2088958 tx_single_coll_ok: 0 tx_multi_coll_ok: 0 tx_timeout_count: 0 tx_restart_queue: 1188 rx_long_length_errors: 0 rx_short_length_errors: 0 rx_align_errors: 0 tx_tcp_seg_good: 2643618 tx_tcp_seg_failed: 0 rx_flow_control_xon: 3176949638 rx_flow_control_xoff: 79245597 tx_flow_control_xon: 357771 tx_flow_control_xoff: 11857745 rx_long_byte_count: 10406062051 rx_csum_offload_good: 10572127 rx_csum_offload_errors: 0 rx_header_split: 2508581 alloc_rx_buff_failed: 0 tx_smbus: 0 rx_smbus: 0 dropped_smbus: 0 rx_dma_failed: 0 tx_dma_failed: 0 Computer 2: 03:00.0 Ethernet controller: Intel Corporation 82574L Gigabit Network Connection Subsystem: Intel Corporation Gigabit CT Desktop Adapter Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+ Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 0, Cache Line Size: 64 bytes Interrupt: pin A routed to IRQ 17 Region 0: Memory at fd2c0000 (32-bit, non-prefetchable) [size=128K] Region 1: Memory at fd200000 (32-bit, non-prefetchable) [size=512K] Region 2: I/O ports at bf00 [size=32] Region 3: Memory at fd2fc000 (32-bit, non-prefetchable) [size=16K] [virtual] Expansion ROM at fd100000 [disabled] [size=256K] Capabilities: [c8] Power Management version 2 Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+) Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=1 PME- Capabilities: [d0] MSI: Enable- Count=1/1 Maskable- 64bit+ Address: 0000000000000000 Data: 0000 Capabilities: [e0] Express (v1) Endpoint, MSI 00 DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s <512ns, L1 <64us ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset- DevCtl: Report errors: Correctable+ Non-Fatal+ Fatal+ Unsupported+ RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+ MaxPayload 128 bytes, MaxReadReq 512 bytes DevSta: CorrErr+ UncorrErr- FatalErr- UnsuppReq- AuxPwr+ TransPend- LnkCap: Port #0, Speed 2.5GT/s, Width x1, ASPM L0s L1, Latency L0 <128ns, L1 <64us ClockPM- Surprise- LLActRep- BwNot- LnkCtl: ASPM L0s Enabled; RCB 64 bytes Disabled- Retrain- CommClk+ ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt- LnkSta: Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt- Capabilities: [a0] MSI-X: Enable+ Count=5 Masked- Vector table: BAR=3 offset=00000000 PBA: BAR=3 offset=00002000 Capabilities: [100] Advanced Error Reporting UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UESvrt: DLP+ SDES- TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol- CESta: RxErr+ BadTLP+ BadDLLP+ Rollover- Timeout+ NonFatalErr- CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+ AERCap: First Error Pointer: 00, GenCap- CGenEn- ChkCap- ChkEn- Capabilities: [140] Device Serial Number 00-1b-21-ff-ff-2c-d4-bc Kernel driver in use: e1000e Kernel modules: e1000e "ethtool -S eth1" output at mtu 1500: NIC statistics: rx_packets: 15287659 tx_packets: 10601666 rx_bytes: 15620091082 tx_bytes: 10378919127 rx_broadcast: 27 tx_broadcast: 5 rx_multicast: 510 tx_multicast: 287 rx_errors: 5 tx_errors: 0 tx_dropped: 0 multicast: 510 collisions: 0 rx_length_errors: 5 rx_over_errors: 0 rx_crc_errors: 0 rx_frame_errors: 0 rx_no_buffer_count: 0 rx_missed_errors: 6641 tx_aborted_errors: 0 tx_carrier_errors: 0 tx_fifo_errors: 0 tx_heartbeat_errors: 0 tx_window_errors: 0 tx_abort_late_coll: 0 tx_deferred_ok: 0 tx_single_coll_ok: 0 tx_multi_coll_ok: 0 tx_timeout_count: 2 tx_restart_queue: 0 rx_long_length_errors: 5 rx_short_length_errors: 0 rx_align_errors: 0 tx_tcp_seg_good: 766110 tx_tcp_seg_failed: 0 rx_flow_control_xon: 357714 rx_flow_control_xoff: 11855647 tx_flow_control_xon: 3175077458 tx_flow_control_xoff: 79245597 rx_long_byte_count: 15620091082 rx_csum_offload_good: 15239778 rx_csum_offload_errors: 0 rx_header_split: 2562045 alloc_rx_buff_failed: 0 tx_smbus: 0 rx_smbus: 0 dropped_smbus: 0 rx_dma_failed: 0 tx_dma_failed: 0
Is this fixed in the meantime? There was http://patchwork.ozlabs.org/patch/34339/ which made it into mainline as: commit a825e00c98a2ee37eb2a0ad93b352e79d2bc1593 Author: Alexander Duyck <alexander.h.duyck@intel.com> Date: Fri Oct 2 12:30:42 2009 +0000 e1000e: swap max hw supported frame size between 82574 and 82583 which could affect one of your machines. (check bug #14261)
I'm closing this as unreproducible. If that is incorrect, please shout.