Created attachment 146761 [details] kernel config Aug 6 06:46:50 prod-ent-ceph03.dc2.ec.loc kernel: [29530.973225] ------------[ cut here ]------------ Aug 6 06:46:50 prod-ent-ceph03.dc2.ec.loc kernel: [29530.973236] WARNING: CPU: 2 PID: 0 at net/core/dev.c:2246 skb_warn_bad_offload+0xc8/0xd5() Aug 6 06:46:50 prod-ent-ceph03.dc2.ec.loc kernel: [29530.973238] : caps=(0x000000000419fba9, 0x00000000001b583b) len=2962 data_len=2896 gso_size=1448 gso_type=1 ip_summed=3 Aug 6 06:46:50 prod-ent-ceph03.dc2.ec.loc kernel: [29530.973239] Modules linked in: ntfs msdos xfs libcrc32c ipmi_devintf intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd sb_edac edac_core 8021q garp ioatdma stp ipmi_si mrp llc bonding hid_generic ixgbe usbhid hid ahci dca libahci mdio Aug 6 06:46:50 prod-ent-ceph03.dc2.ec.loc kernel: [29530.973257] CPU: 2 PID: 0 Comm: swapper/2 Tainted: G W 3.16.0 #2 Aug 6 06:46:50 prod-ent-ceph03.dc2.ec.loc kernel: [29530.973259] Hardware name: Supermicro X9DRD-7LN4F(-JBOD)/X9DRD-EF/X9DRD-7LN4F, BIOS 3.0a 12/05/2013 Aug 6 06:46:50 prod-ent-ceph03.dc2.ec.loc kernel: [29530.973260] 0000000000000009 ffff88046fd036b0 ffffffff815c4096 ffff88046fd036f8 Aug 6 06:46:50 prod-ent-ceph03.dc2.ec.loc kernel: [29530.973262] ffff88046fd036e8 ffffffff8103f633 ffff880018b7c4e0 ffff8804687df000 Aug 6 06:46:50 prod-ent-ceph03.dc2.ec.loc kernel: [29530.973264] 0000000000000001 0000000000000003 ffffffffa0193320 ffff88046fd03748 Aug 6 06:46:50 prod-ent-ceph03.dc2.ec.loc kernel: [29530.973266] Call Trace: Aug 6 06:46:50 prod-ent-ceph03.dc2.ec.loc kernel: [29530.973268] <IRQ> [<ffffffff815c4096>] dump_stack+0x45/0x56 Aug 6 06:46:50 prod-ent-ceph03.dc2.ec.loc kernel: [29530.973280] [<ffffffff8103f633>] warn_slowpath_common+0x73/0x90 Aug 6 06:46:50 prod-ent-ceph03.dc2.ec.loc kernel: [29530.973286] [<ffffffff8103f697>] warn_slowpath_fmt+0x47/0x50 Aug 6 06:46:50 prod-ent-ceph03.dc2.ec.loc kernel: [29530.973288] [<ffffffff812edadc>] ? ___ratelimit+0x7c/0xf0 Aug 6 06:46:50 prod-ent-ceph03.dc2.ec.loc kernel: [29530.973291] [<ffffffff815c5e19>] skb_warn_bad_offload+0xc8/0xd5 Aug 6 06:46:50 prod-ent-ceph03.dc2.ec.loc kernel: [29530.973294] [<ffffffff814e57fe>] skb_checksum_help+0x16e/0x180 Aug 6 06:46:50 prod-ent-ceph03.dc2.ec.loc kernel: [29530.973297] [<ffffffff814e9ecc>] dev_hard_start_xmit+0x42c/0x4b0 Aug 6 06:46:50 prod-ent-ceph03.dc2.ec.loc kernel: [29530.973299] [<ffffffff814ea154>] ? __dev_queue_xmit+0x204/0x440 Aug 6 06:46:50 prod-ent-ceph03.dc2.ec.loc kernel: [29530.973301] [<ffffffff814ea232>] __dev_queue_xmit+0x2e2/0x440 Aug 6 06:46:50 prod-ent-ceph03.dc2.ec.loc kernel: [29530.973302] [<ffffffff814ea39b>] ? dev_queue_xmit+0xb/0x10 Aug 6 06:46:50 prod-ent-ceph03.dc2.ec.loc kernel: [29530.973304] [<ffffffff814ea39b>] dev_queue_xmit+0xb/0x10 Aug 6 06:46:50 prod-ent-ceph03.dc2.ec.loc kernel: [29530.973308] [<ffffffffa012b758>] vlan_dev_hard_start_xmit+0x88/0x100 [8021q] Aug 6 06:46:50 prod-ent-ceph03.dc2.ec.loc kernel: [29530.973317] [<ffffffff814e9d9a>] dev_hard_start_xmit+0x2fa/0x4b0 Aug 6 06:46:50 prod-ent-ceph03.dc2.ec.loc kernel: [29530.973321] [<ffffffff814ea232>] __dev_queue_xmit+0x2e2/0x440 Aug 6 06:46:50 prod-ent-ceph03.dc2.ec.loc kernel: [29530.973323] [<ffffffff814ea39b>] dev_queue_xmit+0xb/0x10 Aug 6 06:46:50 prod-ent-ceph03.dc2.ec.loc kernel: [29530.973325] [<ffffffff814f1192>] neigh_connected_output+0xb2/0xf0 Aug 6 06:46:50 prod-ent-ceph03.dc2.ec.loc kernel: [29530.973327] [<ffffffff815192dc>] ip_finish_output+0x4ec/0x890 Aug 6 06:46:50 prod-ent-ceph03.dc2.ec.loc kernel: [29530.973329] [<ffffffff8151ac03>] ip_output+0x53/0x90 Aug 6 06:46:50 prod-ent-ceph03.dc2.ec.loc kernel: [29530.973331] [<ffffffff8151a39b>] ip_local_out_sk+0x2b/0x30 Aug 6 06:46:50 prod-ent-ceph03.dc2.ec.loc kernel: [29530.973333] [<ffffffff8151a6fa>] ip_queue_xmit+0x13a/0x3c0 Aug 6 06:46:50 prod-ent-ceph03.dc2.ec.loc kernel: [29530.973335] [<ffffffff815309fa>] tcp_transmit_skb+0x42a/0x8f0 Aug 6 06:46:50 prod-ent-ceph03.dc2.ec.loc kernel: [29530.973337] [<ffffffff81530ffa>] tcp_write_xmit+0x13a/0xc00 Aug 6 06:46:50 prod-ent-ceph03.dc2.ec.loc kernel: [29530.973347] [<ffffffff8152f033>] ? tcp_established_options+0x33/0xd0 Aug 6 06:46:50 prod-ent-ceph03.dc2.ec.loc kernel: [29530.973350] [<ffffffff81531d09>] __tcp_push_pending_frames+0x29/0xc0 Aug 6 06:46:50 prod-ent-ceph03.dc2.ec.loc kernel: [29530.973353] [<ffffffff8152da77>] tcp_rcv_established+0x1f7/0x5e0 Aug 6 06:46:50 prod-ent-ceph03.dc2.ec.loc kernel: [29530.973356] [<ffffffff81535fc5>] tcp_v4_do_rcv+0x215/0x4a0 Aug 6 06:46:50 prod-ent-ceph03.dc2.ec.loc kernel: [29530.973369] [<ffffffff810651f8>] ? ttwu_do_activate.constprop.64+0x58/0x60 Aug 6 06:46:50 prod-ent-ceph03.dc2.ec.loc kernel: [29530.973374] [<ffffffff81295d31>] ? security_sock_rcv_skb+0x11/0x20 Aug 6 06:46:50 prod-ent-ceph03.dc2.ec.loc kernel: [29530.973377] [<ffffffff815392ad>] tcp_v4_rcv+0x73d/0x7c0 Aug 6 06:46:50 prod-ent-ceph03.dc2.ec.loc kernel: [29530.973380] [<ffffffff8106fafc>] ? update_group_capacity+0x16c/0x270 Aug 6 06:46:50 prod-ent-ceph03.dc2.ec.loc kernel: [29530.973386] [<ffffffff812e8300>] ? cpumask_next_and+0x30/0x50 Aug 6 06:46:50 prod-ent-ceph03.dc2.ec.loc kernel: [29530.973388] [<ffffffff81514b50>] ip_local_deliver_finish+0x80/0x1c0 Aug 6 06:46:50 prod-ent-ceph03.dc2.ec.loc kernel: [29530.973390] [<ffffffff81515154>] ip_local_deliver+0x34/0x90 Aug 6 06:46:50 prod-ent-ceph03.dc2.ec.loc kernel: [29530.973392] [<ffffffff81514d99>] ip_rcv_finish+0x109/0x350 Aug 6 06:46:50 prod-ent-ceph03.dc2.ec.loc kernel: [29530.973399] [<ffffffff815153d2>] ip_rcv+0x222/0x370 Aug 6 06:46:50 prod-ent-ceph03.dc2.ec.loc kernel: [29530.973403] [<ffffffff814e5eb6>] __netif_receive_skb_core+0x416/0x570 Aug 6 06:46:50 prod-ent-ceph03.dc2.ec.loc kernel: [29530.973407] [<ffffffff814e73f3>] __netif_receive_skb+0x13/0x60 Aug 6 06:46:50 prod-ent-ceph03.dc2.ec.loc kernel: [29530.973410] [<ffffffff814e745e>] netif_receive_skb_internal+0x1e/0x90 Aug 6 06:46:50 prod-ent-ceph03.dc2.ec.loc kernel: [29530.973415] [<ffffffff814e7b40>] napi_gro_receive+0x70/0xa0 Aug 6 06:46:50 prod-ent-ceph03.dc2.ec.loc kernel: [29530.973422] [<ffffffffa0134f4c>] ixgbe_clean_rx_irq+0x75c/0xb20 [ixgbe] Aug 6 06:46:50 prod-ent-ceph03.dc2.ec.loc kernel: [29530.973427] [<ffffffffa0136172>] ixgbe_poll+0x522/0x850 [ixgbe] Aug 6 06:46:50 prod-ent-ceph03.dc2.ec.loc kernel: [29530.973430] [<ffffffff8105e986>] ? hrtimer_get_next_event+0xb6/0xc0 Aug 6 06:46:50 prod-ent-ceph03.dc2.ec.loc kernel: [29530.973437] [<ffffffff814e8dc1>] net_rx_action+0x101/0x1a0 Aug 6 06:46:50 prod-ent-ceph03.dc2.ec.loc kernel: [29530.973443] [<ffffffff810430ab>] __do_softirq+0xdb/0x240 Aug 6 06:46:50 prod-ent-ceph03.dc2.ec.loc kernel: [29530.973447] [<ffffffff8104349e>] irq_exit+0xee/0x110 Aug 6 06:46:50 prod-ent-ceph03.dc2.ec.loc kernel: [29530.973450] [<ffffffff81004913>] do_IRQ+0x53/0xf0 Aug 6 06:46:50 prod-ent-ceph03.dc2.ec.loc kernel: [29530.973453] [<ffffffff815caaaa>] common_interrupt+0x6a/0x6a Aug 6 06:46:50 prod-ent-ceph03.dc2.ec.loc kernel: [29530.973454] <EOI> [<ffffffff814af007>] ? cpuidle_enter_state+0x47/0xc0 Aug 6 06:46:50 prod-ent-ceph03.dc2.ec.loc kernel: [29530.973463] [<ffffffff814af132>] cpuidle_enter+0x12/0x20 Aug 6 06:46:50 prod-ent-ceph03.dc2.ec.loc kernel: [29530.973468] [<ffffffff81075fbf>] cpu_startup_entry+0x24f/0x280 Aug 6 06:46:50 prod-ent-ceph03.dc2.ec.loc kernel: [29530.973477] [<ffffffff810905e3>] ? clockevents_config_and_register+0x23/0x30 Aug 6 06:46:50 prod-ent-ceph03.dc2.ec.loc kernel: [29530.973482] [<ffffffff810282be>] start_secondary+0x1be/0x270 Aug 6 06:46:50 prod-ent-ceph03.dc2.ec.loc kernel: [29530.973486] ---[ end trace de552357488766e8 ]--- Aug 6 06:46:50 prod-ent-ceph03.dc2.ec.loc kernel: [29530.974181] ------------[ cut here ]------------
Some kind of hardware and connection summary and what it was doing would be helpful. Is it repeatable ?
yes, repeatable fills up kern log auto lo iface lo inet loopback auto eth1 iface eth1 inet manual bond-master bond0 bond-primary eth1 auto eth6 iface eth6 inet manual bond-master bond0 auto bond0 iface bond0 inet manual # bond-mode 802.3ad bond-mode active-backup bond-miimon 100 bond_downdelay 200 bond_updelay 200 bond-slaves none ## VLAN MGMT auto bond0.250 iface bond0.250 inet static address 10.248.5.153 netmask 255.255.255.0 vlan-raw-device bond0 post-up ip route add default via 10.248.5.1 dev bond0.250 table vlan_250 post-up ip route add 10.248.5.0/24 dev bond0.250 src 10.248.5.153 table vlan_250 post-up ip rule add to 10.248.5.0/24 table vlan_250 post-up ip rule add from 10.248.5.0/24 table vlan_250 post-down ip rule del from 10.248.5.0/24 table vlan_250 post-down ip rule del to 10.248.5.0/24 table vlan_250 ## VLAN PUB_CEPH auto bond0.290 iface bond0.290 inet static address 10.248.9.153 netmask 255.255.255.0 gateway 10.248.9.1 dns-nameservers 10.248.1.50 10.248.1.51 dns-search dc2.ec.loc dc1.ec.loc e1c.net ec.loc vlan-raw-device bond0 post-up ip route add default via 10.248.9.1 dev bond0.290 table vlan_290 post-up ip route add 10.248.9.0/24 dev bond0.290 src 10.248.9.153 table vlan_290 post-up ip rule add to 10.248.9.0/24 table vlan_290 post-up ip rule add from 10.248.9.0/24 table vlan_290 post-down ip rule del from 10.248.9.0/24 table vlan_290 post-down ip rule del to 10.248.9.0/24 table vlan_290 ## VLAN PRIV_CEPH auto bond0.300 iface bond0.300 inet static address 10.248.10.153 netmask 255.255.255.0 vlan-raw-device bond0 vladi@prod-ent-ceph03:~$ lspci 00:00.0 Host bridge: Intel Corporation Xeon E5/Core i7 DMI2 (rev 07) 00:01.0 PCI bridge: Intel Corporation Xeon E5/Core i7 IIO PCI Express Root Port 1a (rev 07) 00:02.0 PCI bridge: Intel Corporation Xeon E5/Core i7 IIO PCI Express Root Port 2a (rev 07) 00:02.2 PCI bridge: Intel Corporation Xeon E5/Core i7 IIO PCI Express Root Port 2c (rev 07) 00:03.0 PCI bridge: Intel Corporation Xeon E5/Core i7 IIO PCI Express Root Port 3a in PCI Express Mode (rev 07) 00:03.2 PCI bridge: Intel Corporation Xeon E5/Core i7 IIO PCI Express Root Port 3c (rev 07) 00:04.0 System peripheral: Intel Corporation Xeon E5/Core i7 DMA Channel 0 (rev 07) 00:04.1 System peripheral: Intel Corporation Xeon E5/Core i7 DMA Channel 1 (rev 07) 00:04.2 System peripheral: Intel Corporation Xeon E5/Core i7 DMA Channel 2 (rev 07) 00:04.3 System peripheral: Intel Corporation Xeon E5/Core i7 DMA Channel 3 (rev 07) 00:04.4 System peripheral: Intel Corporation Xeon E5/Core i7 DMA Channel 4 (rev 07) 00:04.5 System peripheral: Intel Corporation Xeon E5/Core i7 DMA Channel 5 (rev 07) 00:04.6 System peripheral: Intel Corporation Xeon E5/Core i7 DMA Channel 6 (rev 07) 00:04.7 System peripheral: Intel Corporation Xeon E5/Core i7 DMA Channel 7 (rev 07) 00:05.0 System peripheral: Intel Corporation Xeon E5/Core i7 Address Map, VTd_Misc, System Management (rev 07) 00:05.2 System peripheral: Intel Corporation Xeon E5/Core i7 Control Status and Global Errors (rev 07) 00:05.4 PIC: Intel Corporation Xeon E5/Core i7 I/O APIC (rev 07) 00:11.0 PCI bridge: Intel Corporation C600/X79 series chipset PCI Express Virtual Root Port (rev 06) 00:16.0 Communication controller: Intel Corporation C600/X79 series chipset MEI Controller #1 (rev 05) 00:16.1 Communication controller: Intel Corporation C600/X79 series chipset MEI Controller #2 (rev 05) 00:1a.0 USB controller: Intel Corporation C600/X79 series chipset USB2 Enhanced Host Controller #2 (rev 06) 00:1d.0 USB controller: Intel Corporation C600/X79 series chipset USB2 Enhanced Host Controller #1 (rev 06) 00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev a6) 00:1f.0 ISA bridge: Intel Corporation C600/X79 series chipset LPC Controller (rev 06) 00:1f.2 SATA controller: Intel Corporation C600/X79 series chipset 6-Port SATA AHCI Controller (rev 06) 00:1f.3 SMBus: Intel Corporation C600/X79 series chipset SMBus Host Controller (rev 06) 00:1f.6 Signal processing controller: Intel Corporation C600/X79 series chipset Thermal Management Controller (rev 06) 03:00.0 Serial Attached SCSI controller: LSI Logic / Symbios Logic SAS2308 PCI-Express Fusion-MPT SAS-2 (rev 05) 04:00.0 Ethernet controller: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection (rev 01) 04:00.1 Ethernet controller: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection (rev 01) 06:00.0 Ethernet controller: Intel Corporation I350 Gigabit Network Connection (rev 01) 06:00.1 Ethernet controller: Intel Corporation I350 Gigabit Network Connection (rev 01) 06:00.2 Ethernet controller: Intel Corporation I350 Gigabit Network Connection (rev 01) 06:00.3 Ethernet controller: Intel Corporation I350 Gigabit Network Connection (rev 01) 09:01.0 VGA compatible controller: Matrox Electronics Systems Ltd. MGA G200eW WPCM450 (rev 0a) 7f:08.0 System peripheral: Intel Corporation Xeon E5/Core i7 QPI Link 0 (rev 07) 7f:08.3 System peripheral: Intel Corporation Xeon E5/Core i7 QPI Link Reut 0 (rev 07) 7f:08.4 System peripheral: Intel Corporation Xeon E5/Core i7 QPI Link Reut 0 (rev 07) 7f:09.0 System peripheral: Intel Corporation Xeon E5/Core i7 QPI Link 1 (rev 07) 7f:09.3 System peripheral: Intel Corporation Xeon E5/Core i7 QPI Link Reut 1 (rev 07) 7f:09.4 System peripheral: Intel Corporation Xeon E5/Core i7 QPI Link Reut 1 (rev 07) 7f:0a.0 System peripheral: Intel Corporation Xeon E5/Core i7 Power Control Unit 0 (rev 07) 7f:0a.1 System peripheral: Intel Corporation Xeon E5/Core i7 Power Control Unit 1 (rev 07) 7f:0a.2 System peripheral: Intel Corporation Xeon E5/Core i7 Power Control Unit 2 (rev 07) 7f:0a.3 System peripheral: Intel Corporation Xeon E5/Core i7 Power Control Unit 3 (rev 07) 7f:0b.0 System peripheral: Intel Corporation Xeon E5/Core i7 Interrupt Control Registers (rev 07) 7f:0b.3 System peripheral: Intel Corporation Xeon E5/Core i7 Semaphore and Scratchpad Configuration Registers (rev 07) 7f:0c.0 System peripheral: Intel Corporation Xeon E5/Core i7 Unicast Register 0 (rev 07) 7f:0c.1 System peripheral: Intel Corporation Xeon E5/Core i7 Unicast Register 0 (rev 07) 7f:0c.6 System peripheral: Intel Corporation Xeon E5/Core i7 Integrated Memory Controller System Address Decoder 0 (rev 07) 7f:0c.7 System peripheral: Intel Corporation Xeon E5/Core i7 System Address Decoder (rev 07) 7f:0d.0 System peripheral: Intel Corporation Xeon E5/Core i7 Unicast Register 0 (rev 07) 7f:0d.1 System peripheral: Intel Corporation Xeon E5/Core i7 Unicast Register 0 (rev 07) 7f:0d.6 System peripheral: Intel Corporation Xeon E5/Core i7 Integrated Memory Controller System Address Decoder 1 (rev 07) 7f:0e.0 System peripheral: Intel Corporation Xeon E5/Core i7 Processor Home Agent (rev 07) 7f:0e.1 Performance counters: Intel Corporation Xeon E5/Core i7 Processor Home Agent Performance Monitoring (rev 07) 7f:0f.0 System peripheral: Intel Corporation Xeon E5/Core i7 Integrated Memory Controller Registers (rev 07) 7f:0f.1 System peripheral: Intel Corporation Xeon E5/Core i7 Integrated Memory Controller RAS Registers (rev 07) 7f:0f.2 System peripheral: Intel Corporation Xeon E5/Core i7 Integrated Memory Controller Target Address Decoder 0 (rev 07) 7f:0f.3 System peripheral: Intel Corporation Xeon E5/Core i7 Integrated Memory Controller Target Address Decoder 1 (rev 07) 7f:0f.4 System peripheral: Intel Corporation Xeon E5/Core i7 Integrated Memory Controller Target Address Decoder 2 (rev 07) 7f:0f.5 System peripheral: Intel Corporation Xeon E5/Core i7 Integrated Memory Controller Target Address Decoder 3 (rev 07) 7f:0f.6 System peripheral: Intel Corporation Xeon E5/Core i7 Integrated Memory Controller Target Address Decoder 4 (rev 07) 7f:10.0 System peripheral: Intel Corporation Xeon E5/Core i7 Integrated Memory Controller Channel 0-3 Thermal Control 0 (rev 07) 7f:10.1 System peripheral: Intel Corporation Xeon E5/Core i7 Integrated Memory Controller Channel 0-3 Thermal Control 1 (rev 07) 7f:10.2 System peripheral: Intel Corporation Xeon E5/Core i7 Integrated Memory Controller ERROR Registers 0 (rev 07) 7f:10.3 System peripheral: Intel Corporation Xeon E5/Core i7 Integrated Memory Controller ERROR Registers 1 (rev 07) 7f:10.4 System peripheral: Intel Corporation Xeon E5/Core i7 Integrated Memory Controller Channel 0-3 Thermal Control 2 (rev 07) 7f:10.5 System peripheral: Intel Corporation Xeon E5/Core i7 Integrated Memory Controller Channel 0-3 Thermal Control 3 (rev 07) 7f:10.6 System peripheral: Intel Corporation Xeon E5/Core i7 Integrated Memory Controller ERROR Registers 2 (rev 07) 7f:10.7 System peripheral: Intel Corporation Xeon E5/Core i7 Integrated Memory Controller ERROR Registers 3 (rev 07) 7f:11.0 System peripheral: Intel Corporation Xeon E5/Core i7 DDRIO (rev 07) 7f:13.0 System peripheral: Intel Corporation Xeon E5/Core i7 R2PCIe (rev 07) 7f:13.1 Performance counters: Intel Corporation Xeon E5/Core i7 Ring to PCI Express Performance Monitor (rev 07) 7f:13.4 Performance counters: Intel Corporation Xeon E5/Core i7 QuickPath Interconnect Agent Ring Registers (rev 07) 7f:13.5 Performance counters: Intel Corporation Xeon E5/Core i7 Ring to QuickPath Interconnect Link 0 Performance Monitor (rev 07) 7f:13.6 System peripheral: Intel Corporation Xeon E5/Core i7 Ring to QuickPath Interconnect Link 1 Performance Monitor (rev 07) 80:01.0 PCI bridge: Intel Corporation Xeon E5/Core i7 IIO PCI Express Root Port 1a (rev 07) 80:02.0 PCI bridge: Intel Corporation Xeon E5/Core i7 IIO PCI Express Root Port 2a (rev 07) 80:02.2 PCI bridge: Intel Corporation Xeon E5/Core i7 IIO PCI Express Root Port 2c (rev 07) 80:04.0 System peripheral: Intel Corporation Xeon E5/Core i7 DMA Channel 0 (rev 07) 80:04.1 System peripheral: Intel Corporation Xeon E5/Core i7 DMA Channel 1 (rev 07) 80:04.2 System peripheral: Intel Corporation Xeon E5/Core i7 DMA Channel 2 (rev 07) 80:04.3 System peripheral: Intel Corporation Xeon E5/Core i7 DMA Channel 3 (rev 07) 80:04.4 System peripheral: Intel Corporation Xeon E5/Core i7 DMA Channel 4 (rev 07) 80:04.5 System peripheral: Intel Corporation Xeon E5/Core i7 DMA Channel 5 (rev 07) 80:04.6 System peripheral: Intel Corporation Xeon E5/Core i7 DMA Channel 6 (rev 07) 80:04.7 System peripheral: Intel Corporation Xeon E5/Core i7 DMA Channel 7 (rev 07) 80:05.0 System peripheral: Intel Corporation Xeon E5/Core i7 Address Map, VTd_Misc, System Management (rev 07) 80:05.2 System peripheral: Intel Corporation Xeon E5/Core i7 Control Status and Global Errors (rev 07) 80:05.4 PIC: Intel Corporation Xeon E5/Core i7 I/O APIC (rev 07) 83:00.0 Ethernet controller: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection (rev 01) 83:00.1 Ethernet controller: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection (rev 01) ff:08.0 System peripheral: Intel Corporation Xeon E5/Core i7 QPI Link 0 (rev 07) ff:08.3 System peripheral: Intel Corporation Xeon E5/Core i7 QPI Link Reut 0 (rev 07) ff:08.4 System peripheral: Intel Corporation Xeon E5/Core i7 QPI Link Reut 0 (rev 07) ff:09.0 System peripheral: Intel Corporation Xeon E5/Core i7 QPI Link 1 (rev 07) ff:09.3 System peripheral: Intel Corporation Xeon E5/Core i7 QPI Link Reut 1 (rev 07) ff:09.4 System peripheral: Intel Corporation Xeon E5/Core i7 QPI Link Reut 1 (rev 07) ff:0a.0 System peripheral: Intel Corporation Xeon E5/Core i7 Power Control Unit 0 (rev 07) ff:0a.1 System peripheral: Intel Corporation Xeon E5/Core i7 Power Control Unit 1 (rev 07) ff:0a.2 System peripheral: Intel Corporation Xeon E5/Core i7 Power Control Unit 2 (rev 07) ff:0a.3 System peripheral: Intel Corporation Xeon E5/Core i7 Power Control Unit 3 (rev 07) ff:0b.0 System peripheral: Intel Corporation Xeon E5/Core i7 Interrupt Control Registers (rev 07) ff:0b.3 System peripheral: Intel Corporation Xeon E5/Core i7 Semaphore and Scratchpad Configuration Registers (rev 07) ff:0c.0 System peripheral: Intel Corporation Xeon E5/Core i7 Unicast Register 0 (rev 07) ff:0c.1 System peripheral: Intel Corporation Xeon E5/Core i7 Unicast Register 0 (rev 07) ff:0c.6 System peripheral: Intel Corporation Xeon E5/Core i7 Integrated Memory Controller System Address Decoder 0 (rev 07) ff:0c.7 System peripheral: Intel Corporation Xeon E5/Core i7 System Address Decoder (rev 07) ff:0d.0 System peripheral: Intel Corporation Xeon E5/Core i7 Unicast Register 0 (rev 07) ff:0d.1 System peripheral: Intel Corporation Xeon E5/Core i7 Unicast Register 0 (rev 07) ff:0d.6 System peripheral: Intel Corporation Xeon E5/Core i7 Integrated Memory Controller System Address Decoder 1 (rev 07) ff:0e.0 System peripheral: Intel Corporation Xeon E5/Core i7 Processor Home Agent (rev 07) ff:0e.1 Performance counters: Intel Corporation Xeon E5/Core i7 Processor Home Agent Performance Monitoring (rev 07) ff:0f.0 System peripheral: Intel Corporation Xeon E5/Core i7 Integrated Memory Controller Registers (rev 07) ff:0f.1 System peripheral: Intel Corporation Xeon E5/Core i7 Integrated Memory Controller RAS Registers (rev 07) ff:0f.2 System peripheral: Intel Corporation Xeon E5/Core i7 Integrated Memory Controller Target Address Decoder 0 (rev 07) ff:0f.3 System peripheral: Intel Corporation Xeon E5/Core i7 Integrated Memory Controller Target Address Decoder 1 (rev 07) ff:0f.4 System peripheral: Intel Corporation Xeon E5/Core i7 Integrated Memory Controller Target Address Decoder 2 (rev 07) ff:0f.5 System peripheral: Intel Corporation Xeon E5/Core i7 Integrated Memory Controller Target Address Decoder 3 (rev 07) ff:0f.6 System peripheral: Intel Corporation Xeon E5/Core i7 Integrated Memory Controller Target Address Decoder 4 (rev 07) ff:10.0 System peripheral: Intel Corporation Xeon E5/Core i7 Integrated Memory Controller Channel 0-3 Thermal Control 0 (rev 07) ff:10.1 System peripheral: Intel Corporation Xeon E5/Core i7 Integrated Memory Controller Channel 0-3 Thermal Control 1 (rev 07) ff:10.2 System peripheral: Intel Corporation Xeon E5/Core i7 Integrated Memory Controller ERROR Registers 0 (rev 07) ff:10.3 System peripheral: Intel Corporation Xeon E5/Core i7 Integrated Memory Controller ERROR Registers 1 (rev 07) ff:10.4 System peripheral: Intel Corporation Xeon E5/Core i7 Integrated Memory Controller Channel 0-3 Thermal Control 2 (rev 07) ff:10.5 System peripheral: Intel Corporation Xeon E5/Core i7 Integrated Memory Controller Channel 0-3 Thermal Control 3 (rev 07) ff:10.6 System peripheral: Intel Corporation Xeon E5/Core i7 Integrated Memory Controller ERROR Registers 2 (rev 07) ff:10.7 System peripheral: Intel Corporation Xeon E5/Core i7 Integrated Memory Controller ERROR Registers 3 (rev 07) ff:11.0 System peripheral: Intel Corporation Xeon E5/Core i7 DDRIO (rev 07) ff:13.0 System peripheral: Intel Corporation Xeon E5/Core i7 R2PCIe (rev 07) ff:13.1 Performance counters: Intel Corporation Xeon E5/Core i7 Ring to PCI Express Performance Monitor (rev 07) ff:13.4 Performance counters: Intel Corporation Xeon E5/Core i7 QuickPath Interconnect Agent Ring Registers (rev 07) ff:13.5 Performance counters: Intel Corporation Xeon E5/Core i7 Ring to QuickPath Interconnect Link 0 Performance Monitor (rev 07) ff:13.6 System peripheral: Intel Corporation Xeon E5/Core i7 Ring to QuickPath Interconnect Link 1 Performance Monitor (rev 07)
just compiled 3.16.2 and still seeing this issue.
Same issue with kernel 3.16.3
vladi@prod-ent-ceph03:~$ ethtool -k eth1 Features for eth1: rx-checksumming: on tx-checksumming: on tx-checksum-ipv4: on tx-checksum-ip-generic: off [fixed] tx-checksum-ipv6: on tx-checksum-fcoe-crc: on [fixed] tx-checksum-sctp: on scatter-gather: on tx-scatter-gather: on tx-scatter-gather-fraglist: off [fixed] tcp-segmentation-offload: on tx-tcp-segmentation: on tx-tcp-ecn-segmentation: off [fixed] tx-tcp6-segmentation: on udp-fragmentation-offload: off [fixed] generic-segmentation-offload: on generic-receive-offload: on large-receive-offload: on rx-vlan-offload: on tx-vlan-offload: on ntuple-filters: off receive-hashing: on highdma: on [fixed] rx-vlan-filter: on vlan-challenged: off [fixed] tx-lockless: off [fixed] netns-local: off [fixed] tx-gso-robust: off [fixed] tx-fcoe-segmentation: on [fixed] tx-gre-segmentation: off [fixed] tx-ipip-segmentation: off [fixed] tx-sit-segmentation: off [fixed] tx-udp_tnl-segmentation: off [fixed] tx-mpls-segmentation: off [fixed] fcoe-mtu: off [fixed] tx-nocache-copy: off loopback: off [fixed] rx-fcs: off [fixed] rx-all: off tx-vlan-stag-hw-insert: off [fixed] rx-vlan-stag-hw-parse: off [fixed] rx-vlan-stag-filter: off [fixed] l2-fwd-offload: off busy-poll: on [fixed]
vladi@prod-ent-ceph03:~$ ethtool -i eth1 driver: ixgbe version: 3.19.1-k firmware-version: 0x61c10001 bus-info: 0000:04:00.0 supports-statistics: yes supports-test: yes supports-eeprom-access: yes supports-register-dump: yes supports-priv-flags: no
same bug here on Debian's bpo kernel: Linux router01 3.16-0.bpo.2-amd64 #1 SMP Debian 3.16.3-2~bpo70+1 (2014-09-21) x86_64 GNU/Linux Sep 30 09:14:24 router01 kernel: [ 34.565746] ------------[ cut here ]------------ Sep 30 09:14:24 router01 kernel: [ 34.565812] WARNING: CPU: 5 PID: 3825 at /build/linux-nBoDV9/linux-3.16.3/net/core/dev.c:2246 skb_warn_bad_offload+0xc4/0xcd() Sep 30 09:14:24 router01 kernel: [ 34.565875] : caps=(0x0000000004197ba9, 0x00000000001b583b) len=4410 data_len=2920 gso_size=1448 gso_type=1 ip_summed=3 Sep 30 09:14:24 router01 kernel: [ 34.565938] Modules linked in: ip_vs libcrc32c ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter ip6_tables ipt_REJECT xt_tcpudp nf_conntrack_ipv4 nf_defrag_ipv4 xt_state xt_LOG xt_connlimit nf_conntrack xt_multiport xt_set iptable_filter ip_tables x_tables ip_set_hash_ip ip_set_hash_net ip_set nfnetlink 8021q garp stp mrp llc bonding dm_crypt ipmi_watchdog ipmi_si ipmi_poweroff ipmi_devintf ipmi_msghandler loop iTCO_wdt iTCO_vendor_support coretemp kvm_intel kvm evdev psmouse serio_raw pcspkr lpc_ich i2c_i801 mfd_core acpi_cpufreq ioatdma i7core_edac shpchp processor button edac_core thermal_sys ext4 crc16 mbcache jbd2 dm_mod raid1 md_mod sg sd_mod crct10dif_generic crc_t10dif crct10dif_common hid_generic usbhid hid uhci_hcd ehci_pci mptsas ehci_hcd scsi_transport_sas crc32c_intel mptscsih igb mptbase i2c_algo_bit usbcore i2c_core dca usb_common ptp scsi_mod pps_core Sep 30 09:14:24 router01 kernel: [ 34.569448] CPU: 5 PID: 3825 Comm: bird Tainted: G I 3.16-0.bpo.2-amd64 #1 Debian 3.16.3-2~bpo70+1 Sep 30 09:14:24 router01 kernel: [ 34.569510] Hardware name: HITACHI BladeSymphony F51 /7TPBVa , BIOS F13 08/19/2010 Sep 30 09:14:24 router01 kernel: [ 34.569569] 0000000000000000 ffffffff81773e70 ffffffff8153ff96 ffff88063146b958 Sep 30 09:14:24 router01 kernel: [ 34.569801] ffffffff8106be4c ffff880631934ae8 ffff880632c0b000 0000000000000003 Sep 30 09:14:24 router01 kernel: [ 34.570025] 00000000000010c8 0000000000000000 ffffffff8106bf3a ffffffff81773f00 Sep 30 09:14:24 router01 kernel: [ 34.570258] Call Trace: Sep 30 09:14:24 router01 kernel: [ 34.570319] [<ffffffff8153ff96>] ? dump_stack+0x41/0x51 Sep 30 09:14:24 router01 kernel: [ 34.570388] [<ffffffff8106be4c>] ? warn_slowpath_common+0x8c/0xc0 Sep 30 09:14:24 router01 kernel: [ 34.570455] [<ffffffff8106bf3a>] ? warn_slowpath_fmt+0x4a/0x50 Sep 30 09:14:24 router01 kernel: [ 34.570525] [<ffffffff812cc679>] ? ___ratelimit+0xa9/0x120 Sep 30 09:14:24 router01 kernel: [ 34.570591] [<ffffffff81541709>] ? skb_warn_bad_offload+0xc4/0xcd Sep 30 09:14:24 router01 kernel: [ 34.570662] [<ffffffff81444c65>] ? skb_checksum_help+0x1a5/0x1c0 Sep 30 09:14:24 router01 kernel: [ 34.570729] [<ffffffff8144a8fc>] ? dev_hard_start_xmit+0x4bc/0x5f0 Sep 30 09:14:24 router01 kernel: [ 34.570798] [<ffffffff8144ad65>] ? __dev_queue_xmit+0x335/0x4d0 Sep 30 09:14:24 router01 kernel: [ 34.570864] [<ffffffffa034b895>] ? vlan_dev_hard_start_xmit+0x95/0x120 [8021q] Sep 30 09:14:24 router01 kernel: [ 34.570950] [<ffffffff8144a77e>] ? dev_hard_start_xmit+0x33e/0x5f0 Sep 30 09:14:24 router01 kernel: [ 34.571022] [<ffffffff81486110>] ? ip_forward_options+0x210/0x210 Sep 30 09:14:24 router01 kernel: [ 34.571089] [<ffffffff8144ad65>] ? __dev_queue_xmit+0x335/0x4d0 Sep 30 09:14:24 router01 kernel: [ 34.571157] [<ffffffff81453076>] ? neigh_resolve_output+0x106/0x230 Sep 30 09:14:24 router01 kernel: [ 34.571225] [<ffffffff814880d6>] ? ip_finish_output+0x566/0x8d0 Sep 30 09:14:24 router01 kernel: [ 34.571291] [<ffffffff8148864a>] ? ip_queue_xmit+0x12a/0x3b0 Sep 30 09:14:24 router01 kernel: [ 34.571358] [<ffffffff8149f30e>] ? tcp_transmit_skb+0x41e/0x900 Sep 30 09:14:24 router01 kernel: [ 34.571424] [<ffffffff814a0330>] ? tcp_write_xmit+0x140/0xc40 Sep 30 09:14:24 router01 kernel: [ 34.571491] [<ffffffff814a0e9a>] ? __tcp_push_pending_frames+0x2a/0xc0 Sep 30 09:14:24 router01 kernel: [ 34.571559] [<ffffffff81492051>] ? tcp_sendmsg+0xc1/0xcc0 Sep 30 09:14:24 router01 kernel: [ 34.571626] [<ffffffff8142f1fe>] ? sock_aio_write+0xfe/0x120 Sep 30 09:14:24 router01 kernel: [ 34.571695] [<ffffffff811b9e5f>] ? do_sync_write+0x5f/0x90 Sep 30 09:14:24 router01 kernel: [ 34.571761] [<ffffffff811bac35>] ? vfs_write+0x1b5/0x1f0 Sep 30 09:14:24 router01 kernel: [ 34.571827] [<ffffffff811bb050>] ? SyS_write+0x50/0xb0 Sep 30 09:14:24 router01 kernel: [ 34.571893] [<ffffffff8154646d>] ? system_call_fast_compare_end+0x10/0x15 Sep 30 09:14:24 router01 kernel: [ 34.571966] ---[ end trace afe30ae0025efec6 ]---
disabling all kinds of offloading seems to mitigate the issue: pre-up ethtool -K eth0 tso off gso off gro off tx off rx off sg off rxvlan off txvlan off rxhash off || : pre-up ethtool -K eth1 tso off gso off gro off tx off rx off sg off rxvlan off txvlan off rxhash off || :
further experiments pinpoint to scatter gather. issue can be mitigated for me with: pre-up ethtool -K eth0 sg off ############ Features for eth0: rx-checksumming: on tx-checksumming: on tx-checksum-ipv4: on tx-checksum-ip-generic: off [fixed] tx-checksum-ipv6: on tx-checksum-fcoe-crc: off [fixed] tx-checksum-sctp: on scatter-gather: off tx-scatter-gather: off tx-scatter-gather-fraglist: off [fixed] tcp-segmentation-offload: off tx-tcp-segmentation: off [requested on] tx-tcp-ecn-segmentation: off [fixed] tx-tcp6-segmentation: off [requested on] udp-fragmentation-offload: off [fixed] generic-segmentation-offload: off [requested on] generic-receive-offload: on large-receive-offload: off [fixed] rx-vlan-offload: on tx-vlan-offload: on ntuple-filters: off [fixed] receive-hashing: on highdma: on [fixed] rx-vlan-filter: on [fixed] vlan-challenged: off [fixed] tx-lockless: off [fixed] netns-local: off [fixed] tx-gso-robust: off [fixed] tx-fcoe-segmentation: off [fixed] tx-gre-segmentation: off [fixed] tx-ipip-segmentation: off [fixed] tx-sit-segmentation: off [fixed] tx-udp_tnl-segmentation: off [fixed] tx-mpls-segmentation: off [fixed] fcoe-mtu: off [fixed] tx-nocache-copy: off loopback: off [fixed] rx-fcs: off [fixed] rx-all: off tx-vlan-stag-hw-insert: off [fixed] rx-vlan-stag-hw-parse: off [fixed] rx-vlan-stag-filter: off [fixed] l2-fwd-offload: off [fixed] busy-poll: off [fixed] ################# driver: igb version: 5.0.5-k firmware-version: 1.2.4 bus-info: 0000:02:00.0 supports-statistics: yes supports-test: yes supports-eeprom-access: yes supports-register-dump: yes supports-priv-flags: no
I also just started seeing this, on a Supermicro H8DGU. It's essentially the same setup - two e1000 NICs in 802.3ad bond mode with some VLANs running on top of them. 02:00.0 Ethernet controller [0200]: Intel Corporation 82576 Gigabit Network Connection [8086:10c9] (rev 01) 02:00.1 Ethernet controller [0200]: Intel Corporation 82576 Gigabit Network Connection [8086:10c9] (rev 01) I tried updating the system BIOS, as was suggested in one of the other bug reports about this, but as expected, it made no difference. System Information Manufacturer: Supermicro Product Name: H8DGU BIOS Information Vendor: American Megatrends Inc. Version: 3.5 Release Date: 11/25/2013 # ethtool -i eth0 driver: igb version: 5.0.5-k firmware-version: 1.4.3 # uname -a Linux fs01 3.16-0.bpo.2-amd64 #1 SMP Debian 3.16.3-2~bpo70+1 (2014-09-21) x86_64 GNU/Linux
I see this on Fedora 20 with both Intel and Broadcom NIC's, also with VLAN's on top of 802.3ad. Disabling scatter-gather works for me too. hostA # ethtool -i em1 | egrep '^(driver|version)' driver: tg3 version: 3.137 # ethtool -i em2 | egrep '^(driver|version)' driver: tg3 version: 3.137 # lspci | grep -i ethernet 03:00.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5717 Gigabit Ethernet PCIe (rev 10) 03:00.1 Ethernet controller: Broadcom Corporation NetXtreme BCM5717 Gigabit Ethernet PCIe (rev 10) # uname -r 3.16.3-200.fc20.x86_64 hostB # ethtool -i em1 | egrep '^(driver|version)' driver: tg3 version: 3.137 # ethtool -i em2 | egrep '^(driver|version)' driver: tg3 version: 3.137 # ethtool -i em3 | egrep '^(driver|version)' driver: tg3 version: 3.137 # ethtool -i em4 | egrep '^(driver|version)' driver: tg3 version: 3.137 # lspci | grep -i ethernet 03:00.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5719 Gigabit Ethernet PCIe (rev 01) 03:00.1 Ethernet controller: Broadcom Corporation NetXtreme BCM5719 Gigabit Ethernet PCIe (rev 01) 03:00.2 Ethernet controller: Broadcom Corporation NetXtreme BCM5719 Gigabit Ethernet PCIe (rev 01) 03:00.3 Ethernet controller: Broadcom Corporation NetXtreme BCM5719 Gigabit Ethernet PCIe (rev 01) # uname -r 3.16.3-200.fc20.x86_64 hostC # ethtool -i em1 | egrep '^(driver|version)' driver: igb version: 5.0.5-k # ethtool -i em2 | egrep '^(driver|version)' driver: igb version: 5.0.5-k # ethtool -i em3 | egrep '^(driver|version)' driver: igb version: 5.0.5-k # ethtool -i em4 | egrep '^(driver|version)' driver: igb version: 5.0.5-k # ethtool -i p4p1 | egrep '^(driver|version)' driver: ixgbe version: 3.19.1-k # ethtool -i p4p2 | egrep '^(driver|version)' driver: ixgbe version: 3.19.1-k # lspci | grep -i ethernet 02:00.0 Ethernet controller: Intel Corporation I350 Gigabit Network Connection (rev 01) 02:00.1 Ethernet controller: Intel Corporation I350 Gigabit Network Connection (rev 01) 02:00.2 Ethernet controller: Intel Corporation I350 Gigabit Network Connection (rev 01) 02:00.3 Ethernet controller: Intel Corporation I350 Gigabit Network Connection (rev 01) 03:00.0 Ethernet controller: Intel Corporation Ethernet Controller 10-Gigabit X540-AT2 (rev 01) 03:00.1 Ethernet controller: Intel Corporation Ethernet Controller 10-Gigabit X540-AT2 (rev 01) # uname -r 3.16.3-200.fc20.x86_64 hostD # ethtool -i em1 | egrep '^(driver|version)' driver: bnx2 version: 2.2.5 # ethtool -i em2 | egrep '^(driver|version)' driver: bnx2 version: 2.2.5 # ethtool -i em3 | egrep '^(driver|version)' driver: bnx2 version: 2.2.5 # ethtool -i em4 | egrep '^(driver|version)' driver: bnx2 version: 2.2.5 # lspci | grep -i ethernet 04:00.0 Ethernet controller: Broadcom Corporation NetXtreme II BCM5708 Gigabit Ethernet (rev 12) 06:00.0 Ethernet controller: Broadcom Corporation NetXtreme II BCM5708 Gigabit Ethernet (rev 12) 08:00.0 Ethernet controller: Broadcom Corporation NetXtreme II BCM5708 Gigabit Ethernet (rev 12) 0a:00.0 Ethernet controller: Broadcom Corporation NetXtreme II BCM5708 Gigabit Ethernet (rev 12) # uname -r 3.16.2-201.fc20.x86_64 hostE # ethtool -i em1 | egrep '^(driver|version)' driver: bnx2 version: 2.2.5 # ethtool -i em2 | egrep '^(driver|version)' driver: bnx2 version: 2.2.5 # ethtool -i em3 | egrep '^(driver|version)' driver: bnx2 version: 2.2.5 # ethtool -i em4 | egrep '^(driver|version)' driver: bnx2 version: 2.2.5 # lspci | grep -i ethernet 03:00.0 Ethernet controller: Broadcom Corporation NetXtreme II BCM5708 Gigabit Ethernet (rev 12) 05:00.0 Ethernet controller: Broadcom Corporation NetXtreme II BCM5708 Gigabit Ethernet (rev 12) 08:00.0 Ethernet controller: Broadcom Corporation NetXtreme II BCM5708 Gigabit Ethernet (rev 12) 0a:00.0 Ethernet controller: Broadcom Corporation NetXtreme II BCM5708 Gigabit Ethernet (rev 12) # uname -r 3.16.3-200.fc20.x86_64
Still seeing this on 3.16.5 kernel from debian sid, less than two minutes after booting up. [ 96.107164] ------------[ cut here ]------------ [ 96.107187] WARNING: CPU: 0 PID: 2989 at /build/linux-i5neKT/linux-3.16.5/net/core/dev.c:2246 skb_warn_bad_offload+0xc6/0xd1() [ 96.107196] : caps=(0x000000000419fba9, 0x00000000001b583b) len=1666 data_len=176 gso_size=1448 gso_type=1 ip_summed=3 [ 96.107201] Modules linked in: 8021q garp stp mrp llc bonding loop hid_generic usbhid hid sp5100_tco sr_mod cdrom acpi_cpufreq amd64_edac_mod ttm drm_kms_helper drm k10temp edac_mce_amd edac_core i2c_piix4 psmouse pcspkr evdev serio_raw kvm_amd kvm button processor tpm_tis tpm thermal_sys ext4 crc16 mbcache jbd2 dm_mod usb_storage ata_generic sg sd_mod ses crc_t10dif enclosure crct10dif_common pata_atiixp ahci libahci ohci_pci ehci_pci ohci_hcd ehci_hcd megaraid_sas libata igb i2c_algo_bit i2c_core scsi_mod dca ptp pps_core usbcore usb_common [ 96.107297] CPU: 0 PID: 2989 Comm: sshd Not tainted 3.16-3-amd64 #1 Debian 3.16.5-1 [ 96.107302] Hardware name: Supermicro H8DGU/H8DGU, BIOS 3.5 11/25/2013 [ 96.107306] 0000000000000009 ffffffff815066c3 ffff8804168b39d0 ffffffff81065717 [ 96.107314] ffff8800db980ae8 ffff8804168b3a20 0000000000000001 0000000000000003 [ 96.107321] ffff8800db980ae8 ffffffff8106577c ffffffff817743e8 0000000000000030 [ 96.107328] Call Trace: [ 96.107338] [<ffffffff815066c3>] ? dump_stack+0x41/0x51 [ 96.107348] [<ffffffff81065717>] ? warn_slowpath_common+0x77/0x90 [ 96.107355] [<ffffffff8106577c>] ? warn_slowpath_fmt+0x4c/0x50 [ 96.107363] [<ffffffff81507d5c>] ? skb_warn_bad_offload+0xc6/0xd1 [ 96.107372] [<ffffffff8141748c>] ? skb_checksum_help+0x17c/0x190 [ 96.107381] [<ffffffff8141a5bb>] ? dev_hard_start_xmit+0x47b/0x560 [ 96.107389] [<ffffffff8141a9e4>] ? __dev_queue_xmit+0x344/0x4c0 [ 96.107397] [<ffffffff8141a8c3>] ? __dev_queue_xmit+0x223/0x4c0 [ 96.107407] [<ffffffffa03e2d3c>] ? vlan_dev_hard_start_xmit+0x8c/0x110 [8021q] [ 96.107415] [<ffffffff8141a41f>] ? dev_hard_start_xmit+0x2df/0x560 [ 96.107423] [<ffffffff8141a9e4>] ? __dev_queue_xmit+0x344/0x4c0 [ 96.107433] [<ffffffff81453912>] ? ip_finish_output+0x3e2/0x840 [ 96.107440] [<ffffffff81454be2>] ? ip_queue_xmit+0x132/0x3a0 [ 96.107448] [<ffffffff8146afb6>] ? tcp_transmit_skb+0x456/0x8e0 [ 96.107456] [<ffffffff81408d38>] ? __alloc_skb+0x48/0x2a0 [ 96.107464] [<ffffffff8146b589>] ? tcp_write_xmit+0x149/0xd30 [ 96.107471] [<ffffffff8146c3ba>] ? __tcp_push_pending_frames+0x2a/0xc0 [ 96.107478] [<ffffffff8145dd31>] ? tcp_sendmsg+0xc1/0xd20 [ 96.107488] [<ffffffff813feede>] ? sock_aio_write+0xfe/0x130 [ 96.107497] [<ffffffff811a4d5c>] ? do_sync_write+0x5c/0x90 [ 96.107505] [<ffffffff811a5745>] ? vfs_write+0x195/0x1f0 [ 96.107512] [<ffffffff811a54d3>] ? vfs_read+0x93/0x170 [ 96.107519] [<ffffffff811a61a2>] ? SyS_write+0x42/0xa0 [ 96.107526] [<ffffffff81076ab5>] ? SyS_rt_sigprocmask+0x65/0xb0 [ 96.107533] [<ffffffff8150c7ad>] ? system_call_fast_compare_end+0x10/0x15 [ 96.107538] ---[ end trace 730ad1fc1573ae55 ]---
This looks like it may have been fixed in 3.16.7. I have installed that linux-image-3.16.0-4-amd64 on one of my affected systems, and over 6 hours later, still no sign of the calltrace. Looks good so far.
If this bug is confirmed to be fixed in 3.16.7, does anybody know which commit fixed it?
(In reply to Jean Delvare from comment #15) > If this bug is confirmed to be fixed in 3.16.7 I doubt so: https://apibugzilla.novell.com/show_bug.cgi?id=924121#c0
Link above seems to require authentication. In any case, I have not seen this bug reoccur since installing linux-image-3.16.0-4-amd64 (or any of the subsequent updates), over six months ago.
(In reply to Daniel Swarbrick from comment #17) > Link above seems to require authentication. Bad link then :/: https://bugzilla.novell.com/show_bug.cgi?id=924121#c0 The api prefix is for internal use only. > In any case, I have not seen this bug reoccur since installing > linux-image-3.16.0-4-amd64 (or any of the subsequent updates), over six > months ago. Either ubuntu backported some fix to the kernel, or it is completely a different bug.
Confirm this bug with CentOS 7.1 distro kernel and the following NIC plus ixgbe driver: === 04:00.0 Ethernet controller: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection (rev 01) === Using bonding: === BONDING_OPTS="mode=802.3ad xmit_hash_policy=1 miimon=100 lacp_rate=1" === Also using VLANs over bonding, bridges and virtio interfaces in bridges for KVM. Either host could produce warning or guest as well (depending on what NIC offloads are enabled/disabled).
Narrowed down my bug instance to disabling LRO on both slave interfaces. Linux kernel 4.0.4 is subjected to this bug as well.
Probably, related note from ixgbe README: === WARNING: The ixgbe driver compiles by default with the LRO (Large Receive Offload) feature enabled. This option offers the lowest CPU utilization for receives, but is completely incompatible with *routing/ip forwarding* and *bridging*. If enabling ip forwarding or bridging is a requirement, it is necessary to disable LRO using compile time options as noted in the LRO section later in this document. The result of not disabling LRO when combined with ip forwarding or bridging can be low throughput or even a kernel panic. ===
Same here with kernel 3.10.87, using KVM VMs on top of a physical -> bond (802.3ad, layer3+4) -> VLAN -> bridge stack. There are four interfaces within the bond, some of them are Intel 82576 (igb) and some are Intel 10-Gigabit X540-AT2 (ixgbe). According to ethtool, the 82576 have LRO off (fixed) and the X540-AT2 have LRO on by default, which leads to the following dmesg warning: [ 1672.360854] ------------[ cut here ]------------ [ 1672.360870] WARNING: at net/core/dev.c:2194 skb_warn_bad_offload+0xc8/0xd3() [ 1672.360875] : caps=(0x0000000040004849, 0x0000000000000000) len=549 data_len=495 gso_size=495 gso_type=1 ip_summed=1 [ 1672.360878] Modules linked in: bridge stp llc bonding xt_physdev xt_addrtype xt_pkttype xt_conntrack xt_NFLOG nfnetlink_log nfnetlink xt_hashlimit vhost_net macvtap macvlan nf_conntrack_ftp iTCO_wdt iTCO_vendor_support coretemp mperf freq_table crc32_pclmul crc32c_intel ghash_clmulni_intel microcode kvm_intel ses enclosure sb_edac igb edac_core i2c_i801 ixgbe i2c_algo_bit lpc_ich i2c_core mfd_core mdio shpchp pci_hotplug button xts aesni_intel lrw gf128mul glue_helper ablk_helper cryptd aes_x86_64 sha256_generic iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi tg3 ptp pps_core hwmon libphy e1000 fuse nfs lockd sunrpc fscache reiserfs btrfs xor lzo_compress zlib_deflate raid6_pq ext4 jbd2 ext3 jbd dm_crypt dm_mirror dm_region_hash dm_log firewire_core sl811_hcd xhci_hcd ohci_hcd uhci_hcd usb_storage [ 1672.360942] ehci_pci ehci_hcd mpt2sas raid_class aic94xx libsas qla2xxx megaraid_sas megaraid_mbox megaraid_mm megaraid aacraid sx8 DAC960 hpsa cciss 3w_9xxx 3w_xxxx mptsas scsi_transport_sas mptfc scsi_transport_fc scsi_tgt mptspi mptscsih mptbase atp870u dc395x qla1280 dmx3191d sym53c8xx gdth advansys initio BusLogic arcmsr aic7xxx aic79xx scsi_transport_spi sg pdc_adma sata_inic162x sata_mv ata_piix ahci libahci sata_qstor sata_vsc sata_uli sata_sis sata_sx4 sata_nv sata_via sata_svw sata_sil24 sata_sil sata_promise pata_sl82c105 pata_cs5530 pata_cs5520 pata_via pata_jmicron pata_marvell pata_sis pata_netcell pata_sc1200 pata_pdc202xx_old pata_triflex pata_atiixp pata_opti pata_amd pata_ali pata_it8213 pata_ns87415 pata_ns87410 pata_serverworks pata_cypress pata_artop pata_it821x pata_optidma [ 1672.360999] pata_hpt3x2n pata_hpt3x3 pata_hpt37x pata_hpt366 pata_cmd64x pata_efar pata_rz1000 pata_sil680 pata_radisys pata_pdc2027x pata_mpiix libata [ 1672.361012] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G W 3.10.87-gentoo #1 [ 1672.361016] Hardware name: Supermicro X9DRH-7TF/7F/iTF/iF/X9DRH-7TF/7F/iTF/iF, BIOS 3.0a 12/27/2013 [ 1672.361019] ffffffff8157a93d 0000000000000000 ffffffff81074eef ffff881fcf50c800 [ 1672.361024] ffff883fcfd4c000 0000000000000001 0000000000000001 ffff883fd046c600 [ 1672.361028] ffffffff81074fda ffffffff817657c0 0000000000000030 ffff881fffc036e8 [ 1672.361033] Call Trace: [ 1672.361036] <IRQ> [<ffffffff8157a93d>] ? dump_stack+0xd/0x17 [ 1672.361046] [<ffffffff81074eef>] ? warn_slowpath_common+0x6f/0xa0 [ 1672.361051] [<ffffffff81074fda>] ? warn_slowpath_fmt+0x4a/0x50 [ 1672.361056] [<ffffffff812e316c>] ? ___ratelimit+0xac/0x120 [ 1672.361061] [<ffffffff8157bffe>] ? skb_warn_bad_offload+0xc8/0xd3 [ 1672.361066] [<ffffffff810a32b8>] ? __wake_up+0x48/0x70 [ 1672.361071] [<ffffffff81481211>] ? __skb_gso_segment+0x71/0xc0 [ 1672.361076] [<ffffffff814813f4>] ? dev_hard_start_xmit+0x194/0x500 [ 1672.361080] [<ffffffff8149c44d>] ? sch_direct_xmit+0xfd/0x1d0 [ 1672.361085] [<ffffffff81481959>] ? dev_queue_xmit+0x1f9/0x460 [ 1672.361092] [<ffffffffa0926378>] ? br_dev_queue_push_xmit+0x88/0xc0 [bridge] [ 1672.361098] [<ffffffffa092d2b3>] ? br_sysfs_delbr+0x5a3/0x1c10 [bridge] [ 1672.361104] [<ffffffffa09262f0>] ? br_fdb_delete+0x500/0x500 [bridge] [ 1672.361109] [<ffffffff814aaae5>] ? nf_iterate+0x95/0xc0 [ 1672.361114] [<ffffffffa092d3c0>] ? br_sysfs_delbr+0x6b0/0x1c10 [bridge] [ 1672.361120] [<ffffffffa09262f0>] ? br_fdb_delete+0x500/0x500 [bridge] [ 1672.361124] [<ffffffff814aab86>] ? nf_hook_slow+0x76/0x140 [ 1672.361129] [<ffffffffa09262f0>] ? br_fdb_delete+0x500/0x500 [bridge] [ 1672.361135] [<ffffffffa09263b0>] ? br_dev_queue_push_xmit+0xc0/0xc0 [bridge] [ 1672.361141] [<ffffffffa09263fa>] ? br_forward_finish+0x4a/0x1d0 [bridge] [ 1672.361146] [<ffffffffa092d461>] ? br_sysfs_delbr+0x751/0x1c10 [bridge] [ 1672.361151] [<ffffffffa092e016>] ? br_sysfs_delbr+0x1306/0x1c10 [bridge] [ 1672.361157] [<ffffffffa09263b0>] ? br_dev_queue_push_xmit+0xc0/0xc0 [bridge] [ 1672.361167] [<ffffffff814aaae5>] ? nf_iterate+0x95/0xc0 [ 1672.361173] [<ffffffffa09263b0>] ? br_dev_queue_push_xmit+0xc0/0xc0 [bridge] [ 1672.361177] [<ffffffff814aab86>] ? nf_hook_slow+0x76/0x140 [ 1672.361182] [<ffffffffa09263b0>] ? br_dev_queue_push_xmit+0xc0/0xc0 [bridge] [ 1672.361188] [<ffffffffa09264a8>] ? br_forward_finish+0xf8/0x1d0 [bridge] [ 1672.361193] [<ffffffff81471c56>] ? skb_clone+0x46/0xc0 [ 1672.361197] [<ffffffff814701d9>] ? __skb_clone+0x29/0x110 [ 1672.361202] [<ffffffffa0926410>] ? br_forward_finish+0x60/0x1d0 [bridge] [ 1672.361207] [<ffffffffa092600b>] ? br_fdb_delete+0x21b/0x500 [bridge] [ 1672.361213] [<ffffffffa0927489>] ? br_handle_frame_finish+0x289/0x340 [bridge] [ 1672.361219] [<ffffffffa092db78>] ? br_sysfs_delbr+0xe68/0x1c10 [bridge] [ 1672.361225] [<ffffffffa092e521>] ? br_sysfs_delbr+0x1811/0x1c10 [bridge] [ 1672.361230] [<ffffffff810b2491>] ? find_busiest_group+0x111/0xaa0 [ 1672.361236] [<ffffffffa0927200>] ? br_net_exit+0xe0/0xe0 [bridge] [ 1672.361240] [<ffffffff814aaae5>] ? nf_iterate+0x95/0xc0 [ 1672.361251] [<ffffffffa0927200>] ? br_net_exit+0xe0/0xe0 [bridge] [ 1672.361255] [<ffffffff814aab86>] ? nf_hook_slow+0x76/0x140 [ 1672.361260] [<ffffffffa0927200>] ? br_net_exit+0xe0/0xe0 [bridge] [ 1672.361266] [<ffffffffa09276f0>] ? br_handle_frame+0x1b0/0x950 [bridge] [ 1672.361271] [<ffffffff8147e8c4>] ? __netif_receive_skb_core+0x214/0x750 [ 1672.361276] [<ffffffff8103f875>] ? read_tsc+0x5/0x20 [ 1672.361280] [<ffffffff8147f01f>] ? netif_receive_skb+0x1f/0x90 [ 1672.361285] [<ffffffff8147fec0>] ? napi_gro_receive+0xa0/0xe0 [ 1672.361293] [<ffffffffa089a970>] ? ixgbe_poll+0xab0/0x1160 [ixgbe] [ 1672.361298] [<ffffffff8147f2ef>] ? net_rx_action+0xaf/0x1b0 [ 1672.361303] [<ffffffff8107cf4e>] ? __do_softirq+0xde/0x210 [ 1672.361308] [<ffffffff815816dc>] ? call_softirq+0x1c/0x30 [ 1672.361314] [<ffffffff81039a15>] ? do_softirq+0x65/0xa0 [ 1672.361319] [<ffffffff8107d1ce>] ? irq_exit+0x8e/0xb0 [ 1672.361324] [<ffffffff81581de0>] ? do_IRQ+0x60/0xe0 [ 1672.361329] [<ffffffff8157f82d>] ? common_interrupt+0x6d/0x6d [ 1672.361332] <EOI> [<ffffffff8105c33f>] ? lapic_next_deadline+0x2f/0x40 [ 1672.361340] [<ffffffff8143cf4e>] ? cpuidle_enter_state+0x5e/0xf0 [ 1672.361345] [<ffffffff8143cf47>] ? cpuidle_enter_state+0x57/0xf0 [ 1672.361350] [<ffffffff8143d096>] ? cpuidle_idle_call+0xb6/0x210 [ 1672.361355] [<ffffffff810412ee>] ? arch_cpu_idle+0xe/0x30 [ 1672.361361] [<ffffffff810bacd1>] ? cpu_startup_entry+0xe1/0x270 [ 1672.361366] [<ffffffff818ede84>] ? start_kernel+0x406/0x411 [ 1672.361371] [<ffffffff818ed895>] ? repair_env_string+0x5b/0x5b [ 1672.361376] [<ffffffff818ed6a8>] ? x86_64_start_kernel+0xf6/0x105 [ 1672.361379] ---[ end trace d1f41b5049d41d9d ]--- After disabling LRO on the X540-AT2 (ixgbe) interfaces with the help of ethtool (ethtool -K ethX lro off), the problem disappears and the network within the VM is OK again.
I'm experiencing a similar bug and this is happening in a *guest* running on KVM (qemu-kvm-1.5.3-105), the host has no any log of errors like that; it has a similar network configuration to "comment 22". This guest machine is configured as a gateway for physical and virtual machines, and this error only happens when a physical client tries to connect through it. Any suggestion to fix it? Thanks. Guest info: CentOS 7.3.1611 (Core) Kernel 3.10.0-514.10.2.el7.x86_64 Mar 22 20:37:10 HOSTNAME kernel: ------------[ cut here ]------------ Mar 22 20:37:10 HOSTNAME kernel: WARNING: at net/core/dev.c:2402 skb_warn_bad_offload+0xcd/0xda() Mar 22 20:37:10 HOSTNAME kernel: virtio_net: caps=(0x00000001001b4a29, 0x0000000000000000) len=1434 data_len=1306 gso_size=1368 gso_type=5 ip_summed=1 Mar 22 20:37:10 HOSTNAME kernel: Modules linked in: nfsv3 nfs fscache nf_conntrack_netbios_ns nf_conntrack_broadcast xt_nat ipt_MASQUERADE nf_nat_masquerade_ipv4 xt_mark ip6t_rpfilter ipt_REJECT nf_reject_ipv4 ip6t_REJECT nf_reject_ipv6 xt_conntrack ip_set nfnetlink ebtable_nat ebtable_broute bridge stp llc ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter ext4 mbcache jbd2 iosf_mbi crc32_pclmul ghash_clmulni_intel aesni_intel lrw ppdev gf128mul glue_helper ablk_helper cryptd parport_pc pcspkr sg virtio_balloon parport i2c_piix4 nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables Mar 22 20:37:10 HOSTNAME kernel: xfs libcrc32c sr_mod cdrom ata_generic pata_acpi cirrus drm_kms_helper virtio_blk virtio_console virtio_net crct10dif_pclmul crct10dif_common syscopyarea ata_piix sysfillrect sysimgblt crc32c_intel fb_sys_fops ttm serio_raw virtio_pci drm virtio_ring virtio libata i2c_core floppy dm_mirror dm_region_hash dm_log dm_mod Mar 22 20:37:10 HOSTNAME kernel: CPU: 1 PID: 0 Comm: swapper/1 Tainted: G W ------------ 3.10.0-514.10.2.el7.x86_64 #1 Mar 22 20:37:10 HOSTNAME kernel: Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 Mar 22 20:37:10 HOSTNAME kernel: ffff8801460839a8 84faf728ec725b11 ffff880146083960 ffffffff816864ef Mar 22 20:37:10 HOSTNAME kernel: ffff880146083998 ffffffff81085940 ffff8800b3b45900 ffff88013bb8b000 Mar 22 20:37:10 HOSTNAME kernel: 0000000000000005 0000000000000001 0000000000000000 ffff880146083a00 Mar 22 20:37:10 HOSTNAME kernel: Call Trace: Mar 22 20:37:10 HOSTNAME kernel: <IRQ> [<ffffffff816864ef>] dump_stack+0x19/0x1b Mar 22 20:37:10 HOSTNAME kernel: [<ffffffff81085940>] warn_slowpath_common+0x70/0xb0 Mar 22 20:37:10 HOSTNAME kernel: [<ffffffff810859dc>] warn_slowpath_fmt+0x5c/0x80 Mar 22 20:37:10 HOSTNAME kernel: [<ffffffff8131e833>] ? ___ratelimit+0x93/0x100 Mar 22 20:37:10 HOSTNAME kernel: [<ffffffff81688f2a>] skb_warn_bad_offload+0xcd/0xda Mar 22 20:37:10 HOSTNAME kernel: [<ffffffff81571609>] __skb_gso_segment+0x79/0xb0 Mar 22 20:37:10 HOSTNAME kernel: [<ffffffff815719d5>] validate_xmit_skb.part.94+0x135/0x2f0 Mar 22 20:37:10 HOSTNAME kernel: [<ffffffff8157217d>] __dev_queue_xmit+0x4cd/0x570 Mar 22 20:37:10 HOSTNAME kernel: [<ffffffff81572230>] dev_queue_xmit+0x10/0x20 Mar 22 20:37:10 HOSTNAME kernel: [<ffffffff815b5c66>] ip_finish_output+0x466/0x750 Mar 22 20:37:10 HOSTNAME kernel: [<ffffffff815b6c63>] ip_output+0x73/0xe0 Mar 22 20:37:10 HOSTNAME kernel: [<ffffffff815b5800>] ? ip_fragment.constprop.54+0x90/0x90 Mar 22 20:37:10 HOSTNAME kernel: [<ffffffff815b2716>] ip_forward_finish+0x66/0x80 Mar 22 20:37:10 HOSTNAME kernel: [<ffffffff815b2a97>] ip_forward+0x367/0x470 Mar 22 20:37:10 HOSTNAME kernel: [<ffffffff815b26b0>] ? ip_frag_mem+0x40/0x40 Mar 22 20:37:10 HOSTNAME kernel: [<ffffffff815b06fa>] ip_rcv_finish+0x8a/0x350 Mar 22 20:37:10 HOSTNAME kernel: [<ffffffff815b1086>] ip_rcv+0x2b6/0x410 Mar 22 20:37:10 HOSTNAME kernel: [<ffffffff815b0670>] ? inet_del_offload+0x40/0x40 Mar 22 20:37:10 HOSTNAME kernel: [<ffffffff8156fab2>] __netif_receive_skb_core+0x582/0x800 Mar 22 20:37:10 HOSTNAME kernel: [<ffffffff815dc074>] ? tcp4_gro_receive+0x134/0x1b0 Mar 22 20:37:10 HOSTNAME kernel: [<ffffffff8156fd48>] __netif_receive_skb+0x18/0x60 Mar 22 20:37:10 HOSTNAME kernel: [<ffffffff8156fdd0>] netif_receive_skb_internal+0x40/0xc0 Mar 22 20:37:10 HOSTNAME kernel: [<ffffffff81570f58>] napi_gro_receive+0xd8/0x130 Mar 22 20:37:10 HOSTNAME kernel: [<ffffffffa016a9d5>] virtnet_poll+0x265/0x750 [virtio_net] Mar 22 20:37:10 HOSTNAME kernel: [<ffffffff815705e0>] net_rx_action+0x170/0x380 Mar 22 20:37:10 HOSTNAME kernel: [<ffffffff8108f2cf>] __do_softirq+0xef/0x280 Mar 22 20:37:10 HOSTNAME kernel: [<ffffffff8169859c>] call_softirq+0x1c/0x30 Mar 22 20:37:10 HOSTNAME kernel: [<ffffffff8102d365>] do_softirq+0x65/0xa0 Mar 22 20:37:10 HOSTNAME kernel: [<ffffffff8108f665>] irq_exit+0x115/0x120 Mar 22 20:37:10 HOSTNAME kernel: [<ffffffff81699138>] do_IRQ+0x58/0xf0 Mar 22 20:37:10 HOSTNAME kernel: [<ffffffff8168e2ad>] common_interrupt+0x6d/0x6d Mar 22 20:37:10 HOSTNAME kernel: <EOI> [<ffffffff81060fe6>] ? native_safe_halt+0x6/0x10 Mar 22 20:37:10 HOSTNAME kernel: [<ffffffff810347bf>] default_idle+0x1f/0xc0 Mar 22 20:37:10 HOSTNAME kernel: [<ffffffff81035106>] arch_cpu_idle+0x26/0x30 Mar 22 20:37:10 HOSTNAME kernel: [<ffffffff810e7e65>] cpu_startup_entry+0x245/0x290 Mar 22 20:37:10 HOSTNAME kernel: [<ffffffff8104f07a>] start_secondary+0x1ba/0x230 Mar 22 20:37:10 HOSTNAME kernel: ---[ end trace 4da754ea16f54ff7 ]--- # ethtool -k eth0 Features for eth0: rx-checksumming: on [fixed] tx-checksumming: on tx-checksum-ipv4: off [fixed] tx-checksum-ip-generic: on tx-checksum-ipv6: off [fixed] tx-checksum-fcoe-crc: off [fixed] tx-checksum-sctp: off [fixed] scatter-gather: on tx-scatter-gather: on tx-scatter-gather-fraglist: off [fixed] tcp-segmentation-offload: on tx-tcp-segmentation: on tx-tcp-ecn-segmentation: on tx-tcp6-segmentation: on udp-fragmentation-offload: on generic-segmentation-offload: on generic-receive-offload: on large-receive-offload: off [fixed] rx-vlan-offload: off [fixed] tx-vlan-offload: off [fixed] ntuple-filters: off [fixed] receive-hashing: off [fixed] highdma: on [fixed] rx-vlan-filter: on [fixed] vlan-challenged: off [fixed] tx-lockless: off [fixed] netns-local: off [fixed] tx-gso-robust: off [fixed] tx-fcoe-segmentation: off [fixed] tx-gre-segmentation: off [fixed] tx-ipip-segmentation: off [fixed] tx-sit-segmentation: off [fixed] tx-udp_tnl-segmentation: off [fixed] tx-mpls-segmentation: off [fixed] fcoe-mtu: off [fixed] tx-nocache-copy: off loopback: off [fixed] rx-fcs: off [fixed] rx-all: off [fixed] tx-vlan-stag-hw-insert: off [fixed] rx-vlan-stag-hw-parse: off [fixed] rx-vlan-stag-filter: off [fixed] busy-poll: off [fixed] tx-sctp-segmentation: off [fixed] l2-fwd-offload: off [fixed] hw-tc-offload: off [fixed] # ethtool -i eth0 driver: virtio_net version: 1.0.0 firmware-version: expansion-rom-version: bus-info: 0000:00:07.0 supports-statistics: no supports-test: no supports-eeprom-access: no supports-register-dump: no supports-priv-flags: no
(In reply to c4rl from comment #23) > I'm experiencing a similar bug and this is happening in a *guest* running on > KVM (qemu-kvm-1.5.3-105), the host has no any log of errors like that; it > has a similar network configuration to "comment 22". This guest machine is > configured as a gateway for physical and virtual machines, and this error > only happens when a physical client tries to connect through it. > > Any suggestion to fix it? Thanks. > > Guest info: > CentOS 7.3.1611 (Core) > Kernel 3.10.0-514.10.2.el7.x86_64 > > Mar 22 20:37:10 HOSTNAME kernel: ------------[ cut here ]------------ > Mar 22 20:37:10 HOSTNAME kernel: WARNING: at net/core/dev.c:2402 > skb_warn_bad_offload+0xcd/0xda() > Mar 22 20:37:10 HOSTNAME kernel: virtio_net: caps=(0x00000001001b4a29, > 0x0000000000000000) len=1434 data_len=1306 gso_size=1368 gso_type=5 > ip_summed=1 > Mar 22 20:37:10 HOSTNAME kernel: Modules linked in: nfsv3 nfs fscache > nf_conntrack_netbios_ns nf_conntrack_broadcast xt_nat ipt_MASQUERADE > nf_nat_masquerade_ipv4 xt_mark ip6t_rpfilter ipt_REJECT nf_reject_ipv4 > ip6t_REJECT nf_reject_ipv6 xt_conntrack ip_set nfnetlink ebtable_nat > ebtable_broute bridge stp llc ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 > nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw iptable_nat > nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack > iptable_mangle iptable_security iptable_raw ebtable_filter ebtables > ip6table_filter ip6_tables iptable_filter ext4 mbcache jbd2 iosf_mbi > crc32_pclmul ghash_clmulni_intel aesni_intel lrw ppdev gf128mul glue_helper > ablk_helper cryptd parport_pc pcspkr sg virtio_balloon parport i2c_piix4 > nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables > Mar 22 20:37:10 HOSTNAME kernel: xfs libcrc32c sr_mod cdrom ata_generic > pata_acpi cirrus drm_kms_helper virtio_blk virtio_console virtio_net > crct10dif_pclmul crct10dif_common syscopyarea ata_piix sysfillrect sysimgblt > crc32c_intel fb_sys_fops ttm serio_raw virtio_pci drm virtio_ring virtio > libata i2c_core floppy dm_mirror dm_region_hash dm_log dm_mod > Mar 22 20:37:10 HOSTNAME kernel: CPU: 1 PID: 0 Comm: swapper/1 Tainted: G > W ------------ 3.10.0-514.10.2.el7.x86_64 #1 > Mar 22 20:37:10 HOSTNAME kernel: Hardware name: Red Hat KVM, BIOS 0.5.1 > 01/01/2011 > Mar 22 20:37:10 HOSTNAME kernel: ffff8801460839a8 84faf728ec725b11 > ffff880146083960 ffffffff816864ef > Mar 22 20:37:10 HOSTNAME kernel: ffff880146083998 ffffffff81085940 > ffff8800b3b45900 ffff88013bb8b000 > Mar 22 20:37:10 HOSTNAME kernel: 0000000000000005 0000000000000001 > 0000000000000000 ffff880146083a00 > Mar 22 20:37:10 HOSTNAME kernel: Call Trace: > Mar 22 20:37:10 HOSTNAME kernel: <IRQ> [<ffffffff816864ef>] > dump_stack+0x19/0x1b > Mar 22 20:37:10 HOSTNAME kernel: [<ffffffff81085940>] > warn_slowpath_common+0x70/0xb0 > Mar 22 20:37:10 HOSTNAME kernel: [<ffffffff810859dc>] > warn_slowpath_fmt+0x5c/0x80 > Mar 22 20:37:10 HOSTNAME kernel: [<ffffffff8131e833>] ? > ___ratelimit+0x93/0x100 > Mar 22 20:37:10 HOSTNAME kernel: [<ffffffff81688f2a>] > skb_warn_bad_offload+0xcd/0xda > Mar 22 20:37:10 HOSTNAME kernel: [<ffffffff81571609>] > __skb_gso_segment+0x79/0xb0 > Mar 22 20:37:10 HOSTNAME kernel: [<ffffffff815719d5>] > validate_xmit_skb.part.94+0x135/0x2f0 > Mar 22 20:37:10 HOSTNAME kernel: [<ffffffff8157217d>] > __dev_queue_xmit+0x4cd/0x570 > Mar 22 20:37:10 HOSTNAME kernel: [<ffffffff81572230>] > dev_queue_xmit+0x10/0x20 > Mar 22 20:37:10 HOSTNAME kernel: [<ffffffff815b5c66>] > ip_finish_output+0x466/0x750 > Mar 22 20:37:10 HOSTNAME kernel: [<ffffffff815b6c63>] ip_output+0x73/0xe0 > Mar 22 20:37:10 HOSTNAME kernel: [<ffffffff815b5800>] ? > ip_fragment.constprop.54+0x90/0x90 > Mar 22 20:37:10 HOSTNAME kernel: [<ffffffff815b2716>] > ip_forward_finish+0x66/0x80 > Mar 22 20:37:10 HOSTNAME kernel: [<ffffffff815b2a97>] ip_forward+0x367/0x470 > Mar 22 20:37:10 HOSTNAME kernel: [<ffffffff815b26b0>] ? ip_frag_mem+0x40/0x40 > Mar 22 20:37:10 HOSTNAME kernel: [<ffffffff815b06fa>] > ip_rcv_finish+0x8a/0x350 > Mar 22 20:37:10 HOSTNAME kernel: [<ffffffff815b1086>] ip_rcv+0x2b6/0x410 > Mar 22 20:37:10 HOSTNAME kernel: [<ffffffff815b0670>] ? > inet_del_offload+0x40/0x40 > Mar 22 20:37:10 HOSTNAME kernel: [<ffffffff8156fab2>] > __netif_receive_skb_core+0x582/0x800 > Mar 22 20:37:10 HOSTNAME kernel: [<ffffffff815dc074>] ? > tcp4_gro_receive+0x134/0x1b0 > Mar 22 20:37:10 HOSTNAME kernel: [<ffffffff8156fd48>] > __netif_receive_skb+0x18/0x60 > Mar 22 20:37:10 HOSTNAME kernel: [<ffffffff8156fdd0>] > netif_receive_skb_internal+0x40/0xc0 > Mar 22 20:37:10 HOSTNAME kernel: [<ffffffff81570f58>] > napi_gro_receive+0xd8/0x130 > Mar 22 20:37:10 HOSTNAME kernel: [<ffffffffa016a9d5>] > virtnet_poll+0x265/0x750 [virtio_net] > Mar 22 20:37:10 HOSTNAME kernel: [<ffffffff815705e0>] > net_rx_action+0x170/0x380 > Mar 22 20:37:10 HOSTNAME kernel: [<ffffffff8108f2cf>] __do_softirq+0xef/0x280 > Mar 22 20:37:10 HOSTNAME kernel: [<ffffffff8169859c>] call_softirq+0x1c/0x30 > Mar 22 20:37:10 HOSTNAME kernel: [<ffffffff8102d365>] do_softirq+0x65/0xa0 > Mar 22 20:37:10 HOSTNAME kernel: [<ffffffff8108f665>] irq_exit+0x115/0x120 > Mar 22 20:37:10 HOSTNAME kernel: [<ffffffff81699138>] do_IRQ+0x58/0xf0 > Mar 22 20:37:10 HOSTNAME kernel: [<ffffffff8168e2ad>] > common_interrupt+0x6d/0x6d > Mar 22 20:37:10 HOSTNAME kernel: <EOI> [<ffffffff81060fe6>] ? > native_safe_halt+0x6/0x10 > Mar 22 20:37:10 HOSTNAME kernel: [<ffffffff810347bf>] default_idle+0x1f/0xc0 > Mar 22 20:37:10 HOSTNAME kernel: [<ffffffff81035106>] arch_cpu_idle+0x26/0x30 > Mar 22 20:37:10 HOSTNAME kernel: [<ffffffff810e7e65>] > cpu_startup_entry+0x245/0x290 > Mar 22 20:37:10 HOSTNAME kernel: [<ffffffff8104f07a>] > start_secondary+0x1ba/0x230 > Mar 22 20:37:10 HOSTNAME kernel: ---[ end trace 4da754ea16f54ff7 ]--- > > > # ethtool -k eth0 > Features for eth0: > rx-checksumming: on [fixed] > tx-checksumming: on > tx-checksum-ipv4: off [fixed] > tx-checksum-ip-generic: on > tx-checksum-ipv6: off [fixed] > tx-checksum-fcoe-crc: off [fixed] > tx-checksum-sctp: off [fixed] > scatter-gather: on > tx-scatter-gather: on > tx-scatter-gather-fraglist: off [fixed] > tcp-segmentation-offload: on > tx-tcp-segmentation: on > tx-tcp-ecn-segmentation: on > tx-tcp6-segmentation: on > udp-fragmentation-offload: on > generic-segmentation-offload: on > generic-receive-offload: on > large-receive-offload: off [fixed] > rx-vlan-offload: off [fixed] > tx-vlan-offload: off [fixed] > ntuple-filters: off [fixed] > receive-hashing: off [fixed] > highdma: on [fixed] > rx-vlan-filter: on [fixed] > vlan-challenged: off [fixed] > tx-lockless: off [fixed] > netns-local: off [fixed] > tx-gso-robust: off [fixed] > tx-fcoe-segmentation: off [fixed] > tx-gre-segmentation: off [fixed] > tx-ipip-segmentation: off [fixed] > tx-sit-segmentation: off [fixed] > tx-udp_tnl-segmentation: off [fixed] > tx-mpls-segmentation: off [fixed] > fcoe-mtu: off [fixed] > tx-nocache-copy: off > loopback: off [fixed] > rx-fcs: off [fixed] > rx-all: off [fixed] > tx-vlan-stag-hw-insert: off [fixed] > rx-vlan-stag-hw-parse: off [fixed] > rx-vlan-stag-filter: off [fixed] > busy-poll: off [fixed] > tx-sctp-segmentation: off [fixed] > l2-fwd-offload: off [fixed] > hw-tc-offload: off [fixed] > > # ethtool -i eth0 > driver: virtio_net > version: 1.0.0 > firmware-version: > expansion-rom-version: > bus-info: 0000:00:07.0 > supports-statistics: no > supports-test: no > supports-eeprom-access: no > supports-register-dump: no > supports-priv-flags: no If the LRO - Large receive offload - is disabled then the network starts to work nicely. It's important to know that parameter must be changed in the *host* interfaces. CODE: SELECT ALL # ethtool -K eth0 lro off
I have the same issue with Intel N3150 CPU on 4.9.32-0-hardened kernel (Alpine Linux) and I'm using KVM virtualization for single pfSense VM (2 bridges are connected there). Network config is following: auto lo iface lo inet loopback auto wan iface wan inet manual bridge-ports eth0 auto lan iface lan inet static bridge-ports eth1 eth2 address 192.168.1.99 netmask 255.255.255.0 gateway 192.168.1.1 dns-nameservers 192.168.1.1 eth0 is onboard Realtek NIC, eth1 is PCIe Intel NIC and eth2 is USB 100M NIC. Virtio is used for VM NIC model. If more details needed, just let me know.