Bug 201423 - eth0: hw csum failure
Summary: eth0: hw csum failure
Status: NEW
Alias: None
Product: Networking
Classification: Unclassified
Component: Other (show other bugs)
Hardware: Intel Linux
: P1 normal
Assignee: Stephen Hemminger
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2018-10-14 10:42 UTC by Fabio Rossi
Modified: 2018-10-31 00:26 UTC (History)
0 users

See Also:
Kernel Version: 4.19.0-rc7
Subsystem:
Regression: Yes
Bisected commit-id:


Attachments

Description Fabio Rossi 2018-10-14 10:42:48 UTC
I have a P6T DELUXE V2 motherboard and using the sky2 driver for the ethernet ports. I get the following error message:

[  433.727397] eth0: hw csum failure
[  433.727406] CPU: 4 PID: 0 Comm: swapper/4 Not tainted 4.19.0-rc7 #19
[  433.727406] Hardware name: System manufacturer System Product Name/P6T DELUXE V2, BIOS 1202    12/22/2010
[  433.727407] Call Trace:
[  433.727409]  <IRQ>
[  433.727415]  dump_stack+0x46/0x5b
[  433.727419]  __skb_checksum_complete+0xb0/0xc0
[  433.727423]  tcp_v4_rcv+0x528/0xb60
[  433.727426]  ? ipt_do_table+0x2d0/0x400
[  433.727429]  ip_local_deliver_finish+0x5a/0x110
[  433.727430]  ip_local_deliver+0xe1/0xf0
[  433.727431]  ? ip_sublist_rcv_finish+0x60/0x60
[  433.727432]  ip_rcv+0xca/0xe0
[  433.727434]  ? ip_rcv_finish_core.isra.0+0x300/0x300
[  433.727436]  __netif_receive_skb_one_core+0x4b/0x70
[  433.727438]  netif_receive_skb_internal+0x4e/0x130
[  433.727439]  napi_gro_receive+0x6a/0x80
[  433.727442]  sky2_poll+0x707/0xd20
[  433.727446]  ? rcu_check_callbacks+0x1b4/0x900
[  433.727447]  net_rx_action+0x237/0x380
[  433.727449]  __do_softirq+0xdc/0x1e0
[  433.727452]  irq_exit+0xa9/0xb0
[  433.727453]  do_IRQ+0x45/0xc0
[  433.727455]  common_interrupt+0xf/0xf
[  433.727456]  </IRQ>
[  433.727459] RIP: 0010:cpuidle_enter_state+0x124/0x200
[  433.727461] Code: 53 60 89 c3 e8 dd 90 ad ff 65 8b 3d 96 58 a7 7e e8 d1 8f ad ff 31 ff 49 89 c4 e8 27 99 ad ff fb 48 ba cf f7 53 e3 a5 9b c4 20 <4c> 89 e1 4c 29 e9 48 89 c8 48 c1 f9 3f 48 f7 ea b8 ff ff ff 7f 48
[  433.727462] RSP: 0000:ffffc900000a3e98 EFLAGS: 00000282 ORIG_RAX: ffffffffffffffde
[  433.727463] RAX: ffff880237b1f280 RBX: 0000000000000004 RCX: 000000000000001f
[  433.727464] RDX: 20c49ba5e353f7cf RSI: 000000002fe419c1 RDI: 0000000000000000
[  433.727465] RBP: ffff880237b263a0 R08: 0000000000000714 R09: 000000650512105d
[  433.727465] R10: 00000000ffffffff R11: 0000000000000342 R12: 00000064fc2a8b1c
[  433.727466] R13: 00000064fc25b35f R14: 0000000000000004 R15: ffffffff8204af20
[  433.727468]  ? cpuidle_enter_state+0x119/0x200
[  433.727471]  do_idle+0x1bf/0x200
[  433.727473]  cpu_startup_entry+0x6a/0x70
[  433.727475]  start_secondary+0x17f/0x1c0
[  433.727476]  secondary_startup_64+0xa4/0xb0
[  441.662954] eth0: hw csum failure
[  441.662959] CPU: 4 PID: 4347 Comm: radeon_cs:0 Not tainted 4.19.0-rc7 #19
[  441.662960] Hardware name: System manufacturer System Product Name/P6T DELUXE V2, BIOS 1202    12/22/2010
[  441.662960] Call Trace:
[  441.662963]  <IRQ>
[  441.662968]  dump_stack+0x46/0x5b
[  441.662972]  __skb_checksum_complete+0xb0/0xc0
[  441.662975]  tcp_v4_rcv+0x528/0xb60
[  441.662979]  ? ipt_do_table+0x2d0/0x400
[  441.662981]  ip_local_deliver_finish+0x5a/0x110
[  441.662983]  ip_local_deliver+0xe1/0xf0
[  441.662985]  ? ip_sublist_rcv_finish+0x60/0x60
[  441.662986]  ip_rcv+0xca/0xe0
[  441.662988]  ? ip_rcv_finish_core.isra.0+0x300/0x300
[  441.662990]  __netif_receive_skb_one_core+0x4b/0x70
[  441.662993]  netif_receive_skb_internal+0x4e/0x130
[  441.662994]  napi_gro_receive+0x6a/0x80
[  441.662998]  sky2_poll+0x707/0xd20
[  441.663000]  net_rx_action+0x237/0x380
[  441.663002]  __do_softirq+0xdc/0x1e0
[  441.663005]  irq_exit+0xa9/0xb0
[  441.663007]  do_IRQ+0x45/0xc0
[  441.663009]  common_interrupt+0xf/0xf
[  441.663010]  </IRQ>
[  441.663012] RIP: 0010:merge+0x22/0xb0
[  441.663014] Code: c3 31 c0 c3 90 90 90 90 41 56 41 55 41 54 55 48 89 d5 53 48 89 cb 48 83 ec 18 65 48 8b 04 25 28 00 00 00 48 89 44 24 10 31 c0 <48> 85 c9 74 70 48 85 d2 74 6b 49 89 fd 49 89 f6 49 89 e4 eb 14 48
[  441.663015] RSP: 0018:ffffc9000090b988 EFLAGS: 00000246 ORIG_RAX: ffffffffffffffde
[  441.663017] RAX: 0000000000000000 RBX: ffff88021ab2d408 RCX: ffff88021ab2d408
[  441.663018] RDX: ffff88021ab2d388 RSI: ffffffffa021c440 RDI: 0000000000000000
[  441.663019] RBP: ffff88021ab2d388 R08: 0000000000005ecf R09: 0000000000008500
[  441.663020] R10: ffffea000877ec00 R11: ffff880236803500 R12: ffffffffa021c440
[  441.663021] R13: ffff88021ab2d448 R14: 0000000000000004 R15: ffffc9000090b9e0
[  441.663048]  ? radeon_irq_kms_set_irq_n_enabled+0x120/0x120 [radeon]
[  441.663063]  ? radeon_irq_kms_set_irq_n_enabled+0x120/0x120 [radeon]
[  441.663065]  ? merge+0x57/0xb0
[  441.663080]  ? radeon_irq_kms_set_irq_n_enabled+0x120/0x120 [radeon]
[  441.663082]  list_sort+0x8b/0x230
[  441.663094]  radeon_cs_parser_fini+0xdf/0x110 [radeon]
[  441.663110]  radeon_cs_ioctl+0x2a4/0x710 [radeon]
[  441.663113]  ? __switch_to_asm+0x34/0x70
[  441.663114]  ? __switch_to_asm+0x40/0x70
[  441.663130]  ? radeon_cs_parser_init+0x20/0x20 [radeon]
[  441.663141]  drm_ioctl_kernel+0xa3/0xe0 [drm]
[  441.663149]  drm_ioctl+0x2e2/0x380 [drm]
[  441.663164]  ? radeon_cs_parser_init+0x20/0x20 [radeon]
[  441.663168]  ? page_add_new_anon_rmap+0x42/0x70
[  441.663171]  do_vfs_ioctl+0x9a/0x600
[  441.663173]  ksys_ioctl+0x35/0x60
[  441.663175]  __x64_sys_ioctl+0x11/0x20
[  441.663177]  do_syscall_64+0x3d/0xf0
[  441.663179]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  441.663180] RIP: 0033:0x7f9377377f37
[  441.663182] Code: 00 00 00 75 0c 48 c7 c0 ff ff ff ff 48 83 c4 18 c3 e8 ad db 01 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 21 4f 2c 00 f7 d8 64 89 01 48
[  441.663183] RSP: 002b:00007f92c3130d28 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[  441.663185] RAX: ffffffffffffffda RBX: 0000564498327ec0 RCX: 00007f9377377f37
[  441.663186] RDX: 0000564498337ec8 RSI: 00000000c0206466 RDI: 0000000000000010
[  441.663186] RBP: 0000564498337ec8 R08: 0000000000000000 R09: 0000000000000000
[  441.663187] R10: 0000000000000000 R11: 0000000000000246 R12: 00000000c0206466
[  441.663188] R13: 0000000000000010 R14: 0000000000000000 R15: 0000564497a38120
[  462.833418] eth0: hw csum failure
[  462.833428] CPU: 4 PID: 0 Comm: swapper/4 Not tainted 4.19.0-rc7 #19
[  462.833429] Hardware name: System manufacturer System Product Name/P6T DELUXE V2, BIOS 1202    12/22/2010
[  462.833429] Call Trace:
[  462.833432]  <IRQ>
[  462.833438]  dump_stack+0x46/0x5b
[  462.833442]  __skb_checksum_complete+0xb0/0xc0
[  462.833446]  tcp_v4_rcv+0x528/0xb60
[  462.833449]  ? ipt_do_table+0x2d0/0x400
[  462.833452]  ip_local_deliver_finish+0x5a/0x110
[  462.833454]  ip_local_deliver+0xe1/0xf0
[  462.833455]  ? ip_sublist_rcv_finish+0x60/0x60
[  462.833457]  ip_rcv+0xca/0xe0
[  462.833459]  ? ip_rcv_finish_core.isra.0+0x300/0x300
[  462.833461]  __netif_receive_skb_one_core+0x4b/0x70
[  462.833464]  netif_receive_skb_internal+0x4e/0x130
[  462.833466]  napi_gro_receive+0x6a/0x80
[  462.833469]  sky2_poll+0x707/0xd20
[  462.833471]  net_rx_action+0x237/0x380
[  462.833474]  __do_softirq+0xdc/0x1e0
[  462.833477]  irq_exit+0xa9/0xb0
[  462.833479]  do_IRQ+0x45/0xc0
[  462.833481]  common_interrupt+0xf/0xf
[  462.833482]  </IRQ>
[  462.833486] RIP: 0010:cpuidle_enter_state+0x124/0x200
[  462.833488] Code: 53 60 89 c3 e8 dd 90 ad ff 65 8b 3d 96 58 a7 7e e8 d1 8f ad ff 31 ff 49 89 c4 e8 27 99 ad ff fb 48 ba cf f7 53 e3 a5 9b c4 20 <4c> 89 e1 4c 29 e9 48 89 c8 48 c1 f9 3f 48 f7 ea b8 ff ff ff 7f 48
[  462.833489] RSP: 0018:ffffc900000a3e98 EFLAGS: 00000282 ORIG_RAX: ffffffffffffffde
[  462.833491] RAX: ffff880237b1f280 RBX: 0000000000000004 RCX: 000000000000001f
[  462.833492] RDX: 20c49ba5e353f7cf RSI: 000000002fe419c1 RDI: 0000000000000000
[  462.833493] RBP: ffff880237b263a0 R08: 0000000000000000 R09: 0000000000000000
[  462.833494] R10: 00000000ffffffff R11: 0000000000000273 R12: 0000006bc3052131
[  462.833495] R13: 0000006bc2f99f57 R14: 0000000000000004 R15: ffffffff8204af20
[  462.833498]  ? cpuidle_enter_state+0x119/0x200
[  462.833503]  do_idle+0x1bf/0x200
[  462.833506]  cpu_startup_entry+0x6a/0x70
[  462.833510]  start_secondary+0x17f/0x1c0
[  462.833513]  secondary_startup_64+0xa4/0xb0

Something is changed between 4.17.12 and 4.18, after bisecting the problem I got the following first bad commit:

commit 88078d98d1bb085d72af8437707279e203524fa5
Author: Eric Dumazet <edumazet@google.com>
Date:   Wed Apr 18 11:43:15 2018 -0700

    net: pskb_trim_rcsum() and CHECKSUM_COMPLETE are friends
    
    After working on IP defragmentation lately, I found that some large
    packets defeat CHECKSUM_COMPLETE optimization because of NIC adding
    zero paddings on the last (small) fragment.
    
    While removing the padding with pskb_trim_rcsum(), we set skb->ip_summed
    to CHECKSUM_NONE, forcing a full csum validation, even if all prior
    fragments had CHECKSUM_COMPLETE set.
    
    We can instead compute the checksum of the part we are trimming,
    usually smaller than the part we keep.
    
    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
Comment 1 Fabio Rossi 2018-10-15 22:31:00 UTC
other useful info:

# dmesg | grep -i sky2
[    0.545661] sky2: driver version 1.30
[    0.545781] sky2 0000:06:00.0: Yukon-2 EC Ultra chip revision 3
[    0.546067] sky2 0000:06:00.0 eth0: addr 00:24:8c:xx:xx:xx
[    0.546188] sky2 0000:04:00.0: Yukon-2 EC Ultra chip revision 3
[    0.546484] sky2 0000:04:00.0 eth1: addr 00:24:8c:xx:xx:xx
[   38.043074] sky2 0000:06:00.0 eth0: enabling interface
[   39.842161] sky2 0000:06:00.0 eth0: Link is up at 100 Mbps, full duplex, flow control rx
Comment 2 Fabio Rossi 2018-10-31 00:26:47 UTC
I have applied the commit db4f1be3ca9b0ef7330763d07bf4ace83ad6f913 on top of kernel 4.19.0 which seems to solve the problem

Note You need to log in before you can comment on or make changes to this bug.