Bug 206611 - NETDEV WATCHDOG: enp0s31f6 (e1000e): transmit queue 0 timed out
Summary: NETDEV WATCHDOG: enp0s31f6 (e1000e): transmit queue 0 timed out
Status: NEW
Alias: None
Product: Drivers
Classification: Unclassified
Component: Network (show other bugs)
Hardware: Intel Linux
: P1 normal
Assignee: drivers_network@kernel-bugs.osdl.org
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2020-02-20 14:39 UTC by gima+kernelbugzilla
Modified: 2020-07-06 07:09 UTC (History)
3 users (show)

See Also:
Kernel Version: 5.5.4.arch1-1
Subsystem:
Regression: No
Bisected commit-id:


Attachments
dmesg error (5.81 KB, text/plain)
2020-07-06 07:08 UTC, Roberto Viola
Details
full dmesg (79.21 KB, text/plain)
2020-07-06 07:09 UTC, Roberto Viola
Details

Description gima+kernelbugzilla 2020-02-20 14:39:47 UTC
# Bug description:
Randomly, the e1000 adapter disconnects, and then after a while reconnects.

# Kernel version:
* linux-lts 4.19.101-1 = no network problem
* linux-lts 5.4.19-1 = problem
* linux 5.5.4.arch1-1 = problem

# System:
* CPU: i5-8400
* Motherboard: ROG STRIX Z370-G GAMING, BIOS Version: 2401
* Network adapter:
00:1f.6 Ethernet controller [0200]: Intel Corporation Ethernet Connection (2) I219-V [8086:15b8]
        Subsystem: ASUSTeK Computer Inc. Ethernet Connection (2) I219-V [1043:8672]
        Kernel driver in use: e1000e
        Kernel modules: e1000e



# Kernel version 5.5.4-arch1-1 dmesg:

15:07:39 blep kernel: ------------[ cut here ]------------
15:07:39 blep kernel: NETDEV WATCHDOG: enp0s31f6 (e1000e): transmit queue 0 timed out
15:07:39 blep kernel: WARNING: CPU: 3 PID: 35 at net/sched/sch_generic.c:442 dev_watchdog+0x26a/0x280
15:07:39 blep kernel: Modules linked in: vhost_net vhost macvtap macvlan tap fuse ipt_REJECT nf_reject_ipv4 xt_tcpudp ip6table_mangle ip6table_nat iptable_mangle iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter tun bridge 8021q garp mrp stp llc intel_rapl_msr intel_rapl_common x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel eeepc_wmi iTCO_wdt iTCO_vendor_support mei_hdcp asus_wmi snd_hda_codec_hdmi kvm battery sparse_keymap intel_cstate snd_hda_codec_realtek intel_uncore snd_hda_codec_generic ledtrig_audio rfkill intel_rapl_perf e1000e wmi_bmof i2c_i801 snd_hda_intel mei_me snd_intel_dspcfg nouveau mei snd_hda_codec snd_hda_core snd_hwdep snd_pcm snd_timer mousedev snd joydev mxm_wmi input_leds ttm soundcore ie31200_edac evdev mac_hid sg ip_tables x_tables ext4 crc32c_generic crc16 mbcache jbd2 nls_iso8859_1 nls_cp437 vfat fat hid_generic usbhid hid dm_crypt dm_mod sd_mod uas usb_storage
15:07:39 blep kernel:  crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel ahci libahci libata aesni_intel crypto_simd xhci_pci cryptd glue_helper xhci_hcd scsi_mod wmi i915 intel_gtt i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops drm agpgart vfio_pci irqbypass vfio_virqfd vfio_iommu_type1 vfio
15:07:39 blep kernel: CPU: 3 PID: 35 Comm: ksoftirqd/3 Not tainted 5.5.4-arch1-1 #1
15:07:39 blep kernel: Hardware name: System manufacturer System Product Name/ROG STRIX Z370-G GAMING, BIOS 2401 07/15/2019
15:07:39 blep kernel: RIP: 0010:dev_watchdog+0x26a/0x280
15:07:39 blep kernel: Code: 7a 12 80 ff eb 88 4c 89 f7 c6 05 fa 79 d1 00 01 e8 8b ae fa ff 44 89 e9 4c 89 f6 48 c7 c7 00 96 7a b0 48 89 c2 e8 98 ee 88 ff <0f> 0b e9 66 ff ff ff 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00
15:07:39 blep kernel: RSP: 0018:ffff9f9e401a3d70 EFLAGS: 00010282
15:07:39 blep kernel: RAX: 0000000000000000 RBX: ffff97726b39f000 RCX: 0000000000000000
15:07:39 blep kernel: RDX: 0000000000000102 RSI: 0000000000000092 RDI: 00000000ffffffff
15:07:39 blep kernel: RBP: ffff97727320445c R08: 000000000000047a R09: 0000000000000004
15:07:39 blep kernel: R10: 0000000000000000 R11: 0000000000000001 R12: ffff977273204480
15:07:39 blep kernel: R13: 0000000000000000 R14: ffff977273204000 R15: ffff97726b39f080
15:07:39 blep kernel: FS:  0000000000000000(0000) GS:ffff977276ec0000(0000) knlGS:0000000000000000
15:07:39 blep kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
15:07:39 blep kernel: CR2: ffffa18ec59c1000 CR3: 00000003cd6ec006 CR4: 00000000003626e0
15:07:39 blep kernel: Call Trace:
15:07:39 blep kernel:  ? qdisc_put_unlocked+0x30/0x30
15:07:39 blep kernel:  call_timer_fn+0x2d/0x160
15:07:39 blep kernel:  run_timer_softirq+0x1ad/0x510
15:07:39 blep kernel:  ? qdisc_put_unlocked+0x30/0x30
15:07:39 blep kernel:  __do_softirq+0x111/0x34d
15:07:39 blep kernel:  run_ksoftirqd+0x32/0x40
15:07:39 blep kernel:  smpboot_thread_fn+0x19a/0x230
15:07:39 blep kernel:  kthread+0xfb/0x130
15:07:39 blep kernel:  ? sort_range+0x20/0x20
15:07:39 blep kernel:  ? kthread_park+0x90/0x90
15:07:39 blep kernel:  ret_from_fork+0x35/0x40
15:07:39 blep kernel: ---[ end trace dc42a63743c2d972 ]---



# Kernel version 5.4.19-1-lts dmesg:

15:21:14 blep kernel: ------------[ cut here ]------------
15:21:14 blep kernel: NETDEV WATCHDOG: enp0s31f6 (e1000e): transmit queue 0 timed out
15:21:14 blep kernel: WARNING: CPU: 3 PID: 47218 at net/sched/sch_generic.c:447 dev_watchdog+0x248/0x250
15:21:14 blep kernel: Modules linked in: vhost_net vhost macvtap macvlan tap fuse ipt_REJECT nf_reject_ipv4 xt_tcpudp ip6table_mangle ip6table_nat iptable_mangle iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter tun bridge 8021q garp mrp stp llc intel_rapl_msr iTCO_wdt iTCO_vendor_support mei_hdcp intel_rapl_common x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm intel_cstate intel_uncore eeepc_wmi asus_wmi intel_rapl_perf battery sparse_keymap e1000e rfkill wmi_bmof i2c_i801 snd_hda_codec_hdmi snd_hda_codec_realtek mei_me snd_hda_codec_generic nouveau ledtrig_audio mei snd_hda_intel snd_intel_nhlt snd_hda_codec snd_hda_core snd_hwdep joydev snd_pcm snd_timer mousedev snd mxm_wmi input_leds ttm soundcore evdev ie31200_edac mac_hid sg ip_tables x_tables ext4 crc32c_generic crc16 mbcache jbd2 nls_iso8859_1 nls_cp437 vfat fat hid_generic usbhid dm_crypt hid dm_mod sd_mod uas usb_storage crct10dif_pclmul
15:21:14 blep kernel:  crc32_pclmul crc32c_intel ghash_clmulni_intel ahci libahci aesni_intel libata crypto_simd xhci_pci cryptd glue_helper scsi_mod xhci_hcd wmi i915 intel_gtt i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops drm agpgart vfio_pci irqbypass vfio_virqfd vfio_iommu_type1 vfio
15:21:14 blep kernel: CPU: 3 PID: 47218 Comm: CPU 3/KVM Not tainted 5.4.19-1-lts #1
15:21:14 blep kernel: Hardware name: System manufacturer System Product Name/ROG STRIX Z370-G GAMING, BIOS 2401 07/15/2019
15:21:14 blep kernel: RIP: 0010:dev_watchdog+0x248/0x250
15:21:14 blep kernel: Code: 85 c0 75 e5 eb 9f 4c 89 ef c6 05 9c da b4 00 01 e8 3d d1 fa ff 44 89 e1 4c 89 ee 48 c7 c7 18 27 37 a6 48 89 c2 e8 66 40 8c ff <0f> 0b eb 80 0f 1f 40 00 0f 1f 44 00 00 41 57 41 56 49 89 d6 41 55
15:21:14 blep kernel: RSP: 0018:ffffb185c0190e70 EFLAGS: 00010282
15:21:14 blep kernel: RAX: 0000000000000000 RBX: ffff9401b11d7c00 RCX: 0000000000000006
15:21:14 blep kernel: RDX: 0000000000000007 RSI: 0000000000000086 RDI: ffff9401b6cd7700
15:21:14 blep kernel: RBP: ffff9401af7ec45c R08: 0000000000000579 R09: 0000000000000004
15:21:14 blep kernel: R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000000
15:21:14 blep kernel: R13: ffff9401af7ec000 R14: ffff9401af7ec480 R15: 0000000000000001
15:21:14 blep kernel: FS:  00007f1cfe7ff700(0000) GS:ffff9401b6cc0000(0000) knlGS:ffffa38008852000
15:21:14 blep kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
15:21:14 blep kernel: CR2: 0000018ea10041a8 CR3: 000000038709e001 CR4: 00000000003626e0
15:21:14 blep kernel: Call Trace:
15:21:14 blep kernel:  <IRQ>
15:21:14 blep kernel:  ? pfifo_fast_enqueue+0x150/0x150
15:21:14 blep kernel:  call_timer_fn+0x2d/0x130
15:21:14 blep kernel:  __run_timers+0x18d/0x280
15:21:14 blep kernel:  run_timer_softirq+0x19/0x30
15:21:14 blep kernel:  __do_softirq+0xee/0x2ff
15:21:14 blep kernel:  irq_exit+0xb4/0xc0
15:21:14 blep kernel:  smp_apic_timer_interrupt+0x76/0x130
15:21:14 blep kernel:  apic_timer_interrupt+0xf/0x20
15:21:14 blep kernel:  </IRQ>
15:21:14 blep kernel: RIP: 0010:handle_external_interrupt_irqoff+0x7a/0x100 [kvm_intel]
15:21:14 blep kernel: Code: b7 00 48 c1 e2 10 48 c1 e1 20 48 09 ca 48 09 d0 65 48 89 3d 98 f4 c8 3e 48 89 e2 48 83 e4 f0 6a 18 52 9c 6a 10 e8 a6 30 a6 e4 <65> 48 c7 05 7a f4 c8 3e 00 00 00 00 48 83 c4 08 c3 81 3d 3b 90 02
15:21:14 blep kernel: RSP: 0018:ffffb185c1667cd0 EFLAGS: 00000082 ORIG_RAX: ffffffffffffff13
15:21:14 blep kernel: RAX: ffffffffa5c01bf0 RBX: 0001d2ef434df089 RCX: ffffffff00000000
15:21:14 blep kernel: RDX: ffffb185c1667cd0 RSI: fffe326c93b133ee RDI: ffff940105ae3f40
15:21:14 blep kernel: RBP: ffffb185c1667db0 R08: 0000000000000000 R09: 0000000000000000
15:21:14 blep kernel: R10: 0000000000000000 R11: 0000000000000000 R12: ffff940105ae3f40
15:21:14 blep kernel: R13: 0000000000000000 R14: 8000000000000000 R15: ffffb185c07aa798
15:21:14 blep kernel:  ? __irqentry_text_start+0x8/0x8
15:21:14 blep kernel:  ? kvm_arch_vcpu_ioctl_run+0x98f/0x1de0 [kvm]
15:21:14 blep kernel:  ? try_to_wake_up+0x23c/0x6a0
15:21:14 blep kernel:  ? kvm_vcpu_ioctl+0x263/0x610 [kvm]
15:21:14 blep kernel:  ? do_vfs_ioctl+0x43f/0x6c0
15:21:14 blep kernel:  ? syscall_trace_enter+0x19c/0x2d0
15:21:14 blep kernel:  ? ksys_ioctl+0x5e/0x90
15:21:14 blep kernel:  ? __x64_sys_ioctl+0x16/0x20
15:21:14 blep kernel:  ? do_syscall_64+0x4e/0x140
15:21:14 blep kernel:  ? entry_SYSCALL_64_after_hwframe+0x44/0xa9
15:21:14 blep kernel: ---[ end trace 6b43c433f60b6c2d ]---
15:21:17 blep kernel: e1000e 0000:00:1f.6 enp0s31f6: Reset adapter unexpectedly
Comment 1 Alec Ari 2020-02-21 01:41:28 UTC
Hi, I recently opened a report which looks very similar to this:

https://bugzilla.kernel.org/show_bug.cgi?id=206615

Stack trace is a bit different however.

Thanks for reporting this!

Alec
Comment 2 nisalon_caje 2020-05-27 01:38:18 UTC
I have someting really similar since I upgraded to Ubuntu 20.04 LTS (kernel 5.4.0-31). 
I used to have Ubuntu 16.04 on this machine (4.4.0-150) and it used to be working perfectly.

Any idea of what the cause might be ?

Thanks



May 26 10:04:15 service01K kernel: [161735.901135] ------------[ cut here ]------------
May 26 10:04:15 service01K kernel: [161735.901136] NETDEV WATCHDOG: eth1 (ixgbe): transmit queue 2 timed out
May 26 10:04:15 service01K kernel: [161735.901145] WARNING: CPU: 1 PID: 0 at net/sched/sch_generic.c:447 dev_watchdog+0x258/0x260
May 26 10:04:15 service01K kernel: [161735.901146] Modules linked in: ipt_REJECT nf_reject_ipv4 xt_multiport isofs ip6table_filter ip6_tables xt_tcpudp xt_state xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 iptable_filter bpfilter nls_iso8859_1 dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua ipmi_ssif kvm_intel kvm joydev input_leds ipmi_si ipmi_devintf ipmi_msghandler video acpi_pad acpi_tad sch_fq_codel ip_tables x_tables autofs4 raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid0 multipath linear hid_generic uas usbhid hid usb_storage raid1 crct10dif_pclmul crc32_pclmul ghash_clmulni_intel i2c_algo_bit drm_vram_helper ttm drm_kms_helper syscopyarea aesni_intel sysfillrect sysimgblt crypto_simd fb_sys_fops ixgbe cryptd glue_helper nvme drm xfrm_algo ahci dca mdio libahci nvme_core
May 26 10:04:15 service01K kernel: [161735.901163] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.4.0-31-generic #35-Ubuntu
May 26 10:04:15 service01K kernel: [161735.901163] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./E3C246D4U2-2T, BIOS L2.02K 12/18/2019
May 26 10:04:15 service01K kernel: [161735.901164] RIP: 0010:dev_watchdog+0x258/0x260
May 26 10:04:15 service01K kernel: [161735.901165] Code: 85 c0 75 e5 eb 9f 4c 89 ff c6 05 ef f6 e7 00 01 e8 6d bb fa ff 44 89 e9 4c 89 fe 48 c7 c7 40 73 43 ba 48 89 c2 e8 03 30 71 ff <0f> 0b eb 80 0f 1f 40 00 0f 1f 44 00 00 55 48 89 e5 41 57 49 89 d7
May 26 10:04:15 service01K kernel: [161735.901165] RSP: 0018:ffffb8774003ce30 EFLAGS: 00010286
May 26 10:04:15 service01K kernel: [161735.901166] RAX: 0000000000000000 RBX: ffff891bdf924ec0 RCX: 0000000000000006
May 26 10:04:15 service01K kernel: [161735.901166] RDX: 0000000000000007 RSI: 0000000000000096 RDI: ffff891bee8578c0
May 26 10:04:15 service01K kernel: [161735.901167] RBP: ffffb8774003ce60 R08: 000000000000046b R09: 0000000000000004
May 26 10:04:15 service01K kernel: [161735.901167] R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000040
May 26 10:04:15 service01K kernel: [161735.901167] R13: 0000000000000002 R14: ffff891bdf980480 R15: ffff891bdf980000
May 26 10:04:15 service01K kernel: [161735.901168] FS:  0000000000000000(0000) GS:ffff891bee840000(0000) knlGS:0000000000000000
May 26 10:04:15 service01K kernel: [161735.901168] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
May 26 10:04:15 service01K kernel: [161735.901169] CR2: 00007fb0077d6148 CR3: 0000000894f1c006 CR4: 00000000003606e0
May 26 10:04:15 service01K kernel: [161735.901169] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
May 26 10:04:15 service01K kernel: [161735.901169] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
May 26 10:04:15 service01K kernel: [161735.901170] Call Trace:
May 26 10:04:15 service01K kernel: [161735.901171]  <IRQ>
May 26 10:04:15 service01K kernel: [161735.901173]  ? pfifo_fast_enqueue+0x150/0x150
May 26 10:04:15 service01K kernel: [161735.901175]  call_timer_fn+0x32/0x130
May 26 10:04:15 service01K kernel: [161735.901176]  __run_timers.part.0+0x180/0x280
May 26 10:04:15 service01K kernel: [161735.901177]  ? enqueue_hrtimer+0x3d/0x90
May 26 10:04:15 service01K kernel: [161735.901178]  ? recalibrate_cpu_khz+0x10/0x10
May 26 10:04:15 service01K kernel: [161735.901179]  ? ktime_get+0x3e/0xa0
May 26 10:04:15 service01K kernel: [161735.901180]  run_timer_softirq+0x2a/0x50
May 26 10:04:15 service01K kernel: [161735.901181]  __do_softirq+0xe1/0x2d6
May 26 10:04:15 service01K kernel: [161735.901182]  ? hrtimer_interrupt+0x13b/0x220
May 26 10:04:15 service01K kernel: [161735.901183]  irq_exit+0xae/0xb0
May 26 10:04:15 service01K kernel: [161735.901184]  smp_apic_timer_interrupt+0x7b/0x140
May 26 10:04:15 service01K kernel: [161735.901185]  apic_timer_interrupt+0xf/0x20
May 26 10:04:15 service01K kernel: [161735.901186]  </IRQ>
May 26 10:04:15 service01K kernel: [161735.901187] RIP: 0010:cpuidle_enter_state+0xc5/0x450
May 26 10:04:15 service01K kernel: [161735.901187] Code: ff e8 9f 04 81 ff 80 7d c7 00 74 17 9c 58 0f 1f 44 00 00 f6 c4 02 0f 85 65 03 00 00 31 ff e8 f2 74 87 ff fb 66 0f 1f 44 00 00 <45> 85 ed 0f 88 8f 02 00 00 49 63 cd 4c 8b 7d d0 4c 2b 7d c8 48 8d
May 26 10:04:15 service01K kernel: [161735.901188] RSP: 0018:ffffb877400e3e38 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff13
May 26 10:04:15 service01K kernel: [161735.901188] RAX: ffff891bee86ad00 RBX: ffffffffba759dc0 RCX: 000000000000001f
May 26 10:04:15 service01K kernel: [161735.901189] RDX: 0000000000000000 RSI: 00000000258eeee5 RDI: 0000000000000000
May 26 10:04:15 service01K kernel: [161735.901189] RBP: ffffb877400e3e78 R08: 0000931912eef131 R09: 00000000000000b0
May 26 10:04:15 service01K kernel: [161735.901189] R10: ffff891bee869a00 R11: ffff891bee8699e0 R12: ffff891bee875a20
May 26 10:04:15 service01K kernel: [161735.901190] R13: 0000000000000001 R14: 0000000000000001 R15: ffff891bee875a20
May 26 10:04:15 service01K kernel: [161735.901191]  ? cpuidle_enter_state+0xa1/0x450
May 26 10:04:15 service01K kernel: [161735.901191]  cpuidle_enter+0x2e/0x40
May 26 10:04:15 service01K kernel: [161735.901192]  call_cpuidle+0x23/0x40
May 26 10:04:15 service01K kernel: [161735.901193]  do_idle+0x1dd/0x270
May 26 10:04:15 service01K kernel: [161735.901194]  cpu_startup_entry+0x20/0x30
May 26 10:04:15 service01K kernel: [161735.901195]  start_secondary+0x167/0x1c0
May 26 10:04:15 service01K kernel: [161735.901196]  secondary_startup_64+0xa4/0xb0
May 26 10:04:15 service01K kernel: [161735.901198] ---[ end trace 8367c52cc2c9c7ea ]---
May 26 10:04:15 service01K kernel: [161735.901200] ixgbe 0000:04:00.1 eth1: initiating reset due to tx timeout
May 26 10:04:15 service01K kernel: [161735.901238] ixgbe 0000:04:00.1 eth1: Reset adapter
May 26 10:04:18 service01K systemd-networkd[2915505]: eth1: Lost carrier
May 26 10:04:19 service01K ntpd[1007]: Deleting interface #4 eth1, 192.168.52.129#123, interface stats: received=0, sent=0, dropped=0, active_time=161720 secs
May 26 10:04:19 service01K ntpd[1007]: Deleting interface #8 eth1, fe80::d250:99ff:fed6:91ef%3#123, interface stats: received=0, sent=0, dropped=0, active_time=161720 secs
May 26 10:04:29 service01K systemd-networkd[2915505]: eth1: Gained carrier
May 26 10:04:29 service01K kernel: [161749.910201] ixgbe 0000:04:00.1 eth1: NIC Link is Up 10 Gbps, Flow Control: None
May 26 10:04:30 service01K ntpd[1007]: Listen normally on 9 eth1 192.168.52.129:123
May 26 10:04:30 service01K ntpd[1007]: Listen normally on 10 eth1 [fe80::d250:99ff:fed6:91ef%3]:123
May 26 10:04:30 service01K ntpd[1007]: new interface(s) found: waking up resolver
Comment 3 Alec Ari 2020-05-27 19:58:00 UTC
If you want to avoid this problem yourself you can build a custom kernel the Ubuntu way and just disable the net scheduler in menuconfig, option is CONFIG_NET_SCHED -- Not a proper fix obviously.
Comment 4 Roberto Viola 2020-07-06 07:08:59 UTC
Created attachment 290129 [details]
dmesg error
Comment 5 Roberto Viola 2020-07-06 07:09:17 UTC
Created attachment 290131 [details]
full dmesg

Note You need to log in before you can comment on or make changes to this bug.