Bug 217040 - TXDCTL.ENABLE for one or more queues not cleared within the polling period
Summary: TXDCTL.ENABLE for one or more queues not cleared within the polling period
Status: NEW
Alias: None
Product: Networking
Classification: Unclassified
Component: IPV4 (show other bugs)
Hardware: Intel Linux
: P1 normal
Assignee: Stephen Hemminger
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2023-02-15 03:46 UTC by Satish Patel
Modified: 2023-02-15 03:46 UTC (History)
0 users

See Also:
Kernel Version: 5.4.0-100-generic
Subsystem:
Regression: No
Bisected commit-id:


Attachments

Description Satish Patel 2023-02-15 03:46:21 UTC
I am running Ubuntu 20.04.4 with kernel version 5.4.0-100-generic. Recently in my server rack one of server had memory failure caused all the servers in that rack received following kernel trace and they failed to connect. 

Server vendor: HP DL460 Gen9 blades

# lspci | grep -i eth
06:00.0 Ethernet controller: Intel Corporation 82599 10 Gigabit Dual Port Backplane Connection (rev 01)
06:00.1 Ethernet controller: Intel Corporation 82599 10 Gigabit Dual Port Backplane Connection (rev 01)

~# ethtool -i eno50
driver: ixgbe
version: 5.1.0-k
firmware-version: 0x800008f0, 1.2836.0
expansion-rom-version:
bus-info: 0000:06:00.1
supports-statistics: yes
supports-test: yes
supports-eeprom-access: yes
supports-register-dump: yes
supports-priv-flags: yes


Kernel version: 5.4.0-100-generic #113-Ubuntu


# brctl show
bridge name	bridge id		STP enabled	interfaces
br-mgmt		8000.38eaa733b589	no		eno50.51
br-vlan		8000.38eaa733b589	no		eno50
br-vxlan	8000.38eaa733b589	no		eno50.29


Kernel trace logs, what could be the issue here. Does it related to STP loop or some kind of failure. 

[Thu Feb  9 07:17:07 2023] ------------[ cut here ]------------
[Thu Feb  9 07:17:07 2023] NETDEV WATCHDOG: eno50 (ixgbe): transmit queue 19 timed out
[Thu Feb  9 07:17:07 2023] WARNING: CPU: 16 PID: 0 at net/sched/sch_generic.c:472 dev_watchdog+0x258/0x260
[Thu Feb  9 07:17:07 2023] Modules linked in: ufs qnx4 hfsplus hfs minix ntfs msdos jfs xfs cpuid xt_CHECKSUM xt_conntrack xt_tcpudp ip6table_mangle ip6table_nat binfmt_misc nf_tables nfnetlink iptable_raw bpfilter vhost_net vhost tap nbd iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ip_vs iptable_nat iptable_mangle iptable_filter ipt_REJECT nf_reject_ipv4 xt_MASQUERADE nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip6table_filter ip6_tables ebtables dm_snapshot dm_bufio br_netfilter 8021q garp mrp bridge stp llc dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua ipmi_ssif ixgbevf intel_rapl_msr intel_rapl_common sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp joydev input_leds kvm_intel kvm rapl intel_cstate serio_raw hpilo ioatdma ipmi_si ipmi_devintf ipmi_msghandler acpi_tad mac_hid acpi_power_meter sch_fq_codel msr ip_tables x_tables autofs4 btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath
[Thu Feb  9 07:17:07 2023]  linear mgag200 drm_vram_helper i2c_algo_bit ttm drm_kms_helper crct10dif_pclmul syscopyarea crc32_pclmul sysfillrect ghash_clmulni_intel sysimgblt hid_generic aesni_intel fb_sys_fops crypto_simd ixgbe usbhid cryptd psmouse hpsa glue_helper xfrm_algo drm hid i2c_i801 lpc_ich scsi_transport_sas dca mdio wmi
[Thu Feb  9 07:17:07 2023] CPU: 16 PID: 0 Comm: swapper/16 Not tainted 5.4.0-100-generic #113-Ubuntu
[Thu Feb  9 07:17:07 2023] Hardware name: HP ProLiant BL460c Gen9, BIOS I36 03/25/2019
[Thu Feb  9 07:17:07 2023] RIP: 0010:dev_watchdog+0x258/0x260
[Thu Feb  9 07:17:07 2023] Code: 85 c0 75 e5 eb 9f 4c 89 ff c6 05 84 a2 2c 01 01 e8 ad bf fa ff 44 89 e9 4c 89 fe 48 c7 c7 f8 dc 63 8c 48 89 c2 e8 34 e2 13 00 <0f> 0b eb 80 0f 1f 40 00 0f 1f 44 00 00 55 48 89 e5 41 57 49 89 d7
[Thu Feb  9 07:17:07 2023] RSP: 0018:ffffa133467c0e30 EFLAGS: 00010286
[Thu Feb  9 07:17:07 2023] RAX: 0000000000000000 RBX: ffff8f9deef6cec0 RCX: 000000000000083f
[Thu Feb  9 07:17:07 2023] RDX: 0000000000000000 RSI: 00000000000000f6 RDI: 000000000000083f
[Thu Feb  9 07:17:07 2023] RBP: ffffa133467c0e60 R08: ffff8fae1f71c8c8 R09: 0000000000000004
[Thu Feb  9 07:17:07 2023] R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000040
[Thu Feb  9 07:17:07 2023] R13: 0000000000000013 R14: ffff8f9df16a0480 R15: ffff8f9df16a0000
[Thu Feb  9 07:17:07 2023] FS:  0000000000000000(0000) GS:ffff8fae1f700000(0000) knlGS:0000000000000000
[Thu Feb  9 07:17:07 2023] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[Thu Feb  9 07:17:07 2023] CR2: 00007f0840382000 CR3: 0000000acea0a006 CR4: 00000000001626e0
[Thu Feb  9 07:17:07 2023] Call Trace:
[Thu Feb  9 07:17:07 2023]  <IRQ>
[Thu Feb  9 07:17:07 2023]  ? pfifo_fast_enqueue+0x150/0x150
[Thu Feb  9 07:17:07 2023]  call_timer_fn+0x32/0x130
[Thu Feb  9 07:17:07 2023]  __run_timers.part.0+0x180/0x280
[Thu Feb  9 07:17:07 2023]  ? tick_sched_handle+0x33/0x60
[Thu Feb  9 07:17:07 2023]  ? tick_sched_timer+0x3d/0x80
[Thu Feb  9 07:17:07 2023]  ? ktime_get+0x3e/0xa0
[Thu Feb  9 07:17:07 2023]  run_timer_softirq+0x2a/0x50
[Thu Feb  9 07:17:07 2023]  __do_softirq+0xe1/0x2d6
[Thu Feb  9 07:17:07 2023]  ? hrtimer_interrupt+0x136/0x220
[Thu Feb  9 07:17:07 2023]  irq_exit+0xae/0xb0
[Thu Feb  9 07:17:07 2023]  smp_apic_timer_interrupt+0x7b/0x140
[Thu Feb  9 07:17:07 2023]  apic_timer_interrupt+0xf/0x20
[Thu Feb  9 07:17:07 2023]  </IRQ>
[Thu Feb  9 07:17:07 2023] RIP: 0010:cpuidle_enter_state+0xc5/0x450
[Thu Feb  9 07:17:07 2023] Code: ff e8 ef b8 84 ff 80 7d c7 00 74 17 9c 58 0f 1f 44 00 00 f6 c4 02 0f 85 65 03 00 00 31 ff e8 62 c0 8a ff fb 66 0f 1f 44 00 00 <45> 85 ed 0f 88 8f 02 00 00 49 63 cd 4c 8b 7d d0 4c 2b 7d c8 48 8d
[Thu Feb  9 07:17:07 2023] RSP: 0018:ffffa13346377e38 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff13
[Thu Feb  9 07:17:07 2023] RAX: ffff8fae1f72fe00 RBX: ffffffff8cd59fe0 RCX: 000000000000001f
[Thu Feb  9 07:17:07 2023] RDX: 0000000000000000 RSI: 000000003342629e RDI: 0000000000000000
[Thu Feb  9 07:17:07 2023] RBP: ffffa13346377e78 R08: 00285439725dde3a R09: 00285454394d7639
[Thu Feb  9 07:17:07 2023] R10: ffff8fae1f72eb00 R11: ffff8fae1f72eae0 R12: ffffc1333f9021c0
[Thu Feb  9 07:17:07 2023] R13: 0000000000000004 R14: 0000000000000004 R15: ffffc1333f9021c0
[Thu Feb  9 07:17:07 2023]  ? cpuidle_enter_state+0xa1/0x450
[Thu Feb  9 07:17:07 2023]  cpuidle_enter+0x2e/0x40
[Thu Feb  9 07:17:07 2023]  call_cpuidle+0x23/0x40
[Thu Feb  9 07:17:07 2023]  do_idle+0x1dd/0x270
[Thu Feb  9 07:17:07 2023]  cpu_startup_entry+0x20/0x30
[Thu Feb  9 07:17:07 2023]  start_secondary+0x167/0x1c0
[Thu Feb  9 07:17:07 2023]  secondary_startup_64+0xa4/0xb0
[Thu Feb  9 07:17:07 2023] ---[ end trace ca7e91fa58c34325 ]---
[Thu Feb  9 07:17:07 2023] ixgbe 0000:06:00.1 eno50: initiating reset due to tx timeout
[Thu Feb  9 07:17:07 2023] ixgbe 0000:06:00.1 eno50: Reset adapter
[Thu Feb  9 07:17:07 2023] ixgbe 0000:06:00.1 eno50: TXDCTL.ENABLE for one or more queues not cleared within the polling period
[Thu Feb  9 07:17:08 2023] br-vlan: port 1(eno50) entered disabled state
[Thu Feb  9 07:17:08 2023] br-vxlan: port 1(eno50.29) entered disabled state
[Thu Feb  9 07:17:08 2023] br-mgmt: port 1(eno50.51) entered disabled state
[Thu Feb  9 07:17:08 2023] ixgbe 0000:06:00.1 eno50: NIC Link is Up 10 Gbps, Flow Control: RX/TX
[Thu Feb  9 07:17:08 2023] br-vlan: port 1(eno50) entered blocking state
[Thu Feb  9 07:17:08 2023] br-vlan: port 1(eno50) entered forwarding state
[Thu Feb  9 07:17:08 2023] br-vxlan: port 1(eno50.29) entered blocking state
[Thu Feb  9 07:17:08 2023] br-vxlan: port 1(eno50.29) entered forwarding state
[Thu Feb  9 07:17:08 2023] br-mgmt: port 1(eno50.51) entered blocking state
[Thu Feb  9 07:17:08 2023] br-mgmt: port 1(eno50.51) entered forwarding state
[Thu Feb  9 07:17:09 2023] ixgbe 0000:06:00.1 eno50: NIC Link is Down
[Thu Feb  9 07:17:10 2023] br-vlan: port 1(eno50) entered disabled state
[Thu Feb  9 07:17:10 2023] br-vxlan: port 1(eno50.29) entered disabled state
[Thu Feb  9 07:17:10 2023] br-mgmt: port 1(eno50.51) entered disabled state
[Thu Feb  9 07:17:10 2023] ixgbe 0000:06:00.1 eno50: NIC Link is Up 10 Gbps, Flow Control: RX/TX
[Thu Feb  9 07:17:10 2023] br-vlan: port 1(eno50) entered blocking state
[Thu Feb  9 07:17:10 2023] br-vlan: port 1(eno50) entered forwarding state
[Thu Feb  9 07:17:10 2023] br-vxlan: port 1(eno50.29) entered blocking state
[Thu Feb  9 07:17:10 2023] br-vxlan: port 1(eno50.29) entered forwarding state
[Thu Feb  9 07:17:10 2023] br-mgmt: port 1(eno50.51) entered blocking state
[Thu Feb  9 07:17:10 2023] br-mgmt: port 1(eno50.51) entered forwarding state
[Thu Feb  9 07:27:34 2023] ixgbe 0000:06:00.1 eno50: initiating reset due to tx timeout
[Thu Feb  9 07:27:34 2023] ixgbe 0000:06:00.1 eno50: Reset adapter
[Thu Feb  9 07:27:35 2023] br-vlan: port 1(eno50) entered disabled state
[Thu Feb  9 07:27:35 2023] br-vxlan: port 1(eno50.29) entered disabled state
[Thu Feb  9 07:27:35 2023] br-mgmt: port 1(eno50.51) entered disabled state
[Thu Feb  9 07:27:35 2023] ixgbe 0000:06:00.1 eno50: NIC Link is Up 10 Gbps, Flow Control: RX/TX
[Thu Feb  9 07:27:35 2023] br-vlan: port 1(eno50) entered blocking state
[Thu Feb  9 07:27:35 2023] br-vlan: port 1(eno50) entered forwarding state
[Thu Feb  9 07:27:35 2023] br-vxlan: port 1(eno50.29) entered blocking state
[Thu Feb  9 07:27:35 2023] br-vxlan: port 1(eno50.29) entered forwarding state
[Thu Feb  9 07:27:35 2023] br-mgmt: port 1(eno50.51) entered blocking state
[Thu Feb  9 07:27:35 2023] br-mgmt: port 1(eno50.51) entered forwarding state
[Thu Feb  9 07:27:36 2023] ixgbe 0000:06:00.1 eno50: NIC Link is Down
[Thu Feb  9 07:27:37 2023] br-vlan: port 1(eno50) entered disabled state
[Thu Feb  9 07:27:37 2023] br-vxlan: port 1(eno50.29) entered disabled state
[Thu Feb  9 07:27:37 2023] br-mgmt: port 1(eno50.51) entered disabled state
[Thu Feb  9 07:27:37 2023] ixgbe 0000:06:00.1 eno50: NIC Link is Up 10 Gbps, Flow Control: RX/TX
[Thu Feb  9 07:27:37 2023] br-vlan: port 1(eno50) entered blocking state
[Thu Feb  9 07:27:37 2023] br-vlan: port 1(eno50) entered forwarding state
[Thu Feb  9 07:27:37 2023] br-vxlan: port 1(eno50.29) entered blocking state
[Thu Feb  9 07:27:37 2023] br-vxlan: port 1(eno50.29) entered forwarding state
[Thu Feb  9 07:27:37 2023] br-mgmt: port 1(eno50.51) entered blocking state
[Thu Feb  9 07:27:37 2023] br-mgmt: port 1(eno50.51) entered forwarding state
[Thu Feb  9 07:38:14 2023] ixgbe 0000:06:00.1 eno50: initiating reset due to tx timeout
[Thu Feb  9 07:38:14 2023] ixgbe 0000:06:00.1 eno50: Reset adapter

Note You need to log in before you can comment on or make changes to this bug.