Bug 218740 - e1000e regression: BUG: scheduling while atomic: kworker/5:0/84234/0x00000002
Summary: e1000e regression: BUG: scheduling while atomic: kworker/5:0/84234/0x00000002
Status: RESOLVED CODE_FIX
Alias: None
Product: Drivers
Classification: Unclassified
Component: Network (show other bugs)
Hardware: All Linux
: P3 high
Assignee: drivers_network@kernel-bugs.osdl.org
URL: https://bbs.archlinux.org/viewtopic.p...
Keywords:
: 218766 (view as bug list)
Depends on:
Blocks:
 
Reported: 2024-04-17 17:29 UTC by *cJ*
Modified: 2024-04-30 13:55 UTC (History)
7 users (show)

See Also:
Kernel Version: 6.8.5
Subsystem:
Regression: Yes
Bisected commit-id:


Attachments

Description *cJ* 2024-04-17 17:29:25 UTC
Fresh 5.8.6, I connected an Ethernet cable on the laptop (a few minutes after connecting 15 ftdi_sio USB devices), and noticed that the system wasn't very responsive.

dmesg says:

2024-04-17T13:12:09.576153-0400 pouet kernel: e1000e 0000:00:1f.6 eth0: NIC Link is Up 1000 Mbps Half Duplex, Flow Control: Rx/Tx
2024-04-17T13:12:09.593873-0400 pouet kernel: BUG: scheduling while atomic: kworker/5:0/84234/0x00000002
2024-04-17T13:12:09.593907-0400 pouet kernel: Modules linked in: ftdi_sio usbserial thunderbolt snd_seq_dummy snd_hrtimer snd_seq snd_seq_device dm_crypt bonding tls uhid bnep ccm algif_aead des_generic libdes algif_skcipher btusb cmac btrtl btintel md4 btbcm algif_hash btmtk af_alg wacom uvcvideo bluetooth hid_generic uvc videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videodev usbhid hid videobuf2_common ecdh_generic ecc mc iwlmvm coretemp mac80211 intel_uncore_frequency intel_uncore_frequency_common intel_tcc_cooling x86_pkg_temp_thermal intel_powerclamp joydev mousedev kvm_intel snd_ctl_led elan_i2c snd_hda_codec_realtek libarc4 kvm snd_hda_codec_generic rtsx_pci_sdmmc mmc_core snd_hda_codec_hdmi iwlwifi mei_hdcp irqbypass intel_rapl_msr snd_hda_intel snd_intel_dspcfg polyval_clmulni polyval_generic ghash_clmulni_intel sha512_ssse3 snd_hda_codec sha256_ssse3 sha1_ssse3 aesni_intel intel_wmi_thunderbolt xhci_pci snd_hwdep crypto_simd cfg80211 processor_thermal_device_pci_legacy xhci_hcd snd_hda_core cryptd psmouse think_lmi
2024-04-17T13:12:09.593974-0400 pouet kernel:  processor_thermal_device thinkpad_acpi rtsx_pci processor_thermal_wt_hint firmware_attributes_class pcspkr snd_pcm processor_thermal_rfim ucsi_acpi ledtrig_audio processor_thermal_rapl wmi_bmof platform_profile usbcore snd_timer efi_pstore typec_ucsi mfd_core iTCO_wdt ee1004 mei_me intel_rapl_common iTCO_vendor_support typec snd processor_thermal_wt_req mei processor_thermal_power_floor rfkill processor_thermal_mbox int3403_thermal roles intel_soc_dts_iosf soundcore usb_common intel_pch_thermal hwmon int340x_thermal_zone pinctrl_cannonlake int3400_thermal pinctrl_intel acpi_thermal_rel evdev sch_fq_codel dm_mod vhba(OE) fuse pkcs8_key_parser configfs dax efivarfs autofs4
2024-04-17T13:12:09.594010-0400 pouet kernel: Preemption disabled at:
2024-04-17T13:12:09.594029-0400 pouet kernel: [<0000000000000000>] 0x0
2024-04-17T13:12:09.594048-0400 pouet kernel: CPU: 5 PID: 84234 Comm: kworker/5:0 Tainted: G           OE      6.8.6-pouet #83
2024-04-17T13:12:09.594066-0400 pouet kernel: Hardware name: LENOVO 20M9CTO1WW/20M9CTO1WW, BIOS N2CET67W (1.50 ) 12/15/2022
2024-04-17T13:12:09.594088-0400 pouet kernel: Workqueue: events linkwatch_event
2024-04-17T13:12:09.594125-0400 pouet kernel: Call Trace:
2024-04-17T13:12:09.594146-0400 pouet kernel:  <TASK>
2024-04-17T13:12:09.594164-0400 pouet kernel:  dump_stack_lvl+0x4b/0x67
2024-04-17T13:12:09.594183-0400 pouet kernel:  __schedule_bug+0x82/0x94
2024-04-17T13:12:09.594201-0400 pouet kernel:  __schedule+0x58/0xb2d
2024-04-17T13:12:09.594219-0400 pouet kernel:  ? timekeeping_get_ns+0x19/0x33
2024-04-17T13:12:09.594242-0400 pouet kernel:  ? ktime_get+0x3b/0x4f
2024-04-17T13:12:09.594261-0400 pouet kernel:  ? lapic_next_deadline+0x2f/0x36
2024-04-17T13:12:09.594279-0400 pouet kernel:  ? clockevents_program_event+0xbd/0xec
2024-04-17T13:12:09.594298-0400 pouet kernel:  ? hrtimer_start_range_ns+0x1f2/0x236
2024-04-17T13:12:09.594316-0400 pouet kernel:  schedule+0x3d/0x59
2024-04-17T13:12:09.594334-0400 pouet kernel:  schedule_hrtimeout_range_clock+0x9d/0xe6
2024-04-17T13:12:09.594353-0400 pouet kernel:  ? __pfx_hrtimer_wakeup+0x10/0x10
2024-04-17T13:12:09.594371-0400 pouet kernel:  usleep_range_state+0x64/0x8a
2024-04-17T13:12:09.594389-0400 pouet kernel:  e1000e_read_phy_reg_mdic+0xa3/0x1d7
2024-04-17T13:12:09.594408-0400 pouet kernel:  e1000e_update_stats+0x155/0x725
2024-04-17T13:12:09.594427-0400 pouet kernel:  e1000e_get_stats64+0x2e/0x119
2024-04-17T13:12:09.594445-0400 pouet kernel:  dev_get_stats+0x37/0xca
2024-04-17T13:12:09.594468-0400 pouet kernel:  rtnl_fill_stats+0x41/0x123
2024-04-17T13:12:09.594487-0400 pouet kernel:  rtnl_fill_ifinfo+0x637/0xdae
2024-04-17T13:12:09.594505-0400 pouet kernel:  ? __kmalloc_node_track_caller+0x20c/0x237
2024-04-17T13:12:09.594520-0400 pouet kernel:  ? __wake_up_common_lock+0x4c/0x60
2024-04-17T13:12:09.594539-0400 pouet kernel:  ? kmalloc_reserve+0xa9/0xe4
2024-04-17T13:12:09.594554-0400 pouet kernel:  rtmsg_ifinfo_build_skb+0xa4/0xec
2024-04-17T13:12:09.594573-0400 pouet kernel:  rtmsg_ifinfo_event+0x35/0x64
2024-04-17T13:12:09.594595-0400 pouet kernel:  rtmsg_ifinfo+0x1c/0x25
2024-04-17T13:12:09.594614-0400 pouet kernel:  netdev_state_change+0x69/0x88
2024-04-17T13:12:09.594632-0400 pouet kernel:  linkwatch_do_dev+0x42/0x57
2024-04-17T13:12:09.594651-0400 pouet kernel:  __linkwatch_run_queue+0x154/0x1e2
2024-04-17T13:12:09.594669-0400 pouet kernel:  linkwatch_event+0x25/0x2a
2024-04-17T13:12:09.594691-0400 pouet kernel:  process_scheduled_works+0x195/0x296
2024-04-17T13:12:09.594710-0400 pouet kernel:  worker_thread+0x1ca/0x224
2024-04-17T13:12:09.594733-0400 pouet kernel:  ? __pfx_worker_thread+0x10/0x10
2024-04-17T13:12:09.594751-0400 pouet kernel:  kthread+0xf8/0x103
2024-04-17T13:12:09.594770-0400 pouet kernel:  ? __pfx_kthread+0x10/0x10
2024-04-17T13:12:09.594784-0400 pouet kernel:  ret_from_fork+0x25/0x3a
2024-04-17T13:12:09.594802-0400 pouet kernel:  ? __pfx_kthread+0x10/0x10
2024-04-17T13:12:09.594817-0400 pouet kernel:  ret_from_fork_asm+0x1b/0x30
2024-04-17T13:12:09.594839-0400 pouet kernel:  </TASK>


I'm not quite sure who's to blame. The laptop uses ECC memory so it's not a cosmic thing.
Comment 1 *cJ* 2024-04-17 18:03:25 UTC
Typo in the description, I was running 6.8.6 which was the freshest stable until this night.
I'm compiling master to check whether I can reproduce the problem there.
Comment 2 *cJ* 2024-04-17 18:32:08 UTC
I reproduced the problem on 6.9.0-rc4-pouet-00034-g4b6b51322118 aka the latest master.

2024-04-17T14:28:26.203226-0400 pouet kernel: e1000e 0000:00:1f.6 eth0: NIC Link is Up 1000 Mbps Half Duplex, Flow Control: Rx/Tx
2024-04-17T14:28:26.203619-0400 pouet kernel: BUG: scheduling while atomic: kworker/2:0/32/0x00000002
Comment 3 Ronny Lindner 2024-04-19 08:24:56 UTC
I noticed the issue since 6.8.5. After going back to 6.8.4 it is working again.
Comment 4 *cJ* 2024-04-19 13:28:04 UTC
I've been asked to try this patch yesterday but haven't had time to do it yet:

https://patchwork.ozlabs.org/project/intel-wired-lan/patch/20240417190320.3159360-1-vitaly.lifshits@intel.com/
Comment 5 *cJ* 2024-04-19 13:31:22 UTC
Reference: https://lore.kernel.org/lkml/dff8729b-3ab6-4b54-a3b0-60fabf031d62@intel.com/

I'll test in a few minutes.
Comment 6 *cJ* 2024-04-19 16:32:43 UTC
The patch works for me, I guess it will work for everyone.
Comment 7 Artem S. Tashkinov 2024-04-26 16:14:05 UTC
*** Bug 218766 has been marked as a duplicate of this bug. ***
Comment 8 James.Dutton 2024-04-30 13:55:15 UTC
I have the same problem with the e1000e on a recent 6.8.7 kernel.
The patch mentioned above fixes the problem:
https://patchwork.ozlabs.org/project/intel-wired-lan/patch/20240417190320.3159360-1-vitaly.lifshits@intel.com/

Note You need to log in before you can comment on or make changes to this bug.