Bug 216652 - IGC driver crashes and prevents from using network
Summary: IGC driver crashes and prevents from using network
Status: NEW
Alias: None
Product: Drivers
Classification: Unclassified
Component: Network (show other bugs)
Hardware: AMD Linux
: P1 blocking
Assignee: drivers_network@kernel-bugs.osdl.org
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2022-11-02 07:37 UTC by adam.lamarz
Modified: 2024-01-02 17:31 UTC (History)
8 users (show)

See Also:
Kernel Version: 6.0.3 - 6.0.6
Subsystem:
Regression: No
Bisected commit-id:


Attachments
complete dmesg (130.39 KB, text/plain)
2022-11-02 07:37 UTC, adam.lamarz
Details

Description adam.lamarz 2022-11-02 07:37:24 UTC
Created attachment 303120 [details]
complete dmesg

Hello, 
The IGC driver crashes randomly with following dmesg output:

[61172.003677] igc 0000:0c:00.0 eno1: PCIe link lost, device now detached
[61172.003684] ------------[ cut here ]------------
[61172.003684] igc: Failed to read reg 0xc030!
[61172.003713] WARNING: CPU: 14 PID: 25885 at drivers/net/ethernet/intel/igc/igc_main.c:6197 igc_rd32+0xaf/0xc0 [igc]
[61172.003718] Modules linked in: ntfs3 ccm rfcomm vboxnetadp(OE) vboxnetflt(OE) snd_seq_dummy snd_hrtimer vboxdrv(OE) cmac algif_hash algif_skcipher af_alg bnep binfmt_misc nls_iso8859_1 amdgpu intel_rapl_msr intel_rapl_common mt7921e mt7921_common snd_hda_codec_hdmi mt76_connac_lib btusb edac_mce_amd snd_hda_intel btrtl mt76 snd_intel_dspcfg btbcm iommu_v2 btintel snd_intel_sdw_acpi kvm_amd uvcvideo gpu_sched btmtk mac80211 snd_hda_codec drm_buddy kvm drm_ttm_helper videobuf2_vmalloc ttm videobuf2_memops bluetooth videobuf2_v4l2 snd_usb_audio drm_display_helper snd_hda_core videobuf2_common snd_usbmidi_lib crct10dif_pclmul ghash_clmulni_intel snd_hwdep videodev snd_seq_midi aesni_intel cec snd_seq_midi_event rc_core snd_rawmidi crypto_simd ecdh_generic cryptd cfg80211 ecc drm_kms_helper snd_pcm mc input_leds eeepc_wmi asus_nb_wmi snd_seq asus_wmi platform_profile snd_seq_device ledtrig_audio i2c_algo_bit snd_timer sparse_keymap wmi_bmof fb_sys_fops ccp syscopyarea rapl sysfillrect snd
[61172.003741]  k10temp libarc4 sysimgblt soundcore mac_hid drm msr parport_pc ppdev lp parport ramoops pstore_blk reed_solomon pstore_zone efi_pstore ip_tables x_tables autofs4 hid_generic usbhid hid igc nvme nvme_core ahci crc32_pclmul i2c_piix4 libahci xhci_pci xhci_pci_renesas wmi video gpio_amdpt
[61172.003751] CPU: 14 PID: 25885 Comm: kworker/14:0 Tainted: G           OE      6.0.5 #1
[61172.003752] Hardware name: ASUS System Product Name/ROG STRIX X670E-F GAMING WIFI, BIOS 0705 10/05/2022
[61172.003753] Workqueue: events igc_watchdog_task [igc]
[61172.003756] RIP: 0010:igc_rd32+0xaf/0xc0 [igc]
[61172.003758] Code: c7 c6 60 f4 72 c0 e8 23 d9 95 f3 49 8b bc 24 28 ff ff ff e8 f3 50 23 f3 84 c0 74 a3 89 de 48 c7 c7 88 f4 72 c0 e8 5b 92 8b f3 <0f> 0b eb 91 66 66 2e 0f 1f 84 00 00 00 00 00 66 90 0f 1f 44 00 00
[61172.003759] RSP: 0018:ffff9a931c85bdb8 EFLAGS: 00010246
[61172.003760] RAX: 0000000000000000 RBX: 000000000000c030 RCX: 0000000000000000
[61172.003760] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
[61172.003761] RBP: ffff9a931c85bdd0 R08: 0000000000000000 R09: 0000000000000000
[61172.003761] R10: 0000000000000000 R11: 0000000000000000 R12: ffff89d6f5f64c58
[61172.003762] R13: ffff89d6f5f64000 R14: 0000000000000000 R15: ffff89d681347d80
[61172.003762] FS:  0000000000000000(0000) GS:ffff89e558580000(0000) knlGS:0000000000000000
[61172.003763] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[61172.003763] CR2: ffff8908bd341808 CR3: 000000011a844000 CR4: 0000000000750ee0
[61172.003764] PKRU: 55555554
[61172.003764] Call Trace:
[61172.003765]  <TASK>
[61172.003767]  igc_update_stats+0xa5/0x740 [igc]
[61172.003769]  igc_watchdog_task+0xaa/0x330 [igc]
[61172.003771]  process_one_work+0x225/0x400
[61172.003774]  worker_thread+0x50/0x3e0
[61172.003774]  ? process_one_work+0x400/0x400
[61172.003775]  kthread+0xe9/0x110
[61172.003777]  ? kthread_complete_and_exit+0x20/0x20
[61172.003778]  ret_from_fork+0x22/0x30
[61172.003780]  </TASK>
[61172.003780] ---[ end trace 0000000000000000 ]---


BR,
Adam
Comment 1 Artem S. Tashkinov 2022-11-07 14:04:24 UTC
CC'ing Sasha Neftin.
Comment 2 Whitley 2022-11-08 09:32:05 UTC
Also receiving this issue on my new build. Already reached out to ASUS about this error as well. The notorious I225-V bug has come back to haunt again.

https://www.intel.com/content/www/us/en/support/articles/000057261/ethernet-products/gigabit-ethernet-controllers-up-to-2-5gbe.html

https://www.reddit.com/r/intel/comments/lqb4km/for_people_having_i225v_connection_issues/

[ 7575.621414] ------------[ cut here ]------------
[ 7575.621415] igc: Failed to read reg 0xc030!
[ 7575.621445] WARNING: CPU: 25 PID: 3397 at drivers/net/ethernet/intel/igc/igc_main.c:6174 igc_rd32+0x98/0xa0 [igc]
[ 7575.621451] Modules linked in: fuse xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 ip6table_mangle ip6table_nat ip6table_filter ip6_tables iptable_mangle iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 iptable_filter ip_tables bridge 8021q garp mrp des_generic stp libdes llc md4 binfmt_misc intel_rapl_msr vfat fat intel_rapl_common nvidia_drm(PO) nvidia_modeset(PO) edac_mce_amd iwlmvm kvm_amd nvidia(PO) snd_hda_codec_hdmi mac80211 snd_usb_audio snd_hda_intel kvm libarc4 snd_usbmidi_lib snd_intel_dspcfg pkcs8_key_parser btusb snd_intel_sdw_acpi snd_rawmidi irqbypass btrtl snd_hda_codec snd_seq_device razerkbd(O) btbcm mc joydev iwlwifi rapl btintel snd_hda_core eeepc_wmi asus_wmi snd_hwdep bluetooth sparse_keymap pcspkr platform_profile wmi_bmof i2c_piix4 ecdh_generic drm_kms_helper snd_pcm cfg80211 cec snd_timer rfkill snd drm soundcore acpi_cpufreq crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel sp5100_tco ccp nvme nvme_core igc wmi video
[ 7575.621483] CPU: 25 PID: 3397 Comm: polybar Tainted: P           O      5.15.75-gentoo-dist #1
[ 7575.621486] Hardware name: ASUS System Product Name/ROG STRIX X670E-E GAMING WIFI, BIOS 0705 10/05/2022
[ 7575.621486] RIP: 0010:igc_rd32+0x98/0xa0 [igc]
[ 7575.621489] Code: 48 c7 c6 18 b4 31 c0 e8 48 b9 97 fa 48 8b bb 30 ff ff ff e8 ba f7 3f fa 84 c0 74 b6 89 ee 48 c7 c7 40 b4 31 c0 e8 78 38 92 fa <0f> 0b eb 88 0f 1f 40 00 0f 1f 44 00 00 41 56 41 55 41 54 55 48 89
[ 7575.621491] RSP: 0018:ffffac2d4267b788 EFLAGS: 00010292
[ 7575.621492] RAX: 000000000000001f RBX: ffff9f4f90ceec10 RCX: 0000000000000027
[ 7575.621493] RDX: ffff9f5678860748 RSI: 0000000000000001 RDI: ffff9f5678860740
[ 7575.621494] RBP: 000000000000c030 R08: 0000000000000000 R09: ffffac2d4267b5c8
[ 7575.621495] R10: ffffac2d4267b5c0 R11: ffffffffbb73bfc8 R12: 00000000ffffffff
[ 7575.621495] R13: ffff9f4f90cee000 R14: ffff9f4f90ccad40 R15: 000000000000c030
[ 7575.621496] FS:  00007fe58f7fe640(0000) GS:ffff9f5678840000(0000) knlGS:0000000000000000
[ 7575.621497] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 7575.621498] CR2: 000026ec01641000 CR3: 00000001132f6000 CR4: 0000000000750ee0
[ 7575.621499] PKRU: 55555554
[ 7575.621500] Call Trace:
[ 7575.621502]  <TASK>
[ 7575.621503]  igc_update_stats+0x72/0x850 [igc]
[ 7575.621505]  igc_update_stats+0x723/0x850 [igc]
[ 7575.621507]  dev_get_stats+0x59/0xc0
[ 7575.621511]  rtnl_fill_stats+0x3b/0x130
[ 7575.621514]  rtnl_fill_ifinfo+0x6bf/0x12f0
[ 7575.621515]  ? nla_put+0x28/0x40
[ 7575.621518]  ? rtnl_fill_ifinfo+0x1210/0x12f0
[ 7575.621519]  rtnl_dump_ifinfo+0x528/0x650
[ 7575.621522]  ? ksize+0x14/0x30
[ 7575.621525]  ? __build_skb_around+0xa0/0xb0
[ 7575.621527]  ? __alloc_skb+0xf0/0x1e0
[ 7575.621529]  netlink_dump+0x185/0x310
[ 7575.621532]  ? rtnl_fill_ifinfo+0x12f0/0x12f0
[ 7575.621533]  __netlink_dump_start+0x1ec/0x2d0
[ 7575.621535]  rtnetlink_rcv_msg+0x26a/0x350
[ 7575.621536]  ? futex_wait_queue_me+0xb3/0x100
[ 7575.621538]  ? rtnl_fill_ifinfo+0x12f0/0x12f0
[ 7575.621540]  ? rtnl_calcit.isra.0+0x100/0x100
[ 7575.621541]  netlink_rcv_skb+0x4e/0x100
[ 7575.621543]  netlink_unicast+0x1d9/0x2b0
[ 7575.621545]  netlink_sendmsg+0x23e/0x490
[ 7575.621547]  sock_sendmsg+0x5f/0x70
[ 7575.621549]  __sys_sendto+0xf0/0x160
[ 7575.621552]  __x64_sys_sendto+0x20/0x30
[ 7575.621553]  do_syscall_64+0x38/0xc0
[ 7575.621557]  entry_SYSCALL_64_after_hwframe+0x61/0xcb
[ 7575.621560] RIP: 0033:0x7fe5b094c30e
[ 7575.621561] Code: 85 f7 ff 41 89 c4 44 8b 4c 24 2c 4c 8b 44 24 20 44 8b 54 24 28 48 8b 54 24 18 48 8b 74 24 10 8b 7c 24 08 b8 2c 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 32 44 89 e7 48 89 44 24 08 e8 3d 86 f7 ff 48
[ 7575.621562] RSP: 002b:00007fe58f7fc280 EFLAGS: 00000293 ORIG_RAX: 000000000000002c
[ 7575.621563] RAX: ffffffffffffffda RBX: 00007fe58f7fd420 RCX: 00007fe5b094c30e
[ 7575.621564] RDX: 0000000000000014 RSI: 00007fe58f7fd360 RDI: 0000000000000015
[ 7575.621564] RBP: 00007fe58f7fd3b0 R08: 00007fe58f7fd320 R09: 000000000000000c
[ 7575.621565] R10: 0000000000000000 R11: 0000000000000293 R12: 0000000000000000
[ 7575.621566] R13: 00007fe58f7fd360 R14: 0000555f9ff71e70 R15: 00007fe58f7fdbc0
[ 7575.621567]  </TASK>
[ 7575.621567] ---[ end trace c61ea224e51732ae ]---
Comment 3 Whitley 2022-11-08 09:43:00 UTC
(In reply to adam.lamarz from comment #0)
> Created attachment 303120 [details]
> complete dmesg
> 
> Hello, 
> The IGC driver crashes randomly with following dmesg output:
> 
> [61172.003677] igc 0000:0c:00.0 eno1: PCIe link lost, device now detached
> [61172.003684] ------------[ cut here ]------------
> [61172.003684] igc: Failed to read reg 0xc030!
> [61172.003713] WARNING: CPU: 14 PID: 25885 at
> drivers/net/ethernet/intel/igc/igc_main.c:6197 igc_rd32+0xaf/0xc0 [igc]
> [61172.003718] Modules linked in: ntfs3 ccm rfcomm vboxnetadp(OE)
> vboxnetflt(OE) snd_seq_dummy snd_hrtimer vboxdrv(OE) cmac algif_hash
> algif_skcipher af_alg bnep binfmt_misc nls_iso8859_1 amdgpu intel_rapl_msr
> intel_rapl_common mt7921e mt7921_common snd_hda_codec_hdmi mt76_connac_lib
> btusb edac_mce_amd snd_hda_intel btrtl mt76 snd_intel_dspcfg btbcm iommu_v2
> btintel snd_intel_sdw_acpi kvm_amd uvcvideo gpu_sched btmtk mac80211
> snd_hda_codec drm_buddy kvm drm_ttm_helper videobuf2_vmalloc ttm
> videobuf2_memops bluetooth videobuf2_v4l2 snd_usb_audio drm_display_helper
> snd_hda_core videobuf2_common snd_usbmidi_lib crct10dif_pclmul
> ghash_clmulni_intel snd_hwdep videodev snd_seq_midi aesni_intel cec
> snd_seq_midi_event rc_core snd_rawmidi crypto_simd ecdh_generic cryptd
> cfg80211 ecc drm_kms_helper snd_pcm mc input_leds eeepc_wmi asus_nb_wmi
> snd_seq asus_wmi platform_profile snd_seq_device ledtrig_audio i2c_algo_bit
> snd_timer sparse_keymap wmi_bmof fb_sys_fops ccp syscopyarea rapl
> sysfillrect snd
> [61172.003741]  k10temp libarc4 sysimgblt soundcore mac_hid drm msr
> parport_pc ppdev lp parport ramoops pstore_blk reed_solomon pstore_zone
> efi_pstore ip_tables x_tables autofs4 hid_generic usbhid hid igc nvme
> nvme_core ahci crc32_pclmul i2c_piix4 libahci xhci_pci xhci_pci_renesas wmi
> video gpio_amdpt
> [61172.003751] CPU: 14 PID: 25885 Comm: kworker/14:0 Tainted: G           OE
> 6.0.5 #1
> [61172.003752] Hardware name: ASUS System Product Name/ROG STRIX X670E-F
> GAMING WIFI, BIOS 0705 10/05/2022
> [61172.003753] Workqueue: events igc_watchdog_task [igc]
> [61172.003756] RIP: 0010:igc_rd32+0xaf/0xc0 [igc]
> [61172.003758] Code: c7 c6 60 f4 72 c0 e8 23 d9 95 f3 49 8b bc 24 28 ff ff
> ff e8 f3 50 23 f3 84 c0 74 a3 89 de 48 c7 c7 88 f4 72 c0 e8 5b 92 8b f3 <0f>
> 0b eb 91 66 66 2e 0f 1f 84 00 00 00 00 00 66 90 0f 1f 44 00 00
> [61172.003759] RSP: 0018:ffff9a931c85bdb8 EFLAGS: 00010246
> [61172.003760] RAX: 0000000000000000 RBX: 000000000000c030 RCX:
> 0000000000000000
> [61172.003760] RDX: 0000000000000000 RSI: 0000000000000000 RDI:
> 0000000000000000
> [61172.003761] RBP: ffff9a931c85bdd0 R08: 0000000000000000 R09:
> 0000000000000000
> [61172.003761] R10: 0000000000000000 R11: 0000000000000000 R12:
> ffff89d6f5f64c58
> [61172.003762] R13: ffff89d6f5f64000 R14: 0000000000000000 R15:
> ffff89d681347d80
> [61172.003762] FS:  0000000000000000(0000) GS:ffff89e558580000(0000)
> knlGS:0000000000000000
> [61172.003763] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [61172.003763] CR2: ffff8908bd341808 CR3: 000000011a844000 CR4:
> 0000000000750ee0
> [61172.003764] PKRU: 55555554
> [61172.003764] Call Trace:
> [61172.003765]  <TASK>
> [61172.003767]  igc_update_stats+0xa5/0x740 [igc]
> [61172.003769]  igc_watchdog_task+0xaa/0x330 [igc]
> [61172.003771]  process_one_work+0x225/0x400
> [61172.003774]  worker_thread+0x50/0x3e0
> [61172.003774]  ? process_one_work+0x400/0x400
> [61172.003775]  kthread+0xe9/0x110
> [61172.003777]  ? kthread_complete_and_exit+0x20/0x20
> [61172.003778]  ret_from_fork+0x22/0x30
> [61172.003780]  </TASK>
> [61172.003780] ---[ end trace 0000000000000000 ]---
> 
> 
> BR,
> Adam

Adam, I just found this. Give this a shot? https://www.reddit.com/r/buildapc/comments/xypn1m/network_card_intel_ethernet_controller_i225v_igc/
Comment 4 adam.lamarz 2022-11-09 08:06:17 UTC
(In reply to Whitley from comment #3)
> (In reply to adam.lamarz from comment #0)
> > Created attachment 303120 [details]
> > complete dmesg
> > 
> > Hello, 
> > The IGC driver crashes randomly with following dmesg output:
> > 
> > [61172.003677] igc 0000:0c:00.0 eno1: PCIe link lost, device now detached
> > [61172.003684] ------------[ cut here ]------------
> > [61172.003684] igc: Failed to read reg 0xc030!
> > [61172.003713] WARNING: CPU: 14 PID: 25885 at
> > drivers/net/ethernet/intel/igc/igc_main.c:6197 igc_rd32+0xaf/0xc0 [igc]
> > [61172.003718] Modules linked in: ntfs3 ccm rfcomm vboxnetadp(OE)
> > vboxnetflt(OE) snd_seq_dummy snd_hrtimer vboxdrv(OE) cmac algif_hash
> > algif_skcipher af_alg bnep binfmt_misc nls_iso8859_1 amdgpu intel_rapl_msr
> > intel_rapl_common mt7921e mt7921_common snd_hda_codec_hdmi mt76_connac_lib
> > btusb edac_mce_amd snd_hda_intel btrtl mt76 snd_intel_dspcfg btbcm iommu_v2
> > btintel snd_intel_sdw_acpi kvm_amd uvcvideo gpu_sched btmtk mac80211
> > snd_hda_codec drm_buddy kvm drm_ttm_helper videobuf2_vmalloc ttm
> > videobuf2_memops bluetooth videobuf2_v4l2 snd_usb_audio drm_display_helper
> > snd_hda_core videobuf2_common snd_usbmidi_lib crct10dif_pclmul
> > ghash_clmulni_intel snd_hwdep videodev snd_seq_midi aesni_intel cec
> > snd_seq_midi_event rc_core snd_rawmidi crypto_simd ecdh_generic cryptd
> > cfg80211 ecc drm_kms_helper snd_pcm mc input_leds eeepc_wmi asus_nb_wmi
> > snd_seq asus_wmi platform_profile snd_seq_device ledtrig_audio i2c_algo_bit
> > snd_timer sparse_keymap wmi_bmof fb_sys_fops ccp syscopyarea rapl
> > sysfillrect snd
> > [61172.003741]  k10temp libarc4 sysimgblt soundcore mac_hid drm msr
> > parport_pc ppdev lp parport ramoops pstore_blk reed_solomon pstore_zone
> > efi_pstore ip_tables x_tables autofs4 hid_generic usbhid hid igc nvme
> > nvme_core ahci crc32_pclmul i2c_piix4 libahci xhci_pci xhci_pci_renesas wmi
> > video gpio_amdpt
> > [61172.003751] CPU: 14 PID: 25885 Comm: kworker/14:0 Tainted: G          
> OE
> > 6.0.5 #1
> > [61172.003752] Hardware name: ASUS System Product Name/ROG STRIX X670E-F
> > GAMING WIFI, BIOS 0705 10/05/2022
> > [61172.003753] Workqueue: events igc_watchdog_task [igc]
> > [61172.003756] RIP: 0010:igc_rd32+0xaf/0xc0 [igc]
> > [61172.003758] Code: c7 c6 60 f4 72 c0 e8 23 d9 95 f3 49 8b bc 24 28 ff ff
> > ff e8 f3 50 23 f3 84 c0 74 a3 89 de 48 c7 c7 88 f4 72 c0 e8 5b 92 8b f3
> <0f>
> > 0b eb 91 66 66 2e 0f 1f 84 00 00 00 00 00 66 90 0f 1f 44 00 00
> > [61172.003759] RSP: 0018:ffff9a931c85bdb8 EFLAGS: 00010246
> > [61172.003760] RAX: 0000000000000000 RBX: 000000000000c030 RCX:
> > 0000000000000000
> > [61172.003760] RDX: 0000000000000000 RSI: 0000000000000000 RDI:
> > 0000000000000000
> > [61172.003761] RBP: ffff9a931c85bdd0 R08: 0000000000000000 R09:
> > 0000000000000000
> > [61172.003761] R10: 0000000000000000 R11: 0000000000000000 R12:
> > ffff89d6f5f64c58
> > [61172.003762] R13: ffff89d6f5f64000 R14: 0000000000000000 R15:
> > ffff89d681347d80
> > [61172.003762] FS:  0000000000000000(0000) GS:ffff89e558580000(0000)
> > knlGS:0000000000000000
> > [61172.003763] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [61172.003763] CR2: ffff8908bd341808 CR3: 000000011a844000 CR4:
> > 0000000000750ee0
> > [61172.003764] PKRU: 55555554
> > [61172.003764] Call Trace:
> > [61172.003765]  <TASK>
> > [61172.003767]  igc_update_stats+0xa5/0x740 [igc]
> > [61172.003769]  igc_watchdog_task+0xaa/0x330 [igc]
> > [61172.003771]  process_one_work+0x225/0x400
> > [61172.003774]  worker_thread+0x50/0x3e0
> > [61172.003774]  ? process_one_work+0x400/0x400
> > [61172.003775]  kthread+0xe9/0x110
> > [61172.003777]  ? kthread_complete_and_exit+0x20/0x20
> > [61172.003778]  ret_from_fork+0x22/0x30
> > [61172.003780]  </TASK>
> > [61172.003780] ---[ end trace 0000000000000000 ]---
> > 
> > 
> > BR,
> > Adam
> 
> Adam, I just found this. Give this a shot?
> https://www.reddit.com/r/buildapc/comments/xypn1m/
> network_card_intel_ethernet_controller_i225v_igc/

Thanks for the reply. I added this parameter to my boot config, and I'm testing it now.
BR,
Adam
Comment 5 Whitley 2022-11-09 22:11:52 UTC
No problem Adam. I've been pulling my hair out about this (as well as many people all over the internet) over the I225-V bugs. 

ASUS got back to me and said it's "hardware failure" but in reality, it's a firmware failure. Try maybe swapping out your MOBO with the Hero version because people on Amazon have not been having issues with that board. Or just get a PCI ethernet adapter that's not based off the I225-v chipset. Best of luck.
Comment 6 adam.lamarz 2022-11-10 06:42:41 UTC
Unfortunatelly IGC driver is still crashing :(
[62579.351616] igc 0000:0c:00.0 eno1: PCIe link lost, device now detached
[62579.351622] ------------[ cut here ]------------
[62579.351623] igc: Failed to read reg 0xc030!
[62579.351653] WARNING: CPU: 21 PID: 22349 at drivers/net/ethernet/intel/igc/igc_main.c:6197 igc_rd32+0x95/0xa0 [igc]
[62579.351663] Modules linked in: ntfs3 tls btusb btrtl btbcm btintel btmtk bluetooth ecdh_generic uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videobuf2_common snd_usb_audio videodev snd_usbmidi_lib snd_rawmidi snd_seq_device mc 8021q garp mrp mousedev stp uas llc intel_rapl_msr usb_storage ip6t_REJECT vfat intel_rapl_common usbhid fat nf_reject_ipv6 xt_hl ip6t_rt asus_nb_wmi mt7921e asus_wmi mt7921_common ipt_REJECT ledtrig_audio mt76_connac_lib edac_mce_amd sparse_keymap nf_reject_ipv4 i8042 snd_hda_codec_hdmi platform_profile mt76 xt_LOG kvm_amd amdgpu serio wmi_bmof nf_log_syslog kvm snd_hda_intel mac80211 snd_intel_dspcfg snd_intel_sdw_acpi irqbypass crct10dif_pclmul xt_limit crc32_pclmul snd_hda_codec polyval_clmulni xt_addrtype snd_hda_core polyval_generic libarc4 gf128mul xt_tcpudp ghash_clmulni_intel snd_hwdep gpu_sched cfg80211 aesni_intel drm_buddy snd_pcm drm_ttm_helper xt_conntrack crypto_simd ttm sp5100_tco cryptd snd_timer ccp nf_conntrack rapl pcspkr k10temp
[62579.351697]  i2c_piix4 drm_display_helper snd igc rfkill nf_defrag_ipv6 tpm_crb soundcore cec nf_defrag_ipv4 tpm_tis libcrc32c wmi tpm_tis_core video tpm ip6table_filter rng_core ip6_tables gpio_amdpt mac_hid acpi_cpufreq gpio_generic iptable_filter vboxnetflt(OE) vboxnetadp(OE) vboxdrv(OE) crypto_user fuse bpf_preload ip_tables x_tables ext4 crc32c_generic crc16 mbcache jbd2 nvme nvme_core xhci_pci crc32c_intel xhci_pci_renesas nvme_common
[62579.351717] CPU: 21 PID: 22349 Comm: kworker/21:1 Tainted: G           OE      6.0.7-arch1-1 #1 54734d35253fb4c526adcfdfa2e7225be9ec4a9a
[62579.351720] Hardware name: ASUS System Product Name/ROG STRIX X670E-F GAMING WIFI, BIOS 0705 10/05/2022
[62579.351721] Workqueue: events igc_watchdog_task [igc]
[62579.351726] RIP: 0010:igc_rd32+0x95/0xa0 [igc]
[62579.351731] Code: 48 c7 c6 b8 96 b3 c0 e8 94 c1 eb eb 48 8b bd 28 ff ff ff e8 cd da 8f eb 84 c0 74 b4 89 de 48 c7 c7 e0 96 b3 c0 e8 ab c3 e6 eb <0f> 0b eb a2 0f 1f 80 00 00 00 00 66 0f 1f 00 0f 1f 44 00 00 41 56
[62579.351732] RSP: 0018:ffffa1cc03f47df0 EFLAGS: 00010282
[62579.351734] RAX: 0000000000000000 RBX: 000000000000c030 RCX: 0000000000000027
[62579.351735] RDX: ffff8ddd58761668 RSI: 0000000000000001 RDI: ffff8ddd58761660
[62579.351736] RBP: ffff8dce94252c58 R08: 0000000000000000 R09: ffffa1cc03f47c78
[62579.351737] R10: 0000000000000003 R11: ffff8ddd97fa7c28 R12: ffff8dce94252000
[62579.351738] R13: 0000000000000000 R14: ffff8dce812f6d80 R15: 000000000000c030
[62579.351739] FS:  0000000000000000(0000) GS:ffff8ddd58740000(0000) knlGS:0000000000000000
[62579.351740] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[62579.351741] CR2: 0000000000000000 CR3: 00000004588fe000 CR4: 0000000000750ee0
[62579.351742] PKRU: 55555554
[62579.351743] Call Trace:
[62579.351744]  <TASK>
[62579.351746]  igc_update_stats+0x8a/0x6c0 [igc dbd7ab4a69bc1f61b193f26242c532d95a5b95ae]
[62579.351751]  igc_watchdog_task+0xa7/0x2c0 [igc dbd7ab4a69bc1f61b193f26242c532d95a5b95ae]
[62579.351755]  process_one_work+0x1c7/0x380
[62579.351759]  worker_thread+0x51/0x390
[62579.351761]  ? rescuer_thread+0x3b0/0x3b0
[62579.351763]  kthread+0xde/0x110
[62579.351765]  ? kthread_complete_and_exit+0x20/0x20
[62579.351767]  ret_from_fork+0x22/0x30
[62579.351771]  </TASK>
[62579.351771] ---[ end trace 0000000000000000 ]---
Comment 7 adam.lamarz 2022-11-10 06:43:16 UTC
0.000000] Linux version 6.0.7-arch1-1 (linux@archlinux) (gcc (GCC) 12.2.0, GNU ld (GNU Binutils) 2.39.0) #1 SMP PREEMPT_DYNAMIC Thu, 03 Nov 2022 18:01:58 +0000
[    0.000000] Command line: BOOT_IMAGE=/vmlinuz-linux root=UUID=cb1d4d5b-3600-49be-a58d-10d52195e45e rw loglevel=3 quiet pcie_port_pm=off
Comment 8 Whitley 2022-11-10 10:04:59 UTC
(In reply to adam.lamarz from comment #7)
> 0.000000] Linux version 6.0.7-arch1-1 (linux@archlinux) (gcc (GCC) 12.2.0,
> GNU ld (GNU Binutils) 2.39.0) #1 SMP PREEMPT_DYNAMIC Thu, 03 Nov 2022
> 18:01:58 +0000
> [    0.000000] Command line: BOOT_IMAGE=/vmlinuz-linux
> root=UUID=cb1d4d5b-3600-49be-a58d-10d52195e45e rw loglevel=3 quiet
> pcie_port_pm=off

Hmm it's been OK so far for me... but I am swapping out the motherboard from Amazon because I don't trust this chipset. I haven't crashed on Ethernet. Here is my grub config:

Linux rei 6.0.7-gentoo-dist #1 SMP PREEMPT_DYNAMIC Tue Nov 8 03:41:12 EST 2022 x86_64 AMD Ryzen 9 7950X 16-Core Processor AuthenticAMD GNU/Linux

My /etc/default/grub has this:

GRUB_CMDLINE_LINUX_DEFAULT="pcie_port_pm=off"

Then I ran:

grub-mkconfig -o /boot/grub/grub.cfg

and that applies the new bootloader parameters. I would really like to hear what Sasha Neftin's input is -- knowing that she is way more intelligent than I'll ever be and could absolutely help point us in the direction on how to fix this bug. ;)
Comment 9 adam.lamarz 2022-11-10 10:27:10 UTC
Mine /etc/default/grub has this line:
GRUB_CMDLINE_LINUX_DEFAULT="loglevel=3 quiet pcie_port_pm=off"

and grub.cfg has this line:
linux   /vmlinuz-linux root=UUID=cb1d4d5b-3600-49be-a58d-10d52195e45e rw  loglevel=3 quiet pcie_port_pm=off

After applying this new setting I left my PC running overnight and it was fine, but when I started my work this morning IGC crashed:(.
Comment 10 Whitley 2022-11-11 05:34:22 UTC
Hmm you seem to have the same parameters. 

Adam try this?

https://www.reddit.com/r/linuxhardware/comments/y2f3mf/igc_kernel_crashes_on_i225v_with_latest/

Intel engineers please help give an explanation as to what's going on!
Comment 11 Whitley 2022-11-11 10:14:30 UTC
IGC just crashed on me -- never mind! It's still borked with the PCI command in GRUB.
Comment 12 Whitley 2022-11-14 11:23:00 UTC
Swapped in a new motherboard (upgraded from the strix to the crosshair) and so far, not issues. Will update if I get a crash.
Comment 13 amir.avivi 2022-11-14 11:23:33 UTC
Created attachment 303174 [details]
attachment-24220-0.html

Thanks for your email, I am OOO on maternity leave.
You can contact me by Teams, but response might take some time.
For urgent issue, please talk to my manager shmuel.ben-nisan@intel.com.

Thanks,
Amir
Comment 14 Whitley 2022-11-14 13:15:35 UTC
Amir thank you for your response. I believe the reddit thread is in contact with intel engineer's but I will forward this. Great to see Intel treating their employees well with ML. Congratulations on your newborn!
Comment 15 adam.lamarz 2022-11-23 08:18:08 UTC
Hello,
just to inform you :)
with the latest bios update from Asus (ROG STRIX X670E-F GAMING WIFI BIOS 0805) and
this kernel parameters: 
linux   /vmlinuz-linux root=UUID=cb1d4d5b-3600-49be-a58d-10d52195e45e rw  loglevel=3 quiet pcie_port_pm=off pcie_aspm.policy=performance

my IGC has stopped crashing. Currently I'm running my Pc about 32h straight.

BR,
Adam

Note You need to log in before you can comment on or make changes to this bug.