Is this patch (just guessing :-D ) needed for 4.19.93, I get the following splat when I do "rmmod igb" on Ryzen x86_64 / Fedora 30? https://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git/commit/?id=48a322b6f9965b2f1e4ce81af972f0e287b07ed0 I have wireguard module loaded but not in use when GPF happened. I wanted to remove igb because network device renaming stopped working when using NetworkManager+systemd (tried both udev 70-persistent-net.rules and systemd network/10-persistent-net.link). general protection fault: 0000 [#1] PREEMPT SMP NOPTI CPU: 5 PID: 6888 Comm: rmmod Tainted: G O T 4.19.93+ #41 Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./X370 Taichi, BIOS P5.10 12/17/2018 RIP: 0010:remove_files.isra.0+0x1f/0x70 Code: fe ff ff ff eb 9b 0f 1f 44 00 00 0f 1f 44 00 00 41 54 49 89 d4 55 48 89 fd 53 48 85 f6 74 24 48 8b 06 48 89 f3 48 85 c0 74 19 <48> 8b 30 31 d2 48 89 ef 48 83 c3 08 e8 00 d6 ff ff 48 8b 03 48 85 RSP: 0018:ffffa2d943a53c90 EFLAGS: 00010286 RAX: a0caaab371d2d028 RBX: ffffa15d294063c0 RCX: 0000000000000000 RDX: ffffa15d3aced578 RSI: ffffa15d294063c0 RDI: ffffa15d235dfbb0 RBP: ffffa15d235dfbb0 R08: 0000000000000000 R09: ffffa15d23766458 R10: 0000000000000000 R11: ffffa15d2a17a218 R12: ffffa15d3aced578 R13: 0000000000000000 R14: ffffa15d3acec1d0 R15: 0000000000000000 FS: 00007f1ef0501740(0000) GS:ffffa15d3e940000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000561a3c85fea8 CR3: 00000007e1826000 CR4: 00000000003406e0 Call Trace: sysfs_remove_group+0x3d/0x80 sysfs_remove_groups+0x29/0x40 device_remove_attrs+0x42/0x80 device_del+0x162/0x380 cdev_device_del+0x15/0x30 posix_clock_unregister+0x21/0x50 ptp_clock_unregister+0x6e/0x80 igb_ptp_stop+0x1f/0x50 [igb] igb_remove+0x47/0x160 [igb] pci_device_remove+0x3b/0xa0 device_release_driver_internal+0x183/0x250 driver_detach+0x53/0x84 bus_remove_driver+0x55/0xc6 pci_unregister_driver+0x29/0xb0 __x64_sys_delete_module+0x176/0x2d0 ? exit_to_usermode_loop+0x74/0xd0 do_syscall_64+0x6f/0x329 ? trace_hardirqs_off_thunk+0x1a/0x1c entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x7f1ef0629acb Code: 73 01 c3 48 8b 0d bd 33 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa b8 b0 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 8d 33 0c 00 f7 d8 64 89 01 48 RSP: 002b:00007ffd0a515eb8 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0 RAX: ffffffffffffffda RBX: 0000561a3c855e20 RCX: 00007f1ef0629acb RDX: 000000000000000a RSI: 0000000000000800 RDI: 0000561a3c855e88 RBP: 00007ffd0a515f18 R08: 0000000000000000 R09: 0000000000000000 R10: 00007f1ef069dac0 R11: 0000000000000206 R12: 00007ffd0a5160e0 R13: 00007ffd0a517f8a R14: 0000561a3c8552a0 R15: 0000561a3c855e20 Modules linked in: nfnetlink_acct ip6table_mangle nf_log_ipv6 xt_hl ip6t_REJECT nf_reject_ipv6 xt_state ip6t_rt ip6table_filter ip6_tables iptable_nat nf_nat_ipv4 nf_nat iptable_raw iptable_mangle nf_log_ipv4 nf_log_common xt_LOG xt_hashlimit ipt_REJECT nf_reject_ipv4 xt_owner xt_length xt_limit xt_multiport xt_set xt_conntrack iptable_filter arptable_filter arp_tables dm_integrity nf_conntrack_netlink ip_set_bitmap_port ip_set_hash_mac ip_set_hash_net ip_set nfnetlink algif_hash algif_skcipher af_alg bnep hwmon_vid snd_usb_audio btusb btrtl snd_usbmidi_lib btbcm btintel snd_hwdep snd_rawmidi bluetooth ecdh_generic iwlmvm pktcdvd mac80211 iwlwifi kvm_amd kvm irqbypass cfg80211 wmi_bmof snd_hda_codec_realtek sp5100_tco k10temp snd_hda_codec_generic snd_hda_codec_hdmi i2c_piix4 snd_hda_intel snd_hda_codec snd_hda_core rtc_cmos acpi_cpufreq snd_pcm_oss snd_mixer_oss snd_seq snd_seq_device snd_pcm wireguard(O) binfmt_misc ip6_udp_tunnel udp_tunnel sch_cake tcp_cubic tcp_westwood br_netfilter bridge stp llc ip_tables uas usb_storage usbhid rfkill mxm_wmi igb(-) ccp xhci_pci xhci_hcd usbcore usb_common wmi button 8021q mrp sunrpc iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi snd_timer snd soundcore tun xt_tcpudp x_tables tcp_bbr nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 sch_fq_codel sch_htb sch_pie fuse analog gameport joydev i2c_dev ecryptfs autofs4 amdkfd amd_iommu_v2 [last unloaded: pcspkr] ---[ end trace 4b6d51a0a27b5c23 ]--- RIP: 0010:remove_files.isra.0+0x1f/0x70 Code: fe ff ff ff eb 9b 0f 1f 44 00 00 0f 1f 44 00 00 41 54 49 89 d4 55 48 89 fd 53 48 85 f6 74 24 48 8b 06 48 89 f3 48 85 c0 74 19 <48> 8b 30 31 d2 48 89 ef 48 83 c3 08 e8 00 d6 ff ff 48 8b 03 48 85 RSP: 0018:ffffa2d943a53c90 EFLAGS: 00010286 RAX: a0caaab371d2d028 RBX: ffffa15d294063c0 RCX: 0000000000000000 RDX: ffffa15d3aced578 RSI: ffffa15d294063c0 RDI: ffffa15d235dfbb0 RBP: ffffa15d235dfbb0 R08: 0000000000000000 R09: ffffa15d23766458 R10: 0000000000000000 R11: ffffa15d2a17a218 R12: ffffa15d3aced578 R13: 0000000000000000 R14: ffffa15d3acec1d0 R15: 0000000000000000 FS: 00007f1ef0501740(0000) GS:ffffa15d3e940000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000561a3c85fea8 CR3: 00000007e1826000 CR4: 00000000003406e0
4.19.93 has a33121e5487b424339636b25c35d3a180eaa5f5e , but I didn't get this splat with 4.19.90......
I encountered this bug on kernel branch 5.4 with versions >= 5.4.8 as well. This commit (https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?id=a33121e5487b424339636b25c35d3a180eaa5f5e) seems to be the pitfall, it's in versions >= 5.4.8 but not in versions < 5.4.8. And I reverse-patched the whole commit on 5.4.8, actually fixing the issue. However I'm not sure with the specific problem in the commit and could not figure out a fix yet. Note that this bug's impact is bigger than it seems. Although removing a network driver module is very rare in real use, it might actually happen in daily VM usages. A typical use is to pass-through a PCI device from the host to a VM so that the VM could use it, and if it is a network adapter using igb that is being pass-through'd, the hypervisor would need to remove the associated driver module first before setting up the IOMMU. And I could confirm that this bug would cause the hypervisor(In my case libvirt/KVM) freezing and VM failing to start in these use cases. This is actually how I encountered the bug.
probably fixed by 75718584cb3c64e6269109d4d54f888ac5a5fd15