When we use the kernel 4.19 to perform the Syzkaller test, we find the following problems: [ 273.278134] Internal error: Oops: 9600004f [#1] SMP [ 273.278883] Process modprobe (pid: 9664, stack limit = 0x000000004ac45a30) [ 273.279822] CPU: 1 PID: 9664 Comm: modprobe Kdump: loaded Not tainted 4.19.90-aarch64 #1 [ 273.281370] Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015 [ 273.282377] pstate: 60400005 (nZCv daif +PAN -UAO) [ 273.283087] pc : genl_register_family+0x41c/0xc28 [ 273.283782] lr : genl_register_family+0x41c/0xc28 [ 273.284488] sp : ffff800161c0f820 [ 273.284981] x29: ffff800161c0f820 x28: ffff200083af0160 [ 273.285768] x27: ffff200083aef6e0 x26: 0000000000000007 [ 273.286554] x25: ffff20000354e048 x24: 0000000000000013 [ 273.287339] x23: ffff8001acbdf700 x22: ffff20000354e008 [ 273.288122] x21: 00000000000003ff x20: ffff200083aef000 [ 273.288927] x19: ffff20000354e000 x18: 0000000000000000 [ 273.289711] x17: 0000000000000000 x16: ffff200081153130 [ 273.290474] x15: 0000000000000000 x14: ffff2000802f0664 [ 273.291187] x13: ffff2000802eff34 x12: ffff2000802f29bc [ 273.291899] x11: ffff20008008649c x10: ffff200003558030 [ 273.292621] x9 : ffff200081153540 x8 : ffff200080085150 [ 273.293335] x7 : ffff2000800aa958 x6 : ffff200083eec320 [ 273.294048] x5 : ffff80018ed2c000 x4 : ffff20000354e048 [ 273.294760] x3 : 0000000000001c80 x2 : dfff200000000000 [ 273.295472] x1 : 0000000000000007 x0 : 0000000000000000 [ 273.296187] Call trace: [ 273.296538] genl_register_family+0x41c/0xc28 [ 273.297126] l2tp_nl_init+0x30/0x1000 [l2tp_netlink] [ 273.297793] do_one_initcall+0xb4/0x508 [ 273.298311] do_init_module+0xe0/0x2ec [ 273.298813] load_module+0x24ec/0x2760 [ 273.299317] __se_sys_finit_module+0x184/0x198 [ 273.299896] __arm64_sys_finit_module+0x4c/0x60 [ 273.300505] el0_svc_common+0xc8/0x2b8 [ 273.300999] el0_svc_handler+0xf8/0x160 [ 273.301503] el0_svc+0x10/0x218 [ 273.301917] Code: 97d20345 aa0003f7 aa1903e0 97d21ffa (f9002677) [ 273.302719] kernel fault(0x1) notification starting on CPU 1 [ 273.303456] kernel fault(0x1) notification finished on CPU 1 [ 273.304177] Modules linked in: l2tp_netlink(+) l2tp_core pptp pppox ppp_generic slhc vhost_net nf_conntrack_netlink nfnetlink_cttimeout nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 loop nf_tables nfnetlink sctp vhost_vsock vmw_vsock_virtio_transport_common vhost vsock libcrc32c scsi_transport_iscsi af_key drop_monitor ip6_vti ip_vti ip_gre ipip sit tunnel4 ip_tunnel geneve ip6_udp_tunnel udp_tunnel macsec macvtap tap ipvlan macvlan 8021q veth nlmon dummy bonding bridge stp llc ip6_gre ip6_tunnel tunnel6 gre tun binfmt_misc rfkill sunrpc vfat fat aes_ce_blk crypto_simd cryptd aes_ce_cipher ghash_ce sha2_ce sha256_arm64 sha1_ce sch_fq_codel ext4 mbcache jbd2 virtio_net virtio_gpu net_failover virtio_blk failover virtio_pci virtio_mmio virtio_ring virtio dm_mirror dm_region_hash dm_log dm_mod [ 273.313123] ---[ end trace ceac72010c07a5cf ]--- [ 273.320130] Kernel panic - not syncing: Fatal exception [ 273.327168] kernel fault(0x5) notification starting on CPU 1 [ 273.334218] kernel fault(0x5) notification finished on CPU 1 [ 273.341137] SMP: stopping secondary CPUs [ 273.347680] Kernel Offset: disabled [ 273.353791] CPU features: 0x52,a2200238 [ 273.359831] Memory Limit: none [ 273.366455] Starting crashdump kernel... [ 273.372440] Bye! This problem occurs not only on the L2TP module, but also on the DevLink module.Their call stacks are even the same. [ 152.976384] Unable to handle kernel write to read-only memory at virtual address ffff200001e77048 [ 152.976398] Mem abort info: [ 152.976407] ESR = 0x9600004f [ 152.976418] Exception class = DABT (current EL), IL = 32 bits [ 152.976427] SET = 0, FnV = 0 [ 152.976435] EA = 0, S1PTW = 0 [ 152.976444] Data abort info: [ 152.976453] ISV = 0, ISS = 0x0000004f [ 152.976461] CM = 0, WnR = 1 [ 152.976475] swapper pgtable: 4k pages, 48-bit VAs, pgdp = 000000009f1fac8f [ 152.976484] [ffff200001e77048] pgd=0000000239bfe003, pud=00000001efd29003, pmd=00000001de566003, pte=00600001bd5a1793 [ 152.976514] Internal error: Oops: 9600004f [#1] SMP [ 152.977165] Process modprobe (pid: 9552, stack limit = 0x0000000028c88ea4) [ 152.977942] CPU: 0 PID: 9552 Comm: modprobe Kdump: loaded Not tainted 4.19.90-aarch64 #1 [ 152.979414] Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015 [ 152.980325] pstate: 60400005 (nZCv daif +PAN -UAO) [ 152.980971] pc : genl_register_family+0x41c/0xc28 [ 152.981596] lr : genl_register_family+0x41c/0xc28 [ 152.982238] sp : ffff80018911f820 [ 152.982690] x29: ffff80018911f820 x28: ffff200083af0160 [ 152.983420] x27: ffff200083aef6e0 x26: 000000000000002d [ 152.984139] x25: ffff200001e77048 x24: 0000000000000013 [ 152.984869] x23: ffff80019cdaf500 x22: ffff200001e77008 [ 152.985605] x21: 00000000000003ff x20: ffff200083aef000 [ 152.986319] x19: ffff200001e77000 x18: 0000000000000000 [ 152.987049] x17: 0000000000000000 x16: ffff200081153130 [ 152.987768] x15: 0000000000000000 x14: ffff2000802f0664 [ 152.988472] x13: ffff2000802eff34 x12: ffff2000802f29bc [ 152.989183] x11: ffff20008008649c x10: ffff200001da8154 [ 152.989892] x9 : ffff200081153540 x8 : ffff200080085150 [ 152.990624] x7 : ffff2000800aa958 x6 : ffff200083eec320 [ 152.991354] x5 : ffff800181dac000 x4 : ffff200001e77048 [ 152.992077] x3 : 0000000000002080 x2 : dfff200000000000 [ 152.992798] x1 : 0000000000000007 x0 : 0000000000000000 [ 152.993530] Call trace: [ 152.993893] genl_register_family+0x41c/0xc28 [ 152.994504] devlink_module_init+0x20/0xecc [devlink] [ 152.995199] do_one_initcall+0xb4/0x508 [ 152.995743] do_init_module+0xe0/0x2ec [ 152.996276] load_module+0x24ec/0x2760 [ 152.996811] __se_sys_finit_module+0x184/0x198 [ 152.997438] __arm64_sys_finit_module+0x4c/0x60 [ 152.998071] el0_svc_common+0xc8/0x2b8 [ 152.998604] el0_svc_handler+0xf8/0x160 [ 152.999144] el0_svc+0x10/0x218 [ 152.999605] Code: 97d20345 aa0003f7 aa1903e0 97d21ffa (f9002677) [ 153.000433] kernel fault(0x1) notification starting on CPU 0 [ 153.001205] kernel fault(0x1) notification finished on CPU 0 [ 153.001985] Modules linked in: devlink(+) overlay camellia_generic ceph libceph dns_resolver nfs lockd grace fscache ccm n_gsm ppp_synctty nfnetlink_cthelper l2tp_ip6 nft_compat n_hdlc loop l2tp_ppp l2tp_netlink vfio_iommu_type1 vfio cuse serpent_generic xcbc pptp nfnetlink_log nf_conntrack_netlink nf_tables nfnetlink_cttimeout xt_osf nf_conntrack nfnetlink_osf nf_defrag_ipv6 nf_defrag_ipv4 vhost_vsock af_key vmw_vsock_virtio_transport_common vsock uhid nfnetlink_queue ppp_async fuse pppoe pppox sctp ip_set tcp_diag l2tp_ip libcrc32c l2tp_core inet_diag nfnetlink_acct ppp_generic slhc nfnetlink vhost_net vhost ip6_vti ip_vti ip_gre ipip sit tunnel4 ip_tunnel geneve ip6_udp_tunnel udp_tunnel macsec macvtap tap ipvlan macvlan 8021q veth nlmon dummy bonding bridge stp llc ip6_gre ip6_tunnel tunnel6 [ 153.011231] gre binfmt_misc tun rfkill sunrpc vfat fat aes_ce_blk crypto_simd cryptd aes_ce_cipher ghash_ce sha2_ce sha256_arm64 sha1_ce sch_fq_codel ext4 mbcache jbd2 virtio_net virtio_gpu net_failover failover virtio_blk virtio_pci virtio_mmio virtio_ring virtio dm_mirror dm_region_hash dm_log dm_mod [last unloaded: devlink] [ 153.015147] ---[ end trace 2f72347ddc77231c ]--- [ 153.015779] Kernel panic - not syncing: Fatal exception [ 153.016477] kernel fault(0x5) notification starting on CPU 0 [ 153.017242] kernel fault(0x5) notification finished on CPU 0 [ 153.018004] SMP: stopping secondary CPUs [ 153.018566] Kernel Offset: disabled [ 153.019070] CPU features: 0x52,a2200238 [ 153.019601] Memory Limit: none [ 153.021284] Starting crashdump kernel... [ 153.021843] Bye! After reading the source code of the two modules, we find that they add the __ro_after_init modifier to the declaration. static struct genl_family l2tp_nl_family __ro_after_init = ... static struct genl_family devlink_nl_family __ro_after_init = ... When they enter the lower-level function, genl_register_family attempts to initialize their id and attrbuf members (although attrbuf has been deleted in the new kernel). Dmesg says they are read-only, which means they have been initialized before. When the __ro_after_init is deleted, the Syzkaller does not display this problem. We want to know if this is a problem with Syzkaller, or most __ro_after_init modules do.
At a glance looks like a problem with __ro_after_init on your platform. The call trace shows genl_register_family() is called from do_init_module(), the __ro_after_init should not have been marked Read-Only yet when genl is called.
(In reply to Jakub Kicinski from comment #1) > At a glance looks like a problem with __ro_after_init on your platform. > The call trace shows genl_register_family() is called from do_init_module(), > the __ro_after_init should not have been marked Read-Only yet when genl is > called. After thorough testing, we believe that this problem is caused by ftrace. In kernel 4.19, ftrace calls module_disable_ro before do_init_module to mark the memory as read-only. The calling relationship is ftrace_replace_code ->__ftrace_replace_code ->ftrace_make_call ->module_enable_ro We notice that the following statement exists in the ftrace_make_call function. The module_enable_ro function is called when you enter the if statement.: long offset = (long)pc - (long)addr; if (offset < -SZ_128M || offset >= SZ_128M) { ... module_disable_ro(mod); *dst = trampoline; module_enable_ro(mod, true); ... } Run the following command to reproduce the problem,although we don't know why this if condition is entered: cd /sys/kernel/debug/tracing/ echo function > current_tracer echo :mod:l2tp_netlink > set_ftrace_filter modprobe l2tp_netlink