Hi, with kernel 5.8.14, using cvmfs (https://cvmfs.readthedocs.io/en/stable/) with fuse there, access to the filesystem stops working. It is hard to trigger in a controlled way, it happens randomly after many hours of running. dmesg bellow. There were no issues with 5.7.4 kernel. Cheers, Andrej [87016.331070] BUG: kernel NULL pointer dereference, address: 000000000000040a [87016.331143] #PF: supervisor read access in kernel mode [87016.331206] #PF: error_code(0x0000) - not-present page [87016.331267] PGD 3d42bd067 P4D 3d42bd067 PUD 3fac94067 PMD 0 [87016.331333] Oops: 0000 [#1] SMP NOPTI [87016.331394] CPU: 5 PID: 229156 Comm: bash Not tainted 5.8.14 #1 [87016.331456] Hardware name: Supermicro H8DMT/H8DMT, BIOS 080014 09/24/2009 [87016.331532] RIP: 0010:fuse_readahead+0xfe/0x4d0 [fuse] [87016.331595] Code: 18 48 8b 53 10 8b 43 18 48 8d 7c 24 10 48 8d 74 02 ff e8 a5 1b d8 e5 48 89 c7 48 85 c0 0f 84 82 03 00 00 48 89 6c 24 08 31 c0 <48> 8b 4f 08 48 8d 51 ff 83 e1 01 48 0f 44 d7 48 8b 32 83 e6 01 0f [87016.331716] RSP: 0018:ffffb26e4966fa00 EFLAGS: 00010246 [87016.331779] RAX: 0000000000000000 RBX: ffffb26e4966fb10 RCX: 0000000000000002 [87016.331845] RDX: 0000000000000000 RSI: ffff9bbb4b069240 RDI: 0000000000000402 [87016.331910] RBP: ffff9bbfa6bb1400 R08: 0000000000000402 R09: 0000000000000000 [87016.331975] R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000001 [87016.332040] R13: ffff9bbbef33ad80 R14: ffff9bbbe5ac8890 R15: ffff9bbbef33af00 [87016.332106] FS: 0000149ccfedd740(0000) GS:ffff9bbc0fd40000(0000) knlGS:0000000000000000 [87016.332189] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [87016.332252] CR2: 000000000000040a CR3: 000000040c794000 CR4: 00000000000006e0 [87016.332317] Call Trace: [87016.332385] read_pages+0x5d/0x300 [87016.332445] ? 0xffffffffa6000000 [87016.332505] page_cache_readahead_unbounded+0x18f/0x230 [87016.332571] generic_file_buffered_read+0x557/0xba0 [87016.332633] ? generic_file_buffered_read+0x2d9/0xba0 [87016.332696] do_iter_readv_writev+0x167/0x190 [87016.332758] do_iter_read+0xd4/0x190 [87016.332825] ovl_read_iter+0x16c/0x180 [overlay] [87016.332888] new_sync_read+0x102/0x180 [87016.332949] __kernel_read+0x11a/0x160 [87016.333011] load_elf_phdrs+0x5b/0xa0 [87016.333071] load_elf_binary+0x745/0x16b0 [87016.333134] __do_execve_file+0x5f8/0xbc0 [87016.333195] do_execve+0x27/0x30 [87016.333254] __x64_sys_execve+0x27/0x30 [87016.333315] do_syscall_64+0x4d/0xd0 [87016.333377] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [87016.333439] RIP: 0033:0x149ccf59dc37 [87016.333498] Code: ff ff 76 df 89 c6 f7 de 64 41 89 32 eb d5 89 c6 f7 de 64 41 89 32 eb db 66 2e 0f 1f 84 00 00 00 00 00 90 b8 3b 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 02 f3 c3 48 8b 15 08 12 30 00 f7 d8 64 89 02 [87016.333619] RSP: 002b:00007ffcf11d9e88 EFLAGS: 00000206 ORIG_RAX: 000000000000003b [87016.333701] RAX: ffffffffffffffda RBX: 0000000001a141c0 RCX: 0000149ccf59dc37 [87016.333766] RDX: 0000000001a1ca70 RSI: 0000000001a153e0 RDI: 0000000001a141c0 [87016.333832] RBP: 0000000001a1a6d0 R08: 0000000000000000 R09: 0000000000000018 [87016.333897] R10: 00007ffcf11d9a50 R11: 0000000000000206 R12: 0000000000000000 [87016.333962] R13: 0000000001a153e0 R14: 0000000001a1ca70 R15: 0000000000000000 [87016.334028] Modules linked in: squashfs loop ceph libceph fscache overlay fuse 8021q garp mrp stp llc nft_limit nft_ct nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables nfnetlink amd64_edac_mod edac_mce_amd kvm_amd ccp kvm snd_pcm nv_tco joydev irqbypass snd_timer snd i2c_nforce2 forcedeth k10temp soundcore ipmi_si ipmi_devintf ipmi_msghandler pcspkr acpi_cpufreq drm ip_tables xfs ata_generic serio_raw pata_acpi sata_nv pata_amd [87016.334262] CR2: 000000000000040a [87016.334396] ---[ end trace 7c21ec6226df76cc ]--- [87016.334462] RIP: 0010:fuse_readahead+0xfe/0x4d0 [fuse] [87016.334525] Code: 18 48 8b 53 10 8b 43 18 48 8d 7c 24 10 48 8d 74 02 ff e8 a5 1b d8 e5 48 89 c7 48 85 c0 0f 84 82 03 00 00 48 89 6c 24 08 31 c0 <48> 8b 4f 08 48 8d 51 ff 83 e1 01 48 0f 44 d7 48 8b 32 83 e6 01 0f [87016.334651] RSP: 0018:ffffb26e4966fa00 EFLAGS: 00010246 [87016.334714] RAX: 0000000000000000 RBX: ffffb26e4966fb10 RCX: 0000000000000002 [87016.334781] RDX: 0000000000000000 RSI: ffff9bbb4b069240 RDI: 0000000000000402 [87016.334849] RBP: ffff9bbfa6bb1400 R08: 0000000000000402 R09: 0000000000000000 [87016.334915] R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000001 [87016.334982] R13: ffff9bbbef33ad80 R14: ffff9bbbe5ac8890 R15: ffff9bbbef33af00 [87016.335048] FS: 0000149ccfedd740(0000) GS:ffff9bbc0fd40000(0000) knlGS:0000000000000000 [87016.335133] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [87016.335196] CR2: 000000000000040a CR3: 000000040c794000 CR4: 00000000000006e0 [87016.336180] BUG: Bad rss-counter state mm:000000002916590a type:MM_ANONPAGES val:4
Same with kernel 5.9.2, seems any 5.8 5.9 version to be affected. Nov 1 06:27:24 f9nd164 kernel: [156924.660102] BUG: kernel NULL pointer dereference, address: 000000000000040a Nov 1 06:27:24 f9nd164 kernel: [156924.662561] #PF: supervisor read access in kernel mode Nov 1 06:27:24 f9nd164 kernel: [156924.664778] #PF: error_code(0x0000) - not-present page Nov 1 06:27:24 f9nd164 kernel: [156924.666907] PGD 52015d067 P4D 52015d067 PUD 730869067 PMD 0 Nov 1 06:27:24 f9nd164 kernel: [156924.669142] Oops: 0000 [#1] SMP NOPTI Nov 1 06:27:24 f9nd164 kernel: [156924.671135] CPU: 13 PID: 54693 Comm: ps Not tainted 5.9.2 #1 Nov 1 06:27:24 f9nd164 kernel: [156924.673179] Hardware name: Supermicro AS -1042G-TF/H8QG6, BIOS 3.5 12/16/2013 Nov 1 06:27:24 f9nd164 kernel: [156924.675389] RIP: 0010:fuse_readahead+0xfe/0x4a0 [fuse] Nov 1 06:27:24 f9nd164 kernel: [156924.677330] Code: 18 48 8b 53 10 8b 43 18 48 8d 7c 24 10 48 8d 74 02 ff e8 15 37 16 f6 48 89 c7 48 85 c0 0f 84 70 03 00 00 48 89 6c 24 08 31 c0 <48> 8b 4f 08 48 8d 51 ff 83 e1 01 48 0f 44 d7 48 8b 32 83 e6 01 0f Nov 1 06:27:24 f9nd164 kernel: [156924.681468] RSP: 0000:ffffb6ac4b2d7c38 EFLAGS: 00010246 Nov 1 06:27:24 f9nd164 kernel: [156924.683425] RAX: 0000000000000000 RBX: ffffb6ac4b2d7d48 RCX: 0000000000000002 Nov 1 06:27:24 f9nd164 kernel: [156924.685319] RDX: 0000000000000000 RSI: ffff9027d45b26c8 RDI: 0000000000000402 Nov 1 06:27:24 f9nd164 kernel: [156924.687386] RBP: ffff902e7b58de00 R08: 0000000000000402 R09: 0000000000000000 Nov 1 06:27:24 f9nd164 kernel: [156924.689329] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000001 Nov 1 06:27:24 f9nd164 kernel: [156924.691103] R13: ffff902e75a70340 R14: ffff9028d608cac0 R15: ffff902e75a704c0 Nov 1 06:27:24 f9nd164 kernel: [156924.692964] FS: 0000000000000000(0000) GS:ffff904686d40000(0000) knlGS:0000000000000000 Nov 1 06:27:24 f9nd164 kernel: [156924.694919] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Nov 1 06:27:24 f9nd164 kernel: [156924.696662] CR2: 000000000000040a CR3: 0000000cbb2e4000 CR4: 00000000000406e0 Nov 1 06:27:24 f9nd164 kernel: [156924.698345] Call Trace: Nov 1 06:27:24 f9nd164 kernel: [156924.700210] read_pages+0x5d/0x2c0 Nov 1 06:27:24 f9nd164 kernel: [156924.702001] page_cache_readahead_unbounded+0x18f/0x230 Nov 1 06:27:24 f9nd164 kernel: [156924.703635] filemap_fault+0x618/0x940 Nov 1 06:27:24 f9nd164 kernel: [156924.705356] __do_fault+0x36/0x100 Nov 1 06:27:24 f9nd164 kernel: [156924.707108] handle_mm_fault+0x1180/0x1980 Nov 1 06:27:24 f9nd164 kernel: [156924.708712] ? security_mmap_file+0x81/0xd0 Nov 1 06:27:24 f9nd164 kernel: [156924.710250] do_user_addr_fault+0x1b8/0x3f0 Nov 1 06:27:24 f9nd164 kernel: [156924.711899] exc_page_fault+0x82/0x1a0 Nov 1 06:27:24 f9nd164 kernel: [156924.713506] ? asm_exc_page_fault+0x8/0x30 Nov 1 06:27:24 f9nd164 kernel: [156924.714904] asm_exc_page_fault+0x1e/0x30 Nov 1 06:27:24 f9nd164 kernel: [156924.716292] RIP: 0033:0x15143453663a Nov 1 06:27:24 f9nd164 kernel: [156924.717797] Code: 49 8b 54 24 10 48 8b a5 e8 fe ff ff 48 85 d2 0f 84 08 05 00 00 49 8b 3c 24 48 01 fa 48 85 d2 49 89 54 24 10 0f 84 e6 04 00 00 <48> 8b 02 49 8d 74 24 40 48 85 c0 74 7f 41 b8 ff ff ff 6f 41 bb ff Nov 1 06:27:24 f9nd164 kernel: [156924.720646] RSP: 002b:00007ffcdd073ab0 EFLAGS: 00010202 Nov 1 06:27:24 f9nd164 kernel: [156924.721953] RAX: 00007ffcdd0739e8 RBX: 00007ffcdd0739e8 RCX: 0000151432034170 Nov 1 06:27:24 f9nd164 kernel: [156924.723399] RDX: 0000151432033de0 RSI: 0000000000000000 RDI: 0000151431e30000 Nov 1 06:27:24 f9nd164 kernel: [156924.724822] RBP: 00007ffcdd073c10 R08: 0000151432034160 R09: 0000000000003000 Nov 1 06:27:24 f9nd164 kernel: [156924.726159] R10: 0000000000000812 R11: 0000000000000206 R12: 000015143474eaf8 Nov 1 06:27:24 f9nd164 kernel: [156924.727380] R13: 00007ffcdd073cf0 R14: 0000000000000003 R15: 0000151432034170 Nov 1 06:27:24 f9nd164 kernel: [156924.728601] Modules linked in: overlay fuse 8021q garp mrp stp llc ceph libceph fscache nft_limit nft_ct nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 snd_pcm amd64_edac_mod nf_tables edac_mce_amd kvm_amd ccp nfnetlink snd_timer kvm igb irqbypass snd joydev soundcore mgag200 ipmi_si i2c_algo_bit crct10dif_pclmul dca pcspkr sp5100_tco crc32_pclmul ghash_clmulni_intel drm_kms_helper ipmi_devintf fam15h_power ipmi_msghandler k10temp cec i2c_piix4 acpi_cpufreq drm ip_tables xfs crc32c_intel serio_raw Nov 1 06:27:24 f9nd164 kernel: [156924.733787] CR2: 000000000000040a Nov 1 06:27:24 f9nd164 kernel: [156924.735161] ---[ end trace 7c4a78fad0c139bd ]--- Nov 1 06:27:24 f9nd164 kernel: [156924.736523] RIP: 0010:fuse_readahead+0xfe/0x4a0 [fuse] Nov 1 06:27:24 f9nd164 kernel: [156924.737995] Code: 18 48 8b 53 10 8b 43 18 48 8d 7c 24 10 48 8d 74 02 ff e8 15 37 16 f6 48 89 c7 48 85 c0 0f 84 70 03 00 00 48 89 6c 24 08 31 c0 <48> 8b 4f 08 48 8d 51 ff 83 e1 01 48 0f 44 d7 48 8b 32 83 e6 01 0f Nov 1 06:27:24 f9nd164 kernel: [156924.740551] RSP: 0000:ffffb6ac4b2d7c38 EFLAGS: 00010246 Nov 1 06:27:24 f9nd164 kernel: [156924.741995] RAX: 0000000000000000 RBX: ffffb6ac4b2d7d48 RCX: 0000000000000002 Nov 1 06:27:24 f9nd164 kernel: [156924.743427] RDX: 0000000000000000 RSI: ffff9027d45b26c8 RDI: 0000000000000402 Nov 1 06:27:24 f9nd164 kernel: [156924.744783] RBP: ffff902e7b58de00 R08: 0000000000000402 R09: 0000000000000000 Nov 1 06:27:24 f9nd164 kernel: [156924.745974] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000001 Nov 1 06:27:24 f9nd164 kernel: [156924.747204] R13: ffff902e75a70340 R14: ffff9028d608cac0 R15: ffff902e75a704c0 Nov 1 06:27:24 f9nd164 kernel: [156924.748499] FS: 0000000000000000(0000) GS:ffff904686d40000(0000) knlGS:0000000000000000 Nov 1 06:27:24 f9nd164 kernel: [156924.749827] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Nov 1 06:27:24 f9nd164 kernel: [156924.751062] CR2: 000000000000040a CR3: 0000000cbb2e4000 CR4: 00000000000406e0 Nov 1 06:27:49 f9nd164 kernel: [156949.971807] watchdog: BUG: soft lockup - CPU#12 stuck for 23s! [ps:54693]
Thank you for your report. "mm: fix readahead_page_batch for retry entries" has been added to the -mm tree. Its filename is mm-fix-readahead_page_batch-for-retry-entries.patch You can see more details in this link https://lore.kernel.org/linux-fsdevel/20201103142852.8543-1-willy@infradead.org/
Thanks for the fix. In the meantime I was testing 5.9.3 including this commit commit ddd1165e0c694b13ff4bed6a3c7a2abd4c96df5b Author: Miklos Szeredi <mszeredi@redhat.com> Date: Fri Sep 18 10:36:50 2020 +0200 fuse: fix page dereference after free commit d78092e4937de9ce55edcb4ee4c5e3c707be0190 upstream. After unlock_request() pages from the ap->pages[] array may be put (e.g. by aborting the connection) and the pages can be freed. Prevent use after free by grabbing a reference to the page before calling unlock_request(). The original patch was created by Pradeep P V K. Reported-by: Pradeep P V K <ppvk@codeaurora.org> Cc: <stable@vger.kernel.org> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> seems the problem does not appear any more for fuse.