When creating a large file (i.e. mkfs.ext4 within yocto embedded linux task, which means, 8+GB file), mkfs.ext4 will report a segfault and I get a general protection fault, the system becomes more or less unstable after this. I can reproduce this 100%, when I do the same with 6.0.6 kernel, it works fine. Here is the Kernel dump: Jan 10 14:12:38 michelle kernel: BTRFS warning (device nvme0n1p5): bad eb member end: ptr 0x3fea start 2704543268864 member offset 16383 size 8 Jan 10 14:12:38 michelle kernel: general protection fault, probably for non-canonical address 0x85d8740000000: 0000 [#1] PREEMPT SMP Jan 10 14:12:38 michelle kernel: CPU: 21 PID: 2143606 Comm: mkfs.ext4.real Tainted: P O T 6.1.4-gentoo #2 Jan 10 14:12:38 michelle kernel: Hardware name: Micro-Star International Co., Ltd. MS-7C37/X570-A PRO (MS-7C37), BIOS H.70 01/09/2020 Jan 10 14:12:38 michelle kernel: RIP: 0010:btrfs_get_64+0xe7/0x100 Jan 10 14:12:38 michelle kernel: Code: 40 08 48 2b 15 b2 3a 15 01 48 8d 0c 04 48 c1 fa 06 48 c1 e2 0c 48 03 15 af 3a 15 01 81 eb f8 0f 00 00 74 12 31 c0 89 c6 ff c0 <0f> b6 3c 32 40 88 3c 3> Jan 10 14:12:38 michelle kernel: RSP: 0018:ffffb2d4ca4c3dd0 EFLAGS: 00010202 Jan 10 14:12:38 michelle kernel: RAX: 0000000000000001 RBX: 0000000000000007 RCX: ffffb2d4ca4c3dd9 Jan 10 14:12:38 michelle kernel: RDX: 00085d8740000000 RSI: 0000000000000000 RDI: 000000000000000a Jan 10 14:12:38 michelle kernel: RBP: ffff96fbbfd9c600 R08: 0000000000000001 R09: 00000000ffffdfff Jan 10 14:12:38 michelle kernel: R10: ffffffff94a3a700 R11: ffffffff94aea700 R12: 0000000000000003 Jan 10 14:12:38 michelle kernel: R13: ffff96fbbfd9c600 R14: 0000000000000003 R15: 0000000000003fea Jan 10 14:12:38 michelle kernel: FS: 00007f6ac90e5780(0000) GS:ffff97071ed40000(0000) knlGS:0000000000000000 Jan 10 14:12:38 michelle kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Jan 10 14:12:38 michelle kernel: CR2: 0000557c6f245760 CR3: 000000019a64a000 CR4: 0000000000350ee0 Jan 10 14:12:38 michelle kernel: Call Trace: Jan 10 14:12:38 michelle kernel: <TASK> Jan 10 14:12:38 michelle kernel: btrfs_file_llseek+0x269/0x670 Jan 10 14:12:38 michelle kernel: ksys_lseek+0x61/0xa0 Jan 10 14:12:38 michelle kernel: do_syscall_64+0x56/0x80 Jan 10 14:12:38 michelle kernel: entry_SYSCALL_64_after_hwframe+0x46/0xb0 Jan 10 14:12:38 michelle kernel: RIP: 0033:0x7f6ac91e4d3b Jan 10 14:12:38 michelle kernel: Code: ff ff c3 0f 1f 40 00 48 8b 15 e1 90 0d 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb ba 0f 1f 00 f3 0f 1e fa b8 08 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 0> Jan 10 14:12:38 michelle kernel: RSP: 002b:00007fff68e04038 EFLAGS: 00000297 ORIG_RAX: 0000000000000008 Jan 10 14:12:38 michelle kernel: RAX: ffffffffffffffda RBX: 000056367d4ffea0 RCX: 00007f6ac91e4d3b Jan 10 14:12:38 michelle kernel: RDX: 0000000000000003 RSI: 0000000000000000 RDI: 0000000000000004 Jan 10 14:12:38 michelle kernel: RBP: 0000000000000004 R08: 00007f6ac92bef90 R09: 000056367d4f6160 Jan 10 14:12:38 michelle kernel: R10: 0000000000000000 R11: 0000000000000297 R12: 0000000000000000 Jan 10 14:12:38 michelle kernel: R13: 000056367d4c0eb0 R14: 000056367d50d620 R15: 000056367d4ebab0 Jan 10 14:12:38 michelle kernel: </TASK> Jan 10 14:12:38 michelle kernel: Modules linked in: xt_CHECKSUM xt_MASQUERADE nvidia_drm(PO) nvidia_modeset(PO) ip6table_nat iptable_nat bpfilter nvidia(PO) uvcvideo videobuf2_vmalloc video> Jan 10 14:12:38 michelle kernel: ---[ end trace 0000000000000000 ]--- Jan 10 14:12:38 michelle kernel: RIP: 0010:btrfs_get_64+0xe7/0x100 Jan 10 14:12:38 michelle kernel: Code: 40 08 48 2b 15 b2 3a 15 01 48 8d 0c 04 48 c1 fa 06 48 c1 e2 0c 48 03 15 af 3a 15 01 81 eb f8 0f 00 00 74 12 31 c0 89 c6 ff c0 <0f> b6 3c 32 40 88 3c 3> Jan 10 14:12:38 michelle kernel: RSP: 0018:ffffb2d4ca4c3dd0 EFLAGS: 00010202 Jan 10 14:12:38 michelle kernel: RAX: 0000000000000001 RBX: 0000000000000007 RCX: ffffb2d4ca4c3dd9 Jan 10 14:12:38 michelle kernel: RDX: 00085d8740000000 RSI: 0000000000000000 RDI: 000000000000000a Jan 10 14:12:38 michelle kernel: RBP: ffff96fbbfd9c600 R08: 0000000000000001 R09: 00000000ffffdfff Jan 10 14:12:38 michelle kernel: R10: ffffffff94a3a700 R11: ffffffff94aea700 R12: 0000000000000003 Jan 10 14:12:38 michelle kernel: R13: ffff96fbbfd9c600 R14: 0000000000000003 R15: 0000000000003fea Jan 10 14:12:38 michelle kernel: FS: 00007f6ac90e5780(0000) GS:ffff97071ed40000(0000) knlGS:0000000000000000 Jan 10 14:12:38 michelle kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Jan 10 14:12:38 michelle kernel: CR2: 0000557c6f245760 CR3: 000000019a64a000 CR4: 0000000000350ee0
Just to ensure nothing went sideways during testing: did you really see this with 6.1.4? A patch for a problem that looks quite similar to yours was merged for that version: https://lore.kernel.org/all/20230104160512.620453792@linuxfoundation.org/
(In reply to The Linux kernel's regression tracker (Thorsten Leemhuis) from comment #1) > Just to ensure nothing went sideways during testing [And yes, I see that "6.1.4-gentoo" in the backtrace, but that "gentoo" made me wonder if it's patches or something]
Well, I did not check each and every line of the kernel, but it should be 6.1.4 with very minor patches to kconfig. I do not think the patch addresses the same problem just from the stack traces. But I might be mistaken. I can try with a vanilla kernel directly from git. Can you tell me which version you want me to test? I will try to do a minimal test case as well.
Hello, I have a similar problem when compiling AOSP on Arch w/ 6.1.6-zen kernel: Jan 17 12:35:47 arch-pc kernel: perf: interrupt took too long (2514 > 2500), lowering kernel.perf_event_max_sample_rate to 79000 Jan 17 12:49:42 arch-pc kernel: BTRFS warning (device nvme0n1p1): bad eb member end: ptr 0x3fe9 start 791064428544 member offset 16382 size 8 Jan 17 12:49:42 arch-pc kernel: general protection fault, probably for non-canonical address 0x3292e80000000: 0000 [#1] PREEMPT SMP PTI Jan 17 12:49:42 arch-pc kernel: CPU: 2 PID: 393424 Comm: e2fsdroid Tainted: G W OE 6.1.6-zen1-1-zen #1 a9f1d40d38a4e5cc84569c1bc8cda8fa4a251102 Jan 17 12:49:42 arch-pc kernel: Hardware name: Dell Inc. OptiPlex 7050/062KRH, BIOS 1.22.1 09/15/2022 Jan 17 12:49:42 arch-pc kernel: RIP: 0010:btrfs_get_64+0xdc/0x120 [btrfs] Jan 17 12:49:42 arch-pc kernel: Code: 4a 8b 44 e5 78 48 2b 05 f2 4d fe e4 48 c1 f8 06 48 c1 e0 0c 48 03 05 f3 4d fe e4 81 eb f8 0f 00 00 74 13 31 d2 89 d6 83 c2 01 <0f> b6 3c 30 40 88 3c 31 39 da 72 ef 48 8b 44 24 08 48 8b 54 24 10 Jan 17 12:49:42 arch-pc kernel: RSP: 0018:ffffa37297dc7d60 EFLAGS: 00010202 Jan 17 12:49:42 arch-pc kernel: RAX: 0003292e80000000 RBX: 0000000000000006 RCX: ffffa37297dc7d6a Jan 17 12:49:42 arch-pc kernel: RDX: 0000000000000001 RSI: 0000000000000000 RDI: 0000000000000001 Jan 17 12:49:42 arch-pc kernel: RBP: ffff8c0f888cf800 R08: 0000000000000002 R09: 00000000ffffffea Jan 17 12:49:42 arch-pc kernel: R10: ffffffffa785b840 R11: 00000000fffff000 R12: 0000000000000003 Jan 17 12:49:42 arch-pc kernel: R13: 0000000000003fe9 R14: 0000000000001000 R15: 0000000000000000 Jan 17 12:49:42 arch-pc kernel: FS: 00007f06fbc6c740(0000) GS:ffff8c1a8dc80000(0000) knlGS:0000000000000000 Jan 17 12:49:42 arch-pc kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Jan 17 12:49:42 arch-pc kernel: CR2: 00007f06fbf7ee98 CR3: 000000043a35a002 CR4: 00000000003706e0 Jan 17 12:49:42 arch-pc kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 Jan 17 12:49:42 arch-pc kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Jan 17 12:49:42 arch-pc kernel: Call Trace: Jan 17 12:49:42 arch-pc kernel: <TASK> Jan 17 12:49:42 arch-pc kernel: btrfs_file_llseek+0x36c/0x830 [btrfs 5f77724550ea3d487f82dd40a49fdd783c0cb897] Jan 17 12:49:42 arch-pc kernel: ? __x64_sys_newfstat+0x16f/0x1c0 Jan 17 12:49:42 arch-pc kernel: __x64_sys_lseek+0x76/0xc0 Jan 17 12:49:42 arch-pc kernel: do_syscall_64+0x5c/0x90 Jan 17 12:49:42 arch-pc kernel: ? syscall_exit_to_user_mode+0x2c/0x1d0 Jan 17 12:49:42 arch-pc kernel: ? do_syscall_64+0x6b/0x90 Jan 17 12:49:42 arch-pc kernel: ? do_syscall_64+0x6b/0x90 Jan 17 12:49:42 arch-pc kernel: ? exc_page_fault+0x74/0x170 Jan 17 12:49:42 arch-pc kernel: entry_SYSCALL_64_after_hwframe+0x63/0xcd Jan 17 12:49:42 arch-pc kernel: RIP: 0033:0x7f06fbd6614b Jan 17 12:49:42 arch-pc kernel: Code: ff ff c3 0f 1f 40 00 48 8b 15 39 0c 0e 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb ba 0f 1f 00 f3 0f 1e fa b8 08 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 05 c3 0f 1f 40 00 48 8b 15 09 0c 0e 00 f7 d8 Jan 17 12:49:42 arch-pc kernel: RSP: 002b:00007ffcf9a756e8 EFLAGS: 00000293 ORIG_RAX: 0000000000000008 Jan 17 12:49:42 arch-pc kernel: RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f06fbd6614b Jan 17 12:49:42 arch-pc kernel: RDX: 0000000000000003 RSI: 0000000000000000 RDI: 0000000000000004 Jan 17 12:49:42 arch-pc kernel: RBP: 00007ffcf9a75800 R08: 0000000000001000 R09: 00007f06fbe48220 Jan 17 12:49:42 arch-pc kernel: R10: 00005584072bb390 R11: 0000000000000293 R12: 00005584072c1e50 Jan 17 12:49:42 arch-pc kernel: R13: 0000000000000004 R14: 000000007f2bb746 R15: 00005584072ae510 Jan 17 12:49:42 arch-pc kernel: </TASK> Jan 17 12:49:42 arch-pc kernel: Modules linked in: tcp_diag inet_diag xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xt_addrtype iptable_filter iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 br_netfilter bridge stp llc overlay snd_seq_dummy snd_hrtimer snd_seq snd_seq_device rfkill vmnet(OE) intel_rapl_msr intel_rapl_common intel_tcc_cooling x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm snd_ctl_led irqbypass crct10dif_pclmul crc32_pclmul snd_hda_codec_realtek snd_hda_codec_generic polyval_clmulni polyval_generic snd_hda_codec_hdmi gf128mul ghash_clmulni_intel sha512_ssse3 snd_hda_intel snd_intel_dspcfg aesni_intel snd_intel_sdw_acpi crypto_simd cryptd snd_hda_codec iTCO_wdt mei_wdt snd_hda_core mei_hdcp intel_pmc_bxt mei_pxp dell_wmi vfat snd_hwdep rapl iTCO_vendor_support ee1004 ledtrig_audio dell_smbios dell_wmi_aio e1000e intel_cstate fat snd_pcm mei_me intel_wmi_thunderbolt dcdbas dell_wmi_descriptor wmi_bmof mei pcspkr sparse_keymap snd_timer intel_lpss_pci Jan 17 12:49:42 arch-pc kernel: intel_uncore i2c_i801 snd i2c_smbus intel_lpss idma64 soundcore mousedev joydev acpi_pad mac_hid vmmon(OE) vmw_vmci v4l2loopback(OE) videodev mc dm_multipath dm_mod i2c_dev sg crypto_user fuse ip_tables x_tables usbhid btrfs i915 blake2b_generic libcrc32c crc32c_generic nvme xhci_pci sr_mod xor nvme_core nvme_common crc32c_intel intel_gtt xhci_pci_renesas cdrom raid6_pq amdgpu gpu_sched drm_buddy video wmi drm_ttm_helper ttm drm_display_helper cec Jan 17 12:49:42 arch-pc kernel: ---[ end trace 0000000000000000 ]--- Jan 17 12:49:42 arch-pc kernel: RIP: 0010:btrfs_get_64+0xdc/0x120 [btrfs] Jan 17 12:49:42 arch-pc kernel: Code: 4a 8b 44 e5 78 48 2b 05 f2 4d fe e4 48 c1 f8 06 48 c1 e0 0c 48 03 05 f3 4d fe e4 81 eb f8 0f 00 00 74 13 31 d2 89 d6 83 c2 01 <0f> b6 3c 30 40 88 3c 31 39 da 72 ef 48 8b 44 24 08 48 8b 54 24 10 Jan 17 12:49:42 arch-pc kernel: RSP: 0018:ffffa37297dc7d60 EFLAGS: 00010202 Jan 17 12:49:42 arch-pc kernel: RAX: 0003292e80000000 RBX: 0000000000000006 RCX: ffffa37297dc7d6a Jan 17 12:49:42 arch-pc kernel: RDX: 0000000000000001 RSI: 0000000000000000 RDI: 0000000000000001 Jan 17 12:49:42 arch-pc kernel: RBP: ffff8c0f888cf800 R08: 0000000000000002 R09: 00000000ffffffea Jan 17 12:49:42 arch-pc kernel: R10: ffffffffa785b840 R11: 00000000fffff000 R12: 0000000000000003 Jan 17 12:49:42 arch-pc kernel: R13: 0000000000003fe9 R14: 0000000000001000 R15: 0000000000000000 Jan 17 12:49:42 arch-pc kernel: FS: 00007f06fbc6c740(0000) GS:ffff8c1a8dc80000(0000) knlGS:0000000000000000 Jan 17 12:49:42 arch-pc kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Jan 17 12:49:42 arch-pc kernel: CR2: 00007f06fbf7ee98 CR3: 000000043a35a002 CR4: 00000000003706e0 Jan 17 12:49:42 arch-pc kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 Jan 17 12:49:42 arch-pc kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 According to "Comm: e2fsdroid", it should be creating the system image. It was definitely working well in the past, but I can't tell which exact kernel update introduced the issue.
Not my area of expertise, hence I can't tell you if it'S the same or a different problem. But FWIW, a fix for the initial problem was posted here: https://lore.kernel.org/all/CAL3q7H5XUr2=kLEV192yU6cZakX_diS5+WRLq7LHkGPUOAZZZw@mail.gmail.com/ You might want to try that and if that doesn't help submit you a separate report.
(In reply to The Linux kernel's regression tracker (Thorsten Leemhuis) from comment #5) > Not my area of expertise, hence I can't tell you if it'S the same or a > different problem. But FWIW, a fix for the initial problem was posted here: > > https://lore.kernel.org/all/ > CAL3q7H5XUr2=kLEV192yU6cZakX_diS5+WRLq7LHkGPUOAZZZw@mail.gmail.com/ > > You might want to try that and if that doesn't help submit you a separate > report. Thanks for the info. That patch seems to fix my issue.
Thanks for the report, testing. The fix is in Linus' tree and will appear in stable soon.