Bug 219484
Summary: | f2fs discard causes kernel NULL pointer dereferencing | ||
---|---|---|---|
Product: | File System | Reporter: | piergiorgio.sartor |
Component: | f2fs | Assignee: | Default virtual assignee for f2fs (filesystem_f2fs) |
Status: | RESOLVED CODE_FIX | ||
Severity: | blocking | CC: | chao |
Priority: | P3 | ||
Hardware: | Intel | ||
OS: | Linux | ||
Kernel Version: | Subsystem: | ||
Regression: | No | Bisected commit-id: |
Description
piergiorgio.sartor
2024-11-09 12:01:14 UTC
Hi, thanks for your report. Can you please help to check max_hw_discard_sectors parameter of dm device via "cat /sys/block/<device_name>/queue/max_hw_discard_sectors"? I doubt max_discard_blocks becomes zero in __submit_discard_cmd(), result in that __blkdev_issue_discard() fails to allocate bio. __submit_discard_cmd() { unsigned int max_discard_blocks = SECTOR_TO_BLOCK(bdev_max_discard_sectors(bdev)); ... while () { ... if (len > max_discard_blocks) { len = max_discard_blocks; last = false; } ... } else { err = __blkdev_issue_discard(bdev, SECTOR_FROM_BLOCK(start), SECTOR_FROM_BLOCK(len), GFP_NOFS, &bio); } ... f2fs_bug_on(sbi, !bio); // trigger warning here and panic below } Thanks for the prompt reply. Actually, there is no "max_hw_discard_sectors", but only a "max_discard_segments", which is "1" (for all DM devices). It is also "1" for the underlying SSD (/dev/sda). The "discard_max_bytes", as well as the "discard_max_hw_bytes", is "2147450880" everywhere. Hope this helps, bye, pg One more thing, possibly important. When I create the snapshot, with the working kernel, while "max_discard_segments" is still "1", the other two, "discard_max_bytes" and "discard_max_hw_bytes" are both "0", instead of "2147450880". Hope this helps, bye, pg Do we have any chance to apply this and try to check whether it can fix this bug? From: Chao Yu <chao@kernel.org> --- fs/f2fs/segment.c | 5 ----- 1 file changed, 5 deletions(-) diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c index 10ec69cbae68..86a22447b89b 100644 --- a/fs/f2fs/segment.c +++ b/fs/f2fs/segment.c @@ -1314,11 +1314,6 @@ static int __submit_discard_cmd(struct f2fs_sb_info *sbi, unsigned long flags; bool last = true; - if (len > max_discard_blocks) { - len = max_discard_blocks; - last = false; - } - (*issued)++; if (*issued == dpolicy->max_requests) last = true; -- 2.40.1 Thanks for the support. Difficult to check the patch, I'll have to see with this PC what can I do (not so free to use). Which kernel would be this 6.11.5/6/7? Any other way to test? For example, using sysfs interface? What about the difference with 6.9.12 (working) with this not working? I cannot promise, but I'll have a look on patching. Thanks again, bye, pg Sorry for long delay due to I'm out of office. Now, I can reproduce this bug w/ below testcase: - pvcreate /dev/vdb - vgcreate myvg1 /dev/vdb - lvcreate -L 1024m -n mylv1 myvg1 - mount /dev/myvg1/mylv1 /mnt/f2fs - dd if=/dev/zero of=/mnt/f2fs/file bs=1M count=20 - sync - rm /mnt/f2fs/file - sync - lvcreate -L 1024m -s -n mylv1-snapshot /dev/myvg1/mylv1 - umount /mnt/f2fs ------------[ cut here ]------------ kernel BUG at fs/f2fs/segment.c:1363! Oops: invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI CPU: 4 UID: 0 PID: 730 Comm: umount Not tainted 6.12.0-rc3+ #1107 RIP: 0010:__submit_discard_cmd+0xa53/0x1410 <TASK> __issue_discard_cmd+0x3e5/0x1190 f2fs_issue_discard_timeout+0x244/0x360 f2fs_put_super+0x1fc/0xed0 generic_shutdown_super+0x14c/0x4a0 kill_block_super+0x40/0x90 kill_f2fs_super+0x264/0x430 Let me figure out a patch for that soon.:) (In reply to piergiorgio.sartor from comment #3) > One more thing, possibly important. > > When I create the snapshot, with the working kernel, while > "max_discard_segments" is still "1", the other two, "discard_max_bytes" and > "discard_max_hw_bytes" are both "0", instead of "2147450880". Thanks for the hint, I think that would be a key to the truth. Thanks, > > Hope this helps, > > bye, > > pg Thanks for taking the time to reproduce the issue. I tried to compile the kernel with your patch, but it seems these days is not anymore as easy as it used to be. No success... Good the you manage to see the issue! Thanks, pg Hi all, I tested kernel-6.12.4-100.fc40.x86_64.rpm (Fedora 40, Koji build). This is supposed to include the patch and, for what I tested, it seems to work fine. No NULL pointer de-referencing, no crash, everything good as before. I think you can close the bug, in case something else will pop up in the future, I can re-open. Thanks for the support! Merry Christmas & Happy New Year! bye, pg (In reply to piergiorgio.sartor from comment #9) > Hi all, > > I tested kernel-6.12.4-100.fc40.x86_64.rpm (Fedora 40, Koji build). > This is supposed to include the patch and, for what I tested, it seems to > work fine. No NULL pointer de-referencing, no crash, everything good as > before. Thank you very much for the test and feedback! > > I think you can close the bug, in case something else will pop up in the > future, I can re-open. Fine, let us know if you have any other problem. > > Thanks for the support! > > Merry Christmas & Happy New Year! Merry Christmas & Happy New Year too! Thanks, > > bye, > > pg |