Bug 218184
Summary: | block/badblocks.c warning in 6.7-rc2 | ||
---|---|---|---|
Product: | IO/Storage | Reporter: | Janpieter Sollie (janpieter.sollie) |
Component: | MD | Assignee: | Jens Axboe (axboe) |
Status: | RESOLVED INSUFFICIENT_DATA | ||
Severity: | normal | CC: | colyli, song |
Priority: | P3 | ||
Hardware: | AMD | ||
OS: | Linux | ||
Kernel Version: | 6.7-rc2 | Subsystem: | |
Regression: | Yes | Bisected commit-id: | 3ea3354cb9f03e34ee3fab98f127ab8da4131eee |
Attachments: |
dmesg | scripts/decode_stacktrace.sh
kernel .config |
Created attachment 305465 [details]
kernel .config
to be clear: the bcachefs volume is located on an md device, I have 3 md devices with a bcachefs volume on them, 2 of them seem to be fine, 3rd one seems to struggle added Coly Li as he's the committer of the patches according to git diff Yes, let me look into this. The warning is from this line of code in _badblocks_check(), 1284 WARN_ON(bb->shift < 0 || sectors == 0); It means the caller sent an invalid range to check. From the oops information, "RDX: 0000000000000000" means parameter 'sectors' is 0. So the question is, why does md raid code send a 0-length range for badblocks check? Is this behavior on purpose, or improper? Coly Li IMHO, it doesn't make sense for caller to check a zero-length LBA range. The warning works as expect to detect improper call to badblocks_check(). if I understand you correctly, should we ask the md people why they are calling for zero-length LBA? the question whether calling to check with 0 sectors should be allowed or not is purely code style ... and I'm not a stylist I add Song Liu (md maintainer) into CC. Maybe he can give us some hint, why 0-length LBA range is sent into badblocks_check(). reset to be an md bug, adding commit that caused the regression Looking into this. md layer only calls badblocks_check() when md layer thinks there are badblocks in the device (md.h: is_badblock()). This is likely the reason we only see the issue in 1 of the 3 arrays. However, I don't think md layer is creating a zero size bio. I wonder whether the zero sized bio is from upper layer. Could we try reproduce the issue with the following patch? Thanks! diff --git i/drivers/md/md.c w/drivers/md/md.c index 4ee4593c874a..967e51ec0169 100644 --- i/drivers/md/md.c +++ w/drivers/md/md.c @@ -372,6 +372,7 @@ static bool is_suspended(struct mddev *mddev, struct bio *bio) void md_handle_request(struct mddev *mddev, struct bio *bio) { + WARN_ON((bio_data_dir(bio) == READ) && bio_sectors(bio) == 0); check_suspended: if (is_suspended(mddev, bio)) { DEFINE_WAIT(__wait); @@ -406,6 +407,7 @@ static void md_submit_bio(struct bio *bio) const int rw = bio_data_dir(bio); struct mddev *mddev = bio->bi_bdev->bd_disk->private_data; + WARN_ON((rw == READ) && bio_sectors(bio) == 0); if (mddev == NULL || mddev->pers == NULL) { bio_io_error(bio); return; diff --git i/drivers/md/raid5.c w/drivers/md/raid5.c index dc031d42f53b..924631f65fdf 100644 --- i/drivers/md/raid5.c +++ w/drivers/md/raid5.c @@ -5534,6 +5534,7 @@ static int raid5_read_one_chunk(struct mddev *mddev, struct bio *raid_bio) atomic_inc(&rdev->nr_pending); rcu_read_unlock(); + WARN_ON(bio_sectors(raid_bio) == 0); if (is_badblock(rdev, sector, bio_sectors(raid_bio), &first_bad, &bad_sectors)) { rdev_dec_pending(rdev, mddev); @@ -5589,11 +5590,15 @@ static struct bio *chunk_aligned_read(struct mddev *mddev, struct bio *raid_bio) unsigned chunk_sects = mddev->chunk_sectors; unsigned sectors = chunk_sects - (sector & (chunk_sects-1)); + WARN_ON(bio_sectors(raid_bio) == 0); + if (sectors < bio_sectors(raid_bio)) { struct r5conf *conf = mddev->private; split = bio_split(raid_bio, sectors, GFP_NOIO, &conf->bio_split); bio_chain(split, raid_bio); submit_bio_noacct(raid_bio); + WARN_ON(bio_sectors(raid_bio) == 0); + WARN_ON(bio_sectors(split) == 0); raid_bio = split; } so it's not a bug, the warning is actually a feature. Sadly, I can't reproduce: in bug #218200, as a last resort, I tried to recreate the whole md array. It did not help, but it also makes this bug #218184 impossible to debug further. If you have a suggestion like "apply this patch and do (set of commands to be pasted here, probably some dd if=/dev/zero of=/dev/sd... to corrupt sector)", feel free to reopen. Sorry for abandoning this too early Post some observation on this for the record. Maybe this will ring a bell for somebody (Kent?). In the stack trace: [ 74.070877] raid5_make_request (drivers/md/md.h:224) [ 74.070879] ? wait_woken (kernel/sched/wait.c:439) [ 74.070881] ? sched_clock_cpu (kernel/sched/clock.c:394) [ 74.070882] ? raw_spin_rq_lock_nested (kernel/sched/core.c:574) [ 74.070884] ? newidle_balance (kernel/sched/fair.c:12329) [ 74.070886] md_handle_request (./include/linux/percpu-refcount.h:174) [ 74.070887] ? wake_bit_function (kernel/sched/wait.c:383) [ 74.070888] md_submit_bio (drivers/md/md.c:?) [ 74.070889] __submit_bio (block/blk-core.c:604) [ 74.070890] submit_bio_noacct_nocheck (./include/linux/bio.h:693) [ 74.070892] submit_bio_noacct (block/blk-core.c:?) [ 74.070893] ? journal_write_endio (fs/bcachefs/journal_io.c:1669) [ 74.070894] submit_bio (block/blk-core.c:835) [ 74.070895] bch2_journal_write (./include/linux/percpu-refcount.h:174) From raid5_make_request, we know the call path is: raid5_make_request => chunk_aligned_read => raid5_read_one_chunk => is_badblock => badblocks_check so the bio is READ. However, it was triggered by bch2_journal_write(). I cannot figure out why bch2_journal_write issues a read here. It appears to me that it should only do write or flush. |
Created attachment 305464 [details] dmesg | scripts/decode_stacktrace.sh when booting from 6.7-rc2, compiled with clang, I get this warning on one of my 3 bcachefs volumes: WARNING: CPU: 3 PID: 712 at block/badblocks.c:1284 badblocks_check (block/badblocks.c:1284) The reason why isn't clear, but the stack trace points to an error in md error handling. This bug didn't happen in 6.6 there are 3 commits in 6.7-rc2 which may cause them, in attachment: - decoded stacktrace of dmesg - kernel .config