Bug 218687 - blk_validate_limits warning : md device fails to start
Summary: blk_validate_limits warning : md device fails to start
Status: RESOLVED CODE_FIX
Alias: None
Product: IO/Storage
Classification: Unclassified
Component: Block Layer (show other bugs)
Hardware: AMD Linux
: P3 high
Assignee: Jens Axboe
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2024-04-06 12:33 UTC by Janpieter Sollie
Modified: 2024-07-08 09:16 UTC (History)
2 users (show)

See Also:
Kernel Version: 6.9-rc2
Subsystem:
Regression: Yes
Bisected commit-id:


Attachments
dmesg of 6.9-rc2 (195.56 KB, text/plain)
2024-04-06 12:33 UTC, Janpieter Sollie
Details
mdadm --detail /dev/md1 (1.61 KB, text/plain)
2024-04-06 12:35 UTC, Janpieter Sollie
Details
kernel .config (130.42 KB, text/plain)
2024-04-06 12:38 UTC, Janpieter Sollie
Details
dmesg on 6.9-rc2 - hybrid RAID (19.95 KB, text/plain)
2024-04-07 13:29 UTC, Mateusz Jończyk
Details

Description Janpieter Sollie 2024-04-06 12:33:57 UTC
Created attachment 306092 [details]
dmesg of 6.9-rc2

definitely configuration-specific, I have 3 raid 6 devices where 1 of 3 fails to start.

during bootup, the 2 working raid 6 devices initialize, assemble, and run.
The third one (md1), however, does not:

> WARNING: CPU: 2 PID: 2064 at block/blk-settings.c:192
> blk_validate_limits+0x165/0x180

all devices (including external journal) get detected, but the array does not start.
restarting mdadm service causes a segmentation fault, the 2 working devices get up again, and the 3rd one doesn't:
> mdadm: /dev/md0 has been started with 4 drives and 1 spare and 1 journal.
> mdadm: /dev/md3 has been started with 6 drives and 2 spares and 1 journal.
> mdadm: failed to RUN_ARRAY /dev/md1: Invalid argument    

this did not happen in 6.8 series.
reverting to 6.8 makes the array work properly again.

I'll add the .config + mdadm --info of the failing array, let me know if you'd like more info.
Comment 1 Janpieter Sollie 2024-04-06 12:35:44 UTC
Created attachment 306093 [details]
mdadm --detail /dev/md1
Comment 2 Ming Lei 2024-04-06 12:37:29 UTC
Can you test the following patch?

diff --git a/block/blk-settings.c b/block/blk-settings.c
index cdbaef159c4b..864f86e98f01 100644
--- a/block/blk-settings.c
+++ b/block/blk-settings.c
@@ -187,12 +187,7 @@ static int blk_validate_limits(struct queue_limits *lim)
         * page (which might not be identical to the Linux PAGE_SIZE).  Because
         * of that they are not limited by our notion of "segment size".
         */
-       if (lim->virt_boundary_mask) {
-               if (WARN_ON_ONCE(lim->max_segment_size &&
-                                lim->max_segment_size != UINT_MAX))
-                       return -EINVAL;
-               lim->max_segment_size = UINT_MAX;
-       } else {
+       if (!lim->virt_boundary_mask) {
                /*
                 * The maximum segment size has an odd historic 64k default that
                 * drivers probably should override.  Just like the I/O size we
Comment 3 Janpieter Sollie 2024-04-06 12:38:17 UTC
Created attachment 306094 [details]
kernel .config
Comment 4 Janpieter Sollie 2024-04-06 12:56:21 UTC
(In reply to Lei Ming from comment #2)
> Can you test the following patch?
> 
> diff --git a/block/blk-settings.c b/block/blk-settings.c
> index cdbaef159c4b..864f86e98f01 100644
> --- a/block/blk-settings.c
> +++ b/block/blk-settings.c
> @@ -187,12 +187,7 @@ static int blk_validate_limits(struct queue_limits *lim)
>          * page (which might not be identical to the Linux PAGE_SIZE). 
> Because
>          * of that they are not limited by our notion of "segment size".
>          */
> -       if (lim->virt_boundary_mask) {
> -               if (WARN_ON_ONCE(lim->max_segment_size &&
> -                                lim->max_segment_size != UINT_MAX))
> -                       return -EINVAL;
> -               lim->max_segment_size = UINT_MAX;
> -       } else {
> +       if (!lim->virt_boundary_mask) {
>                 /*
>                  * The maximum segment size has an odd historic 64k default
> that
>                  * drivers probably should override.  Just like the I/O size
> we

that works ...
Comment 5 Mateusz Jończyk 2024-04-07 13:27:25 UTC
Hello,

It seems I also have the same problem and was about to post to LKML about it.

In my laptop, I have a RAID1 array on top of NVMe and SATA SSD hard drives.
This RAID array fails to get assembled on Linux 6.9-rc2 with the following warning in dmesg:

        WARNING: CPU: 0 PID: 170 at block/blk-settings.c:191 blk_validate_limits+0x233/0x270

I have bisected it down to

commit 97894f7d3c29 ("md/raid1: use the atomic queue limit update APIs").

It appears that one of the component devices in the array uses a virtual boundary,
the other one max_segment_size.

On Linux 6.8 I have:

/sys/block/md0/queue/max_segment_size:65536
/sys/block/md1/queue/max_segment_size:65536
/sys/block/nvme0n1/queue/max_segment_size:4294967295
/sys/block/sda/queue/max_segment_size:65536
/sys/block/sdb/queue/max_segment_size:65536

/sys/block/md0/queue/virt_boundary_mask:4095
/sys/block/md1/queue/virt_boundary_mask:4095
/sys/block/nvme0n1/queue/virt_boundary_mask:4095
/sys/block/sda/queue/virt_boundary_mask:0
/sys/block/sdb/queue/virt_boundary_mask:0

so it appears that the md0/md1 devices get a combination of values:
virt_boundary_mask comes from /dev/nvme0n1, max_segment_size from /dev/sdb.

This appears not to be working on Linux 6.9 with the new atomic queue limit update API.

Greeting,
Mateusz
Comment 6 Mateusz Jończyk 2024-04-07 13:29:27 UTC
Created attachment 306103 [details]
dmesg on 6.9-rc2 - hybrid RAID
Comment 7 Mateusz Jończyk 2024-04-07 13:49:36 UTC
I will test your patch.
Comment 8 Mateusz Jończyk 2024-04-07 16:18:26 UTC
Yeah, this patch fixes it.

Tested-by: Mateusz Jończyk <mat.jonczyk@o2.pl>
Comment 9 Mateusz Jończyk 2024-07-07 19:27:07 UTC
For me, this bug was fixed in mainline before Linux 6.9 was released, most probably by commit

commit b561ea56a26415bf44ce8ca6a8e625c7c390f1ea
Author: Ming Lei <ming.lei@redhat.com>
Date:   Sun Apr 7 21:19:31 2024 +0800

    block: allow device to have both virt_boundary_mask and max segment size
    
    When one stacking device is over one device with virt_boundary_mask and
    another one with max segment size, the stacking device have both limits
    set. This way is allowed before d690cb8ae14b ("block: add an API to
    atomically update queue limits").
    
    Relax the limit so that we won't break such kind of stacking setting.
    
    Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218687
    Reported-by: janpieter.sollie@edpnet.be
    Fixes: d690cb8ae14b ("block: add an API to atomically update queue limits")
    Link: https://lore.kernel.org/linux-block/ZfGl8HzUpiOxCLm3@fedora/
    [...]
    Link: https://lore.kernel.org/r/20240407131931.4055231-1-ming.lei@redhat.com
    Signed-off-by: Jens Axboe <axboe@kernel.dk>

@Janpieter, can you confirm that it is indeed fixed, so that this bug report be closed?
Comment 10 Janpieter Sollie 2024-07-08 09:16:51 UTC
bug report can be closed indeed ...mainline contains necessary fixes to resolve the problem

Note You need to log in before you can comment on or make changes to this bug.