Bug 218687 - blk_validate_limits warning : md device fails to start
Summary: blk_validate_limits warning : md device fails to start
Status: NEW
Alias: None
Product: IO/Storage
Classification: Unclassified
Component: Block Layer (show other bugs)
Hardware: AMD Linux
: P3 high
Assignee: Jens Axboe
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2024-04-06 12:33 UTC by Janpieter Sollie
Modified: 2024-04-07 16:18 UTC (History)
2 users (show)

See Also:
Kernel Version: 6.9-rc2
Subsystem:
Regression: Yes
Bisected commit-id:


Attachments
dmesg of 6.9-rc2 (195.56 KB, text/plain)
2024-04-06 12:33 UTC, Janpieter Sollie
Details
mdadm --detail /dev/md1 (1.61 KB, text/plain)
2024-04-06 12:35 UTC, Janpieter Sollie
Details
kernel .config (130.42 KB, text/plain)
2024-04-06 12:38 UTC, Janpieter Sollie
Details
dmesg on 6.9-rc2 - hybrid RAID (19.95 KB, text/plain)
2024-04-07 13:29 UTC, Mateusz Jończyk
Details

Description Janpieter Sollie 2024-04-06 12:33:57 UTC
Created attachment 306092 [details]
dmesg of 6.9-rc2

definitely configuration-specific, I have 3 raid 6 devices where 1 of 3 fails to start.

during bootup, the 2 working raid 6 devices initialize, assemble, and run.
The third one (md1), however, does not:

> WARNING: CPU: 2 PID: 2064 at block/blk-settings.c:192
> blk_validate_limits+0x165/0x180

all devices (including external journal) get detected, but the array does not start.
restarting mdadm service causes a segmentation fault, the 2 working devices get up again, and the 3rd one doesn't:
> mdadm: /dev/md0 has been started with 4 drives and 1 spare and 1 journal.
> mdadm: /dev/md3 has been started with 6 drives and 2 spares and 1 journal.
> mdadm: failed to RUN_ARRAY /dev/md1: Invalid argument    

this did not happen in 6.8 series.
reverting to 6.8 makes the array work properly again.

I'll add the .config + mdadm --info of the failing array, let me know if you'd like more info.
Comment 1 Janpieter Sollie 2024-04-06 12:35:44 UTC
Created attachment 306093 [details]
mdadm --detail /dev/md1
Comment 2 Lei Ming 2024-04-06 12:37:29 UTC
Can you test the following patch?

diff --git a/block/blk-settings.c b/block/blk-settings.c
index cdbaef159c4b..864f86e98f01 100644
--- a/block/blk-settings.c
+++ b/block/blk-settings.c
@@ -187,12 +187,7 @@ static int blk_validate_limits(struct queue_limits *lim)
         * page (which might not be identical to the Linux PAGE_SIZE).  Because
         * of that they are not limited by our notion of "segment size".
         */
-       if (lim->virt_boundary_mask) {
-               if (WARN_ON_ONCE(lim->max_segment_size &&
-                                lim->max_segment_size != UINT_MAX))
-                       return -EINVAL;
-               lim->max_segment_size = UINT_MAX;
-       } else {
+       if (!lim->virt_boundary_mask) {
                /*
                 * The maximum segment size has an odd historic 64k default that
                 * drivers probably should override.  Just like the I/O size we
Comment 3 Janpieter Sollie 2024-04-06 12:38:17 UTC
Created attachment 306094 [details]
kernel .config
Comment 4 Janpieter Sollie 2024-04-06 12:56:21 UTC
(In reply to Lei Ming from comment #2)
> Can you test the following patch?
> 
> diff --git a/block/blk-settings.c b/block/blk-settings.c
> index cdbaef159c4b..864f86e98f01 100644
> --- a/block/blk-settings.c
> +++ b/block/blk-settings.c
> @@ -187,12 +187,7 @@ static int blk_validate_limits(struct queue_limits *lim)
>          * page (which might not be identical to the Linux PAGE_SIZE). 
> Because
>          * of that they are not limited by our notion of "segment size".
>          */
> -       if (lim->virt_boundary_mask) {
> -               if (WARN_ON_ONCE(lim->max_segment_size &&
> -                                lim->max_segment_size != UINT_MAX))
> -                       return -EINVAL;
> -               lim->max_segment_size = UINT_MAX;
> -       } else {
> +       if (!lim->virt_boundary_mask) {
>                 /*
>                  * The maximum segment size has an odd historic 64k default
> that
>                  * drivers probably should override.  Just like the I/O size
> we

that works ...
Comment 5 Mateusz Jończyk 2024-04-07 13:27:25 UTC
Hello,

It seems I also have the same problem and was about to post to LKML about it.

In my laptop, I have a RAID1 array on top of NVMe and SATA SSD hard drives.
This RAID array fails to get assembled on Linux 6.9-rc2 with the following warning in dmesg:

        WARNING: CPU: 0 PID: 170 at block/blk-settings.c:191 blk_validate_limits+0x233/0x270

I have bisected it down to

commit 97894f7d3c29 ("md/raid1: use the atomic queue limit update APIs").

It appears that one of the component devices in the array uses a virtual boundary,
the other one max_segment_size.

On Linux 6.8 I have:

/sys/block/md0/queue/max_segment_size:65536
/sys/block/md1/queue/max_segment_size:65536
/sys/block/nvme0n1/queue/max_segment_size:4294967295
/sys/block/sda/queue/max_segment_size:65536
/sys/block/sdb/queue/max_segment_size:65536

/sys/block/md0/queue/virt_boundary_mask:4095
/sys/block/md1/queue/virt_boundary_mask:4095
/sys/block/nvme0n1/queue/virt_boundary_mask:4095
/sys/block/sda/queue/virt_boundary_mask:0
/sys/block/sdb/queue/virt_boundary_mask:0

so it appears that the md0/md1 devices get a combination of values:
virt_boundary_mask comes from /dev/nvme0n1, max_segment_size from /dev/sdb.

This appears not to be working on Linux 6.9 with the new atomic queue limit update API.

Greeting,
Mateusz
Comment 6 Mateusz Jończyk 2024-04-07 13:29:27 UTC
Created attachment 306103 [details]
dmesg on 6.9-rc2 - hybrid RAID
Comment 7 Mateusz Jończyk 2024-04-07 13:49:36 UTC
I will test your patch.
Comment 8 Mateusz Jończyk 2024-04-07 16:18:26 UTC
Yeah, this patch fixes it.

Tested-by: Mateusz Jończyk <mat.jonczyk@o2.pl>

Note You need to log in before you can comment on or make changes to this bug.