Bug 178211 - can't boot with root fs on md raid 0; mdadm: no devices listed in conf file were found.
Summary: can't boot with root fs on md raid 0; mdadm: no devices listed in conf file w...
Status: NEW
Alias: None
Product: IO/Storage
Classification: Unclassified
Component: MD (show other bugs)
Hardware: x86-64 Linux
: P1 normal
Assignee: io_md
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-10-17 20:57 UTC by Joe S
Modified: 2024-03-15 06:46 UTC (History)
7 users (show)

See Also:
Kernel Version: 4.9-rc1
Subsystem:
Regression: No
Bisected commit-id:


Attachments

Description Joe S 2016-10-17 20:57:45 UTC
What goes wrong:  With 4.9-rc1, root filesystem located on a raid 0 mdraid, can't boot at all.  Instead, after "Booting kernel", the message (repeatedly) appears: mdadm: no devices listed in conf file were found.

Booting this system works fine with 4.8.1 and earlier 4.* kernels.

To make sure it was not a mdadm/uuid issue, I REMOVED the following from mdadm.conf:
ARRAY /dev/md/0  metadata=1.2 UUID=a313d97e:c668a2c1:4672b467:c1015e92 name=quiz:0

And REPLACED the line with:
ARRAY /dev/md/0 metadata=1.2 devices=/dev/sda1,/dev/sdb1 name=quiz:0

so that md is not depending on matching a uuid.  This had no effect.

I also tried adding the following line to /etc/default/grub:
GRUB_CMDLINE_LINUX=" rootdelay=90 "

And then performed update-grub; this had no effect other than to take 90 extra seconds before failing with the repeated mdadm error message.

With both of these changes, 4.8.1 still boots properly.

System: Asrock 970M Pro3 motherboard w/AMD FX-8300 CPU and 32GB RAM running Debian testing/9.

The mdraid is a stripe set built on two 120GB Corsair Force LS SSD drives, /dev/sda(1) and /dev/sdb(1) connected to my motherboard's onboard sata 3 ports.  (This stripe set is for the record very very fast when it is working.)

I compiled the 4.9-rc1 kernel based on my working 4.8.1 kernel config without configuration changes (make menuconfig -> exit -> yes, save config).

Will cheerfully provide additional information if someone knows what questions to ask.

Peace,

Joe Staton
Comment 1 Neil Brown 2016-10-18 20:24:10 UTC
My guess is that this problem is not a problem with md at all.
Rather it is a problem with the controller which accesses the SSDs.
Can you get the initrd to give you a shell? (without knowing what distro you use I cannot suggest how to do that - hopefully you can find out)

Once you have a shell, look in /proc/partitions to see if the devices are listed.
If they are, try to use mdadm to --examine the devices, and then assemble the array.
If not, it is clearly not a problem with MD.
Comment 2 The Linux kernel's regression tracker (Thorsten Leemhuis) 2016-10-30 11:16:56 UTC
FYI: I added this report to the list of regressions for Linux 4.9. I'll
watch this thread for further updates on this issue to document progress
in my weekly reports. Please let me know via regressions@leemhuis.info
in case the discussion moves to a different place (bugzilla or another
mail thread for example). tia!

Current status (afaics): Stuck? Nothing happend for a few days
Ciao, Thorsten
Comment 3 Joe S 2016-10-30 13:02:39 UTC
Sorry - Current status is that after the error I get dropped right into the initrd busybox shell, but none of my 4 keyboards function in that shell on my ASRock motherboard (known problem with this hardware apparently).  Will update when I find a keyboard that works.

Peace, Joe Staton
Comment 4 Joe S 2016-11-03 15:21:00 UTC
With my shiny new-to-me PS/2 keyboard, I have talked to busybox a little about why I don't have a root filesystem assembled so that I can boot.  Essential findings:

None of my sata devices are listed in /proc/partitions.  

mdadm has nothing to examine.

Definitely not an indication of any mdadm problem, as Neil Brown predicted (and he would know).

Leaves me not working, but not knowing what's going on due to my lack of knowledge of how linux works.  sigh.

Peace,

Joe Staton
Comment 5 Joe S 2016-12-08 20:14:58 UTC
Summary of update: WORKING in 4.9rc8.

Detail:  Although I don't know what this problem was other than not being able to access my sata devices despite the drivers being compiled in, the problem still existed in 4.9rc6 but is FIXED in 4.9rc8.  4.9rc8 boots great with the same .config on this same hardware.  You guys rock.
Comment 8 jack dawson 2024-03-15 06:46:48 UTC Comment hidden (spam)

Note You need to log in before you can comment on or make changes to this bug.