Bug 14246 - 2.6.29 and later break evms (driver initialization ordering changes??)
Summary: 2.6.29 and later break evms (driver initialization ordering changes??)
Status: RESOLVED CODE_FIX
Alias: None
Product: IO/Storage
Classification: Unclassified
Component: MD (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: Neil Brown
URL: http://bugs.gentoo.org/show_bug.cgi?i...
Keywords:
Depends on:
Blocks:
 
Reported: 2009-09-29 08:33 UTC by Jeremy Huddleston
Modified: 2010-01-28 02:33 UTC (History)
5 users (show)

See Also:
Kernel Version: 2.6.29 through 2.6.32
Subsystem:
Regression: Yes
Bisected commit-id:


Attachments
/proc/mdstat (before evms_activate) (210 bytes, text/plain)
2009-11-11 06:40 UTC, Beau V.C. Bellamy
Details
strace evms_activate (50.38 KB, text/plain)
2009-11-11 06:41 UTC, Beau V.C. Bellamy
Details
/proc/mdstat (after evms_activate) (210 bytes, text/plain)
2009-11-11 06:42 UTC, Beau V.C. Bellamy
Details
/proc/mdstat (before evms_activate) **corrected (110 bytes, text/plain)
2009-11-17 10:02 UTC, Beau V.C. Bellamy
Details
patch against 2.6.31.6 that fixes this (for me) (8.38 KB, patch)
2009-12-14 16:49 UTC, Alex Moore
Details | Diff
much reduced patch for 2.6.31.6 (442 bytes, patch)
2009-12-15 17:31 UTC, Alex Moore
Details | Diff

Description Jeremy Huddleston 2009-09-29 08:33:50 UTC
See the following for downstream / list reports:
http://bugs.gentoo.org/show_bug.cgi?id=273902
http://www.nabble.com/%22array-md0-already-has-disks%22-with-kernel-2.6.29-td23131323.html

Something broke between 2.6.28 and 2.6.29 which is causing systems using evms to fail to boot.  A bunch of messages like this get spewed:
"md: array md* already has disks!"
Comment 1 Neil Brown 2009-09-29 09:07:45 UTC
Is it possible get get the output of
  cat /proc/mdstat
before an after running evms_activate, and also to
run evms_active under 'strace' and get the trace from that.

I have no idea how evms works so I would need as much tracing as possible
to be able to figure out what is going wrong.
Comment 2 Beau V.C. Bellamy 2009-11-11 06:40:37 UTC
Created attachment 23743 [details]
/proc/mdstat (before evms_activate)
Comment 3 Beau V.C. Bellamy 2009-11-11 06:41:48 UTC
Created attachment 23744 [details]
strace evms_activate
Comment 4 Beau V.C. Bellamy 2009-11-11 06:42:12 UTC
Created attachment 23745 [details]
/proc/mdstat (after evms_activate)
Comment 5 Beau V.C. Bellamy 2009-11-11 06:45:02 UTC
I have the exact same problem!   So here are the files you requested!

I'm using gentoo-sources-2.6.30-r5 for the kernel.

- Beau
Comment 6 Neil Brown 2009-11-12 06:39:57 UTC
Thanks.

Can you run
  mdadm -Ss

which should remove the entry from /proc/mdstat, then
run evms_activate
and see what happens.  If it doesn't work, an 'strace' of that
would help too.
Comment 7 Beau V.C. Bellamy 2009-11-17 10:02:05 UTC
Created attachment 23812 [details]
/proc/mdstat (before evms_activate) **corrected

Sorry, I uploaded the wrong file.  Here is the one that was generated.  There were no entries in /proc/mdstat.  I'm using hacked initramfs image that dumps me into an ash shell right after all the drivers are loaded.
Comment 8 Alex Moore 2009-12-14 16:49:54 UTC
Created attachment 24182 [details]
patch against 2.6.31.6 that fixes this (for me)

I did a git bisection to identify the commits that broke EVMS w/ MD RAID. The first commit which broke things is: http://git.kernel.org/?p=linux/kernel/git/stable/linux-2.6.29.y.git;a=commit;h=d3374825ce57ba2214d375023979f6197ccc1385. However I believe this one is also involved, as it changes the behaviour of the breakage: http://git.kernel.org/?p=linux/kernel/git/stable/linux-2.6.29.y.git;a=commit;h=efeb53c0e57213e843b7ef3cc6ebcdea7d6186ac.

Reverting those two commits fixes the problem on my system. The attached patch does that for 2.6.31.6.

Note that I do not know whether there are other negative implications to applying this patch... perhaps doing so breaks something else that has been added to MD since 2.6.28, which I have yet to experience, and my understanding of the code is insufficient for me to figure it out.

I would hope that someone with a greater understanding of what's going on can figure out whether this is:

(1) a bug in the recent changes to MD, which happen to affect EVMS, or
(2) a deliberate change in behaviour, for which EVMS needs to be adapted

Either way, I hope my narrowing down of the offending changes will help. I may continue to investigate further myself, but suspect I have gone as far as my ability allows without more expert involvement...

Alex
Comment 9 Neil Brown 2009-12-14 21:11:24 UTC
Thanks for doing the bisect - it should help narrow the problem down.

There definitely was a deliberate change in behaviour with that patch.
However I am surprised that anything would need to adapt to it.
The only change should have been that an md device would disappear (e.g.
out of /sys) when it was no longer in use. EVMS, or any other code, should
not really notice.

Can you say anything about how the second patch "changes the behaviour
of the breakage"??  That might give a useful clue as to the nature of the breakage.

Unfortunately the strace in comment #3 died early - it looks like strace itself had problems decoding a getdents.  If it was possible to get a complete
strace of evms_activate failing, that might also provide useful clues.

Thanks,
NeilBrown
Comment 10 Alex Moore 2009-12-15 17:31:39 UTC
Created attachment 24201 [details]
much reduced patch for 2.6.31.6

The first of the commits linked to introduces the problem that results in the "md: array md0 already has disks!" message many times per second after running "evms_activate". When just that problem exists, my initrd never even moves on to attempting to mount the root filesystem... it just sits there forever with a flashing cursor while the problem goes on in the background.

The second commit changes that behaviour in that it does actually fail with an error very quickly. The error is "evms_commit_changes(): Invalid argument", as seen at the end of the strace in comment #3 from Beau, and then the initrd offers to drop me to a shell. A dmesg from the shell then reveals that the "array already has disks" issue exists in this situation too (but there's only a screenful or two of those, after which they cease when it hits the "invalid argument" error).

I have in fact managed to get much further with this now, in that I've got a vastly reduced patch that fixes it on my system (attached).

It seems to me that the problematic case is when the "mddev_put(mddev);" in the the "abort:" section of "md_alloc" is reached without having gone through one of the "goto" statements beforehand. Without this patch I hit the "Invalid argument" problem and also see "array already has disks" in dmesg, with the patch I see neither problem and the system boots, so it would seem that this alone fixes both things.

To be honest I am slightly surprised, as I imagined these were two separate problems, and thought I would end up with a minimal patch reverting one aspect from each of the commits, but I'm not about to complain. This still hasn't helped me understand why the original behaviour was occurring though... perhaps you can figure that out.

Do you still want an strace? I'm just not entirely sure how to go about getting it for you - given that my root filesystem is on the EVMS-managed MD RAID :) but perhaps I could put an strace binary into the initrd somehow...

Alex

P.S. If only it didn't take so long to recompile the kernel, I'd honestly say I'm enjoying this troubleshooting so far. I suppose it would have been less enjoyable if I hadn't made any progress!
Comment 11 Beau V.C. Bellamy 2009-12-15 18:51:15 UTC
Awesome!  I think you've made significant progress.

As far as getting an strace...  I too have my root on the evms volume.  I hacked a custom initramfs to have strace and to drop me into an ash shell right before evms_activate.  Then I'd run strace evms_activate while piping standard error to a file on a thumbdrive.

Working with initramfs is pretty easy.  I hope this helps.

- Beau
Comment 12 Neil Brown 2009-12-16 23:30:02 UTC
(bugzilla was down went I tried to respond, so I sent this by private email
I received a reply that that patch at the end of this email solves the 
problem.)



 Thanks for the extra detail.  I figured out why the second patch made
 a difference - I really didn't think it should have done.
 There is a bug in the first patch which is fixed by the second - the
 first sets mddev->hold_active rather than new->hold_active.
 So that explains part of the mystery, and confirms the problem is
 related to hold_active.

 I had another look at the evms source code and I saw the important
 difference this time.  Every time evms performs an operation on an md
 device it does:
    open
    ioctl
    close

 md doesn't expect this.  It assumes:
    open
    set of ioctls to perform desired changes
    close

 As a consequence, after certain ioctls the 'close' will destroy the
 md device so the effect of the ioctl will be lost.
 I think that the important ioctl in this case is SET_ARRAY_INFO.  I
 cannot see exactly why losing the effect of that ioctl would produce
 the symptoms you report, but it certainly isn't impossible.

 So could you please try the following patch.  I'm not it is what will
 finally go upstream, but if it makes a difference, then that will
 confirm my understanding.

Thanks,
NeilBrown



diff --git a/drivers/md/md.c b/drivers/md/md.c
index 9dd8720..f251ff3 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -5486,6 +5486,7 @@ static int md_ioctl(struct block_device *bdev, fmode_t mode,
 done_unlock:
 abort_unlock:
 	if (mddev->hold_active == UNTIL_IOCTL &&
+	    cmd != SET_ARRAY_INFO &&
 	    err != -EINVAL)
 		mddev->hold_active = 0;
 	mddev_unlock(mddev);
Comment 13 Jeremy Huddleston 2010-01-10 19:02:10 UTC
The patch in the previous comment works fine for me on 2.6.29
Comment 14 Jeremy Huddleston 2010-01-24 18:22:00 UTC
Any ETA when this patch or something similar will make its way into git?
Comment 15 Beau V.C. Bellamy 2010-01-24 20:07:37 UTC
The patch also works for me on 2.6.31.6.  Great job!
Comment 16 Neil Brown 2010-01-28 02:33:17 UTC
(In reply to comment #14)
> Any ETA when this patch or something similar will make its way into git?

December 30th 2009

The mainline commit is cbd1998377504df005302ac90d49db72a48552a6

NeilBrown
Comment 17 Neil Brown 2010-01-28 02:33:50 UTC
.. so marking as 'resolved'.

Note You need to log in before you can comment on or make changes to this bug.