Bug 11967 - md raid10 fails to resync when disks added
Summary: md raid10 fails to resync when disks added
Status: RESOLVED CODE_FIX
Alias: None
Product: IO/Storage
Classification: Unclassified
Component: MD (show other bugs)
Hardware: All Linux
: P1 blocking
Assignee: Neil Brown
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2008-11-06 18:11 UTC by David Bronaugh
Modified: 2008-11-06 18:56 UTC (History)
0 users

See Also:
Kernel Version: 2.6.28-rc3
Subsystem:
Regression: Yes
Bisected commit-id:


Attachments

Description David Bronaugh 2008-11-06 18:11:00 UTC
Latest working kernel version: 2.6.26
Earliest failing kernel version: 2.6.27
Distribution: Debian
Hardware Environment: Intel P35 chipset based board, 6x Seagate 1.5T disks
Software Environment: mdadm - v2.6.7.1 - 15th October 2008
Problem Description: When disks are removed from a raid10 set, then readded, the raid10 driver marks the disks as spare and doesn't resync.
See also: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/285156

Steps to reproduce:
mdadm /dev/md<x> --remove <some device>
mdadm /dev/md<x> --add <same device>
cat /proc/mdstat

Relevant dmesg spew:
[257494.006276] md: bind<sdb3>
[257494.036562] RAID10 conf printout:
[257494.036589]  --- wd:4 rd:6
[257494.036613]  disk 0, wo:0, o:1, dev:sdc3
[257494.036638]  disk 1, wo:0, o:1, dev:sdd3
[257494.036663]  disk 3, wo:0, o:1, dev:sde3
[257494.036687]  disk 5, wo:0, o:1, dev:sda3
[257494.037095] md: recovery of RAID array md4
[257494.037126] md: minimum _guaranteed_  speed: 1000 KB/sec/disk.
[257494.037156] md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for recovery.
[257494.037238] md: using 128k window, over a total of 995904 blocks.
[257494.037267] md: resuming recovery of md4 from checkpoint.
[257494.037294] md: md4: recovery done.
[257494.048583] RAID10 conf printout:
[257494.048608]  --- wd:4 rd:6
[257494.048631]  disk 0, wo:0, o:1, dev:sdc3
[257494.048655]  disk 1, wo:0, o:1, dev:sdd3
[257494.048679]  disk 3, wo:0, o:1, dev:sde3
[257494.048705]  disk 5, wo:0, o:1, dev:sda3
[257494.056925] RAID10 conf printout:
[257494.056959]  --- wd:4 rd:6
[257494.056981]  disk 0, wo:0, o:1, dev:sdc3
[257494.057011]  disk 1, wo:0, o:1, dev:sdd3
[257494.057040]  disk 3, wo:0, o:1, dev:sde3
[257494.057065]  disk 5, wo:0, o:1, dev:sda3
Comment 1 Neil Brown 2008-11-06 18:56:28 UTC
Thanks for the report.

This is fixed by

http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=a53a6c85756339f82ff19e001e90cfba2d6299a8

which has just been committed to mainline and should go into -stable in due course.
Comment 2 Neil Brown 2008-11-06 18:56:56 UTC
Closing as code fix is available.

Note You need to log in before you can comment on or make changes to this bug.