I run nightly raid checking on some raid 1 mirrors (md raid) and when running 3.13 kernels it sometimes locks up. I am not sure of the exact trigger. Here is some sample output: Personalities : [raid1] md14 : active raid1 sda4[1] sdb4[2] 217148624 blocks super 1.2 [2/2] [UU] [=>...................] check = 9.9% (21538176/217148624) finish=109.0min speed=29906K/sec md15 : active raid1 sdd1[0] sdc1[1] 35564360 blocks super 1.2 [2/2] [UU] [>....................] check = 0.0% (12352/35564360) finish=42063.5min speed=14K/sec md13 : active raid1 sda3[1] sdb3[2] 83884984 blocks super 1.2 [2/2] [UU] resync=DELAYED md12 : active raid1 sdb2[2] sda2[1] 10484664 blocks super 1.2 [2/2] [UU] resync=DELAYED md11 : active raid1 sda1[1] sdb1[2] 1048564 blocks super 1.0 [2/2] [UU] unused devices: <none> You can see that one of the two checks has locked up, but the other is continuing.
I checked 3.13.0-0.rc6.git0.2.fc21.i686+PAE and I am still seeing the issue.
My guess is that this is a problem with the drive controller rather than with md/raid. Can you attach complete kernel logs please? What is the most recent kernel on which this has worked?
The most recent kernel that hasn't exhibited the problem is kernel-PAE-3.12.6-300.fc20.i686. I am testing kernel-PAE-3.13.0-0.rc7.git0.1.fc21.i686 right now. If that fails I'll get the dmesg output from this boot.
I want to note that not every device check results in the problem. Often some succeed before one fails, but not always on the same device. When it gets stuck, it seems to happen pretty near the start of the process for a device. On the machine I have been seeing them, I haven't noticed the scsi disks having the problem. So far it has always been the pata disks. But I haven't had the problem happen so many times, that I'd say it very likely can't happen on the scsi disks. But that still seems in line with a possible driver problem.
I just just managed to get two pairs of disks to show the problem at the same time with rc7. sdc and sdd are scsi disks and this is the first time I saw them have a raid check lock up. In this case they locked up first. A check was running on md14 when they locked up, that completed. Then the md13 check locked up. As far as I can tell the raid arrays are working correctly except for the locked up checks.
Created attachment 121071 [details] dmesg output
Created attachment 121081 [details] /proc/mdstat for 3.13.0-0.rc7.git0.1.fc21.i686+PAE
Created attachment 121201 [details] /proc/mdstat from another machine Now that I had a chance to go into work, I am also able to replicate the problem on my work machine.
Created attachment 121211 [details] dmesg output for work machine (3.13.0-0.rc7.git0.2.fc21.x86_64)
I tested this on a vanilla 3.13-rc1 and reproduced the problem. I'm going to start a bisect, but it will probably go slowly.
I just hit a build that didn't boot for other reasons and that is going to slow things down more, but to narrow things down a bit I'm going to note the bisect log. git bisect log git bisect start # bad: [6ce4eac1f600b34f2f7f58f9cd8f0503d79e42ae] Linux 3.13-rc1 git bisect bad 6ce4eac1f600b34f2f7f58f9cd8f0503d79e42ae # good: [5e01dc7b26d9f24f39abace5da98ccbd6a5ceb52] Linux 3.12 git bisect good 5e01dc7b26d9f24f39abace5da98ccbd6a5ceb52 # good: [5cbb3d216e2041700231bcfc383ee5f8b7fc8b74] Merge branch 'akpm' (patches from Andrew Morton) git bisect good 5cbb3d216e2041700231bcfc383ee5f8b7fc8b74 # good: [e1f56c89b040134add93f686931cc266541d239a] mm: convert mm->nr_ptes to atomic_long_t git bisect good e1f56c89b040134add93f686931cc266541d239a
Thanks for persisting with this. It does look like is an md problem after all. My guess would be commit 79ef3a8aa1cb1523cc231c9a90a278333c21f761 I've had another look at the code but cannot see the problem yet. Next time you get a hang could you please collect a kernel stack trace for any process stuck in a 'D' wait, as well as the the mdXX_raid1 and mdXX_resync processes. You can get these with cat /proc/$PID/stack providing CONFIG_STACKTRACE is set.
I'll make sure CONFIG_STACKTRACE is set in my config files. I am basing them off a Fedora config file, but don't know if that is set or not. I'll get 79ef3a8aa1cb1523cc231c9a90a278333c21f761 tested tomorrow. I should be able to try that commit and at least one parent before I leave work. My home machine is having some other problems with the test commits. It looks Nouveau related and that shouldn't affect my work machine.
I managed to lock up my system while I was trying to get the stack dumps the first time. It looks like the problem with not be able to boot test kernels is related to some Fedora change (probably dracut), not with landing near a bad commit when doing the tests. That might give me problems testing tomorrow at work. But I should be able to figure out likely candidates for messing up the initramfs generation. And if it just Nouveau and not any old video driver, I'll be OK as well. I hope to have the stack traces shortly (using a kernel/initramfs from a few days ago).
After I hung the machine and rebooted tonight I am having trouble getting things to hang again to get the stack. I'll try again at work tomorrow as I am too tired to stay up any longer tonight.
This is the stack trace from an md_resync that was stuck: [root@wolff bruno]# cat /proc/27900/stack [<f7f0a316>] sync_request+0x446/0xb50 [raid1] [<c086b4f3>] md_do_sync+0xa13/0xfe0 [<c0867fa8>] md_thread+0xe8/0x100 [<c0472fd1>] kthread+0xa1/0xc0 [<c09caeb7>] ret_from_kernel_thread+0x1b/0x28 [<ffffffff>] 0xffffffff ps uaxww|grep ' D ' root 27897 2.0 0.0 0 0 ? D 05:55 0:02 [md13_resync] root 27900 0.0 0.0 0 0 ? D 05:55 0:00 [md15_resync] root 27981 0.0 0.0 4968 844 pts/0 S+ 05:57 0:00 grep --color=auto D This is the stack trace from an md_resync that was in a D state, but progressing: [<f7f0a316>] sync_request+0x446/0xb50 [raid1] [<c086b4f3>] md_do_sync+0xa13/0xfe0 [<c0867fa8>] md_thread+0xe8/0x100 [<c0472fd1>] kthread+0xa1/0xc0 [<c09caeb7>] ret_from_kernel_thread+0x1b/0x28 [<ffffffff>] 0xffffffff
I verified that 79ef3a8aa1cb1523cc231c9a90a278333c21f761 exhibits the problem. I have another stack dump from an x86_64 machine (the above comment is for i686). root 2216 0.0 0.0 0 0 ? D 08:22 0:00 [md14_resync] cat /proc/2216/stack [<ffffffffa021f101>] sync_request+0x211/0xaa0 [raid1] [<ffffffff814f2195>] md_do_sync+0x955/0xd60 [<ffffffff814eed38>] md_thread+0x118/0x130 [<ffffffff8108dda0>] kthread+0xc0/0xd0 [<ffffffff8167ae3c>] ret_from_fork+0x7c/0xb0 [<ffffffffffffffff>] 0xffffffffffffffff
Thanks. I was really hoping to see the stack trace of md14_raid1 too. I suspect there is almost nothing to it, but I'd like to know for certain.
I have a theory. This code looks wrong, and could conceivably cause the symptom you are seeing. If you can test it and let me know I'd really appreciate it. I'll probably submit the patch anyway because the code definitely looks wrong. (We move the 'sector = conf->start_next_window' assignment into the previous if block (where *_window_requests is incremented) and make it unconditional in there) Thanks. diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c index 1e5a540995e9..a49cfcc7a343 100644 --- a/drivers/md/raid1.c +++ b/drivers/md/raid1.c @@ -924,9 +924,8 @@ static sector_t wait_barrier(struct r1conf *conf, struct bio *bio) conf->next_window_requests++; else conf->current_window_requests++; - } - if (bio->bi_sector >= conf->start_next_window) sector = conf->start_next_window; + } } conf->nr_pending++;
Created attachment 121941 [details] Stack output from md*_raid1 and md*_resync together
I'll try out the patch tomorrow.
Thanks for the extra stack trace. It confirms what I suspected, which is often quite helpful!
I have been having problems setting up testing. I have one machine failing to boot (from an unrelated problem) new kernels and on the other there is a problem building modules. I have some things to try, but trying things takes me a long time and I didn't get anything I could test this fix on today.
I have have started testing this fix using the 3.13.0-0.rc8.git2.1.fc21 kernel on i686 and x86_64. So far no problems, but I'll feel better waiting until tomorrow to close this ticket.
I am pretty confident that I would have seen the problem appear by now if it wasn't fixed, so i am going to close this bug. Thank you for fixing things.
And thank you for persisting with testing. The fix is already in Linus' tree and will flow to relevant -stable releases.