Bug 42954
Summary: | kernel oops when adding a bitmap to a raid1 md device | ||
---|---|---|---|
Product: | IO/Storage | Reporter: | Flavio Stanchina (flavio) |
Component: | MD | Assignee: | Neil Brown (neilb) |
Status: | RESOLVED PATCH_ALREADY_AVAILABLE | ||
Severity: | normal | CC: | neilb |
Priority: | P1 | ||
Hardware: | x86-64 | ||
OS: | Linux | ||
Kernel Version: | 3.2.6 | Subsystem: | |
Regression: | No | Bisected commit-id: | |
Attachments: |
kernel messages
kernel config kernel trace with a different BUG |
Description
Flavio Stanchina
2012-03-18 11:29:48 UTC
Created attachment 72641 [details]
kernel messages
Created attachment 72642 [details]
kernel config
Thanks for the report. I believe this is fixed by: http://neil.brown.name/git?p=md;a=commitdiff;h=37b8fb4a7443ad1d83a977f4b1720b5617447fed which I have queued to send to Linus as soon as 3.3 is out. It will then be added to recent stable kernels. (maybe I should just submit it now .. but I thought 3.3 was imminent). I applied the patch to a vanilla 3.2.6 and tried again, but got the same crash. I'm sorry I didn't capture the kernel messages as I was in a hurry, but can do that if you think it might be useful. Yes please - and double check that you are really running the new kernel as I have a high degree of confidence that the patch fixes that problem. Well, it appears that I did indeed boot the wrong kernel. After I applied your patch and rebuilt the kernel, "make deb-pkg" added a + to the version number (I suppose it's meant to signal that the sources aren't pristine) and to apt-get, a trailing + means "install the package with this name WITHOUT the +" so it installed the unpatched kernel again. :-/ I've now rebuilt and installed the right kernel and I can confirm that your patch fixes the crash I've seen. However, you might want to look at the trace I'm attaching, because at 157.8 seconds there's another kernel BUG in drivers/md/md.c, probably unrelated to the bitmap thing but still firmly in your territory. :) After creating the array, I ran "dd if=/dev/zero of=/dev/md3 bs=1M count=1" as usual and it survived. I ran dd again without count=1 to stress test the thing a bit more, oblivious to the fact that job control is disabled in a shell started with init=/bin/bash, so after a minute or so when I was satisfied that it was working fine I just hit Ctrl+Alt+Del and I got the kernel BUG you'll find in the trace. Created attachment 72687 [details]
kernel trace with a different BUG
Thanks. That second one is fixed by: http://neil.brown.name/git?p=md;a=commitdiff;h=c744a65c1e2d59acc54333ce80a5b0702a98010b already sent to Linus, but he doesn't seem to have pulled it yet. |