Bug 10781
Summary: | unresponsive system (unfair io scheduling) when using dm-crypt | ||
---|---|---|---|
Product: | IO/Storage | Reporter: | Christian Jaeger (christian-bko) |
Component: | LVM2/DM | Assignee: | Milan Broz (gmazyland) |
Status: | CLOSED OBSOLETE | ||
Severity: | normal | CC: | agk, daniel, lure, marejde |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | 2.6.22.19 (kernel.org) [, 2.6.24-1-amd64 (Debian)] | Subsystem: | |
Regression: | No | Bisected commit-id: |
Description
Christian Jaeger
2008-05-23 08:29:10 UTC
Could this be the same problem as the one in http://bugzilla.kernel.org/show_bug.cgi?id=10378 and/or the one being discussed in the thread starting in http://lkml.org/lkml/2008/2/28/150 ? I can't test right now, but will asap if that makes sense. Christian. This might be a problem [in combination] with CONFIG_USER_SCHED. I'm running a new kernel with this change now: @@ -1,7 +1,7 @@ # # Automatically generated make config: don't edit # Linux kernel version: 2.6.27.7 -# Wed Dec 3 19:24:49 2008 +# Wed Dec 3 19:29:40 2008 # CONFIG_64BIT=y # CONFIG_X86_32 is not set @@ -81,10 +81,8 @@ CONFIG_IKCONFIG_PROC=y CONFIG_LOG_BUF_SHIFT=16 # CONFIG_CGROUPS is not set CONFIG_HAVE_UNSTABLE_SCHED_CLOCK=y -CONFIG_GROUP_SCHED=y -CONFIG_FAIR_GROUP_SCHED=y -# CONFIG_RT_GROUP_SCHED is not set -CONFIG_USER_SCHED=y +# CONFIG_GROUP_SCHED is not set +# CONFIG_USER_SCHED is not set # CONFIG_CGROUP_SCHED is not set CONFIG_SYSFS_DEPRECATED=y CONFIG_SYSFS_DEPRECATED_V2=y and things seem to be much better; I haven't run the above tests again yet, though, since I've used up all disk partitions atm and it's late at night. Although the kernel I was running then didn't seem to have CONFIG_USER_SCHED enabled: see https://bugs.freedesktop.org/show_bug.cgi?id=15716 Could it be that config-2.6.22.19 was using USER_SCHED without configuring it? Or something else changed? Or it's multiple problems effecting the whole thing. bugme-daemon@bugzilla.kernel.org wrote: > http://bugzilla.kernel.org/show_bug.cgi?id=10781 > ------- Comment #3 from christian-bko@jaeger.mine.nu 2008-12-03 19:48 > ------- > Although the kernel I was running then didn't seem to have CONFIG_USER_SCHED > enabled: see https://bugs.freedesktop.org/show_bug.cgi?id=15716 > Could it be that config-2.6.22.19 was using USER_SCHED without configuring > it? > Or something else changed? Or it's multiple problems effecting the whole > thing. Too many things changed in kernel since 2.6.22, please use more recent kernel if possible. If you want to play with it, add with this one liner patch in the beginning: http://www2.kernel.org/pub/linux/kernel/people/agk/patches/2.6/2.6.25/dm-crypt-add-cond_resched.patch You will probably need to modify it for < 2.6.25 to apply correctly: Index: home/data/linux-2.6.24.y/drivers/md/dm-crypt.c =================================================================== --- home.orig/data/linux-2.6.24.y/drivers/md/dm-crypt.c +++ home/data/linux-2.6.24.y/drivers/md/dm-crypt.c @@ -374,6 +374,7 @@ static int crypt_convert(struct crypt_co break; ctx->sector++; + cond_resched(); } return r; > please use more recent kernel if possible. As you can see from the diff file I'm using "Linux kernel version: 2.6.27.7" now. As shown on the above-mentioned URL https://bugs.freedesktop.org/show_bug.cgi?id=15716 I've been upgrading to a newer kernel long ago already. I've been using various kernels since: chris@novo:/boot$ l config* -rw-r--r-- 1 root root 63490 2008-04-27 22:10 config-2.6.22.19 -rw-r--r-- 1 root root 72175 2008-06-11 06:50 config-2.6.25.6 -rw-r--r-- 1 root root 73063 2008-06-22 21:44 config-2.6.25.8 -rw-r--r-- 1 root root 73064 2008-07-03 22:17 config-2.6.25.10 -rw-r--r-- 1 root root 75183 2008-07-19 20:32 config-2.6.26 -rw-r--r-- 1 root root 75352 2008-09-11 23:03 config-2.6.26.3 -rw-r--r-- 1 root root 75352 2008-09-12 00:23 config-2.6.26.5 -rw-r--r-- 1 root root 77281 2008-10-16 16:47 config-2.6.27.1 -rw-r--r-- 1 root root 75352 2008-10-30 13:33 config-2.6.26.7.old -rw-r--r-- 1 root root 75352 2008-10-30 14:04 config-2.6.26.6 -rw-r--r-- 1 root root 75352 2008-11-08 11:49 config-2.6.26.7 -rw-r--r-- 1 root root 77293 2008-11-08 12:55 config-2.6.27.5.old -rw-r--r-- 1 root root 77290 2008-11-08 13:12 config.old -rw-r--r-- 1 root root 77290 2008-11-08 13:12 config-2.6.27.5 -rw-r--r-- 1 root root 77216 2008-12-03 20:39 config-2.6.27.7 -rw-r--r-- 1 root root 77216 2008-12-03 20:39 config I've never seen much improvement by going to a newer kernel, only a little maybe. I did move my swap from the logical volume on dm-crypt to a logical volume on a plain text backed volume group, and have moved most of my root filesystem to an unencrypted logical volume too, which both/together seem to have mitigated the issue a little bit. But the most articulate improvement aside from switching off one core seems to be switching off CONFIG_USER_SCHED in the newest kernel. Though again, forgive me that I can't run the above test case right now. I'll do and if the problem persists also try your patch--thanks. BTW one thing I also tested recently was whether I could get a faster disk or swap by using 3 USB sticks in a raid-0 setup (with 16k chunks). This raid device is generally quite a bit faster than my internal (crappy laptop-)disk, linear reading is about twice as fast (~60MB/sec, my laptop has 3 USB ports but two of them are on the same USB bus as it turns out), linear writing is about the same (~30MB/sec), random reading (find -type f on reiserfs with cold cache) is about 4-5 times faster. Using dm-crypt on top of that raid device didn't slow those speeds down, and actually I noticed that writing and reading from that device didn't slow my desktop down at all! I was already starting to suspect that my internal laptop disk is somehow just broken; so I made a clear text raid-0 setup again, created reiserfs on it, then created a non-sparse 4G file on that, then losetup'ed that file to a loop device which I cryptsetup luksFormat'ed and luksOpen'ed, and mkswap and swapon (and swapoff my old swap). But that swap was very bad, it brought my desktop right back to >20 seconds reaction times. Of course I don't know which point in the chain is the exact culprit. And this was with 2.6.27.5 with CONFIG_USER_SCHED=y, I may try again now with the 2.6.27.7 kernel without USER_SCHED. |