Bug 10776
Summary: | dm-crypt: performance problems, uses only single cpu?, 30-40k context switches per sec | ||
---|---|---|---|
Product: | IO/Storage | Reporter: | Sami Liedes (sami.liedes) |
Component: | LVM2/DM | Assignee: | Milan Broz (gmazyland) |
Status: | CLOSED OBSOLETE | ||
Severity: | normal | CC: | agk, alan |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | 2.6.25.4 | Subsystem: | |
Regression: | No | Bisected commit-id: |
Description
Sami Liedes
2008-05-22 11:12:18 UTC
> I have a modern Core 2 Quad (Q6600) processor. The kcryptd process seems to > be > using 99% of a single CPU (i.e. of 400% available), spread evenly on the CPUs > (so it probably was a single thread in full steam). It is moving from cpu to another cpu? hm, it is not good, so we need set processor affinity. But that it use only one core, it is expected. I have some ideas how to optimize dm-crypt on multiple cpus, but it is not yet implemented. > I did some oprofiling and other diggind and found out that when reading a > crypted volume (echo 3 >/proc/sys/vm/drop_caches; dd > if=/dev/mapper/myvg-root_crypt of=/dev/null bs=1M), I get around 30-40 > thousand > context switches per second (as shown by the vmstat command) on an otherwise > idle system, which seems quite a lot to me. Reading the underlying LVM volume > (i.e. not decrypting it) causes about 7-8 thousand context switches per > second, > while when reading the md device on which the LVM volume resides I get around > 3300 cs/sec. When idle I get 200 cs/sec. hrm. could you try if direct dm-crypt (without md - over LVM or better directly over normal partition) is better? for dd "test": if you increase readahead for crypt device - is it better or the same? (I mean blockdev --setra <num> /dev/mapper/... ) There are some patches which are still not in upstream kernel which should help, but this seems to be really excessive context switches count... > I noticed that there was a recent fix in dm-crypt.c that apparently disables > broken parallel processing for *writes* (commit > 3f1e9070f63b0eecadfa059959bf7c9dbe835962). Should parallel processing for > reads > still be possible? I think that even writes with proper implementation is possible. Just it need some configuration (sometimes it is not optimal to use all cores/CPUs for encryption). On Thu, May 22, 2008 at 11:50:56AM -0700, bugme-daemon@bugzilla.kernel.org wrote: > It is moving from cpu to another cpu? hm, it is not good, so we need set > processor affinity. > But that it use only one core, it is expected. I have some ideas how to > optimize dm-crypt > on multiple cpus, but it is not yet implemented. Yes, kcryptd jumps from one cpu to another according to top. > > I did some oprofiling and other diggind and found out that when reading a > > crypted volume (echo 3 >/proc/sys/vm/drop_caches; dd > > if=/dev/mapper/myvg-root_crypt of=/dev/null bs=1M), I get around 30-40 > thousand > > context switches per second (as shown by the vmstat command) on an > otherwise > > idle system, which seems quite a lot to me. Reading the underlying LVM > volume > > (i.e. not decrypting it) causes about 7-8 thousand context switches per > second, > > while when reading the md device on which the LVM volume resides I get > around > > 3300 cs/sec. When idle I get 200 cs/sec. > > hrm. could you try if direct dm-crypt (without md - over LVM or better > directly > over normal partition) is better? More weirdness. ------------------------------------------------------------ # cryptsetup create ctest /dev/sdb1 --readonly --cipher aes-cbc-plain # echo 3 >/proc/sys/vm/drop_caches # dd if=/dev/mapper/ctest of=/dev/null bs=1M & # vmstat 10 procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa [...] 3 0 104 6153688 956628 109916 0 0 53209 2 2046 99368 0 34 42 24 4 0 104 5394392 1576504 110400 0 0 62026 14 2206 97842 0 36 45 19 1 1 104 4624468 2205336 110984 0 0 62933 4 2472 112581 0 36 46 17 ------------------------------------------------------------ So I get 100k context switches per second. But this is quite strange (well, to me at least): `dd if=/dev/sdb1 of=/dev/null bs=1M' (i.e. on the normal partition, without dm-crypt) gives me 20000 context switches per second. If I do `mount /dev/sdb1 /media/spare1' on another terminal, the rate of context switches immediately drops to 1/10th or about 2000/second. Mounting the device increases the block size of the device from 512 to 4096, but manually doing that with blockdev --setbsz 4096 doesn't cause the drop in context switches, indicating that the block size is probably not the culprit. Unmounting the partition while dd is running doesn't cause an increase in context switches, but killing the dd (after unmounting) and starting it again does. > for dd "test": if you increase readahead for > crypt > device - is it better or the same? (I mean blockdev --setra <num> > /dev/mapper/... ) I tried to change the readahead for /dev/mapper/myvg-root_crypt from 256 to 1024 and 16384. It seemed to have a slight effect, so I tried 1048576 too. With readahead set to 1048576, the rate of context switches seems to drop roughly 40% to about 19000/second. Also higher readahead seems to correlate with somewhat higher throughput for dd, up to something like 93 MB/sec. Sami If this is still seen with a modern kernel please update |