Bug 8020 - Write load on DM-Crypt LUKS partition with reiserfs jams system
Summary: Write load on DM-Crypt LUKS partition with reiserfs jams system
Status: CLOSED CODE_FIX
Alias: None
Product: IO/Storage
Classification: Unclassified
Component: LVM2/DM (show other bugs)
Hardware: i386 Linux
: P2 normal
Assignee: Alasdair G Kergon
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2007-02-15 18:55 UTC by Ulrich
Modified: 2009-03-19 02:29 UTC (History)
5 users (show)

See Also:
Kernel Version: 2.6.20
Subsystem:
Regression: ---
Bisected commit-id:


Attachments
dmesg output (18.69 KB, text/plain)
2007-04-08 03:38 UTC, Nyyr
Details
lsmod output (4.15 KB, text/plain)
2007-04-08 03:39 UTC, Nyyr
Details
lspci output (2.16 KB, text/plain)
2007-04-08 03:40 UTC, Nyyr
Details
output (42.46 KB, application/octet-stream)
2007-10-10 14:31 UTC, Michael Schachtebeck
Details

Description Ulrich 2007-02-15 18:55:01 UTC
Most recent kernel where this bug did *NOT* occur:
Distribution: Ubuntu 7.04 (Feisty)
Hardware Environment: i386 (AMD64 in 32-bit-mode)
Software Environment: vanilla 2.6.20 kernel

Description:
Writing larger files onto a partition encrypted with DM-Crypt/LUKS (Filesystem: 
Reiserfs v. 3) freezes the system periodically every few seconds.


The recently purchased computer has an AMD "Sempron 3400+" CPU (AM2-socket) 
installed on a mainboard with Nvidia n-force 430 chipset.
The harddrive is a new 400GB SATA disk connected to the onboard SATA-terminal 
(kernel module: sata_nv).

In addition to three smaller and unencrypted partitions (system, swap and home; 
still unencrypted because it's a testing setup), I created a 322-GB partition 
for encrypted data storage.
This partition is LUKS-formatted via cryptsetup/DM-Crypt and has a Reiserfs 
file system.

The problem is the following:

Whenever I copy a slightly bigger file (e.g. 500MB) /to/ the dm-crypt-
partition, the system is almost blocked for the duration of the copying process.
This means that not only the system responsiveness gets rather low, but that, 
approx. once every two seconds, the system is completely "stalled" for a second.
Even the mouse pointer and Amarok sound playback periodically stop completely.

It doesn't matter if the source is one of the unencrypted partitions on the 
same harddisk, a CD-ROM or even an NFS-mount from a (compared to local copying) 
relatively slow remote server.

This is particularly undesireable, e.g. if a database system for sensitive 
customer data is using the partition, or if the recording of video live-streams 
from surveillance cameras blocks the rest of the system.

And especially since I thought the new standard CFQ-IO-scheduler would prevent 
those problems.

By observing the write performance of my system, I could determine an anomaly 
which also seems to be connected with the above described lock-ups:

Data transfer rates obtained via " time cp 'sourcelocation/
testfile' 'destination' " with a testfile of 1GByte random data.

Write performance:

copy:
  from one unencrypted filesystem to annother unencrypted filesystem 
  on the same harddrive: 38MByte/sec

  from unencrypted filesystem to encrypted filesystem 
  on the same harddrive: 16MByte/sec

  from 100-MBit/sec LAN NFS-mount to unencrypted filesystem
  on local machine: 11.2MByte/sec (maximum for 100MBit-LAN)

  from 100-MBit/sec LAN NFS-mount to encrypted filesystem 
  on local machine: 6.5MByte/sec
  (even though 16MB/sec were possible from the local source before, 
   and the LAN is also capable of 11.2MB/sec [!]
   This is the anomaly I meant before.)


Read performance from encrypted partition (" time cp 1-GB-testfile.dat /dev/
null " or even " time cp 1-GB-testfile.dat /tmp " (no RAM tmpfs))
is perfect with 28 respectively 26 MByte/sec; no system lock-ups.

To reproduce this problem, I downloaded, configured and compiled a vanilla 
2.6.20 kernel and ran the computer with this one instead of the distribution 
standard, (which at the time is a 2.6.20-rc snapshot).

I didn't load any third-party or binary-only kernel modules (i.e. I 
decommissioned the nvidia-graphics-driver).


Here I have uploaded some information about my computer:

http://datenparkplatz.de/DiesUndDas/lspci.output.2.6.20.vanilla.txt
(output of "lspci")

http://datenparkplatz.de/DiesUndDas/dmesg.output.2.6.20.vanilla.txt
(output of "dmesg")

http://datenparkplatz.de/DiesUndDas/proc.version.2.6.20.vanilla.txt
(output of "cat /proc/version")

http://datenparkplatz.de/DiesUndDas/kernelconfig-2.6.20-vanilla.txt
(my kernels ".config")


HTH,
Ulrich
Comment 1 Nyyr 2007-04-08 03:36:51 UTC
I have the same problem, well, even worse:

I have dm-crypt encrypted /home partition (luks). In my /home dir I then created
large file (4.6 GB) encrypted with dm-crypt (also luks, even the same password
and encryption method).
I mounted this file over /dev/loop0 and created ext3 filesystem on it, mounted
under /mnt.
Now, when I copy large files from my windows partition to this /mnt, after some
time (200 MB copied?) the disk I/O system is stalled - I cannot read or write to
the disk - the computer is not locked, but since it cannot do any disk
operations, it's unusable.

Off topic: Other interesting thing is, that even that I mounted this NTFS
partitin READ ONLY, after reseting computer (because of the dm-crypt + copy
problem), the next mount of that NTFS partition says, that it is corrupted!
(unknown flag 0x40000 or something like that).

I'm using dm-crypt encrypted home for a long time and I had no problems so far
(except those short locks during encrypted disk I/O as described by bug reporter
 Alasdair G Kergon).

I just guess, but it seems like dm-crypt module requires exclusive use of the
computer and since it is called recursively??? during such copy (writes into
encrypted file on encrypted partition), it causes deadlock.

I attached dmesg, lspci and lsmod output (the disk I mentioned is the one
attached to PATA onboard VIA interface).
Comment 2 Nyyr 2007-04-08 03:38:33 UTC
Created attachment 11099 [details]
dmesg output
Comment 3 Nyyr 2007-04-08 03:39:37 UTC
Created attachment 11100 [details]
lsmod output
Comment 4 Nyyr 2007-04-08 03:40:24 UTC
Created attachment 11102 [details]
lspci output
Comment 5 Nyyr 2007-04-08 10:30:48 UTC
So......I created luks partition over DVD-RAM (4.6 GB) and when I copy files
with Midnight Commander, I can see, that during reading files from encrypted /home
to MC's cache computer responds OK, but when it starts to write to DVD-RAM, any 
other disk I/O is halted for SEVERAL SECONDS.

So it seems to me, that dm-crypt encrypts data and waits for these data to pass 
through disk I/O while blocking any other disk usage. From this point of view it
seems that it is not a bug but a bad design..... dm-crypt does not support
asynchronous operation?
Comment 6 Vladimir Lushnikov 2007-04-14 13:45:31 UTC
I don't quiet know whether this bug is related to the kernel "crashes" I've been
getting with LUKS/DM-Crypt recently. My HD - several partitions, Linux running
on one of the larger LUKS ones, using LVM2 (inside the LUKS, not LUKS inside the
lvm2)

Kernel 2.6.20-hardened-r1 (gentoo patchset), fuse support built as a module. The
partition in question is one using the ntfs-3g driver, inside a LUKS mapping.
It's got some music on it, and Amarok is the program accessing it. After some
time of listening, the computer hangs. The X "screenshot" just sits there, no
mouse input, no keyboard, no sshd.

The three times this has happened I was listening to music and compiling
programs (using portage) - the latter being a very IO-heavy activity. The last
time the crash happened I was present at the computer, and you could move the
mouse around very slowly for a few seconds before the whole system just hangs.
(More detail: first Kicker went, Firefox was working with the mouse, then I
tried switching windows, and there it hung)

The kernel log doesn't have any messages to errors, except one I got a few days ago:

Apr  8 21:23:19 gentoo VFS: Can't find ext4 filesystem on dev dm-10.
Apr  8 21:23:19 gentoo SQUASHFS error: Can't find a SQUASHFS superblock on dm-10
Apr  8 21:23:19 gentoo FAT: bogus logical sector size 15978
Apr  8 21:23:19 gentoo VFS: Can't find a valid FAT filesystem on dev dm-10.

Which was the last error I got before the system crashed. The subsequent crashes
did not have this problem.

I'm using the CFQ IO scheduler, Via SATA driver. Reiserfs partitions exist for
/var and /tmp, as in the ubuntu bug
https://bugs.launchpad.net/ubuntu/+source/linux-source-2.6.20/+bug/82528

I am going to try removing the FUSE/NTFS combination because I have a feeling it
could be causing the lockups. I'll look into newer kernels (and compile more
debug info, hopefully). The only other thing that could be causing the problem
is that when the initramfs image is loaded when starting up, when "exec
switch_root" is executed, the cleanup of "echo > /proc/sys/kernel/hotplug" is
not done (so "/sbin/mdev" is there). But since init runs udev, that shouldn't
matter?

Apologies if I've been too verbose or imprecise. I will be more than happy to
provide more information.
Comment 7 Natalie Protasevich 2007-06-21 14:05:09 UTC
Several fixes for dm-crypt by Olaf Kirch went into 2.6.22-rc5.
Please test with this release and verify whether the problem has been fixed.
Thanks.
Comment 8 Ulrich 2007-08-18 22:52:38 UTC
Hi,

sorry to say this, but for me, the problem still persists with linux 2.6.22.1.
Comment 9 Michael Schachtebeck 2007-09-26 12:15:12 UTC
I have a similar problem, running kernel 2.6.22-gentoo-r6 (2.6.22.6 with gentoo patchset) on an Athlon64 X2 4200+ with 2 GB RAM. I also tried vanilla 2.6.23-rc7, but the problem kept the same.

My root filesystem is ext3 with data=journal. My home directory is encrypted using pam_mount (it's a file on the root filesystem, mounted via a loop device and dm-crypt), formatted as ext2.

When copying a large file to that encrypted loop device, about 100-150 MB are written without problems. Then, suddenly the load of both processor cores goes up (100%wa for some seconds), then the load of one core goes down to a common level, while the load of the other core stays at 100%wa and nothing is written to disk for some minutes. Each process trying to access some file or directory in that encrypted filesystem blocks. After some minutes, the load suddenly drops down, again 100-200 MB are written to disk quite fast, then the system blocks again for some minutes...

Any hint how to solve that problem?
Comment 10 Ulrich 2007-09-26 15:25:12 UTC
Does it make sense to start a thread on the kernel mailing list?

The situation with filesystem encryption under Linux is a bit annoying at the time, IMO. On a laptop, I think it's an essential feature.
Comment 11 Natalie Protasevich 2007-09-26 22:51:05 UTC
Yes, it always makes sense to raise concerns on the mailing list.
Comment 12 Milan Broz 2007-09-27 00:00:38 UTC
> ------- Comment #11 from protasnb@gmail.com  2007-09-26 22:51 -------
> Yes, it always makes sense to raise concerns on the mailing list.
maybe :-) but dm-crypt mailing is better for discussing dm-crypt
issues...

Well, currently I see two main problems on this bugzilla:

1) the crypt queue per cpu in dmcrypt was not good idea, it causes more problems
and no real performance increase. In current -mm (for 2.6.24) dm-crypt already
switched to singlethreaded queue, now with separate queue for processing
io. So this is already fixed.

2) There is problem in missing io congestion control when dm-crypt submits
work to crypt thread in device map function. This is probably root cause of your
problems. I have some patches which tries to solve it but it is more hack
currently, so we need to fix it properly.
I will paste patch here for testing when ready (I hope it will be soon).

Milan
Comment 13 Milan Broz 2007-09-27 09:47:31 UTC
Please could you try attached workaround if it helps ?

Thanks,
Milan
-----

Add cond_resched() as a workaround to crypt processing queue
to prevent some reported system stucks.

This issue will be solved properly later...

Signed-off-by: Milan Broz <mbroz@redhat.com>
---
 drivers/md/dm-crypt.c |    2 ++
 1 file changed, 2 insertions(+)

Index: linux-2.6.23/drivers/md/dm-crypt.c
===================================================================
--- linux-2.6.23.orig/drivers/md/dm-crypt.c	2007-09-27 18:35:26.000000000 +0200
+++ linux-2.6.23/drivers/md/dm-crypt.c	2007-09-27 18:39:29.000000000 +0200
@@ -665,6 +665,8 @@ static void kcryptd_do_work(struct work_
 		process_read(io);
 	else
 		process_write(io);
+
+	cond_resched();
 }
 
 /*
Comment 14 Michael Schachtebeck 2007-09-27 10:19:36 UTC
Does it make sense to apply this patch to a 2.6.22.6 kernel, or is it 2.6.23-only?
Comment 15 Milan Broz 2007-09-27 10:22:17 UTC
> ------- Comment #14 from michael.schachtebeck@stud.uni-goettingen.de 
> 2007-09-27 10:19 -------
> Does it make sense to apply this patch to a 2.6.22.6 kernel, or is it
> 2.6.23-only?

yes, you can try 2.6.22, code should be same here
Comment 16 Michael Schachtebeck 2007-09-27 11:08:59 UTC
I applied it to 2.6.22-gentoo-r8 (2.6.22.9 with gentoo patchset) - sorry, no effect at all. :-(

Some more observations. maybe trivial:

* even a kill -9 does not kill blocked processes which try to access the encrypted filesystem

* writing to the root filesystem which contains the encrypted filesystem is *not* affected by the blocking, so it's probably no issue with the SATA driver, but related to dm-crypt or the loop device code.
Comment 17 Michael Schachtebeck 2007-09-27 13:23:13 UTC
Does not work with vanilla 2.6.23-rc8 either.
Comment 18 Milan Broz 2007-10-10 06:38:58 UTC
I see similar problem when using loop devices (even without dm-crypt) mentioned in comment #9 - caused by stalling in balance_dirty_pages.
(but note this is only first part of problem.)
This was recently fixed in -mm tree by BDI dirty limit patchset.

Please, could you attach process states output from syslog - run
echo t >/proc/sysrq-trigger
when system is stalled (100% waiting) to confirm this ?
(see also this http://lkml.org/lkml/2007/9/29/57 - if this patch helps, it is dirty_page balance problem).

Anyway, the second part of problem - dm-crypt congestion patch - is needed too. I will prepare it when 2.6.24-rc is ready (too many related changes there).
Comment 19 Michael Schachtebeck 2007-10-10 14:30:32 UTC
ok, here is the output of

echo t >/proc/sysrq-trigger

with 2.6.23 with gentoo patchset. I will try the patch you mentioned in some days, lot to do at the moment...
Comment 20 Michael Schachtebeck 2007-10-10 14:31:32 UTC
Created attachment 13104 [details]
output
Comment 21 Michael Schachtebeck 2007-10-10 23:08:22 UTC
ok, I had some time to test. I applied the patch from http://lkml.org/lkml/2007/9/29/57 - and it works well so far. I wrote a 1 GB file without blocking. Does this patch has an impact on the performance? I noticed that the write rate (according to dd) was 23,9 MB/s which seems not so much on an Athlon64 X2 4200+ (despite of the encryption).
Comment 22 Milan Broz 2007-10-11 00:09:15 UTC
bugme-daemon@bugzilla.kernel.org wrote:
> ------- Comment #20 from michael.schachtebeck@stud.uni-goettingen.de 
> 2007-10-10 14:31 -------
> Created an attachment (id=13104)
>  --> (http://bugzilla.kernel.org/attachment.cgi?id=13104&action=view)
> output

Ok, so it *is* related to dirty page balance stall as expected.

Oct 10 23:19:13 [kernel] dd            D f6486cb8     0  6008   5760
Oct 10 23:19:13 [kernel]        f6486ccc 00200086 00000002 f6486cb8 f6486cb0 00000000 f6423a40 c043cda0
Oct 10 23:19:13 [kernel]        c043fa40 f6486cbc f6423b7c c201aa40 00000001 c2122a80 00200202 c2126000
Oct 10 23:19:13 [kernel]        c2126000 c0133d67 f7bdae00 0002cda1 00000000 00000003 00000000 00000000
Oct 10 23:19:13 [kernel] Call Trace:
Oct 10 23:19:13 [kernel]  [<c0133d67>] lock_timer_base+0x27/0x60
Oct 10 23:19:13 [kernel]  [<c034496a>] schedule_timeout+0x4a/0xc0
Oct 10 23:19:13 [kernel]  [<c01cd2b2>] ext2_discard_prealloc+0x22/0x80
Oct 10 23:19:13 [kernel]  [<c0133a20>] process_timeout+0x0/0x10
Oct 10 23:19:13 [kernel]  [<c034433e>] io_schedule_timeout+0x1e/0x30
Oct 10 23:19:13 [kernel]  [<c015db36>] congestion_wait+0x56/0x80
Oct 10 23:19:13 [kernel]  [<c013e400>] autoremove_wake_function+0x0/0x40

Oct 10 23:19:13 [kernel]  [<c0158961>] balance_dirty_pages_ratelimited_nr+0x141/0x230
Oct 10 23:19:13 [kernel]  [<c0154062>] generic_file_buffered_write+0x372/0x6b0
...
Oct 10 23:19:13 [kernel]  [<c0173765>] do_sync_write+0xd5/0x120
Comment 23 Milan Broz 2007-10-11 00:17:54 UTC
> ok, I had some time to test. I applied the patch from
> http://lkml.org/lkml/2007/9/29/57 - and it works well so far. I wrote a 1 GB
> file without blocking. Does this patch has an impact on the performance? I
> noticed that the write rate (according to dd) was 23,9 MB/s which seems not
> so
> much on an Athlon64 X2 4200+ (despite of the encryption).

good :)

Patch mentioned above is not real fix - it will break dirty page balance...
(but is simple to test to prove that here is the problem)
You should wait till 2.6.24 or if there is some workaround backported.
(see following thread on LKML too).

Crypt performace: switching to two singlethreaded queues should increase performance,
when ready, retest this with 2.6.24-rc (or current -mm) please, patches are already in.

Thanks for testing,
Milan
Comment 24 Michael Schachtebeck 2007-10-11 09:04:18 UTC
I did the same tests with 2.6.23-rc8-mm2 - it worked well (without patching), and the performance was significantly higher than with a patched 2.6.23 (about 32 MB/s when writing a 1 GB file instead of 24 MB/s with a patched 2.6.23).

Do you know if the patch from 2.6.23-rc8-mm2 that fixes the bug can be applied to vanilla 2.6.23, or if there will soon be a port to a 2.6.23.x release?
Comment 25 Milan Broz 2007-10-11 09:33:32 UTC
bugme-daemon@bugzilla.kernel.org wrote:
> ------- Comment #24 from michael.schachtebeck@stud.uni-goettingen.de 
> 2007-10-11 09:04 -------
> Do you know if the patch from 2.6.23-rc8-mm2 that fixes the bug can be
> applied
> to vanilla 2.6.23, or if there will soon be a port to a 2.6.23.x release?

dirty_page: it is part of complex patchset

(I have patched my 2.6.23 tree and it is 18 patches from mm !
For the record - I am using these patches to 2.6.23 but maybe some
cosmetic changes needed...

mm/nfs-remove-congestion_end.patch
mm/lib-percpu_counter_add.patch
mm/lib-percpu_counter_sub.patch
mm/lib-percpu_counter-variable-batch.patch
mm/lib-make-percpu_counter_add-take-s64.patch
mm/lib-percpu_counter_set.patch
mm/lib-percpu_counter_sum_positive.patch
mm/lib-percpu_count_sum.patch
mm/lib-percpu_counter_init-error-handling.patch
mm/lib-percpu_counter_init_irq.patch
mm/mm-bdi-init-hooks.patch
mm/mm-scalable-bdi-statistics-counters.patch
mm/mm-count-reclaimable-pages-per-bdi.patch
mm/mm-count-writeback-pages-per-bdi.patch
mm/lib-floating-proportions.patch
mm/mm-per-device-dirty-threshold.patch
mm/mm-per-device-dirty-threshold-warning-fix.patch
mm/mm-per-device-dirty-threshold-fix.patch

+ all dm fixes from mm tree

No idea if this will be backported or "workarounded"...

for dm-crypt specific queue patches - no, these will not be backported.

Milan
Comment 26 Michael Schachtebeck 2007-10-13 01:31:25 UTC
ok, so I hope that the fixes will be in 2.6.24 and use the patch from http://lkml.org/lkml/2007/9/29/57 for the time being...

Thank you for your assistance.
Comment 27 Michael Schachtebeck 2008-01-25 00:06:27 UTC
Is this issue fixed in 2.6.24? I found this

http://git.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=04fbfdc14e5f48463820d6b9807daa5e9c92c51f

commit, does it contain the fixes you mentioned above?
Comment 28 Milan Broz 2008-01-25 00:43:49 UTC
bugme-daemon@bugzilla.kernel.org wrote:
> ------- Comment #27 from michael.schachtebeck@stud.uni-goettingen.de 
> 2008-01-25 00:06 -------
> Is this issue fixed in 2.6.24? I found this
> 
>
> http://git.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=04fbfdc14e5f48463820d6b9807daa5e9c92c51f
> 
> commit, does it contain the fixes you mentioned above?

yes, 2.6.24 should contain these fixes.
Comment 29 Natalie Protasevich 2008-02-02 01:41:11 UTC
The bug can be closed then I suppose, thanks.
Comment 30 Rafal Wijata 2008-04-21 07:24:26 UTC
[root@mail mail]# uname -a
Linux mail.***** 2.6.24.4-64.fc8 #1 SMP Sat Mar 29 09:15:49 EDT 2008 x86_64 x86_64 x86_64 GNU/Linux

writing to dm-crypted/reiserfs partition is ~bout 4M/s, while unencrypted partition gives almost 100M/s.

So this bug is not fixed!
Comment 31 Milan Broz 2008-04-21 07:54:50 UTC
bugme-daemon@bugzilla.kernel.org wrote:
> writing to dm-crypted/reiserfs partition is ~bout 4M/s, while unencrypted
> partition gives almost 100M/s.
> 
> So this bug is not fixed!

if it is about performance, please report it as separate bug and add info
about your system.

There are probably mixed several problems on this bug - 
this bugreport covers mainly problems with crypt over loop device - and this
should be fixed in recent kernels.
Comment 32 Rafal Wijata 2008-04-22 00:17:27 UTC
As You advised
http://bugzilla.kernel.org/show_bug.cgi?id=10502

Note You need to log in before you can comment on or make changes to this bug.