Bug 202405

Summary: Corrupted data when writing to dm-crypt device
Product: Platform Specific/Hardware Reporter: lirik.lv
Component: ARMAssignee: linux-arm-kernel (linux-arm-kernel)
Status: NEW ---    
Severity: normal CC: ardb, gmazyland, herbert, linux-arm-kernel, mpatocka, snitzer
Priority: P1    
Hardware: ARM   
OS: Linux   
Kernel Version: 5.0-rc3, 4.20.4, 4.17.19 (Maybe others?) Subsystem:
Regression: No Bisected commit-id:
Attachments: cpuinfo
kernel config
/proc/crypto
ver_linux

Description lirik.lv 2019-01-24 20:50:56 UTC
Created attachment 280733 [details]
cpuinfo

cpuinfo, kernel config and other information are in attachments.
I have armv7 cortex-A17 cpu (rk3288).

How to reproduce:
* create dm-crypt device `cryptsetup --type plain` (LUKS can be used too).
* make a filesystem on it (tried both ext4 and ext3)
* write a big file (>110M)
* issue sync
* invalidate the VM cache
* compare the file with original one: files are different

Zram is used in script just for convenience. Bug shows up on disks too.

#!/bin/sh -x
modprobe zram

case $1 in
clean)
	umount /mnt/zram0 /mnt/crypt
	cryptsetup close crypt
	zramctl -r /dev/zram*
	;;
*)
	zram0=$(zramctl -f -s 300M)
	zram1=$(zramctl -f -s 300M)
	crypt="/dev/mapper/crypt"

	echo -n 1 | cryptsetup open --type plain $zram1 crypt --

	mkfs.ext4 $zram0
	mkfs.ext4 $crypt

	mkdir -p /mnt/zram0 /mnt/crypt

	mount $zram0 /mnt/zram0
	mount $crypt /mnt/crypt

	dd if=/dev/zero of=/mnt/zram0/file bs=1M count=200
	cp /mnt/zram0/file /mnt/crypt/file

	sync
	echo 1 > /proc/sys/vm/drop_caches

	cmp /mnt/zram0/file /mnt/crypt/file
esac
Comment 1 lirik.lv 2019-01-24 20:51:52 UTC
Created attachment 280735 [details]
kernel config
Comment 2 lirik.lv 2019-01-24 20:52:39 UTC
Created attachment 280737 [details]
/proc/crypto
Comment 3 lirik.lv 2019-01-24 20:53:40 UTC
Created attachment 280739 [details]
ver_linux
Comment 4 lirik.lv 2019-01-25 12:46:09 UTC
Found easier way to reproduce bug.
No need to create filesystem.
Here is a script:

#!/bin/sh -x
zramctl -s 100M /dev/zram0
echo -n 1 | cryptsetup open --type plain /dev/zram0 crypt --

# Bug appears only when blocksize != 4096,8192,16384, etc
dd if=/dev/zero of=/dev/mapper/crypt bs=2048

sync
echo 1 > /proc/sys/vm/drop_caches

cmp -n 100M /dev/zero /dev/mapper/crypt
Comment 5 Mike Snitzer 2019-02-06 14:57:55 UTC
This is very likely some ARM-specific crypto accelerator bug.  Changing component to crypto.  Sorry I cannot help further given the provided info.
Comment 6 Mike Snitzer 2019-02-06 14:59:28 UTC
(In reply to Mike Snitzer from comment #5)
> This is very likely some ARM-specific crypto accelerator bug.  Changing
> component to crypto.  Sorry I cannot help further given the provided info.

I actually have no idea how to get this to the crypto developers... cc'ing Herbert.
Comment 7 Herbert Xu 2019-02-20 10:30:44 UTC
Since you appear to be using rk_crypto could you please try disabling that (e.g., by not loading it if it's a module or use crconf to lower its priority) and see if the problem persists?

Thanks!
Comment 8 lirik.lv 2019-02-20 10:56:23 UTC
(In reply to Herbert Xu from comment #7)
> Since you appear to be using rk_crypto could you please try disabling that
> (e.g., by not loading it if it's a module or use crconf to lower its
> priority) and see if the problem persists?
> 
> Thanks!

Yes!
I've disabled `CONFIG_CRYPTO_DEV_ROCKCHIP` and problem is gone.

Thank you!
Comment 9 Ard Biesheuvel 2019-02-20 11:14:03 UTC
Some fixes for rockchip accelerators are under review here:
https://lore.kernel.org/linux-crypto/20190213082439.22138-1-zhangzj@rock-chips.com/
Comment 10 lirik.lv 2019-02-20 19:23:29 UTC
(In reply to Ard Biesheuvel from comment #9)
> Some fixes for rockchip accelerators are under review here:
> https://lore.kernel.org/linux-crypto/20190213082439.22138-1-zhangzj@rock-
> chips.com/

Tried this patch, bug still persist.
Comment 11 Ard Biesheuvel 2019-02-20 19:24:41 UTC
(In reply to lirik.lv from comment #10)
> (In reply to Ard Biesheuvel from comment #9)
> > Some fixes for rockchip accelerators are under review here:
> > https://lore.kernel.org/linux-crypto/20190213082439.22138-1-zhangzj@rock-
> > chips.com/
> 
> Tried this patch, bug still persist.

Did you try both patches? It is a series of 2.
Comment 12 lirik.lv 2019-02-20 19:56:12 UTC
(In reply to Ard Biesheuvel from comment #11)

> Did you try both patches? It is a series of 2.

Yes, I've tried both of them.
Comment 13 lirik.lv 2019-02-20 19:59:58 UTC
(In reply to lirik.lv from comment #12)
> (In reply to Ard Biesheuvel from comment #11)
> 
> > Did you try both patches? It is a series of 2.
> 
> Yes, I've tried both of them.

Also, after applying patches I got kernel panic when running my first script, when it copy file on crypt device.