Bug 15424 - kernel BUGs in fs/direct-io.c when cryptsetup tries to setup a new target
Summary: kernel BUGs in fs/direct-io.c when cryptsetup tries to setup a new target
Status: CLOSED PATCH_ALREADY_AVAILABLE
Alias: None
Product: IO/Storage
Classification: Unclassified
Component: LVM2/DM (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: Alasdair G Kergon
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-03-02 11:41 UTC by Daniel Vetter
Modified: 2011-02-24 15:24 UTC (History)
3 users (show)

See Also:
Kernel Version: v2.6.33-2454-g13dda80
Subsystem:
Regression: Yes
Bisected commit-id:


Attachments
dmesg of 2.6.33-rc8 (44.31 KB, application/octet-stream)
2010-03-02 11:41 UTC, Daniel Vetter
Details

Description Daniel Vetter 2010-03-02 11:41:38 UTC
Created attachment 25317 [details]
dmesg of 2.6.33-rc8

Last-working-kernel: v2.6.33-rc8

When cryptsetup tries to setup a new target the kernel BUGs. This happens after cryptsetup has hashed the passphrase (that takes about 1 sec, so it's noticeable).

BUG captured via serial console:

[ 3038.837907] ------------[ cut here ]------------
[ 3038.838273] kernel BUG at /home/daniel/linux/src/fs/direct-io.c:630!
[ 3038.838273] invalid opcode: 0000 [#3] PREEMPT SMP                   
[ 3038.838273] last sysfs file: /sys/devices/virtual/block/dm-2/removable
[ 3038.838273] CPU 1                                                     
[ 3038.838273] Modules linked in: sha256_generic aes_x86_64 aes_generic cbc dm_crypt dm_mod raid1 hid_logitech ff_memless linear md_mod sg sr_mod usbhid hid cdrom sd_mod crc_t10dif ohci_hcd tg3 ohci1394 pata_amd button sata_sil ehci_hcd libphy libata ieee1394 usbcore scsi_mod thermal fan thermal_sys [last unloaded: scsi_wait_scan]
[ 3038.838273]
[ 3038.838273] Pid: 881, comm: cryptsetup Tainted: G      D    2.6.33-02454-g13dda80 #10 S2885 Thunder K8W Mainboard/To Be Filled By O.E.M.
[ 3038.838273] RIP: 0010:[<ffffffff8110922d>]  [<ffffffff8110922d>] dio_send_cur_page+0x87/0x8f
[ 3038.838273] RSP: 0018:ffff8801606cdbb8  EFLAGS: 00010202
[ 3038.838273] RAX: 0000000000000001 RBX: ffff88007f4c0c00 RCX: ffff88007f9426c0
[ 3038.838273] RDX: 0000000000000000 RSI: ffff8801606cdb28 RDI: ffff88007f4a6a00
[ 3038.838273] RBP: ffff8801606cdbc8 R08: ffff8800235bda00 R09: ffff88007f4a6a08
[ 3038.838273] R10: 0000000000000001 R11: ffff8801606cdb08 R12: ffffea00008d4b30
[ 3038.838273] R13: 0000000000001000 R14: 0000000000000000 R15: 0000000000000004
[ 3038.838273] FS:  00007f139b7b27d0(0000) GS:ffff880028300000(0000) knlGS:0000000000000000
[ 3038.838273] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 3038.838273] CR2: 00007fae6078c010 CR3: 0000000160bf5000 CR4: 00000000000006e0
[ 3038.838273] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 3038.838273] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 3038.838273] Process cryptsetup (pid: 881, threadinfo ffff8801606cc000, task ffff88016078a840)
[ 3038.838273] Stack:
[ 3038.838273]  ffff88007fc00980 ffff88007f4c0c00 ffff8801606cdc08 ffffffff811092ef
[ 3038.838273] <0> 0000000000000004 fffffffffffffff4 ffffea00008d4b30 ffff88007f4c0c00
[ 3038.838273] <0> 0000000000000000 ffff88007f937b18 ffff8801606cdcd8 ffffffff81109c15
[ 3038.838273] Call Trace:
[ 3038.838273]  [<ffffffff811092ef>] submit_page_section+0xba/0x11d
[ 3038.838273]  [<ffffffff81109c15>] __blockdev_direct_IO+0x776/0xa80
[ 3038.838273]  [<ffffffff81108076>] blkdev_direct_IO+0x4e/0x50
[ 3038.838273]  [<ffffffff81107289>] ? blkdev_get_blocks+0x0/0x8d
[ 3038.838273]  [<ffffffff810a9c88>] generic_file_aio_read+0xde/0x546
[ 3038.838273]  [<ffffffff810b11e0>] ? release_pages+0x1da/0x1ec
[ 3038.838273]  [<ffffffff810e09ce>] do_sync_read+0xc4/0x101
[ 3038.838273]  [<ffffffff81153238>] ? selinux_file_permission+0x62/0x115
[ 3038.838273]  [<ffffffff81149e15>] ? security_file_permission+0x16/0x18
[ 3038.838273]  [<ffffffff810e1530>] vfs_read+0xaf/0x174
[ 3038.838273]  [<ffffffff810e16b8>] sys_read+0x4a/0x71
[ 3038.838273]  [<ffffffff81002b9b>] system_call_fastpath+0x16/0x1b
[ 3038.838273] Code: 85 d2 74 2b 48 89 df e8 22 ff ff ff 48 8b b3 00 01 00 00 48 89 df e8 97 fd ff ff 85 c0 75 10 48 89 df e8 0c fd ff ff 85 c0 74 04 <0f> 0b eb fe 5e 5b c9 c3 55 48 89 e5 41 56 41 55 41 54 53 48 83
[ 3038.838273] RIP  [<ffffffff8110922d>] dio_send_cur_page+0x87/0x8f
[ 3038.838273]  RSP <ffff8801606cdbb8>
[ 3039.138325] ---[ end trace f15071324f17a125 ]---

Full dmesg of v2.6.33-rc8 attached. If there's other stuff I should provide, I'll gladly do so.
Comment 1 Dmitry Monakhov 2010-03-02 11:54:32 UTC
Ohh.. Seems that underling device does some crap.
Code fail to add first page to empty bio.
Please describe your disk layout, lvm targets.
Comment 2 Daniel Vetter 2010-03-02 12:07:21 UTC
> --- Comment #1 from Dmitry Monakhov <dmonakhov@openvz.org>  2010-03-02
> 11:54:32 ---
> Ohh.. Seems that underling device does some crap.
> Code fail to add first page to empty bio.
> Please describe your disk layout, lvm targets.

The io stack of the one that's failing (root fs):

ext4 -> lvm2 -> luks -> sata partition

Details (from 2.6.33-rc8)
$ mount | grep root
/dev/mapper/viiv-root on / type ext4 (rw,errors=remount-ro)
$ dmsetup table
raid1--viiv-home: 0 419430400 linear 254:4 384
viiv-swap: 0 4194304 linear 254:0 29360512
viiv-root: 0 29360128 linear 254:0 384
viiv-root: 29360128 20971520 linear 254:0 264241536
luks1: 0 390618419 crypt aes-cbc-essiv:sha256 0000000000000000000000000000000000000000000000000000000000000000 0 8:3 2056
viiv-buildscratch: 0 209715200 linear 254:0 33554816
craid1: 0 1855858874 crypt aes-cbc-essiv:sha256 0000000000000000000000000000000000000000000000000000000000000000 0 8:18 2056
$ lspci -nn | grep storage
01:0b.0 Mass storage controller [0180]: Silicon Image, Inc. SiI 3114 [SATALink/SATARaid] Serial ATA Controller [1095:3114] (rev 02) 

Hope that helps.

-Daniel
Comment 3 Daniel Vetter 2010-03-02 19:55:34 UTC
I've just tried out the same kernel version on my netbook (totally different machine compared to my 64bit workstation). It has the exact same problem with the same BUG. Also uses encryption for the root lvm.

Can someone please reassign this to dm-crypt? bugzilla doesn't allow me to do this.
Comment 4 Milan Broz 2010-03-02 20:32:37 UTC
(In reply to comment #0)
> Last-working-kernel: v2.6.33-rc8

There should be no change in dmcrypt / bio_add_page (and yes, code must allow to add one page to empty bio).

Can you try to bisect which patch caused this?

What is the exact reproducer - simple cryptsetup luksOpen?
(and which cryptswtup version? Old version uses many small directIO, 1.1 uses more optimised access)
Comment 5 Daniel Vetter 2010-03-02 20:43:01 UTC
> --- Comment #4 from Milan Broz <mbroz@redhat.com>  2010-03-02 20:32:37 ---
> (In reply to comment #0)
> > Last-working-kernel: v2.6.33-rc8
> 
> There should be no change in dmcrypt / bio_add_page (and yes, code must allow
> to add one page to empty bio).
> 
> Can you try to bisect which patch caused this?

Ok. I'll start bisect this.

> What is the exact reproducer - simple cryptsetup luksOpen?
> (and which cryptswtup version? Old version uses many small directIO, 1.1 uses
> more optimised access)

Workstation has 1.1.0-rc2 This is a Debian unstable amd64 box

Netbook has 1.1.0-rc3, this is a Fedora 12 bo.

I happens about 1 second after I hit enter when asked for the key.
Comment 7 Daniel Vetter 2010-03-02 23:11:38 UTC
> --- Comment #6 from Milan Broz <mbroz@redhat.com>  2010-03-02 21:30:29 ---
> I guess this helps for now:
> 
>
> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=9599945bac93b344519ea97f502cf537124b5a6e

Yep, I can confirm that a kernel with this revert included works flawless.

thx, Daniel
Comment 8 Dmitry Monakhov 2010-03-03 03:57:25 UTC
Yepp. My patch break dm-layer. Sorry. This is because dm->merge may return more than requested. So correct check must test against less what requested.
I've sent correct patch http://lkml.org/lkml/2010/3/2/607

Note You need to log in before you can comment on or make changes to this bug.