Most recent kernel where this bug did *NOT* occur: ? Distribution: kubuntu 6.06.1 Hardware Environment: compaq ml350 g3 Software Environment: kernel 2.6.20rc3 Problem Description: Steps to reproduce: make a luks crypted partition, mount it and fill it with files. reproducable -> 100% steps: cryptsetup luksFormat /dev/cciss_c0d0p3 cryptsetup luksOpen /dev/cciss_c0d0p3 p3 mkfs.xfs /dev/mapper/p3 mount /dev/mapper/p3 /mnt/p3 rsync -vax / /mnt/p3 > rsync runs ~30 seconds, after this it starts to copy files) > after ~15 seconds of copying, the kernel panics: [ 459.938259] ------------[ cut here ]------------ [ 459.942882] kernel BUG at drivers/block/cciss.c:2470! [ 459.947932] invalid opcode: 0000 [#1] [ 459.951597] PREEMPT SMP [ 459.954171] Modules linked in: aes xfrm_user xfrm4_tunnel tunnel4 ipcomp esp4 ah4 af_packet 8021q deflate zlib_deflate zlib_inflate twofish twofish_common serpent blowfish des cbc ecb blkcipher crypto_null af_key md_mod lp serio_raw i2c_piix4 floppy psmouse parport_pc parport cfi_probe gen_probe scb2_flash mtdcore chipreg map_funcs ipv6 evdev ide_generic ohci_hcd usbcore ide_disk serverworks generic wp512 aes_i586 sha256 sha1 dm_crypt dm_mod [ 459.994402] CPU: 0 [ 459.994403] EIP: 0060:[<c02cdc81>] Not tainted VLI [ 459.994404] EFLAGS: 00010016 (2.6.20-rc3 #0) [ 460.006429] EIP is at do_cciss_request+0x371/0x375 [ 460.011217] eax: f7b4fe34 ebx: f7e81970 ecx: 00000000 edx: 000100c5 [ 460.017999] esi: f7f15b84 edi: 00000009 ebp: 00000009 esp: f7f15ad4 [ 460.024784] ds: 007b es: 007b ss: 0068 [ 460.028883] Process kcryptd/0 (pid: 883, ti=f7f14000 task=c19feaa0 task.ti=f7f14000) [ 460.036444] Stack: f7e30b0c f7e50000 08718875 f7b4fe34 00000000 c1499080 00000000 00000000 [ 460.044988] 00007000 c1499520 00000000 00000000 00010000 c1499720 00000000 00000000 [ 460.053558] 00010000 c1499920 00000000 00000000 00010000 c1499b20 00000000 00000000 [ 460.062107] Call Trace: [ 460.064814] [<c02e71f4>] ide_intr+0xc2/0x1f2 [ 460.069202] [<c0114425>] __wake_up_common+0x39/0x59 [ 460.074193] [<c0114477>] __wake_up+0x32/0x43 [ 460.078577] [<c012a546>] __queue_work+0x56/0x69 [ 460.083217] [<c0243ac6>] as_put_io_context+0x36/0x51 [ 460.088292] [<c0244685>] as_completed_request+0xaf/0x223 [ 460.093721] [<c023e9c5>] blk_start_queue+0x2a/0x79 [ 460.098623] [<c02cbb78>] cciss_softirq_done+0x137/0x223 [ 460.103963] [<c023f3c6>] blk_done_softirq+0x51/0x5e [ 460.108959] [<c01202ad>] __do_softirq+0x71/0xd9 [ 460.113603] [<c012034c>] do_softirq+0x37/0x3d [ 460.118071] [<c01204a8>] irq_exit+0x46/0x48 [ 460.122367] [<c010551d>] do_IRQ+0x45/0x78 [ 460.126491] [<c013dc0c>] handle_IRQ_event+0x25/0x4a [ 460.131485] [<c0103737>] common_interrupt+0x23/0x28 [ 460.136480] [<f884ba2e>] aes_enc_blk+0xa2e/0xb6c [aes_i586] [ 460.142172] [<f884c6eb>] aes_encrypt+0x13/0x17 [aes_i586] [ 460.147686] [<f90cc30a>] crypto_cbc_encrypt+0x91/0x162 [cbc] [ 460.153457] [<f90cc03a>] xor_128+0x0/0x17 [cbc] [ 460.158099] [<c024bac8>] rb_insert_color+0x72/0xbd [ 460.163004] [<f884c6d8>] aes_encrypt+0x0/0x17 [aes_i586] [ 460.168440] [<f8837db7>] crypt_convert_scatterlist+0x8a/0xe2 [dm_crypt] [ 460.175171] [<f8837f40>] crypt_convert+0x131/0x17f [dm_crypt] [ 460.181037] [<f88380c6>] kcryptd_do_work+0x138/0x39f [dm_crypt] [ 460.187079] [<c012a643>] run_workqueue+0x7e/0x14d [ 460.191903] [<f8837f8e>] kcryptd_do_work+0x0/0x39f [dm_crypt] [ 460.197758] [<c012a8ae>] worker_thread+0xff/0x160 [ 460.202577] [<c0116f10>] default_wake_function+0x0/0xc [ 460.207826] [<c012a7af>] worker_thread+0x0/0x160 [ 460.212553] [<c012d7d8>] kthread+0xd1/0xd5 [ 460.216766] [<c012d707>] kthread+0x0/0xd5 [ 460.220891] [<c0103937>] kernel_thread_helper+0x7/0x10 [ 460.226142] ======================= [ 460.229715] Code: fe ff ff 89 9a 80 a0 00 00 89 9b 38 02 00 00 89 9b 3c 02 00 00 e9 22 ff ff ff 8b 04 24 e8 b0 0c f7 ff e9 4e ff ff ff 0f 0b eb fe <0f> 0b eb fe 55 57 56 53 83 ec 04 89 c3 89 d6 89 cd 8b 40 38 8b [ 460.250174] EIP: [<c02cdc81>] do_cciss_request+0x371/0x375 SS:ESP 0068:f7f15ad4 [ 460.257529] <0>Kernel panic - not syncing: Fatal exception in interrupt [ 460.264254] BUG: at arch/i386/kernel/smp.c:547 smp_call_function() [ 460.270572] [<c010cda3>] smp_call_function+0x127/0x12c [ 460.275878] [<c011c566>] printk+0x1b/0x1f [ 460.280057] [<c010cdc3>] smp_send_stop+0x1b/0x26 [ 460.284835] [<c011b3b2>] panic+0x57/0x10b [ 460.289013] [<c01044a8>] die+0x204/0x213 [ 460.293113] [<c01048e9>] do_invalid_op+0x0/0xab [ 460.297808] [<c010498b>] do_invalid_op+0xa2/0xab [ 460.302602] [<c02cdc81>] do_cciss_request+0x371/0x375 [ 460.307835] [<c02e71f4>] ide_intr+0xc2/0x1f2 [ 460.312276] [<c024bce3>] rb_erase+0x183/0x270 [ 460.316779] [<c02ebaf3>] task_in_intr+0x0/0x104 [ 460.321486] [<c023d606>] elv_dispatch_sort+0x16/0x70 [ 460.326612] [<c0244a53>] as_move_to_dispatch+0xa1/0x124 [ 460.331956] [<c013f29b>] handle_edge_irq+0xf2/0x11e [ 460.336962] [<c037d6a4>] error_code+0x7c/0x84 [ 460.341456] [<c02cdc81>] do_cciss_request+0x371/0x375 [ 460.346733] [<c02e71f4>] ide_intr+0xc2/0x1f2 [ 460.351130] [<c0114425>] __wake_up_common+0x39/0x59 [ 460.356129] [<c0114477>] __wake_up+0x32/0x43 [ 460.360522] [<c012a546>] __queue_work+0x56/0x69 [ 460.365170] [<c0243ac6>] as_put_io_context+0x36/0x51 [ 460.370246] [<c0244685>] as_completed_request+0xaf/0x223 [ 460.375673] [<c023e9c5>] blk_start_queue+0x2a/0x79 [ 460.380587] [<c02cbb78>] cciss_softirq_done+0x137/0x223 [ 460.385935] [<c023f3c6>] blk_done_softirq+0x51/0x5e [ 460.390930] [<c01202ad>] __do_softirq+0x71/0xd9 [ 460.395583] [<c012034c>] do_softirq+0x37/0x3d [ 460.400059] [<c01204a8>] irq_exit+0x46/0x48 [ 460.404355] [<c010551d>] do_IRQ+0x45/0x78 [ 460.408477] [<c013dc0c>] handle_IRQ_event+0x25/0x4a [ 460.413474] [<c0103737>] common_interrupt+0x23/0x28 [ 460.418483] [<f884ba2e>] aes_enc_blk+0xa2e/0xb6c [aes_i586] [ 460.424188] [<f884c6eb>] aes_encrypt+0x13/0x17 [aes_i586] [ 460.429700] [<f90cc30a>] crypto_cbc_encrypt+0x91/0x162 [cbc] [ 460.435471] [<f90cc03a>] xor_128+0x0/0x17 [cbc] [ 460.440120] [<c024bac8>] rb_insert_color+0x72/0xbd [ 460.445026] [<f884c6d8>] aes_encrypt+0x0/0x17 [aes_i586] [ 460.450484] [<f8837db7>] crypt_convert_scatterlist+0x8a/0xe2 [dm_crypt] [ 460.457239] [<f8837f40>] crypt_convert+0x131/0x17f [dm_crypt] [ 460.463129] [<f88380c6>] kcryptd_do_work+0x138/0x39f [dm_crypt] [ 460.469200] [<c012a643>] run_workqueue+0x7e/0x14d [ 460.474030] [<f8837f8e>] kcryptd_do_work+0x0/0x39f [dm_crypt] [ 460.479903] [<c012a8ae>] worker_thread+0xff/0x160 [ 460.484731] [<c0116f10>] default_wake_function+0x0/0xc [ 460.489988] [<c012a7af>] worker_thread+0x0/0x160 [ 460.494722] [<c012d7d8>] kthread+0xd1/0xd5 [ 460.498936] [<c012d707>] kthread+0x0/0xd5 [ 460.503062] [<c0103937>] kernel_thread_helper+0x7/0x10 [ 460.508324] =======================
* formating and mounting the partition direct works as expected. * sometimes it is enough to only try to remount the dirty xfs-volume to get the "bug" (not the panic) i think this happens if a lot of writes are queued for the dm-crypt device!
Does it work with kernel 2.6.19.1?
the cciss controller is a 624 with 3*36 GB disks on RAID5 > Does it work with kernel 2.6.19.1? no! [ 522.514796] ------------[ cut here ]------------ [ 522.519428] kernel BUG at drivers/block/cciss.c:2507! [ 522.524480] invalid opcode: 0000 [#1] [ 522.528144] PREEMPT SMP [ 522.530737] Modules linked in: xfrm_user xfrm4_tunnel tunnel4 ipcomp esp4 ah4 af_packet 8021q deflate zlib_deflate twofish twofish_common serpent blowfish de s cbc ecb blkcipher crypto_null af_key md_mod lp floppy serio_raw parport_pc par port psmouse scb2_flash mtdcore chipreg map_funcs i2c_piix4 i2c_core ipv6 evdev ide_generic ohci_hcd usbcore ide_disk serverworks generic wp512 aes_i586 sha256 sha1 dm_crypt dm_mod [ 522.568574] CPU: 0 [ 522.568575] EIP: 0060:[<c02f13f2>] Not tainted VLI [ 522.568577] EFLAGS: 00010016 (2.6.19.1 #1) [ 522.580446] EIP is at do_cciss_request+0x3f6/0x403 [ 522.585243] eax: f79d2284 ebx: f7841030 ecx: c026ac49 edx: 00000000 [ 522.592025] esi: 00000008 edi: 00000002 ebp: 00000008 esp: f7d41a34 [ 522.598811] ds: 007b es: 007b ss: 0068 [ 522.602908] Process kcryptd/0 (pid: 870, ti=f7d40000 task=f7d99aa0 task.ti=f7 d40000) [ 522.610471] Stack: f7cedba8 f79d2d84 f7d41a4c f79d2284 08718285 f7f19000 c155 8240 00000000 [ 522.619039] 00000000 00006000 c1567200 00000000 00000000 00008000 c156 5400 00000000 [ 522.627617] 00000000 00008000 c1566400 00000000 00000000 00008000 c156 2e00 00000000 [ 522.636196] Call Trace: [ 522.638888] [<c0115b8f>] activate_task+0x64/0xaf [ 522.643622] [<c0115a5c>] __activate_task+0x23/0x39 [ 522.648527] [<c01163a9>] try_to_wake_up+0x26b/0x31a [ 522.653521] [<c0118302>] __wake_up_common+0x3f/0x60 [ 522.658525] [<c0118302>] __wake_up_common+0x3f/0x60 [ 522.663521] [<c011835d>] __wake_up+0x3a/0x4b [ 522.667913] [<c012eb0b>] __queue_work+0x4b/0x5a [ 522.672562] [<c02699c5>] as_put_io_context+0x3b/0x66 [ 522.677640] [<c0264910>] __freed_request+0x5f/0x8e [ 522.682545] [<c026435a>] blk_start_queue+0x48/0x81 [ 522.687448] [<c02ef754>] cciss_check_queues+0x98/0x10d [ 522.692698] [<c02ef8a4>] cciss_softirq_done+0xdb/0x13f [ 522.697953] [<c02661f6>] blk_done_softirq+0x5c/0x6a [ 522.702946] [<c0123217>] __do_softirq+0x70/0xdf [ 522.707591] [<c01232b9>] do_softirq+0x33/0x35 [ 522.712060] [<c01232fb>] irq_exit+0x40/0x42 [ 522.716355] [<c0105521>] do_IRQ+0x61/0xa3 [ 522.720490] [<c010383e>] common_interrupt+0x1a/0x20 [ 522.725488] [<f884871e>] aes_enc_blk+0x71e/0xb88 [aes_i586] [ 522.731177] [<f88d90aa>] crypto_cbc_encrypt_segment+0x63/0x95 [cbc] [ 522.737556] [<f884af44>] aes_encrypt+0x0/0x5 [aes_i586] [ 522.742898] [<f88d944d>] xor_128+0x0/0x1f [cbc] [ 522.747540] [<f88d91a3>] crypto_cbc_encrypt+0x51/0x91 [cbc] [ 522.753226] [<f88d944d>] xor_128+0x0/0x1f [cbc] [ 522.757883] [<f8813390>] crypt_convert_scatterlist+0xe8/0x11f [dm_crypt] [ 522.764700] [<c014e268>] __alloc_pages+0x56/0x30a [ 522.769529] [<f881355e>] crypt_convert+0x144/0x184 [dm_crypt] [ 522.775392] [<c018e071>] bio_alloc_bioset+0x102/0x18b [ 522.780570] [<f8813ab2>] process_write+0xbf/0x153 [dm_crypt] [ 522.786350] [<c012ed47>] run_workqueue+0x76/0xed [ 522.791081] [<f8813ba0>] kcryptd_do_work+0x0/0x28 [dm_crypt] [ 522.796852] [<c012ef1f>] worker_thread+0x161/0x17d [ 522.801754] [<c01182b1>] default_wake_function+0x0/0x12 [ 522.807095] [<c01182b1>] default_wake_function+0x0/0x12 [ 522.812438] [<c012edbe>] worker_thread+0x0/0x17d [ 522.817166] [<c01323b1>] kthread+0xae/0xd3 [ 522.821378] [<c0132303>] kthread+0x0/0xd3 [ 522.825500] [<c01039ef>] kernel_thread_helper+0x7/0x18 [ 522.830755] ======================= [ 522.834335] Code: e9 c0 fd ff ff 8b 8c 24 1c 02 00 00 89 0c 24 e8 b8 2f f7 ff 8b 44 24 14 89 04 24 e8 f1 fa ff ff 81 c4 08 02 00 00 5b 5e 5f 5d c3 <0f> 0b cb 09 9b e9 3f c0 e9 53 fc ff ff 55 57 56 53 83 ec 10 8b [ 522.854819] EIP: [<c02f13f2>] do_cciss_request+0x3f6/0x403 SS:ESP 0068:f7d41a 34 [ 522.862175] <0>Kernel panic - not syncing: Fatal exception in interrupt [ 522.868897] BUG: warning at arch/i386/kernel/smp.c:549/smp_call_function() [ 522.875882] [<c010dcb0>] smp_call_function+0x13a/0x13f [ 522.881136] [<c02dcd67>] do_unblank_screen+0x46/0x138 [ 522.886300] [<c010dd0f>] smp_send_stop+0x27/0x32 [ 522.891027] [<c010dcb5>] stop_this_cpu+0x0/0x33 [ 522.895673] [<c011dc1b>] panic+0x64/0x110 [ 522.899799] [<c01041fe>] die+0x204/0x213 [ 522.903842] [<c0104524>] do_invalid_op+0x0/0xab [ 522.908485] [<c01045c6>] do_invalid_op+0xa2/0xab [ 522.913226] [<c0272236>] __rb_erase_color+0xeb/0x18a [ 522.918303] [<c02f13f2>] do_cciss_request+0x3f6/0x403 [ 522.923466] [<c02723c6>] rb_erase+0xf1/0x118 [ 522.927852] [<c0261631>] elv_dispatch_sort+0x20/0x79 [ 522.932933] [<c026a7b2>] as_move_to_dispatch+0xa1/0x144 [ 522.938274] [<c0105521>] do_IRQ+0x61/0xa3 [ 522.942400] [<c026a9c7>] as_dispatch_request+0x172/0x34c [ 522.947823] [<c03b4fc1>] error_code+0x39/0x40 [ 522.952292] [<c026ac49>] as_activate_request+0x0/0x57 [ 522.957459] [<c02f13f2>] do_cciss_request+0x3f6/0x403 [ 522.962660] [<c0115b8f>] activate_task+0x64/0xaf [ 522.967395] [<c0115a5c>] __activate_task+0x23/0x39 [ 522.972299] [<c01163a9>] try_to_wake_up+0x26b/0x31a [ 522.977292] [<c0118302>] __wake_up_common+0x3f/0x60 [ 522.982289] [<c0118302>] __wake_up_common+0x3f/0x60 [ 522.987282] [<c011835d>] __wake_up+0x3a/0x4b [ 522.991671] [<c012eb0b>] __queue_work+0x4b/0x5a [ 522.996317] [<c02699c5>] as_put_io_context+0x3b/0x66 [ 523.001394] [<c0264910>] __freed_request+0x5f/0x8e [ 523.006296] [<c026435a>] blk_start_queue+0x48/0x81 [ 523.011201] [<c02ef754>] cciss_check_queues+0x98/0x10d [ 523.016457] [<c02ef8a4>] cciss_softirq_done+0xdb/0x13f [ 523.021713] [<c02661f6>] blk_done_softirq+0x5c/0x6a [ 523.026701] [<c0123217>] __do_softirq+0x70/0xdf [ 523.031348] [<c01232b9>] do_softirq+0x33/0x35 [ 523.035815] [<c01232fb>] irq_exit+0x40/0x42 [ 523.040112] [<c0105521>] do_IRQ+0x61/0xa3 [ 523.044243] [<c010383e>] common_interrupt+0x1a/0x20 [ 523.049241] [<f884871e>] aes_enc_blk+0x71e/0xb88 [aes_i586] [ 523.054934] [<f88d90aa>] crypto_cbc_encrypt_segment+0x63/0x95 [cbc] [ 523.061315] [<f884af44>] aes_encrypt+0x0/0x5 [aes_i586] [ 523.066650] [<f88d944d>] xor_128+0x0/0x1f [cbc] [ 523.071294] [<f88d91a3>] crypto_cbc_encrypt+0x51/0x91 [cbc] [ 523.076980] [<f88d944d>] xor_128+0x0/0x1f [cbc] [ 523.081639] [<f8813390>] crypt_convert_scatterlist+0xe8/0x11f [dm_crypt] [ 523.088456] [<c014e268>] __alloc_pages+0x56/0x30a [ 523.093283] [<f881355e>] crypt_convert+0x144/0x184 [dm_crypt] [ 523.099145] [<c018e071>] bio_alloc_bioset+0x102/0x18b [ 523.104313] [<f8813ab2>] process_write+0xbf/0x153 [dm_crypt] [ 523.110096] [<c012ed47>] run_workqueue+0x76/0xed [ 523.114825] [<f8813ba0>] kcryptd_do_work+0x0/0x28 [dm_crypt] [ 523.120594] [<c012ef1f>] worker_thread+0x161/0x17d [ 523.125499] [<c01182b1>] default_wake_function+0x0/0x12 [ 523.130837] [<c01182b1>] default_wake_function+0x0/0x12 [ 523.136176] [<c012edbe>] worker_thread+0x0/0x17d [ 523.140902] [<c01323b1>] kthread+0xae/0xd3 [ 523.145114] [<c0132303>] kthread+0x0/0xd3 [ 523.149239] [<c01039ef>] kernel_thread_helper+0x7/0x18 [ 523.154490] =======================
Does this work on other controllers?
> Does this work on other controllers? i have no other cciss controller to test with.
I'm sorry, I should have more specific. I meant does this work on other types of controllers such as scsi or other hw raid controllers?
> I meant does this work on other types of controllers such as scsi or other hw raid controllers? i tried it on another machine today (kernel 2.6.14) there it works on dmcrypt->Symbios Logic 53c896 dmcrypt->3ware 7806 raid (but on a non-raid partition) it also works on IDE disks with all kernels
Reassigning this to Chase. Chase please investigate this issue.
Hermann, I am setting up a system to look into this issue. However, I was wondering if you can try your setup using a non-encrypted filesystem. I want to make sure we can isolate this to using the dm-crypt device on top of the cciss driver. Another thing you may want to try is creating a software raid volume on top of the cciss device using the dm driver. I'll let you know what I find out when I have a system setup and have duplicated the problem. Anything you can do in the meantime to isolate the issue further would be helpful. Chase
hi chase, of course, i am willing to help! :-) everything with 2.6.20rc3 -------------------------------------------- Disk /dev/cciss_c0d0: 72.8 GB, 72826629120 bytes 255 heads, 63 sectors/track, 8854 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Device Boot Start End Blocks Id System /dev/cciss_c0d0p1 * 1 13 104391 83 Linux /dev/cciss_c0d0p2 14 8757 70236180 83 Linux /dev/cciss_c0d0p3 8758 8854 779152+ 5 Extended /dev/cciss_c0d0p5 8758 8782 200781 fd Linux raid autodetect /dev/cciss_c0d0p6 8783 8807 200781 fd Linux raid autodetect /dev/cciss_c0d0p7 8808 8832 200781 fd Linux raid autodetect /dev/cciss_c0d0p8 8833 8854 176683+ 83 Linux -------------------------------------------- md0 : active raid5 cciss/c0d0p7[2] cciss/c0d0p6[1] cciss/c0d0p5[0] 401408 blocks level 5, 64k chunk, algorithm 2 [3/3] [UUU] -------------------------------------------- > uncrypted softraid mkfs.xfs /dev/md0 mount /dev/md0 /mnt/md0 rsync -vax / /mnt/md0 > does work -------------------------------------------- > uncrypted normal partition mkfs.xfs /dev/cciss/c0d0p8 mount /dev/cciss/c0d0p8 /mnt/p8 rsync -vax / /mnt/p8 > does work -------------------------------------------- > crypted partition on top of softraid cryptsetup luksFormat /dev/md0 cryptsetup luksOpen /dev/md0 md0 mkfs.xfs /dev/mapper/md0 mount /dev/mapper/md0 /mnt/md0 rsync -vax / /mnt/md0 > does work -------------------------------------------- should i try anything else? for it looks defintly a dm-crypt->cciss bug
Created attachment 10010 [details] serial console output I have added some comments of my own on to the output to explain where it is coming from. The debug output is at the bottom of the log file.
Hermann, I have found that the dm-crypt module is violating our queue limitations for the number of scatter gather elements we can handle per request. I have captured this in the debug output I have attached to this issue. I am going to work with the RedHat dm development to get this resolved as this issue does not seem to be present in other dm modules. I have seen some code in these other modules such as __split_bio which seems to break up the data based on the low level device limitations as reported by the driver. The crypt function crypt_alloc_buffer says in the comments that it should not violate these limits, but I did not find code to verify that this is true. I found that in the crypt_alloc_buffer function we are seeing bio structures with more than 31 segments which later get sent to the cciss driver as requests. I will send this bugzilla to the RedHat dm mailing list so that they can keep it up to date. Thanks for your help on this. Chase
Created attachment 10011 [details] serial console output
I see that some people from RedHat have joined this thread. Is RedHat looking into this issue with the dm-crypt module? Do you have any status?
Mike, this looks like a cciss bug.
Andrew, I found with my debug that while the cciss driver set the maximum number of segments to 31 with the block layer, the dm-crypt module is sending us requests with greater than 31 segments. It was my understanding that a stacking driver had to obey the limitations of the underlying drivers. I also noticed that the dm-crypt driver has a note in their code saying not to exceed the limits on the lower level driver, but no actual checks to see if they are doing so or not. That is why I feel this is a dm-crypt bug. Do you agree? Chase
Does this patch help? http://www.kernel.org/pub/linux/kernel/people/agk/patches/2.6/editing/dm-merge-max_hw_sector.patch
That sounds right. Jens, can you please confirm?
Yep, looks right, it's a dm bug. Why isn't dm using blk_queue_stack_limits()?
We need to revisit this dm code. dm uses blk_queue_stack_limits() where it can, but at this point it doesn't have a struct request_queue to manipulate: it's tracking what the restrictions would be should the proposed device structure become live at a later time. Perhaps we could abstract the few fields required into an embedded struct something like dm's 'struct io_restrictions' and pass that into blk_queue_stack_limits() instead of maintaining duplicate functionality.
Hello, similar issue but mine happens on fat32 (vfat) partition and in general I can't reproduce this. Sometimes I catch this bug once a day, sometimes once a week. But it usually happens when I my HDD is under heavy load. Should I also try to use the patch? BUG: unable to handle kernel NULL pointer dereference at virtual address 00000000 printing eip: c01cf741 *pde = 00000000 Oops: 0000 [#1] PREEMPT Modules linked in: sha256 aes cbc blkcipher cryptomgr crypto_algapi ndiswrapper xt_limit ipt_LOG xt_state xt_tcpudp iptable_filter iptable_mangle ipt_MASQUERADE iptable_nat ip_nat ip_conntrack ip_tables x_tables snd_seq_midi snd_seq_oss snd_seq_midi_event snd_seq snd_pcm_oss snd_mixer_oss dm_crypt dm_mod radeonfb snd_ca0106 snd_rawmidi snd_seq_device snd_ac97_codec snd_pcm snd_timer snd soundcore via686a hwmon i2c_isa i2c_viapro snd_ac97_bus i2c_core pcspkr snd_page_alloc CPU: 0 EIP: 0060:[<c01cf741>] Tainted: P VLI EFLAGS: 00010297 (2.6.19-gentoo-r4 #1) EIP is at blk_recount_segments+0x6e/0x227 eax: 0000001f ebx: 00000000 ecx: dbebffa0 edx: 00000000 esi: da900c80 edi: 00000000 ebp: c5d80100 esp: ef2b1e84 ds: 007b es: 007b ss: 0068 Process kcryptd/0 (pid: 3198, ti=ef2b0000 task=eee34070 task.ti=ef2b0000) Stack: 00000000 0270b62a 00000000 00000000 00000000 00000000 00000001 c013b74d 00000001 00011200 00000000 eee34070 c03cc440 c01de763 dbebffa0 da900c80 c5d80100 dc088360 c0175617 c17d9804 dbebffa0 00000600 c17d9804 dbebffa0 Call Trace: [<c013b74d>] mempool_alloc+0x29/0xf1 [<c01de763>] _mmx_memcpy+0x3f/0x13a [<c0175617>] __bio_clone+0x8f/0xb5 [<f087c974>] kcryptd_do_work+0x1fb/0x38a [dm_crypt] [<c012334c>] run_workqueue+0x91/0xed [<f087c779>] kcryptd_do_work+0x0/0x38a [dm_crypt] [<c0123921>] worker_thread+0x111/0x144 [<c0111ee4>] default_wake_function+0x0/0x15 [<c0123810>] worker_thread+0x0/0x144 [<c012620a>] kthread+0xc5/0xf3 [<c0126145>] kthread+0x0/0xf3 [<c01035df>] kernel_thread_helper+0x7/0x10 ======================= Code: 00 00 c7 44 24 14 00 00 00 00 c7 44 24 20 01 00 00 00 c7 44 24 28 00 00 00 00 89 04 24 6b c0 0c 8d 2c 02 e9 62 01 00 00 8b 7d 00 <8b> 07 89 f9 c1 e8 1a c1 e0 02 8d 90 e0 f5 3c c0 8b 80 e0 f5 3c EIP: [<c01cf741>] blk_recount_segments+0x6e/0x227 SS:ESP 0068:ef2b1e84
The patch doesn't work for me. I catched the bug again while unzipping a file.
Can anyone suggest something what should I do? Kernel bug which I've catched is different (panic vs. null pointer, cciss vs vfat) but in general, issue seems to be quite similar (hdd/dm-crypt access error under heavy load). Should I create new kernel bug entry or what? That's quite important for me. When I catch this bug, I loose contents of the directory in which hanged (due to this bug) process were working.
I think this is a duplicate of Bug #5948. Please try the patches I posted to dm-devel a few weeks ago; they're also available from agk's patch queue: http://www.kernel.org/pub/linux/kernel/people/agk/patches/2.6/editing/ In particular, apply these patches: dm-call-clone_init-early.patch dm-crypt-fix-avoid-cloned-bio-ref-after-free.patch dm-crypt-fix-call-to-clone_init.patch dm-crypt-fix-remove-first_clone.patch and try to reproduce the problem
Hello folks! We have the same issue, cciss + dm_crypt, killing system while mkfs.xfs or later on rsync after a short time (<30s). We tried the patch from http://www.kernel.org/pub/linux/kernel/people/agk/patches/2.6/editing/dm-merge-max_hw_sector.patch Only thing that worked is mkfs.xfs (in luksformat batch). Rsyncinc failed after 20 seconds. Before this patch it always get killed shortly after mkfs.xfs The CPU has much power (~6000 bogomips) and so crypting does not really slow down I/O. It cannot be a dm_crypt issue, because we tried truecrypt, which crypted the the whole device /dev/cciss/c0d0p1 ~300GB without a problem. The raid itself is natively handled by the Smart Array Controller 5302/128. After dev-mapping and mounting the truecrypt volume. We made a mkfs.xfs /dev/mapper/truecrypt0 which ran through. But after starting rsync after a short time it made also trouble. Seems that there is a bug either in cciss driver or dm driver. If you have questions, just ask :-) Kind regards, Markus
Markus, Peter, Could you test patches that Olaf submitted and mentioned in #24 please? This will greatly help further work on this bug. Thanks, --Natalie
Ok, I will try. Maybe today I will do some tests, but it's little frightening as I'm loosing the content of whole directory in which working process had hanged. So please keep thumbs :)
Ok. So here I am again. May someone told me do I have to attach that "kernel NULL pointer" bugs? It makes this bug info so long. Ok so I've got: 2.6.19-gentoo-r5 and on vfat (fat32) I've made file, which is looped as a device and on this I've make luksDevice and on this I've made again vfat system file. In previous bug report I used luksDevice directly on the hdd partition but nowadays I can't do that again. So before applying those 4 patches I've catch the bug again: BUG: unable to handle kernel NULL pointer dereference at virtual address 00000000 printing eip: c01cf741 *pde = 29f4a067 *pte = 00000000 Oops: 0000 [#1] PREEMPT Modules linked in: sha256 aes cbc blkcipher cryptomgr loop sha512 crypto_algapi ndiswrapper ipt_MASQUERADE iptable_nat ip_nat iptable_mangle xt_limit ipt_LOG xt_state ip_conntrack xt_tcpudp iptable_filter ip_tables x_tables snd_seq_midi snd_seq_midi_event snd_seq radeon drm dm_crypt dm_mod snd_ca0106 snd_rawmidi snd_seq_device snd_ac97_codec snd_pcm snd_timer snd soundcore via686a hwmon snd_ac97_bus i2c_isa pcspkr i2c_viapro snd_page_alloc via_agp agpgart CPU: 0 EIP: 0060:[<c01cf741>] Tainted: P VLI EFLAGS: 00010297 (2.6.19-gentoo-r5 #14) EIP is at blk_recount_segments+0x6e/0x227 eax: 0000000d ebx: 00000000 ecx: c5969aa0 edx: 00000000 esi: ef04a040 edi: 00000000 ebp: c87f7340 esp: efc63e84 ds: 007b es: 007b ss: 0068 Process kcryptd/0 (pid: 3207, ti=efc62000 task=c16fe070 task.ti=efc62000) Stack: 00000000 02752636 00000000 00000000 00000000 00000000 00000000 c013b74d 00000001 00011200 00000000 c16fe070 c03e7440 c01de763 c5969aa0 ef04a040 c87f7340 c5969420 c0175617 c92d1884 c5969aa0 00000300 c92d1884 c5969aa0 Call Trace: [<c013b74d>] mempool_alloc+0x29/0xf1 [<c01de763>] _mmx_memcpy+0x3f/0x13a [<c0175617>] __bio_clone+0x8f/0xb5 [<f1887974>] kcryptd_do_work+0x1fb/0x38a [dm_crypt] [<c012334c>] run_workqueue+0x91/0xed [<f1887779>] kcryptd_do_work+0x0/0x38a [dm_crypt] [<c0123921>] worker_thread+0x111/0x144 [<c0111ee4>] default_wake_function+0x0/0x15 [<c0123810>] worker_thread+0x0/0x144 [<c012620a>] kthread+0xc5/0xf3 [<c0126145>] kthread+0x0/0xf3 [<c01035df>] kernel_thread_helper+0x7/0x10 ======================= Code: 00 00 c7 44 24 14 00 00 00 00 c7 44 24 20 01 00 00 00 c7 44 24 28 00 00 00 00 89 04 24 6b c0 0c 8d 2c 02 e9 62 01 00 00 8b 7d 00 <8b> 07 89 f9 c1 e8 1a c1 e0 02 8d 90 e0 a5 3e c0 8b 80 e0 a5 3e EIP: [<c01cf741>] blk_recount_segments+0x6e/0x227 SS:ESP 0068:efc63e84 Now... I will reboot and patch the kernel...
Couldn't find one of the lower mentioned patches in the 2.6.19 branch so I switched to 2.6.21. Without patches - still the same case. Now I'm patching 2.6.21 with these: 1) dm-call-clone_init-early.patch 2) dm-crypt-fix-avoid-cloned-bio-ref-after-free.patch 3) dm-crypt-fix-call-to-clone_init.patch 4) dm-crypt-fix-remove-first_clone.patch but 1) seems to be confusing 3). Afraiding of strange results I didn't patch it. # patch -b -p1 < dm-call-clone_init-early.patch patching file drivers/md/dm-crypt.c Reversed (or previously applied) patch detected! Assume -R? [n] n Apply anyway? [n] n Skipping patch. 7 out of 7 hunks ignored -- saving rejects to file drivers/md/dm-crypt.c.rej Can someone describe in details how I can patch it? Is it a consistent patches and they need to be applied whole at the same time or they can be applied separatly?
Olaf, can you please help to sort those out? Btw are they in -mm yet so that Peter can just use the kernel.org latest tree? Thanks.
I just verified the patches are in 2.6.22-rc5. Marcus, Peter, Hermann - can you please test if your issue(s) has been resolved. Thanks.
I'm testing 2.6.22-rc5 since 2 weeks and I didn't catch the bug till now! Good work. PLEASE NOTE however, that my issue was triggered _randomly under heavy HDD load_. In the worst case I catched the bug once per week, in the best - few times a day. Because of that, there's of course little chance that the bug is still there and I'm just out of luck to trigger it. Right now I'm using test case for about 2 weeks and everything seems to work. Of course if I will catch the bug again, I will report it here. More over, my test case use file looped as a device. I described that in #28. Thanks for Your efforts!
Till today I didn't catch any null pointer bug. Unfortunately today after mounting like this: mount /dev/mapper/l0 /mnt/test/ -o rw,umask=000 I've found this: BUG: unable to handle kernel NULL pointer dereference at virtual address 00000000 printing eip: c013a47c *pde = 29867067 *pte = 00000000 Oops: 0000 [#1] PREEMPT Modules linked in: sha256 aes cbc blkcipher cryptomgr crypto_algapi loop nls_utf8 ntfs ndiswrapper xt_tcpudp iptable_filter iptable_mangle ip_tables x_tables snd_seq_midi snd_seq_midi_event snd_seq pcspkr i2c_dev radeon drm dm_crypt dm_mod snd_ca0106 snd_rawmidi snd_seq_device snd_ac97_codec snd_pcm snd_timer snd soundcore ac97_bus snd_page_alloc via_agp agpgart via686a hwmon i2c_isa i2c_viapro CPU: 0 EIP: 0060:[<c013a47c>] Tainted: P VLI EFLAGS: 00010292 (2.6.22-rc5 #1) EIP is at mempool_free+0xa/0x86 eax: e1509c3c ebx: ef617f64 ecx: 00000000 edx: 00000000 esi: 00000000 edi: e1509c3c ebp: f08326c9 esp: ef617f1c ds: 007b es: 007b fs: 0000 gs: 0000 ss: 0068 Process kcryptd/0 (pid: 3332, ti=ef616000 task=eff1f5d0 task.ti=ef616000) Stack: e1509c44 ef617f64 ed1656a0 e1509c44 f0832760 eff81ce0 0000007c 00000000 c0164336 eff81ce0 e1509c3c 00000002 eff1f5d0 94a96a80 00000004 ef3bb140 eff1f5d0 c03d0e10 eff81ce0 eff81ce0 00000000 00000000 00000001 00000001 Call Trace: [<f0832760>] kcryptd_do_work+0x97/0x2d8 [dm_crypt] [<c0164336>] mntput_no_expire+0x11/0x73 [<f08326c9>] kcryptd_do_work+0x0/0x2d8 [dm_crypt] [<c011fd51>] run_workqueue+0x8c/0x128 [<c0120284>] worker_thread+0x0/0xbc [<c0120336>] worker_thread+0xb2/0xbc [<c01228f3>] autoremove_wake_function+0x0/0x35 [<c012282f>] kthread+0x36/0x5b [<c01227f9>] kthread+0x0/0x5b [<c0103007>] kernel_thread_helper+0x7/0x10 ======================= Code: e8 43 7f 1a 00 89 d8 89 e2 e8 05 85 fe ff 89 e8 e9 4c ff ff ff 31 f6 83 c4 14 89 f0 5b 5e 5f 5d c3 57 89 c7 56 89 d6 53 83 ec 04 <8b> 02 39 42 04 7d 68 9c 5b fa 89 e0 25 00 e0 ff ff ff 40 14 8b EIP: [<c013a47c>] mempool_free+0xa/0x86 SS:ESP 0068:ef617f1c
This Oops is probably bug 7388 Please try attached patch http://bugzilla.kernel.org/show_bug.cgi?id=7388#c11 Milan > ------- Comment #33 from pitrasw@wp.pl 2007-07-31 00:14 ------- > Till today I didn't catch any null pointer bug. Unfortunately today after > mounting like this: > > mount /dev/mapper/l0 /mnt/test/ -o rw,umask=000 > > I've found this: > > BUG: unable to handle kernel NULL pointer dereference at virtual address
Works and didn't O-o-oops for few days now. Thanks for fast reply. Which kernel version will be patched against this 7388 bug?
I've patched my kernel with the patch from #34. After that I didn't catch any bug like the one from #33 but today after: mount /dev/mapper/l0 /mnt/test/ -o rw,umask=000 invoked right after KDE started I've found this (paging request): BUG: unable to handle kernel paging request at virtual address 0100018f printing eip: c013a47c *pde = 00000000 Oops: 0000 [#1] PREEMPT Modules linked in: sha256 aes cbc blkcipher cryptomgr crypto_algapi loop ndiswrapper xt_tcpudp iptable_filter iptable_mangle ip_tables x_tables snd_seq_midi snd_seq_midi_event snd_seq pcspkr i2c_dev radeon drm dm_crypt dm_mod snd_ca0106 snd_rawmidi snd_seq_device snd_ac97_codec snd_pcm snd_timer snd soundcore ac97_bus snd_page_alloc via686a hwmon i2c_isa i2c_viapro via_agp agpgart CPU: 0 EIP: 0060:[<c013a47c>] Tainted: P VLI EFLAGS: 00010292 (2.6.22-rc5 #1) EIP is at mempool_free+0xa/0x86 eax: d6cbbc3c ebx: ef225f64 ecx: 00000000 edx: 0100018f esi: 0100018f edi: d6cbbc3c ebp: f08346c9 esp: ef225f1c ds: 007b es: 007b fs: 0000 gs: 0000 ss: 0068 Process kcryptd/0 (pid: 3319, ti=ef224000 task=eee870d0 task.ti=ef224000) Stack: d6cbbc44 ef225f64 d9d50980 d6cbbc44 f0834760 eff81360 0000007c 00000000 c0164336 eff81360 d6cbbc3c 00000002 eee870d0 b8112300 00000004 eeea2a00 eee870d0 c03d0998 eff81360 eff81360 00000000 00000000 00000001 00000001 Call Trace: [<f0834760>] kcryptd_do_work+0x97/0x2d8 [dm_crypt] [<c0164336>] mntput_no_expire+0x11/0x73 [<f08346c9>] kcryptd_do_work+0x0/0x2d8 [dm_crypt] [<c011fd51>] run_workqueue+0x8c/0x128 [<c0120284>] worker_thread+0x0/0xbc [<c0120336>] worker_thread+0xb2/0xbc [<c01228f3>] autoremove_wake_function+0x0/0x35 [<c012282f>] kthread+0x36/0x5b [<c01227f9>] kthread+0x0/0x5b [<c0103007>] kernel_thread_helper+0x7/0x10 ======================= Code: e8 43 7f 1a 00 89 d8 89 e2 e8 05 85 fe ff 89 e8 e9 4c ff ff ff 31 f6 83 c4 14 89 f0 5b 5e 5f 5d c3 57 89 c7 56 89 d6 53 83 ec 04 <8b> 02 39 42 04 7d 68 9c 5b fa 89 e0 25 00 e0 ff ff ff 40 14 8b EIP: [<c013a47c>] mempool_free+0xa/0x86 SS:ESP 0068:ef225f1c
Milan, when it is ready, can you please get that patch out onto the mailing list(s) in the normal fashion and cc myself? Thanks.
It seems that there is another problem left, please could you test attached patch (apply over previous one from comment #34) and test it together ? Milan ----- From: Milan Broz <mbroz@redhat.com> Fix releasing dm_crypt_io structure so the endio() for the base bio is called in the end of processing. Because there is only one workqueue and read request is processed in two steps, flush_workqueue is not enough because it can generate new post-read processing request. This design will be replaced soon with two private singlethread queues but for now this patch fixes possible mempool release problems during crypt device destroy and some read request pending. Signed-off-by: Milan Broz <mbroz@redhat.com> --- drivers/md/dm-crypt.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) Index: linux-2.6.22/drivers/md/dm-crypt.c =================================================================== --- linux-2.6.22.orig/drivers/md/dm-crypt.c 2007-08-06 17:21:54.000000000 +0200 +++ linux-2.6.22/drivers/md/dm-crypt.c 2007-08-15 14:43:01.000000000 +0200 @@ -481,7 +481,9 @@ static void crypt_free_buffer_pages(stru */ static void dec_pending(struct crypt_io *io, int error) { - struct crypt_config *cc = (struct crypt_config *) io->target->private; + struct crypt_config *cc = io->target->private; + struct bio *base_bio; + int io_error; if (error < 0) io->error = error; @@ -489,9 +491,12 @@ static void dec_pending(struct crypt_io if (!atomic_dec_and_test(&io->pending)) return; - bio_endio(io->base_bio, io->base_bio->bi_size, io->error); + base_bio = io->base_bio; + io_error = io->error; mempool_free(io, cc->io_pool); + + bio_endio(base_bio, base_bio->bi_size, io_error); } /*
It seems that I've messed something while patching the kernel with the first patch (that from #34) and because of that bug from #36 can be inappropriate. Sorry. I jumped into 2.6.22 and patched it with the first patch (#34). I was trying to trigger the last kernel bug (#36) but till now without luck. I think I won't patch the kernel with the next patch (#38) till I will catch the bug again.
> ------- Comment #39 from pitrasw@wp.pl 2007-08-15 13:58 ------- > It seems that I've messed something while patching the kernel with the first > patch (that from #34) and because of that bug from #36 can be inappropriate. ok I checked again second patch and it only solves the same situation different way, my comment is misleading - if there is still running io, dm should not try to destroy mapping. (Or it is bug and should be fixed instead hiding it by this patch). Patch mentioned in comment #34 (flush_workqueue()) should be sufficient to solve this. This patch is already committed to 2.6.23-rc2 (80b16c192e469541263d6bfd9177662ceb632ecc). Also in stable 2.6.22.2 (2d68c23353ff6e72ca62a4d355f09332382d6796). I think that this bug can be closed (together with bug 7388). Milan
ok, thanks.
this bug is not fixed for me! i tried with 2.6.23-rc3 which should include the above fix; now my system BUGs sometimes as soon as i mount the xfs->luks->cciss partition; mount returns a segfault and the partition is no more useable after this; but (after reboot) if i try to massive copy to this partition i get a 100% panic! log of the mount-bug & rsync-panic attached steps to reproduce: # cryptsetup luksOpen /dev/cciss/c0d0p6 p6 Enter LUKS passphrase: key slot 0 unlocked. Command successful. # mount /dev/mapper/p6 /mnt/p6 # rsync -ax / /mnt/p6
Created attachment 12439 [details] serial log of the kernel bug @ mount
Created attachment 12440 [details] panic @ rsync-copy to this partition
kernel BUG at drivers/block/cciss.c:2537 ... EIP is at do_cciss_request+0x382/0x3e0 So we are back in cciss driver code, Mike ? (dm-crypt bug in comment #36 is fixed)
Reading comment #12, so dm-crypt is violating underling block device queue limits here ? Will the patch from comment #17 help or there is need to implement other restrictions for stacked (dm-crypt) devices ?
Milan, it looks like: http://www.kernel.org/pub/linux/kernel/people/agk/patches/2.6/editing/dm-merge-max_hw_sector.patch is STILL not merged... I'd say that's worth a shot.
bugme-daemon@bugzilla.kernel.org wrote: > is STILL not merged... I'd say that's worth a shot. yes, I know. btw for some experimental targets this line were added to setup correct limit. @@ static void check_for_valid_limits(struc { + if (!rs->max_hw_sectors) + rs->max_hw_sectors = SAFE_MAX_SECTORS; And IIRC Alasdair still has no direct bug report proving that this patch really helps. So maybe this is the one we are searching for - please verify if this patch really fix the problem above. Milan
Mike/Chase mentioned it was the segment count being too big, which looks like it's correctly tested and set in the check_for_valid_limits(). So perhaps it's something else. I guess generic_make_request() should incorporate a limit check, for each resolved step to catch these things.
I tried to submit that patch but it was held back because there was no evidence it genuinely fixed anything. Every time we suggest someone tries that patch and report back whether it helps, we never get seem to get a positive reply. It can't do any *harm* though, so I shall submit it again regardless of the lack of evidence, so we can at least rule it out. There's also the recent bounce_pfn workaround.
Definitely bug in dm-crypt, fix verified on Areca HBA, which panics in the same situation - XFS over dm-crypt because of phys. segments violation request. Please could you test this patch on cciss ? Milan -- From: Milan Broz <mbroz@redhat.com> Fix possible max_phys_segments violation in cloned dm-crypt bio. In write operation dm-crypt needs to allocate new bio request and run crypto operation on this clone. Cloned request has always the same size, but number of physical segments can be increased and violate max_phys_segments restriction. This can lead to data corruption and serious hardware malfunction. This was observed when using XFS over dm-crypt and at least two HBA controller drivers (arcmsr, cciss) recently. Fix it by using bio_add_page() call (which performs all restrictions validation) instead of constructing own biovec. (All versions of dm-crypt are affected by this bug.) Signed-off-by: Milan Broz <mbroz@redhat.com> --- drivers/md/dm-crypt.c | 23 +++++++++++------------ 1 file changed, 11 insertions(+), 12 deletions(-) Index: linux-2.6.24/drivers/md/dm-crypt.c =================================================================== --- linux-2.6.24.orig/drivers/md/dm-crypt.c 2007-11-17 06:16:36.000000000 +0100 +++ linux-2.6.24/drivers/md/dm-crypt.c 2007-12-01 15:41:51.000000000 +0100 @@ -399,6 +399,8 @@ static struct bio *crypt_alloc_buffer(st unsigned int nr_iovecs = (size + PAGE_SIZE - 1) >> PAGE_SHIFT; gfp_t gfp_mask = GFP_NOIO | __GFP_HIGHMEM; unsigned int i; + struct page *page; + unsigned long len; clone = bio_alloc_bioset(GFP_NOIO, nr_iovecs, cc->bs); if (!clone) @@ -407,10 +409,8 @@ static struct bio *crypt_alloc_buffer(st clone_init(io, clone); for (i = 0; i < nr_iovecs; i++) { - struct bio_vec *bv = bio_iovec_idx(clone, i); - - bv->bv_page = mempool_alloc(cc->page_pool, gfp_mask); - if (!bv->bv_page) + page = mempool_alloc(cc->page_pool, gfp_mask); + if (!page) break; /* @@ -421,15 +421,14 @@ static struct bio *crypt_alloc_buffer(st if (i == (MIN_BIO_PAGES - 1)) gfp_mask = (gfp_mask | __GFP_NOWARN) & ~__GFP_WAIT; - bv->bv_offset = 0; - if (size > PAGE_SIZE) - bv->bv_len = PAGE_SIZE; - else - bv->bv_len = size; + len = (size > PAGE_SIZE) ? PAGE_SIZE : size; + + if (!bio_add_page(clone, page, len, 0)) { + mempool_free(page, cc->page_pool); + break; + } - clone->bi_size += bv->bv_len; - clone->bi_vcnt++; - size -= bv->bv_len; + size -= len; } if (!clone->bi_size) {
Created attachment 13811 [details] Proposed patch Seems that sent patch was reformated by mistake, here is it in attachement instead, sorry. Milan
the patch from milan solves my problem! :-) i think this bug can be closed now i tried with todays git kernel (2.6.24-rc5+), and the rsync-copy worked as expected! :-))
btw: i hope this patch goes mainline soon! :-)
> btw: i hope this patch goes mainline soon! :-) already on its way http://lkml.org/lkml/2007/12/13/220