Bug 25632 - I/O requests hanging on LVs on top of software RAID 5+0
Summary: I/O requests hanging on LVs on top of software RAID 5+0
Status: RESOLVED INSUFFICIENT_DATA
Alias: None
Product: IO/Storage
Classification: Unclassified
Component: MD (show other bugs)
Hardware: All Linux
: P1 high
Assignee: io_md
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-12-26 04:45 UTC by Travers Carter
Modified: 2012-08-14 15:09 UTC (History)
4 users (show)

See Also:
Kernel Version: 2.6.36
Subsystem:
Regression: No
Bisected commit-id:


Attachments

Description Travers Carter 2010-12-26 04:45:50 UTC
Summary
After some time I/O requests appear to get permanently stuck at either the LVM or MD layer, iostat shows pending requests not moving against the associated device, but the problem only seems to occur for the volume group that is built on a software raid 5+0 array.

Once the problem occurs all I/O to the affected device(s) stops and any processes trying to access it hang until the system is rebooted, processing not using the affected device are unaffected.

Steps to reproduce
Unknown, the systems hosts ~14 KVM guests (the guests are mostly CentOS 5), and the problem typically occurs within 2 or 3 days, and often, but not always at around 4:30am when many of the guests are probably running I/O intensive processes like backups & database maintenance.

Other Information
The hardware is an IBM x3650 M2 w/ 60G of RAM and 6xSAS drives attached to:
01:00.0 SCSI storage controller [0100]: LSI Logic / Symbios Logic SAS1068E PCI-Express Fusion-MPT SAS [1000:0058] (rev 08)

When the problem occurs the system reports very high loads (typically in the range of 80 to 120)

The RAID layout is two 3-disk raid-5 arrays with a raid-0 on top of them.


dmesg
-----
Dec 25 04:35:55 fractus kernel: INFO: task jbd2/dm-4-8:1373 blocked for more than 120 seconds.
Dec 25 04:35:55 fractus kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Dec 25 04:35:55 fractus kernel: jbd2/dm-4-8   D 0000000000000000     0  1373      2 0x00000000
Dec 25 04:35:55 fractus kernel: ffff88075df3dd20 0000000000000046 ffff88075df3dcd0 ffff88075d230880
Dec 25 04:35:55 fractus kernel: 0000000000015c80 ffff88075e9e4b40 0000000000015c80 ffff88075e9e50d0
Dec 25 04:35:55 fractus kernel: ffff88075df3dfd8 ffff88075e9e50d8 ffff88075df3c010 0000000000015c80
Dec 25 04:35:55 fractus kernel: Call Trace:
Dec 25 04:35:55 fractus kernel: [<ffffffffa01199c4>] jbd2_journal_commit_transaction+0x1c4/0x1480 [jbd2]
Dec 25 04:35:55 fractus kernel: [<ffffffff8104f402>] ? finish_task_switch+0x42/0xd0
Dec 25 04:35:55 fractus kernel: [<ffffffff8106f37c>] ? lock_timer_base+0x3c/0x70
Dec 25 04:35:55 fractus kernel: [<ffffffff810810c0>] ? autoremove_wake_function+0x0/0x40
Dec 25 04:35:55 fractus kernel: [<ffffffffa011e9c8>] kjournald2+0xb8/0x220 [jbd2]
Dec 25 04:35:55 fractus kernel: [<ffffffff810810c0>] ? autoremove_wake_function+0x0/0x40
Dec 25 04:35:55 fractus kernel: [<ffffffffa011e910>] ? kjournald2+0x0/0x220 [jbd2]
Dec 25 04:35:55 fractus kernel: [<ffffffff81080a36>] kthread+0x96/0xa0
Dec 25 04:35:55 fractus kernel: [<ffffffff8100bec4>] kernel_thread_helper+0x4/0x10
Dec 25 04:35:55 fractus kernel: [<ffffffff810809a0>] ? kthread+0x0/0xa0
Dec 25 04:35:55 fractus kernel: [<ffffffff8100bec0>] ? kernel_thread_helper+0x0/0x10
Dec 25 04:35:55 fractus kernel: INFO: task flush-253:4:1870 blocked for more than 120 seconds.
Dec 25 04:35:55 fractus kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Dec 25 04:35:55 fractus kernel: flush-253:4   D 0000000000000000     0  1870      2 0x00000000
Dec 25 04:35:55 fractus kernel: ffff880753cfd980 0000000000000046 ffff8807595e2050 0000000000000000
Dec 25 04:35:55 fractus kernel: 0000000000015c80 ffff88075d060b00 0000000000015c80 ffff88075d061090
Dec 25 04:35:55 fractus kernel: ffff880753cfdfd8 ffff88075d061098 ffff880753cfc010 0000000000015c80
Dec 25 04:35:55 fractus kernel: Call Trace:
Dec 25 04:35:55 fractus kernel: [<ffffffffa0117f6c>] start_this_handle+0x1dc/0x520 [jbd2]
Dec 25 04:35:55 fractus kernel: [<ffffffff810810c0>] ? autoremove_wake_function+0x0/0x40
Dec 25 04:35:55 fractus kernel: [<ffffffffa01184ab>] jbd2__journal_start+0xbb/0x100 [jbd2]
Dec 25 04:35:55 fractus kernel: [<ffffffffa0118503>] jbd2_journal_start+0x13/0x20 [jbd2]
Dec 25 04:35:55 fractus kernel: [<ffffffffa015acb0>] ext4_journal_start_sb+0xf0/0x130 [ext4]
Dec 25 04:35:55 fractus kernel: [<ffffffffa0145165>] ? ext4_meta_trans_blocks+0x75/0xf0 [ext4]
Dec 25 04:35:55 fractus kernel: [<ffffffffa0148c38>] ext4_da_writepages+0x268/0x630 [ext4]
Dec 25 04:35:55 fractus kernel: [<ffffffff810fe480>] ? __writepage+0x0/0x40
Dec 25 04:35:55 fractus kernel: [<ffffffff8100bc0e>] ? call_function_single_interrupt+0xe/0x20
Dec 25 04:35:55 fractus kernel: [<ffffffff810ff9d1>] do_writepages+0x21/0x40
Dec 25 04:35:55 fractus kernel: [<ffffffff8116c3ce>] writeback_single_inode+0x9e/0x2a0
Dec 25 04:35:55 fractus kernel: [<ffffffff8116c7ee>] writeback_sb_inodes+0xde/0x180
Dec 25 04:35:55 fractus kernel: [<ffffffff8116c92d>] writeback_inodes_wb+0x9d/0x160
Dec 25 04:35:55 fractus kernel: [<ffffffff8116cc6b>] wb_writeback+0x27b/0x3f0
Dec 25 04:35:55 fractus kernel: [<ffffffff8106f37c>] ? lock_timer_base+0x3c/0x70
Dec 25 04:35:55 fractus kernel: [<ffffffff8116cf7d>] wb_do_writeback+0x19d/0x1e0
Dec 25 04:35:55 fractus kernel: [<ffffffff8116d072>] bdi_writeback_thread+0xb2/0x270
Dec 25 04:35:55 fractus kernel: [<ffffffff8116cfc0>] ? bdi_writeback_thread+0x0/0x270
Dec 25 04:35:55 fractus kernel: [<ffffffff8116cfc0>] ? bdi_writeback_thread+0x0/0x270
Dec 25 04:35:55 fractus kernel: [<ffffffff81080a36>] kthread+0x96/0xa0
Dec 25 04:35:55 fractus kernel: [<ffffffff8100bec4>] kernel_thread_helper+0x4/0x10
Dec 25 04:35:55 fractus kernel: [<ffffffff810809a0>] ? kthread+0x0/0xa0
Dec 25 04:35:55 fractus kernel: [<ffffffff8100bec0>] ? kernel_thread_helper+0x0/0x10
Dec 25 04:35:55 fractus kernel: INFO: task qemu-kvm:22471 blocked for more than 120 seconds.
Dec 25 04:35:55 fractus kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Dec 25 04:35:55 fractus kernel: qemu-kvm      D 0000000000000000     0 22471      1 0x00000080
Dec 25 04:35:55 fractus kernel: ffff880ccc9399c8 0000000000000086 0000000000000400 ffff88075d218b38
Dec 25 04:35:55 fractus kernel: 0000000000015c80 ffff880efa2680c0 0000000000015c80 ffff880efa268650
Dec 25 04:35:55 fractus kernel: ffff880ccc939fd8 ffff880efa268658 ffff880ccc938010 0000000000015c80
Dec 25 04:35:55 fractus kernel: Call Trace:
Dec 25 04:35:55 fractus kernel: [<ffffffffa0117f6c>] start_this_handle+0x1dc/0x520 [jbd2]
Dec 25 04:35:55 fractus kernel: [<ffffffff810810c0>] ? autoremove_wake_function+0x0/0x40
Dec 25 04:35:55 fractus kernel: [<ffffffffa01184ab>] jbd2__journal_start+0xbb/0x100 [jbd2]
Dec 25 04:35:55 fractus kernel: [<ffffffffa0118503>] jbd2_journal_start+0x13/0x20 [jbd2]
Dec 25 04:35:55 fractus kernel: [<ffffffffa015acb0>] ext4_journal_start_sb+0xf0/0x130 [ext4]
Dec 25 04:35:55 fractus kernel: [<ffffffffa014652a>] ext4_dirty_inode+0x2a/0x60 [ext4]
Dec 25 04:35:55 fractus kernel: [<ffffffff8116bfbb>] __mark_inode_dirty+0x3b/0x1f0
Dec 25 04:35:55 fractus kernel: [<ffffffff8115f402>] file_update_time+0xf2/0x170
Dec 25 04:35:55 fractus kernel: [<ffffffff810f78f0>] __generic_file_aio_write+0x210/0x470
Dec 25 04:35:55 fractus kernel: [<ffffffff810f6b08>] ? generic_file_aio_read+0x378/0x720
Dec 25 04:35:55 fractus kernel: [<ffffffff810f7bb5>] generic_file_aio_write+0x65/0xd0
Dec 25 04:35:55 fractus kernel: [<ffffffffa013fee0>] ? ext4_file_write+0x0/0xd0 [ext4]
Dec 25 04:35:55 fractus kernel: [<ffffffffa013ff1f>] ext4_file_write+0x3f/0xd0 [ext4]
Dec 25 04:35:55 fractus kernel: [<ffffffff81148b73>] do_sync_readv_writev+0xd3/0x110
Dec 25 04:35:55 fractus kernel: [<ffffffff811d9dab>] ? selinux_file_permission+0xfb/0x150
Dec 25 04:35:55 fractus kernel: [<ffffffff811d2dff>] ? security_file_permission+0x1f/0x80
Dec 25 04:35:55 fractus kernel: [<ffffffff81149b24>] do_readv_writev+0xd4/0x1f0
Dec 25 04:35:55 fractus kernel: [<ffffffff81073a46>] ? group_send_sig_info+0x56/0x70
Dec 25 04:35:55 fractus kernel: [<ffffffff81073a9f>] ? kill_pid_info+0x3f/0x60
Dec 25 04:35:55 fractus kernel: [<ffffffff81149c86>] vfs_writev+0x46/0x60
Dec 25 04:35:55 fractus kernel: [<ffffffff81149d42>] sys_pwritev+0xa2/0xc0
Dec 25 04:35:55 fractus kernel: [<ffffffff8100b0b2>] system_call_fastpath+0x16/0x1b
Dec 25 04:35:55 fractus kernel: INFO: task qemu-kvm:22691 blocked for more than 120 seconds.
Dec 25 04:35:55 fractus kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Dec 25 04:35:55 fractus kernel: qemu-kvm      D 0000000000000000     0 22691      1 0x00000080
Dec 25 04:35:55 fractus kernel: ffff880b3c75bc18 0000000000000086 ffff880b3c75bfd8 ffff880b3c69fa98
Dec 25 04:35:55 fractus kernel: 0000000000015c80 ffff880b3c69f500 0000000000015c80 ffff880b3c69fa90
Dec 25 04:35:55 fractus kernel: ffff880b3c75bfd8 ffff880b3c69fa98 ffff880b3c75a010 0000000000015c80
Dec 25 04:35:55 fractus kernel: Call Trace:
Dec 25 04:35:55 fractus kernel: [<ffffffff81085bbd>] ? hrtimer_try_to_cancel+0x3d/0xe0
Dec 25 04:35:55 fractus kernel: [<ffffffff81495b32>] __mutex_lock_slowpath+0xf2/0x170
Dec 25 04:35:55 fractus kernel: [<ffffffff81495a1b>] mutex_lock+0x2b/0x50
Dec 25 04:35:55 fractus kernel: [<ffffffff810f7ba2>] generic_file_aio_write+0x52/0xd0
Dec 25 04:35:55 fractus kernel: [<ffffffffa013fee0>] ? ext4_file_write+0x0/0xd0 [ext4]
Dec 25 04:35:55 fractus kernel: [<ffffffffa013ff1f>] ext4_file_write+0x3f/0xd0 [ext4]
Dec 25 04:35:55 fractus kernel: [<ffffffff81148b73>] do_sync_readv_writev+0xd3/0x110
Dec 25 04:35:55 fractus kernel: [<ffffffff811d9dab>] ? selinux_file_permission+0xfb/0x150
Dec 25 04:35:55 fractus kernel: [<ffffffff811d2dff>] ? security_file_permission+0x1f/0x80
Dec 25 04:35:55 fractus kernel: [<ffffffff81149b24>] do_readv_writev+0xd4/0x1f0
Dec 25 04:35:55 fractus kernel: [<ffffffff81149c86>] vfs_writev+0x46/0x60
Dec 25 04:35:55 fractus kernel: [<ffffffff81149d42>] sys_pwritev+0xa2/0xc0
Dec 25 04:35:55 fractus kernel: [<ffffffff8100b0b2>] system_call_fastpath+0x16/0x1b
Dec 25 04:35:55 fractus kernel: INFO: task qemu-kvm:22861 blocked for more than 120 seconds.
Dec 25 04:35:55 fractus kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Dec 25 04:35:55 fractus kernel: qemu-kvm      D 0000000000000000     0 22861      1 0x00000080
Dec 25 04:35:55 fractus kernel: ffff8804b1bf5a88 0000000000000086 0000000000000000 0000000000000000
Dec 25 04:35:55 fractus kernel: 0000000000015c80 ffff88075e7140c0 0000000000015c80 ffff88075e714650
Dec 25 04:35:55 fractus kernel: ffff8804b1bf5fd8 ffff88075e714658 ffff8804b1bf4010 0000000000015c80
Dec 25 04:35:55 fractus kernel: Call Trace:
Dec 25 04:35:55 fractus kernel: [<ffffffffa0117f6c>] start_this_handle+0x1dc/0x520 [jbd2]
Dec 25 04:35:55 fractus kernel: [<ffffffff810810c0>] ? autoremove_wake_function+0x0/0x40
Dec 25 04:35:55 fractus kernel: [<ffffffffa01184ab>] jbd2__journal_start+0xbb/0x100 [jbd2]
Dec 25 04:35:55 fractus kernel: [<ffffffffa0118503>] jbd2_journal_start+0x13/0x20 [jbd2]
Dec 25 04:35:55 fractus kernel: [<ffffffffa015acb0>] ext4_journal_start_sb+0xf0/0x130 [ext4]
Dec 25 04:35:55 fractus kernel: [<ffffffffa014652a>] ext4_dirty_inode+0x2a/0x60 [ext4]
Dec 25 04:35:55 fractus kernel: [<ffffffff8116bfbb>] __mark_inode_dirty+0x3b/0x1f0
Dec 25 04:35:55 fractus kernel: [<ffffffff8115f402>] file_update_time+0xf2/0x170
Dec 25 04:35:55 fractus kernel: [<ffffffff810f78f0>] __generic_file_aio_write+0x210/0x470
Dec 25 04:35:55 fractus kernel: [<ffffffff810f7bb5>] generic_file_aio_write+0x65/0xd0
Dec 25 04:35:55 fractus kernel: [<ffffffffa013ff1f>] ext4_file_write+0x3f/0xd0 [ext4]
Dec 25 04:35:55 fractus kernel: [<ffffffff81148c8a>] do_sync_write+0xda/0x120
Dec 25 04:35:55 fractus kernel: [<ffffffff811d9dab>] ? selinux_file_permission+0xfb/0x150
Dec 25 04:35:55 fractus kernel: [<ffffffff811d2dff>] ? security_file_permission+0x1f/0x80
Dec 25 04:35:55 fractus kernel: [<ffffffff81148f78>] vfs_write+0xc8/0x190
Dec 25 04:35:55 fractus kernel: [<ffffffff81149a32>] sys_pwrite64+0x82/0xa0
Dec 25 04:35:55 fractus kernel: [<ffffffff8100b0b2>] system_call_fastpath+0x16/0x1b
Dec 25 04:35:55 fractus kernel: INFO: task qemu-kvm:22862 blocked for more than 120 seconds.
Dec 25 04:35:55 fractus kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Dec 25 04:35:55 fractus kernel: qemu-kvm      D 0000000000000000     0 22862      1 0x00000080
Dec 25 04:35:55 fractus kernel: ffff88035bdc7c18 0000000000000086 0000000000000000 0000000000000000
Dec 25 04:35:55 fractus kernel: 0000000000015c80 ffff88035be7b4c0 0000000000015c80 ffff88035be7ba50
Dec 25 04:35:55 fractus kernel: ffff88035bdc7fd8 ffff88035be7ba58 ffff88035bdc6010 0000000000015c80
Dec 25 04:35:55 fractus kernel: Call Trace:
Dec 25 04:35:55 fractus kernel: [<ffffffff81495b32>] __mutex_lock_slowpath+0xf2/0x170
Dec 25 04:35:55 fractus kernel: [<ffffffff81495a1b>] mutex_lock+0x2b/0x50
Dec 25 04:35:55 fractus kernel: [<ffffffff810f7ba2>] generic_file_aio_write+0x52/0xd0
Dec 25 04:35:55 fractus kernel: [<ffffffffa013fee0>] ? ext4_file_write+0x0/0xd0 [ext4]
Dec 25 04:35:55 fractus kernel: [<ffffffffa013ff1f>] ext4_file_write+0x3f/0xd0 [ext4]
Dec 25 04:35:55 fractus kernel: [<ffffffff81148b73>] do_sync_readv_writev+0xd3/0x110
Dec 25 04:35:55 fractus kernel: [<ffffffff811d9dab>] ? selinux_file_permission+0xfb/0x150
Dec 25 04:35:55 fractus kernel: [<ffffffff811d2dff>] ? security_file_permission+0x1f/0x80
Dec 25 04:35:55 fractus kernel: [<ffffffff81149b24>] do_readv_writev+0xd4/0x1f0
Dec 25 04:35:55 fractus kernel: [<ffffffff810630b0>] ? delayed_put_task_struct+0x0/0xa0
Dec 25 04:35:55 fractus kernel: [<ffffffff81149c86>] vfs_writev+0x46/0x60
Dec 25 04:35:55 fractus kernel: [<ffffffff81149d42>] sys_pwritev+0xa2/0xc0
Dec 25 04:35:55 fractus kernel: [<ffffffff8100b0b2>] system_call_fastpath+0x16/0x1b
Dec 25 04:35:55 fractus kernel: INFO: task qemu-kvm:22863 blocked for more than 120 seconds.
Dec 25 04:35:55 fractus kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Dec 25 04:35:55 fractus kernel: qemu-kvm      D 0000000000000000     0 22863      1 0x00000080
Dec 25 04:35:55 fractus kernel: ffff880289815cd8 0000000000000086 0000000000000000 0000000000000000
Dec 25 04:35:55 fractus kernel: 0000000000015c80 ffff88035bc4d580 0000000000015c80 ffff88035bc4db10
Dec 25 04:35:55 fractus kernel: ffff880289815fd8 ffff88035bc4db18 ffff880289814010 0000000000015c80
Dec 25 04:35:55 fractus kernel: Call Trace:
Dec 25 04:35:55 fractus kernel: [<ffffffff81495b32>] __mutex_lock_slowpath+0xf2/0x170
Dec 25 04:35:55 fractus kernel: [<ffffffff81495a1b>] mutex_lock+0x2b/0x50
Dec 25 04:35:55 fractus kernel: [<ffffffff810f7ba2>] generic_file_aio_write+0x52/0xd0
Dec 25 04:35:55 fractus kernel: [<ffffffffa013ff1f>] ext4_file_write+0x3f/0xd0 [ext4]
Dec 25 04:35:55 fractus kernel: [<ffffffff81148c8a>] do_sync_write+0xda/0x120
Dec 25 04:35:55 fractus kernel: [<ffffffff811d9dab>] ? selinux_file_permission+0xfb/0x150
Dec 25 04:35:55 fractus kernel: [<ffffffff811d2dff>] ? security_file_permission+0x1f/0x80
Dec 25 04:35:55 fractus kernel: [<ffffffff81148f78>] vfs_write+0xc8/0x190
Dec 25 04:35:55 fractus kernel: [<ffffffff81149a32>] sys_pwrite64+0x82/0xa0
Dec 25 04:35:55 fractus kernel: [<ffffffff8100b0b2>] system_call_fastpath+0x16/0x1b
Dec 25 04:35:55 fractus kernel: INFO: task qemu-kvm:22864 blocked for more than 120 seconds.
Dec 25 04:35:55 fractus kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Dec 25 04:35:55 fractus kernel: qemu-kvm      D 0000000000000000     0 22864      1 0x00000080
Dec 25 04:35:55 fractus kernel: ffff88035bd39cd8 0000000000000086 ffff88035bd39c80 0000000000000000
Dec 25 04:35:55 fractus kernel: 0000000000015c80 ffff880756534b40 0000000000015c80 ffff8807565350d0
Dec 25 04:35:55 fractus kernel: ffff88035bd39fd8 ffff8807565350d8 ffff88035bd38010 0000000000015c80
Dec 25 04:35:55 fractus kernel: Call Trace:
Dec 25 04:35:55 fractus kernel: [<ffffffff81495b32>] __mutex_lock_slowpath+0xf2/0x170
Dec 25 04:35:55 fractus kernel: [<ffffffff81495a1b>] mutex_lock+0x2b/0x50
Dec 25 04:35:55 fractus kernel: [<ffffffff810f7ba2>] generic_file_aio_write+0x52/0xd0
Dec 25 04:35:55 fractus kernel: [<ffffffffa013ff1f>] ext4_file_write+0x3f/0xd0 [ext4]
Dec 25 04:35:55 fractus kernel: [<ffffffff81148c8a>] do_sync_write+0xda/0x120
Dec 25 04:35:55 fractus kernel: [<ffffffff811d9dab>] ? selinux_file_permission+0xfb/0x150
Dec 25 04:35:55 fractus kernel: [<ffffffff811d2dff>] ? security_file_permission+0x1f/0x80
Dec 25 04:35:55 fractus kernel: [<ffffffff81148f78>] vfs_write+0xc8/0x190
Dec 25 04:35:55 fractus kernel: [<ffffffff81149a32>] sys_pwrite64+0x82/0xa0
Dec 25 04:35:55 fractus kernel: [<ffffffff8100b0b2>] system_call_fastpath+0x16/0x1b
Dec 25 04:35:55 fractus kernel: INFO: task qemu-kvm:22865 blocked for more than 120 seconds.
Dec 25 04:35:55 fractus kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Dec 25 04:35:55 fractus kernel: qemu-kvm      D 0000000000000000     0 22865      1 0x00000080
Dec 25 04:35:55 fractus kernel: ffff88028998bc18 0000000000000086 ffff88028998bbc0 0000000000000000
Dec 25 04:35:55 fractus kernel: 0000000000015c80 ffff88075dfd6b00 0000000000015c80 ffff88075dfd7090
Dec 25 04:35:55 fractus kernel: ffff88028998bfd8 ffff88075dfd7098 ffff88028998a010 0000000000015c80
Dec 25 04:35:55 fractus kernel: Call Trace:
Dec 25 04:35:55 fractus kernel: [<ffffffff81495b32>] __mutex_lock_slowpath+0xf2/0x170
Dec 25 04:35:55 fractus kernel: [<ffffffff8117c5d0>] ? mpage_end_io_read+0x0/0x90
Dec 25 04:35:55 fractus kernel: [<ffffffff81495a1b>] mutex_lock+0x2b/0x50
Dec 25 04:35:55 fractus kernel: [<ffffffff810f7ba2>] generic_file_aio_write+0x52/0xd0
Dec 25 04:35:55 fractus kernel: [<ffffffffa013fee0>] ? ext4_file_write+0x0/0xd0 [ext4]
Dec 25 04:35:55 fractus kernel: [<ffffffffa013ff1f>] ext4_file_write+0x3f/0xd0 [ext4]
Dec 25 04:35:55 fractus kernel: [<ffffffff8117c5d0>] ? mpage_end_io_read+0x0/0x90
Dec 25 04:35:55 fractus kernel: [<ffffffff81148b73>] do_sync_readv_writev+0xd3/0x110
Dec 25 04:35:55 fractus kernel: [<ffffffff811d9dab>] ? selinux_file_permission+0xfb/0x150
Dec 25 04:35:55 fractus kernel: [<ffffffff811d2dff>] ? security_file_permission+0x1f/0x80
Dec 25 04:35:55 fractus kernel: [<ffffffff81149b24>] do_readv_writev+0xd4/0x1f0
Dec 25 04:35:55 fractus kernel: [<ffffffffa0001270>] ? dm_bio_destructor+0x0/0x20 [dm_mod]
Dec 25 04:35:55 fractus kernel: [<ffffffff8117c5d0>] ? mpage_end_io_read+0x0/0x90
Dec 25 04:35:55 fractus kernel: [<ffffffff81177e60>] ? bio_fs_destructor+0x0/0x20
Dec 25 04:35:55 fractus kernel: [<ffffffffa0216418>] ? kvm_on_user_return+0x78/0x90 [kvm]
Dec 25 04:35:55 fractus kernel: [<ffffffff81149c86>] vfs_writev+0x46/0x60
Dec 25 04:35:55 fractus kernel: [<ffffffff81149d42>] sys_pwritev+0xa2/0xc0
Dec 25 04:35:55 fractus kernel: [<ffffffff8100b0b2>] system_call_fastpath+0x16/0x1b
Dec 25 04:35:55 fractus kernel: INFO: task qemu-kvm:22488 blocked for more than 120 seconds.
Dec 25 04:35:55 fractus kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Dec 25 04:35:55 fractus kernel: qemu-kvm      D 0000000000000000     0 22488      1 0x00000080
Dec 25 04:35:55 fractus kernel: ffff880b4186f8e8 0000000000000086 ffff880b4186f868 ffffffffa000161d
Dec 25 04:35:55 fractus kernel: 0000000000015c80 ffff880cccb60040 0000000000015c80 ffff880cccb605d0
Dec 25 04:35:55 fractus kernel: ffff880b4186ffd8 ffff880cccb605d8 ffff880b4186e010 0000000000015c80
Dec 25 04:35:55 fractus kernel: Call Trace:
Dec 25 04:35:55 fractus kernel: [<ffffffffa000161d>] ? __map_bio+0xad/0x130 [dm_mod]
Dec 25 04:35:55 fractus kernel: [<ffffffff813963e5>] md_make_request+0x85/0x230
Dec 25 04:35:55 fractus kernel: [<ffffffff810810c0>] ? autoremove_wake_function+0x0/0x40
Dec 25 04:35:55 fractus kernel: [<ffffffff812013a3>] generic_make_request+0x1e3/0x560
Dec 25 04:35:55 fractus kernel: [<ffffffff810f7d15>] ? mempool_alloc_slab+0x15/0x20
Dec 25 04:35:55 fractus kernel: [<ffffffff812017a1>] submit_bio+0x81/0x110
Dec 25 04:35:55 fractus kernel: [<ffffffff8117b44c>] dio_bio_submit+0xbc/0xc0
Dec 25 04:35:55 fractus kernel: [<ffffffff8117b4e0>] dio_send_cur_page+0x90/0xc0
Dec 25 04:35:55 fractus kernel: [<ffffffff8117b575>] submit_page_section+0x65/0x170
Dec 25 04:35:55 fractus kernel: [<ffffffff8117bf98>] __blockdev_direct_IO+0x688/0xb70
Dec 25 04:35:55 fractus kernel: [<ffffffff81179bd7>] blkdev_direct_IO+0x57/0x60
Dec 25 04:35:55 fractus kernel: [<ffffffff81178c20>] ? blkdev_get_blocks+0x0/0xc0
Dec 25 04:35:55 fractus kernel: [<ffffffff810f6e51>] generic_file_aio_read+0x6c1/0x720
Dec 25 04:35:55 fractus kernel: [<ffffffff810f6790>] ? generic_file_aio_read+0x0/0x720
Dec 25 04:35:55 fractus kernel: [<ffffffff81148b73>] do_sync_readv_writev+0xd3/0x110
Dec 25 04:35:55 fractus kernel: [<ffffffff811d2e4f>] ? security_file_permission+0x6f/0x80
Dec 25 04:35:55 fractus kernel: [<ffffffff81149b24>] do_readv_writev+0xd4/0x1f0
Dec 25 04:35:55 fractus kernel: [<ffffffff81073a9f>] ? kill_pid_info+0x3f/0x60
Dec 25 04:35:55 fractus kernel: [<ffffffff81149e53>] vfs_readv+0x43/0x60
Dec 25 04:35:55 fractus kernel: [<ffffffff81149f12>] sys_preadv+0xa2/0xc0
Dec 25 04:35:55 fractus kernel: [<ffffffff8100b0b2>] system_call_fastpath+0x16/0x1b



iostat -x 1 (the avgqu-sz on dm-2, 4, 5, 6 & 8 doesn't change at all)
------------
Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
sda               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sdc               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sdb               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sdd               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sde               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sdf               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
md0               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
md1               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
md2               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
md3               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-0              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-1              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
md127             0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-2              0.00     0.00    0.00    0.00     0.00     0.00     0.00     2.00    0.00   0.00 100.00
dm-3              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-4              0.00     0.00    0.00    0.00     0.00     0.00     0.00     1.00    0.00   0.00 100.00
dm-5              0.00     0.00    0.00    0.00     0.00     0.00     0.00    11.00    0.00   0.00 100.00
dm-6              0.00     0.00    0.00    0.00     0.00     0.00     0.00    10.00    0.00   0.00 100.00
dm-7              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-8              0.00     0.00    0.00    0.00     0.00     0.00     0.00     1.00    0.00   0.00 100.00
dm-9              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-10             0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-11             0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00

cat /proc/mdstat
-----------------
Personalities : [raid1] [raid10] [raid6] [raid5] [raid4] [raid0] 
md127 : active raid0 md3[1] md2[0]
      1101885440 blocks super 1.2 512k chunks
      
md3 : active raid5 sdd4[0] sdf4[3] sde4[1]
      550943744 blocks super 1.1 level 5, 512k chunk, algorithm 2 [3/3] [UUU]
      
md2 : active raid5 sda4[0] sdc4[3] sdb4[1]
      550943744 blocks super 1.1 level 5, 512k chunk, algorithm 2 [3/3] [UUU]
      
md1 : active raid10 sda3[0] sdf3[5] sde3[4] sdd3[3] sdc3[2] sdb3[1]
      50327040 blocks super 1.1 512K chunks 2 near-copies [6/6] [UUUUUU]
      bitmap: 0/1 pages [0KB], 65536KB chunk

md0 : active raid1 sda2[0] sdf2[5] sde2[4] sdd2[3] sdc2[2] sdb2[1]
      511988 blocks super 1.0 [6/6] [UUUUUU]
      
unused devices: <none>


ls -l /dev/mapper
-----------------
[root@fractus ~]# ls -l /dev/mapper
total 0
crw-rw----. 1 root root 10, 236 Dec 24 07:44 control
lrwxrwxrwx. 1 root root       7 Dec 24 07:44 vg0-slash -> ../dm-0
lrwxrwxrwx. 1 root root       7 Dec 24 07:44 vg0-swap -> ../dm-1
lrwxrwxrwx. 1 root root       7 Dec 24 07:44 vg0-tmp -> ../dm-9
lrwxrwxrwx. 1 root root       8 Dec 24 07:44 vg0-usr -> ../dm-10
lrwxrwxrwx. 1 root root       8 Dec 24 07:44 vg0-var -> ../dm-11
lrwxrwxrwx. 1 root root       7 Dec 24 07:44 vg1-clients -> ../dm-3
lrwxrwxrwx. 1 root root       7 Dec 24 07:44 vg1-libvirt -> ../dm-4
lrwxrwxrwx. 1 root root       7 Dec 24 07:44 vg1-scratch -> ../dm-2
lrwxrwxrwx. 1 root root       7 Dec 24 07:44 vg1-vm_mock6 -> ../dm-8
lrwxrwxrwx. 1 root root       7 Dec 24 07:44 vg1-vm_oca--kstest -> ../dm-7
lrwxrwxrwx. 1 root root       7 Dec 24 07:44 vg1-vm_oca--test01 -> ../dm-5
lrwxrwxrwx. 1 root root       7 Dec 24 07:44 vg1-vm_oca--test02 -> ../dm-6

mount
-----
/dev/mapper/vg0-slash on / type ext4 (rw)
proc on /proc type proc (rw)
sysfs on /sys type sysfs (rw)
devpts on /dev/pts type devpts (rw,gid=5,mode=620)
tmpfs on /dev/shm type tmpfs (rw,rootcontext="system_u:object_r:tmpfs_t:s0")
/dev/md0 on /boot type ext4 (rw)
/dev/sda1 on /boot/efi type vfat (rw,umask=0077,shortname=winnt)
/dev/mapper/vg0-tmp on /tmp type ext4 (rw)
/dev/mapper/vg0-usr on /usr type ext4 (rw)
/dev/mapper/vg0-var on /var type ext4 (rw)
/dev/mapper/vg1-libvirt on /var/lib/libvirt/images type ext4 (rw)
/dev/mapper/vg1-scratch on /mnt/scratch type ext4 (rw)
none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)
sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw)
blizzard:/home/users on /home/users type nfs4 (rw,addr=192.168.1.3,clientaddr=192.168.1.24)
blizzard:/home/shares on /home/shares type nfs4 (rw,addr=192.168.1.3,clientaddr=192.168.1.24)
Comment 1 Travers Carter 2010-12-28 20:57:01 UTC
I should have also added above that there are a number of VMs using raw files on an ext4 filesystem on the VG as well as some using LVs on the same VG directly.  Also I first encountered the problem in the RHEL 6 kernel before upgrading to 2.6.36 (see https://bugzilla.redhat.com/show_bug.cgi?id=652086)

dmesg output from another instance of the problem:

INFO: task jbd2/dm-4-8:1374 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
jbd2/dm-4-8   D 0000000000000000     0  1374      2 0x00000000
 ffff88075ed15d20 0000000000000046 ffff88075ed15cd0 ffff88075cb68880
 0000000000015c80 ffff88075d560ac0 0000000000015c80 ffff88075d561050
 ffff88075ed15fd8 ffff88075d561058 ffff88075ed14010 0000000000015c80
Call Trace:
 [<ffffffffa01199c4>] jbd2_journal_commit_transaction+0x1c4/0x1480 [jbd2]
 [<ffffffff8104f402>] ? finish_task_switch+0x42/0xd0
 [<ffffffff8100ba6e>] ? apic_timer_interrupt+0xe/0x20
 [<ffffffff810810c0>] ? autoremove_wake_function+0x0/0x40
 [<ffffffffa011e9c8>] kjournald2+0xb8/0x220 [jbd2]
 [<ffffffff810810c0>] ? autoremove_wake_function+0x0/0x40
 [<ffffffffa011e910>] ? kjournald2+0x0/0x220 [jbd2]
 [<ffffffff81080a36>] kthread+0x96/0xa0
 [<ffffffff8100bec4>] kernel_thread_helper+0x4/0x10
 [<ffffffff810809a0>] ? kthread+0x0/0xa0
 [<ffffffff8100bec0>] ? kernel_thread_helper+0x0/0x10
INFO: task flush-253:4:1862 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
flush-253:4   D 0000000000000000     0  1862      2 0x00000000
 ffff880753bd3980 0000000000000046 ffff88075c5d84b0 0000000000000000
 0000000000015c80 ffff88075dd27540 0000000000015c80 ffff88075dd27ad0
 ffff880753bd3fd8 ffff88075dd27ad8 ffff880753bd2010 0000000000015c80
Call Trace:
 [<ffffffffa0117f6c>] start_this_handle+0x1dc/0x520 [jbd2]
 [<ffffffff810810c0>] ? autoremove_wake_function+0x0/0x40
 [<ffffffffa01184ab>] jbd2__journal_start+0xbb/0x100 [jbd2]
 [<ffffffffa0118503>] jbd2_journal_start+0x13/0x20 [jbd2]
 [<ffffffffa015acb0>] ext4_journal_start_sb+0xf0/0x130 [ext4]
 [<ffffffff81101456>] ? __pagevec_release+0x26/0x40
 [<ffffffffa0145165>] ? ext4_meta_trans_blocks+0x75/0xf0 [ext4]
 [<ffffffffa0148c38>] ext4_da_writepages+0x268/0x630 [ext4]
 [<ffffffff810fe480>] ? __writepage+0x0/0x40
 [<ffffffff8100ba6e>] ? apic_timer_interrupt+0xe/0x20
 [<ffffffff8100ba6e>] ? apic_timer_interrupt+0xe/0x20
 [<ffffffff810ff9d1>] do_writepages+0x21/0x40
 [<ffffffff8116c3ce>] writeback_single_inode+0x9e/0x2a0
 [<ffffffff8116c7ee>] writeback_sb_inodes+0xde/0x180
 [<ffffffff8116c92d>] writeback_inodes_wb+0x9d/0x160
 [<ffffffff8116cc6b>] wb_writeback+0x27b/0x3f0
 [<ffffffff8106f37c>] ? lock_timer_base+0x3c/0x70
 [<ffffffff8116cf7d>] wb_do_writeback+0x19d/0x1e0
 [<ffffffff8116d072>] bdi_writeback_thread+0xb2/0x270
 [<ffffffff8116cfc0>] ? bdi_writeback_thread+0x0/0x270
 [<ffffffff8116cfc0>] ? bdi_writeback_thread+0x0/0x270
 [<ffffffff81080a36>] kthread+0x96/0xa0
 [<ffffffff8100bec4>] kernel_thread_helper+0x4/0x10
 [<ffffffff810809a0>] ? kthread+0x0/0xa0
 [<ffffffff8100bec0>] ? kernel_thread_helper+0x0/0x10
INFO: task qemu-kvm:14405 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
qemu-kvm      D 0000000000000000     0 14405      1 0x00000080
 ffff880594427958 0000000000000082 ffff880f59f03cc0 ffff880f5b2f64f8
 0000000000015c80 ffff88075458a100 0000000000015c80 ffff88075458a690
 ffff880594427fd8 ffff88075458a698 ffff880594426010 0000000000015c80
Call Trace:
 [<ffffffff81173b10>] ? sync_buffer+0x0/0x50
 [<ffffffff81494dd3>] io_schedule+0x73/0xc0
 [<ffffffff81173b50>] sync_buffer+0x40/0x50
 [<ffffffff8149568f>] __wait_on_bit+0x5f/0x90
 [<ffffffff81173b10>] ? sync_buffer+0x0/0x50
 [<ffffffff81495738>] out_of_line_wait_on_bit+0x78/0x90
 [<ffffffff81081100>] ? wake_bit_function+0x0/0x50
 [<ffffffff81173b0e>] __wait_on_buffer+0x2e/0x30
 [<ffffffff81175b0c>] block_prepare_write+0x3bc/0x5b0
 [<ffffffffa0147320>] ? ext4_da_get_block_prep+0x0/0x3e0 [ext4]
 [<ffffffff81175e17>] __block_write_begin+0x17/0x20
 [<ffffffffa014a4c8>] ext4_da_write_begin+0x168/0x280 [ext4]
 [<ffffffff8113fe78>] ? mem_cgroup_get_reclaim_stat_from_page+0x18/0x70
 [<ffffffff810f4e2e>] generic_file_buffered_write+0xfe/0x250
 [<ffffffffa0146554>] ? ext4_dirty_inode+0x54/0x60 [ext4]
 [<ffffffff810f7920>] __generic_file_aio_write+0x240/0x470
 [<ffffffff810f7bb5>] generic_file_aio_write+0x65/0xd0
 [<ffffffffa013ff1f>] ext4_file_write+0x3f/0xd0 [ext4]
 [<ffffffff810736ce>] ? send_signal+0x3e/0x90
 [<ffffffff81148c8a>] do_sync_write+0xda/0x120
 [<ffffffff81073a46>] ? group_send_sig_info+0x56/0x70
 [<ffffffff811d9dab>] ? selinux_file_permission+0xfb/0x150
 [<ffffffff811d2dff>] ? security_file_permission+0x1f/0x80
 [<ffffffff81148f78>] vfs_write+0xc8/0x190
 [<ffffffff81149a32>] sys_pwrite64+0x82/0xa0
 [<ffffffff8100b0b2>] system_call_fastpath+0x16/0x1b
INFO: task qemu-kvm:14430 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
qemu-kvm      D 0000000000000000     0 14430      1 0x00000080
 ffff88091abcdcd8 0000000000000082 00000000ffffffea 0000000000000000
 0000000000015c80 ffff880f5a288b00 0000000000015c80 ffff880f5a289090
 ffff88091abcdfd8 ffff880f5a289098 ffff88091abcc010 0000000000015c80
Call Trace:
 [<ffffffff81495b32>] __mutex_lock_slowpath+0xf2/0x170
 [<ffffffff81495a1b>] mutex_lock+0x2b/0x50
 [<ffffffff810f7ba2>] generic_file_aio_write+0x52/0xd0
 [<ffffffffa013ff1f>] ext4_file_write+0x3f/0xd0 [ext4]
 [<ffffffff8105872c>] ? pick_next_task_fair+0xfc/0x130
 [<ffffffff81148c8a>] do_sync_write+0xda/0x120
 [<ffffffff811d9dab>] ? selinux_file_permission+0xfb/0x150
 [<ffffffff811d2dff>] ? security_file_permission+0x1f/0x80
 [<ffffffff81148f78>] vfs_write+0xc8/0x190
 [<ffffffff81149a32>] sys_pwrite64+0x82/0xa0
 [<ffffffff8100b0b2>] system_call_fastpath+0x16/0x1b
INFO: task qemu-kvm:14455 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
qemu-kvm      D 0000000000000000     0 14455      1 0x00000080
 ffff880cbc0a5cd8 0000000000000082 00000000ffffffea 0000000000000000
 0000000000015c80 ffff8807f235ab00 0000000000015c80 ffff8807f235b090
 ffff880cbc0a5fd8 ffff8807f235b098 ffff880cbc0a4010 0000000000015c80
Call Trace:
 [<ffffffff81495b32>] __mutex_lock_slowpath+0xf2/0x170
 [<ffffffff810c70a5>] ? call_rcu_sched+0x15/0x20
 [<ffffffff81495a1b>] mutex_lock+0x2b/0x50
 [<ffffffff810f7ba2>] generic_file_aio_write+0x52/0xd0
 [<ffffffffa013ff1f>] ext4_file_write+0x3f/0xd0 [ext4]
 [<ffffffff8105872c>] ? pick_next_task_fair+0xfc/0x130
 [<ffffffff81148c8a>] do_sync_write+0xda/0x120
 [<ffffffff811d9dab>] ? selinux_file_permission+0xfb/0x150
 [<ffffffff811d2dff>] ? security_file_permission+0x1f/0x80
 [<ffffffff81148f78>] vfs_write+0xc8/0x190
 [<ffffffff81149a32>] sys_pwrite64+0x82/0xa0
 [<ffffffff8100b0b2>] system_call_fastpath+0x16/0x1b
INFO: task qemu-kvm:14456 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
qemu-kvm      D 0000000000000000     0 14456      1 0x00000080
 ffff8809135f9cd8 0000000000000082 0000000000000005 0000000000000001
 0000000000015c80 ffff880f590d4100 0000000000015c80 ffff880f590d4690
 ffff8809135f9fd8 ffff880f590d4698 ffff8809135f8010 0000000000015c80
Call Trace:
 [<ffffffff81057c61>] ? try_to_wake_up+0xe1/0x450
 [<ffffffff81495b32>] __mutex_lock_slowpath+0xf2/0x170
 [<ffffffff81495a1b>] mutex_lock+0x2b/0x50
 [<ffffffff810f7ba2>] generic_file_aio_write+0x52/0xd0
 [<ffffffffa013ff1f>] ext4_file_write+0x3f/0xd0 [ext4]
 [<ffffffff81148c8a>] do_sync_write+0xda/0x120
 [<ffffffff810861d4>] ? switch_task_namespaces+0x24/0x60
 [<ffffffff811d9dab>] ? selinux_file_permission+0xfb/0x150
 [<ffffffff811d2dff>] ? security_file_permission+0x1f/0x80
 [<ffffffff81148f78>] vfs_write+0xc8/0x190
 [<ffffffff81149a32>] sys_pwrite64+0x82/0xa0
 [<ffffffff8100b0b2>] system_call_fastpath+0x16/0x1b
INFO: task qemu-kvm:14418 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
qemu-kvm      D 0000000000000000     0 14418      1 0x00000080
 ffff8804b2323a18 0000000000000082 ffff8804b2323998 ffffffffa000161d
 0000000000015c80 ffff88075e64b4c0 0000000000015c80 ffff88075e64ba50
 ffff8804b2323fd8 ffff88075e64ba58 ffff8804b2322010 0000000000015c80
Call Trace:
 [<ffffffffa000161d>] ? __map_bio+0xad/0x130 [dm_mod]
 [<ffffffff813963e5>] md_make_request+0x85/0x230
 [<ffffffff8104ba8a>] ? update_curr+0x19a/0x1e0
 [<ffffffff810810c0>] ? autoremove_wake_function+0x0/0x40
 [<ffffffff812013a3>] generic_make_request+0x1e3/0x560
 [<ffffffffa000275b>] ? dm_get_live_table+0x4b/0x60 [dm_mod]
 [<ffffffffa00064b2>] ? linear_merge+0x52/0x60 [dm_mod]
 [<ffffffff812017a1>] submit_bio+0x81/0x110
 [<ffffffff8117b44c>] dio_bio_submit+0xbc/0xc0
 [<ffffffff8117c1ca>] __blockdev_direct_IO+0x8ba/0xb70
 [<ffffffff81092b74>] ? futex_wait+0x224/0x370
 [<ffffffff81179bd7>] blkdev_direct_IO+0x57/0x60
 [<ffffffff81178c20>] ? blkdev_get_blocks+0x0/0xc0
 [<ffffffff810f6e51>] generic_file_aio_read+0x6c1/0x720
 [<ffffffff810922cb>] ? futex_wake+0x10b/0x120
 [<ffffffff81148daa>] do_sync_read+0xda/0x120
 [<ffffffff811d2e4f>] ? security_file_permission+0x6f/0x80
 [<ffffffff81149105>] vfs_read+0xc5/0x190
 [<ffffffff81149992>] sys_pread64+0x82/0xa0
 [<ffffffff8100b0b2>] system_call_fastpath+0x16/0x1b
INFO: task qemu-kvm:14407 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
qemu-kvm      D 0000000000000000     0 14407      1 0x00000080
 ffff88059cb53b08 0000000000000086 ffff88059cb53ab8 ffffffffa00047cc
 0000000000015c80 ffff88075dd86040 0000000000015c80 ffff88075dd865d0
 ffff88059cb53fd8 ffff88075dd865d8 ffff88059cb52010 0000000000015c80
Call Trace:
 [<ffffffffa00047cc>] ? dm_table_unplug_all+0x5c/0xd0 [dm_mod]
 [<ffffffff810f56f0>] ? sync_page+0x0/0x50
 [<ffffffff81494dd3>] io_schedule+0x73/0xc0
 [<ffffffff810f5730>] sync_page+0x40/0x50
 [<ffffffff8149568f>] __wait_on_bit+0x5f/0x90
 [<ffffffff810f58f3>] wait_on_page_bit+0x73/0x80
 [<ffffffff81081100>] ? wake_bit_function+0x0/0x50
 [<ffffffff811007f5>] ? pagevec_lookup_tag+0x25/0x40
 [<ffffffff810f5cc0>] filemap_fdatawait_range+0x110/0x1a0
 [<ffffffff810ff9d1>] ? do_writepages+0x21/0x40
 [<ffffffff810f5ddb>] ? __filemap_fdatawrite_range+0x5b/0x60
 [<ffffffff810f5e50>] filemap_write_and_wait_range+0x70/0x80
 [<ffffffff81170b1a>] vfs_fsync_range+0x5a/0xa0
 [<ffffffff81170ba7>] generic_write_sync+0x47/0x50
 [<ffffffff810f7bfe>] generic_file_aio_write+0xae/0xd0
 [<ffffffffa013ff1f>] ext4_file_write+0x3f/0xd0 [ext4]
 [<ffffffff810736ce>] ? send_signal+0x3e/0x90
 [<ffffffff81148c8a>] do_sync_write+0xda/0x120
 [<ffffffff81073a46>] ? group_send_sig_info+0x56/0x70
 [<ffffffff811d9dab>] ? selinux_file_permission+0xfb/0x150
 [<ffffffff811d2dff>] ? security_file_permission+0x1f/0x80
 [<ffffffff81148f78>] vfs_write+0xc8/0x190
 [<ffffffff81149a32>] sys_pwrite64+0x82/0xa0
 [<ffffffff8100b0b2>] system_call_fastpath+0x16/0x1b
INFO: task qemu-kvm:14433 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
qemu-kvm      D 0000000000000000     0 14433      1 0x00000080
 ffff8808a39b7b08 0000000000000086 ffff8808a39b7ab8 ffffffffa00047cc
 0000000000015c80 ffff880f5b9c9500 0000000000015c80 ffff880f5b9c9a90
 ffff8808a39b7fd8 ffff880f5b9c9a98 ffff8808a39b6010 0000000000015c80
Call Trace:
 [<ffffffffa00047cc>] ? dm_table_unplug_all+0x5c/0xd0 [dm_mod]
 [<ffffffff810f56f0>] ? sync_page+0x0/0x50
 [<ffffffff81494dd3>] io_schedule+0x73/0xc0
 [<ffffffff810f5730>] sync_page+0x40/0x50
 [<ffffffff8149568f>] __wait_on_bit+0x5f/0x90
 [<ffffffff810f58f3>] wait_on_page_bit+0x73/0x80
 [<ffffffff81081100>] ? wake_bit_function+0x0/0x50
 [<ffffffff811007f5>] ? pagevec_lookup_tag+0x25/0x40
 [<ffffffff810f5cc0>] filemap_fdatawait_range+0x110/0x1a0
 [<ffffffff810ff9d1>] ? do_writepages+0x21/0x40
 [<ffffffff810f5ddb>] ? __filemap_fdatawrite_range+0x5b/0x60
 [<ffffffff810f5e50>] filemap_write_and_wait_range+0x70/0x80
 [<ffffffff81170b1a>] vfs_fsync_range+0x5a/0xa0
 [<ffffffff81170ba7>] generic_write_sync+0x47/0x50
 [<ffffffff810f7bfe>] generic_file_aio_write+0xae/0xd0
 [<ffffffffa013ff1f>] ext4_file_write+0x3f/0xd0 [ext4]
 [<ffffffff81148c8a>] do_sync_write+0xda/0x120
 [<ffffffff811d9dab>] ? selinux_file_permission+0xfb/0x150
 [<ffffffff811d2dff>] ? security_file_permission+0x1f/0x80
 [<ffffffff81148f78>] vfs_write+0xc8/0x190
 [<ffffffff81149a32>] sys_pwrite64+0x82/0xa0
 [<ffffffff8100b0b2>] system_call_fastpath+0x16/0x1b
INFO: task qemu-kvm:14436 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
qemu-kvm      D 0000000000000000     0 14436      1 0x00000080
 ffff8809135cfb08 0000000000000086 ffff8809135cfab8 ffffffffa00047cc
 0000000000015c80 ffff8807f235b540 0000000000015c80 ffff8807f235bad0
 ffff8809135cffd8 ffff8807f235bad8 ffff8809135ce010 0000000000015c80
Call Trace:
 [<ffffffffa00047cc>] ? dm_table_unplug_all+0x5c/0xd0 [dm_mod]
 [<ffffffff810f56f0>] ? sync_page+0x0/0x50
 [<ffffffff81494dd3>] io_schedule+0x73/0xc0
 [<ffffffff810f5730>] sync_page+0x40/0x50
 [<ffffffff8149568f>] __wait_on_bit+0x5f/0x90
 [<ffffffff810f58f3>] wait_on_page_bit+0x73/0x80
 [<ffffffff81081100>] ? wake_bit_function+0x0/0x50
 [<ffffffff811007f5>] ? pagevec_lookup_tag+0x25/0x40
 [<ffffffff810f5cc0>] filemap_fdatawait_range+0x110/0x1a0
 [<ffffffff810ff9d1>] ? do_writepages+0x21/0x40
 [<ffffffff810f5ddb>] ? __filemap_fdatawrite_range+0x5b/0x60
 [<ffffffff810f5e50>] filemap_write_and_wait_range+0x70/0x80
 [<ffffffff81170b1a>] vfs_fsync_range+0x5a/0xa0
 [<ffffffff81170ba7>] generic_write_sync+0x47/0x50
 [<ffffffff810f7bfe>] generic_file_aio_write+0xae/0xd0
 [<ffffffffa013ff1f>] ext4_file_write+0x3f/0xd0 [ext4]
 [<ffffffff8105872c>] ? pick_next_task_fair+0xfc/0x130
 [<ffffffff81148c8a>] do_sync_write+0xda/0x120
 [<ffffffff811d9dab>] ? selinux_file_permission+0xfb/0x150
 [<ffffffff811d2dff>] ? security_file_permission+0x1f/0x80
 [<ffffffff81148f78>] vfs_write+0xc8/0x190
 [<ffffffff81149a32>] sys_pwrite64+0x82/0xa0
 [<ffffffff8100b0b2>] system_call_fastpath+0x16/0x1b
Comment 2 Anthony Basile 2011-01-06 21:00:24 UTC
While I'm not 100% sure its the same bug, I think Gentoo is hitting this on 2.6.36.2 and 2.6.37.  See

  http://bugs.gentoo.org/show_bug.cgi?id=350455

Under heavy system loads, when doing a lvm pmove from one pv to another, the system deadlocks.  The culprits appear to be jbd and flush on that md.

You can see the backtrace obtained by sysrq+s on the gentoo bug above.
Comment 3 Travers Carter 2011-01-07 04:40:49 UTC
After moving the VM guests that were using filesystem based disk images on ext4 on top of the LVs directly onto raw LVs just under two weeks ago I have yet to see the problem occur again (it typically occurred at least once a week).  So this may actually be an ext4/jbd2 problem or at least an interaction with it rather than a md or LVM issue.
Comment 4 Valentijn Sessink 2011-05-04 18:04:05 UTC
Just a note: I'm currently using an older 2.6.32 kernel from Ubuntu with KVM machines running directly on LVs and it *does* hang when I use pvmove. I know 2.6.32 is sort of ancient for a kernel bug (so in a way: please ignore this message), but I wouldn't be entirely sure this is file system related.
Comment 5 serverchief 2012-01-31 18:24:26 UTC
Please try changing the IO scheduler to "noop" - and re-test.

Note You need to log in before you can comment on or make changes to this bug.