Bug 192731
Summary: | LVM snapshot hangs during copy, under load | ||
---|---|---|---|
Product: | IO/Storage | Reporter: | Glenn Enright (glenn.enright) |
Component: | Block Layer | Assignee: | Jens Axboe (axboe) |
Status: | NEW --- | ||
Severity: | high | CC: | hydrapolic, jcbsth |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
See Also: | https://bugzilla.kernel.org/show_bug.cgi?id=119841 | ||
Kernel Version: | 4.4.32 | Subsystem: | |
Regression: | No | Bisected commit-id: |
Description
Glenn Enright
2017-01-16 23:30:22 UTC
duplicate of #119841 ? It may well be a duplicate, just did not want to clobber that report. Anyway promised traces below... I've included details from a 'w' sysrq dump on a recently stuck machine. New processes needing io to lvm end up at the entry_SYSCALL_64_fastpath point. Most useful seems to be the kworker bit at the end? <6>1 2017-01-25T21:47:16.503278+00:00 host kernel - - dmeventd D ffff880059d9ba38 0 21089 1 0x00000000 <4>1 2017-01-25T21:47:16.503283+00:00 host kernel - - ffff880059d9ba38 ffffffff818bd500 ffff880048a04580 0000000000000000 <4>1 2017-01-25T21:47:16.503286+00:00 host kernel - - ffff88014020a210 0000000000000000 ffff880059d9b9f0 ffffffff810061a9 <4>1 2017-01-25T21:47:16.503288+00:00 host kernel - - ffff88014020a210 ffff88014020aa10 ffff88014020a210 ffff88014020aa10 <4>1 2017-01-25T21:47:16.503293+00:00 host kernel - - Call Trace: <4>1 2017-01-25T21:47:16.503296+00:00 host kernel - - [<ffffffff810061a9>] ? xen_load_sp0+0xc9/0x1d0 <4>1 2017-01-25T21:47:16.503302+00:00 host kernel - - [<ffffffff81006cbd>] ? xen_mc_flush+0xad/0x1b0 <4>1 2017-01-25T21:47:16.503305+00:00 host kernel - - [<ffffffff8100d605>] ? xen_clocksource_read+0x15/0x20 <4>1 2017-01-25T21:47:16.503308+00:00 host kernel - - [<ffffffff810a0c24>] ? finish_task_switch+0xa4/0x240 <4>1 2017-01-25T21:47:16.503311+00:00 host kernel - - [<ffffffff81683e70>] schedule+0x40/0x90 <4>1 2017-01-25T21:47:16.503314+00:00 host kernel - - [<ffffffff816865fd>] rwsem_down_write_failed+0x1fd/0x360 <4>1 2017-01-25T21:47:16.503316+00:00 host kernel - - [<ffffffff816835d6>] ? __schedule+0x306/0xa30 <4>1 2017-01-25T21:47:16.503321+00:00 host kernel - - [<ffffffff81340693>] call_rwsem_down_write_failed+0x13/0x20 <4>1 2017-01-25T21:47:16.503325+00:00 host kernel - - [<ffffffff81685dd4>] ? down_write+0x24/0x40 <4>1 2017-01-25T21:47:16.503328+00:00 host kernel - - [<ffffffffa04a525f>] snapshot_status+0x2f/0x190 [dm_snapshot] <4>1 2017-01-25T21:47:16.503334+00:00 host kernel - - [<ffffffffa00091c3>] retrieve_status+0xb3/0x1d0 [dm_mod] <4>1 2017-01-25T21:47:16.503339+00:00 host kernel - - [<ffffffffa00092e0>] ? retrieve_status+0x1d0/0x1d0 [dm_mod] <4>1 2017-01-25T21:47:16.503343+00:00 host kernel - - [<ffffffffa000933e>] table_status+0x5e/0x90 [dm_mod] <4>1 2017-01-25T21:47:16.503347+00:00 host kernel - - [<ffffffffa000aa33>] ctl_ioctl+0x1d3/0x410 [dm_mod] <4>1 2017-01-25T21:47:16.503351+00:00 host kernel - - [<ffffffffa000ac83>] dm_ctl_ioctl+0x13/0x20 [dm_mod] <4>1 2017-01-25T21:47:16.503354+00:00 host kernel - - [<ffffffff811e34a8>] do_vfs_ioctl+0x88/0x4b0 <4>1 2017-01-25T21:47:16.503360+00:00 host kernel - - [<ffffffff811ecddc>] ? __fget_light+0x2c/0x70 <4>1 2017-01-25T21:47:16.503363+00:00 host kernel - - [<ffffffff81088425>] ? __set_current_blocked+0x55/0x60 <4>1 2017-01-25T21:47:16.503368+00:00 host kernel - - [<ffffffff811e3962>] SyS_ioctl+0x92/0xa0 <4>1 2017-01-25T21:47:16.503372+00:00 host kernel - - [<ffffffff81003615>] ? syscall_return_slowpath+0x65/0x70 <4>1 2017-01-25T21:47:16.503374+00:00 host kernel - - [<ffffffff8168772e>] entry_SYSCALL_64_fastpath+0x12/0x71 <6>1 2017-01-25T21:47:16.503379+00:00 host kernel - - dd D ffff8800334936b8 0 21296 20055 0x00000000 <4>1 2017-01-25T21:47:16.503383+00:00 host kernel - - ffff8800334936b8 ffff880135440140 ffff880075ad13c0 ffff880140256380 <4>1 2017-01-25T21:47:16.503386+00:00 host kernel - - 0000000000000000 ffff8800334935f8 ffffffff810ac6c4 0014918f609acad7 <4>1 2017-01-25T21:47:16.503388+00:00 host kernel - - 000000000031661d ffff8800334935d8 ffff880140256380 ffff8800191a33c0 <4>1 2017-01-25T21:47:16.503390+00:00 host kernel - - Call Trace: <4>1 2017-01-25T21:47:16.503393+00:00 host kernel - - [<ffffffff810ac6c4>] ? check_preempt_wakeup+0x194/0x240 <4>1 2017-01-25T21:47:16.503398+00:00 host kernel - - [<ffffffff810a1a17>] ? check_preempt_curr+0x87/0x90 <4>1 2017-01-25T21:47:16.503402+00:00 host kernel - - [<ffffffff810a1a3d>] ? ttwu_do_wakeup+0x1d/0xf0 <4>1 2017-01-25T21:47:16.503407+00:00 host kernel - - [<ffffffff8168741f>] ? _raw_spin_lock_irqsave+0x1f/0x50 <4>1 2017-01-25T21:47:16.503411+00:00 host kernel - - [<ffffffff81687146>] ? _raw_spin_unlock_irqrestore+0x16/0x20 <4>1 2017-01-25T21:47:16.503416+00:00 host kernel - - [<ffffffff81683e70>] schedule+0x40/0x90 <4>1 2017-01-25T21:47:16.503420+00:00 host kernel - - [<ffffffff816865fd>] rwsem_down_write_failed+0x1fd/0x360 <4>1 2017-01-25T21:47:16.503422+00:00 host kernel - - [<ffffffff811b5d32>] ? kmem_cache_alloc+0x72/0x160 <4>1 2017-01-25T21:47:16.503425+00:00 host kernel - - [<ffffffff810bac6d>] ? rwsem_wake+0x6d/0xa0 <4>1 2017-01-25T21:47:16.503428+00:00 host kernel - - [<ffffffff81340693>] call_rwsem_down_write_failed+0x13/0x20 <4>1 2017-01-25T21:47:16.503431+00:00 host kernel - - [<ffffffff81685dd4>] ? down_write+0x24/0x40 <4>1 2017-01-25T21:47:16.503435+00:00 host kernel - - [<ffffffffa04a6c20>] snapshot_map+0x90/0x490 [dm_snapshot] <4>1 2017-01-25T21:47:16.503438+00:00 host kernel - - [<ffffffffa00042aa>] __map_bio+0x4a/0x130 [dm_mod] <4>1 2017-01-25T21:47:16.503442+00:00 host kernel - - [<ffffffffa0004897>] __split_and_process_bio+0x327/0x3f0 [dm_mod] <4>1 2017-01-25T21:47:16.503445+00:00 host kernel - - [<ffffffffa00049d4>] dm_make_request+0x74/0xe0 [dm_mod] <4>1 2017-01-25T21:47:16.503448+00:00 host kernel - - [<ffffffff813099af>] generic_make_request+0xff/0x1d0 <4>1 2017-01-25T21:47:16.503451+00:00 host kernel - - [<ffffffff81309af0>] submit_bio+0x70/0x140 <4>1 2017-01-25T21:47:16.503454+00:00 host kernel - - [<ffffffff8116b84f>] ? __lru_cache_add+0x5f/0x80 <4>1 2017-01-25T21:47:16.503457+00:00 host kernel - - [<ffffffff8120f364>] mpage_bio_submit+0x34/0x50 <4>1 2017-01-25T21:47:16.503460+00:00 host kernel - - [<ffffffff8120fd2d>] mpage_readpages+0x13d/0x160 <4>1 2017-01-25T21:47:16.503463+00:00 host kernel - - [<ffffffff81209b80>] ? I_BDEV+0x20/0x20 <4>1 2017-01-25T21:47:16.503466+00:00 host kernel - - [<ffffffff81209b80>] ? I_BDEV+0x20/0x20 <4>1 2017-01-25T21:47:16.503469+00:00 host kernel - - [<ffffffff81208692>] ? block_write_end+0x42/0x90 <4>1 2017-01-25T21:47:16.503472+00:00 host kernel - - [<ffffffff8115b26b>] ? __page_cache_alloc+0xcb/0x110 <4>1 2017-01-25T21:47:16.503475+00:00 host kernel - - [<ffffffff8154efbc>] ? mddev_congested+0x2c/0x40 <4>1 2017-01-25T21:47:16.503533+00:00 host kernel - - [<ffffffff8120a46d>] blkdev_readpages+0x1d/0x20 <4>1 2017-01-25T21:47:16.503538+00:00 host kernel - - [<ffffffff81169260>] __do_page_cache_readahead+0x1a0/0x240 <4>1 2017-01-25T21:47:16.503541+00:00 host kernel - - [<ffffffff8116944d>] ondemand_readahead+0x14d/0x250 <4>1 2017-01-25T21:47:16.503544+00:00 host kernel - - [<ffffffff811f9464>] ? inode_congested+0xa4/0x100 <4>1 2017-01-25T21:47:16.503547+00:00 host kernel - - [<ffffffff811695c2>] page_cache_async_readahead+0x72/0x80 <4>1 2017-01-25T21:47:16.503550+00:00 host kernel - - [<ffffffff8115cb2e>] generic_file_read_iter+0x40e/0x5e0 <4>1 2017-01-25T21:47:16.503553+00:00 host kernel - - [<ffffffff8115c721>] ? generic_file_read_iter+0x1/0x5e0 <4>1 2017-01-25T21:47:16.503555+00:00 host kernel - - [<ffffffff81209f57>] blkdev_read_iter+0x37/0x40 <4>1 2017-01-25T21:47:16.503558+00:00 host kernel - - [<ffffffff811d13dc>] __vfs_read+0xcc/0xf0 <4>1 2017-01-25T21:47:16.503561+00:00 host kernel - - [<ffffffff811d168e>] vfs_read+0x8e/0xe0 <4>1 2017-01-25T21:47:16.503564+00:00 host kernel - - [<ffffffff811ed262>] ? __fdget_pos+0x12/0x50 <4>1 2017-01-25T21:47:16.503567+00:00 host kernel - - [<ffffffff811d1f46>] SyS_read+0x56/0xc0 <4>1 2017-01-25T21:47:16.503570+00:00 host kernel - - [<ffffffff811d1d33>] ? SyS_lseek+0x93/0xa0 <4>1 2017-01-25T21:47:16.503573+00:00 host kernel - - [<ffffffff8168772e>] entry_SYSCALL_64_fastpath+0x12/0x71 <6>1 2017-01-25T21:47:16.503575+00:00 host kernel - - kworker/u4:1 D ffff8801039c30e8 0 19641 2 0x00000000 <6>1 2017-01-25T21:47:16.503577+00:00 host kernel - - Workqueue: writeback wb_workfn (flush-253:61) <4>1 2017-01-25T21:47:16.503580+00:00 host kernel - - ffff8801039c30e8 ffff880135440140 ffff880134e2e440 ffff88013bc01540 <4>1 2017-01-25T21:47:16.503583+00:00 host kernel - - ffff88013bc00700 0000000000000000 ffffea0004243c00 ffff88013bc01540 <4>1 2017-01-25T21:47:16.503587+00:00 host kernel - - ffff8801090f0000 ffff8801039c3058 ffffffff811b53ab 0000000000000000 <4>1 2017-01-25T21:47:16.503589+00:00 host kernel - - Call Trace: <4>1 2017-01-25T21:47:16.503592+00:00 host kernel - - [<ffffffff811b53ab>] ? cache_grow+0x15b/0x250 <4>1 2017-01-25T21:47:16.503595+00:00 host kernel - - [<ffffffff810061a9>] ? xen_load_sp0+0xc9/0x1d0 <4>1 2017-01-25T21:47:16.503598+00:00 host kernel - - [<ffffffff810ac020>] ? update_curr+0x110/0x1a0 <4>1 2017-01-25T21:47:16.503601+00:00 host kernel - - [<ffffffff810b01bf>] ? enqueue_entity+0x46f/0x940 <4>1 2017-01-25T21:47:16.503605+00:00 host kernel - - [<ffffffff81683e70>] schedule+0x40/0x90 <4>1 2017-01-25T21:47:16.503608+00:00 host kernel - - [<ffffffff816865fd>] rwsem_down_write_failed+0x1fd/0x360 <4>1 2017-01-25T21:47:16.503611+00:00 host kernel - - [<ffffffff813025ac>] ? bio_alloc_bioset+0xbc/0x1f0 <4>1 2017-01-25T21:47:16.503615+00:00 host kernel - - [<ffffffff81340693>] call_rwsem_down_write_failed+0x13/0x20 <4>1 2017-01-25T21:47:16.503618+00:00 host kernel - - [<ffffffff81685dd4>] ? down_write+0x24/0x40 <4>1 2017-01-25T21:47:16.503621+00:00 host kernel - - [<ffffffffa04a489d>] __origin_write+0x6d/0x2d0 [dm_snapshot] <4>1 2017-01-25T21:47:16.503624+00:00 host kernel - - [<ffffffff8115d3fc>] ? mempool_alloc+0x5c/0x160 <4>1 2017-01-25T21:47:16.503628+00:00 host kernel - - [<ffffffffa04a4b66>] do_origin+0x66/0x90 [dm_snapshot] <4>1 2017-01-25T21:47:16.503631+00:00 host kernel - - [<ffffffffa04a4fbf>] origin_map+0x6f/0x90 [dm_snapshot] <4>1 2017-01-25T21:47:16.503634+00:00 host kernel - - [<ffffffffa00042aa>] __map_bio+0x4a/0x130 [dm_mod] <4>1 2017-01-25T21:47:16.503637+00:00 host kernel - - [<ffffffffa0004897>] __split_and_process_bio+0x327/0x3f0 [dm_mod] <4>1 2017-01-25T21:47:16.503640+00:00 host kernel - - [<ffffffffa00049d4>] dm_make_request+0x74/0xe0 [dm_mod] <4>1 2017-01-25T21:47:16.503643+00:00 host kernel - - [<ffffffff813099af>] generic_make_request+0xff/0x1d0 <4>1 2017-01-25T21:47:16.503646+00:00 host kernel - - [<ffffffff81092b24>] ? insert_work+0x74/0xc0 <4>1 2017-01-25T21:47:16.503649+00:00 host kernel - - [<ffffffff81309af0>] submit_bio+0x70/0x140 <4>1 2017-01-25T21:47:16.503653+00:00 host kernel - - [<ffffffff8120755e>] submit_bh_wbc+0x13e/0x180 <4>1 2017-01-25T21:47:16.503656+00:00 host kernel - - [<ffffffff8120813d>] __block_write_full_page.clone.0+0x18d/0x310 <4>1 2017-01-25T21:47:16.503659+00:00 host kernel - - [<ffffffff81209b80>] ? I_BDEV+0x20/0x20 <4>1 2017-01-25T21:47:16.503661+00:00 host kernel - - [<ffffffff81209b80>] ? I_BDEV+0x20/0x20 <4>1 2017-01-25T21:47:16.503664+00:00 host kernel - - [<ffffffff812084ad>] block_write_full_page+0xcd/0x120 <4>1 2017-01-25T21:47:16.503668+00:00 host kernel - - [<ffffffff8120a4a8>] blkdev_writepage+0x18/0x20 <4>1 2017-01-25T21:47:16.503670+00:00 host kernel - - [<ffffffff811654b7>] __writepage+0x17/0x40 <4>1 2017-01-25T21:47:16.503674+00:00 host kernel - - [<ffffffff811670ac>] write_cache_pages+0x24c/0x530 <4>1 2017-01-25T21:47:16.503677+00:00 host kernel - - [<ffffffff811654a0>] ? set_page_dirty+0x70/0x70 <4>1 2017-01-25T21:47:16.503680+00:00 host kernel - - [<ffffffff811673e1>] generic_writepages+0x51/0x80 <4>1 2017-01-25T21:47:16.503683+00:00 host kernel - - [<ffffffff81315543>] ? __blk_mq_run_hw_queue+0x293/0x3b0 <4>1 2017-01-25T21:47:16.503685+00:00 host kernel - - [<ffffffff81167430>] do_writepages+0x20/0x30 <4>1 2017-01-25T21:47:16.503688+00:00 host kernel - - [<ffffffff811fab0f>] __writeback_single_inode+0x4f/0x3b0 <4>1 2017-01-25T21:47:16.503691+00:00 host kernel - - [<ffffffff81165bff>] ? __wb_calc_thresh+0x2f/0x140 <4>1 2017-01-25T21:47:16.503693+00:00 host kernel - - [<ffffffff8131571c>] ? blk_mq_run_hw_queue+0x6c/0x80 <4>1 2017-01-25T21:47:16.503696+00:00 host kernel - - [<ffffffff811feb12>] ? wbc_attach_and_unlock_inode+0x122/0x140 <4>1 2017-01-25T21:47:16.503699+00:00 host kernel - - [<ffffffff811ff07e>] writeback_sb_inodes+0x2be/0x4a0 <4>1 2017-01-25T21:47:16.503702+00:00 host kernel - - [<ffffffff811669fc>] ? wb_over_bg_thresh+0x15c/0x260 <4>1 2017-01-25T21:47:16.503704+00:00 host kernel - - [<ffffffff811ff2ec>] __writeback_inodes_wb+0x8c/0xc0 <4>1 2017-01-25T21:47:16.503707+00:00 host kernel - - [<ffffffff811ff614>] wb_writeback+0x244/0x3b0 <4>1 2017-01-25T21:47:16.503710+00:00 host kernel - - [<ffffffff81346e2b>] ? find_next_bit+0xb/0x10 <4>1 2017-01-25T21:47:16.503713+00:00 host kernel - - [<ffffffff811f9210>] ? get_nr_dirty_pages+0x30/0x60 <4>1 2017-01-25T21:47:16.503716+00:00 host kernel - - [<ffffffff811ff923>] wb_do_writeback+0x1a3/0x280 <4>1 2017-01-25T21:47:16.503719+00:00 host kernel - - [<ffffffff811ffa7d>] wb_workfn+0x7d/0x1e0 <4>1 2017-01-25T21:47:16.503721+00:00 host kernel - - [<ffffffff816835d6>] ? __schedule+0x306/0xa30 <4>1 2017-01-25T21:47:16.503724+00:00 host kernel - - [<ffffffff8109381a>] ? queue_delayed_work_on+0x3a/0x50 <4>1 2017-01-25T21:47:16.503727+00:00 host kernel - - [<ffffffff81093d15>] process_one_work+0x165/0x500 <4>1 2017-01-25T21:47:16.503730+00:00 host kernel - - [<ffffffff81687146>] ? _raw_spin_unlock_irqrestore+0x16/0x20 <4>1 2017-01-25T21:47:16.503732+00:00 host kernel - - [<ffffffff810da515>] ? mod_timer+0x105/0x1d0 <4>1 2017-01-25T21:47:16.503734+00:00 host kernel - - [<ffffffff81094ae3>] worker_thread+0x133/0x610 <4>1 2017-01-25T21:47:16.503738+00:00 host kernel - - [<ffffffff810a2f32>] ? default_wake_function+0x12/0x20 <4>1 2017-01-25T21:47:16.503741+00:00 host kernel - - [<ffffffff810b66a6>] ? __wake_up_common+0x56/0x90 <4>1 2017-01-25T21:47:16.503744+00:00 host kernel - - [<ffffffff810949b0>] ? create_worker+0x1e0/0x1e0 <4>1 2017-01-25T21:47:16.503747+00:00 host kernel - - [<ffffffff810949b0>] ? create_worker+0x1e0/0x1e0 <4>1 2017-01-25T21:47:16.503750+00:00 host kernel - - [<ffffffff810994fc>] kthread+0xcc/0xf0 <4>1 2017-01-25T21:47:16.503753+00:00 host kernel - - [<ffffffff810a1c8e>] ? schedule_tail+0x1e/0xc0 <4>1 2017-01-25T21:47:16.503807+00:00 host kernel - - [<ffffffff81099430>] ? kthread_freezable_should_stop+0x70/0x70 <4>1 2017-01-25T21:47:16.503812+00:00 host kernel - - [<ffffffff81687a8f>] ret_from_fork+0x3f/0x70 <4>1 2017-01-25T21:47:16.503815+00:00 host kernel - - [<ffffffff81099430>] ? kthread_freezable_should_stop+0x70/0x70 Is it fixed for you in 4.4.55? |