Hi, Forgive me if this report is wrongly classified - but looking on stack traces I can't judge better where root cause is. I have server (Archlinux based) running as MythTV backend and also as NFS server. Generally server seem to be rock solid stable (24/7 uptime measured in weeks; interrupted by maintenance or software upgrades). Recently I was copying 35G file from MAC desktop to server one NFS. After 5-10min since copy start, all server user task becomes blocked (telnet, HTTPserver, myth backend, etc). In this state NFS file copy was ongoing OK. When NFS transfer ends - all process resumed. Kernel log reported as below. I'm not sure is it kernel issue - but it was definitely correlated with NFS transfer (as soon as NFS copy finish - all server tasks resume). Apr 15 21:04:50 [461880.890543] INFO: task kjournald:9163 blocked for more than 120 seconds. Apr 15 21:04:50 [461880.890582] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Apr 15 21:04:50 [461880.890620] kjournald D 00000001083ffddf 0 9163 2 0x00000000 Apr 15 21:04:50 [461880.890624] ffff8800b6577c10 0000000000000046 00000001083ffddf ffffffff810ffd9a Apr 15 21:04:50 [461880.890628] ffff8801195a9590 ffff8800b6577fd8 ffff8800b6577fd8 ffff8800b6577fd8 Apr 15 21:04:50 [461880.890631] ffff8800105323f0 ffff8801195a9590 ffff88011fffb700 0000000100000000 Apr 15 21:04:50 [461880.890634] Call Trace: Apr 15 21:04:50 [461880.890640] [<ffffffff810ffd9a>] ? free_pcppages_bulk+0x42a/0x470 Apr 15 21:04:50 [461880.890645] [<ffffffff81012f69>] ? read_tsc+0x9/0x20 Apr 15 21:04:50 [461880.890648] [<ffffffff811860e0>] ? __wait_on_buffer+0x30/0x30 Apr 15 21:04:50 [461880.890651] [<ffffffff813f3fdf>] schedule+0x3f/0x60 Apr 15 21:04:50 [461880.890653] [<ffffffff813f408f>] io_schedule+0x8f/0xd0 Apr 15 21:04:50 [461880.890656] [<ffffffff811860ee>] sleep_on_buffer+0xe/0x20 Apr 15 21:04:50 [461880.890658] [<ffffffff813f483f>] __wait_on_bit+0x5f/0x90 Apr 15 21:04:50 [461880.890660] [<ffffffff811860e0>] ? __wait_on_buffer+0x30/0x30 Apr 15 21:04:50 [461880.890663] [<ffffffff813f48ec>] out_of_line_wait_on_bit+0x7c/0x90 Apr 15 21:04:50 [461880.890666] [<ffffffff8107f840>] ? autoremove_wake_function+0x40/0x40 Apr 15 21:04:50 [461880.890669] [<ffffffff811860de>] __wait_on_buffer+0x2e/0x30 Apr 15 21:04:50 [461880.890687] [<ffffffffa00a5b2c>] journal_commit_transaction+0xcac/0x13c0 [jbd] Apr 15 21:04:50 [461880.890690] [<ffffffff8106cd66>] ? try_to_del_timer_sync+0x96/0x160 Apr 15 21:04:50 [461880.890696] [<ffffffffa00a8a76>] kjournald+0xe6/0x250 [jbd] Apr 15 21:04:50 [461880.890699] [<ffffffff8107f800>] ? abort_exclusive_wait+0xb0/0xb0 Apr 15 21:04:50 [461880.890703] [<ffffffffa00a8990>] ? commit_timeout+0x10/0x10 [jbd] Apr 15 21:04:50 [461880.890706] [<ffffffff8107eeac>] kthread+0x8c/0xa0 Apr 15 21:04:50 [461880.890709] [<ffffffff813f8864>] kernel_thread_helper+0x4/0x10 Apr 15 21:04:50 [461880.890711] [<ffffffff8107ee20>] ? kthread_worker_fn+0x190/0x190 Apr 15 21:04:50 [461880.890714] [<ffffffff813f8860>] ? gs_change+0x13/0x13 Apr 15 21:04:50 [461880.890718] INFO: task mythbackend:15037 blocked for more than 120 seconds. Apr 15 21:04:50 [461880.890766] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Apr 15 21:04:50 [461880.890803] mythbackend D 00000001083ff1a3 0 15037 1 0x00000000 Apr 15 21:04:50 [461880.890807] ffff88002045de78 0000000000000086 ffff88002045deb0 ffff88002045de48 Apr 15 21:04:50 [461880.890810] ffff8801090e5d70 ffff88002045dfd8 ffff88002045dfd8 ffff88002045dfd8 Apr 15 21:04:50 [461880.890813] ffffffff8169b020 ffff8801090e5d70 ffff88001b7b3f08 ffff88001b7b3f10 Apr 15 21:04:50 [461880.890815] Call Trace: Apr 15 21:04:50 [461880.890819] [<ffffffff81103ac5>] ? pagevec_lookup_tag+0x25/0x40 Apr 15 21:04:50 [461880.890822] [<ffffffff810f7ecf>] ? filemap_fdatawait_range+0x9f/0x1a0 Apr 15 21:04:50 [461880.890824] [<ffffffff813f3fdf>] schedule+0x3f/0x60 Apr 15 21:04:50 [461880.890827] [<ffffffff813f5629>] __mutex_lock_slowpath+0x139/0x330 Apr 15 21:04:50 [461880.890830] [<ffffffff813f5836>] mutex_lock+0x16/0x30 Apr 15 21:04:50 [461880.890832] [<ffffffff81183fb2>] vfs_fsync_range+0x52/0xa0 Apr 15 21:04:50 [461880.890834] [<ffffffff8118406c>] vfs_fsync+0x1c/0x20 Apr 15 21:04:50 [461880.890837] [<ffffffff811843d6>] sys_fdatasync+0x36/0x50 Apr 15 21:04:50 [461880.890839] [<ffffffff813f7742>] system_call_fastpath+0x16/0x1b Apr 15 21:04:50 [461880.890843] INFO: task mythbackend:17562 blocked for more than 120 seconds. Apr 15 21:04:50 [461880.890876] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Apr 15 21:04:50 [461880.890913] mythbackend D 00000001083ff1a0 0 17562 1 0x00000000 Apr 15 21:04:50 [461880.890916] ffff880016999e78 0000000000000086 ffff880016999eb0 ffff880016999e48 Apr 15 21:04:50 [461880.890919] ffff88010d1df300 ffff880016999fd8 ffff880016999fd8 ffff880016999fd8 Apr 15 21:04:50 [461880.890921] ffffffff8169b020 ffff88010d1df300 ffff880000000001 ffff88006172e0d0 Apr 15 21:04:50 [461880.890924] Call Trace: Apr 15 21:04:50 [461880.890927] [<ffffffff81103ac5>] ? pagevec_lookup_tag+0x25/0x40 Apr 15 21:04:50 [461880.890929] [<ffffffff810f7ecf>] ? filemap_fdatawait_range+0x9f/0x1a0 Apr 15 21:04:50 [461880.890932] [<ffffffff813f3fdf>] schedule+0x3f/0x60 Apr 15 21:04:50 [461880.890934] [<ffffffff813f5629>] __mutex_lock_slowpath+0x139/0x330 Apr 15 21:04:50 [461880.890937] [<ffffffff813f5836>] mutex_lock+0x16/0x30 Apr 15 21:04:50 [461880.890939] [<ffffffff81183fb2>] vfs_fsync_range+0x52/0xa0 Apr 15 21:04:50 [461880.890941] [<ffffffff8118406c>] vfs_fsync+0x1c/0x20 Apr 15 21:04:50 [461880.890943] [<ffffffff811843d6>] sys_fdatasync+0x36/0x50 Apr 15 21:04:50 [461880.890946] [<ffffffff813f7742>] system_call_fastpath+0x16/0x1b Apr 15 21:04:50 [461880.890949] INFO: task mythcommflag:17595 blocked for more than 120 seconds. Apr 15 21:04:50 [461880.890983] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Apr 15 21:04:50 [461880.891019] mythcommflag D 0000000000000002 0 17595 17594 0x00000000 Apr 15 21:04:50 [461880.891022] ffff880012883738 0000000000000086 ffff8800264afbe8 000000003ded1f97 Apr 15 21:04:50 [461880.891025] ffff88010d1ddd70 ffff880012883fd8 ffff880012883fd8 ffff880012883fd8 Apr 15 21:04:50 [461880.891028] ffff8800681e5640 ffff88010d1ddd70 ffff880012883fd8 ffff88011fc57c08 Apr 15 21:04:50 [461880.891030] Call Trace: Apr 15 21:04:50 [461880.891033] [<ffffffff81012f69>] ? read_tsc+0x9/0x20 Apr 15 21:04:50 [461880.891036] [<ffffffff81089edd>] ? ktime_get_ts+0xad/0xe0 Apr 15 21:04:50 [461880.891038] [<ffffffff810f7bd0>] ? __lock_page+0x70/0x70 Apr 15 21:04:50 [461880.891040] [<ffffffff813f3fdf>] schedule+0x3f/0x60 Apr 15 21:04:50 [461880.891043] [<ffffffff813f408f>] io_schedule+0x8f/0xd0 Apr 15 21:04:50 [461880.891045] [<ffffffff810f7bde>] sleep_on_page+0xe/0x20 Apr 15 21:04:50 [461880.891047] [<ffffffff813f483f>] __wait_on_bit+0x5f/0x90 Apr 15 21:04:50 [461880.891050] [<ffffffff810f7dc8>] wait_on_page_bit+0x78/0x80 Apr 15 21:04:50 [461880.891052] [<ffffffff8107f840>] ? autoremove_wake_function+0x40/0x40 Apr 15 21:04:50 [461880.891055] [<ffffffff8110a1a5>] shrink_page_list+0x405/0x910 Apr 15 21:04:50 [461880.891058] [<ffffffff811050d6>] ? __pagevec_release+0x26/0x40 Apr 15 21:04:50 [461880.891061] [<ffffffff813f689f>] ? _raw_spin_unlock_irq+0xf/0x40 Apr 15 21:04:50 [461880.891063] [<ffffffff8110ac71>] shrink_inactive_list+0x2e1/0x470 Apr 15 21:04:50 [461880.891065] [<ffffffff8121dca7>] ? kobject_put+0x27/0x60 Apr 15 21:04:50 [461880.891068] [<ffffffff812e7af7>] ? put_device+0x17/0x20 Apr 15 21:04:50 [461880.891070] [<ffffffff8110b60c>] shrink_zone+0x4bc/0x660 Apr 15 21:04:50 [461880.891073] [<ffffffff81012f69>] ? read_tsc+0x9/0x20 Apr 15 21:04:50 [461880.891075] [<ffffffff8110bade>] do_try_to_free_pages+0xae/0x400 Apr 15 21:04:50 [461880.891078] [<ffffffff8110c1a4>] try_to_free_pages+0xb4/0x1e0 Apr 15 21:04:50 [461880.891080] [<ffffffff811005fc>] __alloc_pages_nodemask+0x52c/0x850 Apr 15 21:04:50 [461880.891084] [<ffffffff81127f8d>] ? page_add_new_anon_rmap+0x9d/0xb0 Apr 15 21:04:50 [461880.891087] [<ffffffff81139f7a>] alloc_pages_vma+0x9a/0x180 Apr 15 21:04:50 [461880.891090] [<ffffffff8114b2d2>] do_huge_pmd_anonymous_page+0x152/0x380 Apr 15 21:04:50 [461880.891092] [<ffffffff8111fbe6>] handle_mm_fault+0x256/0x350 Apr 15 21:04:50 [461880.891095] [<ffffffff8103993a>] do_page_fault+0x14a/0x4d0 Apr 15 21:04:50 [461880.891098] [<ffffffff8112405f>] ? do_brk+0x21f/0x320 Apr 15 21:04:50 [461880.891100] [<ffffffff813f7185>] page_fault+0x25/0x30 Apr 15 21:06:50 [462000.890513] INFO: task kjournald:9163 blocked for more than 120 seconds. Apr 15 21:06:50 [462000.890552] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Apr 15 21:06:50 [462000.890590] kjournald D 00000001083ffddf 0 9163 2 0x00000000 Apr 15 21:06:50 [462000.890594] ffff8800b6577c10 0000000000000046 00000001083ffddf ffffffff810ffd9a Apr 15 21:06:50 [462000.890598] ffff8801195a9590 ffff8800b6577fd8 ffff8800b6577fd8 ffff8800b6577fd8 Apr 15 21:06:50 [462000.890601] ffff8800105323f0 ffff8801195a9590 ffff88011fffb700 0000000100000000 Apr 15 21:06:50 [462000.890604] Call Trace: Apr 15 21:06:50 [462000.890610] [<ffffffff810ffd9a>] ? free_pcppages_bulk+0x42a/0x470 Apr 15 21:06:50 [462000.890614] [<ffffffff81012f69>] ? read_tsc+0x9/0x20 Apr 15 21:06:50 [462000.890618] [<ffffffff811860e0>] ? __wait_on_buffer+0x30/0x30 Apr 15 21:06:50 [462000.890621] [<ffffffff813f3fdf>] schedule+0x3f/0x60 Apr 15 21:06:50 [462000.890623] [<ffffffff813f408f>] io_schedule+0x8f/0xd0 Apr 15 21:06:50 [462000.890625] [<ffffffff811860ee>] sleep_on_buffer+0xe/0x20 Apr 15 21:06:50 [462000.890628] [<ffffffff813f483f>] __wait_on_bit+0x5f/0x90 Apr 15 21:06:50 [462000.890630] [<ffffffff811860e0>] ? __wait_on_buffer+0x30/0x30 Apr 15 21:06:50 [462000.890633] [<ffffffff813f48ec>] out_of_line_wait_on_bit+0x7c/0x90 Apr 15 21:06:50 [462000.890636] [<ffffffff8107f840>] ? autoremove_wake_function+0x40/0x40 Apr 15 21:06:50 [462000.890638] [<ffffffff811860de>] __wait_on_buffer+0x2e/0x30 Apr 15 21:06:50 [462000.890657] [<ffffffffa00a5b2c>] journal_commit_transaction+0xcac/0x13c0 [jbd] Apr 15 21:06:50 [462000.890660] [<ffffffff8106cd66>] ? try_to_del_timer_sync+0x96/0x160 Apr 15 21:06:50 [462000.890666] [<ffffffffa00a8a76>] kjournald+0xe6/0x250 [jbd] Apr 15 21:06:50 [462000.890668] [<ffffffff8107f800>] ? abort_exclusive_wait+0xb0/0xb0 Apr 15 21:06:50 [462000.890673] [<ffffffffa00a8990>] ? commit_timeout+0x10/0x10 [jbd] Apr 15 21:06:50 [462000.890676] [<ffffffff8107eeac>] kthread+0x8c/0xa0 Apr 15 21:06:50 [462000.890678] [<ffffffff813f8864>] kernel_thread_helper+0x4/0x10 Apr 15 21:06:50 [462000.890681] [<ffffffff8107ee20>] ? kthread_worker_fn+0x190/0x190 Apr 15 21:06:50 [462000.890683] [<ffffffff813f8860>] ? gs_change+0x13/0x13 Apr 15 21:06:50 [462000.890688] INFO: task mythbackend:15037 blocked for more than 120 seconds. Apr 15 21:06:50 [462000.890721] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Apr 15 21:06:50 [462000.890758] mythbackend D 00000001083ff1a3 0 15037 1 0x00000000 Apr 15 21:06:50 [462000.890761] ffff88002045de78 0000000000000086 ffff88002045deb0 ffff88002045de48 Apr 15 21:06:50 [462000.890765] ffff8801090e5d70 ffff88002045dfd8 ffff88002045dfd8 ffff88002045dfd8 Apr 15 21:06:50 [462000.890767] ffffffff8169b020 ffff8801090e5d70 ffff88001b7b3f08 ffff88001b7b3f10 Apr 15 21:06:50 [462000.890770] Call Trace: Apr 15 21:06:50 [462000.890773] [<ffffffff81103ac5>] ? pagevec_lookup_tag+0x25/0x40 Apr 15 21:06:50 [462000.890777] [<ffffffff810f7ecf>] ? filemap_fdatawait_range+0x9f/0x1a0 Apr 15 21:06:50 [462000.890779] [<ffffffff813f3fdf>] schedule+0x3f/0x60 Apr 15 21:06:50 [462000.890782] [<ffffffff813f5629>] __mutex_lock_slowpath+0x139/0x330 Apr 15 21:06:50 [462000.890784] [<ffffffff813f5836>] mutex_lock+0x16/0x30 Apr 15 21:06:50 [462000.890787] [<ffffffff81183fb2>] vfs_fsync_range+0x52/0xa0 Apr 15 21:06:50 [462000.890789] [<ffffffff8118406c>] vfs_fsync+0x1c/0x20 Apr 15 21:06:50 [462000.890791] [<ffffffff811843d6>] sys_fdatasync+0x36/0x50 Apr 15 21:06:50 [462000.890794] [<ffffffff813f7742>] system_call_fastpath+0x16/0x1b Apr 15 21:06:50 [462000.890797] INFO: task mythbackend:17562 blocked for more than 120 seconds. Apr 15 21:06:50 [462000.890830] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Apr 15 21:06:50 [462000.890867] mythbackend D 00000001083ff1a0 0 17562 1 0x00000000 Apr 15 21:06:50 [462000.890870] ffff880016999e78 0000000000000086 ffff880016999eb0 ffff880016999e48 Apr 15 21:06:50 [462000.890873] ffff88010d1df300 ffff880016999fd8 ffff880016999fd8 ffff880016999fd8 Apr 15 21:06:50 [462000.890876] ffffffff8169b020 ffff88010d1df300 ffff880000000001 ffff88006172e0d0 Apr 15 21:06:50 [462000.890879] Call Trace: Apr 15 21:06:50 [462000.890881] [<ffffffff81103ac5>] ? pagevec_lookup_tag+0x25/0x40 Apr 15 21:06:50 [462000.890884] [<ffffffff810f7ecf>] ? filemap_fdatawait_range+0x9f/0x1a0 Apr 15 21:06:50 [462000.890886] [<ffffffff813f3fdf>] schedule+0x3f/0x60 Apr 15 21:06:50 [462000.890889] [<ffffffff813f5629>] __mutex_lock_slowpath+0x139/0x330 Apr 15 21:06:50 [462000.890891] [<ffffffff813f5836>] mutex_lock+0x16/0x30 Apr 15 21:06:50 [462000.890893] [<ffffffff81183fb2>] vfs_fsync_range+0x52/0xa0 Apr 15 21:06:50 [462000.890896] [<ffffffff8118406c>] vfs_fsync+0x1c/0x20 Apr 15 21:06:50 [462000.890898] [<ffffffff811843d6>] sys_fdatasync+0x36/0x50 Apr 15 21:06:50 [462000.890901] [<ffffffff813f7742>] system_call_fastpath+0x16/0x1b Apr 15 21:06:50 [462000.890904] INFO: task mythcommflag:17595 blocked for more than 120 seconds. Apr 15 21:06:50 [462000.892048] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Apr 15 21:06:50 [462000.893248] mythcommflag D 0000000000000002 0 17595 17594 0x00000000 Apr 15 21:06:50 [462000.893254] ffff880012883738 0000000000000086 ffff8800264afbe8 000000003ded1f97 Apr 15 21:06:50 [462000.893257] ffff88010d1ddd70 ffff880012883fd8 ffff880012883fd8 ffff880012883fd8 Apr 15 21:06:50 [462000.893260] ffff8800681e5640 ffff88010d1ddd70 ffff880012883fd8 ffff88011fc57c08 Apr 15 21:06:50 [462000.893263] Call Trace: Apr 15 21:06:50 [462000.893266] [<ffffffff81012f69>] ? read_tsc+0x9/0x20 Apr 15 21:06:50 [462000.893269] [<ffffffff81089edd>] ? ktime_get_ts+0xad/0xe0 Apr 15 21:06:50 [462000.893272] [<ffffffff810f7bd0>] ? __lock_page+0x70/0x70 Apr 15 21:06:50 [462000.893274] [<ffffffff813f3fdf>] schedule+0x3f/0x60 Apr 15 21:06:50 [462000.893276] [<ffffffff813f408f>] io_schedule+0x8f/0xd0 Apr 15 21:06:50 [462000.893279] [<ffffffff810f7bde>] sleep_on_page+0xe/0x20 Apr 15 21:06:50 [462000.893281] [<ffffffff813f483f>] __wait_on_bit+0x5f/0x90 Apr 15 21:06:50 [462000.893284] [<ffffffff810f7dc8>] wait_on_page_bit+0x78/0x80 Apr 15 21:06:50 [462000.893286] [<ffffffff8107f840>] ? autoremove_wake_function+0x40/0x40 Apr 15 21:06:50 [462000.893289] [<ffffffff8110a1a5>] shrink_page_list+0x405/0x910 Apr 15 21:06:50 [462000.893291] [<ffffffff811050d6>] ? __pagevec_release+0x26/0x40 Apr 15 21:06:50 [462000.893294] [<ffffffff813f689f>] ? _raw_spin_unlock_irq+0xf/0x40 Apr 15 21:06:50 [462000.893297] [<ffffffff8110ac71>] shrink_inactive_list+0x2e1/0x470 Apr 15 21:06:50 [462000.893300] [<ffffffff8121dca7>] ? kobject_put+0x27/0x60 Apr 15 21:06:50 [462000.893303] [<ffffffff812e7af7>] ? put_device+0x17/0x20 Apr 15 21:06:50 [462000.893305] [<ffffffff8110b60c>] shrink_zone+0x4bc/0x660 Apr 15 21:06:50 [462000.893307] [<ffffffff81012f69>] ? read_tsc+0x9/0x20 Apr 15 21:06:50 [462000.893310] [<ffffffff8110bade>] do_try_to_free_pages+0xae/0x400 Apr 15 21:06:50 [462000.893312] [<ffffffff8110c1a4>] try_to_free_pages+0xb4/0x1e0 Apr 15 21:06:50 [462000.893315] [<ffffffff811005fc>] __alloc_pages_nodemask+0x52c/0x850 Apr 15 21:06:50 [462000.893318] [<ffffffff81127f8d>] ? page_add_new_anon_rmap+0x9d/0xb0 Apr 15 21:06:50 [462000.893321] [<ffffffff81139f7a>] alloc_pages_vma+0x9a/0x180 Apr 15 21:06:50 [462000.893324] [<ffffffff8114b2d2>] do_huge_pmd_anonymous_page+0x152/0x380 Apr 15 21:06:50 [462000.893326] [<ffffffff8111fbe6>] handle_mm_fault+0x256/0x350 Apr 15 21:06:50 [462000.893329] [<ffffffff8103993a>] do_page_fault+0x14a/0x4d0 Apr 15 21:06:50 [462000.893332] [<ffffffff8112405f>] ? do_brk+0x21f/0x320 Apr 15 21:06:50 [462000.893334] [<ffffffff813f7185>] page_fault+0x25/0x30 Apr 15 21:08:50 [462120.890526] INFO: task kjournald:9163 blocked for more than 120 seconds. Apr 15 21:08:50 [462120.891707] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Apr 15 21:08:50 [462120.892890] kjournald D 00000001083ffddf 0 9163 2 0x00000000 Apr 15 21:08:50 [462120.892894] ffff8800b6577c10 0000000000000046 00000001083ffddf ffffffff810ffd9a Apr 15 21:08:50 [462120.892898] ffff8801195a9590 ffff8800b6577fd8 ffff8800b6577fd8 ffff8800b6577fd8 Apr 15 21:08:50 [462120.892901] ffff8800105323f0 ffff8801195a9590 ffff88011fffb700 0000000100000000 Apr 15 21:08:50 [462120.892904] Call Trace: Apr 15 21:08:50 [462120.892911] [<ffffffff810ffd9a>] ? free_pcppages_bulk+0x42a/0x470 Apr 15 21:08:50 [462120.892915] [<ffffffff81012f69>] ? read_tsc+0x9/0x20 Apr 15 21:08:50 [462120.892919] [<ffffffff811860e0>] ? __wait_on_buffer+0x30/0x30 Apr 15 21:08:50 [462120.892922] [<ffffffff813f3fdf>] schedule+0x3f/0x60 Apr 15 21:08:50 [462120.892924] [<ffffffff813f408f>] io_schedule+0x8f/0xd0 Apr 15 21:08:50 [462120.892927] [<ffffffff811860ee>] sleep_on_buffer+0xe/0x20 Apr 15 21:08:50 [462120.892929] [<ffffffff813f483f>] __wait_on_bit+0x5f/0x90 Apr 15 21:08:50 [462120.892932] [<ffffffff811860e0>] ? __wait_on_buffer+0x30/0x30 Apr 15 21:08:50 [462120.892934] [<ffffffff813f48ec>] out_of_line_wait_on_bit+0x7c/0x90 Apr 15 21:08:50 [462120.892938] [<ffffffff8107f840>] ? autoremove_wake_function+0x40/0x40 Apr 15 21:08:50 [462120.892940] [<ffffffff811860de>] __wait_on_buffer+0x2e/0x30 Apr 15 21:08:50 [462120.892961] [<ffffffffa00a5b2c>] journal_commit_transaction+0xcac/0x13c0 [jbd] Apr 15 21:08:50 [462120.892965] [<ffffffff8106cd66>] ? try_to_del_timer_sync+0x96/0x160 Apr 15 21:08:50 [462120.892970] [<ffffffffa00a8a76>] kjournald+0xe6/0x250 [jbd] Apr 15 21:08:50 [462120.892973] [<ffffffff8107f800>] ? abort_exclusive_wait+0xb0/0xb0 Apr 15 21:08:50 [462120.892978] [<ffffffffa00a8990>] ? commit_timeout+0x10/0x10 [jbd] Apr 15 21:08:50 [462120.892980] [<ffffffff8107eeac>] kthread+0x8c/0xa0 Apr 15 21:08:50 [462120.892983] [<ffffffff813f8864>] kernel_thread_helper+0x4/0x10 Apr 15 21:08:50 [462120.892986] [<ffffffff8107ee20>] ? kthread_worker_fn+0x190/0x190 Apr 15 21:08:50 [462120.892988] [<ffffffff813f8860>] ? gs_change+0x13/0x13 Apr 15 21:08:50 [462120.893000] INFO: task mythbackend:15037 blocked for more than 120 seconds. Apr 15 21:08:50 [462120.894190] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Apr 15 21:08:50 [462120.895394] mythbackend D 00000001083ff1a3 0 15037 1 0x00000000 Apr 15 21:08:50 [462120.895397] ffff88002045de78 0000000000000086 ffff88002045deb0 ffff88002045de48 Apr 15 21:08:50 [462120.895400] ffff8801090e5d70 ffff88002045dfd8 ffff88002045dfd8 ffff88002045dfd8 Apr 15 21:08:50 [462120.895403] ffffffff8169b020 ffff8801090e5d70 ffff88001b7b3f08 ffff88001b7b3f10 Apr 15 21:08:50 [462120.895406] Call Trace: Apr 15 21:08:50 [462120.895409] [<ffffffff81103ac5>] ? pagevec_lookup_tag+0x25/0x40 Apr 15 21:08:50 [462120.895413] [<ffffffff810f7ecf>] ? filemap_fdatawait_range+0x9f/0x1a0 Apr 15 21:08:50 [462120.895415] [<ffffffff813f3fdf>] schedule+0x3f/0x60 Apr 15 21:08:50 [462120.895418] [<ffffffff813f5629>] __mutex_lock_slowpath+0x139/0x330 Apr 15 21:08:50 [462120.895420] [<ffffffff813f5836>] mutex_lock+0x16/0x30 Apr 15 21:08:50 [462120.895423] [<ffffffff81183fb2>] vfs_fsync_range+0x52/0xa0 Apr 15 21:08:50 [462120.895425] [<ffffffff8118406c>] vfs_fsync+0x1c/0x20 Apr 15 21:08:50 [462120.895427] [<ffffffff811843d6>] sys_fdatasync+0x36/0x50 Apr 15 21:08:50 [462120.895430] [<ffffffff813f7742>] system_call_fastpath+0x16/0x1b
No idea. Looks kjournald and a bunch of other processes are stuck waiting for IO. What kind of disk is this? Has this only happened this one time, or is it reproduceable?
I can reproduce the case: I have set up automounter to mount /data/video via nfsv4 from my video server (otto) (bug is also hit with nfsv3 i think). I try: dd if=/dev/zero of=/data/video/testfile bs=1024k count=100 -> 22MB/s, works great dd if=/dev/zero of=/data/video/testfile bs=1024k count=1000 -> 6MB/s, works great dd if=/dev/zero of=/data/video/testfile bs=1024k count=10000 -> stuck after some time, some processes block. I logged in via ssh and tried to kill dd, so to find the process i did: ps -ef |grep dd -> This command blocks! So i did strace -f ps -ef ... stat("/proc/2528", {st_mode=S_IFDIR|0555, st_size=0, ...}) = 0 open("/proc/2528/stat", O_RDONLY) = 6 read(6, "2528 (plasma-desktop) D 1 2411 2"..., 1023) = 249 close(6) = 0 open("/proc/2528/status", O_RDONLY) = 6 read(6, "Name:\tplasma-desktop\nState:\tD (d"..., 1023) = 910 close(6) = 0 open("/proc/2528/cmdline", O_RDONLY) = 6 read(6, <unfinished ...> Last line comes from CTRL-C i think So something in /proc is blocking? ps -u myusername works btw. and killing dd makes the system usable within some seconds. My guess: If the IO waiting for NFS to write is larger than memory (+swap?), then i hit the bug. System has 4GB memory and 4GB swap, the dd generates 10GB data. More information: Opensuse 12.1, all patches installed, uname -a Linux mamba 3.1.10-1.9-desktop #1 SMP PREEMPT Thu Apr 5 18:48:38 UTC 2012 (4a97ec8) x86_64 x86_64 x86_64 GNU/Linux NFS Server is a Opensuse 11.4, and mount |grep video otto:/video/ on /data/video type nfs4 (rw,relatime,vers=4,rsize=262144,wsize=262144,namlen=255,hard,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=192.168.135.136,minorversion=0,local_lock=none,addr=192.168.135.104)
Hi, Unfortunately recently moved to 3.3.7. As soon as i will catch this issue under 3.3.7 - I'll post immediately... -br
Is this still happening?
I'm not able to reproduce this problem with 3.3.8. But also I have to mention I switch from NFS to CIFS due problems with v.slow speed (1-3MB/s) for MAC->Linux NFS transfers. Strange as Linux->MAC works OK (70-90MB/s) :-( br
OK, marking this unreproduceable for now. "problems with v.slow speed (1-3MB/s) for MAC->Linux NFS transfers" That's writes from an OSX client to a Linux server? Possibly worth opening another bug if you want to look into that.
"That's writes from an OSX client to a Linux server? Possibly worth opening another bug if you want to look into that." Oh thx for suggestion. I'll do this definitely !