Bug 199753

Summary: btrfs deadlock running rsync
Product: File System Reporter: Eli V (eliventer)
Component: btrfsAssignee: BTRFS virtual assignee (fs_btrfs)
Status: RESOLVED CODE_FIX    
Severity: high CC: dsterba, lakshmipathi.g
Priority: P1    
Hardware: Intel   
OS: Linux   
Kernel Version: 4.16.8, 4.17, 4.18.8 Subsystem:
Regression: No Bisected commit-id:

Description Eli V 2018-05-17 12:57:29 UTC
running an rsync -av --del as the only process hiting my btrfs backup
filesystem, using linux 4.16.8. rsync is now stuck and so is all other access to the filesystem. Stack for the rsync:

[<0>] balance_dirty_pages_ratelimited+0x270/0xca0
[<0>] __btrfs_buffered_write+0x3ae/0x730 [btrfs]
[<0>] btrfs_file_write_iter+0x36e/0x510 [btrfs]
[<0>] __vfs_write+0xcf/0x130
[<0>] vfs_write+0xab/0x190
[<0>] SyS_write+0x3d/0x90
[<0>] do_syscall_64+0x55/0x100
[<0>] entry_SYSCALL_64_after_hwframe+0x3d/0xa2
[<0>] 0xffffffffffffffff

dmesg:
[245043.379979] WARNING: CPU: 4 PID: 2608 at fs/btrfs/locking.c:251
btrfs_tree_lock+0x1be/0x1d0 [btrfs]
[245043.381571] Modules linked in: ipmi_si mpt3sas raid_class
scsi_transport_sas dell_rbu nfsv3 nfsv4 nfs fscache ext2 mgag200
intel_powerclamp i2c_algo_bit coretemp crct10dif_pclmul crc32_pclmul
ghash_clmulni_intel aesni_intel drm_kms_helper aes_x86_64 joydev evdev
crypto_simd iTCO_wdt cryptd syscopyarea iTCO_vendor_support
sysfillrect sysimgblt dcdbas sg fb_sys_fops ttm glue_helper drm pcspkr
serio_raw ipmi_devintf ipmi_msghandler lpc_ich acpi_power_meter
i7core_edac mfd_core button nfsd auth_rpcgss oid_registry nfs_acl
lockd grace sunrpc loop autofs4 ext4 crc32c_generic crc16 mbcache jbd2
btrfs xor zstd_decompress zstd_compress xxhash raid6_pq hid_generic
usbhid hid sd_mod psmouse crc32c_intel i2c_core ehci_pci uhci_hcd
ehci_hcd ixgbe mdio megaraid_sas usbcore ptp usb_common pps_core
scsi_mod bnx2
[245043.391437]  [last unloaded: ipmi_si]
[245043.392792] CPU: 4 PID: 2608 Comm: btrfs-transacti Tainted: G
  W I      4.16.8 #1
[245043.395489] RIP: 0010:btrfs_tree_lock+0x1be/0x1d0 [btrfs]
[245043.396791] RSP: 0018:ffffc9000424b840 EFLAGS: 00010246
[245043.398093] RAX: 0000000000000a30 RBX: ffff8807e20a3d20 RCX:
0000000000000001
[245043.399414] RDX: 0000000000000001 RSI: 0000000000000002 RDI:
ffff8807e20a3d20
[245043.400732] RBP: 0000000000000001 R08: ffff88041f39a700 R09:
ffff880000000000
[245043.402021] R10: 0000000000000040 R11: ffff8807e20a3d20 R12:
ffff8807cb220630
[245043.403296] R13: 0000000000000001 R14: ffff8807cb220628 R15:
ffff88041fbdf000
[245043.404780] FS:  0000000000000000(0000) GS:ffff88082fc80000(0000)
knlGS:0000000000000000
[245043.406050] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[245043.407321] CR2: 00007fffdbdb9f10 CR3: 0000000001c09005 CR4:
00000000000206e0
[245043.408670] Call Trace:
[245043.409977]  btrfs_search_slot+0x761/0xa60 [btrfs]
[245043.411278]  btrfs_insert_empty_items+0x62/0xb0 [btrfs]
[245043.412572]  btrfs_insert_item+0x5b/0xc0 [btrfs]
[245043.413922]  btrfs_create_pending_block_groups+0xfb/0x1e0 [btrfs]
[245043.415216]  do_chunk_alloc+0x1e5/0x2a0 [btrfs]
[245043.416487]  find_free_extent+0xcd0/0xf60 [btrfs]
[245043.417813]  btrfs_reserve_extent+0x96/0x1e0 [btrfs]
[245043.419105]  btrfs_alloc_tree_block+0xfb/0x4a0 [btrfs]
[245043.420378]  __btrfs_cow_block+0x127/0x550 [btrfs]
[245043.421652]  btrfs_cow_block+0xee/0x190 [btrfs]
[245043.422979]  btrfs_search_slot+0x227/0xa60 [btrfs]
[245043.424279]  ? btrfs_update_inode_item+0x59/0x100 [btrfs]
[245043.425538]  ? iput+0x72/0x1e0
[245043.426798]  write_one_cache_group.isra.49+0x20/0x90 [btrfs]
[245043.428131]  btrfs_start_dirty_block_groups+0x102/0x420 [btrfs]
[245043.429419]  btrfs_commit_transaction+0x11b/0x880 [btrfs]
[245043.430712]  ? start_transaction+0x8e/0x410 [btrfs]
[245043.432006]  transaction_kthread+0x184/0x1a0 [btrfs]
[245043.433341]  kthread+0xf0/0x130
[245043.434628]  ? btrfs_cleanup_transaction+0x4e0/0x4e0 [btrfs]
[245043.435928]  ? kthread_create_worker_on_cpu+0x40/0x40
[245043.437236]  ret_from_fork+0x1f/0x30
[245043.438472] Code: 43 58 85 c0 75 2c f0 ff 43 58 f0 ff 43 44 65 48
8b 04 25 00 4d 01 00 8b 80 00 04 00 00 89 43 40 48 83 c4 28 5b 5d 41
5c 41 5d c3 <0f> 0b e9 60 fe ff ff 0f 0b eb d0 0f 1f 80 00 00 00 00 8b
47 4c
[245043.441054] ---[ end trace 15abaa2aaf36827f ]---
[245166.251274] btrfs-transacti D    0  2608      2 0x80000000
[245166.252508] Call Trace:
[245166.253964]  ? __schedule+0x2b1/0x770
[245166.255100]  schedule+0x2d/0x80
[245166.256263]  btrfs_tree_lock+0x111/0x1d0 [btrfs]
[245166.257484]  ? wait_woken+0x80/0x80
[245166.258562]  btrfs_search_slot+0x761/0xa60 [btrfs]
[245166.259664]  btrfs_insert_empty_items+0x62/0xb0 [btrfs]
[245166.260863]  btrfs_insert_item+0x5b/0xc0 [btrfs]
[245166.261907]  btrfs_create_pending_block_groups+0xfb/0x1e0 [btrfs]
[245166.262961]  do_chunk_alloc+0x1e5/0x2a0 [btrfs]
[245166.264034]  find_free_extent+0xcd0/0xf60 [btrfs]
[245166.265260]  btrfs_reserve_extent+0x96/0x1e0 [btrfs]
[245166.266299]  btrfs_alloc_tree_block+0xfb/0x4a0 [btrfs]
[245166.267344]  __btrfs_cow_block+0x127/0x550 [btrfs]
[245166.268555]  btrfs_cow_block+0xee/0x190 [btrfs]
[245166.269595]  btrfs_search_slot+0x227/0xa60 [btrfs]
[245166.270627]  ? btrfs_update_inode_item+0x59/0x100 [btrfs]
[245166.271665]  ? iput+0x72/0x1e0
[245166.272859]  write_one_cache_group.isra.49+0x20/0x90 [btrfs]
[245166.273885]  btrfs_start_dirty_block_groups+0x102/0x420 [btrfs]
[245166.274915]  btrfs_commit_transaction+0x11b/0x880 [btrfs]
[245166.275980]  ? start_transaction+0x8e/0x410 [btrfs]
[245166.277192]  transaction_kthread+0x184/0x1a0 [btrfs]
[245166.278205]  kthread+0xf0/0x130
[245166.279223]  ? btrfs_cleanup_transaction+0x4e0/0x4e0 [btrfs]
[245166.280264]  ? kthread_create_worker_on_cpu+0x40/0x40
[245166.281471]  ret_from_fork+0x1f/0x30
[245287.078165] INFO: task btrfs-transacti:2608 blocked for more than 120 seconds.
[245287.079269]       Tainted: G        W I      4.16.8 #1
[245287.080314] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[245287.081383] btrfs-transacti D    0  2608      2 0x80000000
[245287.082695] Call Trace:
[245287.083756]  ? __schedule+0x2b1/0x770
[245287.084807]  schedule+0x2d/0x80
[245287.085875]  btrfs_tree_lock+0x111/0x1d0 [btrfs]
[245287.086979]  ? wait_woken+0x80/0x80
[245287.088030]  btrfs_search_slot+0x761/0xa60 [btrfs]
[245287.089083]  btrfs_insert_empty_items+0x62/0xb0 [btrfs]
[245287.090167]  btrfs_insert_item+0x5b/0xc0 [btrfs]
[245287.091213]  btrfs_create_pending_block_groups+0xfb/0x1e0 [btrfs]
[245287.092268]  do_chunk_alloc+0x1e5/0x2a0 [btrfs]
[245287.093295]  find_free_extent+0xcd0/0xf60 [btrfs]
[245287.094356]  btrfs_reserve_extent+0x96/0x1e0 [btrfs]
[245287.095348]  btrfs_alloc_tree_block+0xfb/0x4a0 [btrfs]
[245287.096317]  __btrfs_cow_block+0x127/0x550 [btrfs]
[245287.097280]  btrfs_cow_block+0xee/0x190 [btrfs]
[245287.098261]  btrfs_search_slot+0x227/0xa60 [btrfs]
[245287.099199]  ? btrfs_update_inode_item+0x59/0x100 [btrfs]
[245287.100125]  ? iput+0x72/0x1e0
[245287.101060]  write_one_cache_group.isra.49+0x20/0x90 [btrfs]
[245287.102017]  btrfs_start_dirty_block_groups+0x102/0x420 [btrfs]
[245287.103040]  btrfs_commit_transaction+0x11b/0x880 [btrfs]
[245287.104012]  ? start_transaction+0x8e/0x410 [btrfs]
[245287.104977]  transaction_kthread+0x184/0x1a0 [btrfs]
[245287.105932]  kthread+0xf0/0x130
[245287.106953]  ? btrfs_cleanup_transaction+0x4e0/0x4e0 [btrfs]
[245287.107916]  ? kthread_create_worker_on_cpu+0x40/0x40
[245287.108870]  ret_from_fork+0x1f/0x30
[245287.109831] INFO: task kworker/u34:7:6616 blocked for more than 120 seconds.
[245287.110865]       Tainted: G        W I      4.16.8 #1
[245287.111839] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[245287.112838] kworker/u34:7   D    0  6616      2 0x80000000
[245287.113846] Workqueue: writeback wb_workfn (flush-btrfs-1)
[245287.114913] Call Trace:
[245287.115912]  ? __schedule+0x2b1/0x770
[245287.116921]  ? btrfs_buffer_uptodate+0x42/0x60 [btrfs]
[245287.117916]  schedule+0x2d/0x80
[245287.118973]  btrfs_tree_lock+0x111/0x1d0 [btrfs]
[245287.119968]  ? wait_woken+0x80/0x80
[245287.120960]  btrfs_search_slot+0x761/0xa60 [btrfs]
[245287.121958]  btrfs_insert_empty_items+0x62/0xb0 [btrfs]
[245287.123019]  btrfs_insert_item+0x5b/0xc0 [btrfs]
[245287.124016]  btrfs_create_pending_block_groups+0xfb/0x1e0 [btrfs]
[245287.125014]  __btrfs_end_transaction+0x8f/0x2e0 [btrfs]
[245287.126013]  find_free_extent+0xe23/0xf60 [btrfs]
[245287.127030]  btrfs_reserve_extent+0x96/0x1e0 [btrfs]
[245287.128008]  cow_file_range.isra.64+0x11e/0x480 [btrfs]
[245287.128977]  run_delalloc_range+0xed/0x340 [btrfs]
[245287.129932]  writepage_delalloc+0xf9/0x160 [btrfs]
[245287.130916]  __extent_writepage+0xd0/0x2f0 [btrfs]
[245287.131868]  extent_write_cache_pages+0x190/0x330 [btrfs]
[245287.132828]  extent_writepages+0x3b/0x50 [btrfs]
[245287.133766]  do_writepages+0x12/0x60
[245287.134731]  __writeback_single_inode+0x38/0x300
[245287.135669]  writeback_sb_inodes+0x1c6/0x430
[245287.136602]  __writeback_inodes_wb+0x82/0xb0
[245287.137536]  wb_writeback+0x23e/0x2d0
[245287.138494]  ? wb_workfn+0x188/0x3d0
[245287.139412]  wb_workfn+0x188/0x3d0
[245287.140323]  process_one_work+0x1c0/0x390
[245287.141230]  worker_thread+0x42/0x3e0
[245287.142192]  kthread+0xf0/0x130
[245287.143088]  ? create_worker+0x170/0x170
[245287.143981]  ? kthread_create_worker_on_cpu+0x40/0x40
[245287.144880]  ret_from_fork+0x1f/0x30
[245407.908859] INFO: task btrfs-transacti:2608 blocked for more than 120 seconds.
[245407.909846]       Tainted: G        W I      4.16.8 #1
[245407.910763] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[245407.911701] btrfs-transacti D    0  2608      2 0x80000000
[245407.912640] Call Trace:
[245407.913633]  ? __schedule+0x2b1/0x770
[245407.914560]  schedule+0x2d/0x80
[245407.915503]  btrfs_tree_lock+0x111/0x1d0 [btrfs]
[245407.916426]  ? wait_woken+0x80/0x80
[245407.917410]  btrfs_search_slot+0x761/0xa60 [btrfs]
[245407.918342]  btrfs_insert_empty_items+0x62/0xb0 [btrfs]
[245407.919258]  btrfs_insert_item+0x5b/0xc0 [btrfs]
[245407.920160]  btrfs_create_pending_block_groups+0xfb/0x1e0 [btrfs]
[245407.921071]  do_chunk_alloc+0x1e5/0x2a0 [btrfs]
[245407.921963]  find_free_extent+0xcd0/0xf60 [btrfs]
[245407.922847]  btrfs_reserve_extent+0x96/0x1e0 [btrfs]
[245407.923731]  btrfs_alloc_tree_block+0xfb/0x4a0 [btrfs]
[245407.924615]  __btrfs_cow_block+0x127/0x550 [btrfs]
[245407.925561]  btrfs_cow_block+0xee/0x190 [btrfs]
[245407.926447]  btrfs_search_slot+0x227/0xa60 [btrfs]
[245407.927330]  ? btrfs_update_inode_item+0x59/0x100 [btrfs]
[245407.928204]  ? iput+0x72/0x1e0
[245407.929114]  write_one_cache_group.isra.49+0x20/0x90 [btrfs]
[245407.930011]  btrfs_start_dirty_block_groups+0x102/0x420 [btrfs]
[245407.930921]  btrfs_commit_transaction+0x11b/0x880 [btrfs]
[245407.931825]  ? start_transaction+0x8e/0x410 [btrfs]
[245407.932722]  transaction_kthread+0x184/0x1a0 [btrfs]
[245407.933668]  kthread+0xf0/0x130
[245407.934557]  ? btrfs_cleanup_transaction+0x4e0/0x4e0 [btrfs]
[245407.935443]  ? kthread_create_worker_on_cpu+0x40/0x40
[245407.936331]  ret_from_fork+0x1f/0x30
[245407.937283] INFO: task kworker/u34:7:6616 blocked for more than 120 seconds.
[245407.938200]       Tainted: G        W I      4.16.8 #1
[245407.939112] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[245407.940051] kworker/u34:7   D    0  6616      2 0x80000000
[245407.941056] Workqueue: writeback wb_workfn (flush-btrfs-1)
[245407.942006] Call Trace:
[245407.942948]  ? __schedule+0x2b1/0x770
[245407.943905]  ? btrfs_buffer_uptodate+0x42/0x60 [btrfs]
[245407.944921]  schedule+0x2d/0x80
[245407.945898]  btrfs_tree_lock+0x111/0x1d0 [btrfs]
[245407.946858]  ? wait_woken+0x80/0x80
[245407.947826]  btrfs_search_slot+0x761/0xa60 [btrfs]
[245407.948871]  btrfs_insert_empty_items+0x62/0xb0 [btrfs]
[245407.949859]  btrfs_insert_item+0x5b/0xc0 [btrfs]
[245407.950844]  btrfs_create_pending_block_groups+0xfb/0x1e0 [btrfs]
[245407.951844]  __btrfs_end_transaction+0x8f/0x2e0 [btrfs]
[245407.952920]  find_free_extent+0xe23/0xf60 [btrfs]
[245407.953928]  btrfs_reserve_extent+0x96/0x1e0 [btrfs]
[245407.954903]  cow_file_range.isra.64+0x11e/0x480 [btrfs]
[245407.955866]  run_delalloc_range+0xed/0x340 [btrfs]
[245407.956879]  writepage_delalloc+0xf9/0x160 [btrfs]
[245407.957830]  __extent_writepage+0xd0/0x2f0 [btrfs]
[245407.958775]  extent_write_cache_pages+0x190/0x330 [btrfs]
[245407.959733]  extent_writepages+0x3b/0x50 [btrfs]
[245407.960671]  do_writepages+0x12/0x60
[245407.961707]  __writeback_single_inode+0x38/0x300
[245407.962644]  writeback_sb_inodes+0x1c6/0x430
[245407.963575]  __writeback_inodes_wb+0x82/0xb0
[245407.964510]  wb_writeback+0x23e/0x2d0
[245407.965494]  ? wb_workfn+0x188/0x3d0
[245407.966409]  wb_workfn+0x188/0x3d0
[245407.967317]  process_one_work+0x1c0/0x390
[245407.968224]  worker_thread+0x42/0x3e0
[245407.969230]  kthread+0xf0/0x130
[245407.970126]  ? create_worker+0x170/0x170
[245407.971019]  ? kthread_create_worker_on_cpu+0x40/0x40
[245407.971919]  ret_from_fork+0x1f/0x30
[245528.739550] INFO: task btrfs-transacti:2608 blocked for more than 120 seconds.
[245528.740521]       Tainted: G        W I      4.16.8 #1
[245528.741437] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[245528.742375] btrfs-transacti D    0  2608      2 0x80000000
[245528.743315] Call Trace:
[245528.744261]  ? __schedule+0x2b1/0x770
[245528.745187]  schedule+0x2d/0x80
[245528.746127]  btrfs_tree_lock+0x111/0x1d0 [btrfs]
[245528.747048]  ? wait_woken+0x80/0x80
[245528.747980]  btrfs_search_slot+0x761/0xa60 [btrfs]
[245528.748907]  btrfs_insert_empty_items+0x62/0xb0 [btrfs]
[245528.749823]  btrfs_insert_item+0x5b/0xc0 [btrfs]
[245528.750724]  btrfs_create_pending_block_groups+0xfb/0x1e0 [btrfs]
[245528.751633]  do_chunk_alloc+0x1e5/0x2a0 [btrfs]
[245528.752517]  find_free_extent+0xcd0/0xf60 [btrfs]
[245528.753400]  btrfs_reserve_extent+0x96/0x1e0 [btrfs]
[245528.754283]  btrfs_alloc_tree_block+0xfb/0x4a0 [btrfs]
[245528.755167]  __btrfs_cow_block+0x127/0x550 [btrfs]
[245528.756061]  btrfs_cow_block+0xee/0x190 [btrfs]
[245528.756943]  btrfs_search_slot+0x227/0xa60 [btrfs]
[245528.757825]  ? btrfs_update_inode_item+0x59/0x100 [btrfs]
[245528.758699]  ? iput+0x72/0x1e0
[245528.759586]  write_one_cache_group.isra.49+0x20/0x90 [btrfs]
[245528.760483]  btrfs_start_dirty_block_groups+0x102/0x420 [btrfs]
[245528.761392]  btrfs_commit_transaction+0x11b/0x880 [btrfs]
[245528.762295]  ? start_transaction+0x8e/0x410 [btrfs]
[245528.763192]  transaction_kthread+0x184/0x1a0 [btrfs]
[245528.764086]  kthread+0xf0/0x130
[245528.764976]  ? btrfs_cleanup_transaction+0x4e0/0x4e0 [btrfs]
[245528.765860]  ? kthread_create_worker_on_cpu+0x40/0x40
[245528.766744]  ret_from_fork+0x1f/0x30
[245528.767642] INFO: task kworker/u34:7:6616 blocked for more than 120 seconds.
[245528.768556]       Tainted: G        W I      4.16.8 #1
[245528.769468] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[245528.770408] kworker/u34:7   D    0  6616      2 0x80000000
[245528.771354] Workqueue: writeback wb_workfn (flush-btrfs-1)
[245528.772311] Call Trace:
[245528.773253]  ? __schedule+0x2b1/0x770
[245528.774210]  ? btrfs_buffer_uptodate+0x42/0x60 [btrfs]
[245528.775165]  schedule+0x2d/0x80
[245528.776137]  btrfs_tree_lock+0x111/0x1d0 [btrfs]
[245528.777101]  ? wait_woken+0x80/0x80
[245528.778072]  btrfs_search_slot+0x761/0xa60 [btrfs]
[245528.779053]  btrfs_insert_empty_items+0x62/0xb0 [btrfs]
[245528.780049]  btrfs_insert_item+0x5b/0xc0 [btrfs]
[245528.781038]  btrfs_create_pending_block_groups+0xfb/0x1e0 [btrfs]
[245528.782035]  __btrfs_end_transaction+0x8f/0x2e0 [btrfs]
[245528.783032]  find_free_extent+0xe23/0xf60 [btrfs]
[245528.784025]  btrfs_reserve_extent+0x96/0x1e0 [btrfs]
[245528.785001]  cow_file_range.isra.64+0x11e/0x480 [btrfs]
[245528.785966]  run_delalloc_range+0xed/0x340 [btrfs]
[245528.786919]  writepage_delalloc+0xf9/0x160 [btrfs]
[245528.787879]  __extent_writepage+0xd0/0x2f0 [btrfs]
[245528.788828]  extent_write_cache_pages+0x190/0x330 [btrfs]
[245528.789786]  extent_writepages+0x3b/0x50 [btrfs]
[245528.790722]  do_writepages+0x12/0x60
[245528.791663]  __writeback_single_inode+0x38/0x300
[245528.792601]  writeback_sb_inodes+0x1c6/0x430
[245528.793536]  __writeback_inodes_wb+0x82/0xb0
[245528.794471]  wb_writeback+0x23e/0x2d0
[245528.795398]  ? wb_workfn+0x188/0x3d0
[245528.796325]  wb_workfn+0x188/0x3d0
[245528.797238]  process_one_work+0x1c0/0x390
[245528.798145]  worker_thread+0x42/0x3e0
[245528.799051]  kthread+0xf0/0x130
[245528.799957]  ? create_worker+0x170/0x170
[245528.800852]  ? kthread_create_worker_on_cpu+0x40/0x40
[245528.801753]  ret_from_fork+0x1f/0x30
[245649.570175] INFO: task btrfs-transacti:2608 blocked for more than 120 seconds.
[245649.571154]       Tainted: G        W I      4.16.8 #1
[245649.572073] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[245649.573013] btrfs-transacti D    0  2608      2 0x80000000
[245649.573954] Call Trace:
[245649.574960]  ? __schedule+0x2b1/0x770
[245649.575887]  schedule+0x2d/0x80
[245649.576831]  btrfs_tree_lock+0x111/0x1d0 [btrfs]
[245649.577753]  ? wait_woken+0x80/0x80
[245649.578746]  btrfs_search_slot+0x761/0xa60 [btrfs]
[245649.579679]  btrfs_insert_empty_items+0x62/0xb0 [btrfs]
[245649.580596]  btrfs_insert_item+0x5b/0xc0 [btrfs]
[245649.581499]  btrfs_create_pending_block_groups+0xfb/0x1e0 [btrfs]
[245649.582418]  do_chunk_alloc+0x1e5/0x2a0 [btrfs]
[245649.583309]  find_free_extent+0xcd0/0xf60 [btrfs]
[245649.584370]  btrfs_reserve_extent+0x96/0x1e0 [btrfs]
[245649.585255]  btrfs_alloc_tree_block+0xfb/0x4a0 [btrfs]
[245649.586205]  __btrfs_cow_block+0x127/0x550 [btrfs]
[245649.587094]  btrfs_cow_block+0xee/0x190 [btrfs]
[245649.587974]  btrfs_search_slot+0x227/0xa60 [btrfs]
[245528.743315] Call Trace:
[245528.744261]  ? __schedule+0x2b1/0x770
[245528.745187]  schedule+0x2d/0x80
[245528.746127]  btrfs_tree_lock+0x111/0x1d0 [btrfs]
[245528.747048]  ? wait_woken+0x80/0x80
[245528.747980]  btrfs_search_slot+0x761/0xa60 [btrfs]
[245528.748907]  btrfs_insert_empty_items+0x62/0xb0 [btrfs]
[245528.749823]  btrfs_insert_item+0x5b/0xc0 [btrfs]
[245528.750724]  btrfs_create_pending_block_groups+0xfb/0x1e0 [btrfs]
[245528.751633]  do_chunk_alloc+0x1e5/0x2a0 [btrfs]
[245528.752517]  find_free_extent+0xcd0/0xf60 [btrfs]
[245528.753400]  btrfs_reserve_extent+0x96/0x1e0 [btrfs]
[245528.754283]  btrfs_alloc_tree_block+0xfb/0x4a0 [btrfs]
[245528.755167]  __btrfs_cow_block+0x127/0x550 [btrfs]
[245528.756061]  btrfs_cow_block+0xee/0x190 [btrfs]
[245528.756943]  btrfs_search_slot+0x227/0xa60 [btrfs]
[245528.757825]  ? btrfs_update_inode_item+0x59/0x100 [btrfs]
[245528.758699]  ? iput+0x72/0x1e0
[245528.759586]  write_one_cache_group.isra.49+0x20/0x90 [btrfs]
[245528.760483]  btrfs_start_dirty_block_groups+0x102/0x420 [btrfs]
[245528.761392]  btrfs_commit_transaction+0x11b/0x880 [btrfs]
[245528.762295]  ? start_transaction+0x8e/0x410 [btrfs]
[245528.763192]  transaction_kthread+0x184/0x1a0 [btrfs]
[245528.764086]  kthread+0xf0/0x130
[245528.764976]  ? btrfs_cleanup_transaction+0x4e0/0x4e0 [btrfs]
[245528.765860]  ? kthread_create_worker_on_cpu+0x40/0x40
[245528.766744]  ret_from_fork+0x1f/0x30
[245528.767642] INFO: task kworker/u34:7:6616 blocked for more than 120 seconds.
[245528.768556]       Tainted: G        W I      4.16.8 #1
[245528.769468] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[245528.770408] kworker/u34:7   D    0  6616      2 0x80000000
[245528.771354] Workqueue: writeback wb_workfn (flush-btrfs-1)
[245528.772311] Call Trace:
[245528.773253]  ? __schedule+0x2b1/0x770
[245528.774210]  ? btrfs_buffer_uptodate+0x42/0x60 [btrfs]
[245528.775165]  schedule+0x2d/0x80
[245528.776137]  btrfs_tree_lock+0x111/0x1d0 [btrfs]
[245528.777101]  ? wait_woken+0x80/0x80
[245528.778072]  btrfs_search_slot+0x761/0xa60 [btrfs]
[245528.779053]  btrfs_insert_empty_items+0x62/0xb0 [btrfs]
[245528.780049]  btrfs_insert_item+0x5b/0xc0 [btrfs]
[245528.781038]  btrfs_create_pending_block_groups+0xfb/0x1e0 [btrfs]
[245528.782035]  __btrfs_end_transaction+0x8f/0x2e0 [btrfs]
[245528.783032]  find_free_extent+0xe23/0xf60 [btrfs]
[245528.784025]  btrfs_reserve_extent+0x96/0x1e0 [btrfs]
[245528.785001]  cow_file_range.isra.64+0x11e/0x480 [btrfs]
[245528.785966]  run_delalloc_range+0xed/0x340 [btrfs]
[245528.786919]  writepage_delalloc+0xf9/0x160 [btrfs]
[245528.787879]  __extent_writepage+0xd0/0x2f0 [btrfs]
[245528.788828]  extent_write_cache_pages+0x190/0x330 [btrfs]
[245528.789786]  extent_writepages+0x3b/0x50 [btrfs]
[245528.790722]  do_writepages+0x12/0x60
[245528.791663]  __writeback_single_inode+0x38/0x300
[245528.792601]  writeback_sb_inodes+0x1c6/0x430
[245528.793536]  __writeback_inodes_wb+0x82/0xb0
[245528.794471]  wb_writeback+0x23e/0x2d0
[245528.795398]  ? wb_workfn+0x188/0x3d0
[245528.796325]  wb_workfn+0x188/0x3d0
[245528.797238]  process_one_work+0x1c0/0x390
[245528.798145]  worker_thread+0x42/0x3e0
[245528.799051]  kthread+0xf0/0x130
[245528.799957]  ? create_worker+0x170/0x170
[245528.800852]  ? kthread_create_worker_on_cpu+0x40/0x40
[245528.801753]  ret_from_fork+0x1f/0x30
[245649.570175] INFO: task btrfs-transacti:2608 blocked for more than 120 seconds.
[245649.571154]       Tainted: G        W I      4.16.8 #1
[245649.572073] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[245649.573013] btrfs-transacti D    0  2608      2 0x80000000
[245649.573954] Call Trace:
[245649.574960]  ? __schedule+0x2b1/0x770
[245649.575887]  schedule+0x2d/0x80
[245649.576831]  btrfs_tree_lock+0x111/0x1d0 [btrfs]
[245649.577753]  ? wait_woken+0x80/0x80
[245649.578746]  btrfs_search_slot+0x761/0xa60 [btrfs]
[245649.579679]  btrfs_insert_empty_items+0x62/0xb0 [btrfs]
[245649.580596]  btrfs_insert_item+0x5b/0xc0 [btrfs]
[245649.581499]  btrfs_create_pending_block_groups+0xfb/0x1e0 [btrfs]
[245649.582418]  do_chunk_alloc+0x1e5/0x2a0 [btrfs]
[245649.583309]  find_free_extent+0xcd0/0xf60 [btrfs]
[245649.584370]  btrfs_reserve_extent+0x96/0x1e0 [btrfs]
[245649.585255]  btrfs_alloc_tree_block+0xfb/0x4a0 [btrfs]
[245649.586205]  __btrfs_cow_block+0x127/0x550 [btrfs]
[245649.587094]  btrfs_cow_block+0xee/0x190 [btrfs]
[245649.587974]  btrfs_search_slot+0x227/0xa60 [btrfs]
[245528.743315] Call Trace:
[245528.744261]  ? __schedule+0x2b1/0x770
[245528.745187]  schedule+0x2d/0x80
[245528.746127]  btrfs_tree_lock+0x111/0x1d0 [btrfs]
[245528.747048]  ? wait_woken+0x80/0x80
[245528.747980]  btrfs_search_slot+0x761/0xa60 [btrfs]
[245528.748907]  btrfs_insert_empty_items+0x62/0xb0 [btrfs]
[245528.749823]  btrfs_insert_item+0x5b/0xc0 [btrfs]
[245528.750724]  btrfs_create_pending_block_groups+0xfb/0x1e0 [btrfs]
[245528.751633]  do_chunk_alloc+0x1e5/0x2a0 [btrfs]
[245528.752517]  find_free_extent+0xcd0/0xf60 [btrfs]
[245528.753400]  btrfs_reserve_extent+0x96/0x1e0 [btrfs]
[245528.754283]  btrfs_alloc_tree_block+0xfb/0x4a0 [btrfs]
[245528.755167]  __btrfs_cow_block+0x127/0x550 [btrfs]
[245528.756061]  btrfs_cow_block+0xee/0x190 [btrfs]
[245528.756943]  btrfs_search_slot+0x227/0xa60 [btrfs]
[245528.757825]  ? btrfs_update_inode_item+0x59/0x100 [btrfs]
[245528.758699]  ? iput+0x72/0x1e0
[245528.759586]  write_one_cache_group.isra.49+0x20/0x90 [btrfs]
[245528.760483]  btrfs_start_dirty_block_groups+0x102/0x420 [btrfs]
[245528.761392]  btrfs_commit_transaction+0x11b/0x880 [btrfs]
[245528.762295]  ? start_transaction+0x8e/0x410 [btrfs]
[245528.763192]  transaction_kthread+0x184/0x1a0 [btrfs]
[245528.764086]  kthread+0xf0/0x130
[245528.764976]  ? btrfs_cleanup_transaction+0x4e0/0x4e0 [btrfs]
[245528.765860]  ? kthread_create_worker_on_cpu+0x40/0x40
[245528.766744]  ret_from_fork+0x1f/0x30
[245528.767642] INFO: task kworker/u34:7:6616 blocked for more than 120 seconds.
[245528.768556]       Tainted: G        W I      4.16.8 #1
[245528.769468] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[245528.770408] kworker/u34:7   D    0  6616      2 0x80000000
[245528.771354] Workqueue: writeback wb_workfn (flush-btrfs-1)
[245528.772311] Call Trace:
[245528.773253]  ? __schedule+0x2b1/0x770
[245528.774210]  ? btrfs_buffer_uptodate+0x42/0x60 [btrfs]
[245528.775165]  schedule+0x2d/0x80
[245528.776137]  btrfs_tree_lock+0x111/0x1d0 [btrfs]
[245528.777101]  ? wait_woken+0x80/0x80
[245528.778072]  btrfs_search_slot+0x761/0xa60 [btrfs]
[245528.779053]  btrfs_insert_empty_items+0x62/0xb0 [btrfs]
[245528.780049]  btrfs_insert_item+0x5b/0xc0 [btrfs]
[245528.781038]  btrfs_create_pending_block_groups+0xfb/0x1e0 [btrfs]
[245528.782035]  __btrfs_end_transaction+0x8f/0x2e0 [btrfs]
[245528.783032]  find_free_extent+0xe23/0xf60 [btrfs]
[245528.784025]  btrfs_reserve_extent+0x96/0x1e0 [btrfs]
[245528.785001]  cow_file_range.isra.64+0x11e/0x480 [btrfs]
[245528.785966]  run_delalloc_range+0xed/0x340 [btrfs]
[245528.786919]  writepage_delalloc+0xf9/0x160 [btrfs]
[245528.787879]  __extent_writepage+0xd0/0x2f0 [btrfs]
[245528.788828]  extent_write_cache_pages+0x190/0x330 [btrfs]
[245528.789786]  extent_writepages+0x3b/0x50 [btrfs]
[245528.790722]  do_writepages+0x12/0x60
[245528.791663]  __writeback_single_inode+0x38/0x300
[245528.792601]  writeback_sb_inodes+0x1c6/0x430
[245528.793536]  __writeback_inodes_wb+0x82/0xb0
[245528.794471]  wb_writeback+0x23e/0x2d0
[245528.795398]  ? wb_workfn+0x188/0x3d0
[245528.796325]  wb_workfn+0x188/0x3d0
[245528.797238]  process_one_work+0x1c0/0x390
[245528.798145]  worker_thread+0x42/0x3e0
[245528.799051]  kthread+0xf0/0x130
[245528.799957]  ? create_worker+0x170/0x170
[245528.800852]  ? kthread_create_worker_on_cpu+0x40/0x40
[245528.801753]  ret_from_fork+0x1f/0x30
[245649.570175] INFO: task btrfs-transacti:2608 blocked for more than 120 seconds.
[245649.571154]       Tainted: G        W I      4.16.8 #1
[245649.572073] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[245649.573013] btrfs-transacti D    0  2608      2 0x80000000
[245649.573954] Call Trace:
[245649.574960]  ? __schedule+0x2b1/0x770
[245649.575887]  schedule+0x2d/0x80
[245649.576831]  btrfs_tree_lock+0x111/0x1d0 [btrfs]
[245649.577753]  ? wait_woken+0x80/0x80
[245649.578746]  btrfs_search_slot+0x761/0xa60 [btrfs]
[245649.579679]  btrfs_insert_empty_items+0x62/0xb0 [btrfs]
[245649.580596]  btrfs_insert_item+0x5b/0xc0 [btrfs]
[245649.581499]  btrfs_create_pending_block_groups+0xfb/0x1e0 [btrfs]
[245649.582418]  do_chunk_alloc+0x1e5/0x2a0 [btrfs]
[245649.583309]  find_free_extent+0xcd0/0xf60 [btrfs]
[245649.584370]  btrfs_reserve_extent+0x96/0x1e0 [btrfs]
[245649.585255]  btrfs_alloc_tree_block+0xfb/0x4a0 [btrfs]
[245649.586205]  __btrfs_cow_block+0x127/0x550 [btrfs]
[245649.587094]  btrfs_cow_block+0xee/0x190 [btrfs]
[245649.587974]  btrfs_search_slot+0x227/0xa60 [btrfs]
[245649.588857]  ? btrfs_update_inode_item+0x59/0x100 [btrfs]
[245649.589731]  ? iput+0x72/0x1e0
[245649.590671]  write_one_cache_group.isra.49+0x20/0x90 [btrfs]
[245649.591568]  btrfs_start_dirty_block_groups+0x102/0x420 [btrfs]
[245649.592476]  btrfs_commit_transaction+0x11b/0x880 [btrfs]
[245649.593379]  ? start_transaction+0x8e/0x410 [btrfs]
[245649.594338]  transaction_kthread+0x184/0x1a0 [btrfs]
[245649.595225]  kthread+0xf0/0x130
[245649.596113]  ? btrfs_cleanup_transaction+0x4e0/0x4e0 [btrfs]
[245649.596998]  ? kthread_create_worker_on_cpu+0x40/0x40
[245649.597883]  ret_from_fork+0x1f/0x30
[245649.598840] INFO: task kworker/u34:7:6616 blocked for more than 120 seconds.
[245649.599754]       Tainted: G        W I      4.16.8 #1
[245649.600670] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[245649.601614] kworker/u34:7   D    0  6616      2 0x80000000
[245649.602627] Workqueue: writeback wb_workfn (flush-btrfs-1)
[245649.603577] Call Trace:
[245649.604521]  ? __schedule+0x2b1/0x770
[245649.605479]  ? btrfs_buffer_uptodate+0x42/0x60 [btrfs]
[245649.606495]  schedule+0x2d/0x80
[245649.607461]  btrfs_tree_lock+0x111/0x1d0 [btrfs]
[245649.608423]  ? wait_woken+0x80/0x80
[245649.609395]  btrfs_search_slot+0x761/0xa60 [btrfs]
[245649.610441]  btrfs_insert_empty_items+0x62/0xb0 [btrfs]
[245649.611434]  btrfs_insert_item+0x5b/0xc0 [btrfs]
[245649.612420]  btrfs_create_pending_block_groups+0xfb/0x1e0 [btrfs]
[245649.613416]  __btrfs_end_transaction+0x8f/0x2e0 [btrfs]
[245649.614469]  find_free_extent+0xe23/0xf60 [btrfs]
[245649.615452]  btrfs_reserve_extent+0x96/0x1e0 [btrfs]
[245649.616428]  cow_file_range.isra.64+0x11e/0x480 [btrfs]
[245649.617392]  run_delalloc_range+0xed/0x340 [btrfs]
[245649.618410]  writepage_delalloc+0xf9/0x160 [btrfs]
[245649.619365]  __extent_writepage+0xd0/0x2f0 [btrfs]
[245649.620314]  extent_write_cache_pages+0x190/0x330 [btrfs]
[245649.621275]  extent_writepages+0x3b/0x50 [btrfs]
[245649.622278]  do_writepages+0x12/0x60
[245649.623209]  __writeback_single_inode+0x38/0x300
[245649.624147]  writeback_sb_inodes+0x1c6/0x430
[245649.625082]  __writeback_inodes_wb+0x82/0xb0
[245649.626015]  wb_writeback+0x23e/0x2d0
[245649.627008]  ? wb_workfn+0x188/0x3d0
[245649.627924]  wb_workfn+0x188/0x3d0
[245649.628836]  process_one_work+0x1c0/0x390
[245649.629743]  worker_thread+0x42/0x3e0
[245649.630712]  kthread+0xf0/0x130
[245649.631609]  ? create_worker+0x170/0x170
[245649.632503]  ? kthread_create_worker_on_cpu+0x40/0x40
[245649.633405]  ret_from_fork+0x1f/0x30
[245770.400500] INFO: task btrfs-transacti:2608 blocked for more than 120 seconds.
[245770.401476]       Tainted: G        W I      4.16.8 #1
[245770.402395] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[245770.403336] btrfs-transacti D    0  2608      2 0x80000000
[245770.404281] Call Trace:
[245770.405276]  ? __schedule+0x2b1/0x770
[245770.406205]  schedule+0x2d/0x80
[245770.407150]  btrfs_tree_lock+0x111/0x1d0 [btrfs]
[245770.408073]  ? wait_woken+0x80/0x80
[245770.409059]  btrfs_search_slot+0x761/0xa60 [btrfs]
[245770.409994]  btrfs_insert_empty_items+0x62/0xb0 [btrfs]
[245770.410913]  btrfs_insert_item+0x5b/0xc0 [btrfs]
[245770.411817]  btrfs_create_pending_block_groups+0xfb/0x1e0 [btrfs]
[245770.412738]  do_chunk_alloc+0x1e5/0x2a0 [btrfs]
[245770.413627]  find_free_extent+0xcd0/0xf60 [btrfs]
[245770.414513]  btrfs_reserve_extent+0x96/0x1e0 [btrfs]
[245770.415398]  btrfs_alloc_tree_block+0xfb/0x4a0 [btrfs]
[245770.416285]  __btrfs_cow_block+0x127/0x550 [btrfs]
[245770.417230]  btrfs_cow_block+0xee/0x190 [btrfs]
[245770.418118]  btrfs_search_slot+0x227/0xa60 [btrfs]
[245770.419004]  ? btrfs_update_inode_item+0x59/0x100 [btrfs]
[245770.419880]  ? iput+0x72/0x1e0
[245770.420791]  write_one_cache_group.isra.49+0x20/0x90 [btrfs]
[245770.421692]  btrfs_start_dirty_block_groups+0x102/0x420 [btrfs]
[245770.422602]  btrfs_commit_transaction+0x11b/0x880 [btrfs]
[245770.423507]  ? start_transaction+0x8e/0x410 [btrfs]
[245770.424464]  transaction_kthread+0x184/0x1a0 [btrfs]
[245770.425350]  kthread+0xf0/0x130
[245770.426239]  ? btrfs_cleanup_transaction+0x4e0/0x4e0 [btrfs]
[245770.427124]  ? kthread_create_worker_on_cpu+0x40/0x40
[245770.428008]  ret_from_fork+0x1f/0x30
Comment 1 lakshmipathi 2018-06-02 03:38:24 UTC
Is it failing consistently? Can you share reproducible steps? thanks
Comment 2 Eli V 2018-06-04 12:19:43 UTC
It fails every few days in the middle of an rsync, most recently on 6/2 with kernel 4.16.11. Just rebooted into 4.16.13 this morning. The system is dedicated to running backups, so all it does is run rsync from NFS filesytems to a big local btrfs, and then snapshot when the rsync completes. It runs just a single rsync at a time. Of note the fs is using a 64k leaf size for much better performance on the underlying hardware RAID which uses a 64K stripe size. Mount options from /proc/mounts:

rw,noatime,compress=zstd,ssd_spread,space_cache=v2,enospc_debug,subvolid=5,subvol=/
Comment 3 Eli V 2018-06-15 12:15:06 UTC
btrfs deadlocked again under 4.6.13 on 6/12. I'm guessing it's the locking around the wait_woken callback into the btrfs code. I've rebooted into 4.17 will report back if it hangs in the next week or two.

[673874.641848] INFO: task btrfs-transacti:2699 blocked for more than 120 seconds.
[673874.642834]       Tainted: G        W I      4.16.13 #1
[673874.643763] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[673874.644713] btrfs-transacti D    0  2699      2 0x80000000
[673874.645663] Call Trace:
[673874.646623]  ? __schedule+0x2b1/0x770
[673874.647557]  schedule+0x2d/0x80
[673874.648503]  btrfs_tree_lock+0x111/0x1d0 [btrfs]
[673874.649432]  ? wait_woken+0x80/0x80
[673874.650376]  btrfs_search_slot+0x761/0xa60 [btrfs]
[673874.651317]  btrfs_insert_empty_items+0x62/0xb0 [btrfs]
[673874.652241]  btrfs_insert_item+0x5b/0xc0 [btrfs]
[673874.653152]  btrfs_create_pending_block_groups+0xfb/0x1e0 [btrfs]
[673874.654070]  do_chunk_alloc+0x1e5/0x2b0 [btrfs]
[673874.654960]  find_free_extent+0xcd0/0xf60 [btrfs]
[673874.655850]  btrfs_reserve_extent+0x96/0x1e0 [btrfs]
[673874.656736]  btrfs_alloc_tree_block+0xfb/0x4a0 [btrfs]
[673874.657624]  __btrfs_cow_block+0x127/0x550 [btrfs]
[673874.658527]  btrfs_cow_block+0xee/0x190 [btrfs]
[673874.659423]  btrfs_search_slot+0x227/0xa60 [btrfs]
[673874.660321]  ? btrfs_update_inode_item+0x59/0x100 [btrfs]
[673874.661210]  ? iput+0x72/0x1e0
[673874.662109]  write_one_cache_group.isra.49+0x20/0x90 [btrfs]
[673874.663025]  btrfs_start_dirty_block_groups+0x102/0x420 [btrfs]
[673874.663941]  btrfs_commit_transaction+0x11b/0x880 [btrfs]
[673874.664852]  ? start_transaction+0x8e/0x410 [btrfs]
[673874.665772]  transaction_kthread+0x184/0x1a0 [btrfs]
[673874.666671]  kthread+0xf0/0x130
[673874.667568]  ? btrfs_cleanup_transaction+0x4e0/0x4e0 [btrfs]
[673874.668467]  ? kthread_create_worker_on_cpu+0x40/0x40
[673874.669369]  ret_from_fork+0x1f/0x30
[673874.670293] INFO: task kworker/u34:10:30138 blocked for more than 120 seconds.
[673874.671224]       Tainted: G        W I      4.16.13 #1
[673874.672154] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[673874.673112] kworker/u34:10  D    0 30138      2 0x80000000
[673874.674086] Workqueue: writeback wb_workfn (flush-btrfs-1)
[673874.675057] Call Trace:
[673874.676015]  ? __schedule+0x2b1/0x770
[673874.676979]  schedule+0x2d/0x80
[673874.677966]  btrfs_tree_lock+0x111/0x1d0 [btrfs]
[673874.678941]  ? wait_woken+0x80/0x80
[673874.679918]  btrfs_search_slot+0x761/0xa60 [btrfs]
[673874.680905]  btrfs_insert_empty_items+0x62/0xb0 [btrfs]
[673874.681908]  btrfs_insert_item+0x5b/0xc0 [btrfs]
[673874.682907]  btrfs_create_pending_block_groups+0xfb/0x1e0 [btrfs]
[673874.683915]  __btrfs_end_transaction+0x8f/0x2e0 [btrfs]
[673874.684924]  find_free_extent+0xe23/0xf60 [btrfs]
[673874.685945]  btrfs_reserve_extent+0x96/0x1e0 [btrfs]
[673874.686945]  cow_file_range.isra.65+0x11e/0x480 [btrfs]
[673874.687939]  run_delalloc_range+0xed/0x340 [btrfs]
[673874.688922]  writepage_delalloc+0xf9/0x160 [btrfs]
[673874.689894]  __extent_writepage+0xd0/0x2f0 [btrfs]
[673874.690853]  extent_write_cache_pages+0x190/0x330 [btrfs]
[673874.691822]  extent_writepages+0x3b/0x50 [btrfs]
[673874.692773]  do_writepages+0x12/0x60
[673874.693728]  __writeback_single_inode+0x38/0x300
[673874.694679]  writeback_sb_inodes+0x1c6/0x430
[673874.695625]  __writeback_inodes_wb+0x82/0xb0
[673874.696568]  wb_writeback+0x23e/0x2d0
[673874.697510]  ? wb_workfn+0x188/0x3c0
[673874.698452]  wb_workfn+0x188/0x3c0
[673874.699378]  process_one_work+0x1c0/0x390
[673874.700302]  worker_thread+0x42/0x3e0
[673874.701220]  kthread+0xf0/0x130
[673874.702140]  ? create_worker+0x170/0x170
[673874.703052]  ? kthread_create_worker_on_cpu+0x40/0x40
[673874.703967]  ? SyS_exit_group+0xb/0x10
[673874.704877]  ret_from_fork+0x1f/0x30
Comment 4 Eli V 2018-06-18 13:13:02 UTC
4.17 deadlock still present. rsync stack:
[<0>] balance_dirty_pages_ratelimited+0x270/0xca0
[<0>] __btrfs_buffered_write+0x3c2/0x760 [btrfs]
[<0>] btrfs_file_write_iter+0x36e/0x510 [btrfs]
[<0>] __vfs_write+0xcf/0x130
[<0>] vfs_write+0xab/0x190
[<0>] ksys_write+0x3d/0x90
[<0>] do_syscall_64+0x39/0xe0
[<0>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[<0>] 0xffffffffffffffff

kernel dmesg, looks like the wait_woken is consistently present whenever there is a problem:

[301835.925267] btrfs-transacti D    0  2892      2 0x80000000
[301835.926291] Call Trace:
[301835.927244]  ? __schedule+0x2b1/0x770
[301835.928189]  schedule+0x2d/0x80
[301835.929144]  btrfs_tree_lock+0x111/0x1d0 [btrfs]
[301835.930086]  ? wait_woken+0x80/0x80
[301835.931041]  btrfs_search_slot+0x761/0xa60 [btrfs]
[301835.931972]  ? _cond_resched+0x11/0x40
[301835.932892]  btrfs_insert_empty_items+0x62/0xb0 [btrfs]
[301835.933805]  btrfs_insert_item+0x5b/0xc0 [btrfs]
[301835.934712]  btrfs_create_pending_block_groups+0xff/0x1e0 [btrfs]
[301835.935618]  do_chunk_alloc+0x1d5/0x270 [btrfs]
[301835.936518]  find_free_extent+0x651/0x1050 [btrfs]
[301835.937426]  btrfs_reserve_extent+0x96/0x1d0 [btrfs]
[301835.938342]  btrfs_alloc_tree_block+0xfb/0x4a0 [btrfs]
[301835.939239]  ? pagevec_lru_move_fn+0xad/0xd0
[301835.940141]  __btrfs_cow_block+0x124/0x540 [btrfs]
[301835.941048]  btrfs_cow_block+0xee/0x190 [btrfs]
[301835.941954]  btrfs_search_slot+0x227/0xa60 [btrfs]
[301835.942873]  ? btrfs_update_inode_item+0x59/0x100 [btrfs]
[301835.943772]  ? iput+0x66/0x1d0
[301835.944674]  write_one_cache_group.isra.52+0x20/0x90 [btrfs]
[301835.945591]  btrfs_start_dirty_block_groups+0x100/0x420 [btrfs]
[301835.946519]  btrfs_commit_transaction+0xb6/0x840 [btrfs]
[301835.947438]  ? start_transaction+0x8c/0x3e0 [btrfs]
[301835.948353]  transaction_kthread+0x142/0x170 [btrfs]
[301835.949254]  kthread+0xf0/0x130
[301835.950158]  ? btrfs_cleanup_transaction+0x4e0/0x4e0 [btrfs]
[301835.951082]  ? kthread_create_worker_on_cpu+0x40/0x40
[301835.952000]  ret_from_fork+0x1f/0x30
[301835.952916] INFO: task kworker/u33:7:24296 blocked for more than 120 seconds.
[301835.953851]       Tainted: G        W I       4.17.0 #1
[301835.954797] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[301835.955763] kworker/u33:7   D    0 24296      2 0x80000000
[301835.956740] Workqueue: writeback wb_workfn (flush-btrfs-1)
[301835.957717] Call Trace:
[301835.958695]  ? __schedule+0x2b1/0x770
[301835.959682]  ? btrfs_buffer_uptodate+0x2f/0x40 [btrfs]
[301835.960670]  schedule+0x2d/0x80
[301835.961669]  btrfs_tree_lock+0x111/0x1d0 [btrfs]
[301835.962670]  ? wait_woken+0x80/0x80
[301835.963676]  btrfs_search_slot+0x761/0xa60 [btrfs]
[301835.964678]  ? _cond_resched+0x11/0x40
[301835.965686]  btrfs_insert_empty_items+0x62/0xb0 [btrfs]
[301835.966710]  btrfs_insert_item+0x5b/0xc0 [btrfs]
[301835.967709]  btrfs_create_pending_block_groups+0xff/0x1e0 [btrfs]
[301835.968711]  __btrfs_end_transaction+0x88/0x2d0 [btrfs]
[301835.969699]  find_free_extent+0xe7a/0x1050 [btrfs]
[301835.970682]  btrfs_reserve_extent+0x96/0x1d0 [btrfs]
[301835.971661]  cow_file_range.isra.67+0x109/0x460 [btrfs]
[301835.972644]  run_delalloc_range+0xed/0x340 [btrfs]
[301835.973634]  writepage_delalloc+0xf9/0x160 [btrfs]
[301835.974629]  __extent_writepage+0xd0/0x2f0 [btrfs]
[301835.975610]  extent_write_cache_pages+0x190/0x330 [btrfs]
[301835.976596]  extent_writepages+0x3b/0x50 [btrfs]
[301835.977563]  do_writepages+0x12/0x60
[301835.978532]  __writeback_single_inode+0x38/0x300
[301835.979499]  writeback_sb_inodes+0x1c6/0x430
[301835.980459]  __writeback_inodes_wb+0x82/0xb0
[301835.981410]  wb_writeback+0x23e/0x2d0
[301835.982366]  ? wb_workfn+0x188/0x3c0
[301835.983308]  wb_workfn+0x188/0x3c0
[301835.984244]  process_one_work+0x1c0/0x390
[301835.985177]  worker_thread+0x42/0x3e0
[301835.986100]  kthread+0xf0/0x130
[301835.987029]  ? create_worker+0x170/0x170
[301835.987948]  ? kthread_create_worker_on_cpu+0x40/0x40
[301835.988869]  ret_from_fork+0x1f/0x30
Comment 5 Eli V 2018-09-24 12:43:02 UTC
Just hit this on 4.18.8. rsync process is sitting doing nothing all weekend because of the deadlock on the btrfs-trsaction process:
[361526.415801] INFO: task btrfs-transacti:2895 blocked for more than 120 seconds.
[361526.416893]       Tainted: G        W I       4.18.8 #1
[361526.417826] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[361526.418778] btrfs-transacti D    0  2895      2 0x80000000
[361526.419747] Call Trace:
[361526.420695]  ? __schedule+0x2cc/0x780
[361526.421635]  schedule+0x2d/0x80
[361526.422585]  btrfs_tree_lock+0x111/0x1d0 [btrfs]
[361526.423519]  ? wait_woken+0x80/0x80
[361526.424466]  btrfs_search_slot+0x7c9/0xa80 [btrfs]
[361526.425390]  ? _cond_resched+0x11/0x40
[361526.426301]  btrfs_insert_empty_items+0x62/0xb0 [btrfs]
[361526.427207]  btrfs_insert_item+0x5b/0xc0 [btrfs]
[361526.428107]  btrfs_create_pending_block_groups+0xff/0x1e0 [btrfs]
[361526.429005]  do_chunk_alloc+0x1d4/0x260 [btrfs]
[361526.429895]  find_free_extent+0x651/0x1040 [btrfs]
[361526.430794]  btrfs_reserve_extent+0x96/0x1d0 [btrfs]
[361526.431703]  btrfs_alloc_tree_block+0xfa/0x520 [btrfs]
[361526.432602]  __btrfs_cow_block+0x124/0x540 [btrfs]
[361526.433499]  btrfs_cow_block+0xee/0x190 [btrfs]
[361526.434395]  btrfs_search_slot+0x238/0xa80 [btrfs]
[361526.435295]  ? btrfs_update_inode_item+0x59/0x100 [btrfs]
[361526.436194]  ? iput+0x66/0x1d0
[361526.437083]  write_one_cache_group.isra.53+0x20/0x90 [btrfs]
[361526.437995]  btrfs_start_dirty_block_groups+0x100/0x420 [btrfs]
[361526.438909]  btrfs_commit_transaction+0xb6/0x840 [btrfs]
[361526.439830]  ? start_transaction+0x8c/0x3e0 [btrfs]
[361526.440738]  transaction_kthread+0x142/0x170 [btrfs]
[361526.441631]  kthread+0xf0/0x130
[361526.442527]  ? btrfs_cleanup_transaction+0x4e0/0x4e0 [btrfs]
[361526.443423]  ? kthread_create_worker_on_cpu+0x40/0x40
[361526.444333]  ret_from_fork+0x1f/0x30
[361526.445244] INFO: task kworker/u33:21:12240 blocked for more than 120 seconds.
[361526.446167]       Tainted: G        W I       4.18.8 #1
[361526.447090] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[361526.448050] kworker/u33:21  D    0 12240      2 0x80000000
[361526.449009] Workqueue: writeback wb_workfn (flush-btrfs-1)
[361526.449969] Call Trace:
[361526.450926]  ? __schedule+0x2cc/0x780
[361526.451905]  ? btrfs_buffer_uptodate+0x2f/0x40 [btrfs]
[361526.452875]  schedule+0x2d/0x80
[361526.453857]  btrfs_tree_lock+0x111/0x1d0 [btrfs]
[361526.454832]  ? wait_woken+0x80/0x80
[361526.455829]  btrfs_search_slot+0x7c9/0xa80 [btrfs]
[361526.456815]  ? _cond_resched+0x11/0x40
[361526.457812]  btrfs_insert_empty_items+0x62/0xb0 [btrfs]
[361526.458816]  btrfs_insert_item+0x5b/0xc0 [btrfs]
[361526.459821]  btrfs_create_pending_block_groups+0xff/0x1e0 [btrfs]
[361526.460820]  __btrfs_end_transaction+0x88/0x2d0 [btrfs]
[361526.461806]  find_free_extent+0xe68/0x1040 [btrfs]
[361526.462778]  btrfs_reserve_extent+0x96/0x1d0 [btrfs]
[361526.463747]  cow_file_range.isra.69+0x109/0x460 [btrfs]
[361526.464714]  run_delalloc_range+0xed/0x340 [btrfs]
[361526.465686]  writepage_delalloc+0xf9/0x160 [btrfs]
[361526.466655]  __extent_writepage+0xd0/0x2f0 [btrfs]
[361526.467621]  extent_write_cache_pages+0x190/0x330 [btrfs]
[361526.468602]  extent_writepages+0x3b/0x50 [btrfs]
[361526.469556]  do_writepages+0x12/0x60
[361526.470501]  __writeback_single_inode+0x38/0x300
[361526.471449]  writeback_sb_inodes+0x18c/0x440
[361526.472405]  __writeback_inodes_wb+0x82/0xb0
[361526.473346]  wb_writeback+0x23e/0x2d0
[361526.474276]  ? wb_workfn+0x2c4/0x3c0
[361526.475200]  wb_workfn+0x2c4/0x3c0
[361526.476129]  process_one_work+0x1da/0x3b0
[361526.477049]  worker_thread+0x44/0x3d0
[361526.477960]  kthread+0xf0/0x130
[361526.478867]  ? create_worker+0x190/0x190
[361526.479783]  ? kthread_create_worker_on_cpu+0x40/0x40
[361526.480696]  ret_from_fork+0x1f/0x30
Comment 6 David Sterba 2019-05-21 12:24:09 UTC
Thanks for the report. Fixed by 5ce555578e0919237f "Btrfs: fix deadlock when writing out free space caches", in 4.20.