Bug 216676 - dm thin: An ABBA deadlock problem between shrink_slab and dm_pool_abort_metadata
Summary: dm thin: An ABBA deadlock problem between shrink_slab and dm_pool_abort_metadata
Status: NEW
Alias: None
Product: IO/Storage
Classification: Unclassified
Component: MD (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: io_md
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2022-11-10 06:19 UTC by Zhihao Cheng
Modified: 2022-11-10 06:20 UTC (History)
0 users

See Also:
Kernel Version: 6.1.0-rc4
Subsystem:
Regression: No
Bisected commit-id:


Attachments
diff (11.59 KB, patch)
2022-11-10 06:20 UTC, Zhihao Cheng
Details | Diff
b.c (1.75 KB, text/plain)
2022-11-10 06:20 UTC, Zhihao Cheng
Details
test.sh (164 bytes, application/x-shellscript)
2022-11-10 06:20 UTC, Zhihao Cheng
Details

Description Zhihao Cheng 2022-11-10 06:19:46 UTC
-smp 2
-m 4096
config 3 disks

1. Apply diff and compile kernel
2. gcc -o bb b.c
3. ./test.sh

[   79.041162] test.sh (2641): drop_caches: 3
[   79.046844] evict
[   79.049626] test.sh(2641) read bh
[   79.049953] test.sh(2641): try get pmd->root_lock, wait other submit
[   79.253833] sync(2686): try get pmd->root_lock, wait other submit
[   79.425157] ext4lazyinit(2675): try get pmd->root_lock, wait other submit
[   80.030788] commit kworker/u4:7 67 make fail
[   80.031663] device-mapper: thin: 252:0: metadata operation 'dm_pool_commit_metadata' failed: error = -22
[   80.033368] device-mapper: thin: 252:0: aborting current metadata transaction
[  105.153386] kworker/u4:8(85) read bh
[  105.154332] kworker/u4:8(85) read bh done
[  107.713158] INFO: task kworker/u4:7:67 blocked for more than 15 seconds.
[  107.713950]       Not tainted 6.1.0-rc4-00011-g8f17dd350364-dirty #912
[  107.714687] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  107.715558] task:kworker/u4:7    state:D stack:0     pid:67    ppid:2      flags:0x00004000
[  107.715565] Workqueue: dm-thin do_worker
[  107.715574] Call Trace:
[  107.715576]  <TASK>
[  107.715581]  __schedule+0x6ba/0x10f0
[  107.715601]  schedule+0x9d/0x1e0
[  107.715609]  rwsem_down_write_slowpath+0x587/0xdf0
[  107.715620]  down_write+0xec/0x110
[  107.715628]  unregister_shrinker+0x2c/0xf0
[  107.715641]  dm_bufio_client_destroy+0x116/0x3d0
[  107.715654]  dm_block_manager_destroy+0x19/0x40
[  107.715663]  __destroy_persistent_data_objects+0x5e/0x70
[  107.715669]  dm_pool_abort_metadata+0x8e/0x100
[  107.715676]  metadata_operation_failed+0x86/0x110
[  107.715685]  commit+0x6a/0x230
[  107.715694]  do_worker+0xc6e/0xd90
[  107.715699]  ? internal_add_timer+0x54/0x80
[  107.715707]  ? _raw_spin_unlock_irqrestore+0x4b/0x90
[  107.715714]  ? move_linked_works+0x60/0x110
[  107.715727]  process_one_work+0x269/0x630
[  107.715736]  worker_thread+0x266/0x630
[  107.715739]  ? rescuer_thread+0x540/0x540
[  107.715742]  kthread+0x151/0x1b0
[  107.715745]  ? kthread_exit+0x50/0x50
[  107.715748]  ret_from_fork+0x1f/0x30
[  107.715754]  </TASK>
[  107.715764] INFO: task test.sh:2641 blocked for more than 15 seconds.
[  107.716484]       Not tainted 6.1.0-rc4-00011-g8f17dd350364-dirty #912
[  107.717210] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  107.718056] task:test.sh         state:D stack:0     pid:2641  ppid:2454   flags:0x00004000
[  107.718067] Call Trace:
[  107.718069]  <TASK>
[  107.718072]  __schedule+0x6ba/0x10f0
[  107.718077]  ? preempt_count_add+0xba/0x130
[  107.718085]  schedule+0x9d/0x1e0
[  107.718088]  rwsem_down_read_slowpath+0x4f4/0x910
[  107.718094]  ? timer_migration_handler+0x80/0xa0
[  107.718099]  down_read+0x84/0x170
[  107.718102]  dm_thin_find_block+0x4c/0xd0
[  107.718106]  thin_map+0x201/0x3d0
[  107.718110]  __map_bio+0x5b/0x350
[  107.718116]  dm_submit_bio+0x2b6/0x930
[  107.718121]  __submit_bio+0x123/0x2d0
[  107.718128]  submit_bio_noacct_nocheck+0x101/0x3e0
[  107.718135]  ? kvm_clock_read+0x2c/0x70
[  107.718139]  ? ktime_get+0x50/0x100
[  107.718142]  submit_bio_noacct+0x389/0x770
[  107.718154]  submit_bio+0x50/0xc0
[  107.718163]  submit_bh_wbc+0x15e/0x230
[  107.718174]  submit_bh+0x14/0x20
[  107.718179]  ext4_read_bh_nowait+0xc5/0x130
[  107.718186]  ext4_read_block_bitmap_nowait+0x340/0xc60
[  107.718202]  ext4_mb_init_cache+0x1ce/0xdc0
[  107.718209]  ext4_mb_load_buddy_gfp+0x987/0xfa0
[  107.718224]  ext4_discard_preallocations+0x45d/0x830
[  107.718236]  ? vprintk_default+0x21/0x30
[  107.718252]  ext4_clear_inode+0x48/0xf0
[  107.718259]  ext4_evict_inode+0xcf/0xc70
[  107.718264]  evict+0x119/0x2b0
[  107.718271]  dispose_list+0x43/0xa0
[  107.718279]  prune_icache_sb+0x64/0x90
[  107.718287]  super_cache_scan+0x155/0x210
[  107.718294]  do_shrink_slab+0x19e/0x4e0
[  107.718301]  shrink_slab+0x2bd/0x450
[  107.718313]  drop_slab+0xcc/0x1a0
[  107.718324]  drop_caches_sysctl_handler+0xb7/0xe0
[  107.718334]  proc_sys_call_handler+0x1bc/0x300
[  107.718341]  proc_sys_write+0x17/0x20
[  107.718347]  vfs_write+0x3d3/0x570
[  107.718361]  ksys_write+0x73/0x160
[  107.718366]  __x64_sys_write+0x1e/0x30
[  107.718369]  do_syscall_64+0x35/0x80
[  107.718373]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
Comment 1 Zhihao Cheng 2022-11-10 06:20:05 UTC
Created attachment 303152 [details]
diff
Comment 2 Zhihao Cheng 2022-11-10 06:20:16 UTC
Created attachment 303153 [details]
b.c
Comment 3 Zhihao Cheng 2022-11-10 06:20:28 UTC
Created attachment 303154 [details]
test.sh

Note You need to log in before you can comment on or make changes to this bug.