Bug 59061
Summary: | Warning caused by btrfs_tree_lock with lock-debugging enabled | ||
---|---|---|---|
Product: | File System | Reporter: | Clemens Eisserer (linuxhippy) |
Component: | btrfs | Assignee: | Josef Bacik (josef) |
Status: | RESOLVED CODE_FIX | ||
Severity: | normal | CC: | dsterba |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | 3.10.0-0.rc2 | Subsystem: | |
Regression: | No | Bisected commit-id: | |
Attachments: | possible fix |
Description
Clemens Eisserer
2013-05-31 10:12:24 UTC
I've found this lockdep warnings in my logs from the time I worked on it, the kernel is 3.8+ probably with some -next patches (around 1 Mar 15:36:38). [ 4740.477372] btrfs: relocating block group 23114809344 flags 1 [ 4741.240448] btrfs_clean_one_deleted_snapshot: btrfs: cleaner removing 484 [... snipped like 30 similar messages for brevity] [ 4786.583116] btrfs_clean_one_deleted_snapshot: btrfs: cleaner removing 752 [ 4789.868173] ------------[ cut here ]------------ [ 4789.872123] WARNING: at kernel/lockdep.c:702 __lock_acquire+0x1b7d/0x1f20() [ 4789.872123] Hardware name: Santa Rosa platform [ 4789.872123] Modules linked in: btrfs aoe dm_crypt loop [last unloaded: btrfs] [ 4789.872123] Pid: 18578, comm: btrfs-cleaner Not tainted 3.8.0-default+ #267 [ 4789.872123] Call Trace: [ 4789.872123] [<ffffffff8104c77f>] warn_slowpath_common+0x7f/0xc0 [ 4789.872123] [<ffffffff8104c7da>] warn_slowpath_null+0x1a/0x20 [ 4789.872123] [<ffffffff810abd5d>] __lock_acquire+0x1b7d/0x1f20 [ 4789.872123] [<ffffffff81009ed5>] ? native_sched_clock+0x15/0x80 [ 4789.872123] [<ffffffff810a6909>] ? trace_hardirqs_off_caller+0x29/0xc0 [ 4789.872123] [<ffffffff810aa535>] ? __lock_acquire+0x355/0x1f20 [ 4789.872123] [<ffffffff810a6909>] ? trace_hardirqs_off_caller+0x29/0xc0 [ 4789.872123] [<ffffffffa016e4e1>] ? btrfs_tree_lock+0x131/0x290 [btrfs] [ 4789.872123] [<ffffffff810ac794>] lock_acquire+0x94/0x130 [ 4789.872123] [<ffffffffa016e4e1>] ? btrfs_tree_lock+0x131/0x290 [btrfs] [ 4789.872123] [<ffffffff81009ed5>] ? native_sched_clock+0x15/0x80 [ 4789.872123] [<ffffffff81959706>] _raw_write_lock+0x46/0x80 [ 4789.872123] [<ffffffffa016e4e1>] ? btrfs_tree_lock+0x131/0x290 [btrfs] [ 4789.872123] [<ffffffff81087d1f>] ? local_clock+0x6f/0x80 [ 4789.872123] [<ffffffffa016e4e1>] btrfs_tree_lock+0x131/0x290 [btrfs] [ 4789.872123] [<ffffffffa015845e>] ? find_extent_buffer+0xae/0x110 [btrfs] [ 4789.872123] [<ffffffffa01583b0>] ? alloc_extent_buffer+0x4d0/0x4d0 [btrfs] [ 4789.872123] [<ffffffffa0121571>] do_walk_down+0xe1/0x540 [btrfs] [ 4789.872123] [<ffffffff810a6909>] ? trace_hardirqs_off_caller+0x29/0xc0 [ 4789.872123] [<ffffffff810a69ad>] ? trace_hardirqs_off+0xd/0x10 [ 4789.872123] [<ffffffffa01205b4>] ? btrfs_block_rsv_check+0x74/0x90 [btrfs] [ 4789.872123] [<ffffffffa0121aa8>] walk_down_tree+0xd8/0x110 [btrfs] [ 4789.872123] [<ffffffffa0124b20>] btrfs_drop_snapshot+0x380/0x640 [btrfs] [ 4789.872123] [<ffffffffa0138dc5>] btrfs_clean_one_deleted_snapshot+0x125/0x1a0 [btrfs] [ 4789.872123] [<ffffffffa012ec22>] cleaner_kthread+0xb2/0x180 [btrfs] [ 4789.872123] [<ffffffffa012eb70>] ? btree_readpage+0x30/0x30 [btrfs] [ 4789.872123] [<ffffffff81073e8e>] kthread+0xde/0xf0 [ 4789.872123] [<ffffffff81073db0>] ? flush_kthread_worker+0x1e0/0x1e0 [ 4789.872123] [<ffffffff81962dec>] ret_from_fork+0x7c/0xb0 [ 4789.872123] [<ffffffff81073db0>] ? flush_kthread_worker+0x1e0/0x1e0 [ 4789.872123] ---[ end trace 930320f35566d00f ]--- [ 4792.158795] btrfs_clean_one_deleted_snapshot: btrfs: cleaner removing 372 [... etc] And another one with 3.8-rc7+ tag, without any furhter details. Created attachment 103571 [details]
possible fix
This is my educated guess, can you run with this and see if the warning goes away?
When btrfs_find_create_tree_block does the *_create_* part through alloc_extent_buffer, the lockdep class is not set and triggers the warning. In other cases when the tree block is created the class is set. There's another similar instance in walk_down_log_tree where find_create is not followed by lockdep class update. I think the patch is correct and non-intrusive anyway, so add it to next and we'll see, the bug is not easy to reproduce. My guess is that when cleaning starts and no blocks are yet cached, it may show up. Fixed with [PATCH] Btrfs: set lockdep class before locking new extent buffer walk_down_log_tree doesn't lock the eb after creating it, it reads it first which gets the lockdep class set properly, so it is good to go. Please re-open if you reproduce with this patch in place. |