Bug 198483 - btrfs directory listing temporarily includes a recently removed file
Summary: btrfs directory listing temporarily includes a recently removed file
Status: RESOLVED CODE_FIX
Alias: None
Product: File System
Classification: Unclassified
Component: btrfs (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: Josef Bacik
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2018-01-15 12:03 UTC by Zac Medico
Modified: 2018-02-13 12:14 UTC (History)
7 users (show)

See Also:
Kernel Version: 4.14.13
Subsystem:
Regression: No
Bisected commit-id:


Attachments
Test for stale (removed) files in btrfs directory listings (1.63 KB, application/x-shellscript)
2018-01-15 12:03 UTC, Zac Medico
Details

Description Zac Medico 2018-01-15 12:03:21 UTC
Created attachment 273619 [details]
Test for stale (removed) files in btrfs directory listings

The attached script reliably reproduces the problem on a couple of my machines. By default, it creates files numbered 1 to 350, and then removes every 10th
file. On the two machines I've tested, the file named 341 consistently shows up
as a stale entry. This problem was first reported at https://bugs.gentoo.org/641262.
Comment 1 Johannes Hirte 2018-01-15 12:26:54 UTC
I can confirm this on several different systems with kernel 4.14 and 4.15-rc. I could not reproduce this with 4.13.16. So this looks clearly like a regression.
Comment 2 Johannes Hirte 2018-01-15 15:09:05 UTC
bisect points me to this commit:

23b5ec74943f44378b68c0edd8e210a86318ea5e is the first bad commit
commit 23b5ec74943f44378b68c0edd8e210a86318ea5e
Author: Josef Bacik <jbacik@fb.com>
Date:   Mon Jul 24 15:14:25 2017 -0400

    btrfs: fix readdir deadlock with pagefault
    
    Readdir does dir_emit while under the btree lock.  dir_emit can trigger
    the page fault which means we can deadlock.  Fix this by allocating a
    buffer on opening a directory and copying the readdir into this buffer
    and doing dir_emit from outside of the tree lock.
    
    Thread A
    readdir  <holding tree lock>
      dir_emit
        <page fault>
          down_read(mmap_sem)
    
    Thread B
    mmap write
      down_write(mmap_sem)
        page_mkwrite
          wait_ordered_extents
    
    Process C
    finish_ordered_extent
      insert_reserved_file_extent
       try to lock leaf <hang>
    
    Signed-off-by: Josef Bacik <jbacik@fb.com>
    Reviewed-by: David Sterba <dsterba@suse.com>
    [ copy the deadlock scenario to changelog ]
    Signed-off-by: David Sterba <dsterba@suse.com>

:040000 040000 aeda0fb1a56434e03cf34f5ad07f592de1b5e7e2 76e1d0fe3f1b2bff273b3b38e344da33853cf691 M      fs
Comment 3 Josef Bacik 2018-01-23 20:17:56 UTC
Sorry about that, I've posted a patch, you can find it here

https://patchwork.kernel.org/patch/10181019/
Comment 4 Johannes Hirte 2018-01-24 10:52:32 UTC
(In reply to Josef Bacik from comment #3)
> Sorry about that, I've posted a patch, you can find it here
> 
> https://patchwork.kernel.org/patch/10181019/

Seems to fix it. I can't reproduce it with the test-script anymore.
Comment 5 Josh 2018-02-05 21:09:18 UTC
In that case, can we please close this bug?
Comment 6 David Sterba 2018-02-13 12:14:59 UTC
Right. Fixed in 4.15 and backproted to 4.14.x.

Note You need to log in before you can comment on or make changes to this bug.