Bug 205569

Summary: potential data race (likely benign) on inode->i_state (reading and writing to different bits)
Product: File System Reporter: Meng Xu (mengxu.gatech)
Component: ext4Assignee: fs_ext4 (fs_ext4)
Status: RESOLVED INVALID    
Severity: normal CC: tytso
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 5.4-rc5 Subsystem:
Regression: No Bisected commit-id:

Description Meng Xu 2019-11-18 20:41:40 UTC
I am reporting a potential data race (maybe benign) in the ext4 layer on inode->i_state, with reading and writing to the same byte but different bits: I_DIRTY_PAGES (bit 2) and I_NEW | I_FREEING (bit 3 and 5), observable during the write-back phase.

The function call trace is shown below:

[Thread 1: SYS_rmdir]
__do_sys_rmdir
  do_rmdir
    vfs_rmdir
      ext4_rmdir
        ext4_orphan_add
          [READ] WARN_ON_ONCE(!(inode->i_state & (I_NEW | I_FREEING)) &&
		     !inode_is_locked(inode));

[Thread 2: write-back thread]
wb_workfn
  wb_do_writeback
    wb_writeback
      writeback_sb_inodes
        __writeback_single_inode
            [WRITE] dirty = inode->i_state & I_DIRTY;


I could confirm that the WRITE may happen before and after the READ operation by controlling the timing of the two threads, i.e., by setting breakpoints before the WRITE statement.

However, I am not very sure about the implication of such a data race (e.g., causing violations of assumptions). I would appreciate if you could help check on this potential bug and advise whether this is a harmful data race or it
is intended. Thank you!
Comment 1 Theodore Tso 2019-11-19 00:58:46 UTC
The writeback thread is only applicable for data files.   While rmdir() is only applicable for directories.   Also, in both of these function traces, what you referenced is i_state bits being *read*:

 [WRITE] dirty = inode->i_state & I_DIRTY;
  ^^^^^ not correct!

That being said, there are places in fs/fs-writeback.c where i_state is modified, and there are code paths where ext4_orphan_add() can be called on regular data files --- just not the ones you've listed in this bug.

Can you recheck the call traces and make sure they are correct?
Comment 2 Meng Xu 2019-11-19 01:02:30 UTC
(In reply to Theodore Tso from comment #1)
> The writeback thread is only applicable for data files.   While rmdir() is
> only applicable for directories.   Also, in both of these function traces,
> what you referenced is i_state bits being *read*:
> 
>  [WRITE] dirty = inode->i_state & I_DIRTY;
>   ^^^^^ not correct!
> 
> That being said, there are places in fs/fs-writeback.c where i_state is
> modified, and there are code paths where ext4_orphan_add() can be called on
> regular data files --- just not the ones you've listed in this bug.
> 
> Can you recheck the call traces and make sure they are correct?

Hi Ted,

My bad, the [WRITE] location is a few lines down the path,
inode->i_state &= ~dirty;

Best Regards,
Meng
Comment 3 Theodore Tso 2019-11-19 01:26:21 UTC
Yes, it's benign.   An inode which is I_NEW or I_FREEING will never be in the writeback.
Comment 4 Meng Xu 2019-11-19 01:27:58 UTC
(In reply to Theodore Tso from comment #3)
> Yes, it's benign.   An inode which is I_NEW or I_FREEING will never be in
> the writeback.

Many thanks for the confirmation Ted, in the future, I'll post these unsure cases to the mailing list instead of filing a bug report.