Bug 207713

Summary: xfs: data races on ip->i_itemp->ili_fields in xfs_inode_clean()
Product: File System Reporter: Jia-Ju Bai (baijiaju1990)
Component: XFSAssignee: FileSystem/XFS Default Virtual Assignee (filesystem_xfs)
Status: RESOLVED INVALID    
Severity: normal CC: baijiaju1990, sandeen
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 5.4 Subsystem:
Regression: No Bisected commit-id:

Description Jia-Ju Bai 2020-05-13 03:02:10 UTC
The function xfs_inode_clean() is concurrently executed with the functions xfs_inode_item_format_data_fork(), xfs_trans_log_inode() and xfs_inode_item_format() at runtime in the following call contexts:

Thread 1:
xfsaild()
  xfsaild_push()
    xfsaild_push_item()
      xfs_inode_item_push()
        xfs_iflush()
          xfs_iflush_cluster()
            xfs_inode_clean()

Thread 2 (case 1):
xfs_file_write_iter()
  xfs_file_buffered_aio_write()
    xfs_file_aio_write_checks()
      xfs_vn_update_time()
        xfs_trans_commit()
          __xfs_trans_commit()
            xfs_log_commit_cil()
              xlog_cil_insert_items()
                xlog_cil_insert_format_items()
                  xfs_inode_item_format()
                    xfs_inode_item_format_data_fork()

Thread 2 (case 2):
xfs_file_write_iter()
  xfs_file_buffered_aio_write()
    xfs_file_aio_write_checks()
      xfs_vn_update_time()
        xfs_trans_log_inode()

Thread 2 (case 3):
xfs_file_write_iter()
  xfs_file_buffered_aio_write()
    xfs_file_aio_write_checks()
      xfs_vn_update_time()
        xfs_trans_commit()
          __xfs_trans_commit()
            xfs_log_commit_cil()
              xlog_cil_insert_items()
                xlog_cil_insert_format_items()
                  xfs_inode_item_format()

In xfs_inode_clean():
  return !ip->i_itemp || !(ip->i_itemp->ili_fields & XFS_ILOG_ALL);

In xfs_inode_item_format_data_fork() (case 1):
  iip->ili_fields &=
	~(XFS_ILOG_DDATA | XFS_ILOG_DBROOT | XFS_ILOG_DEV);

In xfs_trans_log_inode() (case 2):
  ip->i_itemp->ili_fields |= flags;

In xfs_inode_item_format() (case 3):
  iip->ili_fields &=
	~(XFS_ILOG_ADATA | XFS_ILOG_ABROOT | XFS_ILOG_AEXT);

The variables ip->i_itemp->ili_fields and iip->ili_fields access the same memory, and thus data races can occur.

These data race were found and actually reproduced by our concurrency fuzzer.

I am not sure whether these data races are harmful and how to fix them properly, so I want to listen to your opinions, thanks :)
Comment 1 Dave Chinner 2020-05-13 04:31:58 UTC
On Wed, May 13, 2020 at 03:02:10AM +0000, bugzilla-daemon@bugzilla.kernel.org wrote:
> https://bugzilla.kernel.org/show_bug.cgi?id=207713
> 
>             Bug ID: 207713
>            Summary: xfs: data races on ip->i_itemp->ili_fields in
>                     xfs_inode_clean()
>            Product: File System
>            Version: 2.5
>     Kernel Version: 5.4
>           Hardware: All
>                 OS: Linux
>               Tree: Mainline
>             Status: NEW
>           Severity: normal
>           Priority: P1
>          Component: XFS
>           Assignee: filesystem_xfs@kernel-bugs.kernel.org
>           Reporter: baijiaju1990@gmail.com
>         Regression: No
> 
> The function xfs_inode_clean() is concurrently executed with the functions
> xfs_inode_item_format_data_fork(), xfs_trans_log_inode() and
> xfs_inode_item_format() at runtime in the following call contexts:
> 
> Thread 1:
> xfsaild()
>   xfsaild_push()
>     xfsaild_push_item()
>       xfs_inode_item_push()
>         xfs_iflush()
>           xfs_iflush_cluster()
>             xfs_inode_clean()

The code explains:

                /*
                 * Do an un-protected check to see if the inode is dirty and
                 * is a candidate for flushing.  These checks will be repeated
                 * later after the appropriate locks are acquired.
                 */
                if (xfs_inode_clean(cip) && xfs_ipincount(cip) == 0)
                        continue;

Then it is repeated if the racy check indicates the inode is dirty
whilst holding the correct inode locks.  All three of the "thread 2"
cases are modifying the fields with the correct locks held, so there
isn't actually a data race bug we care about here.

IOWs, this code is simply avoiding taking locks if we don't need
them - it's a pattern we use in numerous places in XFS, so you're
going to get lots of false positives from your tool.

Not a bug, please close.

-Dave.
Comment 2 Jia-Ju Bai 2020-05-13 08:00:10 UTC
Okay, thanks for explanation.
Comment 3 Eric Sandeen 2020-05-13 13:36:57 UTC
A comment to indicate that this is intentional may be helpful, though. :)