Bug 207713 - xfs: data races on ip->i_itemp->ili_fields in xfs_inode_clean()
Summary: xfs: data races on ip->i_itemp->ili_fields in xfs_inode_clean()
Status: RESOLVED INVALID
Alias: None
Product: File System
Classification: Unclassified
Component: XFS (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: FileSystem/XFS Default Virtual Assignee
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2020-05-13 03:02 UTC by Jia-Ju Bai
Modified: 2020-05-13 13:36 UTC (History)
2 users (show)

See Also:
Kernel Version: 5.4
Subsystem:
Regression: No
Bisected commit-id:


Attachments

Description Jia-Ju Bai 2020-05-13 03:02:10 UTC
The function xfs_inode_clean() is concurrently executed with the functions xfs_inode_item_format_data_fork(), xfs_trans_log_inode() and xfs_inode_item_format() at runtime in the following call contexts:

Thread 1:
xfsaild()
  xfsaild_push()
    xfsaild_push_item()
      xfs_inode_item_push()
        xfs_iflush()
          xfs_iflush_cluster()
            xfs_inode_clean()

Thread 2 (case 1):
xfs_file_write_iter()
  xfs_file_buffered_aio_write()
    xfs_file_aio_write_checks()
      xfs_vn_update_time()
        xfs_trans_commit()
          __xfs_trans_commit()
            xfs_log_commit_cil()
              xlog_cil_insert_items()
                xlog_cil_insert_format_items()
                  xfs_inode_item_format()
                    xfs_inode_item_format_data_fork()

Thread 2 (case 2):
xfs_file_write_iter()
  xfs_file_buffered_aio_write()
    xfs_file_aio_write_checks()
      xfs_vn_update_time()
        xfs_trans_log_inode()

Thread 2 (case 3):
xfs_file_write_iter()
  xfs_file_buffered_aio_write()
    xfs_file_aio_write_checks()
      xfs_vn_update_time()
        xfs_trans_commit()
          __xfs_trans_commit()
            xfs_log_commit_cil()
              xlog_cil_insert_items()
                xlog_cil_insert_format_items()
                  xfs_inode_item_format()

In xfs_inode_clean():
  return !ip->i_itemp || !(ip->i_itemp->ili_fields & XFS_ILOG_ALL);

In xfs_inode_item_format_data_fork() (case 1):
  iip->ili_fields &=
	~(XFS_ILOG_DDATA | XFS_ILOG_DBROOT | XFS_ILOG_DEV);

In xfs_trans_log_inode() (case 2):
  ip->i_itemp->ili_fields |= flags;

In xfs_inode_item_format() (case 3):
  iip->ili_fields &=
	~(XFS_ILOG_ADATA | XFS_ILOG_ABROOT | XFS_ILOG_AEXT);

The variables ip->i_itemp->ili_fields and iip->ili_fields access the same memory, and thus data races can occur.

These data race were found and actually reproduced by our concurrency fuzzer.

I am not sure whether these data races are harmful and how to fix them properly, so I want to listen to your opinions, thanks :)
Comment 1 Dave Chinner 2020-05-13 04:31:58 UTC
On Wed, May 13, 2020 at 03:02:10AM +0000, bugzilla-daemon@bugzilla.kernel.org wrote:
> https://bugzilla.kernel.org/show_bug.cgi?id=207713
> 
>             Bug ID: 207713
>            Summary: xfs: data races on ip->i_itemp->ili_fields in
>                     xfs_inode_clean()
>            Product: File System
>            Version: 2.5
>     Kernel Version: 5.4
>           Hardware: All
>                 OS: Linux
>               Tree: Mainline
>             Status: NEW
>           Severity: normal
>           Priority: P1
>          Component: XFS
>           Assignee: filesystem_xfs@kernel-bugs.kernel.org
>           Reporter: baijiaju1990@gmail.com
>         Regression: No
> 
> The function xfs_inode_clean() is concurrently executed with the functions
> xfs_inode_item_format_data_fork(), xfs_trans_log_inode() and
> xfs_inode_item_format() at runtime in the following call contexts:
> 
> Thread 1:
> xfsaild()
>   xfsaild_push()
>     xfsaild_push_item()
>       xfs_inode_item_push()
>         xfs_iflush()
>           xfs_iflush_cluster()
>             xfs_inode_clean()

The code explains:

                /*
                 * Do an un-protected check to see if the inode is dirty and
                 * is a candidate for flushing.  These checks will be repeated
                 * later after the appropriate locks are acquired.
                 */
                if (xfs_inode_clean(cip) && xfs_ipincount(cip) == 0)
                        continue;

Then it is repeated if the racy check indicates the inode is dirty
whilst holding the correct inode locks.  All three of the "thread 2"
cases are modifying the fields with the correct locks held, so there
isn't actually a data race bug we care about here.

IOWs, this code is simply avoiding taking locks if we don't need
them - it's a pattern we use in numerous places in XFS, so you're
going to get lots of false positives from your tool.

Not a bug, please close.

-Dave.
Comment 2 Jia-Ju Bai 2020-05-13 08:00:10 UTC
Okay, thanks for explanation.
Comment 3 Eric Sandeen 2020-05-13 13:36:57 UTC
A comment to indicate that this is intentional may be helpful, though. :)

Note You need to log in before you can comment on or make changes to this bug.