Bug 206397
Summary: | [xfstests generic/475] XFS: Assertion failed: iclog->ic_state == XLOG_STATE_ACTIVE, file: fs/xfs/xfs_log.c, line: 572 | ||
---|---|---|---|
Product: | File System | Reporter: | Zorro Lang (zlang) |
Component: | XFS | Assignee: | FileSystem/XFS Default Virtual Assignee (filesystem_xfs) |
Status: | NEW --- | ||
Severity: | normal | CC: | chandanrlinux |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | linux 5.5+ with xfs-linux xfs-5.6-merge-7 | Subsystem: | |
Regression: | No | Bisected commit-id: |
Description
Zorro Lang
2020-02-04 03:44:50 UTC
I was unable to recreate this issue on a ppc64le kvm guest. I used Linux v5.5 and xfsprogs' for-next branch. Can you please share the kernel config file? Also, Can you please tell me how easy is it recreate this bug? (In reply to Chandan Rajendra from comment #1) > I was unable to recreate this issue on a ppc64le kvm guest. I used Linux > v5.5 and xfsprogs' for-next branch. > > Can you please share the kernel config file? Also, Can you please tell me > how easy is it recreate this bug? It's really hard to reproduce. The g/475 is a random test, it's helped us to find many different issues. For this bug, this's the 1st time I hit it, and can't reproduce it simply. On Tue, Feb 04, 2020 at 05:10:05PM +0000, bugzilla-daemon@bugzilla.kernel.org wrote: > https://bugzilla.kernel.org/show_bug.cgi?id=206397 > > --- Comment #2 from Zorro Lang (zlang@redhat.com) --- > (In reply to Chandan Rajendra from comment #1) > > I was unable to recreate this issue on a ppc64le kvm guest. I used Linux > > v5.5 and xfsprogs' for-next branch. > > > > Can you please share the kernel config file? Also, Can you please tell me > > how easy is it recreate this bug? > > It's really hard to reproduce. The g/475 is a random test, it's helped us to > find many different issues. For this bug, this's the 1st time I hit it, and > can't reproduce it simply. > Have you still been unable to reproduce (assuming you've been attempting to)? How many iterations were required before you reproduced the first time? I'm wondering if the XLOG_STATE_IOERROR check in xfs_log_release_iclog() is racy with respect to filesystem shutdown. There's an ASSERT_ALWAYS() earlier in this (xlog_cil_push()) codepath that checks for ACTIVE || WANT_SYNC and it doesn't appear that has failed from your output snippet. The aforementioned IOERROR check occurs before we acquire ->l_icloglock, however, which I think means xfs_log_force_umount() could jump in if called from another task and reset all of the iclogs while the release path waits on the lock. Brian > -- > You are receiving this mail because: > You are watching the assignee of the bug. > On Wed, Feb 12, 2020 at 10:55:10AM -0500, Brian Foster wrote: > On Tue, Feb 04, 2020 at 05:10:05PM +0000, bugzilla-daemon@bugzilla.kernel.org > wrote: > > https://bugzilla.kernel.org/show_bug.cgi?id=206397 > > > > --- Comment #2 from Zorro Lang (zlang@redhat.com) --- > > (In reply to Chandan Rajendra from comment #1) > > > I was unable to recreate this issue on a ppc64le kvm guest. I used Linux > > > v5.5 and xfsprogs' for-next branch. > > > > > > Can you please share the kernel config file? Also, Can you please tell me > > > how easy is it recreate this bug? > > > > It's really hard to reproduce. The g/475 is a random test, it's helped us > to > > find many different issues. For this bug, this's the 1st time I hit it, and > > can't reproduce it simply. > > > > Have you still been unable to reproduce (assuming you've been attempting > to)? How many iterations were required before you reproduced the first > time? > > I'm wondering if the XLOG_STATE_IOERROR check in xfs_log_release_iclog() > is racy with respect to filesystem shutdown. There's an ASSERT_ALWAYS() > earlier in this (xlog_cil_push()) codepath that checks for ACTIVE || > WANT_SYNC and it doesn't appear that has failed from your output > snippet. The aforementioned IOERROR check occurs before we acquire > ->l_icloglock, however, which I think means xfs_log_force_umount() could > jump in if called from another task and reset all of the iclogs while > the release path waits on the lock. > FWIW, I wasn't able to reproduce after a day or so of iterating generic/475, but I was able to confirm that the check referenced above is racy. The problem looks like a minor oversight in commit df732b29c8 ("xfs: call xlog_state_release_iclog with l_icloglock held"). I've floated a patch here: https://lore.kernel.org/linux-xfs/20200214181528.24046-1-bfoster@redhat.com/ Brian > Brian > > > -- > > You are receiving this mail because: > > You are watching the assignee of the bug. > > > |