Bug 204223
Summary: | [fstests generic/388 on xfs]: 4.19.58 xfs_nocrc / xfs_reflink null pointer dereference at xfs_trans_brelse+0x21 | ||
---|---|---|---|
Product: | File System | Reporter: | Luis Chamberlain (mcgrof) |
Component: | XFS | Assignee: | Luis Chamberlain (mcgrof) |
Status: | REOPENED --- | ||
Severity: | normal | CC: | filesystem_xfs, mcgrof, zlang |
Priority: | P1 | ||
Hardware: | x86-64 | ||
OS: | Linux | ||
Kernel Version: | 4.19.58 | Subsystem: | |
Regression: | No | Bisected commit-id: | |
Bug Depends on: | 204049 | ||
Bug Blocks: |
Description
Luis Chamberlain
2019-07-18 19:03:42 UTC
I'll note that I cannot reproduce the issue on the default configuration, as per oscheck the fstests "xfs" configuration, without crc disabled. I'm able to now trigger the same crash on an "xfs_reflink" configuration: # xfs_info /dev/loop5 meta-data=/dev/loop5 isize=512 agcount=4, agsize=1310720 blks = sectsz=512 attr=2, projid32bit=1 = crc=1 finobt=1, sparse=1, rmapbt=1 = reflink=1 data = bsize=4096 blocks=5242880, imaxpct=25 = sunit=0 swidth=0 blks naming =version 2 bsize=4096 ascii-ci=0, ftype=1 log =internal log bsize=4096 blocks=3693, version=2 = sectsz=512 sunit=0 blks, lazy-count=1 realtime =none extsz=4096 blocks=0, rtextents=0 With the default configuration it is not crashing. After about ~30 runs I managed to crash a system with the same type of output on this big with reflink enabled but rmapbt disabled, ie, the "xfs_reflink_normapbt" configuration Zorro used on kz#204049: # xfs_info /dev/loop5 meta-data=/dev/loop5 isize=512 agcount=4, agsize=1310720 blks = sectsz=512 attr=2, projid32bit=1 = crc=1 finobt=1, sparse=1, rmapbt=0 = reflink=1 data = bsize=4096 blocks=5242880, imaxpct=25 = sunit=0 swidth=0 blks naming =version 2 bsize=4096 ascii-ci=0, ftype=1 log =internal log bsize=4096 blocks=2560, version=2 = sectsz=512 sunit=0 blks, lazy-count=1 realtime =none extsz=4096 blocks=0, rtextents=0 [0] https://bugzilla.kernel.org/show_bug.cgi?id=204049 kz#204049 reveals upstream is also affected by this bug then. (In reply to Luis Chamberlain from comment #3) > kz#204049 reveals upstream is also affected by this bug then. To be clear, but with a different output, so different issue, and Zorro would have to test other configurations to see if the same issue creeps up on upstream with them as well. (In reply to Luis Chamberlain from comment #0) > [129135.499383] BUG: unable to handle kernel NULL pointer dereference at <-- snip --> > [129135.507540] RIP: 0010:xfs_trans_brelse+0x21/0xd0 [xfs] Reproduced with the "xfs_reflink_normapbt" configuration and can confirm an immediate panic on vanilla 4.19.20 with the same trace as above. Fixed by commit 6958d11f77d45d ("xfs: don't trip over uninitialized buffer on extent read of corrupted inode"). Sent as part of the set of stable fixes for v4.19.y series. Not fixed yet, it just takes longer with this commit to trigger. These commits fix this crash: xfs: Move fs/xfs/xfs_attr.h to fs/xfs/libxfs/xfs_attr.h xfs: Add helper function xfs_attr_try_sf_addname xfs: Add attibute set and helper functions xfs: Add attibute remove and helper functions xfs: always rejoin held resources during defer roll I've left generic/388 running over time and it ran up to 247 times successfully, and failed but at least without a crash in the end. In particular the last commit has has some fixes to correct bhold callers to release held buffers correctly merged into the patch, which IMHO should have been split up into a separate patch. Trying to extract the exact minor fix is difficult due to the amount of churn from the prior patches. We'll have to try to do that work somehow or just consider merging all of these. |