Bug 203319 - [xfstests generic/538]: unaligned direct AIO find ext4 corruption
Summary: [xfstests generic/538]: unaligned direct AIO find ext4 corruption
Status: NEW
Alias: None
Product: File System
Classification: Unclassified
Component: ext4 (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: fs_ext4@kernel-bugs.osdl.org
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2019-04-15 08:04 UTC by Zorro Lang
Modified: 2019-05-24 05:20 UTC (History)
1 user (show)

See Also:
Kernel Version: 5.1.0-rc4+
Subsystem:
Regression: No
Bisected commit-id:


Attachments

Description Zorro Lang 2019-04-15 08:04:08 UTC
After we fixed:
commit 372a03e01853f860560eade508794dd274e9b390
Author: Lukas Czerner <lczerner@redhat.com>
Date:   Thu Mar 14 23:20:25 2019 -0400

    ext4: fix data corruption caused by unaligned direct AIO

generic/538 still can trigger a failure on ext4, I hit it once on ppc64le machine:

Running test generic/538
#! /bin/bash
# SPDX-License-Identifier: GPL-2.0
# Copyright (c) 2019 Red Hat, Inc.  All Rights Reserved.
#
# FS QA Test No. 538
#
# Non-block-aligned direct AIO write test with an initial truncate i_size.
#
# Uncover "ext4: Fix data corruption caused by unaligned direct AIO":
# (Ext4 needs to serialize unaligned direct AIO because the zeroing of
FSTYP         -- ext4
PLATFORM      -- Linux/ppc64le ibm-p8-kvm-11-guest-02 5.1.0-rc4+
MKFS_OPTIONS  -- /dev/vda5
MOUNT_OPTIONS -- -o acl,user_xattr -o context=system_u:object_r:nfs_t:s0 /dev/vda5 /mnt/xfstests/mnt2

generic/538	- output mismatch (see /var/lib/xfstests/results//generic/538.out.bad)
    --- tests/generic/538.out	2019-04-14 05:10:30.200092759 -0400
    +++ /var/lib/xfstests/results//generic/538.out.bad	2019-04-14 07:20:09.996795190 -0400
    @@ -1,2 +1,20 @@
     QA output created by 538
    +Data verification fails
    +Find corruption
    +00000000  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    +*
    +00000200  5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a  ZZZZZZZZZZZZZZZZ
    +00002000
    ...
    (Run 'diff -u /var/lib/xfstests/tests/generic/538.out /var/lib/xfstests/results//generic/538.out.bad'  to see the entire diff)
Ran: generic/538
Failures: generic/538
Failed 1 of 1 tests

# cat 538.out.bad
QA output created by 538
Data verification fails
Find corruption
00000000  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
*
00000200  5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a  ZZZZZZZZZZZZZZZZ
00002000
FAIL: [8704, 8192, 512, 0]
-------------------------------------------------
Data verification fails
Find corruption
00002000  5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a  ZZZZZZZZZZZZZZZZ
*
00002200  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
*
00003000  5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a  ZZZZZZZZZZZZZZZZ
00004000
FAIL: [8704, 8192, 512, 8192]
-------------------------------------------------
Silence is golden
Comment 1 Zorro Lang 2019-04-15 17:39:44 UTC
ppc64le(64k pagesize, 512b sector size) can reproduce this bug nearly 100%. I just ran it 200 times on pcc64le, all failed[1]. But x86_64 can't trigger this issue, even I use 1k block size ext4.

[1]
QA output created by 538
Data verification fails
Find corruption
00000000  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
*
00000200  5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a  ZZZZZZZZZZZZZZZZ
00002000
FAIL: [8704, 8192, 512, 0]
-------------------------------------------------
Silence is golden

Note You need to log in before you can comment on or make changes to this bug.