Bug 193661

Summary: xattr ext4_xattr_block_find, bad block on cleanly formatted ext4 partition
Product: File System Reporter: Colin Ian King (colin.king)
Component: ext4Assignee: Theodore Tso (tytso)
Status: RESOLVED CODE_FIX    
Severity: high CC: kernel, tytso
Priority: P1    
Hardware: i386   
OS: Linux   
Kernel Version: all versions tested, from 3.19 through to 4.10-rc6 Subsystem:
Regression: No Bisected commit-id:
Attachments: ext4: lock the the xattr block before calculating its checksum
ext4: lock the xattr block before checksuming it

Description Colin Ian King 2017-01-30 16:06:21 UTC
Reproducer:

Ran inside a 8 proc VM instance, i386 kernel, 4MB of memory with the stress-ng xattr stress test on a cleanly formated ext4 partition (e.g. mkfs.ext4 /dev/vdb1):

using stress-ng - to build from source on ubuntu systems:
   git clone git://kernel.ubuntu.com/cking/stress-ng
   sudo apt-get build-dep stress-ng
   cd stress-ng
   make

test script to invoke the stressor, needs to be run as root:

#!/bin/bash -x
for i in $(seq 30)
do
        ./stress-ng/stress-ng --xattr 8 -t 10
        rc=$?
        if [ $rc -ne 0 ]; then
                exit 1
        fi
done

This will trigger errors such as:

EXT4-fs error (device vdb1): ext4_xattr_block_find:802: inode #131074: comm stress-ng-xattr: bad block 532519

and more often than not one needs to fsck the partition.

I've tested this on mainline kernels from 3.19 through to 4.10-rc6 and I can trigger this on these kernels only on i386 systems. I've tested other 32 bit platforms (32 bit arm, raspberry pi 2) but can't trigger it.  I cannot trigger this issue on amd64 builds of the same kernels in a VM.

I put some debug into ext4_xattr_block_set() at the /* update the inode. * comment and the bug does not trip, so it seems that this reduces the risk of teh race condition occurring.
Comment 1 Colin Ian King 2017-01-30 16:08:57 UTC
Forgot to mention, this can be triggered on a SMP ppc64el VM with 4.8, but far less frequently than the i386.
Comment 2 Colin Ian King 2017-01-30 16:10:12 UTC
..and also userspace xattr setting calls get errno EUCLEAN 117 "Structure needs cleaning".
Comment 3 Colin Ian King 2017-01-30 16:24:02 UTC
For reference, the xattr stressor is: http://kernel.ubuntu.com/git/cking/stress-ng.git/tree/stress-xattr.c
Comment 4 Theodore Tso 2017-01-30 16:48:08 UTC
Thanks for the bug report!

Does this require using the latest version of stress-ng (from the git tree), or is it reproducible using the version of stress-ng in Ubuntu or Debian Jessie?
Comment 5 Colin Ian King 2017-01-30 16:50:14 UTC
I'd recommend the latest version from the git repo.
Comment 6 Theodore Tso 2017-01-30 16:53:01 UTC
And since you appear to be the maintainer of the stress-ng package, while I'm here, any objections if upload a backport of stress-ng v0.07.16 to jessie-backports?  :-)
Comment 7 Colin Ian King 2017-01-30 16:55:42 UTC
An upload to jessie-backports is doable, but I'm going to push another release at the end of this week.
Comment 8 Colin Ian King 2017-02-21 14:52:39 UTC
Hi Ted, is there anything else I need to add to this bug report?
Comment 9 Christian Kujau 2017-02-22 00:24:33 UTC
Wow, this triggers in a Debian/unstable 64-bit VM here after the first run:

$ mkfs.ext4 /dev/sdd
$ mount -t ext4 /dev/sdd /mnt/disk && cd /mnt/disk
$ time ~/193661.sh
++ seq 30
+ for i in $(seq 30)
+ /usr/local/sbin/stress-ng --xattr 8 -t 10
stress-ng: info:  [2639] dispatching hogs: 8 xattr
stress-ng: info:  [2639] cache allocate: default cache size: 6144K
stress-ng: fail:  [2647] stress-ng-xattr: fsetxattr failed, errno=117 (Structure needs cleaning)
stress-ng: error: [2639] process 2647 (stress-ng-xattr) terminated with an error, exit status=1
stress-ng: info:  [2639] unsuccessful run completed in 10.01s
+ rc=2
+ '[' 2 -ne 0 ']'
+ exit 1

real    0m10.019s
user    0m1.460s
sys     0m26.752s

$ dmesg | tail -2
[  494.311212] EXT4-fs (sdd): mounted filesystem with ordered data mode. Opts: (null)
[  550.828337] EXT4-fs error (device sdd): ext4_xattr_block_find:786: inode #524292: comm stress-ng-xattr: bad block 2105384

$ uname -rv
4.9.0-1-amd64 #1 SMP Debian 4.9.6-3 (2017-01-28)
Comment 10 Theodore Tso 2017-02-28 14:01:36 UTC
Created attachment 254979 [details]
ext4: lock the the xattr block before calculating its checksum
Comment 11 Theodore Tso 2017-03-25 21:25:34 UTC
Created attachment 255541 [details]
ext4: lock the xattr block before checksuming it
Comment 12 Colin Ian King 2017-03-26 00:27:53 UTC
I've given this a good soak test on 32 bit and 64 bit x86 builds and it
fixes the issue. Thanks Ted.

Tested-by: Colin Ian King <colin.king@canonical.com>