Bug 202809

Summary: ext4: ext4_xattr_ibody_get:591: comm systemd-journal: corrupted in-inode xattr
Product: File System Reporter: Feng Tang (feng.tang)
Component: ext4Assignee: fs_ext4 (fs_ext4)
Status: RESOLVED INVALID    
Severity: high CC: 1158340263, feng.tang, tytso
Priority: P1    
Hardware: Intel   
OS: Linux   
Kernel Version: 4.19.23-19 Subsystem:
Regression: No Bisected commit-id:
Attachments: fsck output

Description Feng Tang 2019-03-07 08:39:49 UTC
Created attachment 281569 [details]
fsck output

When we run some stress test on our platform, we saw error message like this:

[49074.700194] EXT4-fs error (device mmcblk1p1): ext4_xattr_ibody_get:591: inode #164242: comm systemd-journal: corrupted in-inode xattr
[49074.701633] EXT4-fs error (device mmcblk1p1): ext4_xattr_ibody_get:591: inode #164247: comm systemd-journal: corrupted in-inode xattr
[49074.703260] EXT4-fs error (device mmcblk1p1): ext4_xattr_ibody_get:591: inode #164245: comm systemd-journal: corrupted in-inode xattr
[49079.621065] EXT4-fs error: 39 callbacks suppressed

kernel version is 4.19.23-19.
CPU: Intel(R) Atom(TM) Processor A3960 @ 1.90GHz
rootfs: emmc storage card
Running on top of a hypervisor (ACRN)

When the error message happens, the GUI will stop working, sometimes the serial console will stop too, but we can still connect to it through SSH connections, and if we run fsck on that system, will see some error info (full log attached).

some are:

Multiply-claimed block(s) in inode 164223: 656535
Multiply-claimed block(s) in inode 164224: 673971 757260--758282
Multiply-claimed block(s) in inode 164225: 656531

File /var/log/journal/cf5c3ff473bd4a2dbfc890565e0439f9/system@e2d41008f0ce4d53849cdc7cec0f8a29-000000000003045c-0005831a9bb56af7.journal (inode #164224, mod time Sat Mar  2 11:13:48 2019)
  has 1024 multiply-claimed block(s), shared with 1 file(s):
        /var/log/crashlog/vmevent393_98cd58c91b1da34b635e/crashfile (inode #164240, mod time Sat Mar  2 11:13:48 2019)

Could you please help to check, and provide some hints for debugging? thanks!
Comment 1 Feng Tang 2019-03-08 03:50:35 UTC
according to our test, when there error happens, the system(including systemd-journal) is busy writing logs into different log files under different directories. (mostly under /var/log/)
Comment 2 Theodore Tso 2019-03-08 18:10:52 UTC
Can you try reproducing the problem when using something more robust than an emmc storage card?   The errors:

Multiply-claimed block(s) in inode 164223: 656535

alongside the xattr errors strongly suggest read or writes being corrupted.

So the first thing I would ask is: (a) can you reproduce this on multiple emmc cards?   (b) can you use some other storage, say, a USB-attached SSD, as opposed to crap flash (or to be politically correct, "cost-optimized flash" :-)
Comment 3 Feng Tang 2019-03-09 07:43:41 UTC
Thanks for your prompt response.

We cannot change the emmc card as it is fixed on the board. One other interesting thing is, we have 2 labs, each has 10 boards to do the stress test, and only the boards in one lab can reproduce the issue, so we are doubting some env setup trigger the issue, and not the emmc hw issue.
Comment 4 Feng Tang 2019-04-11 01:40:29 UTC
We has some more findings, if we disable the HW cache feature of the eMMC card hardened in our board, then the issue cannot be reproduced.

So it seems to be like a HW issue, let's close it. And thanks for your help.
Comment 5 Hushup 2021-01-18 02:33:21 UTC
Hi ,We meet the same problem too,Can you share us how to reproduce the problem? Thanks.