Bug 199707 - kernel oops with "BTRFS: decompress failed" in 2 NoCOW files
Summary: kernel oops with "BTRFS: decompress failed" in 2 NoCOW files
Status: NEW
Alias: None
Product: File System
Classification: Unclassified
Component: btrfs (show other bugs)
Hardware: All Linux
: P1 blocking
Assignee: BTRFS virtual assignee
Depends on:
Reported: 2018-05-13 11:42 UTC by jamespharvey20
Modified: 2018-05-14 01:08 UTC (History)
0 users

See Also:
Kernel Version: 4.16.8
Tree: Mainline
Regression: No

Crash with BTRFS decompress failed (11.79 KB, text/plain)
2018-05-13 11:43 UTC, jamespharvey20
Crash with general protection fault (66.52 KB, text/plain)
2018-05-13 11:43 UTC, jamespharvey20
filefrag full output (5.41 KB, text/plain)
2018-05-13 11:44 UTC, jamespharvey20
Crash with BTRFS decompress failed (15.51 KB, text/plain)
2018-05-14 01:08 UTC, jamespharvey20

Description jamespharvey20 2018-05-13 11:42:40 UTC
Booted my oldest ISO, April 1, 2016, kernel 4.4.5.  Bug exists there too.  I can download older ISOs and try further back, if requested, but I'm thinking this likely means it's not a regression but has always been there.

I started a thread on linux-btrfs about this, titled: '"decompress failed" in 1-2 files always causes kernel oops, check/scrub pass'

I have a 3 device btrfs RAID1, always mounted with "compress=lzo".  There are 2 old /var/log/journal files that if I read crash the system.  Ran across them by trying "journalctl --list-boots".  If I mount "ro,degraded" with disks 1&2 or 1&3 and read the file, it crashes.  With disks 2&3, it reads fine.

Often the crash has "BTRFS: decompress failed" as the second line, but other times I'm sometimes seeing "general protection fault" and no obvious indication of BTRFS or decompression.

I originally thought this meant the btrfs checksum was valid (verified because scrub has never and still finds nothing) but nevertheless the compressed data was invalid.  I've cat'ed every file on the disk, and it's only 2 journald files that cause a system crash.

Martin Steigerwald pointed out systemd marks these files nocow, so there is no checksums, snapshots, etc.  He's right.  All files in the directory have the NoCOW attribute.  So, of course, scrub isn't even looking at these.

Chris Murphy said I could verify if I have mismatched data and corruption making it invalid to be uncompressed using btrfs-map-logical.  He also pointed out normally NoCOW means no compression, but he remembered in the archives a discussion that compression can be forced on NoCOW under certain circumstances (if file is fragmented and either volume is mounted with compression or file has inherited chattr +c, or something like that, he didn't remember exactly.)

I confirmed disk 1 has corrupted data in these files, which makes sense since degraded with disks 2&3 reads fine.

Catting a non-essential file should not be able to cause the system to crash.

This is 100% reproducible on these 2 files.  Hardware is fine.  Passes memtest86+ in SMP mode.
Comment 1 jamespharvey20 2018-05-13 11:43:01 UTC
Created attachment 275949 [details]
Crash with BTRFS decompress failed
Comment 2 jamespharvey20 2018-05-13 11:43:17 UTC
Created attachment 275951 [details]
Crash with general protection fault
Comment 3 jamespharvey20 2018-05-13 11:44:11 UTC
Additional details seeming most pertinent.

There are 2 files that trigger a crash:
-rw-r-----+ 1 root 190 16777216 Oct  1  2016 system@00fa3c0596e64d2e84096520ca46f008-0000000000000001-00053cd2c1756577.journal
-rw-r-----+ 1 root 190  8388608 Oct  1  2016 user-1000@b70add0ef010457d933fec23a2afa48a-0000000000000495-00053b6b6e65e9cf.journal

lsattr shows:
---------------C-- system@00fa3c0596e64d2e84096520ca46f008-0000000000000001-00053cd2c1756577.journal
---------------C-- user-1000@b70add0ef010457d933fec23a2afa48a-0000000000000495-00053b6b6e65e9cf.journal

(Focusing here on 1 of these files.)

filefrag -v user-1000@b70add0ef010457d933fec23a2afa48a-0000000000000495-00053b6b6e65e9cf.journal
(Full output is attached)
... 59 extents found

For EACH of the 59 extents:
* btrfs-map-logical -l [FILEFRAG'S STARTING PHYSICAL OFFSET NUMBER * 4096 FOR BLOCKSIZE] -b 4096 -o frag[FRAG NUM].1 -c 1 /dev/lvm/newMain1
* btrfs-map-logical -l [FILEFRAG'S STARTING PHYSICAL OFFSET NUMBER * 4096 FOR BLOCKSIZE] -b 4096 -o frag[FRAG NUM].2 -c 2 /dev/lvm/newMain1
* diff --brief frag[FRAG NUM].1 frag[FRAG NUM].2
(Except on the last extent, "-b 847872")

If all of these matched, there could still be compression corruption, but it would have had to have happened before being written to disk.

If some or all of these didn't match, one of the disk copies got corrupted.

It turns out fragments [0-27], [29-39], and [56-68] match.

But, fragments 28, and [40-55] are completely different.

Notably, btrfs-map-logical isn't crashing, because it's giving data in its compressed form, so isn't tripping up on invalid compressed data.

Regarding reading 4096 for each fragment.  journald files start with ASCII "LPKSHHRH".  I did the first fragment, and found it had an extra 9 byte header before LPKSHHRH, of "3a0c 0000 6b02 0000 0a".  I'm assuming that's a btrfs-lzo header.  After LPKSHHRH is about 2k of binary data, and zeros.  If I split out the first 128k of the uncompressed valid file and run lzop on it, it winds up about 2k, so this compression ratio is realistic.  filefrag doesn't seem to be aware of compression, and shows the ending offsets and length based on uncompressed size.  Not knowing each fragment's actual size, I decided to just grab the first 4k, expecting them all to be compressed within that space, except for the last one which was much larger, which I took its whole size.  This means there could be a 128k fragment taking more than 4k of disk space, so there could be more differences between the mirrored copies than I've discovered.
Comment 4 jamespharvey20 2018-05-13 11:44:35 UTC
Created attachment 275953 [details]
filefrag full output
Comment 5 jamespharvey20 2018-05-13 12:03:46 UTC
Probably other related cases:

User also sometimes had "Fixing recursive fault but reboot is needed!" style crashes (which also appears in my general protection fault crashes), and sometimes "BTRFS: decompress failed".  No resolution.

User also has "BUG: unable to handle kernel paging request" style crashes (which is the first line in my BTRFS decompress failed crashes.)  No resolution.
Comment 6 jamespharvey20 2018-05-14 01:08:20 UTC
Created attachment 275961 [details]
Crash with BTRFS decompress failed

Adding because it's different than the other BTRFS decompress failed I attached.  This one has "BTRFS: decompress failed" as the first line, and is followed by "BUG: unable to handle kernel NULL pointer dereference at 0000000000000001".

(While testing all of my no checksum files for inconsistencies, mounting degraded to get access to mirrored copies that weren't being read with all disks, ran across another corrupt journald file on disk1.)

Note You need to log in before you can comment on or make changes to this bug.