Bug 219167

Summary: Btrfs allocation diverges from the df or statsfs
Product: File System Reporter: ader1990 (avladu)
Component: btrfsAssignee: BTRFS virtual assignee (fs_btrfs)
Status: NEW ---    
Severity: low    
Priority: P3    
Hardware: All   
OS: Linux   
Kernel Version: 6.6, 6.1 Subsystem:
Regression: No Bisected commit-id:

Description ader1990 2024-08-16 07:57:19 UTC

    
Comment 1 ader1990 2024-08-16 08:24:02 UTC
Hello,

We are using read-only btrfs compressed for the /usr partition in the Flatcar Container operating system and in the last few months, we observed a divergence between the `btrfs fi usage` and the `df` outputs when the Flatcar instance is running. Note this impoortant part -- when the Flatcar instance is running. When the filesystem is created, the free space shown by `btrfs fi usage` and the `df` is not diverging, but when Flatcar instance is running, the free space shown by `btrfs fi usage` and the `df` IS diverging, with `df` showing the free space from the `btrfs fi usage`'s `Device unallocated`.

This issue is not a critical one, but I would like to know at least why it happens and in which conditions.

This isssue does happen on AMD64 and ARM64, but cannot be reproduced all the time, and I have no idea why.
The Linux kernel used is vanilla 6.1.y and 6.6.y.
Btrfs progs used were 5.x and 6.x, all of them showed this issue.

I have found to fix the allocation, by decreasing the size to more than the btrfs file system allows, but curiosly the allocation is getting reset during the failed attempt and the issue is fixed. This is of course, a hack, and I would like to understand why this happens and how to properly fix the issue and maybe fix it upstream

Btrfs usage for the /usr partition when the Flatcar image is created by our tooling offline (meaning that the Flatcar instance has not been started). Note the `Device unallocated: 331.99MiB` has a weird value, but that Free(estimated) and Free (statsfs, df) is converging:

Overall:
    Device size:                1015.99MiB
    Device allocated:            684.00MiB
    Device unallocated:          331.99MiB
    Device missing:                  0.00B
    Device slack:                  8.01MiB
    Used:                        461.86MiB
    Free (estimated):            547.64MiB      (min: 547.64MiB)
    Free (statfs, df):           549.08MiB
    Data ratio:                       1.00
    Metadata ratio:                   1.00
    Global reserve:                2.49MiB      (used: 0.00B)
    Multiple profiles:                  no

Now the fun part starts, in these conditions, when the Flatcar instance gets started, from inside the Flatcar image I can see this. Note that the `Free (statfs, df)` almost converged to the `Device unallocated:` value and the `Free (estimated):` and `Free (statfs, df):` diverged.

Overall:
    Device size:		1015.99MiB
    Device allocated:		 684.00MiB
    Device unallocated:		 331.99MiB
    Device missing:		     0.00B
    Device slack:		     0.00B
    Used:			 462.88MiB
    Free (estimated):		 546.61MiB	(min: 546.61MiB)
    Free (statfs, df):		 330.94MiB
    Data ratio:			      1.00
    Metadata ratio:		      1.00
    Global reserve:		   2.51MiB	(used: 0.00B)
    Multiple profiles:		        no

The `df` diff becomes important:

14:15:07   File    Size  Used Avail Use% Type
14:15:07  -/usr   1016M  465M  443M  52% btrfs -> good
14:15:07  +/usr   1016M  465M  331M  59% btrfs -> bad

To fix this, the only way was to resize the btrfs filesystem with a too big of a size and the allocated and estimated values properly converged. Note that the resize fails (exit code 1), but the allocations got reset in the process.

See: https://github.com/flatcar/scripts/pull/2076/files#diff-85bc90c2683efc07d576bbecc3c194fe5b7c5b22bbb3fd050834c8f88ea3afdeR786

I tried to reproduce the issue with this script here: https://github.com/flatcar/Flatcar/issues/1473#issuecomment-2293057389

One final note, these are two issues here but they seem related, firstly the 
`Device allocated` and `Device unallocated` during the offline Flatcar image creation and the second, that during Flatcar run, the `Device unallocated` becomes the `df` free space.