Bug 151921 - enospc after btrfs-convert(ed) ext4 volume
Summary: enospc after btrfs-convert(ed) ext4 volume
Status: NEW
Alias: None
Product: File System
Classification: Unclassified
Component: btrfs (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: Josef Bacik
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-08-11 00:43 UTC by Chris Murphy
Modified: 2017-04-28 08:42 UTC (History)
1 user (show)

See Also:
Kernel Version: 4.8.0-0.rc1.git0.1.fc25.x86_64
Tree: Fedora
Regression: No


Attachments
dmesg (50.35 KB, text/x-log)
2016-08-11 00:43 UTC, Chris Murphy
Details
enospc_debug log (45.11 KB, text/plain)
2016-08-13 14:48 UTC, Chris Murphy
Details
enospc_debug log2 (7.26 KB, text/plain)
2016-08-13 15:04 UTC, Chris Murphy
Details
enospc_debug log3 (3.37 KB, text/plain)
2016-08-13 15:59 UTC, Chris Murphy
Details
enospc_debug log4 (5.19 KB, text/plain)
2016-08-13 16:20 UTC, Chris Murphy
Details

Description Chris Murphy 2016-08-11 00:43:31 UTC
Created attachment 228281 [details]
dmesg

Kinda crazy, BUT...
1. Install Fedora-Workstation-Live-x86_64-25-20160810.n.0.iso default layout (ext4 on LVM) which has btrfs-progs 4.6.1 and kernel-4.8.0-0.rc1.git0.1.fc25.x86_64.
2. Reboot and do some minimal environment checking.
3. Reboot live and do brtrfs-convert
4. Reboot enforcing=0 and restorecon -rv to relabel.
5. Reboot normally, remove the ext2 snapshot, defragment -r -v -t 32M, and then do a full balance.

And I get an enospc near the end. There is no call trace. The following information is taken right after the enospc.


[root@localhost ~]# btrfs fi show
Label: none  uuid: 0237a5df-e1b7-4b46-8f24-967bcf7e5f33
	Total devices 1 FS bytes used 5.54GiB
	devid    1 size 8.00GiB used 6.17GiB path /dev/mapper/fedora-root

[root@localhost ~]# btrfs fi df /
Data, single: total=5.38GiB, used=4.98GiB
System, single: total=32.00MiB, used=16.00KiB
Metadata, single: total=768.00MiB, used=580.80MiB
GlobalReserve, single: total=16.00MiB, used=0.00B

[root@localhost ~]# btrfs fi us /
Overall:
    Device size:		   8.00GiB
    Device allocated:		   6.17GiB
    Device unallocated:		   1.83GiB
    Device missing:		     0.00B
    Used:			   5.54GiB
    Free (estimated):		   2.24GiB	(min: 2.24GiB)
    Data ratio:			      1.00
    Metadata ratio:		      1.00
    Global reserve:		  16.00MiB	(used: 0.00B)

Data,single: Size:5.38GiB, Used:4.98GiB
   /dev/mapper/fedora-root	   5.38GiB

Metadata,single: Size:768.00MiB, Used:580.80MiB
   /dev/mapper/fedora-root	 768.00MiB

System,single: Size:32.00MiB, Used:16.00KiB
   /dev/mapper/fedora-root	  32.00MiB

Unallocated:
   /dev/mapper/fedora-root	   1.83GiB


[root@localhost ~]# btrfs sub list -t /
ID	gen	top level	path	
--	---	---------	--
[root@localhost ~]#

[root@localhost ~]# btrfs inspect-internal tree-stats /dev/mapper/fedora-root 
Calculating size of root tree
	Total size: 16.00KiB
		Inline data: 0.00B
	Total seeks: 0
		Forward seeks: 0
		Backward seeks: 0
		Avg seek len: 0.00B
	Total clusters: 1
		Avg cluster size: 0.00B
		Min cluster size: 0.00B
		Max cluster size: 16.00KiB
	Total disk spread: 0.00B
	Total read time: 0 s 0 us
	Levels: 1
Calculating size of extent tree
	Total size: 3.09MiB
		Inline data: 0.00B
	Total seeks: 147
		Forward seeks: 74
		Backward seeks: 73
		Avg seek len: 57.97MiB
	Seek histogram
		    16384 -    507904:        21 ###
		   573440 -  14778368:        21 ###
		 15482880 -  17383424:        21 ###
		 17596416 -  54706176:        21 ###
		 55902208 -  84492288:        21 ###
		 84819968 - 170704896:        21 ###
		173490176 - 358187008:        15 ##
	Total clusters: 34
		Avg cluster size: 39.06KiB
		Min cluster size: 32.00KiB
		Max cluster size: 112.00KiB
	Total disk spread: 408.64MiB
	Total read time: 0 s 53 us
	Levels: 2
Calculating size of csum tree
	Total size: 5.16MiB
		Inline data: 0.00B
	Total seeks: 97
		Forward seeks: 68
		Backward seeks: 29
		Avg seek len: 75.01MiB
	Seek histogram
		    16384 -     16384:        15 ###
		    32768 -     32768:        14 ###
		    49152 -    114688:        14 ###
		   147456 -   1687552:        12 ###
		  2621440 -  17891328:        12 ###
		 28639232 -  63143936:        12 ###
		 77824000 - 657244160:        12 ###
		693698560 - 693698560:         1 |
	Total clusters: 23
		Avg cluster size: 176.00KiB
		Min cluster size: 32.00KiB
		Max cluster size: 944.00KiB
	Total disk spread: 693.27MiB
	Total read time: 0 s 31 us
	Levels: 2
Calculating size of fs tree
	Total size: 572.48MiB
		Inline data: 339.16MiB
	Total seeks: 19974
		Forward seeks: 11513
		Backward seeks: 8461
		Avg seek len: 89.00MiB
	Seek histogram
		    16384 -     16384:      3716 ###
		    32768 -     32768:      6457 ######
		    49152 -    196608:      3024 ###
		   229376 - 159023104:      2994 ###
		159154176 - 586792960:      2995 ###
		586858496 - 759300096:       754 |
	Total clusters: 3982
		Avg cluster size: 81.86KiB
		Min cluster size: 32.00KiB
		Max cluster size: 1.62MiB
	Total disk spread: 744.38MiB
	Total read time: 1 s 800101 us
	Levels: 3


[root@localhost ~]# ./btrfs-debugfs -b /
block group offset 134217728 len 858570752 used 858570752 chunk_objectid 256 flags 1 usage 1.00
block group offset 2281701376 len 858570752 used 858566656 chunk_objectid 256 flags 1 usage 1.00
block group offset 4429185024 len 858570752 used 858570752 chunk_objectid 256 flags 1 usage 1.00
block group offset 10364125184 len 872415232 used 872316928 chunk_objectid 256 flags 1 usage 1.00
block group offset 12041846784 len 872415232 used 872267776 chunk_objectid 256 flags 1 usage 1.00
block group offset 12914262016 len 603389952 used 226299904 chunk_objectid 256 flags 1 usage 0.38
total_free 377339904 min_used 226299904 free_of_min_used 377090048 block_group_of_min_used 12914262016
Comment 1 Chris Murphy 2016-08-11 01:06:21 UTC
Oh yeah, this happens in a Fedora 24 qemu-kvm, using an 10.0GiB qcow2 file for backing. I did not hit the limit of the qcow2 itself:
-rw-r--r--. 1 qemu qemu 7.7G Aug 10 18:44 bios-f25wlive-1.qcow2

Nor even the LVM logical volume at 8.0GiB.
# df -h
/dev/mapper/fedora/root  8.0G  5.7G  2.2G  72%  /mnt

btrfs check comes up clean
[root@localhost ~]# btrfs check /dev/mapper/fedora-root 
Checking filesystem on /dev/mapper/fedora-root
UUID: 0237a5df-e1b7-4b46-8f24-967bcf7e5f33
checking extents
checking free space cache
checking fs roots
checking csums
checking root refs
found 4740210688 bytes used err is 0
total csum bytes: 5259592
total tree bytes: 611909632
total fs tree bytes: 602750976
total extent tree bytes: 3588096
btree space waste bytes: 161480586
file data blocks allocated: 2503608299520
 referenced 3871703040


The one fishy thing is:
[    4.460099] BTRFS info (device dm-0): bdev /dev/mapper/fedora-root errs: wr 1, rd 0, flush 0, corrupt 0, gen

I don't know what that write error is, I don't see anything in the journal that explains it. And then,

[root@localhost ~]# btrfs scrub start /mnt
scrub started on /mnt, fsid 0237a5df-e1b7-4b46-8f24-967bcf7e5f33 (pid=1661)
[root@localhost ~]# btrfs scrub stat /mnt
scrub status for 0237a5df-e1b7-4b46-8f24-967bcf7e5f33
	scrub started at Wed Aug 10 20:54:39 2016 and finished after 00:00:10
	total bytes scrubbed: 6.40GiB with 0 errors

Link to btrfs-debug-tree tar.gz 18MB
https://drive.google.com/open?id=0B_2Asp8DGjJ9ekJDeGxoTE5sbFk

Link to btrfs-image 23MB
https://drive.google.com/open?id=0B_2Asp8DGjJ9RTQzdV9hOTBOV1E
Comment 2 Chris Murphy 2016-08-13 14:48:03 UTC
Created attachment 228541 [details]
enospc_debug log

Most of this is duplicate information, but shows the sequence from mount, this time with enospc_debug.
Comment 3 Chris Murphy 2016-08-13 15:04:27 UTC
Created attachment 228551 [details]
enospc_debug log2

This is unexpected. Each balance with filter -dusage= always results in the relocation of a different chunk each time. Even -dusage=100 multiple times relocates a chunk with a different address (probably the same chunk previously relocated, but now with a new address).

However -dprofiles=single always fails with the same six block groups.

-mprofiles=single succeeds.
Comment 4 Chris Murphy 2016-08-13 15:59:16 UTC
Created attachment 228561 [details]
enospc_debug log3

In this sequence, I used btrfs send -f to get the data off the volume; delete everything on it; then restore with btrfs receive -f. The file system was not recreated from scratch.

1. The allocation between ext4 converted and btrfs send/receive is off by over 1GiB in both data and metadata chunks combined, so there's some inefficiency with the conversion.
2. Full balance now works, but is it because there's so much more free space?
3. Appears to be yes. Once files are added to the volume to make its free space about the same as the ext4 converted volume, full balance now consistently fails with enospc despite 2GiB of unallocated space remaining.
Comment 5 Chris Murphy 2016-08-13 16:20:19 UTC
Created attachment 228571 [details]
enospc_debug log4

Brand new 8GiB Btrfs file system, used cp -a from previous volume to new volume; full balance succeeds, a 2nd full balance fails with enospc. 1.84GiB unallocated.

So I'd say this problem is not directly related to ext4 conversion, it's just that the inefficiency of the conversion fills up the volume a bit more and can increase the chance of enospc.

Note You need to log in before you can comment on or make changes to this bug.