Bug 121071 - Btrfs full balance command fails due to ENOSPC
Summary: Btrfs full balance command fails due to ENOSPC
Status: NEW
Alias: None
Product: File System
Classification: Unclassified
Component: btrfs (show other bugs)
Hardware: x86-64 Linux
: P1 normal
Assignee: Josef Bacik
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-06-27 16:50 UTC by Francesco Turco
Modified: 2016-06-28 13:50 UTC (History)
1 user (show)

See Also:
Kernel Version: 4.6.2
Subsystem:
Regression: No
Bisected commit-id:


Attachments
dmesg log file (162.34 KB, text/x-log)
2016-06-27 17:09 UTC, Francesco Turco
Details
btrfs-debugfs output log (1.20 KB, text/plain)
2016-06-27 18:27 UTC, Francesco Turco
Details
Possibly different cause, same issue with different machine (88.64 KB, text/plain)
2016-06-28 09:32 UTC, sjon
Details
output log for show_usage command (2.68 KB, text/plain)
2016-06-28 13:43 UTC, Francesco Turco
Details
output log for show_dev_extents.py command (1.15 KB, text/plain)
2016-06-28 13:50 UTC, Francesco Turco
Details

Description Francesco Turco 2016-06-27 16:50:36 UTC
# btrfs filesystem show /
Label: none  uuid: 27150b83-7d90-4031-8e83-581315b9a254
	Total devices 1 FS bytes used 10.79GiB
	devid    1 size 25.00GiB used 13.31GiB path /dev/mapper/Desktop-root
# btrfs filesystem df /
Data, single: total=11.00GiB, used=10.40GiB
System, DUP: total=32.00MiB, used=16.00KiB
Metadata, DUP: total=1.12GiB, used=392.08MiB
GlobalReserve, single: total=144.00MiB, used=0.00B
# btrfs balance start --full-balance /
ERROR: error during balancing '/': No space left on device
There may be more info in syslog - try dmesg | tail
# dmesg | tail
[29807.441930] BTRFS info (device dm-2): found 13206 extents
[29807.879845] BTRFS info (device dm-2): relocating block group 47542435840 flags 1
[29827.116083] BTRFS info (device dm-2): found 12909 extents
[29830.500110] BTRFS info (device dm-2): found 12909 extents
[29830.976485] BTRFS info (device dm-2): relocating block group 46468694016 flags 1
[29848.924188] BTRFS info (device dm-2): found 5129 extents
[29851.533076] BTRFS info (device dm-2): found 5129 extents
[29851.994787] BTRFS info (device dm-2): relocating block group 46435139584 flags 34
[29852.399460] BTRFS info (device dm-2): found 1 extents
[29852.657983] BTRFS info (device dm-2): 1 enospc errors during balance

-------

I'm no expert, but as far as I can understand I'm using only half of the space for my root partition, so the full balance command should finish successfully instead of throwing out an error.

-------

My distribution is Parabola GNU/Linux-libre.
Btrfs-progs version is 4.5.3.
Kernel version is 4.6.2.
Comment 1 Francesco Turco 2016-06-27 17:09:15 UTC
Created attachment 221301 [details]
dmesg log file

I remounted the root partition with the enospc_debug option and then ran again the full balance command.
Comment 2 Francesco Turco 2016-06-27 18:24:41 UTC
I tried running the full balance command from a live USB system but it fails with the same error message.

Packages on the live USB system are somewhat older:
- kernel: 4.5.4
- btrfs-progs: 4.5.3
Comment 3 Francesco Turco 2016-06-27 18:27:37 UTC
Created attachment 221321 [details]
btrfs-debugfs output log

I downloaded btrfs-debugfs from: https://github.com/kdave/btrfs-progs/blob/master/btrfs-debugfs and then run:

# ./btrfs-debugfs -b /
Comment 4 sjon 2016-06-28 09:32:52 UTC
Created attachment 221341 [details]
Possibly different cause, same issue with different machine

I'm pretty sure I have the same issue with two machines running 4.6.2. They have even more space free; and both refuse any write to the fs, complaining about "No space left on device". Clearing the free space cache (with a remount) didn't help; a reboot did 'fix' one of them (left the other one running).

Both machines have only ~ 50% of space used, one is RAID10, the other RAID1:

Overall:
    Device size:		   3.64TiB
    Device allocated:		   2.19TiB
    Device unallocated:		   1.45TiB
    Device missing:		     0.00B
    Used:			   1.87TiB
    Free (estimated):		 771.96GiB	(min: 771.96GiB)
    Data ratio:			      2.00
    Metadata ratio:		      2.00
    Global reserve:		 512.00MiB	(used: 0.00B)

Data,RAID1: Size:950.00GiB, Used:920.03GiB
   /dev/sdc	 950.00GiB
   /dev/sdd	 950.00GiB

Metadata,RAID1: Size:171.00GiB, Used:37.54GiB
   /dev/sdc	 171.00GiB
   /dev/sdd	 171.00GiB

System,RAID1: Size:32.00MiB, Used:208.00KiB
   /dev/sdc	  32.00MiB
   /dev/sdd	  32.00MiB

Unallocated:
   /dev/sdc	 741.99GiB
   /dev/sdd	 741.99GiB

There are also quite a number of segfaults, see attached dmesg
Comment 5 Francesco Turco 2016-06-28 13:43:12 UTC
Created attachment 221401 [details]
output log for show_usage command

I downloaded the show_usage.py script from knorrie's github repository (https://github.com/knorrie/python-btrfs/blob/master/examples/show_usage.py) along with the btrfs python module from the same repository.

Then I run the script as root as follows:
./show_usage.py /
Comment 6 Francesco Turco 2016-06-28 13:50:19 UTC
Created attachment 221411 [details]
output log for show_dev_extents.py command

This is the output log for the following command:

# ./show_dev_extents.py /

Note You need to log in before you can comment on or make changes to this bug.