Bug 104361 - bogus 'no space left on device' with 'btrfs delete missing' raid5
Summary: bogus 'no space left on device' with 'btrfs delete missing' raid5
Status: NEW
Alias: None
Product: File System
Classification: Unclassified
Component: btrfs (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: Josef Bacik
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2015-09-09 22:45 UTC by Chris Murphy
Modified: 2016-03-20 09:58 UTC (History)
1 user (show)

See Also:
Kernel Version: 4.2.0-300.fc23.x86_64
Tree: Mainline
Regression: No


Attachments
dmesg (39.31 KB, text/plain)
2015-09-09 22:45 UTC, Chris Murphy
Details
strace.txt (6.54 KB, text/plain)
2015-09-09 22:53 UTC, Chris Murphy
Details
reproduce steps detail, shell export (3.96 KB, text/plain)
2015-09-10 01:37 UTC, Chris Murphy
Details

Description Chris Murphy 2015-09-09 22:45:31 UTC
Created attachment 187201 [details]
dmesg

Kernel 4.2.0-300.fc23.x86_64
btrfs-progs btrfs-progs-4.2-1.fc23.x86_64

Summary: 4x 8GB virtual disks in raid5 with 3.62GiB data, no snapshots or subvolumes or reflinks; fail one of the devices; reboot; btrfs dev delete missing fails with 'no space left on device' error even though 3 remaining disks have enough space for this fs shrink.

Steps to reproduce:

1. mkfs.btrfs -draid5 -mraid5 /dev/sd[bcde]
2. mount -o ssd /dev/sdb /mnt
3. rsync -a /usr /mnt
4. btrfs balance start /mnt
5. umount /mnt
6. power off; remove /dev/sde; boot
7. mount -o ssd,degraded /dev/sdb /mnt
8. btrfs dev delete missing /mnt

Actual:

# btrfs dev delete missing /mnt
ERROR: error removing the device 'missing' - No space left on device

Expected:

3.62GiB should fit fine on a 3x 8GiB raid5 volume.

Additional info:

[   36.918587] BTRFS info (device sdc): use ssd allocation scheme
[   36.918592] BTRFS info (device sdc): allowing degraded mounts
[   36.918595] BTRFS info (device sdc): disk space caching is enabled
[   36.918597] BTRFS: has skinny extents
[   36.919528] BTRFS warning (device sdc): devid 4 uuid 2b189c80-ec54-4a58-a94a-f6c9b7a6b3d8 is missing
[   46.444624] BTRFS info (device sdc): relocating block group 16139681792 flags 132
[   56.249652] BTRFS info (device sdc): found 15097 extents
[   56.321922] BTRFS info (device sdc): relocating block group 9697230848 flags 129
[   91.437237] BTRFS info (device sdc): found 32352 extents
[   96.382425] BTRFS info (device sdc): found 32351 extents
[   96.613160] BTRFS info (device sdc): relocating block group 16743661568 flags 130
[   96.798134] BTRFS info (device sdc): found 1 extents
[   96.907638] BTRFS info (device sdc): relocating block group 16441671680 flags 132
[  104.739716] BTRFS info (device sdc): found 13424 extents
Comment 1 Chris Murphy 2015-09-09 22:53:27 UTC
Created attachment 187211 [details]
strace.txt

When issuing additional 'btrfs dev delete missing' commands, there are no additional kernel messages recorded even when the volume is mounted with enospc_debug. Yet there is still this confusing message:
ERROR: error removing the device 'missing' - No space left on device

Attaching strace, maybe that'll shed some light on the failure.

strace btrfs device delete missing /mnt
Comment 2 Chris Murphy 2015-09-10 01:37:25 UTC
Created attachment 187231 [details]
reproduce steps detail, shell export

The most salient part right after the message, is the obviousness that this should work, there's plenty of space on remaining devices so this error really makes no sense.


[root@localhost ~]# btrfs dev delete missing /mnt
ERROR: error removing the device 'missing' - No space left on device
[root@localhost ~]# btrfs fi show /mnt
Label: none  uuid: 3b2b5197-2a68-49a4-8f76-83a088303904
	Total devices 4 FS bytes used 3.54GiB
	devid    1 size 8.00GiB used 2.28GiB path /dev/sdb
	devid    2 size 8.00GiB used 2.28GiB path /dev/sdc
	devid    3 size 8.00GiB used 2.28GiB path /dev/sdd
	*** Some devices missing

btrfs-progs v4.2
[root@localhost ~]# btrfs fi df /mnt
Data, RAID5: total=5.00GiB, used=3.32GiB
System, RAID5: total=64.00MiB, used=16.00KiB
Metadata, RAID5: total=512.00MiB, used=228.73MiB
GlobalReserve, single: total=80.00MiB, used=0.00B
[root@localhost ~]#

Note You need to log in before you can comment on or make changes to this bug.