Bug 92641 - too many missing devices, writeable mount is not allowed
Summary: too many missing devices, writeable mount is not allowed
Status: RESOLVED CODE_FIX
Alias: None
Product: File System
Classification: Unclassified
Component: btrfs (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: Josef Bacik
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2015-02-04 06:33 UTC by Chris Murphy
Modified: 2022-10-04 08:22 UTC (History)
6 users (show)

See Also:
Kernel Version: 3.19.0-0.rc7.git0.1.fc22.x86_64
Subsystem:
Regression: No
Bisected commit-id:


Attachments
dmesg1 (70.01 KB, text/plain)
2015-02-04 06:35 UTC, Chris Murphy
Details
dmesg2 (72.44 KB, text/plain)
2015-02-04 06:36 UTC, Chris Murphy
Details
dmesg4.txt (80.12 KB, text/plain)
2015-02-04 20:38 UTC, Chris Murphy
Details

Description Chris Murphy 2015-02-04 06:33:39 UTC
Problem occurs with 3.19.0-0.rc7.git0.1.fc22.x86_64, no regression testing or attempt to reproduce has been done yet. But the file system isn't particularly old.

Steps 1-6 occur with kernel 3.16 through 3.19 with no errors.

1. mkfs.btrfs -draid1 -mraid1 /dev/sd[bc]  ## btrfs-progs ~3.16 or 3.17
2. mount /dev/sdb /mnt/btr  
3. copy some files to /mnt/btr
4. unmount /mnt/btr
5. Disconnect /dev/sdc

Steps 6-10 occur only with kernel 3.19

6. mount -odegraded /dev/sdb /mnt/btr
7. btrfs balance start -dconvert=single -mconvert=single -f /mnt/btr
8. In another shell, btrfs balance pause /mnt/btr
9. Wait for pause confirmation in 1st shell, then umount /mnt/btr
10. mount -odegraded /dev/sdb /mnt/btr

-msingle=dup was disallowed so I chose single

[ 2029.715092] BTRFS error (device sdc): unable to start balance with target metadata profile 32


Result when mounting:

[39691.150313] BTRFS info (device sdb): allowing degraded mounts
[39691.152501] BTRFS info (device sdb): disk space caching is enabled
[39693.756987] BTRFS: too many missing devices, writeable mount is not allowed
[39693.778349] BTRFS: open_ctree failed

I have no reason to think this is a regression, but haven't tried older kernels yet.

Additional information:


[ 5719.840900] BTRFS info (device sdc): found 16 extents
[ 6097.761142] usb 1-1.4: USB disconnect, device number 4
[ 6097.774052] sd 3:0:0:0: [sdc] Synchronizing SCSI cache
[ 6097.783575] sd 3:0:0:0: [sdc] Synchronize Cache(10) failed: Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK

5719 is about the time of the balance pause. I don't know the meaning of the last two messages or their implication in possibly causing the problem.


[root@f22s ~]# btrfs check /dev/sdb
warning, device 2 is missing
warning devid 2 not found already
Checking filesystem on /dev/sdb
UUID: 0f1c615f-30a0-4166-8a3c-987849551513
checking extents
checking free space cache
Error reading 476011409408, -1
failed to load free space cache for block group 476368076800
checking fs roots
checking csums
checking root refs
found 164679408219 bytes used err is 0
total csum bytes: 354762924
total tree bytes: 608239616
total fs tree bytes: 139395072
total extent tree bytes: 58785792
btree space waste bytes: 84024816
file data blocks allocated: 378008100864
 referenced 385864163328
Btrfs v3.18.2

No change with -orecovery,degraded; -oro,degraded does mount.
Comment 1 Chris Murphy 2015-02-04 06:35:38 UTC
Created attachment 165791 [details]
dmesg1

This contains the balance start, pause, and device disconnect. Complete dmesg.
Comment 2 Chris Murphy 2015-02-04 06:36:58 UTC
Created attachment 165801 [details]
dmesg2

This contains the mount attempts, also complete dmesg. But the most relevant part is:

[ 1592.211698] BTRFS: device label btrfs1 devid 1 transid 6090 /dev/sdb
[ 1614.127600] BTRFS info (device sdb): allowing degraded mounts
[ 1614.127678] BTRFS info (device sdb): disk space caching is enabled
[ 1616.722334] BTRFS: too many missing devices, writeable mount is not allowed
[ 1616.740638] BTRFS: open_ctree failed
[ 1641.107248] BTRFS info (device sdb): enabling auto recovery
[ 1641.107299] BTRFS info (device sdb): allowing degraded mounts
[ 1641.107333] BTRFS info (device sdb): disk space caching is enabled
[ 1643.354675] BTRFS: too many missing devices, writeable mount is not allowed
[ 1643.372162] BTRFS: open_ctree failed
Comment 3 Chris Murphy 2015-02-04 06:59:40 UTC
btrfs-image hangs, so this might be incompleted, but it's ~358MB.
https://drive.google.com/file/d/0B_2Asp8DGjJ9b2p0aUpGUTVzVU0/view?pli=1
Comment 4 Chris Murphy 2015-02-04 20:37:39 UTC
OK so this is completely reproducible with a new fs using only 
kernel-3.19.0-0.rc7.git0.1.fc22.x86_64
btrfs-progs-3.18.2-1.fc21.x86_64

# mkfs.btrfs -draid1 -mraid1 /dev/sd[bc]2
# mount /dev/sdb2 /mnt/btr
!!copy files
# umount /mnt/btr
!!disconnect all devices

!!connect one device
[ 4312.325111] usb 2-2: new SuperSpeed USB device number 4 using xhci_hcd
[ 4312.339119] usb 2-2: New USB device found, idVendor=174c, idProduct=5136
[ 4312.341348] usb 2-2: New USB device strings: Mfr=2, Product=3, SerialNumber=1
[ 4312.343550] usb 2-2: Product: AS2105
[ 4312.345719] usb 2-2: Manufacturer: ASMedia
[ 4312.347848] usb 2-2: SerialNumber: 00000000000000000000
[ 4312.350670] usb-storage 2-2:1.0: USB Mass Storage device detected
[ 4312.353910] scsi host6: usb-storage 2-2:1.0
[ 4313.358213] scsi 6:0:0:0: Direct-Access     ASMT     2105             0    PQ: 0 ANSI: 6
[ 4313.361394] sd 6:0:0:0: Attached scsi generic sg1 type 0
[ 4315.116218] sd 6:0:0:0: [sdb] 976773168 512-byte logical blocks: (500 GB/465 GiB)
[ 4315.118980] sd 6:0:0:0: [sdb] Write Protect is off
[ 4315.121478] sd 6:0:0:0: [sdb] Mode Sense: 43 00 00 00
[ 4315.121832] sd 6:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[ 4315.188271]  sdb: sdb1 sdb2
[ 4315.192440] sd 6:0:0:0: [sdb] Attached SCSI disk

# mount -o degraded /dev/sdb2 /mnt/btr

[ 4351.340980] BTRFS info (device sdb2): allowing degraded mounts
[ 4351.342019] BTRFS info (device sdb2): disk space caching is enabled
[ 4351.342750] BTRFS: has skinny extents
[ 4351.482338] SELinux: initialized (dev sdb2, type btrfs), uses xattr

# btrfs balance start -dconvert=single -mconvert=single -f /mnt/btr
!!wait a minute
# btrfs balance status /mnt/btr
Balance on '/mnt/btr' is running
3 out of about 27 chunks balanced (4 considered),  89% left
# btrfs balance pause /mnt/btr
!! wait 4 minutes
# umount /mnt/btr
!! command completes immediately, no kernel messages
# eject /dev/sdb

[ 4890.700161] ldm_validate_partition_table(): Disk read failed.
[ 4890.704585] Dev sdb: unable to read RDB block 0
[ 4890.706846]  sdb: unable to read partition table
[ 4890.710086] ldm_validate_partition_table(): Disk read failed.
[ 4890.711695] Dev sdb: unable to read RDB block 0
[ 4890.713319]  sdb: unable to read partition table

!! detach device

[ 4941.622366] usb 2-2: USB disconnect, device number 4
[ 4941.630437] sd 6:0:0:0: [sdb] Synchronizing SCSI cache
[ 4941.633476] sd 6:0:0:0: [sdb] Synchronize Cache(10) failed: Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK

!! connect device

[ 4967.225104] usb 2-2: new SuperSpeed USB device number 5 using xhci_hcd
[ 4967.239118] usb 2-2: New USB device found, idVendor=174c, idProduct=5136
[ 4967.241392] usb 2-2: New USB device strings: Mfr=2, Product=3, SerialNumber=1
[ 4967.243682] usb 2-2: Product: AS2105
[ 4967.246067] usb 2-2: Manufacturer: ASMedia
[ 4967.248417] usb 2-2: SerialNumber: 00000000000000000000
[ 4967.251971] usb-storage 2-2:1.0: USB Mass Storage device detected
[ 4967.254491] scsi host7: usb-storage 2-2:1.0
[ 4968.259184] scsi 7:0:0:0: Direct-Access     ASMT     2105             0    PQ: 0 ANSI: 6
[ 4968.260956] sd 7:0:0:0: Attached scsi generic sg1 type 0
[ 4970.011907] sd 7:0:0:0: [sdb] 976773168 512-byte logical blocks: (500 GB/465 GiB)
[ 4970.015081] sd 7:0:0:0: [sdb] Write Protect is off
[ 4970.017977] sd 7:0:0:0: [sdb] Mode Sense: 43 00 00 00
[ 4970.018299] sd 7:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[ 4970.084757]  sdb: sdb1 sdb2
[ 4970.090530] sd 7:0:0:0: [sdb] Attached SCSI disk

# mount -odegraded /dev/sdb2 /mnt/btr
mount: wrong fs type, bad option, bad superblock on /dev/sdb2,

[ 5005.403153] BTRFS info (device sdb2): allowing degraded mounts
[ 5005.404798] BTRFS info (device sdb2): disk space caching is enabled
[ 5005.406384] BTRFS: has skinny extents
[ 5005.570307] BTRFS: too many missing devices, writeable mount is not allowed
[ 5005.587140] BTRFS: open_ctree failed
Comment 5 Chris Murphy 2015-02-04 20:38:44 UTC
Created attachment 165861 [details]
dmesg4.txt

This is complete dmesg for comment 4.
Comment 6 Chris Murphy 2015-02-04 20:41:22 UTC
# btrfs-image -c9 -t2 -s /dev/sdb2 92641-c4-btrfs.image
warning, device 1 is missing
warning, device 1 is missing
warning devid 1 not found already
Check tree block failed, want=65536, have=0
read block failed check_tree_block
Error reading metadata block
Error adding block -5
Check tree block failed, want=65536, have=0
read block failed check_tree_block
Error reading metadata block
Error flushing pending -5
create failed (Bad file descriptor)

# btrfs-image -c9 -t2 -s -w /dev/sdb2 92641-c4-btrfs.image
warning, device 1 is missing
warning, device 1 is missing
warning devid 1 not found already
Check tree block failed, want=65536, have=0
read block failed check_tree_block
Error reading metadata block
Error adding metadata block
Check tree block failed, want=65536, have=0
read block failed check_tree_block
Error reading metadata block
Error flushing pending -5
create failed (Bad file descriptor)
Comment 7 Chris Murphy 2015-02-04 20:56:08 UTC
# btrfs check /dev/sdb2
warning, device 1 is missing
warning, device 1 is missing
warning devid 1 not found already
Checking filesystem on /dev/sdb2
UUID: ae76c914-705f-427e-b062-bdbc6c823bb4
checking extents
checking free space cache
Error reading 11840192512, -1
failed to load free space cache for block group 1103101952
Error reading 10302906368, -1
failed to load free space cache for block group 2176843776
Error reading 10468786176, -1
failed to load free space cache for block group 3250585600
Error reading 10303430656, -1
failed to load free space cache for block group 4324327424
Error reading 10469048320, -1
failed to load free space cache for block group 5398069248
Error reading 4323475456, -1
failed to load free space cache for block group 6471811072
Error reading 13987172352, -1
failed to load free space cache for block group 7545552896
Error reading 10304217088, -1
failed to load free space cache for block group 8619294720
Error reading 14453514240, -1
failed to load free space cache for block group 13988003840
Error reading 16015753216, -1
failed to load free space cache for block group 18282971136
checking fs roots
checking csums
checking root refs
found 2700832111 bytes used err is 0
total csum bytes: 22153044
total tree bytes: 30490624
total fs tree bytes: 6029312
total extent tree bytes: 999424
btree space waste bytes: 2040833
file data blocks allocated: 22688911360
 referenced 22688911360
Btrfs v3.18.2



# btrfs check --repair /dev/sdb2
enabling repair mode
warning, device 1 is missing
warning, device 1 is missing
warning devid 1 not found already
Checking filesystem on /dev/sdb2
UUID: ae76c914-705f-427e-b062-bdbc6c823bb4
checking extents
extent-tree.c:2657: btrfs_reserve_extent: Assertion `ret` failed.
btrfs[0x43c10a]
btrfs(btrfs_reserve_extent+0x802)[0x440d22]
btrfs(btrfs_alloc_free_block+0x5f)[0x440fcf]
btrfs(__btrfs_cow_block+0xc1)[0x433221]
btrfs(btrfs_cow_block+0x35)[0x433835]
btrfs[0x4385b6]
btrfs(btrfs_commit_transaction+0x8e)[0x439dbe]
btrfs[0x41225e]
btrfs(cmd_check+0x7c7)[0x425e97]
btrfs(main+0x82)[0x4125c2]
/lib64/libc.so.6(__libc_start_main+0xf0)[0x7fd5cb3befe0]
btrfs[0x4126c4]
#
Comment 8 David Sterba 2015-03-17 15:54:06 UTC
> 6. mount -odegraded /dev/sdb /mnt/btr
> 7. btrfs balance start -dconvert=single -mconvert=single -f /mnt/btr

This is a valid combination of operations, ie. there are less writable devices but the balance target is fine with that. After unmounting the filesystem, this combination of operations is not taken into account, but it should.
Comment 9 Dan Jacobson 2015-06-24 04:03:06 UTC
With Linux 4.0.0-2-686-pae #1 SMP Debian 4.0.5-1 one always gets (harmless?)

Synchronize Cache(10) failed: Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK

messages when unplugging Toshiba Canvio Basics disk drives.
Comment 10 Dan Jacobson 2016-06-07 00:10:55 UTC
I found the Toshiba Canvio Basics problems all occur under USB2. With a 100% USB3 connection and computer, I have no more problems.

Note You need to log in before you can comment on or make changes to this bug.