Bug 87131 - allows umount before transactions complete
Summary: allows umount before transactions complete
Status: CLOSED UNREPRODUCIBLE
Alias: None
Product: File System
Classification: Unclassified
Component: btrfs (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: Josef Bacik
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-10-29 21:50 UTC by Chris Murphy
Modified: 2020-03-15 07:12 UTC (History)
3 users (show)

See Also:
Kernel Version: 3.18.0-0.rc2 x86_64
Tree: Fedora
Regression: No


Attachments
dmesg (148.82 KB, text/plain)
2014-10-29 21:51 UTC, Chris Murphy
Details

Description Chris Murphy 2014-10-29 21:50:16 UTC
SUMMARY:
 After umount returning to prompt, and physical disconnected 2x devices (btrfs raid1 on raw devices), I get a backtrace with some scary messages. Here are some snippets:
[10570.371285] BTRFS: error (device sdc) in btrfs_commit_transaction:1917: errno=-5 IO failure (Error while writing out transaction)
[10570.372426] BTRFS info (device sdc): forced readonly
[10570.372432] BTRFS warning (device sdc): Skipping commit of aborted transaction.
[10570.372456] BTRFS: Transaction aborted (error -5)
[10570.372807] BTRFS: error (device sdc) in cleanup_transaction:1599: errno=-5 IO failure
[10570.373960] BTRFS info (device sdc): delayed_refs has NO entry


HARDWARE:
Fedora 21 Server
kernel-3.18.0-0.rc2.git1.1.fc22.x86_64
btrfs-progs-3.17
2x firewire devices in a Btrfs raid1, (raw devices, no partitioning, md or lvm)

Client is a Mac running OS X 10.10 connected to above server via NFS v4, copying files with rsync to server.

Reproducibility: Unknown, once so far

1.  From client, unmount NFS share
2.  From server:
systemctl stop nfs-server.service
umount /mnt/btrfs1
3. Commands return to prompt
4. Physically disconnect both devices at the same time
5. I see backtrace on the server

re: timestamp 10509 with sd messages, I'm not sure if these are before or after the physical disconnect, I think they're right at the time of the disconnect

This is dmesg continuous from the time of the nfs-server service being stopped, to the end. Complete  dmesg is here:



[10509.080513] nfsd: last server has exited, flushing export cache
[10554.562413] sd 7:0:0:0: [sdc] Synchronizing SCSI cache
[10554.563704] sd 6:0:0:0: [sdb] Synchronizing SCSI cache
[10554.652787] sd 7:0:0:0: [sdc]  
[10554.652791] Result: hostbyte=DID_BUS_BUSY driverbyte=DRIVER_OK
[10554.652792] sd 7:0:0:0: [sdc] Stopping disk
[10554.654796] sd 6:0:0:0: [sdb]  
[10554.654798] Result: hostbyte=DID_BUS_BUSY driverbyte=DRIVER_OK
[10554.654800] sd 6:0:0:0: [sdb] Stopping disk
[10554.682775] sd 7:0:0:0: [sdc] START_STOP FAILED
[10554.682777] sd 7:0:0:0: [sdc]  
[10554.682778] Result: hostbyte=DID_BUS_BUSY driverbyte=DRIVER_OK
[10554.684790] sd 6:0:0:0: [sdb] START_STOP FAILED
[10554.684793] sd 6:0:0:0: [sdb]  
[10554.684794] Result: hostbyte=DID_BUS_BUSY driverbyte=DRIVER_OK
[10554.696334] firewire_sbp2 fw2.0: released target 7:0:0
[10554.696535] firewire_sbp2 fw1.0: released target 6:0:0
[10570.371285] BTRFS: error (device sdc) in btrfs_commit_transaction:1917: errno=-5 IO failure (Error while writing out transaction)
[10570.372426] BTRFS info (device sdc): forced readonly
[10570.372432] BTRFS warning (device sdc): Skipping commit of aborted transaction.
[10570.372434] ------------[ cut here ]------------
[10570.372454] WARNING: CPU: 0 PID: 968 at fs/btrfs/super.c:260 __btrfs_abort_transaction+0x54/0x120 [btrfs]()
[10570.372456] BTRFS: Transaction aborted (error -5)
[10570.372457] Modules linked in: nfsd auth_rpcgss nfs_acl lockd grace sunrpc nf_nat_h323 nf_conntrack_h323 nf_nat_pptp nf_nat_proto_gre nf_conntrack_pptp nf_conntrack_proto_gre nf_nat_tftp nf_conntrack_tftp nf_nat_sip nf_conntrack_sip nf_nat_irc nf_conntrack_irc nf_nat_ftp nf_conntrack_ftp firewire_sbp2 ccm xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack bridge stp llc bnep arc4 b43 mac80211 cfg80211 ssb iTCO_wdt iTCO_vendor_support nls_utf8 hfsplus x86_pkg_temp_thermal coretemp kvm_intel applesmc kvm input_polldev hid_appleir btusb crct10dif_pclmul crc32_pclmul crc32c_intel bluetooth ghash_clmulni_intel uvcvideo videobuf2_vmalloc videobuf2_core rfkill videobuf2_memops v4l2_common videodev media
[10570.372542]  bcm5974 i2c_i801 sdhci_pci sdhci bcma mmc_core lpc_ich mfd_core snd_hda_codec_cirrus sbs snd_hda_codec_generic sbshc snd_hda_intel snd_hda_controller apple_gmux snd_hda_codec snd_hwdep snd_seq snd_seq_device snd_pcm snd_timer snd mei_me soundcore mei apple_bl shpchp btrfs xor i915 raid6_pq i2c_algo_bit drm_kms_helper tg3 firewire_ohci drm firewire_core ptp pps_core crc_itu_t video
[10570.372583] CPU: 0 PID: 968 Comm: btrfs-transacti Not tainted 3.18.0-0.rc2.git1.1.fc22.x86_64 #1
[10570.372586] Hardware name: Apple Inc. MacBookPro8,2/Mac-94245A3940C91C80, BIOS    MBP81.88Z.0047.B27.1201241646 01/24/12
[10570.372588]  0000000000000000 0000000049a53850 ffff880250c47bb8 ffffffff8185d5a5
[10570.372593]  0000000000000000 ffff880250c47c10 ffff880250c47bf8 ffffffff810ab9f1
[10570.372598]  ffffffff8111d987 00000000fffffffb ffff88025c7fbe60 ffff88025ca7a000
[10570.372602] Call Trace:
[10570.372609]  [<ffffffff8185d5a5>] dump_stack+0x4e/0x68
[10570.372616]  [<ffffffff810ab9f1>] warn_slowpath_common+0x81/0xa0
[10570.372622]  [<ffffffff8111d987>] ? vprintk_emit+0x387/0x660
[10570.372633]  [<ffffffff810aba65>] warn_slowpath_fmt+0x55/0x70
[10570.372657]  [<ffffffffa024c6f4>] __btrfs_abort_transaction+0x54/0x120 [btrfs]
[10570.372680]  [<ffffffffa027ef64>] cleanup_transaction+0x84/0x480 [btrfs]
[10570.372690]  [<ffffffff810fd7e0>] ? abort_exclusive_wait+0xb0/0xb0
[10570.372713]  [<ffffffffa0280282>] btrfs_commit_transaction+0x792/0xc20 [btrfs]
[10570.372734]  [<ffffffffa027b605>] transaction_kthread+0x225/0x290 [btrfs]
[10570.372754]  [<ffffffffa027b3e0>] ? btrfs_cleanup_transaction+0x710/0x710 [btrfs]
[10570.372761]  [<ffffffff810d38db>] kthread+0x10b/0x130
[10570.372766]  [<ffffffff81028cc9>] ? sched_clock+0x9/0x10
[10570.372771]  [<ffffffff810d37d0>] ? kthread_create_on_node+0x250/0x250
[10570.372775]  [<ffffffff8186767c>] ret_from_fork+0x7c/0xb0
[10570.372779]  [<ffffffff810d37d0>] ? kthread_create_on_node+0x250/0x250
[10570.372782] ---[ end trace 42911c328407f53e ]---
[10570.372807] BTRFS: error (device sdc) in cleanup_transaction:1599: errno=-5 IO failure
[10570.373960] BTRFS info (device sdc): delayed_refs has NO entry

Reboot, reconnect, but not mounted:
[   74.873140] BTRFS: device label btrfs1 devid 2 transid 4860 /dev/sdb
[   80.729925] BTRFS: device label btrfs1 devid 1 transid 4860 /dev/sdc

So this is good the two devices have the same generation.

# btrfs check /dev/sdb
Checking filesystem on /dev/sdb
UUID: 0f1c615f-30a0-4166-8a3c-987849551513
checking extents
checking free space cache
checking fs roots
checking csums
checking root refs
found 688074962 bytes used err is 0
total csum bytes: 211549532
total tree bytes: 564854784
total fs tree bytes: 264306688
total extent tree bytes: 56295424
btree space waste bytes: 91382334
file data blocks allocated: 216679673856
 referenced 224192188416
Btrfs v3.17

kernel messages at mount time
# mount -o compress /dev/sdb /mnt/btrfs1
[ 1065.662446] BTRFS info (device sdc): disk space caching is enabled
[ 1066.819178] SELinux: initialized (dev sdc, type btrfs), uses xattr

No complaints.
Comment 1 Chris Murphy 2014-10-29 21:51:15 UTC
Created attachment 155831 [details]
dmesg

complete dmesg
Comment 2 Chris Murphy 2014-11-06 20:57:45 UTC
I can't reproduce so far, so I'm considering the possibility of user error. I'm uncertain if the bind mount of /mnt/btrfs1 to /export/backup (the NFS share) was unmounted before physical disconnect.
Comment 3 Marcos Souza 2020-03-15 03:09:07 UTC
Chris, this kernel is too old. Can we close this bug? Thanks.
Comment 4 Chris Murphy 2020-03-15 07:12:46 UTC
Yep.

Note You need to log in before you can comment on or make changes to this bug.