Bug 29312

Summary: Unmounting fails even after the underlying device is long gone
Product: File System Reporter: Bart Van Assche (bvanassche)
Component: ext4Assignee: fs_ext4 (fs_ext4)
Status: CLOSED UNREPRODUCIBLE    
Severity: normal CC: florian, maciej.rutecki, rjw
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 2.6.38-rc5 Subsystem:
Regression: Yes Bisected commit-id:
Bug Depends on:    
Bug Blocks: 27352    

Description Bart Van Assche 2011-02-17 19:20:40 UTC
Unmounting a filesystem after the underlying (network) device is gone used to work with previous kernel versions but apparently doesn't work anymore with 2.6.38. Additionally, CPU load goes up to 1.0 and stays at 1.0:

# umount /mnt
umount: /mnt: device is busy.
        (In some cases useful info about processes that use
         the device is found by lsof(8) or fuser(1))
# echo 1 > /sys/block/sdc/device/delete
-bash: /sys/block/sdc/device/delete: No such file or directory
# cat /proc/loadavg
1.05 1.15 1.94 1/278 8426

The following call trace was logged several times:

sd 8:0:0:0: [sdc] Unhandled error code
sd 8:0:0:0: [sdc]  Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK
sd 8:0:0:0: [sdc] CDB: Write(10): 2a 00 00 00 01 80 00 00 08 00
end_request: I/O error, dev sdc, sector 384
------------[ cut here ]------------
WARNING: at fs/ext4/extents.c:3765 ext4_convert_unwritten_extents+0xef/0x120 [ext4]()
Hardware name: P5Q DELUXE
Modules linked in: ib_srp ext4 jbd2 crc16 scsi_transport_srp fuse ip6t_LOG ipt_MASQUERADE xt_pkttype xt_TCPMSS xt_tcpudp ipt_LOG xt_limit iptable_nat nf_nat radeon ttm drm_kms_helper drm i2c_algo_bit snd_pcm_oss snd_mixer_oss snd_seq snd_seq_device af_packet rdma_ucm ib_ipoib ib_uverbs ib_umad cpufreq_conservative cpufreq_userspace cpufreq_powersave acpi_cpufreq mperf mlx4_ib ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_raw xt_NOTRACK ipt_REJECT xt_state iptable_raw iptable_filter ip6table_mangle nf_conntrack_netbios_ns nf_conntrack_ipv4 nf_conntrack nf_defrag_ipv4 ip_tables ip6table_filter ip6_tables x_tables loop dm_mod coretemp snd_hda_codec_hdmi snd_hda_codec_analog snd_hda_intel snd_hda_codec snd_hwdep snd_pcm snd_timer sg i2c_i801 snd sr_mod soundcore snd_page_alloc pcspkr hid_belkin cdrom skge mlx4_core button sky2 i2c_core raid456 async_raid6_recov async_pq usbhid hid raid6_pq async_xor xor async_memcpy async_tx raid10 raid0 uhci_hcd rtc_cmos rtc_core ehci_hcd rtc_
lib usbcore sd_mod edd raid1 ext3 mbcache jbd fan ata_generic ata_piix pata_marvell ahci libahci libata thermal processor thermal_sys hwmon [last unloaded: ib_srp]
Pid: 20, comm: kworker/1:1 Not tainted 2.6.38-rc5-tcm+ #3
Call Trace:
 [<ffffffff8103fe0a>] ? warn_slowpath_common+0x7a/0xb0
 [<ffffffff8103fe55>] ? warn_slowpath_null+0x15/0x20
 [<ffffffffa05addef>] ? ext4_convert_unwritten_extents+0xef/0x120 [ext4]
 [<ffffffffa0599948>] ? ext4_end_io_nolock+0x48/0x110 [ext4]
 [<ffffffffa0599a4a>] ? ext4_end_io_work+0x3a/0xb0 [ext4]
 [<ffffffffa0599a10>] ? ext4_end_io_work+0x0/0xb0 [ext4]
 [<ffffffff8105490f>] ? process_one_work+0xff/0x370
 [<ffffffff810560d1>] ? worker_thread+0x161/0x330
 [<ffffffff81055f70>] ? worker_thread+0x0/0x330
 [<ffffffff8105a6c6>] ? kthread+0x96/0xa0
 [<ffffffff81003a94>] ? kernel_thread_helper+0x4/0x10
 [<ffffffff8105a630>] ? kthread+0x0/0xa0
 [<ffffffff81003a90>] ? kernel_thread_helper+0x0/0x10
---[ end trace 3e35244bb4bb9c94 ]---
ext4_convert_unwritten_extents: ext4_ext_map_blocks returned error inode#36, block=2203, max_blocks=1
ext4_end_io_nolock: failed to convert unwritten extents to written extents, error is -5 io is still on inode 36 aio dio list
Comment 1 Rafael J. Wysocki 2011-03-07 20:39:33 UTC
On Monday, March 07, 2011, Bart Van Assche wrote:
> On Sun, Mar 6, 2011 at 1:10 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> >
> > This message has been generated automatically as a part of a summary report
> > of recent regressions.
> >
> > The following bug entry is on the current list of known regressions
> > from 2.6.37.  Please verify if it still should be listed and let the
> tracking team
> > know (either way).
> >
> >
> > Bug-Entry       : http://bugzilla.kernel.org/show_bug.cgi?id=29312
> > Subject         : Unmounting fails even after the underlying device is long
> gone
> > Submitter       : Bart Van Assche <bart.vanassche@gmail.com>
> > Date            : 2011-02-17 19:20 (18 days old)
> 
> Unmounting works fine again with 2.6.38-rc7+, but I see now a huge
> number of error messages in the system log:
> # grep -c '^Mar  7 18:37:4.* kernel: sd 6:0:0:0: rejecting I/O to
> offline device$' /var/log/messages
> 4987