Bug 173031 - kernel oops on USB flash drive removal
Summary: kernel oops on USB flash drive removal
Status: RESOLVED PATCH_ALREADY_AVAILABLE
Alias: None
Product: IO/Storage
Classification: Unclassified
Component: Block Layer (show other bugs)
Hardware: All Linux
: P1 high
Assignee: Andrew Morton
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-09-28 10:35 UTC by Francesco Dolcini
Modified: 2019-02-04 21:10 UTC (History)
4 users (show)

See Also:
Kernel Version: 4.1.33
Subsystem:
Regression: Yes
Bisected commit-id:


Attachments
x86 kernel config (63.97 KB, text/plain)
2016-09-28 10:35 UTC, Francesco Dolcini
Details
imx6q kernel config (74.15 KB, text/plain)
2016-09-28 10:35 UTC, Francesco Dolcini
Details
mdev usb mount/umount script (572 bytes, text/plain)
2016-09-28 10:37 UTC, Francesco Dolcini
Details

Description Francesco Dolcini 2016-09-28 10:35:14 UTC
Created attachment 239911 [details]
x86 kernel config

Hi,
I am experiencing various kernel oops since upgrading from kernel 3.14 to kernel 4.1.

I was able to reproduce the bug with this test:

lazy umount with mdev on USB flads drive removal event on /media/usb/ (mdev script attached)l

/media/usb/.not_mounted file exist on rootfs to verify if USB flash drive is mounted or not

Bug was originally reproduced on freescale imx6q, using a non-vanilla kernel and vfat filesystem. However I was able to reproduce it also on x86 (Atom Z510) with vanilla kernel and using ext4 filesystem. Reproducing on imx6 with vanilla kernel is not possible, hardware requires some (small) patches to work.

x86 Oops information (with ext4):
--------------------------------

[/root]# [   79.776087] usb 1-3: new high-speed USB device number 5 using ehci-pci
[   79.925734] usb-storage 1-3:1.0: USB Mass Storage device detected
[   79.953080] scsi host3: usb-storage 1-3:1.0
[   80.962397] scsi 3:0:0:0: Direct-Access     Generic  Flash Disk       8.07 PQ: 0 ANSI: 2
[   80.973806] sd 3:0:0:0: [sda] 7886848 512-byte logical blocks: (4.04 GB/3.76 GiB)
[   81.006984] sd 3:0:0:0: [sda] Write Protect is off
[   81.029882] sd 3:0:0:0: [sda] Mode Sense: 03 00 00 00
[   81.036946] sd 3:0:0:0: [sda] No Caching mode page found
[   81.042747] sd 3:0:0:0: [sda] Assuming drive cache: write through
[   81.059417] sd 3:0:0:0: [sda] Attached SCSI removable disk
[   81.098931] EXT3-fs (sda): error: couldn't mount because of unsupported optional features (240)
[   81.110307] EXT2-fs (sda): error: couldn't mount because of unsupported optional features (244)
[   81.726932] EXT4-fs (sda): recovery complete
[   81.735870] EXT4-fs (sda): mounted filesystem with ordered data mode. Opts: (null)

[/root]# 
[/root]# [/root]# mount[/root]# while true; do [ ! -f /media/usb/.not_mounted ] && dd if=/dev/zero of=/
media/usb/aaa bs=1k count=1 2>/dev/null ; done
[   90.616356] usb 1-3: USB disconnect, device number 5
[   90.628198] sd 3:0:0:0: [sda] FAILED Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK
[   90.636536] sd 3:0:0:0: [sda] CDB: Write(10) 2a 00 00 04 23 08 00 00 08 00
[   90.643455] blk_update_request: I/O error, dev sda, sector 271112
[   90.649580] EXT4-fs warning (device sda): ext4_end_bio:332: I/O error -5 writing to inode 12 (offset 0 size 4096 starting block 33890)
[   90.661692] Buffer I/O error on device sda, logical block 33889
[   90.668219] Aborting journal on device sda-8.
[   90.673414] EXT4-fs error (device sda) in ext4_truncate:3840: Readonly filesystem
[   90.685823] JBD2: Error -5 detected when updating journal superblock for sda-8.
[   90.702793] EXT4-fs (sda): previous I/O error to superblock detected
[   90.740733] BUG: unable to handle kernel paging request at 1dcb5000
[   90.744026] IP: [<c121ca3c>] __percpu_counter_add+0x2c/0xf0
[   90.744026] *pde = 00000000 
[   90.744026] Oops: 0000 [#1] PREEMPT SMP 
[   90.744026] Modules linked in: arc4 ath9k ath9k_common ath9k_hw ath mac80211 cfg80211 rfkill
[   90.744026] CPU: 0 PID: 1534 Comm: dd Tainted: G        W       4.1.33_11 #2
[   90.744026] task: dee12880 ti: defca000 task.ti: defca000
[   90.744026] EIP: 0060:[<c121ca3c>] EFLAGS: 00010002 CPU: 0
[   90.744026] EIP is at __percpu_counter_add+0x2c/0xf0
[   90.744026] EAX: 00000000 EBX: ded84bfc ECX: 00000002 EDX: 00000001
[   90.744026] ESI: ffffffff EDI: de80eb68 EBP: defcbc84 ESP: defcbc68
[   90.744026]  DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
[   90.744026] CR0: 8005003b CR2: 1dcb5000 CR3: 1efa5000 CR4: 000006d0
[   90.744026] Stack:
[   90.744026]  c15ed780 00000019 00000001 00000000 ded84bdc ffffffff de80eb68 defcbc98
[   90.744026]  c10c58ca 00000008 df6bf800 de80eb58 defcbcb4 c1121447 00000000 00000282
[   90.744026]  df6bf800 de040400 de040400 defcbcc0 c11215a4 dd553000 defcbcf4 c1189c5a
[   90.744026] Call Trace:
[   90.744026]  [<c10c58ca>] account_page_dirtied+0x6a/0xc0
[   90.744026]  [<c1121447>] __set_page_dirty+0x37/0xb0
[   90.744026]  [<c11215a4>] mark_buffer_dirty+0x44/0x80
[   90.744026]  [<c1189c5a>] ext4_commit_super+0x17a/0x230
[   90.744026]  [<c118a38b>] __ext4_abort+0x2b/0x110
[   90.744026]  [<c1124f1d>] ? sync_dirty_buffer+0xd/0x10
[   90.744026]  [<c1189c7a>] ? ext4_commit_super+0x19a/0x230
[   90.744026]  [<c119c3c8>] ext4_journal_check_start+0x88/0x90
[   90.744026]  [<c119c495>] __ext4_journal_start_sb+0x15/0x60
[   90.744026]  [<c117f7c4>] ext4_dirty_inode+0x34/0x70
[   90.744026]  [<c111b577>] __mark_inode_dirty+0x27/0x290
[   90.744026]  [<c111039c>] ? setattr_copy+0x6c/0xd0
[   90.744026]  [<c117f36e>] ext4_setattr+0x23e/0x660
[   90.744026]  [<c1173fb7>] ? ext4_file_open+0x37/0x140
[   90.744026]  [<c11107c5>] notify_change+0x1d5/0x350
[   90.744026]  [<c121cac4>] ? __percpu_counter_add+0xb4/0xf0
[   90.744026]  [<c105c17b>] ? get_parent_ip+0xb/0x40
[   90.744026]  [<c10f4b33>] do_truncate+0x53/0x90
[   90.744026]  [<c10c71bd>] ? file_ra_state_init+0xd/0x30
[   90.744026]  [<c11030a8>] do_last.isra.48+0x778/0xe90
[   90.744026]  [<c121c36f>] ? __this_cpu_preempt_check+0xf/0x20
[   90.744026]  [<c1104788>] path_openat+0x68/0x4d0
[   90.744026]  [<c1105c3d>] do_filp_open+0x2d/0x80
[   90.744026]  [<c10f5e31>] do_sys_open+0x111/0x200
[   90.744026]  [<c10f4376>] ? filp_close+0x46/0x60
[   90.744026]  [<c10f5f3f>] SyS_open+0x1f/0x30
[   90.744026]  [<c1418ba2>] syscall_call+0x7/0x7
[   90.744026] Code: 89 e5 57 56 53 89 c3 b8 01 00 00 00 8d 64 24 f0 89 55 ec 89 4d f0 e8 84 f7 e3 ff b8 1b b1 53 c1 e8 2a f9 ff ff 8b 55 ec 8b 43 0c <64> 8b 30 89 f7 8b 4d f0 c1 ff 1f 01 d6 89 75 ec 8b 75 08 11 cf
[   90.744026] EIP: [<c121ca3c>] __percpu_counter_add+0x2c/0xf0 SS:ESP 0068:defcbc68
[   90.744026] CR2: 000000001dcb5000
[   90.744026] ---[ end trace faf081e4d1712537 ]---
[   90.744026] Kernel panic - not syncing: Fatal exception
[   90.744026] Kernel Offset: disabled

x86 Oops information (with vfat):
---------------------------------

[/root]# [  176.844068] usb 1-3: new high-speed USB device number 4 using ehci-pci
[  176.993405] usb-storage 1-3:1.0: USB Mass Storage device detected
[  177.018061] scsi host2: usb-storage 1-3:1.0
[  178.026664] scsi 2:0:0:0: Direct-Access     Generic  Flash Disk       8.07 PQ: 0 ANSI: 2
[  178.037979] sd 2:0:0:0: [sda] 7886848 512-byte logical blocks: (4.04 GB/3.76 GiB)
[  178.072465] sd 2:0:0:0: [sda] Write Protect is off
[  178.093828] sd 2:0:0:0: [sda] Mode Sense: 03 00 00 00
[  178.100567] sd 2:0:0:0: [sda] No Caching mode page found
[  178.106409] sd 2:0:0:0: [sda] Assuming drive cache: write through
[  178.119711]  sda:
[  178.125233] sd 2:0:0:0: [sda] Attached SCSI removable disk
[  178.180165] FAT-fs (sda): Volume was not properly unmounted. Some data may be corrupt. Please run fsck.

[/root]# while true; do [ ! -f /media/usb/.not_mounted ] && dd if=/dev/zero of=/
media/usb/aaa bs=1k count=1 2>/dev/null ; done
[  192.230834] usb 1-3: USB disconnect, device number 4
[  192.240277] sd 2:0:0:0: [sda] FAILED Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK
[  192.248566] sd 2:0:0:0: [sda] CDB: Write(10) 2a 00 00 00 20 00 00 00 01 00
[  192.255501] blk_update_request: I/O error, dev sda, sector 8192
[  192.261446] Buffer I/O error on dev sda, logical block 8192, lost sync page write
[  192.269976] FAT-fs (sda): FAT read failed (blocknr 6275)
[  192.306364] ------------[ cut here ]------------
[  192.311133] WARNING: CPU: 0 PID: 1860 at /media/ssd/dolcini/FBO/linux-fbo/fs/fs-writeback.c:1327 __mark_inode_dirty+0x27b/0x290()
[  192.322839] bdi-block not registered
[  192.326448] Modules linked in: vfat fat arc4 ath9k ath9k_common ath9k_hw ath mac80211 cfg80211 rfkill
[  192.335896] CPU: 0 PID: 1860 Comm: dd Tainted: G        W       4.1.33_11 #2
[  192.336300]  00000286 00000286 dc3dbba8 c1413d72 dc3dbbec 00000000 dc3dbbdc c103d692
[  192.336300]  c153b00c dc3dbc08 00000744 c153219c 0000052f c111b7cb 0000052f c111b7cb
[  192.336300]  de95e1f8 ded60bdc ded60cdc dc3dbbf4 c103d6ef 00000009 dc3dbbec c153b00c
[  192.336300] Call Trace:
[  192.336300]  [<c1413d72>] dump_stack+0x47/0x5b
[  192.336300]  [<c103d692>] warn_slowpath_common+0x82/0xb0
[  192.336300]  [<c111b7cb>] ? __mark_inode_dirty+0x27b/0x290
[  192.336300]  [<c111b7cb>] ? __mark_inode_dirty+0x27b/0x290
[  192.336300]  [<c103d6ef>] warn_slowpath_fmt+0x2f/0x40
[  192.336300]  [<c111b7cb>] __mark_inode_dirty+0x27b/0x290
[  192.336300]  [<e18e5c3b>] mark_fsinfo_dirty+0x2b/0x30 [fat]
[  192.336300]  [<e18e676c>] fat_alloc_clusters+0x20c/0x460 [fat]
[  192.336300]  [<c121c36f>] ? __this_cpu_preempt_check+0xf/0x20
[  192.336300]  [<e18e7b5f>] fat_get_block+0x18f/0x230 [fat]
[  192.336300]  [<c105c1fb>] ? preempt_count_add+0x4b/0xa0
[  192.336300]  [<c1122073>] ? alloc_page_buffers+0x53/0xa0
[  192.336300]  [<c1123c20>] __block_write_begin+0x100/0x350
[  192.336300]  [<c1123ea8>] block_write_begin+0x38/0x70
[  192.336300]  [<e18e79d0>] ? fat_generic_ioctl+0x4f0/0x4f0 [fat]
[  192.336300]  [<c11240a8>] cont_write_begin+0x1c8/0x2c0
[  192.336300]  [<e18e79d0>] ? fat_generic_ioctl+0x4f0/0x4f0 [fat]
[  192.336300]  [<e18e8b15>] fat_write_begin+0x55/0x90 [fat]
[  192.336300]  [<e18e79d0>] ? fat_generic_ioctl+0x4f0/0x4f0 [fat]
[  192.336300]  [<c10bd37d>] generic_perform_write+0xad/0x190
[  192.336300]  [<c108360c>] ? current_fs_time+0xc/0x50
[  192.336300]  [<c10bf31f>] __generic_file_write_iter+0x16f/0x1b0
[  192.336300]  [<c1104799>] ? path_openat+0x79/0x4d0
[  192.336300]  [<c10bf518>] generic_file_write_iter+0x1b8/0x300
[  192.336300]  [<c105c17b>] ? get_parent_ip+0xb/0x40
[  192.336300]  [<c10f6814>] __vfs_write+0xa4/0xe0
[  192.336300]  [<c10f6d8f>] vfs_write+0x8f/0x1a0
[  192.336300]  [<c10f76a2>] SyS_write+0x42/0x90
[  192.336300]  [<c1418ba2>] syscall_call+0x7/0x7
[  192.521611] ---[ end trace 3f66944902906d95 ]---
[  192.622657] FAT-fs (sda): unable to read inode block for updating (i_pos 131073)
[  192.632672] FAT-fs (sda): bread failed in fat_clusters_flush



imx6q Oops information (with vfat):
---------------------------------

[   31.583781] usb 1-1: new high-speed USB device number 2 using ci_hdrc
[   31.743595] usb-storage 1-1:1.0: USB Mass Storage device detected
[   31.750669] scsi host0: usb-storage 1-1:1.0
[   32.756162] scsi 0:0:0:0: Direct-Access     Generic  Flash Disk       8.07 PQ: 0 ANSI: 2
[   32.766946] sd 0:0:0:0: [sda] 7886848 512-byte logical blocks: (4.04 GB/3.76 GiB)
[   32.775294] sd 0:0:0:0: [sda] Write Protect is off
[   32.780116] sd 0:0:0:0: [sda] Mode Sense: 03 00 00 00
[   32.785743] sd 0:0:0:0: [sda] No Caching mode page found
[   32.791082] sd 0:0:0:0: [sda] Assuming drive cache: write through
[   32.802898]  sda:
[   32.807620] sd 0:0:0:0: [sda] Attached SCSI removable disk
[   32.852732] FAT-fs (sda): Volume was not properly unmounted. Some data may be corrupt. Please run fsck.

root@fbo:~# while true; do [ ! -f /media/usb/.not_mounted ] && dd if=/dev/zero of=/media/usb/aaa bs=1k count=1 2>/dev/null ; done
[   61.142634] wlcore: wl18xx HW: 183x or 180x, PG 2.2 (ROM 0x11)
[   61.155905] wlcore: loaded
[   62.670078] usb 1-1: USB disconnect, device number 2
[   62.683879] sd 0:0:0:0: [sda] UNKNOWN(0x2003) Result: hostbyte=0x01 driverbyte=0x00
[   62.691577] sd 0:0:0:0: [sda] CDB: opcode=0x2a 2a 00 00 00 20 00 00 00 01 00
[   62.698664] blk_update_request: I/O error, dev sda, sector 8192
[   62.704623] Buffer I/O error on dev sda, logical block 8192, lost sync page write
[   62.719584] FAT-fs (sda): Directory bread(block 8192) failed
[   62.725638] FAT-fs (sda): Directory bread(block 8193) failed
[   62.731406] FAT-fs (sda): Directory bread(block 8194) failed
[   62.737224] FAT-fs (sda): Directory bread(block 8195) failed
[   62.742934] FAT-fs (sda): Directory bread(block 8196) failed
[   62.748799] FAT-fs (sda): Directory bread(block 8197) failed
[   62.754536] FAT-fs (sda): Directory bread(block 8198) failed
[   62.760217] FAT-fs (sda): Directory bread(block 8199) failed
[   62.765949] FAT-fs (sda): Directory bread(block 8200) failed
[   62.771628] FAT-fs (sda): Directory bread(block 8201) failed
[   62.779616] FAT-fs (sda): FAT read failed (blocknr 6270)
[   62.785092] Unable to handle kernel paging request at virtual address 50835000
[   62.792331] pgd = ceb8c000
[   62.795046] [50835000] *pgd=00000000
[   62.798656] Internal error: Oops: 5 [#1] PREEMPT SMP ARM
[   62.803978] Modules linked in:
[   62.807060] CPU: 1 PID: 526 Comm: dd Not tainted 4.1.32-01802-g58326dd-dirty #6
[   62.814380] Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree)
[   62.820924] task: ce76af80 ti: ceb02000 task.ti: ceb02000
[   62.826338] PC is at __percpu_counter_add+0x2c/0x104
[   62.831311] LR is at __percpu_counter_add+0x24/0x104
[   62.836284] pc : [<801e8a14>]    lr : [<801e8a0c>]    psr: 200c0093
[   62.836284] sp : ceb03df8  ip : 00000000  fp : 00000000
[   62.847767] r10: d0fa0660  r9 : 00000001  r8 : 00000000
[   62.852999] r7 : ffffffff  r6 : ffffffff  r5 : 00000018  r4 : ce3c9150
[   62.859531] r3 : 00000000  r2 : 50835000  r1 : 8061d562  r0 : 00000001
[   62.866068] Flags: nzCv  IRQs off  FIQs on  Mode SVC_32  ISA ARM  Segment user
[   62.873294] Control: 10c53c7d  Table: 5eb8c04a  DAC: 00000015
[   62.879045] Process dd (pid: 526, stack limit = 0xceb02210)
[   62.884623] Stack: (0xceb03df8 to 0xceb04000)
[   62.888987] 3de0:                                                       200c0013 cc2f615c
[   62.897172] 3e00: ffffffff cc2f615c 00000000 00000001 d0fa0660 8009b02c 00000018 d0fa0660
[   62.905356] 3e20: 00040835 ceb03ef8 00000000 8009b250 0000000e 00000001 00000000 00000001
[   62.913541] 3e40: 0000030f ceb03e64 00000001 00000000 00000002 ceb03e70 8009ab78 cc2f615c
[   62.921726] 3e60: ffffe000 00000389 00000002 00000000 d0fa0660 d1970e20 ceb02000 800bce58
[   62.929912] 3e80: ce6d9840 cc143440 600c0013 00000000 ffffe000 00000001 ceb03eac 8004469c
[   62.938098] 3ea0: ceb03ef8 ceb03ef4 7eab6f06 cc2f615c ceb03ef8 00000000 00000000 8000ed04
[   62.946283] 3ec0: ceb02000 00000000 7eab6f06 8009b3c0 ceb03ed0 ceb03ed0 ceb03ed8 ceb03ed8
[   62.954468] 3ee0: ceb03ee0 ceb03ee0 cc2f615c cc2f615c 00000000 80093394 7fffffff 00000000
[   62.962653] 3f00: 00000000 00000000 ffffffff 7fffffff 00000000 00000000 ce678400 804c0080
[   62.970838] 3f20: 805e70f3 80093434 ffffffff 7fffffff 00000000 cc2f6090 ce678400 800e82e4
[   62.979023] 3f40: ce678400 ce678464 804c0080 800c48e4 cc2f6000 00000083 807035a0 800c5390
[   62.987208] 3f60: ce678400 806e4e0c 807035a0 800c4778 ce6f0b40 00000000 807035a0 800dc7a4
[   62.995393] 3f80: ce76af80 8003d7e0 8000ed04 ceb02000 ceb03fb0 00000005 8000ed04 8001197c
[   63.003578] 3fa0: 00000241 00e26008 00000000 8000ebcc fffffffb 00020241 000001b6 00000000
[   63.011763] 3fc0: 00000241 00e26008 00000000 00000005 00000000 00000000 000993b5 7eab6f06
[   63.019948] 3fe0: 00000005 7eab6b48 76e611f1 76dee996 200c0030 7eab6f06 00000000 00000000
[   63.028147] [<801e8a14>] (__percpu_counter_add) from [<8009b02c>] (clear_page_dirty_for_io+0xac/0xd8)
[   63.037381] [<8009b02c>] (clear_page_dirty_for_io) from [<8009b250>] (write_cache_pages+0x1f8/0x32c)
[   63.046524] [<8009b250>] (write_cache_pages) from [<8009b3c0>] (generic_writepages+0x3c/0x64)
[   63.055066] [<8009b3c0>] (generic_writepages) from [<80093394>] (__filemap_fdatawrite_range+0x64/0x70)
[   63.064387] [<80093394>] (__filemap_fdatawrite_range) from [<80093434>] (filemap_flush+0x24/0x2c)
[   63.073276] [<80093434>] (filemap_flush) from [<800e82e4>] (sync_filesystem+0x60/0xa8)
[   63.081211] [<800e82e4>] (sync_filesystem) from [<800c48e4>] (generic_shutdown_super+0x28/0xd4)
[   63.089923] [<800c48e4>] (generic_shutdown_super) from [<800c5390>] (kill_block_super+0x18/0x64)
[   63.098721] [<800c5390>] (kill_block_super) from [<800c4778>] (deactivate_locked_super+0x4c/0x7c)
[   63.107610] [<800c4778>] (deactivate_locked_super) from [<800dc7a4>] (cleanup_mnt+0x4c/0x6c)
[   63.116061] [<800dc7a4>] (cleanup_mnt) from [<8003d7e0>] (task_work_run+0xbc/0xe4)
[   63.123648] [<8003d7e0>] (task_work_run) from [<8001197c>] (do_work_pending+0x90/0xa4)
[   63.131578] [<8001197c>] (do_work_pending) from [<8000ebcc>] (work_pending+0xc/0x20)
[   63.139331] Code: e59f00d8 ebfff16c e5943018 ee1d2f90 (e7933002) 
[   63.145432] ---[ end trace 4f8d17d8fa59b935 ]---
[   63.150055] Kernel panic - not syncing: Fatal exception
[   63.155292] CPU2: stopping
[   63.158015] CPU: 2 PID: 0 Comm: swapper/2 Tainted: G      D         4.1.32-01802-g58326dd-dirty #6
[   63.166978] Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree)
[   63.173533] [<800152d4>] (unwind_backtrace) from [<80011d24>] (show_stack+0x10/0x14)
[   63.181296] [<80011d24>] (show_stack) from [<804aab04>] (dump_stack+0x70/0x8c)
[   63.188532] [<804aab04>] (dump_stack) from [<80013e4c>] (handle_IPI+0xd0/0x174)
[   63.195853] [<80013e4c>] (handle_IPI) from [<800093d4>] (gic_handle_irq+0x58/0x60)
[   63.203433] [<800093d4>] (gic_handle_irq) from [<800127c0>] (__irq_svc+0x40/0x74)
[   63.210920] Exception stack(0xce097f70 to 0xce097fb8)
[   63.215978] 7f60:                                     ce097fb8 0000000e b420624c 0000000e
[   63.224164] 7f80: 00000000 d0f15ce8 b420624c 0000000e b408cb3c 0000000e 00000000 806da5f4
[   63.232348] 7fa0: 00000009 ce097fb8 8006c71c 802feec0 60070013 ffffffff
[   63.238979] [<800127c0>] (__irq_svc) from [<802feec0>] (cpuidle_enter_state+0xc0/0x1a8)
[   63.247001] [<802feec0>] (cpuidle_enter_state) from [<80054380>] (cpu_startup_entry+0x19c/0x264)
[   63.255796] [<80054380>] (cpu_startup_entry) from [<1000946c>] (0x1000946c)
[   63.262765] CPU3: stopping
[   63.265486] CPU: 3 PID: 0 Comm: swapper/3 Tainted: G      D         4.1.32-01802-g58326dd-dirty #6
[   63.274449] Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree)
[   63.280997] [<800152d4>] (unwind_backtrace) from [<80011d24>] (show_stack+0x10/0x14)
[   63.288756] [<80011d24>] (show_stack) from [<804aab04>] (dump_stack+0x70/0x8c)
[   63.295990] [<804aab04>] (dump_stack) from [<80013e4c>] (handle_IPI+0xd0/0x174)
[   63.303310] [<80013e4c>] (handle_IPI) from [<800093d4>] (gic_handle_irq+0x58/0x60)
[   63.310889] [<800093d4>] (gic_handle_irq) from [<800127c0>] (__irq_svc+0x40/0x74)
[   63.318375] Exception stack(0xce099f70 to 0xce099fb8)
[   63.323434] 9f60:                                     ce099fb8 0000000e b420624c 0000000e
[   63.331621] 9f80: 00000000 d0f21ce8 b420624c 0000000e b408d1be 0000000e 00000000 806da5f4
[   63.339806] 9fa0: 00000009 ce099fb8 8006c71c 802feec0 60070013 ffffffff
[   63.346433] [<800127c0>] (__irq_svc) from [<802feec0>] (cpuidle_enter_state+0xc0/0x1a8)
[   63.354450] [<802feec0>] (cpuidle_enter_state) from [<80054380>] (cpu_startup_entry+0x19c/0x264)
[   63.363245] [<80054380>] (cpu_startup_entry) from [<1000946c>] (0x1000946c)
[   63.370212] CPU0: stopping
[   63.372933] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G      D         4.1.32-01802-g58326dd-dirty #6
[   63.381896] Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree)
[   63.388443] [<800152d4>] (unwind_backtrace) from [<80011d24>] (show_stack+0x10/0x14)
[   63.396202] [<80011d24>] (show_stack) from [<804aab04>] (dump_stack+0x70/0x8c)
[   63.403436] [<804aab04>] (dump_stack) from [<80013e4c>] (handle_IPI+0xd0/0x174)
[   63.410756] [<80013e4c>] (handle_IPI) from [<800093d4>] (gic_handle_irq+0x58/0x60)
[   63.418335] [<800093d4>] (gic_handle_irq) from [<800127c0>] (__irq_svc+0x40/0x74)
[   63.425822] Exception stack(0x806d9f38 to 0x806d9f80)
[   63.430879] 9f20:                                                       806d9f80 0000000e
[   63.439065] 9f40: b420624c 0000000e 00000000 d0efdce8 b420624c 0000000e b408e011 0000000e
[   63.447251] 9f60: 00000000 806da5f4 00000009 806d9f80 8006c71c 802feec0 60070013 ffffffff
[   63.455443] [<800127c0>] (__irq_svc) from [<802feec0>] (cpuidle_enter_state+0xc0/0x1a8)
[   63.463462] [<802feec0>] (cpuidle_enter_state) from [<80054380>] (cpu_startup_entry+0x19c/0x264)
[   63.472270] [<80054380>] (cpu_startup_entry) from [<806a0b98>] (start_kernel+0x2dc/0x334)
[   63.480457] Rebooting in 60 seconds..
Comment 1 Francesco Dolcini 2016-09-28 10:35:49 UTC
Created attachment 239921 [details]
imx6q kernel config
Comment 2 Francesco Dolcini 2016-09-28 10:37:28 UTC
Created attachment 239931 [details]
mdev usb mount/umount script
Comment 3 Francesco Dolcini 2016-09-28 10:50:27 UTC
I forget to mention explicitly the command used to trigger the bug:

rootfs with /media/usb/.not_mounted

mount USB flash drive (from mdev) in /media/usb/ with dirsync option

while true; do [ ! -f /media/usb/.not_mounted ] && dd if=/dev/zero of=/media/usb/aaa bs=1k count=1 2>/dev/null ; done

remove USB flash, mdev does umount -l, kernel oops
Comment 4 Francesco Dolcini 2016-09-28 11:27:30 UTC
Maybe related to this https://lkml.org/lkml/2015/11/6/422 ?
Comment 5 Francesco Dolcini 2016-10-04 10:25:46 UTC
I was able to bisect the problem to this commit:

6cd18e711dd8075da9d78cfc1239f912ff28968a is the first bad commit
commit 6cd18e711dd8075da9d78cfc1239f912ff28968a
Author: NeilBrown <neilb@suse.de>
Date:   Mon Apr 27 14:12:22 2015 +1000

    block: destroy bdi before blockdev is unregistered.
    
    Because of the peculiar way that md devices are created (automatically
    when the device node is opened), a new device can be created and
    registered immediately after the
    	blk_unregister_region(disk_devt(disk), disk->minors);
    call in del_gendisk().
    
    Therefore it is important that all visible artifacts of the previous
    device are removed before this call.  In particular, the 'bdi'.
    
    Since:
    commit c4db59d31e39ea067c32163ac961e9c80198fd37
    Author: Christoph Hellwig <hch@lst.de>
        fs: don't reassign dirty inodes to default_backing_dev_info
    
    moved the
       device_unregister(bdi->dev);
    call from bdi_unregister() to bdi_destroy() it has been quite easy to
    lose a race and have a new (e.g.) "md127" be created after the
    blk_unregister_region() call and before bdi_destroy() is ultimately
    called by the final 'put_disk', which must come after del_gendisk().
    
    The new device finds that the bdi name is already registered in sysfs
    and complains
    
    > [ 9627.630029] WARNING: CPU: 18 PID: 3330 at fs/sysfs/dir.c:31 sysfs_warn_dup+0x5a/0x70()
    > [ 9627.630032] sysfs: cannot create duplicate filename '/devices/virtual/bdi/9:127'
    
    We can fix this by moving the bdi_destroy() call out of
    blk_release_queue() (which can happen very late when a refcount
    reaches zero) and into blk_cleanup_queue() - which happens exactly when the md
    device driver calls it.
    
    Then it is only necessary for md to call blk_cleanup_queue() before
    del_gendisk().  As loop.c devices are also created on demand by
    opening the device node, we make the same change there.
    
    Fixes: c4db59d31e39ea067c32163ac961e9c80198fd37
    Reported-by: Azat Khuzhin <a3at.mail@gmail.com>
    Cc: Christoph Hellwig <hch@lst.de>
    Cc: stable@vger.kernel.org (v4.0)
    Signed-off-by: NeilBrown <neilb@suse.de>
    Reviewed-by: Christoph Hellwig <hch@lst.de>
    Signed-off-by: Jens Axboe <axboe@fb.com>

:040000 040000 2fb99f8a362b77870234926f19841d04386275b7 d6796c472b877521e0aaa73e05ee4d17ab1c969e M	block
:040000 040000 58aa31fb3fadd58672e66cd478b3719138987d3b 89e7fe93421748d3e566a93b8e0261677510dfcf M	drivers
Comment 6 zhuangzhuang 2017-05-23 03:28:59 UTC
(In reply to Francesco Dolcini from comment #5)
> I was able to bisect the problem to this commit:
> 
> 6cd18e711dd8075da9d78cfc1239f912ff28968a is the first bad commit
> commit 6cd18e711dd8075da9d78cfc1239f912ff28968a
> Author: NeilBrown <neilb@suse.de>
> Date:   Mon Apr 27 14:12:22 2015 +1000
> 
>     block: destroy bdi before blockdev is unregistered.
>     
>     Because of the peculiar way that md devices are created (automatically
>     when the device node is opened), a new device can be created and
>     registered immediately after the
>       blk_unregister_region(disk_devt(disk), disk->minors);
>     call in del_gendisk().
>     
>     Therefore it is important that all visible artifacts of the previous
>     device are removed before this call.  In particular, the 'bdi'.
>     
>     Since:
>     commit c4db59d31e39ea067c32163ac961e9c80198fd37
>     Author: Christoph Hellwig <hch@lst.de>
>         fs: don't reassign dirty inodes to default_backing_dev_info
>     
>     moved the
>        device_unregister(bdi->dev);
>     call from bdi_unregister() to bdi_destroy() it has been quite easy to
>     lose a race and have a new (e.g.) "md127" be created after the
>     blk_unregister_region() call and before bdi_destroy() is ultimately
>     called by the final 'put_disk', which must come after del_gendisk().
>     
>     The new device finds that the bdi name is already registered in sysfs
>     and complains
>     
>     > [ 9627.630029] WARNING: CPU: 18 PID: 3330 at fs/sysfs/dir.c:31
> sysfs_warn_dup+0x5a/0x70()
>     > [ 9627.630032] sysfs: cannot create duplicate filename
> '/devices/virtual/bdi/9:127'
>     
>     We can fix this by moving the bdi_destroy() call out of
>     blk_release_queue() (which can happen very late when a refcount
>     reaches zero) and into blk_cleanup_queue() - which happens exactly when
> the md
>     device driver calls it.
>     
>     Then it is only necessary for md to call blk_cleanup_queue() before
>     del_gendisk().  As loop.c devices are also created on demand by
>     opening the device node, we make the same change there.
>     
>     Fixes: c4db59d31e39ea067c32163ac961e9c80198fd37
>     Reported-by: Azat Khuzhin <a3at.mail@gmail.com>
>     Cc: Christoph Hellwig <hch@lst.de>
>     Cc: stable@vger.kernel.org (v4.0)
>     Signed-off-by: NeilBrown <neilb@suse.de>
>     Reviewed-by: Christoph Hellwig <hch@lst.de>
>     Signed-off-by: Jens Axboe <axboe@fb.com>
> 
> :040000 040000 2fb99f8a362b77870234926f19841d04386275b7
> d6796c472b877521e0aaa73e05ee4d17ab1c969e M    block
> :040000 040000 58aa31fb3fadd58672e66cd478b3719138987d3b
> 89e7fe93421748d3e566a93b8e0261677510dfcf M    drivers

can we solve this problem by revert this patch?
Comment 7 Marcos Souza 2019-02-01 01:13:42 UTC
Was this bug fixes already? Can you still reproduce this bug with a newer kernel?
Comment 8 Marcos Souza 2019-02-04 21:10:25 UTC
Per Francesco private message, this bug does not happen in current mainline. Closing.

Note You need to log in before you can comment on or make changes to this bug.