Bug 199825 - XFS: Assertion failed: XFS_FORCED_SHUTDOWN(ip->i_mount) || ip->i_delayed_blks == 0, file: fs/xfs/xfs_super.c, line: 984
Summary: XFS: Assertion failed: XFS_FORCED_SHUTDOWN(ip->i_mount) || ip->i_delayed_blks...
Status: NEW
Alias: None
Product: File System
Classification: Unclassified
Component: XFS (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: FileSystem/XFS Default Virtual Assignee
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2018-05-24 17:47 UTC by Zorro Lang
Modified: 2018-05-25 04:07 UTC (History)
0 users

See Also:
Kernel Version: 4.17-rc6
Subsystem:
Regression: No
Bisected commit-id:


Attachments

Description Zorro Lang 2018-05-24 17:47:39 UTC
Description:

Just hit a XFS crash, when tested XFS on a 500T storage. The kernel is 4.17-rc6 (HEAD is bee797529d7c1ea4e2803fda067d20edbc00bc3d), the xfsprogs version is 4.16.1 (HEAD is 27c3aeb25d1ef93c63e46443d73503353ba5e1a5)

 kernel: XFS: Assertion failed: XFS_FORCED_SHUTDOWN(ip->i_mount) || ip->i_delayed_blks == 0, file: fs/xfs/xfs_super.c, line: 984
 kernel: WARNING: CPU: 39 PID: 487 at fs/xfs/xfs_message.c:105 asswarn+0x1e/0x30 [xfs]
 kernel: Modules linked in: loop intel_powerclamp coretemp kvm_intel sunrpc kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel crypto_simd cryptd glue_helper iT\
CO_wdt gpio_ich iTCO_vendor_support dm_service_time pcspkr ipmi_ssif ipmi_si ipmi_devintf hpilo hpwdt sg i7core_edac lpc_ich ipmi_msghandler shpchp acpi_power_meter pcc_cpufreq dm_multipath \
ip_tables xfs libcrc32c sr_mod cdrom ata_generic pata_acpi radeon i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm ata_piix sd_mod drm mpt3sas libata crc32c_inte\
l serio_raw hpsa netxen_nic i2c_core raid_class scsi_transport_sas dm_mirror dm_region_hash dm_log dm_mod
 kernel: CPU: 39 PID: 487 Comm: kswapd0 Not tainted 4.17.0-rc6+ #2
 kernel: Hardware name: HP ProLiant DL580 G7, BIOS P65 10/01/2013
 kernel: RIP: 0010:asswarn+0x1e/0x30 [xfs]
 kernel: RSP: 0018:ffff9b9887d9fbc0 EFLAGS: 00010246
 kernel: RAX: 0000000000000000 RBX: ffff882b91e009d8 RCX: 0000000000000000
 kernel: RDX: ffff9b9887d9fae0 RSI: 000000000000000a RDI: ffffffffc05ecff9
 kernel: RBP: ffff882b91e008c8 R08: 0000000000000000 R09: 0000000000000021
 kernel: R10: 0000000000000000 R11: 000000000000000a R12: ffff882b91e00780
 kernel: R13: 000000000000ce58 R14: 000000000000000c R15: 00000000000000e5
 kernel: FS:  0000000000000000(0000) GS:ffff882f6fbc0000(0000) knlGS:0000000000000000
 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 kernel: CR2: 00007f832588eff0 CR3: 0000000a2dc0a002 CR4: 00000000000206e0
 kernel: Call Trace:
 kernel: xfs_fs_destroy_inode+0x1db/0x200 [xfs]
 kernel: dispose_list+0x48/0x60
 kernel: prune_icache_sb+0x52/0x70
 kernel: super_cache_scan+0x139/0x190
 kernel: shrink_slab+0x249/0x430
 kernel: shrink_node+0x44a/0x490
 kernel: balance_pgdat+0x188/0x360
 kernel: kswapd+0x171/0x410
 kernel: ? remove_wait_queue+0x60/0x60
 kernel: kthread+0xf8/0x130
 kernel: ? balance_pgdat+0x360/0x360
 kernel: ? kthread_bind+0x10/0x10
 kernel: ret_from_fork+0x35/0x40
 kernel: Code: 99 f9 ff ff 0f 0b e8 c2 d4 4e c8 66 90 66 66 66 66 90 48 89 f1 41 89 d0 48 c7 c6 08 7f 5f c0 48 89 fa 31 c0 31 ff e8 b2 fc ff ff <0f> 0b c3 0f 1f 44 00 00 66 2e 0f 1f 84 00 00\
 00 00 00 66 66 66
 kernel: ---[ end trace 6d72efdc406c28bc ]---

Steps to hit this bug (I'm not sure all steps is necessary):
1. mkfs.xfs -m crc=1,finobt=1,rmapbt=1,reflink=1 -i sparse=1 -d size=100t $dev
2. fallocate a 100t file
3. punch holes on above big file randomly, to free a few space.
4. run fs_mark to take some space.
5. keep running fsstress, even if the fs is full, keep running for a while.
6. repeat step#3 to #5 100 times.
7. growfs to 300T
8. fallocate a 200t file
9. repeat step#3 to #5 100 times
10. growfs to 500T
11. fallocate a 200T file
12. repeat step#3 to #5 100 times
(XFS shutdown at here)
13. umount XFS, and do fsck
(a XFS corruption can be found at here)

Phase 1 - find and verify superblock...
        - reporting progress in intervals of 15 minutes
Memory available for repair (48300MB) may not be sufficient.
At least 64121MB is needed to repair this filesystem efficiently
If repair fails due to lack of memory, please
turn prefetching off (-P) to reduce the memory footprint.
Phase 2 - using internal log
        - zero log...
        - scan filesystem freespace and inode maps...
sb_fdblocks 2570207456, counted 2570207543
        - 12:28:55: scanning filesystem freespace - 501 of 501 allocation groups done
        - found root inode chunk
Phase 3 - for each AG...
        - scan (but don't clear) agi unlinked lists...
        - 12:28:55: scanning agi unlinked lists - 501 of 501 allocation groups done
        - process known inodes and perform inode discovery...
        - agno = 0
        - agno = 30
        - agno = 15
        - agno = 60
        - agno = 45
        - agno = 75
        - agno = 90
....
....
Comment 1 Zorro Lang 2018-05-25 04:07:54 UTC
This assert failure can be reproduced after disable rmapbt, reflink, finobt and sparse inodes (the xfs didn't be shutdown, all fsstress processes can't be killed, they are basically hammering xfs_inode_ag_walk() and slamming on eofblocks scans because xfs is full):

# xfs_info /mnt/500Test/
meta-data=/dev/mapper/VG500T-LV500T isize=512    agcount=301, agsize=268435455 blks
         =                       sectsz=4096  attr=2, projid32bit=1
         =                       crc=1        finobt=0 spinodes=0 rmapbt=0
         =                       reflink=0
data     =                       bsize=4096   blocks=80530636800, imaxpct=1
         =                       sunit=0      swidth=0 blks
naming   =version 2              bsize=4096   ascii-ci=0 ftype=1
log      =internal               bsize=4096   blocks=521728, version=2
         =                       sectsz=4096  sunit=1 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0


[17641.803028] XFS: Assertion failed: XFS_FORCED_SHUTDOWN(ip->i_mount) || ip->i_delayed_blks == 0, file: fs/xfs/xfs_super.c, line: 984
[17641.866008] WARNING: CPU: 58 PID: 490 at fs/xfs/xfs_message.c:105 asswarn+0x1e/0x30 [xfs]                                                                                                   
[17641.909106] Modules linked in: loop intel_powerclamp coretemp kvm_intel sunrpc kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel crypto_simd cryptd glue_help
er iTCO_wdt gpio_ich iTCO_vendor_support dm_service_time pcspkr ipmi_ssif ipmi_si ipmi_devintf hpilo hpwdt sg i7core_edac lpc_ich ipmi_msghandler shpchp acpi_power_meter pcc_cpufreq dm_multip
ath ip_tables xfs libcrc32c sr_mod cdrom ata_generic pata_acpi radeon i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm ata_piix sd_mod drm mpt3sas libata crc32c_i
ntel serio_raw hpsa netxen_nic i2c_core raid_class scsi_transport_sas dm_mirror dm_region_hash dm_log dm_mod                                                                                   
[17642.206190] CPU: 58 PID: 490 Comm: kswapd3 Tainted: G        W         4.17.0-rc6+ #2                                                                                                       
[17642.243780] Hardware name: HP ProLiant DL580 G7, BIOS P65 10/01/2013                                                                                                                        
[17642.276382] RIP: 0010:asswarn+0x1e/0x30 [xfs]                                                                                                                                               
[17642.296899] RSP: 0018:ffff9b9887db7bc0 EFLAGS: 00010246                                                                                                                                     
[17642.322266] RAX: 0000000000000000 RBX: ffff883a8b98ef18 RCX: 0000000000000000                                                                                                               
[17642.360463] RDX: ffff9b9887db7ae0 RSI: 000000000000000a RDI: ffffffffc05ecff9                                                                                                               
[17642.397232] RBP: ffff883a8b98ee08 R08: 0000000000000000 R09: 0000000000000021
[17642.434908] R10: 0000000000000000 R11: 000000000000000a R12: ffff883a8b98ecc0
[17642.471571] R13: 000000000000b626 R14: 0000000000000030 R15: 000000000000009c
[17642.509493] FS:  0000000000000000(0000) GS:ffff883b6f280000(0000) knlGS:0000000000000000
[17642.547861] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033                                                                                                                       [9/9036]
[17642.577162] CR2: 0000000000ba9458 CR3: 0000000a2dc0a003 CR4: 00000000000206e0
[17642.612301] Call Trace:
[17642.624598]  xfs_fs_destroy_inode+0x1db/0x200 [xfs]
[17642.650251]  dispose_list+0x48/0x60
[17642.666599]  prune_icache_sb+0x52/0x70
[17642.684140]  super_cache_scan+0x139/0x190
[17642.703757]  shrink_slab+0x249/0x430
[17642.722686]  shrink_node+0x44a/0x490
[17642.739528]  balance_pgdat+0x188/0x360
[17642.757128]  kswapd+0x171/0x410
[17642.772346]  ? remove_wait_queue+0x60/0x60
[17642.793224]  kthread+0xf8/0x130
[17642.807889]  ? balance_pgdat+0x360/0x360
[17642.826241]  ? kthread_bind+0x10/0x10
[17642.844241]  ret_from_fork+0x35/0x40
[17642.862674] Code: 99 f9 ff ff 0f 0b e8 c2 d4 4e c8 66 90 66 66 66 66 90 48 89 f1 41 89 d0 48 c7 c6 08 7f 5f c0 48 89 fa 31 c0 31 ff e8 b2 fc ff ff <0f> 0b c3 0f 1f 44 00 00 66 2e 0f 1f 84
00 00 00 00 00 66 66 66

# cat /proc/17313/stack 
[<0>] xfs_free_eofblocks+0xa4/0x200 [xfs]
[<0>] xfs_inode_free_eofblocks+0x123/0x140 [xfs]
[<0>] xfs_inode_ag_walk.isra.20+0x19d/0x410 [xfs]
[<0>] xfs_inode_ag_iterator_tag+0x79/0xb0 [xfs]
[<0>] xfs_file_buffered_aio_write+0x163/0x2b0 [xfs]
[<0>] xfs_file_write_iter+0xf7/0x140 [xfs]
[<0>] __vfs_write+0x100/0x180
[<0>] vfs_write+0xad/0x1a0
[<0>] ksys_write+0x52/0xc0
[<0>] do_syscall_64+0x5b/0x180
[<0>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[<0>] 0xffffffffffffffff
[root@hp-dl580g7-04 ~]# cat /proc/17313/stack 
[<0>] 0xffffffffffffffff

# cat /proc/17313/stack 
[<0>] xfs_inode_ag_walk.isra.20+0x23c/0x410 [xfs]
[<0>] xfs_inode_ag_iterator_tag+0x79/0xb0 [xfs]
[<0>] xfs_file_buffered_aio_write+0x163/0x2b0 [xfs]
[<0>] xfs_file_write_iter+0xf7/0x140 [xfs]
[<0>] __vfs_write+0x100/0x180
[<0>] vfs_write+0xad/0x1a0
[<0>] ksys_write+0x52/0xc0
[<0>] do_syscall_64+0x5b/0x180
[<0>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[<0>] 0xffffffffffffffff

Note You need to log in before you can comment on or make changes to this bug.