Bug 8276

Summary: ufs (ufstype=ufs2) oops on unmount
Product: File System Reporter: Jim Paris (jim)
Component: OtherAssignee: Evgeniy A. Dushistov (dushistov)
Status: CLOSED CODE_FIX    
Severity: normal CC: dushistov, protasnb
Priority: P2    
Hardware: i386   
OS: Linux   
Kernel Version: 2.6.20.1 Tree: Mainline
Regression: ---
Attachments: objdump -S ufs.ko
objdump -S ufs.ko
ufs_unmount_zero_link fix

Description Jim Paris 2007-03-28 10:13:15 UTC
Most recent kernel where this bug did *NOT* occur:
Distribution: Debian
Hardware Environment: x86_64 SMP
Software Environment: 
Problem Description:
Steps to reproduce:

$ file fs.img
fs.img: Unix Fast File system [v2] (big-endian) last mounted on /cell_mw_cfs,
last written at Sun Mar 25 05:49:34 2007, clean flag 1, readonly flag 0, number
of blocks 5242878, number of data blocks 5077077, number of cylinder groups 56,
block size 16384, fragment size 2048, average file size 16384, average number of
files in dir 64, pending blocks to free 0, pending inodes to free 0, system-wide
uuid 0, minimum percentage of free blocks 8, TIME optimization
$ mount -o loop,ufstype=ufs2,ro fs.img tmp/
$ find tmp/

Hits a few errors, dmesg gets some stuff:

[827280.036853] UFS-fs error (device loop1): ufs_check_page: bad entry in
directory #141312: unaligned directory entry - offset=0, rec_len=61846, name_len=144
[827280.036917] UFS-fs error (device loop1): ufs_readdir: bad page in #141312
[827280.054615] UFS-fs error (device loop1): ufs_read_inode: inode 6 has zero nlink
[827280.054616]
[827280.054662] init_special_inode: bogus i_mode (0)
[827280.054691] UFS-fs error (device loop1): ufs_read_inode: inode 8 has zero nlink
[827280.054692]
[827280.054736] init_special_inode: bogus i_mode (0)
[827280.054755] UFS-fs error (device loop1): ufs_read_inode: inode 10 has zero nlink
[827280.054756]
[827280.054800] init_special_inode: bogus i_mode (0)
[827280.054817] UFS-fs error (device loop1): ufs_read_inode: inode 11 has zero nlink
[827280.054819]
[827280.054863] init_special_inode: bogus i_mode (0)
[827280.054882] UFS-fs error (device loop1): ufs_read_inode: inode 12 has zero nlink
[827280.054883]
[827280.054927] init_special_inode: bogus i_mode (0)
[827280.054946] UFS-fs error (device loop1): ufs_read_inode: inode 15 has zero nlink
[827280.054947]
[827280.054991] init_special_inode: bogus i_mode (0)
[827280.055019] UFS-fs error (device loop1): ufs_read_inode: inode 20 has zero nlink
[827280.055021]
[827280.055065] init_special_inode: bogus i_mode (0)
[827280.055083] UFS-fs error (device loop1): ufs_read_inode: inode 22 has zero nlink
[827280.055084]
[827280.055128] init_special_inode: bogus i_mode (0)
[827280.084792] UFS-fs error (device loop1): ufs_read_inode: inode 23561 has
zero nlink
[827280.084793]
[827280.084842] init_special_inode: bogus i_mode (0)
[827280.084863] UFS-fs error (device loop1): ufs_read_inode: inode 23562 has
zero nlink
[827280.084865]
[827280.084910] init_special_inode: bogus i_mode (0)
[827280.084930] UFS-fs error (device loop1): ufs_read_inode: inode 23563 has
zero nlink
[827280.084931]
[827280.084975] init_special_inode: bogus i_mode (0)
[827280.084993] UFS-fs error (device loop1): ufs_read_inode: inode 23564 has
zero nlink
[827280.084995]
[827280.085039] init_special_inode: bogus i_mode (0)
[827280.085056] UFS-fs error (device loop1): ufs_read_inode: inode 23565 has
zero nlink
[827280.085058]
[827280.085102] init_special_inode: bogus i_mode (0)
[827280.086067] UFS-fs error (device loop1): ufs_check_page: bad entry in
directory #1248485: unaligned directory entry - offset=0, rec_len=56645,
name_len=252
[827280.086124] UFS-fs error (device loop1): ufs_readdir: bad page in #1248485
[827280.086471] UFS-fs error (device loop1): ufs_check_page: bad entry in
directory #1271989: directory entry across blocks - offset=0, rec_len=52012,
name_len=245
[827280.086543] UFS-fs error (device loop1): ufs_readdir: bad page in #1271989
[827280.086725] UFS-fs error (device loop1): ufs_read_inode: inode 23603 has
zero nlink
[827280.086727]
[827280.086771] init_special_inode: bogus i_mode (0)
[827280.117270] UFS-fs error (device loop1): ufs_read_inode: inode 94208 has
zero nlink
[827280.117272]
[827280.117317] init_special_inode: bogus i_mode (0)
[827280.241836] UFS-fs error (device loop1): ufs_check_page: bad entry in
directory #423936: directory entry across blocks - offset=0, rec_len=65280,
name_len=0
[827280.241894] UFS-fs error (device loop1): ufs_readdir: bad page in #423936
[827280.355618] UFS-fs error (device loop1): ufs_read_inode: inode 541696 has
zero nlink
[827280.355620]
[827280.355666] init_special_inode: bogus i_mode (0)

That part is not completely unexpected, it might be corrupted or have some
non-standard structure.  But, on unmounting the filesystem,

[827297.979496] Unable to handle kernel NULL pointer dereference at
00000000000000b8 RIP:
[827297.979514]  [<ffffffff88393735>] :ufs:ufs_read_cylinder+0x41/0x2c5
[827297.979585] PGD b40cf067 PUD b34e9067 PMD 0
[827297.979615] Oops: 0000 [1] SMP
[827297.979641] CPU 0
[827297.979663] Modules linked in: ufs nls_iso8859_1 isofs i2c_isa w83793
hwmon_vid nfs nfsd exportfs lockd nfs_acl sunrpc button ac battery ipv6
iptable_raw xt_policy xt_multiport ipt_ULOG ipt_TTL ipt_ttl ipt_TOS ipt_tos
ipt_TCPMSS ipt_SAME ipt_REJECT ipt_REDIRECT ipt_recent ipt_owner ipt_NETMAP
ipt_MASQUERADE ipt_LOG ipt_iprange ipt_ECN ipt_ecn ipt_ah ipt_addrtype xt_tcpmss
xt_pkttype xt_physdev xt_NFQUEUE xt_MARK xt_mark xt_mac xt_limit xt_length
xt_helper xt_dccp xt_conntrack xt_CLASSIFY xt_tcpudp xt_state iptable_nat nf_nat
nf_conntrack_ipv4 nf_conntrack iptable_mangle nfnetlink iptable_filter ip_tables
x_tables raid456 xor kvm_intel kvm fuse loop evdev parport_pc parport shpchp
pci_hotplug floppy i2c_i801 i2c_core pcspkr ext3 jbd mbcache dm_mirror
dm_snapshot dm_mod raid1 md_mod ide_generic ata_generic ata_piix sd_mod piix
generic ide_core ahci sata_sil24 e1000 ehci_hcd uhci_hcd thermal processor fan
[827297.980157] Pid: 11019, comm: umount Not tainted 2.6.20.1 #1
[827297.980185] RIP: 0010:[<ffffffff88393735>]  [<ffffffff88393735>]
:ufs:ufs_read_cylinder+0x41/0x2c5
[827297.980238] RSP: 0018:ffff810024ad9d08  EFLAGS: 00010246
[827297.980265] RAX: 0000000000000000 RBX: ffff81013939ac00 RCX: 0000000000000000
[827297.980309] RDX: 0000000000000000 RSI: ffff81013b4e5c00 RDI: 00000000000000b8
[827297.980353] RBP: 0000000000000000 R08: 0000000000000024 R09: 0000000000000000
[827297.980397] R10: ffff81013cf5e528 R11: 00000000fffffffa R12: ffff810136a4b800
[827297.980442] R13: ffff810136a4b800 R14: ffff81013939ac00 R15: 0000000000000017
[827297.980487] FS:  00002ace8cbed1d0(0000) GS:ffffffff804fb000(0000)
knlGS:0000000000000000
[827297.980532] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[827297.980560] CR2: 00000000000000b8 CR3: 00000000b34f6000 CR4: 00000000000026e0
[827297.980605] Process umount (pid: 11019, threadinfo ffff810024ad8000, task
ffff81013b42b040)
[827297.980650] Stack:  ffff810100000000 ffff81013939ac00 ffff810025753a18
0000000000000017
[827297.980705]  ffff810136a4b800 0000000000513c20 0000000000005c00 ffffffff88393bc2
[827297.980756]  ffff810025753a18 0000000000084400 ffff810025753a18 ffff810136a4b800
[827297.980790] Call Trace:
[827297.980836]  [<ffffffff88393bc2>] :ufs:ufs_load_cylinder+0x12c/0x13a
[827297.980872]  [<ffffffff8839511b>] :ufs:ufs_free_inode+0x8f/0x308
[827297.980908]  [<ffffffff88395dee>] :ufs:ufs_delete_inode+0x0/0x93
[827297.980942]  [<ffffffff88395e7b>] :ufs:ufs_delete_inode+0x8d/0x93
[827297.980974]  [<ffffffff8022c05e>] generic_delete_inode+0xc0/0x136
[827297.981007]  [<ffffffff802c0999>] shrink_dcache_for_umount_subtree+0x203/0x250
[827297.981055]  [<ffffffff802c0e95>] shrink_dcache_for_umount+0x2f/0x3d
[827297.981087]  [<ffffffff802bac62>] generic_shutdown_super+0x19/0xf1
[827297.981119]  [<ffffffff802bad60>] kill_block_super+0x26/0x3b
[827297.981149]  [<ffffffff802bae2f>] deactivate_super+0x6a/0x83
[827297.981180]  [<ffffffff802c2e0f>] sys_umount+0x247/0x27d
[827297.981212]  [<ffffffff802215d3>] sys_newstat+0x19/0x31
[827297.981250]  [<ffffffff8025611e>] system_call+0x7e/0x83
[827297.981284]
[827297.981303]
[827297.981304] Code: 48 8b 04 38 4c 8b 68 28 75 0c 44 89 f8 0f af 86 14 01 00 00
[827297.981404] RIP  [<ffffffff88393735>] :ufs:ufs_read_cylinder+0x41/0x2c5
[827297.981438]  RSP <ffff810024ad9d08>
[827297.981461] CR2: 00000000000000b8
[827297.981821]  BUG: at kernel/exit.c:860 do_exit()
[827297.981939]
[827297.981940] Call Trace:
[827297.982067]  [<ffffffff802135ca>] do_exit+0x51/0x7ff
[827297.982138]  [<ffffffff8025c84b>] _spin_unlock_irqrestore+0x8/0x9
[827297.982213]  [<ffffffff8020aa0b>] do_page_fault+0x72d/0x7ad
[827297.982291]  [<ffffffff802371dd>] d_instantiate+0x52/0x8a
[827297.982362]  [<ffffffff8023c61c>] d_rehash+0x21/0x34
[827297.982432]  [<ffffffff8022d703>] d_splice_alias+0x114/0x11c
[827297.982511]  [<ffffffff8025c9fd>] error_exit+0x0/0x84
[827297.982592]  [<ffffffff88393735>] :ufs:ufs_read_cylinder+0x41/0x2c5
[827297.982673]  [<ffffffff88393bc2>] :ufs:ufs_load_cylinder+0x12c/0x13a
[827297.982753]  [<ffffffff8839511b>] :ufs:ufs_free_inode+0x8f/0x308
[827297.982832]  [<ffffffff88395dee>] :ufs:ufs_delete_inode+0x0/0x93
[827297.982908]  [<ffffffff88395e7b>] :ufs:ufs_delete_inode+0x8d/0x93
[827297.982981]  [<ffffffff8022c05e>] generic_delete_inode+0xc0/0x136
[827297.983054]  [<ffffffff802c0999>] shrink_dcache_for_umount_subtree+0x203/0x250
[827297.983145]  [<ffffffff802c0e95>] shrink_dcache_for_umount+0x2f/0x3d
[827297.983218]  [<ffffffff802bac62>] generic_shutdown_super+0x19/0xf1
[827297.983292]  [<ffffffff802bad60>] kill_block_super+0x26/0x3b
[827297.983365]  [<ffffffff802bae2f>] deactivate_super+0x6a/0x83
[827297.983437]  [<ffffffff802c2e0f>] sys_umount+0x247/0x27d
[827297.983512]  [<ffffffff802215d3>] sys_newstat+0x19/0x31
[827297.983591]  [<ffffffff8025611e>] system_call+0x7e/0x83

A friend tells me that he got an oops doing the same thing on a 32-bit UP
system, but I'm not sure the exact kernel version (probably 2.6.18).
Comment 1 Evgeniy A. Dushistov 2007-03-28 13:16:11 UTC
>$ file fs.img

Thanks for reporting this.

May be I'm missing something,
is it possible to get this image somewhere?
Comment 2 Jim Paris 2007-03-28 13:29:24 UTC
Unfortunately I can't share it, and I haven't reproduced it with other images.
I can probably pull out pieces if there is some particular table or entry you'd
like to see, or if you'd like any extra debugging added to ufs I can do that too.
Comment 3 Evgeniy A. Dushistov 2007-03-28 14:20:55 UTC
>Unfortunately I can't share it, and I haven't reproduced it with other images.
>I can probably pull out pieces if there is some particular table or entry you'd
>like to see, or if you'd like any extra debugging added to ufs I can do that 
>too.


There is UFS's option avaible on kernel build stage:

>UFS debugging (UFS_DEBUG)
>
>If you are experiencing any problems with the UFS filesystem, say
>Y here. This will result in _many_ additional debugging messages to be
>written to the system log.

can you turn on it and post result here, 
messages should appear somewhere like /var/log/messages.


And before any rebuild of kernel can you post here
result of "objdump -S path/to/ufs/module"?
This should help "parse" oops message.
Comment 4 Jim Paris 2007-03-28 18:03:41 UTC
Created attachment 10984 [details]
objdump -S ufs.ko
Comment 5 Jim Paris 2007-03-28 18:06:16 UTC
Created attachment 10985 [details]
objdump -S ufs.ko
Comment 6 Jim Paris 2007-03-28 18:07:57 UTC
Disassembly added.  I'll want to move to a different machine before I start
triggering the oops on purpose, so it might take me a few days to get the
debugging output.
Comment 7 Evgeniy A. Dushistov 2007-04-08 04:46:27 UTC
>Disassembly added.  

Thanks.
After looking on trace:

827297.980942]  [<ffffffff88395e7b>] :ufs:ufs_delete_inode+0x8d/0x93
[827297.980974]  [<ffffffff8022c05e>] generic_delete_inode+0xc0/0x136

the problem happened, because of 
a)filesystem mounted read-only
b)filesystem was damaged, and inode has zero link count
c)ufs_delete_inode was called

Can you try this patch?
Comment 8 Evgeniy A. Dushistov 2007-04-08 04:48:23 UTC
Created attachment 11104 [details]
ufs_unmount_zero_link fix

This patch should fix oops on unmount, if cache contains inode with zero links,

and filesystem mounted readonly.
Comment 9 Jim Paris 2007-04-15 12:47:46 UTC
I tested it: ufs-proper-handling-of-zero-link-case.patch
fixes the oops on unmount.  Thank you!
Comment 10 Natalie Protasevich 2007-07-06 23:22:04 UTC
It looks like the patch is in the tree, the problem can be closed.
Thanks.