Bug 42763
Summary: | directory access hangs without error | ||
---|---|---|---|
Product: | File System | Reporter: | Eric Buddington (ebuddington) |
Component: | ext4 | Assignee: | fs_ext4 (fs_ext4) |
Status: | RESOLVED CODE_FIX | ||
Severity: | normal | CC: | alan, jack, sandeen |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | 3.2.5 | Subsystem: | |
Regression: | No | Bisected commit-id: | |
Attachments: |
C program to trigger the bug
Fix missed wakeup on I_NEW |
Description
Eric Buddington
2012-02-13 03:41:27 UTC
More info: After a reboot (and forced but uneventful fsck), the problem is gone until I run my program again. The program consists of multiple threads traversing and stat()ing different directory trees simultaneously. The program will actually work fine for several iterations before getting stuck, but once it gets stuck, that directory is a trap for all processes until reboot. Created attachment 72366 [details]
C program to trigger the bug
compile with --std=gnu99 -lpthread -lrt
invocation is ./edu <directory> <number of threads>
triggers the bug after about a dozen iteratons. Program will stall showing "1a", indicating one still-active thread, and fail to exit.
Did fsck ever find any problems with the filesystem? I'm not clear on whether a simple reboot will clear it up, or if it requires a fsck to get things happy again. fsck did not find any problems with the filesystem. I rebooted without any fsck, and was found the same symptoms; a couple dozen successful runs, then a hang (this time, it seems to be on two directories simultaneously). I have been dropping caches between runs. I haven't tried to trigger the bug without dropping caches. Ok, I'm traveling w/ limited time to try to reproduce but thanks for all the info. A sysrq-W or sysrq-T when it's hung might show us where the threads are at. -Eric The stuck threads look like this: edu D c023a2f4 0 9912 1 0x00000004 f50b2b80 00000086 00000000 c023a2f4 f7b2b400 d5350000 c09f6d80 00000000 c09f6d80 c1c5f500 0000000a c33dbee0 c023f172 00000000 d53515cc c33dbee0 000015cc d5352000 c8c4b4a4 c33dbee0 c1c5f500 f0e05dac c01558a1 00000246 Call Trace: [<c023a2f4>] ? ext4_getblk+0x8b/0x13d [<c023f172>] ? search_dirblock+0x76/0xaf [<c01558a1>] ? arch_local_irq_save+0xf/0x14 [<c0651740>] ? _raw_spin_lock_irqsave+0x8/0x2c [<c01c2cc3>] ? inode_wait+0x5/0x8 [<c0650c36>] ? __wait_on_bit+0x2f/0x54 [<c01c2cbe>] ? inode_owner_or_capable+0x30/0x30 [<c0650cba>] ? out_of_line_wait_on_bit+0x5f/0x67 [<c01c2cbe>] ? inode_owner_or_capable+0x30/0x30 [<c014532b>] ? autoremove_wake_function+0x2f/0x2f [<c01c3610>] ? wait_on_bit.constprop.13+0x22/0x25 [<c01c3c8b>] ? iget_locked+0x42/0xc5 [<c023aad8>] ? ext4_iget+0x24/0x5be [<c01bad90>] ? do_lookup+0x1e4/0x224 [<c01243a6>] ? should_resched+0x5/0x1e [<c06506de>] ? _cond_resched+0x5/0x18 [<c01a61c9>] ? slab_pre_alloc_hook.isra.62+0x20/0x23 [<c01a6d76>] ? kmem_cache_alloc+0x1c/0xb1 [<c0240b61>] ? ext4_lookup.part.29+0x50/0xc6 [<c01c1cfc>] ? __d_alloc+0xec/0xfb [<c01ba3b0>] ? d_alloc_and_lookup+0x2c/0x49 [<c01bacf9>] ? do_lookup+0x14d/0x224 [<c01bae26>] ? walk_component+0x56/0x10c [<c01baf15>] ? lookup_last+0x39/0x3f [<c01bb9c3>] ? path_lookupat+0x74/0x238 [<c01243a6>] ? should_resched+0x5/0x1e [<c06506de>] ? _cond_resched+0x5/0x18 [<c01bbba1>] ? do_path_lookup+0x1a/0x4f [<c01bcd77>] ? user_path_at_empty+0x3d/0x69 [<c01b6065>] ? cp_new_stat64+0xec/0xfe [<c01bcdbb>] ? user_path_at+0x18/0x1d [<c01b6410>] ? vfs_fstatat+0x3f/0x67 [<c01b644e>] ? vfs_lstat+0x16/0x18 [<c01b6645>] ? sys_lstat64+0xe/0x21 [<c01b81a4>] ? flush_old_exec+0x29/0x81 [<c0655f63>] ? sysenter_do_call+0x12/0x2c If I run the multi-threaded 'du' *without* flushing caches between, I don't seem to trigger the problem (after ~250 iterations, whereas ~20 trigger it otherwise) The command I use that triggers the bug (/packages is about 17G of ordinary software installs): n=1; while [ $n -lt 128 ]; do sync; echo 3 > /proc/sys/vm/drop_caches; ./edu /packages $n; n=$[$n+1]; done I just discovered that one RAID drive has large and increasing SMART values for Raw_Read_Error_Rate and Hardware_ECC_Recovered. Apparently, any disk read errors are being recovered, since I'm not seeing any errors in dmesg or /proc/mdstat - so I still think there's an ext4 problem (though perhaps it takes a flooded queue and a very-slow-response disk to trigger...) Hmm, it seems mirroring of emails and bugzilla is broken. Anyway, thanks for the traces. It's a generic VFS bug. I'll attach here a fix in a moment. Created attachment 72448 [details]
Fix missed wakeup on I_NEW
Can you please try this patch? Thanks.
Applied to 3.2.7, and it seems to fix it; at least I ran the same test for about 4x as long as before without the problem exhibiting itself. Thanks! |