Bug 42763 - directory access hangs without error
Summary: directory access hangs without error
Status: RESOLVED CODE_FIX
Alias: None
Product: File System
Classification: Unclassified
Component: ext4 (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: fs_ext4@kernel-bugs.osdl.org
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2012-02-13 03:41 UTC by Eric Buddington
Modified: 2012-08-30 14:35 UTC (History)
3 users (show)

See Also:
Kernel Version: 3.2.5
Subsystem:
Regression: No
Bisected commit-id:


Attachments
C program to trigger the bug (6.17 KB, text/plain)
2012-02-13 16:02 UTC, Eric Buddington
Details
Fix missed wakeup on I_NEW (2.64 KB, patch)
2012-02-20 17:32 UTC, Jan Kara
Details | Diff

Description Eric Buddington 2012-02-13 03:41:27 UTC
Kernel 3.2.5
ext4 over RAID-6

I have a specific directory that freezes all processes that try to getdents() or open() a new file. In some cases, the kernel gives  me "blocked for more than 120 seconds" messages, often reporting that the process is stuck in ext4_getblk.

I have found no errors, however. The stuck processes stay stuck forever (at least hours), and dmesg shows no complaints about RAID, filesystem, or anything else. Full reads of the RAID devices work without hanging, and drive self-tests pass.

Other processes are able to access other parts of the filesystem normally; this is not a system-wide or fs-wide hang.

Rebooting and fscking made the directory accessible again, but now there is a different directory exhibiting the problem (unknown whether it had the problem before the fsck/reboot)

This particular problem exhibited itself for the first time after tests of my multi-threaded 'du' that beat the filesystem with a few dozen threads simultaneously.

Given that this seems to very reproducible, I have many opportunities to poke at a hung process, query the filesystem, or recompile the kernel in any way that would be helpful; I just don't know to approach it from here.
Comment 1 Eric Buddington 2012-02-13 15:55:11 UTC
More info:

After a reboot (and forced but uneventful fsck), the problem is gone until I run my program again. The program consists of multiple threads traversing and stat()ing different directory trees simultaneously.

The program will actually work fine for several iterations before getting stuck,
but once it gets stuck, that directory is a trap for all processes until reboot.
Comment 2 Eric Buddington 2012-02-13 16:02:51 UTC
Created attachment 72366 [details]
C program to trigger the bug

compile with --std=gnu99 -lpthread -lrt

invocation is ./edu <directory> <number of threads>

triggers the bug after about a dozen iteratons. Program will stall showing "1a", indicating one still-active thread, and fail to exit.
Comment 3 Eric Sandeen 2012-02-13 16:56:54 UTC
Did fsck ever find any problems with the filesystem?

I'm not clear on whether a simple reboot will clear it up, or if it requires a fsck to get things happy again.
Comment 4 Eric Buddington 2012-02-13 17:57:16 UTC
fsck did not find any problems with the filesystem.

I rebooted without any fsck, and was found the same symptoms; a couple dozen successful runs, then a hang (this time, it seems to be on two directories simultaneously).

I have been dropping caches between runs. I haven't tried to trigger the bug without dropping caches.
Comment 5 Eric Sandeen 2012-02-13 18:21:23 UTC
Ok, I'm traveling w/ limited time to try to reproduce but thanks for all the info.

A sysrq-W or sysrq-T when it's hung might show us where the threads are at.

-Eric
Comment 6 Eric Buddington 2012-02-13 18:30:27 UTC
The stuck threads look like this:

edu             D c023a2f4     0  9912      1 0x00000004
f50b2b80 00000086 00000000 c023a2f4 f7b2b400 d5350000 c09f6d80 00000000
c09f6d80 c1c5f500 0000000a c33dbee0 c023f172 00000000 d53515cc c33dbee0
000015cc d5352000 c8c4b4a4 c33dbee0 c1c5f500 f0e05dac c01558a1 00000246
Call Trace:
[<c023a2f4>] ? ext4_getblk+0x8b/0x13d
[<c023f172>] ? search_dirblock+0x76/0xaf
[<c01558a1>] ? arch_local_irq_save+0xf/0x14
[<c0651740>] ? _raw_spin_lock_irqsave+0x8/0x2c
[<c01c2cc3>] ? inode_wait+0x5/0x8
[<c0650c36>] ? __wait_on_bit+0x2f/0x54
[<c01c2cbe>] ? inode_owner_or_capable+0x30/0x30
[<c0650cba>] ? out_of_line_wait_on_bit+0x5f/0x67
[<c01c2cbe>] ? inode_owner_or_capable+0x30/0x30
[<c014532b>] ? autoremove_wake_function+0x2f/0x2f
[<c01c3610>] ? wait_on_bit.constprop.13+0x22/0x25
[<c01c3c8b>] ? iget_locked+0x42/0xc5
[<c023aad8>] ? ext4_iget+0x24/0x5be
[<c01bad90>] ? do_lookup+0x1e4/0x224
[<c01243a6>] ? should_resched+0x5/0x1e
[<c06506de>] ? _cond_resched+0x5/0x18
[<c01a61c9>] ? slab_pre_alloc_hook.isra.62+0x20/0x23
[<c01a6d76>] ? kmem_cache_alloc+0x1c/0xb1
[<c0240b61>] ? ext4_lookup.part.29+0x50/0xc6
[<c01c1cfc>] ? __d_alloc+0xec/0xfb
[<c01ba3b0>] ? d_alloc_and_lookup+0x2c/0x49
[<c01bacf9>] ? do_lookup+0x14d/0x224
[<c01bae26>] ? walk_component+0x56/0x10c
[<c01baf15>] ? lookup_last+0x39/0x3f
[<c01bb9c3>] ? path_lookupat+0x74/0x238
[<c01243a6>] ? should_resched+0x5/0x1e
[<c06506de>] ? _cond_resched+0x5/0x18
[<c01bbba1>] ? do_path_lookup+0x1a/0x4f
[<c01bcd77>] ? user_path_at_empty+0x3d/0x69
[<c01b6065>] ? cp_new_stat64+0xec/0xfe
[<c01bcdbb>] ? user_path_at+0x18/0x1d
[<c01b6410>] ? vfs_fstatat+0x3f/0x67
[<c01b644e>] ? vfs_lstat+0x16/0x18
[<c01b6645>] ? sys_lstat64+0xe/0x21
[<c01b81a4>] ? flush_old_exec+0x29/0x81
[<c0655f63>] ? sysenter_do_call+0x12/0x2c
Comment 7 Eric Buddington 2012-02-13 20:29:13 UTC
If I run the multi-threaded 'du' *without* flushing caches between, I don't seem to trigger the problem (after ~250 iterations, whereas ~20 trigger it otherwise)

The command I use that triggers the bug (/packages is about 17G of ordinary software installs):

n=1; while [ $n -lt 128 ]; do sync; echo 3 > /proc/sys/vm/drop_caches; ./edu /packages $n; n=$[$n+1]; done
Comment 8 Eric Buddington 2012-02-14 05:07:39 UTC
I just discovered that one RAID drive has large and increasing SMART values for
Raw_Read_Error_Rate and Hardware_ECC_Recovered.

Apparently, any disk read errors are being recovered, since I'm not seeing any errors in dmesg or /proc/mdstat - so I still think there's an ext4 problem (though perhaps it takes a flooded queue and a very-slow-response disk to trigger...)
Comment 9 Jan Kara 2012-02-20 17:31:29 UTC
Hmm, it seems mirroring of emails and bugzilla is broken. Anyway, thanks for the traces. It's a generic VFS bug. I'll attach here a fix in a moment.
Comment 10 Jan Kara 2012-02-20 17:32:36 UTC
Created attachment 72448 [details]
Fix missed wakeup on I_NEW

Can you please try this patch? Thanks.
Comment 11 Eric Buddington 2012-02-24 02:22:15 UTC
Applied to 3.2.7, and it seems to fix it; at least I ran the same test for about 4x as long as before without the problem exhibiting itself.

Thanks!

Note You need to log in before you can comment on or make changes to this bug.