Bug 44731
Summary: | ext4 deadlock under heavy io? | ||
---|---|---|---|
Product: | File System | Reporter: | Mirek Rusin (mirek) |
Component: | ext4 | Assignee: | fs_ext4 (fs_ext4) |
Status: | RESOLVED OBSOLETE | ||
Severity: | blocking | CC: | alan, anssi.hannula, jack, kernel, mirek, neilb, pilo, semenko, tomdeering7, unicell |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | 3.2.0-23 x86_64 | Subsystem: | |
Regression: | No | Bisected commit-id: | |
Attachments: |
sysrq-w output on 3.4.2 with frozen filesystem dm-4
echo w > /proc/sysrq-trigger output on frozen filesystem. echo w > /proc/sysrq-trigger output on frozen filesystem (#2) |
Description
Mirek Rusin
2012-07-13 12:23:13 UTC
Some additional information: - this hang situation happens very rarely, I'd say maybe every 10TBytes of data written (at 1Gbit/sec). - there are maximum of 32 threads doing io. - each io operation is at most 0.5MB of data (written or read) per syscall with an exception of fallocate, which can be rarely large, ie. 60GB. - this situation happens on both empty volumes (where fallocate can find continuous blocks to do any requested fallocate) and, what seems more often, when disk is close to being full. - the volume is LSI RAID6 with 500MB cache and 24x 2TB hard drives - i'm able to do several hundred of ios per second normally, the hang seems to be caused by the kernel itself and not hardware - during the hang i can access and write to the volume from other processes/threads. Created attachment 75501 [details]
sysrq-w output on 3.4.2 with frozen filesystem dm-4
I'm seeing a very similar issue on 3.4.2. I think this issue didn't exist several major versions back.
When one or more applications do intensive I/O, the filesystem freezes and all processes trying to access it freeze as well.
It doesn't always happen in the same way, though. Sometimes I get the "task blocked for more than 120 seconds", but sometimes I don't (in the attached case I didn't get those). Also, sometimes the system becomes usable again after several minutes of being seemingly frozen, but often it doesn't recover (like in the attached case). If wanted, I can provide more logs of these different situations (the logs do look very similar than the ones on this report, though).
In the attachment the frozen fs is dm-4. It is on the same LVM VG and RAID-6 as dm-3, dm-5, dm-6, and those other filesystems continued to work while dm-4 was frozen.
I had a look into the traces from comment 2. At first sight things are waiting for IO. But it needn't be that simple - a few processes are hanging in get_active_stripe() waiting for a free stripe. So in theory it could be some RAID5 issue. Neil, any idea? Possibly fixed by upstream commit fab363b5ff502d1b39ddcfec04271f5858d9f26e which went into 3.5-rc6 and 3.4.5 (as 8d9369807370331cebf3e237b95ecce068af80f1). Thanks for the info. I upgraded my affected system to 3.5.3, I'll report back if I still encounter it (sometimes triggering the issue takes a very long time, though). Neil, In my configuration I'm not using raid5 code your commit refers to - I'm running hardware, LSI card backed RAID6. @Mirek: Commit fab363b5ff502d1b39ddcfec04271f5858d9f26e seems to impact RAID6 (and RAID4) as well. See: http://comments.gmane.org/gmane.linux.raid/39236 Nick, yes it does - for software raid, right? I'm using hardware raid. This code is not even loaded into kernel (I've got it in kernel config as (m)odule, and it doesn't appear in /proc/modules - this code is not used at all in my case). The patch I referred to fixed problems with process hanging in get_active_stripes(). This appears in the comment #2 trace, so the patch may fix that problem. As you say the original problem does not mention MD raid, so it is probably a different problem. Yeah, Mirek, to better diagnose your problem, I need sysrq-w output when the system is hung. It should happen sometime today, will let you know, on ubuntu it's just: echo w > /proc/sysrq-trigger as root, right? Exactly. Thanks! Created attachment 78801 [details]
echo w > /proc/sysrq-trigger output on frozen filesystem.
Added attachment with output from "echo w > /proc/sysrq-trigger" on frozen fs.
The culprit of your hang now is: ------------[ cut here ]------------ kernel BUG at /build/buildd/linux-3.2.0/fs/jbd2/transaction.c:1093! invalid opcode: 0000 [#1] SMP CPU 0 Modules linked in: binfmt_misc vesafb dcdbas ses enclosure mac_hid lp parport bnx2 megaraid_sas Pid: 20387, comm: pifs4/eio Not tainted 3.2.0-23-generic #36-Ubuntu Dell Inc. PowerEdge R210 II/09T7VV RIP: 0010:[<ffffffff8125f27c>] [<ffffffff8125f27c>] jbd2_journal_dirty_metadata+0x1ec/0x230 RSP: 0018:ffff880102c29ae8 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff88022e128f00 RCX: ffff8801444048e0 RDX: ffff88022312ce58 RSI: 0000000000000000 RDI: ffff88022312ce58 RBP: ffff880102c29b38 R08: ffff880192ce9138 R09: 7010000000000000 R10: fe4f31d37b9c4e02 R11: 0000000000000000 R12: ffff880192ce9138 R13: ffff88014575ff50 R14: ffff88022fbae000 R15: ffff88022312ce58 FS: 00007ffc9fe65700(0000) GS:ffff88023fc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 00007f702b43a04c CR3: 00000002304b0000 CR4: 00000000000406f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process pifs4/eio (pid: 20387, threadinfo ffff880102c28000, task ffff88010065ade0) Stack: ffff88022312ce58 ffff88014575ff50 ffff880102c29b48 ffffffff8125f9e8 ffff880102c29b28 ffff8801444048e0 000000000000038a ffffffff8182bb37 ffff88022312ce58 ffff88022312ce58 ffff880102c29b88 ffffffff81241bcb Call Trace: [<ffffffff8125f9e8>] ? jbd2_journal_get_create_access+0xd8/0x170 [<ffffffff81241bcb>] __ext4_handle_dirty_metadata+0x8b/0x130 [<ffffffff8123d3f2>] ext4_ext_split+0x2f2/0x710 [<ffffffff8123e104>] ? ext4_ext_find_extent+0x134/0x3a0 [<ffffffff8123e4a4>] ext4_ext_create_new_leaf+0x134/0x180 [<ffffffff8123eb47>] ext4_ext_insert_extent+0xc7/0x440 [<ffffffff8123bffc>] ? ext4_ext_check_overlap.isra.20+0xbc/0xd0 [<ffffffff8124036c>] ext4_ext_map_blocks+0x58c/0xe70 [<ffffffff8125dd8a>] ? start_this_handle.isra.9+0x37a/0x3e0 [<ffffffff81215e45>] ext4_map_blocks+0x1b5/0x280 [<ffffffff81240fb2>] ext4_fallocate+0x192/0x3e0 [<ffffffff81176602>] do_fallocate+0xf2/0x160 [<ffffffff811766bb>] sys_fallocate+0x4b/0x70 [<ffffffff81664a82>] system_call_fastpath+0x16/0x1b Code: 08 49 8b 54 24 18 49 8d b6 58 03 00 00 89 04 24 49 89 d9 48 c7 c7 c0 0b a2 81 31 c0 e8 c4 4e 3e 00 b8 ea ff ff ff e9 d2 fe ff ff <0f> 0b 4d 85 c9 74 04 41 8b 41 08 45 31 c0 48 85 c9 74 04 44 8b RIP [<ffffffff8125f27c>] jbd2_journal_dirty_metadata+0x1ec/0x230 RSP <ffff880102c29ae8> --- Which means that we reserved too few credits for the transaction allocating blocks from fallocate. I was looking into the code and the math in ext4_chunk_trans_blocks() looks sound. But I don't know all the details of extent code. Maybe other ext4 guys will see the problem quicker than me... Created attachment 78931 [details]
echo w > /proc/sysrq-trigger output on frozen filesystem (#2)
...just happened again, in case it's useful, another dump attached.
And just for record. the problem is the same... Mirek, if upgrading the kernel is an option for you, trying the latest stable kernel from 3.5 might be worth a shot. I didn't find anything that should fix your oops but there are some changes in the extent code so maybe your problem will get fixed as a side effect. I have also experienced this. Bug report for Ubuntu: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1071012 |