Bug 13369
Summary: | kernel BUG at fs/ext4/inode.c:3123 | ||
---|---|---|---|
Product: | File System | Reporter: | Chris Komner (ckom26) |
Component: | ext4 | Assignee: | Eric Sandeen (sandeen) |
Status: | RESOLVED CODE_FIX | ||
Severity: | blocking | CC: | nao.nakashima, sandeen, tytso |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | 2.6.30-rc6 | Subsystem: | |
Regression: | No | Bisected commit-id: | |
Attachments: |
Kernel logs of the bug, had to pull the plug to get the second one since the box wouldn't reboot normally.
dmesg output of the bug yet another snip kernel .config |
Description
Chris Komner
2009-05-23 12:04:51 UTC
Created attachment 21506 [details]
Kernel logs of the bug, had to pull the plug to get the second one since the box wouldn't reboot normally.
This is: static int ext4_normal_writepage(struct page *page, struct writeback_control *wbc) { ... if (page_has_buffers(page)) { /* if page has buffers it should all be mapped * and allocated. If there are not buffers attached * to the page we know the page is dirty but it lost * buffers. That means that at some moment in time * after write_begin() / write_end() has been called * all buffers have been clean and thus they must have been * written at least once. So they are all mapped and we can * happily proceed with mapping them and writing the page. */ BUG_ON(walk_page_buffers(NULL, page_buffers(page), 0, len, NULL, ext4_bh_unmapped_or_delay)); } Does ktorrent preallocate space? You might try turning that off to see if it goes away, as a hint. I can't look at this much today, it's my birthday and I'm going to have some non-computer fun. ;) If you want, you could modify the test in ext4_normal_writepage() at this spot to something like: if (page_has_buffers(page)) { /* * (big comment deleted) */ if (walk_page_buffers(NULL, page_buffers(page), 0, len, NULL, ext4_bh_unmapped_or_delay)) { struct buffer_head *bh; bh = (struct buffer_head *)page_private(page); printk("bh state 0x%x\n", bh->b_state); BUG(); } } so that we'd see the actual buffer head state that was wrong. (this assumes you have page-sized blocks; most likely true, if you have 4k blocks, which is the mkfs default) Space preallocation is turned off in ktorrent. The modification didn't help much, but then again the block size is 1k. Additional info from dumpe2fs: Filesystem volume name: STORAGE Last mounted on: <not available> Filesystem UUID: 7d9c4f5a-c7fe-4b2d-a02a-dc095ba40e7c Filesystem magic number: 0xEF53 Filesystem revision #: 1 (dynamic) Filesystem features: has_journal ext_attr resize_inode dir_index filetype needs_recovery extent flex_bg sparse_super large_file huge_file uninit_bg dir_nlink extra_isize Filesystem flags: signed_directory_hash Default mount options: (none) Filesystem state: clean Errors behavior: Continue Filesystem OS type: Linux Inode count: 10799616 Block count: 172787076 Reserved block count: 0 Free blocks: 8744435 Free inodes: 10795146 First block: 1 Block size: 1024 Fragment size: 1024 Reserved GDT blocks: 256 Blocks per group: 8192 Fragments per group: 8192 Inodes per group: 512 Inode blocks per group: 128 Flex block group size: 16 Filesystem created: Tue Apr 28 14:51:31 2009 Last mount time: Sat May 23 23:11:38 2009 Last write time: Sat May 23 23:11:38 2009 Mount count: 3 Maximum mount count: 400 Last checked: Sat May 23 10:17:43 2009 Check interval: 34560000 (13 months, 1 week, 3 days) Next check after: Sun Jun 27 10:17:43 2010 Lifetime writes: 5176 MB Reserved blocks uid: 0 (user root) Reserved blocks gid: 0 (group root) First inode: 11 Inode size: 256 Required extra isize: 28 Desired extra isize: 28 Journal inode: 8 Default directory hash: half_md4 Directory Hash Seed: 71ca129e-0529-45c7-944c-03f3ca7ddd83 Journal backup: inode blocks Journal size: 32M Oh, and happy birthday! Created attachment 21510 [details]
dmesg output of the bug
kernel tree bumped to -rc6-git6
So I'm glad you're using a filesystem with a 1k blocksize since it provides a good test case for us --- and we don't do enough testing for the case where the blocksize != pagesize. But you do realize that the ext4 filesystem will be far more efficient with the larger 4k blocksize, right? I'm not able to reproduce the problem on a filesystem with 1k blocksize running fsx, fsstress, and dbench using a 2.6.30-rc6 kernel with the full ext4 patch queue, using a netbook with an dual-core N270 Atom processor. I'm retrying with a stock 2.6.30-rc6 kernel now... I've tried replicating using fsstress, fsx, and dbench using a stock 2.6.30-rc6 on a dual-core netbook and a 1k block filesystem; no luck. I tried replicating using fsstress, fsx, and dbench using 2.6.30-rc6 + the ext4 patch queue on a x61s Lenovo laptop (Intel Core 2 Duo L7500), again with a 1k block filesystem. No luck. I'm now trying to do a bittorrent download using transmission onto the 1k block filesystem. How easily can you reproduce the problem on your system? The 1k blocksize was decided accidentally by an obsolete and forgotten /etc/mke2fs.conf, and ATM I can't back up the data and reformat 4k. The reproduction of the problem is as easy as it gets: I just launch ktorrent and wait. Of note might be that after the bug occurs, normal reboot procedure hangs, the kernel seems to sync and remount read-only with the magic sysrq combos, but fails to reboot or poweroff and there's always some loss of data in the partition. The latest dmesg snip follows attached. Created attachment 21512 [details]
yet another snip
So the modification w/ the printk didn't print anything? Must not have been the first bh on the page that was wrong, then, I didn't anticipate 1k blocks, I'll come up with another patch later that will print the state for all bh's on the page. -Eric With data=journal and blocksize= 1k I am seeing some problem explained here http://bugzilla.kernel.org/show_bug.cgi?id=12624 -aneesh On Sun, May 24, 2009 at 02:52:48AM +0000, bugzilla-daemon@bugzilla.kernel.org wrote: > http://bugzilla.kernel.org/show_bug.cgi?id=13369 > > > Theodore Tso <tytso@mit.edu> changed: > > What |Removed |Added > ---------------------------------------------------------------------------- > CC| |tytso@mit.edu > > > > > --- Comment #5 from Theodore Tso <tytso@mit.edu> 2009-05-24 02:52:48 --- > So I'm glad you're using a filesystem with a 1k blocksize since it provides a > good test case for us --- and we don't do enough testing for the case where > the > blocksize != pagesize. But you do realize that the ext4 filesystem will be > far more efficient with the larger 4k blocksize, right? > > I'm not able to reproduce the problem on a filesystem with 1k blocksize > running > fsx, fsstress, and dbench using a 2.6.30-rc6 kernel with the full ext4 patch > queue, using a netbook with an dual-core N270 Atom processor. I'm retrying > with a stock 2.6.30-rc6 kernel now... > It is with nodelalloc. I am also finding some unmapped buffer_heads in nodelalloc mode with data=journal. See the latest update for bug http://bugzilla.kernel.org/show_bug.cgi?id=12624 -aneesh Can you try with the patch series found at http://article.gmane.org/gmane.comp.file-systems.ext4/12962 (Threading is not working for gmane currently so the two patches you may need are http://thread.gmane.org/gmane.linux.file-systems/31465 http://thread.gmane.org/gmane.comp.file-systems.ext4/12960 ) I guess mmap and truncate combination can result in we missing pagefault when we write to second block in the same page. That would mean we may have to do block allocation in writepage. -aneesh I applied both patches, recompiled and rebooted. A few seconds after ktorrent started, I got this: kernel BUG at fs/ext4/inode.c:3123! invalid opcode: 0000 [#1] PREEMPT SMP last sysfs file: /sys/devices/system/cpu/cpu0/cpufreq/scaling_cur_freq Modules linked in: nvidia(P) oss_usb oss_hdaudio osscore aes_i586 aes_generic ipt_MASQUERADE ipt_REDIRECT xt_multiport xt_tcpudp xt_conntrack xt_state iptable_nat nf_nat nf_conntrack_ipv4 nf_conntrack nf_defrag_ipv4 iptable_filter ip_tables x_tables uvcvideo videodev v4l1_compat uhci_hcd sr_mod ohci1394 sdhci_pci sdhci ieee1394 cdrom mmc_core ehci_hcd ac button battery sg i2c_i801 usbcore i2c_core psmouse ata_piix evdev arc4 ecb iwl3945 iwlcore mac80211 led_class cfg80211 fuse dell_laptop backlight rfkill b44 dcdbas mii rtc_cmos rtc_core rtc_lib [last unloaded: soundcore] Pid: 1060, comm: kjournald2 Tainted: P (2.6.30-rc6-git6 #2) Vostro 1500 EIP: 0060:[<c01ffd13>] EFLAGS: 00010202 CPU: 1 EIP is at ext4_normal_writepage+0x183/0x190 EAX: 00000001 EBX: c28cb8e0 ECX: 00001000 EDX: 00000000 ESI: 00001000 EDI: 00000000 EBP: f756be00 ESP: f756bdd4 DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 Process kjournald2 (pid: 1060, ti=f756a000 task=f6d33390 task.ti=f756a000) Stack: 00001000 00000000 c01fc0f0 8000082d 0000000c 00000000 f756bf1c 10541259 eb4d7d30 00000002 eb4d7d30 f756bea0 c01780d7 10541259 c28cb8e0 c01789b9 00000000 00000007 10541259 00000000 00000001 eb4d7d30 c01780c0 f74fc0a8 Call Trace: [<c01fc0f0>] ? ext4_bh_unmapped_or_delay+0x0/0x50 [<c01780d7>] ? __writepage+0x17/0x60 [<c01789b9>] ? write_cache_pages+0x1a9/0x3f0 [<c01780c0>] ? __writepage+0x0/0x60 [<c01ffb90>] ? ext4_normal_writepage+0x0/0x190 [<c0178c27>] ? generic_writepages+0x27/0x50 [<c02228d2>] ? jbd2_journal_commit_transaction+0x3b2/0x12b0 [<c013eeaf>] ? try_to_del_timer_sync+0x4f/0x70 [<c0227a87>] ? kjournald2+0xd7/0x230 [<c014c430>] ? autoremove_wake_function+0x0/0x60 [<c02279b0>] ? kjournald2+0x0/0x230 [<c014c002>] ? kthread+0x52/0xa0 [<c014bfb0>] ? kthread+0x0/0xa0 [<c0103d67>] ? kernel_thread_helper+0x7/0x10 Code: 00 00 00 75 21 8b 4c 24 18 89 d8 83 c4 20 ba b0 17 20 c0 5b 5e 5f c9 e9 fc e5 fc ff f3 90 8b 4f 48 e9 b4 fe ff ff e8 6d 30 f3 ff <0f> 0b eb fe 0f 0b eb fe 90 8d 74 26 00 55 89 e5 57 56 89 ce 53 EIP: [<c01ffd13>] ext4_normal_writepage+0x183/0x190 SS:ESP 0068:f756bdd4 ---[ end trace d3629642779515bf ]--- In between, I had booted the stock archlinux kernel, version 2.6.29.4 and I couldn't reproduce the bug; instead I got these errors, which might be odd because I had just e2fsck'ed the device: ext4: find_group_flex failed, fallback succeeded dir 572486 ext4: find_group_flex failed, fallback succeeded dir 572439 ext4: find_group_flex failed, fallback succeeded dir 572469 ext4: find_group_flex failed, fallback succeeded dir 572470 ext4: find_group_flex failed, fallback succeeded dir 572478 ext4: find_group_flex failed, fallback succeeded dir 572444 ext4: find_group_flex failed, fallback succeeded dir 572477 ext4: find_group_flex failed, fallback succeeded dir 572447 ext4: find_group_flex failed, fallback succeeded dir 572482 ext4: find_group_flex failed, fallback succeeded dir 572446 Fsck is always needed for this partition, after the inode.c bug and the mandatory cold reboot, otherwise it aborts: ext4_abort called. EXT4-fs error (device sda9): ext4_put_super: Couldn't clean up the journal My kernel .config follows. Created attachment 21520 [details]
kernel .config
Will you be able to try these patches. They are on top of stable patch from the ext4 patch queue found at http://repo.or.cz/w/ext4-patch-queue.git Apply the patches upto stable-boundary-undo.patch Then apply the below five patches. Apply them in the below order. page_cache_size.patch 1.patch 2.patch unmap_or_unwritten ext4_writepage -aneesh Patches applied 24 hours ago and no bugs whatsoever, so far. Thanks. Assigning to myself just so I can close. It sounds like this is fixed, based on comment #16. If this shows up again, please do re-open. Thanks, -Eric |