Bug 15231
Summary: | kernel BUG at fs/ext4/inode.c:1852! | ||
---|---|---|---|
Product: | File System | Reporter: | Franco Broi (franco) |
Component: | ext4 | Assignee: | fs_ext4 (fs_ext4) |
Status: | CLOSED UNREPRODUCIBLE | ||
Severity: | normal | CC: | alan, dmonakhov, jack, matt, tytso |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | 2.6.32.7 | Subsystem: | |
Regression: | Yes | Bisected commit-id: | |
Attachments: |
messages file
messages file Patch to try reverting the cherry pick of 39bc680 |
Description
Franco Broi
2010-02-05 09:38:14 UTC
Created attachment 24916 [details]
messages file
Changed the summary to be more specific. Just saw this bug on similar hardware running 2.6.32.2 There's an intersting bit of memory corruption from the traceback at the end of the messages file attached. Feb 8 17:52:27 echo24 kernel: kernel BUG at fs/inode.c:1343! Feb 8 17:52:27 echo24 kernel: invalid opcode: 0000 [#1] SMP Feb 8 17:52:27 echo24 kernel: last sysfs file: /sys/devices/platform/i5k_amb.0/temp2_input Feb 8 17:52:27 echo24 kernel: CPU 5 Feb 8 17:52:27 echo24 kernel: Modules linked in: usb_storage nfsd exportfs mx_driver(P) mx_mcp(P) nfs lockd nfs_acl auth_rpcgss sunrpc ipv6 ext4 jbd2 crc16 dm_multipath uinput e1000e mptsas iTCO_wdt iTCO_vendor_support mptscsih i2c_i801 i2c_core pcspkr mptbase shpchp i5k_amb ioatdma serio_raw hwmon scsi_transport_sas dca ata_generic [last unloaded: myri10ge] Feb 8 17:52:27 echo24 kernel: Pid: 2677, comm: nfsd Tainted: P W 2.6.32.2 #1 X7DWE Feb 8 17:52:27 echo24 kernel: RIP: 0010:[<ffffffff810fdaf9>] [<ffffffff810fdaf9>] iput+0x1b/0x66 Feb 8 17:52:27 echo24 kernel: RSP: 0018:ffff88021183fd30 EFLAGS: 00010246 Feb 8 17:52:27 echo24 kernel: RAX: 0000000000000000 RBX: ffff88021b16dbc8 RCX: 00000000002a0024 Feb 8 17:52:27 echo24 kernel: RDX: 0000000000000001 RSI: ffffea00010cdb10 RDI: ffff88021b16dbc8 Feb 8 17:52:27 echo24 kernel: RBP: ffff88021183fd40 R08: ffff880210cd5d40 R09: 0000000000000003 Feb 8 17:52:27 echo24 kernel: R10: 0000000000000001 R11: ffff88021183fc30 R12: ffff88021b16db80 Feb 8 17:52:27 echo24 kernel: R13: 0000000000001000 R14: ffff880211f20000 R15: ffff88022df50e00 Feb 8 17:52:27 echo24 kernel: FS: 0000000000000000(0000) GS:ffff880028340000(0000) knlGS:0000000000000000 Feb 8 17:52:27 echo24 kernel: CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b Feb 8 17:52:27 echo24 kernel: CR2: 0000000000736f68 CR3: 0000000210dde000 CR4: 00000000000406e0 Feb 8 17:52:27 echo24 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 Feb 8 17:52:27 echo24 kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Feb 8 17:52:27 echo24 kernel: Process nfsd (pid: 2677, threadinfo ffff88021183e000, task ffff880211f19760) Feb 8 17:52:27 echo24 kernel: Stack: Feb 8 17:52:27 echo24 kernel: ffff88021183fd50 ffff88014dfc6400 ffff88021183fd60 ffffffff812bdb2e Feb 8 17:52:27 echo24 kernel: <0> ffff88014dfc6400 ffff88014dfc6410 ffff88021183fd80 ffffffffa01a51c5 Feb 8 17:52:27 echo24 kernel: <0> ffff88021183fd80 ffff88014dfc6400 ffff88021183fdb0 ffffffffa01af5d9 Feb 8 17:52:27 echo24 kernel: Call Trace: Feb 8 17:52:27 echo24 kernel: [<ffffffff812bdb2e>] sock_release+0x5c/0x6c Feb 8 17:52:27 echo24 kernel: [<ffffffffa01a51c5>] svc_sock_free+0x45/0x51 [sunrpc] Feb 8 17:52:27 echo24 kernel: [<ffffffffa01af5d9>] svc_xprt_free+0x3e/0x4e [sunrpc] Feb 8 17:52:27 echo24 kernel: [<ffffffffa01af59b>] ? svc_xprt_free+0x0/0x4e [sunrpc] Feb 8 17:52:27 echo24 kernel: [<ffffffff811ab411>] kref_put+0x43/0x4f Feb 8 17:52:27 echo24 kernel: [<ffffffffa01aeaed>] svc_xprt_release+0xc6/0xce [sunrpc] Feb 8 17:52:27 echo24 kernel: [<ffffffffa01af2f4>] svc_recv+0x72d/0x78b [sunrpc] Feb 8 17:52:27 echo24 kernel: [<ffffffff81049930>] ? default_wake_function+0x0/0xf Feb 8 17:52:27 echo24 kernel: [<ffffffffa030f7df>] ? nfsd+0x0/0x133 [nfsd] Feb 8 17:52:27 echo24 kernel: [<ffffffffa030f87f>] nfsd+0xa0/0x133 [nfsd] Feb 8 17:52:27 echo24 kernel: [<ffffffff81065541>] kthread+0x7d/0x85 Feb 8 17:52:27 echo24 kernel: [<ffffffff81011cda>] child_rip+0xa/0x20 Feb 8 17:52:27 echo24 kernel: [<ffffffff8101163d>] ? restore_args+0x0/0x30 Feb 8 17:52:27 echo24 kernel: [<ffffffff810654c4>] ? kthread+0x0/0x85 Feb 8 17:52:27 echo24 kernel: [<ffffffff81011cd0>] ? child_rip+0x0/0x20 Feb 8 17:52:27 echo24 kernel: Code: c0 74 d5 48 83 c4 20 5b 41 5c 41 5d 41 5e c9 c3 55 48 89 e5 53 48 89 fb 48 83 ec 08 48 85 ff 74 50 48 83 bf 10 02 00 00 40 75 04 <0f> 0b eb fe 48 8d 7f 48 48 c7 c6 20 5b 88 81 e8 6f b5 0a 00 85 Feb 8 17:52:27 echo24 kernel: RIP [<ffffffff810fdaf9>] iput+0x1b/0x66 Feb 8 17:52:27 echo24 kernel: RSP <ffff88021183fd30> Feb 8 17:52:27 echo24 kernel: ---[ end trace 56313310fcbaf839 ]--- Created attachment 24960 [details]
messages file
OK, so the dump in comment #3 and #4 are a completely different issue. It's not even clear it's an ext4 issue. (An oops in fs/inode.c is quite different from an oops in fs/ext4/inode.c). It looks the failure is in comment #1 is due to commit dbe5cc00, which is a cherry pick of commit 39bc680a (ext4: fix sleep inside spinlock issue with quota and dealloc). And which was introduced in 2.6.32.3. I didn't think we were going to merge this into -stable, since it introduced this failure. Fixing it requires a series of commits. Argh.... I'll try to get to figuring out which of the upstream commits need to go to -stable to fix this bug. Ted, what problem does commit 39bc680a cause? Because I thought all the problems were introduced by commit d21cd8f (also mentioned in the changelogs) and that was not committed to -stable (exactly because of this). But the problems described in commit 0637c6f4135f592f094207c7c21e7c0fc5557834 were always there so maybe commit 39bc680a just makes them visible... Shrug. Maybe just backing out 39bc680a would be an option for -stable... Well, the BUG_ON getting tripped as reported in comment #1 in this bug is: BUG_ON(mdblocks < EXT4_I(inode)->i_reserved_meta_blocks); ... and the commit 39bc680a does mess with how i_reserved_meta_blocks is calculated. @@ -1836,10 +1834,11 @@ repeat: } return -ENOSPC; } + spin_lock(&EXT4_I(inode)->i_block_reservation_lock); EXT4_I(inode)->i_reserved_data_blocks += nrblocks; - EXT4_I(inode)->i_reserved_meta_blocks = mdblocks; - + EXT4_I(inode)->i_reserved_meta_blocks += md_needed; spin_unlock(&EXT4_I(inode)->i_block_reservation_lock); + return 0; /* success */ I'm going to need to go back and see whether I screwed up earlier and blamed the wrong commit earlier, or whether that commit was implicated in some other regression. However, given the claimed regression between 2.6.32.2 and 2.6.32.7, and given that commit 39bc680a was applied between 2.6.32.2 and 2.6.32.3, at the very least we should try backing out 39bc680a and seeing whether that makes Franco's problem go away. Created attachment 24975 [details]
Patch to try reverting the cherry pick of 39bc680
Franco, could you try applying this revert to either 2.6.32.7 or 2.7.32.8, and see if the problem still recurs? Many thanks!!
I've applied the patch to 2.6.32.7 - strangely the patch program segfaulted but applied the patch correctly. [root@echo2 linux-2.6.32.7]# patch -p1 < patch patching file fs/ext4/inode.c Segmentation fault Feb 10 02:11:57 echo2 kernel: patch[18919]: segfault at 0 ip 0000003a0c07ee70 sp 00007fff85523278 error 4 in libc-2.10.1.so[3a0c000000+164000] Also, before booting the new kernel I did a df, /data6 had a negative used block count. I guess this could be related to the current problem if the ref counting has gone a bit skew-whiff although this is with the system running 2.6.32.2. Filesystem 1K-blocks Used Available Use% Mounted on /dev/mapper/vgdata--3-data5 13458176552 2072548 13456104004 1% /data5 /dev/mapper/vgdata--3-data6 13458176552 -3576068 13461752620 0% /data6 /dev/mapper/vgdata--4-data7 13458176552 40839788 13417336764 1% /data7 /dev/mapper/vgdata--4-data8 13458176552 1933764 13456242788 1% /data8 I unmounted and ran fsck, it said it was clean and when I remounted it appeared to fix itself. Filesystem 1K-blocks Used Available Use% Mounted on /dev/mapper/vgdata--3-data5 13458176552 2072548 13456104004 1% /data5 /dev/mapper/vgdata--3-data6 13458176552 1920236 13456256316 1% /data6 /dev/mapper/vgdata--4-data7 13458176552 40974956 13417201596 1% /data7 /dev/mapper/vgdata--4-data8 13458176552 1933764 13456242788 1% /data8 The patched kernel is running on 3 servers - watch this space.. All clear after 24 hours of testing but still seeing the other non-ext4 bug. echo3: kernel BUG at fs/inode.c:1343! echo4: kernel BUG at fs/inode.c:1343! So the "kernel BUG at fs/inode.c:1343!" failure was showing up on both 2.6.32.2 and 2.6.32.7, right? Do you know when it first started showing up? The kernel stack trace indicates that it's coming from releasing one the nfs kernel thread's network sockets, when the TCP connection is broken or something else goes wrong. I don't think it's ext4 related at all; it looks like an NFS server issue. I've only seen this bug starting with 2.6.32.2. I haven't tried anyting between this version and 2.6.32.7. Sorry, yes, the above errors were with the patched 2.6.32.7 kernel. It looks like the patch has worked, thanks. Got a few more of the fs/inode.c:1343! errors though. Franco, Could you file a separate bug for the fs/inode.c:1343 problem? And I'd suggest attaching it to the NFS component since that's what I strongly suspect is going on. The other thing that would be really big help for me is if you could try running 2.6.33-rc8. It has a series of fixes which I believe should address the quota problem, while still leaving the quota fix in place, and it would be really good to know if it has been fixed before Linus releases 2.6.33. 2.6.33-rc8 is pretty close to 2.6.33-final, so it should be relatively safe to run it. Still, if you're hesitant to exposing your production machines to a pre-release kernel, I'll completely understand. Thanks, - Ted Sorry Ted but I can't run any further tests on the production hardware but we have ordered another PB of RAID disk and servers which should start arriving in the next few weeks. So if you can hang out until then, I'll be able to give it a good trashing. Trashing, hmmm bit of a Freudian slip! Of course I meant thrashing... |