Bug 9732

Summary: oops in extent code via ext4_fallocate
Product: File System Reporter: Eric Sandeen (sandeen)
Component: ext4Assignee: fs_ext4 (fs_ext4)
Status: CLOSED CODE_FIX    
Severity: normal CC: sandeen
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 2.6.24-rc7 Subsystem:
Regression: --- Bisected commit-id:

Description Eric Sandeen 2008-01-11 12:39:58 UTC
Latest working kernel version: Unknown
Earliest failing kernel version: 2.6.24-rc7
Distribution: Fedora Rawhide
Hardware Environment: x86_64
Software Environment:
Problem Description:

a simple fallocate; truncate; fallocate series oopses the kernel.

2.6.24-rc7 with the latest ext4 patch stack as of about Jan 9, 2008, minus delalloc patches.

Steps to reproduce, using the test program at http://oss.sgi.com/archives/xfs/2007-07/msg00092.html

[root@bear-05 sdb8]# touch testfile
[root@bear-05 sdb8]# ./testfallocate -f testfile 0 65536
Trying to preallocate blocks (offset=0, len=65536)
fallocate system call succedded !  ret=0
# FALLOCATE TEST REPORT #
	New blocks preallocated = 16.
	Number of bytes preallocated = 65536
	Old file size = 0, New file size 65536.
	Old num blocks = 0, New num blocks 64.


### TESTS PASSED ###
[root@bear-05 sdb8]# ls  -lh testfile; du -hc testfile
-rw-r--r-- 1 root root 64K Jan 11 14:12 testfile
64K	testfile
64K	total
[root@bear-05 sdb8]# /root/truncate testfile 32768
Truncating testfile to 32768
[root@bear-05 sdb8]# ls  -lh testfile; du -hc testfile
-rw-r--r-- 1 root root 32K Jan 11 14:12 testfile
32K	testfile
32K	total
[root@bear-05 sdb8]# ./testfallocate -f testfile 0 65536
Trying to preallocate blocks (offset=0, len=65536)
Segmentation fault


and yields:
EXT4-fs: file extents enabled
EXT4-fs: mballoc enabled
------------[ cut here ]------------
kernel BUG at fs/ext4/extents.c:1056!
invalid opcode: 0000 [1] SMP 
CPU 2 
Modules linked in: ext4dev jbd2 crc16 autofs4 hidp nfs lockd nfs_acl rfcomm l2cap bluetooth sunrpc ipv6 cpufreq_ondemand dm_multipath video output sbs sbshc battery ac power_supply parport_pc lp parport sg pata_acpi cfi_cmdset_0002 ata_generic cfi_util button jedec_probe serio_raw cfi_probe tg3 gen_probe ck804xrom mtd rtc_cmos chipreg pata_amd map_funcs k8temp libata hwmon i2c_nforce2 shpchp i2c_core pcspkr dm_snapshot dm_zero dm_mirror dm_mod qla2xxx scsi_transport_fc scsi_tgt mptspi mptscsih mptbase scsi_transport_spi sd_mod scsi_mod ext3 jbd mbcache ehci_hcd ohci_hcd uhci_hcd
Pid: 3554, comm: testfallocate Not tainted 2.6.24-0.147.rc7.git2.fc9 #1
RIP: 0010:[<ffffffff88364497>]  [<ffffffff88364497>] :ext4dev:ext4_ext_search_left+0x97/0xbc
RSP: 0018:ffff81012dc65cf0  EFLAGS: 00010287
RAX: 0000000000008008 RBX: ffff81012dc65d78 RCX: 0000000000000000
RDX: ffff81012dc65d90 RSI: ffff81012e5f37e0 RDI: ffff81012e04c00c
RBP: ffff81012e04c180 R08: 0000000000000008 R09: 0000000000000000
R10: 0000000000000000 R11: ffff81012dc65d98 R12: 0000000000000008
R13: ffff81012e04c180 R14: 0000000000000008 R15: 0000000000000000
FS:  00002aaaaaab96f0(0000) GS:ffff81013fc01a28(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 000000327d0414a0 CR3: 0000000130ce8000 CR4: 00000000000006a0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process testfallocate (pid: 3554, threadinfo ffff81012dc64000, task ffff81012dd5a000)
Stack:  ffffffff88366073 ffff8101349e71d0 ffffffff81053c91 000000022dcc30b8
 ffff81012dc65e88 0000000000000008 ffff81012e549000 ffff81012dcc30b8
 ffff81012dcc3090 ffff81012e04c000 ffff81012dcc30b8 0000000000000008
Call Trace:
 [<ffffffff88366073>] :ext4dev:ext4_ext_get_blocks+0x5ba/0x8c1
 [<ffffffff81053c91>] lock_release_holdtime+0x27/0x49
 [<ffffffff812748f6>] _spin_unlock+0x17/0x20
 [<ffffffff883400a6>] :jbd2:start_this_handle+0x4e0/0x4fe
 [<ffffffff88366564>] :ext4dev:ext4_fallocate+0x175/0x39a
 [<ffffffff81053c91>] lock_release_holdtime+0x27/0x49
 [<ffffffff81056480>] __lock_acquire+0x4e7/0xc4d
 [<ffffffff81053c91>] lock_release_holdtime+0x27/0x49
 [<ffffffff810a8de7>] sys_fallocate+0xe4/0x10d
 [<ffffffff8100c043>] tracesys+0xd5/0xda


Code: 0f 0b eb fe ff c8 89 02 0f b7 47 06 8b 4f 08 0f b7 57 04 48 
RIP  [<ffffffff88364497>] :ext4dev:ext4_ext_search_left+0x97/0xbc
 RSP <ffff81012dc65cf0>
---[ end trace 9a60a6a6c694770a ]---
SysRq : Resetting


The BUG_ON is:

        BUG_ON(*logical < le32_to_cpu(ex->ee_block) + le16_to_cpu(ex->ee_len));

where these were the values:

        logical 8 ee_block 0 ee_len 32776

Haven't looked further into it yet.
Comment 1 Eric Sandeen 2008-01-11 15:37:44 UTC
This dies in the same way:

[root@bear-05 sdb8]# ./testfallocate -f testfile 0 32768
[root@bear-05 sdb8]# ./testfallocate -f testfile 16384 65536

(note, they overlap)

as does this:

[root@bear-05 sdb8]# ./testfallocate -f testfile 0 16384
[root@bear-05 sdb8]# ./testfallocate -f testfile 32768 65536

(non-overlapping)
Comment 2 Eric Sandeen 2008-01-15 07:29:55 UTC
Aneesh fixed this in the ext4 patch queue.

[PATCH] ext4: use ext4_ext_get_actual_len instead of directly using ext4_extent.ee_len

Thanks!