Bug 9855 - ext3 ACL corruption
Summary: ext3 ACL corruption
Status: REJECTED INVALID
Alias: None
Product: File System
Classification: Unclassified
Component: ext3 (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: Andrew Morton
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2008-01-30 14:29 UTC by Kevin Shanahan
Modified: 2008-02-05 00:45 UTC (History)
0 users

See Also:
Kernel Version: 2.6.23
Subsystem:
Regression: ---
Bisected commit-id:


Attachments
Kernel .config (2.6.24) (44.09 KB, text/plain)
2008-01-30 14:45 UTC, Kevin Shanahan
Details
dmesg (running 2.6.24) (15.01 KB, text/plain)
2008-01-30 14:47 UTC, Kevin Shanahan
Details
dump of inode 966665 (8.00 KB, application/octet-stream)
2008-01-31 01:11 UTC, Kevin Shanahan
Details
dump of inode 1294339 (24.50 KB, application/octet-stream)
2008-01-31 01:12 UTC, Kevin Shanahan
Details
dump of inode block (4.00 KB, application/octet-stream)
2008-01-31 12:21 UTC, Kevin Shanahan
Details

Description Kevin Shanahan 2008-01-30 14:29:25 UTC
Latest working kernel version: Unknown
Earliest failing kernel version: Definitely 2.6.23 and 2.6.23.8 but earlier is possible
Distribution: Debian Etch
Hardware Environment: Multiple x86 machines

Software Environment:
Filesystem is Ext3 on LVM on RAID-1 (on SATA).
# e2fsck -V
e2fsck 1.40-WIP (14-Nov-2006)
	Using EXT2FS Library version 1.40-WIP, 14-Nov-2006

Problem Description:
On several occasions now I have had e2fsck prune away ACLs on my file systems during a file system check after rebooting a number of (reasonably) long running Samba servers. This morning I decided to manually run fsck before rebooting one of these:

# e2fsck -pfv /dev/mapper/vg_main-lv_samba
(entry->e_value_offs + entry->e_value_size: 116, offs: 120)
/dev/mapper/vg_main-lv_samba: Extended attribute in inode 163841 has a value offset (56) which is invalid
CLEARED.
(entry->e_value_offs + entry->e_value_size: 116, offs: 120)
/dev/mapper/vg_main-lv_samba: Extended attribute in inode 262146 has a value offset (56) which is invalid
CLEARED.
[ snip lots of (near) identical errors]

    8301 inodes used (0.08%)
    1621 non-contiguous inodes (19.5%)
         # of inodes with ind/dind/tind blocks: 3837/24/0
 1108478 blocks used (5.29%)
       0 bad blocks
       1 large file

    7590 regular files
     662 directories
       0 character device files
       0 block device files
       0 fifos
       0 links
      40 symbolic links (38 fast symbolic links)
       0 sockets
--------
    8292 files

(Note: after remounting)
# tune2fs -l /dev/mapper/vg_main-lv_samba 
tune2fs 1.40-WIP (14-Nov-2006)
Filesystem volume name:   <none>
Last mounted on:          <not available>
Filesystem UUID:          88677414-c1f8-41ba-b737-d9f6170d771b
Filesystem magic number:  0xEF53
Filesystem revision #:    1 (dynamic)
Filesystem features:      has_journal ext_attr resize_inode dir_index filetype needs_recovery sparse_super large_file
Filesystem flags:         signed directory hash 
Default mount options:    (none)
Filesystem state:         clean
Errors behavior:          Continue
Filesystem OS type:       Linux
Inode count:              10485760
Block count:              20971520
Reserved block count:     1048576
Free blocks:              19863038
Free inodes:              10477459
First block:              0
Block size:               4096
Fragment size:            4096
Reserved GDT blocks:      1019
Blocks per group:         32768
Fragments per group:      32768
Inodes per group:         16384
Inode blocks per group:   1024
Filesystem created:       Wed Feb 21 21:38:33 2007
Last mount time:          Thu Jan 31 03:18:54 2008
Last write time:          Thu Jan 31 03:18:54 2008
Mount count:              1
Maximum mount count:      30
Last checked:             Thu Jan 31 03:16:51 2008
Check interval:           15552000 (6 months)
Next check after:         Tue Jul 29 02:16:51 2008
Reserved blocks uid:      0 (user root)
Reserved blocks gid:      0 (group root)
First inode:              11
Inode size:		  256
Journal inode:            8
Default directory hash:   tea
Directory Hash Seed:      be8c201b-3563-4fa5-a2a6-e2864e4b73e2
Journal backup:           inode blocks


Steps to reproduce:
Unfortunately, precise steps are not known. Restoring all the filesystem's ACLs from a recent dump made using "getfacl -RP" fixes the ACLs without causing the corruption to return.

These are production Samba servers making fairly extensive use of file and directory ACLs. Thus far, I've only noticed the corruptions when it came time to upgrade to a new kernel and reboot (and the boot scripts then run fsck). Note that I've never noticed any issues at runtime because of this - only when I later realised that ACLs had been removed from random files and/or directories.

I think I will implement some scripts to unmount and run fsck nightly from cron, so I can at least detect the corruption a little earlier. If there is some more helpful debugging output I can provide, please let me know.
Comment 1 Kevin Shanahan 2008-01-30 14:45:06 UTC
Created attachment 14649 [details]
Kernel .config (2.6.24)

This is the .config of the kernel currently running. Nothing significant has been changed since the 2.6.23 config (certainly nothing fs related).
Comment 2 Kevin Shanahan 2008-01-30 14:47:24 UTC
Created attachment 14650 [details]
dmesg (running 2.6.24)

dmesg from the same machine, currently running 2.6.24.
Comment 3 Anonymous Emailer 2008-01-30 14:50:05 UTC
Reply-To: akpm@linux-foundation.org

On Wed, 30 Jan 2008 14:29:27 -0800 (PST)
bugme-daemon@bugzilla.kernel.org wrote:

> http://bugzilla.kernel.org/show_bug.cgi?id=9855
> 
>            Summary: ext3 ACL corruption
>            Product: File System
>            Version: 2.5
>      KernelVersion: 2.6.23
>           Platform: All
>         OS/Version: Linux
>               Tree: Mainline
>             Status: NEW
>           Severity: normal
>           Priority: P1
>          Component: ext3
>         AssignedTo: akpm@osdl.org
>         ReportedBy: kmshanah@ucwb.org.au
> 
> 
> Latest working kernel version: Unknown
> Earliest failing kernel version: Definitely 2.6.23 and 2.6.23.8 but earlier
> is
> possible
> Distribution: Debian Etch
> Hardware Environment: Multiple x86 machines
> 
> Software Environment:
> Filesystem is Ext3 on LVM on RAID-1 (on SATA).
> # e2fsck -V
> e2fsck 1.40-WIP (14-Nov-2006)
>         Using EXT2FS Library version 1.40-WIP, 14-Nov-2006
> 
> Problem Description:
> On several occasions now I have had e2fsck prune away ACLs on my file systems
> during a file system check after rebooting a number of (reasonably) long
> running Samba servers. This morning I decided to manually run fsck before
> rebooting one of these:
> 
> # e2fsck -pfv /dev/mapper/vg_main-lv_samba
> (entry->e_value_offs + entry->e_value_size: 116, offs: 120)
> /dev/mapper/vg_main-lv_samba: Extended attribute in inode 163841 has a value
> offset (56) which is invalid
> CLEARED.
> (entry->e_value_offs + entry->e_value_size: 116, offs: 120)
> /dev/mapper/vg_main-lv_samba: Extended attribute in inode 262146 has a value
> offset (56) which is invalid
> CLEARED.
> [ snip lots of (near) identical errors]
> 
>     8301 inodes used (0.08%)
>     1621 non-contiguous inodes (19.5%)
>          # of inodes with ind/dind/tind blocks: 3837/24/0
>  1108478 blocks used (5.29%)
>        0 bad blocks
>        1 large file
> 
>     7590 regular files
>      662 directories
>        0 character device files
>        0 block device files
>        0 fifos
>        0 links
>       40 symbolic links (38 fast symbolic links)
>        0 sockets
> --------
>     8292 files
> 
> (Note: after remounting)
> # tune2fs -l /dev/mapper/vg_main-lv_samba 
> tune2fs 1.40-WIP (14-Nov-2006)
> Filesystem volume name:   <none>
> Last mounted on:          <not available>
> Filesystem UUID:          88677414-c1f8-41ba-b737-d9f6170d771b
> Filesystem magic number:  0xEF53
> Filesystem revision #:    1 (dynamic)
> Filesystem features:      has_journal ext_attr resize_inode dir_index
> filetype
> needs_recovery sparse_super large_file
> Filesystem flags:         signed directory hash 
> Default mount options:    (none)
> Filesystem state:         clean
> Errors behavior:          Continue
> Filesystem OS type:       Linux
> Inode count:              10485760
> Block count:              20971520
> Reserved block count:     1048576
> Free blocks:              19863038
> Free inodes:              10477459
> First block:              0
> Block size:               4096
> Fragment size:            4096
> Reserved GDT blocks:      1019
> Blocks per group:         32768
> Fragments per group:      32768
> Inodes per group:         16384
> Inode blocks per group:   1024
> Filesystem created:       Wed Feb 21 21:38:33 2007
> Last mount time:          Thu Jan 31 03:18:54 2008
> Last write time:          Thu Jan 31 03:18:54 2008
> Mount count:              1
> Maximum mount count:      30
> Last checked:             Thu Jan 31 03:16:51 2008
> Check interval:           15552000 (6 months)
> Next check after:         Tue Jul 29 02:16:51 2008
> Reserved blocks uid:      0 (user root)
> Reserved blocks gid:      0 (group root)
> First inode:              11
> Inode size:               256
> Journal inode:            8
> Default directory hash:   tea
> Directory Hash Seed:      be8c201b-3563-4fa5-a2a6-e2864e4b73e2
> Journal backup:           inode blocks
> 
> 
> Steps to reproduce:
> Unfortunately, precise steps are not known. Restoring all the filesystem's
> ACLs
> from a recent dump made using "getfacl -RP" fixes the ACLs without causing
> the
> corruption to return.
> 
> These are production Samba servers making fairly extensive use of file and
> directory ACLs. Thus far, I've only noticed the corruptions when it came time
> to upgrade to a new kernel and reboot (and the boot scripts then run fsck).
> Note that I've never noticed any issues at runtime because of this - only
> when
> I later realised that ACLs had been removed from random files and/or
> directories.
> 
> I think I will implement some scripts to unmount and run fsck nightly from
> cron, so I can at least detect the corruption a little earlier. If there is
> some more helpful debugging output I can provide, please let me know.
> 
Comment 4 Anonymous Emailer 2008-01-30 23:49:28 UTC
Reply-To: adilger@sun.com

On Jan 30, 2008  14:49 -0800, Andrew Morton wrote:
> > Problem Description:
> > On several occasions now I have had e2fsck prune away ACLs on my file
> systems
> > during a file system check after rebooting a number of (reasonably) long
> > running Samba servers. This morning I decided to manually run fsck before
> > rebooting one of these:
> > 
> > # e2fsck -pfv /dev/mapper/vg_main-lv_samba
> > (entry->e_value_offs + entry->e_value_size: 116, offs: 120)
> > /dev/mapper/vg_main-lv_samba: Extended attribute in inode 163841 has a
> value
> > offset (56) which is invalid
> > CLEARED.
> > (entry->e_value_offs + entry->e_value_size: 116, offs: 120)
> > /dev/mapper/vg_main-lv_samba: Extended attribute in inode 262146 has a
> value
> > offset (56) which is invalid
> > CLEARED.

While these error messages still exist in e2fsck, this code appears to
have been changed somewhat because these same error messages no longer
get printed in e2fsprogs 1.40.5.

> > Inode size:               256     

This is a bit interesting, since it isn't very common to use large inodes.
I suspect this relates to the problem.

> > These are production Samba servers making fairly extensive use of file and
> > directory ACLs. Thus far, I've only noticed the corruptions when it came
> time
> > to upgrade to a new kernel and reboot (and the boot scripts then run fsck).
> > Note that I've never noticed any issues at runtime because of this - only
> when
> > I later realised that ACLs had been removed from random files and/or
> > directories.
> > 
> > I think I will implement some scripts to unmount and run fsck nightly from
> > cron, so I can at least detect the corruption a little earlier. If there is
> > some more helpful debugging output I can provide, please let me know.

There is just such a script in the thread "forced fsck (again?)".  Since you
are using LVs for the filesystem.

If you are able to reproduce this, could you please dump the inode and EA
block before fixing the problem.

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.
Comment 5 Kevin Shanahan 2008-01-31 01:08:01 UTC
(In reply to comment #4)
> If you are able to reproduce this, could you please dump the inode and EA
> block before fixing the problem.

Ok, I've got another occurence already. Different machine to above, but identical fs arrangement. Hopefully I've done the dump right - this is new territory for me.

# fsck -nfv /dev/mapper/vg_main-lv_samba
e2fsck 1.40-WIP (14-Nov-2006)
Pass 1: Checking inodes, blocks, and sizes
(entry->e_value_offs + entry->e_value_size: 116, offs: 120)
Extended attribute in inode 966665 has a value offset (72) which is invalid
Clear? no

(entry->e_value_offs + entry->e_value_size: 116, offs: 120)
Extended attribute in inode 1294339 has a value offset (72) which is invalid
Clear? no

[ snip 1613 further occurrences ]

# debugfs /dev/mapper/vg_main-lv_samba 
debugfs 1.40-WIP (14-Nov-2006)
debugfs:  dump <966665> /root/inode.966665
debugfs:  dump <1294339> /root/inode.1294339
debugfs:  quit

I'll attach the two inode dumps in a moment. Not sure about the EA block - I couldn't see any commands for getting at EAs, though from the wording of the fsck warnings, I guess they might be in-inode EAs.

Regards,
Kevin
Comment 6 Kevin Shanahan 2008-01-31 01:11:38 UTC
Created attachment 14662 [details]
dump of inode 966665
Comment 7 Kevin Shanahan 2008-01-31 01:12:18 UTC
Created attachment 14663 [details]
dump of inode 1294339
Comment 8 Kevin Shanahan 2008-01-31 04:14:32 UTC
On Thu, 2008-01-31 at 03:05 -0700, Andreas Dilger wrote:
> ...  To get the interesting bits you need:
> 
> debugfs: stat <966665>   # prints decoded inode, "File ACL:" is a block
> number
> debugfs: imap <966665>   # prints inode block number, offset
> 
> dd if=/dev/mapper/vg_main-lv_samba of=/tmp/inode.bin bs=4k count=1
> skip={iblock}
> dd if=/dev/mapper/vg_main-lv_samba of=/tmp/inode.bin bs=4k count=1
> skip={ACLblk}

Ah, ok - learning fast. Lets see how I go this time:

e2fsck 1.40-WIP (14-Nov-2006)
Pass 1: Checking inodes, blocks, and sizes
(entry->e_value_offs + entry->e_value_size: 116, offs: 120)
Extended attribute in inode 3342652 has a value offset (72) which is
invalid
Clear? no
...

# debugfs /dev/mapper/vg_main-lv_samba
debugfs:  stat <3342652>

Inode: 3342652   Type: regular    Mode:  0770   Flags: 0x0   Generation: 3684645243
User:     0   Group: 10140   Size: 18432
File ACL: 0    Directory ACL: 0
Links: 1   Blockcount: 40
Fragment:  Address: 0    Number: 0    Size: 0
ctime: 0x475be06e -- Sun Dec  9 23:02:46 2007
atime: 0x475d4073 -- Tue Dec 11 00:04:43 2007
mtime: 0x45d2686a -- Wed Feb 14 12:09:54 2007
Size of extra inode fields: 4
Extended attributes stored in inode body: 
   = "01 00 00 00 01 00 07 00 04 00 05 00 08 00 05 00 d6 27 00 00 08 00 07 00 09 28 00 00 08 00 07 00 0a 28 00 00 10 00 07 00 20 00 00 00 " (44)
  DOSATTRIB = "0x20" (4)
BLOCKS:
(0):6713397, (1):6713399, (2):6713395, (3):6713405, (4):6713396
TOTAL: 5

debugfs:  imap <3342652>
Inode 3342652 is part of block group 204
      located at block 6684693, offset 0x0b00

# dd if=/dev/mapper/vg_main-lv_samba of=iblock.bin bs=4k count=1 skip=6684693
1+0 records in
1+0 records out
4096 bytes (4.1 kB) copied, 0.000429132 seconds, 9.5 MB/s

I'm assuming that "File ACL: 0" means that there's no ACL block.

> Attach all of the above to an email to the list...  It will only be 8 kB.

Ok, done. Hopefully this is what you mean by "the list". :)

Cheers,
Kevin.
Comment 9 Eric Sandeen 2008-01-31 06:58:17 UTC
Andreas Dilger wrote:
> On Jan 30, 2008  14:49 -0800, Andrew Morton wrote:
>>> Problem Description:

>>> Inode size:               256     
> 
> This is a bit interesting, since it isn't very common to use large inodes.
> I suspect this relates to the problem.

I think it is somewhat common on samba servers, though.

And it's the new default in the latest e2fsprogs... maybe something will
shake out in the F9 development cycle.

>>> These are production Samba servers making fairly extensive use of file and
>>> directory ACLs. Thus far, I've only noticed the corruptions when it came
>>> time
>>> to upgrade to a new kernel and reboot (and the boot scripts then run fsck).
>>> Note that I've never noticed any issues at runtime because of this - only
>>> when
>>> I later realised that ACLs had been removed from random files and/or
>>> directories.
>>>
>>> I think I will implement some scripts to unmount and run fsck nightly from
>>> cron, so I can at least detect the corruption a little earlier. If there is
>>> some more helpful debugging output I can provide, please let me know.
> 
> There is just such a script in the thread "forced fsck (again?)".  Since you
> are using LVs for the filesystem.

Which is on the ext3-users list btw...

> If you are able to reproduce this, could you please dump the inode and EA
> block before fixing the problem.

Do you need instructions on doing that?

-Eric
Comment 10 Kevin Shanahan 2008-01-31 12:21:17 UTC
Created attachment 14665 [details]
dump of inode block

See comment #8
Comment 11 Anonymous Emailer 2008-02-04 17:06:36 UTC
Reply-To: adilger@sun.com

On Jan 31, 2008  22:44 +1030, Kevin Shanahan wrote:
> On Thu, 2008-01-31 at 03:05 -0700, Andreas Dilger wrote:
> > ...  To get the interesting bits you need:
> > 
> > debugfs: stat <966665>   # prints decoded inode, "File ACL:" is a block
> number
> > debugfs: imap <966665>   # prints inode block number, offset
> > 
> > dd if=/dev/mapper/vg_main-lv_samba of=/tmp/inode.bin bs=4k count=1
> skip={iblock}
> > dd if=/dev/mapper/vg_main-lv_samba of=/tmp/inode.bin bs=4k count=1
> skip={ACLblk}
> 
> Ah, ok - learning fast. Lets see how I go this time:
> 
> # debugfs /dev/mapper/vg_main-lv_samba
> debugfs:  stat <3342652>
> 
> Inode: 3342652   Type: regular    Mode:  0770   Flags: 0x0   Generation:
> 3684645243
> User:     0   Group: 10140   Size: 18432
> File ACL: 0    Directory ACL: 0
> Links: 1   Blockcount: 40
> Fragment:  Address: 0    Number: 0    Size: 0
> ctime: 0x475be06e -- Sun Dec  9 23:02:46 2007
> atime: 0x475d4073 -- Tue Dec 11 00:04:43 2007
> mtime: 0x45d2686a -- Wed Feb 14 12:09:54 2007
> Size of extra inode fields: 4
> Extended attributes stored in inode body: 
>    = "01 00 00 00 01 00 07 00 04 00 05 00 08 00 05 00 d6 27 00 00 08 00 07 00
>    09 28 00 00 08 00 07 00 0a 28 00 00 10 00 07 00 20 00 00 00 " (44)
>   DOSATTRIB = "0x20" (4)
> BLOCKS:
> (0):6713397, (1):6713399, (2):6713395, (3):6713405, (4):6713396
> TOTAL: 5
> 
> debugfs:  imap <3342652>
> Inode 3342652 is part of block group 204
>       located at block 6684693, offset 0x0b00

The hexdump of this data (od -Ax -tx4 -a) shows the EA is in good shape:

000b80 00000004        ea020000        00480200        00000000
       eot nul nul nul nul nul stx   j nul stx   H nul nul nul nul nul

inode.i_extra_isize=0x0004
ext3_xattr_ibody_header.h_magic=0xea020000

[EA1 entry]
ext3_xattr_entry.e_name_len=0x00  (unused for POSIX_ACL_ACCESS)
ext3_xattr_entry.e_name_index=0x02 (EXT3_INDEX_POSIX_ACL_ACCESS)
ext3_xattr_entry.e_value_offs=0x0048 = 72
ext3_xattr_entry.e_value_block=0x00000000  (unused)



000b90 0000002c        00000000        00740109        00000000
         , nul nul nul nul nul nul nul  ht soh   t nul nul nul nul nul
[EA1 cont.]
ext3_xattr_entry.e_value_size=0x0000002c = 44
ext3_xattr_entry.e_hash=0x00000000  (currently unused)

[EA2 entry]
ext3_xattr_entry.e_name_len=0x09
ext3_xattr_entry.e_name_index=0x01 (EXT3_INDEX_USER)
ext3_xattr_entry.e_value_offs=0x0074 = 116
ext3_xattr_entry.e_value_block=0x00000000  (unused)

000ba0 00000004        00000000        41534f44        49525454
       eot nul nul nul nul nul nul nul   D   O   S   A   T   T   R   I
[EA2 cont.]
ext3_xattr_entry.e_value_size=0x0000002c = 44
ext3_xattr_entry.e_hash=0x00000000  (currently unused)
ext3_xattr_entry.e_name=DOSATTRIB

000bb0 00000042        00000000        00000000        00000000
         B nul nul nul nul nul nul nul nul nul nul nul nul nul nul nul
000bc0 00000000        00000000        00000000        00000000
       nul nul nul nul nul nul nul nul nul nul nul nul nul nul nul nul

000bd0 00000001        00070001        00050004        00050008
       soh nul nul nul soh nul bel nul eot nul enq nul  bs nul enq nul
[EA1 data]
ext3_acl_header.a_version=0x00000001
ext3_acl_entry_short[0].e_tag=0x0001
ext3_acl_entry_short[0].e_perm=0x0007
ext3_acl_entry_short[1].e_tag=0x0004
ext3_acl_entry_short[1].e_perm=0x0005
ext3_acl_entry_short[2].e_tag=0x0008
ext3_acl_entry_short[2].e_perm=0x0005

000be0 000027d6        00070008        00002809        00070008
         V   ' nul nul  bs nul bel nul  ht   ( nul nul  bs nul bel nul
[EA1 data cont]
ext3_acl_entry_short[2].e_id=0x27d6
ext3_acl_entry[3].e_tag=0x0008
ext3_acl_entry[3].e_perm=0x0007
ext3_acl_entry[3].e_id=0x00002809

ext3_acl_entry[5].e_tag=0x0008
ext3_acl_entry[5].e_perm=0x0007
ext3_acl_entry[5].e_id=0x0000280a

000bf0 0000280a        00070010        00000020        30327830
        nl   ( nul nul dle nul bel nul  sp nul nul nul   0   x   2   0
[EA1 data cont]
ext3_acl_entry[6].e_tag=0x0010
ext3_acl_entry[6].e_perm=0x0007
ext3_acl_entry[6].e_id=0x00000020

[EA2 data]
"0x20"

> I'm assuming that "File ACL: 0" means that there's no ACL block.

Right.

> e2fsck 1.40-WIP (14-Nov-2006)
> Pass 1: Checking inodes, blocks, and sizes
> (entry->e_value_offs + entry->e_value_size: 116, offs: 120)
> Extended attribute in inode 3342652 has a value offset (72) which is
> invalid
> Clear? no
> ...

Hmm, I wonder if e2fsck is calculating the wrong file offset or something?
The kernel appears to be taking the EA data offset from the end of
i_extra_isize and the ext3_xattr_ibody_header fields (so 0x88 + e_value_offs
from the start of the inode).

Conversely, debugfs isn't having any problem with this EA at all.

h, I think I see the problem, and this was fixed in newer e2fsck.
The EAs are stored "out of order" in the inode and older e2fsprogs
considered that an error.  That was fixed in the final 1.40 release:

	Remove check in e2fsck which requires EA's in inodes to be sorted;
	they don't need to be sorted, and e2fsck was previously wrongly
	clearing unsorted EA's stored in the inode structure.


Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.
Comment 12 Kevin Shanahan 2008-02-05 00:42:38 UTC
On Mon, 2008-02-04 at 18:06 -0700, Andreas Dilger wrote:
> Hmm, I wonder if e2fsck is calculating the wrong file offset or something?
> The kernel appears to be taking the EA data offset from the end of
> i_extra_isize and the ext3_xattr_ibody_header fields (so 0x88 + e_value_offs
> from the start of the inode).
> 
> Conversely, debugfs isn't having any problem with this EA at all.
> 
> h, I think I see the problem, and this was fixed in newer e2fsck.
> The EAs are stored "out of order" in the inode and older e2fsprogs
> considered that an error.  That was fixed in the final 1.40 release:
> 
>       Remove check in e2fsck which requires EA's in inodes to be sorted;
>       they don't need to be sorted, and e2fsck was previously wrongly
>       clearing unsorted EA's stored in the inode structure.

Ah, I think you got it! I've just now reproduced the problem on one
filesystem with the old e2fsck-1.40-WIP from Debian Etch and then
checked again with the newer e2fsck-1.40.5 from Sid. The new version
doesn't report any problems.

Thanks very much for your time looking into this.

Regards,
Kevin.
Comment 13 Kevin Shanahan 2008-02-05 00:45:35 UTC
I guess marking it as INVALID is the right thing to do - not a kernel bug.

Note You need to log in before you can comment on or make changes to this bug.