Bug 8209 - fsck on large RAID arrays corrupts root partition
Summary: fsck on large RAID arrays corrupts root partition
Status: REJECTED INSUFFICIENT_DATA
Alias: None
Product: Drivers
Classification: Unclassified
Component: Other (show other bugs)
Hardware: i386 Linux
: P2 high
Assignee: Andrew Morton
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2007-03-15 12:20 UTC by Peter Kerwien
Modified: 2009-03-23 10:57 UTC (History)
3 users (show)

See Also:
Kernel Version: 2.6.20.3
Subsystem:
Regression: ---
Bisected commit-id:


Attachments
My kernel config (39.09 KB, text/plain)
2007-03-19 23:52 UTC, Peter Kerwien
Details
My installed packages (6.22 KB, text/plain)
2007-03-19 23:53 UTC, Peter Kerwien
Details
Patch with the SG32 vs SG64 change, for testing (1.14 KB, patch)
2007-03-21 04:59 UTC, Olaf Kirch
Details | Diff

Description Peter Kerwien 2007-03-15 12:20:45 UTC
Most recent kernel where this bug did *NOT* occur: ?
Distribution: Gentoo Linux amd64
Hardware Environment: Gigabyte GA-965P-DS4, Intel Core 2 Duo, ATI Radeon 7000
PCI graphics card, Areca ARC-1220 PCIe RAID controller
Software Environment:
Problem Description:

(I'm sorry I cannot give more detailed information. But it's my server that
crashes and I cannot test things at the moment since I don't want to risk losing
 any data. I will try to set up a machine to reproduce the problem on, but I
probably need help to know what to look for)

I have two RAID arrays in my system, both created with the Areca controller:

1. RAID1 (2 x 80GB) => /dev/sda
2. RAID5 (3 x 500GB) => /dev/sdb

Om my /dev/sda I have 3 partitions:

1. sda1: /boot approx, 32MB, ext3
2. sda2: /swap 2GB
3. sda3: / , the rest of the drive, ext3

On my /dev/sdb I have only one ~1TB partition formated with ext3.

I'm running Gentoo Linux amd64 with a vanilla kernel 2.6.20.3. After a fsck -f
/dev/sdb1, I rebooted the system and then it failed too boot. I got the error
message: "No init found. Try passing init= option to kernel.".

When I tried to repair the system by booting up from the Gentoo install CD I
cannot chroot to the system. But I could mount /dev/sda3 and /sbin/init etc. are
still there. But it seems that I cannot run them anymore.

Very similar (the same?) problem occurred for me about one year ago on a totally
different machine, running Gentoo Linux i386 with an older kernel. I cannot
remember which version, but it was probably the latest stable 2.6.x at that
time. One system disk formated with reiserfs and one software RAID5 (4 x 200GB)
formated with reiserfs. The software RAID5 was created and monitored with mdadm.
I hadn't checked the system for a while, so I umounted the RAID5 (/dev/md0), ran
resierfsck --fix-fixable /dev/md0 and during this, the system went corrupt. What
I can remember I first received error messages when I tried to execute commands
in another virtual console. And then error messages that something respawned too
quickly and then the system waited 5 minutes before trying again. After a
reboot, I also received a kernel panic message. Probably the same as above.

It seems that a fsck or reiserfsck on large RAID volumes, in my case 1TB and
600GB respectively, can corrupt the / partition on another disk. Programs cannot
execute any longer and the system refuses to boot due to execute problems of
/sbin/init.

Steps to reproduce:

Create a RAID5 (>600GB) with mdadm or ARC-1220. Format it with reiserfs or ext3.
Use it for a while. Umount the RAID5 array. Run fsck -f or reiserfsck
--fix-fixable on the RAID5 volume. The try to reboot the system.
Comment 1 Peter Kerwien 2007-03-15 12:22:31 UTC
Forgot to mention: I used e2fsprogs 1.39. Which reiserfsprogs version I used a
year ago, I cannot remember.
Comment 2 Peter Kerwien 2007-03-16 13:53:01 UTC
I'm now trying to repair my Gentoo system. This is what I have found so far:

I cannot chroot into the system via the installcd:

livecd ~ # chroot /mnt/gentoo/ /bin/bash
chroot: cannot run command `/bin/bash': No such file or directory

I looked into the /lib64 directory on the system disk and realize that some very
important files are missing, e.g. ld-linux-x86-64.so.2!

After restoring glibc-2.5 from another Gentoo amd64, I can now chroot into the
system.

So during the execution of fsck -f on my RAID5 array, something deleted some
very important files on my root partition.
Comment 3 Peter Kerwien 2007-03-16 14:57:36 UTC
Other files that was missing:

/lib64/libgpm.so (gpm)
/lib64/libz.so (zlib)
/lib64/libbz2.so (bzip2)
/lib64/libbz2.so.1.0 (bzip2)
/lib64/libpam.so (pam)
/lib64/libpamc.so (pam)
/lib64/security/pam_unix_auth.so (pam)
/lib64/security/pam_unix_acct.so (pam)
/lib64/security/pam_unix_passwd.so (pam)
/lib64/security/pam_unix_session.so (pam)
/lib64/libcrack.so (cracklib)
/lib64/libwrap.so (tcp-wrappers)
/lib64/libe2p.so (e2fsprogs)
/lib64/libext2fs.so (e2fsprogs)
/lib64/libuuid.so (e2fsprogs)
/lib64/libcom_err.so (com_err)
/lib64/libss.so (ss)
/lib64/libreadline.so (readline)
/lib64/libhistory.so (readline)
/lib64/libpwdb.so (pwdb)

Here is my superblock information on the device I performed fsck -f on:

livecd / # tune2fs -l /dev/sdd1
tune2fs 1.38 (30-Jun-2005)
Filesystem volume name:   <none>
Last mounted on:          <not available>
Filesystem UUID:          1aaaf3ce-e85f-4a90-b816-34e165f8dfb6
Filesystem magic number:  0xEF53
Filesystem revision #:    1 (dynamic)
Filesystem features:      has_journal filetype sparse_super large_file
Default mount options:    (none)
Filesystem state:         clean
Errors behavior:          Continue
Filesystem OS type:       Linux
Inode count:              122077184
Block count:              244139797
Reserved block count:     12206989
Free blocks:              76628610
Free inodes:              122024787
First block:              0
Block size:               4096
Fragment size:            4096
Blocks per group:         32768
Fragments per group:      32768
Inodes per group:         16384
Inode blocks per group:   512
Filesystem created:       Fri Feb 16 20:06:04 2007
Last mount time:          Thu Mar 15 17:03:07 2007
Last write time:          Thu Mar 15 17:03:07 2007
Mount count:              0
Maximum mount count:      27
Last checked:             Thu Mar 15 17:03:07 2007
Check interval:           15552000 (6 months)
Next check after:         Tue Sep 11 17:03:07 2007
Reserved blocks uid:      0 (user root)
Reserved blocks gid:      0 (group root)
First inode:              11
Inode size:               128
Journal inode:            8
Default directory hash:   tea
Directory Hash Seed:      dec77015-10e6-4707-8d0d-176cc2442d45
Journal backup:           inode blocks
Comment 4 Peter Kerwien 2007-03-16 15:21:08 UTC
And some info from my / partition on /dev/sda3:

server2 ~ # tune2fs -l /dev/sda3
tune2fs 1.39 (29-May-2006)
Filesystem volume name:   <none>
Last mounted on:          <not available>
Filesystem UUID:          5f7a50e4-229c-4922-bfba-1840ceaf8904
Filesystem magic number:  0xEF53
Filesystem revision #:    1 (dynamic)
Filesystem features:      has_journal filetype needs_recovery sparse_super
large_file
Default mount options:    (none)
Filesystem state:         clean
Errors behavior:          Continue
Filesystem OS type:       Linux
Inode count:              9519104
Block count:              19010919
Reserved block count:     950545
Free blocks:              17386661
Free inodes:              9132750
First block:              0
Block size:               4096
Fragment size:            4096
Blocks per group:         32768
Fragments per group:      32768
Inodes per group:         16384
Inode blocks per group:   512
Filesystem created:       Sun Mar 11 16:27:38 2007
Last mount time:          Sat Mar 17 00:19:34 2007
Last write time:          Sat Mar 17 00:19:34 2007
Mount count:              17
Maximum mount count:      -1
Last checked:             Sun Mar 11 16:27:38 2007
Check interval:           0 (<none>)
Reserved blocks uid:      0 (user root)
Reserved blocks gid:      0 (group root)
First inode:              11
Inode size:               128
Journal inode:            8
First orphan inode:       4964353
Default directory hash:   tea
Directory Hash Seed:      5dcd3059-7f6a-4d2d-9c27-0a2be6ceaf31
Journal backup:           inode blocks
Comment 5 Olaf Kirch 2007-03-18 10:17:41 UTC
It seems raid5 is corrupting memory in some cases. See also bug #7458
and bug #8144
Comment 6 Anonymous Emailer 2007-03-19 01:34:53 UTC
Reply-To: erich@areca.com.tw

Hi Sir,

I will try to reproduce this bug in my Lab.
If I have more clue to this bug or need more help from you.
I will request to you again.

Best Regards
Erich Chen


Date: Thu, 15 Mar 2007 12:20:46 -0700
From: bugme-daemon@bugzilla.kernel.org
To: akpm@linux-foundation.org
Subject: [Bug 8209] New: fsck on large RAID arrays corrupts root partition


http://bugzilla.kernel.org/show_bug.cgi?id=8209

           Summary: fsck on large RAID arrays corrupts root partition
    Kernel Version: 2.6.20.3
            Status: NEW
          Severity: high
             Owner: akpm@osdl.org
         Submitter: peter@kerwien.homeip.net


Most recent kernel where this bug did *NOT* occur: ?
Distribution: Gentoo Linux amd64
Hardware Environment: Gigabyte GA-965P-DS4, Intel Core 2 Duo, ATI Radeon 
7000
PCI graphics card, Areca ARC-1220 PCIe RAID controller
Software Environment:
Problem Description:

(I'm sorry I cannot give more detailed information. But it's my server that
crashes and I cannot test things at the moment since I don't want to risk 
losing
 any data. I will try to set up a machine to reproduce the problem on, but I
probably need help to know what to look for)

I have two RAID arrays in my system, both created with the Areca controller:

1. RAID1 (2 x 80GB) => /dev/sda
2. RAID5 (3 x 500GB) => /dev/sdb

Om my /dev/sda I have 3 partitions:

1. sda1: /boot approx, 32MB, ext3
2. sda2: /swap 2GB
3. sda3: / , the rest of the drive, ext3

On my /dev/sdb I have only one ~1TB partition formated with ext3.

I'm running Gentoo Linux amd64 with a vanilla kernel 2.6.20.3. After a 
fsck -f
/dev/sdb1, I rebooted the system and then it failed too boot. I got the 
error
message: "No init found. Try passing init= option to kernel.".

When I tried to repair the system by booting up from the Gentoo install CD I
cannot chroot to the system. But I could mount /dev/sda3 and /sbin/init etc. 
are
still there. But it seems that I cannot run them anymore.

Very similar (the same?) problem occurred for me about one year ago on a 
totally
different machine, running Gentoo Linux i386 with an older kernel. I cannot
remember which version, but it was probably the latest stable 2.6.x at that
time. One system disk formated with reiserfs and one software RAID5 (4 x 
200GB)
formated with reiserfs. The software RAID5 was created and monitored with 
mdadm.
I hadn't checked the system for a while, so I umounted the RAID5 (/dev/md0), 
ran
resierfsck --fix-fixable /dev/md0 and during this, the system went corrupt. 
What
I can remember I first received error messages when I tried to execute 
commands
in another virtual console. And then error messages that something respawned 
too
quickly and then the system waited 5 minutes before trying again. After a
reboot, I also received a kernel panic message. Probably the same as above.

It seems that a fsck or reiserfsck on large RAID volumes, in my case 1TB and
600GB respectively, can corrupt the / partition on another disk. Programs 
cannot
execute any longer and the system refuses to boot due to execute problems of
/sbin/init.

Steps to reproduce:

Create a RAID5 (>600GB) with mdadm or ARC-1220. Format it with reiserfs or 
ext3.
Use it for a while. Umount the RAID5 array. Run fsck -f or reiserfsck
--fix-fixable on the RAID5 volume. The try to reboot the system.


Comment 7 Peter Kerwien 2007-03-19 07:55:08 UTC
More (Areca related) information about my system. Please let me know which other
information you need.

Areca firmware is 1.42. I'm using the kernel included Areca driver.

server2 bin # cli64 disk info
 #   ModelName        Serial#          FirmRev     Capacity  State
===============================================================================
 1   WDC WD800JD-22M  WD-WMAM9HW53057  10.01E01      80.0GB  RaidSet Member(1)
 2   WDC WD800JD-22M  WD-WMAM9HW87298  10.01E01      80.0GB  RaidSet Member(1)
 3   WDC WD5000YS-01  WD-WCANU1414847  09.02E09     500.1GB  RaidSet Member(2)
 4   WDC WD5000YS-01  WD-WCANU1415641  09.02E09     500.1GB  RaidSet Member(2)
 5   WDC WD5000YS-01  WD-WCANU1477134  09.02E09     500.1GB  RaidSet Member(2)
===============================================================================
GuiErrMsg<0x00>: Success.

server2 bin # cli64 rsf info
 #  Name             Disks TotalCap  FreeCap DiskChannels       State
===============================================================================
 1  Raid Set # 00        2  160.0GB    0.0GB 12                 Normal
 2  Raid Set # 01        3 1500.0GB    0.0GB 345                Normal
===============================================================================
GuiErrMsg<0x00>: Success.

server2 bin # cli64 vsf info
 # Name             Raid# Level   Capacity Ch/Id/Lun  State
===============================================================================
 1 ARC-1220-VOL#00    1   Raid0+1   80.0GB 00/00/00   Normal
 2 ARC-1220-VOL#01    2   Raid5   1000.0GB 00/00/01   Normal
===============================================================================
GuiErrMsg<0x00>: Success.

Cache mode is write-through on both volumes.
Comment 8 Peter Kerwien 2007-03-19 23:52:08 UTC
Created attachment 10868 [details]
My kernel config

My kernel config
Comment 9 Peter Kerwien 2007-03-19 23:53:52 UTC
Created attachment 10869 [details]
My installed packages

A list of all installed applications.
Comment 10 Olaf Kirch 2007-03-21 04:54:16 UTC
There's something strange in the way the areca driver builds its
scatter gather lists:

address_lo = cpu_to_le32(dma_addr_lo32(sg_dma_address(sl)));
address_hi = cpu_to_le32(dma_addr_hi32(sg_dma_address(sl)));
if (address_hi == 0) {
        struct SG32ENTRY *pdma_sg = (struct SG32ENTRY *)psge;

        pdma_sg->length = length;
	...
} else {
        struct SG64ENTRY *pdma_sg = (struct SG64ENTRY *)psge;

        pdma_sg->length = length|IS_SG64_ADDR;
	...
}

IS_SG64_ADDR is 0x01000000 (ie bit24). To me it appears
that whenever we do SG I/O to/from an address that is smaller than 2**32
but has bit 24 set, we create a SG32ENTRY which to the HBA looks like
it is an SG64ENTRY.

Shouldn't the condition read like this?

	if (address_hi == 0 && !(address_lo & IS_SG64_ADDR))
Comment 11 Olaf Kirch 2007-03-21 04:59:17 UTC
Created attachment 10886 [details]
Patch with the SG32 vs SG64 change, for testing

Totally untested patch - may or may not help.
Comment 12 Erich Chen 2007-03-28 03:07:17 UTC
It is incorrect with !(address_lo & IS_SG64_ADDR)).
IS_SG64_ADDR is a flag for Areca firmware spec. only,
Areca support team works for a long time and can not reproduce 
This "fsck on large RAID arrays corrupts root partition" in our Lab. .
Areca Had ran Gentoo, RedHat, SuSe OS worked with linux-2.6.20.3 kernel.
All of them work fine.
We had test large size with 1.6TB RAID Volumes too, but it worked fine.
I think the large volume is not key point.
You can try to boot from single Raid Volume on 2.6.20.3 kernel.
If it works, then create or attach your new large RAID volume into this RAID 
adapter.
Scan your scsi bus again.
You will find this new device. Please do the same procedure as yours with 
mkfs.ext3 and fsck even reboot your system again.
I am not sure which problem on your site.
But Areca had Gigabyte GA-965P-DS4 last week from Gigabyte, Areca try to 
reproduce this bug on this main board.

Comment 13 Peter Kerwien 2007-03-28 03:56:35 UTC
I'm not sure this is related to the Areca driver at all. I experienced a similar
thing (but since it is over a year ago I'm not really sure) when I performed a
reiserfsck on a SW RAID5 created with mdadm, i.e. running a totally different
configuration. The only common thing I can see is:

Running fsck/reiserfsck on a RAID5 corrupts the / partition located on another
harddrive.

During the resierfsck on the RAID5 volume, my / partition on another (then
unraided) systemdisk got corrupted. I couldn't start the system. Exactly the
same symptom I had recently. The system became unstable, i.e. I couldn't execute
commands after the fsck and it refused to reboot after the fsck.

I haven't had the time or equipment to try this again at home. Sorry.

Info to Erich: The HW revision on my Gigabyte motherboard is 1.0 using firmware F8.
Comment 14 Olaf Kirch 2007-03-28 04:39:54 UTC
re comment #12: You're right, please ignore my comment on the IS_SG64 bit.

re comment #13: I doubt you can blame this on some generic raid5 issue
- dmraid and hwraid are entirely different beasts
Comment 15 Erich Chen 2007-03-29 20:09:18 UTC
Hi ,
Areca had ran Gentoo linux on Gigabyte GA-965P-DS4 with Areca RAID adapter.
But it works fine.
I ran this system by 32 bit CPU, I will research if I can run it with x64.
Comment 16 Peter Kerwien 2007-06-25 00:56:17 UTC
This has happened again while expanding the RAID-5 device. Added one 500GB harddrive to the RAID. After the RAID expansion, I unmounted it and performed a 'fsck -n /dev/sdb1' before I was going to increase the partition and filesystem. During the execution of fsck the system became unusable. Stupid me forgot the potential problem I might get when running fsck.

I shut it down, booted up from a Gentoo 2007.0 install CD, mounted / and /boot (/dev/sda3 and /dev/sda1) and saw again that important links were deleted from the /lib64 directory, e.g. /lib64/ld-linux-x86-64.so.2. When this happened I ran kernel 2.6.21.1. Same motherboard and Areca firmware as before.

I'm trying to get help via the Gentoo forum and e2fsprogs project page. But no response so far.
Comment 17 Erich Chen 2007-07-03 18:44:53 UTC
Hi Peter Kerwien,

I am so sorry that you touch this bug again, Areca had new firmware version 1.43.
You can update it.
Areca had found some SMART data issue on WD5000.., You can try to disable controller's SMART data polling behavior.
Incorrect SMART data report may cause behavior abnormal on old Areca firmware version.

Best Regard
Erich Chen 
Comment 18 Peter Kerwien 2007-07-03 22:32:28 UTC
I've upgraded the FW to 1.43 after I expanded my RAID and repaired the system. I might try another e2fsck on the RAID to see if the problem is still there or not. I have more or less become an expert on repairing it now ;-)

I'm also planning to install a Gentoo x86 system on two spare harddrives. It could be interesting to see if this can be x86_64 related or not.
Comment 19 Alan 2009-03-23 10:57:00 UTC
Closing out old stale bugs

Note You need to log in before you can comment on or make changes to this bug.