Bug 113781 - kernel 4.4.1-2-ARCH, 4.4.3-1-ARCH and Fake RAID array setup
Summary: kernel 4.4.1-2-ARCH, 4.4.3-1-ARCH and Fake RAID array setup
Status: NEW
Alias: None
Product: IO/Storage
Classification: Unclassified
Component: MD (show other bugs)
Hardware: x86-64 Linux
: P1 normal
Assignee: fs_ext4@kernel-bugs.osdl.org
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-03-05 13:57 UTC by Andre Hasekamp
Modified: 2016-04-21 20:06 UTC (History)
1 user (show)

See Also:
Kernel Version: 4.4.1-2-ARCH, 4.4.3-1-ARCH
Subsystem:
Regression: Yes
Bisected commit-id:


Attachments

Description Andre Hasekamp 2016-03-05 13:57:05 UTC
Unfortunately, I am not 100% sure that this is a kernel problem, but after you read this post, I hope you understand why I assigned this bug report to the Linux kernel. A search on "RAID" through the bug database did not result in anything comparable with the Fake RAID problem described here.

First the symptoms, afterwards detailed explanation of the software and hardware used. Below, it always concerns PC configurations with a so-called "Fake RAID array", with 2 disks connected in a RAID 0 setup.

PC 1.

Immediately after Linux kernel 4.4.1-2-x86_64 ARCH had been installed on 2016-02-03, no Fake RAID disks were recognised anymore. When you start dmraid, under kernel 4.4.1-2 you get a message something like "No block devices", under kernel 4.4.3-1 you get the message "No raid disks". Downgrade to the previously working kernel 4.3.3-3-x86_64 ARCH made the PC run again. According to fsck filesystems are still OK.

PC 2.

A fairly heavily used PC at the time. Around 2016-02-03 Linux kernel 4.4.1-2-x86_64 ARCH was one of the updates. It took approximately 2 to 3 weeks until the disks were suddenly hopelessly screwed up. It is impossible to do anything with these disks; I will have to reformat them again and for safety's sake I will also repartition them again one day. Power supply has been taken off these disks and temporarily a third (single) disk has been installed with a Linux system (kernel 4.4.1-2-x86_64 ARCH, in the meantime 4.4.3-1-x86_64 ARCH). PC runs fine, admittedly only for a few days now, but my faith in Linux and/or the hardware would be unbearably shaken if this configuration would fail in the near or further future. 

PC 3.

Around 2016-02-03 Linux kernel 4.4.1-2-x86_64 ARCH was among updates. At the time, this system was rarely used. Having seen the problems with the other 2 PC's, I downgraded the kernel to 4.3.3-3-x86_64 ARCH. According to fsck the filesystems are still OK. 

Downgrading the kernel is of course not a permanent solution.

Software.

For approximately 3 years now, I use the Archlinux distribution on my PCs. Archlinux has no versions, it has "rolling updates". So far, I never had problems with this distribution.

Besides the kernel, from the dependency tree, here are some of the Archlinux packages that maybe relevant in the context of Fake RAID, together with their installation dates (installation dates are taken from the file date in the package cache).

- dmraid: version 1.0.0.rc16.3-10-x86_64, since october 2013.
- device-mapper: version 2.02.141-1-x86_64, since 2016-01-25.
- thin-provisioning-tools: version 0.6.0-2-x86_64, since 2016-01-24.

All filesystems are eventually ext4, the home directory is encrypted with Luks.

Hardware.

Not clear if it helps, but here is the hardware, relevant in this context, of the PCs mentioned above.

PC 1 (since January 2009).

Motherboard: Asus M3A78-T.
CPU: AMD Phenom X4 9850.
Storage controller: AMD 750 with Promise (Fake) RAID firmware.
Magnetic disks (2*): Western Digital Cavair Blue WD3200AAKS.

PC 2 (since April 2012).

Motherboard: Asus F1A75-V PRO.
CPU: AMD A8 3850.
Storage controller: AMD A75 FCH with Promise (Fake) RAID firmware.
Magnetic disks (2*): Western Digital WD5000AAKX Blue.

PC 3 (since August 2010).
Motherboard: Asus M4A89GTD Pro/USB3.
CPU: AMD Phenom II X4 965.
Storage controller: AMD 850 with Promise (Fake) RAID firmware.
Magnetic disks (2*): Western Digital WD5000AAKS.

If I can help by supplying more data, please let me know. From the disks on PCs 1 and 3 I can read data in a "normal" way. From the disks on PC 2 I'm not so sure. Most likely, I can still read data such as a superblock (or what is left of it) with dd from a live CD, but in that case you would have to specify precisely on which disk and where I can find the data, as well as how much data you want.

Kind regards,
Andre Hasekamp.
Comment 1 Andre Hasekamp 2016-03-13 11:01:46 UTC
2016-03-13

Kernel 4.4.5-1-ARCH
device-mapper 2.02.145-1
thin-provisioning-tools 0.6.1-2

Unfortunately the problem remains the same.

Andre Hasekamp.
Comment 2 Andre Hasekamp 2016-04-21 20:06:57 UTC
2016-04-21

Archlinux is now at:

Kernel 4.5.1-1
dmraid 1.0.0.rc16.3-10
device-mapper 2.02.149-1
thin-provisioning-tools 0.6.1-2

At this stage, I can run a linux system again from "Fake RAID array" disks (RAID 0) on PC 1 (decribed above).

Although it is not clear what caused it, to me it is so much improvement that I will use "Fake RAID array" disks again. Neverthless, in the meantime, after having encountered the problems described, I have moved my production work to Archlinux installations on SSD. At the moment I have 2 PCs, each with a functioning linux test system on "Fake RAID array" disks. These test systems are not frequently used. So it may take a while before I consider these test systems to be reliable enough for production work.

From my point of view, you may close this bug. Should the "Fake RAID array" disks present problems again in the future, then I will open a new bugreport.

Since the status of the current bug report is still NEW, I trust that you have not spent any time on it yet.

Kind regards,
Andre Hasekamp.

Note You need to log in before you can comment on or make changes to this bug.