Bug 78711

Summary: A partition appears to not be properly setup under 3.16, but is with 3.15
Product: IO/Storage Reporter: Bruno Wolff III (bruno)
Component: Block LayerAssignee: io_md
Status: RESOLVED CODE_FIX    
Severity: normal CC: jason, neilb
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: kernel-PAE-3.16.0-0.rc1.git4.2.fc21.i686 Subsystem:
Regression: Yes Bisected commit-id:
Attachments: superblock under 3.15
blkid output from 3.16 live instance

Description Bruno Wolff III 2014-06-22 17:34:22 UTC
I have a machine that won't boot in 3.16 and it looks like it is probably because /dev/sda3 is detected as not having a valid superblock under 3.16, but is with 3.15.

I have the following Fedora bug for this:
https://bugzilla.redhat.com/show_bug.cgi?id=1111442mdadm --examine /dev/sda3

Under 3.15 mdadm says the following about /dev/sda3:

/dev/sda3:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : 9e06cb82:14b726e2:af554f00:b9b73901
           Name : bruno.wolff.to:13  (local to host bruno.wolff.to)
  Creation Time : Thu Jun 30 06:22:05 2011
     Raid Level : raid1
   Raid Devices : 1

 Avail Dev Size : 167770112 (80.00 GiB 85.90 GB)
     Array Size : 83884984 (80.00 GiB 85.90 GB)
  Used Dev Size : 167769968 (80.00 GiB 85.90 GB)
    Data Offset : 2048 sectors
   Super Offset : 8 sectors
   Unused Space : before=1968 sectors, after=144 sectors
          State : clean
    Device UUID : 350f0baa:0622d7da:bc485abd:a4b765f9

    Update Time : Sun Jun 22 12:22:42 2014
       Checksum : 3e3671a9 - correct
         Events : 7031153


   Device Role : Active device 0
   Array State : A ('A' == active, '.' == missing, 'R' == replacing)
Comment 1 Neil Brown 2014-06-22 22:29:59 UTC
It is odd that mdadm doesn't recognise sda3 under 3.16, but blkid does:

/dev/sda3: UUID="9e06cb82-14b7-26e2-af55-4f00b9b73901" UUID_SUB="350f0baa-0622-d7da-bc48-5abda4b765f9" LABEL="bruno.wolff.to:13" TYPE="linux_raid_member" PARTUUID="6c3efcfe-03" 

Maybe something did something to sda3... changed the device size or partition offset or something.

Can you 'dd' the first 8K of sda3 to somewhere for each kernel, then compare them?
Comment 2 Bruno Wolff III 2014-06-23 03:53:42 UTC
Created attachment 140671 [details]
superblock under 3.15

I haven't got a 3.16 copy yet. dracut doesn't have dd and I need to set up a live image for 3.16 so that I can boot to where I'll have a copy of dd. (sda3 has my root file system on it.)

I thought that maybe the dump from 3.15 might be useful by itself, if there is some sort of corruption.

Probably it will be tomorrow night before I get a live image tested.
Comment 3 Bruno Wolff III 2014-06-24 15:29:44 UTC
I retested this on 3.16.0-0.rc2.git0.1.fc21 and still see the problem. Fedora kernels before this were broken in other ways on i686 so I need a live image with this kernel to boot 3.16 on the problem machine. There is a new nightly compose that I should be able to use after work today.
Comment 4 Bruno Wolff III 2014-06-25 00:22:09 UTC
Created attachment 140911 [details]
blkid output from 3.16 live instance

Created attachment 911900 [details]
blkid output from live image instance

I got a live image to boot, but it was pretty unstable and locked up a few times (requiring reboots) while I was collecting info.

dd indicated that /dev/sda3 was 0 bytes long. So that suggests that the difference isn't in md, but rather in some other part of I/O.

The device is created and blkid reports some information about it.
Comment 5 Bruno Wolff III 2014-06-25 04:35:51 UTC
fdisk doesn't seem to show anything odd about the partition table:
fdisk -l /dev/sda

Disk /dev/sda: 298,1 GiB, 320072933376 bytes, 625142448 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0x6c3efcfe

Device     Boot     Start       End   Sectors   Size Id Type
/dev/sda1  *         2048   2099199   2097152     1G fd Linux raid autodetect
/dev/sda2         2099200  23070719  20971520    10G fd Linux raid autodetect
/dev/sda3        23070720 190842879 167772160    80G fd Linux raid autodetect
/dev/sda4       190842880 625142447 434299568 207,1G fd Linux raid autodetect
Comment 6 Neil Brown 2014-06-25 06:35:55 UTC
As you say, this doesn't sound like a an md problem.

Only think I can suggest is to try running "blockdev --rereadpt /dev/sda" and see if that makes the partitions appear properly.

You probably need to reassign this to "block layer" or something like that.
Or hope the fedora bugzilla will find an answer for you.
Comment 7 Bruno Wolff III 2014-06-25 15:25:23 UTC
I have a suggestion to try out the patch at https://lkml.org/lkml/2014/6/23/96 and I should be able to test it tonight and report back.
Comment 8 Bruno Wolff III 2014-06-25 23:53:30 UTC
I tried out the patch referenced in comment 7 and it did seem to get me past the place I was having this problem. The system still didn't finish booting, but I suspect that is a different problem.
Comment 9 Jason M. 2014-06-26 18:32:54 UTC
Your patch to include/linux/uio.h fixed https://bugzilla.kernel.org/show_bug.cgi?id=78911 for me. After applying it and rebuilding I can mount my swap partition again.
Comment 10 Bruno Wolff III 2014-06-26 20:56:43 UTC
The fix for this is now in Linus' tree:
http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=0b86dbf675e0170a191a9ca18e5e99fd39a678c0