Bug 118511 - Corruption of VM qcow2 image file on EXT4 with crypto enabled
Summary: Corruption of VM qcow2 image file on EXT4 with crypto enabled
Status: RESOLVED PATCH_ALREADY_AVAILABLE
Alias: None
Product: File System
Classification: Unclassified
Component: ext4 (show other bugs)
Hardware: All Linux
: P1 high
Assignee: fs_ext4@kernel-bugs.osdl.org
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-05-19 14:50 UTC by ass3mbler
Modified: 2016-05-30 16:46 UTC (History)
1 user (show)

See Also:
Kernel Version: 4.5.3
Subsystem:
Regression: No
Bisected commit-id:


Attachments
Hypervisro kernel config file (90.26 KB, application/octet-stream)
2016-05-19 14:50 UTC, ass3mbler
Details

Description ass3mbler 2016-05-19 14:50:42 UTC
Created attachment 216801 [details]
Hypervisro kernel config file

Hello,

I have experienced two times in 48 hours a file system corruption on a QCOW2 image file running a linux Guest.

My configuration is the following:

Hypervisor:  
  - Gentoo Linux with a pure kernel 4.5.3, compiled manually.
  - QEMU 2.8.0 + KVM
  - /dev/md4, raid 1 with two identical partitions (/dev/sda4 and /dev/sdb4), ext4
  - /dev/md4 is mounted under /mnt/md4 and it contains a single dir /mnt/md4/kvm, encrypted  
  - after de-encrypting the /mnt/md4/kvm dir, it's bind-mounted in /kvm (mount --bind /mnt/md4/ /kvm)
  - nothing else is actually running on the hypervisor, only an openssh server

Guest:
  - Gentoo Linux with a pure kernel 4.5.4, compiled manually
  - virtio drivers for disk, networking etc.
  - the whole image of the guest is a 250GB QCOW2 file, stored under /kvm/xxx.qcow2 in the hypervisor's filesystem
  - the root partition is /dev/sda2 (about 230GB), EXT3

I'm running this configuration successfully on many other (even very busy) deployments without any problem, the only difference in this installation is the encrypted /mnt/md4/kvm directory on the hypervisor.

For two times in the lasts 48h I've found the root filesystem of the guest (/dev/sda2) remounted in read-only mode after a detected write problem. Here is the log from dmesg:

[[Guest]]
[208323.124266] blk_update_request: critical target error, dev sda, sector 231060144
[208323.124540] Aborting journal on device sda2-8.
[208323.729847] EXT4-fs error (device sda2): ext4_journal_check_start:56: Detected aborted journal
[208323.729855] EXT4-fs (sda2): Remounting filesystem read-only
[208323.740861] EXT4-fs error (device sda2): ext4_journal_check_start:56: Detected aborted journal
[208323.772340] EXT4-fs error (device sda2): ext4_journal_check_start:56: Detected aborted journal
[208323.772346] EXT4-fs error (device sda2): ext4_journal_check_start:56: Detected aborted journal
[208323.773233] EXT4-fs error (device sda2): ext4_journal_check_start:56: Detected aborted journal


At the same time, on the hypervisor dmesg i have only this line:

[[Hypervisor]]
[596477.535490] ext4_bio_write_page: ret = -12

After that, I have to perform a reboot of the Guest. I've started the guest from a gentoo iso and performed a fsck on the root (/dev/sda2) partition. This is the output:

e2fsck 1.42.13 (17-May-2015)
/dev/sda2: recovering journal
/dev/sda2 contains a file system with errors, check forced.
Pass 1: Checking inodes, blocks, and sizes
Deleted inode 5709828 has zero dtime. Fix<y>? yes
Inodes that were part of a corrupted orphan linked list found. Fix<y>? yes
Inode 5709829 was part of the orphaned inode list. FIXED.
Inode 5709830 was part of the orphaned inode list. FIXED.
Inode 5709831 was part of the orphaned inode list. FIXED.
Inode 5709832 was part of the orphaned inode list. FIXED.
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
Free blocks count wrong (25234981, counted=22539957).
Fix<y>? yes
Inode bitmap differences: -(5709828--5709832)
Fix<y>? yes
Free inodes count wrong for group #697 (8175, counted=8180).
Fix<y>? yes
Free inodes count wrong (11008395, counted=10993791).
Fix<y>? yes
/dev/sda2: ***** FILE SYSTEM WAS MODIFIED *****
/dev/sda2: 3424129/14417920 files (3.9% non-contiguous), 35131723/57671680 block

I attach the .config of the Hypervisor kernel.

Thank you in advance and best regards,

Andrew
Comment 1 Navin 2016-05-24 14:50:49 UTC
Can you check the memory state of Host/Hypervisor if is full around the range [596476,596478.]  so that 596477.535490 when ext4_bio_write_page is not able to get memory ? 

If your hypervisor is using encryption then this patch may help (already present in 4.6 mainline)

https://patchwork.ozlabs.org/patch/602204/

If that doesn't work then ,You need to your system stats logged and check and check when ENOMEM is returned. It could genuinely out of memory or there could be something wrong with code.

Hypervisor/Host cannot write/commit/allocate buffers because it is out of memory. 

Hence your guest is in a transient state where the change are not committed and most probably journal is aborted.


[[Hypervisor]]
[596477.535490] ext4_bio_write_page: ret = -12

http://lxr.free-electrons.com/source/include/uapi/asm-generic/errno-base.h#L15

 15 #define ENOMEM          12      /* Out of memory */
Comment 2 ass3mbler 2016-05-25 17:45:44 UTC
Hi Navin,

thank you a lot for your help, I'll upgrade the kernel to the v4.6 today's night to see if it gets better!

I have some doubt about a real out-of-memory condition since the Hypervisor has 8GB of RAM, the Guest is hard-limited to 4GB and the only other "big" process running on the Hypervisor is a simple (mostly idle) opensshd server instance... so I really hope that the patch will solve the issue.

I'll let you know very shortly, thank you again for your precious help and best regards,

Andrew



(In reply to Navin from comment #1)
> Can you check the memory state of Host/Hypervisor if is full around the
> range [596476,596478.]  so that 596477.535490 when ext4_bio_write_page is
> not able to get memory ? 
> 
> If your hypervisor is using encryption then this patch may help (already
> present in 4.6 mainline)
> 
> https://patchwork.ozlabs.org/patch/602204/
> 
> If that doesn't work then ,You need to your system stats logged and check
> and check when ENOMEM is returned. It could genuinely out of memory or there
> could be something wrong with code.
> 
> Hypervisor/Host cannot write/commit/allocate buffers because it is out of
> memory. 
> 
> Hence your guest is in a transient state where the change are not committed
> and most probably journal is aborted.
> 
> 
> [[Hypervisor]]
> [596477.535490] ext4_bio_write_page: ret = -12
> 
> http://lxr.free-electrons.com/source/include/uapi/asm-generic/errno-base.
> h#L15
> 
>  15 #define ENOMEM          12      /* Out of memory */
Comment 3 ass3mbler 2016-05-30 16:46:11 UTC
Hi Navin,

I can confirm that moving to kernel 4.6 following your suggestion fully solved the issue.

Thank you a lot for pointing me in the right direction, I mark this issue as resolved.

Best regards,

Andrew

Note You need to log in before you can comment on or make changes to this bug.