Bug 109341 - Docker btrfs snapshot driver operation trashed grub, missing operating system
Summary: Docker btrfs snapshot driver operation trashed grub, missing operating system
Status: RESOLVED CODE_FIX
Alias: None
Product: File System
Classification: Unclassified
Component: btrfs (show other bugs)
Hardware: i386 Linux
: P1 high
Assignee: Josef Bacik
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2015-12-14 07:43 UTC by Vincent Petry
Modified: 2016-01-18 16:27 UTC (History)
1 user (show)

See Also:
Kernel Version: 4.3.0-2-default
Subsystem:
Regression: No
Bisected commit-id:


Attachments
dmesg before the final reboot (199.10 KB, application/x-bzip)
2015-12-14 07:43 UTC, Vincent Petry
Details
Hardware info (lshw, lsmod) (21.25 KB, text/plain)
2015-12-14 07:43 UTC, Vincent Petry
Details
Boot sector dumps (191.73 KB, application/x-bzip)
2016-01-06 20:15 UTC, Vincent Petry
Details
systemd journal before last failed boot (226.37 KB, text/x-log)
2016-01-06 20:16 UTC, Vincent Petry
Details

Description Vincent Petry 2015-12-14 07:43:05 UTC
Created attachment 197311 [details]
dmesg before the final reboot

I'm raising this here because I believe it is likely to be a bug in BTRFS.

tldr; of the problem: ran docker as I did usually, it set up its layered FS using the BTRFS driver. Laptop behaved weirdly. Rebooted. "Missing operating system message appears, cannot boot.

The problem appeared on Friday 11th December around 15:57 (for cross-referencing in logs)

My setup:
- Laptop: Dell XPS 13 9333 Developer edition, bought in March 2014, not that old)
- Hard disk is a SSD: LITEONIT LMT-256
- OS: openSUSE Factory x86_64
- Kernel: 4.3.0-2-default from the openSUSE official packages
- Grub: grub2-2.02~beta2-50.1.x86_64 and is installed on the root of the partition
- Docker: docker-1.8.3-87.3.x86_64
- Description of events:

1. Ran docker-based unit tested for ownCloud (check out ownCloud git master then ran: % PRIMARY_STORAGE_CONFIG=swift ./autotest.sh mysql | tee test.log 2>&1
2. Waited for docker init, this one time docker was particularily slow to verify the various FS layers
3. Noticing that docker was slow, I killed the process
4. Tried to delete the docker container with "docker rm $containerid" but got a message that "Failed to destroy btrfs snapshot: directory not empty" (can be seen in attached dmesg dump too)
5. Tried setting up ownCloud from scratch, but mysql was behaving strangely and saying my user did not have enough permissions even though it did
6. Restarted mysql, but the weird permission issue persisted
7. Asked computer to reboot, had to kill Xorg manually as it got stuck
8. After reboot: "Missing operating system"
9. Ran openSUSE live KDE on USB stick, ran "btrfs.fsck" (forgot the result, but very few broken files, not spectacular)
10. Restored grub2 with yast from the live USB
11. Computer works properly again
12. Also note that the mysql permission problem is magically gone as if the computer was in a weird state before that reboot

So it looks like something bad happened when docker did its magic with BTRFS subvolumes when setting up the docker instance. Maybe an unexpected "0" value slipped somewhere in the code and made BTRFS overwrite the boot info on the partition, which usually happens to be at position 0 ?

Can another program that manipulates subvolumes even erase the boot info ? (in case it's a docker bug...)

I did not find anything relevant or suspicious in the log.

Attachements: hwinfo.txt and dmesg dump from before the final reboot

This is NOT REPRODUCIBLE, it only happened once. Running docker as I did above now works properly without weird side effects. I still wanted to report it because of its nasty consequence and in the hope it can be useful.
Comment 1 Vincent Petry 2015-12-14 07:43:48 UTC
Created attachment 197321 [details]
Hardware info (lshw, lsmod)
Comment 2 Vincent Petry 2016-01-06 19:59:44 UTC
It happened again, twice!
And this time docker wasn't started at all...
So I think this has nothing to do with docker.

Since the last boot (yesterday) I put the laptop into suspend to ram as I usually do. Not sure if that matters at all.
Note that I also have a ecryptfs in my home.

Before restoring grub using a USB boot stick, I took a dump of the first sectors of the root partition. I believe that it's the place where the boot loader is supposed to be when booting from the root partition ?

I will reboot again and now grab the same sector to do a binary diff. Maybe it will contain some clues as to what happened to the bootloader.

Stay tuned.
Comment 3 Vincent Petry 2016-01-06 20:15:00 UTC
Created attachment 198861 [details]
Boot sector dumps

Added attachment for boot sector dumps.

Contains the following:
- bootsectorbroken.dat: boot sectors at the time where it was broken ("Missing operating system")
- bootsectorok.dat: boot sectors after reinstalling grub with yast from usb stick
- bootsectorbroken-hexdump.dat: hexdump from broken boot sector
- bootsectorok-hexdump.dat: hexdump from repaired boot sector
- dumpdiff.txt: diff of the hexdumps

If you have a look at the dump diff, you can clearly see that everything is ZEROS between offsets 0x0 to 0x10000. However in the repaired version these offsets are populated.

After offset 0x10000 there are fewer differences, maybe this is already past the bootloader and is not relevant to this issue.

It is still a mystery why a process of any kind would write zeroes on the boot sector during normal functioning of the OS.
Comment 4 Vincent Petry 2016-01-06 20:16:12 UTC
Created attachment 198871 [details]
systemd journal before last failed boot
Comment 5 Vincent Petry 2016-01-06 20:43:26 UTC
Additional info: I'm using EFI boot. (not UEFI)

Another interesting find: it looks like Grub puts stuff here, this is from the fixed dump:

00010dd0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00040000  23 20 47 52 55 42 20 45  6e 76 69 72 6f 6e 6d 65  |# GRUB Environme|
00040010  6e 74 20 42 6c 6f 63 6b  0a 23 23 23 23 23 23 23  |nt Block.#######|
00040020  23 23 23 23 23 23 23 23  23 23 23 23 23 23 23 23  |################|
*
00040200  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
0007d000


However in the broken dump there are only zeros starting with 0x10dd0

00010dd0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
0007d000


I was hoping to find out whether the erasing data is maybe real data from files instead of just zeroes.
Comment 6 Vincent Petry 2016-01-06 20:52:09 UTC
It is said here that it's usually not recommended to install Grub to the root of a partition: https://wiki.archlinux.org/index.php/GRUB#Install_to_partition_or_partitionless_disk

However, the instructions there seem to be setting the immutable attribute on "core.img".

When I checked it on my openSUSE install:
% lsattr /boot/grub2/i386-pc/core.img
---------------- /boot/grub2/i386-pc/core.img

It looks like it's not set!

I'm going to try setting it for now and see whether the problem disappears completely.
Maybe openSUSE/Yast or Grub should set it automatically ?
Comment 7 Vincent Petry 2016-01-06 20:56:34 UTC
Reading further, it seems that it's not related.
The article says that Grub is adding instructions in the boot sector to find core.img, with a specific address. To make sure the address stays the same, make core.img immutable.

However in my case, the code in the boot sector got erased by zeroes... so it's not likely to help much. We'll see.

Is there any tool out there to monitor disk IO which could tell me whenever something is touching the first sectors of the partition ?
Comment 8 Vincent Petry 2016-01-07 06:46:49 UTC
I raised a ticket in the openSUSE tracker https://bugzilla.novell.com/show_bug.cgi?id=960910 and it seems that it might be a FS trim/discard regression in 4.3. The fix will be ported to openSUSE Tumbleweed.
Comment 9 David Sterba 2016-01-18 16:27:19 UTC
Thanks for the analysis and report. Upstream patch: https://patchwork.kernel.org/patch/7984441/ .

Note You need to log in before you can comment on or make changes to this bug.