Bug 207809 - Bcache does not consistently work with swap file hibernation
Summary: Bcache does not consistently work with swap file hibernation
Status: NEW
Alias: None
Product: IO/Storage
Classification: Unclassified
Component: Other (show other bugs)
Hardware: x86-64 Linux
: P1 normal
Assignee: io_other
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2020-05-21 02:17 UTC by Alec Feldman
Modified: 2020-06-24 04:39 UTC (History)
0 users

See Also:
Kernel Version: 5.6.12
Subsystem:
Regression: No
Bisected commit-id:


Attachments

Description Alec Feldman 2020-05-21 02:17:31 UTC
I followed the directions to get a swap file working on btrfs, including using resume=swap_device and resume_offset=swap_file_offset. The physical offset was retrieved using this script: https://github.com/osandov/osandov-linux/blob/master/scripts/btrfs_map_physical.c. I then divided the physical offset by the page size to get the resume offset. Hibernating into a swap file only works the first time after creation. Every subsequent hibernation leads my system to boot as if it were not resuming. Hibernating can also cause the backing device to unattach from the caching device, leading to an unbootable system. I then have to forcefully recreate the cache in order to fix it. Eventually, the above happens, but the filesystem also becomes corrupted as I can no longer reattatch the cache. This can be reproduced on any setup with bcache, btrfs on the cache, and a swap file on the btrfs system. I am not sure if this affects other filesystems.
Comment 1 Alec Feldman 2020-06-24 04:39:57 UTC
From looking deeper into this issue and reading the FAQ on the bcache site, this appears to be a catch 22 issue. During resume, you are not allowed to make any changes to the disk. However, with bcache, this can be tricky: any read you make from a bcache device could result in a write to update the caching device. Currently bcache has no good ways of solving this. This means resume must finish before bcache starts. However, if the resume image is on the bcache backing device, the image must be accessed first. Using resume_offset to use the swap file on the backing device must be updating the caching device, and subsequently causing data loss. If I have reached the correct conclusion from the data loss that occured on my system and the limitation listed on the bcache FAQ, then that means resuming from a swap file on a backing device, at least with the cache enabled, is not possible. Disabling the cache during resume may prevent data loss from occuring.

Note You need to log in before you can comment on or make changes to this bug.