Bug 92261
Summary: | Provide a way to really delete files, please | ||
---|---|---|---|
Product: | File System | Reporter: | Alexander Holler (holler) |
Component: | btrfs | Assignee: | Josef Bacik (josef) |
Status: | NEW --- | ||
Severity: | normal | CC: | dsterba, richard, rini17, szg00000 |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | All | Subsystem: | |
Regression: | No | Bisected commit-id: |
Description
Alexander Holler
2015-01-29 11:46:55 UTC
Just to avoid any misunderstanding: I'm speaking about a way to selectively securely delete one file. Having to shred a whole device or partition isn't appropriate. One shouldn't have to burn down a whole house in order to destroy a small piece of paper. This explains the problem: ======== laptopahbt ~ # dd if=/dev/zero of=/tmp/test.img bs=1M count=100 100+0 Datensätze ein 100+0 Datensätze aus 104857600 Bytes (105 MB) kopiert, 0,0580709 s, 1,8 GB/s laptopahbt ~ # grep -a abrakadabra /tmp/test.img laptopahbt ~ # mkfs.btrfs /tmp/test.img SMALL VOLUME: forcing mixed metadata/data groups Btrfs v3.18 See http://btrfs.wiki.kernel.org for more information. Turning ON incompat feature 'mixed-bg': mixed data and metadata block groups Turning ON incompat feature 'extref': increased hardlink limit per file to 65536 Turning ON incompat feature 'skinny-metadata': reduced-size metadata extent refs Created a data/metadata chunk of size 8388608 ERROR: device scan failed '/tmp/test.img' - Block device required fs created label (null) on /tmp/test.img nodesize 4096 leafsize 4096 sectorsize 4096 size 100.00MiB laptopahbt ~ # grep -a abrakadabra /tmp/test.img laptopahbt ~ # mount -o loop /tmp/test.img /mnt/ laptopahbt ~ # echo abrakadabra >/mnt/foo.txt laptopahbt ~ # umount /mnt/ laptopahbt ~ # grep -a abrakadabra /tmp/test.img (...) abrakadabra (...) abrakadabra laptopahbt ~ # mount -o loop /tmp/test.img /mnt/ laptopahbt ~ # shred -u /mnt/foo.txt laptopahbt ~ # umount /mnt/ laptopahbt ~ # grep -a abrakadabra /tmp/test.img (...) abrakadabra laptopahbt ~ # ======== Secure deletion starts to be hard if snapshots and reflinks are involved. Eg. one can't securely delete a file that resides on a read-only snapshot, and if the snapshot is deleted, the blocks are deleted not file-by-file, but incrementally if the snapshot is the only owner of the blocks. Similar holds for a reflinked file. The way it's implemented, a reflink (or deduplicated files for that matter) share some extents. file A: blocks 0-3 file B: 0-1 from A, 2 modified, 3 from A Deleting file B would keep blocks 0,1,3 on the disk -- is that expected? Yes if the user knows about the reflink, but not necessarily if the files were deduplicated. The block sharing can be detected, but if the secure deletion is eg. implemented via the 's' file attribute and the expectation is that 'rm' securely deletes the file. Now we can fail 'rm' because of the sharing, or let i pass. I your example, the file is small and gets inlined, another case to cover. And here we speak about clearing a few tens of kilobytes at most, likely to have no effect to do a TRIM, si we're left to overwriting the block with zeros/pattern. IOW we need more research here. You can add a dozens more use cases where you can't securely delete files and all of them will miss the point. No one wants to overwrite (secure delete) blocks of a file if just a hardlink or a duplicated file (which is deduplicated) is deleted. Similiar to snapshots. If someone secure deletes a file he can't assume that it will be deleted from a snapshot too. Similiar to reflinks, if a user doesn't know about the reflink and doesn't delete it too, he doesn't care about the shared contents and it's ok to leave the shared contents alive and still give the user a positive answer that the file was deleted. Nobody would assume or expect something else. A warning would be nice that some parts aren't deleted (because they are still in use), but a secure delete doesn't mean that stuff in snapshots and other shared parts are deleted too, so even without deleting the still shared stuff, it's ok to give the user a positive feedback, because that instance he wanted to delete is deleted. Would be nice to mention such problems like snapshots in the documentation, but, as already said, nobody reasonable would expect that a secure delete would delete stuff from existing snapshots too. Sorry to become cynical. But I interpret that "more research" just as a more polite "no". Seems to be used quiet often these days. Anyway, in the meantime I've already implemented it for FAT. Not perfect, but it just had cost me a few hours and already works and securely overwrites files when they will be unlinked. I've added an unlinkat_s() and will now modify rm to give it an option -s which then will call that new unlinkat_s(). You see, it's quiet easy to start if you still see the forest, regardless of all the trees. Again, sorry, but such a "more research" got overused, and nobody expects a perfect solution right from the beginning. Even with more research you wont't find a perfect solution if no one even starts with an imperfect solution to find possible problems. (In reply to Alexander Holler from comment #4) > Anyway, in the meantime I've already implemented it for FAT. Not perfect, > but it just had cost me a few hours and already works and securely > overwrites files when they will be unlinked. I've added an unlinkat_s() and > will now modify rm to give it an option -s which then will call that new > unlinkat_s(). > > You see, it's quiet easy to start if you still see the forest, regardless of > all the trees. Can you please share your approach on linux-fsdevel@vger? Thanks, //richard Sorry, but currently no. I just ended up in a discussion with 24 mails about a simple 2-line patch with one pr_info() where I had to defend myself against around 3 maintainers. Because I don't like having to do such (especially in public, by mail and with people I don't know), it costs a lot of time and I don't receive compensations for doing such, I rather prefer in not getting involved more. I didn't really expect that someone will fix these problems with most FS and I've filed these bugs (the other one is bug #92261 for ext4) more as a reminder that something is badly broken in filesystem designs since around 30 years. Or even longer, don't know, the first FS I can remember which did that was FAT. Sorry, that other bug is bug #92271 I know Bugzilla is not for discussion, but don't know of better place for my remark, sorry. Alexander Holler, why not instead of trying to patch all linux filesystems (that has any chance to succeed only if you're RedHat), rather solve your problem some other way? For example I can think of: modification for EncFS or ecryptfs that uses unique random key for every file that is then used together with filesystem key to encrypt a file. When file is removed, wipe this key and file contents are automatically inaccessible. Of course, this key can leak too, but as you wrote, harder recovery is enough for you. Adding a layer on top trying to fix problems below is a very bad solution. Why not fix the real problem? It isn't magic to do so (as my proof of concept shows). I find it astonishing that people claim it a feature when asking that filesystems should do what should be one of their primary use case (deleting files). And currently filesystems do only around two thirds of what they should do, You can create files. You can modify files. But you can't really delete files. And even more astonishing is, that people come up with a lot of workarounds instead of just fixing the base (or requesting that). Besides that the new layer introduced as workaround might introduce new problems. Using the example of encryption, it isn't only that key might already have gone away, the key itself might be insecure or even the encryption algorithm or its implementation might have a problem or even a backdoor (you aren't aware of). It also might be become possible to just brutforce the key in a reasonable time frame. So it's a bet to rely on encryption instead of just really deleting files. And if you assume that throwing away a key will make the contents unreadable in future, it's a bet on the future, which is even more worth than just assuming something is (only) right now secured through encryption. Sorry, but I've to correct a silly error which turned my last comment almost into the opposite. I meant 'worse' instead of 'worth', looks like my two brain sides have been out of sync. ;) I wasn't sure if I should mention it, but it's quiet difficult to change the encryption algorithm or key on devices you don't have anymore. Using encryption to presumably make files unreadable (instead of deleting them) is a difference than using encryption e.g. for communication. |