Bug 200735

Summary: creating softlink does not check for source file existence or corruption
Product: File System Reporter: Shehbaz (shehbazjaffer007)
Component: ext4Assignee: fs_ext4 (fs_ext4)
Status: RESOLVED INVALID    
Severity: normal CC: tytso
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 4.17.0-rc3+ Subsystem:
Regression: No Bisected commit-id:
Attachments: cmd2000 workload containing 2000 nested files and directories
corruptoffset.c program to corrupt specific disk offset
findAllInodes.sh

Description Shehbaz 2018-08-05 19:23:24 UTC
Created attachment 277685 [details]
cmd2000 workload containing 2000 nested files and directories

Hello!

Background
===========
I am performing file system block layer corruption experiments for ext4 for different file system commands.

For all source-file-read based commands like access, truncate, chmod, open, chown, utimes,read, rename , getdirentries, creat - if the source file inode is corrupted, I get an appropriate error in the user space, like this: 

code exited with error Structure needs cleaning

and/or an appropriate error in kernel space, like this:

EXT4-fs error (device dm-0): ext4_iget:4769: inode #8218: comm access: bad extra_isize 16962 (inode size 256)

Issue:
=======
However, for the creation of a symlink, I do not observe any form of check/warning or error if the source file inode, or inode in the source path inode is corrupted.

Steps To Reproduce
==================

1. Please create file system on a 1.5 GB disk preferably with QEMU.
2. Please mount the file system on /mnt
3. run ./cmd2000 workload (please find attached). this file contains a command that creates our source file:

/mnt/cmdsymlink_dir1/cmdsymlink_dir2/cmdsymlink_dir3/cmdsymlink_dir4/datafile

along with some 2000 other nested files and directories.

4. find the inode numbers of all directory numbers using the ls -li command. We can do this using the findAllInodes.sh script attached. For example in my setup, I get the following inodes.

$ ./findAllInodes.sh 
8301 cmdsymlink_dir1
2593 cmdsymlink_dir2
4171 cmdsymlink_dir3
4723 cmdsymlink_dir4
5754 datafile


5. unmount the file system. now corrupt inode using the program:
./corruptOffset X 256

where X = ((inode_number/16 + 121) * 4096) + (((inode_number % 16) - 1) * 256)

where inode_number = one of the 5 inode number values obtained in the output of findAllInodes.sh in the previous step.

6. run the following command that successfully creates the new symlink, even after source path inode corruption.

sudo -- sh -c 'sudo ln -s /mnt/cmdsymlink_dir1/cmdsymlink_dir2/cmdsymlink_dir3/cmdsymlink_dir4/datafile /mnt/C5UAECG4PB3G47D5SRPHDAJ6R3ASPTQMLR/OUQJI0MJOCZ9JTV1JDEFA1AT0UYA2UG7ZU537FZ8WR8SB7KBBBJPNU6QFSYSYXQ0A/newlink'

Ideally, the symlink command should traverse the source path and inform the user if the path is broken, corrupt, inaccessible or needs cleaning like all other commands. Since symlink does not raise any error in this case, it should be a bug.

Please find scripts attached for your reference.

Thanks!
Comment 1 Shehbaz 2018-08-05 19:24:03 UTC
Created attachment 277687 [details]
corruptoffset.c program to corrupt specific disk offset
Comment 2 Shehbaz 2018-08-05 19:25:03 UTC
Created attachment 277689 [details]
findAllInodes.sh

on a mounted file system, checks the inode numbers of our source path directories and files.
Comment 3 Theodore Tso 2018-08-05 22:45:19 UTC
This is working as intended.  All a symbolic link is a text string which is interpreted at the time that the symbolic link is dereferenced (e.g., when you try to open a pathname that contains a symbolic link).   

The destination of the symlink doesn't even need to exist.  In fact, it's not checked at all:

% ln -s "When in the Course of human events, it becomes necessary for one people to dissolve the political bands which have connected them with another, and to assume among the powers of the earth, the separate and equal station to which the Laws of Nature and of Nature’s God entitle them, a decent respect to the opinions of mankind requires that they should declare the causes which impel them to the separation." test-link

% ls -l test-link 
4 lrwxrwxrwx 1 tytso tytso 408 Aug  5 18:43 test-link -> 'When in the Course of human events, it becomes necessary for one people to dissolve the political bands which have connected them with another, and to assume among the powers of the earth, the separate and equal station to which the Laws of Nature and of Nature’s God entitle them, a decent respect to the opinions of mankind requires that they should declare the causes which impel them to the separation.'

% cat test-link
cat: test-link: File name too long

As you can see, we don't even try to evaluate the symlink to see if it is a valid file name (filenames can not be longer than 255 characters, and this is 408 characters).

This is all working as intended and in fact it is what is required by the POSIX standard.