Bug 193431

Summary: inline_data + journal_data == segfaults/buserrors in userspace mmap
Product: File System Reporter: Peter Rabbitson (rabbit)
Component: ext4Assignee: fs_ext4 (fs_ext4)
Status: RESOLVED INSUFFICIENT_DATA    
Severity: high CC: kernel, tytso
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: Multiple ( tested on 3.16, 4.1 and 4.9 ) Subsystem:
Regression: No Bisected commit-id:
Attachments: Backtrace attempt

Description Peter Rabbitson 2017-01-28 12:38:26 UTC
Created attachment 253361 [details]
Backtrace attempt

Steps to reproduce:


root@Ahasver:~# dd if=/dev/zero of=zeros.img bs=128M count=1
1+0 records in
1+0 records out
134217728 bytes (134 MB) copied, 1.61473 s, 83.1 MB/s


root@Ahasver:~# losetup -f zeros.img


root@Ahasver:~# mke2fs -t ext4 -I 512 -j -O inline_data /dev/loop0
mke2fs 1.43.3 (04-Sep-2016)
journal checksum features.
Discarding device blocks: done                            
Creating filesystem with 131072 1k blocks and 32768 inodes
Filesystem UUID: 675cf4a8-fb74-4a4a-9f28-f01b462b1933
Superblock backups stored on blocks: 
	8193, 24577, 40961, 57345, 73729

Allocating group tables: done                            
Writing inode tables: done                            
Creating journal (4096 blocks): done
Writing superblocks and filesystem accounting information: done 



root@Ahasver:~# mount -o data=journal /dev/loop0 /mnt


root@Ahasver:~# rm -f /usr/lib/locale/locale-archive


root@Ahasver:~# ln -s /mnt/locale-archive /usr/lib/locale/locale-archive


root@Ahasver:~# localedef -i en_US -c -f ISO-8859-1 en_US
Segmentation fault



I am also attaching a gdb backtrace of the error as seen using the Debian Stretch Installer RC1. The backtraced line 773 seems to be this one: https://sources.debian.net/src/glibc/2.24-9/locale/programs/locarchive.c/#L773
Comment 1 Christian Kujau 2017-01-29 22:21:22 UTC
I don't think this has anything to do with the underlying file system but should be reported against the "libc-bin" package. The "ln" command above creates a dead symlink and the same segfault happens when the symlink points to /tmp/foo (tmpfs) or some other non-existant location. Running through strace suggests some kind of programming logic error in localedef instead.
Comment 2 Theodore Tso 2017-01-30 02:56:18 UTC
Using a stock ext4 file system ---- or an xfs file system --- results in the same segfault in localdef.

So I think Christian is right.  This has nothing to do with ext4.

P.S.  I just tried with btrfs, and surprise!   localdef is also crashing with a segfault.
Comment 3 Peter Rabbitson 2017-01-30 11:11:25 UTC
I assure you the problem is real. In my frustration/excitement I did not properly trim down the testcase, and ignored the obvious change in behavior ( SIGUS -> SIGSEGV ).

Unfortunately I won't have time in the next couple days to properly trim this down. All I currently have is the tarball of a minimalistic chroot left over from a failed debian install ( 35MiB SHA1:18917303621bd99ef8c319107ed0dcd2eb44abe0 ): https://ipfs.io/ipfs/QmVbjsxH93Wy4Lnx1pThdd1iQ33TpeysoHuwzFb7awef5U/chroot_with_failing_locales_reconfig.tar.xz

The failure is as follows, showcasing a successful run and a failing run right after ( the 512 inode size was a red herring: 256 fails as well ):


root@Ahasver:~# dd if=/dev/zero of=zeros.img bs=512M count=1
1+0 records in
1+0 records out
536870912 bytes (537 MB) copied, 11.8806 s, 45.2 MB/s


root@Ahasver:~# losetup -f zeros.img


root@Ahasver:~# mke2fs -t ext4 -j -O inline_data /dev/loop0
mke2fs 1.43.3 (04-Sep-2016)
Discarding device blocks: done                            
Creating filesystem with 131072 4k blocks and 32768 inodes
Filesystem UUID: 027504fd-e49d-42a3-bcd5-4ea3a05af5ed
Superblock backups stored on blocks: 
	32768, 98304

Allocating group tables: done                            
Writing inode tables: done                            
Creating journal (4096 blocks): done
Writing superblocks and filesystem accounting information: done


root@Ahasver:~# mount /dev/loop0 /mnt


root@Ahasver:~# cd /mnt


root@Ahasver:/mnt# tar -Jxf ~/chroot_with_failing_locales_reconfig.tar.xz 


root@Ahasver:/mnt# cd chroot_with_failing_locales_reconfig/


root@Ahasver:/mnt/chroot_with_failing_locales_reconfig# ./chroot.bash 


root@Ahasver:/# dpkg-reconfigure -u locales
Generating locales (this might take a while)...
  en_US.ISO-8859-1... done
  en_US.UTF-8... done
Generation complete.


root@Ahasver:/# exit


root@Ahasver:/mnt/chroot_with_failing_locales_reconfig# cd


root@Ahasver:~# umount /mnt


root@Ahasver:~# mount -o data=journal /dev/loop0 /mnt


root@Ahasver:~# cd /mnt/chroot_with_failing_locales_reconfig/


root@Ahasver:/mnt/chroot_with_failing_locales_reconfig# ./chroot.bash 


root@Ahasver:/# dpkg-reconfigure -u locales
Generating locales (this might take a while)...
  en_US.ISO-8859-1...Bus error
 done
  en_US.UTF-8...Bus error
 done
Generation complete.
Comment 4 Theodore Tso 2017-01-30 15:58:19 UTC
It may be a real problem, but if it isn't ext4 specific, you're reporting it to the wrong place and we won't be able to help you.

Userspace crashes tend to be caused by (surprise!) userspace bugs, and this the kernel bugzilla.   If you can trim this down to something which makes it clear that it is ext4 specific issue please feel free to update this bug with this new information.
Comment 5 Peter Rabbitson 2017-01-30 16:07:04 UTC
But it *is* ext4 specific - in https://bugzilla.kernel.org/show_bug.cgi?id=193431#c3 I repeatably demonstrate different behavior of a chroot depending solely on whether a fs is mounted with data=journal or without. That is with a userspace that is as new as it gets ( debian stretch-to-be ).

In the coming days I will try to distill this down further to a standalone program, but the chance is high I will fail: my C-fu is virtually nonexistent.