Bug 11412

Summary: Crash in vfs_readlink() on intentionally corrupted ext2 fs
Product: File System Reporter: Sami Liedes (sami.liedes)
Component: ext2Assignee: Andrew Morton (akpm)
Status: RESOLVED CODE_FIX    
Severity: normal CC: alan, duaneg
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 2.6.27-rc4 + patches for #10976 & #11266 Subsystem:
Regression: No Bisected commit-id:

Description Sami Liedes 2008-08-23 07:03:14 UTC
Distribution: Minimal Debian sid (unstable)
Hardware Environment: qemu x86
Problem Description:

I've now seen this happen twice, but unfortunately have not found a way to reproduce it (running the same tests on the same corrupted fs doesn't reproduce it).

I've been torture testing ext2 with broken filesystems. Apparently in these crashes there's something wrong with the const char *link passed to vfs_readlink(), as the crash is in strlen().

The two crashes happened after running four parallel tests for 17 hours, so I guess that's about the rate at which I can reproduce this if more tests are needed (but I'll gladly run the same tests with any patches, because I do have spare CPU time).

Here are the two backtraces I've seen:

---------- hdb.6993 ----------
***** zzuffing ***** seed 6993
[42596.659321] EXT2-fs error (device hdb): ext2_valid_block_bitmap: Invalid block bitmap - block_group = 0, block = 34
[42596.669321] BUG: unable to handle kernel paging request at c12be000
[42596.669321] IP: [<c045e979>] strlen+0xd/0x17
[42596.669321] *pde = 0788c163 *pte = 012be160
[42596.669321] Oops: 0000 [#1] DEBUG_PAGEALLOC
[42596.669321]
[42596.669321] Pid: 12667, comm: cp Not tainted (2.6.27-rc4 #1)
[42596.669321] EIP: 0060:[<c045e979>] EFLAGS: 00000283 CPU: 0
[42596.669321] EIP is at strlen+0xd/0x17
[42596.669321] EAX: 00000000 EBX: c12bd000 ECX: ffffefff EDX: 08202f60
[42596.669321] ESI: c12bd000 EDI: c12be000 EBP: c711ceec ESP: c711cee8
[42596.669321]  DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068
[42596.669321] Process cp (pid: 12667, ti=c711c000 task=c7ad2680 task.ti=c711c000)
[42596.669321] Stack: 00000401 c711cf04 c026b189 08202f60 c10277a0 c3af28e8 c10277a0 c711cf78
[42596.669321]        c026b288 c12bd000 00000401 08202f60 00000000 00000000 c0b8a5e0 00000246
[42596.669321]        00000001 00000246 c0b8a5e0 c494c000 00000000 c12bd000 c0545199 c494c000
[42596.669321] Call Trace:
[42596.669321]  [<c026b189>] ? vfs_readlink+0x22/0x49
[42596.669321]  [<c026b288>] ? generic_readlink+0x55/0x7e
[42596.669321]  [<c0545199>] ? _spin_unlock+0x1d/0x20
[42596.669321]  [<c027a38f>] ? mnt_drop_write+0x55/0x130
[42596.669321]  [<c0266f66>] ? sys_readlinkat+0x5e/0x78
[42596.669321]  [<c0266fa7>] ? sys_readlink+0x27/0x29
[42596.669321]  [<c0202f3e>] ? syscall_call+0x7/0xb
[42596.669321]  =======================
[42596.669321] Code: 55 89 e5 56 89 c6 89 d0 88 c4 ac 38 e0 74 09 84 c0 75 f7 be 01 00 00 00 89 f0 48 5e 5d c3 55 89 e5 57 89 c7 b9 ff ff ff ff 31 c0 <f2> ae f7 d1 49 89 c8 5f 5d c3 55 89 e5 57 31 ff 85 c9 74 0e 89
[42596.669321] EIP: [<c045e979>] strlen+0xd/0x17 SS:ESP 0068:c711cee8
[42596.669321] ---[ end trace f964ba82de460dfa ]---
[42596.719321] EXT2-fs error (device hdb): ext2_check_page: bad entry in directory #1281: inode out of bounds - offset=0, inode=67110145, rec_len=12, name_len=1
[42596.719321] EXT2-fs error (device hdb): ext2_readdir: bad page in #1281
[42596.779321] EXT2-fs error (device hdb): ext2_readdir: bad page in #1281
[42596.919321] EXT2-fs error (device hdb): ext2_readdir: bad page in #1281
[... still got some EXT2 errors after this]
----------

And another:

---------- hdb.10006449 ----------
***** zzuffing ***** seed 10006449
[38407.575040] attempt to access beyond end of device
[38407.575040] hdb: rw=0, want=150706, limit=20480
[38407.575040] Buffer I/O error on device hdb, logical block 75352
[38407.585040] attempt to access beyond end of device
[38407.585040] hdb: rw=0, want=150706, limit=20480
[38407.585040] Buffer I/O error on device hdb, logical block 75352
[38407.595040] BUG: unable to handle kernel paging request at c1388000
[38407.595040] IP: [<c045e979>] strlen+0xd/0x17
[38407.595040] *pde = 0788c163 *pte = 01388160
[38407.595040] Oops: 0000 [#1] DEBUG_PAGEALLOC
[38407.595040]
[38407.595040] Pid: 11247, comm: cp Not tainted (2.6.27-rc4 #1)
[38407.595040] EIP: 0060:[<c045e979>] EFLAGS: 00000297 CPU: 0
[38407.595040] EIP is at strlen+0xd/0x17
[38407.595040] EAX: 00000000 EBX: c1387000 ECX: ffffefff EDX: 09dca7d8
[38407.595040] ESI: c1387000 EDI: c1388000 EBP: c69e6eec ESP: c69e6ee8
[38407.595040]  DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068
[38407.595040] Process cp (pid: 11247, ti=c69e6000 task=c791cd00 task.ti=c69e6000)
[38407.595040] Stack: 00000401 c69e6f04 c026b189 09dca7d8 c10290e0 c4f62ab0 c10290e0 c69e6f78
[38407.595040]        c026b288 c1387000 00000401 09dca7d8 00000000 00000000 c0b8a5e0 00000246
[38407.595040]        00000001 00000246 c0b8a5e0 c3d81080 00000000 c1387000 c0545199 c3d81080
[38407.595040] Call Trace:
[38407.595040]  [<c026b189>] ? vfs_readlink+0x22/0x49
[38407.595040]  [<c026b288>] ? generic_readlink+0x55/0x7e
[38407.595040]  [<c0545199>] ? _spin_unlock+0x1d/0x20
[38407.595040]  [<c027a38f>] ? mnt_drop_write+0x55/0x130
[38407.595040]  [<c0266f66>] ? sys_readlinkat+0x5e/0x78
[38407.595040]  [<c0266fa7>] ? sys_readlink+0x27/0x29
[38407.595040]  [<c0202f3e>] ? syscall_call+0x7/0xb
[38407.595040]  =======================
[38407.595040] Code: 55 89 e5 56 89 c6 89 d0 88 c4 ac 38 e0 74 09 84 c0 75 f7 be 01 00 00 00 89 f0 48 5e 5d c3 55 89 e5 57 89 c7 b9 ff ff ff ff 31 c0 <f2> ae f7 d1 49 89 c8 5f 5d c3 55 89 e5 57 31 ff 85 c9 74 0e 89
[38407.595040] EIP: [<c045e979>] strlen+0xd/0x17 SS:ESP 0068:c69e6ee8
[38407.595040] ---[ end trace 77c6ae736e5146bf ]---
[38407.645040] attempt to access beyond end of device
[38407.645040] hdb: rw=0, want=2147503082, limit=20480
[38407.645040] Buffer I/O error on device hdb, logical block 1073751540
[38407.645040] attempt to access beyond end of device
----------
Comment 1 Duane Griffin 2008-12-04 10:38:16 UTC
It looks like there is no check that the link name is NULL-terminated on disk.

Since ext2_follow_link sets the name pointer to point into the inode data we can't just unconditionally NULL-terminate. Changing that to allocate and copy it into a buffer wouldn't be very nice.

Unless I'm missing something (quite possible) the generic page_follow_link_light function has the same issue. If so then ext2 along with a whole bunch of other filesystems will still be affected, even if the first case is fixed.

It would probably be useful to get an image that showed this problem, even though it doesn't happen every time.