Most recent kernel where this bug did not occur: 2.6.8 Distribution: Debian (sarge, with some packages from etch) Hardware Environment: i386 (Dual PIII, Soyo D6IBA2 motherboard) Software Environment: Problem Description: The function befs_utf2nls (in fs/befs/linuxvfs.c) writes a 0 byte past the end of a block of memory allocated via kmalloc(), leading to memory corruption. This happens only for filenames which are pure ASCII and a multiple of 4 bytes in length. Here's a snippet from my syslog (with DEBUG_SLAB turned on; note the values for "redzone 2"): Jul 31 22:03:40 gondolin kernel: slab error in cache_free_debugcheck(): cache `size-32': double free, or memory outside object was overwritten Jul 31 22:03:40 gondolin kernel: [cache_free_debugcheck+461/592] cache_free_debugcheck+0x1cd/0x250 Jul 31 22:03:40 gondolin kernel: [kfree+89/160] kfree+0x59/0xa0 Jul 31 22:03:40 gondolin kernel: [pg0+541965946/1069794304] befs_readdir+0x13a/0x230 [befs] Jul 31 22:03:40 gondolin kernel: [pg0+541965946/1069794304] befs_readdir+0x13a/0x230 [befs] Jul 31 22:03:40 gondolin kernel: [permission+111/160] permission+0x6f/0xa0 Jul 31 22:03:41 gondolin kernel: [vfs_permission+32/48] vfs_permission+0x20/0x30 Jul 31 22:03:41 gondolin kernel: [may_open+85/528] may_open+0x55/0x210 Jul 31 22:03:41 gondolin kernel: [__copy_to_user_ll+112/128] __copy_to_user_ll+0x70/0x80 Jul 31 22:03:41 gondolin kernel: [copy_to_user+66/96] copy_to_user+0x42/0x60 Jul 31 22:03:41 gondolin kernel: [cp_new_stat64+248/272] cp_new_stat64+0xf8/0x110 Jul 31 22:03:41 gondolin kernel: [vfs_readdir+128/160] vfs_readdir+0x80/0xa0 Jul 31 22:03:41 gondolin kernel: [filldir64+0/256] filldir64+0x0/0x100 Jul 31 22:03:41 gondolin kernel: [sys_getdents64+116/208] sys_getdents64+0x74/0xd0 Jul 31 22:03:41 gondolin kernel: [filldir64+0/256] filldir64+0x0/0x100 Jul 31 22:03:41 gondolin kernel: [sysenter_past_esp+84/117] sysenter_past_esp+0x54/0x75 Jul 31 22:03:41 gondolin kernel: dab0ae24: redzone 1: 0x170fc2a5, redzone 2: 0x170fc200. Jul 31 22:03:41 gondolin kernel: slab error in cache_free_debugcheck(): cache `size-32': double free, or memory outside object was overwritten Jul 31 22:03:41 gondolin kernel: [cache_free_debugcheck+461/592] cache_free_debugcheck+0x1cd/0x250 Jul 31 22:03:41 gondolin kernel: [kfree+89/160] kfree+0x59/0xa0 Jul 31 22:03:41 gondolin kernel: [pg0+541965946/1069794304] befs_readdir+0x13a/0x230 [befs] Jul 31 22:03:41 gondolin kernel: [pg0+541965946/1069794304] befs_readdir+0x13a/0x230 [befs] Jul 31 22:03:41 gondolin kernel: [permission+111/160] permission+0x6f/0xa0 Jul 31 22:03:41 gondolin kernel: [vfs_permission+32/48] vfs_permission+0x20/0x30 Jul 31 22:03:41 gondolin kernel: [may_open+85/528] may_open+0x55/0x210 Jul 31 22:03:41 gondolin kernel: [__copy_to_user_ll+112/128] __copy_to_user_ll+0x70/0x80 Jul 31 22:03:42 gondolin kernel: [copy_to_user+66/96] copy_to_user+0x42/0x60 Jul 31 22:03:42 gondolin kernel: [cp_new_stat64+248/272] cp_new_stat64+0xf8/0x110 Jul 31 22:03:42 gondolin kernel: [vfs_readdir+128/160] vfs_readdir+0x80/0xa0 Jul 31 22:03:42 gondolin kernel: [filldir64+0/256] filldir64+0x0/0x100 Jul 31 22:03:42 gondolin kernel: [sys_getdents64+116/208] sys_getdents64+0x74/0xd0 Jul 31 22:03:42 gondolin kernel: [filldir64+0/256] filldir64+0x0/0x100 Jul 31 22:03:42 gondolin kernel: [sysenter_past_esp+84/117] sysenter_past_esp+0x54/0x75 Jul 31 22:03:42 gondolin kernel: dab0ae24: redzone 1: 0x170fc2a5, redzone 2: 0x170fc200. Without DEBUG_SLAB, this leads to further corruption and hard lockups; I believe this is the bug which has made kernels later than 2.6.8 unusable for me. (This must be due to changes in memory management, the bug has been in the BeFS driver since the time it was introduced (AFAICT).) Steps to reproduce: Create a directory (in BeOS, naturally :-) with files named, e.g., "1", "22", "333", "4444", ... Mount it in Linux and do an "ls" or "find". Fix: The obvious fix is to change *out = result = kmalloc(maxlen, GFP_NOFS); to *out = result = kmalloc(maxlen + 1, GFP_NOFS); in befs_utf2nls(). The same fix may be needed for befs_nls2utf().
Created attachment 8672 [details] Fix BeFS slab corruption This patch implements your suggested change - does it looks OK for you? I'm not sure that befs_nls2utf() needs this change: In that function, maxlen = 3 * in_len, so the output UTF string is three times longer than the incoming one, which looks like a safety measure. Are there special cases where a UTF string could be more than three times longer than its equivalent NLS string?
Created attachment 8692 [details] Fix befs_nls2utf aswell
Patch merged in the main tree (commit 94f563c426a78c97fc2a377315995e6ec8343872), closing bug.