Recently xfstests on s390x always hit below kernel BUG: usercopy: Kernel memory exposure attempt detected from vmalloc 'no area' (offset 0, size 1)! It's reproducible on xfs with default mkfs options. But it's easier and 100% reproducible (for me) on xfs with 64k directory block size (-n size=65536). The kernel HEAD commit is: commit 032dcf09e2bf7c822be25b4abef7a6c913870d98 Author: Linus Torvalds <torvalds@linux-foundation.org> Date: Fri Jun 3 20:01:25 2022 -0700 Merge tag 'gpio-fixes-for-v5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux [20797.425894] XFS (loop1): Mounting V5 Filesystem [20797.433354] XFS (loop1): Ending clean mount [20823.669300] usercopy: Kernel memory exposure attempt detected from vmalloc 'n o area' (offset 0, size 1)! [20823.669339] ------------[ cut here ]------------ [20823.669340] kernel BUG at mm/usercopy.c:101! [20823.669385] monitor event: 0040 ilc:2 [#1] SMP [20823.669415] Modules linked in: ext2 overlay dm_zero dm_log_writes dm_thin_poo l dm_persistent_data dm_bio_prison sd_mod t10_pi crc64_rocksoft_generic crc64_ro cksoft crc64 sg dm_snapshot dm_bufio ext4 mbcache jbd2 dm_flakey tls loop lcs ct cm fsm zfcp scsi_transport_fc dasd_fba_mod rfkill sunrpc vfio_ccw mdev vfio_iomm u_type1 zcrypt_cex4 vfio drm fuse i2c_core fb font drm_panel_orientation_quirks xfs libcrc32c ghash_s390 prng aes_s390 des_s390 sha3_512_s390 sha3_256_s390 dasd _eckd_mod dasd_mod qeth_l2 bridge stp llc qeth qdio ccwgroup dm_mirror dm_region _hash dm_log dm_mod pkey zcrypt [last unloaded: scsi_debug] [20823.669520] CPU: 0 PID: 3774731 Comm: rm Kdump: loaded Tainted: G B W 5.18.0+ #1 [20823.669530] Hardware name: IBM 8561 LT1 400 (z/VM 7.2.0) [20823.672501] Krnl PSW : 0704d00180000000 000000009df4a85a (usercopy_abort+0xaa /0xb0) [20823.672564] R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:1 PM:0 RI: 0 EA:3 [20823.672575] Krnl GPRS: 0000000000000001 001c000018090e00 000000000000005c 000 0000000000004 [20823.672584] 001c000000000000 000000009d332024 000000009e14b1a0 001 bff8000000000 [20823.672593] 0000000000000001 0000000000000001 0000000000000000 000 000009e14b1e0 [20823.672601] 000000009e70d070 00000000a87bdac0 000000009df4a856 001 bff8001f5f720 [20823.672621] Krnl Code: 000000009df4a84c: b9040031 lgr %r3,%r1 [20823.672621] 000000009df4a850: c0e5ffffbbfc brasl %r14,000 000009df42048 [20823.672621] #000000009df4a856: af000000 mc 0,0 [20823.672621] >000000009df4a85a: 0707 bcr 0,%r7 [20823.672621] 000000009df4a85c: 0707 bcr 0,%r7 [20823.672621] 000000009df4a85e: 0707 bcr 0,%r7 [20823.672621] 000000009df4a860: c0040007b0a4 brcl 0,000000 009e0409a8 [20823.672621] 000000009df4a866: eb6ff0480024 stmg %r6,%r15 ,72(%r15) [20823.672789] Call Trace: [20823.672794] [<000000009df4a85a>] usercopy_abort+0xaa/0xb0 [20823.672817] ([<000000009df4a856>] usercopy_abort+0xa6/0xb0) [20823.672825] [<000000009cd30c34>] check_heap_object+0x474/0x480 [20823.672833] [<000000009cd30cb4>] __check_object_size+0x74/0x150 [20823.672840] [<000000009cd8de06>] filldir64+0x296/0x530 [20823.672849] [<001bffff805957dc>] xfs_dir2_leaf_getdents+0x40c/0xca0 [xfs] [20823.673277] [<001bffff80596e18>] xfs_readdir+0x3f8/0x740 [xfs] [20823.673522] [<000000009cd8c7ac>] iterate_dir+0x41c/0x580 [20823.673529] [<000000009cd8d6b4>] __do_sys_getdents64+0xc4/0x1c0 [20823.673537] [<000000009c4bda8c>] do_syscall+0x22c/0x330 [20823.673546] [<000000009df5e8be>] __do_syscall+0xce/0xf0 [20823.673554] [<000000009df87402>] system_call+0x82/0xb0 [20823.673563] INFO: lockdep is turned off. [20823.673568] Last Breaking-Event-Address: [20823.673572] [<000000009df420f4>] _printk+0xac/0xb8 [20823.673581] ---[ end trace 0000000000000000 ]--- [20829.875273] usercopy: Kernel memory exposure attempt detected from vmalloc 'n o area' (offset 0, size 1)! [20829.875316] ------------[ cut here ]------------ [20829.875318] kernel BUG at mm/usercopy.c:101! [20829.875448] monitor event: 0040 ilc:2 [#2] SMP [20829.875468] Modules linked in: ext2 overlay dm_zero dm_log_writes dm_thin_poo l dm_persistent_data dm_bio_prison sd_mod t10_pi crc64_rocksoft_generic crc64_r cksoft crc64 sg dm_snapshot dm_bufio ext4 mbcache jbd2 dm_flakey tls loop lcs ct cm fsm zfcp scsi_transport_fc dasd_fba_mod rfkill sunrpc vfio_ccw mdev vfio_iomm u_type1 zcrypt_cex4 vfio drm fuse i2c_core fb font drm_panel_orientation_quirks xfs libcrc32c ghash_s390 prng aes_s390 des_s390 sha3_512_s390 sha3_256_s390 dasd _eckd_mod dasd_mod qeth_l2 bridge stp llc qeth qdio ccwgroup dm_mirror dm_region _hash dm_log dm_mod pkey zcrypt [last unloaded: scsi_debug] [20829.875616] CPU: 0 PID: 3776251 Comm: find Kdump: loaded Tainted: G B D W 5.18.0+ #1 [20829.875629] Hardware name: IBM 8561 LT1 400 (z/VM 7.2.0) [20829.879533] Krnl PSW : 0704d00180000000 000000009df4a85a (usercopy_abort+0xaa /0xb0) [20829.879554] R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:1 PM:0 RI: 0 EA:3 [20829.879573] Krnl GPRS: 0000000000000001 001c000018090e00 000000000000005c 000 0000000000004 [20829.879578] 001c000000000000 000000009d332024 000000009e14b1a0 001 bff8000000000 [20829.879583] 0000000000000001 0000000000000001 0000000000000000 000 000009e14b1e0 [20829.879587] 000000009e70d070 00000000a21852c0 000000009df4a856 001 bff8004fef728 [20829.879599] Krnl Code: 000000009df4a84c: b9040031 lgr %r3,%r1 [20829.879599] 000000009df4a850: c0e5ffffbbfc brasl %r14,000 000009df42048 [20829.879599] #000000009df4a856: af000000 mc 0,0 [20829.879599] >000000009df4a85a: 0707 bcr 0,%r7 [20829.879599] 000000009df4a85c: 0707 bcr 0,%r7 [20829.879599] 000000009df4a85e: 0707 bcr 0,%r7 [20829.879599] 000000009df4a860: c0040007b0a4 brcl 0,000000 009e0409a8 [20829.879599] 000000009df4a866: eb6ff0480024 stmg %r6,%r15 ,72(%r15) [20829.879631] Call Trace: [20829.879634] [<000000009df4a85a>] usercopy_abort+0xaa/0xb0 [20829.879639] ([<000000009df4a856>] usercopy_abort+0xa6/0xb0) [20829.879644] [<000000009cd30c34>] check_heap_object+0x474/0x480 [20829.879650] [<000000009cd30cb4>] __check_object_size+0x74/0x150 [20829.879654] [<000000009cd8de06>] filldir64+0x296/0x530 [20829.879661] [<001bffff805957dc>] xfs_dir2_leaf_getdents+0x40c/0xca0 [xfs] [20829.879971] [<001bffff80596e18>] xfs_readdir+0x3f8/0x740 [xfs] [20829.880107] [<000000009cd8c7ac>] iterate_dir+0x41c/0x580 [20829.880112] [<000000009cd8d6b4>] __do_sys_getdents64+0xc4/0x1c0 [20829.880117] [<000000009c4bda8c>] do_syscall+0x22c/0x330 [20829.880124] [<000000009df5e8be>] __do_syscall+0xce/0xf0 [20829.880129] [<000000009df87402>] system_call+0x82/0xb0 [20829.880135] INFO: lockdep is turned off. [20829.880138] Last Breaking-Event-Address: [20829.880141] [<000000009df420f4>] _printk+0xac/0xb8 [20829.880148] ---[ end trace 0000000000000000 ]--- [20829.975537] XFS (loop0): Unmounting Filesystem
CC filesystem_xfs@kernel-bugs.kernel.org
Default xfs (no specified mkfs options) can reproduce this bug with xfstests xfs/294. The decode_stacktrace.sh output as below[1], HEAD=032dcf09e ("Merge tag 'gpio-fixes-for-v5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux") [1] # ./scripts/decode_stacktrace.sh vmlinux < console.log [30523.215443] run fstests xfs/294 at 2022-06-05 00:40:48 [30525.371171] XFS (loop1): Mounting V5 Filesystem [30525.388258] XFS (loop1): Ending clean mount [30574.012385] restraintd[1854]: *** Current Time: Sun Jun 05 00:41:38 2022 Loc alwatchdog at: Mon Jun 06 16:13:37 2022 [30604.239628] usercopy: Kernel memory exposure attempt detected from vmalloc 'n o area' (offset 0, size 1)! [30604.239677] ------------[ cut here ]------------ [30604.239679] kernel BUG at mm/usercopy.c:101! [30604.239731] monitor event: 0040 ilc:2 [#1] SMP [30604.239774] Modules linked in: ext2 overlay dm_zero dm_log_writes dm_thin_poo l dm_persistent_data dm_bio_prison sd_mod t10_pi crc64_rocksoft_generic crc64_ro cksoft crc64 sg dm_snapshot dm_bufio ext4 mbcache jbd2 dm_flakey tls loop lcs ct cm fsm zfcp scsi_transport_fc dasd_fba_mod rfkill vfio_ccw mdev vfio_iommu_type1 zcrypt_cex4 vfio sunrpc drm i2c_core fb fuse font drm_panel_orientation_quirks xfs libcrc32c ghash_s390 prng aes_s390 des_s390 sha3_512_s390 sha3_256_s390 qeth _l2 bridge stp llc dasd_eckd_mod dasd_mod qeth qdio ccwgroup dm_mirror dm_region _hash dm_log dm_mod pkey zcrypt [last unloaded: scsi_debug] 5.18.0+ #1 [30604.240048] Hardware name: IBM 8561 LT1 400 (z/VM 7.2.0) [30604.240155] Krnl PSW : 0704d00180000000 00000000255ca85a (usercopy_abort+0xaa /0xb0) [30604.240177] R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:1 PM:0 RI: 0 EA:3 [30604.240188] Krnl GPRS: 0000000000000001 001c000018090e00 000000000000005c 000 0000000000004 [30604.240196] 001c000000000000 00000000249b2024 00000000257cb1a0 001 bff8000000000 [30604.240204] 0000000000000001 0000000000000001 0000000000000000 000 00000257cb1e0 [30604.240213] 0000000025d8d070 00000000973502c0 00000000255ca856 001 bff80041af730 [30604.240231] Krnl Code: 00000000255ca84c: b9040031 lgr %r3,%r1 Code starting with the faulting instruction =========================================== [30604.240231] 00000000255ca850: c0e5ffffbbfc brasl %r14,000 00000255c2048 [30604.240231] #00000000255ca856: af000000 mc 0,0 [30604.240231] >00000000255ca85a: 0707 bcr 0,%r7 [30604.240231] 00000000255ca85c: 0707 bcr 0,%r7 [30604.240231] 00000000255ca85e: 0707 bcr 0,%r7 [30604.240231] 00000000255ca860: c0040007b0a4 brcl 0,000000 00256c09a8 [30604.240231] 00000000255ca866: eb6ff0480024 stmg %r6,%r15 ,72(%r15) [30604.240369] Call Trace: [30604.240375] usercopy_abort (??:?) [30604.240382] usercopy_abort (mm/usercopy.c:101 (discriminator 24)) [30604.240400] check_heap_object (mm/usercopy.c:180) [30604.240409] __check_object_size (mm/usercopy.c:123 mm/usercopy.c:255 mm/usercopy.c:214) [30604.240415] filldir64 (./include/linux/uaccess.h:108 fs/readdir.c:339) [30604.240424] xfs_dir2_leaf_getdents (./include/linux/fs.h:3430 fs/xfs/xfs_dir2_readdir.c:472) xfs [30604.240830] xfs_readdir (fs/xfs/xfs_dir2_readdir.c:547) xfs [30604.241036] iterate_dir (fs/readdir.c:65) [30604.241042] __do_sys_getdents64 (fs/readdir.c:369) [30604.241047] do_syscall (arch/s390/kernel/syscall.c:144 (discriminator 1)) [30604.241053] __do_syscall (arch/s390/kernel/syscall.c:169) [30604.241058] system_call (arch/s390/kernel/entry.S:335) [30604.241064] INFO: lockdep is turned off. [30604.241067] Last Breaking-Event-Address: [30604.241070] _printk (kernel/printk/printk.c:2426) [30604.241077] ---[ end trace 0000000000000000 ]--- [30609.984847] usercopy: Kernel memory exposure attempt detected from vmalloc 'n o area' (offset 0, size 1)! [30609.984894] ------------[ cut here ]------------ [30609.984896] kernel BUG at mm/usercopy.c:101! [30609.984945] monitor event: 0040 ilc:2 [#2] SMP [30609.984984] Modules linked in: ext2 overlay dm_zero dm_log_writes dm_thin_poo l dm_persistent_data dm_bio_prison sd_mod t10_pi crc64_rocksoft_generic crc64_r cksoft crc64 sg dm_snapshot dm_bufio ext4 mbcache jbd2 dm_flakey tls loop lcs ct cm fsm zfcp scsi_transport_fc dasd_fba_mod rfkill vfio_ccw mdev vfio_iommu_type1 zcrypt_cex4 vfio sunrpc drm i2c_core fb fuse font drm_panel_orientation_quirks xfs libcrc32c ghash_s390 prng aes_s390 des_s390 sha3_512_s390 sha3_256_s390 qeth _l2 bridge stp llc dasd_eckd_mod dasd_mod qeth qdio ccwgroup dm_mirror dm_region _hash dm_log dm_mod pkey zcrypt [last unloaded: scsi_debug] 5.18.0+ #1 [30609.985151] Hardware name: IBM 8561 LT1 400 (z/VM 7.2.0) [30609.985211] Krnl PSW : 0704d00180000000 00000000255ca85a (usercopy_abort+0xaa /0xb0) [30609.985249] R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:1 PM:0 RI: 0 EA:3 [30609.985258] Krnl GPRS: 0000000000000001 001c000018090e00 000000000000005c 000 0000000000004 [30609.985264] 001c000000000000 00000000249b2024 00000000257cb1a0 001 bff8000000000 [30609.985271] 0000000000000001 0000000000000001 0000000000000000 000 00000257cb1e0 [30609.985276] 0000000025d8d070 00000000a2d652c0 00000000255ca856 001 bff800810f668 [30609.985293] Krnl Code: 00000000255ca84c: b9040031 lgr %r3,%r1 Code starting with the faulting instruction =========================================== [30609.985293] 00000000255ca850: c0e5ffffbbfc brasl %r14,000 00000255c2048 [30609.985293] #00000000255ca856: af000000 mc 0,0 [30609.985293] >00000000255ca85a: 0707 bcr 0,%r7 [30609.985293] 00000000255ca85c: 0707 bcr 0,%r7 [30609.985293] 00000000255ca85e: 0707 bcr 0,%r7 [30609.985293] 00000000255ca860: c0040007b0a4 brcl 0,000000 00256c09a8 [30609.985293] 00000000255ca866: eb6ff0480024 stmg %r6,%r15 ,72(%r15) [30609.985340] Call Trace: [30609.985345] usercopy_abort (??:?) [30609.985352] usercopy_abort (mm/usercopy.c:101 (discriminator 24)) [30609.985358] check_heap_object (mm/usercopy.c:180) [30609.985367] __check_object_size (mm/usercopy.c:123 mm/usercopy.c:255 mm/usercopy.c:214) [30609.985374] filldir64 (./include/linux/uaccess.h:108 fs/readdir.c:339) [30609.985383] xfs_dir2_leaf_getdents (./include/linux/fs.h:3430 fs/xfs/xfs_dir2_readdir.c:472) xfs [30609.985780] xfs_readdir (fs/xfs/xfs_dir2_readdir.c:547) xfs [30609.986002] iterate_dir (fs/readdir.c:65) [30609.986009] __do_sys_getdents64 (fs/readdir.c:369) [30609.986017] do_syscall (arch/s390/kernel/syscall.c:144 (discriminator 1)) [30609.986026] __do_syscall (arch/s390/kernel/syscall.c:169) [30609.986033] system_call (arch/s390/kernel/entry.S:335) [30609.986041] INFO: lockdep is turned off. [30609.986046] Last Breaking-Event-Address: [30609.986050] _printk (kernel/printk/printk.c:2426) [30609.986059] ---[ end trace 0000000000000000 ]--- [30610.050449] XFS (loop0): Unmounting Filesystem
(switched to email. Please respond via emailed reply-to-all, not via the bugzilla web interface). On Sun, 05 Jun 2022 01:00:15 +0000 bugzilla-daemon@kernel.org wrote: > https://bugzilla.kernel.org/show_bug.cgi?id=216073 > > Bug ID: 216073 > Summary: [s390x] kernel BUG at mm/usercopy.c:101! usercopy: > Kernel memory exposure attempt detected from vmalloc > 'n o area' (offset 0, size 1)! > Product: Memory Management > Version: 2.5 > Kernel Version: 5.19-rc0 > Hardware: All > OS: Linux > Tree: Mainline > Status: NEW > Severity: normal > Priority: P1 > Component: Other > Assignee: akpm@linux-foundation.org > Reporter: zlang@redhat.com > Regression: No > > Recently xfstests on s390x always hit below kernel BUG: > usercopy: Kernel memory exposure attempt detected from vmalloc 'no area' > (offset 0, size 1)! Thanks. Do you know if this is specific to s390? > It's reproducible on xfs with default mkfs options. But it's easier and 100% > reproducible (for me) on xfs with 64k directory block size (-n size=65536). > > The kernel HEAD commit is: > commit 032dcf09e2bf7c822be25b4abef7a6c913870d98 > Author: Linus Torvalds <torvalds@linux-foundation.org> > Date: Fri Jun 3 20:01:25 2022 -0700 > > Merge tag 'gpio-fixes-for-v5.19-rc1' of > git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux > > > [20797.425894] XFS (loop1): Mounting V5 Filesystem > [20797.433354] XFS (loop1): Ending clean mount > [20823.669300] usercopy: Kernel memory exposure attempt detected from vmalloc > 'n > o area' (offset 0, size 1)! > [20823.669339] ------------[ cut here ]------------ > [20823.669340] kernel BUG at mm/usercopy.c:101! > [20823.669385] monitor event: 0040 ilc:2 [#1] SMP > [20823.669415] Modules linked in: ext2 overlay dm_zero dm_log_writes > dm_thin_poo > l dm_persistent_data dm_bio_prison sd_mod t10_pi crc64_rocksoft_generic > crc64_ro > cksoft crc64 sg dm_snapshot dm_bufio ext4 mbcache jbd2 dm_flakey tls loop lcs > ct > cm fsm zfcp scsi_transport_fc dasd_fba_mod rfkill sunrpc vfio_ccw mdev > vfio_iomm > u_type1 zcrypt_cex4 vfio drm fuse i2c_core fb font > drm_panel_orientation_quirks > xfs libcrc32c ghash_s390 prng aes_s390 des_s390 sha3_512_s390 sha3_256_s390 > dasd > _eckd_mod dasd_mod qeth_l2 bridge stp llc qeth qdio ccwgroup dm_mirror > dm_region > _hash dm_log dm_mod pkey zcrypt [last unloaded: scsi_debug] > [20823.669520] CPU: 0 PID: 3774731 Comm: rm Kdump: loaded Tainted: G B W > 5.18.0+ #1 > [20823.669530] Hardware name: IBM 8561 LT1 400 (z/VM 7.2.0) > [20823.672501] Krnl PSW : 0704d00180000000 000000009df4a85a > (usercopy_abort+0xaa > /0xb0) > [20823.672564] R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:1 PM:0 > RI: > 0 EA:3 > [20823.672575] Krnl GPRS: 0000000000000001 001c000018090e00 000000000000005c > 000 > 0000000000004 > [20823.672584] 001c000000000000 000000009d332024 000000009e14b1a0 > 001 > bff8000000000 > [20823.672593] 0000000000000001 0000000000000001 0000000000000000 > 000 > 000009e14b1e0 > [20823.672601] 000000009e70d070 00000000a87bdac0 000000009df4a856 > 001 > bff8001f5f720 > [20823.672621] Krnl Code: 000000009df4a84c: b9040031 lgr > %r3,%r1 > [20823.672621] 000000009df4a850: c0e5ffffbbfc brasl > %r14,000 > 000009df42048 > [20823.672621] #000000009df4a856: af000000 mc 0,0 > [20823.672621] >000000009df4a85a: 0707 bcr 0,%r7 > [20823.672621] 000000009df4a85c: 0707 bcr 0,%r7 > [20823.672621] 000000009df4a85e: 0707 bcr 0,%r7 > [20823.672621] 000000009df4a860: c0040007b0a4 brcl > 0,000000 > 009e0409a8 > [20823.672621] 000000009df4a866: eb6ff0480024 stmg > %r6,%r15 > ,72(%r15) > [20823.672789] Call Trace: > [20823.672794] [<000000009df4a85a>] usercopy_abort+0xaa/0xb0 > [20823.672817] ([<000000009df4a856>] usercopy_abort+0xa6/0xb0) > [20823.672825] [<000000009cd30c34>] check_heap_object+0x474/0x480 > [20823.672833] [<000000009cd30cb4>] __check_object_size+0x74/0x150 > [20823.672840] [<000000009cd8de06>] filldir64+0x296/0x530 > [20823.672849] [<001bffff805957dc>] xfs_dir2_leaf_getdents+0x40c/0xca0 [xfs] > [20823.673277] [<001bffff80596e18>] xfs_readdir+0x3f8/0x740 [xfs] > [20823.673522] [<000000009cd8c7ac>] iterate_dir+0x41c/0x580 > [20823.673529] [<000000009cd8d6b4>] __do_sys_getdents64+0xc4/0x1c0 > [20823.673537] [<000000009c4bda8c>] do_syscall+0x22c/0x330 > [20823.673546] [<000000009df5e8be>] __do_syscall+0xce/0xf0 > [20823.673554] [<000000009df87402>] system_call+0x82/0xb0 > [20823.673563] INFO: lockdep is turned off. > [20823.673568] Last Breaking-Event-Address: > [20823.673572] [<000000009df420f4>] _printk+0xac/0xb8 > [20823.673581] ---[ end trace 0000000000000000 ]--- > [20829.875273] usercopy: Kernel memory exposure attempt detected from vmalloc > 'n > o area' (offset 0, size 1)! > [20829.875316] ------------[ cut here ]------------ > [20829.875318] kernel BUG at mm/usercopy.c:101! > [20829.875448] monitor event: 0040 ilc:2 [#2] SMP > [20829.875468] Modules linked in: ext2 overlay dm_zero dm_log_writes > dm_thin_poo > l dm_persistent_data dm_bio_prison sd_mod t10_pi crc64_rocksoft_generic > crc64_r > cksoft crc64 sg dm_snapshot dm_bufio ext4 mbcache jbd2 dm_flakey tls loop lcs > ct > cm fsm zfcp scsi_transport_fc dasd_fba_mod rfkill sunrpc vfio_ccw mdev > vfio_iomm > u_type1 zcrypt_cex4 vfio drm fuse i2c_core fb font > drm_panel_orientation_quirks > xfs libcrc32c ghash_s390 prng aes_s390 des_s390 sha3_512_s390 sha3_256_s390 > dasd > _eckd_mod dasd_mod qeth_l2 bridge stp llc qeth qdio ccwgroup dm_mirror > dm_region > _hash dm_log dm_mod pkey zcrypt [last unloaded: scsi_debug] > [20829.875616] CPU: 0 PID: 3776251 Comm: find Kdump: loaded Tainted: G B D > W > 5.18.0+ #1 > [20829.875629] Hardware name: IBM 8561 LT1 400 (z/VM 7.2.0) > [20829.879533] Krnl PSW : 0704d00180000000 000000009df4a85a > (usercopy_abort+0xaa > /0xb0) > [20829.879554] R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:1 PM:0 > RI: > 0 EA:3 > [20829.879573] Krnl GPRS: 0000000000000001 001c000018090e00 000000000000005c > 000 > 0000000000004 > [20829.879578] 001c000000000000 000000009d332024 000000009e14b1a0 > 001 > bff8000000000 > [20829.879583] 0000000000000001 0000000000000001 0000000000000000 > 000 > 000009e14b1e0 > [20829.879587] 000000009e70d070 00000000a21852c0 000000009df4a856 > 001 > bff8004fef728 > [20829.879599] Krnl Code: 000000009df4a84c: b9040031 lgr > %r3,%r1 > [20829.879599] 000000009df4a850: c0e5ffffbbfc brasl > %r14,000 > 000009df42048 > [20829.879599] #000000009df4a856: af000000 mc 0,0 > [20829.879599] >000000009df4a85a: 0707 bcr 0,%r7 > [20829.879599] 000000009df4a85c: 0707 bcr 0,%r7 > [20829.879599] 000000009df4a85e: 0707 bcr 0,%r7 > [20829.879599] 000000009df4a860: c0040007b0a4 brcl > 0,000000 > 009e0409a8 > [20829.879599] 000000009df4a866: eb6ff0480024 stmg > %r6,%r15 > ,72(%r15) > [20829.879631] Call Trace: > [20829.879634] [<000000009df4a85a>] usercopy_abort+0xaa/0xb0 > [20829.879639] ([<000000009df4a856>] usercopy_abort+0xa6/0xb0) > [20829.879644] [<000000009cd30c34>] check_heap_object+0x474/0x480 > [20829.879650] [<000000009cd30cb4>] __check_object_size+0x74/0x150 > [20829.879654] [<000000009cd8de06>] filldir64+0x296/0x530 > [20829.879661] [<001bffff805957dc>] xfs_dir2_leaf_getdents+0x40c/0xca0 [xfs] > [20829.879971] [<001bffff80596e18>] xfs_readdir+0x3f8/0x740 [xfs] > [20829.880107] [<000000009cd8c7ac>] iterate_dir+0x41c/0x580 > [20829.880112] [<000000009cd8d6b4>] __do_sys_getdents64+0xc4/0x1c0 > [20829.880117] [<000000009c4bda8c>] do_syscall+0x22c/0x330 > [20829.880124] [<000000009df5e8be>] __do_syscall+0xce/0xf0 > [20829.880129] [<000000009df87402>] system_call+0x82/0xb0 > [20829.880135] INFO: lockdep is turned off. > [20829.880138] Last Breaking-Event-Address: > [20829.880141] [<000000009df420f4>] _printk+0xac/0xb8 > [20829.880148] ---[ end trace 0000000000000000 ]--- > [20829.975537] XFS (loop0): Unmounting Filesystem > > -- > You may reply to this email to add a comment. > > You are receiving this mail because: > You are the assignee for the bug.
On Mon, Jun 06, 2022 at 03:13:12PM -0700, Andrew Morton wrote: > (switched to email. Please respond via emailed reply-to-all, not via the > bugzilla web interface). Hi Zorro, Unfortunately, I am not able to reproduce the issue. Could you please clarify your test environment details and share your xfstests config? Thanks!
On Tue, Jun 07, 2022 at 05:05:01PM +0200, Alexander Gordeev wrote: > On Mon, Jun 06, 2022 at 03:13:12PM -0700, Andrew Morton wrote: > > (switched to email. Please respond via emailed reply-to-all, not via the > > bugzilla web interface). > > Hi Zorro, > > Unfortunately, I am not able to reproduce the issue. Could you please > clarify your test environment details and share your xfstests config? One of the test environment details as [1]. The xfstests config as [2]. It's easier to reproduce on 64k directory size xfs by running xfstests auto group. Thanks, Zorro [1] CPU Vendor IBM/S390 Model Name 8561 Family 0 Model 3126312 Stepping 0 Speed 0.0 Processors 2 Cores 0 Sockets 0 Hyper True Flags edat dfp vxp vx vxe ldisp sie vxe2 highgprs etf3eh te vxd gs sort zarch msa stfle dflt eimm esan3 Arch(s) s390x Memory 4096 MB NUMA Nodes 1 Disks Model Size Logical sector size Physical sector size 3390/0c 41.03 GB / 38.21 GiB 4096 bytes 4096 bytes [2] # cat local.config FSTYP=xfs TEST_DEV=/dev/loop0 TEST_DIR=/mnt/fstests/TEST_DIR SCRATCH_DEV=/dev/loop1 SCRATCH_MNT=/mnt/fstests/SCRATCH_DIR LOGWRITES_DEV=/dev/loop2 MKFS_OPTIONS="-n size=65536 -m crc=1,finobt=1,reflink=1,rmapbt=0,bigtime=1,inobtcount=1" TEST_FS_MOUNT_OPTS="" > > Thanks! >
On Wed, Jun 08, 2022 at 10:19:22AM +0800, Zorro Lang wrote: > One of the test environment details as [1]. The xfstests config as [2]. > It's easier to reproduce on 64k directory size xfs by running xfstests > auto group. Thanks for the details, Zorro! Do you create test and scratch device with xfs_io, as README suggests? If yes, what are sizes of the files? Also, do you run always xfs/auto or xfs/294 hits for you reliably? Thanks!
On Wed, Jun 08, 2022 at 09:13:12PM +0200, Alexander Gordeev wrote: > On Wed, Jun 08, 2022 at 10:19:22AM +0800, Zorro Lang wrote: > > One of the test environment details as [1]. The xfstests config as [2]. > > It's easier to reproduce on 64k directory size xfs by running xfstests > > auto group. > > > Thanks for the details, Zorro! > > Do you create test and scratch device with xfs_io, as README suggests? > If yes, what are sizes of the files? # fallocate -l 5G /home/test_dev.img # fallocate -l 10G /home/scratch_dev.img Then create loop devices. > Also, do you run always xfs/auto or xfs/294 hits for you reliably? 100% for on my testing, I tried 10 times, then hit it 10 times last weekend. Will test again this week. > > Thanks! >
It's not a s390x specific bug, I just hit this issue on aarch64 with linux v5.19.0-rc1: [ 980.200947] usercopy: Kernel memory exposure attempt detected from vmalloc 'no area' (offset 0, size 1)! [ 980.200968] ------------[ cut here ]------------ [ 980.200969] kernel BUG at mm/usercopy.c:101! [ 980.201081] Internal error: Oops - BUG: 0 [#1] SMP [ 980.224192] Modules linked in: rfkill arm_spe_pmu mlx5_ib ast drm_vram_helper drm_ttm_helper ttm ib_uverbs acpi_ipmi drm_kms_helper ipmi_ssif fb_sys_fops syscopyarea sysfillrect ib_core sysimgblt arm_cmn arm_dmc620_pmu arm_dsu_pmu cppc_cpufreq sunrpc vfat fat drm fuse xfs libcrc32c mlx5_core crct10dif_ce ghash_ce sha2_ce sha256_arm64 sha1_ce sbsa_gwdt nvme igb mlxfw nvme_core tls i2c_algo_bit psample pci_hyperv_intf i2c_designware_platform i2c_designware_core xgene_hwmon ipmi_devintf ipmi_msghandler [ 980.268449] CPU: 42 PID: 121940 Comm: rm Kdump: loaded Not tainted 5.19.0-rc1+ #1 [ 980.275921] Hardware name: GIGABYTE R272-P30-JG/MP32-AR0-JG, BIOS F16f (SCP: 1.06.20210615) 07/01/2021 [ 980.285214] pstate: 60400009 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 980.292165] pc : usercopy_abort+0x78/0x7c [ 980.296167] lr : usercopy_abort+0x78/0x7c [ 980.300166] sp : ffff80002b007730 [ 980.303469] x29: ffff80002b007740 x28: ffff80002b007cc0 x27: ffffdc5683ecc880 [ 980.310595] x26: 1ffff00005600f9b x25: ffffdc5681c90000 x24: ffff80002b007cdc [ 980.317722] x23: ffff800041a0004a x22: 0000000000000001 x21: 0000000000000001 [ 980.324848] x20: 0000000000000000 x19: ffff800041a00049 x18: 0000000000000000 [ 980.331974] x17: 2720636f6c6c616d x16: 76206d6f72662064 x15: 6574636574656420 [ 980.339101] x14: 74706d6574746120 x13: 21293120657a6973 x12: ffff6106cbc4c03f [ 980.346227] x11: 1fffe106cbc4c03e x10: ffff6106cbc4c03e x9 : ffffdc5681f36e30 [ 980.353353] x8 : ffff08365e2601f7 x7 : 0000000000000001 x6 : ffff6106cbc4c03e [ 980.360480] x5 : ffff08365e2601f0 x4 : 1fffe10044b11801 x3 : 0000000000000000 [ 980.367606] x2 : 0000000000000000 x1 : ffff08022588c000 x0 : 000000000000005c [ 980.374733] Call trace: [ 980.377167] usercopy_abort+0x78/0x7c [ 980.380819] check_heap_object+0x3dc/0x3e0 [ 980.384907] __check_object_size.part.0+0x6c/0x1f0 [ 980.389688] __check_object_size+0x24/0x30 [ 980.393774] filldir64+0x548/0x84c [ 980.397165] xfs_dir2_block_getdents+0x404/0x960 [xfs] [ 980.402437] xfs_readdir+0x3c4/0x4b0 [xfs] [ 980.406652] xfs_file_readdir+0x6c/0xa0 [xfs] [ 980.411127] iterate_dir+0x3a4/0x500 [ 980.414691] __do_sys_getdents64+0xb0/0x230 [ 980.418863] __arm64_sys_getdents64+0x70/0xa0 [ 980.423209] invoke_syscall.constprop.0+0xd8/0x1d0 [ 980.427991] el0_svc_common.constprop.0+0x224/0x2bc [ 980.432858] do_el0_svc+0x4c/0x90 [ 980.436163] el0_svc+0x5c/0x140 [ 980.439294] el0t_64_sync_handler+0xb4/0x130 [ 980.443553] el0t_64_sync+0x174/0x178 [ 980.447206] Code: f90003e3 aa0003e3 91098100 97ffe24b (d4210000) [ 980.453292] SMP: stopping secondary CPUs [ 980.458162] Starting crashdump kernel... [ 980.462073] Bye!
Zorro, linux developers don't use buugzilla. Nobody saw your most recent comment. Please resend it, as an emailed reply-to-all.
On Wed, Jun 08, 2022 at 09:13:12PM +0200, Alexander Gordeev wrote: > On Wed, Jun 08, 2022 at 10:19:22AM +0800, Zorro Lang wrote: > > One of the test environment details as [1]. The xfstests config as [2]. > > It's easier to reproduce on 64k directory size xfs by running xfstests > > auto group. > > > Thanks for the details, Zorro! > > Do you create test and scratch device with xfs_io, as README suggests? > If yes, what are sizes of the files? > Also, do you run always xfs/auto or xfs/294 hits for you reliably? Looks likt it's not a s390x specific bug, I just hit this issue once (not 100% reproducible) on aarch64 with linux v5.19.0-rc1+ [1]. So back to cc linux-mm to get more review. Thanks, Zorro [1] [ 980.200947] usercopy: Kernel memory exposure attempt detected from vmalloc 'no area' (offset 0, size 1)! [ 980.200968] ------------[ cut here ]------------ [ 980.200969] kernel BUG at mm/usercopy.c:101! [ 980.201081] Internal error: Oops - BUG: 0 [#1] SMP [ 980.224192] Modules linked in: rfkill arm_spe_pmu mlx5_ib ast drm_vram_helper drm_ttm_helper ttm ib_uverbs acpi_ipmi drm_kms_helper ipmi_ssif fb_sys_fops syscopyarea sysfillrect ib_core sysimgblt arm_cmn arm_dmc620_pmu arm_dsu_pmu cppc_cpufreq sunrpc vfat fat drm fuse xfs libcrc32c mlx5_core crct10dif_ce ghash_ce sha2_ce sha256_arm64 sha1_ce sbsa_gwdt nvme igb mlxfw nvme_core tls i2c_algo_bit psample pci_hyperv_intf i2c_designware_platform i2c_designware_core xgene_hwmon ipmi_devintf ipmi_msghandler [ 980.268449] CPU: 42 PID: 121940 Comm: rm Kdump: loaded Not tainted 5.19.0-rc1+ #1 [ 980.275921] Hardware name: GIGABYTE R272-P30-JG/MP32-AR0-JG, BIOS F16f (SCP: 1.06.20210615) 07/01/2021 [ 980.285214] pstate: 60400009 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 980.292165] pc : usercopy_abort+0x78/0x7c [ 980.296167] lr : usercopy_abort+0x78/0x7c [ 980.300166] sp : ffff80002b007730 [ 980.303469] x29: ffff80002b007740 x28: ffff80002b007cc0 x27: ffffdc5683ecc880 [ 980.310595] x26: 1ffff00005600f9b x25: ffffdc5681c90000 x24: ffff80002b007cdc [ 980.317722] x23: ffff800041a0004a x22: 0000000000000001 x21: 0000000000000001 [ 980.324848] x20: 0000000000000000 x19: ffff800041a00049 x18: 0000000000000000 [ 980.331974] x17: 2720636f6c6c616d x16: 76206d6f72662064 x15: 6574636574656420 [ 980.339101] x14: 74706d6574746120 x13: 21293120657a6973 x12: ffff6106cbc4c03f [ 980.346227] x11: 1fffe106cbc4c03e x10: ffff6106cbc4c03e x9 : ffffdc5681f36e30 [ 980.353353] x8 : ffff08365e2601f7 x7 : 0000000000000001 x6 : ffff6106cbc4c03e [ 980.360480] x5 : ffff08365e2601f0 x4 : 1fffe10044b11801 x3 : 0000000000000000 [ 980.367606] x2 : 0000000000000000 x1 : ffff08022588c000 x0 : 000000000000005c [ 980.374733] Call trace: [ 980.377167] usercopy_abort+0x78/0x7c [ 980.380819] check_heap_object+0x3dc/0x3e0 [ 980.384907] __check_object_size.part.0+0x6c/0x1f0 [ 980.389688] __check_object_size+0x24/0x30 [ 980.393774] filldir64+0x548/0x84c [ 980.397165] xfs_dir2_block_getdents+0x404/0x960 [xfs] [ 980.402437] xfs_readdir+0x3c4/0x4b0 [xfs] [ 980.406652] xfs_file_readdir+0x6c/0xa0 [xfs] [ 980.411127] iterate_dir+0x3a4/0x500 [ 980.414691] __do_sys_getdents64+0xb0/0x230 [ 980.418863] __arm64_sys_getdents64+0x70/0xa0 [ 980.423209] invoke_syscall.constprop.0+0xd8/0x1d0 [ 980.427991] el0_svc_common.constprop.0+0x224/0x2bc [ 980.432858] do_el0_svc+0x4c/0x90 [ 980.436163] el0_svc+0x5c/0x140 [ 980.439294] el0t_64_sync_handler+0xb4/0x130 [ 980.443553] el0t_64_sync+0x174/0x178 [ 980.447206] Code: f90003e3 aa0003e3 91098100 97ffe24b (d4210000) [ 980.453292] SMP: stopping secondary CPUs [ 980.458162] Starting crashdump kernel... [ 980.462073] Bye! > > Thanks! >
On Sun, Jun 12, 2022 at 12:42:30PM +0800, Zorro Lang wrote: > Looks likt it's not a s390x specific bug, I just hit this issue once (not > 100% > reproducible) on aarch64 with linux v5.19.0-rc1+ [1]. So back to cc linux-mm > to get more review. > > [1] > [ 980.200947] usercopy: Kernel memory exposure attempt detected from vmalloc > 'no area' (offset 0, size 1)! if (is_vmalloc_addr(ptr)) { struct vm_struct *area = find_vm_area(ptr); if (!area) { usercopy_abort("vmalloc", "no area", to_user, 0, n); Oh. Looks like XFS uses vm_map_ram() and vm_map_ram() doesn't allocate a vm_struct. Ulad, how does this look to you? diff --git a/mm/usercopy.c b/mm/usercopy.c index baeacc735b83..6bc2a1407c59 100644 --- a/mm/usercopy.c +++ b/mm/usercopy.c @@ -173,7 +173,7 @@ static inline void check_heap_object(const void *ptr, unsigned long n, } if (is_vmalloc_addr(ptr)) { - struct vm_struct *area = find_vm_area(ptr); + struct vmap_area *area = find_vmap_area((unsigned long)ptr); unsigned long offset; if (!area) { @@ -181,8 +181,9 @@ static inline void check_heap_object(const void *ptr, unsigned long n, return; } - offset = ptr - area->addr; - if (offset + n > get_vm_area_size(area)) + /* XXX: We should also abort for free vmap_areas */ + offset = (unsigned long)ptr - area->va_start; + if (offset + n >= area->va_end) usercopy_abort("vmalloc", NULL, to_user, offset, n); return; } diff --git a/mm/vmalloc.c b/mm/vmalloc.c index 07db42455dd4..effd1ff6a4b4 100644 --- a/mm/vmalloc.c +++ b/mm/vmalloc.c @@ -1798,7 +1798,7 @@ static void free_unmap_vmap_area(struct vmap_area *va) free_vmap_area_noflush(va); } -static struct vmap_area *find_vmap_area(unsigned long addr) +struct vmap_area *find_vmap_area(unsigned long addr) { struct vmap_area *va; > [ 980.200968] ------------[ cut here ]------------ > [ 980.200969] kernel BUG at mm/usercopy.c:101! > [ 980.201081] Internal error: Oops - BUG: 0 [#1] SMP > [ 980.224192] Modules linked in: rfkill arm_spe_pmu mlx5_ib ast > drm_vram_helper drm_ttm_helper ttm ib_uverbs acpi_ipmi drm_kms_helper > ipmi_ssif fb_sys_fops syscopyarea sysfillrect ib_core sysimgblt arm_cmn > arm_dmc620_pmu arm_dsu_pmu cppc_cpufreq sunrpc vfat fat drm fuse xfs > libcrc32c mlx5_core crct10dif_ce ghash_ce sha2_ce sha256_arm64 sha1_ce > sbsa_gwdt nvme igb mlxfw nvme_core tls i2c_algo_bit psample pci_hyperv_intf > i2c_designware_platform i2c_designware_core xgene_hwmon ipmi_devintf > ipmi_msghandler > [ 980.268449] CPU: 42 PID: 121940 Comm: rm Kdump: loaded Not tainted > 5.19.0-rc1+ #1 > [ 980.275921] Hardware name: GIGABYTE R272-P30-JG/MP32-AR0-JG, BIOS F16f > (SCP: 1.06.20210615) 07/01/2021 > [ 980.285214] pstate: 60400009 (nZCv daif +PAN -UAO -TCO -DIT -SSBS > BTYPE=--) > [ 980.292165] pc : usercopy_abort+0x78/0x7c > [ 980.296167] lr : usercopy_abort+0x78/0x7c > [ 980.300166] sp : ffff80002b007730 > [ 980.303469] x29: ffff80002b007740 x28: ffff80002b007cc0 x27: > ffffdc5683ecc880 > [ 980.310595] x26: 1ffff00005600f9b x25: ffffdc5681c90000 x24: > ffff80002b007cdc > [ 980.317722] x23: ffff800041a0004a x22: 0000000000000001 x21: > 0000000000000001 > [ 980.324848] x20: 0000000000000000 x19: ffff800041a00049 x18: > 0000000000000000 > [ 980.331974] x17: 2720636f6c6c616d x16: 76206d6f72662064 x15: > 6574636574656420 > [ 980.339101] x14: 74706d6574746120 x13: 21293120657a6973 x12: > ffff6106cbc4c03f > [ 980.346227] x11: 1fffe106cbc4c03e x10: ffff6106cbc4c03e x9 : > ffffdc5681f36e30 > [ 980.353353] x8 : ffff08365e2601f7 x7 : 0000000000000001 x6 : > ffff6106cbc4c03e > [ 980.360480] x5 : ffff08365e2601f0 x4 : 1fffe10044b11801 x3 : > 0000000000000000 > [ 980.367606] x2 : 0000000000000000 x1 : ffff08022588c000 x0 : > 000000000000005c > [ 980.374733] Call trace: > [ 980.377167] usercopy_abort+0x78/0x7c > [ 980.380819] check_heap_object+0x3dc/0x3e0 > [ 980.384907] __check_object_size.part.0+0x6c/0x1f0 > [ 980.389688] __check_object_size+0x24/0x30 > [ 980.393774] filldir64+0x548/0x84c > [ 980.397165] xfs_dir2_block_getdents+0x404/0x960 [xfs] > [ 980.402437] xfs_readdir+0x3c4/0x4b0 [xfs] > [ 980.406652] xfs_file_readdir+0x6c/0xa0 [xfs] > [ 980.411127] iterate_dir+0x3a4/0x500
> On Sun, Jun 12, 2022 at 12:42:30PM +0800, Zorro Lang wrote: > > Looks likt it's not a s390x specific bug, I just hit this issue once (not > 100% > > reproducible) on aarch64 with linux v5.19.0-rc1+ [1]. So back to cc > linux-mm > > to get more review. > > > > [1] > > [ 980.200947] usercopy: Kernel memory exposure attempt detected from > vmalloc 'no area' (offset 0, size 1)! > > if (is_vmalloc_addr(ptr)) { > struct vm_struct *area = find_vm_area(ptr); > if (!area) { > usercopy_abort("vmalloc", "no area", to_user, 0, n); > > Oh. Looks like XFS uses vm_map_ram() and vm_map_ram() doesn't allocate > a vm_struct. > > Ulad, how does this look to you? > It looks like a correct way to me :) XFS uses per-cpu-vm_map_ram()-vm_unmap_ram() API which do not allocate "vm_struct" because it is not needed. > > diff --git a/mm/usercopy.c b/mm/usercopy.c > index baeacc735b83..6bc2a1407c59 100644 > --- a/mm/usercopy.c > +++ b/mm/usercopy.c > @@ -173,7 +173,7 @@ static inline void check_heap_object(const void *ptr, > unsigned long n, > } > > if (is_vmalloc_addr(ptr)) { > - struct vm_struct *area = find_vm_area(ptr); > + struct vmap_area *area = find_vmap_area((unsigned long)ptr); > unsigned long offset; > > if (!area) { > @@ -181,8 +181,9 @@ static inline void check_heap_object(const void *ptr, > unsigned long n, > return; > } > > - offset = ptr - area->addr; > - if (offset + n > get_vm_area_size(area)) > + /* XXX: We should also abort for free vmap_areas */ > + offset = (unsigned long)ptr - area->va_start; > I was a bit confused about "offset" and why it is needed here. It is always zero. So we can get rid of it to make it less confused. From the other hand a zero offset contributes to nothing. > > + if (offset + n >= area->va_end) > I think it is a bit wrong. As i see it, "n" is a size and what we would like to do here is boundary check: <snip> if (n > va_size(area)) usercopy_abort("vmalloc", NULL, to_user, 0, n); <snip> -- Uladzislau Rezki
On Sun, Jun 12, 2022 at 03:03:20PM +0200, Uladzislau Rezki wrote: > > @@ -181,8 +181,9 @@ static inline void check_heap_object(const void *ptr, > unsigned long n, > > return; > > } > > > > - offset = ptr - area->addr; > > - if (offset + n > get_vm_area_size(area)) > > + /* XXX: We should also abort for free vmap_areas */ > > + offset = (unsigned long)ptr - area->va_start; > > > I was a bit confused about "offset" and why it is needed here. It is always > zero. > So we can get rid of it to make it less confused. From the other hand a zero > offset > contributes to nothing. I don't think offset is necessarily zero. 'ptr' is a pointer somewhere in the object, not necessarily the start of the object. > > > > + if (offset + n >= area->va_end) > > > I think it is a bit wrong. As i see it, "n" is a size and what we would like > to do > here is boundary check: > > <snip> > if (n > va_size(area)) > usercopy_abort("vmalloc", NULL, to_user, 0, n); > <snip> Hmm ... we should probably be more careful about wrapping. if (n > area->va_end - addr) usercopy_abort("vmalloc", NULL, to_user, offset, n); ... and that goes for the whole function actually. I'll split that into a separate change.
On Sun, Jun 12, 2022 at 11:27 AM Matthew Wilcox <willy@infradead.org> wrote: > > On Sun, Jun 12, 2022 at 03:03:20PM +0200, Uladzislau Rezki wrote: > > > @@ -181,8 +181,9 @@ static inline void check_heap_object(const void *ptr, > unsigned long n, > > > return; > > > } > > > > > > - offset = ptr - area->addr; > > > - if (offset + n > get_vm_area_size(area)) > > > + /* XXX: We should also abort for free vmap_areas */ > > > + offset = (unsigned long)ptr - area->va_start; > > > > > I was a bit confused about "offset" and why it is needed here. It is always > zero. > > So we can get rid of it to make it less confused. From the other hand a > zero offset > > contributes to nothing. > > I don't think offset is necessarily zero. 'ptr' is a pointer somewhere > in the object, not necessarily the start of the object. > > > > > > > + if (offset + n >= area->va_end) > > > > > I think it is a bit wrong. As i see it, "n" is a size and what we would > like to do > > here is boundary check: > > > > <snip> > > if (n > va_size(area)) > > usercopy_abort("vmalloc", NULL, to_user, 0, n); > > <snip> > > Hmm ... we should probably be more careful about wrapping. > > if (n > area->va_end - addr) > usercopy_abort("vmalloc", NULL, to_user, offset, n); > > ... and that goes for the whole function actually. I'll split that into > a separate change. Please let me know if there is something we want to test -- I can reproduce the problem reliably: ------------[ cut here ]------------ kernel BUG at mm/usercopy.c:101! Internal error: Oops - BUG: 0 [#1] PREEMPT SMP CPU: 4 PID: 3259 Comm: iptables Not tainted 5.19.0-rc1-lockdep+ #1 pc : usercopy_abort+0x9c/0xa0 lr : usercopy_abort+0x9c/0xa0 sp : ffffffc010bd78d0 x29: ffffffc010bd78e0 x28: 42ffff80ac08d8ec x27: 42ffff80ac08d8ec x26: 42ffff80ac08d8c0 x25: 000000000000000a x24: ffffffdf4c7e5120 x23: 000000000bec44c2 x22: efffffc000000000 x21: ffffffdf2896b0c0 x20: 0000000000000001 x19: 000000000000000b x18: 0000000000000000 x17: 2820636f6c6c616d x16: 0000000000000042 x15: 6574636574656420 x14: 74706d6574746120 x13: 0000000000000018 x12: 000000000000000d x11: ff80007fffffffff x10: 0000000000000001 x9 : db174b7f89103400 x8 : db174b7f89103400 x7 : 0000000000000000 x6 : 79706f6372657375 x5 : ffffffdf4d9c617e x4 : 0000000000000000 x3 : ffffffdf4b7d017c x2 : ffffff80eb188b18 x1 : 42ffff80ac08d8c8 x0 : 0000000000000066 Call trace: usercopy_abort+0x9c/0xa0 __check_object_size+0x38c/0x400 xt_obj_to_user+0xe4/0x200 xt_compat_target_to_user+0xd8/0x18c compat_copy_entries_to_user+0x278/0x424 do_ipt_get_ctl+0x7bc/0xb2c nf_getsockopt+0x7c/0xb4 ip_getsockopt+0xee8/0xfa4 raw_getsockopt+0xf4/0x23c sock_common_getsockopt+0x48/0x54 __sys_getsockopt+0x11c/0x2f8 __arm64_sys_getsockopt+0x60/0x70 el0_svc_common+0xfc/0x1cc do_el0_svc_compat+0x38/0x5c el0_svc_compat+0x68/0xf4 el0t_32_sync_handler+0xc0/0xf0 el0t_32_sync+0x190/0x194 Code: aa0903e4 a9017bfd 910043fd 9438be18 (d4210000) ---[ end trace 0000000000000000 ]---
On Sun, Jun 12, 2022 at 11:59:58AM -0600, Yu Zhao wrote: > Please let me know if there is something we want to test -- I can > reproduce the problem reliably: > > ------------[ cut here ]------------ > kernel BUG at mm/usercopy.c:101! The line right before cut here would have been nice ;-) https://lore.kernel.org/linux-mm/YqXU+oU7wayOcmCe@casper.infradead.org/ might fix your problem, but I can't be sure without that line. > Internal error: Oops - BUG: 0 [#1] PREEMPT SMP > CPU: 4 PID: 3259 Comm: iptables Not tainted 5.19.0-rc1-lockdep+ #1 > pc : usercopy_abort+0x9c/0xa0 > lr : usercopy_abort+0x9c/0xa0 > sp : ffffffc010bd78d0 > x29: ffffffc010bd78e0 x28: 42ffff80ac08d8ec x27: 42ffff80ac08d8ec > x26: 42ffff80ac08d8c0 x25: 000000000000000a x24: ffffffdf4c7e5120 > x23: 000000000bec44c2 x22: efffffc000000000 x21: ffffffdf2896b0c0 > x20: 0000000000000001 x19: 000000000000000b x18: 0000000000000000 > x17: 2820636f6c6c616d x16: 0000000000000042 x15: 6574636574656420 > x14: 74706d6574746120 x13: 0000000000000018 x12: 000000000000000d > x11: ff80007fffffffff x10: 0000000000000001 x9 : db174b7f89103400 > x8 : db174b7f89103400 x7 : 0000000000000000 x6 : 79706f6372657375 > x5 : ffffffdf4d9c617e x4 : 0000000000000000 x3 : ffffffdf4b7d017c > x2 : ffffff80eb188b18 x1 : 42ffff80ac08d8c8 x0 : 0000000000000066 > Call trace: > usercopy_abort+0x9c/0xa0 > __check_object_size+0x38c/0x400 > xt_obj_to_user+0xe4/0x200 > xt_compat_target_to_user+0xd8/0x18c > compat_copy_entries_to_user+0x278/0x424 > do_ipt_get_ctl+0x7bc/0xb2c > nf_getsockopt+0x7c/0xb4 > ip_getsockopt+0xee8/0xfa4 > raw_getsockopt+0xf4/0x23c > sock_common_getsockopt+0x48/0x54 > __sys_getsockopt+0x11c/0x2f8 > __arm64_sys_getsockopt+0x60/0x70 > el0_svc_common+0xfc/0x1cc > do_el0_svc_compat+0x38/0x5c > el0_svc_compat+0x68/0xf4 > el0t_32_sync_handler+0xc0/0xf0 > el0t_32_sync+0x190/0x194 > Code: aa0903e4 a9017bfd 910043fd 9438be18 (d4210000) > ---[ end trace 0000000000000000 ]---
On Sun, Jun 12, 2022 at 12:05 PM Matthew Wilcox <willy@infradead.org> wrote: > > On Sun, Jun 12, 2022 at 11:59:58AM -0600, Yu Zhao wrote: > > Please let me know if there is something we want to test -- I can > > reproduce the problem reliably: > > > > ------------[ cut here ]------------ > > kernel BUG at mm/usercopy.c:101! > > The line right before cut here would have been nice ;-) Right. $ grep usercopy: usercopy: Kernel memory exposure attempt detected from vmalloc (offset 2882303761517129920, size 11)! usercopy: Kernel memory exposure attempt detected from vmalloc (offset 8574853690513436864, size 11)! usercopy: Kernel memory exposure attempt detected from vmalloc (offset 7998392938210013376, size 11)! ... > https://lore.kernel.org/linux-mm/YqXU+oU7wayOcmCe@casper.infradead.org/ > > might fix your problem, but I can't be sure without that line. Thanks, it worked!
> On Sun, Jun 12, 2022 at 03:03:20PM +0200, Uladzislau Rezki wrote: > > > @@ -181,8 +181,9 @@ static inline void check_heap_object(const void *ptr, > unsigned long n, > > > return; > > > } > > > > > > - offset = ptr - area->addr; > > > - if (offset + n > get_vm_area_size(area)) > > > + /* XXX: We should also abort for free vmap_areas */ > > > + offset = (unsigned long)ptr - area->va_start; > > > > > I was a bit confused about "offset" and why it is needed here. It is always > zero. > > So we can get rid of it to make it less confused. From the other hand a > zero offset > > contributes to nothing. > > I don't think offset is necessarily zero. 'ptr' is a pointer somewhere > in the object, not necessarily the start of the object. > Right you are. Just checked the __find_vmap_area() it returns VA of the address it belongs to. Initially i was thinking that addr have to be exactly as va->start only, so i was wrong. > > > > > > + if (offset + n >= area->va_end) > > > > > I think it is a bit wrong. As i see it, "n" is a size and what we would > like to do > > here is boundary check: > > > > <snip> > > if (n > va_size(area)) > > usercopy_abort("vmalloc", NULL, to_user, 0, n); > > <snip> > > Hmm ... we should probably be more careful about wrapping. > > if (n > area->va_end - addr) > usercopy_abort("vmalloc", NULL, to_user, offset, n); > > ... and that goes for the whole function actually. I'll split that into > a separate change. > Based on that offset can be > 0, checking "offset + n" with va->va_end is OK. <snip> if (offset + n > area->va_end) usercopy_abort("vmalloc", NULL, to_user, offset, n); <snip> -- Uladzislau Rezki
On Sun, Jun 12, 2022 at 12:43:45PM -0600, Yu Zhao wrote: > On Sun, Jun 12, 2022 at 12:05 PM Matthew Wilcox <willy@infradead.org> wrote: > > > > On Sun, Jun 12, 2022 at 11:59:58AM -0600, Yu Zhao wrote: > > > Please let me know if there is something we want to test -- I can > > > reproduce the problem reliably: > > > > > > ------------[ cut here ]------------ > > > kernel BUG at mm/usercopy.c:101! > > > > The line right before cut here would have been nice ;-) > > Right. > > $ grep usercopy: > usercopy: Kernel memory exposure attempt detected from vmalloc (offset > 2882303761517129920, size 11)! > usercopy: Kernel memory exposure attempt detected from vmalloc (offset > 8574853690513436864, size 11)! > usercopy: Kernel memory exposure attempt detected from vmalloc (offset > 7998392938210013376, size 11)! That's a different problem. And, er, what? How on earth do we have an offset that big?! struct vm_struct *area = find_vm_area(ptr); offset = ptr - area->addr; if (offset + n > get_vm_area_size(area)) usercopy_abort("vmalloc", NULL, to_user, offset, n); That first offset is 0x2800'0000'0000'30C0 You said it was easy to replicate; can you add: printk("addr:%px ptr:%px\n", area->addr, ptr); so that we can start to understand how we end up with such a bogus offset?
On Sun, Jun 12, 2022 at 1:52 PM Matthew Wilcox <willy@infradead.org> wrote: > > On Sun, Jun 12, 2022 at 12:43:45PM -0600, Yu Zhao wrote: > > On Sun, Jun 12, 2022 at 12:05 PM Matthew Wilcox <willy@infradead.org> > wrote: > > > > > > On Sun, Jun 12, 2022 at 11:59:58AM -0600, Yu Zhao wrote: > > > > Please let me know if there is something we want to test -- I can > > > > reproduce the problem reliably: > > > > > > > > ------------[ cut here ]------------ > > > > kernel BUG at mm/usercopy.c:101! > > > > > > The line right before cut here would have been nice ;-) > > > > Right. > > > > $ grep usercopy: > > usercopy: Kernel memory exposure attempt detected from vmalloc (offset > > 2882303761517129920, size 11)! > > usercopy: Kernel memory exposure attempt detected from vmalloc (offset > > 8574853690513436864, size 11)! > > usercopy: Kernel memory exposure attempt detected from vmalloc (offset > > 7998392938210013376, size 11)! > > That's a different problem. And, er, what? How on earth do we have > an offset that big?! > > struct vm_struct *area = find_vm_area(ptr); > offset = ptr - area->addr; > if (offset + n > get_vm_area_size(area)) > usercopy_abort("vmalloc", NULL, to_user, offset, n); > > That first offset is 0x2800'0000'0000'30C0 > > You said it was easy to replicate; can you add: > > printk("addr:%px ptr:%px\n", area->addr, ptr); > > so that we can start to understand how we end up with such a bogus > offset? Here you go: addr:96ffffdfebcd4000 ptr:ffffffdfebcd70c0 usercopy: Kernel memory exposure attempt detected from vmalloc (offset 7566047373982445760, size 11)! And, not sure if it'd be helpful, with the vmap: va_start:ffffffd83db0d000 va_end:ffffffd83db13000 addr:44ffffd83db0d000 ptr:ffffffd83db100c0 usercopy: Kernel memory exposure attempt detected from vmalloc (offset 13474770085092536512, size 11)! which seems to explain why the fix worked. + if (offset + n > get_vm_area_size(area)) { + struct vmap_area *vmap = find_vmap_area((unsigned long)ptr); + + if (vmap) + printk("va_start:%px va_end:%px\n", vmap->va_start, vmap->va_end); + printk("addr:%px ptr:%px\n", area->addr, ptr); usercopy_abort("vmalloc", NULL, to_user, offset, n); + }