Bug 217296
Summary: | kmemleaks on ac3b43283923 ("module: replace module_layout with module_memory") | ||
---|---|---|---|
Product: | Linux | Reporter: | Bugspray Bot (bugbot) |
Component: | Kernel | Assignee: | Virtual assignee for kernel bugs (linux-kernel) |
Status: | NEW --- | ||
Severity: | normal | ||
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | Subsystem: | ||
Regression: | No | Bisected commit-id: |
Description
Bugspray Bot
2023-04-03 21:20:36 UTC
Luis Chamberlain <mcgrof@kernel.org> writes: On Thu, Mar 30, 2023 at 04:45:43PM -0600, jim.cromie@gmail.com wrote: > hi Luis, etal > > kmemleak is reporting 19 leaks during boot > > because the hexdumps appeared to have module-names, > and Ive been hacking nearby, and see the same names > every time I boot my test-vm, I needed a clearer picture > Jason corroborated and bisected. > > the 19 leaks split into 2 groups, > 9 with names of builtin modules in the hexdump, > all with the same backtrace > 9 without module-names (with a shared backtrace) > +1 wo name-ish and a separate backtrace Song, please take a look. Thanks for the report Jim, what kernel are you on exactly? Luis (via https://msgid.link/ZCaE71aPvvQ/L05L@bombadil.infradead.org) Zorro Boogs <jim.cromie@gmail.com> replies to comment #1: On Fri, Mar 31, 2023 at 1:00 AM Luis Chamberlain <mcgrof@kernel.org> wrote: > > On Thu, Mar 30, 2023 at 04:45:43PM -0600, jim.cromie@gmail.com wrote: > > hi Luis, etal > > > > kmemleak is reporting 19 leaks during boot > > > > because the hexdumps appeared to have module-names, > > and Ive been hacking nearby, and see the same names > > every time I boot my test-vm, I needed a clearer picture > > Jason corroborated and bisected. > > > > the 19 leaks split into 2 groups, > > 9 with names of builtin modules in the hexdump, > > all with the same backtrace > > 9 without module-names (with a shared backtrace) > > +1 wo name-ish and a separate backtrace > > Song, please take a look. > > Thanks for the report Jim, what kernel are you on exactly? > > Luis :#> uptime 09:45:32 up 1 day, 23:07, 0 users, load average: 0.07, 0.04, 0.01 :#> uname -a Linux (none) 6.3.0-rc1-f2-00001-gac3b43283923 #359 SMP PREEMPT_DYNAMIC Wed Mar 29 09:33:11 MDT 2023 x86_64 x86_64 x86_64 GNU/Linux the leaks I sent previously might be from/on a different commit, heres the relevant one fwiw, the config is unremarkable. it started with CONFIG_BUILD_SALT="5.16.8-200.fc35.x86_64" then `make localmodconfig` to drop anything I dont have hw for then `virtme-configkernel --update` to pick up the 9p,etc config options And some extra DEBUG_* options If you'd like to see runs with others, or see the config itself, please ask. :#> uname -a Linux (none) 6.3.0-rc1-f2-00001-gac3b43283923 #359 SMP PREEMPT_DYNAMIC Wed Mar 29 09:33:11 MDT 2023 x86_64 x86_64 x86_64 GNU/Linux :#> ./grok_kmemleak -n not: bless( { 'backtraces' => { '[<0000000058fb276d>] __kmalloc_node_track_caller+0x4a/0x140 [<00000000a2f80203>] memdup_user+0x26/0x90 [<00000000f7cd3624>] strndup_user+0x3f/0x60 [<0000000098fd26c5>] load_module+0x188b/0x20e0 [<0000000074361279>] __do_sys_finit_module+0x93/0xf0 [<000000004caeb948>] do_syscall_64+0x34/0x80 [<000000009f5d036c>] entry_SYSCALL_64_after_hwframe+0x46/0xb0' => 16, '[<0000000094c136c3>] kmalloc_trace+0x26/0x90 [<00000000700fd414>] resolve_symbol+0x2a5/0x3a0 [<000000001dd9228b>] load_module+0x1465/0x20e0 [<0000000074361279>] __do_sys_finit_module+0x93/0xf0 [<000000004caeb948>] do_syscall_64+0x34/0x80 [<000000009f5d036c>] entry_SYSCALL_64_after_hwframe+0x46/0xb0' => 3 }, 'hexdumps' => { '00 b3 af 5a de 87 df 17 ...Z....' => 1, '00 b6 af 5a de 87 d8 cf ...Z....' => 1, '00 b7 af 5a de 87 d5 77 ...Z...w' => 1, '00 c8 b9 5a de 91 a2 2f ...Z.../' => 1, '00 ca b9 5a de 91 a6 3f ...Z...?' => 1, '00 cf b9 5a de 91 a2 07 ...Z....' => 1, '00 e0 4c 56 d2 64 8b 77 ..LV.d.w' => 1, '00 e0 f2 59 dd da 85 7f ...Y....' => 1, '00 e3 4c 56 d2 64 89 1f ..LV.d..' => 1, '00 e4 f2 59 dd da 8f 4f ...Y...O' => 1, '00 e5 4c 56 d2 64 87 bf ..LV.d..' => 1, '00 e6 f2 59 dd da 89 57 ...Y...W' => 1, '00 e8 f2 59 dd da 83 17 ...Y....' => 1, '00 e9 4c 56 d2 64 8e df ..LV.d..' => 1, '00 eb 4c 56 d2 64 80 67 ..LV.d.g' => 1, '00 ec 4c 56 d2 64 8f 7f ..LV.d..' => 1, '40 d4 1c 08 80 88 ff ff 88 99 37 c0 ff ff ff ff @.........7.....' => 1, '88 99 37 c0 ff ff ff ff 40 09 e8 13 80 88 ff ff ..7.....@.......' => 1, 'c8 8a 23 c0 ff ff ff ff c8 8a 23 c0 ff ff ff ff ..#.......#.....' => 1 }, 'users' => { 'comm "(udev-worker)", pid 219,' => 1, 'comm "(udev-worker)", pid 221,' => 4, 'comm "(udev-worker)", pid 224,' => 3, 'comm "(udev-worker)", pid 229,' => 1, 'comm "(udev-worker)", pid 230,' => 1, 'comm "modprobe", pid 728,' => 1, 'comm "modprobe", pid 814,' => 1, 'comm "modprobe", pid 825,' => 4, 'comm "modprobe", pid 832,' => 1, 'comm "modprobe", pid 835,' => 2 } }, 'LeakSet' ) mods: bless( { 'backtraces' => { '[<0000000058fb276d>] __kmalloc_node_track_caller+0x4a/0x140 [<00000000ab7b01fd>] kstrdup+0x32/0x60 [<000000005ed25b98>] kobject_set_name_vargs+0x1c/0x90 [<0000000090fe19ca>] kobject_init_and_add+0x4d/0x90 [<0000000045666935>] mod_sysfs_setup+0xa9/0x6e0 [<00000000d6f7187b>] load_module+0x1de3/0x20e0 [<0000000074361279>] __do_sys_finit_module+0x93/0xf0 [<000000004caeb948>] do_syscall_64+0x34/0x80 [<000000009f5d036c>] entry_SYSCALL_64_after_hwframe+0x46/0xb0' => 16 }, 'hexdumps' => { '63 65 63 00 d2 64 80 7f cec..d..' => 1, '63 72 63 33 32 5f 70 63 6c 6d 75 6c 00 24 14 48 crc32_pclmul.$.H' => 1, '63 72 63 33 32 63 5f 69 6e 74 65 6c 00 a7 e0 f8 crc32c_intel....' => 1, '63 72 63 74 31 30 64 69 66 5f 70 63 6c 6d 75 6c crct10dif_pclmul' => 1, '67 68 61 73 68 5f 63 6c 6d 75 6c 6e 69 5f 69 6e ghash_clmulni_in' => 1, '69 32 63 5f 61 6c 67 6f 5f 62 69 74 00 c4 b6 08 i2c_algo_bit....' => 1, '69 32 63 5f 70 69 69 78 34 00 cb 8a 66 a7 e2 48 i2c_piix4...f..H' => 1, '69 6e 74 65 6c 5f 72 61 70 6c 5f 63 6f 6d 6d 6f intel_rapl_commo' => 1, '69 6e 74 65 6c 5f 72 61 70 6c 5f 6d 73 72 00 98 intel_rapl_msr..' => 1, '69 6f 6d 6d 75 5f 76 32 00 70 a8 80 6c c4 bd 08 iommu_v2.p..l...' => 1, '6d 78 6d 5f 77 6d 69 00 mxm_wmi.' => 1, '70 63 73 70 6b 72 00 8f pcspkr..' => 1, '73 65 72 69 6f 5f 72 61 77 00 cb 8a 66 a7 ed b8 serio_raw...f...' => 1, '74 65 73 74 5f 64 79 6e 61 6d 69 63 5f 64 65 62 test_dynamic_deb' => 1, '76 69 64 65 6f 00 d9 bf video...' => 1, '77 6d 69 00 dd da 80 df wmi.....' => 1 }, 'users' => { 'comm "(udev-worker)", pid 219,' => 1, 'comm "(udev-worker)", pid 221,' => 4, 'comm "(udev-worker)", pid 224,' => 2, 'comm "(udev-worker)", pid 229,' => 1, 'comm "(udev-worker)", pid 230,' => 1, 'comm "modprobe", pid 728,' => 1, 'comm "modprobe", pid 814,' => 1, 'comm "modprobe", pid 825,' => 3, 'comm "modprobe", pid 832,' => 1, 'comm "modprobe", pid 835,' => 1 } }, 'LeakSet' ) :#> lsmod Module Size Used by mxm_wmi 12288 0 iommu_v2 20480 0 video 65536 0 i2c_algo_bit 12288 0 wmi 32768 2 video,mxm_wmi cec 57344 0 test_dynamic_debug 20480 0 intel_rapl_msr 16384 0 crc32_pclmul 12288 0 intel_rapl_common 28672 1 intel_rapl_msr ghash_clmulni_intel 12288 0 crct10dif_pclmul 12288 1 crc32c_intel 20480 0 serio_raw 16384 0 pcspkr 12288 0 i2c_piix4 28672 0 :#> (via https://msgid.link/CAJfuBxwng_fB5XH5LEWAWwN29fitGLBZ8hpdW3+4HjO_MDK1Eg@mail.gmail.com) Luis Chamberlain <mcgrof@kernel.org> replies to comment #2: On Fri, Mar 31, 2023 at 11:08:23AM -0600, jim.cromie@gmail.com wrote: > :#> uptime > 09:45:32 up 1 day, 23:07, 0 users, load average: 0.07, 0.04, 0.01 > :#> uname -a > Linux (none) 6.3.0-rc1-f2-00001-gac3b43283923 #359 SMP PREEMPT_DYNAMIC > Wed Mar 29 09:33:11 MDT 2023 x86_64 x86_64 x86_64 GNU/Linux > > the leaks I sent previously might be from/on a different commit, > heres the relevant one > > fwiw, the config is unremarkable. it started with > CONFIG_BUILD_SALT="5.16.8-200.fc35.x86_64" > then `make localmodconfig` to drop anything I dont have hw for > then `virtme-configkernel --update` to pick up the 9p,etc config options > And some extra DEBUG_* options > If you'd like to see runs with others, or see the config itself, please ask. If you wanna see things explode echo 0 > /proc/sys/vm/oom_dump_tasks ./stress-ng --module 20 --module-name xfs This assumes xfs is not already loaded, and has all dependencies already loaded. What would test the load_module() path. If you wanna see if the test is earlier, you can try a module which is already loaded on your system. > :#> uname -a > Linux (none) 6.3.0-rc1-f2-00001-gac3b43283923 #359 SMP PREEMPT_DYNAMIC > Wed Mar 29 09:33:11 MDT 2023 x86_64 x86_64 x86_64 GNU/Linux > :#> ./grok_kmemleak -n > not: bless( { > 'backtraces' => { > '[<0000000058fb276d>] __kmalloc_node_track_caller+0x4a/0x140 > [<00000000a2f80203>] memdup_user+0x26/0x90 > [<00000000f7cd3624>] strndup_user+0x3f/0x60 > [<0000000098fd26c5>] load_module+0x188b/0x20e0 Can you do: gdb vmlinux l *(load_module+0x188b) And provide the output? > }, 'LeakSet' ) > mods: bless( { > 'backtraces' => { > '[<0000000058fb276d>] __kmalloc_node_track_caller+0x4a/0x140 > [<00000000ab7b01fd>] kstrdup+0x32/0x60 > [<000000005ed25b98>] kobject_set_name_vargs+0x1c/0x90 > [<0000000090fe19ca>] kobject_init_and_add+0x4d/0x90 > [<0000000045666935>] mod_sysfs_setup+0xa9/0x6e0 Ok that is a specific enough hint. I'll take a review of this sysfs path see what changed that could break. > [<00000000d6f7187b>] load_module+0x1de3/0x20e0 > [<0000000074361279>] __do_sys_finit_module+0x93/0xf0 > [<000000004caeb948>] do_syscall_64+0x34/0x80 > [<000000009f5d036c>] entry_SYSCALL_64_after_hwframe+0x46/0xb0' => 16 > }, Luis (via https://msgid.link/ZCcwkCBgyxOgROVu@bombadil.infradead.org) Zorro Boogs <jim.cromie@gmail.com> replies to comment #3: On Fri, Mar 31, 2023 at 1:12 PM Luis Chamberlain <mcgrof@kernel.org> wrote: > > On Fri, Mar 31, 2023 at 11:08:23AM -0600, jim.cromie@gmail.com wrote: > > :#> uptime > > 09:45:32 up 1 day, 23:07, 0 users, load average: 0.07, 0.04, 0.01 > > :#> uname -a > > Linux (none) 6.3.0-rc1-f2-00001-gac3b43283923 #359 SMP PREEMPT_DYNAMIC > > Wed Mar 29 09:33:11 MDT 2023 x86_64 x86_64 x86_64 GNU/Linux > > > > the leaks I sent previously might be from/on a different commit, > > heres the relevant one > > > > fwiw, the config is unremarkable. it started with > > CONFIG_BUILD_SALT="5.16.8-200.fc35.x86_64" > > then `make localmodconfig` to drop anything I dont have hw for > > then `virtme-configkernel --update` to pick up the 9p,etc config options > > And some extra DEBUG_* options > > If you'd like to see runs with others, or see the config itself, please > ask. > > If you wanna see things explode > > echo 0 > /proc/sys/vm/oom_dump_tasks > ./stress-ng --module 20 --module-name xfs > > This assumes xfs is not already loaded, and has all dependencies already > loaded. What would test the load_module() path. > > If you wanna see if the test is earlier, you can try a module which > is already loaded on your system. > > > :#> uname -a > > Linux (none) 6.3.0-rc1-f2-00001-gac3b43283923 #359 SMP PREEMPT_DYNAMIC > > Wed Mar 29 09:33:11 MDT 2023 x86_64 x86_64 x86_64 GNU/Linux > > :#> ./grok_kmemleak -n > > not: bless( { > > 'backtraces' => { > > '[<0000000058fb276d>] __kmalloc_node_track_caller+0x4a/0x140 > > [<00000000a2f80203>] memdup_user+0x26/0x90 > > [<00000000f7cd3624>] strndup_user+0x3f/0x60 > > [<0000000098fd26c5>] load_module+0x188b/0x20e0 > > Can you do: > > gdb vmlinux > l *(load_module+0x188b) > > And provide the output? (gdb) l *(load_module+0x188b) 0xffffffff8122a4bb is in load_module (/home/jimc/projects/lx/wk-next/kernel/module/main.c:2820). 2815 goto free_modinfo; 2816 2817 flush_module_icache(mod); 2818 2819 /* Now copy in args */ 2820 mod->args = strndup_user(uargs, ~0UL >> 1); 2821 if (IS_ERR(mod->args)) { 2822 err = PTR_ERR(mod->args); 2823 goto free_arch_cleanup; 2824 } > > > }, 'LeakSet' ) > > mods: bless( { > > 'backtraces' => { > > '[<0000000058fb276d>] __kmalloc_node_track_caller+0x4a/0x140 > > [<00000000ab7b01fd>] kstrdup+0x32/0x60 > > [<000000005ed25b98>] kobject_set_name_vargs+0x1c/0x90 > > [<0000000090fe19ca>] kobject_init_and_add+0x4d/0x90 > > [<0000000045666935>] mod_sysfs_setup+0xa9/0x6e0 > > Ok that is a specific enough hint. I'll take a review of this sysfs > path see what changed that could break. (gdb) l *(mod_sysfs_setup+0xa9) 0xffffffff8122d2d9 is in mod_sysfs_setup (/home/jimc/projects/lx/wk-next/kernel/module/sysfs.c:361). 356 357 mod->mkobj.mod = mod; 358 359 memset(&mod->mkobj.kobj, 0, sizeof(mod->mkobj.kobj)); 360 mod->mkobj.kobj.kset = module_kset; 361 err = kobject_init_and_add(&mod->mkobj.kobj, &module_ktype, NULL, 362 "%s", mod->name); 363 if (err) 364 mod_kobject_put(mod); 365 (gdb) > > > [<00000000d6f7187b>] load_module+0x1de3/0x20e0 (gdb) l *(load_module+0x1de3) 0xffffffff8122aa13 is in load_module (/home/jimc/projects/lx/wk-next/kernel/module/main.c:2856). 2851 pr_warn("%s: parameters '%s' after `--' ignored\n", 2852 mod->name, after_dashes); 2853 } 2854 2855 /* Link in to sysfs. */ 2856 err = mod_sysfs_setup(mod, info, mod->kp, mod->num_kp); 2857 if (err < 0) 2858 goto coming_cleanup; 2859 2860 if (is_livepatch_module(mod)) { > > [<0000000074361279>] __do_sys_finit_module+0x93/0xf0 > > [<000000004caeb948>] do_syscall_64+0x34/0x80 > > [<000000009f5d036c>] entry_SYSCALL_64_after_hwframe+0x46/0xb0' => 16 > > }, >> Luis (via https://msgid.link/CAJfuBxzP0-sk59H6DTkkng+mFa0WWJdr7fVj=iKsaLT_J1YXuQ@mail.gmail.com) Song Liu <song@kernel.org> replies to comment #1: On Fri, Mar 31, 2023 at 12:00 AM Luis Chamberlain <mcgrof@kernel.org> wrote: > > On Thu, Mar 30, 2023 at 04:45:43PM -0600, jim.cromie@gmail.com wrote: > > hi Luis, etal > > > > kmemleak is reporting 19 leaks during boot > > > > because the hexdumps appeared to have module-names, > > and Ive been hacking nearby, and see the same names > > every time I boot my test-vm, I needed a clearer picture > > Jason corroborated and bisected. > > > > the 19 leaks split into 2 groups, > > 9 with names of builtin modules in the hexdump, > > all with the same backtrace > > 9 without module-names (with a shared backtrace) > > +1 wo name-ish and a separate backtrace > > Song, please take a look. I will look into this next week. Thanks, Song (via https://msgid.link/CAPhsuW6P5AYVKMk=G1bEUz5PGZKmTJwtgQBmE-P4iAo7dOr5yA@mail.gmail.com) Luis Chamberlain <mcgrof@kernel.org> replies to comment #5: On Fri, Mar 31, 2023 at 05:27:04PM -0700, Song Liu wrote: > On Fri, Mar 31, 2023 at 12:00 AM Luis Chamberlain <mcgrof@kernel.org> wrote: > > > > On Thu, Mar 30, 2023 at 04:45:43PM -0600, jim.cromie@gmail.com wrote: > > > hi Luis, etal > > > > > > kmemleak is reporting 19 leaks during boot > > > > > > because the hexdumps appeared to have module-names, > > > and Ive been hacking nearby, and see the same names > > > every time I boot my test-vm, I needed a clearer picture > > > Jason corroborated and bisected. > > > > > > the 19 leaks split into 2 groups, > > > 9 with names of builtin modules in the hexdump, > > > all with the same backtrace > > > 9 without module-names (with a shared backtrace) > > > +1 wo name-ish and a separate backtrace > > > > Song, please take a look. > > I will look into this next week. I'm thinking this may be it, at least this gets us to what we used to do as per original Catalinas' 4f2294b6dc88d ("kmemleak: Add modules support") and right before Song's patch. diff --git a/kernel/module/main.c b/kernel/module/main.c index 6b6da80f363f..3b9c71fa6096 100644 --- a/kernel/module/main.c +++ b/kernel/module/main.c @@ -2240,7 +2240,10 @@ static int move_module(struct module *mod, struct load_info *info) * which is inside the block. Just mark it as not being a * leak. */ - kmemleak_ignore(ptr); + if (type == MOD_INIT_TEXT) + kmemleak_ignore(ptr); + else + kmemleak_not_leak(ptr); if (!ptr) { t = type; goto out_enomem; We used to use the grey area for the TEXT but the original commit doesn't explain too well why we grey out init but not the others. Ie why kmemleak_ignore() on init and kmemleak_not_leak() on the others. Catalinas, any thoughts / suggestions? Should we just stick to kmemleak_not_leak() for both now? Luis (via https://msgid.link/ZCs6jpo1nYe1Wm08@bombadil.infradead.org) Konstantin Ryabitsev <konstantin@linuxfoundation.org> writes: On Thu, Mar 30, 2023 at 04:45:43PM -0600, jim.cromie@gmail.com wrote: > hi Luis, etal > > kmemleak is reporting 19 leaks during boot Hi, all: I'm going to use this thread to test out bugbot. You can just ignore it and let it do its thing behind the scenes -- if it explodes, I'll take care of it. It should just send a single follow-up telling us that it's tracking the thread, but otherwise stay entirely out of everyone's hair. bugbot on Sorry for interrupting and thank you for your patience -- this is for a good cause, I promise. :) -K (via https://msgid.link/owkyirqlrkdwvlmd4vlivgahd5uycolsdii3kvwbvakj5222mh@nydsfzk7uqtz) Zorro Boogs <jim.cromie@gmail.com> replies to comment #6: On Mon, Apr 3, 2023 at 2:44 PM Luis Chamberlain <mcgrof@kernel.org> wrote: > > On Fri, Mar 31, 2023 at 05:27:04PM -0700, Song Liu wrote: > > On Fri, Mar 31, 2023 at 12:00 AM Luis Chamberlain <mcgrof@kernel.org> > wrote: > > > > > > On Thu, Mar 30, 2023 at 04:45:43PM -0600, jim.cromie@gmail.com wrote: > > > > hi Luis, etal > > > > > > > > kmemleak is reporting 19 leaks during boot > > > > > > > > because the hexdumps appeared to have module-names, > > > > and Ive been hacking nearby, and see the same names > > > > every time I boot my test-vm, I needed a clearer picture > > > > Jason corroborated and bisected. > > > > > > > > the 19 leaks split into 2 groups, > > > > 9 with names of builtin modules in the hexdump, > > > > all with the same backtrace > > > > 9 without module-names (with a shared backtrace) > > > > +1 wo name-ish and a separate backtrace > > > > > > Song, please take a look. > > > > I will look into this next week. > > I'm thinking this may be it, at least this gets us to what we used to do > as per original Catalinas' 4f2294b6dc88d ("kmemleak: Add modules > support") and right before Song's patch. > > diff --git a/kernel/module/main.c b/kernel/module/main.c > index 6b6da80f363f..3b9c71fa6096 100644 > --- a/kernel/module/main.c > +++ b/kernel/module/main.c > @@ -2240,7 +2240,10 @@ static int move_module(struct module *mod, struct > load_info *info) > * which is inside the block. Just mark it as not being a > * leak. > */ > - kmemleak_ignore(ptr); > + if (type == MOD_INIT_TEXT) > + kmemleak_ignore(ptr); > + else > + kmemleak_not_leak(ptr); > if (!ptr) { > t = type; > goto out_enomem; > > We used to use the grey area for the TEXT but the original commit > doesn't explain too well why we grey out init but not the others. Ie > why kmemleak_ignore() on init and kmemleak_not_leak() on the others. > > Catalinas, any thoughts / suggestions? Should we just stick to > kmemleak_not_leak() for both now? > > Luis So I have mixed results. your patch fixed the 19 leaks on my worktree / branch where I found them. on top of ac3b43283923 module: replace module_layout with module_memory it fixed the (same) 19, but gets a few new ones. whats weird is that once they report, they disappear from /sys/kernel/debug/kmemleak heres that kmemleak report, with a little preceding / setup, performed by this bash scripting drms_unload() { for m in i915 amdgpu nouveau \ iommu_v2 video i2c_algo_bit mxm_wmi wmi intel_rapl_msr \ drm_display_helper cec drm_kms_helper drm_ttm_helper ttm gpu_sched drm_buddy drm; do rmmod $m ; done } drms_load() { uname -a for m in i915 amdgpu nouveau; do modprobe $m $* done } cycle_drms() { for i in 1 $*; do # loop 1+argc times time drms_load drms_unload done } leak_drive () { [[ -f /sys/kernel/debug/kmemleak ]] || { echo "need KMEMLEAK" return } let count=0 #echo ok testing, each dot is 10 secs while true; do var=`cat /sys/kernel/debug/kmemleak` if [[ -z $var ]] ; then cycle_drms echo scan >/sys/kernel/debug/kmemleak else break fi ((count=$count+1)) echo finished pass $count done cat /sys/kernel/debug/kmemleak dmesg | grep /sys/kernel/debug/kmemleak uname -a } finished pass 6 Linux (none) 6.3.0-rc1-f2-00002-g30504a44c558 #360 SMP PREEMPT_DYNAMIC Tue Apr 4 15:25:05 MDT 2023 x86_64 x86_64 x86_64 GNU/Linux [ 51.768797] ACPI: bus type drm_connector registered [ 52.039443] AMD-Vi: AMD IOMMUv2 functionality not available on this system - This is not a bug. [ 52.795766] [drm] amdgpu kernel modesetting enabled. [ 52.796288] amdgpu: CRAT table not found [ 52.796502] amdgpu: Virtual CRAT table created for CPU [ 52.796964] amdgpu: Topology: Add CPU node real 0m1.354s user 0m0.002s sys 0m0.919s rmmod: ERROR: Module intel_rapl_msr is not currently loaded [ 53.401823] ACPI: bus type drm_connector unregistered [ 53.595705] kmemleak: 2 new suspected memory leaks (see /sys/kernel/debug/kmemleak) finished pass 7 unreferenced object 0xffff8880059b0240 (size 192): comm "modprobe", pid 716, jiffies 4294714739 (age 6.065s) hex dump (first 32 bytes): 00 db 50 c0 ff ff ff ff 00 00 00 00 00 00 00 00 ..P............. 00 00 00 00 00 00 00 00 ea ff ff ff ff ff ff ff ................ backtrace: [<00000000406104d4>] __kmalloc+0x49/0x150 [<00000000fe00c883>] __register_sysctl_table+0x51/0x7f0 [<00000000438011af>] 0xffffffffc04faa78 [<000000009a44098c>] 0xffffffffc037f01b [<00000000de0b0c0b>] do_one_initcall+0x43/0x210 [<0000000016200549>] do_init_module+0x60/0x240 [<00000000e5f75cca>] __do_sys_finit_module+0x93/0xf0 [<0000000014ed2961>] do_syscall_64+0x34/0x80 [<00000000d14e8c97>] entry_SYSCALL_64_after_hwframe+0x46/0xb0 unreferenced object 0xffff88800ce25800 (size 256): comm "modprobe", pid 716, jiffies 4294714739 (age 6.065s) hex dump (first 32 bytes): 78 58 e2 0c 80 88 ff ff 00 00 00 00 00 00 00 00 xX.............. 00 00 00 00 00 00 00 00 ea ff ff ff ff ff ff ff ................ backtrace: [<00000000406104d4>] __kmalloc+0x49/0x150 [<00000000ec6658c8>] __register_sysctl_table+0x569/0x7f0 [<00000000438011af>] 0xffffffffc04faa78 [<000000009a44098c>] 0xffffffffc037f01b [<00000000de0b0c0b>] do_one_initcall+0x43/0x210 [<0000000016200549>] do_init_module+0x60/0x240 [<00000000e5f75cca>] __do_sys_finit_module+0x93/0xf0 [<0000000014ed2961>] do_syscall_64+0x34/0x80 [<00000000d14e8c97>] entry_SYSCALL_64_after_hwframe+0x46/0xb0 [ 53.595705] kmemleak: 2 new suspected memory leaks (see /sys/kernel/debug/kmemleak) Linux (none) 6.3.0-rc1-f2-00002-g30504a44c558 #360 SMP PREEMPT_DYNAMIC Tue Apr 4 15:25:05 MDT 2023 x86_64 x86_64 x86_64 GNU/Linux at this point, kmemleak is empty. Im guessing thats because the leak was in / under do_init_module, and __init mem is recycled. Maybe its also why the leak-trace has 2 entries without symbol info Heres the levels above & below those mystery levels (gdb) l *(do_one_initcall+0x43) 0xffffffff81001093 is in do_one_initcall (/home/jimc/projects/lx/wk-next/init/main.c:1306). 1301 1302 if (initcall_blacklisted(fn)) 1303 return -EPERM; 1304 1305 do_trace_initcall_start(fn); 1306 ret = fn(); 1307 do_trace_initcall_finish(fn, ret); 1308 1309 msgbuf[0] = 0; 1310 (gdb) l *(__register_sysctl_table+0x569) 0xffffffff814e5e99 is in __register_sysctl_table (/home/jimc/projects/lx/wk-next/fs/proc/proc_sysctl.c:974). 969 char *new_name; 970 971 new = kzalloc(sizeof(*new) + sizeof(struct ctl_node) + 972 sizeof(struct ctl_table)*2 + namelen + 1, 973 GFP_KERNEL); 974 if (!new) 975 return NULL; 976 977 node = (struct ctl_node *)(new + 1); 978 table = (struct ctl_table *)(node + 1); (gdb) (via https://msgid.link/CAJfuBxzGJvrJo9nTXxZ3xZ7QmdSb6YxBw-bojZjQTpACBeK_sQ@mail.gmail.com) Luis Chamberlain <mcgrof@kernel.org> replies to comment #8: On Tue, Apr 04, 2023 at 07:38:41PM -0600, jim.cromie@gmail.com wrote: > On Mon, Apr 3, 2023 at 2:44 PM Luis Chamberlain <mcgrof@kernel.org> wrote: > > > > On Fri, Mar 31, 2023 at 05:27:04PM -0700, Song Liu wrote: > > > On Fri, Mar 31, 2023 at 12:00 AM Luis Chamberlain <mcgrof@kernel.org> > wrote: > > > > > > > > On Thu, Mar 30, 2023 at 04:45:43PM -0600, jim.cromie@gmail.com wrote: > > > > > hi Luis, etal > > > > > > > > > > kmemleak is reporting 19 leaks during boot > > > > > > > > > > because the hexdumps appeared to have module-names, > > > > > and Ive been hacking nearby, and see the same names > > > > > every time I boot my test-vm, I needed a clearer picture > > > > > Jason corroborated and bisected. > > > > > > > > > > the 19 leaks split into 2 groups, > > > > > 9 with names of builtin modules in the hexdump, > > > > > all with the same backtrace > > > > > 9 without module-names (with a shared backtrace) > > > > > +1 wo name-ish and a separate backtrace > > > > > > > > Song, please take a look. > > > > > > I will look into this next week. > > > > I'm thinking this may be it, at least this gets us to what we used to do > > as per original Catalinas' 4f2294b6dc88d ("kmemleak: Add modules > > support") and right before Song's patch. > > > > diff --git a/kernel/module/main.c b/kernel/module/main.c > > index 6b6da80f363f..3b9c71fa6096 100644 > > --- a/kernel/module/main.c > > +++ b/kernel/module/main.c > > @@ -2240,7 +2240,10 @@ static int move_module(struct module *mod, struct > load_info *info) > > * which is inside the block. Just mark it as not being a > > * leak. > > */ > > - kmemleak_ignore(ptr); > > + if (type == MOD_INIT_TEXT) > > + kmemleak_ignore(ptr); > > + else > > + kmemleak_not_leak(ptr); > > if (!ptr) { > > t = type; > > goto out_enomem; > > > > We used to use the grey area for the TEXT but the original commit > > doesn't explain too well why we grey out init but not the others. Ie > > why kmemleak_ignore() on init and kmemleak_not_leak() on the others. > > > > Catalinas, any thoughts / suggestions? Should we just stick to > > kmemleak_not_leak() for both now? > > > > Luis > > So I have mixed results. > > your patch fixed the 19 leaks on my worktree / branch where I found them. > > on top of > ac3b43283923 module: replace module_layout with module_memory > > it fixed the (same) 19, but gets a few new ones. > whats weird is that once they report, they disappear from > /sys/kernel/debug/kmemleak I think I missed the MOD_INIT_DATA and MOD_INIT_RODATA. Can you try the patch below instead: From 6890bd43866c40e1b58a832361812cdc5d965e4c Mon Sep 17 00:00:00 2001 From: Luis Chamberlain <mcgrof@kernel.org> Date: Tue, 4 Apr 2023 18:52:47 -0700 Subject: [PATCH] module: fix kmemleak annotations for non init ELF sections Commit ac3b43283923 ("module: replace module_layout with module_memory") reworked the way to handle memory allocations to make it clearer. But it lost in translation how we handle kmemleak_ignore() or kmemleak_not_leak() for these sections. Fix this and clarify the comments a bit more. Fixes: ac3b43283923 ("module: replace module_layout with module_memory") Reported-by: Jim Cromie <jim.cromie@gmail.com> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org> --- kernel/module/main.c | 20 ++++++++++++++++---- 1 file changed, 16 insertions(+), 4 deletions(-) diff --git a/kernel/module/main.c b/kernel/module/main.c index 5cc21083af04..fe0f3b8fd3a8 100644 --- a/kernel/module/main.c +++ b/kernel/module/main.c @@ -2233,11 +2233,23 @@ static int move_module(struct module *mod, struct load_info *info) ptr = module_memory_alloc(mod->mem[type].size, type); /* - * The pointer to this block is stored in the module structure - * which is inside the block. Just mark it as not being a - * leak. + * The pointer to these blocks of memory are stored on the module + * structure and we keep that around so long as the module is + * around. We only free that memory when we unload the module. + * Just mark them as not being a leak then. The .init* ELF + * sections *do* get freed after boot so we treat them slightly + * differently and only grey them out as they work as typical + * memory allocations which *do* get eventually get freed. */ - kmemleak_ignore(ptr); + switch (type) { + case MOD_INIT_TEXT: /* fallthrough */ + case MOD_INIT_DATA: /* fallthrough */ + case MOD_INIT_RODATA: /* fallthrough */ + kmemleak_ignore(ptr); + break; + default: + kmemleak_not_leak(ptr); + } if (!ptr) { t = type; goto out_enomem; (via https://msgid.link/ZCzWdLOg1i2p1Q67@bombadil.infradead.org) Zorro Boogs <jim.cromie@gmail.com> replies to comment #9: On Tue, Apr 4, 2023 at 8:01 PM Luis Chamberlain <mcgrof@kernel.org> wrote: > > On Tue, Apr 04, 2023 at 07:38:41PM -0600, jim.cromie@gmail.com wrote: > > On Mon, Apr 3, 2023 at 2:44 PM Luis Chamberlain <mcgrof@kernel.org> wrote: > > > > > > On Fri, Mar 31, 2023 at 05:27:04PM -0700, Song Liu wrote: > > > > On Fri, Mar 31, 2023 at 12:00 AM Luis Chamberlain <mcgrof@kernel.org> > wrote: > > > > > > > > > > On Thu, Mar 30, 2023 at 04:45:43PM -0600, jim.cromie@gmail.com wrote: > > > > > > hi Luis, etal > > > > > > > > > > > > kmemleak is reporting 19 leaks during boot > > > > > > > > > > > > because the hexdumps appeared to have module-names, > > > > > > and Ive been hacking nearby, and see the same names > > > > > > every time I boot my test-vm, I needed a clearer picture > > > > > > Jason corroborated and bisected. > > > > > > > > > > > > the 19 leaks split into 2 groups, > > > > > > 9 with names of builtin modules in the hexdump, > > > > > > all with the same backtrace > > > > > > 9 without module-names (with a shared backtrace) > > > > > > +1 wo name-ish and a separate backtrace > > > > > > > > > > Song, please take a look. > > > > > > > > I will look into this next week. > > > > > > I'm thinking this may be it, at least this gets us to what we used to do > > > as per original Catalinas' 4f2294b6dc88d ("kmemleak: Add modules > > > support") and right before Song's patch. > > > > > > diff --git a/kernel/module/main.c b/kernel/module/main.c > > > index 6b6da80f363f..3b9c71fa6096 100644 > > > --- a/kernel/module/main.c > > > +++ b/kernel/module/main.c > > > @@ -2240,7 +2240,10 @@ static int move_module(struct module *mod, struct > load_info *info) > > > * which is inside the block. Just mark it as not being a > > > * leak. > > > */ > > > - kmemleak_ignore(ptr); > > > + if (type == MOD_INIT_TEXT) > > > + kmemleak_ignore(ptr); > > > + else > > > + kmemleak_not_leak(ptr); > > > if (!ptr) { > > > t = type; > > > goto out_enomem; > > > > > > We used to use the grey area for the TEXT but the original commit > > > doesn't explain too well why we grey out init but not the others. Ie > > > why kmemleak_ignore() on init and kmemleak_not_leak() on the others. > > > > > > Catalinas, any thoughts / suggestions? Should we just stick to > > > kmemleak_not_leak() for both now? > > > > > > Luis > > > > So I have mixed results. > > > > your patch fixed the 19 leaks on my worktree / branch where I found them. > > > > on top of > > ac3b43283923 module: replace module_layout with module_memory > > > > it fixed the (same) 19, but gets a few new ones. > > whats weird is that once they report, they disappear from > > /sys/kernel/debug/kmemleak this disappearing act is still going on. my script issues no echo clear > kmemleak > > I think I missed the MOD_INIT_DATA and MOD_INIT_RODATA. Can you try the > patch below instead: > > From 6890bd43866c40e1b58a832361812cdc5d965e4c Mon Sep 17 00:00:00 2001 > From: Luis Chamberlain <mcgrof@kernel.org> > Date: Tue, 4 Apr 2023 18:52:47 -0700 > Subject: [PATCH] module: fix kmemleak annotations for non init ELF sections > > Commit ac3b43283923 ("module: replace module_layout with module_memory") > reworked the way to handle memory allocations to make it clearer. But it > lost in translation how we handle kmemleak_ignore() or kmemleak_not_leak() > for these sections. > > Fix this and clarify the comments a bit more. > > Fixes: ac3b43283923 ("module: replace module_layout with module_memory") > Reported-by: Jim Cromie <jim.cromie@gmail.com> > Signed-off-by: Luis Chamberlain <mcgrof@kernel.org> > --- > kernel/module/main.c | 20 ++++++++++++++++---- > 1 file changed, 16 insertions(+), 4 deletions(-) > > diff --git a/kernel/module/main.c b/kernel/module/main.c > index 5cc21083af04..fe0f3b8fd3a8 100644 > --- a/kernel/module/main.c > +++ b/kernel/module/main.c > @@ -2233,11 +2233,23 @@ static int move_module(struct module *mod, struct > load_info *info) > ptr = module_memory_alloc(mod->mem[type].size, type); > > /* > - * The pointer to this block is stored in the module > structure > - * which is inside the block. Just mark it as not being a > - * leak. > + * The pointer to these blocks of memory are stored on the > module > + * structure and we keep that around so long as the module is > + * around. We only free that memory when we unload the > module. > + * Just mark them as not being a leak then. The .init* ELF > + * sections *do* get freed after boot so we treat them > slightly > + * differently and only grey them out as they work as typical > + * memory allocations which *do* get eventually get freed. > */ > - kmemleak_ignore(ptr); > + switch (type) { > + case MOD_INIT_TEXT: /* fallthrough */ > + case MOD_INIT_DATA: /* fallthrough */ > + case MOD_INIT_RODATA: /* fallthrough */ > + kmemleak_ignore(ptr); > + break; > + default: > + kmemleak_not_leak(ptr); > + } > if (!ptr) { > t = type; > goto out_enomem; > -- > 2.39.2 > sorry for the delay, I was seeing heisen-responses, and several BUGs. a make clean seems to have settled things mostly. But in case theres any clues in there, Ive kept the paste-in of 2 BUGs with f23cd1ffca4b (HEAD) kmemleaks on ac3b43283923 ("module: replace module_layout with module_memory") ac3b43283923 module: replace module_layout with module_memory heres the 1st run. cuz it leaked, I reran in another vm, which got different results. I left it overnight doing nothing (laptop slept, vm with it), and it BUG'd on a soft lockup (much later, but the leaktrace does have a timerfd in it) R11 looks poisoned. [ 49.994743] kmemleak: 1 new suspected memory leaks (see /sys/kernel/debug/kmemleak) finished pass 6 unreferenced object 0xffff888006fe8600 (size 256): comm "(udev-worker)", pid 422, jiffies 4294711998 (age 5.223s) hex dump (first 32 bytes): 00 86 fe 06 80 88 ff ff 00 00 00 00 00 00 00 00 ................ 00 00 00 00 00 00 00 00 70 92 a0 80 18 00 00 00 ........p....... backtrace: [<00000000da294bc2>] kmalloc_trace+0x26/0x90 [<00000000af593495>] __do_sys_timerfd_create+0x58/0x190 [<0000000044a7da2f>] do_syscall_64+0x34/0x80 [<0000000013b2114c>] entry_SYSCALL_64_after_hwframe+0x46/0xb0 [ 49.994743] kmemleak: 1 new suspected memory leaks (see /sys/kernel/debug/kmemleak) Linux (none) 6.3.0-rc1-f2-00002-gf23cd1ffca4b #361 SMP PREEMPT_DYNAMIC Tue Apr 4 20:05:45 MDT 2023 x86_64 x86_64 x86_64 GNU/Linux :#> cat /sys/kernel/debug/kmemleak :#>:#> [16686.317671] watchdog: BUG: soft lockup - CPU#1 stuck for 85s! [kworker/1:0:707] [16686.578795] Modules linked in: crc32_pclmul(E) intel_rapl_common(E) ghash_clmulni_intel(E) crct10dif_pclmul(E) crc32c_intel(E) pcspkr(E) serio_raw(E) i2c_piix4(E) [last unloaded: drm(E)] [16686.579866] CPU: 1 PID: 707 Comm: kworker/1:0 Tainted: G E 6.3.0-rc1-f2-00002-gf23cd1ffca4b #361 [16686.580479] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-1.fc37 04/01/2014 [16686.580988] Workqueue: ata_sff ata_sff_pio_task [16686.581217] RIP: 0010:_raw_spin_unlock_irq+0x11/0x30 [16686.581531] Code: 05 fc b3 23 7e 85 c0 74 01 c3 0f 1f 44 00 00 c3 66 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 e8 46 00 00 00 90 fb 0f 1f 44 00 00 <bf> 01 00 00 00 e8 55 c1 3b ff 65 8b 05 c6 b3 23 7e 85 c0 74 01 c3 [16686.582594] RSP: 0018:ffffc9000042fe90 EFLAGS: 00000246 [16686.582899] RAX: 0000000000000000 RBX: ffff888005dcc0b8 RCX: 0000000000000000 [16686.583424] RDX: 0000000000010376 RSI: ffff888005dcc17c RDI: ffff888007b8fe40 [16686.583900] RBP: ffff88807dcb2100 R08: ff6565725e607360 R09: 0000000000000001 [16686.584241] R10: 0000000000000058 R11: fefefefefefefeff R12: ffff88807dcbb500 [16686.584570] R13: 0000000000000000 R14: ffff888008bb20c0 R15: ffff888005dcc0c0 [16686.584891] FS: 0000000000000000(0000) GS:ffff88807dc80000(0000) knlGS:0000000000000000 [16686.585252] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [16686.585596] CR2: 000055bbcb66c440 CR3: 000000000938d000 CR4: 0000000000750ee0 [16686.585979] PKRU: 55555554 [16686.586156] Call Trace: [16686.586335] <TASK> [16686.586490] process_one_work+0x1c3/0x3c0 [16686.586824] worker_thread+0x4d/0x380 [16686.587073] ? _raw_spin_lock_irqsave+0x23/0x50 [16686.587380] ? rescuer_thread+0x370/0x370 [16686.587661] kthread+0xe6/0x110 [16686.587881] ? kthread_complete_and_exit+0x20/0x20 [16686.588221] ret_from_fork+0x1f/0x30 [16686.588460] </TASK> using sh-script posted previously, 2nd run went like this: the kmemleak modprobe traces look unchanged from last report, but new rmmod leaks are seen here. And kmemleak file is empty after the report. It also BUG'd later on a soft lockup finished pass 15 Linux (none) 6.3.0-rc1-f2-00002-gf23cd1ffca4b #361 SMP PREEMPT_DYNAMIC Tue Apr 4 20:05:45 MDT 2023 x86_64 x86_64 x86_64 GNU/Linux [ 2030.329359] ACPI: bus type drm_connector registered [ 2030.686297] AMD-Vi: AMD IOMMUv2 functionality not available on this system - This is not a bug. [ 2031.600726] [drm] amdgpu kernel modesetting enabled. [ 2031.601205] amdgpu: CRAT table not found [ 2031.601403] amdgpu: Virtual CRAT table created for CPU [ 2031.601797] amdgpu: Topology: Add CPU node real 0m1.725s user 0m0.000s sys 0m0.956s rmmod: ERROR: Module intel_rapl_msr is not currently loaded [ 2032.328701] ACPI: bus type drm_connector unregistered [ 2032.504633] kmemleak: 8 new suspected memory leaks (see /sys/kernel/debug/kmemleak) finished pass 16 unreferenced object 0xffff88800536eb40 (size 192): comm "modprobe", pid 927, jiffies 4296693079 (age 6.651s) hex dump (first 32 bytes): 00 5b 50 c0 ff ff ff ff 00 00 00 00 00 00 00 00 .[P............. 00 00 00 00 00 00 00 00 ea ff ff ff ff ff ff ff ................ backtrace: [<0000000097c7da82>] __kmalloc+0x49/0x150 [<000000003bcf1708>] __register_sysctl_table+0x51/0x7f0 [<00000000c0b7f00a>] 0xffffffffc04f2a78 [<000000009ea66960>] 0xffffffffc066001b [<000000001412bcff>] do_one_initcall+0x43/0x210 [<00000000c116532d>] do_init_module+0x60/0x240 [<000000001f641d01>] __do_sys_finit_module+0x93/0xf0 [<0000000034507d8b>] do_syscall_64+0x34/0x80 [<0000000087a8ea8c>] entry_SYSCALL_64_after_hwframe+0x46/0xb0 unreferenced object 0xffff888009e3b800 (size 256): comm "modprobe", pid 927, jiffies 4296693079 (age 6.651s) hex dump (first 32 bytes): 78 b8 e3 09 80 88 ff ff 00 00 00 00 00 00 00 00 x............... 00 00 00 00 00 00 00 00 ea ff ff ff ff ff ff ff ................ backtrace: [<0000000097c7da82>] __kmalloc+0x49/0x150 [<00000000a17c20b9>] __register_sysctl_table+0x569/0x7f0 [<00000000c0b7f00a>] 0xffffffffc04f2a78 [<000000009ea66960>] 0xffffffffc066001b [<000000001412bcff>] do_one_initcall+0x43/0x210 [<00000000c116532d>] do_init_module+0x60/0x240 [<000000001f641d01>] __do_sys_finit_module+0x93/0xf0 [<0000000034507d8b>] do_syscall_64+0x34/0x80 [<0000000087a8ea8c>] entry_SYSCALL_64_after_hwframe+0x46/0xb0 unreferenced object 0xffff888021c363c0 (size 96): comm "rmmod", pid 947, jiffies 4296694434 (age 5.296s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<0000000085923eeb>] kmalloc_trace+0x26/0x90 [<00000000920b2848>] kernfs_fop_open+0x30c/0x390 [<0000000078b60e3b>] do_dentry_open+0x1de/0x410 [<00000000140ea377>] path_openat+0xaa0/0x10a0 [<000000008bcf35a2>] do_filp_open+0xa1/0x130 [<00000000404bfb4b>] do_sys_openat2+0x74/0x130 [<000000009fd3d965>] __x64_sys_openat+0x5c/0x70 [<0000000034507d8b>] do_syscall_64+0x34/0x80 [<0000000087a8ea8c>] entry_SYSCALL_64_after_hwframe+0x46/0xb0 unreferenced object 0xffff888021c36540 (size 96): comm "rmmod", pid 947, jiffies 4296694434 (age 5.296s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<0000000085923eeb>] kmalloc_trace+0x26/0x90 [<00000000920b2848>] kernfs_fop_open+0x30c/0x390 [<0000000078b60e3b>] do_dentry_open+0x1de/0x410 [<00000000140ea377>] path_openat+0xaa0/0x10a0 [<000000008bcf35a2>] do_filp_open+0xa1/0x130 [<00000000404bfb4b>] do_sys_openat2+0x74/0x130 [<000000009fd3d965>] __x64_sys_openat+0x5c/0x70 [<0000000034507d8b>] do_syscall_64+0x34/0x80 [<0000000087a8ea8c>] entry_SYSCALL_64_after_hwframe+0x46/0xb0 unreferenced object 0xffff888021c36420 (size 96): comm "rmmod", pid 948, jiffies 4296694456 (age 5.274s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<0000000085923eeb>] kmalloc_trace+0x26/0x90 [<00000000920b2848>] kernfs_fop_open+0x30c/0x390 [<0000000078b60e3b>] do_dentry_open+0x1de/0x410 [<00000000140ea377>] path_openat+0xaa0/0x10a0 [<000000008bcf35a2>] do_filp_open+0xa1/0x130 [<00000000404bfb4b>] do_sys_openat2+0x74/0x130 [<000000009fd3d965>] __x64_sys_openat+0x5c/0x70 [<0000000034507d8b>] do_syscall_64+0x34/0x80 [<0000000087a8ea8c>] entry_SYSCALL_64_after_hwframe+0x46/0xb0 unreferenced object 0xffff888021c368a0 (size 96): comm "rmmod", pid 950, jiffies 4296694518 (age 5.240s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<0000000085923eeb>] kmalloc_trace+0x26/0x90 [<00000000920b2848>] kernfs_fop_open+0x30c/0x390 [<0000000078b60e3b>] do_dentry_open+0x1de/0x410 [<00000000140ea377>] path_openat+0xaa0/0x10a0 [<000000008bcf35a2>] do_filp_open+0xa1/0x130 [<00000000404bfb4b>] do_sys_openat2+0x74/0x130 [<000000009fd3d965>] __x64_sys_openat+0x5c/0x70 [<0000000034507d8b>] do_syscall_64+0x34/0x80 [<0000000087a8ea8c>] entry_SYSCALL_64_after_hwframe+0x46/0xb0 unreferenced object 0xffff888021c36ae0 (size 96): comm "rmmod", pid 950, jiffies 4296694519 (age 5.239s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<0000000085923eeb>] kmalloc_trace+0x26/0x90 [<00000000920b2848>] kernfs_fop_open+0x30c/0x390 [<0000000078b60e3b>] do_dentry_open+0x1de/0x410 [<00000000140ea377>] path_openat+0xaa0/0x10a0 [<000000008bcf35a2>] do_filp_open+0xa1/0x130 [<00000000404bfb4b>] do_sys_openat2+0x74/0x130 [<000000009fd3d965>] __x64_sys_openat+0x5c/0x70 [<0000000034507d8b>] do_syscall_64+0x34/0x80 [<0000000087a8ea8c>] entry_SYSCALL_64_after_hwframe+0x46/0xb0 [ 2032.504633] kmemleak: 8 new suspected memory leaks (see /sys/kernel/debug/kmemleak) Linux (none) 6.3.0-rc1-f2-00002-gf23cd1ffca4b #361 SMP PREEMPT_DYNAMIC Tue Apr 4 20:05:45 MDT 2023 x86_64 x86_64 x86_64 GNU/Linux /sys/kernel/debug/kmemleak is empty, so my reader script gets nothing :#> ./grok_kmemleak -ag all: bless( { 'backtraces' => {}, 'hexdumps' => {}, 'users' => {} }, 'LeakSet' ) mods: bless( { 'backtraces' => {}, 'hexdumps' => {}, 'users' => {} }, 'LeakSet' ) :#> [17772.014943] hrtimer: interrupt took 4628209 ns [17807.172581] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks: [18017.793102] watchdog: BUG: soft lockup - CPU#2 stuck for 253s! [kworker/2:1:420] [18017.793110] Modules linked in: crc32_pclmul(E) intel_rapl_common(E) ghash_clmulni_intel(E) crct10dif_pclmul(E) crc32c_intel(E) pcspkr(E) serio_raw(E) i2c_piix4(E) [last unloaded: drm(E)] [18017.793173] CPU: 2 PID: 420 Comm: kworker/2:1 Tainted: G E 6.3.0-rc1-f2-00002-gf23cd1ffca4b #361 [18017.793176] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-1.fc37 04/01/2014 [18017.793179] Workqueue: events_freezable_power_ disk_events_workfn [18017.793240] RIP: 0010:_raw_spin_unlock_irqrestore+0x19/0x40 [18017.793266] Code: 04 31 c0 5b c3 0f 1f 44 00 00 31 c0 eb f5 0f 1f 00 0f 1f 44 00 00 e8 f6 05 00 00 90 f7 c6 00 02 00 00 74 06 fb 0f 1f 44 00 00 <bf> 01 00 00 00 e8 fd c6 3b ff 65 8b 05 6e b9 23 7e 85 c0 74 01 c3 [18017.793268] RSP: 0018:ffffc9000058bb90 EFLAGS: 00000206 [18017.793269] RAX: 0000000000000001 RBX: 0000000000000000 RCX: 0000000000000000 [18017.793270] RDX: 0000000000000000 RSI: 0000000000000293 RDI: ffff888007f99900 [18017.793270] RBP: 0000000000000293 R08: 0000000000000001 R09: ffffffff82c6c380 [18017.793271] R10: 0000000000000000 R11: 0000000000000000 R12: ffff888006b88000 [18017.793271] R13: ffff888006933000 R14: ffff888006914000 R15: 0000000000000000 [18017.793274] FS: 0000000000000000(0000) GS:ffff88807dd00000(0000) knlGS:0000000000000000 [18017.793275] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [18017.793276] CR2: 000055fa79d48440 CR3: 0000000009f97000 CR4: 0000000000750ee0 [18017.793282] PKRU: 55555554 [18017.793294] Call Trace: [18017.793312] <TASK> [18017.793313] ata_scsi_queuecmd+0x4f/0x70 [18017.793339] scsi_queue_rq+0x36d/0xc20 [18017.793345] blk_mq_dispatch_rq_list+0x2ab/0x830 [18017.793355] ? _raw_spin_lock_irqsave+0x23/0x50 [18017.793358] __blk_mq_sched_dispatch_requests+0x9d/0x120 [18017.793363] blk_mq_sched_dispatch_requests+0x30/0x60 [18017.793365] __blk_mq_run_hw_queue+0x85/0xa0 [18017.793366] blk_execute_rq+0x9e/0x190 [18017.793368] scsi_execute_cmd+0xfd/0x2d0 [18017.793370] sr_check_events+0xc1/0x2b0 [18017.793374] ? finish_task_switch.isra.0+0x9b/0x2f0 [18017.793394] cdrom_check_events+0x14/0x30 [18017.793403] disk_check_events+0x34/0xf0 [18017.793405] process_one_work+0x1c3/0x3c0 [18017.793408] worker_thread+0x4d/0x380 [18017.793409] ? _raw_spin_lock_irqsave+0x23/0x50 [18017.793411] ? rescuer_thread+0x370/0x370 [18017.793412] kthread+0xe6/0x110 [18017.793415] ? kthread_complete_and_exit+0x20/0x20 [18017.793417] ret_from_fork+0x1f/0x30 [18017.793428] </TASK> [18020.504528] rcu: 0-...!: (1 GPs behind) idle=753c/0/0x1 softirq=100444/100444 fqs=470 [18020.504962] rcu: (detected by 0, t=273814 jiffies, g=67733, q=11 ncpus=3) [18020.505425] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G EL 6.3.0-rc1-f2-00002-gf23cd1ffca4b #361 [18020.506002] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-1.fc37 04/01/2014 [18020.506499] RIP: 0010:pv_native_safe_halt+0xb/0x10 [18020.506848] Code: c3 0f 23 f6 c3 0f 0b 0f 1f 84 00 00 00 00 00 0f 0b 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 eb 07 0f 00 2d 6f 69 24 00 fb f4 <c3> cc cc cc cc 8b 17 48 89 fe 89 d7 83 e7 fe 0f 01 f9 66 90 48 c1 [18020.508055] RSP: 0018:ffffffff82c03e98 EFLAGS: 00000282 [18020.508376] RAX: 0000000000000000 RBX: ffffffff82c0f2c0 RCX: 0000000000000001 [18020.508824] RDX: 4000000000000000 RSI: ffffffff824c92ca RDI: ffffffff82486ff1 [18020.509199] RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000000 [18020.509532] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 [18020.509897] R13: 0000000000000000 R14: ffffffff82c0ea10 R15: 0000000000000000 [18020.510375] FS: 0000000000000000(0000) GS:ffff88807dc00000(0000) knlGS:0000000000000000 [18020.510901] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [18020.511251] CR2: 00007f1773c1e738 CR3: 0000000002c32000 CR4: 0000000000750ef0 [18020.511705] PKRU: 55555554 [18020.511884] Call Trace: [18020.512076] <TASK> [18020.512183] default_idle+0x5/0x10 [18020.512353] default_idle_call+0x26/0xd0 [18020.512608] do_idle+0x1cb/0x220 [18020.512832] cpu_startup_entry+0x19/0x20 [18020.513098] rest_init+0xcb/0xd0 [18020.513261] arch_call_rest_init+0xa/0x20 [18020.513477] start_kernel+0x734/0xb40 [18020.513653] ? load_ucode_bsp+0x68/0x180 [18020.513847] secondary_startup_64_no_verify+0xe5/0xeb [18020.514089] </TASK> :#> Im not sure when I did the make clean, maybe here. it'd be a 'clean' explanation of the BUG struff. I havent seen any today so with some minor tweaks to my mod-load-unload leak_drive, (to use different modules as workload, in case anything tickles different) cycling on pcspkr, I get doing cycle_ pcspkr [ 65.163494] input: PC Speaker as /devices/platform/pcspkr/input/input24 [ 65.335487] kmemleak: 1 new suspected memory leaks (see /sys/kernel/debug/kmemleak) finished pass 21 unreferenced object 0xffff8880097a1120 (size 96): comm "bash", pid 412, jiffies 4294727350 (age 5.240s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<000000006a558d73>] kmalloc_trace+0x26/0x90 [<0000000010830fb6>] kernfs_fop_open+0x30c/0x390 [<000000002c6acb11>] do_dentry_open+0x1de/0x410 [<00000000debed23e>] path_openat+0xaa0/0x10a0 [<00000000f619d9cf>] do_filp_open+0xa1/0x130 [<00000000b6a4c64d>] do_sys_openat2+0x74/0x130 [<0000000007cd46d3>] __x64_sys_openat+0x5c/0x70 [<0000000062074f8d>] do_syscall_64+0x34/0x80 [<0000000044b0764d>] entry_SYSCALL_64_after_hwframe+0x46/0xb0 [ 65.335487] kmemleak: 1 new suspected memory leaks (see /sys/kernel/debug/kmemleak) Linux (none) 6.3.0-rc1-f2-00002-gf23cd1ffca4b #362 SMP PREEMPT_DYNAMIC Wed Apr 5 09:26:03 MDT 2023 x86_64 x86_64 x86_64 GNU/Linux :#> that leaktrace repeated verbatim when cycling on test_dynamic_debug the leaktrace cycling drm modules is like I posted previously, but repeated here, since its a new patch finished pass 4 Linux (none) 6.3.0-rc1-f2-00002-gf23cd1ffca4b #362 SMP PREEMPT_DYNAMIC Wed Apr 5 09:26:03 MDT 2023 x86_64 x86_64 x86_64 GNU/Linux [ 305.519903] ACPI: bus type drm_connector registered [ 305.768467] AMD-Vi: AMD IOMMUv2 functionality not available on this system - This is not a bug. [ 306.487182] [drm] amdgpu kernel modesetting enabled. [ 306.487838] amdgpu: CRAT table not found [ 306.488124] amdgpu: Virtual CRAT table created for CPU [ 306.488585] amdgpu: Topology: Add CPU node real 0m1.258s user 0m0.002s sys 0m0.876s rmmod: ERROR: Module intel_rapl_msr is not currently loaded [ 307.066093] ACPI: bus type drm_connector unregistered [ 307.215619] kmemleak: 2 new suspected memory leaks (see /sys/kernel/debug/kmemleak) finished pass 5 unreferenced object 0xffff888007f87180 (size 192): comm "modprobe", pid 504, jiffies 4294969152 (age 5.317s) hex dump (first 32 bytes): 00 fb 50 c0 ff ff ff ff 00 00 00 00 00 00 00 00 ..P............. 00 00 00 00 00 00 00 00 ea ff ff ff ff ff ff ff ................ backtrace: [<00000000f63d45f6>] __kmalloc+0x49/0x150 [<00000000f918d102>] __register_sysctl_table+0x51/0x7f0 [<000000002fbfef58>] 0xffffffffc04fca78 [<000000001dcc6780>] 0xffffffffc038601b [<00000000d82d83ab>] do_one_initcall+0x43/0x210 [<000000002ef9c020>] do_init_module+0x60/0x240 [<00000000859f64f2>] __do_sys_finit_module+0x93/0xf0 [<000000006b72a46f>] do_syscall_64+0x34/0x80 [<000000009dda0f8e>] entry_SYSCALL_64_after_hwframe+0x46/0xb0 unreferenced object 0xffff888008a3c800 (size 256): comm "modprobe", pid 504, jiffies 4294969152 (age 5.317s) hex dump (first 32 bytes): 78 c8 a3 08 80 88 ff ff 00 00 00 00 00 00 00 00 x............... 00 00 00 00 00 00 00 00 ea ff ff ff ff ff ff ff ................ backtrace: [<00000000f63d45f6>] __kmalloc+0x49/0x150 [<000000001a756730>] __register_sysctl_table+0x569/0x7f0 [<000000002fbfef58>] 0xffffffffc04fca78 [<000000001dcc6780>] 0xffffffffc038601b [<00000000d82d83ab>] do_one_initcall+0x43/0x210 [<000000002ef9c020>] do_init_module+0x60/0x240 [<00000000859f64f2>] __do_sys_finit_module+0x93/0xf0 [<000000006b72a46f>] do_syscall_64+0x34/0x80 [<000000009dda0f8e>] entry_SYSCALL_64_after_hwframe+0x46/0xb0 [ 307.215619] kmemleak: 2 new suspected memory leaks (see /sys/kernel/debug/kmemleak) Linux (none) 6.3.0-rc1-f2-00002-gf23cd1ffca4b #362 SMP PREEMPT_DYNAMIC Wed Apr 5 09:26:03 MDT 2023 x86_64 x86_64 x86_64 GNU/Linux expanding those traces (cuz I scripted it) :#> ./grok_kmemleak -agp paste in your selected input: - ie pasted from above unreferenced object 0xffff888007f87180 (size 192): comm "modprobe", pid 504, jiffies 4294969152 (age 5.317s) hex dump (first 32 bytes): 00 fb 50 c0 ff ff ff ff 00 00 00 00 00 00 00 00 ..P............. 00 00 00 00 00 00 00 00 ea ff ff ff ff ff ff ff ................ backtrace: [<00000000f63d45f6>] __kmalloc+0x49/0x150 [<00000000f918d102>] __register_sysctl_table+0x51/0x7f0 [<000000002fbfef58>] 0xffffffffc04fca78 [<000000001dcc6780>] 0xffffffffc038601b [<00000000d82d83ab>] do_one_initcall+0x43/0x210 [<000000002ef9c020>] do_init_module+0x60/0x240 [<00000000859f64f2>] __do_sys_finit_module+0x93/0xf0 [<000000006b72a46f>] do_syscall_64+0x34/0x80 [<000000009dda0f8e>] entry_SYSCALL_64_after_hwframe+0x46/0xb0 unreferenced object 0xffff888008a3c800 (size 256): comm "modprobe", pid 504, jiffies 4294969152 (age 5.317s) hex dump (first 32 bytes): 78 c8 a3 08 80 88 ff ff 00 00 00 00 00 00 00 00 x............... 00 00 00 00 00 00 00 00 ea ff ff ff ff ff ff ff ................ backtrace: [<00000000f63d45f6>] __kmalloc+0x49/0x150 [<000000001a756730>] __register_sysctl_table+0x569/0x7f0 [<000000002fbfef58>] 0xffffffffc04fca78 [<000000001dcc6780>] 0xffffffffc038601b [<00000000d82d83ab>] do_one_initcall+0x43/0x210 [<000000002ef9c020>] do_init_module+0x60/0x240 [<00000000859f64f2>] __do_sys_finit_module+0x93/0xf0 [<000000006b72a46f>] do_syscall_64+0x34/0x80 [<000000009dda0f8e>] entry_SYSCALL_64_after_hwframe+0x46/0xb0 all: bless( { 'backtraces' => { '[<00000000f63d45f6>] __kmalloc+0x49/0x150 [<000000001a756730>] __register_sysctl_table+0x569/0x7f0 [<000000002fbfef58>] 0xffffffffc04fca78 [<000000001dcc6780>] 0xffffffffc038601b [<00000000d82d83ab>] do_one_initcall+0x43/0x210 [<000000002ef9c020>] do_init_module+0x60/0x240 [<00000000859f64f2>] __do_sys_finit_module+0x93/0xf0 [<000000006b72a46f>] do_syscall_64+0x34/0x80 [<000000009dda0f8e>] entry_SYSCALL_64_after_hwframe+0x46/0xb0' => 1, '[<00000000f63d45f6>] __kmalloc+0x49/0x150 [<00000000f918d102>] __register_sysctl_table+0x51/0x7f0 [<000000002fbfef58>] 0xffffffffc04fca78 [<000000001dcc6780>] 0xffffffffc038601b [<00000000d82d83ab>] do_one_initcall+0x43/0x210 [<000000002ef9c020>] do_init_module+0x60/0x240 [<00000000859f64f2>] __do_sys_finit_module+0x93/0xf0 [<000000006b72a46f>] do_syscall_64+0x34/0x80 [<000000009dda0f8e>] entry_SYSCALL_64_after_hwframe+0x46/0xb0' => 1 }, 'hexdumps' => { '00 fb 50 c0 ff ff ff ff 00 00 00 00 00 00 00 00 ..P.............' => 1, '78 c8 a3 08 80 88 ff ff 00 00 00 00 00 00 00 00 x...............' => 1 }, 'users' => { 'comm "modprobe", pid 504,' => 2 } }, 'LeakSet' ) # this leak backtrace has 1 occurrences [<00000000f63d45f6>] __kmalloc+0x49/0x150 [<00000000f918d102>] __register_sysctl_table+0x51/0x7f0 [<000000002fbfef58>] 0xffffffffc04fca78 [<000000001dcc6780>] 0xffffffffc038601b [<00000000d82d83ab>] do_one_initcall+0x43/0x210 [<000000002ef9c020>] do_init_module+0x60/0x240 [<00000000859f64f2>] __do_sys_finit_module+0x93/0xf0 [<000000006b72a46f>] do_syscall_64+0x34/0x80 [<000000009dda0f8e>] entry_SYSCALL_64_after_hwframe+0x46/0xb0 doing: gdb -q -ex 'l *(__kmalloc+0x49)' -ex quit wk-next/builds/f2/vmlinux Reading symbols from wk-next/builds/f2/vmlinux... 0xffffffff81386319 is in __kmalloc (/home/jimc/projects/lx/wk-next/mm/slab_common.c:966). 961 s = kmalloc_slab(size, flags); 962 963 if (unlikely(ZERO_OR_NULL_PTR(s))) 964 return s; 965 966 ret = __kmem_cache_alloc_node(s, flags, node, size, caller); 967 ret = kasan_kmalloc(s, ret, size, flags); 968 trace_kmalloc(caller, ret, size, s->size, flags, node); 969 return ret; 970 } doing: gdb -q -ex 'l *(__register_sysctl_table+0x51)' -ex quit wk-next/builds/f2/vmlinux Reading symbols from wk-next/builds/f2/vmlinux... 0xffffffff814e5991 is in __register_sysctl_table (/home/jimc/projects/lx/wk-next/include/linux/slab.h:584). 579 index = kmalloc_index(size); 580 return kmalloc_trace( 581 kmalloc_caches[kmalloc_type(flags)][index], 582 flags, size); 583 } 584 return __kmalloc(size, flags); 585 } 586 #else 587 static __always_inline __alloc_size(1) void *kmalloc(size_t size, gfp_t flags) 588 { doing: gdb -q -ex 'l *(0xffffffffc04fca78)' -ex quit wk-next/builds/f2/vmlinux Reading symbols from wk-next/builds/f2/vmlinux... doing: gdb -q -ex 'l *(0xffffffffc038601b)' -ex quit wk-next/builds/f2/vmlinux Reading symbols from wk-next/builds/f2/vmlinux... doing: gdb -q -ex 'l *(do_one_initcall+0x43)' -ex quit wk-next/builds/f2/vmlinux Reading symbols from wk-next/builds/f2/vmlinux... 0xffffffff81001093 is in do_one_initcall (/home/jimc/projects/lx/wk-next/init/main.c:1306). 1301 1302 if (initcall_blacklisted(fn)) 1303 return -EPERM; 1304 1305 do_trace_initcall_start(fn); 1306 ret = fn(); 1307 do_trace_initcall_finish(fn, ret); 1308 1309 msgbuf[0] = 0; 1310 doing: gdb -q -ex 'l *(do_init_module+0x60)' -ex quit wk-next/builds/f2/vmlinux Reading symbols from wk-next/builds/f2/vmlinux... 0xffffffff812287c0 is in do_init_module (/home/jimc/projects/lx/wk-next/kernel/module/main.c:2481). 2476 freeinit->init_rodata = mod->mem[MOD_INIT_RODATA].base; 2477 2478 do_mod_ctors(mod); 2479 /* Start the module */ 2480 if (mod->init != NULL) 2481 ret = do_one_initcall(mod->init); 2482 if (ret < 0) { 2483 goto fail_free_freeinit; 2484 } 2485 if (ret > 0) { doing: gdb -q -ex 'l *(__do_sys_finit_module+0x93)' -ex quit wk-next/builds/f2/vmlinux Reading symbols from wk-next/builds/f2/vmlinux... 0xffffffff8122af73 is in __do_sys_finit_module (/home/jimc/projects/lx/wk-next/kernel/module/main.c:2987). 2982 } else { 2983 info.hdr = buf; 2984 info.len = len; 2985 } 2986 2987 return load_module(&info, uargs, flags); 2988 } 2989 2990 static inline int within(unsigned long addr, void *start, unsigned long size) 2991 { doing: gdb -q -ex 'l *(do_syscall_64+0x34)' -ex quit wk-next/builds/f2/vmlinux Reading symbols from wk-next/builds/f2/vmlinux... 0xffffffff81ddad64 is in do_syscall_64 (/home/jimc/projects/lx/wk-next/arch/x86/entry/common.c:50). 45 */ 46 unsigned int unr = nr; 47 48 if (likely(unr < NR_syscalls)) { 49 unr = array_index_nospec(unr, NR_syscalls); 50 regs->ax = sys_call_table[unr](regs); 51 return true; 52 } 53 return false; 54 } doing: gdb -q -ex 'l *(entry_SYSCALL_64_after_hwframe+0x46)' -ex quit wk-next/builds/f2/vmlinux Reading symbols from wk-next/builds/f2/vmlinux... 0xffffffff81e0006a is at /home/jimc/projects/lx/wk-next/arch/x86/entry/entry_64.S:120. 115 116 /* clobbers %rax, make sure it is after saving the syscall nr */ 117 IBRS_ENTER 118 UNTRAIN_RET 119 120 call do_syscall_64 /* returns with IRQs disabled */ 121 122 /* 123 * Try to use SYSRET instead of IRET if we're returning to 124 * a completely clean 64-bit userspace context. If we're not, # this leak backtrace has 1 occurrences [<00000000f63d45f6>] __kmalloc+0x49/0x150 [<000000001a756730>] __register_sysctl_table+0x569/0x7f0 [<000000002fbfef58>] 0xffffffffc04fca78 [<000000001dcc6780>] 0xffffffffc038601b [<00000000d82d83ab>] do_one_initcall+0x43/0x210 [<000000002ef9c020>] do_init_module+0x60/0x240 [<00000000859f64f2>] __do_sys_finit_module+0x93/0xf0 [<000000006b72a46f>] do_syscall_64+0x34/0x80 [<000000009dda0f8e>] entry_SYSCALL_64_after_hwframe+0x46/0xb0 doing: gdb -q -ex 'l *(__kmalloc+0x49)' -ex quit wk-next/builds/f2/vmlinux Reading symbols from wk-next/builds/f2/vmlinux... 0xffffffff81386319 is in __kmalloc (/home/jimc/projects/lx/wk-next/mm/slab_common.c:966). 961 s = kmalloc_slab(size, flags); 962 963 if (unlikely(ZERO_OR_NULL_PTR(s))) 964 return s; 965 966 ret = __kmem_cache_alloc_node(s, flags, node, size, caller); 967 ret = kasan_kmalloc(s, ret, size, flags); 968 trace_kmalloc(caller, ret, size, s->size, flags, node); 969 return ret; 970 } doing: gdb -q -ex 'l *(__register_sysctl_table+0x569)' -ex quit wk-next/builds/f2/vmlinux Reading symbols from wk-next/builds/f2/vmlinux... 0xffffffff814e5ea9 is in __register_sysctl_table (/home/jimc/projects/lx/wk-next/fs/proc/proc_sysctl.c:974). 969 char *new_name; 970 971 new = kzalloc(sizeof(*new) + sizeof(struct ctl_node) + 972 sizeof(struct ctl_table)*2 + namelen + 1, 973 GFP_KERNEL); 974 if (!new) 975 return NULL; 976 977 node = (struct ctl_node *)(new + 1); 978 table = (struct ctl_table *)(node + 1); doing: gdb -q -ex 'l *(0xffffffffc04fca78)' -ex quit wk-next/builds/f2/vmlinux Reading symbols from wk-next/builds/f2/vmlinux... doing: gdb -q -ex 'l *(0xffffffffc038601b)' -ex quit wk-next/builds/f2/vmlinux Reading symbols from wk-next/builds/f2/vmlinux... doing: gdb -q -ex 'l *(do_one_initcall+0x43)' -ex quit wk-next/builds/f2/vmlinux Reading symbols from wk-next/builds/f2/vmlinux... 0xffffffff81001093 is in do_one_initcall (/home/jimc/projects/lx/wk-next/init/main.c:1306). 1301 1302 if (initcall_blacklisted(fn)) 1303 return -EPERM; 1304 1305 do_trace_initcall_start(fn); 1306 ret = fn(); 1307 do_trace_initcall_finish(fn, ret); 1308 1309 msgbuf[0] = 0; 1310 doing: gdb -q -ex 'l *(do_init_module+0x60)' -ex quit wk-next/builds/f2/vmlinux Reading symbols from wk-next/builds/f2/vmlinux... 0xffffffff812287c0 is in do_init_module (/home/jimc/projects/lx/wk-next/kernel/module/main.c:2481). 2476 freeinit->init_rodata = mod->mem[MOD_INIT_RODATA].base; 2477 2478 do_mod_ctors(mod); 2479 /* Start the module */ 2480 if (mod->init != NULL) 2481 ret = do_one_initcall(mod->init); 2482 if (ret < 0) { 2483 goto fail_free_freeinit; 2484 } 2485 if (ret > 0) { doing: gdb -q -ex 'l *(__do_sys_finit_module+0x93)' -ex quit wk-next/builds/f2/vmlinux Reading symbols from wk-next/builds/f2/vmlinux... 0xffffffff8122af73 is in __do_sys_finit_module (/home/jimc/projects/lx/wk-next/kernel/module/main.c:2987). 2982 } else { 2983 info.hdr = buf; 2984 info.len = len; 2985 } 2986 2987 return load_module(&info, uargs, flags); 2988 } 2989 2990 static inline int within(unsigned long addr, void *start, unsigned long size) 2991 { doing: gdb -q -ex 'l *(do_syscall_64+0x34)' -ex quit wk-next/builds/f2/vmlinux Reading symbols from wk-next/builds/f2/vmlinux... 0xffffffff81ddad64 is in do_syscall_64 (/home/jimc/projects/lx/wk-next/arch/x86/entry/common.c:50). 45 */ 46 unsigned int unr = nr; 47 48 if (likely(unr < NR_syscalls)) { 49 unr = array_index_nospec(unr, NR_syscalls); 50 regs->ax = sys_call_table[unr](regs); 51 return true; 52 } 53 return false; 54 } doing: gdb -q -ex 'l *(entry_SYSCALL_64_after_hwframe+0x46)' -ex quit wk-next/builds/f2/vmlinux Reading symbols from wk-next/builds/f2/vmlinux... 0xffffffff81e0006a is at /home/jimc/projects/lx/wk-next/arch/x86/entry/entry_64.S:120. 115 116 /* clobbers %rax, make sure it is after saving the syscall nr */ 117 IBRS_ENTER 118 UNTRAIN_RET 119 120 call do_syscall_64 /* returns with IRQs disabled */ 121 122 /* 123 * Try to use SYSRET instead of IRET if we're returning to 124 * a completely clean 64-bit userspace context. If we're not, mods: bless( { 'backtraces' => {}, 'hexdumps' => {}, 'users' => {} }, 'LeakSet' ) :#> (via https://msgid.link/CAJfuBxw7F5yN=F=i_0ZZ0b2EpWU4T=RXaf13qG9XVq6tG-EGJQ@mail.gmail.com) Luis Chamberlain <mcgrof@kernel.org> replies to comment #10: On Wed, Apr 05, 2023 at 09:14:12PM -0600, jim.cromie@gmail.com wrote: > On Tue, Apr 4, 2023 at 8:01 PM Luis Chamberlain <mcgrof@kernel.org> wrote: > > > > On Tue, Apr 04, 2023 at 07:38:41PM -0600, jim.cromie@gmail.com wrote: > > > On Mon, Apr 3, 2023 at 2:44 PM Luis Chamberlain <mcgrof@kernel.org> > wrote: <-- my old patch --> > this disappearing act is still going on. > my script issues no echo clear > kmemleak So this email is ab it confusing. Because you comment here before the new patch. <-- my new switch statement kmemleak patch to fix the reported leak --> > sorry for the delay, I was seeing heisen-responses, and several BUGs. > a make clean seems to have settled things mostly. And now here you comment after thew new suggested patch and say it seemst to mostly fixed things. > But in case theres any clues in there, In where? > Ive kept the paste-in of 2 BUGs > > with > f23cd1ffca4b (HEAD) kmemleaks on ac3b43283923 ("module: replace > module_layout with module_memory") > ac3b43283923 module: replace module_layout with module_memory The only commit here that makes sense to me is ac3b43283923 ("module: replace module_layout with module_memory" Commit f23cd1ffca4b means absolutely nothing to me. I can only guess you mean that you've applied my suggested changes with the new switch statement? > heres the 1st run. cuz it leaked, I reran in another vm, which got > different results. > I left it overnight doing nothing (laptop slept, vm with it), > and it BUG'd on a soft lockup > (much later, but the leaktrace does have a timerfd in it) > R11 looks poisoned. <-- some unrelated leak I think --> > using sh-script posted previously, I don't recall what that sh-script was. <-- snip some leaks --> > Im not sure when I did the make clean, maybe here. > it'd be a 'clean' explanation of the BUG struff. > I havent seen any today OK great. <-- snip some long traces--> I don't get these long traces if you didn't see any then. Luis (via https://msgid.link/ZC8jQmbie6RWVyXo@bombadil.infradead.org) Catalin Marinas <catalin.marinas@arm.com> replies to comment #6: On Mon, Apr 03, 2023 at 01:43:58PM -0700, Luis Chamberlain wrote: > On Fri, Mar 31, 2023 at 05:27:04PM -0700, Song Liu wrote: > > On Fri, Mar 31, 2023 at 12:00 AM Luis Chamberlain <mcgrof@kernel.org> > wrote: > > > On Thu, Mar 30, 2023 at 04:45:43PM -0600, jim.cromie@gmail.com wrote: > > > > kmemleak is reporting 19 leaks during boot > > > > > > > > because the hexdumps appeared to have module-names, > > > > and Ive been hacking nearby, and see the same names > > > > every time I boot my test-vm, I needed a clearer picture > > > > Jason corroborated and bisected. > > > > > > > > the 19 leaks split into 2 groups, > > > > 9 with names of builtin modules in the hexdump, > > > > all with the same backtrace > > > > 9 without module-names (with a shared backtrace) > > > > +1 wo name-ish and a separate backtrace > > > > > > Song, please take a look. > > > > I will look into this next week. > > I'm thinking this may be it, at least this gets us to what we used to do > as per original Catalinas' 4f2294b6dc88d ("kmemleak: Add modules > support") and right before Song's patch. > > diff --git a/kernel/module/main.c b/kernel/module/main.c > index 6b6da80f363f..3b9c71fa6096 100644 > --- a/kernel/module/main.c > +++ b/kernel/module/main.c > @@ -2240,7 +2240,10 @@ static int move_module(struct module *mod, struct > load_info *info) > * which is inside the block. Just mark it as not being a > * leak. > */ > - kmemleak_ignore(ptr); > + if (type == MOD_INIT_TEXT) > + kmemleak_ignore(ptr); > + else > + kmemleak_not_leak(ptr); > if (!ptr) { > t = type; > goto out_enomem; > > We used to use the grey area for the TEXT but the original commit > doesn't explain too well why we grey out init but not the others. Ie > why kmemleak_ignore() on init and kmemleak_not_leak() on the others. It's safe to use the 'grey' colour in all cases. For text sections that don't need scanning, there's a slight chance of increasing the false negatives, so marking it 'black' ignores the scanning. For the init section, if it gets discarded anyway, just going with kmemleak_not_leak() is fine. It simplifies the logic above. (via https://msgid.link/ZDV4YGjRpuqcI7F3@arm.com) Luis Chamberlain <mcgrof@kernel.org> replies to comment #12: On Tue, Apr 11, 2023 at 04:10:24PM +0100, Catalin Marinas wrote: > On Mon, Apr 03, 2023 at 01:43:58PM -0700, Luis Chamberlain wrote: > > On Fri, Mar 31, 2023 at 05:27:04PM -0700, Song Liu wrote: > > > On Fri, Mar 31, 2023 at 12:00 AM Luis Chamberlain <mcgrof@kernel.org> > wrote: > > > > On Thu, Mar 30, 2023 at 04:45:43PM -0600, jim.cromie@gmail.com wrote: > > > > > kmemleak is reporting 19 leaks during boot > > > > > > > > > > because the hexdumps appeared to have module-names, > > > > > and Ive been hacking nearby, and see the same names > > > > > every time I boot my test-vm, I needed a clearer picture > > > > > Jason corroborated and bisected. > > > > > > > > > > the 19 leaks split into 2 groups, > > > > > 9 with names of builtin modules in the hexdump, > > > > > all with the same backtrace > > > > > 9 without module-names (with a shared backtrace) > > > > > +1 wo name-ish and a separate backtrace > > > > > > > > Song, please take a look. > > > > > > I will look into this next week. > > > > I'm thinking this may be it, at least this gets us to what we used to do > > as per original Catalinas' 4f2294b6dc88d ("kmemleak: Add modules > > support") and right before Song's patch. > > > > diff --git a/kernel/module/main.c b/kernel/module/main.c > > index 6b6da80f363f..3b9c71fa6096 100644 > > --- a/kernel/module/main.c > > +++ b/kernel/module/main.c > > @@ -2240,7 +2240,10 @@ static int move_module(struct module *mod, struct > load_info *info) > > * which is inside the block. Just mark it as not being a > > * leak. > > */ > > - kmemleak_ignore(ptr); > > + if (type == MOD_INIT_TEXT) > > + kmemleak_ignore(ptr); > > + else > > + kmemleak_not_leak(ptr); > > if (!ptr) { > > t = type; > > goto out_enomem; > > > > We used to use the grey area for the TEXT but the original commit > > doesn't explain too well why we grey out init but not the others. Ie > > why kmemleak_ignore() on init and kmemleak_not_leak() on the others. > > It's safe to use the 'grey' colour in all cases. For text sections that > don't need scanning, there's a slight chance of increasing the false > negatives, It turns out that there are *tons* of false positives today, unless these are real leaks. > so marking it 'black' ignores the scanning. For the init > section, if it gets discarded anyway, just going with > kmemleak_not_leak() is fine. It simplifies the logic above. Luis (via https://msgid.link/ZDWT6UoWshTUBU+u@bombadil.infradead.org) Luis Chamberlain <mcgrof@kernel.org> replies to comment #13: On Tue, Apr 11, 2023 at 10:07:53AM -0700, Luis Chamberlain wrote: > On Tue, Apr 11, 2023 at 04:10:24PM +0100, Catalin Marinas wrote: > > On Mon, Apr 03, 2023 at 01:43:58PM -0700, Luis Chamberlain wrote: > > > On Fri, Mar 31, 2023 at 05:27:04PM -0700, Song Liu wrote: > > > > On Fri, Mar 31, 2023 at 12:00 AM Luis Chamberlain <mcgrof@kernel.org> > wrote: > > > > > On Thu, Mar 30, 2023 at 04:45:43PM -0600, jim.cromie@gmail.com wrote: > > > > > > kmemleak is reporting 19 leaks during boot > > > > > > > > > > > > because the hexdumps appeared to have module-names, > > > > > > and Ive been hacking nearby, and see the same names > > > > > > every time I boot my test-vm, I needed a clearer picture > > > > > > Jason corroborated and bisected. > > > > > > > > > > > > the 19 leaks split into 2 groups, > > > > > > 9 with names of builtin modules in the hexdump, > > > > > > all with the same backtrace > > > > > > 9 without module-names (with a shared backtrace) > > > > > > +1 wo name-ish and a separate backtrace > > > > > > > > > > Song, please take a look. > > > > > > > > I will look into this next week. > > > > > > I'm thinking this may be it, at least this gets us to what we used to do > > > as per original Catalinas' 4f2294b6dc88d ("kmemleak: Add modules > > > support") and right before Song's patch. > > > > > > diff --git a/kernel/module/main.c b/kernel/module/main.c > > > index 6b6da80f363f..3b9c71fa6096 100644 > > > --- a/kernel/module/main.c > > > +++ b/kernel/module/main.c > > > @@ -2240,7 +2240,10 @@ static int move_module(struct module *mod, struct > load_info *info) > > > * which is inside the block. Just mark it as not being a > > > * leak. > > > */ > > > - kmemleak_ignore(ptr); > > > + if (type == MOD_INIT_TEXT) > > > + kmemleak_ignore(ptr); > > > + else > > > + kmemleak_not_leak(ptr); > > > if (!ptr) { > > > t = type; > > > goto out_enomem; > > > > > > We used to use the grey area for the TEXT but the original commit > > > doesn't explain too well why we grey out init but not the others. Ie > > > why kmemleak_ignore() on init and kmemleak_not_leak() on the others. > > > > It's safe to use the 'grey' colour in all cases. For text sections that > > don't need scanning, there's a slight chance of increasing the false > > negatives, > > It turns out that there are *tons* of false positives today, unless > these are real leaks. I should clarify: *if* we leave things as-is, we seem to get tons of false positives. Luis (via https://msgid.link/ZDXmq1B2W0h2rrYW@bombadil.infradead.org) Catalin Marinas <catalin.marinas@arm.com> replies to comment #14: On Tue, Apr 11, 2023 at 04:00:59PM -0700, Luis Chamberlain wrote: > On Tue, Apr 11, 2023 at 10:07:53AM -0700, Luis Chamberlain wrote: > > On Tue, Apr 11, 2023 at 04:10:24PM +0100, Catalin Marinas wrote: > > > On Mon, Apr 03, 2023 at 01:43:58PM -0700, Luis Chamberlain wrote: > > > > On Fri, Mar 31, 2023 at 05:27:04PM -0700, Song Liu wrote: > > > > > On Fri, Mar 31, 2023 at 12:00 AM Luis Chamberlain <mcgrof@kernel.org> > wrote: > > > > > > On Thu, Mar 30, 2023 at 04:45:43PM -0600, jim.cromie@gmail.com > wrote: > > > > > > > kmemleak is reporting 19 leaks during boot > > > > > > > > > > > > > > because the hexdumps appeared to have module-names, > > > > > > > and Ive been hacking nearby, and see the same names > > > > > > > every time I boot my test-vm, I needed a clearer picture > > > > > > > Jason corroborated and bisected. > > > > > > > > > > > > > > the 19 leaks split into 2 groups, > > > > > > > 9 with names of builtin modules in the hexdump, > > > > > > > all with the same backtrace > > > > > > > 9 without module-names (with a shared backtrace) > > > > > > > +1 wo name-ish and a separate backtrace > > > > > > > > > > > > Song, please take a look. > > > > > > > > > > I will look into this next week. > > > > > > > > I'm thinking this may be it, at least this gets us to what we used to > do > > > > as per original Catalinas' 4f2294b6dc88d ("kmemleak: Add modules > > > > support") and right before Song's patch. > > > > > > > > diff --git a/kernel/module/main.c b/kernel/module/main.c > > > > index 6b6da80f363f..3b9c71fa6096 100644 > > > > --- a/kernel/module/main.c > > > > +++ b/kernel/module/main.c > > > > @@ -2240,7 +2240,10 @@ static int move_module(struct module *mod, > struct load_info *info) > > > > * which is inside the block. Just mark it as not being > a > > > > * leak. > > > > */ > > > > - kmemleak_ignore(ptr); > > > > + if (type == MOD_INIT_TEXT) > > > > + kmemleak_ignore(ptr); > > > > + else > > > > + kmemleak_not_leak(ptr); > > > > if (!ptr) { > > > > t = type; > > > > goto out_enomem; > > > > > > > > We used to use the grey area for the TEXT but the original commit > > > > doesn't explain too well why we grey out init but not the others. Ie > > > > why kmemleak_ignore() on init and kmemleak_not_leak() on the others. > > > > > > It's safe to use the 'grey' colour in all cases. For text sections that > > > don't need scanning, there's a slight chance of increasing the false > > > negatives, > > > > It turns out that there are *tons* of false positives today, unless > > these are real leaks. > > I should clarify: *if* we leave things as-is, we seem to get tons of > false positives. Which makes sense if kmemleak_ignore() is used, such objects would not be scanned. I'd just replace it with kmemleak_not_leak() irrespective of the type. (via https://msgid.link/ZDZ/oLGXa9DnIWbL@arm.com) Luis Chamberlain <mcgrof@kernel.org> replies to comment #15: On Wed, Apr 12, 2023 at 10:53:36AM +0100, Catalin Marinas wrote: > On Tue, Apr 11, 2023 at 04:00:59PM -0700, Luis Chamberlain wrote: > > On Tue, Apr 11, 2023 at 10:07:53AM -0700, Luis Chamberlain wrote: > > > On Tue, Apr 11, 2023 at 04:10:24PM +0100, Catalin Marinas wrote: > > > > On Mon, Apr 03, 2023 at 01:43:58PM -0700, Luis Chamberlain wrote: > > > > > On Fri, Mar 31, 2023 at 05:27:04PM -0700, Song Liu wrote: > > > > > > On Fri, Mar 31, 2023 at 12:00 AM Luis Chamberlain > <mcgrof@kernel.org> wrote: > > > > > > > On Thu, Mar 30, 2023 at 04:45:43PM -0600, jim.cromie@gmail.com > wrote: > > > > > > > > kmemleak is reporting 19 leaks during boot > > > > > > > > > > > > > > > > because the hexdumps appeared to have module-names, > > > > > > > > and Ive been hacking nearby, and see the same names > > > > > > > > every time I boot my test-vm, I needed a clearer picture > > > > > > > > Jason corroborated and bisected. > > > > > > > > > > > > > > > > the 19 leaks split into 2 groups, > > > > > > > > 9 with names of builtin modules in the hexdump, > > > > > > > > all with the same backtrace > > > > > > > > 9 without module-names (with a shared backtrace) > > > > > > > > +1 wo name-ish and a separate backtrace > > > > > > > > > > > > > > Song, please take a look. > > > > > > > > > > > > I will look into this next week. > > > > > > > > > > I'm thinking this may be it, at least this gets us to what we used to > do > > > > > as per original Catalinas' 4f2294b6dc88d ("kmemleak: Add modules > > > > > support") and right before Song's patch. > > > > > > > > > > diff --git a/kernel/module/main.c b/kernel/module/main.c > > > > > index 6b6da80f363f..3b9c71fa6096 100644 > > > > > --- a/kernel/module/main.c > > > > > +++ b/kernel/module/main.c > > > > > @@ -2240,7 +2240,10 @@ static int move_module(struct module *mod, > struct load_info *info) > > > > > * which is inside the block. Just mark it as not being > a > > > > > * leak. > > > > > */ > > > > > - kmemleak_ignore(ptr); > > > > > + if (type == MOD_INIT_TEXT) > > > > > + kmemleak_ignore(ptr); > > > > > + else > > > > > + kmemleak_not_leak(ptr); > > > > > if (!ptr) { > > > > > t = type; > > > > > goto out_enomem; > > > > > > > > > > We used to use the grey area for the TEXT but the original commit > > > > > doesn't explain too well why we grey out init but not the others. Ie > > > > > why kmemleak_ignore() on init and kmemleak_not_leak() on the others. > > > > > > > > It's safe to use the 'grey' colour in all cases. For text sections that > > > > don't need scanning, there's a slight chance of increasing the false > > > > negatives, > > > > > > It turns out that there are *tons* of false positives today, unless > > > these are real leaks. > > > > I should clarify: *if* we leave things as-is, we seem to get tons of > > false positives. > > Which makes sense if kmemleak_ignore() is used, such objects would not > be scanned. I'd just replace it with kmemleak_not_leak() irrespective of > the type. OK I'll do that and add a Suggested-by you :) Luis (via https://msgid.link/ZDbovXtyHPTMgUO6%40bombadil.infradead.org) |