Bug 217943

Summary: kmalloc memory leak over time.
Product: Linux Reporter: freeze0985
Component: KernelAssignee: Virtual assignee for kernel bugs (linux-kernel)
Status: NEW ---    
Severity: high CC: bagasdotme, freeze0985, regressions
Priority: P3    
Hardware: All   
OS: Linux   
Kernel Version: 6.5 Subsystem:
Regression: No Bisected commit-id:

Description freeze0985 2023-09-24 07:24:37 UTC
Since 1st weak of Sept I have been observing memory leak in my system, so after doing a little big of digging I found out that the leak is caused by kmalloc. In Linux 6.5.3 memory leak would increase  to nearly 50% of my ram over a period of 6-9 hours. In the Newer Linux 6.5.4 I am yet to observe that much memory leak(Haven't used my laptop for that long till now) but over a period of 3 hours I see 2.2GB reserved and it is not used by any program at all. this is just after 3 hours of usage on Linux 6.5.4. Slabtop link: https://i.imgur.com/8OFw2Fa.png
Comment 1 Bagas Sanjaya 2023-09-24 10:54:38 UTC
(In reply to freeze0985 from comment #0)
> Since 1st weak of Sept I have been observing memory leak in my system, so
> after doing a little big of digging I found out that the leak is caused by
> kmalloc. In Linux 6.5.3 memory leak would increase  to nearly 50% of my ram
> over a period of 6-9 hours. In the Newer Linux 6.5.4 I am yet to observe
> that much memory leak(Haven't used my laptop for that long till now) but
> over a period of 3 hours I see 2.2GB reserved and it is not used by any
> program at all. this is just after 3 hours of usage on Linux 6.5.4. Slabtop
> link: https://i.imgur.com/8OFw2Fa.png

Do you have this issue on v6.1?
Comment 2 freeze0985 2023-09-24 11:49:06 UTC
Not that i remember.
Comment 3 freeze0985 2023-09-24 11:49:50 UTC
(In reply to Bagas Sanjaya from comment #1)
> (In reply to freeze0985 from comment #0)
> > Since 1st weak of Sept I have been observing memory leak in my system, so
> > after doing a little big of digging I found out that the leak is caused by
> > kmalloc. In Linux 6.5.3 memory leak would increase  to nearly 50% of my ram
> > over a period of 6-9 hours. In the Newer Linux 6.5.4 I am yet to observe
> > that much memory leak(Haven't used my laptop for that long till now) but
> > over a period of 3 hours I see 2.2GB reserved and it is not used by any
> > program at all. this is just after 3 hours of usage on Linux 6.5.4. Slabtop
> > link: https://i.imgur.com/8OFw2Fa.png
> 
> Do you have this issue on v6.1?

I remember having boot problem at that time.
Comment 4 freeze0985 2023-09-24 11:50:44 UTC
My system info:
OS: ArcoLinux x86_64
Laptop name: Lenovo Legion 5 Pro 16IAH7H
CPU: i7-12700H
RAM: 16GB
GPU: NVIDIA GeForce RTX 3060 Mobile / Max-Q
GPU: Intel Alder Lake-P
Kernel: linux-6.5.4
Comment 5 Bagas Sanjaya 2023-09-26 00:28:06 UTC
(In reply to freeze0985 from comment #3)
> (In reply to Bagas Sanjaya from comment #1)
> > (In reply to freeze0985 from comment #0)
> > > Since 1st weak of Sept I have been observing memory leak in my system, so
> > > after doing a little big of digging I found out that the leak is caused
> by
> > > kmalloc. In Linux 6.5.3 memory leak would increase  to nearly 50% of my
> ram
> > > over a period of 6-9 hours. In the Newer Linux 6.5.4 I am yet to observe
> > > that much memory leak(Haven't used my laptop for that long till now) but
> > > over a period of 3 hours I see 2.2GB reserved and it is not used by any
> > > program at all. this is just after 3 hours of usage on Linux 6.5.4.
> Slabtop
> > > link: https://i.imgur.com/8OFw2Fa.png
> > 
> > Do you have this issue on v6.1?
> 
> I remember having boot problem at that time.

What about v6.4?
Comment 6 freeze0985 2023-09-26 03:21:34 UTC
Their was no problem iirc
Comment 7 freeze0985 2023-09-26 17:45:00 UTC
(In reply to Bagas Sanjaya from comment #5)
> (In reply to freeze0985 from comment #3)
> > (In reply to Bagas Sanjaya from comment #1)
> > > (In reply to freeze0985 from comment #0)
> > > > Since 1st weak of Sept I have been observing memory leak in my system,
> so
> > > > after doing a little big of digging I found out that the leak is caused
> > by
> > > > kmalloc. In Linux 6.5.3 memory leak would increase  to nearly 50% of my
> > ram
> > > > over a period of 6-9 hours. In the Newer Linux 6.5.4 I am yet to
> observe
> > > > that much memory leak(Haven't used my laptop for that long till now)
> but
> > > > over a period of 3 hours I see 2.2GB reserved and it is not used by any
> > > > program at all. this is just after 3 hours of usage on Linux 6.5.4.
> > Slabtop
> > > > link: https://i.imgur.com/8OFw2Fa.png
> > > 
> > > Do you have this issue on v6.1?
> > 
> > I remember having boot problem at that time.
> 
> What about v6.4?

6.4 works fine I down graded to it before linux 6.5.4 was released. no memory leak.

today i ran my laptop for 6 hours and 40 mins and this memory leak is still happening at same rate.

              total        used        free      shared  buff/cache   available
Mem:            15Gi       7.2Gi       2.8Gi       797Mi       6.4Gi       8.1Gi
Swap:             0B          0B          0B
Total:          15Gi       7.2Gi       2.8Gi


this is a big problem.
Comment 8 freeze0985 2023-09-27 14:33:10 UTC
I upgraded to 6.5.5 today and it's still not fixed yet.
Comment 9 Bagas Sanjaya 2023-09-28 08:22:08 UTC
(In reply to freeze0985 from comment #0)
> Since 1st weak of Sept I have been observing memory leak in my system, so
> after doing a little big of digging I found out that the leak is caused by
> kmalloc. In Linux 6.5.3 memory leak would increase  to nearly 50% of my ram
> over a period of 6-9 hours. In the Newer Linux 6.5.4 I am yet to observe
> that much memory leak(Haven't used my laptop for that long till now) but
> over a period of 3 hours I see 2.2GB reserved and it is not used by any
> program at all. this is just after 3 hours of usage on Linux 6.5.4. Slabtop
> link: https://i.imgur.com/8OFw2Fa.png

Can you attach that slabtop image here? The link above is broken.
Comment 10 Bagas Sanjaya 2023-09-28 08:22:56 UTC
(In reply to freeze0985 from comment #6)
> Their was no problem iirc

Then perform bisection. See Documentation/admin-guide/bug-bisect.rst for
how to do that.
Comment 11 The Linux kernel's regression tracker (Thorsten Leemhuis) 2023-09-28 09:56:14 UTC
(In reply to freeze0985 from comment #4)
> GPU: NVIDIA GeForce RTX 3060 Mobile / Max-Q
> GPU: Intel Alder Lake-P

Just to be sure: you are not using Nvidia's driver, do you?

But FWIW, a bisection would be really great. But given the time to spot this I guess that's won't be easy. Have you checked the net if there are any test debugging guides that explain how to see which kernel subsystem allocates all that memory?
Comment 12 freeze0985 2023-09-28 12:39:42 UTC
(In reply to The Linux kernel's regression tracker (Thorsten Leemhuis) from comment #11)
> (In reply to freeze0985 from comment #4)
> > GPU: NVIDIA GeForce RTX 3060 Mobile / Max-Q
> > GPU: Intel Alder Lake-P
> 
> Just to be sure: you are not using Nvidia's driver, do you?
> 
> But FWIW, a bisection would be really great. But given the time to spot this
> I guess that's won't be easy. Have you checked the net if there are any test
> debugging guides that explain how to see which kernel subsystem allocates
> all that memory?

I have nvidia-dkms https://archlinux.org/packages/extra/x86_64/nvidia-dkms/.

But I am not using any application which would need nvidia.
Comment 13 freeze0985 2023-09-28 12:41:39 UTC
(In reply to The Linux kernel's regression tracker (Thorsten Leemhuis) from comment #11)
> (In reply to freeze0985 from comment #4)
> > GPU: NVIDIA GeForce RTX 3060 Mobile / Max-Q
> > GPU: Intel Alder Lake-P
> 
> Just to be sure: you are not using Nvidia's driver, do you?
> 
> But FWIW, a bisection would be really great. But given the time to spot this
> I guess that's won't be easy. Have you checked the net if there are any test
> debugging guides that explain how to see which kernel subsystem allocates
> all that memory?

I have not read anything related to linux kernel subsytem allocation.
Comment 14 freeze0985 2023-09-28 12:43:20 UTC
(In reply to Bagas Sanjaya from comment #9)
> (In reply to freeze0985 from comment #0)
> > Since 1st weak of Sept I have been observing memory leak in my system, so
> > after doing a little big of digging I found out that the leak is caused by
> > kmalloc. In Linux 6.5.3 memory leak would increase  to nearly 50% of my ram
> > over a period of 6-9 hours. In the Newer Linux 6.5.4 I am yet to observe
> > that much memory leak(Haven't used my laptop for that long till now) but
> > over a period of 3 hours I see 2.2GB reserved and it is not used by any
> > program at all. this is just after 3 hours of usage on Linux 6.5.4. Slabtop
> > link: https://i.imgur.com/8OFw2Fa.png
> 
> Can you attach that slabtop image here? The link above is broken.

The Imgur link works. But anyways here is the new link: https://ibb.co/Kxj56LP.
Comment 15 The Linux kernel's regression tracker (Thorsten Leemhuis) 2023-09-28 12:57:40 UTC
(In reply to freeze0985 from comment #12)
> I have nvidia-dkms https://archlinux.org/packages/extra/x86_64/nvidia-dkms/.

Then the upstream Linux kernel developer most likely won't help you. Please reproduce with a vanilla kernel. For details see:
https://linux-regtracking.leemhuis.info/post/frequent-reasons-why-linux-kernel-bug-reports-are-ignored/

(In reply to freeze0985 from comment #13)
> I have not read anything related to linux kernel subsytem allocation.

I meant that you could search the net for guides that might be helpful to determine the source of those allocations.
Comment 16 freeze0985 2023-09-28 13:05:35 UTC
it's literally vanila kernel
Comment 17 freeze0985 2023-09-28 13:07:37 UTC
I will uninstall and observe kmalloc leak. btw  this issue started on 6.5.3 iirc . i simply reported to an fork of linux and they said the problem is not with our fork but with upstream aka you guys.
Comment 18 freeze0985 2023-09-28 13:12:00 UTC
(In reply to The Linux kernel's regression tracker (Thorsten Leemhuis) from comment #15)
> (In reply to freeze0985 from comment #12)
> > I have nvidia-dkms
> https://archlinux.org/packages/extra/x86_64/nvidia-dkms/.
> 
> Then the upstream Linux kernel developer most likely won't help you. Please
> reproduce with a vanilla kernel. For details see:
> https://linux-regtracking.leemhuis.info/post/frequent-reasons-why-linux-
> kernel-bug-reports-are-ignored/
> 
> (In reply to freeze0985 from comment #13)
> > I have not read anything related to linux kernel subsytem allocation.
> 
> I meant that you could search the net for guides that might be helpful to
> determine the source of those allocations.

Ok. I will try this out: https://docs.kernel.org/dev-tools/kmemleak.html . Goddammit it sucks to be the guy who finds the bug first
Comment 19 The Linux kernel's regression tracker (Thorsten Leemhuis) 2023-09-28 13:25:54 UTC
(In reply to freeze0985 from comment #16)
> it's literally vanila kernel

If you loaded out-of tree drivers (which you apparently have) it's not a vanilla kernel anymore (even if they are open source) -- and then upstream kernel developers often don't care.

> Ok. I will try this out: https://docs.kernel.org/dev-tools/kmemleak.html 

Thx!
Comment 20 freeze0985 2023-10-05 14:06:03 UTC
You guys were right, the dammed nvidia-dkms is the one to blame. it is causing kmalloc leak on my system. in fact i just reinstalled it and I saw unusual memory spike( haven't restarted yet).
Comment 21 freeze0985 2023-10-05 14:13:12 UTC
before you guys close this, can you explain how is this possible that nvidia-dkms is causing issue when it's not being used. I have two graphics cards. Intel and nvidia 3060 mobile.

I run my system on intel graphics and no application uses nvidia.