Bug 219427 - Memory leak of pinned pages in "low memory conditions"
Summary: Memory leak of pinned pages in "low memory conditions"
Status: NEW
Alias: None
Product: Memory Management
Classification: Unclassified
Component: Page Allocator (show other bugs)
Hardware: All Linux
: P3 normal
Assignee: Andrew Morton
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2024-10-25 20:08 UTC by vlovich
Modified: 2024-10-26 02:55 UTC (History)
2 users (show)

See Also:
Kernel Version:
Subsystem:
Regression: No
Bisected commit-id:


Attachments

Description vlovich 2024-10-25 20:08:14 UTC
Running llama.cpp by default has it using CUDA's allocator which creates pinned pages.  Running on the latest official 6.11 kernel results in permanent memory leaks after each invocation with `free -m` reporting more & more memory used with no active process actually using that memory. Similarly, `nr_foll_pin_acquired` and `nr_foll_pin_acquired` in `/proc/vmstat` are horribly imbalanced. llama.cpp discussion https://github.com/ggerganov/llama.cpp/issues/9988 and reported to nvidia 
https://forums.developer.nvidia.com/t/memory-leak-on-kernel-6-11-0-when-using-cudamallochost/308691

I see a patch proposed in https://lore.kernel.org/lkml/87y12ibbew.fsf@nvdebian.thelocal/T/#ma3aebfc4d8aa152d2c0439bedf0a4862d2510185 but the patch doesn't seem to have been applied in 6.12 RC nor mainline so I wanted to create a bug to make sure this is tracked.
Comment 1 vlovich 2024-10-25 21:31:25 UTC
The only thing that doesn't make sense in the explanation. I have 64GiB of RAM and even on a freshly booted machine, the memory usage is only 4 GiB. The maximum allocated by llama.cpp is ~16 GiB. So it's a bit strange to be hitting the issue with the reasoning that it's in the "low memory conditions" case only.
Comment 2 vlovich 2024-10-25 23:10:34 UTC
I've confirmed the patch fixes the issue on 6.11. Can't quite get 6.12 booting for some reason to double-check.

Note You need to log in before you can comment on or make changes to this bug.