Bug 201085 - Kernel allows mlock() on pages in CMA without migrating pages out of CMA first
Summary: Kernel allows mlock() on pages in CMA without migrating pages out of CMA first
Status: NEW
Alias: None
Product: Memory Management
Classification: Unclassified
Component: Page Allocator (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: Andrew Morton
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2018-09-11 03:59 UTC by Timothy Pearson
Modified: 2019-08-12 20:01 UTC (History)
0 users

See Also:
Kernel Version: 4.18
Subsystem:
Regression: No
Bisected commit-id:


Attachments

Description Timothy Pearson 2018-09-11 03:59:11 UTC
Pages allocated in CMA are not migrated out of CMA when non-CMA memory is available and locking is attempted via mlock().  This can result in rapid exhaustion of the CMA pool if memory locking is used by an application with large memory requirements such as QEMU.

To reproduce, on a dual-CPU (NUMA) POWER9 host try to launch a VM with mlock=on and 1/2 or more of physical memory allocated to the guest.  Observe full CMA pool depletion occurs despite plenty of normal free RAM available.
Comment 1 Andrew Morton 2018-09-12 19:47:31 UTC
(switched to email.  Please respond via emailed reply-to-all, not via the
bugzilla web interface).

On Tue, 11 Sep 2018 03:59:11 +0000 bugzilla-daemon@bugzilla.kernel.org wrote:

> https://bugzilla.kernel.org/show_bug.cgi?id=201085
> 
>             Bug ID: 201085
>            Summary: Kernel allows mlock() on pages in CMA without
>                     migrating pages out of CMA first
>            Product: Memory Management
>            Version: 2.5
>     Kernel Version: 4.18
>           Hardware: All
>                 OS: Linux
>               Tree: Mainline
>             Status: NEW
>           Severity: normal
>           Priority: P1
>          Component: Page Allocator
>           Assignee: akpm@linux-foundation.org
>           Reporter: tpearson@raptorengineering.com
>         Regression: No
> 
> Pages allocated in CMA are not migrated out of CMA when non-CMA memory is
> available and locking is attempted via mlock().  This can result in rapid
> exhaustion of the CMA pool if memory locking is used by an application with
> large memory requirements such as QEMU.
> 
> To reproduce, on a dual-CPU (NUMA) POWER9 host try to launch a VM with
> mlock=on
> and 1/2 or more of physical memory allocated to the guest.  Observe full CMA
> pool depletion occurs despite plenty of normal free RAM available.
> 
> -- 
> You are receiving this mail because:
> You are the assignee for the bug.
Comment 2 mike.kravetz 2018-09-13 00:00:11 UTC
On 09/12/2018 12:47 PM, Andrew Morton wrote:
> 
> (switched to email.  Please respond via emailed reply-to-all, not via the
> bugzilla web interface).
> 
> On Tue, 11 Sep 2018 03:59:11 +0000 bugzilla-daemon@bugzilla.kernel.org wrote:
> 
>> https://bugzilla.kernel.org/show_bug.cgi?id=201085
>>
>>             Bug ID: 201085
>>            Summary: Kernel allows mlock() on pages in CMA without
>>                     migrating pages out of CMA first
>>            Product: Memory Management
>>            Version: 2.5
>>     Kernel Version: 4.18
>>           Hardware: All
>>                 OS: Linux
>>               Tree: Mainline
>>             Status: NEW
>>           Severity: normal
>>           Priority: P1
>>          Component: Page Allocator
>>           Assignee: akpm@linux-foundation.org
>>           Reporter: tpearson@raptorengineering.com
>>         Regression: No
>>
>> Pages allocated in CMA are not migrated out of CMA when non-CMA memory is
>> available and locking is attempted via mlock().  This can result in rapid
>> exhaustion of the CMA pool if memory locking is used by an application with
>> large memory requirements such as QEMU.
>>
>> To reproduce, on a dual-CPU (NUMA) POWER9 host try to launch a VM with
>> mlock=on
>> and 1/2 or more of physical memory allocated to the guest.  Observe full CMA
>> pool depletion occurs despite plenty of normal free RAM available.
>>
>> -- 
>> You are receiving this mail because:
>> You are the assignee for the bug.

IIRC, Aneesh is working on some powerpc IOMMU patches for a similar issue
(long term pinning of cma pages).  Added him on Cc:
https://lkml.kernel.org/r/20180906054342.25094-2-aneesh.kumar@linux.ibm.com

This report seems to be suggesting a more general solution/change.  Wondering
if there is any overlap with this and Aneesh's work.
Comment 3 Aneesh Kumar KV 2018-09-14 03:23:31 UTC
On 9/14/18 8:39 AM, Aneesh Kumar K.V wrote:
> On 9/13/18 3:49 AM, Mike Kravetz wrote:
>> On 09/12/2018 12:47 PM, Andrew Morton wrote:
>>>
>>> (switched to email.  Please respond via emailed reply-to-all, not via 
>>> the
>>> bugzilla web interface).
>>>
>>> On Tue, 11 Sep 2018 03:59:11 +0000 
>>> bugzilla-daemon@bugzilla.kernel.org wrote:
>>>
>>>> https://bugzilla.kernel.org/show_bug.cgi?id=201085
>>>>
>>>>              Bug ID: 201085
>>>>             Summary: Kernel allows mlock() on pages in CMA without
>>>>                      migrating pages out of CMA first
>>>>             Product: Memory Management
>>>>             Version: 2.5
>>>>      Kernel Version: 4.18
>>>>            Hardware: All
>>>>                  OS: Linux
>>>>                Tree: Mainline
>>>>              Status: NEW
>>>>            Severity: normal
>>>>            Priority: P1
>>>>           Component: Page Allocator
>>>>            Assignee: akpm@linux-foundation.org
>>>>            Reporter: tpearson@raptorengineering.com
>>>>          Regression: No
>>>>
>>>> Pages allocated in CMA are not migrated out of CMA when non-CMA 
>>>> memory is
>>>> available and locking is attempted via mlock().  This can result in 
>>>> rapid
>>>> exhaustion of the CMA pool if memory locking is used by an 
>>>> application with
>>>> large memory requirements such as QEMU.
>>>>
>>>> To reproduce, on a dual-CPU (NUMA) POWER9 host try to launch a VM 
>>>> with mlock=on
>>>> and 1/2 or more of physical memory allocated to the guest.  Observe 
>>>> full CMA
>>>> pool depletion occurs despite plenty of normal free RAM available.
>>>>
>>>> -- 
>>>> You are receiving this mail because:
>>>> You are the assignee for the bug.
>>
>> IIRC, Aneesh is working on some powerpc IOMMU patches for a similar issue
>> (long term pinning of cma pages).  Added him on Cc:
>> https://lkml.kernel.org/r/20180906054342.25094-2-aneesh.kumar@linux.ibm.com 
>>
>>
>> This report seems to be suggesting a more general solution/change.  
>> Wondering
>> if there is any overlap with this and Aneesh's work.
>>
> 
> This is a related issue. I am looking at doing something similar to what 
> I did with IOMMU patches. That is migrate pages out of CMA region bfore 
> mlock.
> 
> The problem mentioned is similar to vfio. With VFIO we do pin the guest 
> pages and that is similar with -realtime mlock=on option of Qemu.
> 
> We can endup backing guest RAM with pages from CMA area and these are 
> different qemu options that do pin these guest pages for the lifetime of 
> the guest.
> 

Another option is to look at the possibility of having something similar 
to prctl that will avoid allocation from CMA region?

-aneesh
Comment 4 Aneesh Kumar KV 2018-09-14 03:53:14 UTC
On 9/13/18 3:49 AM, Mike Kravetz wrote:
> On 09/12/2018 12:47 PM, Andrew Morton wrote:
>>
>> (switched to email.  Please respond via emailed reply-to-all, not via the
>> bugzilla web interface).
>>
>> On Tue, 11 Sep 2018 03:59:11 +0000 bugzilla-daemon@bugzilla.kernel.org
>> wrote:
>>
>>> https://bugzilla.kernel.org/show_bug.cgi?id=201085
>>>
>>>              Bug ID: 201085
>>>             Summary: Kernel allows mlock() on pages in CMA without
>>>                      migrating pages out of CMA first
>>>             Product: Memory Management
>>>             Version: 2.5
>>>      Kernel Version: 4.18
>>>            Hardware: All
>>>                  OS: Linux
>>>                Tree: Mainline
>>>              Status: NEW
>>>            Severity: normal
>>>            Priority: P1
>>>           Component: Page Allocator
>>>            Assignee: akpm@linux-foundation.org
>>>            Reporter: tpearson@raptorengineering.com
>>>          Regression: No
>>>
>>> Pages allocated in CMA are not migrated out of CMA when non-CMA memory is
>>> available and locking is attempted via mlock().  This can result in rapid
>>> exhaustion of the CMA pool if memory locking is used by an application with
>>> large memory requirements such as QEMU.
>>>
>>> To reproduce, on a dual-CPU (NUMA) POWER9 host try to launch a VM with
>>> mlock=on
>>> and 1/2 or more of physical memory allocated to the guest.  Observe full
>>> CMA
>>> pool depletion occurs despite plenty of normal free RAM available.
>>>
>>> -- 
>>> You are receiving this mail because:
>>> You are the assignee for the bug.
> 
> IIRC, Aneesh is working on some powerpc IOMMU patches for a similar issue
> (long term pinning of cma pages).  Added him on Cc:
> https://lkml.kernel.org/r/20180906054342.25094-2-aneesh.kumar@linux.ibm.com
> 
> This report seems to be suggesting a more general solution/change.  Wondering
> if there is any overlap with this and Aneesh's work.
> 

This is a related issue. I am looking at doing something similar to what 
I did with IOMMU patches. That is migrate pages out of CMA region bfore 
mlock.

The problem mentioned is similar to vfio. With VFIO we do pin the guest 
pages and that is similar with -realtime mlock=on option of Qemu.

We can endup backing guest RAM with pages from CMA area and these are 
different qemu options that do pin these guest pages for the lifetime of 
the guest.

-aneesh
Comment 5 Timothy Pearson 2019-08-12 20:01:43 UTC
It's been almost a year.  Are we any closer to a solution?

Note You need to log in before you can comment on or make changes to this bug.