Bug 42967

Summary: kmalloc_node crashes when it tries to allocate memory on a node that doesn't have memory
Product: Memory Management Reporter: mario.nicolas
Component: NUMA/discontigmemAssignee: Vikram Dhillon (dhillonv10)
Status: NEW ---    
Severity: normal CC: atomlin, dhillonv10
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 3.1.0-7 Subsystem:
Regression: No Bisected commit-id:
Attachments: crash log

Description mario.nicolas 2012-03-20 11:26:10 UTC
Hi,

kmalloc_node crashes when it tries to allocate memory on a node that doesn't have memory.
Below is a simple kernel module that illustrates the problem:
It crashes when the core parameter is out of range (for example if we use 1 when there is memory only on node 0). The log crash is attached.
A better behaviour would be to return a NULL pointer that could be checked and handled by the module.


#include <linux/module.h>
#include <linux/slab.h>
#include <linux/init.h>

static char* ptr = NULL;
static int core = 0;
module_param(core, int, S_IRUSR);


static int __init mymodule_init(void)
{
    printk("Using core affinity %d\n", core);
    ptr = kmalloc_node(GFP_KERNEL, 128, core);
    printk ("My module worked! ptr = %p\n", ptr);
    return 0;
}

static void __exit mymodule_exit(void)
{
    kfree(ptr);
    printk ("Unloading my module.\n");
    return;
}

module_init(mymodule_init);
module_exit(mymodule_exit);

MODULE_LICENSE("GPL");
Comment 1 mario.nicolas 2012-03-20 11:27:15 UTC
Created attachment 72657 [details]
crash log
Comment 2 Aaron Tomlin 2013-04-22 19:29:22 UTC
(In reply to comment #1)
> Created an attachment (id=72657) [details]
> crash log

Submitted a patch [1] to assert, if nodeid > num_online_nodes()
is true, under the -debug kernel, since ____cache_alloc_node()
is a "hot code" path.

---
[1]: http://marc.info/?l=linux-mm&m=136181270727758&w=1
Comment 3 Aaron Tomlin 2014-01-23 12:23:42 UTC
(In reply to Aaron Tomlin from comment #2)
> (In reply to comment #1)
> > Created an attachment (id=72657) [details]
> > crash log
> 
> Submitted a patch [1] to assert, if nodeid > num_online_nodes()
> is true, under the -debug kernel, since ____cache_alloc_node()
> is a "hot code" path.
> 
> ---
> [1]: http://marc.info/?l=linux-mm&m=136181270727758&w=1

A patch been committed since v3.11-rc1 [1] can this BUG be marked
as closed?

---
[1]: http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=14e50c6a9bc2b283bb4021026226268312ceefdd