The most significant bit of the nodemask when calling `mbind` or `set_mempolicy` is ignored. Calling `mbind` or `set_mempolicy` with `*nodemask=(1<<maxnode)-1` results in allocation in allocation in nodes `0`-`N-2`. No memory is allocated in node `N-1`. The cause may be the function `get_nodes` in mm/mempolicy.c:1238 decrementing `maxnode` right away. Documentation clearly states (e.g. `set_mempolicy()`) "nodemask points to a bit mask of node IDs that contains up to maxnode bits". Steps to Reproduce: (1) Attempt to set a memory policy that interleaves memory on N nodes (2) Check the per-node memory usage (3) Allocate and initialize a large array (4) Check the per-node memory usage and compare with the previous one The code below sets a interleave-on-all-nodes memory policy and displays the per-node usage before and after a large (1GB) allocation. Compiled with `gcc-8 file.c -lnuma`. ------------------- #include <stdio.h> #include <stdlib.h> #include <stdint.h> #include <numa.h> #include <numaif.h> #include <unistd.h> #include <sys/mman.h> #define NUM_ELEMS ((1 << 30) / sizeof(int)) // 1GB array void print_node_memusage() { for (size_t i=0; i < numa_num_configured_nodes(); i++) { FILE *fp; char buf[1024]; snprintf(buf, sizeof(buf), "cat /sys/devices/system/node/node%lu/meminfo | grep MemUsed", i); if ((fp = popen(buf, "r")) == NULL) { perror("popen"); exit(-1); } while(fgets(buf, sizeof(buf), fp) != NULL) { printf("%s", buf); } if(pclose(fp)) { perror("pclose"); exit(-1); } } } int main() { uint64_t num_nodes = numa_num_configured_nodes(); uint64_t all_nodes_mask = (1 << numa_num_configured_nodes()) - 1; set_mempolicy(MPOL_INTERLEAVE, &all_nodes_mask, num_nodes); // print per-node memory usage before print_node_memusage(); // allocate large array and write to it int *a = malloc(NUM_ELEMS * sizeof(int)); a[0] = 123; for (size_t i=1; i < NUM_ELEMS; i++) { a[i] = (a[i-1] * a[i-1]) % 1000000; } // print per-node memory usage after print_node_memusage(); free(a); return 0; } ------------------- Expected Results: It should allocate similar amounts in all N nodes. Actual Results: It allocated memory in the first N-1 nodes. No memory is allocated in the last node. Example run on my machine: Before: Node 0 MemUsed: 3669964 kB Node 1 MemUsed: 935864 kB Node 2 MemUsed: 2921224 kB Node 3 MemUsed: 2439580 kB After: Node 0 MemUsed: 4020876 kB (+343MB) Node 1 MemUsed: 1287212 kB (+343MB) Node 2 MemUsed: 3271468 kB (+342MB) Node 3 MemUsed: 2439264 kB (no dif) Build Date & Hardware: Kernel: 4.18.10 OS: Ubuntu 16.04.3 LTS CPU: 2 x Intel(R) Xeon(R) CPU E5-2660 v4 @ 2.00GHz RAM: 4x8GB DIMMs (8GBs per node)