Bug 217253 - mbind, set_mempolicy, migrate_pages: maxnode description is off-by-one
Summary: mbind, set_mempolicy, migrate_pages: maxnode description is off-by-one
Status: NEEDINFO
Alias: None
Product: Documentation
Classification: Unclassified
Component: man-pages (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: documentation_man-pages@kernel-bugs.osdl.org
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2023-03-27 14:05 UTC by Anthony J. Battersby
Modified: 2023-05-19 11:51 UTC (History)
1 user (show)

See Also:
Kernel Version:
Subsystem:
Regression: No
Bisected commit-id:


Attachments

Description Anthony J. Battersby 2023-03-27 14:05:42 UTC
linux/mm/mempolicy.c::get_nodes() does "--maxnode" at the beginning, so:
maxnode == 0 is invalid (-EINVAL).
maxnode == 1 specifies the empty set of nodes (the man pages currently say to use maxnode == 0).
maxnode == 2 indicates one valid bit in nodemask.
maxnode == 3 indicates two valid bits in nodemask.
etc.

Incorrect section from mbind manpage:

"nodemask points to a bit mask of nodes containing up to maxnode bits.  The bit mask size is rounded to the next multiple of sizeof(unsigned long), but the kernel will use bits only up to maxnode.  A NULL value of nodemask or a maxnode value of zero specifies the empty set of nodes.  If the value of maxnode is zero, the nodemask argument is ignored."

I am not sure if this was an intentional design choice or a bug that got enshrined in the userspace API, but userspace programs "in the know" seem to rely on this now:

https://gitlab.com/qemu-project/qemu/-/blob/60ca584b8af0de525656f959991a440f8c191f12/backends/hostmem.c#L369

Also, the commit message for linux commit c6018b4b2549 ("mm/mempolicy: add set_mempolicy_home_node syscall") shows using "new_nodes->size + 1", so this API bug/choice seems to be known within the kernel community.

Here is a related bugzilla entry that treats the problem as a kernel bug rather than a documentation issue:
https://bugzilla.kernel.org/show_bug.cgi?id=201433

But since "fixing" the bug (assuming that it was unintentional) might break existing userspace programs that work around the bug, I suggest fixing the documentation instead.  But that is just my opinion as a user who just ran into the bug and did some investigating; best to check with the kernel maintainers for their opinion.

Related:
linux commit 050c17f239fd ("numa: change get_mempolicy() to use nr_node_ids instead of MAX_NUMNODES") commit message talks about calculating maxnode for get_mempolicy().
Comment 1 Alejandro Colomar 2023-05-19 11:46:24 UTC
Thanks for the investigation.  I CCed the maintainer.  If you have any
specific suggestions for fixing the documentation, would you mind
preparing a patch according to the ./CONTRIBUTING file in the
man-pages repository?

Note You need to log in before you can comment on or make changes to this bug.