Bug 213785 - hugetlbfs: interprets mode as decimal
Summary: hugetlbfs: interprets mode as decimal
Status: NEW
Alias: None
Product: Memory Management
Classification: Unclassified
Component: Other (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: Andrew Morton
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2021-07-19 15:34 UTC by Dennis Camera
Modified: 2021-07-21 17:48 UTC (History)
0 users

See Also:
Kernel Version: >=5.1
Subsystem:
Regression: Yes
Bisected commit-id:


Attachments

Description Dennis Camera 2021-07-19 15:34:00 UTC
I noticed the following strange behaviour with hugetlbfs mounts for kernel versions something north of 4.19, but certainly since 5.4:

when the mode= option of the mount has the sticky bit set, the mode of the mount point gets scrambled.

To test the following command line can be used:
mkdir -p /tmp/tlbtest && mount -t hugetlbfs -o mode=.... none /tmp/tlbtest && stat -c 'mode: %04a' /tmp/tlbtest

For kernel versions <= 4.19 or if the first byte of mode is 0, stat outputs what was set using "-o mode".

But on newer kernel versions if the first byte of mode is 1 the output of stat is as follows (input -> output):
1700 -> 1244
1750 -> 1326
1770 -> 1352
1775 -> 1357
1777 -> 1361

The behaviour is reproducible across different kernel versions and architectures (5.4.47-amd64, 5.9.8-amd64, 5.10.40-ppc64el, 5.12.15-amd64).
Comment 1 Andrew Morton 2021-07-20 22:07:07 UTC
(switched to email.  Please respond via emailed reply-to-all, not via the
bugzilla web interface).

On Mon, 19 Jul 2021 15:34:00 +0000 bugzilla-daemon@bugzilla.kernel.org wrote:

> https://bugzilla.kernel.org/show_bug.cgi?id=213785
> 
>             Bug ID: 213785
>            Summary: hugetlbfs scrambles mode when sticky bit is set
>            Product: Memory Management
>            Version: 2.5
>     Kernel Version: >4.19 (certainly >= 5.4)
>           Hardware: All
>                 OS: Linux
>               Tree: Mainline
>             Status: NEW
>           Severity: normal
>           Priority: P1
>          Component: Other
>           Assignee: akpm@linux-foundation.org
>           Reporter: bugs+kernel.org@dtnr.ch
>         Regression: Yes
> 
> I noticed the following strange behaviour with hugetlbfs mounts for kernel
> versions something north of 4.19, but certainly since 5.4:
> 
> when the mode= option of the mount has the sticky bit set, the mode of the
> mount point gets scrambled.
> 
> To test the following command line can be used:
> mkdir -p /tmp/tlbtest && mount -t hugetlbfs -o mode=.... none /tmp/tlbtest &&
> stat -c 'mode: %04a' /tmp/tlbtest
> 
> For kernel versions <= 4.19 or if the first byte of mode is 0, stat outputs
> what was set using "-o mode".
> 
> But on newer kernel versions if the first byte of mode is 1 the output of
> stat
> is as follows (input -> output):
> 1700 -> 1244
> 1750 -> 1326
> 1770 -> 1352
> 1775 -> 1357
> 1777 -> 1361
> 
> The behaviour is reproducible across different kernel versions and
> architectures (5.4.47-amd64, 5.9.8-amd64, 5.10.40-ppc64el, 5.12.15-amd64).
> 
> -- 
> You may reply to this email to add a comment.
> 
> You are receiving this mail because:
> You are the assignee for the bug.
Comment 2 mike.kravetz 2021-07-21 04:38:56 UTC
Add David, Al

On 7/20/21 3:07 PM, Andrew Morton wrote:
>> https://bugzilla.kernel.org/show_bug.cgi?id=213785
>>
>>            Summary: hugetlbfs scrambles mode when sticky bit is set
>>            Product: Memory Management
>>            Version: 2.5
>>     Kernel Version: >4.19 (certainly >= 5.4)
>>         Regression: Yes
>>
>> I noticed the following strange behaviour with hugetlbfs mounts for kernel
>> versions something north of 4.19, but certainly since 5.4:
>>
>> when the mode= option of the mount has the sticky bit set, the mode of the
>> mount point gets scrambled.
>>
>> To test the following command line can be used:
>> mkdir -p /tmp/tlbtest && mount -t hugetlbfs -o mode=.... none /tmp/tlbtest
>> &&
>> stat -c 'mode: %04a' /tmp/tlbtest
>>
>> For kernel versions <= 4.19 or if the first byte of mode is 0, stat outputs
>> what was set using "-o mode".
>>
>> But on newer kernel versions if the first byte of mode is 1 the output of
>> stat
>> is as follows (input -> output):
>> 1700 -> 1244
>> 1750 -> 1326
>> 1770 -> 1352
>> 1775 -> 1357
>> 1777 -> 1361
>>
>> The behaviour is reproducible across different kernel versions and
>> architectures (5.4.47-amd64, 5.9.8-amd64, 5.10.40-ppc64el, 5.12.15-amd64).

I took a quick look and believe a change in behavior was caused by
commit 32021982a324 "Convert the hugetlbfs to use the fs_context
during mount".

Prior to the commit, code processing the mode option used
match_octal() to convert the command line string to a numeric value.
Since match_octal expects a string representing an octal value, it does
not require a leading '0'.  As a result, prior to this commit the
argument 'mode=1700' would result in a mode value of 01700.  After the
commit one must precede octal values with 0.  So, mode=1700 would result
in a mode value of 03244 (& 01777U) = 1244.

If my analysis is correct, I am not sure how to proceed.  IMO, the
current behavior is 'more correct'.  However, until v5.1 a preceeding 0
was not required when specifying mode for hugetlbfs.  So, this was
certainly a change in behavior.  Suggestions?
Comment 3 Dennis Camera 2021-07-21 08:03:58 UTC
Dear Mike,

On Tue, 20 Jul 2021 21:38 -0700 Mike Kravetz wrote:
> I took a quick look and believe a change in behavior was caused by
> commit 32021982a324 "Convert the hugetlbfs to use the fs_context
> during mount".

Thanks for taking a look at this issue and pointing to the commit that
introduced this change.
It looks credible.

> Prior to the commit, code processing the mode option used
> match_octal() to convert the command line string to a numeric value.
> Since match_octal expects a string representing an octal value, it
> does
> not require a leading '0'.  As a result, prior to this commit the
> argument 'mode=1700' would result in a mode value of 01700.  After
> the
> commit one must precede octal values with 0.  So, mode=1700 would
> result
> in a mode value of 03244 (& 01777U) = 1244.

I've been racking my brain trying to figure out what was going on, but
the solution is so simple…

> If my analysis is correct, I am not sure how to proceed.  IMO, the
> current behavior is 'more correct'.  However, until v5.1 a preceeding
> 0
> was not required when specifying mode for hugetlbfs.  So, this was
> certainly a change in behavior.  Suggestions?

Generally, I would agree that interpreting integers as decimals is the
correct way, but in the case of file modes I cannot remember having
used or seen modes in anything but octal over all the years I've been
working with Linux.

As I understand Documentation/admin-guide/mm/hugetlbpage.rst the mode
option should be interpreted as octal:
> The mode option sets [the mode] to value & 01777. This value is given
in octal. By default the value 0755 is picked.

Furthermore, I had a look at the POSIX documentation for chmod(1)
(https://pubs.opengroup.org/onlinepubs/9699919799/utilities/chmod.html)
which also interprets all integers as octals.

So my suggestion would be to convert the mode option to a
fsparam_u32oct and if possible backport this change to the supported
>=5.1 kernel versions.

Regards,
Dennis
Comment 4 willy 2021-07-21 11:57:59 UTC
On Tue, Jul 20, 2021 at 09:38:48PM -0700, Mike Kravetz wrote:
> Add David, Al
> 
> On 7/20/21 3:07 PM, Andrew Morton wrote:
> >> https://bugzilla.kernel.org/show_bug.cgi?id=213785
> >>
> >>            Summary: hugetlbfs scrambles mode when sticky bit is set
> >>            Product: Memory Management
> >>            Version: 2.5
> >>     Kernel Version: >4.19 (certainly >= 5.4)
> >>         Regression: Yes
> >>
> >> I noticed the following strange behaviour with hugetlbfs mounts for kernel
> >> versions something north of 4.19, but certainly since 5.4:
> >>
> >> when the mode= option of the mount has the sticky bit set, the mode of the
> >> mount point gets scrambled.
> >>
> >> To test the following command line can be used:
> >> mkdir -p /tmp/tlbtest && mount -t hugetlbfs -o mode=.... none /tmp/tlbtest
> &&
> >> stat -c 'mode: %04a' /tmp/tlbtest
> >>
> >> For kernel versions <= 4.19 or if the first byte of mode is 0, stat
> outputs
> >> what was set using "-o mode".
> >>
> >> But on newer kernel versions if the first byte of mode is 1 the output of
> stat
> >> is as follows (input -> output):
> >> 1700 -> 1244
> >> 1750 -> 1326
> >> 1770 -> 1352
> >> 1775 -> 1357
> >> 1777 -> 1361
> >>
> >> The behaviour is reproducible across different kernel versions and
> >> architectures (5.4.47-amd64, 5.9.8-amd64, 5.10.40-ppc64el, 5.12.15-amd64).
> 
> I took a quick look and believe a change in behavior was caused by
> commit 32021982a324 "Convert the hugetlbfs to use the fs_context
> during mount".
> 
> Prior to the commit, code processing the mode option used
> match_octal() to convert the command line string to a numeric value.
> Since match_octal expects a string representing an octal value, it does
> not require a leading '0'.  As a result, prior to this commit the
> argument 'mode=1700' would result in a mode value of 01700.  After the
> commit one must precede octal values with 0.  So, mode=1700 would result
> in a mode value of 03244 (& 01777U) = 1244.
> 
> If my analysis is correct, I am not sure how to proceed.  IMO, the
> current behavior is 'more correct'.  However, until v5.1 a preceeding 0
> was not required when specifying mode for hugetlbfs.  So, this was
> certainly a change in behavior.  Suggestions?  

Probably should follow what shmem does:

mm/shmem.c:     fsparam_u32oct("mode",          Opt_mode),

(also several other filesystems)
Comment 5 mike.kravetz 2021-07-21 17:48:30 UTC
On 7/21/21 4:57 AM, Matthew Wilcox wrote:
> On Tue, Jul 20, 2021 at 09:38:48PM -0700, Mike Kravetz wrote:
>> Add David, Al
>>
>> On 7/20/21 3:07 PM, Andrew Morton wrote:
>>>> https://bugzilla.kernel.org/show_bug.cgi?id=213785
>>
>> I took a quick look and believe a change in behavior was caused by
>> commit 32021982a324 "Convert the hugetlbfs to use the fs_context
>> during mount".
>>
>> Prior to the commit, code processing the mode option used
>> match_octal() to convert the command line string to a numeric value.
>> Since match_octal expects a string representing an octal value, it does
>> not require a leading '0'.  As a result, prior to this commit the
>> argument 'mode=1700' would result in a mode value of 01700.  After the
>> commit one must precede octal values with 0.  So, mode=1700 would result
>> in a mode value of 03244 (& 01777U) = 1244.
>>
>> If my analysis is correct, I am not sure how to proceed.  IMO, the
>> current behavior is 'more correct'.  However, until v5.1 a preceeding 0
>> was not required when specifying mode for hugetlbfs.  So, this was
>> certainly a change in behavior.  Suggestions?  
> 
> Probably should follow what shmem does:
> 
> mm/shmem.c:     fsparam_u32oct("mode",          Opt_mode),
> 
> (also several other filesystems)
> 

Thanks Matthew!

That looks to be exactly what is needed.

Note You need to log in before you can comment on or make changes to this bug.