Bug 203107 - Bad page map in process during boot
Summary: Bad page map in process during boot
Status: NEW
Alias: None
Product: File System
Classification: Unclassified
Component: ext4 (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: fs_ext4@kernel-bugs.osdl.org
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2019-03-29 20:46 UTC by echto1
Modified: 2019-08-05 19:17 UTC (History)
1 user (show)

See Also:
Kernel Version: 5.0.5
Tree: Mainline
Regression: No


Attachments
attachment-13675-0.html (2.04 KB, text/html)
2019-04-02 13:00 UTC, echto1
Details

Description echto1 2019-03-29 20:46:22 UTC
Error occurs randomly at boot after upgrading kernel from 5.0.0 to 5.0.4.

https://justpaste.it/387uf
Comment 1 Jan Kara 2019-04-02 10:33:49 UTC
Switching to email...

On Fri 29-03-19 20:46:22, bugzilla-daemon@bugzilla.kernel.org wrote:
> https://bugzilla.kernel.org/show_bug.cgi?id=203107
> 
>             Bug ID: 203107
>            Summary: Bad page map in process during boot
>            Product: File System
>            Version: 2.5
>     Kernel Version: 5.0.5
>           Hardware: All
>                 OS: Linux
>               Tree: Mainline
>             Status: NEW
>           Severity: normal
>           Priority: P1
>          Component: ext4
>           Assignee: fs_ext4@kernel-bugs.osdl.org
>           Reporter: echto1@gmail.com
>         Regression: No
> 
> Error occurs randomly at boot after upgrading kernel from 5.0.0 to 5.0.4.
> 
> https://justpaste.it/387uf

I don't think this is an ext4 error. Sure this is an error in file mapping
of libblkid.so.1.1.0 (which is handled by ext4) but the filesystem has very
little to say wrt how or which PTEs are installed. And the problem is that
invalid PTE (dead000000000100) is present in page tables. So this looks
more like a problem in MM itself. Adding MM guys to CC.

								Honza
Comment 2 echto1 2019-04-02 13:00:04 UTC
Created attachment 282099 [details]
attachment-13675-0.html

Thank you.

On April 2, 2019 3:33:49 AM PDT, bugzilla-daemon@bugzilla.kernel.org wrote:
>https://bugzilla.kernel.org/show_bug.cgi?id=203107
>
>--- Comment #1 from Jan Kara (jack@suse.cz) ---
>Switching to email...
>
>On Fri 29-03-19 20:46:22, bugzilla-daemon@bugzilla.kernel.org wrote:
>> https://bugzilla.kernel.org/show_bug.cgi?id=203107
>> 
>>             Bug ID: 203107
>>            Summary: Bad page map in process during boot
>>            Product: File System
>>            Version: 2.5
>>     Kernel Version: 5.0.5
>>           Hardware: All
>>                 OS: Linux
>>               Tree: Mainline
>>             Status: NEW
>>           Severity: normal
>>           Priority: P1
>>          Component: ext4
>>           Assignee: fs_ext4@kernel-bugs.osdl.org
>>           Reporter: echto1@gmail.com
>>         Regression: No
>> 
>> Error occurs randomly at boot after upgrading kernel from 5.0.0 to
>5.0.4.
>> 
>> https://justpaste.it/387uf
>
>I don't think this is an ext4 error. Sure this is an error in file
>mapping
>of libblkid.so.1.1.0 (which is handled by ext4) but the filesystem has
>very
>little to say wrt how or which PTEs are installed. And the problem is
>that
>invalid PTE (dead000000000100) is present in page tables. So this looks
>more like a problem in MM itself. Adding MM guys to CC.
>
>                                                                Honza
>
>-- 
>You are receiving this mail because:
>You reported the bug.
Comment 3 Kirill A. Shutemov 2019-04-04 13:08:46 UTC
On Tue, Apr 02, 2019 at 12:16:13PM +0200, Jan Kara wrote:
> Switching to email...
> 
> On Fri 29-03-19 20:46:22, bugzilla-daemon@bugzilla.kernel.org wrote:
> > https://bugzilla.kernel.org/show_bug.cgi?id=203107
> > 
> >             Bug ID: 203107
> >            Summary: Bad page map in process during boot
> >            Product: File System
> >            Version: 2.5
> >     Kernel Version: 5.0.5
> >           Hardware: All
> >                 OS: Linux
> >               Tree: Mainline
> >             Status: NEW
> >           Severity: normal
> >           Priority: P1
> >          Component: ext4
> >           Assignee: fs_ext4@kernel-bugs.osdl.org
> >           Reporter: echto1@gmail.com
> >         Regression: No
> > 
> > Error occurs randomly at boot after upgrading kernel from 5.0.0 to 5.0.4.
> > 
> > https://justpaste.it/387uf
> 
> I don't think this is an ext4 error. Sure this is an error in file mapping
> of libblkid.so.1.1.0 (which is handled by ext4) but the filesystem has very
> little to say wrt how or which PTEs are installed. And the problem is that
> invalid PTE (dead000000000100) is present in page tables. So this looks
> more like a problem in MM itself. Adding MM guys to CC.

0xdead000000000100 and 0xdead000000000200 are LIST_POISON1 and
LIST_POISON2 repectively. Have no idea how would they end up in page table.
Comment 4 Vlastimil Babka 2019-04-04 14:23:50 UTC
On 4/4/19 3:08 PM, Kirill A. Shutemov wrote:
> On Tue, Apr 02, 2019 at 12:16:13PM +0200, Jan Kara wrote:
>> Switching to email...
>>
>> On Fri 29-03-19 20:46:22, bugzilla-daemon@bugzilla.kernel.org wrote:
>>> https://bugzilla.kernel.org/show_bug.cgi?id=203107
>>>
>>>             Bug ID: 203107
>>>            Summary: Bad page map in process during boot
>>>            Product: File System
>>>            Version: 2.5
>>>     Kernel Version: 5.0.5
>>>           Hardware: All
>>>                 OS: Linux
>>>               Tree: Mainline
>>>             Status: NEW
>>>           Severity: normal
>>>           Priority: P1
>>>          Component: ext4
>>>           Assignee: fs_ext4@kernel-bugs.osdl.org
>>>           Reporter: echto1@gmail.com
>>>         Regression: No
>>>
>>> Error occurs randomly at boot after upgrading kernel from 5.0.0 to 5.0.4.
>>>
>>> https://justpaste.it/387uf
>>
>> I don't think this is an ext4 error. Sure this is an error in file mapping
>> of libblkid.so.1.1.0 (which is handled by ext4) but the filesystem has very
>> little to say wrt how or which PTEs are installed. And the problem is that
>> invalid PTE (dead000000000100) is present in page tables. So this looks
>> more like a problem in MM itself. Adding MM guys to CC.
> 
> 0xdead000000000100 and 0xdead000000000200 are LIST_POISON1 and
> LIST_POISON2 repectively. Have no idea how would they end up in page table.

It's possible that CONFIG_DEBUG_LIST could catch the issue. Between
5.0.0 to 5.0.4 it should be also relatively easy to bisect with the
stable git tree [1], although if it happens randomly, you need to
perform enough attempts to accurately determine which commit is "good".

[1] git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git

Note You need to log in before you can comment on or make changes to this bug.