Bug 97321 - WARNING at untrack_pfn+0x 99/0xa0()
Summary: WARNING at untrack_pfn+0x 99/0xa0()
Status: RESOLVED CODE_FIX
Alias: None
Product: Memory Management
Classification: Unclassified
Component: Other (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: Andrew Morton
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2015-04-26 21:09 UTC by Stas Sergeev
Modified: 2016-09-29 20:55 UTC (History)
2 users (show)

See Also:
Kernel Version: 4.0.0-rc6+ git
Subsystem:
Regression: No
Bisected commit-id:


Attachments
possible fix (714 bytes, text/plain)
2015-04-26 21:09 UTC, Stas Sergeev
Details

Description Stas Sergeev 2015-04-26 21:09:07 UTC
Created attachment 175121 [details]
possible fix

Hello.

I have a program that AFAIK does mremap() on previously
mmap()ed /dev/mem. This results in the following stack trace:

[   67.887346] WARNING: CPU: 3 PID: 5144 at arch/x86/mm/pat.c:904 untrack_pfn+0x
99/0xa0()

[   67.892540] Call Trace:
[   67.892623]  [<ffffffff81541bcd>] dump_stack+0x4f/0x7b
[   67.892706]  [<ffffffff810533fb>] warn_slowpath_common+0x8b/0xd0
[   67.892788]  [<ffffffff810534e5>] warn_slowpath_null+0x15/0x20
[   67.892870]  [<ffffffff8104b309>] untrack_pfn+0x99/0xa0
[   67.892952]  [<ffffffff81138f3c>] unmap_single_vma+0x73c/0x750
[   67.893035]  [<ffffffff8115879d>] ? alloc_pages_current+0x10d/0x1c0
[   67.893118]  [<ffffffff81096846>] ? lockdep_init_map+0x66/0x7f0
[   67.893200]  [<ffffffff81139b5c>] unmap_vmas+0x4c/0xb0
[   67.893282]  [<ffffffff8113f1a3>] unmap_region+0xa3/0x110
[   67.893364]  [<ffffffff8113f5d9>] ? vma_rb_erase+0x129/0x250
[   67.893446]  [<ffffffff811413b0>] do_munmap+0x1f0/0x460
[   67.893560]  [<ffffffff811444bd>] move_vma+0x14d/0x280
[   67.893641]  [<ffffffff81144992>] SyS_mremap+0x3a2/0x510
[   67.893724]  [<ffffffff8154b689>] system_call_fastpath+0x12/0x17


The problem happens because __follow_pte() returns
-EINVAL after !pte_present(*ptep) check, and so
follow_phys() fails.
I think if the page is not present, it is simply not
needed to do free_pfn_range(). So I made a naive patch
(attached) that seem to fix the problem.
Comment 1 Andrew Morton 2015-04-28 21:36:10 UTC
I'm switching this to email - we don't handle patches via bugzilla.

Suresh, could you please take a look?


On Sun, 26 Apr 2015 21:09:07 +0000 bugzilla-daemon@bugzilla.kernel.org wrote:

> https://bugzilla.kernel.org/show_bug.cgi?id=97321
> 
>             Bug ID: 97321
>            Summary: WARNING at untrack_pfn+0x 99/0xa0()
>            Product: Memory Management
>            Version: 2.5
>     Kernel Version: 4.0.0-rc6+ git
>           Hardware: All
>                 OS: Linux
>               Tree: Mainline
>             Status: NEW
>           Severity: normal
>           Priority: P1
>          Component: Other
>           Assignee: akpm@linux-foundation.org
>           Reporter: stsp@list.ru
>         Regression: No
> 
> Created attachment 175121 [details]
>   --> https://bugzilla.kernel.org/attachment.cgi?id=175121&action=edit
> possible fix
> 
> Hello.
> 
> I have a program that AFAIK does mremap() on previously
> mmap()ed /dev/mem. This results in the following stack trace:
> 
> [   67.887346] WARNING: CPU: 3 PID: 5144 at arch/x86/mm/pat.c:904
> untrack_pfn+0x
> 99/0xa0()
> 
> [   67.892540] Call Trace:
> [   67.892623]  [<ffffffff81541bcd>] dump_stack+0x4f/0x7b
> [   67.892706]  [<ffffffff810533fb>] warn_slowpath_common+0x8b/0xd0
> [   67.892788]  [<ffffffff810534e5>] warn_slowpath_null+0x15/0x20
> [   67.892870]  [<ffffffff8104b309>] untrack_pfn+0x99/0xa0
> [   67.892952]  [<ffffffff81138f3c>] unmap_single_vma+0x73c/0x750
> [   67.893035]  [<ffffffff8115879d>] ? alloc_pages_current+0x10d/0x1c0
> [   67.893118]  [<ffffffff81096846>] ? lockdep_init_map+0x66/0x7f0
> [   67.893200]  [<ffffffff81139b5c>] unmap_vmas+0x4c/0xb0
> [   67.893282]  [<ffffffff8113f1a3>] unmap_region+0xa3/0x110
> [   67.893364]  [<ffffffff8113f5d9>] ? vma_rb_erase+0x129/0x250
> [   67.893446]  [<ffffffff811413b0>] do_munmap+0x1f0/0x460
> [   67.893560]  [<ffffffff811444bd>] move_vma+0x14d/0x280
> [   67.893641]  [<ffffffff81144992>] SyS_mremap+0x3a2/0x510
> [   67.893724]  [<ffffffff8154b689>] system_call_fastpath+0x12/0x17
> 
> 
> The problem happens because __follow_pte() returns
> -EINVAL after !pte_present(*ptep) check, and so
> follow_phys() fails.
> I think if the page is not present, it is simply not
> needed to do free_pfn_range(). So I made a naive patch
> (attached) that seem to fix the problem.

patch:

diff --git a/arch/x86/mm/pat.c b/arch/x86/mm/pat.c
index 7ac6869..2df97f6 100644
--- a/arch/x86/mm/pat.c
+++ b/arch/x86/mm/pat.c
@@ -900,14 +900,12 @@ void untrack_pfn(struct vm_area_struct *vma, unsigned long pfn,
 	/* free the chunk starting from pfn or the whole chunk */
 	paddr = (resource_size_t)pfn << PAGE_SHIFT;
 	if (!paddr && !size) {
-		if (follow_phys(vma, vma->vm_start, 0, &prot, &paddr)) {
-			WARN_ON_ONCE(1);
-			return;
-		}
-
-		size = vma->vm_end - vma->vm_start;
+		int err = follow_phys(vma, vma->vm_start, 0, &prot, &paddr);
+		if (!err)
+			size = vma->vm_end - vma->vm_start;
 	}
-	free_pfn_range(paddr, size);
+	if (size)
+		free_pfn_range(paddr, size);
 	vma->vm_flags &= ~VM_PAT;
 }
Comment 2 Stas Sergeev 2015-04-29 10:39:29 UTC
This is a regression, although the rather old one.
Comment 3 Stas Sergeev 2015-10-28 21:37:56 UTC
Toshi Kani <toshi.kani@hp.com> explains:

I looked at the dosemu code and was able to reproduce the issue with a test
program.  This problem happens when mremap() to /dev/mem (or PFNMAP) is
called with MREMAP_FIXED.

In this case, mremap calls move_vma(), which first calls move_page_tables()
to remap the translation and then calls do_munmap() to remove the original
mapping.  Hence, when untrack_pfn() is called from do_munmap(), the
original map is already removed, and follow_phys() fails with the
 !pte_present() check.

I think there are a couple of issues:
 - If track_pfn() ignores an error from follow_phys() and skips
free_pfn_range(), PAT continues to track the original map that is removed.
 - track_pfn() calls free_pfn_range() to untrack a given free range. 
 However, rbt_memtype_erase() requires the free range match exactly to the
tracked range.  This does not support mremap, which needs to free up part
of the tracked range.
 - PAT does not track a new translation specified by mremap() with MREMAP_F
IXED.

Note You need to log in before you can comment on or make changes to this bug.