Bug 59491

Summary: Regression/Broken MTRR with commit cd7b304dfaf1f3999ac5d2a1feeba95dec4284a9 "x86, range: fix missing merge during add range"
Product: Memory Management Reporter: Joshua Covington (joshuacov)
Component: MTTRAssignee: Andrew Morton (akpm)
Status: CLOSED CODE_FIX    
Severity: normal CC: alexandre.nunes, christian.koenig, daniel, higkoohk, js314592, yinghai
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 3.9.5 Subsystem:
Regression: Yes Bisected commit-id:
Attachments: dmesg - broken mttr with 3.9.5
dmesg with reverted commit
dmesg-patch_reverted-mtrr_clenup_debug
dmesg-with_offending_patch-mtrr_cleanup_debug
dmesg from 3.9.5 kernel with applied patches
dmesg from 3.9.5 kernel with applied patches v2

Description Joshua Covington 2013-06-09 09:23:27 UTC
Created attachment 104011 [details]
dmesg - broken mttr with 3.9.5

Commit cd7b304dfaf1f3999ac5d2a1feeba95dec4284a9 "x86, range: fix missing merge during add range" introduced a regression for me. After upgrading to kernel-3.9.5 my dmesg is full of '*BAD*gran_size:' entries (see attached files). 3.9.4 works fine.

/proc/mtrr now looks like this (with 3.9.5):

# cat /proc/mtrr
reg00: base=0x000000000 (    0MB), size=16384MB, count=1: write-back
reg01: base=0x400000000 (16384MB), size= 1024MB, count=1: write-back
reg02: base=0x0c0000000 ( 3072MB), size= 1024MB, count=1: uncachable
reg03: base=0x43f800000 (17400MB), size=    8MB, count=1: uncachable

Reverting the offending commit gives me back
# cat /proc/mtrr
reg00: base=0x000000000 (    0MB), size= 2048MB, count=1: write-back
reg01: base=0x080000000 ( 2048MB), size= 1024MB, count=1: write-back
reg02: base=0x100000000 ( 4096MB), size= 4096MB, count=1: write-back
reg03: base=0x200000000 ( 8192MB), size= 8192MB, count=1: write-back
reg04: base=0x400000000 (16384MB), size= 1024MB, count=1: write-back
reg05: base=0x43f800000 (17400MB), size=    8MB, count=1: uncachable
reg06: base=0x0c0000000 ( 3072MB), size=  256MB, count=1: write-combining

The upstream commit is: fbe06b7bae7c9cf6ab05168fce5ee93b2f4bae7c
Comment 1 Joshua Covington 2013-06-09 09:26:29 UTC
Created attachment 104021 [details]
dmesg with reverted commit
Comment 2 Yinghai Lu 2013-06-09 17:18:02 UTC
please boot with "mtrr_cleanup_debug" with and without reverting
and post the boot log.

Thanks
Comment 3 Andrew Morton 2013-06-09 17:22:47 UTC
Didn't Shan Wei <shanwei88@gmail.com> just send a fix for this?  Subject "[PATCH buf-fix] kernel, range: fix broken mtrr_cleanup"
Comment 5 Yinghai Lu 2013-06-09 20:06:01 UTC
(In reply to comment #3)
> Didn't Shan Wei <shanwei88@gmail.com> just send a fix for this?  Subject
> "[PATCH buf-fix] kernel, range: fix broken mtrr_cleanup"

No, that is not right fix.
It will break add_range_with_merge to handle case like

existing:
1G-2G
3G-4G
5G-6G
then add range merge with [1G-6G)

Thanks
Comment 6 Joshua Covington 2013-06-09 22:41:33 UTC
Created attachment 104061 [details]
dmesg-patch_reverted-mtrr_clenup_debug

(In reply to comment #2)
> please boot with "mtrr_cleanup_debug" with and without reverting
> and post the boot log.
> 
> Thanks

Dmesg form the kernel with the reverted patch and mtrr_cleanup_debug.
Comment 7 Joshua Covington 2013-06-09 22:48:08 UTC
Created attachment 104071 [details]
dmesg-with_offending_patch-mtrr_cleanup_debug

(In reply to comment #2)
> please boot with "mtrr_cleanup_debug" with and without reverting
> and post the boot log.
> 
> Thanks

Dmesg from the 3.9.5 kernel (with the patch) and mtrr_cleanup_debug
Comment 8 Joshua Covington 2013-06-09 23:23:14 UTC
Created attachment 104081 [details]
dmesg from 3.9.5 kernel with applied patches

(In reply to comment #4)
> please check patch at
> 
> https://patchwork.kernel.org/patch/2694981/
> https://patchwork.kernel.org/patch/2694971/

Here are 2 dmesg (*.tar.xz) from a 3.9.5 kernel with the above patches applied (with and without mtrr_cleanup_debug). It looks like these patches fixed the problem.
Comment 9 Joshua Covington 2013-06-11 10:00:53 UTC
Created attachment 104371 [details]
dmesg from 3.9.5 kernel with applied patches v2

I tried v2 of the patches available here:

https://patchwork.kernel.org/patch/2695891/
https://patchwork.kernel.org/patch/2695881/

and here are the dmesg from a 3.9.5 kernel with the above patches applied (with and without mtrr_cleanup_debug). These also seem to have fixed the issue.
Comment 10 JS 2013-06-20 20:54:25 UTC
(In reply to comment #9)
> Created an attachment (id=104371) [details]
> dmesg from 3.9.5 kernel with applied patches v2
> 
> I tried v2 of the patches available here:
> 
> https://patchwork.kernel.org/patch/2695891/
> https://patchwork.kernel.org/patch/2695881/
> 
> and here are the dmesg from a 3.9.5 kernel with the above patches applied
> (with
> and without mtrr_cleanup_debug). These also seem to have fixed the issue.
It seems these patches are still not committed in 3.9.7
Comment 11 Yinghai Lu 2013-06-20 21:40:02 UTC
(In reply to comment #10)
> It seems these patches are still not committed in 3.9.7

Now they are in tip/x86/urgent, so hpa will push them to linus for v3.10-rc7
after that, greg will have that in 3.9.X
Comment 12 Yinghai Lu 2013-06-26 02:26:12 UTC
*** Bug 60001 has been marked as a duplicate of this bug. ***
Comment 13 Yinghai Lu 2013-06-26 02:27:56 UTC
Now the patches are in upstream. Greq already pick them up for v3.9.8