Bug 10328

Summary: [regression] performance drop for glx
Product: Memory Management Reporter: Oleksij Rempel (fishor) (bug-track)
Component: OtherAssignee: Andrew Morton (akpm)
Status: CLOSED CODE_FIX    
Severity: normal CC: bunk, rjw
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 2.6.25-rc6 Subsystem:
Regression: Yes Bisected commit-id:
Bug Depends on:    
Bug Blocks: 9832    
Attachments: xorg.log

Description Oleksij Rempel (fishor) 2008-03-25 15:11:14 UTC
after commit 4138cc3418f5eaa7524ff8e927102863f1ba0ea5 i expirience some grafik related perfomance issues.

I used glxgears for test.
before this patch: 1281.005 FPS
and after: 765.000 FPS

latest tested commit a4083c9271e0a697278e089f2c0b9a95363ada0a
still hase bad performance.

I use Pentium D with 2GB RAM, Grafick: i945G, ICH7
Comment 1 Anonymous Emailer 2008-03-25 15:28:58 UTC
Reply-To: akpm@linux-foundation.org

(switched to email.  Please respond via emailed reply-to-all, not via the
bugzilla web interface).

On Tue, 25 Mar 2008 15:11:15 -0700 (PDT)
bugme-daemon@bugzilla.kernel.org wrote:

> http://bugzilla.kernel.org/show_bug.cgi?id=10328
> 
>            Summary: [regression] performance drop for glx
>            Product: Memory Management
>            Version: 2.5
>      KernelVersion: 2.6.25-rc6
>           Platform: All
>         OS/Version: Linux
>               Tree: Mainline
>             Status: NEW
>           Severity: normal
>           Priority: P1
>          Component: Other
>         AssignedTo: akpm@osdl.org
>         ReportedBy: bug-track@fisher-privat.net
> 
> 
> after commit 4138cc3418f5eaa7524ff8e927102863f1ba0ea5 i expirience some
> grafik
> related perfomance issues.
> 
> I used glxgears for test.
> before this patch: 1281.005 FPS
> and after: 765.000 FPS

It nearly halved.

> latest tested commit a4083c9271e0a697278e089f2c0b9a95363ada0a
> still hase bad performance.
> 
> I use Pentium D with 2GB RAM, Grafick: i945G, ICH7
> 

That's 

: commit 4138cc3418f5eaa7524ff8e927102863f1ba0ea5
: Author: Siddha, Suresh B <suresh.b.siddha@intel.com>
: Date:   Wed Jan 30 13:33:43 2008 +0100
: 
:     x86: set strong uncacheable where UC is really desired
:     
:     Also use _PAGE_PWT for all the mappings which need uncache mapping.
:     Instead of existing PAT2 which is UC- (and can be overwritten by MTRRs),
:     we now use PAT3 which is strong uncacheable.
:     
:     This makes it consistent with pgprot_noncached()
:     
:     Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
:     Signed-off-by: Ingo Molnar <mingo@elte.hu>
:     Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
: 
Comment 2 Arjan van de Ven 2008-03-25 16:23:38 UTC
which graphics driver version is this?
It seems the code is ASKING for uncached, but WANTING something else...
that's a big bug/problem in principle, need to know details of who/what/where
Comment 3 Arjan van de Ven 2008-03-25 16:24:47 UTC
(alternatively, the "good" performance was a correctness bug that really shouldn't have been done.... I'll need to get the graphics guys involved on that just need more details first)
Comment 4 Suresh B Siddha 2008-03-25 17:43:17 UTC
On Tue, Mar 25, 2008 at 03:28:09PM -0700, Andrew Morton wrote:
> (switched to email.  Please respond via emailed reply-to-all, not via the
> bugzilla web interface).
> 
> On Tue, 25 Mar 2008 15:11:15 -0700 (PDT)
> bugme-daemon@bugzilla.kernel.org wrote:
> 
> > http://bugzilla.kernel.org/show_bug.cgi?id=10328
> > 
> >            Summary: [regression] performance drop for glx
> > 
> > after commit 4138cc3418f5eaa7524ff8e927102863f1ba0ea5 i expirience some
> grafik
> > related perfomance issues.
> > 
> > I used glxgears for test.
> > before this patch: 1281.005 FPS
> > and after: 765.000 FPS
> 
> It nearly halved.
> 
> > latest tested commit a4083c9271e0a697278e089f2c0b9a95363ada0a
> > still hase bad performance.
> > 
> > I use Pentium D with 2GB RAM, Grafick: i945G, ICH7
> > 
> 
> That's 
> 
> : commit 4138cc3418f5eaa7524ff8e927102863f1ba0ea5
> : Author: Siddha, Suresh B <suresh.b.siddha@intel.com>
> : Date:   Wed Jan 30 13:33:43 2008 +0100
> : 
> :     x86: set strong uncacheable where UC is really desired
> :     
> :     Also use _PAGE_PWT for all the mappings which need uncache mapping.
> :     Instead of existing PAT2 which is UC- (and can be overwritten by
> MTRRs),
> :     we now use PAT3 which is strong uncacheable.
> :     
> :     This makes it consistent with pgprot_noncached()

Alexey, Can you please try the appended patch?

Andrew, can you please push the appended patch for 2.6.25? Thanks.
---

fb drivers are using ioremap()/ioremap_nocache(), followed by mtrr_add with
WC attribute. Recent changes in page attribute code made both
ioremap()/ioremap_nocache() mappings as UC (instead of previous UC-). This
breaks the graphics performance, as the effective memory type is UC instead
of expected WC.

The correct way to fix this is to add ioremap_wc() (which uses UC- in the
absence of PAT kernel support and WC with PAT) and change all the
fb drivers to use this new ioremap_wc() API.

We can take this correct and longer route for post 2.6.25. For now,
revert back to the UC- behavior for ioremap/ioremap_nocache.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Cc: Arjan van de Ven <arjan@linux.intel.com>
---

diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
index 4afaba0..794895c 100644
--- a/arch/x86/mm/ioremap.c
+++ b/arch/x86/mm/ioremap.c
@@ -137,7 +137,11 @@ static void __iomem *__ioremap(resource_size_t phys_addr, unsigned long size,
 	switch (mode) {
 	case IOR_MODE_UNCACHED:
 	default:
-		prot = PAGE_KERNEL_NOCACHE;
+		/*
+		 * FIXME: we will use UC MINUS for now, as video fb drivers
+		 * depend on it. Upcoming ioremap_wc() will fix this behavior.
+		 */
+		prot = PAGE_KERNEL_UC_MINUS;
 		break;
 	case IOR_MODE_CACHED:
 		prot = PAGE_KERNEL;
diff --git a/include/asm-x86/pgtable.h b/include/asm-x86/pgtable.h
index 174b877..9cf472a 100644
--- a/include/asm-x86/pgtable.h
+++ b/include/asm-x86/pgtable.h
@@ -85,6 +85,7 @@ extern pteval_t __PAGE_KERNEL, __PAGE_KERNEL_EXEC;
 #define __PAGE_KERNEL_RX		(__PAGE_KERNEL_EXEC & ~_PAGE_RW)
 #define __PAGE_KERNEL_EXEC_NOCACHE	(__PAGE_KERNEL_EXEC | _PAGE_PCD | _PAGE_PWT)
 #define __PAGE_KERNEL_NOCACHE		(__PAGE_KERNEL | _PAGE_PCD | _PAGE_PWT)
+#define __PAGE_KERNEL_UC_MINUS		(__PAGE_KERNEL | _PAGE_PCD)
 #define __PAGE_KERNEL_VSYSCALL		(__PAGE_KERNEL_RX | _PAGE_USER)
 #define __PAGE_KERNEL_VSYSCALL_NOCACHE	(__PAGE_KERNEL_VSYSCALL | _PAGE_PCD | _PAGE_PWT)
 #define __PAGE_KERNEL_LARGE		(__PAGE_KERNEL | _PAGE_PSE)
@@ -101,6 +102,7 @@ extern pteval_t __PAGE_KERNEL, __PAGE_KERNEL_EXEC;
 #define PAGE_KERNEL_EXEC		MAKE_GLOBAL(__PAGE_KERNEL_EXEC)
 #define PAGE_KERNEL_RX			MAKE_GLOBAL(__PAGE_KERNEL_RX)
 #define PAGE_KERNEL_NOCACHE		MAKE_GLOBAL(__PAGE_KERNEL_NOCACHE)
+#define PAGE_KERNEL_UC_MINUS		MAKE_GLOBAL(__PAGE_KERNEL_UC_MINUS)
 #define PAGE_KERNEL_EXEC_NOCACHE	MAKE_GLOBAL(__PAGE_KERNEL_EXEC_NOCACHE)
 #define PAGE_KERNEL_LARGE		MAKE_GLOBAL(__PAGE_KERNEL_LARGE)
 #define PAGE_KERNEL_LARGE_EXEC		MAKE_GLOBAL(__PAGE_KERNEL_LARGE_EXEC)
Comment 5 Arjan van de Ven 2008-03-25 21:42:49 UTC
Suresh Siddha wrote:
> On Tue, Mar 25, 2008 at 03:28:09PM -0700, Andrew Morton wrote:
>> (switched to email.  Please respond via emailed reply-to-all, not via the
>> bugzilla web interface).
>>
>> On Tue, 25 Mar 2008 15:11:15 -0700 (PDT)
>> bugme-daemon@bugzilla.kernel.org wrote:
>>
>>> http://bugzilla.kernel.org/show_bug.cgi?id=10328
>>>
>>>            Summary: [regression] performance drop for glx
>>>
>>> after commit 4138cc3418f5eaa7524ff8e927102863f1ba0ea5 i expirience some
>>> grafik
>>> related perfomance issues.
>>>
>>> I used glxgears for test.
>>> before this patch: 1281.005 FPS
>>> and after: 765.000 FPS
>> It nearly halved.
>>
>>> latest tested commit a4083c9271e0a697278e089f2c0b9a95363ada0a
>>> still hase bad performance.
>>>
>>> I use Pentium D with 2GB RAM, Grafick: i945G, ICH7
>>>
>> That's 
>>
>> : commit 4138cc3418f5eaa7524ff8e927102863f1ba0ea5
>> : Author: Siddha, Suresh B <suresh.b.siddha@intel.com>
>> : Date:   Wed Jan 30 13:33:43 2008 +0100
>> : 
>> :     x86: set strong uncacheable where UC is really desired
>> :     
>> :     Also use _PAGE_PWT for all the mappings which need uncache mapping.
>> :     Instead of existing PAT2 which is UC- (and can be overwritten by
>> MTRRs),
>> :     we now use PAT3 which is strong uncacheable.
>> :     
>> :     This makes it consistent with pgprot_noncached()
> 
> Alexey, Can you please try the appended patch?
> 
> Andrew, can you please push the appended patch for 2.6.25? Thanks.
> ---
> 
> fb drivers are using ioremap()/ioremap_nocache(), followed by mtrr_add with
> WC attribute. Recent changes in page attribute code made both
> ioremap()/ioremap_nocache() mappings as UC (instead of previous UC-). This
> breaks the graphics performance, as the effective memory type is UC instead
> of expected WC.
> 
> The correct way to fix this is to add ioremap_wc() (which uses UC- in the
> absence of PAT kernel support and WC with PAT) and change all the
> fb drivers to use this new ioremap_wc() API.
> 
> We can take this correct and longer route for post 2.6.25. For now,
> revert back to the UC- behavior for ioremap/ioremap_nocache.

I would still like to add an ioremap_wc() even in 2.6.25; even if it's for now
identical to ioremap_nocache(). Better get the right API in place as soon as possible
Comment 6 Ingo Molnar 2008-03-25 22:29:58 UTC
* Suresh Siddha <suresh.b.siddha@intel.com> wrote:

> fb drivers are using ioremap()/ioremap_nocache(), followed by mtrr_add 
> with WC attribute. Recent changes in page attribute code made both 
> ioremap()/ioremap_nocache() mappings as UC (instead of previous UC-). 
> This breaks the graphics performance, as the effective memory type is 
> UC instead of expected WC.
> 
> The correct way to fix this is to add ioremap_wc() (which uses UC- in 
> the absence of PAT kernel support and WC with PAT) and change all the 
> fb drivers to use this new ioremap_wc() API.
> 
> We can take this correct and longer route for post 2.6.25. For now, 
> revert back to the UC- behavior for ioremap/ioremap_nocache.

thanks Suresh, applied.

	Ingo
Comment 7 Oleksij Rempel (fishor) 2008-03-26 00:09:13 UTC
This patch do not make any difference to me.
I use xserver-xorg-video-intel 2:2.2.1-1ubuntu5
Comment 8 Oleksij Rempel (fishor) 2008-03-26 00:09:40 UTC
Created attachment 15442 [details]
xorg.log
Comment 9 Suresh B Siddha 2008-03-26 10:55:48 UTC
On Wed, Mar 26, 2008 at 06:29:11AM +0100, Ingo Molnar wrote:
> 
> * Suresh Siddha <suresh.b.siddha@intel.com> wrote:
> 
> > fb drivers are using ioremap()/ioremap_nocache(), followed by mtrr_add 
> > with WC attribute. Recent changes in page attribute code made both 
> > ioremap()/ioremap_nocache() mappings as UC (instead of previous UC-). 
> > This breaks the graphics performance, as the effective memory type is 
> > UC instead of expected WC.
> > 
> > The correct way to fix this is to add ioremap_wc() (which uses UC- in 
> > the absence of PAT kernel support and WC with PAT) and change all the 
> > fb drivers to use this new ioremap_wc() API.
> > 
> > We can take this correct and longer route for post 2.6.25. For now, 
> > revert back to the UC- behavior for ioremap/ioremap_nocache.
> 
> thanks Suresh, applied.

Well, we need to take care of set_memory_uc() aswell, as the previous version
didn't fix Alexey's issue.

Alexey, can you please test and ack this, before Andrew/Ingo can push relevant
bits to their trees. Thanks.
---

fb drivers are using ioremap()/ioremap_nocache(), followed by mtrr_add with
WC attribute. Recent changes in page attribute code made both
ioremap()/ioremap_nocache() mappings as UC (instead of previous UC-). This
breaks the graphics performance, as the effective memory type is UC instead
of expected WC.

The correct way to fix this is to add ioremap_wc() (which uses UC- in the
absence of PAT kernel support and WC with PAT) and change all the
drivers to use this new ioremap_wc() API.

We can take this correct and longer route for post 2.6.25. For now,
revert back to the UC- behavior for ioremap/ioremap_nocache.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Alexey Fisher <bug-track@fisher-privat.net>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
---

diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
index 4afaba0..794895c 100644
--- a/arch/x86/mm/ioremap.c
+++ b/arch/x86/mm/ioremap.c
@@ -137,7 +137,11 @@ static void __iomem *__ioremap(resource_size_t phys_addr, unsigned long size,
 	switch (mode) {
 	case IOR_MODE_UNCACHED:
 	default:
-		prot = PAGE_KERNEL_NOCACHE;
+		/*
+		 * FIXME: we will use UC MINUS for now, as video fb drivers
+		 * depend on it. Upcoming ioremap_wc() will fix this behavior.
+		 */
+		prot = PAGE_KERNEL_UC_MINUS;
 		break;
 	case IOR_MODE_CACHED:
 		prot = PAGE_KERNEL;
diff --git a/arch/x86/mm/pageattr.c b/arch/x86/mm/pageattr.c
index 14e48b5..7b79f6b 100644
--- a/arch/x86/mm/pageattr.c
+++ b/arch/x86/mm/pageattr.c
@@ -771,7 +771,7 @@ static inline int change_page_attr_clear(unsigned long addr, int numpages,
 int set_memory_uc(unsigned long addr, int numpages)
 {
 	return change_page_attr_set(addr, numpages,
-				    __pgprot(_PAGE_PCD | _PAGE_PWT));
+				    __pgprot(_PAGE_PCD));
 }
 EXPORT_SYMBOL(set_memory_uc);
 
diff --git a/include/asm-x86/pgtable.h b/include/asm-x86/pgtable.h
index 174b877..9cf472a 100644
--- a/include/asm-x86/pgtable.h
+++ b/include/asm-x86/pgtable.h
@@ -85,6 +85,7 @@ extern pteval_t __PAGE_KERNEL, __PAGE_KERNEL_EXEC;
 #define __PAGE_KERNEL_RX		(__PAGE_KERNEL_EXEC & ~_PAGE_RW)
 #define __PAGE_KERNEL_EXEC_NOCACHE	(__PAGE_KERNEL_EXEC | _PAGE_PCD | _PAGE_PWT)
 #define __PAGE_KERNEL_NOCACHE		(__PAGE_KERNEL | _PAGE_PCD | _PAGE_PWT)
+#define __PAGE_KERNEL_UC_MINUS		(__PAGE_KERNEL | _PAGE_PCD)
 #define __PAGE_KERNEL_VSYSCALL		(__PAGE_KERNEL_RX | _PAGE_USER)
 #define __PAGE_KERNEL_VSYSCALL_NOCACHE	(__PAGE_KERNEL_VSYSCALL | _PAGE_PCD | _PAGE_PWT)
 #define __PAGE_KERNEL_LARGE		(__PAGE_KERNEL | _PAGE_PSE)
@@ -101,6 +102,7 @@ extern pteval_t __PAGE_KERNEL, __PAGE_KERNEL_EXEC;
 #define PAGE_KERNEL_EXEC		MAKE_GLOBAL(__PAGE_KERNEL_EXEC)
 #define PAGE_KERNEL_RX			MAKE_GLOBAL(__PAGE_KERNEL_RX)
 #define PAGE_KERNEL_NOCACHE		MAKE_GLOBAL(__PAGE_KERNEL_NOCACHE)
+#define PAGE_KERNEL_UC_MINUS		MAKE_GLOBAL(__PAGE_KERNEL_UC_MINUS)
 #define PAGE_KERNEL_EXEC_NOCACHE	MAKE_GLOBAL(__PAGE_KERNEL_EXEC_NOCACHE)
 #define PAGE_KERNEL_LARGE		MAKE_GLOBAL(__PAGE_KERNEL_LARGE)
 #define PAGE_KERNEL_LARGE_EXEC		MAKE_GLOBAL(__PAGE_KERNEL_LARGE_EXEC)
Comment 10 Oleksij Rempel (fishor) 2008-03-26 15:22:42 UTC
Last patch working for me:
glxgears 
6135 frames in 5.0 seconds = 1226.967 FPS
6202 frames in 5.0 seconds = 1240.361 FPS
Comment 11 Adrian Bunk 2008-03-27 00:08:10 UTC
fixed by commit d546b67a940eb42a99f56b86c5cd8d47c8348c2a
Comment 12 Oleksij Rempel (fishor) 2008-04-19 01:48:10 UTC
Same issue with commit 2e5d9c857d4e6c9e7b7d8c8c86a68a7842d213d6

kernel 2.6.25-git1
Comment 13 Oleksij Rempel (fishor) 2008-04-19 01:50:01 UTC
commit 2e5d9c857d4e6c9e7b7d8c8c86a68a7842d213d6
Author: venkatesh.pallipadi@intel.com <venkatesh.pallipadi@intel.com>
Date:   Tue Mar 18 17:00:14 2008 -0700

    x86: PAT infrastructure patch
    
    Sets up pat_init() infrastructure.
    
    PAT MSR has following setting.
    	PAT
    	|PCD
    	||PWT
    	|||
    	000 WB		_PAGE_CACHE_WB
    	001 WC		_PAGE_CACHE_WC
    	010 UC-		_PAGE_CACHE_UC_MINUS
    	011 UC		_PAGE_CACHE_UC
    
    We are effectively changing WT from boot time setting to WC.
    UC_MINUS is used to provide backward compatibility to existing /dev/mem
    users(X).
    
    reserve_memtype and free_memtype are new interfaces for maintaining alias-free
    mapping. It is currently implemented in a simple way with a linked list and
    not optimized. reserve and free tracks the effective memory type, as a result
    of PAT and MTRR setting rather than what is actually requested in PAT.
    
    pat_init piggy backs on mtrr_init as the rules for setting both pat and mtrr
    are same.
    
    Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
    Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
    Signed-off-by: Ingo Molnar <mingo@elte.hu>
Comment 14 Rafael J. Wysocki 2008-04-19 05:55:20 UTC
Are you sure this is the same bug?

If not, I'd prefer to use another bug entry for tracking it as a regression from 2.6.25.
Comment 15 Oleksij Rempel (fishor) 2008-04-19 13:09:09 UTC
This issue have same sumptomps: Grafik perfomance drop ( glxgears ) from 1200 fps to 700 fps. And couset by patch wich enable PAT. I would say this is the same bug, but i can't sure. I'm not programmer, just theology student ;)
Comment 16 Oleksij Rempel (fishor) 2008-04-24 22:35:37 UTC
Am Dienstag, den 25.03.2008, 21:42 -0700 schrieb Arjan van de Ven:
> Suresh Siddha wrote:
> > On Tue, Mar 25, 2008 at 03:28:09PM -0700, Andrew Morton wrote:
> >> (switched to email.  Please respond via emailed reply-to-all, not via the
> >> bugzilla web interface).
> >>
> >> On Tue, 25 Mar 2008 15:11:15 -0700 (PDT)
> >> bugme-daemon@bugzilla.kernel.org wrote:
> >>
> >>> http://bugzilla.kernel.org/show_bug.cgi?id=10328
> >>>
> >>>            Summary: [regression] performance drop for glx
> >>>
> >>> after commit 4138cc3418f5eaa7524ff8e927102863f1ba0ea5 i expirience some
> grafik
> >>> related perfomance issues.
> >>>
> >>> I used glxgears for test.
> >>> before this patch: 1281.005 FPS
> >>> and after: 765.000 FPS
> >> It nearly halved.
> >>
> >>> latest tested commit a4083c9271e0a697278e089f2c0b9a95363ada0a
> >>> still hase bad performance.
> >>>
> >>> I use Pentium D with 2GB RAM, Grafick: i945G, ICH7
> >>>
> >> That's 
> >>
> >> : commit 4138cc3418f5eaa7524ff8e927102863f1ba0ea5
> >> : Author: Siddha, Suresh B <suresh.b.siddha@intel.com>
> >> : Date:   Wed Jan 30 13:33:43 2008 +0100
> >> : 
> >> :     x86: set strong uncacheable where UC is really desired
> >> :     
> >> :     Also use _PAGE_PWT for all the mappings which need uncache mapping.
> >> :     Instead of existing PAT2 which is UC- (and can be overwritten by
> MTRRs),
> >> :     we now use PAT3 which is strong uncacheable.
> >> :     
> >> :     This makes it consistent with pgprot_noncached()
> > 
> > Alexey, Can you please try the appended patch?
> > 
> > Andrew, can you please push the appended patch for 2.6.25? Thanks.
> > ---
> > 


This issue come again with commit:

commit 2e5d9c857d4e6c9e7b7d8c8c86a68a7842d213d6
Author: venkatesh.pallipadi@intel.com <venkatesh.pallipadi@intel.com>
Date:   Tue Mar 18 17:00:14 2008 -0700

    x86: PAT infrastructure patch

    Sets up pat_init() infrastructure.

    PAT MSR has following setting.
        PAT
        |PCD
        ||PWT
        |||
        000 WB          _PAGE_CACHE_WB
        001 WC          _PAGE_CACHE_WC
        010 UC-         _PAGE_CACHE_UC_MINUS
        011 UC          _PAGE_CACHE_UC

    We are effectively changing WT from boot time setting to WC.
    UC_MINUS is used to provide backward compatibility to existing /dev/mem
    users(X).

    reserve_memtype and free_memtype are new interfaces for maintaining
alias-free
    mapping. It is currently implemented in a simple way with a linked list and
    not optimized. reserve and free tracks the effective memory type, as a
result
    of PAT and MTRR setting rather than what is actually requested in PAT.

    pat_init piggy backs on mtrr_init as the rules for setting both pat and
mtrr
    are same.

    Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
    Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
    Signed-off-by: Ingo Molnar <mingo@elte.hu>
Comment 17 Suresh B Siddha 2008-04-25 17:07:32 UTC
On Fri, Apr 25, 2008 at 07:35:23AM +0200, Alexey Fisher wrote:
> > >> On Tue, 25 Mar 2008 15:11:15 -0700 (PDT)
> > >> bugme-daemon@bugzilla.kernel.org wrote:
> > >>
> > >>> http://bugzilla.kernel.org/show_bug.cgi?id=10328
> > >>>
> > >>>            Summary: [regression] performance drop for glx
> > > 
> > > Alexey, Can you please try the appended patch?
> > > 
> > > Andrew, can you please push the appended patch for 2.6.25? Thanks.
> > > ---
> > > 
> 
> 
> This issue come again with commit:
> 
> commit 2e5d9c857d4e6c9e7b7d8c8c86a68a7842d213d6
> Author: venkatesh.pallipadi@intel.com <venkatesh.pallipadi@intel.com>
> Date:   Tue Mar 18 17:00:14 2008 -0700
> 
>     x86: PAT infrastructure patch
> 
>     Sets up pat_init() infrastructure.
> 

Alexey, Sorry. I missed posting a fix in the post PAT series for this issue. Can you
please try the appended patch and Ack?

copying X folks get their attention and their Ack on the patch and next steps outlined.

Thanks.
---

[patch] use UC minus for ioremap(), ioremap_nocache() and pci_mmap_page_range()

Use UC_MINUS for ioremap(), ioremap_nocache() instead of strong UC.
Once all the X drivers move to ioremap_wc(), we can go back to strong
UC semantics for ioremap() and ioremap_nocache().

To avoid attribute aliasing issues, pci_mmap_page_range() will also
use UC_MINUS for default non write-combining mapping request.

Next steps:
	a) change all the video drivers using ioremap() or ioremap_nocache()
	   and adding WC MTTR using mttr_add() to ioremap_wc()

	b) for strict usage, we can go back to strong uc semantics
	   for ioremap() and ioremap_nocache() after some grace period for
	   completing step-a.

	c) user level X server needs to use the appropriate method for setting
	   up WC mapping (like using resourceX_wc sysfs file instead of
	   adding MTRR for WC and using /dev/mem or resourceX under /sys)

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
---

diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
index 3a4baf9..c053977 100644
--- a/arch/x86/mm/ioremap.c
+++ b/arch/x86/mm/ioremap.c
@@ -176,11 +176,11 @@ static void __iomem *__ioremap(resource_size_t phys_addr, unsigned long size,
 		/*
 		 * Do not fallback to certain memory types with certain
 		 * requested type:
-		 * - request is uncached, return cannot be write-back
-		 * - request is uncached, return cannot be write-combine
+		 * - request is uc-, return cannot be write-back
+		 * - request is uc-, return cannot be write-combine
 		 * - request is write-combine, return cannot be write-back
 		 */
-		if ((prot_val == _PAGE_CACHE_UC &&
+		if ((prot_val == _PAGE_CACHE_UC_MINUS &&
 		     (new_prot_val == _PAGE_CACHE_WB ||
 		      new_prot_val == _PAGE_CACHE_WC)) ||
 		    (prot_val == _PAGE_CACHE_WC &&
@@ -201,6 +201,9 @@ static void __iomem *__ioremap(resource_size_t phys_addr, unsigned long size,
 	default:
 		prot = PAGE_KERNEL_NOCACHE;
 		break;
+	case _PAGE_CACHE_UC_MINUS:
+		prot = PAGE_KERNEL_UC_MINUS;
+		break;
 	case _PAGE_CACHE_WC:
 		prot = PAGE_KERNEL_WC;
 		break;
@@ -255,7 +258,16 @@ static void __iomem *__ioremap(resource_size_t phys_addr, unsigned long size,
  */
 void __iomem *ioremap_nocache(resource_size_t phys_addr, unsigned long size)
 {
-	return __ioremap(phys_addr, size, _PAGE_CACHE_UC);
+	/*
+	 * Ideally, this should be:
+	 * 	pat_wc_enabled ? _PAGE_CACHE_UC : _PAGE_CACHE_UC_MINUS;
+	 *
+	 * Till we fix all X drivers to use ioremap_wc(), we will use
+	 * UC MINUS.
+	 */
+	unsigned long val = _PAGE_CACHE_UC_MINUS;
+
+	return __ioremap(phys_addr, size, val);
 }
 EXPORT_SYMBOL(ioremap_nocache);
 
diff --git a/arch/x86/mm/pageattr.c b/arch/x86/mm/pageattr.c
index c29ebd0..ad3202c 100644
--- a/arch/x86/mm/pageattr.c
+++ b/arch/x86/mm/pageattr.c
@@ -779,14 +779,20 @@ static inline int change_page_attr_clear(unsigned long addr, int numpages,
 
 int _set_memory_uc(unsigned long addr, int numpages)
 {
+	/*
+	 * for now UC MINUS. see comments in ioremap_nocache()
+	 */
 	return change_page_attr_set(addr, numpages,
-				    __pgprot(_PAGE_CACHE_UC));
+				    __pgprot(_PAGE_CACHE_UC_MINUS));
 }
 
 int set_memory_uc(unsigned long addr, int numpages)
 {
+	/*
+	 * for now UC MINUS. see comments in ioremap_nocache()
+	 */
 	if (reserve_memtype(addr, addr + numpages * PAGE_SIZE,
-	                    _PAGE_CACHE_UC, NULL))
+	                    _PAGE_CACHE_UC_MINUS, NULL))
 		return -EINVAL;
 
 	return _set_memory_uc(addr, numpages);
diff --git a/arch/x86/pci/i386.c b/arch/x86/pci/i386.c
index 94f6c73..8af0f0b 100644
--- a/arch/x86/pci/i386.c
+++ b/arch/x86/pci/i386.c
@@ -301,6 +301,13 @@ int pci_mmap_page_range(struct pci_dev *dev, struct vm_area_struct *vma,
 	prot = pgprot_val(vma->vm_page_prot);
 	if (pat_wc_enabled && write_combine)
 		prot |= _PAGE_CACHE_WC;
+	else if (pat_wc_enabled)
+		/*
+		 * ioremap() and ioremap_nocache() defaults to UC MINUS for now.
+		 * To avoid attribute conflicts, request UC MINUS here
+		 * aswell.
+		 */
+		prot |= _PAGE_CACHE_UC_MINUS;
 	else if (boot_cpu_data.x86 > 3)
 		prot |= _PAGE_CACHE_UC;
 
@@ -319,9 +326,8 @@ int pci_mmap_page_range(struct pci_dev *dev, struct vm_area_struct *vma,
 		 * - request is uncached, return cannot be write-combine
 		 * - request is write-combine, return cannot be write-back
 		 */
-		if ((flags == _PAGE_CACHE_UC &&
-		     (new_flags == _PAGE_CACHE_WB ||
-		      new_flags == _PAGE_CACHE_WC)) ||
+		if ((flags == _PAGE_CACHE_UC_MINUS &&
+		     (new_flags == _PAGE_CACHE_WB)) ||
 		    (flags == _PAGE_CACHE_WC &&
 		     new_flags == _PAGE_CACHE_WB)) {
 			free_memtype(addr, addr+len);
Comment 18 Oleksij Rempel (fishor) 2008-04-26 04:44:10 UTC
Am Freitag, den 25.04.2008, 17:07 -0700 schrieb Suresh Siddha:
> On Fri, Apr 25, 2008 at 07:35:23AM +0200, Alexey Fisher wrote:
> > > >> On Tue, 25 Mar 2008 15:11:15 -0700 (PDT)
> > > >> bugme-daemon@bugzilla.kernel.org wrote:
> > > >>
> > > >>> http://bugzilla.kernel.org/show_bug.cgi?id=10328
> > > >>>
> > > >>>            Summary: [regression] performance drop for glx
> > > > 
> > > > Alexey, Can you please try the appended patch?
> > > > 
> > > > Andrew, can you please push the appended patch for 2.6.25? Thanks.
> > > > ---
> > > > 
> > 
> > 
> > This issue come again with commit:
> > 
> > commit 2e5d9c857d4e6c9e7b7d8c8c86a68a7842d213d6
> > Author: venkatesh.pallipadi@intel.com <venkatesh.pallipadi@intel.com>
> > Date:   Tue Mar 18 17:00:14 2008 -0700
> > 
> >     x86: PAT infrastructure patch
> > 
> >     Sets up pat_init() infrastructure.
> > 
> 
> Alexey, Sorry. I missed posting a fix in the post PAT series for this issue.
> Can you
> please try the appended patch and Ack?
> 
> copying X folks get their attention and their Ack on the patch and next steps
> outlined.
> 
> Thanks.
> ---
> 
> [patch] use UC minus for ioremap(), ioremap_nocache() and
> pci_mmap_page_range()
> 
> Use UC_MINUS for ioremap(), ioremap_nocache() instead of strong UC.
> Once all the X drivers move to ioremap_wc(), we can go back to strong
> UC semantics for ioremap() and ioremap_nocache().
> 

Thank you. This patch working.

Ack ?
Comment 19 Ingo Molnar 2008-04-28 09:35:55 UTC
* Suresh Siddha <suresh.b.siddha@intel.com> wrote:

> Alexey, Sorry. I missed posting a fix in the post PAT series for this 
> issue. Can you please try the appended patch and Ack?
> 
> copying X folks get their attention and their Ack on the patch and 
> next steps outlined.

i queued up the patch below - hand-merged to x86.git.

	Ingo

----------->
Subject: x86 PAT: fix performance drop for glx, use UC minus for ioremap(), ioremap_nocache() and pci_mmap_page_range()
From: Suresh Siddha <suresh.b.siddha@intel.com>
Date: Fri, 25 Apr 2008 17:07:22 -0700

Use UC_MINUS for ioremap(), ioremap_nocache() instead of strong UC.
Once all the X drivers move to ioremap_wc(), we can go back to strong
UC semantics for ioremap() and ioremap_nocache().

To avoid attribute aliasing issues, pci_mmap_page_range() will also
use UC_MINUS for default non write-combining mapping request.

Next steps:
	a) change all the video drivers using ioremap() or ioremap_nocache()
	   and adding WC MTTR using mttr_add() to ioremap_wc()

	b) for strict usage, we can go back to strong uc semantics
	   for ioremap() and ioremap_nocache() after some grace period for
	   completing step-a.

	c) user level X server needs to use the appropriate method for setting
	   up WC mapping (like using resourceX_wc sysfs file instead of
	   adding MTRR for WC and using /dev/mem or resourceX under /sys)

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
 arch/x86/mm/ioremap.c  |   20 ++++++++++++++++----
 arch/x86/mm/pageattr.c |   10 ++++++++--
 arch/x86/pci/i386.c    |   12 +++++++++---
 3 files changed, 33 insertions(+), 9 deletions(-)

Index: linux-x86.q/arch/x86/mm/ioremap.c
===================================================================
--- linux-x86.q.orig/arch/x86/mm/ioremap.c
+++ linux-x86.q/arch/x86/mm/ioremap.c
@@ -176,11 +176,11 @@ static void __iomem *__ioremap(resource_
 		/*
 		 * Do not fallback to certain memory types with certain
 		 * requested type:
-		 * - request is uncached, return cannot be write-back
-		 * - request is uncached, return cannot be write-combine
+		 * - request is uc-, return cannot be write-back
+		 * - request is uc-, return cannot be write-combine
 		 * - request is write-combine, return cannot be write-back
 		 */
-		if ((prot_val == _PAGE_CACHE_UC &&
+		if ((prot_val == _PAGE_CACHE_UC_MINUS &&
 		     (new_prot_val == _PAGE_CACHE_WB ||
 		      new_prot_val == _PAGE_CACHE_WC)) ||
 		    (prot_val == _PAGE_CACHE_WC &&
@@ -201,6 +201,9 @@ static void __iomem *__ioremap(resource_
 	default:
 		prot = PAGE_KERNEL_NOCACHE;
 		break;
+	case _PAGE_CACHE_UC_MINUS:
+		prot = PAGE_KERNEL_UC_MINUS;
+		break;
 	case _PAGE_CACHE_WC:
 		prot = PAGE_KERNEL_WC;
 		break;
@@ -255,7 +258,16 @@ static void __iomem *__ioremap(resource_
  */
 void __iomem *ioremap_nocache(resource_size_t phys_addr, unsigned long size)
 {
-	return __ioremap(phys_addr, size, _PAGE_CACHE_UC);
+	/*
+	 * Ideally, this should be:
+	 * 	pat_wc_enabled ? _PAGE_CACHE_UC : _PAGE_CACHE_UC_MINUS;
+	 *
+	 * Till we fix all X drivers to use ioremap_wc(), we will use
+	 * UC MINUS.
+	 */
+	unsigned long val = _PAGE_CACHE_UC_MINUS;
+
+	return __ioremap(phys_addr, size, val);
 }
 EXPORT_SYMBOL(ioremap_nocache);
 
Index: linux-x86.q/arch/x86/mm/pageattr.c
===================================================================
--- linux-x86.q.orig/arch/x86/mm/pageattr.c
+++ linux-x86.q/arch/x86/mm/pageattr.c
@@ -829,14 +829,20 @@ static inline int change_page_attr_clear
 
 int _set_memory_uc(unsigned long addr, int numpages)
 {
+	/*
+	 * for now UC MINUS. see comments in ioremap_nocache()
+	 */
 	return change_page_attr_set(&addr, numpages,
-				    __pgprot(_PAGE_CACHE_UC), 0);
+				    __pgprot(_PAGE_CACHE_UC_MINUS), 0);
 }
 
 int set_memory_uc(unsigned long addr, int numpages)
 {
+	/*
+	 * for now UC MINUS. see comments in ioremap_nocache()
+	 */
 	if (reserve_memtype(addr, addr + numpages * PAGE_SIZE,
-	                    _PAGE_CACHE_UC, NULL))
+			    _PAGE_CACHE_UC_MINUS, NULL))
 		return -EINVAL;
 
 	return _set_memory_uc(addr, numpages);
Index: linux-x86.q/arch/x86/pci/i386.c
===================================================================
--- linux-x86.q.orig/arch/x86/pci/i386.c
+++ linux-x86.q/arch/x86/pci/i386.c
@@ -301,6 +301,13 @@ int pci_mmap_page_range(struct pci_dev *
 	prot = pgprot_val(vma->vm_page_prot);
 	if (pat_wc_enabled && write_combine)
 		prot |= _PAGE_CACHE_WC;
+	else if (pat_wc_enabled)
+		/*
+		 * ioremap() and ioremap_nocache() defaults to UC MINUS for now.
+		 * To avoid attribute conflicts, request UC MINUS here
+		 * aswell.
+		 */
+		prot |= _PAGE_CACHE_UC_MINUS;
 	else if (boot_cpu_data.x86 > 3)
 		prot |= _PAGE_CACHE_UC;
 
@@ -319,9 +326,8 @@ int pci_mmap_page_range(struct pci_dev *
 		 * - request is uncached, return cannot be write-combine
 		 * - request is write-combine, return cannot be write-back
 		 */
-		if ((flags == _PAGE_CACHE_UC &&
-		     (new_flags == _PAGE_CACHE_WB ||
-		      new_flags == _PAGE_CACHE_WC)) ||
+		if ((flags == _PAGE_CACHE_UC_MINUS &&
+		     (new_flags == _PAGE_CACHE_WB)) ||
 		    (flags == _PAGE_CACHE_WC &&
 		     new_flags == _PAGE_CACHE_WB)) {
 			free_memtype(addr, addr+len);