Bug 15618

Summary: 2.6.18->2.6.32->2.6.33 huge regression in performance
Product: Process Management Reporter: Anton Starikov (ant.starikov)
Component: OtherAssignee: process_other
Status: RESOLVED CODE_FIX    
Severity: high CC: rjw
Priority: P1    
Hardware: x86-64   
OS: Linux   
Kernel Version: 2.6.32 Subsystem:
Regression: Yes Bisected commit-id:
Attachments: testcase

Description Anton Starikov 2010-03-23 16:13:16 UTC
We have benchmarked some multithreaded code here on 16-core/4-way opteron 8356 host on number of kernels (see below) and found strange results.
Up to 8 threads we didn't see any noticeable differences in performance, but starting from 9 threads performance diverges substantially. I provide here results for 14 threads

2.6.18-164.11.1.el5 (centos)

user time: ~60 sec
sys time: ~12 sec

2.6.32.9-70.fc12.x86_64 (fedora-12)

user time: ~60 sec
sys time: ~75 sec

2.6.33-0.46.rc8.git1.fc13.x86_64 (fedora-12 + rawhide kernel)

user time: ~60 sec
sys time: ~300 sec

In all three cases real time regress corresponding to giving numbers.

Binary used for all three cases is exactly the same (compiled on centos).
Setups for all three cases so identical as possible (last two - the same fedora-12 setup booted with different kernels).

What can be reason of this regress in performance? Is it possible to tune something to recover performance on 2.6.18 kernel? 

I perf'ed on 2.6.32.9-70.fc12.x86_64 kernel

report (top part only):

43.64% dve22lts-mc [kernel] [k] _spin_lock_irqsave 
32.93% dve22lts-mc ./dve22lts-mc [.] DBSLLlookup_ret 
5.37% dve22lts-mc ./dve22lts-mc [.] SuperFastHash 
3.76% dve22lts-mc /lib64/libc-2.11.1.so [.] __GI_memcpy 
2.60% dve22lts-mc [kernel] [k] clear_page_c 
1.60% dve22lts-mc ./dve22lts-mc [.] index_next_dfs

stat: 
129875.554435 task-clock-msecs # 10.210 CPUs 
1883 context-switches # 0.000 M/sec 
17 CPU-migrations # 0.000 M/sec 
2695310 page-faults # 0.021 M/sec 
298370338040 cycles # 2297.356 M/sec 
130581778178 instructions # 0.438 IPC 
42517143751 cache-references # 327.368 M/sec 
101906904 cache-misses # 0.785 M/sec 

callgraph(top part only):

53.09%      dve22lts-mc  [kernel]                                         [k]
_spin_lock_irqsave
               |          
               |--49.90%-- __down_read_trylock
               |          down_read_trylock
               |          do_page_fault
               |          page_fault
               |          |          
               |          |--99.99%-- __GI_memcpy
               |          |          |          
               |          |          |--84.28%-- (nil)
               |          |          |          
               |          |          |--9.78%-- 0x100000000
               |          |          |          
               |          |           --5.94%-- 0x1
               |           --0.01%-- 
[...]

               |          
               |--49.39%-- __up_read
               |          up_read
               |          |          
               |          |--100.00%-- do_page_fault
               |          |          page_fault
               |          |          |          
               |          |          |--99.99%-- __GI_memcpy
               |          |          |          |          
               |          |          |          |--84.18%-- (nil)
               |          |          |          |          
               |          |          |          |--10.13%-- 0x100000000
               |          |          |          |          
               |          |          |           --5.69%-- 0x1
               |          |           --0.01%-- 
[...]

               |           --0.00%-- 
[...]

                --0.72%-- 
[...]



On 2.6.33 I see similar picture with spin-lock plus addition of a lot of time spent in cgroup related kernel calls.

If it is necessary, I can attach binary for tests.
Comment 1 Anton Starikov 2010-03-23 16:36:01 UTC
Created attachment 25659 [details]
testcase

I attach here the testcase.
Unpack and cd regression-testcase
Then run as ./RUNME NTHREADS
Test isn't long, for 2 threads it takes about 30 seconds on 2.4 GHz Opteron.
Comment 2 Andrew Morton 2010-03-23 17:24:10 UTC
(switched to email.  Please respond via emailed reply-to-all, not via the
bugzilla web interface).

On Tue, 23 Mar 2010 16:13:25 GMT bugzilla-daemon@bugzilla.kernel.org wrote:

> https://bugzilla.kernel.org/show_bug.cgi?id=15618
> 
>            Summary: 2.6.18->2.6.32->2.6.33 huge regression in performance
>            Product: Process Management
>            Version: 2.5
>     Kernel Version: 2.6.32
>           Platform: All
>         OS/Version: Linux
>               Tree: Mainline
>             Status: NEW
>           Severity: high
>           Priority: P1
>          Component: Other
>         AssignedTo: process_other@kernel-bugs.osdl.org
>         ReportedBy: ant.starikov@gmail.com
>         Regression: No
> 
> 
> We have benchmarked some multithreaded code here on 16-core/4-way opteron
> 8356
> host on number of kernels (see below) and found strange results.
> Up to 8 threads we didn't see any noticeable differences in performance, but
> starting from 9 threads performance diverges substantially. I provide here
> results for 14 threads

lolz.  Catastrophic meltdown.  Thanks for doing all that work - at a
guess I'd say it's mmap_sem.  Perhaps with some assist from the CPU
scheduler.

If you change the config to set CONFIG_RWSEM_GENERIC_SPINLOCK=n,
CONFIG_RWSEM_XCHGADD_ALGORITHM=y does it help?

Anyway, there's a testcase in bugzilla and it looks like we got us some
work to do.


> 2.6.18-164.11.1.el5 (centos)
> 
> user time: ~60 sec
> sys time: ~12 sec
> 
> 2.6.32.9-70.fc12.x86_64 (fedora-12)
> 
> user time: ~60 sec
> sys time: ~75 sec
> 
> 2.6.33-0.46.rc8.git1.fc13.x86_64 (fedora-12 + rawhide kernel)
> 
> user time: ~60 sec
> sys time: ~300 sec
> 
> In all three cases real time regress corresponding to giving numbers.
> 
> Binary used for all three cases is exactly the same (compiled on centos).
> Setups for all three cases so identical as possible (last two - the same
> fedora-12 setup booted with different kernels).
> 
> What can be reason of this regress in performance? Is it possible to tune
> something to recover performance on 2.6.18 kernel? 
> 
> I perf'ed on 2.6.32.9-70.fc12.x86_64 kernel
> 
> report (top part only):
> 
> 43.64% dve22lts-mc [kernel] [k] _spin_lock_irqsave 
> 32.93% dve22lts-mc ./dve22lts-mc [.] DBSLLlookup_ret 
> 5.37% dve22lts-mc ./dve22lts-mc [.] SuperFastHash 
> 3.76% dve22lts-mc /lib64/libc-2.11.1.so [.] __GI_memcpy 
> 2.60% dve22lts-mc [kernel] [k] clear_page_c 
> 1.60% dve22lts-mc ./dve22lts-mc [.] index_next_dfs
> 
> stat: 
> 129875.554435 task-clock-msecs # 10.210 CPUs 
> 1883 context-switches # 0.000 M/sec 
> 17 CPU-migrations # 0.000 M/sec 
> 2695310 page-faults # 0.021 M/sec 
> 298370338040 cycles # 2297.356 M/sec 
> 130581778178 instructions # 0.438 IPC 
> 42517143751 cache-references # 327.368 M/sec 
> 101906904 cache-misses # 0.785 M/sec 
> 
> callgraph(top part only):
> 
> 53.09%      dve22lts-mc  [kernel]                                         [k]
> _spin_lock_irqsave
>                |          
>                |--49.90%-- __down_read_trylock
>                |          down_read_trylock
>                |          do_page_fault
>                |          page_fault
>                |          |          
>                |          |--99.99%-- __GI_memcpy
>                |          |          |          
>                |          |          |--84.28%-- (nil)
>                |          |          |          
>                |          |          |--9.78%-- 0x100000000
>                |          |          |          
>                |          |           --5.94%-- 0x1
>                |           --0.01%-- 
> [...]
> 
>                |          
>                |--49.39%-- __up_read
>                |          up_read
>                |          |          
>                |          |--100.00%-- do_page_fault
>                |          |          page_fault
>                |          |          |          
>                |          |          |--99.99%-- __GI_memcpy
>                |          |          |          |          
>                |          |          |          |--84.18%-- (nil)
>                |          |          |          |          
>                |          |          |          |--10.13%-- 0x100000000
>                |          |          |          |          
>                |          |          |           --5.69%-- 0x1
>                |          |           --0.01%-- 
> [...]
> 
>                |           --0.00%-- 
> [...]
> 
>                 --0.72%-- 
> [...]
> 
> 
> 
> On 2.6.33 I see similar picture with spin-lock plus addition of a lot of time
> spent in cgroup related kernel calls.
> 
> If it is necessary, I can attach binary for tests.
> 
> -- 
> Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
> ------- You are receiving this mail because: -------
> You are on the CC list for the bug.
Comment 3 Linus Torvalds 2010-03-23 17:49:53 UTC
On Tue, 23 Mar 2010, Ingo Molnar wrote:
> 
> It shows a very brutal amount of page fault invoked mmap_sem spinning 
> overhead.

Isn't this already fixed? It's the same old "x86-64 rwsemaphores are using 
the shit-for-brains generic version" thing, and it's fixed by

	1838ef1 x86-64, rwsem: 64-bit xadd rwsem implementation
	5d0b723 x86: clean up rwsem type system
	59c33fa x86-32: clean up rwsem inline asm statements

NOTE! None of those are in 2.6.33 - they were merged afterwards. But they 
are in 2.6.34-rc1 (and obviously current -git). So Anton would have to 
compile his own kernel to test his load.

We could mark them as stable material if the load in question is a real 
load rather than just a test-case. On one of the random page-fault 
benchmarks the rwsem fix was something like a 400% performance 
improvement, and it was apparently visible in real life on some crazy SGI 
"initialize huge heap concurrently on lots of threads" load.

Side note: the reason the spinlock sucks is because of the fair ticket 
locks, it really does all the wrong things for the rwsem code. That's why 
old kernels don't show it - the old unfair locks didn't show the same kind 
of behavior.

			Linus
Comment 4 Anton Starikov 2010-03-23 17:58:18 UTC
On Mar 23, 2010, at 6:45 PM, Linus Torvalds wrote:

> 
> 
> On Tue, 23 Mar 2010, Ingo Molnar wrote:
>> 
>> It shows a very brutal amount of page fault invoked mmap_sem spinning 
>> overhead.
> 
> Isn't this already fixed? It's the same old "x86-64 rwsemaphores are using 
> the shit-for-brains generic version" thing, and it's fixed by
> 
>       1838ef1 x86-64, rwsem: 64-bit xadd rwsem implementation
>       5d0b723 x86: clean up rwsem type system
>       59c33fa x86-32: clean up rwsem inline asm statements
> 
> NOTE! None of those are in 2.6.33 - they were merged afterwards. But they 
> are in 2.6.34-rc1 (and obviously current -git). So Anton would have to 
> compile his own kernel to test his load.

Thanks for info, I will try it now.

> We could mark them as stable material if the load in question is a real 
> load rather than just a test-case. On one of the random page-fault 
> benchmarks the rwsem fix was something like a 400% performance 
> improvement, and it was apparently visible in real life on some crazy SGI 
> "initialize huge heap concurrently on lots of threads" load.

It is not just a test-case, it is real-life code. With real-life problems on 2.6.32 and later :)


Anton.
Comment 5 Anton Starikov 2010-03-23 18:04:26 UTC
On Mar 23, 2010, at 7:00 PM, Ingo Molnar wrote:
>> NOTE! None of those are in 2.6.33 - they were merged afterwards. But they 
>> are in 2.6.34-rc1 (and obviously current -git). So Anton would have to 
>> compile his own kernel to test his load.
> 
> another option is to run the rawhide kernel via something like:
> 
>       yum update --enablerepo=development kernel
> 
> this will give kernel-2.6.34-0.13.rc1.git1.fc14.x86_64, which has those 
> changes included.

I will apply this commits to 2.6.32, I afraid current OFED (which I need also) will not work on 2.6.33+.

Anton.
Comment 6 Andrew Morton 2010-03-23 18:14:36 UTC
On Tue, 23 Mar 2010 18:34:09 +0100
Ingo Molnar <mingo@elte.hu> wrote:

> 
> It shows a very brutal amount of page fault invoked mmap_sem spinning 
> overhead.
> 

Yes.  Note that we fall off a cliff at nine threads on a 16-way.  As
soon as a core gets two threads scheduled onto it?  Probably triggered
by an MM change, possibly triggered by a sched change which tickled a
preexisting MM shortcoming.  Who knows.

Anton, we have an executable binary in the bugzilla report but it would
be nice to also have at least a description of what that code is
actually doing.  A quick strace shows quite a lot of mprotect activity.
A pseudo-code walkthrough, perhaps?

Thanks.
Comment 7 Anton Starikov 2010-03-23 18:20:06 UTC
On Mar 23, 2010, at 7:13 PM, Andrew Morton wrote:
> Anton, we have an executable binary in the bugzilla report but it would
> be nice to also have at least a description of what that code is
> actually doing.  A quick strace shows quite a lot of mprotect activity.
> A pseudo-code walkthrough, perhaps?


Right now can't say too much about the code (we just gave a chance to neighbor group to run their code on our cluster, so I'm totally unfriendly with this code). I will forward your question to them.

But probably right now you can get more information (including sources) here http://fmt.cs.utwente.nl/tools/ltsmin/

Anton
Comment 8 Andrew Morton 2010-03-23 18:22:46 UTC
On Tue, 23 Mar 2010 19:03:36 +0100
Anton Starikov <ant.starikov@gmail.com> wrote:

> 
> On Mar 23, 2010, at 7:00 PM, Ingo Molnar wrote:
> >> NOTE! None of those are in 2.6.33 - they were merged afterwards. But they 
> >> are in 2.6.34-rc1 (and obviously current -git). So Anton would have to 
> >> compile his own kernel to test his load.
> > 
> > another option is to run the rawhide kernel via something like:
> > 
> >     yum update --enablerepo=development kernel
> > 
> > this will give kernel-2.6.34-0.13.rc1.git1.fc14.x86_64, which has those 
> > changes included.
> 
> I will apply this commits to 2.6.32, I afraid current OFED (which I need
> also) will not work on 2.6.33+.
> 

You should be able to simply set CONFIG_RWSEM_GENERIC_SPINLOCK=n,
CONFIG_RWSEM_XCHGADD_ALGORITHM=y by hand, as I mentioned earlier?
Comment 9 Anton Starikov 2010-03-23 18:26:25 UTC
On Mar 23, 2010, at 7:21 PM, Andrew Morton wrote:
>> I will apply this commits to 2.6.32, I afraid current OFED (which I need
>> also) will not work on 2.6.33+.
>> 
> 
> You should be able to simply set CONFIG_RWSEM_GENERIC_SPINLOCK=n,
> CONFIG_RWSEM_XCHGADD_ALGORITHM=y by hand, as I mentioned earlier?

Hm. I tried, but when I do "make oldconfig", then it gets rewritten, so I assume that it conflicts with some other setting from default fedora kernel config. trying to figure out which one exactly.

Anton.
Comment 10 Ingo Molnar 2010-03-23 18:28:55 UTC
* Andrew Morton <akpm@linux-foundation.org> wrote:

> On Tue, 23 Mar 2010 18:34:09 +0100
> Ingo Molnar <mingo@elte.hu> wrote:
> 
> > 
> > It shows a very brutal amount of page fault invoked mmap_sem spinning 
> > overhead.
> > 
> 
> Yes.  Note that we fall off a cliff at nine threads on a 16-way.  As soon as 
> a core gets two threads scheduled onto it?

it's AMD Opterons so no SMT.

My (wild) guess would be that 8 cpus can still do cacheline ping-pong 
reasonably efficiently, but it starts breaking down very seriously with 9 or 
more cores bouncing the same single cache-line.

Breakdowns in scalability are usually very non-linear, for hardware and 
software reasons. '8 threads' sounds like a hw limit to me. From the scheduler 
POV there's no big difference between 8 or 9 CPUs used [this is non-HT] - with 
8 or 7 cores still idle.

	Ingo
Comment 11 Ingo Molnar 2010-03-23 18:47:53 UTC
* Andrew Morton <akpm@linux-foundation.org> wrote:

> lolz.  Catastrophic meltdown.  Thanks for doing all that work - at a guess 
> I'd say it's mmap_sem. [...]

Looks like we dont need to guess, just look at the call graph profile (a'ka 
the smoking gun):

> > I perf'ed on 2.6.32.9-70.fc12.x86_64 kernel
> >
> > [...]
> >
> > callgraph(top part only):
> > 
> > 53.09%      dve22lts-mc  [kernel]                                        
> [k]
> > _spin_lock_irqsave
> >                |          
> >                |--49.90%-- __down_read_trylock
> >                |          down_read_trylock
> >                |          do_page_fault
> >                |          page_fault
> >                |          |          
> >                |          |--99.99%-- __GI_memcpy
> >                |          |          |          
> >                |          |          |--84.28%-- (nil)
> >                |          |          |          
> >                |          |          |--9.78%-- 0x100000000
> >                |          |          |          
> >                |          |           --5.94%-- 0x1
> >                |           --0.01%-- 
> > [...]
> > 
> >                |          
> >                |--49.39%-- __up_read
> >                |          up_read
> >                |          |          
> >                |          |--100.00%-- do_page_fault
> >                |          |          page_fault
> >                |          |          |          
> >                |          |          |--99.99%-- __GI_memcpy
> >                |          |          |          |          
> >                |          |          |          |--84.18%-- (nil)
> >                |          |          |          |          
> >                |          |          |          |--10.13%-- 0x100000000
> >                |          |          |          |          
> >                |          |          |           --5.69%-- 0x1
> >                |          |           --0.01%-- 
> > [...]

It shows a very brutal amount of page fault invoked mmap_sem spinning 
overhead.

> Perhaps with some assist from the CPU scheduler.

Doesnt look like it, the perf stat numbers show that the scheduler is only 
very lightly involved:

  > > 129875.554435 task-clock-msecs # 10.210 CPUs 
  > >          1883 context-switches # 0.000 M/sec 
 
a context switch only every ~68 milliseconds.

	Ingo
	Ingo
Comment 12 Ingo Molnar 2010-03-23 18:48:09 UTC
* Linus Torvalds <torvalds@linux-foundation.org> wrote:

> On Tue, 23 Mar 2010, Ingo Molnar wrote:
> > 
> > It shows a very brutal amount of page fault invoked mmap_sem spinning 
> > overhead.
> 
> Isn't this already fixed? It's the same old "x86-64 rwsemaphores are using 
> the shit-for-brains generic version" thing, and it's fixed by
> 
>       1838ef1 x86-64, rwsem: 64-bit xadd rwsem implementation
>       5d0b723 x86: clean up rwsem type system
>       59c33fa x86-32: clean up rwsem inline asm statements

Ah, indeed!

> NOTE! None of those are in 2.6.33 - they were merged afterwards. But they 
> are in 2.6.34-rc1 (and obviously current -git). So Anton would have to 
> compile his own kernel to test his load.

another option is to run the rawhide kernel via something like:

	yum update --enablerepo=development kernel

this will give kernel-2.6.34-0.13.rc1.git1.fc14.x86_64, which has those 
changes included.

OTOH that kernel has debugging [lockdep] enabled so it might not be 
comparable.

> We could mark them as stable material if the load in question is a real load 
> rather than just a test-case. On one of the random page-fault benchmarks the 
> rwsem fix was something like a 400% performance improvement, and it was 
> apparently visible in real life on some crazy SGI "initialize huge heap 
> concurrently on lots of threads" load.
> 
> Side note: the reason the spinlock sucks is because of the fair ticket 
> locks, it really does all the wrong things for the rwsem code. That's why 
> old kernels don't show it - the old unfair locks didn't show the same kind 
> of behavior.

Yeah.

	Ingo
Comment 13 Anton Starikov 2010-03-23 19:15:29 UTC
On Mar 23, 2010, at 6:45 PM, Linus Torvalds wrote:

> 
> 
> On Tue, 23 Mar 2010, Ingo Molnar wrote:
>> 
>> It shows a very brutal amount of page fault invoked mmap_sem spinning 
>> overhead.
> 
> Isn't this already fixed? It's the same old "x86-64 rwsemaphores are using 
> the shit-for-brains generic version" thing, and it's fixed by
> 
>       1838ef1 x86-64, rwsem: 64-bit xadd rwsem implementation
>       5d0b723 x86: clean up rwsem type system
>       59c33fa x86-32: clean up rwsem inline asm statements
> 
> NOTE! None of those are in 2.6.33 - they were merged afterwards. But they 
> are in 2.6.34-rc1 (and obviously current -git). So Anton would have to 
> compile his own kernel to test his load.


Applied mentioned patches. Things didn't improve too much.

before:
prog: Total exploration time 9.880 real 60.620 user 76.970 sys

after:
prog: Total exploration time 9.020 real 59.430 user 66.190 sys

perf report:

    38.58%             prog  [kernel]                                           [k] _spin_lock_irqsave
    37.42%             prog  ./prog                                             [.] DBSLLlookup_ret
     6.22%             prog  ./prog                                             [.] SuperFastHash
     3.65%             prog  /lib64/libc-2.11.1.so                              [.] __GI_memcpy
     2.09%             prog  ./anderson.6.dve2C                                 [.] get_successors
     1.75%             prog  [kernel]                                           [k] clear_page_c
     1.73%             prog  ./prog                                             [.] index_next_dfs
     0.71%             prog  [kernel]                                           [k] handle_mm_fault
     0.38%             prog  ./prog                                             [.] cb_hook
     0.33%             prog  ./prog                                             [.] get_local
     0.32%             prog  [kernel]                                           [k] page_fault

Anton.
Comment 14 Anonymous Emailer 2010-03-23 19:18:39 UTC
Reply-To: peterz@infradead.org

On Tue, 2010-03-23 at 20:14 +0100, Anton Starikov wrote:
> On Mar 23, 2010, at 6:45 PM, Linus Torvalds wrote:
> 
> > 
> > 
> > On Tue, 23 Mar 2010, Ingo Molnar wrote:
> >> 
> >> It shows a very brutal amount of page fault invoked mmap_sem spinning 
> >> overhead.
> > 
> > Isn't this already fixed? It's the same old "x86-64 rwsemaphores are using 
> > the shit-for-brains generic version" thing, and it's fixed by
> > 
> >     1838ef1 x86-64, rwsem: 64-bit xadd rwsem implementation
> >     5d0b723 x86: clean up rwsem type system
> >     59c33fa x86-32: clean up rwsem inline asm statements
> > 
> > NOTE! None of those are in 2.6.33 - they were merged afterwards. But they 
> > are in 2.6.34-rc1 (and obviously current -git). So Anton would have to 
> > compile his own kernel to test his load.
> 
> 
> Applied mentioned patches. Things didn't improve too much.
> 
> before:
> prog: Total exploration time 9.880 real 60.620 user 76.970 sys
> 
> after:
> prog: Total exploration time 9.020 real 59.430 user 66.190 sys
> 
> perf report:
> 
>     38.58%             prog  [kernel]                                        
>       [k] _spin_lock_irqsave
>     37.42%             prog  ./prog                                          
>       [.] DBSLLlookup_ret
>      6.22%             prog  ./prog                                          
>        [.] SuperFastHash
>      3.65%             prog  /lib64/libc-2.11.1.so                           
>        [.] __GI_memcpy
>      2.09%             prog  ./anderson.6.dve2C                              
>        [.] get_successors
>      1.75%             prog  [kernel]                                        
>        [k] clear_page_c
>      1.73%             prog  ./prog                                          
>        [.] index_next_dfs
>      0.71%             prog  [kernel]                                        
>        [k] handle_mm_fault
>      0.38%             prog  ./prog                                          
>        [.] cb_hook
>      0.33%             prog  ./prog                                          
>        [.] get_local
>      0.32%             prog  [kernel]                                        
>        [k] page_fault

Could you verify with a callgraph profile what that spin_lock_irqsave()
is? If those rwsem patches were successfull mmap_sem should no longer
have a spinlock to content on, in which case it might be another lock.

If not, something went wrong with backporting those patches.
Comment 15 Anton Starikov 2010-03-23 19:31:02 UTC
On Mar 23, 2010, at 8:22 PM, Robin Holt wrote:

> On Tue, Mar 23, 2010 at 07:25:43PM +0100, Anton Starikov wrote:
>> On Mar 23, 2010, at 7:21 PM, Andrew Morton wrote:
>>>> I will apply this commits to 2.6.32, I afraid current OFED (which I need
>>>> also) will not work on 2.6.33+.
>>>> 
>>> 
>>> You should be able to simply set CONFIG_RWSEM_GENERIC_SPINLOCK=n,
>>> CONFIG_RWSEM_XCHGADD_ALGORITHM=y by hand, as I mentioned earlier?
>> 
>> Hm. I tried, but when I do "make oldconfig", then it gets rewritten, so I
>> assume that it conflicts with some other setting from default fedora kernel
>> config. trying to figure out which one exactly.
> 
> Have you tracked this down yet?  I just got the patches applied against
> an older kernel and am running into the same issue.

I decided to not track down this issue and just applied patches. I understood that with this patches there is no need to change this config options. Am I wrong?

Anton
Comment 16 Anton Starikov 2010-03-23 19:42:58 UTC
I attach here callgraph.

Also I checked kernel source, actual code which was compiled is exactly what should be after patches.

Do I miss something?
Comment 17 Anton Starikov 2010-03-23 19:50:59 UTC
On Mar 23, 2010, at 8:22 PM, Robin Holt wrote:

> On Tue, Mar 23, 2010 at 07:25:43PM +0100, Anton Starikov wrote:
>> On Mar 23, 2010, at 7:21 PM, Andrew Morton wrote:
>>>> I will apply this commits to 2.6.32, I afraid current OFED (which I need
>>>> also) will not work on 2.6.33+.
>>>> 
>>> 
>>> You should be able to simply set CONFIG_RWSEM_GENERIC_SPINLOCK=n,
>>> CONFIG_RWSEM_XCHGADD_ALGORITHM=y by hand, as I mentioned earlier?
>> 
>> Hm. I tried, but when I do "make oldconfig", then it gets rewritten, so I
>> assume that it conflicts with some other setting from default fedora kernel
>> config. trying to figure out which one exactly.
> 
> Have you tracked this down yet?  I just got the patches applied against
> an older kernel and am running into the same issue.


I think you can prevent overwriting this options if you set them in arch/x86/configs/x86_64_defconfig

Anton
Comment 18 Linus Torvalds 2010-03-23 19:57:57 UTC
On Tue, 23 Mar 2010, Andrew Morton wrote:
> 
> You should be able to simply set CONFIG_RWSEM_GENERIC_SPINLOCK=n,
> CONFIG_RWSEM_XCHGADD_ALGORITHM=y by hand, as I mentioned earlier?

No. Doesn't work. The XADD code simply never worked on x86-64, which is 
why those three commits I pointed at are required.

Oh, and you need one more commit (at least) in addition to the three I 
already mentioned - the one that actually adds the x86-64 wrappers and 
Kconfig option:

	bafaecd x86-64: support native xadd rwsem implementation

so the minimal list of commits (on top of 2.6.33) is at least

	59c33fa x86-32: clean up rwsem inline asm statements
	5d0b723 x86: clean up rwsem type system
	bafaecd x86-64: support native xadd rwsem implementation
	1838ef1 x86-64, rwsem: 64-bit xadd rwsem implementation

and I just verified that they at least cherry-pick cleanly (in that 
order). I _think_ it would be good to also do

	0d1622d x86-64, rwsem: Avoid store forwarding hazard in __downgrade_write

but that one is a small detail, not anything fundamentally important.

			Linus
Comment 19 Robin Holt 2010-03-23 19:58:12 UTC
On Tue, Mar 23, 2010 at 02:49:59PM -0500, Robin Holt wrote:
> On Tue, Mar 23, 2010 at 08:30:19PM +0100, Anton Starikov wrote:
> > 
> > On Mar 23, 2010, at 8:22 PM, Robin Holt wrote:
> > 
> > > On Tue, Mar 23, 2010 at 07:25:43PM +0100, Anton Starikov wrote:
> > >> On Mar 23, 2010, at 7:21 PM, Andrew Morton wrote:
> > >>>> I will apply this commits to 2.6.32, I afraid current OFED (which I
> need also) will not work on 2.6.33+.
> > >>>> 
> > >>> 
> > >>> You should be able to simply set CONFIG_RWSEM_GENERIC_SPINLOCK=n,
> > >>> CONFIG_RWSEM_XCHGADD_ALGORITHM=y by hand, as I mentioned earlier?
> > >> 
> > >> Hm. I tried, but when I do "make oldconfig", then it gets rewritten, so
> I assume that it conflicts with some other setting from default fedora kernel
> config. trying to figure out which one exactly.
> > > 
> > > Have you tracked this down yet?  I just got the patches applied against
> > > an older kernel and am running into the same issue.
> > 
> > I decided to not track down this issue and just applied patches. I
> understood that with this patches there is no need to change this config
> options. Am I wrong?
> 
> We might need to also apply:
> bafaecd11df15ad5b1e598adc7736afcd38ee13d

For the record, these are the patches I have applied to a 2.6.32 kernel from a vendor:

59c33fa7791e9948ba467c2b83e307a0d087ab49
5d0b7235d83eefdafda300656e97d368afcafc9a
1838ef1d782f7527e6defe87e180598622d2d071
0d1622d7f526311d87d7da2ee7dd14b73e45d3fc
bafaecd11df15ad5b1e598adc7736afcd38ee13d

A quick look at the disassembly makes it look like we are using the
rwsem_64, et al.

Robin
Comment 20 Linus Torvalds 2010-03-23 19:59:28 UTC
On Tue, 23 Mar 2010, Anton Starikov wrote:

> 
> On Mar 23, 2010, at 6:45 PM, Linus Torvalds wrote:
> 
> > 
> > 
> > On Tue, 23 Mar 2010, Ingo Molnar wrote:
> >> 
> >> It shows a very brutal amount of page fault invoked mmap_sem spinning 
> >> overhead.
> > 
> > Isn't this already fixed? It's the same old "x86-64 rwsemaphores are using 
> > the shit-for-brains generic version" thing, and it's fixed by
> > 
> >     1838ef1 x86-64, rwsem: 64-bit xadd rwsem implementation
> >     5d0b723 x86: clean up rwsem type system
> >     59c33fa x86-32: clean up rwsem inline asm statements
> > 
> > NOTE! None of those are in 2.6.33 - they were merged afterwards. But they 
> > are in 2.6.34-rc1 (and obviously current -git). So Anton would have to 
> > compile his own kernel to test his load.
> 
> 
> Applied mentioned patches. Things didn't improve too much.

Yeah, I missed at least one commit, namely

	bafaecd x86-64: support native xadd rwsem implementation

which is the one that actually makes x86-64 able to use the xadd version.

		Linus
Comment 21 Robin Holt 2010-03-23 20:16:41 UTC
On Tue, Mar 23, 2010 at 07:25:43PM +0100, Anton Starikov wrote:
> On Mar 23, 2010, at 7:21 PM, Andrew Morton wrote:
> >> I will apply this commits to 2.6.32, I afraid current OFED (which I need
> also) will not work on 2.6.33+.
> >> 
> > 
> > You should be able to simply set CONFIG_RWSEM_GENERIC_SPINLOCK=n,
> > CONFIG_RWSEM_XCHGADD_ALGORITHM=y by hand, as I mentioned earlier?
> 
> Hm. I tried, but when I do "make oldconfig", then it gets rewritten, so I
> assume that it conflicts with some other setting from default fedora kernel
> config. trying to figure out which one exactly.

Have you tracked this down yet?  I just got the patches applied against
an older kernel and am running into the same issue.

Thanks,
Robin
Comment 22 Anonymous Emailer 2010-03-23 20:20:53 UTC
Reply-To: peterz@infradead.org

On Tue, 2010-03-23 at 20:14 +0100, Anton Starikov wrote:
> On Mar 23, 2010, at 6:45 PM, Linus Torvalds wrote:
> 
> > 
> > 
> > On Tue, 23 Mar 2010, Ingo Molnar wrote:
> >> 
> >> It shows a very brutal amount of page fault invoked mmap_sem spinning 
> >> overhead.
> > 
> > Isn't this already fixed? It's the same old "x86-64 rwsemaphores are using 
> > the shit-for-brains generic version" thing, and it's fixed by
> > 
> >     1838ef1 x86-64, rwsem: 64-bit xadd rwsem implementation
> >     5d0b723 x86: clean up rwsem type system
> >     59c33fa x86-32: clean up rwsem inline asm statements
> > 
> > NOTE! None of those are in 2.6.33 - they were merged afterwards. But they 
> > are in 2.6.34-rc1 (and obviously current -git). So Anton would have to 
> > compile his own kernel to test his load.
> 
> 
> Applied mentioned patches. Things didn't improve too much.
> 
> before:
> prog: Total exploration time 9.880 real 60.620 user 76.970 sys
> 
> after:
> prog: Total exploration time 9.020 real 59.430 user 66.190 sys
> 
> perf report:
> 
>     38.58%             prog  [kernel]                                        
>       [k] _spin_lock_irqsave
>     37.42%             prog  ./prog                                          
>       [.] DBSLLlookup_ret
>      6.22%             prog  ./prog                                          
>        [.] SuperFastHash
>      3.65%             prog  /lib64/libc-2.11.1.so                           
>        [.] __GI_memcpy
>      2.09%             prog  ./anderson.6.dve2C                              
>        [.] get_successors
>      1.75%             prog  [kernel]                                        
>        [k] clear_page_c
>      1.73%             prog  ./prog                                          
>        [.] index_next_dfs
>      0.71%             prog  [kernel]                                        
>        [k] handle_mm_fault
>      0.38%             prog  ./prog                                          
>        [.] cb_hook
>      0.33%             prog  ./prog                                          
>        [.] get_local
>      0.32%             prog  [kernel]                                        
>        [k] page_fault

Could you verify with a callgraph profile what that spin_lock_irqsave()
is? If those rwsem patches were successfull mmap_sem should no longer
have a spinlock to content on, in which case it might be another lock.

If not, something went wrong with backporting those patches.
Comment 23 Robin Holt 2010-03-23 20:26:40 UTC
On Tue, Mar 23, 2010 at 08:30:19PM +0100, Anton Starikov wrote:
> 
> On Mar 23, 2010, at 8:22 PM, Robin Holt wrote:
> 
> > On Tue, Mar 23, 2010 at 07:25:43PM +0100, Anton Starikov wrote:
> >> On Mar 23, 2010, at 7:21 PM, Andrew Morton wrote:
> >>>> I will apply this commits to 2.6.32, I afraid current OFED (which I need
> also) will not work on 2.6.33+.
> >>>> 
> >>> 
> >>> You should be able to simply set CONFIG_RWSEM_GENERIC_SPINLOCK=n,
> >>> CONFIG_RWSEM_XCHGADD_ALGORITHM=y by hand, as I mentioned earlier?
> >> 
> >> Hm. I tried, but when I do "make oldconfig", then it gets rewritten, so I
> assume that it conflicts with some other setting from default fedora kernel
> config. trying to figure out which one exactly.
> > 
> > Have you tracked this down yet?  I just got the patches applied against
> > an older kernel and am running into the same issue.
> 
> I decided to not track down this issue and just applied patches. I understood
> that with this patches there is no need to change this config options. Am I
> wrong?

We might need to also apply:
bafaecd11df15ad5b1e598adc7736afcd38ee13d

Robin
Comment 24 Anton Starikov 2010-03-23 20:44:36 UTC
I think we got a winner!

Problem seems to be fixed.

Just for record, I used next patches:

59c33fa7791e9948ba467c2b83e307a0d087ab49
5d0b7235d83eefdafda300656e97d368afcafc9a
1838ef1d782f7527e6defe87e180598622d2d071
4126faf0ab7417fbc6eb99fb0fd407e01e9e9dfe
bafaecd11df15ad5b1e598adc7736afcd38ee13d
0d1622d7f526311d87d7da2ee7dd14b73e45d3fc


Thanks,
Anton.

On Mar 23, 2010, at 8:54 PM, Linus Torvalds wrote:

> 
> 
> On Tue, 23 Mar 2010, Anton Starikov wrote:
> 
>> 
>> On Mar 23, 2010, at 6:45 PM, Linus Torvalds wrote:
>> 
>>> 
>>> 
>>> On Tue, 23 Mar 2010, Ingo Molnar wrote:
>>>> 
>>>> It shows a very brutal amount of page fault invoked mmap_sem spinning 
>>>> overhead.
>>> 
>>> Isn't this already fixed? It's the same old "x86-64 rwsemaphores are using 
>>> the shit-for-brains generic version" thing, and it's fixed by
>>> 
>>>     1838ef1 x86-64, rwsem: 64-bit xadd rwsem implementation
>>>     5d0b723 x86: clean up rwsem type system
>>>     59c33fa x86-32: clean up rwsem inline asm statements
>>> 
>>> NOTE! None of those are in 2.6.33 - they were merged afterwards. But they 
>>> are in 2.6.34-rc1 (and obviously current -git). So Anton would have to 
>>> compile his own kernel to test his load.
>> 
>> 
>> Applied mentioned patches. Things didn't improve too much.
> 
> Yeah, I missed at least one commit, namely
> 
>       bafaecd x86-64: support native xadd rwsem implementation
> 
> which is the one that actually makes x86-64 able to use the xadd version.
> 
>               Linus
Comment 25 Anton Starikov 2010-03-23 21:20:03 UTC
Although case is solved, I will post description for testcase program.
Just in case someone wonder or would like to keep it for some later tests.

------------------------------------------------------------------------
It is a parallel model checker. The command line you used does reachability
on the state space of mode anderson.6, meaning that it searches through all
possible states (int vectors). Each thread gets a vector from the queue,
calculates its successor states and puts them in a lock-less static hash
table (pseudo BFS exploration because the threads each have there own
queue).

How did ingo run the binary? Because the static table size should be chosen
to fit into memory. "-s 27" allocates 2^27 * (|vector| + 1 ) * sizeof(int)
bytes. |vector| is equal to 19 for anderson.6, ergo the table size is 10GB.
This could explain the huge number of page faults ingo gets.

But anyway, you can imagine that the code is quiet jumpy and has a big
memory footprint, so the page faults may also be normal.
------------------------------------------------------------------------

On Mar 23, 2010, at 7:13 PM, Andrew Morton wrote:

> Anton, we have an executable binary in the bugzilla report but it would
> be nice to also have at least a description of what that code is
> actually doing.  A quick strace shows quite a lot of mprotect activity.
> A pseudo-code walkthrough, perhaps?
> 
> Thanks.
Comment 26 Rafael J. Wysocki 2010-03-23 22:07:39 UTC
Closing, since the problem has been solved in the current Linus' tree.
Comment 27 Linus Torvalds 2010-03-23 23:09:51 UTC
On Tue, 23 Mar 2010, Anton Starikov wrote:
>
> I think we got a winner!
> 
> Problem seems to be fixed.
> 
> Just for record, I used next patches:
> 
> 59c33fa7791e9948ba467c2b83e307a0d087ab49
> 5d0b7235d83eefdafda300656e97d368afcafc9a
> 1838ef1d782f7527e6defe87e180598622d2d071
> 4126faf0ab7417fbc6eb99fb0fd407e01e9e9dfe
> bafaecd11df15ad5b1e598adc7736afcd38ee13d
> 0d1622d7f526311d87d7da2ee7dd14b73e45d3fc

Ok. If you have performance numbers for before/after these patches for 
your actual workload, I'd suggest posting them to stable@kernel.org, and 
maybe those rwsem fixes will get back-ported.

The patches are pretty small, and should be fairly safe. So they are 
certainly stable material.

		Linus
Comment 28 Anton Starikov 2010-03-23 23:19:59 UTC
Tomorrow I will try to patch and check 2.6.33 and see are this patches enough to restore performance or not, because on 2.6.33 kernel performance issue also used to involve somehow crgoup business (and performance was terrible even comparing to broken 2.6.32). If it will not fix 2.6.33, then I will ask to reopen the bug, otherwise I will post to stable@.

Thanks again for help,
Anton.

On Mar 24, 2010, at 12:04 AM, Linus Torvalds wrote:

> 
> 
> On Tue, 23 Mar 2010, Anton Starikov wrote:
>> 
>> I think we got a winner!
>> 
>> Problem seems to be fixed.
>> 
>> Just for record, I used next patches:
>> 
>> 59c33fa7791e9948ba467c2b83e307a0d087ab49
>> 5d0b7235d83eefdafda300656e97d368afcafc9a
>> 1838ef1d782f7527e6defe87e180598622d2d071
>> 4126faf0ab7417fbc6eb99fb0fd407e01e9e9dfe
>> bafaecd11df15ad5b1e598adc7736afcd38ee13d
>> 0d1622d7f526311d87d7da2ee7dd14b73e45d3fc
> 
> Ok. If you have performance numbers for before/after these patches for 
> your actual workload, I'd suggest posting them to stable@kernel.org, and 
> maybe those rwsem fixes will get back-ported.
> 
> The patches are pretty small, and should be fairly safe. So they are 
> certainly stable material.
> 
>               Linus
Comment 29 Ingo Molnar 2010-03-23 23:37:29 UTC
* Linus Torvalds <torvalds@linux-foundation.org> wrote:

> 
> 
> On Tue, 23 Mar 2010, Anton Starikov wrote:
> >
> > I think we got a winner!
> > 
> > Problem seems to be fixed.
> > 
> > Just for record, I used next patches:
> > 
> > 59c33fa7791e9948ba467c2b83e307a0d087ab49
> > 5d0b7235d83eefdafda300656e97d368afcafc9a
> > 1838ef1d782f7527e6defe87e180598622d2d071
> > 4126faf0ab7417fbc6eb99fb0fd407e01e9e9dfe
> > bafaecd11df15ad5b1e598adc7736afcd38ee13d
> > 0d1622d7f526311d87d7da2ee7dd14b73e45d3fc
> 
> Ok. If you have performance numbers for before/after these patches for 
> your actual workload, I'd suggest posting them to stable@kernel.org, and 
> maybe those rwsem fixes will get back-ported.
> 
> The patches are pretty small, and should be fairly safe. So they are 
> certainly stable material.

We havent had any stability problems with them, except one trivial build bug, 
so -stable would be nice.

	Ingo
Comment 30 Linus Torvalds 2010-03-24 00:00:56 UTC
On Wed, 24 Mar 2010, Ingo Molnar wrote:
> 
> We havent had any stability problems with them, except one trivial build bug, 
> so -stable would be nice.

Oh, you're right. There was that UML build bug. But I think that was 
included in the list of commits Anton had - commit 4126faf0ab ("x86: Fix 
breakage of UML from the changes in the rwsem system").

		Linus
Comment 31 Anton Starikov 2010-03-24 00:03:50 UTC
Yes, it is included into my list.
When I will submit it into stable, I will include it also.

Anton

On Mar 24, 2010, at 12:55 AM, Linus Torvalds wrote:

> 
> 
> On Wed, 24 Mar 2010, Ingo Molnar wrote:
>> 
>> We havent had any stability problems with them, except one trivial build
>> bug, 
>> so -stable would be nice.
> 
> Oh, you're right. There was that UML build bug. But I think that was 
> included in the list of commits Anton had - commit 4126faf0ab ("x86: Fix 
> breakage of UML from the changes in the rwsem system").
> 
>               Linus
Comment 32 Linus Torvalds 2010-03-24 03:05:43 UTC
On Wed, 24 Mar 2010, Andi Kleen wrote:
> 
> It would be also nice to get that change into 2.6.32 stable. That is
> widely used on larger systems.

Looking at the changes to the files in question, it looks like it should 
all apply cleanly to 2.6.32, so I don't see any reason not to backport 
further back.

Somebody should double-check, though.

		Linus
Comment 33 Anonymous Emailer 2010-03-24 03:07:19 UTC
Reply-To: andi@firstfloor.org

Linus Torvalds <torvalds@linux-foundation.org> writes:

> On Wed, 24 Mar 2010, Ingo Molnar wrote:
>> 
>> We havent had any stability problems with them, except one trivial build
>> bug, 
>> so -stable would be nice.
>
> Oh, you're right. There was that UML build bug. But I think that was 
> included in the list of commits Anton had - commit 4126faf0ab ("x86: Fix 
> breakage of UML from the changes in the rwsem system").

It would be also nice to get that change into 2.6.32 stable. That is
widely used on larger systems.

-Andi
Comment 34 Anonymous Emailer 2010-03-24 17:46:36 UTC
Reply-To: rdreier@cisco.com

 > I will apply this commits to 2.6.32, I afraid current OFED (which I
 > need also) will not work on 2.6.33+.

What do you need from OFED that is not in 2.6.34-rc1?
Comment 35 Anton Starikov 2010-03-26 03:25:38 UTC
On Mar 24, 2010, at 5:40 PM, Roland Dreier wrote:

>> I will apply this commits to 2.6.32, I afraid current OFED (which I
>> need also) will not work on 2.6.33+.
> 
> What do you need from OFED that is not in 2.6.34-rc1?

I didn't go too 2.6.34-rc1.
I tried 2.6.33, mlx4 driver which comes with kernel produces panic on my hardwire. And OFED-1.5 doesn't support this kernel (probably it still can be compiled, didn't check).

Anton.
Comment 36 Lee Schermerhorn 2010-04-02 18:58:15 UTC
On Tue, 2010-03-23 at 10:22 -0400, Andrew Morton wrote:
> (switched to email.  Please respond via emailed reply-to-all, not via the
> bugzilla web interface).
> 
> On Tue, 23 Mar 2010 16:13:25 GMT bugzilla-daemon@bugzilla.kernel.org wrote:
> 
> > https://bugzilla.kernel.org/show_bug.cgi?id=15618
> > 
> >            Summary: 2.6.18->2.6.32->2.6.33 huge regression in performance
> >            Product: Process Management
> >            Version: 2.5
> >     Kernel Version: 2.6.32
> >           Platform: All
> >         OS/Version: Linux
> >               Tree: Mainline
> >             Status: NEW
> >           Severity: high
> >           Priority: P1
> >          Component: Other
> >         AssignedTo: process_other@kernel-bugs.osdl.org
> >         ReportedBy: ant.starikov@gmail.com
> >         Regression: No
> > 
> > 
> > We have benchmarked some multithreaded code here on 16-core/4-way opteron
> 8356
> > host on number of kernels (see below) and found strange results.
> > Up to 8 threads we didn't see any noticeable differences in performance,
> but
> > starting from 9 threads performance diverges substantially. I provide here
> > results for 14 threads
> 
> lolz.  Catastrophic meltdown.  Thanks for doing all that work - at a
> guess I'd say it's mmap_sem.  Perhaps with some assist from the CPU
> scheduler.
> 
> If you change the config to set CONFIG_RWSEM_GENERIC_SPINLOCK=n,
> CONFIG_RWSEM_XCHGADD_ALGORITHM=y does it help?
> 
> Anyway, there's a testcase in bugzilla and it looks like we got us some
> work to do.
> 
<snip>

I had an "opportunity" to investigate page fault behavior on 2.6.18+
[RHEL5.4] on an 8-socket Istanbul system earlier this year.  When I saw
this mail, I collected up the data I had from that adventure and ran
additional tests on 2.6.33 and 2.6.34-rc1.  I have attached plots for
what "per node" and "system wide" page fault scalability.

The per node plot [#1] shows the page fault rate of 1 to 6
[nr_cores_per_socket] tasks [processes] and threads faulting in a fixed
GB/task at the same time on a single socket.  The system wide plot [#3]
show 1 to 48 [nr_sockets * nr_cores_per_socket] tasks and threads again
faulting in a fixed GB/task...   For the latter test, I load one core
per socket at at time, then add the 2nd core per socket, ...  In all
cases, the individual tasks/threads are fork()ed/pthread_create()d by a
parent bound to the cpu where they'll run to obtain node-local kernel
data structures.  The tests run with SCHED_FIFO.

I plot both "faults per wall clock second"--the aggregate rate--and
"faults per cpu second" or normalized rate.  The per node scalability
doesn't look all that different across the 3 releases, especially the
faults per cpu seconds curves.  However, in the system wide
multi-threaded tests, 2.6.33 is an anomaly compared to both 2.6.18+ and
2.6.34-rc1.  The 2.6.18+ and 2.6.34.rc1 multi-threaded tests show a lot
of noise and, of course, a lot lower fault rate relative the the
multi-task tests.  I aborted the 2.6.33 system wide multi-threaded test
at 32 threads because it was just taking too long.

Unfortunately, with this many curves, the legends obscure much of the
plot.  So, rather than bloat this message any more, I've packaged up the
raw data along with plots with and without legends and placed the
tarball here:

	http://free.linux.hp.com/~lts/Pft/

That directory also contains the source for the version of the pft test
used, along with the scripts used to run the tests and plot the results.
Note that some manual editing of the "plot annotations" in the raw data
was required to generate several different plots from the same data.

The pft test is a highly, uh, "evolved" version of pft.c that Christoph
Lameter pointed me at a few years ago.  This version requires a patched
libnuma with the v2 api.  The required patch to the numactl-2.0.3
package is included in the test tarball.  [I've contacted Cliff about
getting the patch into 2.0.4.]

Lee
Comment 37 Greg Kroah-Hartman 2010-04-19 18:33:23 UTC
On Tue, Mar 23, 2010 at 08:00:54PM -0700, Linus Torvalds wrote:
> 
> 
> On Wed, 24 Mar 2010, Andi Kleen wrote:
> > 
> > It would be also nice to get that change into 2.6.32 stable. That is
> > widely used on larger systems.
> 
> Looking at the changes to the files in question, it looks like it should 
> all apply cleanly to 2.6.32, so I don't see any reason not to backport 
> further back.
> 
> Somebody should double-check, though.

I have queued them all up for .33 and .32-stable kernel releases now.

thanks,

greg k-h
Comment 38 Greg Kroah-Hartman 2010-04-19 19:09:09 UTC
On Tue, Mar 23, 2010 at 08:00:54PM -0700, Linus Torvalds wrote:
> 
> 
> On Wed, 24 Mar 2010, Andi Kleen wrote:
> > 
> > It would be also nice to get that change into 2.6.32 stable. That is
> > widely used on larger systems.
> 
> Looking at the changes to the files in question, it looks like it should 
> all apply cleanly to 2.6.32, so I don't see any reason not to backport 
> further back.
> 
> Somebody should double-check, though.

I have queued them all up for .33 and .32-stable kernel releases now.

thanks,

greg k-h