Bug 101021 - Multicore scheduling completely broken - Only one core being used at all times
Summary: Multicore scheduling completely broken - Only one core being used at all times
Status: RESOLVED CODE_FIX
Alias: None
Product: Process Management
Classification: Unclassified
Component: Scheduler (show other bugs)
Hardware: All Linux
: P1 blocking
Assignee: platform_x86_64@kernel-bugs.osdl.org
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2015-07-06 01:09 UTC by Christian
Modified: 2015-12-05 10:45 UTC (History)
8 users (show)

See Also:
Kernel Version: v4.2.0-rc1+
Subsystem:
Regression: Yes
Bisected commit-id:


Attachments
scenario 1 and scenario 2. (463.37 KB, application/gzip)
2015-07-06 01:09 UTC, Christian
Details
Working configuration after software fix for 4.2.0 (167.23 KB, application/octet-stream)
2015-09-10 11:16 UTC, Christian
Details

Description Christian 2015-07-06 01:09:45 UTC
Created attachment 181981 [details]
scenario 1 and scenario 2.

I ran a comparison between v4.0.1 and v4.2.0-rc1 to demonstrate the issue.
Only one core is being used, other cores are online, but stay idle all the time.

A complete set of logs is available in these attachments

scenario 1: everything works: log-vmlinuz-4.0.1.log
log: debug/log-vmlinuz-4.0.1.log

scenario 2: only 1 out of 8 cores being utilized: 
log: debug/log-vmlinuz-4.2.0-rc1+.log

Notes:
Both kernels versions have same configuration and ran in the same machine.

configuration files are inside the compressed logs:
 -config-4.0.1
 -config-4.2.0-rc1+

CPU Info:
cat cpuinfo 
processor       : 0
vendor_id       : AuthenticAMD
cpu family      : 21
model           : 2
model name      : AMD FX-8320E Eight-Core Processor
stepping        : 0
microcode       : 0x6000822
cpu MHz         : 3200.000
cache size      : 2048 KB
physical id     : 0
siblings        : 8
core id         : 0
cpu cores       : 4
apicid          : 16
initial apicid  : 0
fpu             : yes
fpu_exception   : yes
cpuid level     : 13
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc extd_apicid aperfmperf pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 popcnt aes xsave avx f16c lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs xop skinit wdt lwp fma4 tce nodeid_msr tbm topoext perfctr_core perfctr_nb arat cpb hw_pstate npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold vmmcall bmi1
bugs            : fxsave_leak
bogomips        : 6935.51
TLB size        : 1536 4K pages
clflush size    : 64
cache_alignment : 64
address sizes   : 48 bits physical, 48 bits virtual
power management: ts ttp tm 100mhzsteps hwpstate cpb eff_freq_ro

processor       : 1
vendor_id       : AuthenticAMD
cpu family      : 21
model           : 2
model name      : AMD FX-8320E Eight-Core Processor              
stepping        : 0
microcode       : 0x6000822
cpu MHz         : 3200.000
cache size      : 2048 KB
physical id     : 0
siblings        : 8
core id         : 1
cpu cores       : 4
apicid          : 17
initial apicid  : 1
fpu             : yes
fpu_exception   : yes
cpuid level     : 13
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc extd_apicid aperfmperf pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 popcnt aes xsave avx f16c lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs xop skinit wdt lwp fma4 tce nodeid_msr tbm topoext perfctr_core perfctr_nb arat cpb hw_pstate npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold vmmcall bmi1
bugs            : fxsave_leak
bogomips        : 6935.51
TLB size        : 1536 4K pages
clflush size    : 64
cache_alignment : 64
address sizes   : 48 bits physical, 48 bits virtual
power management: ts ttp tm 100mhzsteps hwpstate cpb eff_freq_ro

processor       : 2
vendor_id       : AuthenticAMD
cpu family      : 21
model           : 2
model name      : AMD FX-8320E Eight-Core Processor              
stepping        : 0
microcode       : 0x6000822
cpu MHz         : 3200.000
cache size      : 2048 KB
physical id     : 0
siblings        : 8
core id         : 2
cpu cores       : 4
apicid          : 18
initial apicid  : 2
fpu             : yes
fpu_exception   : yes
cpuid level     : 13
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc extd_apicid aperfmperf pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 popcnt aes xsave avx f16c lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs xop skinit wdt lwp fma4 tce nodeid_msr tbm topoext perfctr_core perfctr_nb arat cpb hw_pstate npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold vmmcall bmi1
bugs            : fxsave_leak
bogomips        : 6935.51
TLB size        : 1536 4K pages
clflush size    : 64
cache_alignment : 64
address sizes   : 48 bits physical, 48 bits virtual
power management: ts ttp tm 100mhzsteps hwpstate cpb eff_freq_ro

processor       : 3
vendor_id       : AuthenticAMD
cpu family      : 21
model           : 2
model name      : AMD FX-8320E Eight-Core Processor              
stepping        : 0
microcode       : 0x6000822
cpu MHz         : 3200.000
cache size      : 2048 KB
physical id     : 0
siblings        : 8
core id         : 3
cpu cores       : 4
apicid          : 19
initial apicid  : 3
fpu             : yes
fpu_exception   : yes
cpuid level     : 13
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc extd_apicid aperfmperf pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 popcnt aes xsave avx f16c lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs xop skinit wdt lwp fma4 tce nodeid_msr tbm topoext perfctr_core perfctr_nb arat cpb hw_pstate npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold vmmcall bmi1
bugs            : fxsave_leak
bogomips        : 6935.51
TLB size        : 1536 4K pages
clflush size    : 64
cache_alignment : 64
address sizes   : 48 bits physical, 48 bits virtual
power management: ts ttp tm 100mhzsteps hwpstate cpb eff_freq_ro

processor       : 4
vendor_id       : AuthenticAMD
cpu family      : 21
model           : 2
model name      : AMD FX-8320E Eight-Core Processor              
stepping        : 0
microcode       : 0x6000822
cpu MHz         : 3200.000
cache size      : 2048 KB
physical id     : 0
siblings        : 8
core id         : 4
cpu cores       : 4
apicid          : 20
initial apicid  : 4
fpu             : yes
fpu_exception   : yes
cpuid level     : 13
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc extd_apicid aperfmperf pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 popcnt aes xsave avx f16c lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs xop skinit wdt lwp fma4 tce nodeid_msr tbm topoext perfctr_core perfctr_nb arat cpb hw_pstate npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold vmmcall bmi1
bugs            : fxsave_leak
bogomips        : 6935.51
TLB size        : 1536 4K pages
clflush size    : 64
cache_alignment : 64
address sizes   : 48 bits physical, 48 bits virtual
power management: ts ttp tm 100mhzsteps hwpstate cpb eff_freq_ro

processor       : 5
vendor_id       : AuthenticAMD
cpu family      : 21
model           : 2
model name      : AMD FX-8320E Eight-Core Processor              
stepping        : 0
microcode       : 0x6000822
cpu MHz         : 3200.000
cache size      : 2048 KB
physical id     : 0
siblings        : 8
core id         : 5
cpu cores       : 4
apicid          : 21
initial apicid  : 5
fpu             : yes
fpu_exception   : yes
cpuid level     : 13
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc extd_apicid aperfmperf pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 popcnt aes xsave avx f16c lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs xop skinit wdt lwp fma4 tce nodeid_msr tbm topoext perfctr_core perfctr_nb arat cpb hw_pstate npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold vmmcall bmi1
bugs            : fxsave_leak
bogomips        : 6935.51
TLB size        : 1536 4K pages
clflush size    : 64
cache_alignment : 64
address sizes   : 48 bits physical, 48 bits virtual
power management: ts ttp tm 100mhzsteps hwpstate cpb eff_freq_ro

processor       : 6
vendor_id       : AuthenticAMD
cpu family      : 21
model           : 2
model name      : AMD FX-8320E Eight-Core Processor              
stepping        : 0
microcode       : 0x6000822
cpu MHz         : 3200.000
cache size      : 2048 KB
physical id     : 0
siblings        : 8
core id         : 6
cpu cores       : 4
apicid          : 22
initial apicid  : 6
fpu             : yes
fpu_exception   : yes
cpuid level     : 13
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc extd_apicid aperfmperf pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 popcnt aes xsave avx f16c lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs xop skinit wdt lwp fma4 tce nodeid_msr tbm topoext perfctr_core perfctr_nb arat cpb hw_pstate npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold vmmcall bmi1
bugs            : fxsave_leak
bogomips        : 6935.51
TLB size        : 1536 4K pages
clflush size    : 64
cache_alignment : 64
address sizes   : 48 bits physical, 48 bits virtual
power management: ts ttp tm 100mhzsteps hwpstate cpb eff_freq_ro

processor       : 7
vendor_id       : AuthenticAMD
cpu family      : 21
model           : 2
model name      : AMD FX-8320E Eight-Core Processor              
stepping        : 0
microcode       : 0x6000822
cpu MHz         : 3200.000
cache size      : 2048 KB
physical id     : 0
siblings        : 8
core id         : 7
cpu cores       : 4
apicid          : 23
initial apicid  : 7
fpu             : yes
fpu_exception   : yes
cpuid level     : 13
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc extd_apicid aperfmperf pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 popcnt aes xsave avx f16c lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs xop skinit wdt lwp fma4 tce nodeid_msr tbm topoext perfctr_core perfctr_nb arat cpb hw_pstate npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold vmmcall bmi1
bugs            : fxsave_leak
bogomips        : 6935.51
TLB size        : 1536 4K pages
clflush size    : 64
cache_alignment : 64
address sizes   : 48 bits physical, 48 bits virtual
power management: ts ttp tm 100mhzsteps hwpstate cpb eff_freq_ro

runtracer proc #
Comment 1 Martin Wohlert 2015-09-01 15:50:12 UTC
This bug is also in the final release of 4.2.0!

It's affecting all my devices (AMD, Intel and ARM). 4.1.x was using all cores, with 4.2.0 there are all cores visible, but only the first is used for all threads.

So what happened here? Is it configurable?
Comment 2 Christian 2015-09-01 16:04:54 UTC
Hello Martin,
This is not a configuration problem, the same exact configuration works in 4.0.1.
I haven't had the time to post all mandatory kernel bug logs for this issue.
Latest working kernel version I have is 4.0.1, which can be used for comparison purposes.
Someone needs to track the changes in CPU scheduling from 4.0.1 to latest, but the fix should be trivial.
Comment 3 Christian 2015-09-01 16:06:46 UTC
Correction: Since you saw this work in 4.1.x, best is to track changes between these two versions then.
Comment 4 Ognian Tenchev 2015-09-02 15:09:13 UTC
I have the same issue. Last working kernel for me was 4.1.6.

With 4.2.0 only one core work from four. I use same configuration (cat /proc/config.gz >.config).

I also try with new config (make mrpoper ; make menuconfig) - no success.
Comment 5 Christian 2015-09-02 15:56:37 UTC
Hello Oqnian, Martin,
Are you both using systemd as your system management?
Comment 6 Ognian Tenchev 2015-09-02 15:59:50 UTC
No. I'm using openrc.
Comment 7 Christian 2015-09-02 16:00:59 UTC
Nice, so we can rule that out then.
Comment 8 Martin Wohlert 2015-09-02 17:14:18 UTC
I'm using systemd on almost all systems, but one with openrc is also affected.
Comment 9 Christian 2015-09-02 17:22:08 UTC
Hello Martin,
Thanks for your response.
I initially though this could be a broken interaction with systemd of some sort, but I'm glad to have your input. Now we know:
(1) Not hardware specific.
(2) Affects versions above 4.1.6.
(3) Not caused by or related to systemd.
I changed the bug correspondingly and cced new affected groups.
Thanks,
Christian
Comment 10 Christian 2015-09-02 17:52:30 UTC
The following commit from Linus Torvalds fixed the issue in my machine:

commit 8bdc69b764013a9b5ebeef7df8f314f1066c5d79
Merge: 76ec51e 20f1f4b
Author: Linus Torvalds <torvalds@linux-foundation.org>
Date:   Wed Sep 2 08:04:23 2015 -0700

    Merge branch 'for-4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup
    
    Pull cgroup updates from Tejun Heo:
    
     - a new PIDs controller is added.  It turns out that PIDs are actually
       an independent resource from kmem due to the limited PID space.
    
     - more core preparations for the v2 interface.  Once cpu side interface
       is settled, it should be ready for lifting the devel mask.
       for-4.3-unified-base was temporarily branched so that other trees
       (block) can pull cgroup core changes that blkcg changes depend on.
    
     - a non-critical idr_preload usage bug fix.
    
    * 'for-4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
      cgroup: pids: fix invalid get/put usage
      cgroup: introduce cgroup_subsys->legacy_name
      cgroup: don't print subsystems for the default hierarchy
      cgroup: make cftype->private a unsigned long
      cgroup: export cgrp_dfl_root
      cgroup: define controller file conventions
      cgroup: fix idr_preload usage
      cgroup: add documentation for the PIDs controller
      cgroup: implement the PIDs subsystem
      cgroup: allow a cgroup subsystem to reject a fork
Comment 11 Christian 2015-09-02 17:57:20 UTC
The commit above is available in version: 4.2.0 (v4.2-4366-g88f95e5)
Comment 12 Juergen Rose 2015-09-09 07:45:39 UTC
I see the same issue with gentoo version of linux-4.2.0 on all my AMD machines (2) all Intel machines I checked (5) are working fine.

The AMDs are:

Linux impala 4.2.0-gentoo-r1 #1 SMP Thu Sep 3 11:44:06 CEST 2015 x86_64 AMD Phenom(tm) II X4 965 Processor AuthenticAMD GNU/Linux

root@caiman:/usr/src/linux(78)# uname -a
Linux caiman 4.2.0-gentoo #1 SMP Mon Aug 31 08:33:41 CEST 2015 x86_64 AMD Phenom(tm) II X6 1090T Processor AuthenticAMD GNU/Linux
Comment 13 Christian 2015-09-09 08:24:03 UTC
Hello Juergen,
Check if the commit above would fix your issue, it was included later on 4.2.0.
Comment 14 Juergen Rose 2015-09-10 10:59:03 UTC
Hello Christian, 

I installed the kernel sources from git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup. I compiled the kernel with my standard options and booted with this kernel. Now I have:

 root@caiman:/root(2)# uname -a
Linux caiman 4.2.0+ #1 SMP Thu Sep 10 11:08:11 CEST 2015 x86_64 AMD Phenom(tm) II X6 1090T Processor AuthenticAMD GNU/Linux


Now I am compiling gcc with MAKEOPTS="-j7":


oot@caiman:/root(8)# ps -eFl | head -n1
F S UID        PID  PPID  C PRI  NI ADDR SZ WCHAN    RSS PSR STIME TTY          TIME CMD
root@caiman:/root(9)# ps -eFl | grep cc
0 S rose      3058  3039  0  80   0 -  8953 SYSC_e  3356   0 12:42 ?        00:00:00 /usr/bin/dbus-daemon --config-file=/etc/at-spi2/accessibility.
4 S root      4170  4156  0  80   0 -  8953 SYSC_e  3280   0 11:37 ?        00:00:00 /usr/bin/dbus-daemon --config-file=/etc/at-spi2/accessibility.
4 S portage  14956  9180  0  80   0 -  1052 wait    1424   0 12:35 pts/1    00:00:00 [sys-devel/gcc-4.9.3] sandbox /usr/lib/portage/python2.7/ebuil
0 S portage  23518  6359  0  80   0 -  4688 wait    3208   0 12:56 pts/1    00:00:00 /usr/x86_64-pc-linux-gnu/gcc-bin/4.9.3/x86_64-pc-linux-gnu-g++
0 R portage  23534 23518  4  80   0 - 32609 -      98260   0 12:56 pts/1    00:00:01 /usr/libexec/gcc/x86_64-pc-linux-gnu/4.9.3/cc1plus -quiet -I .
0 S portage  23936  6359  0  80   0 -  4688 wait    3200   0 12:56 pts/1    00:00:00 /usr/x86_64-pc-linux-gnu/gcc-bin/4.9.3/x86_64-pc-linux-gnu-g++
0 R portage  23937 23936  4  80   0 - 31341 -      94280   0 12:56 pts/1    00:00:00 /usr/libexec/gcc/x86_64-pc-linux-gnu/4.9.3/cc1plus -quiet -I .
0 S portage  24021  6359  0  80   0 -  4688 wait    3224   0 12:56 pts/1    00:00:00 /usr/x86_64-pc-linux-gnu/gcc-bin/4.9.3/x86_64-pc-linux-gnu-g++
0 R portage  24022 24021  4  80   0 - 28045 -      80496   0 12:56 pts/1    00:00:00 /usr/libexec/gcc/x86_64-pc-linux-gnu/4.9.3/cc1plus -quiet -I .
0 S portage  24026  6359  0  80   0 -  4688 wait    3204   0 12:57 pts/1    00:00:00 /usr/x86_64-pc-linux-gnu/gcc-bin/4.9.3/x86_64-pc-linux-gnu-g++
0 R portage  24028 24026  4  80   0 - 26993 -      75716   0 12:57 pts/1    00:00:00 /usr/libexec/gcc/x86_64-pc-linux-gnu/4.9.3/cc1plus -quiet -I .
0 S portage  24074  6359  0  80   0 -  4688 wait    3104   0 12:57 pts/1    00:00:00 /usr/x86_64-pc-linux-gnu/gcc-bin/4.9.3/x86_64-pc-linux-gnu-g++
0 R portage  24075 24074  5  80   0 - 20221 -      45576   0 12:57 pts/1    00:00:00 /usr/libexec/gcc/x86_64-pc-linux-gnu/4.9.3/cc1plus -quiet -I .
0 S portage  24114  6359  0  80   0 -  4688 wait    3016   0 12:57 pts/1    00:00:00 /usr/x86_64-pc-linux-gnu/gcc-bin/4.9.3/x86_64-pc-linux-gnu-g++
0 S portage  24115  6359  0  80   0 -  4688 wait    3048   0 12:57 pts/1    00:00:00 /usr/x86_64-pc-linux-gnu/gcc-bin/4.9.3/x86_64-pc-linux-gnu-g++
0 R portage  24116 24114  5  80   0 - 15107 -      26632   0 12:57 pts/1    00:00:00 /usr/libexec/gcc/x86_64-pc-linux-gnu/4.9.3/cc1plus -quiet -I .
0 R portage  24117 24115  5  80   0 - 14048 -      23108   0 12:57 pts/1    00:00:00 /usr/libexec/gcc/x86_64-pc-linux-gnu/4.9.3/cc1plus -quiet -I .
0 S root     24123 10963  0  80   0 -  3738 pipe_w  2224   0 12:57 pts/3    00:00:00 grep --colour=auto cc

root@caiman:/root(11)# ps -eFl | grep make
0 S portage   6359  6335  0  80   0 - 10718 pipe_w 15996   0 12:44 pts/1    00:00:01 make DESTDIR= RPATH_ENVVAR=LD_LIBRARY_PATH TARGET_SUBDIR=x86_6
0 S portage  14992 14975  0  80   0 -  5193 wait    3828   0 12:35 pts/1    00:00:00 /bin/bash /usr/lib/portage/python2.7/ebuild-helpers/emake LDFL
0 S portage  14994 14992  0  80   0 -  7992 wait    4908   0 12:35 pts/1    00:00:00 make -j7 LDFLAGS=-Wl,-O1 -Wl,--as-needed STAGE1_CFLAGS= LIBPAT
0 S portage  15017 15005  0  80   0 -  7994 wait    5064   0 12:35 pts/1    00:00:00 make DESTDIR= RPATH_ENVVAR=LD_LIBRARY_PATH TARGET_SUBDIR=x86_6

All gcc processes are running at core 0. I.e., no progress so far.
Comment 15 Martin Wohlert 2015-09-10 11:02:05 UTC
https://bugs.gentoo.org/show_bug.cgi?id=559382

josef.95 gave a hint that worked for me:

Try
CONFIG_NO_HZ_IDLE=y
# CONFIG_NO_HZ_FULL is not set
Comment 16 Christian 2015-09-10 11:14:04 UTC
Hello Juergen,
The URL of your repository (ending with '/tj/cgroup') suggests your using the cutting edge unstable version of the kernel for this area 'cgroup' which is exactly where the problem is, so I'd try the official linux branch instead:

git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

I'll attach my configuration as well.
Please let me knwow if that works for you.
Comment 17 Christian 2015-09-10 11:16:46 UTC
Created attachment 187261 [details]
Working configuration after software fix for 4.2.0

This is the latest configuration that works well with the current version of the kernel: 4.2.0+
Comment 18 Juergen Rose 2015-09-11 06:23:47 UTC
(In reply to Christian from comment #16)
> Hello Juergen,
> The URL of your repository (ending with '/tj/cgroup') suggests your using
> the cutting edge unstable version of the kernel for this area 'cgroup' which
> is exactly where the problem is, so I'd try the official linux branch
> instead:
> 
> git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
> 
> I'll attach my configuration as well.
> Please let me knwow if that works for you.

Now I cloned git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git,
compiled the kernel and booted the new kernel (still with my old .config).
Trying to compile gcc the system seems to hang almost completely. At the console
some messages are still shown:

[740.60..]  [<ffffff....>] entry_SYSCALL_64_fastpath ....
...
[920.60..]  [<ffffff....>] entry_SYSCALL_64_fastpath ....
[920.60..] rcu_sched kthread starved for 600092 jiffies!...
[1100.53..] INFO: rcu_sched detected stalls on CPUs/tasks:
Comment 19 Juergen Rose 2015-09-11 06:53:41 UTC
(In reply to Martin Wohlert from comment #15)
> https://bugs.gentoo.org/show_bug.cgi?id=559382
> 
> josef.95 gave a hint that worked for me:
> 
> Try
> CONFIG_NO_HZ_IDLE=y
> # CONFIG_NO_HZ_FULL is not set

I had the following NO_HZ settings:

root@caiman:/usr/src(3)# grep CONFIG_NO_HZ linux/.config
CONFIG_NO_HZ_COMMON=y
# CONFIG_NO_HZ_IDLE is not set
CONFIG_NO_HZ_FULL=y
CONFIG_NO_HZ_FULL_ALL=y
# CONFIG_NO_HZ_FULL_SYSIDLE is not set
CONFIG_NO_HZ=y

I will now try 
CONFIG_NO_HZ_IDLE=y


BTW. 4.1.6-gentoo works like a charm with theese settings.
Comment 20 Martin Wohlert 2015-09-11 07:07:17 UTC
Juergen Rose: make sure to unset CONFIG_NO_HZ_FULL when setting CONFIG_NO_HZ_IDLE=y

NO_HZ_FULL: "Full dynticks system (tickless)"
NO_HZ_IDLE: "Idle dynticks system (tickless idle)"
Comment 21 Juergen Rose 2015-09-11 07:35:42 UTC
(In reply to Martin Wohlert from comment #20)
> Juergen Rose: make sure to unset CONFIG_NO_HZ_FULL when setting
> CONFIG_NO_HZ_IDLE=y
> 
> NO_HZ_FULL: "Full dynticks system (tickless)"
> NO_HZ_IDLE: "Idle dynticks system (tickless idle)"

I am running now the kernel from git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git with:

root@caiman:/root(2)# zgrep NO_HZ /proc/config.gz 
CONFIG_NO_HZ_COMMON=y
CONFIG_NO_HZ_IDLE=y
# CONFIG_NO_HZ_FULL is not set
CONFIG_NO_HZ=y
root@caiman:/root(3)# 


This looks much better. Compiling gcc the cc processes are distributed around several cores:


root@caiman:/root(4)# ps -eFl | head -n1
F S UID        PID  PPID  C PRI  NI ADDR SZ WCHAN    RSS PSR STIME TTY          TIME CMD
root@caiman:/root(5)# ps -eFl | grep cc
0 S portage   3817  1337  0  80   0 -  4688 wait    3112   4 09:34 pts/0    00:00:00 /usr/x86_64-pc-linux-gnu/gcc-bin/4.9.3/x86_64-pc-linu
0 R portage   3818  3817 57  80   0 - 31640 -      92576   1 09:34 pts/0    00:00:00 /usr/libexec/gcc/x86_64-pc-linux-gnu/4.9.3/cc1plus -q
0 S portage   3846  1337  0  80   0 -  4688 wait    3172   0 09:34 pts/0    00:00:00 /usr/x86_64-pc-linux-gnu/gcc-bin/4.9.3/x86_64-pc-linu
0 R portage   3848  3846  0  80   0 - 18681 -      41800   0 09:34 pts/0    00:00:00 /usr/libexec/gcc/x86_64-pc-linux-gnu/4.9.3/cc1plus -q
0 S portage   3857  1337  0  80   0 -  4688 wait    3088   4 09:34 pts/0    00:00:00 /usr/x86_64-pc-linux-gnu/gcc-bin/4.9.3/x86_64-pc-linu
0 S portage   3858  1337  0  80   0 -  4688 wait    3208   5 09:34 pts/0    00:00:00 /usr/x86_64-pc-linux-gnu/gcc-bin/4.9.3/x86_64-pc-linu
0 R portage   3859  3858  0  80   0 - 19777 -      46012   5 09:34 pts/0    00:00:00 /usr/libexec/gcc/x86_64-pc-linux-gnu/4.9.3/cc1plus -q
0 R portage   3860  3857  0  80   0 - 16406 -      35336   4 09:34 pts/0    00:00:00 /usr/libexec/gcc/x86_64-pc-linux-gnu/4.9.3/cc1plus -q
0 S portage   3861  1337  0  80   0 -  4688 wait    3116   1 09:34 pts/0    00:00:00 /usr/x86_64-pc-linux-gnu/gcc-bin/4.9.3/x86_64-pc-linu
0 S portage   3862  1337  0  80   0 -  4688 wait    3064   3 09:34 pts/0    00:00:00 /usr/x86_64-pc-linux-gnu/gcc-bin/4.9.3/x86_64-pc-linu
0 S portage   3863  1337  0  80   0 -  4688 wait    3060   0 09:34 pts/0    00:00:00 /usr/x86_64-pc-linux-gnu/gcc-bin/4.9.3/x86_64-pc-linu
0 R portage   3864  3861  0  80   0 - 14407 -      25728   0 09:34 pts/0    00:00:00 /usr/libexec/gcc/x86_64-pc-linux-gnu/4.9.3/cc1plus -q
0 R portage   3865  3862  0  80   0 - 15447 -      28520   2 09:34 pts/0    00:00:00 /usr/libexec/gcc/x86_64-pc-linux-gnu/4.9.3/cc1plus -q
0 R portage   3866  3863  0  80   0 - 15436 -      27628   3 09:34 pts/0    00:00:00 /usr/libexec/gcc/x86_64-pc-linux-gnu/4.9.3/cc1plus -q
4 S root      5746  5732  0  80   0 -  8953 SYSC_e  2836   1 09:26 ?        00:00:00 /usr/bin/dbus-daemon --config-file=/etc/at-spi2/acces
4 S portage  10388  3314  0  80   0 -  1052 wait    1368   0 09:31 pts/0    00:00:00 [sys-devel/gcc-4.9.3] sandbox /usr/lib/portage/python
Comment 22 Martin Wohlert 2015-09-11 07:39:33 UTC
Juergen: CONFIG_NO_HZ_IDLE=y works with gentoo-sources-4.2.0 too
Comment 23 Juergen Rose 2015-09-11 08:05:04 UTC
(In reply to Martin Wohlert from comment #22)
> Juergen: CONFIG_NO_HZ_IDLE=y works with gentoo-sources-4.2.0 too

I am already compiling gentoo-sources-4.2.0-r1 with CONFIG_NO_HZ_IDLE=y, because I hoped, that this. will work too. I will check the results this evening.
Comment 24 Alexey Shvetsov 2015-10-21 04:49:42 UTC
Its not fixed. I still see this issue on x86_64 and armv7l platforms.

Affected kernels 4.2.x (all!) and 4.3-rcX (also all!)
Comment 25 Alexey Shvetsov 2015-10-25 13:22:02 UTC
Can someone reopen this bugs since its not fixed!
Comment 26 Christian 2015-10-25 15:57:32 UTC
Hello Alexey, 
Please help posting your logs for the issue.
Rgds,
Christian
Comment 27 Alexey Shvetsov 2015-10-26 04:33:27 UTC
What kind of logs do you need?

I tryed 4.2.x and 4.3.x kernels on x86_64 and armv7l platforms. If kernel compiled with NO_HZ_FULL (Full dynticks) then all processes (even kernel threads!) will be placed on core 0.
Comment 28 Christian 2015-11-22 15:02:46 UTC
Resolved in 4.4.0-rc1.
Comment 29 Leho Kraav 2015-12-05 10:45:03 UTC
I'm pretty sure I'm seeing this in 4.2.6. All -j4 builds etc just live on core 0.

leho@gusto ~ $ [-] zgrep NO_HZ /proc/config.gz
CONFIG_NO_HZ_COMMON=y
# CONFIG_NO_HZ_IDLE is not set
CONFIG_NO_HZ_FULL=y
CONFIG_NO_HZ_FULL_ALL=y
CONFIG_NO_HZ_FULL_SYSIDLE=y
CONFIG_NO_HZ_FULL_SYSIDLE_SMALL=8
# CONFIG_NO_HZ is not set

Going to test the alternative configurations presented here.

Note You need to log in before you can comment on or make changes to this bug.