Bug 203037 - `make -j4` spawns less than four GCC instances and usually runs just one GCC instance
Summary: `make -j4` spawns less than four GCC instances and usually runs just one GCC ...
Status: RESOLVED MOVED
Alias: None
Product: Other
Classification: Unclassified
Component: Configuration (show other bugs)
Hardware: x86-64 Linux
: P1 high
Assignee: other_configuration@kernel-bugs.osdl.org
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2019-03-25 16:31 UTC by Artem S. Tashkinov
Modified: 2019-03-29 21:51 UTC (History)
1 user (show)

See Also:
Kernel Version: 5.0.4
Subsystem:
Regression: No
Bisected commit-id:


Attachments
4.20 config (104.81 KB, text/plain)
2019-03-25 16:31 UTC, Artem S. Tashkinov
Details

Description Artem S. Tashkinov 2019-03-25 16:31:44 UTC
Created attachment 282011 [details]
4.20 config

Recently I've noticed that kernel compilation with the "-j4" flag is super ineffective on my four-core Intel Core i5 2500 CPU: it uses at most three GCC copies instead of four. On average I see just one(!) GCC process running during compilation (rarely two or three).

`make -j8` works much better and uses on average six cores.

My distro is fully updated Fedora 29.

Here's my complete compilation command:

# time ( make -j4 && make modules_install && /bin/cp -av arch/x86/boot/bzImage /boot/vmlinuz-5.0-ic64.x86_64 )

Top output while running this command:

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ P COMMAND                                                                                                                
 4790 root      20   0   67200  44788  11924 R  10.0   0.3   0:00.15 0 cc1                                                                                                                    
 2188 root      20   0  247724  88940  66404 S   6.0   0.5   2:17.60 2 Xorg                                                                                                                   
 2525 birdie    20   0  341100  29364  23156 S   0.7   0.2   0:00.42 1 panel-28-weathe                                                                                                        
 2563 birdie    20   0  186036  23808  19184 S   0.7   0.1   0:05.07 2 panel-24-netloa                                                                                                        
 2763 birdie    20   0  348156  35944  25956 S   0.7   0.2   0:28.30 1 xfce4-terminal    

I don't use or have enabled any fancy power saving options. Pretty much everything runs by default. 

This doesn't seem right.
Comment 1 Linus Torvalds 2019-03-25 17:32:29 UTC
I think this is purely about "make" itself. 

"make" uses the "-j" setting to limit what it starts, but with recursive sub-make etc, it's not a trivial thing to do. I think GNU make still uses a special "jobserver" model for the recursive case that uses a pipe to effectively give out "tokens" to each job, and the -j option then says how many tokens can be in flight at any particular time.

But that token model has some latency, and 'make' also tries to take into account all the other things that are happening, which is not just the compilation, but also all of our (very complex) Makefile rules themselves.

My personal model has always been to just give the "-j" limit as twice the number of CPU's you actually have, partly for IO, partly for slop, and partly because make tends to be conservative when it comes to load.

You might also try to use the "-l" option in addition to "-j", which limits parallelism by load.

IOW, with 4 CPU's, a reasonable 'make' command line might be 

      make -l4 -j8

which also takes into account what else is happening on that machine (the load average obviously depends on whether there are _other_ loads going on).

NOTE! It's entirely possible that some Kbuild change has made any 'make' parallelism heuristics work worse than they used to. If you can pinpoint when this started (perhaps with bisection?) that would be interesting. But it might also be about your version of 'make' itself etc etc.
Comment 2 Artem S. Tashkinov 2019-03-25 17:41:51 UTC
(In reply to Linus Torvalds from comment #1)
> I think this is purely about "make" itself. 

I wanted to note that when I'm building kernel 4.17 under CentOS 6/7 there's no such bug, i.e. four GCC instances keep running which makes me believe that there's some bug either in the make package in Fedora 29 or something was changed in the kernel build system. To be absolutely sure I will try to reproduce this bug while building kernel 5.0 under CentOS and I'll let you know. I'm sorry for not giving you this info right from the beginning.
Comment 3 Linus Torvalds 2019-03-25 17:52:11 UTC
As I noted, it's entirely possible that some Kbuild change has messed with the heuristics. If the effect of this is obvious, and you can script it somehow, you might even be able to automate finding the culprit with "git bisect run".

But yes, it might be just the version of 'make' itself or something like that.
Comment 4 Artem S. Tashkinov 2019-03-27 11:11:36 UTC
OK, I've verified that under CentOS 7.x 64 make (GNU Make 3.82, Built for x86_64-redhat-linux-gnu) keeps on average 4 GCC processes running while compiling Linux 5.0.4.

Under Fedora 29 where the issue is present, GNU Make 4.2.1, Built for x86_64-redhat-linux-gnu.

That's a bug ;-)

Linus, if you don't have the time or/and desire to figure it out, please CC the responsible developers who could look into the bug.

Thank you.
Comment 5 Linus Torvalds 2019-03-27 19:04:56 UTC
So the same kernel sources build fine with make 3.82, but the build doesn't parallelize well with make 4.2.1?

I have no idea about who maintains 'make' these days. But there seems to be a bug tracker at

  http://savannah.gnu.org/bugs/?group=make

that you could try making a report to.
Comment 6 Artem S. Tashkinov 2019-03-28 09:56:29 UTC
(In reply to Linus Torvalds from comment #5)
> So the same kernel sources build fine with make 3.82, but the build doesn't
> parallelize well with make 4.2.1?

That's correct.

> 
> I have no idea about who maintains 'make' these days. But there seems to be
> a bug tracker at
> 
>   http://savannah.gnu.org/bugs/?group=make
> 
> that you could try making a report to.

https://savannah.gnu.org/bugs/index.php?56019
Comment 7 Linus Torvalds 2019-03-29 21:51:52 UTC
Btw, you could try if the recent change to avoid one level of recursive 'make' might help or make a difference.

(See commit 688931a5ad4e "kbuild: skip sub-make for in-tree build with GNU Make 4.x").

Note You need to log in before you can comment on or make changes to this bug.