Bug 84131

Summary: [bisected] Hard lockups under high CPU load - RCU Related?
Product: Process Management Reporter: Mike Lothian (mike)
Component: PreemptionAssignee: Robert Love (rlove)
Severity: high CC: mike
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 3.17 Subsystem:
Regression: Yes Bisected commit-id:
Attachments: Journalctl ouput
Dot config
Kabini dot config
Periodic timer ticks doc config

Description Mike Lothian 2014-09-08 19:01:47 UTC
Created attachment 149481 [details]
Journalctl ouput

I've been getting hard lock ups on several machines since the start of the 3.17 rc cycle.

Hard locks usually happen during periods of high load - the easiest way I've been able to manifest the issue is compiling Chromium

I biected it down to 19d402c1e75077e2bcfe17f7fe5bcfc8deb74991 which is a merge rather than a commit. 

git bisect start
git bisect bad fb762340e55638332407560396aea380b7af9cbf
git bisect good 19583ca584d6f574384e17fe7613dfaeadcdc4a6
git bisect bad ae045e2455429c418a418a3376301a9e5753a0a8
git bisect bad 53ee983378ff23e8f3ff95ecf99dea7c6c221900
git bisect good 2042088cd67d0064d18c52c13c69af2499907bb1
git bisect good 98959948a7ba33cf8c708626e0d2a1456397e1c6
git bisect good 6f929b4e5a022c3ca806c1675ccb833c42086853
git bisect bad 2521129a6d2fd8a81f99cf95055eddea3df914ff
git bisect good 7b9d1f0b7a18b86db0ac1de628fa91c0994fefbe
git bisect bad ce4747963252a30613ebf1c1df3d83b9526a342e
git bisect good fb86b2440de0ec10fe0272eb19d262ae7a01adb8
git bisect good fb86b2440de0ec10fe0272eb19d262ae7a01adb8
git bisect bad e9c9eecabaa898ff3fedd98813ee4ac1a00d006a
git bisect good 26bfa5f89486a8926cd4d4ca81a04d3f0f174934
git bisect good b08ee5f7e4135d64b8edd769367f8964a725122e
git bisect bad 19d402c1e75077e2bcfe17f7fe5bcfc8deb74991

First bad

I'm worried this might not be perfect due to the hard lockup not always triggering

I noticed there were some RCU updates yesterday but that didn't fix things for me

I'll attach the logs and .config for my main machine where I see this issue the most
Comment 1 Mike Lothian 2014-09-08 19:03:03 UTC
Created attachment 149491 [details]
Dot config
Comment 2 Mike Lothian 2014-09-08 19:42:33 UTC
I'm getting hard freezes on my AMD Kabini system with a very similar config however I'm not getting any messages in the logs related to rcu_preempt
Comment 3 Mike Lothian 2014-09-08 19:44:20 UTC
Created attachment 149511 [details]
Kabini dot config
Comment 4 Mike Lothian 2014-09-13 15:42:12 UTC
I can work around the issue by changing full dynticks to periodic timer ticks
Comment 5 Mike Lothian 2014-09-13 15:42:53 UTC
Created attachment 150041 [details]
Periodic timer ticks doc config
Comment 6 Mike Lothian 2014-10-11 13:43:39 UTC
After reading a bit closer it looks like nohz full is an experimental feature

Closing as invalid