Bug 156261

Summary: kernel starts to hang partially resuming on user input
Product: Process Management Reporter: Elmar Stellnberger (estellnb)
Component: SchedulerAssignee: Ingo Molnar (mingo)
Status: NEW ---    
Severity: high    
Priority: P1    
Hardware: x86-64   
OS: Linux   
Kernel Version: 4.8-rc4 Subsystem:
Regression: No Bisected commit-id:
Attachments: screenshot of dmesg before crash

Description Elmar Stellnberger 2016-09-07 14:38:01 UTC
The latest issue discovered with 4.8.0-rc4 as well as 4.8.0-rc5+ is that the kernel sometimes starts to hang - however while I am playing music only. The music then resumes for some seconds when I start to press keys or move the mouse until it keeps hanging finally:

4.8.0-rc5+: hitting some keys short fractions of music playback did continue for some time; finally I had to do a hard reset
4.8.0-rc4: apparently the same issue; when I pressed the SysRq sequence S-U-B it did still reboot.

4.8.0-rc5+: d060e0f603a4156087813d221d818bb39ec91429
4.8.0-rc4: as by the tag.

there was nothing in the logs; at least not in /var/log/messages. As far as I remember -rc2/-rc1+ did work well.
Comment 1 Elmar Stellnberger 2016-09-07 14:40:11 UTC
  perhaps these two dmesgg have something to do with a possible future hang:
[  203.704098] CE: hpet increased min_delta_ns to 20115 nsec
[  616.983496] perf: interrupt took too long (2508 > 2500), lowering kernel.perf_event_max_sample_rate to 79500
Comment 2 Elmar Stellnberger 2016-09-09 08:34:16 UTC
It has actually nothing to do with music playback. Today it started to hang without. That happens very irregularly; yesterday I have waited to catch the error for more than half a day with 'nice -20 dmesg -w' but actually nothing had happened.
Comment 3 Elmar Stellnberger 2016-09-09 11:52:53 UTC
Created attachment 232851 [details]
screenshot of dmesg before crash

more or less obviously related to nouveau; see also: https://bugs.freedesktop.org/show_bug.cgi?id=97614.
Comment 4 Elmar Stellnberger 2016-09-15 11:40:58 UTC
still a problem with 4.8.0-rc6+ (5924bbecd0267d87c24110cbe2041b5075173a25) although I am not absolutely sure whether it is related with the screen distortions under nouveau. Last time I could escape by the SysRq-keys while the system freeze occurred instantaneously.
Comment 5 Elmar Stellnberger 2016-09-21 10:07:13 UTC
  The page_not_present errors seem to have disappeared now with kernel commit 7d1e042314619115153a0f6f06e4552c09a50e13 (4.8-rc7+). Some nouveau related trapped read errors are still present in the dmesg (see freedesktop bug). I will report back to you soon whether this resolves the recurrently experienced kernel hangs.