Bug 207245
Summary: | huge CPU temperature increase from 5.2 to 5.5 ... and when using intel_pstate | ||
---|---|---|---|
Product: | Platform Specific/Hardware | Reporter: | Christoph Anton Mitterer (calestyo) |
Component: | x86-64 | Assignee: | platform_x86_64 (platform_x86_64) |
Status: | NEW --- | ||
Severity: | blocking | ||
Priority: | P1 | ||
Hardware: | x86-64 | ||
OS: | Linux | ||
URL: | https://lore.kernel.org/lkml/ce8097694ddfab616616f8f81521495d99c74416.camel@scientia.net/T/#re454505a9f78ec90a0e7d7810c42d103c32fd4e6 | ||
Kernel Version: | anything beyond 5.2.x | Subsystem: | |
Regression: | Yes | Bisected commit-id: |
Description
Christoph Anton Mitterer
2020-04-14 22:31:08 UTC
I've made some further very extensive tests in the meantime, but these were mostly for clearly GPU related stuff, i.e. the problem that the temperatures go through the roof when playing back any video. These were reported here: https://gitlab.freedesktop.org/drm/intel/-/issues/953#note_463451 But I haven't made any plots/conclusions for that new set of tests, yet (will keep this ticket updated once I've done). As for the general (I mean even when doing non-graphics intensive stuff like the unhide-brute or sha512 sum verify tests that I've described above) extreme temperature increase since >5.2 that I see, ... what I would try next is whether mitigations=off changes anything (it didn't for video playback). Also I found out about the nice features of perf record respectively perf report. I've played a bit with that already and the first "results" showed that when I do anyting (like just typing at the keyboard, quickly moving up/down in e.g. Evolutions mail list, or just Alt-Tab-ing between windows, the number of events recorded there increases by magnitudes(!!). I'd be thankful for any guide in what to actually test to better nail down that problem I see. Thanks! I've upgraded to 5.5.17 (again the stock Debian sid package), and all future tests with 5.5.x will be with this. Problems unchanged. I've also checked 5.5.17 with intel_pstate being enabled but at the same time using: iommu=off mitigations=off pci=nomsi I didn't repeat all tests as extensively as they're in the git repo, but I've played back a video with mpv and did some casual working (Atl-Tab-switching between windows, scrolling/up down in some windows, etc.). None of these seem to help in terms of my CPU temperature going through the roof. I've added a number of plots for my test series specifically about video playback (mentioned in Comment 1) here: https://gitlab.freedesktop.org/drm/intel/issues/956#note_477885 and in the following comments. Not sure if these plots give any good general clues, since they're mostly video acceleration related (which is likely a bug of it's own). Yet in these tests there is one notable exception that works more or less well, and that in turn might - for an expert - be a hint what changed in kernels >5.2 that cause parts or all of the overheating issues I see since then (which occur, as stated several times, even for non-GPU-related processes). And just to point that out again: Even in 5.2 and before: video playback never worked really well and caused quite some temperature (though never nearly close to 100°C, though this in turn I tried only with cinnamon and could thus be something cinnamon related)... but now with >5.2 even non-fullscreen videos that take up just some centimetres of the screen cause the temperatures to go through the roof, while with fullscreen, as you can see, it goes nearly always to 100°C. btw: I've also added moving average plots to the git repo for my first test series, that is the one from the beginning of this ticket, where I compare 5.2 with 5.5 and for each intel_pstate=enabled with disabled under a number of differen workloads, both GPU intensive, GPU-related but not so intensive and GPU-unrelated. These plots show more clearly how dramatic the temperature rise from 5.2 to 5.5 is,... and also how within each version how big the difference between intel_pstate being enabled and not. https://github.com/calestyo/cpu-tests/commit/b256e5457b32dabc19035031073e2f09fe882cc4 Still completely broken in both 5.6 and 5.7 ... scrolling up a website in e.g. chromium gives mitt 96°C ... guess that's Intel hard&software :-D |