Bug 214329
Summary: | Kernel NULL pointer dereference after 5cbba60596b1f32f637190ca9ed5b1acdadb852c | ||
---|---|---|---|
Product: | Power Management | Reporter: | Pablo Mendez Hernandez (pablomh) |
Component: | intel_pstate | Assignee: | Srinivas Pandruvada (srinivas.pandruvada) |
Status: | CLOSED CODE_FIX | ||
Severity: | high | CC: | rui.zhang, srinivas.pandruvada |
Priority: | P1 | ||
Hardware: | Intel | ||
OS: | Linux | ||
Kernel Version: | 5.15.0-0.rc0 | Subsystem: | |
Regression: | No | Bisected commit-id: | |
Attachments: |
First part of screenshot
Second part of screenshot Third part of screenshot Fourth part of screenshot Patch for 5.16 kernel |
Description
Pablo Mendez Hernandez
2021-09-06 11:28:12 UTC
Hardware is Lenovo P1 Gen 3 running Fedora 34 with Rawhide kernels: - kernel-5.15.0-0.rc0.20210831gitb91db6a0b52e.1.fc36.x86_64: OK - kernel-5.15.0-0.rc0.20210901git9e9fb7655ed5.2.fc36.x86_64: FAILS - kernel-5.15.0-0.rc0.20210902git4ac6d90867a4.4.fc36.x86_64: FAILS Created attachment 298679 [details]
First part of screenshot
Created attachment 298681 [details]
Second part of screenshot
Created attachment 298683 [details]
Third part of screenshot
Created attachment 298685 [details]
Fourth part of screenshot
It fixed the issue for me. Thanks! If you want to amend the commit message, you can mention that the issue also existed in P1 Gen3. You have my "Tested-by" if you consider it appropriate :) Thanks for the test. We decided to revert the commit for this release. Please keep the bug open, I will attach the fix soon for the next release. Sure, no problem. Created attachment 298935 [details]
Patch for 5.16 kernel
Hi Pablo,
Please test the attached patch. I would like to add your tested-by after your tests.
Hi, As discussed over email, I tested the patch and it's submitted already for inclusion in rafael's tree: https://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git/commit/?h=linux-next&id=57577c996d731ce1e5a4a488e64e6e201b360847 commit 57577c996d731ce1e5a4a488e64e6e201b360847 Author: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> AuthorDate: Tue Sep 28 09:42:17 2021 -0700 Commit: Rafael J. Wysocki <rafael.j.wysocki@intel.com> CommitDate: Tue Oct 5 15:30:44 2021 +0200 cpufreq: intel_pstate: Process HWP Guaranteed change notification It is possible that HWP guaranteed ratio is changed in response to change in power and thermal limits. For example when Intel Speed Select performance profile is changed or there is change in TDP, hardware can send notifications. It is possible that the guaranteed ratio is increased. This creates an issue when turbo is disabled, as the old limits set in MSR_HWP_REQUEST are still lower and hardware will clip to older limits. This change enables HWP interrupt and process HWP interrupts. When guaranteed is changed, calls cpufreq_update_policy() so that driver callbacks are called to update to new HWP limits. This callback is called from a delayed workqueue of 10ms to avoid frequent updates. Although the scope of IA32_HWP_INTERRUPT is per logical cpu, on some plaforms interrupt is generated on all CPUs. This is particularly a problem during initialization, when the driver didn't allocated data for other CPUs. So this change uses a cpumask of enabled CPUs and process interrupts on those CPUs only. When the cpufreq offline() or suspend() callback is called, HWP interrupt is disabled on those CPUs and also cancels any pending work item. Spin lock is used to protect data and processing shared with interrupt handler. Here READ_ONCE(), WRITE_ONCE() macros are used to designate shared data, even though spin lock act as an optimization barrier here. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Tested-by: pablomh@gmail.com Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> |