I recently deployed a custom kernel build with X86_AMD_PSTATE_DEFAULT_MODE set to 1 (Disabled) to a newer AMD system: AMD Ryzen 3 4100 that supports cppc and got a NULL pointer reference crash so I had to add amd_pstate=active on that system to get it to boot. The same kernel boots fine on older AMD systems that don't support cppc. The kernel is 6.6.22 This is what I think is happening when no amd_pstate parameter has been specified although I haven't tested a patch yet: in function: amd_pstate_init the cppc_state starts as AMD_PSTATE_UNDEFINED. On older systems that don't support CPPC, !boot_cpu_has(X86_FEATURE_CPPC) tests true so "driver load is disabled, boot with specific mode to enable this" is printed and -ENODEV is returned. On the newer systems that support CPPC , it goes on to "ret = amd_pstate_set_driver(CONFIG_X86_AMD_PSTATE_DEFAULT_MODE);" . Then amd_pstate_set_driver sees AMD_PSTATE_DISABLE , prints "driver is explicitly disabled" , doesn't assign a driver to current_pstate_driver, but still returns 0. So then the switch statement in amd_pstate_init breaks and then I suspect it crashes at "current_pstate_driver->adjust_perf = amd_pstate_adjust_perf;" just below the switch statement. If this is the case, maybe adjusting amd_pstate_set_driver to return -ENODEV if cppc_state == AMD_PSTATE_DISABLE might fix the problem. Here is some of the crash output (manually typed from a photo): amd_pstate: driver is explicitly disabled BUG: kernel NULL pointer deference, address: 0000000000000050 ... ? __die+0x1a/0x60 ? page_fault_oops+0x17c/0x4a0 ? _prb_read_valid+0x263/0x2e0 ? exc_page_fault+0x33f/0x610 ? prb_read_valid+0x12/0x20 ? asm_exc_page_fault+0x22/0x30 ? amd_pstate_init+0x90/0x260 ? amd_pstate_param+0xb0/0xb0 do_one_initcall+0x82/0x2c0 kernel_init_freeable+0x1af/0x260 ? rest_init+0xb0/0xb0 kernel_init+0x11/0x1b0 ret_from_fork+0x2b/0x40 ? rest_init+0xb0/0xb0 ret_from_fork_asm+0x11/0x20
Actually that fix might interfer with the use of amd_pstate_set_driver in function: amd_pstate_param so perhaps it would be better to retest cppc_state for disabled after the call to amd_pstate_set_driver(CONFIG_X86_AMD_PSTATE_DEFAULT_MODE) or maybe a test if current_pstate_driver=NULL after the switch statement might be better.
Perry please take a look.
(In reply to Artem S. Tashkinov from comment #2) > Perry please take a look. Sure, let me check the issue on my local system.
(In reply to Matthew Stapleton from comment #0) > I recently deployed a custom kernel build with X86_AMD_PSTATE_DEFAULT_MODE > set to 1 (Disabled) to a newer AMD system: AMD Ryzen 3 4100 that supports > cppc and got a NULL pointer reference crash so I had to add > amd_pstate=active on that system to get it to boot. The same kernel boots > fine on older AMD systems that don't support cppc. > > The kernel is 6.6.22 > > This is what I think is happening when no amd_pstate parameter has been > specified although I haven't tested a patch yet: > in function: amd_pstate_init the cppc_state starts as AMD_PSTATE_UNDEFINED. > > On older systems that don't support CPPC, !boot_cpu_has(X86_FEATURE_CPPC) > tests true so "driver load is disabled, boot with specific mode to enable > this" is printed and -ENODEV is returned. > > On the newer systems that support CPPC , it goes on to "ret = > amd_pstate_set_driver(CONFIG_X86_AMD_PSTATE_DEFAULT_MODE);" . Then > amd_pstate_set_driver sees AMD_PSTATE_DISABLE , prints "driver is explicitly > disabled" , doesn't assign a driver to current_pstate_driver, but still > returns 0. So then the switch statement in amd_pstate_init breaks and then > I suspect it crashes at "current_pstate_driver->adjust_perf = > amd_pstate_adjust_perf;" just below the switch statement. > > If this is the case, maybe adjusting amd_pstate_set_driver to return -ENODEV > if cppc_state == AMD_PSTATE_DISABLE might fix the problem. > > Here is some of the crash output (manually typed from a photo): > amd_pstate: driver is explicitly disabled > BUG: kernel NULL pointer deference, address: 0000000000000050 > ... > ? __die+0x1a/0x60 > ? page_fault_oops+0x17c/0x4a0 > ? _prb_read_valid+0x263/0x2e0 > ? exc_page_fault+0x33f/0x610 > ? prb_read_valid+0x12/0x20 > ? asm_exc_page_fault+0x22/0x30 > ? amd_pstate_init+0x90/0x260 > ? amd_pstate_param+0xb0/0xb0 > do_one_initcall+0x82/0x2c0 > kernel_init_freeable+0x1af/0x260 > ? rest_init+0xb0/0xb0 > kernel_init+0x11/0x1b0 > ret_from_fork+0x2b/0x40 > ? rest_init+0xb0/0xb0 > ret_from_fork_asm+0x11/0x20 Could you try the patch lately posted while I am trying to find a system to reproduce? https://lore.kernel.org/lkml/9b31fbcdfd4e4f00c3302f45e655aa43589b224c.1715356532.git.perry.yuan@amd.com/ I have changed the mode selection code a bit, it will be great if you can help to try this on your system, if the issue still can be reproduced, I will need to chang the patch accordingly. thanks. Perry.
Thanks for the patch. After manually applying just "[PATCH v2 10/10] cpufreq: amd-pstate: automatically load pstate driver by default" to 6.6.31, the AMD Ryzen 3 4100 system is no longer crashing at bootup with X86_AMD_PSTATE_DEFAULT_MODE set to 1 (Disabled) and when no amd_pstate option is specified.
(In reply to Matthew Stapleton from comment #5) > Thanks for the patch. > > After manually applying just "[PATCH v2 10/10] cpufreq: amd-pstate: > automatically load pstate driver by default" to 6.6.31, the AMD Ryzen 3 4100 > system is no longer crashing at bootup with X86_AMD_PSTATE_DEFAULT_MODE set > to 1 (Disabled) and when no amd_pstate option is specified. Thanks for the testing! glad to know that patch work for you. Then we can close the ticket looks like. Perry