Bug 212507

Summary: Switching IPA and step_wise governors screw up tz->passive
Product: Power Management Reporter: Daniel Lezcano (daniel.lezcano)
Component: ThermalAssignee: Zhang Rui (rui.zhang)
Status: NEW ---    
Severity: normal CC: lukasz.luba
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: v5.12 Subsystem:
Regression: No Bisected commit-id:

Description Daniel Lezcano 2021-03-31 15:42:42 UTC
The power allocator sets the cooling device state to zero.

When switching to the step_wise, the routine does:

step_wise_throttle()
 -> thermal_zone_trip_update()
 ...
 old_target = instance->target;
 instance->target = get_target_state(instance, trend, throttle);
 ...

 old_target is what have set the power allocator, zero.
 instance->target is -1

 ...

 That fullfills the condition:

 else if (old_target != THERMAL_NO_TARGET &&
          instance->target == THERMAL_NO_TARGET)
                update_passive_instance(tz, trip_type, -1);

But tz->passive is equal to zero and becomes -1. The test in the core code does:

        if (!stop && tz->passive)
                thermal_zone_device_set_polling(tz, tz->passive_delay_jiffies);

which becomes true.

The step_wise monitors the thermal zone with the passive mitigation delay instead of the polling delay.
Comment 1 Lukasz Luba 2021-04-09 11:45:05 UTC
I've reproduced locally the issue and can see what you mean.
The problem is not with IPA, but inside step_wise. The logic in step_wise which enables/disables passive is odd. It abuses the fact that 'tz->passive' is int and not bool, as it should be and also comment states that for thermal_zone_device structure. Step_wise is looping over the cooling devices and increments or decrements this tz->passive value. IMO the tz->passive filed should be changed to 'bool' and this arithmetic operations in governors should be forbidden.
Comment 2 Zhang Rui 2022-06-21 07:45:12 UTC
Just found this while going through the PM bug reports.
will any of you fix this? :)