Bug 215875 - The UI hangs when system changes CPU frequency on Raspberry Pi 4B
Summary: The UI hangs when system changes CPU frequency on Raspberry Pi 4B
Status: RESOLVED CODE_FIX
Alias: None
Product: Drivers
Classification: Unclassified
Component: Other (show other bugs)
Hardware: ARM Linux
: P1 high
Assignee: drivers_other
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2022-04-22 06:01 UTC by Jian-Hong Pan
Modified: 2023-11-10 08:09 UTC (History)
0 users

See Also:
Kernel Version: 5.15+
Subsystem:
Regression: Yes
Bisected commit-id:


Attachments
The dmesg log (57.34 KB, text/plain)
2022-04-22 06:01 UTC, Jian-Hong Pan
Details
The config to build kernel (197.33 KB, text/plain)
2022-04-22 06:02 UTC, Jian-Hong Pan
Details

Description Jian-Hong Pan 2022-04-22 06:01:24 UTC
Created attachment 300790 [details]
The dmesg log

The UI hangs totally when system changes CPU frequency on Raspberry Pi 4B.  The test kernel is mainline kernel 5.18-rc3.

According to the oops messages and the stack:

[ 1330.166158] INFO: task kworker/0:0:6 blocked for more than 120 seconds.
[ 1330.172963]       Tainted: G        W   E     5.18.0-rc3 #61
[ 1330.178767] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 1330.186742] task:kworker/0:0     state:D stack:    0 pid:    6 ppid:     2 flags:0x00000008
[ 1330.186762] Workqueue: events dbs_work_handler
[ 1330.186779] Call trace:
[ 1330.186782]  __switch_to+0xf8/0x150
[ 1330.186793]  __schedule+0x37c/0x141c
[ 1330.186800]  schedule+0x54/0xc4
[ 1330.186807]  schedule_timeout+0xa4/0x190
[ 1330.186813]  __wait_for_common+0xc4/0x224
[ 1330.186818]  wait_for_completion_timeout+0x2c/0x40
[ 1330.186822]  mbox_send_message+0xf0/0x1f4
[ 1330.186828]  rpi_firmware_property_list+0xf8/0x280
[ 1330.186834]  rpi_firmware_property+0x7c/0xc0
[ 1330.186839]  raspberrypi_fw_set_rate+0x64/0xe4 [clk_raspberrypi]
[ 1330.186851]  clk_change_rate+0xe4/0x470
[ 1330.186859]  clk_core_set_rate_nolock+0x1d0/0x2ec
[ 1330.186864]  clk_set_rate+0x44/0x190
[ 1330.186870]  _set_opp+0x3b4/0x70c
[ 1330.186876]  dev_pm_opp_set_rate+0x11c/0x2d0
[ 1330.186881]  set_target+0x3c/0x4c
[ 1330.186887]  __cpufreq_driver_target+0x400/0x7ac
[ 1330.186893]  od_dbs_update+0xc8/0x1c0
[ 1330.186898]  dbs_work_handler+0x4c/0x8c
[ 1330.186903]  process_one_work+0x210/0x480
[ 1330.186910]  worker_thread+0x180/0x520
[ 1330.186915]  kthread+0xe4/0xf0
[ 1330.186920]  ret_from_fork+0x10/0x20
[ 1330.186928] INFO: task kworker/u8:0:8 blocked for more than 120 seconds.
[ 1330.193744]       Tainted: G        W   E     5.18.0-rc3 #61
[ 1330.199509] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 1330.207467] task:kworker/u8:0    state:D stack:    0 pid:    8 ppid:     2 flags:0x00000008
[ 1330.207485] Workqueue: events_unbound commit_work [drm_kms_helper]
[ 1330.207555] Call trace:
[ 1330.207558]  __switch_to+0xf8/0x150
[ 1330.207572]  __schedule+0x37c/0x141c
[ 1330.207579]  schedule+0x54/0xc4
[ 1330.207586]  schedule_preempt_disabled+0x1c/0x2c
[ 1330.207593]  __mutex_lock.constprop.0+0x160/0x59c
[ 1330.207598]  __mutex_lock_slowpath+0x40/0x8c
[ 1330.207602]  mutex_lock+0x68/0x80
[ 1330.207607]  clk_prepare_lock+0x4c/0xac
[ 1330.207616]  clk_set_rate_range.part.0+0x3c/0x250
[ 1330.207622]  clk_set_min_rate+0x3c/0xdc
[ 1330.207627]  vc4_atomic_commit_tail+0x68c/0x77c [vc4]
[ 1330.207664]  commit_tail+0xd8/0x210 [drm_kms_helper]
[ 1330.207707]  commit_work+0x20/0x30 [drm_kms_helper]
[ 1330.207748]  process_one_work+0x210/0x480
[ 1330.207755]  worker_thread+0x180/0x520
[ 1330.207760]  kthread+0xe4/0xf0
[ 1330.207765]  ret_from_fork+0x10/0x20

Looks like clk_raspberrypi tries to set/change rate by invoking raspberrypi_fw_set_rate(), but the transaction with firmware is timeout. There might be a lock shared with others, but hold by it. For example, the second oops message from drm_kms_helper -> vc4 shows clk_prepare_lock() is waiting to be scheduled.

The UI hangs until raspberrypi-firmware gets the failed returned error from mbox_send_message().

[ 4887.601985] raspberrypi-firmware soc:firmware: mbox_send_message returned -62
[ 4887.609269] raspberrypi-clk soc:firmware:clocks: Failed to change fw-clk-arm frequency: -62

The serial console is alive when the UI hangs.
Comment 1 Jian-Hong Pan 2022-04-22 06:02:39 UTC
Created attachment 300791 [details]
The config to build kernel
Comment 2 Jian-Hong Pan 2022-04-22 06:09:32 UTC
Also, notice clk_raspberrypi's error "load of value 16 is not a valid value for type '_Bool'" and corresponding backtrace when system boots:

[    4.893965] UBSAN: invalid-load in drivers/clk/bcm/clk-raspberrypi.c:368:14
[    4.901734] load of value 16 is not a valid value for type '_Bool'
[    4.908656] CPU: 1 PID: 213 Comm: systemd-udevd Tainted: G            E     5.18.0-rc3 #61
[    4.917709] Hardware name: Raspberry Pi 4 Model B (DT)
[    4.923615] Call trace:
[    4.923646] mmc1: SDHCI controller on fe300000.mmc [fe300000.mmc] using PIO
[    4.926094]  dump_backtrace+0xbc/0x130
[    4.930598]  show_stack+0x24/0x70
[    4.934664]  dump_stack_lvl+0x68/0x84
[    4.938393]  dump_stack+0x18/0x34
[    4.942456]  ubsan_epilogue+0x10/0x50
[    4.946178]  __ubsan_handle_load_invalid_value+0x80/0x90
[    4.951578]  raspberrypi_clk_probe+0x484/0x4c4 [clk_raspberrypi]
[    4.957727]  platform_probe+0x74/0x114
[    4.957748]  really_probe+0x18c/0x3f0
[    4.957762]  __driver_probe_device+0x124/0x1c0
[    4.957775]  driver_probe_device+0x44/0x140
[    4.974034]  __driver_attach+0xe0/0x234
[    4.978627]  bus_for_each_dev+0x7c/0xe0
[    4.983211]  driver_attach+0x30/0x40
[    4.987533]  bus_add_driver+0x158/0x250
[    4.992122]  driver_register+0x84/0x140
[    4.996707]  __platform_driver_register+0x34/0x40
[    5.002174]  raspberrypi_clk_driver_init+0x30/0x1000 [clk_raspberrypi]
[    5.009493]  do_one_initcall+0x50/0x240
[    5.013460]  do_init_module+0x50/0x260
[    5.017268]  load_module+0x2160/0x26f0
[    5.021077]  __do_sys_finit_module+0xac/0x12c
[    5.025517]  __arm64_sys_finit_module+0x2c/0x40
[    5.025532]  invoke_syscall+0x78/0x100
[    5.025547]  el0_svc_common.constprop.0+0x180/0x184
[    5.025560]  do_el0_svc+0x34/0x9c
[    5.025574]  el0_svc+0x54/0x1e0
[    5.025585]  el0t_64_sync_handler+0xa4/0x130
[    5.025596]  el0t_64_sync+0x1a0/0x1a4
Comment 3 Jian-Hong Pan 2022-04-22 06:21:09 UTC
Have updated the EEPROM with rpi-eeprom-update in Raspberry OS

$ sudo rpi-eeprom-update -a
*** INSTALLING EEPROM UPDATES ***

BOOTLOADER: update available
   CURRENT: Thu 03 Sep 2020 12:11:43 PM UTC (1599135103)
    LATEST: Tue 25 Jan 2022 02:30:41 PM UTC (1643121041)
   RELEASE: default (/lib/firmware/raspberrypi/bootloader/default)
            Use raspi-config to change the release.

  VL805_FW: Dedicated VL805 EEPROM
     VL805: up to date
   CURRENT: 000138a1
    LATEST: 000138a1
   CURRENT: Thu 03 Sep 2020 12:11:43 PM UTC (1599135103)
    UPDATE: Tue 25 Jan 2022 02:30:41 PM UTC (1643121041)
    BOOTFS: /boot

EEPROM updates pending. Please reboot to apply the update.
To cancel a pending update run "sudo rpi-eeprom-update -r".
Comment 4 Jian-Hong Pan 2022-05-11 09:37:46 UTC
This issue still can be reproduced with 5.18-rc6 kernel.
Comment 5 Jian-Hong Pan 2023-11-10 08:09:15 UTC
This cannot be reproduced since kernel 6.1.

Note You need to log in before you can comment on or make changes to this bug.