We have seen the following WARNING on one of our CI machines. The reproduction rate is currently unknown has this has been the first time we hit it: [ 238.566193] ============================= [ 238.566198] WARNING: suspicious RCU usage [ 238.566203] 4.17.0-rc7-g0e963d962be7-drmtip_57+ #1 Tainted: G U [ 238.566206] ----------------------------- [ 238.566211] ./include/trace/events/timer.h:290 suspicious rcu_dereference_check() usage! [ 238.566215] other info that might help us debug this: [ 238.566222] rcu_scheduler_active = 2, debug_locks = 0 [ 238.566228] 1 lock held by swapper/1/0: [ 238.566231] #0: 0000000044acc168 (hrtimer_bases.lock){-.-.}, at: lock_hrtimer_base.isra.2+0x1f/0x40 [ 238.566247] stack backtrace: [ 238.566254] CPU: 1 PID: 0 Comm: swapper/1 Tainted: G U 4.17.0-rc7-g0e963d962be7-drmtip_57+ #1 [ 238.566258] Hardware name: Intel Corporation CannonLake Client Platform/CannonLake Y LPDDR4 RVP, BIOS CNLSFWR1.R00.X122.B01.1801151045 01/15/2018 [ 238.566262] Call Trace: [ 238.566270] dump_stack+0x67/0x9b [ 238.566277] hrtimer_try_to_cancel+0x1b1/0x1f0 [ 238.566285] hrtimer_cancel+0x10/0x20 [ 238.566293] __tick_nohz_idle_restart_tick+0x4a/0xd0 [ 238.566299] tick_nohz_idle_exit+0x9c/0xb0 [ 238.566307] do_idle+0x14f/0x250 [ 238.566316] cpu_startup_entry+0x6a/0x70 [ 238.566324] start_secondary+0x19d/0x1f0 [ 238.566332] secondary_startup_64+0xa5/0xb0 [ 238.566355] ====================================================== [ 238.566356] WARNING: possible circular locking dependency detected [ 238.566357] 4.17.0-rc7-g0e963d962be7-drmtip_57+ #1 Tainted: G U [ 238.566358] ------------------------------------------------------ [ 238.566359] swapper/1/0 is trying to acquire lock: [ 238.566360] 000000005f4cb210 ((console_sem).lock){-.-.}, at: down_trylock+0xa/0x30 [ 238.566365] but task is already holding lock: [ 238.566365] 0000000044acc168 (hrtimer_bases.lock){-.-.}, at: lock_hrtimer_base.isra.2+0x1f/0x40 [ 238.566370] which lock already depends on the new lock. [ 238.566372] the existing dependency chain (in reverse order) is: [ 238.566374] -> #4 (hrtimer_bases.lock){-.-.}: [ 238.566377] lock_hrtimer_base.isra.2+0x1f/0x40 [ 238.566378] hrtimer_start_range_ns+0x5a/0x3a0 [ 238.566379] __enqueue_rt_entity+0x209/0x360 [ 238.566380] enqueue_rt_entity+0x28/0x50 [ 238.566381] enqueue_task_rt+0x2a/0xb0 [ 238.566382] __sched_setscheduler+0x2fd/0x920 [ 238.566383] _sched_setscheduler+0x72/0x90 [ 238.566384] watchdog_enable+0x93/0xb0 [ 238.566385] smpboot_thread_fn+0x107/0x280 [ 238.566385] kthread+0x119/0x130 [ 238.566386] ret_from_fork+0x3a/0x50 [ 238.566387] -> #3 (&rt_b->rt_runtime_lock){-.-.}: [ 238.566391] __enqueue_rt_entity+0x1ab/0x360 [ 238.566392] enqueue_rt_entity+0x28/0x50 [ 238.566393] enqueue_task_rt+0x2a/0xb0 [ 238.566393] __sched_setscheduler+0x2fd/0x920 [ 238.566394] _sched_setscheduler+0x72/0x90 [ 238.566395] watchdog_enable+0x93/0xb0 [ 238.566396] smpboot_thread_fn+0x107/0x280 [ 238.566397] kthread+0x119/0x130 [ 238.566398] ret_from_fork+0x3a/0x50 [ 238.566399] -> #2 (&rq->lock){-.-.}: [ 238.566402] task_fork_fair+0x36/0x160 [ 238.566403] sched_fork+0x123/0x280 [ 238.566404] copy_process.part.7+0x5f4/0x1d30 [ 238.566405] _do_fork+0xe2/0x700 [ 238.566406] kernel_thread+0x20/0x30 [ 238.566407] rest_init+0x1d/0x220 [ 238.566408] start_kernel+0x4a2/0x4c2 [ 238.566409] secondary_startup_64+0xa5/0xb0 [ 238.566410] -> #1 (&p->pi_lock){-.-.}: [ 238.566414] try_to_wake_up+0x37/0x650 [ 238.566414] up+0x3b/0x50 [ 238.566415] __up_console_sem+0x2e/0x50 [ 238.566416] console_unlock+0x32f/0x640 [ 238.566417] register_framebuffer+0x24c/0x350 [ 238.566418] __drm_fb_helper_initial_config_and_unlock+0x22a/0x4a0 [ 238.566419] intel_fbdev_initial_config+0xf/0x20 [i915] [ 238.566420] async_run_entry_fn+0x34/0x160 [ 238.566421] process_one_work+0x229/0x6a0 [ 238.566422] worker_thread+0x35/0x380 [ 238.566423] kthread+0x119/0x130 [ 238.566424] ret_from_fork+0x3a/0x50 [ 238.566425] -> #0 ((console_sem).lock){-.-.}: [ 238.566428] _raw_spin_lock_irqsave+0x33/0x50 [ 238.566429] down_trylock+0xa/0x30 [ 238.566430] __down_trylock_console_sem+0x20/0x80 [ 238.566431] console_trylock+0xe/0x60 [ 238.566432] vprintk_emit+0x22e/0x4d0 [ 238.566433] printk+0x4d/0x69 [ 238.566434] lockdep_rcu_suspicious+0x1b/0xf0 [ 238.566434] hrtimer_try_to_cancel+0x1b1/0x1f0 [ 238.566435] hrtimer_cancel+0x10/0x20 [ 238.566436] __tick_nohz_idle_restart_tick+0x4a/0xd0 [ 238.566437] tick_nohz_idle_exit+0x9c/0xb0 [ 238.566438] do_idle+0x14f/0x250 [ 238.566439] cpu_startup_entry+0x6a/0x70 [ 238.566440] start_secondary+0x19d/0x1f0 [ 238.566440] secondary_startup_64+0xa5/0xb0 [ 238.566442] other info that might help us debug this: [ 238.566444] Chain exists of: [ 238.566444] (console_sem).lock --> &rt_b->rt_runtime_lock --> hrtimer_bases.lock [ 238.566450] Possible unsafe locking scenario: [ 238.566452] CPU0 CPU1 [ 238.566452] ---- ---- [ 238.566453] lock(hrtimer_bases.lock); [ 238.566455] lock(&rt_b->rt_runtime_lock); [ 238.566458] lock(hrtimer_bases.lock); [ 238.566460] lock((console_sem).lock); [ 238.566462] *** DEADLOCK *** [ 238.566464] 1 lock held by swapper/1/0: [ 238.566465] #0: 0000000044acc168 (hrtimer_bases.lock){-.-.}, at: lock_hrtimer_base.isra.2+0x1f/0x40 [ 238.566469] stack backtrace: [ 238.566471] CPU: 1 PID: 0 Comm: swapper/1 Tainted: G U 4.17.0-rc7-g0e963d962be7-drmtip_57+ #1 [ 238.566472] Hardware name: Intel Corporation CannonLake Client Platform/CannonLake Y LPDDR4 RVP, BIOS CNLSFWR1.R00.X122.B01.1801151045 01/15/2018 [ 238.566473] Call Trace: [ 238.566474] dump_stack+0x67/0x9b [ 238.566475] print_circular_bug.isra.18+0x1c8/0x2b0 [ 238.566476] __lock_acquire+0x1897/0x1b50 [ 238.566477] ? lock_acquire+0xa6/0x210 [ 238.566478] lock_acquire+0xa6/0x210 [ 238.566479] ? down_trylock+0xa/0x30 [ 238.566480] ? vprintk_emit+0x22e/0x4d0 [ 238.566481] _raw_spin_lock_irqsave+0x33/0x50 [ 238.566482] ? down_trylock+0xa/0x30 [ 238.566482] down_trylock+0xa/0x30 [ 238.566483] __down_trylock_console_sem+0x20/0x80 [ 238.566484] console_trylock+0xe/0x60 [ 238.566485] vprintk_emit+0x22e/0x4d0 [ 238.566486] printk+0x4d/0x69 [ 238.566487] lockdep_rcu_suspicious+0x1b/0xf0 [ 238.566488] hrtimer_try_to_cancel+0x1b1/0x1f0 [ 238.566488] hrtimer_cancel+0x10/0x20 [ 238.566489] __tick_nohz_idle_restart_tick+0x4a/0xd0 [ 238.566490] tick_nohz_idle_exit+0x9c/0xb0 [ 238.566491] do_idle+0x14f/0x250 [ 238.566492] cpu_startup_entry+0x6a/0x70 [ 238.566493] start_secondary+0x19d/0x1f0 [ 238.566494] secondary_startup_64+0xa5/0xb0 Kernel config: https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_57/kconfig.txt Boot log: https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_57/fi-cnl-y3/boot1.log Logs during the execution of the tests: https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_57/fi-cnl-y3/dmesg1.log
Adding Steven Rostedt to the Cc: list, as per Chris Wilson's recommendation.
This issue occurred only once and that too 3 months ago with drmtip_57.
This issue can be closed as the boot log and test results are not found any more.