Bug 205787 - rcu_sched self-detected stall on CPU with odroid-c1 (meson)
Summary: rcu_sched self-detected stall on CPU with odroid-c1 (meson)
Status: NEW
Alias: None
Product: Platform Specific/Hardware
Classification: Unclassified
Component: ARM (show other bugs)
Hardware: ARM Linux
: P1 normal
Assignee: linux-arm-kernel@lists.arm.linux.org.uk
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2019-12-07 08:46 UTC by dirkneukirchen
Modified: 2020-04-30 13:05 UTC (History)
0 users

See Also:
Kernel Version: 5.3.11-odroidc1 5.6.7 5.7.0-rc3
Subsystem:
Regression: Yes
Bisected commit-id:


Attachments
full dmesg of 5.3.11 armbian on odroid-c1 (92.27 KB, text/plain)
2019-12-07 08:46 UTC, dirkneukirchen
Details
serial log with this issue on 5.7.0-rc3 mainline (11.07 KB, text/plain)
2020-04-28 16:36 UTC, dirkneukirchen
Details

Description dirkneukirchen 2019-12-07 08:46:15 UTC
Created attachment 286211 [details]
full dmesg of 5.3.11 armbian on odroid-c1

Booting odroid-c1 is fine however after some network (?) activity the system slows down showing several exceptions in dmesg

Hardware:
Odroid-C1 with
USB network adapter: ID 0b95:772a ASIX Electronics Corp. AX88772A Fast Ethernet

Softwarestack: Armbian Nightly / Debian Buster w. 5.3.11
previous Kernel 5.3.9 iirc showed similar error

[58153.405330] hrtimer: interrupt took 965010 ns
[58852.154986] rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
[58852.155544] rcu: 	1-....: (243 ticks this GP) idle=d42/1/0x40000004 softirq=459500/459501 fqs=172 
[58852.155789] 	(detected by 2, t=2105 jiffies, g=1378857, q=5917)
[58852.156017] Sending NMI from CPU 2 to CPUs 1:
[58852.171345] NMI backtrace for cpu 1
[58852.171525] CPU: 1 PID: 30069 Comm: kworker/1:3 Not tainted 5.3.11-odroidc1 #5.99.191113
[58852.171635] Hardware name: Amlogic Meson platform
[58852.171736] Workqueue: events dbs_work_handler
[58852.171903] PC is at memcpy+0x248/0x330
[58852.171988] LR is at 0x7b3bc7a8
[58852.172110] pc : [<c0e89948>]    lr : [<7b3bc7a8>]    psr: 20010113
[58852.172217] sp : eb1f3b70  ip : c7a8d173  fp : c1403080
[58852.172325] r10: 00000100  r9 : e36ba2bc  r8 : aaeb211f
[58852.172449] r7 : 4816ae97  r6 : 2595d57c  r5 : 6b92dec8  r4 : 88d038a8
[58852.172572] r3 : 409b77d6  r2 : 00000270  r1 : ec453224  r0 : eb7ba5e4
[58852.172701] Flags: nzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
[58852.172819] Control: 10c5387d  Table: 2dd8c04a  DAC: 00000051
[58852.172960] CPU: 1 PID: 30069 Comm: kworker/1:3 Not tainted 5.3.11-odroidc1 #5.99.191113
[58852.173066] Hardware name: Amlogic Meson platform
[58852.173157] Workqueue: events dbs_work_handler
...

see attached dmesg for more
Comment 1 dirkneukirchen 2019-12-09 08:39:43 UTC
possibly related/duplicate:
https://bugzilla.kernel.org/show_bug.cgi?id=205211
Comment 2 dirkneukirchen 2020-04-28 16:36:01 UTC
Created attachment 288793 [details]
serial log with this issue on 5.7.0-rc3 mainline


Kernel 4.19.118 compiled with oldconfig and same userland (Armbian) does not show
this issues (or not easily)

Kernels 5.6.7 and 5.7.0-rc3 seem to exhibit the same issue
attached the error that showed up in serial while doing a simple

mtr <ip address>
Comment 3 dirkneukirchen 2020-04-28 16:36:35 UTC
since 4.19.x doesnt show the error this should be a regression
Comment 4 dirkneukirchen 2020-04-30 13:05:10 UTC
The bug might be related to a different default governor on Armbian
as it was configured with GOVERNOR=interactive
however Armbian Images are currently broken , still investigating locally modified Armbian

Note You need to log in before you can comment on or make changes to this bug.