Bug 196557 - ftrace=mmiotrace cause soft lockup
Summary: ftrace=mmiotrace cause soft lockup
Status: RESOLVED CODE_FIX
Alias: None
Product: Tracing/Profiling
Classification: Unclassified
Component: Ftrace (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: Steven Rostedt
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2017-08-01 12:46 UTC by Ziqian SUN (Zamir)
Modified: 2017-09-21 01:50 UTC (History)
3 users (show)

See Also:
Kernel Version: 4.11.11-300.fc26.x86_64
Subsystem:
Regression: No
Bisected commit-id:


Attachments

Description Ziqian SUN (Zamir) 2017-08-01 12:46:33 UTC
Based on https://www.kernel.org/doc/Documentation/admin-guide/kernel-parameters.txt , I tried to boot the system with ftrace=mmiotrace. However the system did not boot up. From the console I see lots of soft lockup.

Version:
4.11.11-300.fc26.x86_64  with Fedora 26

Steps:

# grubby --args "ftrace=mmiotrace" --update-kernel=/boot/vmlinuz-4.11.11-300.fc26.x86_64 ; reboot

Console output
------8<--------
[   52.123483] NMI watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [systemd-udevd:617] 
[   52.165176] Modules linked in: iTCO_wdt(+) iTCO_vendor_support intel_rapl sb_edac edac_core x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel intel_cstate intel_uncore intel_rapl_perf ipmi_ssif lpc_ich hpilo hpwdt ipmi_si ipmi_devintf ioatdma tg3 ipmi_msghandler dca ptp pps_core acpi_power_meter shpchp pcc_cpufreq tpm_tis tpm_tis_core tpm xfs libcrc32c mgag200 i2c_algo_bit drm_kms_helper ttm crc32c_intel drm serio_raw hpsa ata_generic pata_acpi scsi_transport_sas 
[   52.407390] CPU: 0 PID: 617 Comm: systemd-udevd Not tainted 4.11.11-300.fc26.x86_64 #1 
[   52.447820] Hardware name: HP ProLiant DL388p Gen8, BIOS P70 07/15/2012 
[   52.482583] task: ffff9f3cdffb8000 task.stack: ffffb2bb44f28000 
[   52.511982] RIP: 0010:iTCO_wdt_unset_NO_REBOOT_bit+0x79/0x130 [iTCO_wdt] 
[   52.546489] RSP: 0018:ffffb2bb44f2bad0 EFLAGS: 00010246 ORIG_RAX: ffffffffffffff10 
[   52.584586] RAX: ffffb2bb432ad410 RBX: ffff9f3cdebaac28 RCX: 00000000000fed1f 
[   52.621781] RDX: 00000000ffffffdf RSI: 0000000000000282 RDI: ffff9f3cdebaac28 
[   52.659171] RBP: ffffb2bb44f2baf0 R08: ffff9f3cd944f690 R09: 0000000622a61000 
[   52.695681] R10: 00003ffffffff000 R11: 0000000000000067 R12: 0000000000000020 
[   52.732839] R13: ffff9f3cd944f410 R14: ffff9f3cd9497780 R15: 0000000000000021 
[   52.770132] FS:  00007fb29864d8c0(0000) GS:ffff9f3ce6600000(0000) knlGS:0000000000000000 
[   52.811825] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033 
[   52.841377] CR2: ffffb2bb432ad410 CR3: 000000061ffcb000 CR4: 00000000000406f0 
[   52.878118] Call Trace: 
[   52.890117]  iTCO_wdt_probe+0xed/0x370 [iTCO_wdt] 
[   52.915412]  platform_drv_probe+0x3b/0xa0 
[   52.935336]  driver_probe_device+0x106/0x450 
[   52.956473]  __driver_attach+0xa4/0xe0 
[   52.976562]  ? driver_probe_device+0x450/0x450 
[   52.999422]  bus_for_each_dev+0x6e/0xb0 
[   53.018329]  driver_attach+0x1e/0x20 
[   53.037549]  bus_add_driver+0x1d0/0x270 
[   53.056573]  ? 0xffffffffc031b000 
[   53.073528]  driver_register+0x60/0xe0 
[   53.092733]  ? 0xffffffffc031b000 
[   53.109256]  __platform_driver_register+0x36/0x40 
[   53.132530]  iTCO_wdt_init_module+0x2a/0x1000 [iTCO_wdt] 
[   53.159759]  do_one_initcall+0x50/0x1a0 
[   53.179727]  ? kfree+0x154/0x170 
[   53.195637]  ? kfree+0x154/0x170 
[   53.213065]  ? kmem_cache_alloc_trace+0x15f/0x1c0 
[   53.236690]  ? do_init_module+0x27/0x1e6 
[   53.256477]  do_init_module+0x5f/0x1e6 
[   53.276572]  load_module+0x23d6/0x2970 
[   53.297406]  SYSC_init_module+0x16f/0x1a0 
[   53.318519]  ? SYSC_init_module+0x16f/0x1a0 
[   53.340775]  SyS_init_module+0xe/0x10 
[   53.360691]  do_syscall_64+0x67/0x170 
[   53.378770]  entry_SYSCALL64_slow_path+0x25/0x25 
[   53.402249] RIP: 0033:0x7fb2972b115a 
[   53.421481] RSP: 002b:00007fffc1683828 EFLAGS: 00000246 ORIG_RAX: 00000000000000af 
[   53.459642] RAX: ffffffffffffffda RBX: 0000559619f5e360 RCX: 00007fb2972b115a 
[   53.496535] RDX: 00007fb297de79c5 RSI: 0000000000003893 RDI: 000055961a7762c0 
[   53.533271] RBP: 00007fb297de79c5 R08: 0000559619f5e0f0 R09: 00000000000002b0 
[   53.570701] R10: 00007fb29756bb00 R11: 0000000000000246 R12: 000055961a7762c0 
[   53.607573] R13: 0000559619f5e410 R14: 0000000000020000 R15: 000055961954bf4a 
[   53.643938] Code: 77 31 c0 48 8b 4d e8 65 48 33 0c 25 28 00 00 00 0f 85 c5 00 00 00 48 83 c4 10 5b 41 5c 5d c3 44 89 e2 f7 d2 48 8b 83 c8 00 00 00 <8b> 00 21 d0 48 8b 93 c8 00 00 00 89 45 e4 89 02 48 8b 83 c8 00  
------8<--------


If this is a kernel bug, I think this either should be documented or be fixed in kernel. If this is not a kernel bug, please close it and I can file it to systemd.
Comment 1 Steven Rostedt 2017-08-01 13:40:24 UTC
mmiotrace does some crazy mappings to device drivers. It also requires that the system is running with only one CPU. I never thought about having it enabled via the command line.

It is not something that can be easily enabled. Thus the fix for this bug is to document that it can not be enabled at start up. Even better, if it is in the command line, produce a warning that it's not a tracer that can be enabled via command line, and disable it.

-- Steve
Comment 2 AustinKim 2017-08-14 23:32:45 UTC
Sometimes, I suffered softlock-up when configuring ftrace with function_graph tracer.
As far as I checked, the sudden IRQ storms in the devices, ftrace ringbuffer cannot handle the message.
But after I used the following command, I have not observed any softlockup or crashes later.

<< Command to configure ftracer >>
"echo 0 > d/tracing/tracing_on"
"sleep 1"

"echo function_graph > d/tracing/current_tracer"
"sleep 1"

"echo  > d/tracing/set_ftrace_filter"
"sleep 1"

adb shell "echo __fd_install lpm_cpuidle_enter  > d/tracing/set_graph_function"
"sleep 1"

"echo 1 > d/tracing/tracing_on"
"sleep 1"

<< Command to collect ftrace log >>
"echo 0 > d/tracing/tracing_on"
"cat d/tracing/trace_pipe > /data/trace.txt"

BR,
Guillermo Austin Kim
Comment 3 Ziqian SUN (Zamir) 2017-08-30 02:38:13 UTC
(In reply to Steven Rostedt from comment #1)
> Even better, if it is in
> the command line, produce a warning that it's not a tracer that can be
> enabled via command line, and disable it.

I proposed a patch against 4.12.9. With that patch it now reports the following message and skip that tracer.
[    0.511289] Tracer 'mmiotrace' cannot be enabled during boot

I will rebase it to master branch and send it out.
Comment 4 Ziqian SUN (Zamir) 2017-09-21 01:50:27 UTC
As this is merged so I am closing this myself.

Thanks for your help Steve!

c7b3ae0bd2ca658c7a71c49901d08c590294fac9
tracing: Ignore mmiotrace from kernel commandline

Note You need to log in before you can comment on or make changes to this bug.