hi, I encounter a crash issue, trace_read_pipe peek a PADDING event from ringbuffer, and trying to print it, resulting in kernel crash. 1. in ring_buffer_event_data, found this event's type_len is bigger than RINGBUF_TYPE_DATA_TYPE_LEN_MAX, print WARNING in dmesg. [249033.677658] -(1)[9975:mobile_log_d.rd]WARNING: CPU: 1 PID: 9975 at ring_buffer_event_data+0x38/0x48 [249033.687108] -(1)[9975:mobile_log_d.rd]CPU: 1 PID: 9975 Comm: mobile_log_d.rd Tainted: P S W O 4.19.191-ge8b7b3424080-dirty #1 [249033.688703] -(1)[9975:mobile_log_d.rd]Hardware name: MT6769V/CZ (DT) [249033.689566] -(1)[9975:mobile_log_d.rd]pstate: 60400005 (nZCv daif +PAN -UAO) [249033.690519] -(1)[9975:mobile_log_d.rd]pc : ring_buffer_event_data+0x38/0x48 [249033.691460] -(1)[9975:mobile_log_d.rd]lr : __find_next_entry+0x120/0x230 [249033.692360] -(1)[9975:mobile_log_d.rd]sp : ffffff800b05bc30 [249033.693117] -(1)[9975:mobile_log_d.rd]x29: ffffff800b05bc70 x28: 0000000000000000 [249033.694132] -(1)[9975:mobile_log_d.rd]x27: ffffffd10b553fe4 x26: 0000000000000006 [249033.695146] -(1)[9975:mobile_log_d.rd]x25: ffffff927285dcc8 x24: ffffffd192e1c800 [249033.696160] -(1)[9975:mobile_log_d.rd]x23: 0000000000000000 x22: ffffff927285d000 [249033.697174] -(1)[9975:mobile_log_d.rd]x21: ffffffd1941540e8 x20: ffffffd194152000 [249033.698188] -(1)[9975:mobile_log_d.rd]x19: ffffffd1941540e0 x18: 0000000000000000 [249033.699201] -(1)[9975:mobile_log_d.rd]x17: 0000000000000000 x16: ffffff92722071dc [249033.700216] -(1)[9975:mobile_log_d.rd]x15: ffffff92723ad6ef x14: 000000000000003f [249033.701230] -(1)[9975:mobile_log_d.rd]x13: ffff0000ffffff00 x12: 000000000000000b [249033.702244] -(1)[9975:mobile_log_d.rd]x11: 0000000000000014 x10: 000000000000001e [249033.703258] -(1)[9975:mobile_log_d.rd]x9 : 000000000000001d x8 : ffffffd10b553fe4 [249033.704272] -(1)[9975:mobile_log_d.rd]x7 : 0000000000000001 x6 : ffffffd194153176 [249033.705285] -(1)[9975:mobile_log_d.rd]x5 : ffffffd10b553fc8 x4 : ffff0a01ffffff00 [249033.706300] -(1)[9975:mobile_log_d.rd]x3 : ffffff800b05bc60 x2 : 0000000000000001 [249033.707314] -(1)[9975:mobile_log_d.rd]x1 : 0000000000000000 x0 : ffffffd10b553fe4 [249033.708331] -(1)[9975:mobile_log_d.rd]Call trace: [249033.708989] -(1)[9975:mobile_log_d.rd] ring_buffer_event_data+0x38/0x48 [249033.709883] -(1)[9975:mobile_log_d.rd] tracing_read_pipe+0x22c/0x3ec [249033.710745] -(1)[9975:mobile_log_d.rd] __vfs_read+0x44/0x13c [249033.711516] -(1)[9975:mobile_log_d.rd] vfs_read+0xb8/0x150 [249033.712266] -(1)[9975:mobile_log_d.rd] ksys_read+0x6c/0xcc [249033.713015] -(1)[9975:mobile_log_d.rd] __arm64_sys_read+0x18/0x20 [249033.713847] -(1)[9975:mobile_log_d.rd] el0_svc_common+0x98/0x160 [249033.714663] -(1)[9975:mobile_log_d.rd] el0_svc_handler+0x60/0x78 [249033.715482] -(1)[9975:mobile_log_d.rd] el0_svc+0x8/0x300 2. trace_read_pipe print this event resulting in a crash: [249033.718312] .(1)[9975:mobile_log_d.rd]Unable to handle kernel paging request at virtual address 0031154100000006 [249033.719952] .(1)[9975:mobile_log_d.rd]Mem abort info: [249033.720706] .(1)[9975:mobile_log_d.rd] ESR = 0x96000004 [249033.721436] .(1)[9975:mobile_log_d.rd] Exception class = DABT (current EL), IL = 32 bits [249033.722586] .(1)[9975:mobile_log_d.rd] SET = 0, FnV = 0 [249033.723619] .(1)[9975:mobile_log_d.rd] EA = 0, S1PTW = 0 [249033.724413] .(1)[9975:mobile_log_d.rd]Data abort info: [249033.725119] .(1)[9975:mobile_log_d.rd] ISV = 0, ISS = 0x00000004 [249033.725944] .(1)[9975:mobile_log_d.rd] CM = 0, WnR = 0 [249033.726685] .(1)[9975:mobile_log_d.rd][0031154100000006] address between user and kernel address ranges [249033.728207] -(1)[9975:mobile_log_d.rd]Internal error: Oops: 96000004 [#1] PREEMPT SMP [249033.762538] -(1)[9975:mobile_log_d.rd]Kernel Offset: 0x1269200000 from 0xffffff8008000000 [249033.763633] -(1)[9975:mobile_log_d.rd]PHYS_OFFSET: 0xffffffef40000000 [249033.777647] -(1)[9975:mobile_log_d.rd]CPU: 1 PID: 9975 Comm: mobile_log_d.rd Tainted: P S W O 4.19.191-ge8b7b3424080-dirty #1 [249033.779236] -(1)[9975:mobile_log_d.rd]Hardware name: MT6769V/CZ (DT) [249033.780090] -(1)[9975:mobile_log_d.rd]pstate: 20400005 (nzCv daif +PAN -UAO) [249033.781039] -(1)[9975:mobile_log_d.rd]pc : string+0x38/0x74 [249033.781794] -(1)[9975:mobile_log_d.rd]lr : vsnprintf+0x430/0x65c [249033.782601] -(1)[9975:mobile_log_d.rd]sp : ffffff800b05bad0 [249033.783351] -(1)[9975:mobile_log_d.rd]x29: ffffff800b05bad0 x28: ffffff800b05bb80 [249033.784357] -(1)[9975:mobile_log_d.rd]x27: ffffff92723ad4c8 x26: ffffff92724ec0f6 [249033.785361] -(1)[9975:mobile_log_d.rd]x25: ffffff92724ec0f8 x24: ffffffd1941531b3 [249033.786367] -(1)[9975:mobile_log_d.rd]x23: 00000000ffffffff x22: 0000000000000002 [249033.787372] -(1)[9975:mobile_log_d.rd]x21: ffffffd1941540a0 x20: ffffff800b05bb68 [249033.788377] -(1)[9975:mobile_log_d.rd]x19: ffffffd1941531b2 x18: 0000000000000000 [249033.789382] -(1)[9975:mobile_log_d.rd]x17: 0000000000000031 x16: ffffff92722072a0 [249033.790387] -(1)[9975:mobile_log_d.rd]x15: ffffff92723ad6ef x14: 000000000000004e [249033.791392] -(1)[9975:mobile_log_d.rd]x13: ffff0000ffffff00 x12: 0000000000000000 [249033.792397] -(1)[9975:mobile_log_d.rd]x11: ffffff92723ad6b8 x10: 1531154100000006 [249033.793401] -(1)[9975:mobile_log_d.rd]x9 : 0000000000000000 x8 : ffffffd1941540a0 [249033.794406] -(1)[9975:mobile_log_d.rd]x7 : 2020202020202020 x6 : ffffffd1941531b3 [249033.795411] -(1)[9975:mobile_log_d.rd]x5 : ffffffd10b553fc8 x4 : ffff0a00ffffff04 [249033.796415] -(1)[9975:mobile_log_d.rd]x3 : ffff0a00ffffff04 x2 : 1531154100000006 [249033.797420] -(1)[9975:mobile_log_d.rd]x1 : ffffffffffffffff x0 : ffffffd1941531b3 [249033.798430] -(1)[9975:mobile_log_d.rd]CPU: 1 PID: 9975 Comm: mobile_log_d.rd Tainted: P S W O 4.19.191-ge8b7b3424080-dirty #1 [249033.800018] -(1)[9975:mobile_log_d.rd]Hardware name: MT6769V/CZ (DT) [249033.800868] -(1)[9975:mobile_log_d.rd]Call trace: [249033.801521] -(1)[9975:mobile_log_d.rd] dump_backtrace+0x0/0x158 [249033.802320] -(1)[9975:mobile_log_d.rd] show_stack+0x14/0x1c [249033.803076] -(1)[9975:mobile_log_d.rd] dump_stack+0xb8/0xf0 [249033.803836] -(1)[9975:mobile_log_d.rd] mrdump_common_die+0x170/0x214 [249033.804689] -(1)[9975:mobile_log_d.rd] ipanic_die+0x30/0x40 [249033.805445] -(1)[9975:mobile_log_d.rd] notify_die+0x64/0xb4 [249033.806200] -(1)[9975:mobile_log_d.rd] die+0xac/0x37c [249033.806891] -(1)[9975:mobile_log_d.rd] die_kernel_fault+0x60/0x70 [249033.807710] -(1)[9975:mobile_log_d.rd] die_kernel_fault+0x0/0x70 [249033.808518] -(1)[9975:mobile_log_d.rd] do_bad_area+0x30/0x130 [249033.809296] -(1)[9975:mobile_log_d.rd] do_translation_fault+0x34/0x40 [249033.810158] -(1)[9975:mobile_log_d.rd] do_mem_abort+0x4c/0xf8 [249033.810935] -(1)[9975:mobile_log_d.rd] el1_da+0x1c/0xac [249033.811647] -(1)[9975:mobile_log_d.rd] string+0x38/0x74 [249033.812357] -(1)[9975:mobile_log_d.rd] vsnprintf+0x430/0x65c [249033.813122] -(1)[9975:mobile_log_d.rd] seq_buf_vprintf+0x64/0xcc [249033.813934] -(1)[9975:mobile_log_d.rd] trace_seq_printf+0x70/0xd8 [249033.814755] -(1)[9975:mobile_log_d.rd] trace_raw_output_ipi_handler+0x40/0x54 [249033.815705] -(1)[9975:mobile_log_d.rd] print_trace_line+0x3ec/0x424 [249033.816547] -(1)[9975:mobile_log_d.rd] tracing_read_pipe+0x28c/0x3ec [249033.817402] -(1)[9975:mobile_log_d.rd] __vfs_read+0x44/0x13c [249033.818166] -(1)[9975:mobile_log_d.rd] vfs_read+0xb8/0x150 [249033.818908] -(1)[9975:mobile_log_d.rd] ksys_read+0x6c/0xcc [249033.819651] -(1)[9975:mobile_log_d.rd] __arm64_sys_read+0x18/0x20 [249033.820470] -(1)[9975:mobile_log_d.rd] el0_svc_common+0x98/0x160 [249033.821278] -(1)[9975:mobile_log_d.rd] el0_svc_handler+0x60/0x78 [249033.822086] -(1)[9975:mobile_log_d.rd] el0_svc+0x8/0x300 I do not enable ipi event, array[0] of this padding event is 0x18, in my device, ipi entry trace event's type value is 0x18 too, so trace_read_pipe call trace_raw_output_ipi_handler to print this event,eventually, the system crashed because of illegal pointers. PADDING event should not return to caller in ring_buffer_peek.
hi, This issue has happened many times. Every time peek_next_entry return a padding event at end of page. Our reader task poll trace_pipe and other files. Maybe this issue will triggered in the following scenarios: 1. reader task wakeup by other file. 2. after read data from other file, call ring_buffer_poll_wait and find ring_buffer no empty because read != commit, but because of cpu out-of-order execution, next event has not been set to padding type. 3. in ring_buffer_peek, expired data in this event show it is not a padding event, and return event to peek_next_entry. 4. event is set to padding type and detected in rb_event_data, then print a warning. 5. print_trace_line use error type to print this event, eventually crash. Maybe we can add smp_wb to make ture padding type will be set first: --- a/kernel/trace/ring_buffer.c +++ b/kernel/trace/ring_buffer.c @@ -2681,6 +2681,7 @@ rb_move_tail(struct ring_buffer_per_cpu *cpu_buffer, rb_reset_tail(cpu_buffer, tail, info); + smp_wb(); /* Commit what we have for now. */ rb_end_commit(cpu_buffer); /* rb_end_commit() decs committing */