Bug 219358
Summary: | test_cpu_read_buf_percent() fails with (expect < 0) | ||
---|---|---|---|
Product: | Tools | Reporter: | Adrien Nader (adrien) |
Component: | Trace-cmd/Kernelshark | Assignee: | Default virtual assignee for Trace-cmd and kernelshark (tools_tracecmd_kernelshark) |
Status: | NEEDINFO --- | ||
Severity: | normal | CC: | rostedt |
Priority: | P3 | ||
Hardware: | PPC-64 | ||
OS: | Linux | ||
Kernel Version: | Subsystem: | ||
Regression: | No | Bisected commit-id: | |
Attachments: |
Handle large sub-buffer size in percentage test
Do not test a read of more than buffer size data Do not test more events than what the ring buffer can hold |
Description
Adrien Nader
2024-10-07 14:39:34 UTC
The issue is because PowerPC has 64K PAGE_SIZE. The ring buffer sub-buffer size is always at least PAGE_SIZE. That means the "events_per_buf" is going to be quite big as you pointed out. If we did the same logic with the normal 4K PAGE_SIZE we would have: buffersize = 1506304 bufsize = 4096 data_size = 4080 events_per_buf = 204 nr_subbufs = 369 The algorithm only works when we have more than 100 sub-buffers, but in the case of PowerPC with its large PAGE_SIZE, that's not going to be the case. Yes, the test is broken here, and you are correct, the check that failed was to make sure the algorithm used was correct (which it is not for PowerPC). The fix is to account for large sub buffers in the test. I'll fix that sometime in the near future. Thanks for the report! -- Steve Created attachment 306981 [details]
Handle large sub-buffer size in percentage test
Create a "min_percent" in the buffer percentage test that will cover at least one sub-buffer size. As PowerPC has 64K PAGE_SIZE which makes the 1 percent less than a sub-buffer page, and the subtraction of one page is going to cause issues.
I wrote up a quick fix. Can you test that? I'll try to throw that at the CI this evening and I'll pick up a ppc64el machine tomorrow if needed. Thanks a lot! The patch works and lets the test continue further but only by a few lines as the following fails: CU_TEST(data->done == false); Since there was a call to msleep(), I tried to increase it without thinking and msleep(60000) makes the tests succeed while msleep(10000) doesn't. However, I don't understand why it improves the situation: reading the code, I would expect the converse to be true. I also wanted to add explicit synchronization between the threads but wasn't able to find out where and when the two are supposed to synchronize or run in parallel. By the way, there is also a failure in test_trace_cpu_read() which seems to be caused by the page size. When calling the following: test_cpu_read(&data, data.events_per_buf * 50); expect is 163800 while cnt is reliably 75348. For comparison, on arm64, both are 10200 ( = 163800 / 16, as expected). Should something less than 50 be used on ppc64el? On Mon, 14 Oct 2024 08:12:07 +0000 bugzilla-daemon@kernel.org wrote: > https://bugzilla.kernel.org/show_bug.cgi?id=219358 > > --- Comment #5 from Adrien Nader (adrien@notk.org) --- > The patch works and lets the test continue further but only by a few lines as > the following fails: > > CU_TEST(data->done == false); > > Since there was a call to msleep(), I tried to increase it without thinking > and > msleep(60000) makes the tests succeed while msleep(10000) doesn't. However, I > don't understand why it improves the situation: reading the code, I would > expect the converse to be true. I also wanted to add explicit synchronization I would think it would be reverse too. > between the threads but wasn't able to find out where and when the two are > supposed to synchronize or run in parallel. I would add printf() statements in the trace_cpu_read_thread() code. Perhaps you can figure something out? > > > > By the way, there is also a failure in test_trace_cpu_read() which seems to > be > caused by the page size. When calling the following: > > test_cpu_read(&data, data.events_per_buf * 50); > > expect is 163800 while cnt is reliably 75348. For comparison, on arm64, both > are 10200 ( = 163800 / 16, as expected). Should something less than 50 be > used > on ppc64el? I should probably add a check to make sure events_per_buf * 50 is less than the size of a individual per-cpu buffer. Created attachment 307016 [details]
Do not test a read of more than buffer size data
Make sure the events_per_buf * 50 is not greater than the per CPU buffer size.
I added a patch to hopefully fix the * 50 run. Created attachment 307017 [details]
Do not test more events than what the ring buffer can hold
Found a bug in the last version of this patch. I forgot that tracefs_instance_get_buffer_size() returns the size in kilobytes. I need to multiply 1024 to it.
I also need to update the documentation to state that's what is returned. As he man page for tracefs_instance_get_buffer_size() doesn't state that!
Unfortunately this didn't fix the issue: the condition is not verified and if I force it to true, the computed big_num is negative. The actual limit seems to be data.nr_subbufs. I've checked that nr_subbufs succeeds on ppc64el and nr_subbufs + 1 fails; I've also checked that nr_subbufs succeeds on arm64 while nr_subbufs + 1 works but nr_subbufs + 10 fails (on arm64, nr_subbufs = 353 so it's not unexpected to have to increase by more than one to fail). In that case, the diff is: diff --git a/utest/tracefs-utest.c b/utest/tracefs-utest.c index 5bfd665..a7d8423 100644 --- a/utest/tracefs-utest.c +++ b/utest/tracefs-utest.c @@ -1278,7 +1278,7 @@ static void test_instance_trace_cpu_read(struct tracefs_instance *instance, bool test_cpu_read(&data, data.events_per_buf / 2); test_cpu_read(&data, data.events_per_buf); test_cpu_read(&data, data.events_per_buf + 1); - test_cpu_read(&data, data.events_per_buf * 50); + test_cpu_read(&data, data.events_per_buf * (50 <= data.nr_subbufs ? 50 : data.nr_subbufs)); shutdown_trace_cpu(&data); } On Mon, 21 Oct 2024 15:25:42 +0000 bugzilla-daemon@kernel.org wrote: > > diff --git a/utest/tracefs-utest.c b/utest/tracefs-utest.c > index 5bfd665..a7d8423 100644 > --- a/utest/tracefs-utest.c > +++ b/utest/tracefs-utest.c > @@ -1278,7 +1278,7 @@ static void test_instance_trace_cpu_read(struct > tracefs_instance *instance, bool > test_cpu_read(&data, data.events_per_buf / 2); > test_cpu_read(&data, data.events_per_buf); > test_cpu_read(&data, data.events_per_buf + 1); > - test_cpu_read(&data, data.events_per_buf * 50); > + test_cpu_read(&data, data.events_per_buf * (50 <= data.nr_subbufs ? > 50 > : data.nr_subbufs)); Actually, I rather have it be at least one less than nr_subbufs. If you have it as: test_cpu_read(&data, data.events_per_buf * (50 < data.nr_subbufs ? 50 : data.nr_subbufs - 1)); Would that work? > > shutdown_trace_cpu(&data); > } > I just tried this and: - arm64 passes - ppc64el passes - amd64 wasn't built due to unrelated CI issues so I couldn't test in exactly the same setup but it was passing on yesterday with the higher value - s390x doesn't pass but there are numerous errors on s390x and this was already present I want to tackle s390x too but after ppc64el is done. Hi Steven, I don't know if you are waiting on something related to s390x. I realize I wasn't very clear: I don't intend to touch s390x for now and I think changes for the current issues shouldn't wait on that. No intent to rush you either, I simply thought my message was maybe ambiguous. :) Just so I understand. Both the attached patches should be applied then? |