Bug 219358

Summary: test_cpu_read_buf_percent() fails with (expect < 0)
Product: Tools Reporter: Adrien Nader (adrien)
Component: Trace-cmd/KernelsharkAssignee: Default virtual assignee for Trace-cmd and kernelshark (tools_tracecmd_kernelshark)
Status: NEEDINFO ---    
Severity: normal CC: rostedt
Priority: P3    
Hardware: PPC-64   
OS: Linux   
Kernel Version: Subsystem:
Regression: No Bisected commit-id:
Attachments: Handle large sub-buffer size in percentage test
Do not test a read of more than buffer size data
Do not test more events than what the ring buffer can hold

Description Adrien Nader 2024-10-07 14:39:34 UTC
Hi,

When working on the libtracefs 1.8.0/1.8.1 package in Ubuntu, we
observed testsuite failures on ppc64el. I've looked at the first one but
I'm not sure I fully understand the purpose of that test.

The issue is that in test_cpu_read_buf_percent(), expect is negative when
percent = 1.

The test in the following code fails:

    expect = data->nr_subbufs * data->events_per_buf * percent / 100;

    /* Add just under the percent */
    expect -= data->events_per_buf;
    CU_TEST(expect > 0);

I obtained the values used to compute expect:

    buffersize     = 1506304
    bufsize        = 65536
    data_size      = 65520
    events_per_buf = 3276
    nr_subbufs     = 22

    nr_subbufs * events_per_buf * percent / 100
      = 22 * 3276 * 1 / 100 = 720

    expect - events_per_buf
      = 720 - 3276
      = -2556

For reference, the computation with real numbers would give a positive
value:

    nr_subbufs = 1506304 / 65520 = 22.99 // are we unlucky?
    events_per_buf = 65520 / 20 = 3276
    nr_subbufs * events_per_buf * percent / 100 - events_per_buf
      = 22.99 * 3276 - 2556
      = 1802

I expect the check isn't there to test something specific but rather to
ensure the logic of the subsequent code is not derailed by an
unreasonable value.

Is there a specific reason that much is subtracted from expect? Should
this be changed? Should the values be compute slightly differently?

Thanks,

-- 
Adrien
Comment 1 Steven Rostedt 2024-10-07 15:16:08 UTC
The issue is because PowerPC has 64K PAGE_SIZE. The ring buffer sub-buffer size is always at least PAGE_SIZE. That means the "events_per_buf" is going to be quite big as you pointed out. If we did the same logic with the normal 4K PAGE_SIZE we would have:


    buffersize     = 1506304
    bufsize        = 4096
    data_size      = 4080
    events_per_buf = 204
    nr_subbufs     = 369

The algorithm only works when we have more than 100 sub-buffers, but in the case of PowerPC with its large PAGE_SIZE, that's not going to be the case.

Yes, the test is broken here, and you are correct, the check that failed was to make sure the algorithm used was correct (which it is not for PowerPC).

The fix is to account for large sub buffers in the test. I'll fix that sometime in the near future.

Thanks for the report!

-- Steve
Comment 2 Steven Rostedt 2024-10-07 15:32:02 UTC
Created attachment 306981 [details]
Handle large sub-buffer size in percentage test

Create a "min_percent" in the buffer percentage test that will cover at least one sub-buffer size. As PowerPC has 64K PAGE_SIZE which makes the 1 percent less than a sub-buffer page, and the subtraction of one page is going to cause issues.
Comment 3 Steven Rostedt 2024-10-07 15:32:25 UTC
I wrote up a quick fix. Can you test that?
Comment 4 Adrien Nader 2024-10-07 15:47:26 UTC
I'll try to throw that at the CI this evening and I'll pick up a ppc64el machine tomorrow if needed. Thanks a lot!
Comment 5 Adrien Nader 2024-10-14 08:12:07 UTC
The patch works and lets the test continue further but only by a few lines as the following fails:

    CU_TEST(data->done == false);

Since there was a call to msleep(), I tried to increase it without thinking and msleep(60000) makes the tests succeed while msleep(10000) doesn't. However, I don't understand why it improves the situation: reading the code, I would expect the converse to be true. I also wanted to add explicit synchronization between the threads but wasn't able to find out where and when the two are supposed to synchronize or run in parallel.



By the way, there is also a failure in test_trace_cpu_read() which seems to be caused by the page size. When calling the following:

    test_cpu_read(&data, data.events_per_buf * 50);

expect is 163800 while cnt is reliably 75348. For comparison, on arm64, both are 10200 ( = 163800 / 16, as expected). Should something less than 50 be used on ppc64el?
Comment 6 Steven Rostedt 2024-10-16 17:24:25 UTC
On Mon, 14 Oct 2024 08:12:07 +0000
bugzilla-daemon@kernel.org wrote:

> https://bugzilla.kernel.org/show_bug.cgi?id=219358
> 
> --- Comment #5 from Adrien Nader (adrien@notk.org) ---
> The patch works and lets the test continue further but only by a few lines as
> the following fails:
> 
>     CU_TEST(data->done == false);
> 
> Since there was a call to msleep(), I tried to increase it without thinking
> and
> msleep(60000) makes the tests succeed while msleep(10000) doesn't. However, I
> don't understand why it improves the situation: reading the code, I would
> expect the converse to be true. I also wanted to add explicit synchronization

I would think it would be reverse too.

> between the threads but wasn't able to find out where and when the two are
> supposed to synchronize or run in parallel.

I would add printf() statements in the trace_cpu_read_thread() code.
Perhaps you can figure something out?

> 
> 
> 
> By the way, there is also a failure in test_trace_cpu_read() which seems to
> be
> caused by the page size. When calling the following:
> 
>     test_cpu_read(&data, data.events_per_buf * 50);
> 
> expect is 163800 while cnt is reliably 75348. For comparison, on arm64, both
> are 10200 ( = 163800 / 16, as expected). Should something less than 50 be
> used
> on ppc64el?

I should probably add a check to make sure events_per_buf * 50 is less than
the size of a individual per-cpu buffer.
Comment 7 Steven Rostedt 2024-10-16 17:31:44 UTC
Created attachment 307016 [details]
Do not test a read of more than buffer size data

Make sure the events_per_buf * 50 is not greater than the per CPU buffer size.
Comment 8 Steven Rostedt 2024-10-16 17:32:30 UTC
I added a patch to hopefully fix the * 50 run.
Comment 9 Steven Rostedt 2024-10-16 17:48:12 UTC
Created attachment 307017 [details]
Do not test more events than what the ring  buffer can hold

Found a bug in the last version of this patch. I forgot that tracefs_instance_get_buffer_size() returns the size in kilobytes. I need to multiply 1024 to it.

I also need to update the documentation to state that's what is returned. As he man page for tracefs_instance_get_buffer_size() doesn't state that!
Comment 10 Adrien Nader 2024-10-21 15:25:42 UTC
Unfortunately this didn't fix the issue: the condition is not verified and if I force it to true, the computed big_num is negative.

The actual limit seems to be data.nr_subbufs. I've checked that nr_subbufs succeeds on ppc64el and nr_subbufs + 1 fails; I've also checked that nr_subbufs succeeds on arm64 while nr_subbufs + 1 works but nr_subbufs + 10 fails (on arm64, nr_subbufs = 353 so it's not unexpected to have to increase by more than one to fail).

In that case, the diff is:

diff --git a/utest/tracefs-utest.c b/utest/tracefs-utest.c
index 5bfd665..a7d8423 100644
--- a/utest/tracefs-utest.c
+++ b/utest/tracefs-utest.c
@@ -1278,7 +1278,7 @@ static void test_instance_trace_cpu_read(struct tracefs_instance *instance, bool
        test_cpu_read(&data, data.events_per_buf / 2);
        test_cpu_read(&data, data.events_per_buf);
        test_cpu_read(&data, data.events_per_buf + 1);
-       test_cpu_read(&data, data.events_per_buf * 50);
+       test_cpu_read(&data, data.events_per_buf * (50 <= data.nr_subbufs ? 50 : data.nr_subbufs));

        shutdown_trace_cpu(&data);
 }
Comment 11 Steven Rostedt 2024-10-22 07:48:30 UTC
On Mon, 21 Oct 2024 15:25:42 +0000
bugzilla-daemon@kernel.org wrote:
> 
> diff --git a/utest/tracefs-utest.c b/utest/tracefs-utest.c
> index 5bfd665..a7d8423 100644
> --- a/utest/tracefs-utest.c
> +++ b/utest/tracefs-utest.c
> @@ -1278,7 +1278,7 @@ static void test_instance_trace_cpu_read(struct
> tracefs_instance *instance, bool
>         test_cpu_read(&data, data.events_per_buf / 2);
>         test_cpu_read(&data, data.events_per_buf);
>         test_cpu_read(&data, data.events_per_buf + 1);
> -       test_cpu_read(&data, data.events_per_buf * 50);
> +       test_cpu_read(&data, data.events_per_buf * (50 <= data.nr_subbufs ?
> 50
> : data.nr_subbufs));

Actually, I rather have it be at least one less than nr_subbufs. If you have it as:

       test_cpu_read(&data, data.events_per_buf * (50 < data.nr_subbufs ? 50 : data.nr_subbufs - 1));

Would that work?


> 
>         shutdown_trace_cpu(&data);
>  }
>
Comment 12 Adrien Nader 2024-10-22 10:34:45 UTC
I just tried this and:

- arm64 passes
- ppc64el passes
- amd64 wasn't built due to unrelated CI issues so I couldn't test in exactly the same setup but it was passing on yesterday with the higher value
- s390x doesn't pass but there are numerous errors on s390x and this was already present

I want to tackle s390x too but after ppc64el is done.
Comment 13 Adrien Nader 2024-10-30 10:10:53 UTC
Hi Steven,

I don't know if you are waiting on something related to s390x. I realize I wasn't very clear: I don't intend to touch s390x for now and I think changes for the current issues shouldn't wait on that. No intent to rush you either, I simply thought my message was maybe ambiguous. :)
Comment 14 Steven Rostedt 2024-11-04 21:11:26 UTC
Just so I understand. Both the attached patches should be applied then?