Bug 118001 - an integer overflow in ring_buffer of 64-bit system
Summary: an integer overflow in ring_buffer of 64-bit system
Status: RESOLVED CODE_FIX
Alias: None
Product: Tracing/Profiling
Classification: Unclassified
Component: Ftrace (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: Steven Rostedt
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-05-11 10:39 UTC by Hao Qin
Modified: 2017-01-19 03:46 UTC (History)
0 users

See Also:
Kernel Version: found in source of 3.18.32 and 4.5.3, may be influence on more versions
Subsystem:
Regression: No
Bisected commit-id:


Attachments

Description Hao Qin 2016-05-11 10:39:55 UTC
Hello, 
I found a integer overflow at ring_buffer_resize of kernel/trace/ring_buffer.c file of 64-bit system.

In ring_buffer_resize function of 64-bit system, the type of parameter "size" is unsiged long, and the type of variable "nr_pages" is unsigned. When execute the line 
"nr_pages = DIV_ROUND_UP(size, BUF_PAGE_SIZE);"
unchecked parameter "size" may cause "nr_pages" overflow.

I thought the "size" should be convert to unsigned int before be assigned to nr_pages.


I hope this report is helpful.

thanks
Comment 1 Steven Rostedt 2016-05-11 13:13:42 UTC
Although I agree this should be fixed (probably by making nr_pages into long, as if you have the memory, heck use it), this is not a real bug.

If we do have an overflow, the worse that will happen is that you get the wrong amount of pages. If you asked for more than 32bit size of pages, you will get less than that (the mod of it). There's checks to make sure that we can successfully allocate memory before it proceeds.

Again, the worse that will happen is that you will get less than you asked for.

But that said. I'll write a patch to fix this. I wont even bother marking it for stable, as I haven't had anyone complain yet (seldom do people ask for more that 2 billion pages per cpu).

-- Steve
Comment 2 Hao Qin 2016-05-12 03:11:43 UTC
I am sorry for that my description is not clear.
Because nr_pages_to_update of ring_buffer_per_cpu is int,  so nr_pages overflow may cause nr_pages_to_update to be negative.But it looks like no more damaged.

I have no much experience in kernel development. but I think only making nr_pages into long is no effect for this problem. Trying to making all related variable into long ? I have no idea.

Hao Qin

... and sorry for my Chinese English. :)
Comment 3 Steven Rostedt 2016-05-12 14:23:29 UTC
I meant to make all related variables into long, not just the one variable. But you are correct, I was able to crash the ring buffer (not the kernel because one of the RB_WARN_ON()'s caught the issues, and just shuts down the ring buffer). I did this by first echoing in 10 into buffer_size_kb and then 8556396480. The first 10 causes 3 pages to be allocated. Then the second number is (2^31/2^10+4)*4080. The trace.c code will shift this number by 10 (2^10), and then the DIV_ROUND_UP() uses BUF_PAGE_SIZE (4080). The 4 is 1 over the 3 pages. This makes cpu_buffer->nr_pages_to_update a negative number that is greater than the number of pages that exist (negative means to remove pages).

Then rb_remove_pages() will try to remove more pages than exists. This triggers the RB_WARN_ON(cpu_buffer, nr_removed), because nr_remove is not zero. That shuts down the ring buffer.

I'm convinced that the nr_pages variables should be long (but not all unsigned, as negative is used to create pages).

Thanks, I'll update this and send it to stable.
Comment 4 Steven Rostedt 2017-01-19 03:46:44 UTC
Commit 9b94a8fba501f38368aef6ac1b30e7335252a220 fixes the issue.

Note You need to log in before you can comment on or make changes to this bug.