Bug 9779

Summary: Setting cpu_share to 1 freezes system
Product: Process Management Reporter: Daniel Hahler (linux-bugs)
Component: SchedulerAssignee: Ingo Molnar (mingo)
Status: CLOSED CODE_FIX    
Severity: high CC: bunk, dhaval
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 2.6.24 Subsystem:
Regression: --- Bisected commit-id:
Attachments: fix for root-triggered /sys/kernel/uids hang

Description Daniel Hahler 2008-01-19 16:12:37 UTC
Distribution: Ubuntu
Hardware Environment: amd64, x86 kernel
Problem Description:

The following command locks up my system:
echo 1 | sudo tee /sys/kernel/uids/`id -u`/cpu_share

While I have not reproduced it for this command, it locked up my system twice when trying it with the uid of my system user "boinc".

Others on the IRC channel ##kernel could not confirm it (using virtual machines on 64bit and 32bit), but I think it's grave enough to report it anyway.

I'm using the 2.6.24 kernel from Ubuntu (but someone on ##kernel could not confirm it using the ubuntu-server kernel).

My CPU is AMD64 3000+.

Please ask for any details you need to understand the problem.


Steps to reproduce:
echo 1 | sudo tee /sys/kernel/uids/`id -u`/cpu_share
Comment 1 Daniel Hahler 2008-01-20 15:20:46 UTC
It appears that this bug only happens when setting cpu_share=1 for _another_ user.

Here's a way to reproduce it also in a virtual machine (at least for me):
1. Create another user (I'm using "foo") or just use another one probably (e.g. "boinc")
2. sudo -u foo python -c 'while 1: i = 1' &
3. echo 1 | sudo tee /sys/kernel/uids/`id -u foo`/cpu_share

The gdb backtrace on the VirtualBox process then looks like the following. (I don't know if it's really useful)
(gdb) bt
#0  0xb7f86410 in __kernel_vsyscall ()
#1  0xb58f7311 in select () from /lib/tls/i686/cmov/libc.so.6
#2  0xb79b2d23 in QEventLoop::processEvents (this=0x8312120, flags=4)
    at kernel/qeventloop_x11.cpp:291
#3  0xb7a286a8 in QEventLoop::enterLoop (this=0x8312120) at kernel/qeventloop.cpp:198
#4  0xb7a283a6 in QEventLoop::exec (this=0x8312120) at kernel/qeventloop.cpp:145
#5  0xb7a0eef7 in QApplication::exec (this=0xbf84bf90) at kernel/qapplication.cpp:2758
#6  0x08127e3f in ?? ()
#7  0xbf84bf90 in ?? ()
#8  0xbf84c074 in ?? ()
#9  0x00000000 in ?? ()
Comment 2 Dhaval Giani 2008-01-21 04:53:00 UTC
Can you send your kernel stacktrace? It should come out when the system crashes on the console.

I'm downloading and setting up virtual box now to try to reproduce it here. I do not see it happening on my test system.

Can you please send your .config as well?
Comment 3 Ingo Molnar 2008-01-22 02:13:11 UTC
* bugme-daemon@bugzilla.kernel.org <bugme-daemon@bugzilla.kernel.org> wrote:

> The following command locks up my system:
> echo 1 | sudo tee /sys/kernel/uids/`id -u`/cpu_share
> 
> While I have not reproduced it for this command, it locked up my 
> system twice when trying it with the uid of my system user "boinc".

does the patch below fix it?

	Ingo

Subject: x86: patches/sched-user-share-fix.patch
From: Ingo Molnar <mingo@elte.hu>


Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
 kernel/sched.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Index: linux/kernel/sched.c
===================================================================
--- linux.orig/kernel/sched.c
+++ linux/kernel/sched.c
@@ -268,7 +268,7 @@ struct task_group init_task_group = {
 # define INIT_TASK_GROUP_LOAD	NICE_0_LOAD
 #endif
 
-#define MIN_GROUP_SHARES	1
+#define MIN_GROUP_SHARES	2
 
 static int init_task_group_load = INIT_TASK_GROUP_LOAD;
 
Comment 4 Ingo Molnar 2008-01-22 02:19:55 UTC
please try the fix below instead.

-------------->
Subject: sched: group scheduler, set uid share fix
From: Ingo Molnar <mingo@elte.hu>

setting cpu share to 1 causes hangs, as reported in:

    http://bugzilla.kernel.org/show_bug.cgi?id=9779

as the default share is 1024, the values of 0 and 1 can indeed
cause problems. Limit it to 2 or higher values.

These values can only be set by the root user - but still it
makes sense to protect against nonsensical values.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
 kernel/sched.c |    8 ++++++++
 1 file changed, 8 insertions(+)

Index: linux/kernel/sched.c
===================================================================
--- linux.orig/kernel/sched.c
+++ linux/kernel/sched.c
@@ -7136,6 +7136,14 @@ static void set_se_shares(struct sched_e
 
 	spin_lock_irq(&rq->lock);
 
+	/*
+	 * A weight of 0 or 1 can cause arithmetics problems.
+	 * (The default weight is 1024 - so there's no practical
+	 *  limitation from this.)
+	 */
+	if (shares < 2)
+		shares = 2;
+
 	on_rq = se->on_rq;
 	if (on_rq)
 		dequeue_entity(cfs_rq, se, 0);
Comment 5 Ingo Molnar 2008-01-22 02:24:32 UTC
Created attachment 14531 [details]
fix for root-triggered /sys/kernel/uids hang

please try this attached patch instead.
Comment 6 Dhaval Giani 2008-01-22 02:33:21 UTC
ok, i will agree with that fix :-).

Do you want to try the other fix for sched-devel? (Using the macro?)
Comment 7 Dhaval Giani 2008-01-22 02:35:06 UTC
Ingo, this fix will work as it has been reported that setting the shares as 2 does not cause a hang.
Comment 8 Ingo Molnar 2008-01-22 07:27:10 UTC
> Ingo, this fix will work as it has been reported that setting the 
> shares as 2 does not cause a hang.

nitpick: testing the patch is still required, to make sure it's really 
fixed. I too see that that patch sets it to 2, but it's only fixed if 
testers confirm that the patch works :)
Comment 9 Adrian Bunk 2008-01-22 20:39:35 UTC
Fix got included in Linus' tree as commit c61935fd0e7f087a643827b4bf5ef646963c10fa.