Bug 106831 - Back-to-back OOM Killer Events Lockup the Kernel
Summary: Back-to-back OOM Killer Events Lockup the Kernel
Status: NEW
Alias: None
Product: Process Management
Classification: Unclassified
Component: Scheduler (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: Ingo Molnar
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2015-10-28 23:35 UTC by Chris Carday
Modified: 2016-03-23 18:25 UTC (History)
1 user (show)

See Also:
Kernel Version: 3.10.53
Subsystem:
Regression: No
Bisected commit-id:


Attachments

Description Chris Carday 2015-10-28 23:35:56 UTC
The current kernel code:

After oom-killer kicks in, the kernel picks a task(thread) to kill & sets a bit in its task_struct saying "I am dying so bypass me if another oom event comes along". Then a kill -9 is sent to each task in that process - using a loop.  The task structs start getting cleaned up.

When/if a new oom event comes in, the kernel again looks for a task to kill.  It looks for one without that dying bit set.  It may happen to find a task in the same process as before - if not fully killed/cleaned up yet.  So it begins the same kill procedure, but the linked lists are in an intermediate state because the previous oom killer event has begun shutting them down. (hence possible infinite loop due to transient list members).

Proposed fix:

Instead of setting just the bit in the one specified task, set the bit in all tasks of the process chosen to be killed.  This prevents the process from being chosen a second time. 

--- linux-3.10.53/mm/oom_kill.c	2014-08-13 21:24:29.000000000 -0400
+++ linux-3.10.53-working/mm/oom_kill.c	2015-10-28 16:39:28.274157000 -0400
@@ -501,7 +501,15 @@
 		}
 	rcu_read_unlock();
 
-	set_tsk_thread_flag(victim, TIF_MEMDIE);
+	write_lock(&tasklist_lock);
+	t = victim;
+
+	do {
+		set_tsk_thread_flag(t, TIF_MEMDIE);
+	} while_each_thread(victim, t);
+
+	write_unlock(&tasklist_lock);
+
 	do_send_sig_info(SIGKILL, SEND_SIG_FORCED, victim, true);
 	put_task_struct(victim);
 }
Comment 1 Chris Carday 2015-10-28 23:46:27 UTC
Specifically - the bug occurring here is that task_struct->thread_group is being traversed & hits a member that has "t->next = t" & the code infinitely loops there while holding the tasklist_lock.

Note You need to log in before you can comment on or make changes to this bug.