Bug 26842 - [2.6.37 regression] threads with CPU affinity cannot be killed
[2.6.37 regression] threads with CPU affinity cannot be killed
Status: CLOSED CODE_FIX
Product: Process Management
Classification: Unclassified
Component: Scheduler
All Linux
: P1 normal
Assigned To: Ingo Molnar
:
Depends on:
Blocks: 21782
  Show dependency treegraph
 
Reported: 2011-01-16 13:39 UTC by tim blechmann
Modified: 2011-02-12 23:18 UTC (History)
4 users (show)

See Also:
Kernel Version: 2.6.37
Tree: Mainline
Regression: Yes


Attachments

Description tim blechmann 2011-01-16 13:39:42 UTC
my application uses multiple threads, that run with SCHED_FIFO scheduling and that are pinned to separate physical CPUs.

starting with 2.6.37, the application stopped working and i cannot kill the processes any more with SIGKILL. i am also under the impression, that the threads, which are pinned to separate CPUs are never actually dispatched.

2.6.36 and used to work fine ...
Comment 1 tim blechmann 2011-01-22 22:12:42 UTC
2.6.38-rc2 still has the issue. it seems, i cannot kill any thread which is running with real-time scheduling.
Comment 2 tim blechmann 2011-02-05 17:44:42 UTC
i bisected it, the first bad commit is:

commit 34f971f6f7988be4d014eec3e3526bee6d007ffa
Author: Peter Zijlstra <a.p.zijlstra@chello.nl>
Date:   Wed Sep 22 13:53:15 2010 +0200

    sched: Create special class for stop/migrate work
    
    In order to separate the stop/migrate work thread from the SCHED_FIFO
    implementation, create a special class for it that is of higher priority than
    SCHED_FIFO itself.
    
    This currently solves a problem where cpu-hotplug consumes so much cpu-time
    that the SCHED_FIFO class gets throttled, but has the bandwidth replenishment
    timer pending on the now dead cpu.
    
    It is also required for when we add the planned deadline scheduling class above
    SCHED_FIFO, as the stop/migrate thread still needs to transcent those tasks.
    
    Tested-by: Heiko Carstens <heiko.carstens@de.ibm.com>
    Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
    LKML-Reference: <1285165776.2275.1022.camel@laptop>
    Signed-off-by: Ingo Molnar <mingo@elte.hu>
Comment 3 Rafael J. Wysocki 2011-02-05 18:04:38 UTC
First-Bad-Commit : 34f971f6f7988be4d014eec3e3526bee6d007ffa
Comment 4 tim blechmann 2011-02-06 14:26:01 UTC
this two-liner seems to resolve the symptoms:

diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c
index 2df820b..2035b4f 100644
--- a/kernel/stop_machine.c
+++ b/kernel/stop_machine.c
@@ -293,6 +293,7 @@ extern void sched_set_stop_task(int cpu, struct task_struct *stop);
 static int __cpuinit cpu_stop_cpu_callback(struct notifier_block *nfb,
                       unsigned long action, void *hcpu)
 {
+   struct sched_param param = { .sched_priority = MAX_RT_PRIO - 1 };
    unsigned int cpu = (unsigned long)hcpu;
    struct cpu_stopper *stopper = &per_cpu(cpu_stopper, cpu);
    struct task_struct *p;
@@ -305,6 +306,7 @@ static int __cpuinit cpu_stop_cpu_callback(struct notifier_block *nfb,
                   cpu);
        if (IS_ERR(p))
            return notifier_from_errno(PTR_ERR(p));
+       sched_setscheduler_nocheck(p, SCHED_FIFO, &param);
        get_task_struct(p);
        kthread_bind(p, cpu);
        sched_set_stop_task(cpu, p);
Comment 5 tim blechmann 2011-02-06 14:37:21 UTC
hm actually not quite:

while i can finally kill my application, it doesn't behave correctly: the application has 4 real-time threads, each pinned to separate CPU cores. some of the threads do not seem to get scheduled (they don't consume any CPU time)
Comment 6 Peter Zijlstra 2011-02-07 13:22:17 UTC
On Sun, 2011-02-06 at 14:26 +0000, bugzilla-daemon@bugzilla.kernel.org
wrote:
> https://bugzilla.kernel.org/show_bug.cgi?id=26842
> 
> 
> 
> 
> 
> --- Comment #4 from tim blechmann <tim@klingt.org>  2011-02-06 14:26:01 ---
> this two-liner seems to resolve the symptoms:
> 
> diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c
> index 2df820b..2035b4f 100644
> --- a/kernel/stop_machine.c
> +++ b/kernel/stop_machine.c
> @@ -293,6 +293,7 @@ extern void sched_set_stop_task(int cpu, struct task_struct
> *stop);
>  static int __cpuinit cpu_stop_cpu_callback(struct notifier_block *nfb,
>                        unsigned long action, void *hcpu)
>  {
> +   struct sched_param param = { .sched_priority = MAX_RT_PRIO - 1 };
>     unsigned int cpu = (unsigned long)hcpu;
>     struct cpu_stopper *stopper = &per_cpu(cpu_stopper, cpu);
>     struct task_struct *p;
> @@ -305,6 +306,7 @@ static int __cpuinit cpu_stop_cpu_callback(struct
> notifier_block *nfb,
>                    cpu);
>         if (IS_ERR(p))
>             return notifier_from_errno(PTR_ERR(p));
> +       sched_setscheduler_nocheck(p, SCHED_FIFO, &param);
>         get_task_struct(p);
>         kthread_bind(p, cpu);
>         sched_set_stop_task(cpu, p);
> 

That should be an absolute NOP, the task shouldn't be running and
sched_set_stop_task() already fiddles with sched_setscheduler.

Does the below patch cure things? If not, I'll try and write a proglet
that does what you describe to see if I can reproduce.

---
commit 06c3bc655697b19521901f9254eb0bbb2c67e7e8
Author: Peter Zijlstra <a.p.zijlstra@chello.nl>
Date:   Wed Feb 2 13:19:48 2011 +0100

    sched: Fix update_curr_rt()
    
    cpu_stopper_thread()
      migration_cpu_stop()
        __migrate_task()
          deactivate_task()
            dequeue_task()
              dequeue_task_rq()
                update_curr_rt()
    
    Will call update_curr_rt() on rq->curr, which at that time is
    rq->stop. The problem is that rq->stop.prio matches an RT prio and
    thus falsely assumes its a rt_sched_class task.
    
    Reported-Debuged-Tested-Acked-by: Thomas Gleixner <tglx@linutronix.de>
    Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
    LKML-Reference: <new-submission>
    Cc: stable@kernel.org # .37
    Signed-off-by: Ingo Molnar <mingo@elte.hu>

diff --git a/kernel/sched_rt.c b/kernel/sched_rt.c
index c914ec7..ad62677 100644
--- a/kernel/sched_rt.c
+++ b/kernel/sched_rt.c
@@ -625,7 +625,7 @@ static void update_curr_rt(struct rq *rq)
 	struct rt_rq *rt_rq = rt_rq_of_se(rt_se);
 	u64 delta_exec;
 
-	if (!task_has_rt_policy(curr))
+	if (curr->sched_class != &rt_sched_class)
 		return;
 
 	delta_exec = rq->clock_task - curr->se.exec_start;
Comment 7 tim blechmann 2011-02-07 15:27:35 UTC
this patch seems to fix it ...
Comment 8 Florian Mickler 2011-02-09 05:45:20 UTC
Patch: https://bugzilla.kernel.org/show_bug.cgi?id=26842#c6
Comment 9 Rafael J. Wysocki 2011-02-12 23:18:09 UTC
Fixed by commit 06c3bc655697b19521901f9254eb0bbb2c67e7e8 .

Note You need to log in before you can comment on or make changes to this bug.