Bug 64831

Summary: do_IRQ displays IRQ warnings during CPU hotplug
Product: Platform Specific/Hardware Reporter: Prarit Bhargava (prarit)
Component: x86-64Assignee: platform_x86_64 (platform_x86_64)
Status: NEW ---    
Severity: normal    
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 3.12 Subsystem:
Regression: No Bisected commit-id:
Attachments: Patch that resolves this issue

Description Prarit Bhargava 2013-11-11 23:03:13 UTC
Created attachment 114311 [details]
Patch that resolves this issue

When doing cpu hotplug testing I occasionally see

[  612.014573] do_IRQ: 56.134 No irq handler for vector (irq -1)

on the serial console.

This warning indicates that an unregistered IRQ occurred and no handler was found.

In the cpu_down path on x86, fixup_irqs() is called and the APIC IRR is examined.  For each irq set in the IRR, the corresponding IRQ handler is called through an retrigger event.  The IRQ vector entry is then set to -1.  After this, some time can elapse when the CPU does handle the IRQ -- which calls do_IRQ() for an IRQ that now no longer has a vector entry.  This then causes the do_IRQ() code to output the bogus message for an event that has already been handled.

To help debug this I did (sorry for the cut-and-paste):

diff --git a/arch/x86/kernel/irq.c b/arch/x86/kernel/irq.c
index 22d0687..98335a0 100644
--- a/arch/x86/kernel/irq.c
+++ b/arch/x86/kernel/irq.c
@@ -355,8 +355,11 @@ void fixup_irqs(void)
                        data = irq_desc_get_irq_data(desc);
                        chip = irq_data_get_irq_chip(data);
                        raw_spin_lock(&desc->lock);
-                       if (chip->irq_retrigger)
+                       if (chip->irq_retrigger) {
+                               printk("%s: %d.%u retriggered\n", __FUNCTION__,
+                                      smp_processor_id(), irq);
                                chip->irq_retrigger(data);
+                       }
                        raw_spin_unlock(&desc->lock);
                }
                __this_cpu_write(vector_irq[vector], -1);

to see that the retrigger and the do_IRQ message were for the same cpu and irq.
Comment 1 Prarit Bhargava 2013-11-11 23:14:43 UTC
Submitted upstream here:

http://marc.info/?l=linux-kernel&m=138421132030503&w=2

P.