Bug 187351 - unhandled irq stack trace during shutdown
Summary: unhandled irq stack trace during shutdown
Status: NEW
Alias: None
Product: Platform Specific/Hardware
Classification: Unclassified
Component: x86-64 (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: platform_x86_64@kernel-bugs.osdl.org
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-11-09 19:31 UTC by Prarit Bhargava
Modified: 2016-11-30 15:06 UTC (History)
3 users (show)

See Also:
Kernel Version: 4.9.0
Subsystem:
Regression: No
Bisected commit-id:


Attachments
Boot log for unhandled irq (153.57 KB, text/x-log)
2016-11-09 19:31 UTC, Prarit Bhargava
Details

Description Prarit Bhargava 2016-11-09 19:31:07 UTC
Created attachment 244081 [details]
Boot log for unhandled irq

The following stack trace for an unhandled irq occurs on system shutdown:

irq 16: nobody cared (try booting with the "irqpoll" option)
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.9.0-0.rc4.git0.1.el7_UNSUPPORTED.x86_64 #1
Hardware name: Hewlett-Packard HP Z820 Workstation/158B, BIOS J63 v03.90 06/01/2016
 ffff88041f803eb0 ffffffff81337765 ffff8804186dbc00 ffff8804186dbc9c
 ffff88041f803ed8 ffffffff810db935 ffff8804186dbc00 0000000000000000
 0000000000000010 ffff88041f803f10 ffffffff810dbc8f ffff8804186dbc00
Call Trace:
 <IRQ>  [<ffffffff81337765>] dump_stack+0x63/0x8e
 [<ffffffff810db935>] __report_bad_irq+0x35/0xd0
 [<ffffffff810dbc8f>] note_interrupt+0x20f/0x260
 [<ffffffff810d8ff5>] handle_irq_event_percpu+0x45/0x60
 [<ffffffff810d903c>] handle_irq_event+0x2c/0x50
 [<ffffffff810dc9ba>] handle_fasteoi_irq+0x8a/0x150
 [<ffffffff8102ee0b>] handle_irq+0xab/0x130
 [<ffffffff8109de6a>] ? atomic_notifier_call_chain+0x1a/0x20
 [<ffffffff8103631d>] ? __exit_idle+0x2d/0x30
 [<ffffffff817158ad>] do_IRQ+0x4d/0xd0
 [<ffffffff81713902>] common_interrupt+0x82/0x82
 <EOI>  [<ffffffff815e34d1>] ? cpuidle_enter_state+0xc1/0x280
 [<ffffffff815e34c4>] ? cpuidle_enter_state+0xb4/0x280
 [<ffffffff815e36c7>] cpuidle_enter+0x17/0x20
 [<ffffffff810c1813>] call_cpuidle+0x23/0x40
 [<ffffffff810c1a79>] cpu_startup_entry+0x149/0x240
 [<ffffffff81705cd7>] rest_init+0x77/0x80
 [<ffffffff81d8a147>] start_kernel+0x495/0x4a2
 [<ffffffff81d89aa0>] ? set_init_arg+0x55/0x55
 [<ffffffff81d89120>] ? early_idt_handler_array+0x120/0x120
 [<ffffffff81d895d6>] x86_64_start_reservations+0x2a/0x2c
 [<ffffffff81d89715>] x86_64_start_kernel+0x13d/0x14c
handlers:
[<ffffffff8155ace0>] usb_hcd_irq
Disabling IRQ #16

P.
Comment 2 Prarit Bhargava 2016-11-09 19:39:30 UTC
Bjorn, I have attached a full boot file from an affected system as part of the bugzilla Description.  If this is not sufficient to debug, please let me know.

P.
Comment 3 Prarit Bhargava 2016-11-09 19:46:05 UTC
Some debug information:

The issue is that pci_device_shutdown() is called on each PCI device, and does

        if (drv && drv->shutdown)
                drv->shutdown(pci_dev);
        pci_msi_shutdown(pci_dev);
        pci_msix_shutdown(pci_dev);

The pci_msi_shutdown() and pci_msix_shutdown() functions both call
pci_intx_for_msi() which enables the INTx interrupt asynchronously of the
driver.

The problem is that the driver may not have a shutdown function and the
device remains active.  The driver continues to operate the PCI device and the
device interrupts to generate INTx.  The driver, however, has not registered a
handler for INTx and the interrupt line remains set which leads to an unhandled
IRQ warning.

This can be debugged by adding a 3 second delay in pci_device_shutdown()

diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
index 1ccce1cd6aca..74776ca35e0c 100644
--- a/drivers/pci/pci-driver.c
+++ b/drivers/pci/pci-driver.c
@@ -20,6 +20,7 @@
 #include <linux/pm_runtime.h>
 #include <linux/suspend.h>
 #include <linux/kexec.h>
+#include <linux/delay.h>
 #include "pci.h"
 
 struct pci_dynid {
@@ -461,10 +462,19 @@ static void pci_device_shutdown(struct device *dev)
 
 	pm_runtime_resume(dev);
 
+	/*
+	 * Mika, on the 4S the device that had the problem was the
+	 * GT218 [GeForce 210] Video card running the nouveau driver.
+	 * The nouveau driver does not use INTX at all IIRC.
+	 */
+	printk(KERN_EMERG "Calling shutdown for device %s\n", dev_name(dev));
 	if (drv && drv->shutdown)
 		drv->shutdown(pci_dev);
+
 	pci_msi_shutdown(pci_dev);
 	pci_msix_shutdown(pci_dev);
+	mdelay(3000); /* wait for unhandled irq */
+	printk(KERN_EMERG "Completed shutdown for device %s\n", dev_name(dev));
 
 	/*
 	 * If this is a kexec reboot, turn off Bus Master bit on the


and is highly reproducible on systems that have NVIDIA cards in them.

P.
Comment 4 Rui Wang 2016-11-30 13:49:40 UTC
Enabling INTx may be needed for every PCI device because system shutdown is passing control to unknown OS or firmware. Here's a patch trying to fix it from a different perspective.

From 8f6d102e0a1ebd906f113532ec805fa395d12a3e Mon Sep 17 00:00:00 2001
From: Rui Wang <rui.y.wang@intel.com>
Date: Fri, 25 Nov 2016 03:26:07 -0500
Subject: [PATCH] pci/msi: Setup INTx before pci_intx_for_msi()

Migrate MSI actions to INTx if the driver doesn't have shutdown()
to quiesce the hardware, in order to prevent IRQ storm during system
shutdown.

Signed-off-by: Rui Wang <rui.y.wang@intel.com>
---
 drivers/pci/msi.c |   35 +++++++++++++++++++++++++++++++++++
 1 files changed, 35 insertions(+), 0 deletions(-)

diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
index 98f1222..1be5645 100644
--- a/drivers/pci/msi.c
+++ b/drivers/pci/msi.c
@@ -886,6 +886,39 @@ int pci_msi_vec_count(struct pci_dev *dev)
 }
 EXPORT_SYMBOL(pci_msi_vec_count);

+void pci_msi_to_intx(struct pci_dev *dev)
+{
+       struct irq_desc *desc;
+       struct msi_desc *entry;
+       struct irqaction *act;
+       unsigned int intx_irq;
+       struct pci_driver *drv = dev->driver;
+
+       /*
+        * If it has shutdown() it's supposed to quiesce the hw.
+        */
+       if (drv && drv->shutdown)
+               return;
+
+       /* Only the first entry since the device can only assert 1 INTx */
+       entry = first_pci_msi_entry(dev);
+       if (!entry)
+               return;
+       intx_irq = entry->msi_attrib.default_irq;
+       if (entry->irq) {
+               desc = irq_to_desc(entry->irq);
+               act = desc->action;
+               while (act) {
+                       irq_set_affinity_hint(entry->irq, NULL);
+                       /* Remove & reuse the action without freeing it */
+                       remove_irq(entry->irq, act);
+                       act->flags |= IRQF_SHARED;
+                       setup_irq(intx_irq, act);
+                       act = act->next;
+               }
+       }
+}
+
 void pci_msi_shutdown(struct pci_dev *dev)
 {
        struct msi_desc *desc;
@@ -898,6 +931,7 @@ void pci_msi_shutdown(struct pci_dev *dev)
        desc = first_pci_msi_entry(dev);

        pci_msi_set_enable(dev, 0);
+       pci_msi_to_intx(dev);
        pci_intx_for_msi(dev, 1);
        dev->msi_enabled = 0;

@@ -1006,6 +1040,7 @@ void pci_msix_shutdown(struct pci_dev *dev)
        }

        pci_msix_clear_and_set_ctrl(dev, PCI_MSIX_FLAGS_ENABLE, 0);
+       pci_msi_to_intx(dev);
        pci_intx_for_msi(dev, 1);
        dev->msix_enabled = 0;
        pcibios_alloc_irq(dev);
--
1.7.5.4
Comment 5 Prarit Bhargava 2016-11-30 13:59:48 UTC
(In reply to Rui Wang from comment #4)
> Enabling INTx may be needed for every PCI device because system shutdown is
> passing control to unknown OS or firmware. Here's a patch trying to fix it
> from a different perspective.
> 

Thanks Rui.  A few questions.

1.  Do you know of any systems that have problems like this?  My argument is that prior to d52877c7b1af ("pci/irq: let pci_device_shutdown to call pci_msi_shutdown v2") there were no reports of faulty OS or firmware switching that I can find.

2.  Shouldn't the reset (at least on systems within the past 10 years) cover this situation?

P.
Comment 6 Rui Wang 2016-11-30 15:06:57 UTC
(In reply to Prarit Bhargava from comment #5)
> 1.  Do you know of any systems that have problems like this?  My argument is
> that prior to d52877c7b1af ("pci/irq: let pci_device_shutdown to call
> pci_msi_shutdown v2") there were no reports of faulty OS or firmware
> switching that I can find.
> 

No I don't. I was just theorizing.

> 2.  Shouldn't the reset (at least on systems within the past 10 years) cover
> this situation?
> 

kexec doesn't do reset. As for firmware, I think the firmware can choose not to do a reset immediately, but to continue executing for a while. e.g. sense the temperature to decide if the fan needs to continue turning. And it is not limited to PC but can be anything (automobile, drone...)

Note You need to log in before you can comment on or make changes to this bug.