Bug 25392
Summary: | scsi_eh_11 CPU loop | ||
---|---|---|---|
Product: | IO/Storage | Reporter: | Martin Steigerwald (Martin) |
Component: | Serial ATA | Assignee: | Jeff Garzik (jgarzik) |
Status: | CLOSED UNREPRODUCIBLE | ||
Severity: | normal | CC: | florian, maciej.rutecki, rjw, tj |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | 2.6.37-rc6-tp42-00009-gb3444d1 | Subsystem: | |
Regression: | Yes | Bisected commit-id: | |
Bug Depends on: | |||
Bug Blocks: | 21782 | ||
Attachments: | script to poll for stacktraces |
Description
Martin Steigerwald
2010-12-21 10:24:06 UTC
Answer by Tejun: Can you please apply the following patch, trigger the problem and attach the kernel log? diff --git a/drivers/ata/libata-eh.c b/drivers/ata/libata-eh.c index 5e59050..c748b5a 100644 --- a/drivers/ata/libata-eh.c +++ b/drivers/ata/libata-eh.c @@ -888,12 +888,19 @@ void ata_eh_fastdrain_timerfn(unsigned long arg) */ static void ata_eh_set_pending(struct ata_port *ap, int fastdrain) { + static int xxx; int cnt; /* already scheduled? */ if (ap->pflags & ATA_PFLAG_EH_PENDING) return; + if (!(ap->pflags & ATA_PFLAG_LOADING) && xxx < 16) { + ata_port_printk(ap, KERN_WARNING, "XXX ata_eh_set_pending()\n"); + dump_stack(); + xxx++; + } + ap->pflags |= ATA_PFLAG_EH_PENDING; if (!fastdrain) I will do so with my next kernel compile - likely rc7 if it comes out before Christmas. As I do not know what exactly triggers it, it can take a while till I experience it again. I hope to be able to trigger it by inserting and removing the PCMCIA eSATA adapter lots of times. As the bug happened, I forgot to vgchange -an the LVM on the external drive. That might be or might not be related to triggering the issue. I will try this as well. Might make sense to switch this bugzilla report to mail. Or use the bugzilla. Or just use the bugzilla for tracking and continue using mail. I'll do whatever suits you best (but prefer mail). *** Bug 25602 has been marked as a duplicate of this bug. *** I reproduced this issue via: - Insert eSATA disk - vgscan - vgchange -ay - Remove eSATA disk - vgscan Without the vgscan after removing the disk it didn't show up. But I do not get any hints in dmesg: shambhala:~> dmesg | tail -52 pcmcia_socket pcmcia_socket0: pccard: CardBus card inserted into slot 0 pci 0000:03:00.0: [1095:3512] type 0 class 0x000180 pci 0000:03:00.0: reg 10: [io 0x0000-0x0007] pci 0000:03:00.0: reg 14: [io 0x0000-0x0003] pci 0000:03:00.0: reg 18: [io 0x0000-0x0007] pci 0000:03:00.0: reg 1c: [io 0x0000-0x0003] pci 0000:03:00.0: reg 20: [io 0x0000-0x000f] pci 0000:03:00.0: reg 24: [mem 0x00000000-0x000001ff] pci 0000:03:00.0: reg 30: [mem 0x00000000-0x0007ffff pref] pci 0000:03:00.0: supports D1 D2 pci 0000:03:00.0: BAR 6: assigned [mem 0xe8000000-0xe807ffff pref] pci 0000:03:00.0: BAR 5: assigned [mem 0xc4000000-0xc40001ff] pci 0000:03:00.0: BAR 5: set to [mem 0xc4000000-0xc40001ff] (PCI address [0xc4000000-0xc40001ff]) pci 0000:03:00.0: BAR 4: assigned [io 0x4000-0x400f] pci 0000:03:00.0: BAR 4: set to [io 0x4000-0x400f] (PCI address [0x4000-0x400f]) pci 0000:03:00.0: BAR 0: assigned [io 0x4010-0x4017] pci 0000:03:00.0: BAR 0: set to [io 0x4010-0x4017] (PCI address [0x4010-0x4017]) pci 0000:03:00.0: BAR 2: assigned [io 0x4018-0x401f] pci 0000:03:00.0: BAR 2: set to [io 0x4018-0x401f] (PCI address [0x4018-0x401f]) pci 0000:03:00.0: BAR 1: assigned [io 0x4020-0x4023] pci 0000:03:00.0: BAR 1: set to [io 0x4020-0x4023] (PCI address [0x4020-0x4023]) pci 0000:03:00.0: BAR 3: assigned [io 0x4024-0x4027] pci 0000:03:00.0: BAR 3: set to [io 0x4024-0x4027] (PCI address [0x4024-0x4027]) sata_sil 0000:03:00.0: enabling device (0000 -> 0003) sata_sil 0000:03:00.0: PCI INT A -> Link[LNKA] -> GSI 11 (level, low) -> IRQ 11 sata_sil 0000:03:00.0: Applying R_ERR on DMA activate FIS errata fix sata_sil 0000:03:00.0: setting latency timer to 64 scsi4 : sata_sil scsi5 : sata_sil ata5: SATA max UDMA/100 mmio m512@0xc4000000 tf 0xc4000080 irq 11 ata6: SATA max UDMA/100 mmio m512@0xc4000000 tf 0xc40000c0 irq 11 ata5: SATA link down (SStatus 0 SControl 310) ata6: SATA link up 1.5 Gbps (SStatus 113 SControl 310) ata6.00: ATA-8: Hitachi HTS545050B9A300, PB4OC60G, max UDMA/133 ata6.00: 976773168 sectors, multi 0: LBA48 NCQ (depth 0/32) ata6.00: configured for UDMA/100 scsi 5:0:0:0: Direct-Access ATA Hitachi HTS54505 PB4O PQ: 0 ANSI: 5 sd 5:0:0:0: Attached scsi generic sg2 type 0 sd 5:0:0:0: [sdc] 976773168 512-byte logical blocks: (500 GB/465 GiB) sd 5:0:0:0: [sdc] Write Protect is off sd 5:0:0:0: [sdc] Mode Sense: 00 3a 00 00 sd 5:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA sdc: sdc1 sdc3 sd 5:0:0:0: [sdc] Attached SCSI disk pcmcia_socket pcmcia_socket0: pccard: card ejected from slot 0 ata6.00: disabled sd 5:0:0:0: [sdc] Synchronizing SCSI cache sd 5:0:0:0: [sdc] Result: hostbyte=0x04 driverbyte=0x00 sd 5:0:0:0: [sdc] Stopping disk sd 5:0:0:0: [sdc] START_STOP FAILED sd 5:0:0:0: [sdc] Result: hostbyte=0x04 driverbyte=0x00 sata_sil 0000:03:00.0: PCI INT A disabled Even though I have the debug patch applied: diff --git a/drivers/ata/libata-eh.c b/drivers/ata/libata-eh.c index 5e59050..c748b5a 100644 --- a/drivers/ata/libata-eh.c +++ b/drivers/ata/libata-eh.c @@ -888,12 +888,19 @@ void ata_eh_fastdrain_timerfn(unsigned long arg) */ static void ata_eh_set_pending(struct ata_port *ap, int fastdrain) { + static int xxx; int cnt; /* already scheduled? */ if (ap->pflags & ATA_PFLAG_EH_PENDING) return; + if (!(ap->pflags & ATA_PFLAG_LOADING) && xxx < 16) { + ata_port_printk(ap, KERN_WARNING, "XXX ata_eh_set_pending()\n"); + dump_stack(); + xxx++; + } + ap->pflags |= ATA_PFLAG_EH_PENDING; if (!fastdrain) This is with: martin@shambhala:~> cat /proc/version Linux version 2.6.37-rc7-tp42-ata-eh-dbg-dirty (martin@shambhala) (gcc version 4.4.5 (Debian 4.4.5-8) ) #1 PREEMPT Wed Dec 22 11:41:20 CET 2010 Thats just 2.6.37-rc7 + the debug patch. Hmmm... weird. While the scsi_eh is looping, can you please do "cat /proc/PID_OF_THE_TASK/stack" several times and see where it's spending its cpu cycles? Thanks. Created attachment 43452 [details]
script to poll for stacktraces
Script to poll for stacktraces.
Was not able to reproduce this with 2.6.37-rc8 - probably issue has been fixed in the meanwhile. Thus closing as fixed for now. Thanks. Anyway I posted a script to poll for stacktraces, should anyone should have this issue again. I changed the resolution to UNREPRODUCIBLE, which is (nowadays) the default resolution if no specific commit is known that fixes it. |