Bug 11735
Summary: | delkin_cb fails after resume | ||
---|---|---|---|
Product: | IO/Storage | Reporter: | bumble.bee |
Component: | IDE | Assignee: | Bartlomiej Zolnierkiewicz (bzolnier) |
Status: | CLOSED CODE_FIX | ||
Severity: | normal | CC: | alan |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | 2.6.27-rc9 | Subsystem: | |
Regression: | --- | Bisected commit-id: | |
Attachments: |
add PM support to delkin_cb
add PM support to delkin_cb (with minor fix) |
Description
bumble.bee
2008-10-10 13:45:55 UTC
Well delkin_cb doesn't really support suspend/resume so no suprise whatsoever. pata_ninja32 does so it might be more useful to investigate why the delkin driver is slower for you. < 1MB/sec sounds like the Delkin is selecting PIO0 not PIO4 Hi Alan, I'd love to do just that. Maybe if this thing can perform in PIO mode it won't matter that there's no DMA. I've built a kernel and initrd for both delkin_cb and pata_ninja32. # hdparm -t /dev/hde /dev/hde: Timing buffered disk reads: 10 MB in 3.35 seconds = 2.98 MB/sec # hdparm -t /dev/sdb /dev/sdb: Timing buffered disk reads: 6 MB in 3.27 seconds = 1.84 MB/sec The dmesg seems to indicate PIO4: pccard: CardBus card inserted into slot 0 PCI: 0000:02:00.0 reg 10 io port: [0, 1f] PCI: 0000:02:00.0 reg 14 32bit mmio: [0, fff] pata_ninja32 0000:02:00.0: enabling device (0000 -> 0003) pata_ninja32 0000:02:00.0: PCI INT A -> Link[LNKD] -> GSI 11 (level, low) -> IRQ 11 pata_ninja32 0000:02:00.0: setting latency timer to 64 scsi2 : pata_ninja32 ata3: PATA max PIO4 irq 11 ata3.00: ATA-0: TRANSCEND, 20070831, max UDMA/66 ata3.00: 31326208 sectors, multi 0: LBA ata3.00: configured for PIO4 ata3.00: configured for PIO4 ata3: EH complete scsi 2:0:0:0: Direct-Access ATA TRANSCEND 2007 PQ: 0 ANSI: 5 I'm afraid pata_ninja32 doesn't do much better with suspend. After resuming, top shows me kacpi_notify using about 50% cpu, kacpid using ~42%, scsi_eh_1 about 0.3% and the rest is in "wa" state. Reads from the disk still don't return. What is logged in "dmesg" when this is occuring ? The kacpi usage also seems bizarrely high but I assume you don't see that with then card removed when resuming ? To clarify, in addition to the CF card this laptop has a "regular" HDD. I had switched to an all-libata setup for both the HDD and the ninja32, which is when the kacpid cpu hog shows up. I've now also built a kernel with the old ata driver for the HDD and only the ninja32 in libata. That shows the same behavior as the delkin_cb driver (100% wait state on resume). With that kernel, I've managed to capture this in the syslog after resume: ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen ala1.00: cmd 30/00:30:90:12:ea/00:00:00:00:00/e0 tag 0 pio 24576 out res 40/00:00:00:00:00/00:00:00:00:00/e0 Emask 0x4 (timeout) ata1.00: status: { DRDY } ata1: soft resetting link ata1.00: failed to IDENTIFY (INIT_DEV_PARAMS failed, err_mask=0x80) ata1.00: revalidation failed (errno=-5) ata1: soft resetting link ata1.00: failed to IDENTIFY (INIT_DEV_PARAMS failed, err_mask=0x80) ata1.00: revalidation failed (errno=-5) ata1: soft resetting link ata1.00: failed to IDENTIFY (INIT_DEV_PARAMS failed, err_mask=0x80) ata1.00: revalidation failed (errno=-5) ata1.00: disabled ata1.00: EH complete sd 0:0:0:0: [sda] Result: hostbyte=0x04 driverbyte=0x00 end_request: I/O error, dev sda, sector 15340176 sd 0:0:0:0: [sda] Result: hostbyte=0x04 driverbyte=0x00 end_request: I/O error, dev sda, sector 18719176 Buffer I/O error on device dm-1, logical block 2323832 lost page write due to I/O error on dm-1 It's dm-1 because the filesystem on the card is on an encrypted block device. After this error the system goes back to 100% idle, and if I try to access the filesystem I get I/O errors from ls. To answer your question, I've not had problems with resuming on this machine before (with only the HDD, using the old IDE driver). I can't really test resume without the card right now, since it holds the root filesystem, but I could dd it back to the HDD if that's necessary. Ok that is hopeful. It is getting stuck trying to transfer data and then fails to recover. I've got a general test patch about to appear for cleaning up those cases. Created attachment 18330 [details]
add PM support to delkin_cb
Please try this patch against current Linus' tree (should also apply 2.6.28-rc1 once it is released).
Created attachment 18331 [details]
add PM support to delkin_cb (with minor fix)
Holy smokes, I had no idea you guys served a la carte! I remeber seeing a mail from Mark Lord saying that he had no further interest in delkin_cb, thanks for maintaining it Bart. I applied your patch to Linus' git, it seems to resume just fine. It doesn't hang in a wait state, and I/O that I start before resuming completes after it. Seeing this in dmesg: Delkin-ASKA-Workbit Cardbus IDE 0000:02:00.0: PCI INT A disabled ACPI handle has no context! Then, later: Delkin-ASKA-Workbit Cardbus IDE 0000:02:00.0: enabling device (0000 -> 0003) Delkin-ASKA-Workbit Cardbus IDE 0000:02:00.0: PCI INT A -> Link[LNKD] -> GSI 11 (level, low) -> IRQ 11 Delkin-ASKA-Workbit Cardbus IDE 0000:02:00.0: restoring config space at offset 0xf (was 0x4040100, writing 0x404010b) Delkin-ASKA-Workbit Cardbus IDE 0000:02:00.0: restoring config space at offset 0x5 (was 0x0, writing 0x90000000) Delkin-ASKA-Workbit Cardbus IDE 0000:02:00.0: restoring config space at offset 0x4 (was 0x1, writing 0xe001) Delkin-ASKA-Workbit Cardbus IDE 0000:02:00.0: setting latency timer to 64 I'm awaiting Alan's general patch to see if it will help with libata. It'd be great if it could be made to perform at >PIO0 speeds (but surprisingly even as-is it's not that bad as a system drive). Thanks! Nice.. and I think that explains why the standard PM recovery in libata isn't enough so I'll just go fix up pata_ninja32 to match I'm closing this bug since fixes are Linus' tree now: commit 8c061a40c293660793dab83b75223e6eaaa04f8b (delkin_cb) commit e7c0d217cdaa837d30bc265eddac4d176969fd68 (pata_ninja32) |