Bug 12621

Summary: Resume broken on iBook
Product: Power Management Reporter: Andreas Schwab (schwab)
Component: Hibernation/SuspendAssignee: Rafael J. Wysocki (rjw)
Status: CLOSED CODE_FIX    
Severity: normal CC: benh, rjw
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 2.6.29-rc3 Subsystem:
Regression: Yes Bisected commit-id:
Bug Depends on:    
Bug Blocks: 7216, 12398    
Attachments: Put devices into D0 before restoring their state
Workaround for devices without the PM capability
lspci -vv
Read state from the device after trying to change it on early resume
Change current_state to unknown in PMAC resume
Sanitize radeonfb suspend/resume

Description Andreas Schwab 2009-02-01 16:54:50 UTC
Latest working kernel version: 2.6.29-rc2
Earliest failing kernel version: 2.6.29-rc3
Distribution:
Hardware Environment: iBook G4
Software Environment:
Problem Description:
Crash during resume from suspend to ram.

Caused by:
commit aa8c6c93747f7b55fa11e1624fec8ca33763a805
Author: Rafael J. Wysocki <rjw@sisk.pl>
Date:   Fri Jan 16 21:54:43 2009 +0100

    PCI PM: Restore standard config registers of all devices early
    
    There is a problem in our handling of suspend-resume of PCI devices that
    many of them have their standard config registers restored with
    interrupts enabled and they are put into the full power state with
    interrupts enabled as well.  This may lead to the following scenario:
      * an interrupt vector is shared between two or more devices
      * one device is resumed earlier and generates an interrupt
      * the interrupt handler of another device tries to handle it and
        attempts to access the device the config space of which hasn't been
        restored yet and/or which still is in a low power state
      * the system crashes as a result
    
    To prevent this from happening we should restore the standard
    configuration registers of all devices with interrupts disabled and we
    should put them into the D0 power state right after that.
    Unfortunately, this cannot be done using the existing
    pci_set_power_state(), because it can sleep.  Also, to do it we have to
    make sure that the config spaces of all devices were actually saved
    during suspend.
Comment 1 Rafael J. Wysocki 2009-02-02 03:59:44 UTC
Created attachment 20076 [details]
Put devices into D0 before restoring their state

Can you try this patch,  please?
Comment 2 Andreas Schwab 2009-02-02 11:37:11 UTC
That does not help.
Comment 3 Rafael J. Wysocki 2009-02-02 13:25:22 UTC
Created attachment 20078 [details]
Workaround for devices without the PM capability

Thanks for testing.

Please also try this one (on top of the previous one).
Comment 4 Rafael J. Wysocki 2009-02-02 13:26:29 UTC
First-Bad-Commit: aa8c6c93747f7b55fa11e1624fec8ca33763a805
Handled-By : Rafael J. Wysocki <rjw@sisk.pl>
Comment 5 Andreas Schwab 2009-02-02 13:59:40 UTC
That didn't help either.
Comment 6 Rafael J. Wysocki 2009-02-02 14:15:40 UTC
Please attach the output of 'lspci -vv'.
Comment 7 Andreas Schwab 2009-02-02 14:35:15 UTC
When suspending with no_console_suspend init=/bin/sh I see Badness at drivers/pci/pci-driver.c:368 (and in this state resume actually works even in vanilla rc3, but the badness only happens with the patches applied).
Comment 8 Andreas Schwab 2009-02-02 14:36:28 UTC
Created attachment 20079 [details]
lspci -vv
Comment 9 Rafael J. Wysocki 2009-02-02 15:06:15 UTC
(In reply to comment #7)
> When suspending with no_console_suspend init=/bin/sh I see Badness at
> drivers/pci/pci-driver.c:368

This only is a warning, no need to worry about it.  You can just comment out lines 368 and 369 in drivers/pci/pci-driver.c and it should go away.

> (and in this state resume actually works even in
> vanilla rc3, but the badness only happens with the patches applied).

Ah, good.  One of device drivers apparently breaks things, I wonder which one.
Comment 10 Rafael J. Wysocki 2009-02-02 15:13:10 UTC
Which drivers are loaded when you boot with init=/bin/sh ?
Comment 11 Rafael J. Wysocki 2009-02-02 15:24:50 UTC
IOW, please attach a dmesg log containing a successful suspend-resume cycle with init=/bin/sh, if possible.
Comment 12 Rafael J. Wysocki 2009-02-02 15:39:08 UTC
Created attachment 20080 [details]
Read state from the device after trying to change it on early resume

Linus thinks something like this may help, so please also try it.
Comment 13 Rafael J. Wysocki 2009-02-02 15:47:20 UTC
The last patch is on top of the one from comment #1.
Comment 14 Rafael J. Wysocki 2009-02-02 15:53:31 UTC
Created attachment 20081 [details]
Change current_state to unknown in PMAC resume

If the patch from comment #12 doesn't help, please also try this one.
Comment 15 Rafael J. Wysocki 2009-02-02 15:54:21 UTC
(On top of the patch from comment #1).
Comment 16 Benjamin Herrenschmidt 2009-02-02 15:55:52 UTC
The root problem is radeonfb.

It has its own config space restore code (yuck, but heh, that predates all the
PCI core stuff) -and- it makes the decision as to whether it must re-POST the
card on not ... based on the BAR values having been lost...

With your patch, the BAR values are restored, so radeonfb things the card
doesn't need re-POSTing and doesn't try.

This is fishy, I'll try to come up with something a bit more sane. I'm not sure
though what is the best way to "detect" that the card is non posted, I'll
figure something.

(Note, email temporarily down, so I can't reply on the lkml thread right away)
Comment 17 Benjamin Herrenschmidt 2009-02-02 15:58:00 UTC
Actually, Andreas, did you disable the early resume code in there ? With the early resume, radeonfb should have kicked in before the PCI restore code and thus should have somewhat worked anyway... unless pci_enable_device() -also- causes the BARs to be restored.

Now, the early resume hack has some other issues that I mentioned in the email thread on the list, that will need to be sorted out but it seems to be working enough on the albook here.
Comment 18 Rafael J. Wysocki 2009-02-02 16:06:20 UTC
(In reply to comment #17)
> Actually, Andreas, did you disable the early resume code in there ? With the
> early resume, radeonfb should have kicked in before the PCI restore code and
> thus should have somewhat worked anyway...

I don't really think it would.  The PCI restore code goes before the driver's resume_early(), actually.  The rationale is that the PCI restore should not change whatever the driver might have done in resume_early().

> unless pci_enable_device() -also- causes the BARs to be restored.

No, it doesn't.

> Now, the early resume hack has some other issues that I mentioned in the
> email
> thread on the list, that will need to be sorted out but it seems to be
> working
> enough on the albook here.

What about reposting the adapter unconditionally on resume?  Would that break?
Comment 19 Benjamin Herrenschmidt 2009-02-02 17:07:36 UTC
It might... we don't know how to repost all of them, but that's not necessary the problem, it's also that the repost code was reverse engineered and may not actually work if the chip is still "hot". In fact I know at least of a couple of things that will probably burst, such as the initialization of the chip's memory controller.
Comment 20 Benjamin Herrenschmidt 2009-02-02 17:08:21 UTC
Created attachment 20085 [details]
Sanitize radeonfb suspend/resume

Andreas, can you test that ? I need to test it on older machines too see if it works with the ones using D2 state, and some of the thinkpads as well...
Comment 21 Rafael J. Wysocki 2009-02-03 00:15:50 UTC
(In reply to comment #19)
> It might... we don't know how to repost all of them, but that's not necessary
> the problem, it's also that the repost code was reverse engineered and may
> not
> actually work if the chip is still "hot". In fact I know at least of a couple
> of things that will probably burst, such as the initialization of the chip's
> memory controller.

Hmm.  So perhaps we need a "do not restore" flag?  Not nice, but would do the trick.
Comment 22 Benjamin Herrenschmidt 2009-02-03 01:00:51 UTC
Let's see first. I've used a different method to detect if the chip needs to be POSTed in the patch, comparing the values of some important PLL registers with the saved values. I need to test on some older machines that don't power off the chip to see if that works fine, and if it does, we can just merge that.
Comment 23 Andreas Schwab 2009-02-03 11:33:16 UTC
Ben's patch works fine here.
Comment 24 Rafael J. Wysocki 2009-02-03 13:24:06 UTC
Great!  Thanks for testing.

Handled-By : Benjamin Herrenschmidt <benh@kernel.crashing.org>
Patch : http://bugzilla.kernel.org/attachment.cgi?id=20085&action=view