Bug 14730

Summary: sky2 won't work after suspend/resume cycle
Product: Drivers Reporter: Maciej J. Woloszyk (mat)
Component: PCIAssignee: drivers_pci (drivers_pci)
Status: CLOSED CODE_FIX    
Severity: high CC: rjw, vyacheslavovich
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 2.6.31-rc8 Subsystem:
Regression: Yes Bisected commit-id:
Bug Depends on:    
Bug Blocks: 7216, 13615    
Attachments: dmesg in 2.6.31-rc3 after full suspend/resume cycle
dmesg in 2.6.31 after full suspend/resume cycle
lspci in 2.6.31-rc3 before suspend/resume
lspci in 2.6.31-rc3 after suspend/resume
lspci in 2.6.31 before suspend/resume
lspci in 2.6.31 after suspend/resume
lspci -vv before suspend
lspci -vv after resume
sky2: Force restoration of PCI configuration registers
dmesg after full suspend/resume cycle with comment #26 patch
sky2: Move PCI resume bits to the early phase
.config file for 2.6.32
dmesg after full suspend/resume cycle with comment #32 patch
sky2: Move PCI resume bits to the early phase, add sleep
dmesg after full suspend/resume cycle with comment #35 patch
PCI / PM: Use per-device D3 delay
dmesg after full suspend/resume cycle with comment #39 patch
PCI / PM: Use per-device D3 delays
sky2: Allow PCI PM core to handle PCI-specific parts of suspend/resume

Description Maciej J. Woloszyk 2009-12-04 09:02:08 UTC
I use sky2 network adapter in my Fujitsu Siemens Esprimo Mobile U9200. It worked fine up until 2.6.31-rc3 (or at least that's the last one I tried it works for me) and in all versions after that I tried it stops working after suspend/resume cycle (I use powersave daemon, but I also tried without it - effect is the same). 

After resume what I see in dmesg is this:

sky2 driver version 1.23
sky2 0000:04:00.0: enabling device (0000 -> 0003)
sky2 0000:04:00.0: PCI INT A -> GSI 17 (level, low) -> IRQ 17
sky2 0000:04:00.0: setting latency timer to 64
sky2 0000:04:00.0: unsupported chip type 0xff
sky2 0000:04:00.0: PCI INT A disabled
sky2: probe of 0000:04:00.0 failed with error -95

So it looks like something is probably not restored correctly after resume or some initializations are not executed.

I use Gentoo sources, but I also tried it on vanilla kernels - 2.6.31 and 2.6.32-rc8 and it doesn't work in any version past 2.6.31-rc3.
Comment 1 Maciej J. Woloszyk 2009-12-04 09:06:32 UTC
Created attachment 24009 [details]
dmesg in 2.6.31-rc3 after full suspend/resume cycle
Comment 2 Maciej J. Woloszyk 2009-12-04 09:09:19 UTC
Created attachment 24010 [details]
dmesg in 2.6.31 after full suspend/resume cycle
Comment 3 Maciej J. Woloszyk 2009-12-04 09:10:08 UTC
Created attachment 24011 [details]
lspci in 2.6.31-rc3 before suspend/resume
Comment 4 Maciej J. Woloszyk 2009-12-04 09:10:51 UTC
Created attachment 24012 [details]
lspci in 2.6.31-rc3 after suspend/resume
Comment 5 Maciej J. Woloszyk 2009-12-04 09:11:28 UTC
Created attachment 24013 [details]
lspci in 2.6.31 before suspend/resume
Comment 6 Maciej J. Woloszyk 2009-12-04 09:12:12 UTC
Created attachment 24014 [details]
lspci in 2.6.31 after suspend/resume
Comment 7 Andrew Morton 2009-12-07 19:51:32 UTC
(switched to email.  Please respond via emailed reply-to-all, not via the
bugzilla web interface).

On Fri, 4 Dec 2009 09:02:21 GMT
bugzilla-daemon@bugzilla.kernel.org wrote:

> http://bugzilla.kernel.org/show_bug.cgi?id=14730
> 
>            Summary: sky2 won't work after suspend/resume cycle
>            Product: Power Management
>            Version: 2.5
>           Platform: All
>         OS/Version: Linux
>               Tree: Mainline
>             Status: NEW
>           Severity: high
>           Priority: P1
>          Component: Hibernation/Suspend
>         AssignedTo: power-management_other@kernel-bugs.osdl.org
>         ReportedBy: mat@esi.com.pl
>         Regression: Yes
> 
> 
> I use sky2 network adapter in my Fujitsu Siemens Esprimo Mobile U9200. It
> worked fine up until 2.6.31-rc3 (or at least that's the last one I tried it
> works for me) and in all versions after that I tried it stops working after
> suspend/resume cycle (I use powersave daemon, but I also tried without it -
> effect is the same). 
> 
> After resume what I see in dmesg is this:
> 
> sky2 driver version 1.23
> sky2 0000:04:00.0: enabling device (0000 -> 0003)
> sky2 0000:04:00.0: PCI INT A -> GSI 17 (level, low) -> IRQ 17
> sky2 0000:04:00.0: setting latency timer to 64
> sky2 0000:04:00.0: unsupported chip type 0xff

It looks like reads from the device are returning all-ones.  

> sky2 0000:04:00.0: PCI INT A disabled
> sky2: probe of 0000:04:00.0 failed with error -95
> 
> So it looks like something is probably not restored correctly after resume or
> some initializations are not executed.
> 
> I use Gentoo sources, but I also tried it on vanilla kernels - 2.6.31 and
> 2.6.32-rc8 and it doesn't work in any version past 2.6.31-rc3.
> 

Help.  Do we think this regression is likely to be a sky2 thing, an
ACPI thing, an x86 arch thing or...?

Thanks.
Comment 8 Stephen Hemminger 2009-12-07 20:00:29 UTC
On Mon, 7 Dec 2009 11:50:57 -0800
Andrew Morton <akpm@linux-foundation.org> wrote:

> 
> (switched to email.  Please respond via emailed reply-to-all, not via the
> bugzilla web interface).
> 
> On Fri, 4 Dec 2009 09:02:21 GMT
> bugzilla-daemon@bugzilla.kernel.org wrote:
> 
> > http://bugzilla.kernel.org/show_bug.cgi?id=14730
> > 
> >            Summary: sky2 won't work after suspend/resume cycle
> >            Product: Power Management
> >            Version: 2.5
> >           Platform: All
> >         OS/Version: Linux
> >               Tree: Mainline
> >             Status: NEW
> >           Severity: high
> >           Priority: P1
> >          Component: Hibernation/Suspend
> >         AssignedTo: power-management_other@kernel-bugs.osdl.org
> >         ReportedBy: mat@esi.com.pl
> >         Regression: Yes
> > 
> > 
> > I use sky2 network adapter in my Fujitsu Siemens Esprimo Mobile U9200. It
> > worked fine up until 2.6.31-rc3 (or at least that's the last one I tried it
> > works for me) and in all versions after that I tried it stops working after
> > suspend/resume cycle (I use powersave daemon, but I also tried without it -
> > effect is the same). 
> > 
> > After resume what I see in dmesg is this:
> > 
> > sky2 driver version 1.23
> > sky2 0000:04:00.0: enabling device (0000 -> 0003)
> > sky2 0000:04:00.0: PCI INT A -> GSI 17 (level, low) -> IRQ 17
> > sky2 0000:04:00.0: setting latency timer to 64
> > sky2 0000:04:00.0: unsupported chip type 0xff
> 
> It looks like reads from the device are returning all-ones.  
>

That means something in PM didn't turn on the bus, so driver is out of
luck.  Most of these problems have been traced back to generic PCI
power management, nothing in driver runs before this.
Comment 9 Rafael J. Wysocki 2009-12-07 20:23:20 UTC
On Monday 07 December 2009, bugzilla-daemon@bugzilla.kernel.org wrote:
> http://bugzilla.kernel.org/show_bug.cgi?id=14730
> 
> 
> 
> 
> 
> --- Comment #7 from Andrew Morton <akpm@linux-foundation.org>  2009-12-07
> 19:51:32 ---
> (switched to email.  Please respond via emailed reply-to-all, not via the
> bugzilla web interface).
> 
> On Fri, 4 Dec 2009 09:02:21 GMT
> bugzilla-daemon@bugzilla.kernel.org wrote:
> 
> > http://bugzilla.kernel.org/show_bug.cgi?id=14730
> > 
> >            Summary: sky2 won't work after suspend/resume cycle
> >            Product: Power Management
> >            Version: 2.5
> >           Platform: All
> >         OS/Version: Linux
> >               Tree: Mainline
> >             Status: NEW
> >           Severity: high
> >           Priority: P1
> >          Component: Hibernation/Suspend
> >         AssignedTo: power-management_other@kernel-bugs.osdl.org
> >         ReportedBy: mat@esi.com.pl
> >         Regression: Yes
> > 
> > 
> > I use sky2 network adapter in my Fujitsu Siemens Esprimo Mobile U9200. It
> > worked fine up until 2.6.31-rc3 (or at least that's the last one I tried it
> > works for me) and in all versions after that I tried it stops working after
> > suspend/resume cycle (I use powersave daemon, but I also tried without it -
> > effect is the same). 
> > 
> > After resume what I see in dmesg is this:
> > 
> > sky2 driver version 1.23
> > sky2 0000:04:00.0: enabling device (0000 -> 0003)
> > sky2 0000:04:00.0: PCI INT A -> GSI 17 (level, low) -> IRQ 17
> > sky2 0000:04:00.0: setting latency timer to 64
> > sky2 0000:04:00.0: unsupported chip type 0xff
> 
> It looks like reads from the device are returning all-ones.  
> 
> > sky2 0000:04:00.0: PCI INT A disabled
> > sky2: probe of 0000:04:00.0 failed with error -95
> > 
> > So it looks like something is probably not restored correctly after resume
> or
> > some initializations are not executed.
> > 
> > I use Gentoo sources, but I also tried it on vanilla kernels - 2.6.31 and
> > 2.6.32-rc8 and it doesn't work in any version past 2.6.31-rc3.
> > 
> 
> Help.  Do we think this regression is likely to be a sky2 thing, an
> ACPI thing, an x86 arch thing or...?

Quite frankly, I have no idea.

In principle that may be a result of a PCI PM core change, but nothing
comes to mind immediately.

It may be worth trying to revert 

commit 4b77b0a2ba27d64f58f16d8d4d48d8319dda36ff
Author: Rafael J. Wysocki <rjw@sisk.pl>
Date:   Wed Sep 9 23:49:59 2009 +0200

    PCI: Clear saved_state after the state has been restored

and retesting.
Comment 10 Andrew Morton 2009-12-07 20:23:38 UTC
(cc linux-pci and Rafael)

On Mon, 7 Dec 2009 11:59:57 -0800
Stephen Hemminger <shemminger@linux-foundation.org> wrote:

> On Mon, 7 Dec 2009 11:50:57 -0800
> Andrew Morton <akpm@linux-foundation.org> wrote:
> 
> > 
> > (switched to email.  Please respond via emailed reply-to-all, not via the
> > bugzilla web interface).
> > 
> > On Fri, 4 Dec 2009 09:02:21 GMT
> > bugzilla-daemon@bugzilla.kernel.org wrote:
> > 
> > > http://bugzilla.kernel.org/show_bug.cgi?id=14730
> > > 
> > >            Summary: sky2 won't work after suspend/resume cycle
> > >            Product: Power Management
> > >            Version: 2.5
> > >           Platform: All
> > >         OS/Version: Linux
> > >               Tree: Mainline
> > >             Status: NEW
> > >           Severity: high
> > >           Priority: P1
> > >          Component: Hibernation/Suspend
> > >         AssignedTo: power-management_other@kernel-bugs.osdl.org
> > >         ReportedBy: mat@esi.com.pl
> > >         Regression: Yes
> > > 
> > > 
> > > I use sky2 network adapter in my Fujitsu Siemens Esprimo Mobile U9200. It
> > > worked fine up until 2.6.31-rc3 (or at least that's the last one I tried
> it
> > > works for me) and in all versions after that I tried it stops working
> after
> > > suspend/resume cycle (I use powersave daemon, but I also tried without it
> -
> > > effect is the same). 
> > > 
> > > After resume what I see in dmesg is this:
> > > 
> > > sky2 driver version 1.23
> > > sky2 0000:04:00.0: enabling device (0000 -> 0003)
> > > sky2 0000:04:00.0: PCI INT A -> GSI 17 (level, low) -> IRQ 17
> > > sky2 0000:04:00.0: setting latency timer to 64
> > > sky2 0000:04:00.0: unsupported chip type 0xff
> > 
> > It looks like reads from the device are returning all-ones.  
> >
> 
> That means something in PM didn't turn on the bus, so driver is out of
> luck.  Most of these problems have been traced back to generic PCI
> power management, nothing in driver runs before this.

OK, thanks - I cast the Cc net a bit wider.  Hopefully a suitable fish
will swim into it.
Comment 11 Ilya Hegai 2009-12-10 10:36:36 UTC
(In reply to comment #9)
> On Monday 07 December 2009, bugzilla-daemon@bugzilla.kernel.org wrote:
> > http://bugzilla.kernel.org/show_bug.cgi?id=14730
> > 
> > 
> > 
> > 
> > 
> > --- Comment #7 from Andrew Morton <akpm@linux-foundation.org>  2009-12-07
> 19:51:32 ---
> > (switched to email.  Please respond via emailed reply-to-all, not via the
> > bugzilla web interface).
> > 
> > On Fri, 4 Dec 2009 09:02:21 GMT
> > bugzilla-daemon@bugzilla.kernel.org wrote:
> > 
> > > http://bugzilla.kernel.org/show_bug.cgi?id=14730
> > > 
> > >            Summary: sky2 won't work after suspend/resume cycle
> > >            Product: Power Management
> > >            Version: 2.5
> > >           Platform: All
> > >         OS/Version: Linux
> > >               Tree: Mainline
> > >             Status: NEW
> > >           Severity: high
> > >           Priority: P1
> > >          Component: Hibernation/Suspend
> > >         AssignedTo: power-management_other@kernel-bugs.osdl.org
> > >         ReportedBy: mat@esi.com.pl
> > >         Regression: Yes
> > > 
> > > 
> > > I use sky2 network adapter in my Fujitsu Siemens Esprimo Mobile U9200. It
> > > worked fine up until 2.6.31-rc3 (or at least that's the last one I tried
> it
> > > works for me) and in all versions after that I tried it stops working
> after
> > > suspend/resume cycle (I use powersave daemon, but I also tried without it
> -
> > > effect is the same). 
> > > 
> > > After resume what I see in dmesg is this:
> > > 
> > > sky2 driver version 1.23
> > > sky2 0000:04:00.0: enabling device (0000 -> 0003)
> > > sky2 0000:04:00.0: PCI INT A -> GSI 17 (level, low) -> IRQ 17
> > > sky2 0000:04:00.0: setting latency timer to 64
> > > sky2 0000:04:00.0: unsupported chip type 0xff
> > 
> > It looks like reads from the device are returning all-ones.  
> > 
> > > sky2 0000:04:00.0: PCI INT A disabled
> > > sky2: probe of 0000:04:00.0 failed with error -95
> > > 
> > > So it looks like something is probably not restored correctly after
> resume or
> > > some initializations are not executed.
> > > 
> > > I use Gentoo sources, but I also tried it on vanilla kernels - 2.6.31 and
> > > 2.6.32-rc8 and it doesn't work in any version past 2.6.31-rc3.
> > > 
> > 
> > Help.  Do we think this regression is likely to be a sky2 thing, an
> > ACPI thing, an x86 arch thing or...?
> 
> Quite frankly, I have no idea.
> 
> In principle that may be a result of a PCI PM core change, but nothing
> comes to mind immediately.
> 
> It may be worth trying to revert 
> 
> commit 4b77b0a2ba27d64f58f16d8d4d48d8319dda36ff
> Author: Rafael J. Wysocki <rjw@sisk.pl>
> Date:   Wed Sep 9 23:49:59 2009 +0200
> 
>     PCI: Clear saved_state after the state has been restored
> 
> and retesting.

I'm not familiar with git, but I haven't found traces of that commit in 2.6.31.7 that I'm using (gentoo-sources) (and the problem exists there)
but after I applied it (got it from git) - sky2 errors after hibernation/resume dissapeared
And as I can see this patch has applied against 2.6.32, so I guess TS person is considered to upgrade to the latest kernel
Comment 12 Rafael J. Wysocki 2009-12-10 19:13:12 UTC
On Thursday 10 December 2009, bugzilla-daemon@bugzilla.kernel.org wrote:
> http://bugzilla.kernel.org/show_bug.cgi?id=14730
> 
> --- Comment #11 from Ilya Hegai <vyacheslavovich@gmail.com>  2009-12-10
> 10:36:36 ---
> (In reply to comment #9)
> > On Monday 07 December 2009, bugzilla-daemon@bugzilla.kernel.org wrote:
> > > http://bugzilla.kernel.org/show_bug.cgi?id=14730
...
> > It may be worth trying to revert 
> > 
> > commit 4b77b0a2ba27d64f58f16d8d4d48d8319dda36ff
> > Author: Rafael J. Wysocki <rjw@sisk.pl>
> > Date:   Wed Sep 9 23:49:59 2009 +0200
> > 
> >     PCI: Clear saved_state after the state has been restored
> > 
> > and retesting.
> 
> I'm not familiar with git, but I haven't found traces of that commit in
> 2.6.31.7 that I'm using (gentoo-sources) (and the problem exists there)
> but after I applied it (got it from git) - sky2 errors after
> hibernation/resume
> dissapeared
> And as I can see this patch has applied against 2.6.32, so I guess TS person
> is
> considered to upgrade to the latest kernel

Hmm.  The original report is against 2.6.32-rc8.

So you're saying that the commit above, when applied against 2.6.31.7,
actually fixes a sky2 resume problem for you?

Maciej, can you please check if 2.6.31.7 with the above commit applied works
for you?
Comment 13 Maciej J. Woloszyk 2009-12-10 19:48:52 UTC
On czwartek 10 grudzień 2009, Rafael J. Wysocki wrote:
> On Thursday 10 December 2009, bugzilla-daemon@bugzilla.kernel.org wrote:
> > http://bugzilla.kernel.org/show_bug.cgi?id=14730
> >
> > --- Comment #11 from Ilya Hegai <vyacheslavovich@gmail.com>  2009-12-10
> > 10:36:36 --- (In reply to comment #9)
> >
> > > On Monday 07 December 2009, bugzilla-daemon@bugzilla.kernel.org wrote:
> > > > http://bugzilla.kernel.org/show_bug.cgi?id=14730
> 
> ...
> 
> > > It may be worth trying to revert
> > >
> > > commit 4b77b0a2ba27d64f58f16d8d4d48d8319dda36ff
> > > Author: Rafael J. Wysocki <rjw@sisk.pl>
> > > Date:   Wed Sep 9 23:49:59 2009 +0200
> > >
> > >     PCI: Clear saved_state after the state has been restored
> > >
> > > and retesting.
> >
> > I'm not familiar with git, but I haven't found traces of that commit in
> > 2.6.31.7 that I'm using (gentoo-sources) (and the problem exists there)
> > but after I applied it (got it from git) - sky2 errors after
> > hibernation/resume dissapeared
> > And as I can see this patch has applied against 2.6.32, so I guess TS
> > person is considered to upgrade to the latest kernel
> 
> Hmm.  The original report is against 2.6.32-rc8.
> 
> So you're saying that the commit above, when applied against 2.6.31.7,
> actually fixes a sky2 resume problem for you?
> 
> Maciej, can you please check if 2.6.31.7 with the above commit applied
>  works for you?
> 

Well... I've downloaded the patch from git. The working kernel - 2.6.31-rc3 
does not have it aplied, but later versions do, so I tried to revert it on the 
latest gentoo sources (2.6.32) - it reverted fine, but it didn't really helped 
- after suspend/resume sky2 does not work. I'll try gentoo-sources-2.6.31-r7 
right now and I'll let you know of the results.

m.

-- 
Maciej J. Woloszyk, <mat@esi.com.pl>         "Rejoice. For Very Bad
tel. +48 501 033 410                    Things are about to happen."
JID:mat@esi.com.pl                 Richard/Looking For Group
Comment 14 Maciej J. Woloszyk 2009-12-10 20:15:20 UTC
On czwartek 10 grudzień 2009, Rafael J. Wysocki wrote:
> On Thursday 10 December 2009, bugzilla-daemon@bugzilla.kernel.org wrote:
> > http://bugzilla.kernel.org/show_bug.cgi?id=14730
> >
> > --- Comment #11 from Ilya Hegai <vyacheslavovich@gmail.com>  2009-12-10
> > 10:36:36 --- (In reply to comment #9)
> >
> > > On Monday 07 December 2009, bugzilla-daemon@bugzilla.kernel.org wrote:
> > > > http://bugzilla.kernel.org/show_bug.cgi?id=14730
> 
> ...
> 
> > > It may be worth trying to revert
> > >
> > > commit 4b77b0a2ba27d64f58f16d8d4d48d8319dda36ff
> > > Author: Rafael J. Wysocki <rjw@sisk.pl>
> > > Date:   Wed Sep 9 23:49:59 2009 +0200
> > >
> > >     PCI: Clear saved_state after the state has been restored
> > >
> > > and retesting.
> >
> > I'm not familiar with git, but I haven't found traces of that commit in
> > 2.6.31.7 that I'm using (gentoo-sources) (and the problem exists there)
> > but after I applied it (got it from git) - sky2 errors after
> > hibernation/resume dissapeared
> > And as I can see this patch has applied against 2.6.32, so I guess TS
> > person is considered to upgrade to the latest kernel
> 
> Hmm.  The original report is against 2.6.32-rc8.
> 
> So you're saying that the commit above, when applied against 2.6.31.7,
> actually fixes a sky2 resume problem for you?
> 
> Maciej, can you please check if 2.6.31.7 with the above commit applied
>  works for you?
> 

Ok. I've just tried 2.6.31.7 with the patch applied - it didn't worked. sky2 
still does'nt work after resume.

m.


-- 
Maciej J. Woloszyk, <mat@esi.com.pl>         "Rejoice. For Very Bad
tel. +48 501 033 410                    Things are about to happen."
JID:mat@esi.com.pl                 Richard/Looking For Group
Comment 15 Rafael J. Wysocki 2009-12-10 20:40:03 UTC
On Thursday 10 December 2009, Maciej J. Woloszyk wrote:
> On czwartek 10 grudzień 2009, Rafael J. Wysocki wrote:
> > On Thursday 10 December 2009, bugzilla-daemon@bugzilla.kernel.org wrote:
> > > http://bugzilla.kernel.org/show_bug.cgi?id=14730
> > >
> > > --- Comment #11 from Ilya Hegai <vyacheslavovich@gmail.com>  2009-12-10
> > > 10:36:36 --- (In reply to comment #9)
> > >
> > > > On Monday 07 December 2009, bugzilla-daemon@bugzilla.kernel.org wrote:
> > > > > http://bugzilla.kernel.org/show_bug.cgi?id=14730
> > 
> > ...
> > 
> > > > It may be worth trying to revert
> > > >
> > > > commit 4b77b0a2ba27d64f58f16d8d4d48d8319dda36ff
> > > > Author: Rafael J. Wysocki <rjw@sisk.pl>
> > > > Date:   Wed Sep 9 23:49:59 2009 +0200
> > > >
> > > >     PCI: Clear saved_state after the state has been restored
> > > >
> > > > and retesting.
> > >
> > > I'm not familiar with git, but I haven't found traces of that commit in
> > > 2.6.31.7 that I'm using (gentoo-sources) (and the problem exists there)
> > > but after I applied it (got it from git) - sky2 errors after
> > > hibernation/resume dissapeared
> > > And as I can see this patch has applied against 2.6.32, so I guess TS
> > > person is considered to upgrade to the latest kernel
> > 
> > Hmm.  The original report is against 2.6.32-rc8.
> > 
> > So you're saying that the commit above, when applied against 2.6.31.7,
> > actually fixes a sky2 resume problem for you?
> > 
> > Maciej, can you please check if 2.6.31.7 with the above commit applied
> >  works for you?
> > 
> 
> Ok. I've just tried 2.6.31.7 with the patch applied - it didn't worked. sky2 
> still does'nt work after resume.

So the problem is clearly different for you.  It also means it's not universal
for all sky2s, so it may depend on whether the adapter is PCIe or something.

Can you please use 2.6.32 for further testing and do the following:

# echo code > /sys/power/pm_test
# echo mem > /sys/power/state

wait until it gets back to the command prompt and see if your sky2 works
after that?
Comment 16 Maciej J. Woloszyk 2009-12-10 21:00:31 UTC
On czwartek 10 grudzień 2009, Rafael J. Wysocki wrote:
> On Thursday 10 December 2009, Maciej J. Woloszyk wrote:
> > On czwartek 10 grudzień 2009, Rafael J. Wysocki wrote:
> > > On Thursday 10 December 2009, bugzilla-daemon@bugzilla.kernel.org wrote:
> > > > http://bugzilla.kernel.org/show_bug.cgi?id=14730
> > > >
> > > > --- Comment #11 from Ilya Hegai <vyacheslavovich@gmail.com> 
> > > > 2009-12-10 10:36:36 --- (In reply to comment #9)
> > > >
> > > > > On Monday 07 December 2009, bugzilla-daemon@bugzilla.kernel.org 
wrote:
> > > > > > http://bugzilla.kernel.org/show_bug.cgi?id=14730
> > >
> > > ...
> > >
> > > > > It may be worth trying to revert
> > > > >
> > > > > commit 4b77b0a2ba27d64f58f16d8d4d48d8319dda36ff
> > > > > Author: Rafael J. Wysocki <rjw@sisk.pl>
> > > > > Date:   Wed Sep 9 23:49:59 2009 +0200
> > > > >
> > > > >     PCI: Clear saved_state after the state has been restored
> > > > >
> > > > > and retesting.
> > > >
> > > > I'm not familiar with git, but I haven't found traces of that commit
> > > > in 2.6.31.7 that I'm using (gentoo-sources) (and the problem exists
> > > > there) but after I applied it (got it from git) - sky2 errors after
> > > > hibernation/resume dissapeared
> > > > And as I can see this patch has applied against 2.6.32, so I guess TS
> > > > person is considered to upgrade to the latest kernel
> > >
> > > Hmm.  The original report is against 2.6.32-rc8.
> > >
> > > So you're saying that the commit above, when applied against 2.6.31.7,
> > > actually fixes a sky2 resume problem for you?
> > >
> > > Maciej, can you please check if 2.6.31.7 with the above commit applied
> > >  works for you?
> >
> > Ok. I've just tried 2.6.31.7 with the patch applied - it didn't worked.
> > sky2 still does'nt work after resume.
> 
> So the problem is clearly different for you.  It also means it's not
>  universal for all sky2s, so it may depend on whether the adapter is PCIe
>  or something.
> 
> Can you please use 2.6.32 for further testing and do the following:
> 

Ok.

> # echo code > /sys/power/pm_test

I hope you ment it to  be "echo core" because if it actualy supposed to be 
"code" - it didn't worked ;)

> # echo mem > /sys/power/state
> 
> wait until it gets back to the command prompt and see if your sky2 works
> after that?
> 

Yes - after the second echo it did kind of suspend/resume cycle and after that 
sky2 still worked.
Comment 17 Rafael J. Wysocki 2009-12-10 21:23:05 UTC
On Thursday 10 December 2009, Maciej J. Woloszyk wrote:
...
> > Can you please use 2.6.32 for further testing and do the following:
> > 
> 
> Ok.
> 
> > # echo code > /sys/power/pm_test
> 
> I hope you ment it to  be "echo core" because if it actualy supposed to be 
> "code" - it didn't worked ;)

Yes, that should have been 'core', sorry for the mistake.

> > # echo mem > /sys/power/state
> > 
> > wait until it gets back to the command prompt and see if your sky2 works
> > after that?
> > 
> 
> Yes - after the second echo it did kind of suspend/resume cycle and after
> that 
> sky2 still worked.

OK, thanks.

Please attach the full output of 'lspci -vv' (including all devices) before and
after a suspend-resume cycle failing to restore the sky2 functionality.
Comment 18 Maciej J. Woloszyk 2009-12-10 21:38:19 UTC
Created attachment 24143 [details]
lspci -vv before suspend
Comment 19 Maciej J. Woloszyk 2009-12-10 21:39:03 UTC
Created attachment 24144 [details]
lspci -vv after resume
Comment 20 Rafael J. Wysocki 2009-12-10 21:56:15 UTC
Looking at the file in comment #19 I can see your Ethernet showing Unorrectable Error and Unsupported Request status, so it seems something has failed at the hardware level.

I'm not sure what to do next, really.

If I were you and I had a system where the problem was 100% reproducible, I'd try to find the patch that cause that to happen using bisection.  I have no other ideas at the moment.
Comment 21 Rafael J. Wysocki 2009-12-10 22:07:23 UTC
There's one thing to check, though.  Please try to boot with pci=nomsi and see if the problem is reproducible.
Comment 22 Maciej J. Woloszyk 2009-12-10 22:44:43 UTC
Ok. Tried that. No change - sky still doesn't work.

So... probably I would have to go patch hunting ;)
Comment 23 Ilya Hegai 2009-12-11 08:34:41 UTC
Me too) because I'm still experiencing that problem, yesterday several hibernate/resume cycles went smoothly, but today I've got that error again, so it's not reproducible always which makes "patch hunting" work harder
And unfortunately I can play with 2.6.31.X only, because resume in 2.6.32 is completely broken for my laptop
Comment 24 Rafael J. Wysocki 2009-12-11 21:00:32 UTC
On Friday 11 December 2009, bugzilla-daemon@bugzilla.kernel.org wrote:
> http://bugzilla.kernel.org/show_bug.cgi?id=14730
> 
> --- Comment #23 from Ilya Hegai <vyacheslavovich@gmail.com>  2009-12-11
> 08:34:41 ---
> Me too) because I'm still experiencing that problem, yesterday several
> hibernate/resume cycles went smoothly, but today I've got that error again,
> so
> it's not reproducible always which makes "patch hunting" work harder

So the commit we have discussed doesn't help to fix resume on your box after
all?

> And unfortunately I can play with 2.6.31.X only, because resume in 2.6.32 is
> completely broken for my laptop

I think you should file a separate bug report about that.

Is there any kernel where suspend/resume works reliably on your box?
Comment 25 Maciej J. Woloszyk 2009-12-28 10:26:46 UTC
Ok. So I did a hunting session... first I compiled all 2.6.31 RC's from rc4 to rc9 - it worked up until rc7. Next I opened the changelog for 2.6.31-rc8 and searched what could be responsible for my problems. I found the suspect in commit c82f63e411f1b58427c103bd95af2863b1c96dd1 (PCI: check saved state before restore). I downloaded patch from GIT and aplied it over rc7 - as expected it stopped working. Next I took 2.6.32 and reverted the patch, and - again as expected - it works just fine for now. I'm not sure if this does not break something else for someone else but for now it works fine for me.
Comment 26 Rafael J. Wysocki 2009-12-29 14:50:41 UTC
Created attachment 24350 [details]
sky2: Force restoration of PCI configuration registers

In that case the attached patch should help instead of the revert.

Can you verify, please?
Comment 27 Maciej J. Woloszyk 2009-12-29 17:12:27 UTC
Patched clean 2.6.28, compiled, rebooted and tested. It works fine with the last patch.

Thank you very much :)
Comment 28 Maciej J. Woloszyk 2009-12-29 17:13:26 UTC
It was supposed to be 2.6.32 of course ;)
Comment 29 Rafael J. Wysocki 2009-12-29 21:00:04 UTC
Alas, this means that for some unknown reason the restoration of the sky2's config registers works at this point, although the previous attempt (done by the PCI core in the early phase of resume) apparently fails.

So, something's going on that we don't quite understand at the moment.

Can you attach a dmesg output containing full suspend-resume cycle with the patch from comment #26 applied, please?
Comment 30 Maciej J. Woloszyk 2009-12-30 07:07:27 UTC
Created attachment 24375 [details]
dmesg after full suspend/resume cycle with comment #26 patch
Comment 31 Rafael J. Wysocki 2009-12-30 20:28:23 UTC
So, this line:

sky2 0000:04:00.0: Refused to change power state, currently in D3

tells us that we couldn't change the power state of the adapter in the early phase of resume, although we could do that later.  Interestingly enough, the configuration space of the PCIe root port 0000:00:1c.1, which is the upstream "bridge" of the adapter, was restored successfully, so this shouldn't have been a communication problem.

Hmm.  The hardware appears to be quirky.

Please attach your .config from 2.6.32.
Comment 32 Rafael J. Wysocki 2009-12-30 20:45:53 UTC
Created attachment 24380 [details]
sky2: Move PCI resume bits to the early phase

Let's try to move the PCI config space restoration in the driver to the early phase of resume and see what happens.

Please try to suspend-resume with this patch (instead of the previous one) applied and attach the output of dmesg (regardless of the result).
Comment 33 Maciej J. Woloszyk 2009-12-30 21:23:27 UTC
Created attachment 24381 [details]
.config file for 2.6.32
Comment 34 Maciej J. Woloszyk 2009-12-30 21:42:40 UTC
Created attachment 24382 [details]
dmesg after full suspend/resume cycle with comment #32 patch

Here is the dmesg result - after reversing the previous patch an applying the new one. And with the new one it does not work after resume.
Comment 35 Rafael J. Wysocki 2009-12-30 21:59:47 UTC
Created attachment 24383 [details]
sky2: Move PCI resume bits to the early phase, add sleep

Thanks.

Let's check if the adapter needs more time to switch to D0.

Please apply this patch instead of the previous one and retest (please attach the output of dmesg regardless of the result).
Comment 36 Rafael J. Wysocki 2009-12-30 22:01:47 UTC
If the patch from comment #35 doesn't work, can you unset CONFIG_HOTPLUG_PCI_PCIE in your .config and retest, please?
Comment 37 Maciej J. Woloszyk 2009-12-30 22:31:33 UTC
Created attachment 24384 [details]
dmesg after full suspend/resume cycle with comment #35 patch

Ok. Comment #35 patch seems to have worked.
Comment 38 Rafael J. Wysocki 2009-12-30 22:50:39 UTC
Ah, good.  So it's just quirky hardware.  I was afraid it might be something more fundamental.

So, please give me some time to prepare a more general patch.
Comment 39 Rafael J. Wysocki 2009-12-30 23:33:55 UTC
Created attachment 24386 [details]
PCI / PM: Use per-device D3 delay

Please check if this patch works for you (instead of the previous one).

Please attach the output of dmesg after a suspend-resume cycle regardless of the result.
Comment 40 Maciej J. Woloszyk 2009-12-30 23:52:41 UTC
Created attachment 24387 [details]
dmesg after full suspend/resume cycle with comment #39 patch

It works.
Comment 41 Rafael J. Wysocki 2009-12-31 00:45:30 UTC
Great, thanks for testing.

I'll send the patch for review tomorrow.

In the meantime, can you please check if the value of pdev->d3_delay in drivers/net/sky2.c:sky2_probe() can be decreased?  I used a very pessimistic value (200 ms is 20 times more than the requirement of the PCI standard), so perhaps we don't need to wait that long during the resume of your network adapter ...
Comment 42 Maciej J. Woloszyk 2009-12-31 08:34:09 UTC
I've just finished the tests. First i went down to some very low value of 20 (it didn't worked obviously ;)) and the incresed gradually. At 120 it started working, but it died after third suspend/resume cycle. For now I can tell that safe value for my adapter is 150 ms - at this point it initialized correctly for 10 consecutive suspend/resume cycles.
Comment 43 Rafael J. Wysocki 2009-12-31 09:56:57 UTC
OK, thanks.  I'll use 150 in the final patch, then.
Comment 44 Rafael J. Wysocki 2009-12-31 11:19:34 UTC
Created attachment 24390 [details]
PCI / PM: Use per-device D3 delays

Updated patch sent for review.

It's slightly different from the previous one, so I guess it's a good idea to double check if it works.
Comment 45 Rafael J. Wysocki 2009-12-31 11:20:33 UTC
Handled-By : Rafael J. Wysocki <rjw@sisk.pl>
Patch : http://patchwork.kernel.org/patch/70357/
Comment 46 Maciej J. Woloszyk 2009-12-31 11:42:20 UTC
Checked the new version. It works fine for me. Thanks.
Comment 47 Rafael J. Wysocki 2009-12-31 15:53:48 UTC
Created attachment 24392 [details]
sky2: Allow PCI PM core to handle PCI-specific parts of suspend/resume

When we're at it, can you please test if the attached patch doesn't break things for you?

It simplifies the sky2's suspend and resume routines and it should improve the handling of Wake-on-LAN in some cases.