Bug 11865

Summary: WOL for E100 Doesn't Work Anymore
Product: Drivers Reporter: Roger (rogerx.oss)
Component: NetworkAssignee: Rafael J. Wysocki (rjw)
Status: CLOSED CODE_FIX    
Severity: normal CC: andi, jbrandeb, rjw
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 2.6.27 Subsystem:
Regression: Yes Bisected commit-id:
Bug Depends on:    
Bug Blocks: 7216, 11167    
Attachments: e100: Fix WOL regression

Description Roger 2008-10-26 21:56:35 UTC
Latest working kernel version: 2.6.27
Earliest failing kernel version: 2.6.27
Distribution: Gentoo
Hardware Environment: 2x750P3 1G RAM w/ 

00:14.0 Ethernet controller: Intel Corporation 82557/8/9 [Ethernet Pro 100] (rev 08)

Problem Description:
As of kernel version 2.6.27, WOL has stopped working.  The box no longer powers on after the magic packet has been sent.  Rebooting using kernel-2.6.26*, WOL *does* work!

Steps to reproduce:
(Not needed.  Self Explanatory.)


I noticed 3 lines of code in e100.c added/modified since 2.6.26*, however, I reversed 2 lines of code to 2.6.26 and the other line was just added an additional pointer, for which I left alone.  Recompiled, rebooted and WOL still didn't work using 2.6.27.  I'm guessing, the source of this bug lies elsewhere.
Comment 1 Roger 2008-10-26 21:57:41 UTC
I made some additional notes of this bug on Bug #5149.

However, I believe at this point, they're probably separate issues as WOL has been working well since 2.6.27.
Comment 2 Anonymous Emailer 2008-10-27 22:02:39 UTC
Reply-To: akpm@linux-foundation.org


(switched to email.  Please respond via emailed reply-to-all, not via the
bugzilla web interface).

On Sun, 26 Oct 2008 21:56:35 -0700 (PDT) bugme-daemon@bugzilla.kernel.org wrote:

> http://bugzilla.kernel.org/show_bug.cgi?id=11865
> 
>            Summary: WOL for E100 Doesn't Work Anymore
>            Product: Drivers
>            Version: 2.5
>      KernelVersion: 2.6.27
>           Platform: All
>         OS/Version: Linux
>               Tree: Mainline
>             Status: NEW
>           Severity: normal
>           Priority: P1
>          Component: Network
>         AssignedTo: jgarzik@pobox.com
>         ReportedBy: rogerx@sdf.lonestar.org
> 
> 
> Latest working kernel version: 2.6.27

This should read 2.6.26.  It is a regression.

> Earliest failing kernel version: 2.6.27
> Distribution: Gentoo
> Hardware Environment: 2x750P3 1G RAM w/ 
> 
> 00:14.0 Ethernet controller: Intel Corporation 82557/8/9 [Ethernet Pro 100]
> (rev 08)
> 
> Problem Description:
> As of kernel version 2.6.27, WOL has stopped working.  The box no longer
> powers
> on after the magic packet has been sent.  Rebooting using kernel-2.6.26*, WOL
> *does* work!
> 
> Steps to reproduce:
> (Not needed.  Self Explanatory.)
> 
> 
> I noticed 3 lines of code in e100.c added/modified since 2.6.26*, however, I
> reversed 2 lines of code to 2.6.26 and the other line was just added an
> additional pointer, for which I left alone.  Recompiled, rebooted and WOL
> still
> didn't work using 2.6.27.  I'm guessing, the source of this bug lies
> elsewhere.
Comment 3 Jesse Brandeburg 2008-11-03 11:05:06 UTC
Andrew Morton wrote:
> (switched to email.  Please respond via emailed reply-to-all, not via
> the bugzilla web interface).
> 
> On Sun, 26 Oct 2008 21:56:35 -0700 (PDT)
> bugme-daemon@bugzilla.kernel.org wrote: 
> 
>> http://bugzilla.kernel.org/show_bug.cgi?id=11865
>> 
>>            Summary: WOL for E100 Doesn't Work Anymore           
>>            Product: Drivers Version: 2.5
>>      KernelVersion: 2.6.27
>>           Platform: All
>>         OS/Version: Linux
>>               Tree: Mainline
>>             Status: NEW
>>           Severity: normal
>>           Priority: P1
>>          Component: Network
>>         AssignedTo: jgarzik@pobox.com
>>         ReportedBy: rogerx@sdf.lonestar.org
>> 
>> 
>> Latest working kernel version: 2.6.27
> 
> This should read 2.6.26.  It is a regression.
> 
>> Earliest failing kernel version: 2.6.27
>> Distribution: Gentoo
>> Hardware Environment: 2x750P3 1G RAM w/
>> 
>> 00:14.0 Ethernet controller: Intel Corporation 82557/8/9 [Ethernet
>> Pro 100] (rev 08) 
>> 
>> Problem Description:
>> As of kernel version 2.6.27, WOL has stopped working.  The box no
>> longer powers on after the magic packet has been sent.  Rebooting
>> using kernel-2.6.26*, WOL *does* work! 

what does cat /proc/acpi/wakeup say?  (if you have legacy acpi procfs)

>> 
>> Steps to reproduce:
>> (Not needed.  Self Explanatory.)
>> 
>> 
>> I noticed 3 lines of code in e100.c added/modified since 2.6.26*,
>> however, I reversed 2 lines of code to 2.6.26 and the other line was
>> just added an additional pointer, for which I left alone. 
>> Recompiled, rebooted and WOL still didn't work using 2.6.27.  I'm
>> guessing, the source of this bug lies elsewhere. 

I think there is an outstanding patch set to "make device use the new
power management api"
http://marc.info/?l=linux-netdev&m=121874992800468&w=2

It doesn't appear this patch was in 2.6.27, I'm not sure why not, but I
am relatively afraid that lots of devices ability to wake up got busted
in 2.6.27.
Comment 4 Rafael J. Wysocki 2008-11-03 15:27:45 UTC
On Monday, 3 of November 2008, Brandeburg, Jesse wrote:
> Andrew Morton wrote:
> > (switched to email.  Please respond via emailed reply-to-all, not via
> > the bugzilla web interface).
> > 
> > On Sun, 26 Oct 2008 21:56:35 -0700 (PDT)
> > bugme-daemon@bugzilla.kernel.org wrote: 
> > 
> >> http://bugzilla.kernel.org/show_bug.cgi?id=11865
> >> 
> >>            Summary: WOL for E100 Doesn't Work Anymore           
> >>            Product: Drivers Version: 2.5
> >>      KernelVersion: 2.6.27
> >>           Platform: All
> >>         OS/Version: Linux
> >>               Tree: Mainline
> >>             Status: NEW
> >>           Severity: normal
> >>           Priority: P1
> >>          Component: Network
> >>         AssignedTo: jgarzik@pobox.com
> >>         ReportedBy: rogerx@sdf.lonestar.org
> >> 
> >> 
> >> Latest working kernel version: 2.6.27
> > 
> > This should read 2.6.26.  It is a regression.
> > 
> >> Earliest failing kernel version: 2.6.27
> >> Distribution: Gentoo
> >> Hardware Environment: 2x750P3 1G RAM w/
> >> 
> >> 00:14.0 Ethernet controller: Intel Corporation 82557/8/9 [Ethernet
> >> Pro 100] (rev 08) 
> >> 
> >> Problem Description:
> >> As of kernel version 2.6.27, WOL has stopped working.  The box no
> >> longer powers on after the magic packet has been sent.  Rebooting
> >> using kernel-2.6.26*, WOL *does* work! 
> 
> what does cat /proc/acpi/wakeup say?  (if you have legacy acpi procfs)
> 
> >> 
> >> Steps to reproduce:
> >> (Not needed.  Self Explanatory.)
> >> 
> >> 
> >> I noticed 3 lines of code in e100.c added/modified since 2.6.26*,
> >> however, I reversed 2 lines of code to 2.6.26 and the other line was
> >> just added an additional pointer, for which I left alone. 
> >> Recompiled, rebooted and WOL still didn't work using 2.6.27.  I'm
> >> guessing, the source of this bug lies elsewhere. 
> 
> I think there is an outstanding patch set to "make device use the new
> power management api"
> http://marc.info/?l=linux-netdev&m=121874992800468&w=2

There were three patches like this, one against sky2 and the others for e100
and skge.  The sky2 one has been merged, the other two are in the Jeff's tree
AFAICS.

> It doesn't appear this patch was in 2.6.27, I'm not sure why not, but I
> am relatively afraid that lots of devices ability to wake up got busted
> in 2.6.27.

Well, unfortunately the maintainers of the networking code were not very
interested in these patches, although I had sent them well before 2.6.27
for the first time.

Generally, the sky2 commit at
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=9d731d77c9794bb0a264f58d35949a1ab6dcc41c
illustrates what needs to be done for other NICs supporting WOL (the key part
is to call device_set_wakeup_enable() with appropriate arguments whenever
WOL is enabled/disabled and using for the initialization).

I'm going to do this over time for all of the drivers, but if anyone can help,
please do so.

Thanks,
Rafael
Comment 5 Jesse Brandeburg 2008-11-03 16:48:40 UTC
Rafael J. Wysocki wrote:
>>>> 00:14.0 Ethernet controller: Intel Corporation 82557/8/9 [Ethernet
>>>> Pro 100] (rev 08) 
>>>> 
>>>> Problem Description:
>>>> As of kernel version 2.6.27, WOL has stopped working.  The box no
>>>> longer powers on after the magic packet has been sent.  Rebooting
>>>> using kernel-2.6.26*, WOL *does* work!

>> I think there is an outstanding patch set to "make device use the
>> new power management api"
>> http://marc.info/?l=linux-netdev&m=121874992800468&w=2 
> 
> There were three patches like this, one against sky2 and the others
> for e100 and skge.  The sky2 one has been merged, the other two are
> in the Jeff's tree AFAICS.

why did you pick those three drivers?  I'm glad you picked e100 as one
of the ones to make the change to, but there are a lot of other drivers
that support wake on lan, and are probably now broken.
 
>> It doesn't appear this patch was in 2.6.27, I'm not sure why not,
>> but I am relatively afraid that lots of devices ability to wake up
>> got busted in 2.6.27.
> 
> Well, unfortunately the maintainers of the networking code were not
> very interested in these patches, although I had sent them well
> before 2.6.27 
> for the first time.

unless I am mistaken the reason we have drivers in the kernel is that so
when API changes are made to the kernel the driver is pulled along with
it.  As I understand it, your changes to the core API should have been
rejected until the drivers that used power management APIs were updated
too, otherwise we just get generic user breakage.

Would you have asked for our help directly (CC:) we probably can/could
help with our drivers.
 
> Generally, the sky2 commit at
>
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commi
t;h=9d731d77c9794bb0a264f58d35949a1ab6dcc41c
> illustrates what needs to be done for other NICs supporting WOL (the
> key part is to call device_set_wakeup_enable() with appropriate
> arguments whenever 
> WOL is enabled/disabled and using for the initialization).
> 
> I'm going to do this over time for all of the drivers, but if anyone
> can help, please do so.

While that is a worthy goal, I don't think it is okay to have
regressions until you have time to fix them.  This is a pretty big
change in a subtly difficult area to both get right and to test.

Your work is appeciated, I just didn't realize it would break all
devices that didn't implement it.

I'm open to corrections to my point of view above.
Jesse
Comment 6 Rafael J. Wysocki 2008-11-03 17:38:48 UTC
On Tuesday, 4 of November 2008, bugme-daemon@bugzilla.kernel.org wrote:
> http://bugzilla.kernel.org/show_bug.cgi?id=11865
> 
> ------- Comment #5 from jesse.brandeburg@intel.com  2008-11-03 16:48 -------
> Rafael J. Wysocki wrote:
> >>>> 00:14.0 Ethernet controller: Intel Corporation 82557/8/9 [Ethernet
> >>>> Pro 100] (rev 08) 
> >>>> 
> >>>> Problem Description:
> >>>> As of kernel version 2.6.27, WOL has stopped working.  The box no
> >>>> longer powers on after the magic packet has been sent.  Rebooting
> >>>> using kernel-2.6.26*, WOL *does* work!
> 
> >> I think there is an outstanding patch set to "make device use the
> >> new power management api"
> >> http://marc.info/?l=linux-netdev&m=121874992800468&w=2 
> > 
> > There were three patches like this, one against sky2 and the others
> > for e100 and skge.  The sky2 one has been merged, the other two are
> > in the Jeff's tree AFAICS.
> 
> why did you pick those three drivers?

I could test them myself at that time.

> I'm glad you picked e100 as one of the ones to make the change to, but there
> are a lot of other drivers that support wake on lan, and are probably now
> broken. 

Well, many of them wouldn't work due to the missing ACPI support and the
changes in question were necessary to add that support.  Also, there were
_no_ problem reports related to this until now, except for the sky2 case that
was immediately handled.

> >> It doesn't appear this patch was in 2.6.27, I'm not sure why not,
> >> but I am relatively afraid that lots of devices ability to wake up
> >> got busted in 2.6.27.
> > 
> > Well, unfortunately the maintainers of the networking code were not
> > very interested in these patches, although I had sent them well
> > before 2.6.27 
> > for the first time.
> 
> unless I am mistaken the reason we have drivers in the kernel is that so
> when API changes are made to the kernel the driver is pulled along with
> it.  As I understand it, your changes to the core API should have been
> rejected until the drivers that used power management APIs were updated
> too, otherwise we just get generic user breakage.

Unfortunately, I had to fix some damage that had been caused by someone else
before and I wanted to send fixes for all of the drivers once the first batch
of my patches (sky2, e100, skge, tg3 - I forgot about this one, it's already
in) had been accepted, but that hasn't really happened yet.

> Would you have asked for our help directly (CC:) we probably can/could
> help with our drivers.

Sorry about that, my fault.

> > Generally, the sky2 commit at
> >
> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commi
> t;h=9d731d77c9794bb0a264f58d35949a1ab6dcc41c
> > illustrates what needs to be done for other NICs supporting WOL (the
> > key part is to call device_set_wakeup_enable() with appropriate
> > arguments whenever 
> > WOL is enabled/disabled and using for the initialization).
> > 
> > I'm going to do this over time for all of the drivers, but if anyone
> > can help, please do so.
> 
> While that is a worthy goal, I don't think it is okay to have
> regressions until you have time to fix them.  This is a pretty big
> change in a subtly difficult area to both get right and to test.

This isn't really "until I have time to fix them", but "until my fixes are
accepted" thing.  I _really_ care about regressions, but in this case I failed
to make people realize the importance of the patches I was sending.

Moreover, I wasn't quite sure how many systems would _really_ be broken,
because many of them didn't work anyway (despite the support in the drivers),
as I said above.  For this reason, I thought I would fix problems as soon as
they were reported, but there was only one report before this one.

> Your work is appeciated, I just didn't realize it would break all
> devices that didn't implement it.

In fact it doesn't really break them, you can write 'enabled' to
/sys/devices/.../power/wakeup of the device in question and it will work.

> I'm open to corrections to my point of view above.

Well, before those changes WOL didn't work on any of my machines and now
it works on all of them (except for one that hasn't been tested) - with the
already merged patches and the two waiting for merging.

Sorry about the problems caused, I'm willing to fix them ASAP.

Thanks,
Rafael
Comment 7 Rafael J. Wysocki 2008-11-03 23:16:07 UTC
Created attachment 18646 [details]
e100: Fix WOL regression

FWIW, attached is the minimal fix for e100 (please test).

[The patch in the Jeff's tree does more than this one.]
Comment 8 Roger 2008-11-04 01:57:20 UTC
Confirmed on my "Ethernet controller: Intel Corporation 82557/8/9 [Ethernet Pro 100] (rev 08)", simply doing the following re-enables WOL:

 # cat /sys/devices/pci0000\:00/0000\:00\:14.0/power/wakeup
disabled

 # echo "enabled" > /sys/devices/pci0000\:00/0000\:00\:14.0/power/wakeup

I'll test any submitted patches shortly.
Comment 9 Roger 2008-11-04 02:43:42 UTC
Patch submitted by Comment #7 works here.

To verify, I made sure sysfs showed ensure "../power/wakeup" showed "disabled" (as it should with the current kernel release).

I then issued "halt -p", restarted & rechecked the sysfs and it did show the correct "enabled" flag without any interference from a cold start.

I then hibernated and WOL worked with a cheerful power on.

Just to check, I have another PCI ethernet card with WOL capability.  Does the motherboard need the two pin power cable attached to the WOL pins on a motherboard in order to work?  It's an Intel 440BX board with no cable attachment point on the motherboard, however, the bios of the motherboard does show it's WOL capable.
Comment 10 Rafael J. Wysocki 2008-11-04 03:36:26 UTC
What network adapter is there on the board?
Comment 11 Rafael J. Wysocki 2008-11-04 08:41:57 UTC
On Tuesday, 4 of November 2008, Rafael J. Wysocki wrote:
[--snip--]
> 
> > Your work is appeciated, I just didn't realize it would break all
> > devices that didn't implement it.
> 
> In fact it doesn't really break them, you can write 'enabled' to
> /sys/devices/.../power/wakeup of the device in question and it will work.
> 
> > I'm open to corrections to my point of view above.
> 
> Well, before those changes WOL didn't work on any of my machines and now
> it works on all of them (except for one that hasn't been tested) - with the
> already merged patches and the two waiting for merging.
> 
> Sorry about the problems caused, I'm willing to fix them ASAP.

BTW, WOL fixes for e1000e, e1000 and igb follow.

Thanks,
Rafael
Comment 12 Roger 2008-11-04 16:15:35 UTC
This is for Comment #3

On Mon, 2008-11-03 at 11:04 -0800, Brandeburg, Jesse wrote:
> /proc/acpi/wakeup

$ cat /proc/acpi/wakeup
Device  S-state   Status   Sysfs node
PCI0      S4     disabled  no-bus:pci0000:00
SLPB      S5    *enabled 


Seems pretty normal.  This is after using Rafael's patch on the kernel,
fixing the bug.  This output is pretty much Identical to what I saw
prior to the patch.  So, I think using this older /proc/acpi is not very
reliable at this point.

Checking /sys/bus/pci...*/power/wakeup *DID* show it was disabled & then
showed enabled after applying Rafael's patch.  Also, as Rafael further
noted, simply marking the /sys/*/*/*/power/wakeup flag as enabled did
renable WOL as well.
Comment 13 Rafael J. Wysocki 2008-11-05 02:32:52 UTC
In fact /proc/acpi/wakeup has never worked for many users and it's now considered as deprecated, although you can use it to see which devices are set up to wake up the system (still, these only are devices that ACPI knows about).
Comment 14 Rafael J. Wysocki 2008-11-09 13:03:51 UTC
Patch : http://bugzilla.kernel.org/attachment.cgi?id=18646&action=view
Comment 15 Rafael J. Wysocki 2008-11-09 13:04:41 UTC
Handled-By : Rafael J. Wysocki <rjw@sisk.pl>
Comment 16 Andreas Mohr 2008-11-09 15:26:30 UTC
The "needs cable" vs. no cable needed is said to be a PCI2.1(?) vs. PCI2.2(?) thing and is said to be a thoroughly different way of operation.
This being said, my via-rhine card (onboard; newish PCI --> non-cable WOL) did NOT work on 25/26ish either despite massive ACPI signal wakeup and magic packet sender tweaking, thus any work in this area to put us out of this misery is greatly appreciated from my side!
Comment 17 Roger 2008-11-16 22:53:43 UTC
What is the following email message for?

---
This message has been generated automatically as a part of a report
of regressions introduced between 2.6.26 and 2.6.27.

The following bug entry is on the current list of known regressions
introduced between 2.6.26 and 2.6.27.  Please verify if it still should
be listed and let me know (either way)
---

Reminder to let me know if this bug is fixed or not??
Comment 18 Rafael J. Wysocki 2008-11-17 01:53:29 UTC
On Monday, 17 of November 2008, you wrote:
> http://bugzilla.kernel.org/show_bug.cgi?id=11865
> 
> ------- Comment #17 from rogerx@sdf.lonestar.org  2008-11-16 22:53 -------
> What is the following email message for?
> 
> ---
> This message has been generated automatically as a part of a report
> of regressions introduced between 2.6.26 and 2.6.27.
> 
> The following bug entry is on the current list of known regressions
> introduced between 2.6.26 and 2.6.27.  Please verify if it still should
> be listed and let me know (either way)
> ---
> 
> Reminder to let me know if this bug is fixed or not??

This is generated automatically by the script creating the weekly report
of recent regressions.  It is a request for you to check the latest kernel
and see if the problem is still existing, but if you know that it hasn't been
fixed yet, you can just send a "should be listed" message to to me.

Of course, if I'm involved in handling the bug, like in this case, you don't
need to do anything. :-)

Thanks,
Rafael
Comment 19 Roger 2008-12-14 17:35:26 UTC
Interesting, in the kernel 2.6.27, just using the sysfs fix to enable WOL isn't working here?

Can anybody else confirm this? (..as I might be seeing an issue with using a custom patched kernel here.)
Comment 20 Roger 2008-12-16 12:38:58 UTC
The sysfs hack doens't work anymore for me, but the patch still works fine.

Tested:
=sys-kernel/gentoo-sources-2.6.27-r5
=sys-kernel/vanilla-sources-2.6.27.5
Comment 22 Roger 2009-05-21 15:18:05 UTC
As of kernel-2.6.29, verified wake on lan works again without patching!