Bug 217069

Summary: Wake on Lan is broken since 6.2
Product: Drivers Reporter: Ivan Ivanich (iivanich)
Component: NetworkAssignee: drivers_network (drivers_network)
Status: RESOLVED CODE_FIX    
Severity: normal CC: benjamin.asbach, bjorn, glaubersm, harv, hkallweit1, iivanich, jmprieto, kernel, lenb, lvjianmin, Mrmaxmeier, natrio, omgwtfsalty, regressions, rjw, Robert.Moore, rui.zhang, tiwai, tommy.giesler
Priority: P1    
Hardware: Intel   
OS: Linux   
Kernel Version: 6.2 Subsystem:
Regression: Yes Bisected commit-id:
Attachments: add condition to detect and handle pcie event
FACP
add more strict conditions to improve compatibility

Description Ivan Ivanich 2023-02-22 00:51:52 UTC
After upgrade to 6.2 having issues with wake on lan on 2 systems:
- first is an old lenovo laptop from 2012(Ivy Bridge) with realtek network adapter
- second is a PC(Haswell refresh) with PCIE realtek network adapter

Both uses r8169 driver for network.

On laptop it's not possible to wake on lan after poweroff
On PC it's not possible to wake on lan up after hibernate but works after poweroff

In both cases downgrade to 6.1.x kernel fixes the issue.
Comment 1 Artem S. Tashkinov 2023-02-22 07:27:54 UTC
Could you please bisect?
Comment 2 Ivan Ivanich 2023-02-22 09:26:49 UTC
Actually I can't test it because I have no physical access to these computers, so wouldn't be able to poweron them in case of failure.
Comment 3 Ivan Ivanich 2023-02-22 11:27:25 UTC
There were several changes to the r8169 which could potentially cause regression

- r8169: fix dmar pte write access is not set error
- r8169: move rtl_wol_enable_rx() and rtl_prepare_power_down()
- r8169: enable GRO software interrupt coalescing per default
- r8169: use tp_to_dev instead of open code
Comment 4 The Linux kernel's regression tracker (Thorsten Leemhuis) 2023-02-23 09:51:05 UTC
(In reply to Artem S. Tashkinov from comment #1)
> Could you please bisect?

FWIW, apparently you can't, but FWIW, it IMHO would have been better to wait for some feedback from the developers first anyway, as they might know about the cause already (or even have a fix) -- and in that case you would just waste time and energy.
Comment 5 Heiner Kallweit 2023-02-24 20:24:41 UTC
At first it would be good to know which chip version is used on both systems.
dmesg | grep XID
Does WoL from suspend2ram work?
Comment 6 Ivan Ivanich 2023-02-24 21:27:52 UTC
That's PC Haswell where wol works after shutdown but fails from hibernate(eth1 is in use)
dmesg | grep XID
[    0.740406] r8169 0000:03:00.0 eth0: RTL8168g/8111g, xxxxxxx:51:31, XID 4c0, IRQ 33
[    0.742128] r8169 0000:06:00.0 eth1: RTL8168e/8111e, xxxxxxx:d4:6f, XID 2c2, IRQ 34

laptop Ivy Bridge wol broken for shutdown and hibernate
dmesg | grep XID
[    5.681721] r8169 0000:04:00.0 eth0: RTL8168evl/8111evl, xxxxxxx:ef:07, XID 2c9, IRQ 29

Can't test suspend2ram right now because I've downgraded kernels for both to the 6.1.13 and as I said before there is nobody there who can power on pc's if it's wouldn't wakeup.
Comment 7 Heiner Kallweit 2023-02-24 21:45:44 UTC
I see no change in r8169 between 6.1 an d6.2 that could cause a change in WoL behavior. So the root cause may be somewhere else.
Comment 8 The Linux kernel's regression tracker (Thorsten Leemhuis) 2023-02-25 06:52:23 UTC
(In reply to Heiner Kallweit from comment #7)
> I see no change in r8169 between 6.1 an d6.2 that could cause a change in
> WoL behavior. So the root cause may be somewhere else.

IOW: Ivan or someone else has to bisect this to find the root of the problem?
Comment 9 Heiner Kallweit 2023-03-03 07:24:28 UTC
(In reply to The Linux kernel's regression tracker (Thorsten Leemhuis) from comment #8)
> (In reply to Heiner Kallweit from comment #7)
> > I see no change in r8169 between 6.1 an d6.2 that could cause a change in
> > WoL behavior. So the root cause may be somewhere else.
> 
> IOW: Ivan or someone else has to bisect this to find the root of the problem?

Right.
Comment 10 Patrick Silva 2023-03-08 15:04:54 UTC
Wake on lan is not working after poweroff with kernel linux 6.2.2.arch1-1 on my Arch Linux too.
Cannot test wol after suspend and hibernation because currently these features do not work on my hardware.

My NIC:
$ sudo dmesg | grep XID
[   21.104518] r8169 0000:04:00.0 eth0: RTL8168evl/8111evl, xxxxxxx:4b:1b, XID 2c9, IRQ 30

Downgrade to linux-lts 6.1.15-1 fixes wol on my machine.

At least an intel user is facing the same regression, see
https://bbs.archlinux.org/viewtopic.php?id=283917
Comment 11 Heiner Kallweit 2023-03-08 15:13:43 UTC
(In reply to Patrick Silva from comment #10)
> Wake on lan is not working after poweroff with kernel linux 6.2.2.arch1-1 on
> my Arch Linux too.
> Cannot test wol after suspend and hibernation because currently these
> features do not work on my hardware.
> 
> My NIC:
> $ sudo dmesg | grep XID
> [   21.104518] r8169 0000:04:00.0 eth0: RTL8168evl/8111evl, xxxxxxx:4b:1b,
> XID 2c9, IRQ 30
> 
> Downgrade to linux-lts 6.1.15-1 fixes wol on my machine.
> 
> At least an intel user is facing the same regression, see
> https://bbs.archlinux.org/viewtopic.php?id=283917

Then please bisect between these two kernel versions.
Comment 12 Patrick Silva 2023-03-08 15:16:20 UTC
(In reply to Heiner Kallweit from comment #11)
> (In reply to Patrick Silva from comment #10)
> > Wake on lan is not working after poweroff with kernel linux 6.2.2.arch1-1
> on
> > my Arch Linux too.
> > Cannot test wol after suspend and hibernation because currently these
> > features do not work on my hardware.
> > 
> > My NIC:
> > $ sudo dmesg | grep XID
> > [   21.104518] r8169 0000:04:00.0 eth0: RTL8168evl/8111evl, xxxxxxx:4b:1b,
> > XID 2c9, IRQ 30
> > 
> > Downgrade to linux-lts 6.1.15-1 fixes wol on my machine.
> > 
> > At least an intel user is facing the same regression, see
> > https://bbs.archlinux.org/viewtopic.php?id=283917
> 
> Then please bisect between these two kernel versions.

I do not have the technical knowledge to do it. Sorry.
Comment 13 anarchy 2023-03-08 16:25:13 UTC
I have this problem with https://bugzilla.kernel.org/show_bug.cgi?id=217163 RTL8125B as well. I did some troubleshooting. It seems the problem happens during shutdown, the lan port doesn't stay powered on so the wake on lan signal can be sent.
Comment 14 anarchy 2023-03-08 16:48:16 UTC
I found a solution for my bug, https://wiki.debian.org/WakeOnLan

It works for me.


Add an interface config file /etc/network/interfaces.d/eth0 (or modify the global interface config file /etc/network/interfaces):

auto eth0
iface eth0 inet dhcp
        ethernet-wol g

Activate it:

sudo reboot
Comment 15 Heiner Kallweit 2023-03-08 16:59:26 UTC
(In reply to anarchy from comment #14)
> I found a solution for my bug, https://wiki.debian.org/WakeOnLan
> 
> It works for me.
> 
> 
> Add an interface config file /etc/network/interfaces.d/eth0 (or modify the
> global interface config file /etc/network/interfaces):
> 
> auto eth0
> iface eth0 inet dhcp
>         ethernet-wol g
> 
> Activate it:
> 
> sudo reboot

Then it's not a bug. It's clear that you have to configure WoL before using it.
Comment 16 Benjamin Asbach 2023-03-08 17:41:04 UTC
Created attachment 303902 [details]
attachment-1376-0.html

Can you confirm that you're talking about kernel 6.2?

Am Mi., 8. März 2023 um 17:48 Uhr schrieb <bugzilla-daemon@kernel.org>:

> https://bugzilla.kernel.org/show_bug.cgi?id=217069
>
> --- Comment #14 from anarchy (omgwtfsalty@hotmail.com) ---
> I found a solution for my bug, https://wiki.debian.org/WakeOnLan
>
> It works for me.
>
>
> Add an interface config file /etc/network/interfaces.d/eth0 (or modify the
> global interface config file /etc/network/interfaces):
>
> auto eth0
> iface eth0 inet dhcp
>         ethernet-wol g
>
> Activate it:
>
> sudo reboot
>
> --
> You may reply to this email to add a comment.
>
> You are receiving this mail because:
> You are on the CC list for the bug.
Comment 17 Harvey 2023-03-16 11:15:12 UTC
Well, this doesn't seem to be related to r8169 because it is happening to me also with e1000e kernel module. 

I operate 3 Supermicro servers (DMI: Supermicro X8SIL/X8SIL, BIOS 1.2a) which have 2 integrated Ethernet controllers.
lspci says:
Ethernet controller: Intel Corporation 82574L Gigabit Network Connection

All of them are configured to wake on LAN (via ether-wake from a Router) at a certain daytime to only power on during worktime.
All of this is working for years without any problem. But starting with kernel 6.2 the machines do not power on anymore. The fun stuff: the integrated IPMI management interface is working and via the web server the machines can be started.

I am completely puzzled what happens here.

Greetings
Harvey
Comment 18 The Linux kernel's regression tracker (Thorsten Leemhuis) 2023-03-16 11:51:21 UTC
(In reply to Harvey from comment #17)

> I am completely puzzled what happens here.

Might be some ACPI/PCI change or something like that which causes this. Would be great if someone could bisect this. Any volunteers?
Comment 19 Natrio 2023-03-16 17:05:53 UTC
I have working WOL on r8169 (RTL8111/8168/8411 on MSI B75MA-E33), but NOT working on atl1c (AR8151 on Gigabyte H61M-S2PV).
Ethernet device seems to be ON by network switch LED after suspend, but "Magic packet" not waking it up.
On linux-6.1.* kernels all OK.
Comment 21 Heiner Kallweit 2023-03-17 22:16:27 UTC
(In reply to Natrio from comment #19)
> I have working WOL on r8169 (RTL8111/8168/8411 on MSI B75MA-E33), but NOT
> working on atl1c (AR8151 on Gigabyte H61M-S2PV).
> Ethernet device seems to be ON by network switch LED after suspend, but
> "Magic packet" not waking it up.
> On linux-6.1.* kernels all OK.

This is more or less what has been said already.
What's needed is a bisect from somebody being affected.
Comment 22 Natrio 2023-03-18 11:50:38 UTC
After the testing by pre-built kernels:
0e2c9884cbbae00f956d881848669790d73be43d (12 Dec 2022) - working.
8715c6d3100fc7c6edddf29af4a399a1c12d028c (13 Dec 2022) – NOT working.
I'll try to bisect between them if I can.
Comment 23 Mrmaxmeier 2023-03-18 13:04:32 UTC
Hi, I tried bisecting this a few days ago and my run just finished on

> 5c62d5aab8752e5ee7bfbe75ed6060db1c787f98
> ACPICA: Events: Support fixed PCIe wake event

My setup is a bit wonky though and slightly complicated by the fact that rtcwake broke for a bit somewhere between v6.1 and v6.2, but the commit seems relevant and is in Natrio's range :)
Comment 24 The Linux kernel's regression tracker (Thorsten Leemhuis) 2023-03-18 14:41:57 UTC
Could somebody else maybe try if reverting 5c62d5aab8752e5ee7bfbe75ed6060db1c787f98 helps?
Comment 25 Natrio 2023-03-18 19:00:11 UTC
Yes, I can.
60f2096b59bcd6827aa53d771505f939317b254c - working
5c62d5aab8752e5ee7bfbe75ed6060db1c787f98 - NOT working

Ethernet: Atheros AR8151 (atl1c module)
MB: Gigabyte H61M-S2PV (Intel H61 Express)
CPU: i5-3570 (Ivy Bridge)
Comment 26 The Linux kernel's regression tracker (Thorsten Leemhuis) 2023-03-19 07:23:15 UTC
(In reply to Natrio from comment #25)

> 60f2096b59bcd6827aa53d771505f939317b254c - working
> 5c62d5aab8752e5ee7bfbe75ed6060db1c787f98 - NOT working

Great, thx. I now asked the developer of the culprit to look into this.
Comment 27 Harvey 2023-03-19 16:44:23 UTC
I can also confirm that reverting 5c62d5aab8752e5ee7bfbe75ed6060db1c787f98 solves the problem for me too.
Comment 28 Patrick Silva 2023-03-20 02:17:44 UTC
Reverting 5c62d5aab8752e5ee7bfbe75ed6060db1c787f98
fixes wake on lan on my Arch Linux too.
Comment 29 The Linux kernel's regression tracker (Thorsten Leemhuis) 2023-03-24 10:19:16 UTC
Could somebody try the following patch:
https://github.com/acpica/acpica/pull/784/commits/0e66e6aae972dac3833bdcbd223aa6a8b1733176

This was asked for in https://lore.kernel.org/all/b39064e3-4f8b-f607-b270-1e0c8539d391@loongson.cn/ – I'm just a man-in-the-middle here that forwards that request.
Comment 30 Patrick Silva 2023-03-24 15:41:16 UTC
I have installed the packages available in the comment 15 of the following link and rebooted

https://bbs.archlinux.org/viewtopic.php?pid=2091705

Wake on lan is still not working.
Comment 31 Harvey 2023-03-25 15:21:57 UTC
No joy. Wake-onLAN still not working :(
Comment 32 Takashi Iwai 2023-03-26 07:16:59 UTC
The same problem was reported for openSUSE TW 6.2.x kernel.
The suggested ACPICA change didn't fix, while the revert makes working it again:
  https://bugzilla.opensuse.org/show_bug.cgi?id=1209526#c6
Comment 33 Jianmin Lv 2023-03-27 07:48:12 UTC
I transform and adjust the patch from the suggested ACPICA change as following, please try it again, thanks!

diff --git a/drivers/acpi/acpica/evevent.c b/drivers/acpi/acpica/evevent.c
index 82d1728b9bc6..9254f1d4ad06 100644
--- a/drivers/acpi/acpica/evevent.c
+++ b/drivers/acpi/acpica/evevent.c
@@ -139,15 +139,24 @@ static acpi_status acpi_ev_fixed_event_initialize(void)
                /* Disable the fixed event */

                if (acpi_gbl_fixed_event_info[i].enable_register_id != 0xFF) {
-                       status =
-                           acpi_write_bit_register(acpi_gbl_fixed_event_info
-                                                   [i].enable_register_id,
-                                                   (i ==
-                                                    ACPI_EVENT_PCIE_WAKE) ?
-                                                   ACPI_ENABLE_EVENT :
-                                                   ACPI_DISABLE_EVENT);
-                       if (ACPI_FAILURE(status)) {
-                               return (status);
+                       if (i == ACPI_EVENT_PCIE_WAKE) {
+                               if ((acpi_gbl_FADT.flags & ACPI_FADT_PCI_EXPRESS_WAKE)) {
+                                       status = acpi_write_bit_register(
+                                               acpi_gbl_fixed_event_info[i].enable_register_id,
+                                               ACPI_ENABLE_EVENT);
+
+                                       if (ACPI_FAILURE(status)) {
+                                               return (status);
+                                       }
+                               }
+                       } else {
+                               status = acpi_write_bit_register(
+                                               acpi_gbl_fixed_event_info[i].enable_register_id,
+                                               ACPI_DISABLE_EVENT);
+
+                               if (ACPI_FAILURE(status)) {
+                                       return (status);
+                               }
                        }
                }
        }
diff --git a/drivers/acpi/acpica/hwsleep.c b/drivers/acpi/acpica/hwsleep.c
index 37b3f641feaa..65672fa8dc5b 100644
--- a/drivers/acpi/acpica/hwsleep.c
+++ b/drivers/acpi/acpica/hwsleep.c
@@ -68,6 +68,15 @@ acpi_status acpi_hw_legacy_sleep(u8 sleep_state)
                return_ACPI_STATUS(status);
        }

+       /* Enable pcie wake event if support */
+       if ((acpi_gbl_FADT.flags & ACPI_FADT_PCI_EXPRESS_WAKE)) {
+               (void)
+                   acpi_write_bit_register(acpi_gbl_fixed_event_info
+                                           [ACPI_EVENT_PCIE_WAKE].
+                                           enable_register_id,
+                                           ACPI_DISABLE_EVENT);
+       }
+
        /* Get current value of PM1A control */

        status = acpi_hw_register_read(ACPI_REGISTER_PM1_CONTROL,
@@ -311,13 +320,13 @@ acpi_status acpi_hw_legacy_wake(u8 sleep_state)
                                    [ACPI_EVENT_SLEEP_BUTTON].
                                    status_register_id, ACPI_CLEAR_STATUS);

-       /* Enable pcie wake event if support */
+       /* Clear and disable pcie wake event if support */
        if ((acpi_gbl_FADT.flags & ACPI_FADT_PCI_EXPRESS_WAKE)) {
                (void)
                    acpi_write_bit_register(acpi_gbl_fixed_event_info
                                            [ACPI_EVENT_PCIE_WAKE].
                                            enable_register_id,
-                                           ACPI_DISABLE_EVENT);
+                                           ACPI_ENABLE_EVENT);
                (void)
                    acpi_write_bit_register(acpi_gbl_fixed_event_info
                                            [ACPI_EVENT_PCIE_WAKE].
Comment 34 Patrick Silva 2023-03-27 14:15:54 UTC
Tested on Arch Linux. If I turn the computer off from the SDDM login manager,
wol works after sending the magic packet twice. If I turn the computer off after login, wol works after sending magic packet once.
Comment 35 Harvey 2023-03-27 17:31:19 UTC
This version of the patch solves the wake-on-lan problem for me. No graphical UI on the server though..

Good work!

Greetings
Harvey
Comment 36 Bjorn Helgaas 2023-03-27 19:50:49 UTC
Harvey, can you clarify what "no graphical UI on the server" means?  Is that a change that happens with this patch?  Or something else unexpected?

Or is it just observation along the lines of "there's no graphics UI on this server (and I don't expect one), and I can't turn off the computer from the SDDM login manager, so I don't know whether it fixes the problem Patrick mentioned in comment #34"?
Comment 37 Takashi Iwai 2023-03-28 06:15:59 UTC
The test of the change in comment 33 by openSUSE TW reporter was negative, it still didn't work:
  https://bugzilla.opensuse.org/show_bug.cgi?id=1209526#c9

(BTW, at the next time, could you use an attachment for the patch instead of pasting to the form, unless it's a trivial onliner or such?  Otherwise it makes difficult to apply the patch cleanly.)
Comment 38 Harvey 2023-03-28 09:03:45 UTC
@Bjorn Helgaas:

It's only to say "I can't turn off the computer from the SDDM login manager, so I don't know whether it fixes the problem Patrick mentioned in comment #34"

Sorry for being unclear here.
Comment 39 Jianmin Lv 2023-03-28 10:52:14 UTC
Created attachment 304045 [details]
add condition to detect and handle pcie event
Comment 40 Jianmin Lv 2023-03-28 10:55:07 UTC
(In reply to Harvey from comment #38)
> @Bjorn Helgaas:
> 
> It's only to say "I can't turn off the computer from the SDDM login manager,
> so I don't know whether it fixes the problem Patrick mentioned in comment
> #34"
> 
> Sorry for being unclear here.

Ok, thanks for your reminder, I changed the patch again (added condition to detect and handle pcie event), the patch is attached to Comment 39, please try it again.

Thanks!
Comment 41 Juha Virtanen 2023-03-28 16:40:30 UTC
(In reply to Jianmin Lv from comment #39)
> Created attachment 304045 [details]
> add condition to detect and handle pcie event

I tested this. WOL is not working. Neither with a single magic packet nor  two or more magic packets.

See also https://bugzilla.opensuse.org/show_bug.cgi?id=1209526#c14
Comment 42 Jianmin Lv 2023-03-29 01:09:35 UTC
(In reply to Juha Virtanen from comment #41)
> (In reply to Jianmin Lv from comment #39)
> > Created attachment 304045 [details]
> > add condition to detect and handle pcie event
> 
> I tested this. WOL is not working. Neither with a single magic packet nor 
> two or more magic packets.
> 
> See also https://bugzilla.opensuse.org/show_bug.cgi?id=1209526#c14

Could you show me FACP (/sys/firmware/acpi/tables/FACP) as an attachment please? I need to check a flag there to see if the PCIE WAKE is set. 

Thanks!
Comment 43 The Linux kernel's regression tracker (Thorsten Leemhuis) 2023-03-29 10:51:31 UTC
Jianmin Lv, it seems it takes some time to fix this properly. Should we simply revert the culprit and reapply it later once the problems are solved? Or would that create more trouble than it would solve?
Comment 44 The Linux kernel's regression tracker (Thorsten Leemhuis) 2023-03-29 10:52:32 UTC
(In reply to The Linux kernel's regression tracker (Thorsten Leemhuis) from comment #43)
> Jianmin Lv, it seems it takes some time to fix this properly. 

Reminder, as per https://docs.kernel.org/process/handling-regressions.html this ideally should be fixed by now.
Comment 45 Juha Virtanen 2023-03-29 14:43:11 UTC
Created attachment 304052 [details]
FACP

Linux videostore 6.2.8-1-default #1 SMP PREEMPT_DYNAMIC Wed Mar 22 18:56:06 UTC 2023 (221c28f) x86_64 x86_64 x86_64 GNU/Linux

after waking machine up:
 
34c0d4dd29b6cada3a75496e60f67ce216785587c2c7caffd4b17dd434f01c7d  FACP-6.2.8-1-default



Linux videostore 6.2.8-1.g768abb4-default #1 SMP PREEMPT_DYNAMIC Tue Mar 28 15:40:59 UTC 2023 (768abb4) x86_64 x86_64 x86_64 GNU/Linux

With patch https://bugzilla.kernel.org/show_bug.cgi?id=217069#c39, right after reboot:

34c0d4dd29b6cada3a75496e60f67ce216785587c2c7caffd4b17dd434f01c7d  FACP-6.2.8-1.g768abb4-default-1

Same kernel than above, but after hibernate-wake cycle (WOL not working):

34c0d4dd29b6cada3a75496e60f67ce216785587c2c7caffd4b17dd434f01c7d  FACP-6.2.8-1.g768abb4-default-2

There is no changes in FACP file.
Comment 46 Jianmin Lv 2023-03-30 13:24:54 UTC
Created attachment 304058 [details]
add more strict conditions to improve compatibility
Comment 47 Jianmin Lv 2023-03-30 13:30:46 UTC
(In reply to Juha Virtanen from comment #45)
> Created attachment 304052 [details]
> FACP
> 
> Linux videostore 6.2.8-1-default #1 SMP PREEMPT_DYNAMIC Wed Mar 22 18:56:06
> UTC 2023 (221c28f) x86_64 x86_64 x86_64 GNU/Linux
> 
> after waking machine up:
>  
> 34c0d4dd29b6cada3a75496e60f67ce216785587c2c7caffd4b17dd434f01c7d 
> FACP-6.2.8-1-default
> 
> 
> 
> Linux videostore 6.2.8-1.g768abb4-default #1 SMP PREEMPT_DYNAMIC Tue Mar 28
> 15:40:59 UTC 2023 (768abb4) x86_64 x86_64 x86_64 GNU/Linux
> 
> With patch https://bugzilla.kernel.org/show_bug.cgi?id=217069#c39, right
> after reboot:
> 
> 34c0d4dd29b6cada3a75496e60f67ce216785587c2c7caffd4b17dd434f01c7d 
> FACP-6.2.8-1.g768abb4-default-1
> 
> Same kernel than above, but after hibernate-wake cycle (WOL not working):
> 
> 34c0d4dd29b6cada3a75496e60f67ce216785587c2c7caffd4b17dd434f01c7d 
> FACP-6.2.8-1.g768abb4-default-2
> 
> There is no changes in FACP file.

Ok, I found that the PCIE WAKE flag is not set in your FACP like all machines that I know from feedback. I have adjusted the  fixed patch by using the flag as condition to improve compatibility, please try it again.

Thanks!
Comment 48 Jianmin Lv 2023-03-30 13:31:28 UTC
(In reply to Jianmin Lv from comment #47)
> (In reply to Juha Virtanen from comment #45)
> > Created attachment 304052 [details]
> > FACP
> > 
> > Linux videostore 6.2.8-1-default #1 SMP PREEMPT_DYNAMIC Wed Mar 22 18:56:06
> > UTC 2023 (221c28f) x86_64 x86_64 x86_64 GNU/Linux
> > 
> > after waking machine up:
> >  
> > 34c0d4dd29b6cada3a75496e60f67ce216785587c2c7caffd4b17dd434f01c7d 
> > FACP-6.2.8-1-default
> > 
> > 
> > 
> > Linux videostore 6.2.8-1.g768abb4-default #1 SMP PREEMPT_DYNAMIC Tue Mar 28
> > 15:40:59 UTC 2023 (768abb4) x86_64 x86_64 x86_64 GNU/Linux
> > 
> > With patch https://bugzilla.kernel.org/show_bug.cgi?id=217069#c39, right
> > after reboot:
> > 
> > 34c0d4dd29b6cada3a75496e60f67ce216785587c2c7caffd4b17dd434f01c7d 
> > FACP-6.2.8-1.g768abb4-default-1
> > 
> > Same kernel than above, but after hibernate-wake cycle (WOL not working):
> > 
> > 34c0d4dd29b6cada3a75496e60f67ce216785587c2c7caffd4b17dd434f01c7d 
> > FACP-6.2.8-1.g768abb4-default-2
> > 
> > There is no changes in FACP file.
> 
> Ok, I found that the PCIE WAKE flag is not set in your FACP like all
> machines that I know from feedback. I have adjusted the  fixed patch by
> using the flag as condition to improve compatibility, please try it again.
> 
> Thanks!

See attachment 304058 [details]
Comment 49 Jianmin Lv 2023-03-30 13:32:18 UTC
(In reply to The Linux kernel's regression tracker (Thorsten Leemhuis) from comment #44)
> (In reply to The Linux kernel's regression tracker (Thorsten Leemhuis) from
> comment #43)
> > Jianmin Lv, it seems it takes some time to fix this properly. 
> 
> Reminder, as per https://docs.kernel.org/process/handling-regressions.html
> this ideally should be fixed by now.

Ok, I'll submit to revert it if the issue can not be fixed yet by the end of the week. Some people has submit fixed patch to ACPICA project (linux kernel gets transformed patches from it), where the simillar way is used as my patch, but with less condition. By now, the simillar fix patches works well on some problem machines, but not on every problem machine. I have checked all branches that are affected by the previous issue patch and added more conditions to prevent machines with old firmware from being affected.

Thanks!
Comment 50 Bjorn Helgaas 2023-03-30 14:11:37 UTC
Please include spec references in your commit log so this is less of a guessing game about how different platforms behave and more of a "here's what the spec says and how we translate that into code."

There may be a few platforms that are obviously out of spec and need to be handled as exceptions, but from what I can glean from the log, the comment #48 patch fixes deficiencies in the original PCIEXP_WAKE_EN implementation.

It *sounds* like maybe it actually fixes two issues: (1) support PCIEXP_WAKE_EN/PCIEXP_WAKE_STS only if ACPI_FADT_PCI_EXPRESS_WAKE is set and (2) enable PCIe wakeup event only when entering sleep and disable it for runtime.  If so, this should be split into two patches.

Also suggest more specific subject lines because "fix pcie_wake event handling" includes no details about what is actually wrong.

If I understand correctly, the regression is from 5c62d5aab875 ("ACPICA: Events: Support fixed PCIe wake event"), which appeared in v6.2-rc1 (and v6.2, of course).  If we think we can fix it before v6.3 (we're at v6.3-rc4 now, v6.3 probably Apr 23 or 30), then I don't think it's worth reverting just to add it back.  If we *can't* fix it before v6.3, of course we should revert it.  But I don't know the path through ACPICA to Linux, maybe that is lengthy.
Comment 51 Jianmin Lv 2023-03-30 14:47:37 UTC
(In reply to Bjorn Helgaas from comment #50)
> Please include spec references in your commit log so this is less of a
> guessing game about how different platforms behave and more of a "here's
> what the spec says and how we translate that into code."
> 
Ok, if the fix patch work well, I'll add more spec references in the commit log in the patch to be submitted.

> There may be a few platforms that are obviously out of spec and need to be
> handled as exceptions, but from what I can glean from the log, the comment
> #48 patch fixes deficiencies in the original PCIEXP_WAKE_EN implementation.
> 
Right.

> It *sounds* like maybe it actually fixes two issues: (1) support
> PCIEXP_WAKE_EN/PCIEXP_WAKE_STS only if ACPI_FADT_PCI_EXPRESS_WAKE is set and
> (2) enable PCIe wakeup event only when entering sleep and disable it for
> runtime.  If so, this should be split into two patches.
> 

Ok, if the patch works well, I'll split it into seperated ones.

> Also suggest more specific subject lines because "fix pcie_wake event
> handling" includes no details about what is actually wrong.
> 
Ok.

> If I understand correctly, the regression is from 5c62d5aab875 ("ACPICA:
> Events: Support fixed PCIe wake event"), which appeared in v6.2-rc1 (and
> v6.2, of course).  If we think we can fix it before v6.3 (we're at v6.3-rc4
> now, v6.3 probably Apr 23 or 30), then I don't think it's worth reverting
> just to add it back.  If we *can't* fix it before v6.3, of course we should
> revert it.  But I don't know the path through ACPICA to Linux, maybe that is
> lengthy.

Agree, I'm trying my best to fix it and really appreciate any test and comment. AFAIK, the patchs to acpi subsystem should be submitted to ACPICA, and the maintainer of it will port merged patches to linux kernel.
Comment 52 The Linux kernel's regression tracker (Thorsten Leemhuis) 2023-03-30 14:53:37 UTC
(In reply to Bjorn Helgaas from comment #50)
>
> If I understand correctly, the regression is from 5c62d5aab875 ("ACPICA:
> Events: Support fixed PCIe wake event"), which appeared in v6.2-rc1 (and
> v6.2, of course).  If we think we can fix it before v6.3 (we're at v6.3-rc4
> now, v6.3 probably Apr 23 or 30), then I don't think it's worth reverting
> just to add it back. 

From my point I strongly disagree, as there is a simple and strong reason for a revert in mainline if a proper fix doesn't materialize within the next few days[1]: it's needed get this fixed in 6.2.y rather sooner than later, as the stable-kernel rules forbid a 6.2-only revert. Hence to get it fixed there the culprit needs to be reverted in mainline first and the revert then backported. That why Documentation/process/handling-regressions.rst / https://docs.kernel.org/process/handling-regressions.html strongly suggest to revert in this case.

[1] with a bit of luck we'll have one thx to Jianmin Lv; thx for your work!
Comment 53 Juha Virtanen 2023-03-30 16:33:10 UTC
(In reply to Jianmin Lv from comment #46)
> Created attachment 304058 [details]
> add more strict conditions to improve compatibility

I tested this patch with help from Takashi Iwai building test kernel.

WOL is still not forking. There was no change in FACP file.
Comment 54 Juha Virtanen 2023-03-30 17:48:18 UTC
(In reply to Juha Virtanen from comment #53)
> (In reply to Jianmin Lv from comment #46)
> > Created attachment 304058 [details]
> > add more strict conditions to improve compatibility
> 
> I tested this patch with help from Takashi Iwai building test kernel.
> 
> WOL is still not forking. There was no change in FACP file.

Some more information. I decided to retest reverted patch (see https://bugzilla.opensuse.org/show_bug.cgi?id=1209526#c6) to see, if WOL still works with it. WOL failed to work with it unless I cold-rebooted that machine. I usually use kexec to reboot to avoid attaching display and keyboard to enter LUKS password. This setup has worked years until kernel 6.2.0.

After cold reboot reverted patch worked again. WOL works.

Then I retested Comment #53 pacched kernel by rebooting to it with kexec. WOL works with it now.

I have also grabbed FACP file several times along all this testing. There has not been any changes in it at all.
Comment 55 Jianmin Lv 2023-03-31 01:28:22 UTC
(In reply to Juha Virtanen from comment #54)
> (In reply to Juha Virtanen from comment #53)
> > (In reply to Jianmin Lv from comment #46)
> > > Created attachment 304058 [details]
> > > add more strict conditions to improve compatibility
> > 
> > I tested this patch with help from Takashi Iwai building test kernel.
> > 
> > WOL is still not forking. There was no change in FACP file.
> 
> Some more information. I decided to retest reverted patch (see
> https://bugzilla.opensuse.org/show_bug.cgi?id=1209526#c6) to see, if WOL
> still works with it. WOL failed to work with it unless I cold-rebooted that
> machine. I usually use kexec to reboot to avoid attaching display and
> keyboard to enter LUKS password. This setup has worked years until kernel
> 6.2.0.
> 
> After cold reboot reverted patch worked again. WOL works.
> 
> Then I retested Comment #53 pacched kernel by rebooting to it with kexec.
> WOL works with it now.
> 
> I have also grabbed FACP file several times along all this testing. There
> has not been any changes in it at all.

So, I understand that the patch attached in attachment 304058 [details] works well for your case. And maybe the one attached in attachment 304045 [details] works well too, and so does one in Comment 33. So could you please retest both 304045 and Comment 33 on your machine? Thanks very much!
Comment 56 Jianmin Lv 2023-03-31 01:32:53 UTC
(In reply to Juha Virtanen from comment #54)
> (In reply to Juha Virtanen from comment #53)
> > (In reply to Jianmin Lv from comment #46)
> > > Created attachment 304058 [details]
> > > add more strict conditions to improve compatibility
> > 
> > I tested this patch with help from Takashi Iwai building test kernel.
> > 
> > WOL is still not forking. There was no change in FACP file.
> 
> Some more information. I decided to retest reverted patch (see
> https://bugzilla.opensuse.org/show_bug.cgi?id=1209526#c6) to see, if WOL
> still works with it. WOL failed to work with it unless I cold-rebooted that
> machine. I usually use kexec to reboot to avoid attaching display and
> keyboard to enter LUKS password. This setup has worked years until kernel
> 6.2.0.
> 
> After cold reboot reverted patch worked again. WOL works.
> 
> Then I retested Comment #53 pacched kernel by rebooting to it with kexec.
> WOL works with it now.
> 
> I have also grabbed FACP file several times along all this testing. There
> has not been any changes in it at all.

BTY, no need to check FACP any more, which is a constant table from BIOS, and I just want to be sure that PCIE WAKE flag is not set in it.
Comment 57 Juha Virtanen 2023-03-31 02:03:00 UTC
(In reply to Jianmin Lv from comment #56)
> (In reply to Juha Virtanen from comment #54)
> > Some more information. I decided to retest reverted patch (see
> > https://bugzilla.opensuse.org/show_bug.cgi?id=1209526#c6) to see, if WOL
> > still works with it. WOL failed to work with it unless I cold-rebooted that
> > machine. I usually use kexec to reboot to avoid attaching display and
> > keyboard to enter LUKS password. This setup has worked years until kernel
> > 6.2.0.
> > 
> > After cold reboot reverted patch worked again. WOL works.
> > 
> > Then I retested Comment #53 pacched kernel by rebooting to it with kexec.
> > WOL works with it now.
> > 
> > I have also grabbed FACP file several times along all this testing. There
> > has not been any changes in it at all.
> 
> BTY, no need to check FACP any more, which is a constant table from BIOS,
> and I just want to be sure that PCIE WAKE flag is not set in it.

I can retest those next week.
Comment 58 The Linux kernel's regression tracker (Thorsten Leemhuis) 2023-04-04 13:40:34 UTC
Was there any progress to get this sorted out? This is now been broken in 6.2.y for six weeks and due to the stable-kernel rules can only be fixed there once it was fixed in mainline -- that's why I think a quick revert is the best solution, unless it creates trouble for other users. Even for mainline it should be fixed now, as we are post rc5, hence per https://docs.kernel.org/process/handling-regressions.html the revert should go in "to ensure all the improvements and fixes are ideally tested together for at least one week before Linus releases a new mainline version"
Comment 59 Tommy Giesler 2023-04-05 09:34:28 UTC
Me and my colleagues did some tests with roughly 40 different mainboard models and different network chips (incl. Intel i219, i210, several Realtek 1G + 2.5G chips) in our environment and when using 6.2.9 with the last patch of Jianmin applied, WOL works again for all of them.

When using a Kernel 6.2.1 without this patch, none of them were working.

I hope this information helps. If you need detailed information which chips we tested, let me know.
Comment 60 Juha Virtanen 2023-04-05 15:05:16 UTC
(In reply to Juha Virtanen from comment #57)
> (In reply to Jianmin Lv from comment #56)
> > (In reply to Juha Virtanen from comment #54)
> > > Some more information. I decided to retest reverted patch (see
> > > https://bugzilla.opensuse.org/show_bug.cgi?id=1209526#c6) to see, if WOL
> > > still works with it. WOL failed to work with it unless I cold-rebooted
> that
> > > machine. I usually use kexec to reboot to avoid attaching display and
> > > keyboard to enter LUKS password. This setup has worked years until kernel
> > > 6.2.0.
> > > 
> > > After cold reboot reverted patch worked again. WOL works.
> > > 
> > > Then I retested Comment #53 pacched kernel by rebooting to it with kexec.
> > > WOL works with it now.
> > > 
> > > I have also grabbed FACP file several times along all this testing. There
> > > has not been any changes in it at all.
> > 
> > BTY, no need to check FACP any more, which is a constant table from BIOS,
> > and I just want to be sure that PCIE WAKE flag is not set in it.
> 
> I can retest those next week.

It appears that I cannot any more test those other patches as distribution kernel is now 6.2.9 in Opensuse Tumbleweed and older patched kernels are based on 6.2.8. (Or maybe I could, but I need to study how to do it.)

With 
Linux <host> 6.2.9-1.gda795ff-default #1 SMP PREEMPT_DYNAMIC Thu Mar 30 15:01:35 UTC 2023 (da795ff) x86_64 x86_64 x86_64 GNU/Linux

WOL rocks. With Opensuse Tumleweed's 6.2.9 kernel WOL fails, which was expected.

Linux <host> 6.2.9-1-default #1 SMP PREEMPT_DYNAMIC Thu Mar 30 11:30:50 UTC 2023 (7a187a3) x86_64 x86_64 x86_64 GNU/Linux

After switching back to

Linux <host> 6.2.9-1.gda795ff-default #1 SMP PREEMPT_DYNAMIC Thu Mar 30 15:01:35 UTC 2023 (da795ff) x86_64 x86_64 x86_64 GNU/Linux

by kexec reboot WOL fails. After doing a cold reboot WOL works again.
Comment 61 Juha Virtanen 2023-04-07 11:03:12 UTC
(In reply to Jianmin Lv from comment #46)
> Created attachment 304058 [details]
> add more strict conditions to improve compatibility

How will this WOL regression bug be fixed? Will Lv's patch be the patch to fix this issue?
Comment 62 The Linux kernel's regression tracker (Thorsten Leemhuis) 2023-04-16 12:57:24 UTC
Jianmin Lv posted a update, the fix is now under review:
https://lore.kernel.org/all/754225a2-95a9-2c36-1886-7da1a78308c2@loongson.cn/

To those of you that have problems with wol when doing kexec: is this a kernel regression as well? if yes, please open a separate ticket and mentioned it here. And ideally perform a bisection to find the change that is causing it, as without that the issue might only get fixed if we are lucky.
Comment 63 The Linux kernel's regression tracker (Thorsten Leemhuis) 2023-04-22 04:11:31 UTC
Linus had enough and reverted the culprit now reverted in mainline and marked the revert for backporting to 6.2:

https://git.kernel.org/torvalds/c/8e41e0a575664d26bb87e012c39435c4c3914ed9

It should fix this issue, but nevertheless would be good if someone could confirm this.
Comment 64 Juha Virtanen 2023-04-26 16:42:07 UTC
(In reply to The Linux kernel's regression tracker (Thorsten Leemhuis) from comment #63)
> Linus had enough and reverted the culprit now reverted in mainline and
> marked the revert for backporting to 6.2:
> 
> https://git.kernel.org/torvalds/c/8e41e0a575664d26bb87e012c39435c4c3914ed9
> 
> It should fix this issue, but nevertheless would be good if someone could
> confirm this.

Should this be reverted in 6.2.12?

OpenSuse Tumbleweed distribution kernel

Linux videostore 6.2.12-1-default #1 SMP PREEMPT_DYNAMIC Thu Apr 20 11:01:10 UTC 2023 (eb3255d) x86_64 x86_64 x86_64 GNU/Linux

Anyway, WOL is broken for me with 6.2.12 after cold reboot.
Comment 65 The Linux kernel's regression tracker (Thorsten Leemhuis) 2023-04-26 17:37:13 UTC
(In reply to Juha Virtanen from comment #64)

> WOL is broken for me with 6.2.12

That's expected, as it was only reverted for 6.2.13 (after it had been reverted for 6.3).
Comment 66 Patrick Silva 2023-04-28 18:23:16 UTC
I have just updated to 6.2.13 on Arch Linux and wol is working again.
Comment 67 Natrio 2023-04-29 10:06:10 UTC
On my hardware (Comment #19) WOL working again with 6.2.13 and 6.3 kernels.
Comment 68 Ivan Ivanich 2023-05-01 20:28:49 UTC
I can confirm too that with 6.3 kernel issue is finally resolved. It's a shame that it has lasted for the 12 minor revisions(for 6.2 kernel) to just simply revert fault commit. Closing.
Comment 69 Juha Virtanen 2023-05-04 15:59:07 UTC
I confirm also that TW kernel 6.3.1 fixed this issue for me.
Comment 70 Tommy Giesler 2023-05-05 05:53:00 UTC
Same from my/our side. All systems are working as expected running 6.3.1.