Bug 218237 - performance regression in ASIX AX88179 USB ethernet adapter adds 400ms to resume
Summary: performance regression in ASIX AX88179 USB ethernet adapter adds 400ms to resume
Status: RESOLVED WILL_NOT_FIX
Alias: None
Product: Drivers
Classification: Unclassified
Component: Network (show other bugs)
Hardware: All Linux
: P3 normal
Assignee: drivers_network@kernel-bugs.osdl.org
URL:
Keywords:
Depends on:
Blocks: 178231
  Show dependency tree
 
Reported: 2023-12-06 21:01 UTC by Todd Brandt
Modified: 2024-07-03 22:04 UTC (History)
2 users (show)

See Also:
Kernel Version: 6.7.0-rc3
Subsystem:
Regression: Yes
Bisected commit-id: 0739af07d1d947af27c877f797cb82ceee702515


Attachments
thinkpad-x1_s2idle_6.7-rc2.html (667.90 KB, text/html)
2023-12-06 21:01 UTC, Todd Brandt
Details
thinkpad-x1_s2idle_6.7-rc3.html (678.24 KB, text/html)
2023-12-06 21:02 UTC, Todd Brandt
Details
ASIX-vs-Realtek-resume-performance.png (53.45 KB, image/png)
2024-07-03 21:40 UTC, Todd Brandt
Details
otcpl-dell-9320-rpl_freeze-ASIX-AX88179-USB-dongle.html (921.26 KB, text/html)
2024-07-03 21:42 UTC, Todd Brandt
Details
otcpl-dell-9320-rpl_freeze-Realtek-RTL8153-USB-dongle.html (796.93 KB, text/html)
2024-07-03 21:44 UTC, Todd Brandt
Details

Description Todd Brandt 2023-12-06 21:01:35 UTC
Created attachment 305552 [details]
thinkpad-x1_s2idle_6.7-rc2.html

Apparently a fix to the AX88179 usb ethernet dongle in 6.7-rc3 involves adding a full 400ms to S2idle and S3 resume. I understand that full functionality should come first before performance, but is it really necessary to add almost a half second to resume on all systems that use this device? Is there a better way of fixing this or is this device really that broken?

The commit and diff is here and I've attached the before and after effects in two sleepgraph timelines from a thinkpad x1 that uses the dongle.

commit 0739af07d1d947af27c877f797cb82ceee702515
Author: Jose Ignacio Tornos Martinez <jtornosm@redhat.com>
Date:   Mon Nov 20 13:06:29 2023 +0100

    net: usb: ax88179_178a: fix failed operations during ax88179_reset


diff --git a/drivers/net/usb/ax88179_178a.c b/drivers/net/usb/ax88179_178a.c
index aff39bf3161d..4ea0e155bb0d 100644
--- a/drivers/net/usb/ax88179_178a.c
+++ b/drivers/net/usb/ax88179_178a.c
@@ -1583,11 +1583,11 @@ static int ax88179_reset(struct usbnet *dev)
 
        *tmp16 = AX_PHYPWR_RSTCTL_IPRL;
        ax88179_write_cmd(dev, AX_ACCESS_MAC, AX_PHYPWR_RSTCTL, 2, 2, tmp16);
-       msleep(200);
+       msleep(500);
 
        *tmp = AX_CLK_SELECT_ACS | AX_CLK_SELECT_BCS;
        ax88179_write_cmd(dev, AX_ACCESS_MAC, AX_CLK_SELECT, 1, 1, tmp);
-       msleep(100);
+       msleep(200);
 
        /* Ethernet PHY Auto Detach*/
        ax88179_auto_detach(dev);
Comment 1 Todd Brandt 2023-12-06 21:02:21 UTC
Created attachment 305553 [details]
thinkpad-x1_s2idle_6.7-rc3.html

after timeline showing the additional 400ms of resume time.
Comment 2 José Ignacio Tornos Martínez 2023-12-07 08:30:03 UTC
Yes, it is necessary because of the commented issue in the commit.
At this moment there is no more information about how to do it in a better way. Indeed the manufacturer driver is using sleeps.
As you say full functionality should come first before performance, otherwise the device may not be initialized properly.
Comment 3 Len Brown 2023-12-09 03:05:36 UTC
How do reliability and speed compare if the driver is unloaded on suspend
and loaded on resume?
Comment 4 Len Brown 2023-12-09 03:07:03 UTC
If this situation persists, then we need to replace all of the AX**
devices from our lab -- because they'll mask useful testing of other devices...
Comment 5 José Ignacio Tornos Martínez 2023-12-09 17:35:35 UTC
Realize that reset operation is needed as well when the device is resumed, and in this way we avoid possible problems. 
I guess it is due to your scenario or the way you test, but I do not understand how it can mask useful testing of other devices if all the drivers/devices in the machine are resumed.
Comment 6 José Ignacio Tornos Martínez 2023-12-10 09:38:09 UTC
Anyway, let me try to confirm if reset operation is strictly necessary when this device is resumed.
Comment 7 Todd Brandt 2024-01-19 19:05:50 UTC
(In reply to José Ignacio Tornos Martínez from comment #5)
> Realize that reset operation is needed as well when the device is resumed,
> and in this way we avoid possible problems. 
> I guess it is due to your scenario or the way you test, but I do not
> understand how it can mask useful testing of other devices if all the
> drivers/devices in the machine are resumed.

Len just means it affects our ability to see at a glance other devices' effects on total resume time. Any machine with this dongle will have a constant extra 400ms in total resume that masks performance issues in any other devices that run in parallel.
Comment 8 José Ignacio Tornos Martínez 2024-03-25 09:17:57 UTC
Reset is strictly necessary and with no more information I think it is better to keep like this to avoid the problems mentioned.
Comment 9 Todd Brandt 2024-07-03 21:40:43 UTC
Created attachment 306527 [details]
ASIX-vs-Realtek-resume-performance.png

See the massive difference in resume time in 6.10.0-rc5 and 6.10.0-rc6 as a result of switching from ASIX to Realtek. The original ASIX regression appeared upstream in 6.8.0
Comment 10 Todd Brandt 2024-07-03 21:42:13 UTC
Created attachment 306528 [details]
otcpl-dell-9320-rpl_freeze-ASIX-AX88179-USB-dongle.html

Dell 9320 S2idle resume with the ASIX AX88170 dongle, lots of msleeps.
Comment 11 Todd Brandt 2024-07-03 21:44:33 UTC
Created attachment 306529 [details]
otcpl-dell-9320-rpl_freeze-Realtek-RTL8153-USB-dongle.html

Dell 9320 S2idle resume with the Realtex RTL8153 dongle. Notice an improvement in resume time of 540ms!
Comment 12 Todd Brandt 2024-07-03 21:56:29 UTC
I'm happy to say that we've just thrown out all 31 of our ASIX AX88??? USB ethernet dongles and have replaced them all with Realtex RTL8153 USB ethernet dongles. The Realtek dongles have caused a tremendous resume time drop of up to 600ms in 7 of our machines and removed a major barrier to our performance analysis.

I've included an image showing the median resume time for the Dell 9320 over the past year and you can see the massive 540ms improvement in performance from the switchover to the RTL8253 in 6.10.0-rc5. The machine has never run faster. The kernel regression that caused this in the ASIX dongle appeared upstream in 6.8.0 and you can see the huge increase in resume time.

I've included two timelines demonstrating just how much better the Realtek RTL8153 is over the ASIX AX88179. I STRONGLY recommend to anyone needing a USB-ethernet dongle: if you care about performance, choose the Realtek version or at least avoid the ASIX version if at all possible.

WHen you buy on Amazon be sure to check the stats, some dongle manufacturers will stick ASIX chips in there and not tell you. Our personal pick is the USB-A Uni RJ45. Don't buy the USB-C version of the Uni RJ45 as they use the ASIX chipset in that one for some reason (probably because it's cheaper).

Note You need to log in before you can comment on or make changes to this bug.