Bug 214933 - [Regression] Commit 69a74aef8a18eef20fb0044b5e164af41b84db21 "e100: use generic power management" - breaks suspend/resume
Summary: [Regression] Commit 69a74aef8a18eef20fb0044b5e164af41b84db21 "e100: use gener...
Status: RESOLVED CODE_FIX
Alias: None
Product: Drivers
Classification: Unclassified
Component: Network (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: Default virtual assignee for network-wireless-intel
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2021-11-03 13:33 UTC by Alexey Kuznetsov
Modified: 2021-11-30 07:25 UTC (History)
5 users (show)

See Also:
Kernel Version: 5.10.70
Tree: Mainline
Regression: No


Attachments
proposed fix, only compile tested (2.49 KB, patch)
2021-11-10 02:07 UTC, Jesse Brandeburg
Details | Diff
proposed fix, only compile tested (2.53 KB, patch)
2021-11-10 02:31 UTC, Jesse Brandeburg
Details | Diff

Description Alexey Kuznetsov 2021-11-03 13:33:37 UTC
Hello!

My notebook unable to suspend / resume due to bug in kernel. I discovered with series of 'git bisect' rebuilds bogus commit.

To make recent (I'm using 5.10) kernel to work I had to revert:

69a74aef8a18eef20fb0044b5e164af41b84db21

Following my original bug report against debian kernel (no one responds):

https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=982956
Comment 1 Alexey Kuznetsov 2021-11-04 18:21:01 UTC
Another person reporting same hardware issues related to the same commit:

  * https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=995927
Comment 2 Artem S. Tashkinov 2021-11-05 11:21:57 UTC
CC'ing the relevant people.

Commit 69a74aef8a18eef20fb0044b5e164af41b84db21 "e100: use generic power management" breaks suspend/resume for some configurations.
Comment 3 Jakub Kicinski 2021-11-05 13:49:12 UTC
Not sure how this go assigned to Intel WiFi. Adding Tony from Intel Ethernet as well.
If the issue does not get addressed please just send a revert to the ML.
Comment 4 Alexey Kuznetsov 2021-11-05 13:52:39 UTC
My fault, that was my first assumption suspend / resume freeze related to my ipw2100. I did assign it to wifi by mistake. it is clear now 100% e100 issue.
Comment 5 Jesse Brandeburg 2021-11-10 01:19:42 UTC
Hi Alexey! Thanks for following up on your report for this. I looked into this code, it seems patently wrong.

From the description of the problem I suspect the device is being left in D3 after the resume, and timing out all operations.

As for a fix, I'll attach a proposed patch here, but I don't think I have an e100 laying around any more to test.
Comment 6 Jesse Brandeburg 2021-11-10 02:07:11 UTC
Created attachment 299499 [details]
proposed fix, only compile tested

Hi, if you can please give this patch a try, and forgive me if it doesn't quite work right as I can't test it easily.

Feedback welcome!
Comment 7 Jesse Brandeburg 2021-11-10 02:31:05 UTC
Created attachment 299501 [details]
proposed fix, only compile tested

This change is more likely to work, since we're thinking the device likely isn't enabled correctly (probably the pci_enable_master() is the whole bug)
Comment 8 Alexey Kuznetsov 2021-11-10 07:39:53 UTC
Hank Jesse! It works! I haven't tested ethernet connection it self, since I'm using wifi on this machine. But interfaces comes up and suspend / resume works fine without task blocked for... kernel spam!

I've tested v5.14.16 + debian patches + [299501 patch]
Comment 10 Jesse Brandeburg 2021-11-30 01:05:11 UTC
Patch accepted upstream:
https://lore.kernel.org/netdev/163723620924.17258.12932119103111984410.git-patchwork-notify@kernel.org/

I have no way to set this bug to resolved in bugzilla, can someone help?

Note You need to log in before you can comment on or make changes to this bug.