Bug 214933

Summary: [Regression] Commit 69a74aef8a18eef20fb0044b5e164af41b84db21 "e100: use generic power management" - breaks suspend/resume
Product: Drivers Reporter: Alexey Kuznetsov (axet)
Component: NetworkAssignee: Default virtual assignee for network-wireless-intel (drivers_network-wireless-intel)
Status: RESOLVED CODE_FIX    
Severity: normal CC: anthony.l.nguyen, jacob.e.keller, jbrandeb, kubakici, linuxwifi
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 5.10.70 Subsystem:
Regression: No Bisected commit-id:
Attachments: proposed fix, only compile tested
proposed fix, only compile tested

Description Alexey Kuznetsov 2021-11-03 13:33:37 UTC
Hello!

My notebook unable to suspend / resume due to bug in kernel. I discovered with series of 'git bisect' rebuilds bogus commit.

To make recent (I'm using 5.10) kernel to work I had to revert:

69a74aef8a18eef20fb0044b5e164af41b84db21

Following my original bug report against debian kernel (no one responds):

https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=982956
Comment 1 Alexey Kuznetsov 2021-11-04 18:21:01 UTC
Another person reporting same hardware issues related to the same commit:

  * https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=995927
Comment 2 Artem S. Tashkinov 2021-11-05 11:21:57 UTC
CC'ing the relevant people.

Commit 69a74aef8a18eef20fb0044b5e164af41b84db21 "e100: use generic power management" breaks suspend/resume for some configurations.
Comment 3 Jakub Kicinski 2021-11-05 13:49:12 UTC
Not sure how this go assigned to Intel WiFi. Adding Tony from Intel Ethernet as well.
If the issue does not get addressed please just send a revert to the ML.
Comment 4 Alexey Kuznetsov 2021-11-05 13:52:39 UTC
My fault, that was my first assumption suspend / resume freeze related to my ipw2100. I did assign it to wifi by mistake. it is clear now 100% e100 issue.
Comment 5 Jesse Brandeburg 2021-11-10 01:19:42 UTC
Hi Alexey! Thanks for following up on your report for this. I looked into this code, it seems patently wrong.

From the description of the problem I suspect the device is being left in D3 after the resume, and timing out all operations.

As for a fix, I'll attach a proposed patch here, but I don't think I have an e100 laying around any more to test.
Comment 6 Jesse Brandeburg 2021-11-10 02:07:11 UTC
Created attachment 299499 [details]
proposed fix, only compile tested

Hi, if you can please give this patch a try, and forgive me if it doesn't quite work right as I can't test it easily.

Feedback welcome!
Comment 7 Jesse Brandeburg 2021-11-10 02:31:05 UTC
Created attachment 299501 [details]
proposed fix, only compile tested

This change is more likely to work, since we're thinking the device likely isn't enabled correctly (probably the pci_enable_master() is the whole bug)
Comment 8 Alexey Kuznetsov 2021-11-10 07:39:53 UTC
Hank Jesse! It works! I haven't tested ethernet connection it self, since I'm using wifi on this machine. But interfaces comes up and suspend / resume works fine without task blocked for... kernel spam!

I've tested v5.14.16 + debian patches + [299501 patch]
Comment 10 Jesse Brandeburg 2021-11-30 01:05:11 UTC
Patch accepted upstream:
https://lore.kernel.org/netdev/163723620924.17258.12932119103111984410.git-patchwork-notify@kernel.org/

I have no way to set this bug to resolved in bugzilla, can someone help?