Bug 10751

Summary: pciehp probing slow, duplicate kobject_add
Product: Drivers Reporter: Jan C. Nordholz (jckn)
Component: PCIAssignee: drivers_pci (drivers_pci)
Status: CLOSED CODE_FIX    
Severity: low CC: akpm, bunk, kaneshige.kenji
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 2.6.26-rc1, 2.6.26-rc2, 2.6.26-rc3 Subsystem:
Regression: Yes Bisected commit-id:
Bug Depends on:    
Bug Blocks: 10492    
Attachments: Dmesg output (2.6.25)
Dmesg output (2.6.26-rc3)
For "pciehp probing slow" problem (1/2)
For "pciehp probing slow" problem (2/2)
For "duplecate kobject_add" problem
Dmesg output (2.6.26+rc3 plus previous three patches)
For "pciehp probing slow" problem (additional)
dmesg output (2.6.26-rc3 with all patches and pciehp_debug)
Patch (1/3) for "pciehp probing slot" problem
Patch (2/3) for "pciehp probing slot" problem
Patch (3/3) for "pciehp probing slot" problem
patch for test
dmesg output (2.6.26-rc3 with additional patch)

Description Jan C. Nordholz 2008-05-19 14:26:41 UTC
Latest working kernel version: 2.6.25
Earliest failing kernel version: 2.6.26-rc1
Distribution: Debian
Hardware Environment: Lenovo Thinkpad R61 (8943-DLG)
Software Environment: 
Problem Description: Since 26rc1, there is a significant delay during pciehp initialization (enumerating the pciehp devices takes about one second each), and it tries to create a sysfs entry twice.

(What's more, my iwl3945 has stopped working with 2.6.26, but as it's not a PCIE device I believe the two issues are orthogonal... correct me if I'm wrong.)

Attaching two dmesgs (2.6.25 & 2.6.26-rc3).
Comment 1 Jan C. Nordholz 2008-05-19 14:29:21 UTC
Created attachment 16203 [details]
Dmesg output (2.6.25)
Comment 2 Jan C. Nordholz 2008-05-19 14:29:56 UTC
Created attachment 16204 [details]
Dmesg output (2.6.26-rc3)
Comment 3 Andrew Morton 2008-05-19 14:47:21 UTC
Thanks.  Could you please raise a separate report for the
iwl3945 regression?
Comment 4 Jan C. Nordholz 2008-05-19 15:57:17 UTC
Sure - reported as #10755.
Comment 5 Kenji Kaneshige 2008-05-19 23:38:11 UTC
I made three patches. Two of them are "pciehp probing slow" problem and one of
them is for "duplicate kobject_add" problem. These patches are against 
2.6.26-rc3. I'll attach these patches. Could you please try those and report
the result?

- For "pciehp probing slow" problem
  (1/2) pciehp-poll-cmd-complete-when-hp-interrupt-is-disabled.patch
  (2/2) pciehp-move-msleep-after-power-off.patch

- For "duplicate kobject_add" problem
  pci_hotplug_core-add-check-of-slot-name.patch
Comment 6 Kenji Kaneshige 2008-05-19 23:40:17 UTC
Created attachment 16208 [details]
For "pciehp probing slow" problem (1/2)
Comment 7 Kenji Kaneshige 2008-05-19 23:44:30 UTC
Created attachment 16209 [details]
For "pciehp probing slow" problem (2/2)

Patch for "pciehp probing slow" problem (2/2)
Comment 8 Kenji Kaneshige 2008-05-19 23:45:44 UTC
Created attachment 16210 [details]
For "duplecate kobject_add" problem

Patch for "duplecate kobject_add" problem
Comment 9 Jan C. Nordholz 2008-05-20 06:01:34 UTC
Hi Kenji,

thanks for your efforts! The pciehp delay problem persists, however. I'm attaching an updated dmesg. Is there anything I can do to debug the problem?
Comment 10 Jan C. Nordholz 2008-05-20 06:02:15 UTC
Created attachment 16215 [details]
Dmesg output (2.6.26+rc3 plus previous three patches)
Comment 11 Kenji Kaneshige 2008-05-20 18:03:06 UTC
Thank you for the report.
I'll post the patch for "duplecate kobject_add" problem to the linux-pci community.
For debugging pciehp delay problem, could you send me the dmesg message output
with "pciehp.pciehp_debug" boot option that enables debug mode of pciehp driver?
You can see the debug message like below when pciehp driver is loaded under the 
debugging mode.

pci_hotplug: PCI Hot Plug PCI Core version: 0.5
pciehp: Hotplug Controller:
pciehp:   Seg/Bus/Dev/Func/IRQ : 0002:13:08.0 IRQ 50
pciehp:   Vendor ID            : 0x10b5
pciehp:   Device ID            : 0x8532
pciehp:   Subsystem ID         : 0x0000
pciehp:   Subsystem Vendor ID  : 0x0000
pciehp:   PCIe Cap offset      : 0x68
pciehp:   PCI resource [8]     : 0xe00000@0xab800000
pciehp:   PCI resource [9]     : 0x40000000@0xea00000000
pciehp: Slot Capabilities      : 0x01200cde
pciehp:   Physical Slot Number : 36
pciehp:   Attention Button     :  no
pciehp:   Power Controller     : yes
pciehp:   MRL Sensor           : yes
pciehp:   Attention Indicator  : yes
pciehp:   Power Indicator      : yes
pciehp:   Hot-Plug Surprise    :  no
pciehp:   EMI Present          :  no
pciehp: Slot Status            : 0x0050
pciehp: Slot Control           : 0x0050
Comment 12 Kenji Kaneshige 2008-05-20 21:36:49 UTC
Could you try one more thing? I made an additional patch on top of previous three
patches. Could you try it and give me the result?
Comment 13 Kenji Kaneshige 2008-05-20 21:39:25 UTC
Created attachment 16221 [details]
For "pciehp probing slow" problem (additional)

Additional patch for "pciehp probing slow" problem
Comment 14 Jan C. Nordholz 2008-05-21 02:22:43 UTC
Hi,

no problem... pciehp is still slow with all three patches attached. Here's the verbose dmesg.
Comment 15 Jan C. Nordholz 2008-05-21 02:23:52 UTC
Created attachment 16222 [details]
dmesg output (2.6.26-rc3 with all patches and pciehp_debug)
Comment 16 Kenji Kaneshige 2008-05-21 23:43:46 UTC
Hi Jan-san,

Thank you for testing. From the dmesg output, I found that hot-plug
controller on your machine doesn't notify of the command completed
event even though it supports command completed. Your machine doesn't
support any of power controller, attention led, power led,
electromechanical interlock. I think this is why hot-plug controller
doesn't notify the command completed event.

I updated the set of patches to handle above mentioned controller
behavior. Could you try it? Please note that this set of patches
are against 2.6.26-rc3, not against previous ones.

I'm sorry for troubling you. I hope this is the last one.
Comment 17 Kenji Kaneshige 2008-05-21 23:48:13 UTC
Created attachment 16233 [details]
Patch (1/3) for "pciehp probing slot" problem

Patch (1/3) for "pciehp probing slot" problem
Comment 18 Kenji Kaneshige 2008-05-21 23:48:51 UTC
Created attachment 16234 [details]
Patch (2/3) for "pciehp probing slot" problem

Patch (2/3) for "pciehp probing slot" problem
Comment 19 Kenji Kaneshige 2008-05-21 23:49:42 UTC
Created attachment 16235 [details]
Patch (3/3) for "pciehp probing slot" problem

Patch (3/3) for "pciehp probing slot" problem
Comment 20 Jan C. Nordholz 2008-05-22 01:33:47 UTC
Hi Kenji,

excellent, the timeout delays are gone. Thank you very much for fixing this so fast!
Comment 21 Kenji Kaneshige 2008-05-22 17:27:03 UTC
Hi Jan-san,

Thank you for testing. I'm happy to hear the good news. I want to push
the patches to the community. But before that, I would like to ask you
to do one more test just in case. I'm sorry for troubling you again.

I'll attach a new patch named "test.patch" that is on top of the three
patches I sent yesterday. Could you try it with "pciehp.pciehp_debug"
boot option and send me the dmesg output? Please note that I expect
that the problem will come back with test.patch.

And could you tell me the model name of your machine?
Comment 22 Kenji Kaneshige 2008-05-22 17:29:23 UTC
Created attachment 16251 [details]
patch for test
Comment 23 Kenji Kaneshige 2008-05-23 14:14:49 UTC
Jan-san,

> And could you tell me the model name of your machine?

Please ignore this line in comment #21. Sorry.

And I'm still waiting for the test result of "patch for test" attached
in #22 (I called it test.patch in comment #21). Please try it and send
me the dmsg output.
Comment 24 Jan C. Nordholz 2008-05-24 03:49:42 UTC
I'm sorry for taking so long and keeping you waiting. Here's the dmesg - you were right, the delay indeed returns with the additional patch.
Comment 25 Jan C. Nordholz 2008-05-24 03:50:22 UTC
Created attachment 16265 [details]
dmesg output (2.6.26-rc3 with additional patch)
Comment 26 Kenji Kaneshige 2008-05-24 06:23:32 UTC
Jan-san,

Thank you very much for testing, and I'm sorry for troubling you many times.
I'll push the patches to the community early next week.

Thank you very much, again.
Comment 27 Adrian Bunk 2008-05-28 00:43:31 UTC
fixed by commit 5808639bfa98d69f77a481d759570d85f164fea0
Comment 28 Adrian Bunk 2008-05-28 00:44:38 UTC
(and commit a86161b3134465f072d965ca7508ec9c1e2e52c7)