Bug 9874
Summary: | Undocking Lenovo ThinkPad T61 causes oops | ||
---|---|---|---|
Product: | ACPI | Reporter: | Lukas Hejtmanek (xhejtman) |
Component: | Config-Hotplug | Assignee: | Rafael J. Wysocki (rjw) |
Status: | CLOSED CODE_FIX | ||
Severity: | normal | CC: | acpi-bugzilla, bunk, stern |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | 2.6.24-git4 | Subsystem: | |
Regression: | Yes | Bisected commit-id: | |
Bug Depends on: | |||
Bug Blocks: | 9832 | ||
Attachments: |
acpidump output
Patch removing the locking of devices from the suspend core Patch removing the locking of devices from the suspend core (updated) Patch removing the locking of devices from the suspend core (updated 2x) |
Description
Lukas Hejtmanek
2008-02-02 02:07:41 UTC
Created attachment 14682 [details]
acpidump output
does 2.6.24 work? I did not test 2.6.24. 2.6.24-rc8 was OK and since there are no big changes between 2.6.24 and 2.6.24-rc8, I can assume that 2.6.24 works as well. But if you want me to check it to be sure, I can do that but it takes some time.. 2.6.24 is OK, at least it does not happen every time. This entry is being used for tracking a regression from 2.6.24. Please don't close it until the problem is fixed in the mainline. Regressions list annotation: Handled-By : Len Brown <lenb@kernel.org> I did not see this oops in 2.6.25-rc1, but I did not huge testing. I reboot often due to various other problems so I do not highly utilize undocking before suspend as before. well, in 2.6.25-rc2-git2, it happened again. It seems that it does not happen always, just sometimes. It looks like this trace precedes the undocking oops. I.e., I can see this trace and after it, undocking produces the oops mentioned above. [86836.756886] ACPI: \_SB_.GDCK - docking [83791.360019] ata4.00: configured for UDMA/33 [86837.177682] acpi IBM0079:01: Suspicious device_add during suspend [86837.177689] Pid: 71, comm: kacpi_notify Not tainted 2.6.25-rc2-git2 #6 [86837.177693] [86837.177694] Call Trace: [86837.177705] [<ffffffff80397070>] pm_sleep_lock+0x10/0x20 [86837.177717] [<ffffffff803905ac>] device_add+0x5c/0x590 [86837.177723] [<ffffffff80359b83>] acpi_ut_release_mutex+0x5f/0x63 [86837.177730] [<ffffffff8035b9c4>] acpi_bus_data_handler+0x0/0x1 [86837.177737] [<ffffffff8035c8d8>] acpi_add_single_object+0xafa/0xcc1 [86837.177745] [<ffffffff8028ef64>] kmem_cache_free+0x14/0xb0 [86837.177754] [<ffffffff80342cb2>] acpi_os_release_object+0x9/0xd [86837.177761] [<ffffffff80358fea>] acpi_ut_update_ref_count+0x50/0x9d [86837.177768] [<ffffffff80359112>] acpi_ut_update_object_reference+0xdb/0x136 [86837.177776] [<ffffffff8035ccb1>] acpi_bus_add+0x1c/0x32 [86837.177783] [<ffffffff8036025c>] hotplug_dock_devices+0xec/0x117 [86837.177788] [<ffffffff8035b9c4>] acpi_bus_data_handler+0x0/0x1 [86837.177794] [<ffffffff80342eb6>] acpi_os_execute_deferred+0x0/0x2c [86837.177801] [<ffffffff803609fc>] dock_notify+0x7e/0xcb [86837.177807] [<ffffffff80348819>] acpi_ev_notify_dispatch+0x57/0x60 [86837.177813] [<ffffffff80342ed9>] acpi_os_execute_deferred+0x23/0x2c [86837.177820] [<ffffffff8024630f>] run_workqueue+0xbf/0x160 [86837.177826] [<ffffffff80246d50>] worker_thread+0x0/0x100 [86837.177831] [<ffffffff80246d50>] worker_thread+0x0/0x100 [86837.177837] [<ffffffff80246def>] worker_thread+0x9f/0x100 [86837.177844] [<ffffffff80249f50>] autoremove_wake_function+0x0/0x30 [86837.177850] [<ffffffff80246d50>] worker_thread+0x0/0x100 [86837.177856] [<ffffffff80246d50>] worker_thread+0x0/0x100 [86837.177860] [<ffffffff80249b5b>] kthread+0x4b/0x80 [86837.177868] [<ffffffff8020d128>] child_rip+0xa/0x12 [86837.177874] [<ffffffff80249b10>] kthread+0x0/0x80 [86837.177879] [<ffffffff8020d11e>] child_rip+0x0/0x12 [86837.177883] [86837.177885] ACPI: Error adding device IBM0079:01<7>PM: Writing back config space on device 0000:03:00.0 at offset 1 (was 100106, writing 100102) Does it happen during a suspend? The stack dump would be easier to interpret if the kernel was build with CONFIG_FRAME_POINTER. Apparently acpi_add_single_object() is the culprit, and it was called from within a workqueue. It may have been a coincidence that the workqueue was running during a suspend. Created attachment 14961 [details]
Patch removing the locking of devices from the suspend core
Lukas, can you please test with the attached patch applied?
To any ACPI experts reading this bug report: Is it truly necessary to register a new device (acpi_add_single_object) while the system is suspending or resuming? Would it be okay to block the device_add() call until after the resume is finished, instead of failing it? Created attachment 14964 [details]
Patch removing the locking of devices from the suspend core (updated)
The previously attached patch was incomplete. Please test this one instead.
(In reply to comment #9) > Does it happen during a suspend? > comment #8 happens during resume. (I suspend at home and resume at work in a dock station) (why the hell I did not receive any mails as this bug was updated?) Created attachment 14965 [details]
Patch removing the locking of devices from the suspend core (updated 2x)
Both previously attached patches were incomplete.
(In reply to comment #10) > It may have been a coincidence that the > workqueue was running during a suspend. I think it is because it does not happen every time. (In reply to comment #14) > (In reply to comment #9) > > Does it happen during a suspend? > > > comment #8 happens during resume. (I suspend at home and resume at work in a > dock station) Ah, thanks, that's important. > (why the hell I did not receive any mails as this bug was updated?) Bugzilla problem, I guess. Regressions list annotation: Patch : http://marc.info/?l=linux-acpi&m=120389632114090&w=2 |