Bug 35642
Summary: | On resume, I sometimes get a kernel oops with led_trigger_unregister_simple | ||
---|---|---|---|
Product: | ACPI | Reporter: | Sven-Hendrik Haase (svenstaro) |
Component: | Power-Battery | Assignee: | Lan Tianyu (tianyu.lan) |
Status: | CLOSED CODE_FIX | ||
Severity: | normal | CC: | acpi-bugzilla, diego.viola, florian, leho, lenb, maciej.rutecki, rjw, svenstaro |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | 2.6.39 | Subsystem: | |
Regression: | Yes | Bisected commit-id: | |
Bug Depends on: | |||
Bug Blocks: | 7216, 32012 | ||
Attachments: |
relevant kernel.log section
lspci -vvv lspci -vvv /proc/acpi/battery/ while no battery is inserted debug patch debug patch bko-35642-dmesg-led_trigger_unregister_simple.log bko-35642-dmesg-led_trigger_unregister_simple-post-patch.log |
Created attachment 59052 [details]
lspci -vvv
so if conky is not running, this OOPS doesn't happen? It looks like it accesses a proc battery interface. However, I'm surprised that such an interface exists if the battery is not present. please show the contents of that interface. please verify that this was not a problem in 2.6.38 Created attachment 59252 [details]
lspci -vvv
The same problem seems to happen on my system, too. It's strange that it occurs kind of randomly after a few suspend/sleep cycles.
conky, kde4
Linux 2.6.38-ARCH #1 SMP PREEMPT
x86_64 Intel(R) Core(TM)2 Duo CPU T5800 @ 2.00GHz GenuineIntel GNU/Linux
Arch Linux x86_64
I was unable to recreate the oops after multiple attempts without conky. The problem indeed seems to be caused by an access of conky to the acpi battery interface during suspend/resume. This also makes it hard to recreate even with conky running. For reference, this is what conky does: http://git.omp.am/?p=conky.git;a=blob;f=src/linux.cc;h=880405fe46c10943c94e6e0985418a8409935f52;hb=HEAD#l2080 I attached a file that shows the contents of my /proc/acpi/battery/ when the battery is not inserted. It also shows a BAT1 (BAT0 being my primary battery) that I could make use of if I used an additional battery bay instead of an optical drive but that was never inserted. Apparently the problem also exists in 2.6.38 as Matz has pointed out. Created attachment 59322 [details]
/proc/acpi/battery/ while no battery is inserted
Actually, I lied, this is what conky really does in the tagged version that I use: http://git.omp.am/?p=conky.git;a=blob;f=src/linux.c;h=ce5f73335bbf7e306a31264c984e4ed93b8d4283;hb=0b3fbed04520af4b228aa42723e02b5831f1d0c2#l1961 Created attachment 63012 [details]
debug patch
Please try this patch. I guess this problem is introduced by commit 25be5821521640eb00b7eb219ffe59664510d073.
The battery_notify refreshes the sysfs without check whether the battery exits or not. It will invoke sysfs_add_battery very time. It is not reasonable. This may cause sysfs_remove_battery to invoke power_supply_unregister when battery doesn't exit.
The sysfs_remove_battery will be invoked in the acpi_battery_update
and battery_notify. The acpi_battery_update will be invoked when accessing "/proc/acpi/battery/BAT0/state". So when system resumes from suspend with conky, the sysfs_remove_battery maybe be invoked simultaneously.The power_supply_unregister has chance to be invoked twice. This may lead the problem and produces such message.
Created attachment 63022 [details]
debug patch
Please try this one.
With a patched kernel I'm completely unable to recreate the issue so far. As it didn't happen every time before, this might just be bad luck, though. I will announce any problems here but so far it looks fine. It's a few days in and the problem does indeed appear to be fixed. Tianyu Lan are you submitting this patch for upstream inclusion? No, I will submit it later with other patchs. Ignore-Patch: https://bugzilla.kernel.org/attachment.cgi?id=63022 Patch: https://patchwork.kernel.org/patch/927582/ Patch: https://patchwork.kernel.org/patch/927612/ these two patches are staged in acpi-test for v3.1 A patch referencing this bug report has been merged in Linux v3.1-rc1: commit 9c921c22a7f33397a6774d7fa076db9b6a0fd669 Author: Lan Tianyu <tianyu.lan@intel.com> Date: Thu Jun 30 11:34:12 2011 +0800 ACPI / Battery: Resolve the race condition in the sysfs_remove_battery() A patch referencing this bug report has been merged in Linux v3.1-rc1: commit 69d94ec6d83d84044252d9ba03f6a8970816e350 Author: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Date: Sat Aug 6 01:34:08 2011 +0300 Battery: sysfs_remove_battery(): possible circular locking A patch referencing this bug report has been merged in Linux v3.1-rc1: commit 6e17fb6aa1a67afa1827ae317c3594040f055730 Author: Lan Tianyu <tianyu.lan@intel.com> Date: Thu Jun 30 11:33:58 2011 +0800 ACPI / Battery: Add the check before refresh sysfs in the battery_notify() Created attachment 71302 [details]
bko-35642-dmesg-led_trigger_unregister_simple.log
it looks like i might've bumped into this today with acer travelmate 8172. i suspended with AC power disconnected, then reconnected AC before resuming and received this BUG.
$ uname -a Linux travelmate 3.0.2-pf #4 SMP PREEMPT Fri Aug 26 12:39:19 EEST 2011 i686 Intel(R) Core(TM) i3 CPU U 330 @ 1.20GHz GenuineIntel GNU/Linux
patches from 3.1-rc1 seem to apply cleanly when applied in the correct order. so far i'm at compile-test stage, will put into production shortly.
if there's anything known that prevent these patches from working correctly on 3.0 series, i'd appreciate the info.
initial test cycle seems to confirm that patch is successful. will keep monitoring the situation. Created attachment 71322 [details]
bko-35642-dmesg-led_trigger_unregister_simple-post-patch.log
i can now report that my oops' continue. patches seem to have no apparent effect after all. dmesg attached.
I'm having a similar issue here I think? Bug 112351 Could someone please help? |
Created attachment 59042 [details] relevant kernel.log section On kernel 2.6.39, I sometimes get the attached oops during resume. The log I attached contains the complete sleep-resume cycle and as you can see, I'm suspending into memory. I'm certainly no kernel hacker but from what the trace lets me guess, conky (which also tries to get battery level for me) explodes the kernel on resume since it can't find a battery? My battery is usually not inserted. It appears to be a regression as I cannot remember seeing this in 2.6.28 or earlier. Relevant system info and software: software: conky, kde4, kde4 battery monitor uname -a: Linux smith 2.6.39-ARCH #1 SMP PREEMPT Fri May 20 11:33:59 CEST 2011 x86_64 Intel(R) Core(TM)2 Duo CPU T7300 @ 2.00GHz GenuineIntel GNU/Linux distro: Arch Linux x86_64