Bug 13533
Summary: | 2.6.30 fails when removing second battery - Thinkpad R400 | ||
---|---|---|---|
Product: | ACPI | Reporter: | Vojtech Gondzala (vojtech.gondzala) |
Component: | Power-Battery | Assignee: | Zhang Rui (rui.zhang) |
Status: | CLOSED CODE_FIX | ||
Severity: | normal | CC: | acpi-bugzilla, bjorn.helgaas, lenb, pm, rui.zhang, vojtech.gondzala, yakui.zhao |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | Subsystem: | ||
Regression: | Yes | Bisected commit-id: | |
Attachments: |
syslog messages
TP-R400 - acpidump patch: fix a deadlock in hotplug case refreshed patch |
Will you please attach the output of acpidump? From the description it seems that the 2.6.29 can work well. Will you please use the git-bisect to identify the first bad commit which causes the regression? Thanks. Created attachment 21927 [details]
TP-R400 - acpidump
Hi,
there is output from acpidump.
I'm very busy for seek, which commint cause this regression, I try it at the weekend, maybe.
One more time Hi, I think, I found the reason of problem, if is battery in ultrabay undocked, kernel 2.6.29 says: ACPI: \_SB_.PCI0.LPC_.EC__.BAT1 - undocking but 2.6.30: ACPI: \_SB_.PCI0.LPC_.EC__.BAT1 - docking if run command `echo 1 > /sys/devices/platform/dock.1/undock` before removing battery (dock.1 is a battery bay), then is everything good. Something with dock in kernel is wrong, but I dont known whath. ACPI: \_SB_.PCI0.LPC_.EC__.BAT1 - docking ACPI: Battery Slot [BAT1] (battery present) ------------[ cut here ]------------ WARNING: at kernel/workqueue.c:371 flush_cpu_workqueue+0xa1/0xb0() what happens after this warning, does the system continue to fuction? Particaly(In reply to comment #4) > ACPI: \_SB_.PCI0.LPC_.EC__.BAT1 - docking > ACPI: Battery Slot [BAT1] (battery present) > ------------[ cut here ]------------ > WARNING: at kernel/workqueue.c:371 flush_cpu_workqueue+0xa1/0xb0() > > what happens after this warning, does the system continue to fuction? it is unusable, inserting a battery is ignored and something is broken - cannot switch virtual console - it frozen, cpufreqd segfault, only thing what help is SysRq key. hah, it's true. there is a deadlock in the ACPI hotplug mechanism. thanks for finding this, patch will be attached later. :) Created attachment 21977 [details]
patch: fix a deadlock in hotplug case
please apply this patch and see if it helps.
*** Bug 13466 has been marked as a duplicate of this bug. *** Patch doesn't help, there is a problem with NULL pointer, see log: BUG: unable to handle kernel NULL pointer dereference at 0000000000000020 IP: [<ffffffff80267539>] queue_work_on+0x29/0x80 PGD 0 Oops: 0000 [#1] PREEMPT SMP last sysfs file: /sys/devices/LNXSYSTM:00/device:00/PNP0A08:00/device:01/PNP0C09:00/PNP0C0A:01/power_supply/BAT1/energy_full CPU 1 Modules linked in: i915 drm i2c_algo_bit ipv6 sco bridge stp llc bnep l2cap bluetooth snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_hda_codec_conexant snd_pcm_oss snd_mixer_oss pcmcia snd_hda_intel snd_hda_codec snd_hwdep arc4 joydev snd_pcm ecb snd_timer sdhci_pci sdhci ohci1394 thinkpad_acpi nvram wmi iwlagn snd video output mmc_core ieee1394 iwlcore rfkill led_class mac80211 soundcore snd_page_alloc yenta_socket rsrc_nonstatic pcmcia_core serio_raw ricoh_mmc sg cfg80211 psmouse uhci_hcd i2c_i801 i2c_core ehci_hcd iTCO_wdt iTCO_vendor_support usbcore pcspkr heci(C) intel_agp e1000e evdev thermal fan button battery ac aes_x86_64 aes_generic dm_crypt dm_mod fuse vboxdrv cpufreq_powersave cpufreq_conservative cpufreq_ondemand acpi_cpufreq freq_table processor coretemp input_polldev rtc_cmos rtc_core rtc_lib ext3 jbd mbcache sd_mod ahci libata scsi_mod Pid: 16, comm: kacpi_notify Tainted: G C 2.6.30-ARCH #1 7443C1G RIP: 0010:[<ffffffff80267539>] [<ffffffff80267539>] queue_work_on+0x29/0x80 RSP: 0018:ffff88007b103c80 EFLAGS: 00010246 RAX: ffff88006c0062d8 RBX: ffff88006c0062c0 RCX: 0000000000000000 RDX: ffff88006c0062d0 RSI: 0000000000000000 RDI: 0000000000000001 RBP: ffffffff80401faf R08: ffff880001037c80 R09: 0000000000000000 R10: 0000000000000001 R11: 00000000ffffffff R12: ffff8800789de6c0 R13: 0000000000000001 R14: 0000000000000000 R15: 0000000000000040 FS: 0000000000000000(0000) GS:ffff88000102a000(0000) knlGS:0000000000000000 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b CR2: 0000000000000020 CR3: 0000000000201000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process kacpi_notify (pid: 16, threadinfo ffff88007b102000, task ffff88007b08e500) Stack: 49555f080f0cd041 00000000f4f6e2b5 495056921ca00041 ffffffff8026773a ffffffff80401faf ffff8800789de6c0 0000000000000001 00000000f4f6e2b5 0000000000000020 ffffffff803fb22c 504c304943505f42 0000000000000246 Call Trace: [<ffffffff8026773a>] ? queue_work+0x3a/0x90 [<ffffffff80401faf>] ? acpi_dock_deferred_cb+0x0/0x1a6 [<ffffffff803fb22c>] ? __acpi_os_execute+0x124/0x174 [<ffffffff80401825>] ? acpi_dock_notifier_call+0xf9/0x13d [<ffffffff8027228f>] ? notifier_call_chain+0x4f/0xa0 [<ffffffff8027279e>] ? __blocking_notifier_call_chain+0x6e/0xc0 [<ffffffff803faed0>] ? acpi_os_execute_deferred+0x0/0x6b [<ffffffff803fd737>] ? acpi_bus_notify+0x35/0x91 [<ffffffff8040de26>] ? acpi_ev_notify_dispatch+0x53/0x8d [<ffffffff803faf1b>] ? acpi_os_execute_deferred+0x4b/0x6b [<ffffffff80266431>] ? worker_thread+0x161/0x300 [<ffffffff8026c6d0>] ? autoremove_wake_function+0x0/0x60 [<ffffffff802662d0>] ? worker_thread+0x0/0x300 [<ffffffff8026c0b4>] ? kthread+0x64/0xc0 [<ffffffff8024ae30>] ? schedule_tail+0x30/0x80 [<ffffffff8020d4fa>] ? child_rip+0xa/0x20 [<ffffffff8026c050>] ? kthread+0x0/0xc0 [<ffffffff8020d4f0>] ? child_rip+0x0/0x20 Code: 00 00 48 83 ec 18 65 48 8b 04 25 28 00 00 00 48 89 44 24 08 31 c0 f0 0f ba 2a 00 19 c9 85 c9 75 31 48 8d 42 08 48 39 42 08 75 49 <44> 8b 46 20 45 85 c0 75 38 48 63 ff 48 8b 06 48 89 d6 48 03 04 RIP [<ffffffff80267539>] queue_work_on+0x29/0x80 RSP <ffff88007b103c80> CR2: 0000000000000020 ---[ end trace 31e2bbece5b888dc ]--- note: kacpi_notify[16] exited with preempt_count 1 *** Bug 13466 has been marked as a duplicate of this bug. *** Created attachment 22002 [details]
refreshed patch
please try this refreshed patch instead
(In reply to comment #11) > Created an attachment (id=22002) [details] > refreshed patch > > please try this refreshed patch instead Is seems to be functional, for me. No fails, when inserting/removing battery or HDD or CD-ROM drive in ultrabay. good news. Mark this bug as Resolved. Here's the actual deadlock for future reference: <user removes battery, platform generates system-level notify> acpi_bus_notify blocking_notifier_call_chain acpi_dock_notifier_call acpi_os_hotplug_execute(acpi_dock_deferred_cb, ...) __acpi_os_execute(..., acpi_dock_deferred_cb, ...) schedule_work(<acpi_dock_deferred_cb>) queue_work(keventd_wq, <acpi_dock_deferred_cb>) <acpi_dock_deferred_cb queued for execution in keventd_wq> worker_thread(keventd_wq) acpi_os_execute_hp_deferred acpi_dock_deferred_cb dock_notify hotplug_dock_devices dock_remove_acpi_device acpi_bus_trim acpi_bus_remove device_release_driver acpi_device_remove acpi_battery_remove sysfs_remove_battery power_supply_unregister flush_scheduled_work flush_workqueue(keventd_wq) Now we're waiting for keventd_wq to be flushed, but it won't be considered flushed until acpi_os_execute_hp_deferred() completes. patch applied to acpi-test tree shipped in linux-2.6.31-rc1 closed |
Created attachment 21908 [details] syslog messages System fails, when removing a battery in laptop. Hardware: Thinkpad R400 (7443-C1G) battery in ultrabay and main battery. Steps to reproduce: Replace DVD-RW drive for a ultrabay battery, then try to remove battery. 2.6.29 works fine.