Bug 9731 - 2.6.24-rc7: Deadlock when any ACPI eject sys node written
Summary: 2.6.24-rc7: Deadlock when any ACPI eject sys node written
Status: CLOSED DUPLICATE of bug 2884
Alias: None
Product: ACPI
Classification: Unclassified
Component: Config-Hotplug (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: Zhang Rui
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2008-01-11 09:38 UTC by Daniel Arai
Modified: 2008-10-16 23:19 UTC (History)
1 user (show)

See Also:
Kernel Version: 2.6.24-rc7
Subsystem:
Regression: ---
Bisected commit-id:


Attachments

Description Daniel Arai 2008-01-11 09:38:24 UTC
Latest working kernel version: Unknown
Earliest failing kernel version: All 2.6.24-rc versions I've tried
Distribution: sles10
Hardware Environment: x86_64
Software Environment:
Problem Description:
I have "hardware" that supports ejectable CPUs.  Any attempt to eject a CPU by echoing 1 into the /sys node results in the shell doing the echo deadlocking.

Here's what dmesg says bash is doing:

bash          D 0000000000000000     0  3552   3372
 ffff810007023ca8 0000000000000082 0000000000000000 ffff8100014327f0
 0000000000000000 ffffffff00000000 ffff81000ecde0c0 ffff8100014437c0
 304a455f0dd521a0 00000000ffffdb37 00000000000000ff ffff81000fe37900
Call Trace:
 [<ffffffff80447282>] wait_for_completion+0xa2/0xf0
 [<ffffffff80231d50>] default_wake_function+0x0/0x10
 [<ffffffff802e2f6d>] sysfs_addrm_finish+0x1dd/0x250
 [<ffffffff802e17d6>] sysfs_hash_and_remove+0xa6/0xc0
 [<ffffffff8038d37d>] device_remove_file+0x2d/0x60
 [<ffffffff803525c3>] acpi_device_unregister+0xc8/0x124
 [<ffffffff80352778>] acpi_bus_remove+0x5e/0x64
 [<ffffffff803527f8>] acpi_bus_trim+0x7a/0xee
 [<ffffffff803528e8>] acpi_eject_store+0x7c/0x119
 [<ffffffff802e1ef4>] sysfs_write_file+0xd4/0x150
 [<ffffffff80293f7d>] vfs_write+0xdd/0x150
 [<ffffffff80294643>] sys_write+0x53/0x90
 [<ffffffff8020bf1e>] system_call+0x7e/0x83

The problem seems to be that acpi_device_unregister tries to delete the sys node for eject, but the node cannot be deleted until the write completes.

sysfs_write_file calls flush_write_buffer, which does this:

static int
flush_write_buffer(struct dentry * dentry, struct sysfs_buffer * buffer, size_t count)
{
	struct sysfs_dirent *attr_sd = dentry->d_fsdata;
	struct kobject *kobj = attr_sd->s_parent->s_elem.dir.kobj;
	struct sysfs_ops * ops = buffer->ops;
	int rc;

	/* need attr_sd for attr and ops, its parent for kobj */
	if (!sysfs_get_active_two(attr_sd))
		return -ENODEV;

	rc = ops->store(kobj, attr_sd->s_elem.attr.attr, buffer->page, count);

	sysfs_put_active_two(attr_sd);

	return rc;
}

sysfs_addrm_finish calls sysfs_deactivate, which is stuck waiting forever on the wait_for_completion call:

/**
 *	sysfs_deactivate - deactivate sysfs_dirent
 *	@sd: sysfs_dirent to deactivate
 *
 *	Deny new active references and drain existing ones.
 */
static void sysfs_deactivate(struct sysfs_dirent *sd)
{
	DECLARE_COMPLETION_ONSTACK(wait);
	int v;

	BUG_ON(sd->s_sibling || !(sd->s_flags & SYSFS_FLAG_REMOVED));
	sd->s_sibling = (void *)&wait;

	/* atomic_add_return() is a mb(), put_active() will always see
	 * the updated sd->s_sibling.
	 */
	v = atomic_add_return(SD_DEACTIVATED_BIAS, &sd->s_active);

	if (v != SD_DEACTIVATED_BIAS)
		wait_for_completion(&wait);

	sd->s_sibling = NULL;
}

But it looks like to me the wait_for_completion() won't return until the call to sysfs_put_active_two() in flush_write_buffer() is invoked.  This looks like a deadlock to me.

I can provide more information if it's helpful, and can help with testing any patches.

I'm not sure when this problem was exactly first introduced.  2.6.22 hung in a similar way, but it looks like the code that deals with deleting sysfs nodes got significantly reworked between 2.6.22 and 2.6.24.

Steps to reproduce:
echo 1 into any /sys/devices/LNXSYSTM:00/ACPI*/eject node.  Watch the parent process hang.
Comment 1 Anonymous Emailer 2008-01-11 12:38:17 UTC
Reply-To: akpm@linux-foundation.org

On Fri, 11 Jan 2008 09:38:25 -0800 (PST) bugme-daemon@bugzilla.kernel.org wrote:

> http://bugzilla.kernel.org/show_bug.cgi?id=9731
> 
>            Summary: 2.6.24-rc7: Deadlock when any ACPI eject sys node
>                     written
>            Product: ACPI
>            Version: 2.5
>      KernelVersion: 2.6.24-rc7
>           Platform: All
>         OS/Version: Linux
>               Tree: Mainline
>             Status: NEW
>           Severity: normal
>           Priority: P1
>          Component: Other
>         AssignedTo: acpi_other@kernel-bugs.osdl.org
>         ReportedBy: arai@vmware.com
> 
> 
> Latest working kernel version: Unknown
> Earliest failing kernel version: All 2.6.24-rc versions I've tried
> Distribution: sles10
> Hardware Environment: x86_64
> Software Environment:
> Problem Description:
> I have "hardware" that supports ejectable CPUs.  Any attempt to eject a CPU
> by
> echoing 1 into the /sys node results in the shell doing the echo deadlocking.
> 
> Here's what dmesg says bash is doing:
> 
> bash          D 0000000000000000     0  3552   3372
>  ffff810007023ca8 0000000000000082 0000000000000000 ffff8100014327f0
>  0000000000000000 ffffffff00000000 ffff81000ecde0c0 ffff8100014437c0
>  304a455f0dd521a0 00000000ffffdb37 00000000000000ff ffff81000fe37900
> Call Trace:
>  [<ffffffff80447282>] wait_for_completion+0xa2/0xf0
>  [<ffffffff80231d50>] default_wake_function+0x0/0x10
>  [<ffffffff802e2f6d>] sysfs_addrm_finish+0x1dd/0x250
>  [<ffffffff802e17d6>] sysfs_hash_and_remove+0xa6/0xc0
>  [<ffffffff8038d37d>] device_remove_file+0x2d/0x60
>  [<ffffffff803525c3>] acpi_device_unregister+0xc8/0x124
>  [<ffffffff80352778>] acpi_bus_remove+0x5e/0x64
>  [<ffffffff803527f8>] acpi_bus_trim+0x7a/0xee
>  [<ffffffff803528e8>] acpi_eject_store+0x7c/0x119
>  [<ffffffff802e1ef4>] sysfs_write_file+0xd4/0x150
>  [<ffffffff80293f7d>] vfs_write+0xdd/0x150
>  [<ffffffff80294643>] sys_write+0x53/0x90
>  [<ffffffff8020bf1e>] system_call+0x7e/0x83
> 
> The problem seems to be that acpi_device_unregister tries to delete the sys
> node for eject, but the node cannot be deleted until the write completes.
> 
> sysfs_write_file calls flush_write_buffer, which does this:
> 
> static int
> flush_write_buffer(struct dentry * dentry, struct sysfs_buffer * buffer,
> size_t
> count)
> {
>         struct sysfs_dirent *attr_sd = dentry->d_fsdata;
>         struct kobject *kobj = attr_sd->s_parent->s_elem.dir.kobj;
>         struct sysfs_ops * ops = buffer->ops;
>         int rc;
> 
>         /* need attr_sd for attr and ops, its parent for kobj */
>         if (!sysfs_get_active_two(attr_sd))
>                 return -ENODEV;
> 
>         rc = ops->store(kobj, attr_sd->s_elem.attr.attr, buffer->page,
>         count);
> 
>         sysfs_put_active_two(attr_sd);
> 
>         return rc;
> }
> 
> sysfs_addrm_finish calls sysfs_deactivate, which is stuck waiting forever on
> the wait_for_completion call:
> 
> /**
>  *      sysfs_deactivate - deactivate sysfs_dirent
>  *      @sd: sysfs_dirent to deactivate
>  *
>  *      Deny new active references and drain existing ones.
>  */
> static void sysfs_deactivate(struct sysfs_dirent *sd)
> {
>         DECLARE_COMPLETION_ONSTACK(wait);
>         int v;
> 
>         BUG_ON(sd->s_sibling || !(sd->s_flags & SYSFS_FLAG_REMOVED));
>         sd->s_sibling = (void *)&wait;
> 
>         /* atomic_add_return() is a mb(), put_active() will always see
>          * the updated sd->s_sibling.
>          */
>         v = atomic_add_return(SD_DEACTIVATED_BIAS, &sd->s_active);
> 
>         if (v != SD_DEACTIVATED_BIAS)
>                 wait_for_completion(&wait);
> 
>         sd->s_sibling = NULL;
> }
> 
> But it looks like to me the wait_for_completion() won't return until the call
> to sysfs_put_active_two() in flush_write_buffer() is invoked.  This looks
> like
> a deadlock to me.
> 
> I can provide more information if it's helpful, and can help with testing any
> patches.
> 
> I'm not sure when this problem was exactly first introduced.  2.6.22 hung in
> a
> similar way, but it looks like the code that deals with deleting sysfs nodes
> got significantly reworked between 2.6.22 and 2.6.24.
> 
> Steps to reproduce:
> echo 1 into any /sys/devices/LNXSYSTM:00/ACPI*/eject node.  Watch the parent
> process hang.

Thanks.  So it would seem that sysfs core changes caused the acpi code to
fail.
Comment 2 Zhang Rui 2008-01-16 00:22:56 UTC
huh, so is this a regression?
I don't think so.
The root cause of this bug is that:
when echo 1 > eject, acpi will try to unregister this device, and acpi_unregister_device will try to remove the "eject" file and thus cause the deadlock.
Please apply the patch series in bug #2884 and see if there is any difference.

Thanks
Comment 3 Daniel Arai 2008-01-16 14:38:42 UTC
Thank you!  The patch series in bug #2884 seems to fix the problem for me.  I applied the patches to 2.6.24-rc8.  When do you expect that these fixes will make it into the kernel?
Comment 4 Len Brown 2008-10-16 23:19:22 UTC

*** This bug has been marked as a duplicate of bug 2884 ***

Note You need to log in before you can comment on or make changes to this bug.