Bug 219244
Summary: | "driver core: Fix uevent_show() vs driver detach race" introduces weird "frozen usb mouse pointer at boot" bug on my system | ||
---|---|---|---|
Product: | Drivers | Reporter: | brmails+k |
Component: | Input Devices | Assignee: | drivers_input-devices |
Status: | NEW --- | ||
Severity: | normal | CC: | dan.j.williams, regressions |
Priority: | P3 | ||
Hardware: | Intel | ||
OS: | Linux | ||
Kernel Version: | v6.6.46 - 6.6.52, v6.10.10, v6.11.0 | Subsystem: | |
Regression: | Yes | Bisected commit-id: | 4d035c743c3e391728a6f81cbf0f7f9ca700cf62 |
Description
brmails+k
2024-09-06 19:37:59 UTC
The important question is: does the same problem happen in mainline? It's important, as different steps will be taken depending on the answer -- and it also influences which developers will care/are responsible. I just tested v6.6.50 in the off-chance the issue might have been solved coincidentally. Unfortunately it wasn't I'd have to set up a separate system for mainline testing. That might take a while for me to find the necessary time. I'll report back when I have done that and tested the mainline kernel for the bug. brmails+k: can I CC you when forwarding this by mail? This would expose your email address to the world. Thanks for asking. Even though this address is separate from my personal one, I'd still like to keep it spam-free, if possible. If I can read your forwarded messages on https://lore.kernel.org/ that should be ok and no CC would be necessary, I guess. I noticed that the commit that apparently leads to some issues on my system solves a CVE [1], so simply reverting won't be the solution - which I didn't expect anyway. [1]: CVE-2024-44952: driver core: Fix uevent_show() vs driver detach race: https://lore.kernel.org/all/2024090411-CVE-2024-44952-6290@gregkh/ (In reply to brmails+k from comment #4) > If I can read your forwarded messages on https://lore.kernel.org/ Then keep an eye on https://lore.kernel.org/regressions/3724e8e8-ab71-4f64-8ba1-c5c9a617632f@leemhuis.info/T/#u :-D > https://lore.kernel.org/regressions/3724e8e8-ab71-4f64-8ba1-
> c5c9a617632f@leemhuis.info/T/#u :-D
Thanks for "forwarding" the issue.
I'm still planning on setting up a test drive to test the mainline kernel. But given that this is a rather minor issue, the commit a rather important one (to keep) and my system (in the current setup) being the only one affected / being able to replicate the issue, there's no pressure (time-wise or other) - definitely not from my part.
O'll report back when I had the time to setup the test drive and do the mainline testing - and post the results, of course.
Little update: I set up a test install but haven't been able to re-create the issue, even with the v6.6 linux. I transferred then some settings to the test install and installed all packages that the install with the problem has installed (as per what the package manager tells me). But the test install still does not show the issue. The only difference between those two setups (apart from probably some config files I have to compare that might be different, of course) is that the main install is on a nvme and the test install is on a hdd. The two different drives are connected to the same connector to the mainboard, but need different connector cables (m.2-nvme vs sata). So, for now, I don't have an answer what might cause the issue, but I'll keep testing. It might be the fresh install vs. the old - but I tried to minimize that by having all packages on the test install, too. Maybe the hdd vs the nvme (speed), that leads to different order of executing commands, or the different connector cables, which interfere somehow with usb initialization. All of course depending on the changes of the commit. But that's just theoretical. I might have to sacrifice my backup install which is also on a (separate) nvme and put a fresh isntall on there to see if it's the fresh vs. old install or nvme vs. hdd setup. I'll update as I get new data / information. Update after a few days of more testing. I haven't been able to replicate the issue on a test install on a hdd. Not even once! But I have been able to replicate the issue on a test install on another nvme of the same make and model (Samsung 980 Pro) as the main install. With kernel v6.6.51, v6.10.10 and 6.11.0. I'm not sure what to make of it or what to test next / further at this point. Maybe I get some idea and the do some more testing. Any command output needed I'll try to provide. I saw Dan Williams' post [1] with the potential patch. I'll test it as soon as I get some time to do so and report back. Thanks for the forward and thanks to Dan Williams for taking the time to come up with a potential fix! [1] https://lore.kernel.org/regressions/3724e8e8-ab71-4f64-8ba1-c5c9a617632f@leemhuis.info/T/#m00607dbb85fe611b056230a52d3032e13990b800 Dan Williams wrote: [...] Ok, the following boots and passes the CXL unit tests, would appreciate if the reporter can give this a try: -- >8 -- Subject: driver core: Fix userspace expectations of uevent_show() as a probe barrier From: Dan Williams <dan.j.williams@intel.com> Commit "driver core: Fix uevent_show() vs driver detach race" [1] attempted to fix a lockdep report in uevent_show() by making the lookup of driver information for a given device lockless. It turns out that userspace likely depends on the side-effect of uevent_show() flushing device probing. Introduce a new "locked" type for sysfs attributes that arranges for the attribute to be called under the device-lock, but without setting up a reverse locking order dependency with the kernfs_get_active() lock. This new facility holds a reference on a device while any locked-sysfs attribute of that device is open. It then takes the lock around sysfs read/write operations in the following order: of->mutex of->op_mutex <= device_lock() kernfs_get_active() <operation> Compare that to problematic locking order of: of->mutex kernfs_get_active() <operation> device_lock() ...which causes potential deadlocks with kernfs_drain() that may be called while the device_lock() is held. Fixes: 15fffc6a5624 ("driver core: Fix uevent_show() vs driver detach race") [1] Cc: stable@vger.kernel.org Link: https://bugzilla.kernel.org/show_bug.cgi?id=219244 Signed-off-by: Dan Williams <dan.j.williams@intel.com> --- drivers/base/core.c | 2 +- fs/kernfs/file.c | 24 +++++++++++++++++++----- fs/sysfs/file.c | 47 +++++++++++++++++++++++++++++++++++++++++++++++ fs/sysfs/group.c | 4 ++-- include/linux/device.h | 3 +++ include/linux/kernfs.h | 1 + include/linux/sysfs.h | 10 ++++++++++ 7 files changed, 83 insertions(+), 8 deletions(-) diff --git a/drivers/base/core.c b/drivers/base/core.c index 8c0733d3aad8..1fd5a18cbb62 100644 --- a/drivers/base/core.c +++ b/drivers/base/core.c @@ -2772,7 +2772,7 @@ static ssize_t uevent_store(struct device *dev, struct device_attribute *attr, return count; } -static DEVICE_ATTR_RW(uevent); +static DEVICE_ATTR_LOCKED_RW(uevent); static ssize_t online_show(struct device *dev, struct device_attribute *attr, char *buf) diff --git a/fs/kernfs/file.c b/fs/kernfs/file.c index 8502ef68459b..eb5c2167beb9 100644 --- a/fs/kernfs/file.c +++ b/fs/kernfs/file.c @@ -142,6 +142,20 @@ static void kernfs_seq_stop_active(struct seq_file *sf, void *v) kernfs_put_active(of->kn); } +static void kernfs_open_file_lock(struct kernfs_open_file *of) +{ + mutex_lock(&of->mutex); + if (of->op_mutex) + mutex_lock(of->op_mutex); +} + +static void kernfs_open_file_unlock(struct kernfs_open_file *of) +{ + if (of->op_mutex) + mutex_unlock(of->op_mutex); + mutex_unlock(&of->mutex); +} + static void *kernfs_seq_start(struct seq_file *sf, loff_t *ppos) { struct kernfs_open_file *of = sf->private; @@ -151,7 +165,7 @@ static void *kernfs_seq_start(struct seq_file *sf, loff_t *ppos) * @of->mutex nests outside active ref and is primarily to ensure that * the ops aren't called concurrently for the same open file. */ - mutex_lock(&of->mutex); + kernfs_open_file_lock(of); if (!kernfs_get_active(of->kn)) return ERR_PTR(-ENODEV); @@ -193,7 +207,7 @@ static void kernfs_seq_stop(struct seq_file *sf, void *v) if (v != ERR_PTR(-ENODEV)) kernfs_seq_stop_active(sf, v); - mutex_unlock(&of->mutex); + kernfs_open_file_unlock(of); } static int kernfs_seq_show(struct seq_file *sf, void *v) @@ -322,9 +336,9 @@ static ssize_t kernfs_fop_write_iter(struct kiocb *iocb, struct iov_iter *iter) * @of->mutex nests outside active ref and is used both to ensure that * the ops aren't called concurrently for the same open file. */ - mutex_lock(&of->mutex); + kernfs_open_file_lock(of); if (!kernfs_get_active(of->kn)) { - mutex_unlock(&of->mutex); + kernfs_open_file_unlock(of); len = -ENODEV; goto out_free; } @@ -336,7 +350,7 @@ static ssize_t kernfs_fop_write_iter(struct kiocb *iocb, struct iov_iter *iter) len = -EINVAL; kernfs_put_active(of->kn); - mutex_unlock(&of->mutex); + kernfs_open_file_unlock(of); if (len > 0) iocb->ki_pos += len; diff --git a/fs/sysfs/file.c b/fs/sysfs/file.c index d1995e2d6c94..1bb878efcf00 100644 --- a/fs/sysfs/file.c +++ b/fs/sysfs/file.c @@ -15,6 +15,7 @@ #include <linux/list.h> #include <linux/mutex.h> #include <linux/seq_file.h> +#include <linux/device.h> #include <linux/mm.h> #include "sysfs.h" @@ -189,6 +190,26 @@ static int sysfs_kf_bin_open(struct kernfs_open_file *of) return 0; } +/* locked attributes are always device attributes */ +static int sysfs_kf_setup_lock(struct kernfs_open_file *of) +{ + struct kobject *kobj = of->kn->parent->priv; + struct device *dev = kobj_to_dev(kobj); + + get_device(dev); + of->op_mutex = &dev->mutex; + + return 0; +} + +static void sysfs_kf_free_lock(struct kernfs_open_file *of) +{ + struct kobject *kobj = of->kn->parent->priv; + struct device *dev = kobj_to_dev(kobj); + + put_device(dev); +} + void sysfs_notify(struct kobject *kobj, const char *dir, const char *attr) { struct kernfs_node *kn = kobj->sd, *tmp; @@ -227,6 +248,25 @@ static const struct kernfs_ops sysfs_file_kfops_rw = { .write = sysfs_kf_write, }; +static const struct kernfs_ops sysfs_locked_kfops_ro = { + .seq_show = sysfs_kf_seq_show, + .open = sysfs_kf_setup_lock, + .release = sysfs_kf_free_lock, +}; + +static const struct kernfs_ops sysfs_locked_kfops_wo = { + .write = sysfs_kf_write, + .open = sysfs_kf_setup_lock, + .release = sysfs_kf_free_lock, +}; + +static const struct kernfs_ops sysfs_locked_kfops_rw = { + .seq_show = sysfs_kf_seq_show, + .write = sysfs_kf_write, + .open = sysfs_kf_setup_lock, + .release = sysfs_kf_free_lock, +}; + static const struct kernfs_ops sysfs_prealloc_kfops_ro = { .read = sysfs_kf_read, .prealloc = true, @@ -287,6 +327,13 @@ int sysfs_add_file_mode_ns(struct kernfs_node *parent, ops = &sysfs_prealloc_kfops_ro; else if (sysfs_ops->store) ops = &sysfs_prealloc_kfops_wo; + } else if (mode & SYSFS_LOCKED) { + if (sysfs_ops->show && sysfs_ops->store) + ops = &sysfs_locked_kfops_rw; + else if (sysfs_ops->show) + ops = &sysfs_locked_kfops_ro; + else if (sysfs_ops->store) + ops = &sysfs_locked_kfops_wo; } else { if (sysfs_ops->show && sysfs_ops->store) ops = &sysfs_file_kfops_rw; diff --git a/fs/sysfs/group.c b/fs/sysfs/group.c index d22ad67a0f32..0158367866be 100644 --- a/fs/sysfs/group.c +++ b/fs/sysfs/group.c @@ -68,11 +68,11 @@ static int create_files(struct kernfs_node *parent, struct kobject *kobj, continue; } - WARN(mode & ~(SYSFS_PREALLOC | 0664), + WARN(mode & ~(SYSFS_PREALLOC | SYSFS_LOCKED | 0664), "Attribute %s: Invalid permissions 0%o\n", (*attr)->name, mode); - mode &= SYSFS_PREALLOC | 0664; + mode &= SYSFS_PREALLOC | SYSFS_LOCKED | 0664; error = sysfs_add_file_mode_ns(parent, *attr, mode, uid, gid, NULL); if (unlikely(error)) diff --git a/include/linux/device.h b/include/linux/device.h index 34eb20f5966f..c38c33bed333 100644 --- a/include/linux/device.h +++ b/include/linux/device.h @@ -180,6 +180,9 @@ ssize_t device_show_string(struct device *dev, struct device_attribute *attr, #define DEVICE_ATTR_RW(_name) \ struct device_attribute dev_attr_##_name = __ATTR_RW(_name) +#define DEVICE_ATTR_LOCKED_RW(_name) \ + struct device_attribute dev_attr_##_name = __ATTR_LOCKED_RW(_name) + /** * DEVICE_ATTR_ADMIN_RW - Define an admin-only read-write device attribute. * @_name: Attribute name. diff --git a/include/linux/kernfs.h b/include/linux/kernfs.h index 87c79d076d6d..df6828a7cd3e 100644 --- a/include/linux/kernfs.h +++ b/include/linux/kernfs.h @@ -257,6 +257,7 @@ struct kernfs_open_file { /* private fields, do not use outside kernfs proper */ struct mutex mutex; + struct mutex *op_mutex; struct mutex prealloc_mutex; int event; struct list_head list; diff --git a/include/linux/sysfs.h b/include/linux/sysfs.h index c4e64dc11206..981588c9c673 100644 --- a/include/linux/sysfs.h +++ b/include/linux/sysfs.h @@ -103,6 +103,7 @@ struct attribute_group { #define SYSFS_PREALLOC 010000 #define SYSFS_GROUP_INVISIBLE 020000 +#define SYSFS_LOCKED 040000 /* * DEFINE_SYSFS_GROUP_VISIBLE(name): @@ -230,6 +231,13 @@ struct attribute_group { .store = _store, \ } +#define __ATTR_LOCKED(_name, _mode, _show, _store) { \ + .attr = {.name = __stringify(_name), \ + .mode = SYSFS_LOCKED | VERIFY_OCTAL_PERMISSIONS(_mode) },\ + .show = _show, \ + .store = _store, \ +} + #define __ATTR_RO(_name) { \ .attr = { .name = __stringify(_name), .mode = 0444 }, \ .show = _name##_show, \ @@ -255,6 +263,8 @@ struct attribute_group { #define __ATTR_RW(_name) __ATTR(_name, 0644, _name##_show, _name##_store) +#define __ATTR_LOCKED_RW(_name) __ATTR_LOCKED(_name, 0644, _name##_show, _name##_store) + #define __ATTR_NULL { .attr = { .name = NULL } } #ifdef CONFIG_DEBUG_LOCK_ALLOC Good news! I think the proposed patch by Dan Williams fixes the issue. I have tested it with v6.6.52 and v6.10.11. I haven't been able to recreate the issue with those modified kernels even once. The patch can be applied to v6.11.0 and v6.10.11 out of the box. For v6.6.52 I had to slightly modify it as the line #define SYSFS_GROUP_INVISIBLE 020000 doesn't exist in /include/linux/sysfs.h in v6.6.52 hence the patch looking for that line fails on that file. But after adjusting the patch accordingly, the patch works fine on v6.6.52 and the issue is gone with the patched version of v6.6.52, just like 6.10.11. So, I assume that the fix / patch proposed by Dan Williams works as intended resolving the issue I had. Thanks again for forwarding the bug report and for the quick fix! (In reply to brmails+k from comment #11) > Good news! > > I think the proposed patch by Dan Williams fixes the issue. Thanks for the report and the test! Let me know if you want a different credit in the kernel besides the following: Reported-by: brmails+k@disroot.org Tested-by: brmails+k@disroot.org Link: https://bugzilla.kernel.org/show_bug.cgi?id=219244 Thanks for the fix! I'd prefer just "brmails+k" to avoid spam, but if an email address is necessary, what you proposed is fine, too. (In reply to brmails+k from comment #13) > I'd prefer just "brmails+k" to avoid spam, Maybe it's time to allow a format like Reported-by: brmails+k Tested-by: brmails+k Dan, what do you think, should I submit this as a documentation change (which tells people that that format is only allowed in combination with a Link:/Closes: tag) and see what people think about it? That format could work for subsystems that use gitlab instances or github, too. > but if an email address is > necessary, what you proposed is fine, too. The email address in a public comment now anyway. :-/ I see there is already precedent for not including a full email in Reported-by and Tested-by lines: "git log | grep Reported-by: | grep -v @". It does break flows like mine, "stg mail --auto", and checkpatch complains, but looks like "git send-email" is ok to skip over those lines without complaint. I am not sure we need to formalize that this is ok because the preference would be to include testers on future updates about the patch, but if individuals do not want their email in the kernel history for Reported-by and Tested-by: then that is their prerogative. An update that might be worthwhile in the documentation is the suggestion to extend reporters and testers the courtesy of confirming how and whether they are credited in git history rather than just assuming the contact details used to make the report will find their way into the kernel history verbatim. I didn't want do cause any trouble / extra work... But thank you for all your efforts! Luckily, I seem to be the only one who ran into this issue, so maybe no one outside of the kernel team will ever read the comment / commit... :-) I just tested the new version of the patch [1]. Works fine, as expected (I tested 6.6.53 and 6.10.12). Thanks! [1] https://lore.kernel.org/all/172790598832.1168608.4519484276671503678.stgit@dwillia2-xfh.jf.intel.com/ Commit "driver core: Fix uevent_show() vs driver detach race" [1] attempted to fix a lockdep report in uevent_show() by making the lookup of driver information for a given device lockless. It turns out that userspace likely depends on the side-effect of uevent_show() flushing device probing, as evidenced by the removal of the lock causing a regression initializing USB devices [2]. Introduce a new "locked" type for sysfs attributes that arranges for the attribute to be called under the device-lock, but without setting up a reverse locking order dependency with the kernfs_get_active() lock. This new facility holds a reference on a device while any locked-sysfs attribute of that device is open. It then takes the lock around sysfs read/write operations in the following order: of->mutex of->op_mutex <= device_lock() kernfs_get_active() <operation> Compare that to problematic locking order of: of->mutex kernfs_get_active() <operation> device_lock() ...which causes potential deadlocks with kernfs_drain() that may be called while the device_lock() is held. Note that the synchronize_rcu() in module_remove_driver(), introduced in the precedubg fix, likely needs to remain since dev_uevent() is not always called with the device_lock held. Userspace can potentially trigger use after free by racing module removal against kernel invocations of kobject_uevent(). Note2, this change effectively allows userspace to test for CONFIG_DEBUG_KOBJECT_RELEASE-style driver bugs as the lifetime of open file descriptors on the @uevent attribute keeps the device reference count elevated indefinitely. Reported-by: brmails+k Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219244 [2] Tested-by: brmails+k Fixes: 15fffc6a5624 ("driver core: Fix uevent_show() vs driver detach race") [1] Cc: stable@vger.kernel.org Signed-off-by: Dan Williams <dan.j.williams@intel.com> --- Changes since v1 [1]: - Move the new "locked" infrastructure to private header files to make it clear it is not approved for general usage (Greg) [1]: http://lore.kernel.org/172772671177.1003633.7355063154793624707.stgit@dwillia2-xfh.jf.intel.com drivers/base/base.h | 9 +++++++++ drivers/base/core.c | 2 +- fs/kernfs/file.c | 24 +++++++++++++++++++----- fs/sysfs/file.c | 47 +++++++++++++++++++++++++++++++++++++++++++++++ fs/sysfs/group.c | 4 ++-- fs/sysfs/sysfs.h | 18 ++++++++++++++++++ include/linux/kernfs.h | 1 + include/linux/sysfs.h | 1 + 8 files changed, 98 insertions(+), 8 deletions(-) diff --git a/drivers/base/base.h b/drivers/base/base.h index 0b53593372d7..0405e7667184 100644 --- a/drivers/base/base.h +++ b/drivers/base/base.h @@ -246,3 +246,12 @@ static inline int devtmpfs_delete_node(struct device *dev) { return 0; } void software_node_notify(struct device *dev); void software_node_notify_remove(struct device *dev); + +/* + * "locked" device attributes are an internal exception for the @uevent + * attribute. + */ +#include "../../fs/sysfs/sysfs.h" + +#define DEVICE_ATTR_LOCKED_RW(_name) \ + struct device_attribute dev_attr_##_name = __ATTR_LOCKED_RW(_name) diff --git a/drivers/base/core.c b/drivers/base/core.c index 8c0733d3aad8..1fd5a18cbb62 100644 --- a/drivers/base/core.c +++ b/drivers/base/core.c @@ -2772,7 +2772,7 @@ static ssize_t uevent_store(struct device *dev, struct device_attribute *attr, return count; } -static DEVICE_ATTR_RW(uevent); +static DEVICE_ATTR_LOCKED_RW(uevent); static ssize_t online_show(struct device *dev, struct device_attribute *attr, char *buf) diff --git a/fs/kernfs/file.c b/fs/kernfs/file.c index 8502ef68459b..eb5c2167beb9 100644 --- a/fs/kernfs/file.c +++ b/fs/kernfs/file.c @@ -142,6 +142,20 @@ static void kernfs_seq_stop_active(struct seq_file *sf, void *v) kernfs_put_active(of->kn); } +static void kernfs_open_file_lock(struct kernfs_open_file *of) +{ + mutex_lock(&of->mutex); + if (of->op_mutex) + mutex_lock(of->op_mutex); +} + +static void kernfs_open_file_unlock(struct kernfs_open_file *of) +{ + if (of->op_mutex) + mutex_unlock(of->op_mutex); + mutex_unlock(&of->mutex); +} + static void *kernfs_seq_start(struct seq_file *sf, loff_t *ppos) { struct kernfs_open_file *of = sf->private; @@ -151,7 +165,7 @@ static void *kernfs_seq_start(struct seq_file *sf, loff_t *ppos) * @of->mutex nests outside active ref and is primarily to ensure that * the ops aren't called concurrently for the same open file. */ - mutex_lock(&of->mutex); + kernfs_open_file_lock(of); if (!kernfs_get_active(of->kn)) return ERR_PTR(-ENODEV); @@ -193,7 +207,7 @@ static void kernfs_seq_stop(struct seq_file *sf, void *v) if (v != ERR_PTR(-ENODEV)) kernfs_seq_stop_active(sf, v); - mutex_unlock(&of->mutex); + kernfs_open_file_unlock(of); } static int kernfs_seq_show(struct seq_file *sf, void *v) @@ -322,9 +336,9 @@ static ssize_t kernfs_fop_write_iter(struct kiocb *iocb, struct iov_iter *iter) * @of->mutex nests outside active ref and is used both to ensure that * the ops aren't called concurrently for the same open file. */ - mutex_lock(&of->mutex); + kernfs_open_file_lock(of); if (!kernfs_get_active(of->kn)) { - mutex_unlock(&of->mutex); + kernfs_open_file_unlock(of); len = -ENODEV; goto out_free; } @@ -336,7 +350,7 @@ static ssize_t kernfs_fop_write_iter(struct kiocb *iocb, struct iov_iter *iter) len = -EINVAL; kernfs_put_active(of->kn); - mutex_unlock(&of->mutex); + kernfs_open_file_unlock(of); if (len > 0) iocb->ki_pos += len; diff --git a/fs/sysfs/file.c b/fs/sysfs/file.c index d1995e2d6c94..1bb878efcf00 100644 --- a/fs/sysfs/file.c +++ b/fs/sysfs/file.c @@ -15,6 +15,7 @@ #include <linux/list.h> #include <linux/mutex.h> #include <linux/seq_file.h> +#include <linux/device.h> #include <linux/mm.h> #include "sysfs.h" @@ -189,6 +190,26 @@ static int sysfs_kf_bin_open(struct kernfs_open_file *of) return 0; } +/* locked attributes are always device attributes */ +static int sysfs_kf_setup_lock(struct kernfs_open_file *of) +{ + struct kobject *kobj = of->kn->parent->priv; + struct device *dev = kobj_to_dev(kobj); + + get_device(dev); + of->op_mutex = &dev->mutex; + + return 0; +} + +static void sysfs_kf_free_lock(struct kernfs_open_file *of) +{ + struct kobject *kobj = of->kn->parent->priv; + struct device *dev = kobj_to_dev(kobj); + + put_device(dev); +} + void sysfs_notify(struct kobject *kobj, const char *dir, const char *attr) { struct kernfs_node *kn = kobj->sd, *tmp; @@ -227,6 +248,25 @@ static const struct kernfs_ops sysfs_file_kfops_rw = { .write = sysfs_kf_write, }; +static const struct kernfs_ops sysfs_locked_kfops_ro = { + .seq_show = sysfs_kf_seq_show, + .open = sysfs_kf_setup_lock, + .release = sysfs_kf_free_lock, +}; + +static const struct kernfs_ops sysfs_locked_kfops_wo = { + .write = sysfs_kf_write, + .open = sysfs_kf_setup_lock, + .release = sysfs_kf_free_lock, +}; + +static const struct kernfs_ops sysfs_locked_kfops_rw = { + .seq_show = sysfs_kf_seq_show, + .write = sysfs_kf_write, + .open = sysfs_kf_setup_lock, + .release = sysfs_kf_free_lock, +}; + static const struct kernfs_ops sysfs_prealloc_kfops_ro = { .read = sysfs_kf_read, .prealloc = true, @@ -287,6 +327,13 @@ int sysfs_add_file_mode_ns(struct kernfs_node *parent, ops = &sysfs_prealloc_kfops_ro; else if (sysfs_ops->store) ops = &sysfs_prealloc_kfops_wo; + } else if (mode & SYSFS_LOCKED) { + if (sysfs_ops->show && sysfs_ops->store) + ops = &sysfs_locked_kfops_rw; + else if (sysfs_ops->show) + ops = &sysfs_locked_kfops_ro; + else if (sysfs_ops->store) + ops = &sysfs_locked_kfops_wo; } else { if (sysfs_ops->show && sysfs_ops->store) ops = &sysfs_file_kfops_rw; diff --git a/fs/sysfs/group.c b/fs/sysfs/group.c index d22ad67a0f32..0158367866be 100644 --- a/fs/sysfs/group.c +++ b/fs/sysfs/group.c @@ -68,11 +68,11 @@ static int create_files(struct kernfs_node *parent, struct kobject *kobj, continue; } - WARN(mode & ~(SYSFS_PREALLOC | 0664), + WARN(mode & ~(SYSFS_PREALLOC | SYSFS_LOCKED | 0664), "Attribute %s: Invalid permissions 0%o\n", (*attr)->name, mode); - mode &= SYSFS_PREALLOC | 0664; + mode &= SYSFS_PREALLOC | SYSFS_LOCKED | 0664; error = sysfs_add_file_mode_ns(parent, *attr, mode, uid, gid, NULL); if (unlikely(error)) diff --git a/fs/sysfs/sysfs.h b/fs/sysfs/sysfs.h index 3f28c9af5756..51d5b6af243f 100644 --- a/fs/sysfs/sysfs.h +++ b/fs/sysfs/sysfs.h @@ -40,4 +40,22 @@ int sysfs_add_bin_file_mode_ns(struct kernfs_node *parent, int sysfs_create_link_sd(struct kernfs_node *kn, struct kobject *target, const char *name); +#define SYSFS_LOCKED 040000 + +/* + * uevent_show() needs this due to userspace expectations of reading + * that attribute flushing device probing, it is not intended to be a + * general semantic for other device sysfs attributes. It is broken to + * use this with non-device / pure-kobject sysfs attributes, see + * sysfs_kf_setup_lock(). + */ +#define __ATTR_LOCKED(_name, _mode, _show, _store) { \ + .attr = {.name = __stringify(_name), \ + .mode = SYSFS_LOCKED | VERIFY_OCTAL_PERMISSIONS(_mode) },\ + .show = _show, \ + .store = _store, \ +} + +#define __ATTR_LOCKED_RW(_name) __ATTR_LOCKED(_name, 0644, _name##_show, _name##_store) + #endif /* __SYSFS_INTERNAL_H */ diff --git a/include/linux/kernfs.h b/include/linux/kernfs.h index 87c79d076d6d..df6828a7cd3e 100644 --- a/include/linux/kernfs.h +++ b/include/linux/kernfs.h @@ -257,6 +257,7 @@ struct kernfs_open_file { /* private fields, do not use outside kernfs proper */ struct mutex mutex; + struct mutex *op_mutex; struct mutex prealloc_mutex; int event; struct list_head list; diff --git a/include/linux/sysfs.h b/include/linux/sysfs.h index c4e64dc11206..dd28c6238b64 100644 --- a/include/linux/sysfs.h +++ b/include/linux/sysfs.h @@ -103,6 +103,7 @@ struct attribute_group { #define SYSFS_PREALLOC 010000 #define SYSFS_GROUP_INVISIBLE 020000 +/* see fs/sysfs/sysfs.h for private mode-flags */ /* * DEFINE_SYSFS_GROUP_VISIBLE(name): |