When a procfile is created, one thing that can be done is to set the "owner"
field in the proc_dir_entry. When this is done, a module_get is done against
that module when the file is opened and put when it's closed. The problem is
that there is a race window where the procfile exists on the system but the
owner is not yet set.
If this happens then no module reference will be taken on open
(try_module_get(NULL) is a no-op that returns success). If the owner is then
set while the file is open, a module reference will be put when it's closed.
This can make the module refcount go negative.
I believe fixing this requires that we make certain that if the owner is to be
set, that it be set when the proc_dir_entry is created but before proc_register
(similar to how proc_create sets the fops).
This is not a difficult problem to fix, but it will probably be labour
intensive. A new interface will need to be created that can create the procfile
with the owner already set and the places where we create procfiles will need
to be fixed to use it (and to pass in an "owner" arg).
Some possibilities for shortcuts:
1) if the owner field in the file_operations struct is set then set pde->owner
to that. We'll have to audit the existing uses to make sure that this doesn't
2) turn proc_create and proc_create_data into wrappers that call the new
function with an owner of THIS_MODULE. That would probably work, but may mean
that some places that don't set the owner would now do so.
I count >300 places that call either proc_create, proc_create_data, or
create_proc_entry. Some sort of shortcut may be a necessity.
Alexy Dobriyan may have an opinion on this, he did a patch a while back that added the proc_create interface that fixes a similar problem race with setting the fops for the file.
Created attachment 19816 [details]
test case: kernel module
Part of a reproducer. Kernel module that creates a dummy procfile and then sleeps for 10s before setting pde->owner.
To reproduce, build this and plug it in. After plugging it in, run a program to open the proctest procfile and just hold it open. After the owner is set, close the file. The module refcount will go to -1.
The race window is a lot smaller with most real-world procfiles, but almost all of them are racy.
Time to remove struct proc_dir_entry::owner, I guess.
All surgery with proc_reg_file_ops and making remove_proc_entry()
reliable made ->owner unnecessary.
Final proofreading for races and into bitbucket it goes.
That certainly sounds easier, but won't that take away the guarantee that you hold a reference to a module while holding one of its procfiles open?
With the reproducer above, that'll just make sure that you never get a module reference at all. For a real-world example, I know that some NFS daemons hold open files under /proc and periodically poll them. What would prevent the module from being unplugged between polls?
If we want to get rid of the ->owner field, I we'll still need to have some way to make sure that holding a procfile open gives you a module reference.
Looks like ->read_proc, ->write_proc users still rely on ->owner being set.
Well, we may shove under the carpet this fact.
You care about one thing: module shouldn't be removed while one of callbacks
For that, remove_proc_entry() waits while all users inside module stop
executing module's code. After that generic proc code marks PDE as dead and
starts returning -E for newcomers.
This also fixes "cd into dir and module is pinned" annoyance.
So, I'd like to leave this scheme (one may argue about suppying THIS_MODULE
during registration, just like we do with ->proc_fops and ->data). But still
"bogo-revoke" case #2 in commit 786d7e1612f0b0adb6046f19b906609e4fe8b1ba
can't be fixed by this, proxy fops are required, and if we do proxy fops
we might as well supply .owner in fops.
Also removing ->owner fixes minor bug with subsystems flipping ->owner
on live PDE (obvious refcount leak and refcount underleak), simply because
it's so simple to assign new ->owner. With proc_fops being const, they might
think about doing this better.
I look what to do with ->read_proc/->write_proc, after that patch will be
commit 99b76233803beab302123d243eea9e41149804f3 in mainline