When a procfile is created, one thing that can be done is to set the "owner" field in the proc_dir_entry. When this is done, a module_get is done against that module when the file is opened and put when it's closed. The problem is that there is a race window where the procfile exists on the system but the owner is not yet set. If this happens then no module reference will be taken on open (try_module_get(NULL) is a no-op that returns success). If the owner is then set while the file is open, a module reference will be put when it's closed. This can make the module refcount go negative. I believe fixing this requires that we make certain that if the owner is to be set, that it be set when the proc_dir_entry is created but before proc_register (similar to how proc_create sets the fops). This is not a difficult problem to fix, but it will probably be labour intensive. A new interface will need to be created that can create the procfile with the owner already set and the places where we create procfiles will need to be fixed to use it (and to pass in an "owner" arg). Some possibilities for shortcuts: 1) if the owner field in the file_operations struct is set then set pde->owner to that. We'll have to audit the existing uses to make sure that this doesn't break anything... 2) turn proc_create and proc_create_data into wrappers that call the new function with an owner of THIS_MODULE. That would probably work, but may mean that some places that don't set the owner would now do so. I count >300 places that call either proc_create, proc_create_data, or create_proc_entry. Some sort of shortcut may be a necessity. Alexy Dobriyan may have an opinion on this, he did a patch a while back that added the proc_create interface that fixes a similar problem race with setting the fops for the file.
Created attachment 19816 [details] test case: kernel module Part of a reproducer. Kernel module that creates a dummy procfile and then sleeps for 10s before setting pde->owner. To reproduce, build this and plug it in. After plugging it in, run a program to open the proctest procfile and just hold it open. After the owner is set, close the file. The module refcount will go to -1. The race window is a lot smaller with most real-world procfiles, but almost all of them are racy.
Time to remove struct proc_dir_entry::owner, I guess. All surgery with proc_reg_file_ops and making remove_proc_entry() reliable made ->owner unnecessary. Final proofreading for races and into bitbucket it goes.
That certainly sounds easier, but won't that take away the guarantee that you hold a reference to a module while holding one of its procfiles open? With the reproducer above, that'll just make sure that you never get a module reference at all. For a real-world example, I know that some NFS daemons hold open files under /proc and periodically poll them. What would prevent the module from being unplugged between polls? If we want to get rid of the ->owner field, I we'll still need to have some way to make sure that holding a procfile open gives you a module reference.
Looks like ->read_proc, ->write_proc users still rely on ->owner being set. Well, we may shove under the carpet this fact.
You care about one thing: module shouldn't be removed while one of callbacks is executed. For that, remove_proc_entry() waits while all users inside module stop executing module's code. After that generic proc code marks PDE as dead and starts returning -E for newcomers. This also fixes "cd into dir and module is pinned" annoyance. So, I'd like to leave this scheme (one may argue about suppying THIS_MODULE during registration, just like we do with ->proc_fops and ->data). But still "bogo-revoke" case #2 in commit 786d7e1612f0b0adb6046f19b906609e4fe8b1ba can't be fixed by this, proxy fops are required, and if we do proxy fops we might as well supply .owner in fops. Also removing ->owner fixes minor bug with subsystems flipping ->owner on live PDE (obvious refcount leak and refcount underleak), simply because it's so simple to assign new ->owner. With proc_fops being const, they might think about doing this better. I look what to do with ->read_proc/->write_proc, after that patch will be straightforward.
commit 99b76233803beab302123d243eea9e41149804f3 in mainline