Bug 195453
Summary: | race on fs/exec with fs_struct in_exec flag, introduced in commit 498052bba55ecaff58db6a1436b0e25bfd75a7ff | ||
---|---|---|---|
Product: | File System | Reporter: | Colin Ian King (colin.king) |
Component: | Other | Assignee: | fs_other |
Status: | NEW --- | ||
Severity: | high | CC: | benh, dev.jongy, domnulnopcea, shaoyi |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | 2.6.29+ onwards | Subsystem: | |
Regression: | No | Bisected commit-id: | |
Attachments: | this fix solves the issue without any overhead of extra per thread locking and a simple lightweight retry |
Description
Colin Ian King
2017-04-18 13:22:33 UTC
I don't think the second condition you gave is relevant, because once the task_struct->fs pointer is nullified, this thread is not accounted in fs->users anymore. Since this thread doesn't count for neither fs->count nor n_fs, it is okay. The first condition is indeed a problem. I'm not sure of the wanted fix (probably some kind of locking in copy_process, that while_each_thread could use too), but currently this race is bad. Thanks for looking at this. The locking on copy_process concerns me as I didn't want to introduce a locking fix that caused a serious performance regression on the copy and exec paths. Any updates on this? Created attachment 256351 [details]
this fix solves the issue without any overhead of extra per thread locking and a simple lightweight retry
This has been extensively tested on a multi-proc Xeon server with the reproducers and fixes the issue. I've checked this out with a high contention of pthreads and the retry loop occurs very rarely, so the overhead of the retry is very small indeed.
Did you ever submit the fix to the mailing list ? Hi Colin, May I please ask about the most recent update about your fix? This race condition can still be reproduced on the current linux mainline v5.19.0-rc6. I found ubuntu had picked this patch up as https://lists.ubuntu.com/archives/kernel-team/2017-May/084102.html but also reported the soft lockup issue in kernel 4.4 as https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1876856 so I'm wondering do you have an updated version of the patch and the plan to submit it to the upstream? Would really appreciate if you're still tracking this issue! |