|Summary:||PROBLEM: fanotify 'FAN_OPEN_PERM' flag block unsafe thread program and TAIL command|
|Product:||File System||Reporter:||Ashish (ashi100sh)|
Fanotify Test Program
Description Ashish 2015-03-20 06:15:48 UTC
Created attachment 171361 [details] Fanotify Test Program I use fanotify test sample program to monitor open access perms on the whole file system(/) Then I run test program with unsafe multiple threads, each thread continuously run in loop to write the same file, in following sequence: a) open file b) write file and c) close file When I do Clt+C on test program (unsafe thread program - without lock), it never come out when fanotify test program is running. If we do tail to any file during above mention scenario then Clt+C on tail also get hung. I thing it is bug of Kernel's fanotify. When multiple threads iterate to same directory or files, some thread will hang. Currently, fanotify merging access events for different threads. Same issue reported in this link: http://stackoverflow.com/questions/7566755/multi-thread-opening-file-hangs-when-fanotify-is-on Patch for this bug, never merged in mainline http://marc.info/?l=linux-kernel&m=131822913806350&w=2 After applying this patch, my unsafe thread program get out after doing Ctl+C when fanotify test program running. If any more information required let me know.
Comment 2 Jan Kara 2015-06-16 14:01:03 UTC
Thanks for report. What kernel version do you use for the test? Because we don't merge permission events since 3.14...
Comment 3 Jan Kara 2015-06-16 14:20:32 UTC
Actually, I was wrong we don't merge the permission events already since 3.8.
Comment 4 Ashish 2015-07-06 11:15:09 UTC
Hi Jan, You are correct, Don't merge permission event logic has included from 3.8 kernel onwards. http://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/commit/fs/notify/fanotify/fanotify.c?h=linux-3.0.y&id=07d351b5f618e5be5bd97443d25db41eb1bb8244 But above patch/changes in kernel is not solving my issue. I am facing issue when any "unsafe threads" program writing to same file and fanotify test sample program monitoring "/". During that time my "unsafe threads" program or "tailf of any file" never get exit by doing Clt+C. I have already attached sample programs. But after applying patch "http://marc.info/?l=linux-kernel&m=131822913806350&w=2" , my problem get solved. In linux main stream alternative patch had merged which solved that issue, but not mine. I have tested on 1)Ubuntu 12.04 with 3.5 kernel where I had applied same patch merged in 3.8 kernel. 2) SELS 11 SP3 which had already backported patch which you mentioned. Please try to reproduce this issue on your system. If you required any help let me know.
Comment 5 Ashish 2015-07-08 10:03:36 UTC
On SUSE , I have reproduced this issue on kernel version 3.0.101-0.46-default
Comment 6 Jan Kara 2015-07-09 16:45:42 UTC
OK, I can reproduce the issue with SLE11 SP3 kernels but not with upstream as you say. So let's close this bug as no upstream maintained kernel has the problem and continue tracking this issue in SUSE bugzilla. Thanks for your testing!
Comment 7 Ashish 2015-07-09 18:33:49 UTC
This issue is still persist in upstream. As per comment#4 . I have seen this issue on 3.8 code merges on Ubuntu. I will try this on vanilla kernel version 3.9 and 4.0 . Will share result on it.
Comment 8 Jan Kara 2015-08-04 16:02:12 UTC
Ashish, I'm pretty sure my commit 13116dfd13c8c9d60ea04ece13419af2de8e2e37 present in 3.14 fixes the issue. Have you tried the vanilla kernels you mentioned?
Comment 9 Ashish 2015-08-07 09:30:12 UTC
I took the following available vanilla kernels from kernel.org and performed the testing: 1) kernel 3.12.45 2) kernel 3.14.49 3) kernel 3.18.19 4) kernel 4.1.4 Testing Results: 1) Bug is still present in kernel 3.12.45 2) Bug is fixed in 3.14.49 kernel, but in kernel-3.14 system hung when stopping and flushing the marked filesystem using "FAN_MARK_FLUSH" 3)But above mentioned (flush related ) bug is also fixed in 3.18.19 & 4.14 kernel I tested with AV product(which is using fanotify) and test programme of fanotify . With AV result is positive but test programme freeze the system (may be different issue) Conclusion: 1) Clt+C on unsafe thread programme and tailf is working on 3.14 onwards. I think 3.18.19 is more stable code and it can use as reference for backporting. Note: In latest available SLES 11 SP4 kernel version 3.0.101-63-default, still have this issue. But in SLES 12 kernel version 3.12.28-4-default have already fixed all this issue. Thanks Jan, I am closing this bug as it is fixed in latest available stream.