Bug 87721

Summary: fanotify: ignore marks not respected
Product: File System Reporter: Heinrich Schuchardt (xypron.glpk)
Component: OtherAssignee: Jan Kara (jack)
Status: CLOSED CODE_FIX    
Severity: normal CC: eparis, jack, xypron.glpk
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 3.17.1 Subsystem:
Regression: No Bisected commit-id:
Attachments: Example showing the incorrect behavior
example output of test file
[PATCH] fanotify: Fix notification of groups with inode and mount marks
[PATCH 1/1] fanotify: ignore marks not respected

Description Heinrich Schuchardt 2014-11-04 18:48:47 UTC
Created attachment 156521 [details]
Example showing the incorrect behavior

The fanotify API allows to mark mounts, directories, and files for notification. Furthermore it allows to create marks with an ignore mask to indicate which events shall be ignored.

Function fsnotify_add_inode_mark() stores marks for inodes as list in attribute i_list of the inode. The list is sorted by priority and group.

Function fsnotify_add_vfsmount_mark() stores marks for mounts as list in attribute m_list of the vfsmount. The list is sorted by priority and group.

If an event occurs function fsnotify() is called. This function loops over all mount and inode marks and sends the event to the group using function send_to_group().

For an event and a group both an inode mark and a vfsmount mark may coexist. E.g. a user program may want to monitor one mount for FAN_MODIFY events but wants to ignore FAN_MODIFY events for some file.

If for an event and a group both an inode mark and a vfsmount mark exist it is important to call send_to_group for both marks as once. Otherwise ignore masks bits set on the one mark cannot be considered for the other mark.

Function fsnotify() assumes that m_list and i_list are sorted by group and ignores the sorting by priority. Hence some ignore marks may not be correctly considered.

The attachement contains an example demonstrating this behavior. In the example 9 fanotify groups are created. Each watches the same mount. For six of the groups an ignore inode mark for a file is created. Hence only 3 groups should receive events for changes to the ignored file.

When the example code is run repeatedly the number of groups for which events are thrown varies randomly. The result depends on the memory addresses assigned to newly assigned groups.
Comment 1 Heinrich Schuchardt 2014-11-04 18:56:24 UTC
Created attachment 156531 [details]
example output of test file
Comment 2 Jan Kara 2014-11-05 11:55:00 UTC
Thanks for report and the testcase. I agree with your analysis. I'll attach a patch which should fix this in a moment.
Comment 3 Jan Kara 2014-11-05 12:04:00 UTC
Created attachment 156621 [details]
[PATCH] fanotify: Fix notification of groups with inode and mount marks

This patch fixes the problem to me. Can you test it as well? Thanks.
Comment 4 Jan Kara 2014-11-05 12:04:26 UTC
Sorry, marked attachment as private initially.
Comment 5 Heinrich Schuchardt 2014-11-05 19:36:32 UTC
Created attachment 156781 [details]
[PATCH 1/1] fanotify: ignore marks not respected

https://lkml.org/lkml/2014/11/5/714
Comment 6 Heinrich Schuchardt 2014-11-05 20:53:57 UTC
(In reply to Jan Kara from comment #3)
> Created attachment 156621 [details]
> [PATCH] fanotify: Fix notification of groups with inode and mount marks
> 
> This patch fixes the problem to me. Can you test it as well? Thanks.

I receive
"Sorry, you are not authorized to access attachment #156621 [details]."
Comment 7 Jan Kara 2014-11-05 21:08:37 UTC
Strange. The attachment was still private. I disabled that again so you should be hopefully able to see it now.
Comment 8 Heinrich Schuchardt 2014-11-06 07:03:43 UTC
(In reply to Jan Kara from comment #3)
> Created attachment 156621 [details]
> [PATCH] fanotify: Fix notification of groups with inode and mount marks
> 
> This patch fixes the problem to me. Can you test it as well? Thanks.

I compiled the patch against Kernel 3.18-rc3. My test case runs fine. So you can add
Tested-by: Heinrich Schuchardt <xypron.glpk@gmx.de>

fsnotify_compare_groups is called in a loop for every event of an inode.
The loop runs up to FANOTIFY_DEFAULT_MAX_LISTENERS (= 128) times per user.
So I suggest to declare the function as
static inline int fsnotify_compare_groups(...)

The comments in front of the function should explain why the marks are sorted by priority. You could add a comment like:
/*
 * Fanotify_init provides different notification classes. Events shall be
 * passed to listeners in sequence of FAN_CLASS_PRE_CONTENT, FAN_CLASS_CONTENT,
 * FAN_CLASS_NOTIF. As the notification class is mapped to field priority the
 * marks have to be sorted by decreasing values of this field.
 *
 * Correct handling of the ignore mask requires processing inode and vfsmount
 * marks of each group together. Using the group address as further sort
 * criterion provides a unique sorting order.
 *
 * A return value of 1 signifies that b has priority over a.
 * A return value of 0 signifies that the two marks have to be handled
 * together.
 * A return value of -1 signifies that a has priority over b.
 */

I personally prefer to read a == NULL instead of !a because the former notation clearly shows we are dealing with pointers. But that is a question of taste only.

Andrew Morton already queued my patch for the MM tree. I am not sure if you have to base your patch on mine to make Andrew's life easier.

I clearly favor your solution with the additional compare function.