Bug 95421 - Outdated documentation on posix_fadvise?
Summary: Outdated documentation on posix_fadvise?
Status: RESOLVED CODE_FIX
Alias: None
Product: Documentation
Classification: Unclassified
Component: man-pages (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: documentation_man-pages@kernel-bugs.osdl.org
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2015-03-24 13:02 UTC by Maik Zumstrull
Modified: 2017-01-26 01:47 UTC (History)
1 user (show)

See Also:
Kernel Version:
Subsystem:
Regression: No
Bisected commit-id:


Attachments

Description Maik Zumstrull 2015-03-24 13:02:55 UTC
http://man7.org/linux/man-pages/man2/posix_fadvise.2.html suggests that POSIX_FADV_DONTNEED has no effect on dirty pages. However, if I'm reading https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/mm/fadvise.c?id=v3.19#n115 correctly, it will in fact schedule dirty pages for writeout if the underlying device is not already congested.

Can you please verify actual behavior and update the manpage if indicated?
Comment 1 Michael Kerrisk 2015-05-05 07:39:17 UTC
(In reply to Maik Zumstrull from comment #0)
> http://man7.org/linux/man-pages/man2/posix_fadvise.2.html suggests that
> POSIX_FADV_DONTNEED has no effect on dirty pages.

I'm missing something. Where does the page suggest that?
Comment 2 Maik Zumstrull 2015-05-05 08:57:52 UTC
Last paragraph under NOTES. "Pages that have not yet been written out will be unaffected, so if the application wishes to guarantee that pages will be released, it should call fsync(2) or fdatasync(2) first."

Unless I'm misreading the code, this is not true.
Comment 3 Michael Kerrisk 2017-01-25 23:59:51 UTC
(In reply to Maik Zumstrull from comment #0)
> http://man7.org/linux/man-pages/man2/posix_fadvise.2.html suggests that
> POSIX_FADV_DONTNEED has no effect on dirty pages. However, if I'm reading
> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/mm/
> fadvise.c?id=v3.19#n115 correctly, it will in fact schedule dirty pages for
> writeout if the underlying device is not already congested.

So, looking at the code in mm/fadvise.c, we have

    case POSIX_FADV_DONTNEED:
            if (!inode_write_congested(mapping->host))
                    __filemap_fdatawrite_range(mapping, offset, endbyte,
                                               WB_SYNC_NONE);

This suggests that *if* the backing device is not congested, then __filemap_fdatawrite_range() is called. The comments for that function say:

    __filemap_fdatawrite_range - start writeback on mapping dirty pages in range

So, my reading of this is that *maybe* some dirty pages will be written to the backing device by the time that POSIX_FADV_DONTNEED gets to calling invalidate_mapping_pages() whose description says:

/**
 * invalidate_mapping_pages - Invalidate all the unlocked pages of one inode
 * @mapping: the address_space which holds the pages to invalidate
 * @start: the offset 'from' which to invalidate
 * @end: the offset 'to' which to invalidate (inclusive)
 *
 * This function only removes the unlocked pages, if you want to
 * remove all the pages of one inode, you must call truncate_inode_pages.
 *
 * invalidate_mapping_pages() will not block on IO activity. It will not
 * invalidate pages which are dirty, locked, under writeback or mapped into
 * pagetables.
 */

So, my reading of this is that the handling of dirty pages is an optimazation. If some pages can be written in time, they will be freed by POSIX_MADV_DONTFREE. But there are no guaranteed, so I think the text in the man page:

       Pages that have not yet been written out will be unaffected, so if
       the application wishes to guarantee that pages will  be  released,
       it should call fsync(2) or fdatasync(2) first.

is pretty much okay, and no change is required. 

So, I'm closing this. If you can find further info that contradicts what I've found, please do reopen the bug.

thanks,

Michael
Comment 4 Michael Kerrisk 2017-01-26 01:47:25 UTC
Upon reflection, I decided to tweak the text a bit, to hint about what is going. The page now contains this paragraph:

              The implementation may attempt to write back dirty pages in
              the specified region, but  this  is  not  guaranteed.   Any
              unwritten  dirty  pages will not be freed.  If the applica‐
              tion wishes to ensure  that  pages  will  be  released,  it
              should call fsync(2) or fdatasync(2) first.

Note You need to log in before you can comment on or make changes to this bug.