http://man7.org/linux/man-pages/man2/posix_fadvise.2.html suggests that POSIX_FADV_DONTNEED has no effect on dirty pages. However, if I'm reading https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/mm/fadvise.c?id=v3.19#n115 correctly, it will in fact schedule dirty pages for writeout if the underlying device is not already congested. Can you please verify actual behavior and update the manpage if indicated?
(In reply to Maik Zumstrull from comment #0) > http://man7.org/linux/man-pages/man2/posix_fadvise.2.html suggests that > POSIX_FADV_DONTNEED has no effect on dirty pages. I'm missing something. Where does the page suggest that?
Last paragraph under NOTES. "Pages that have not yet been written out will be unaffected, so if the application wishes to guarantee that pages will be released, it should call fsync(2) or fdatasync(2) first." Unless I'm misreading the code, this is not true.
(In reply to Maik Zumstrull from comment #0) > http://man7.org/linux/man-pages/man2/posix_fadvise.2.html suggests that > POSIX_FADV_DONTNEED has no effect on dirty pages. However, if I'm reading > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/mm/ > fadvise.c?id=v3.19#n115 correctly, it will in fact schedule dirty pages for > writeout if the underlying device is not already congested. So, looking at the code in mm/fadvise.c, we have case POSIX_FADV_DONTNEED: if (!inode_write_congested(mapping->host)) __filemap_fdatawrite_range(mapping, offset, endbyte, WB_SYNC_NONE); This suggests that *if* the backing device is not congested, then __filemap_fdatawrite_range() is called. The comments for that function say: __filemap_fdatawrite_range - start writeback on mapping dirty pages in range So, my reading of this is that *maybe* some dirty pages will be written to the backing device by the time that POSIX_FADV_DONTNEED gets to calling invalidate_mapping_pages() whose description says: /** * invalidate_mapping_pages - Invalidate all the unlocked pages of one inode * @mapping: the address_space which holds the pages to invalidate * @start: the offset 'from' which to invalidate * @end: the offset 'to' which to invalidate (inclusive) * * This function only removes the unlocked pages, if you want to * remove all the pages of one inode, you must call truncate_inode_pages. * * invalidate_mapping_pages() will not block on IO activity. It will not * invalidate pages which are dirty, locked, under writeback or mapped into * pagetables. */ So, my reading of this is that the handling of dirty pages is an optimazation. If some pages can be written in time, they will be freed by POSIX_MADV_DONTFREE. But there are no guaranteed, so I think the text in the man page: Pages that have not yet been written out will be unaffected, so if the application wishes to guarantee that pages will be released, it should call fsync(2) or fdatasync(2) first. is pretty much okay, and no change is required. So, I'm closing this. If you can find further info that contradicts what I've found, please do reopen the bug. thanks, Michael
Upon reflection, I decided to tweak the text a bit, to hint about what is going. The page now contains this paragraph: The implementation may attempt to write back dirty pages in the specified region, but this is not guaranteed. Any unwritten dirty pages will not be freed. If the applica‐ tion wishes to ensure that pages will be released, it should call fsync(2) or fdatasync(2) first.