Bug 201461
Summary: | ext4 journal stalls write system call | ||
---|---|---|---|
Product: | File System | Reporter: | zhenbianshu |
Component: | ext4 | Assignee: | fs_ext4 (fs_ext4) |
Status: | RESOLVED INVALID | ||
Severity: | normal | CC: | tytso, zhenbianshu |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | 3.10.0-229.el7.x86_64 | Subsystem: | |
Regression: | No | Bisected commit-id: |
Description
zhenbianshu
2018-10-17 12:56:38 UTC
It shouldn't be the case that the journal's data=ordered writeback would be taking the i_data semaphore for writing. I'm guessing what is happening is that journal is causing a lot of I/O operations because of data=ordered mode (e.g., not data=writeback), and this is delaying a read which a writeback thread was trying to do while it was holding a write lock on the inode's semaphore. (In reply to Theodore Tso from comment #1) > It shouldn't be the case that the journal's data=ordered writeback would be > taking the i_data semaphore for writing. > > I'm guessing what is happening is that journal is causing a lot of I/O > operations because of data=ordered mode (e.g., not data=writeback), and this > is delaying a read which a writeback thread was trying to do while it was > holding a write lock on the inode's semaphore. can you please show me the related source code of writeback holding a write lock? The writeback code will end up calling ext4_writepages(), which will do block allocations via ext4_map_blocks() with the flag EXT4_GET_BLOCKS_CREATE(), and this will take a write lock on i_data_sem. I'll let you trace through the call chain that starts in the writeback thread, through the call to ext4_writepages(), proceeds to mpage_map_one_extent(), and from there to ext4_map_blocks() which actually holds the write block. Note that this is distinct from the data=ordered forced writeblock which starts in fs/jbd2's journal_submit_inode_data_buffer() calling generic_writepages(). The difference is that generic_writepages() will only call writepage() function. For ext4 that's ext4_writepage(), which will never do block allocation. But the writeback threads call the address space operation's, writepages() function, and ext4_writepages() *will* do block allocation for delayed allocation writes. This is a subtle point; ext4_writepage() and ext4_writepages() do very different things. The first is used by the commit kernel thread; the second is used by the writeback kernel thread. (In reply to Theodore Tso from comment #3) > The writeback code will end up calling ext4_writepages(), which will do > block allocations via ext4_map_blocks() with the flag > EXT4_GET_BLOCKS_CREATE(), and this will take a write lock on i_data_sem. > > I'll let you trace through the call chain that starts in the writeback > thread, through the call to ext4_writepages(), proceeds to > mpage_map_one_extent(), and from there to ext4_map_blocks() which actually > holds the write block. > > Note that this is distinct from the data=ordered forced writeblock which > starts in fs/jbd2's journal_submit_inode_data_buffer() calling > generic_writepages(). The difference is that generic_writepages() will > only call writepage() function. For ext4 that's ext4_writepage(), which > will never do block allocation. > > But the writeback threads call the address space operation's, writepages() > function, and ext4_writepages() *will* do block allocation for delayed > allocation writes. > > This is a subtle point; ext4_writepage() and ext4_writepages() do very > different things. The first is used by the commit kernel thread; the > second is used by the writeback kernel thread. I did not find ext4_writepages() or mpage_map_one_extent() in linux 3.10.0 source code, what I found is ext4_da_writepages(). And the call chain: journal_submit_data_buffers() journal_submit_inode_data_buffers() generic_writepages() ext4_da_writepages() write_cache_page_da() mpage_da_map_and_submit() ext4_map_blocks() will take the write lock of i_data_sem. Will generic_writepages() call ext4_da_writepages() in jbd2 ? Keep in mind that 3.10 was released over ***five years*** ago. Quite frankly, using a 3.10 kernel borders on malpractice, since it has so many unpatched security holes it isn't funny. If you are talking about a RHEL kernel, it has diverged significantly from mainline and whether or not it is a security disaster is between you and your Red Hat support contract. :-) My comments apply mainly to the the upstream development kernel, but to answer your questions: No, in 3.10, generic_writepages() does not call ext4_da_writepages(). And ext4_da_writepages roughly corresponds to what ext4_writepages() in non-obsolete kernels. A huge number of things have changed between 3.10 and more modern kernels, including many performance enhancements. So some of the details have changed. (In reply to Theodore Tso from comment #5) > Keep in mind that 3.10 was released over ***five years*** ago. Quite > frankly, using a 3.10 kernel borders on malpractice, since it has so many > unpatched security holes it isn't funny. If you are talking about a RHEL > kernel, it has diverged significantly from mainline and whether or not it is > a security disaster is between you and your Red Hat support contract. :-) > > My comments apply mainly to the the upstream development kernel, but to > answer your questions: > > No, in 3.10, generic_writepages() does not call ext4_da_writepages(). And > ext4_da_writepages roughly corresponds to what ext4_writepages() in > non-obsolete kernels. > > A huge number of things have changed between 3.10 and more modern kernels, > including many performance enhancements. So some of the details have > changed. I got it, thank you very much! |