Kernel Bug Tracker – Bug 15827
ext4_get_blocks may be called while ext4_truncate() is in progress
Last modified: 2010-07-06 07:15:00 UTC
Created attachment 26081 [details]
During truncate we may need to restart new transaction, to avoid deadlock on i_data_sem it was dropped
Author: Jan Kara <email@example.com>
Date: Mon Aug 17 22:17:20 2009 -0400
Jan given a good explanation why this approach would work, I have better explanation why this can't work work.
Yes we have blocked all writers beyond i_size, but writers(flush, page_mkwrite)
before i_size still may change node blocks, so 'path' which was lookup by truncate is not longer valid. So we are in big big troubles.
I've add created inode's history tracer patch which spotted the issue.
Created attachment 26082 [details]
debug patch against ext4.git/next + patches from bug #15792
The debug patch is rather ugly, but still useful.
in given example both alloc and truncate tasks try to modify same path
So ext4_ext_rm_leaf goes crazy because leaf node was modified.
1) Truncate_task: Drop path collected if ext4_ext_truncate_extend_restart result in journal_restart, ant retry lookup.
2) Save current truncate path some where in inode, so get_block may check
about conflicts and to let truncate task to know that path is no longer valid.
If so truncate task must goes back to (1).
BTW IMHO it is also reasonable to replace simple
printk("strange request..") to something more scary
like EXT4_ERROR_INODE(inode, "strange request..")