Created attachment 26081 [details] dmesg output During truncate we may need to restart new transaction, to avoid deadlock on i_data_sem it was dropped commit 487caeef9fc08c0565e082c40a8aaf58dad92bbb Author: Jan Kara <jack@suse.cz> Date: Mon Aug 17 22:17:20 2009 -0400 Jan given a good explanation why this approach would work, I have better explanation why this can't work work. Yes we have blocked all writers beyond i_size, but writers(flush, page_mkwrite) before i_size still may change node blocks, so 'path' which was lookup by truncate is not longer valid. So we are in big big troubles. I've add created inode's history tracer patch which spotted the issue. See attachments.
Created attachment 26082 [details] debug patch against ext4.git/next + patches from bug #15792 The debug patch is rather ugly, but still useful.
in given example both alloc and truncate tasks try to modify same path path: 218:[1]33:108506 So ext4_ext_rm_leaf goes crazy because leaf node was modified. Possible solutions: 1) Truncate_task: Drop path collected if ext4_ext_truncate_extend_restart result in journal_restart, ant retry lookup. 2) Save current truncate path some where in inode, so get_block may check about conflicts and to let truncate task to know that path is no longer valid. If so truncate task must goes back to (1).
Proposed patch http://patchwork.ozlabs.org/patch/50687/ BTW IMHO it is also reasonable to replace simple printk("strange request..") to something more scary like EXT4_ERROR_INODE(inode, "strange request..")