Bug 15827 - ext4_get_blocks may be called while ext4_truncate() is in progress
Summary: ext4_get_blocks may be called while ext4_truncate() is in progress
Status: RESOLVED CODE_FIX
Alias: None
Product: File System
Classification: Unclassified
Component: ext4 (show other bugs)
Hardware: All Linux
: P1 high
Assignee: fs_ext4@kernel-bugs.osdl.org
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-04-21 15:36 UTC by Dmitry Monakhov
Modified: 2010-07-06 07:15 UTC (History)
2 users (show)

See Also:
Kernel Version: v2.6.29-rc5
Subsystem:
Regression: No
Bisected commit-id:


Attachments
dmesg output (35.55 KB, text/plain)
2010-04-21 15:36 UTC, Dmitry Monakhov
Details
debug patch against ext4.git/next + patches from bug #15792 (7.85 KB, patch)
2010-04-21 15:41 UTC, Dmitry Monakhov
Details | Diff

Description Dmitry Monakhov 2010-04-21 15:36:07 UTC
Created attachment 26081 [details]
dmesg output

During truncate we may need to restart new transaction, to avoid deadlock on i_data_sem it was dropped 
commit 487caeef9fc08c0565e082c40a8aaf58dad92bbb
Author: Jan Kara <jack@suse.cz>
Date:   Mon Aug 17 22:17:20 2009 -0400

Jan given a good explanation why this approach would work, I have better explanation why this can't work work.
Yes we have blocked all writers beyond i_size, but writers(flush, page_mkwrite)
before i_size still may change node blocks, so 'path' which was lookup by truncate is not longer valid. So we are in big big troubles. 

I've add created inode's history tracer patch which spotted the issue.
See attachments.
Comment 1 Dmitry Monakhov 2010-04-21 15:41:13 UTC
Created attachment 26082 [details]
debug patch against ext4.git/next + patches from bug #15792

The debug patch is rather ugly, but still useful.
Comment 2 Dmitry Monakhov 2010-04-21 15:55:15 UTC
in given example both alloc and truncate tasks try to modify same path
path:  218:[1]33:108506
So ext4_ext_rm_leaf goes crazy because leaf node was modified.

Possible solutions:
1) Truncate_task: Drop path collected if ext4_ext_truncate_extend_restart result in journal_restart, ant retry lookup.
2) Save current truncate path some where in inode, so get_block may check
   about conflicts and to let truncate task to know that path is no longer valid.
   If so truncate task must goes back to (1).
Comment 3 Dmitry Monakhov 2010-04-22 04:36:10 UTC
Proposed patch
http://patchwork.ozlabs.org/patch/50687/

BTW IMHO it is also reasonable to replace simple
printk("strange request..") to something more scary
like EXT4_ERROR_INODE(inode, "strange request..")

Note You need to log in before you can comment on or make changes to this bug.