Bug 65701

Summary: oops: fs/ext4/ext4_jbd2.c
Product: File System Reporter: loco
Component: ext4Assignee: fs_ext4 (fs_ext4)
Status: NEW ---    
Severity: normal CC: alan, dmonakhov, jack
Priority: P1    
Hardware: x86-64   
OS: Linux   
Kernel Version: 3.11.9-gentoo Subsystem:
Regression: No Bisected commit-id:
Attachments: this patch should fix an issue

Description loco 2013-11-24 23:44:01 UTC
The following oops happened shortly after kernel upgrade:
 
[18592.166266] ------------[ cut here ]------------
[18592.166279] WARNING: CPU: 3 PID: 4273 at fs/ext4/ext4_jbd2.c:48 ext4_journal_check_start+0x24/0x67()
[18592.166282] Modules linked in: xt_multiport vmwgfx cfbfillrect cfbimgblt cfbcopyarea fb fbdev ttm drm i2c_core agpgart shpchp evdev libcrc32c aes_x86_64 sha256_generic iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi tg3 ptp pps_core e1000 fuse nfs lockd sunrpc reiserfs linear raid10 raid456 async_raid6_recov async_memcpy async_pq raid6_pq async_xor xor async_tx raid1 raid0 dm_snapshot dm_crypt megaraid_sas megaraid_mbox megaraid_mm megaraid hpsa cciss
[18592.166378] CPU: 3 PID: 4273 Comm: master Not tainted 3.11.9-gentoo-mrbyte #1
[18592.166382] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 06/22/2012
[18592.166385]  0000000000000000 ffffffff813718c8 0000000000000000 ffffffff81031607
[18592.166390]  ffffffff8114b7ac ffff880428c4d000 0000000000000002 0000000000000000
[18592.166394]  0000000000000001 ffffffff8114b7ac ffff880428c4d000 ffffffff8114b82f
[18592.166399] Call Trace:
[18592.166407]  [<ffffffff813718c8>] ? dump_stack+0x41/0x51
[18592.166415]  [<ffffffff81031607>] ? warn_slowpath_common+0x6f/0x84
[18592.166420]  [<ffffffff8114b7ac>] ? ext4_journal_check_start+0x24/0x67
[18592.166426]  [<ffffffff8114b7ac>] ? ext4_journal_check_start+0x24/0x67
[18592.166431]  [<ffffffff8114b82f>] ? __ext4_journal_start_sb+0x1b/0x65
[18592.166439]  [<ffffffff81136bbf>] ? ext4_dirty_inode+0x20/0x4f
[18592.166446]  [<ffffffff810e36ec>] ? __mark_inode_dirty+0x27/0x189
[18592.166452]  [<ffffffff810da24a>] ? update_time+0xa1/0xa8
[18592.166458]  [<ffffffff810340ca>] ? current_fs_time+0x20/0x23
[18592.166463]  [<ffffffff810daa2a>] ? file_update_time+0x92/0xb1
[18592.166469]  [<ffffffff810ce9a8>] ? pipe_write+0x40e/0x450
[18592.166475]  [<ffffffff810c7ea4>] ? do_sync_write+0x52/0x79
[18592.166481]  [<ffffffff810c8462>] ? vfs_write+0xa7/0x10b
[18592.166486]  [<ffffffff810c8aa0>] ? SyS_write+0x41/0x75
[18592.166492]  [<ffffffff8137723d>] ? tracesys+0xd4/0xd9
[18592.166495] ---[ end trace f01209520ef673d6 ]---
Comment 1 Dmitry Monakhov 2013-11-25 09:38:07 UTC
In fact this is not ext4's issue, this is generic fs issue.
pipe_write() do not accuire sb_start_write(), as far as I understand this is because pipe should not touch fs's data. But it is not completely correct because pipe want update file's time. Let's use same trick as we use in touch_time()
pipe_write will call sb_start_write_try() and skip time update on frozen fs.
See proposed patch.
Comment 2 Dmitry Monakhov 2013-11-25 09:39:00 UTC
Created attachment 115861 [details]
this patch should fix an issue
Comment 3 Jan Kara 2013-11-25 22:15:26 UTC
The patch looks good to me. Feel free to add
Reviewed-by: Jan Kara <jack@suse.cz>
Can you send it to Al Viro for inclusion? Thanks!
Comment 4 Dmitry Monakhov 2013-11-26 08:07:55 UTC
BTW initial discussion started here:
http://marc.info/?t=134876094300003&r=1&w=2

# Original test case       
cat /mnt/test/fifo   
mkfifo /mnt/test/fifo
echo foo > /mnt/test/fifo &
fsfreeze -f /mnt/test      
cat /mnt/test/fifo