Bug 11840 - ext4: __jbd2_log_wait_for_space: no transactions
Summary: ext4: __jbd2_log_wait_for_space: no transactions
Status: CLOSED PATCH_ALREADY_AVAILABLE
Alias: None
Product: File System
Classification: Unclassified
Component: ext4 (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: Theodore Tso
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2008-10-25 08:40 UTC by François Valenduc
Modified: 2009-01-17 15:24 UTC (History)
0 users

See Also:
Kernel Version: 2.6.28-rc1
Tree: Mainline
Regression: ---


Attachments

Description François Valenduc 2008-10-25 08:40:41 UTC
Latest working kernel version: unknown
Earliest failing kernel version: unknown
Distribution: Gentoo
Hardware Environment: ACER Travelmate 4000
Software Environment: Gentoo
Problem Description:

I have tried to use the new ext4 filesystem and I encounter a very annoying problem. I have tried to convert partition reiserfs to an ext4. This partition is in fact a LVM2 volume. So I made a compressed tar file of this partition. Then I formatted the partition an created an ext4 filesystem on it. When I untar the archive to restore the files, plenty of line like this starts flooding dmesg:

Oct 23 22:36:25 ordi-francois [<e1491533>] ext4_da_writepages+0x283/0x2b0 [ext4]
Oct 23 22:36:25 ordi-francois ext4_da_writepages: jbd2_start: 1024 pages, ino 40898; err -30

Then, the filesystem is remounted read-only. It also occurs with the ext4-stable tree for kernel 2.6.27.

Does anybody knows a solution to this annoying problem ?


Steps to reproduce:
Comment 1 François Valenduc 2008-10-26 05:40:47 UTC
It seems the error with ext4_da_writepages goes away if I enable huge file support in mke2fs.conf. However, I don't plan to use file of more than 1 TB on this device. Unfortunately, uncompressing the tar archive still stops at some point. When it stops, there is only 320 Mb used on the logical volume which has a size of 1GB. When I want to create a file, it indicates that there is no more free space on the filesystem !
Can sombedy explain why only 1/3 of the total volume can be used ? I have tried inode sizes of 128 and 256 bytes but this doesn't change anything.
Comment 2 François Valenduc 2008-10-28 12:30:25 UTC
In fact, the problem only occurs on a 32 bit computer. I have tested it on a machine having intel core 2 duo and running a 64 but kernel and the problem doesn't occur.
I found this patch on osdir.com: http://osdir.com/ml/file-systems.ext4/2007-11/msg00200.htm

This is dated from november 2007, I would guess that since then, the problem had been resolved but it doesn't seems to be the case.
Thanks for your help.
Comment 3 Theodore Tso 2008-11-02 13:52:41 UTC
Hi Francois,

I tried sending e-mail to you, but your e-mail apparently has an over-active spam filter (or your e-mail is incorrect).  I'm sending this as a bugzilla note in the hopes you will receive it.

Hi,

        Sorry no one responded, but while the Bugzilla system is great
for tracking bugs, it's not a good place to call attention to a bug.
In fact, I'm not sure that fs_ext4@kernel-bugs.osdl.org forwards
anywhere sane (I'm checking up on that, though).  In general it's a
good idea by sending e-mail to either linux-kernel@vger.kernel.org or
(preferred) linux-ext4@vger.kernel.org.

        From your description, the ext4 filesystem thought there was
some kind of on-disk corruption or other problem with your filesystem,
and remounted it read-only.  That's consistent with your report of the
messages in the syslog of:

ext4_da_writepages: jbd2_start: 1024 pages, ino 40898; err -30

There should have been something in your logs *before* all of the err
-30 complaints, however.  Was this your root filesystem, or do you
have this saved in logfile in /var/log?  If so, can you search back
for something that occurs before all of the ext4_da_writepages error
messages?

                                                        - Ted
Comment 4 François Valenduc 2008-11-04 12:31:42 UTC
Following the discussion we had on the mailing list, it appears that the problem has nothing to do with 32 or 64 bit systems. So, I put the original title back.
The kind of error I saw were the following:

Oct 23 22:35:58 ordi-francois ext4_abort called.
Oct 23 22:35:58 ordi-francois EXT4-fs error (device dm-1): ext4_journal_start_sb: Detected aborted journal
Oct 23 22:36:25 ordi-francois ext4_da_writepages: jbd2_start: 1024 pages, ino 39936; err -30
Oct 23 22:36:25 ordi-francois [<e1491533>] ext4_da_writepages+0x283/0x2b0 [ext4]
Oct 23 22:36:25 ordi-francois ext4_da_writepages: jbd2_start: 1024 pages, ino 39938; err -30
Oct 23 22:36:25 ordi-francois [<e1491533>] ext4_da_writepages+0x283/0x2b0 [ext4]

But, it doesn't seems to happen any more.
Comment 5 Linux ext4 mailing list 2008-11-04 12:40:39 UTC
Are you sure there wasn't something before "ext4_abort called"?  Say, such as:

Nov  2 22:20:53 taz kernel: __jbd2_log_wait_for_space: no transactions
Nov  2 22:20:53 taz kernel: Aborting journal on device dm-0:8.
Nov  2 22:20:57 taz kernel: ext4_abort called.

If so, this is a known problem, and we have a bug fix.  The ext3 version of this 2.6.27 regression is Bug #11937, and the ext4 version of this bug was reported in the Fedora beta kernel:

https://bugzilla.redhat.com/show_bug.cgi?id=469582

A fix is at the top of the ext4 patch queue, and given two confirmations that it solves the problem, it will be pushed to Linus shortly.
Comment 6 François Valenduc 2008-11-04 13:09:23 UTC
Indeed, there is something like that before:

Oct 23 22:35:58 ordi-francois __jbd2_log_wait_for_space: no transactions
Oct 23 22:35:58 ordi-francois Aborting journal on device dm-1:8.
Oct 23 22:35:58 ordi-francois ext4_abort called.
Oct 23 22:35:58 ordi-francois EXT4-fs error (device dm-1): ext4_journal_start_sb: Detected aborted journal
Oct 23 22:35:58 ordi-francois Remounting filesystem read-only

A few seconds later, this error occurs and repeats itself several times:

Oct 23 22:36:25 ordi-francois ext4_da_writepages: jbd2_start: 1024 pages, ino 39938; err -30
Oct 23 22:36:25 ordi-francois Pid: 12907, comm: pdflush Not tainted 2.6.27.3 #15
Oct 23 22:36:25 ordi-francois [<e1491533>] ext4_da_writepages+0x283/0x2b0 [ext4]
Oct 23 22:36:25 ordi-francois [<c0416cef>] hrtick_start_fair+0x8f/0xf0
Oct 23 22:36:25 ordi-francois [<c0416df9>] pick_next_task_fair+0xa9/0xf0
Oct 23 22:36:25 ordi-francois [<c045b61b>] do_writepages+0x2b/0x50
Oct 23 22:36:25 ordi-francois [<c0492f00>] __writeback_single_inode+0x80/0x2e0
Oct 23 22:36:25 ordi-francois [<c05c8392>] schedule+0x172/0x2b0
Oct 23 22:36:25 ordi-francois [<e0828c0b>] dm_table_any_congested+0xb/0x60 [dm_mod]
Oct 23 22:36:25 ordi-francois [<c04934d3>] generic_sync_sb_inodes+0x1a3/0x270
Oct 23 22:36:25 ordi-francois [<c0493799>] writeback_inodes+0x79/0x90
Oct 23 22:36:25 ordi-francois [<c045bec5>] wb_kupdate+0x85/0xf0
Oct 23 22:36:25 ordi-francois [<c045c2d0>] pdflush+0x0/0x190
Oct 23 22:36:25 ordi-francois [<c045c398>] pdflush+0xc8/0x190
Oct 23 22:36:25 ordi-francois [<c045be40>] wb_kupdate+0x0/0xf0
Oct 23 22:36:25 ordi-francois [<c042b692>] kthread+0x42/0x70
Oct 23 22:36:25 ordi-francois [<c042b650>] kthread+0x0/0x70
Oct 23 22:36:25 ordi-francois [<c0403ce7>] kernel_thread_helper+0x7/0x10
Comment 7 Theodore Tso 2009-01-17 15:24:08 UTC
This was fixed by commit 8c3f25d8, which is was released in 2.6.28.

Note You need to log in before you can comment on or make changes to this bug.