I was testing a new file system feature that triggered IO errors on inode read. I did not actually change the abort part here so I believe this is unrelated to my changes. I had one case during testing where the journal abort didn't work and jbd2 journal abort errored out. <putting some stress on the file system> EXT4-fs error (device sda1): ext4_iget: triggering IO error EXT4-fs error (device sda1): ext4_put_super: Couldn't clean up the journal This happens in ext4_put_super->jbd2_journal_destroy->jbd2_log_do_checkpoint and then finally if (journal->j_sb_buffer) { if (!is_journal_aborted(journal)) { /* We can now mark the journal as empty. */ journal->j_tail = 0; journal->j_tail_sequence = ++journal->j_transaction_sequence; jbd2_journal_update_superblock(journal, 1); } else { err = -EIO; <------------ this is triggered } So it looks like jbd2_log_do_checkpoint sometimes does not succeed? It does a couple of retry, maybe that log is not enough. The problem is not easy to trigger unfortunately, i only saw it very rarely. I assume it's some race in the checkpoint creation. I tried to reproduce it with jbd2 debugging enabled, but no luck so far.
If the journal is aborted, then any attempts to destroy the journal, checkpoint the journal, create a new transaction, etc., will result in -EIO being returned. In the code that you quoted, the EIO is being returned because is_journal_aborted(journal) returns true. The normal reason why the journal gets aborted is because ext4_error() has called ext4_handle_error(), which then calls jbd2_journal_abort() and the errors behavior is remount-read-only. The basic idea is that if some kind of file system error or corruption has been detected, we want to stop the file system from any further modifications, which might cause more damage and/or user data loss. We don't have a good errno value to use, so we just use EIO, and the error handling after an aborted journal is admittedly not great. It triggers a lot of scary, and to someone who isn't an ext3/ext4 veteran, misleading, error messages, and unfortunately it can cause the key initial failure to scroll off of a VT console. So it's not a race condition; the system is functioning as designed, although at some point we may put in better error handling and some earlier tests for an aborted journal in some of the upper layers of various ext4 functions.
Hmm, it still seems strange because sometimes the error happens and more often not, even when triggering the same underlying inode IO error. I think something is at least inconsistent.
Well, an I/O error won't cause an ext4_error() --- unless the garbage returned is corrupted enough that it causes the ext4 file system code to decide to throw an ext4_error. So the fact that you sometimes get an aborted journal (caused by an ext4_error) isn't entirely surprising. The ext4_error is going to depend on whether or not the ext4 fs code things the file system is corrupted, which is going to be data dependent, and that might be variable after an I/O error. The real problem may be that we need to be doing a better job of doing error checking....