After upgrading from 2.6.32-rc7 to rc8 I have experienced a number of unexpected crashes that appear to be related to my xfs filesystems. When these occur, dmesg says: [33091.758929] xfs_force_shutdown(sda6,0x2) called from line 1043 of file fs/xfs/xfs_log.c. Return address = 0xffffffff8114c5d3 [33091.758951] Filesystem "sda6": Log I/O Error Detected. Shutting down filesystem: sda6 [33091.758956] Please umount the filesystem, and rectify the problem(s) ... [33171.850058] Filesystem "sda6": xfs_log_force: error 5 returned. [33191.540124] No probe response from AP 00:16:b6:26:41:29 after 500ms, disconnecting. Any further attempts to access files on the affected filesystem hang, however anything else that has already been loaded into memory continues to work fine. An ls -l shows the mountpoint as red with question marks in the metadata fields. Attempts to unmount the filesystem fail saying that it is in use, while an fuser on the mount point hangs like everything else. This appears to occur approximately once daily with the typical usage patterns for my laptop. There does not appear to be a particular action on my part that triggers it. So unfortunately it will be difficult to consistently reproduce. It can occur quite suddenly under only mild disk usage such as saving, compiling a latex document followed by rereading the postscript file. I recall that it also occurred once after a suspend to ram. Let me know if there is any additional information I can provide. Kevin
What was in the system logs 10 lines or so before "Filesystem "sda6": Log I/O Error Detected?" This looks like XFS is getting an IO error from the storage beneath it, and doing the right thing in the face of that ...
Sorry, you're right. Not sure where my head was at: [33090.932246] wlan0: direct probe to AP 00:03:52:e7:07:f0 (try 1) [33090.992157] ata1.00: exception Emask 0x10 SAct 0x0 SErr 0x4050002 action 0xe frozen [33090.992165] ata1.00: irq_stat 0x00000040, connection status changed [33090.992176] ata1.00: cmd ea/00:00:00:00:00/00:00:00:00:00/a0 tag 0 [33090.992178] res 50/00:08:4f:10:e5/00:00:12:00:00/40 Emask 0x10 (ATA bus error) [33090.992187] ata1: hard resetting link [33091.130129] wlan0: direct probe to AP 00:03:52:e7:07:f0 (try 2) [33091.330107] wlan0: direct probe to AP 00:03:52:e7:07:f0 (try 3) [33091.530125] wlan0: direct probe to AP 00:03:52:e7:07:f0 timed out [33091.740128] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300) [33091.741504] ata1.00: ACPI cmd ef/02:00:00:00:00:a0 (unknown) succeeded [33091.741512] ata1.00: ACPI cmd f5/00:00:00:00:00:a0 (unknown) filtered out [33091.741518] ata1.00: ACPI cmd ef/10:03:00:00:00:a0 (unknown) filtered out [33091.744278] ata1.00: ACPI cmd ef/02:00:00:00:00:a0 (unknown) succeeded [33091.744286] ata1.00: ACPI cmd f5/00:00:00:00:00:a0 (unknown) filtered out [33091.744292] ata1.00: ACPI cmd ef/10:03:00:00:00:a0 (unknown) filtered out [33091.745531] ata1.00: configured for UDMA/133 [33091.748270] ata1.00: configured for UDMA/133 [33091.748295] ata1: EH complete [33091.758884] end_request: I/O error, dev sda, sector 223677343 [33091.758921] I/O error in filesystem ("sda6") meta-data dev sda6 block 0x74158f0 ("xlog_iodone") error 5 buf count 5632 [33091.758929] xfs_force_shutdown(sda6,0x2) called from line 1043 of file fs/xfs/xfs_log.c. Return address = 0xffffffff8114c5d3 [33091.758951] Filesystem "sda6": Log I/O Error Detected. Shutting down filesystem: sda6 [33091.758956] Please umount the filesystem, and rectify the problem(s) Reassign as necessary. I've switched back to rc7 and have yet to have a problem, so I'm thinking (hoping) this is not a hardware issue. I'll report back if I find otherwise.
That's ok, we should probably make the error mesage say: printk("^^^ look up there before filing a bug ^^^\n"); ;) -Eric
Looks like similar problem as the following one. http://bugzilla.kernel.org/show_bug.cgi?id=14543 but I can't think of anything which could make recent kernels newly prone to this type of problem. Are you sure this is a regression? Can you please check again? The workaround for the above bug will be merged during this merge window but the change is a bit pervasive so I'll wait a bit until backport it for -stable. Thanks.