Distribution: Debian Sid System: Athlon 64 x2 4200+ Architecture: X86_64 Latest kernel without the problem: 2.6.30.1 I never got it on 2.6.30. It seems that it was the latest kernel without this problem. Latest tested kernel where the problem occured: 2.6.31-rc2-git9 (not yet tested on rc3.) Hi, I've an external disk (SATA) connected in USB. If, for some reason, the disk has not been umounted correctly, I get a freeze of the system while trying to access it. Mostly it is done by a cron process to do backup. The last messages in the syslog befor the crash are always: Jul 14 10:36:10 tangerine kernel: Starting XFS recovery on filesystem: sdd4 (logdev: internal) Jul 14 10:36:10 tangerine kernel: Ending XFS recovery on filesystem: sdd4 (logdev: internal) At this point, it is mounting the device for a backup (with backup2l). I've set File system -> XFS but, as the disk is on an USB attachement, it can be something else... Regards Jean-Luc
Does the system actually crash (BUG or oops) or just hang? If the former, can you try to get the info from the console, and if it's a hang, can you try sysrq W to get the stuck thread info? -Eric
No BUG, no oops, just completely frozen. I will try sysrq W next time (I will try to trigger the problem but I've data to preserve before :) ). J-L
I got the problem several times again but I didnt manage to get SysRq W to show anything: I was in X window and I think the display can only be done on a console. This time, the filesystem was clean but, as usual, the latest messages in the log are related to this device (there is a recovery message but the filesystem is marked clean at mount). I remarked also that syslog was restarting just before the crash (log rotation via anacron). Jul 26 09:27:07 tangerine kernel: imklog 4.2.0, log source = /proc/kmsg started. Jul 26 09:27:07 tangerine rsyslogd: [origin software="rsyslogd" swVersion="4.2.0" x-pid="4718" x-info="http://www.rsyslog.com"] (re)start Jul 26 09:27:07 tangerine rsyslogd: [origin software="rsyslogd" swVersion="4.2.0" x-pid="4718" x-info="http://www.rsyslog.com"] rsyslogd was HUPed, type 'restart'. Jul 26 09:27:07 tangerine kernel: Kernel logging (proc) stopped. Jul 26 09:27:07 tangerine kernel: imklog 4.2.0, log source = /proc/kmsg started. Jul 26 09:27:07 tangerine rsyslogd: [origin software="rsyslogd" swVersion="4.2.0" x-pid="4718" x-info="http://www.rsyslog.com"] (re)start Jul 26 09:27:16 tangerine kernel: XFS mounting filesystem sdh4 Jul 26 09:27:16 tangerine kernel: Ending clean XFS mount for filesystem: sdh4 Jul 26 09:27:20 tangerine kernel: XFS mounting filesystem dm-16 Jul 26 09:27:20 tangerine kernel: Starting XFS recovery on filesystem: dm-16 (logdev: internal) Jul 26 09:27:20 tangerine kernel: Ending XFS recovery on filesystem: dm-16 (logdev: internal) Jul 26 09:27:26 tangerine ntpd[5746]: synchronized to 88.191.80.132, stratum 2 Jul 26 09:29:56 tangerine kernel: hdc: lost interrupt Jul 26 09:52:45 tangerine kernel: imklog 4.2.0, log source = /proc/kmsg started.
Hi, I thought it was a regression but doing quite intensive tests, I had a problem with 2.6.30 as well and eventually the system was frozen. Last syslog messages were: Jul 29 12:45:10 tangerine kernel: I/O error in filesystem ("sdh4") meta-data dev sdh4 block 0xae4598 ("xfs_trans_read_buf") error 5 buf count 40 96 Jul 29 12:45:12 tangerine kernel: XFS: Filesystem sdi4 has duplicate UUID - can't mount Jul 29 12:45:12 tangerine kernel: I/O error in filesystem ("sdh4") meta-data dev sdh4 block 0xae4598 ("xfs_trans_read_buf") error 5 buf count 4096 Jul 29 12:45:12 tangerine kernel: end_request: I/O error, dev sdh, sector 87747168 Jul 29 12:45:12 tangerine kernel: I/O error in filesystem ("sdh4") meta-data dev sdh4 block 0x53aea60 ("xlog_iodone") error 5 buf count 1024 Jul 29 12:45:12 tangerine kernel: xfs_force_shutdown(sdh4,0x2) called from line 1043 of file fs/xfs/xfs_log.c. Return address = 0xffffffffa0225b13 Jul 29 12:45:12 tangerine kernel: Filesystem "sdh4": Log I/O Error Detected. Shutting down filesystem: sdh4 Jul 29 12:45:12 tangerine kernel: Please umount the filesystem, and rectify the problem(s) Jul 29 12:45:12 tangerine kernel: XFS: Unable to update superblock counters. Freespace may not be correct on next mount. Jul 29 12:45:13 tangerine kernel: hdc: lost interrupt ---- BTW /dev/hdc is a CDROM drive without a media in it. Regards Jean-Luc
Moving to the list of post-2.6.29 regressions.
The " I/O error in filesystem ("sdh4") meta-data dev ... " messages mean XFS got I/O errors from the underluing device. Together with the hdc: lost interrupt messages this looks a lot like an IDE problem to me.
Jean, is this still a problem on current mainline kernels?
I'm now running 2.6.36 and I've not had this problem since. J-L
Thx. I'm closing this as unreproducible.