Bug 8889 - Raid Level 1 causes "soft resetting port" on ata devices
Summary: Raid Level 1 causes "soft resetting port" on ata devices
Status: VERIFIED CODE_FIX
Alias: None
Product: IO/Storage
Classification: Unclassified
Component: MD (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: Alan
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2007-08-15 05:26 UTC by Bjoern Olausson
Modified: 2007-10-19 08:51 UTC (History)
5 users (show)

See Also:
Kernel Version: 2.6.22.2
Subsystem:
Regression: Yes
Bisected commit-id:


Attachments
Clock HPT374 with 50 MHz DPLL (for real :-) (1.62 KB, patch)
2007-08-17 10:13 UTC, Sergei Shtylyov
Details | Diff

Description Bjoern Olausson 2007-08-15 05:26:43 UTC
Most recent kernel where this bug did not occur:
2.6.22.1

Distribution:
Gentoo

Hardware Environment:
00:00.0 Host bridge: VIA Technologies, Inc. PT880 Host Bridge
00:00.1 Host bridge: VIA Technologies, Inc. PT880 Host Bridge
00:00.2 Host bridge: VIA Technologies, Inc. PT880 Host Bridge
00:00.3 Host bridge: VIA Technologies, Inc. PT880 Host Bridge
00:00.4 Host bridge: VIA Technologies, Inc. PT880 Host Bridge
00:00.7 Host bridge: VIA Technologies, Inc. PT880 Host Bridge
00:01.0 PCI bridge: VIA Technologies, Inc. VT8237 PCI Bridge
00:0a.0 RAID bus controller: Triones Technologies, Inc. HPT374 (rev 07)
00:0a.1 RAID bus controller: Triones Technologies, Inc. HPT374 (rev 07)
00:0b.0 Multimedia audio controller: Creative Labs SB Live! EMU10k1 (rev 06)
00:0b.1 Input device controller: Creative Labs SB Live! Game Port (rev 06)
00:0c.0 Ethernet controller: D-Link System Inc DGE-528T Gigabit Ethernet Adapter (rev 10)
00:0d.0 Ethernet controller: Atheros Communications, Inc. AR5212 802.11abg NIC (rev 01)
00:0f.0 RAID bus controller: VIA Technologies, Inc. VIA VT6420 SATA RAID Controller (rev 80)
00:0f.1 IDE interface: VIA Technologies, Inc. VT82C586A/B/VT82C686/A/B/VT823x/A/C PIPC Bus Master IDE (rev 06)
00:10.0 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller (rev 81)
00:10.1 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller (rev 81)
00:10.2 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller (rev 81)
00:10.3 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller (rev 81)
00:10.4 USB Controller: VIA Technologies, Inc. USB 2.0 (rev 86)
00:11.0 ISA bridge: VIA Technologies, Inc. VT8237 ISA bridge [KT600/K8T800/K8T890 South]
00:11.5 Multimedia audio controller: VIA Technologies, Inc. VT8233/A/8235/8237 AC97 Audio Controller (rev 60)
00:12.0 Ethernet controller: VIA Technologies, Inc. VT6102 [Rhine-II] (rev 78)
01:00.0 VGA compatible controller: nVidia Corporation NV43 [GeForce 6200] (rev a1)


Software Environment:
Portage 2.1.2.11 (default-linux/x86/2007.0/desktop, gcc-4.1.2, glibc-2.5-r4, 2.6.22.1 i686)
=================================================================
System uname: 2.6.22.1 i686 Intel(R) Celeron(R) CPU 2.66GHz
Gentoo Base System release 1.12.9
Timestamp of tree: Tue, 14 Aug 2007 14:20:01 +0000
distcc 2.18.3 i686-pc-linux-gnu (protocols 1 and 2) (default port 3632) [disabled]
ccache version 2.4 [enabled]
dev-lang/python:     2.4.4-r4
dev-python/pycrypto: 2.0.1-r6
dev-util/ccache:     2.4-r7
sys-apps/sandbox:    1.2.17
sys-devel/autoconf:  2.13, 2.61
sys-devel/automake:  1.4_p6, 1.5, 1.6.3, 1.7.9-r1, 1.8.5-r3, 1.9.6-r2, 1.10
sys-devel/binutils:  2.17
sys-devel/gcc-config: 1.3.16
sys-devel/libtool:   1.5.23b
virtual/os-headers:  2.6.21
ACCEPT_KEYWORDS="x86"
AUTOCLEAN="yes"
CBUILD="i686-pc-linux-gnu"
CFLAGS="-march=pentium4 -O2 -pipe -msse2 -mfpmath=sse -fomit-frame-pointer"
CHOST="i686-pc-linux-gnu"
CONFIG_PROTECT="/etc /var/qmail/alias /var/qmail/control"
CONFIG_PROTECT_MASK="/etc/env.d /etc/gconf /etc/php/apache2-php5/ext-active/ /etc/php/cgi-php5/ext-active/ /etc/php/cli-php5/ext-active/ /etc/revdep-rebuild /etc/terminfo /etc/texmf/web2c"
CXXFLAGS="-march=pentium4 -O2 -pipe -msse2 -mfpmath=sse -fomit-frame-pointer"
DISTDIR="/usr/portage/distfiles"
FEATURES="ccache distlocks metadata-transfer parallel-fetch sandbox sfperms strict"
GENTOO_MIRRORS="ftp://ftp.belnet.be/mirror/rsync.gentoo.org/gentoo/ ftp://ftp-stud.fht-esslingen.de/pub/Mirrors/gentoo ftp://ftp.easynet.nl/mirror/gentoo/ http://distfiles.gentoo.org http://www.ibiblio.org/pub/Linux/distributions/gentoo"
LANG="de_DE.utf8"
LINGUAS="en us de sv"
PKGDIR="/usr/portage/packages"
PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --compress --force --whole-file --delete --delete-after --stats --timeout=180 --exclude=/distfiles --exclude=/local --exclude=/packages --filter=H_**/files/digest-*"
PORTAGE_TMPDIR="/var/tmp"
PORTDIR="/usr/portage"
PORTDIR_OVERLAY="/usr/local/portage"
SYNC="rsync://rsync.europe.gentoo.org/gentoo-portage"
USE="7zip a52 aac aalib acl acpi alsa amr apache2 apm ares bash-completion berkdb bitmap-fonts bittorrent bzip2 cairo cdparanoia cdr clamav clamd cli cpudetection cracklib crypt cups curl dbus dcraw dga dri dts dv dvd dvdnav dvdr dvdread eds emboss encode evo exif fam fbcon ffmpeg filter firefox flac fortran gd gdbm geoip gif gnutls gpm gstreamer hal iconv imagemagick imap ipalias isdnlog jpeg jpeg2k live lm_sensors logrotate lufsusermount lzo mad madwifi matroska metalink midi mikmod mjpeg mmx mmxext mp2 mp3 mpeg mpeg2 mpm-prefork mudflap multiuser musepack mysql nas ncurses netpbm nls nptl nptlonly oav ogg openmp opensslcrypt pam pcre pdf perl png postgres pppd python qt3support quicktime quotas readline real reflection rewrite rrdtool rtc samba sdl sensord serial session sftplogging shaper snmp softquota spell spl sqlite sse sse2 ssl svg tcpd theora tiff tools truetype truetype-fonts type1-fonts unicode unzip usb userlocales valias vhosts vidix vorbis vroot win32codecs winbind x264 x86 xinetd xml xml2 xorg xpm xvid zip zlib" ALSA_CARDS="ali5451 als4000 atiixp atiixp-modem bt87x ca0106 cmipci emu10k1 emu10k1x ens1370 ens1371 es1938 es1968 fm801 hda-intel intel8x0 intel8x0m maestro3 trident usb-audio via82xx via82xx-modem ymfpci" ALSA_PCM_PLUGINS="adpcm alaw asym copy dmix dshare dsnoop empty extplug file hooks iec958 ioplug ladspa lfloat linear meter mulaw multi null plug rate route share shm softvol" ELIBC="glibc" INPUT_DEVICES="keyboard mouse evdev" KERNEL="linux" LCD_DEVICES="bayrad cfontz cfontz633 glk hd44780 lb216 lcdm001 mtxorb ncurses text" LINGUAS="en us de sv" USERLAND="GNU" VIDEO_CARDS="nvidia vesa fbdev nv"
Unset:  CTARGET, EMERGE_DEFAULT_OPTS, INSTALL_MASK, LC_ALL, LDFLAGS, MAKEOPTS, PORTAGE_COMPRESS, PORTAGE_COMPRESS_FLAGS, PORTAGE_RSYNC_EXTRA_OPTS



Problem Description:

Alan, your patch (http://bugzilla.kernel.org/show_bug.cgi?id=8791) worked fine on 2.6.22.1 and does on 2.6.22.2, but there is somthing that I can't track down to MD or pata_hpt37x

Linux boots fine, but asa. I start to copy stuff on the raid md3 (sda and sdb
in raid level 1) it fails after a few MB.

Actually I can't be sure if it is caused by the driver and your patch or if
something other went wrong in 2.6.22.2.

Only thing I can match in the changelog is this:

commit 74ff092c258313747791da5d82054027167d1a79
Author: Milan Broz <mbroz@redhat.com>
Date:   Thu Jul 12 17:27:24 2007 +0100

    dm raid1: fix status

    Fix mirror status line broken in dm-log-report-fault-status.patch:
      - space missing between two words
      - placeholder ("0") required for compatibility with a subsequent patch
      - incorrect offset parameter

    Signed-off-by: Milan Broz <mbroz@redhat.com>
    Signed-off-by: Alasdair G Kergon <agk@redhat.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>



Aug 14 01:22:59 enterprise ata3.01: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
Aug 14 01:22:59 enterprise ata3.01: cmd 35/00:00:1f:fc:08/00:04:0f:00:00/f0 tag 0 cdb 0x0 data 524288 out
Aug 14 01:22:59 enterprise res 40/00:00:00:4f:c2/00:00:00:00:00/10 Emask 0x4 (timeout)
Aug 14 01:22:59 enterprise ata3: soft resetting port
Aug 14 01:22:59 enterprise Find mode for 12 reports C829C62
Aug 14 01:22:59 enterprise Find mode for 12 reports C829C62
Aug 14 01:22:59 enterprise Find mode for DMA 69 reports 1CAE9C62
Aug 14 01:22:59 enterprise Find mode for DMA 69 reports 1CAE9C62
Aug 14 01:22:59 enterprise ata3.00: configured for UDMA/100
Aug 14 01:23:00 enterprise ata3.01: configured for UDMA/100
Aug 14 01:23:00 enterprise ata3: EH complete
Aug 14 01:23:00 enterprise sd 2:0:0:0: [sda] 398297088 512-byte hardware sectors (203928 MB)
Aug 14 01:23:00 enterprise sd 2:0:0:0: [sda] Write Protect is off
Aug 14 01:23:00 enterprise sd 2:0:0:0: [sda] Mode Sense: 00 3a 00 00
Aug 14 01:23:00 enterprise sd 2:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Aug 14 01:23:00 enterprise sd 2:0:1:0: [sdb] 398297088 512-byte hardware sectors (203928 MB)
Aug 14 01:23:00 enterprise sd 2:0:1:0: [sdb] Write Protect is off
Aug 14 01:23:00 enterprise sd 2:0:1:0: [sdb] Mode Sense: 00 3a 00 00
Aug 14 01:23:00 enterprise sd 2:0:1:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Aug 14 01:23:00 enterprise sd 2:0:0:0: [sda] 398297088 512-byte hardware sectors (203928 MB)
Aug 14 01:23:00 enterprise sd 2:0:0:0: [sda] Write Protect is off
Aug 14 01:23:00 enterprise sd 2:0:0:0: [sda] Mode Sense: 00 3a 00 00
Aug 14 01:23:00 enterprise sd 2:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA

Steps to reproduce:
Just copy stuff on the md Level 1 Raid device, it will fail after some MB.
Then it is recovered and you restart to copy but it will fail again after some MB.. you can repeat this as long as you are not bored.

Please take refference to the following bugreport, It may be related, or it may be the cause. http://bugzilla.kernel.org/show_bug.cgi?id=8791

Thanks for the help
Bjoern Olausson
Comment 1 Andrew Morton 2007-08-15 09:24:35 UTC
This looks much more like a sata-shat-itself bug than an MD bug.

Just to confirm: did you really mean that 2.6.22.1 is OK, but 2.6.22.2
failed?
Comment 2 Bjoern Olausson 2007-08-15 10:08:43 UTC
I do!

2.6.22.1 <--- WORKS
2.6.22.2 <--- DOES NOT WORK

regards
Bjoern
Comment 3 Alan 2007-08-15 10:45:34 UTC
It would be very useful to know which changeset of 2.6.22.* broke it - probably one touching the pata driver ?
Comment 4 Bjoern Olausson 2007-08-15 10:52:49 UTC
Anything you want me to do?
All I could do is to check the changelog, but I guess you didt that already ;-)

Thanks and regards
Bjoern
Comment 5 Greg Kroah-Hartman 2007-08-15 13:38:59 UTC
Can you use git to do a 'git bisect' to see which exact patch broke your machine?

It shouldn't take that long to do, as you have a simple way to test the result :)
Comment 6 Bjoern Olausson 2007-08-15 13:47:45 UTC
Okay, please give me some advice how to handle this "git bisect". What sohould I "bisct" cmpiled sources or a bisect of /usr/src/linux-2.6.22.1 and /usr/src/linux-2.6.22.2 ?

Thanks for your advice

regards
Bjoern
Comment 7 Bjoern Olausson 2007-08-15 16:40:45 UTC
Okay, maybe this is related:

I can't burn DVDs on my Desktop using 2.6.22.1/2 after some MB or GB or during finalisation (it divers very much) I get this error:

Aug 16 01:27:03 freax ata5.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
Aug 16 01:27:03 freax ata5.00: cmd a0/00:00:00:00:20/00:00:00:00:00/a0 tag 0 cdb 0xad data 4 in
Aug 16 01:27:03 freax res 40/00:02:00:08:00/00:00:00:00:00/a0 Emask 0x4 (timeout)
Aug 16 01:27:04 freax ata5: soft resetting port
Aug 16 01:27:05 freax ata5.00: configured for UDMA/66
Aug 16 01:27:05 freax ata5: EH complete
Aug 16 01:27:05 freax ata5.00: 16 bytes trailing data

here some Infos about the desktop
00:00.0 Host bridge: Intel Corporation 975X Express Memory Controller Hub (rev c0)
00:01.0 PCI bridge: Intel Corporation 975X Express PCI Express Root Port (rev c0)
00:1b.0 Audio device: Intel Corporation 82801G (ICH7 Family) High Definition Audio Controller (rev 01)
00:1c.0 PCI bridge: Intel Corporation 82801G (ICH7 Family) PCI Express Port 1 (rev 01)
00:1c.3 PCI bridge: Intel Corporation 82801G (ICH7 Family) PCI Express Port 4 (rev 01)
00:1d.0 USB Controller: Intel Corporation 82801G (ICH7 Family) USB UHCI #1 (rev 01)
00:1d.1 USB Controller: Intel Corporation 82801G (ICH7 Family) USB UHCI #2 (rev 01)
00:1d.2 USB Controller: Intel Corporation 82801G (ICH7 Family) USB UHCI #3 (rev 01)
00:1d.3 USB Controller: Intel Corporation 82801G (ICH7 Family) USB UHCI #4 (rev 01)
00:1d.7 USB Controller: Intel Corporation 82801G (ICH7 Family) USB2 EHCI Controller (rev 01)
00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev e1)
00:1f.0 ISA bridge: Intel Corporation 82801GB/GR (ICH7 Family) LPC Interface Bridge (rev 01)
00:1f.1 IDE interface: Intel Corporation 82801G (ICH7 Family) IDE Controller (rev 01)
00:1f.2 SATA controller: Intel Corporation 82801GR/GH (ICH7 Family) Serial ATA Storage Controller AHCI (rev 01)
00:1f.3 SMBus: Intel Corporation 82801G (ICH7 Family) SMBus Controller (rev 01)
01:00.0 Multimedia audio controller: Creative Labs Unknown device 0005
01:01.0 Multimedia video controller: Brooktree Corporation Bt878 Video Capture (rev 11)
01:01.1 Multimedia controller: Brooktree Corporation Bt878 Audio Capture (rev 11)
01:02.0 Ethernet controller: Atheros Communications, Inc. AR5212 802.11abg NIC (rev 01)
01:03.0 FireWire (IEEE 1394): Texas Instruments TSB43AB22/A IEEE-1394a-2000 Controller (PHY/Link)
02:00.0 Ethernet controller: Marvell Technology Group Ltd. 88E8053 PCI-E Gigabit Ethernet Controller (rev 20)
04:00.0 VGA compatible controller: nVidia Corporation G70 [GeForce 7600 GT] (rev a1)

The DVD-RW drive is a Plextor 760A PATA

regards
Bjoern
Comment 8 Michal Piotrowski 2007-08-16 03:57:54 UTC
Bjoern, is this bug also present in 2.6.23-rc3? I guess so.
Comment 9 Bjoern Olausson 2007-08-16 04:48:20 UTC
I would guess... yes, but I'll give it a shot today.

greetings
Bjoern
Comment 10 Bjoern Olausson 2007-08-16 09:09:18 UTC
Okay, now this is getting more and more weired... I switched back to 2.6.22 on my Desktop and still I can only burn CDs but no DVDs...

Aug 16 17:17:47 freax ata5.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
Aug 16 17:17:47 freax ata5.00: cmd a0/01:00:00:00:00/00:00:00:00:00/a0 tag 0 cdb 0x2a data 3276                              8 out
Aug 16 17:17:47 freax res 40/00:03:00:00:20/00:00:00:00:00/b0 Emask 0x4 (timeout)
Aug 16 17:17:52 freax ata5: port is slow to respond, please be patient (Status 0xd8)
Aug 16 17:17:57 freax ata5: device not ready (errno=-16), forcing hardreset
Aug 16 17:17:57 freax ata5: soft resetting port
Aug 16 17:17:57 freax ata5.00: configured for UDMA/66
Aug 16 17:17:57 freax ata5.01: configured for UDMA/33
Aug 16 17:17:57 freax ata5: EH complete
Aug 16 17:18:05 freax ata5.00: 16 bytes trailing data

Tried switching to 2.6.23_rc3 but some stuff will not compile against 2.6.23_rc3 so I'll have to wait till stuff works befor I can test 2.6.23

Hopfully I am not mixing things here. But I guess the Problem on the Desktop and the Server could be the same problem (both are using exclusivly libata)

regards
Bjoern
Comment 11 Sergei Shtylyov 2007-08-17 10:13:57 UTC
Created attachment 12428 [details]
Clock HPT374 with 50 MHz DPLL (for real :-)

Try this patch please?
It's absolutely necessary for HPT374 to work: the chip can't tolerate 66 MHz DPLL clock that the driver is setting it to -- it might have been fixed by 2.6.22 if the fix was complete...
Comment 12 Bjoern Olausson 2007-08-20 03:32:55 UTC
The HPT Controller works after using the two patches on vanilla 2.6.22.3

1) Diff for PLL tuning (http://bugzilla.kernel.org/attachment.cgi?id=12104)
2) Clock HPT374 with 50 MHz DPLL (http://bugzilla.kernel.org/attachment.cgi?id=12428)

moved 188 files (~3GB)to the mentiond raid device without problems.

Maybe I sould open another bug for the
ata5.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2
frozen
on the intel ICH7 (parallel port) when writing data to DVD (CD works)
(The drive workes, tried it with Windows and there I could burn a DVD without problems)

Thansk for the fix

regards
Bjoern
Comment 13 Bjoern Olausson 2007-08-21 05:01:57 UTC
NO NOT Fixed...

after doing some more heavy IO to another RAID 1 (with two disks) attached to the same Controller I got th following:


ata3.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
ata3.00: cmd 35/00:00:97:51:7b/00:04:09:00:00/e0 tag 0 cdb 0x0 data 524288 out
         res 40/00:00:00:4f:c2/00:00:00:00:00/10 Emask 0x4 (timeout)
ata3: soft resetting port
Find mode for 12 reports C829C62
Find mode for 12 reports C829C62
Find mode for DMA 69 reports 1CAE9C62
Find mode for DMA 69 reports 1CAE9C62
ata3.00: configured for UDMA/100
ata3.01: configured for UDMA/100
ata3: EH complete
sd 2:0:0:0: [sda] 398297088 512-byte hardware sectors (203928 MB)
sd 2:0:0:0: [sda] Write Protect is off
sd 2:0:0:0: [sda] Mode Sense: 00 3a 00 00
sd 2:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
sd 2:0:1:0: [sdb] 398297088 512-byte hardware sectors (203928 MB)
sd 2:0:1:0: [sdb] Write Protect is off
sd 2:0:1:0: [sdb] Mode Sense: 00 3a 00 00
sd 2:0:1:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
sd 2:0:0:0: [sda] 398297088 512-byte hardware sectors (203928 MB)
sd 2:0:0:0: [sda] Write Protect is off
sd 2:0:0:0: [sda] Mode Sense: 00 3a 00 00
sd 2:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
sd 2:0:1:0: [sdb] 398297088 512-byte hardware sectors (203928 MB)
sd 2:0:1:0: [sdb] Write Protect is off
sd 2:0:1:0: [sdb] Mode Sense: 00 3a 00 00
sd 2:0:1:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA

The problem got better bot is not solve... or could the same error messae occure if there are some bad blocks/cluster on the drive?

regards
Bjoern
Comment 14 Alan 2007-08-21 05:56:00 UTC
A bad block should cause a precise error report from the drive rather than a timeout. It might timeout if the drive is struggling badly but I would expect to see a report of an actual media error back from the drive.

You can also use the smart tools to check the last failed commands as the drive sees them
Comment 15 Bjoern Olausson 2007-08-21 11:23:08 UTC
I detected the above only because a KDE Dialogue told me during modifiing serveral files that it could not write to a file, I told KDE just to skip the file and the process continued modifiing my files. Before I applied the "Clock HPT374 with 50 MHz DPLL (for real :-)" patch the entire process was hang and had to be restartet, than after a short time the same happened again.

After patching, the process stoped at a count of ~1000 files but after the "EH complete" I could just continue modifiing another 9000 files without problem and without any "exception Emask" Since that Error I could not reproduce it anymore. Maybe after a reboot...

Checking smart did not show anything usefull:

smartctl -a /dev/sda
smartctl version 5.36 [i686-pc-linux-gnu] Copyright (C) 2002-6 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

Device: ATA      Maxtor 6Y200P0   Version: YAR4
Serial number: Y60QJ***
Device type: disk
Local Time is: Tue Aug 21 20:11:45 2007 CEST
Device does not support SMART

Error Counter logging not supported

[GLTSD (Global Logging Target Save Disable) set. Enable Save with '-S on']
Device does not support Self Test logging
20:11:45 [~]




smartctl -a /dev/sdb
smartctl version 5.36 [i686-pc-linux-gnu] Copyright (C) 2002-6 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

Device: ATA      Maxtor 6Y200P0   Version: YAR4
Serial number: Y60QJ***
Device type: disk
Local Time is: Tue Aug 21 20:11:54 2007 CEST
Device does not support SMART

Error Counter logging not supported

[GLTSD (Global Logging Target Save Disable) set. Enable Save with '-S on']
Device does not support Self Test logging

Regards
Bjoern
Comment 16 Bjoern Olausson 2007-08-24 08:20:07 UTC
The system is up for several days now and I got no more errors.
And hopfully they will not reoccure.

Will the "50 MHz DPLL" patch be commited to the mainlain Kernel?

regards and thanks
Bjoern
Comment 17 Sergei Shtylyov 2007-08-24 13:33:15 UTC
(In reply to comment #16)
> Will the "50 MHz DPLL" patch be commited to the mainlain Kernel?

Already there. :-)
Comment 18 Bjoern Olausson 2007-08-30 09:57:09 UTC
Okay, bug is still present!

on heavy IO (I have but a VirtualBox image on the raid connected on the hpt controller) I get the following now serveral time:

The drives recover fine but  I guess they should never fail ;-)

Let me know if you need more info.

Aug 30 15:49:07 enterprise ata3.01: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
Aug 30 15:49:07 enterprise ata3.01: cmd ca/00:e0:4f:34:03/00:00:00:00:00/f0 tag 0 cdb 0x0 data 114688 out
Aug 30 15:49:07 enterprise res 40/00:00:00:4f:c2/00:00:00:00:00/10 Emask 0x4 (timeout)
Aug 30 15:49:07 enterprise ata3: soft resetting port
Aug 30 15:49:07 enterprise Find mode for 12 reports C829C62
Aug 30 15:49:07 enterprise Find mode for 12 reports C829C62
Aug 30 15:49:07 enterprise Find mode for DMA 69 reports 1CAE9C62
Aug 30 15:49:07 enterprise Find mode for DMA 69 reports 1CAE9C62
Aug 30 15:49:07 enterprise ata3.00: configured for UDMA/100
Aug 30 15:49:07 enterprise ata3.01: configured for UDMA/100
Aug 30 15:49:07 enterprise ata3: EH complete
Aug 30 15:49:07 enterprise sd 2:0:0:0: [sda] 398297088 512-byte hardware sectors (203928 MB)
Aug 30 15:49:07 enterprise sd 2:0:0:0: [sda] Write Protect is off
Aug 30 15:49:07 enterprise sd 2:0:0:0: [sda] Mode Sense: 00 3a 00 00
Aug 30 15:49:07 enterprise sd 2:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Aug 30 15:49:07 enterprise sd 2:0:1:0: [sdb] 398297088 512-byte hardware sectors (203928 MB)
Aug 30 15:49:07 enterprise sd 2:0:1:0: [sdb] Write Protect is off
Aug 30 15:49:07 enterprise sd 2:0:1:0: [sdb] Mode Sense: 00 3a 00 00
Aug 30 15:49:07 enterprise sd 2:0:1:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Aug 30 15:49:07 enterprise sd 2:0:0:0: [sda] 398297088 512-byte hardware sectors (203928 MB)
Aug 30 15:49:07 enterprise sd 2:0:0:0: [sda] Write Protect is off
Aug 30 15:49:07 enterprise sd 2:0:0:0: [sda] Mode Sense: 00 3a 00 00
Aug 30 15:49:07 enterprise sd 2:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Aug 30 15:49:07 enterprise sd 2:0:1:0: [sdb] 398297088 512-byte hardware sectors (203928 MB)
Aug 30 15:49:07 enterprise sd 2:0:1:0: [sdb] Write Protect is off
Aug 30 15:49:07 enterprise sd 2:0:1:0: [sdb] Mode Sense: 00 3a 00 00
Aug 30 15:49:07 enterprise sd 2:0:1:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA

[...]

Aug 30 15:56:17 enterprise ata3.01: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
Aug 30 15:56:17 enterprise ata3.01: cmd ca/00:68:8f:95:00/00:00:00:00:00/f0 tag 0 cdb 0x0 data 53248 out
Aug 30 15:56:17 enterprise res 40/00:00:00:4f:c2/00:00:00:00:00/10 Emask 0x4 (timeout)
Aug 30 15:56:17 enterprise ata3: soft resetting port
Aug 30 15:56:17 enterprise Find mode for 12 reports C829C62
Aug 30 15:56:17 enterprise Find mode for 12 reports C829C62
Aug 30 15:56:17 enterprise Find mode for DMA 69 reports 1CAE9C62
Aug 30 15:56:17 enterprise Find mode for DMA 69 reports 1CAE9C62
Aug 30 15:56:17 enterprise ata3.00: configured for UDMA/100
Aug 30 15:56:17 enterprise ata3.01: configured for UDMA/100
Aug 30 15:56:17 enterprise ata3: EH complete
Aug 30 15:56:17 enterprise sd 2:0:0:0: [sda] 398297088 512-byte hardware sectors (203928 MB)
Aug 30 15:56:17 enterprise sd 2:0:0:0: [sda] Write Protect is off
Aug 30 15:56:17 enterprise sd 2:0:0:0: [sda] Mode Sense: 00 3a 00 00
Aug 30 15:56:17 enterprise sd 2:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Aug 30 15:56:17 enterprise sd 2:0:1:0: [sdb] 398297088 512-byte hardware sectors (203928 MB)
Aug 30 15:56:17 enterprise sd 2:0:1:0: [sdb] Write Protect is off
Aug 30 15:56:17 enterprise sd 2:0:1:0: [sdb] Mode Sense: 00 3a 00 00
Aug 30 15:56:17 enterprise sd 2:0:1:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Aug 30 15:56:17 enterprise sd 2:0:0:0: [sda] 398297088 512-byte hardware sectors (203928 MB)
Aug 30 15:56:17 enterprise sd 2:0:0:0: [sda] Write Protect is off
Aug 30 15:56:17 enterprise sd 2:0:0:0: [sda] Mode Sense: 00 3a 00 00
Aug 30 15:56:17 enterprise sd 2:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Aug 30 15:56:17 enterprise sd 2:0:1:0: [sdb] 398297088 512-byte hardware sectors (203928 MB)
Aug 30 15:56:17 enterprise sd 2:0:1:0: [sdb] Write Protect is off
Aug 30 15:56:17 enterprise sd 2:0:1:0: [sdb] Mode Sense: 00 3a 00 00
Aug 30 15:56:17 enterprise sd 2:0:1:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA

[...]
[...]

Aug 30 18:41:31 enterprise ata3.01: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
Aug 30 18:41:31 enterprise ata3.01: cmd 35/00:00:6f:be:6b/00:02:17:00:00/f0 tag 0 cdb 0x0 data 262144 out
Aug 30 18:41:31 enterprise res 40/00:00:00:4f:c2/00:00:00:00:00/10 Emask 0x4 (timeout)
Aug 30 18:41:31 enterprise ata3: soft resetting port
Aug 30 18:41:31 enterprise Find mode for 12 reports C829C62
Aug 30 18:41:31 enterprise Find mode for 12 reports C829C62
Aug 30 18:41:31 enterprise Find mode for DMA 69 reports 1CAE9C62
Aug 30 18:41:31 enterprise Find mode for DMA 69 reports 1CAE9C62
Aug 30 18:41:31 enterprise ata3.00: configured for UDMA/100
Aug 30 18:41:31 enterprise ata3.01: configured for UDMA/100
Aug 30 18:41:31 enterprise ata3: EH complete
Aug 30 18:41:31 enterprise sd 2:0:0:0: [sda] 398297088 512-byte hardware sectors (203928 MB)
Aug 30 18:41:31 enterprise sd 2:0:0:0: [sda] Write Protect is off
Aug 30 18:41:31 enterprise sd 2:0:0:0: [sda] Mode Sense: 00 3a 00 00
Aug 30 18:41:31 enterprise sd 2:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Aug 30 18:41:31 enterprise sd 2:0:1:0: [sdb] 398297088 512-byte hardware sectors (203928 MB)
Aug 30 18:41:31 enterprise sd 2:0:1:0: [sdb] Write Protect is off
Aug 30 18:41:31 enterprise sd 2:0:1:0: [sdb] Mode Sense: 00 3a 00 00
Aug 30 18:41:31 enterprise sd 2:0:1:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Aug 30 18:41:31 enterprise sd 2:0:0:0: [sda] 398297088 512-byte hardware sectors (203928 MB)
Aug 30 18:41:31 enterprise sd 2:0:0:0: [sda] Write Protect is off
Aug 30 18:41:31 enterprise sd 2:0:0:0: [sda] Mode Sense: 00 3a 00 00
Aug 30 18:41:31 enterprise sd 2:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Aug 30 18:41:31 enterprise sd 2:0:1:0: [sdb] 398297088 512-byte hardware sectors (203928 MB)
Aug 30 18:41:31 enterprise sd 2:0:1:0: [sdb] Write Protect is off
Aug 30 18:41:31 enterprise sd 2:0:1:0: [sdb] Mode Sense: 00 3a 00 00
Aug 30 18:41:31 enterprise sd 2:0:1:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Comment 19 Bjoern Olausson 2007-08-30 10:17:03 UTC
Here's a full bootlog:

http://paste.olausson.de/88c1613c72.html

regards
Bjoern
Comment 20 Bjoern Olausson 2007-09-08 03:50:49 UTC
Okay, this problem now also occures on other devices (I just didn't see it.

Now I also noticed it on my Raid5 Array.

Any progress, any ideas?

It manly occures when moving around large number of small files (in my case a lot of photos, all around 2-5MB)

regards
Bjoern
Comment 21 Sergei Shtylyov 2007-09-08 04:05:36 UTC
(In reply to comment #20)
> Okay, this problem now also occures on other devices (I just didn't see it.

I wash my hands then... :-)
Comment 22 Bjoern Olausson 2007-09-08 04:32:35 UTC
(In reply to comment #21)
> 
> I wash my hands then... :-)
> 

I guess you are reffering to your hard work on fixing this bug, so your hands ar getting so sweaty from the uncountable keystroks ... hrrhrr, am I right.... ;-)

Here you can see what files (pictures) are causing the trouble 

https://gallery.boonline.dyndns.org
(No parental controll required, can be viewd from age between 1 to 99+ [maybe glasses required])

I recommend the following URL:
https://gallery.boonline.dyndns.org/v/Snapshots/Sport/Tanzen/Club/Weihnachtsball/2006_12_02/oulu/?g2_page=2
(some of the best Dancers from Oulu (Finnland)) They won the team competition!

I had to move around the pictures, recreate the gallery, change headers, exifdate, so I touched a large number of files on the devices in a very short time.

Have fun.

regards
Bjoern
Comment 23 Bjoern Olausson 2007-10-10 10:34:46 UTC
Okay, since Sep 20 23:01:27 I had no more "exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen"

Currently I am switching from 2.6.23-rc8 to 2.6.23.

Hopfully there are no more "exception Emask" errors and we can close this bug... but just give me some days/weeks to test it ;-)
Comment 24 Bjoern Olausson 2007-10-19 08:51:05 UTC
No more "exception Emask" since Sep 20 23:01:27 running on 2.6.23

I'll mark this one as solved.

Thanks for the help

regards
Bjoern

Note You need to log in before you can comment on or make changes to this bug.