Bug 12709 (corsair) - 2.6.28.5 is killing USBest 165 firmware (Corsair Flash Voyager & A-DATA flash drives)
Summary: 2.6.28.5 is killing USBest 165 firmware (Corsair Flash Voyager & A-DATA flash...
Status: RESOLVED INSUFFICIENT_DATA
Alias: corsair
Product: Drivers
Classification: Unclassified
Component: USB (show other bugs)
Hardware: All Linux
: P1 high
Assignee: Greg Kroah-Hartman
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2009-02-15 05:15 UTC by Tomas Mudrunka
Modified: 2012-05-30 14:03 UTC (History)
4 users (show)

See Also:
Kernel Version: 2.6.28.5
Subsystem:
Regression: No
Bisected commit-id:


Attachments
USBMon logs for Corsair Flash Voyager - USBest firmware 165 & 163 (410.65 KB, application/octet-stream)
2009-02-16 10:03 UTC, Tomas Mudrunka
Details
USBMon logs for Corsair Flash Voyager - USBest firmware 165 & 163 - drive very confused (743.00 KB, application/octet-stream)
2009-02-16 14:04 UTC, Tomas Mudrunka
Details

Description Tomas Mudrunka 2009-02-15 05:15:25 UTC
Latest working kernel version: unsure?
Earliest failing kernel version: 2.6.28
Distribution: Arch Linux
Hardware Environment: Acer Aspire 3000 + Corsair Flash Voyager
Software Environment:
Problem Description:

Flash Voyagers and A-Data are very popular usb thumbdrives, but their firmware is killed after inserting&removing it in usb port on linux box few times (1 or more) and it needs to be reflashed using windows!

Additional info:
theres no special info in dmesg or kernel.log (its looks like everything works OK), but device /dev/sdX is inaccessible, fdisk -l shows anything, cfdisk is not working and you can't see partitions (/dev/sdXY)...

These thumbdrives works well on windows systems until they are not inserted into the Linux box.

Steps to reproduce:
Create partition and format it to any type of FS using linux or windows.
Try to connect and disconnect it (i tried Corsair Flash Voyager 32 GB - version with neckband - i heard it have different chipset than version without neckband... - the USBest 165) few times to linux box. (I have hal+gnome automount enabled, but removing mounted flashdrive still shouldn't brick the thumbdrive - should it? and it was killed when removed after unmounting also)

kernel.log (dmesg) output:

*** CONNECTED
Feb 15 14:06:15 harvie-ntb usb 1-6: new high speed USB device using ehci_hcd and address 3
Feb 15 14:06:15 harvie-ntb usb 1-6: configuration #1 chosen from 1 choice
Feb 15 14:06:17 harvie-ntb Initializing USB Mass Storage driver...
Feb 15 14:06:17 harvie-ntb scsi2 : SCSI emulation for USB Mass Storage devices
Feb 15 14:06:17 harvie-ntb usbcore: registered new interface driver usb-storage
Feb 15 14:06:17 harvie-ntb USB Mass Storage support registered.
Feb 15 14:06:17 harvie-ntb usb-storage: device found at 3
Feb 15 14:06:17 harvie-ntb usb-storage: waiting for device to settle before scanning
Feb 15 14:06:22 harvie-ntb scsi 2:0:0:0: Direct-Access     USBest   USB2FlashStorage 0.00 PQ: 0 ANSI: 2
Feb 15 14:06:22 harvie-ntb sd 2:0:0:0: [sdb] Attached SCSI removable disk
Feb 15 14:06:22 harvie-ntb sd 2:0:0:0: Attached scsi generic sg2 type 0
Feb 15 14:06:22 harvie-ntb usb-storage: device scan complete

*** DISCONNECTED
Feb 15 14:06:30 harvie-ntb usb 1-6: USB disconnect, address 3


^^^^ so it looks like before firmware was crippled, but block device can't be mounted or acessed by fdisk, dd and other similar tools. all data are probably lost. but you can still repair drive using windows and firmware from this site:

http://209.85.171.104/translate_c?hl=en&sl=ru&tl=en&u=http://flashboot.ru/index.php%3Fname%3DFiles%26op%3Dcat%26id%3D11&usg=ALkJrhjRTNKzyhieqp13n6fv_85u3v5yCA


This bug is NOT related to another corsair voyager bug:
http://bugzilla.kernel.org/show_bug.cgi?id=12188
Comment 1 Tomas Mudrunka 2009-02-15 05:35:23 UTC
And killed stick have different identification in lsusb. it looks like smaller drive from different vendor:
Bus 001 Device 004: ID 1307:0163 Transcend Information, Inc. 512MB USB Flash Drive
Comment 2 Tomas Mudrunka 2009-02-15 07:10:23 UTC
INVALID = The problem described is not a bug.
so destroying data and hardware is not a bug?
Comment 3 Andrew Morton 2009-02-15 11:07:35 UTC
Reassigned to USB.
Comment 4 Anonymous Emailer 2009-02-15 11:08:57 UTC
Reply-To: akpm@linux-foundation.org


(switched to email.  Please respond via emailed reply-to-all, not via the
bugzilla web interface).

On Sun, 15 Feb 2009 05:15:25 -0800 (PST) bugme-daemon@bugzilla.kernel.org wrote:

> http://bugzilla.kernel.org/show_bug.cgi?id=12709
> 
>            Summary: 2.6.28 is killing USBest 165 firmware (Corsair Flash
>                     Voyager & A-DATA flash drives)
>            Product: Drivers
>            Version: 2.5
>      KernelVersion: 2.6.28
>           Platform: All
>         OS/Version: Linux
>               Tree: Mainline
>             Status: NEW
>           Severity: blocking
>           Priority: P1
>          Component: Flash/Memory Technology Devices

This is now USB

>         AssignedTo: dwmw2@infradead.org

This is now Greg

>         ReportedBy: harvie@email.cz
> 
> 
> Latest working kernel version: unsure?
> Earliest failing kernel version: 2.6.28
> Distribution: Arch Linux
> Hardware Environment: Acer Aspire 3000 + Corsair Flash Voyager
> Software Environment:
> Problem Description:
> 
> Flash Voyagers and A-Data are very popular usb thumbdrives, but their
> firmware
> is killed after inserting&removing it in usb port on linux box few times (1
> or
> more) and it needs to be reflashed using windows!
> 
> Additional info:
> theres no special info in dmesg or kernel.log (its looks like everything
> works
> OK), but device /dev/sdX is inaccessible, fdisk -l shows anything, cfdisk is
> not working and you can't see partitions (/dev/sdXY)...
> 
> These thumbdrives works well on windows systems until they are not inserted
> into the Linux box.
> 
> Steps to reproduce:
> Create partition and format it to any type of FS using linux or windows.
> Try to connect and disconnect it (i tried Corsair Flash Voyager 32 GB -
> version
> with neckband - i heard it have different chipset than version without
> neckband... - the USBest 165) few times to linux box. (I have hal+gnome
> automount enabled, but removing mounted flashdrive still shouldn't brick the
> thumbdrive - should it? and it was killed when removed after unmounting also)
> 
> kernel.log (dmesg) output:
> 
> *** CONNECTED
> Feb 15 14:06:15 harvie-ntb usb 1-6: new high speed USB device using ehci_hcd
> and address 3
> Feb 15 14:06:15 harvie-ntb usb 1-6: configuration #1 chosen from 1 choice
> Feb 15 14:06:17 harvie-ntb Initializing USB Mass Storage driver...
> Feb 15 14:06:17 harvie-ntb scsi2 : SCSI emulation for USB Mass Storage
> devices
> Feb 15 14:06:17 harvie-ntb usbcore: registered new interface driver
> usb-storage
> Feb 15 14:06:17 harvie-ntb USB Mass Storage support registered.
> Feb 15 14:06:17 harvie-ntb usb-storage: device found at 3
> Feb 15 14:06:17 harvie-ntb usb-storage: waiting for device to settle before
> scanning
> Feb 15 14:06:22 harvie-ntb scsi 2:0:0:0: Direct-Access     USBest  
> USB2FlashStorage 0.00 PQ: 0 ANSI: 2
> Feb 15 14:06:22 harvie-ntb sd 2:0:0:0: [sdb] Attached SCSI removable disk
> Feb 15 14:06:22 harvie-ntb sd 2:0:0:0: Attached scsi generic sg2 type 0
> Feb 15 14:06:22 harvie-ntb usb-storage: device scan complete
> 
> *** DISCONNECTED
> Feb 15 14:06:30 harvie-ntb usb 1-6: USB disconnect, address 3
> 
> 
> ^^^^ so it looks like before firmware was crippled, but block device can't be
> mounted or acessed by fdisk, dd and other similar tools. all data are
> probably
> lost. but you can still repair drive using windows and firmware from this
> site:
> 
>
> http://209.85.171.104/translate_c?hl=en&sl=ru&tl=en&u=http://flashboot.ru/index.php%3Fname%3DFiles%26op%3Dcat%26id%3D11&usg=ALkJrhjRTNKzyhieqp13n6fv_85u3v5yCA
> 
> 
> This bug is NOT related to another corsair voyager bug:
> http://bugzilla.kernel.org/show_bug.cgi?id=12188
> 
> 
Comment 5 Alan Stern 2009-02-15 12:20:19 UTC
> > http://bugzilla.kernel.org/show_bug.cgi?id=12709
> > 
> >            Summary: 2.6.28 is killing USBest 165 firmware (Corsair Flash
> >                     Voyager & A-DATA flash drives)

> > Flash Voyagers and A-Data are very popular usb thumbdrives, but their
> firmware
> > is killed after inserting&removing it in usb port on linux box few times (1
> or
> > more) and it needs to be reflashed using windows!
> > 
> > Additional info:
> > theres no special info in dmesg or kernel.log (its looks like everything
> works
> > OK), but device /dev/sdX is inaccessible, fdisk -l shows anything, cfdisk
> is
> > not working and you can't see partitions (/dev/sdXY)...
> > 
> > These thumbdrives works well on windows systems until they are not inserted
> > into the Linux box.

Thomas, please use usbmon to get a trace of the activity when you plug 
in a working drive and it stops working.  Instructions are in the 
kernel source file Documentation/usb/usbmon.txt.  Attach the usbmon 
trace to the bug report.

Alan Stern
Comment 6 Tomas Mudrunka 2009-02-16 10:03:58 UTC
Created attachment 20262 [details]
USBMon logs for Corsair Flash Voyager - USBest firmware 165 & 163

Oh i can't reproduce the error. i downgraded the firmware from original 165 to 163 and it seems to be working properly. And i can recomend it to everybody else who have this problem and wants to use this drive with linux. so there is probably just firmware bug.

i will do some testing with my drive for few weeks and i will reopen this bug in case of some problems...

I am attaching some usbmon logs created during my tests. there is only one problem with destroying FAT, but maybe it was created improperly by low level format during firmware upgrade. after formating it using mkfs.vfat or windows xp it working. hope it will stay. im not sure, if i can trust this drive...
Comment 7 Tomas Mudrunka 2009-02-16 10:21:47 UTC
F*ck! first try after closing this bug and firmware is gone again... reopening...
Comment 8 Tomas Mudrunka 2009-02-16 14:04:10 UTC
Created attachment 20268 [details]
USBMon logs for Corsair Flash Voyager - USBest firmware 165 & 163 - drive very confused

After more than 50 tries i have some results.
I still cannot kill the firmware to be saying "No media inserted", but it is not responding when i try to modify partition table or anything else. there are something in cache, but if i reconnect the drive, it is full of zeroes each time.


::I am attaching USB logs
Archive description: each file equals one try of: inserting + doing some operations + removing

kill44-badblocks_test.log <-- doing readonly badblocktest (removed from archive - too big & nothing interesting)
kill45-zeroing.log <-- doing dd if=/dev/zero of=/dev/sdb
kill46-created_partition_and_fs.log <-- cfdisk /dev/sdb && mkfs.vfat
kill47.log <-- can't find partition table
kill48-partition_table-disapeared.log <-- it's definetely disappeared
kill49-created_fs.log <-- trying to create another partition
kill50-partition_table_disappeared_again.log <-- looks ok, but there's nothing...
kill51-unknown_partition_type_after_creating_partition.log <-- still can't create partition
kill52.log <-- drive is full of zeroes after inserting
Comment 9 Alan Stern 2009-02-16 14:35:46 UTC
These logs are no help.  They don't show the contents of the partition table, because usbmon displays only the first 32 bytes of each data transfer.  And they are full of useless crap because of all the extraneous activity from hal and udev.

Here's what you should do for a better test.  Before plugging in the device, turn off the haldaemon service, and rename /lib/udev/vol_id to something else so that it can't run automatically.  Then plug in the drive, and don't start usbmon until a few seconds later.  (Remember to rename /lib/udev/vol_id back to its original name after the testing is finished!)

Finally, don't try anything fancy like creating a partition table or a file system.  Just do something simple like:  "dd if=test-file of=/dev/sdb count=1" where test-file contains some nontrivial data.  Maybe plain text; that would be fine.

Then after unplugging and replugging, do "dd if=/dev/sdb count=1" to see if the data is the same as before.
Comment 10 Alan Stern 2009-03-12 08:52:41 UTC
Has there been any progress on this?
Comment 11 Tomas Mudrunka 2009-05-10 00:21:20 UTC
Sorry for not responding so long time (school). Now i reclaimed my flash drive for the same model and at this time there is no problem, but i didn't try to format or repartition it. im pretty sure that my problem appeared when i created new partition table on it last time.

i can't brick another stick at this time, but i believe that there is some different way that this stick should be formated. i am copypasting few debug info about default factory format.

i believe that problem appeared when i executed dd if=/dev/zero of=/dev/sdb.
maybe it overwrited some data important for firmware. can somebody look at this?:



0 ;) harvie@harvie-ntb ~ $ fdisk -l /dev/sdb
Disk /dev/sdb: 32.3 GB, 32346472448 bytes
255 heads, 63 sectors/track, 3932 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0x04dd5721

   Device Boot      Start         End      Blocks   Id  System
/dev/sdb1   *           1        3933    31588320+   c  W95 FAT32 (LBA)



0 ;) harvie@harvie-ntb ~ $ cfdisk /dev/sdb
FATAL ERROR: Bad primary partition 0: Partition ends in the final partial cylinder
Press any key to exit cfdisk



0 ;) harvie@harvie-ntb ~ $ xxd /dev/sdb | less
0000000: 33c0 8ed0 bc00 7cfb 5007 501f fcbe 1b7c  3.....|.P.P....|
0000010: bf1b 0650 57b9 e501 f3a4 cbbd be07 b104  ...PW...........
0000020: 386e 007c 0975 1383 c510 e2f4 cd18 8bf5  8n.|.u..........
0000030: 83c6 1049 7419 382c 74f6 a0b5 07b4 078b  ...It.8,t.......
0000040: f0ac 3c00 74fc bb07 00b4 0ecd 10eb f288  ..<.t...........
0000050: 4e10 e846 0073 2afe 4610 807e 040b 740b  N..F.s*.F..~..t.
0000060: 807e 040c 7405 a0b6 0775 d280 4602 0683  .~..t....u..F...
0000070: 4608 0683 560a 00e8 2100 7305 a0b6 07eb  F...V...!.s.....
0000080: bc81 3efe 7d55 aa74 0b80 7e10 0074 c8a0  ..>.}U.t..~..t..
0000090: b707 eba9 8bfc 1e57 8bf5 cbbf 0500 8a56  .......W.......V
00000a0: 00b4 08cd 1372 238a c124 3f98 8ade 8afc  .....r#..$?.....
00000b0: 43f7 e38b d186 d6b1 06d2 ee42 f7e2 3956  C..........B..9V
00000c0: 0a77 2372 0539 4608 731c eb1a 90bb 007c  .w#r.9F.s......|
00000d0: 8b4e 028b 5600 cd13 7351 4f74 4e32 e48a  .N..V...sQOtN2..
00000e0: 5600 cd13 ebe4 8a56 0060 bbaa 55b4 41cd  V......V.`..U.A.
00000f0: 1372 3681 fb55 aa75 30f6 c101 742b 6160  .r6..U.u0...t+a`
0000100: 6a00 6a00 ff76 0aff 7608 6a00 6800 7c6a  j.j..v..v.j.h.|j
0000110: 016a 10b4 428b f4cd 1361 6173 0e4f 740b  .j..B....aas.Ot.
0000120: 32e4 8a56 00cd 13eb d661 f9c3 496e 7661  2..V.....a..Inva
0000130: 6c69 6420 7061 7274 6974 696f 6e20 7461  lid partition ta
0000140: 626c 6500 4572 726f 7220 6c6f 6164 696e  ble.Error loadin
0000150: 6720 6f70 6572 6174 696e 6720 7379 7374  g operating syst
0000160: 656d 004d 6973 7369 6e67 206f 7065 7261  em.Missing opera
0000170: 7469 6e67 2073 7973 7465 6d00 0000 0000  ting system.....
0000180: 0000 0000 0000 0000 0000 0000 0000 0000  ................
0000190: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00001a0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00001b0: 0000 0000 0000 0000 2157 dd04 0000 8001  ........!W......
00001c0: 0100 0cfe ff5b 3f00 0000 c1ff c303 0000  .....[?.........
00001d0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00001e0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00001f0: 0000 0000 0000 0000 0000 0000 0000 55aa  ..............U.
0000200: 0000 0000 0000 0000 0000 0000 0000 0000  ................
...and lot of zeroes before any other data...
0007df0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
0007e00: eb58 904d 5344 4f53 352e 3000 0220 2600  .X.MSDOS5.0.. &.
0007e10: 0200 0000 00f8 0000 3f00 ff00 3f00 0000  ........?...?...
0007e20: c1ff c303 393c 0000 0000 0000 0200 0000  ....9<..........
0007e30: 0100 0600 0000 0000 0000 0000 0000 0000  ................
0007e40: 0000 2961 c11e f44e 4f20 4e41 4d45 2020  ..)a...NO NAME  
0007e50: 2020 4641 5433 3220 2020 33c9 8ed1 bcf4    FAT32   3.....
0007e60: 7b8e c18e d9bd 007c 884e 028a 5640 b408  {......|.N..V@..
0007e70: cd13 7305 b9ff ff8a f166 0fb6 c640 660f  ..s......f...@f.
0007e80: b6d1 80e2 3ff7 e286 cdc0 ed06 4166 0fb7  ....?.......Af..
0007e90: c966 f7e1 6689 46f8 837e 1600 7538 837e  .f..f.F..~..u8.~
0007ea0: 2a00 7732 668b 461c 6683 c00c bb00 80b9  *.w2f.F.f.......
0007eb0: 0100 e82b 00e9 4803 a0fa 7db4 7d8b f0ac  ...+..H...}.}...
0007ec0: 84c0 7417 3cff 7409 b40e bb07 00cd 10eb  ..t.<.t.........
0007ed0: eea0 fb7d ebe5 a0f9 7deb e098 cd16 cd19  ...}....}.......
0007ee0: 6660 663b 46f8 0f82 4a00 666a 0066 5006  f`f;F...J.fj.fP.
0007ef0: 5366 6810 0001 0080 7e02 000f 8520 00b4  Sfh.....~.... ..
0007f00: 41bb aa55 8a56 40cd 130f 821c 0081 fb55  A..U.V@........U
0007f10: aa0f 8514 00f6 c101 0f84 0d00 fe46 02b4  .............F..
0007f20: 428a 5640 8bf4 cd13 b0f9 6658 6658 6658  B.V@......fXfXfX
0007f30: 6658 eb2a 6633 d266 0fb7 4e18 66f7 f1fe  fX.*f3.f..N.f...
0007f40: c28a ca66 8bd0 66c1 ea10 f776 1a86 d68a  ...f..f....v....
0007f50: 5640 8ae8 c0e4 060a ccb8 0102 cd13 6661  V@............fa
0007f60: 0f82 54ff 81c3 0002 6640 490f 8571 ffc3  ..T.....f@I..q..
0007f70: 4e54 4c44 5220 2020 2020 2000 0000 0000  NTLDR      .....
0007f80: 0000 0000 0000 0000 0000 0000 0000 0000  ................
0007f90: 0000 0000 0000 0000 0000 0000 0000 0000  ................
0007fa0: 0000 0000 0000 0000 0000 0000 0d0a 5265  ..............Re
0007fb0: 6d6f 7665 2064 6973 6b73 206f 7220 6f74  move disks or ot
0007fc0: 6865 7220 6d65 6469 612e ff0d 0a44 6973  her media....Dis
0007fd0: 6b20 6572 726f 72ff 0d0a 5072 6573 7320  k error...Press 
0007fe0: 616e 7920 6b65 7920 746f 2072 6573 7461  any key to resta
0007ff0: 7274 0d0a 0000 0000 00ac cbd8 0000 55aa  rt............U.
0008000: 5252 6141 0000 0000 0000 0000 0000 0000  RRaA............
0008010: 0000 0000 0000 0000 0000 0000 0000 0000  ................
...and lot of zeroes before any other data...
Comment 12 Tomas Mudrunka 2009-05-10 00:24:50 UTC
more info about fresh stick:

1 ;( harvie@harvie-ntb ~ $ lsusb -v

Bus 001 Device 019: ID 1b1c:0ab1  
Device Descriptor:
  bLength                18
  bDescriptorType         1
  bcdUSB               2.00
  bDeviceClass            0 (Defined at Interface level)
  bDeviceSubClass         0 
  bDeviceProtocol         0 
  bMaxPacketSize0        64
  idVendor           0x1b1c 
  idProduct          0x0ab1 
  bcdDevice            1.00
  iManufacturer           1 Corsair
  iProduct                2 Flash Voyager
  iSerial                 3 b3475f7170e1f6
  bNumConfigurations      1
  Configuration Descriptor:
    bLength                 9
    bDescriptorType         2
    wTotalLength           39
    bNumInterfaces          1
    bConfigurationValue     1
    iConfiguration          0 
    bmAttributes         0x80
      (Bus Powered)
    MaxPower               98mA
    Interface Descriptor:
      bLength                 9
      bDescriptorType         4
      bInterfaceNumber        0
      bAlternateSetting       0
      bNumEndpoints           3
      bInterfaceClass         8 Mass Storage
      bInterfaceSubClass      6 SCSI
      bInterfaceProtocol     80 Bulk (Zip)
      iInterface              0 
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x01  EP 1 OUT
        bmAttributes            2
          Transfer Type            Bulk
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0200  1x 512 bytes
        bInterval               1
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x82  EP 2 IN
        bmAttributes            2
          Transfer Type            Bulk
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0200  1x 512 bytes
        bInterval               1
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x83  EP 3 IN
        bmAttributes            3
          Transfer Type            Interrupt
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0040  1x 64 bytes
        bInterval               8
Device Qualifier (for other device speed):
  bLength                10
  bDescriptorType         6
  bcdUSB               2.00
  bDeviceClass            0 (Defined at Interface level)
  bDeviceSubClass         0 
  bDeviceProtocol         0 
  bMaxPacketSize0        64
  bNumConfigurations      1
can't get debug descriptor: Connection timed out
cannot read device status, Connection timed out (110)
Comment 13 Eric Enright 2009-09-03 18:26:46 UTC
I have been experiencing a similar issue with 128MB JetFlash / Transcend USB flash drives.  These are based on USBest controller UT161-T6G.  Similar devices with a UT163-L6 appear to be unaffected (same manufacturer model, different controllers).

Platform: ARM
Kernel: 2.4.26
Relevant drivers: usb-ohci-ep93xx, usb-ohci

The devices initially appear to be fine; they are properly detected and registered, and can be formatted and used as one would expect.  After some time, they begin to intermittently exhibit varying issues:

•	Device is seen on the bus, however it fails to respond to a SET ADDRESS (observed with a USB protocol analyzer – no messages come from the device at all).
•	Device is seen on the bus, assigned an address, recognized as a mass storage device, however SCSI operations fail (reading the partition table, etc)
•	Device is seen on the bus, assigned an address, recognized as a mass storage device, partitions are properly read and the device can be read from and written to, however soon after begins to kick up IO errors.
•	Device operates perfectly fine.

I have witnessed all failure modes in a single device while power cycling it’s connected host over a half an hour.  I initially thought that the device was bad, however I have found these same issues in more than six (6) units with the same controller.

The devices were carefully opened and tested electrically, which they pass.

The underlying Samsung NAND chip was removed from the PCB and read with a known-good testing assembly, which also had issues.

When the device is in the state that it fails to accept a bus address from the host, no amount of host-side hackery I have tried seems to bring it back.  On the EP9302 I am able to force the USB ports to a DISCONNECTED state via an extension register on the OHCI controller (holds D+ high and D- low).  Doing so, pausing a few seconds, and removing the override does cause the kernel to see a “new” device, which always fails to assign the address.  I have tried cycling this state over 100 times, with no luck.  Interestingly, if the device boots into a working state, I am able to break it again by performing this cycling.

It would appear that once the device is in this state, nothing short of power disconnection can bring it back (assuming it’s not so bad that it won’t work at all).

When the device is working, some times there is loss of data (writing known data out, read-back verification fails).

Additionally, when the device does work, it may report incorrect capacity size (observed in kernel logs and via USB protocol analyzer).  4GB, 0MB, etc.  This is in line with the OP.

All of these problems were able to be rectified by using the JetFlash recovery tool and by reflashing the NAND chip with the contents of a known good chip, which also sounds to be in line with the OP.

My suspicions are that the controller is using a small section of flash for configuration data, and this is becoming corrupted.  Given that the OP states this occurs only after using it in his Linux box, I begin to wonder if the Linux USB implementation is in some circumstances triggering a bug in the firmware.

It would be nice if I were able to find an explanation for this behaviour.  I have kernel logs and USB protocol dumps from a Beagle 480 if anyone is interested.  I could perform other testing as well, if requested and setup for it is reasonable.
Comment 14 Tomas Mudrunka 2009-09-03 21:26:18 UTC
Eric Enright: i am expecting that my drive was corrupted by "zeroing" it with "dd if=/dev/zero of=/dev/sdb". i needed to do this because i was unable to read partition table from it using common methods like using (c)fdisk. But it didn't failed immediately after this "zeroing" & repartitioning. It didn't mounted few times and it was destroying all the data after each few reconnects. After this it needed to be reflashed and reformated again.

Now i have reclaimed the drive (which was absolutely unusable in the last state of its "disease") and now i am afraid to try to reformat (and "rezero") it again, because it's just working well for now.

I think that good solution at the kernel (mass storage driver) level can be some kind of write-protection of the first few bytes of storage area on all of those USBest chipsets. It's probably hardware problem which can't be definetely solved using any reflashing. So i think the best way is to prevent Linux users from triggering the bug.
Comment 15 Tomas Mudrunka 2009-09-03 21:30:24 UTC
Maybe Windows drivers for this drive are just accessing drive from higher addresses (which is explaining corrupted partition table). We can look if partition table starts few bytes "higer" compared to other drives. So Linux driver can just ignore first few bits/bytes on those chips and everything could be just fine.
Comment 16 Tomas Mudrunka 2009-09-03 21:51:45 UTC
This is what says fdisk and cfdisk about factory-formated drive:

0 ;) harvie@harvie-ntb ~ $ LANG=C fdisk -l /dev/sdb

Disk /dev/sdb: 32.3 GB, 32346472448 bytes
255 heads, 63 sectors/track, 3932 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0x04dd5721

   Device Boot      Start         End      Blocks   Id  System
/dev/sdb1   *           1        3933    31588320+   c  W95 FAT32 (LBA)

0 ;) harvie@harvie-ntb ~ $ LANG=C cfdisk -P r /dev/sdb
FATAL ERROR: Bad primary partition 0: Partition ends in the final partial cylinder
Comment 17 Tomas Mudrunka 2009-09-03 22:20:37 UTC
I've just found another strange fact on the web. Many users are experiencing this problem with Corsair Voyagers, but they say that it will look like "USBest" chip after the error. So it's not sure that it's USBest. look on lsusb:

0 ;) harvie@harvie-ntb ~ $ lsusb 
Bus 001 Device 004: ID 1b1c:0ab1  
Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Bus 003 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
Bus 002 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub

0 ;) harvie@harvie-ntb ~ $ lsusb -v -d 1b1c:0ab1

Bus 001 Device 007: ID 1b1c:0ab1  
Device Descriptor:
  bLength                18
  bDescriptorType         1
  bcdUSB               2.00
  bDeviceClass            0 (Defined at Interface level)
  bDeviceSubClass         0 
  bDeviceProtocol         0 
  bMaxPacketSize0        64
  idVendor           0x1b1c 
  idProduct          0x0ab1 
  bcdDevice            1.00
  iManufacturer           1 Corsair
  iProduct                2 Flash Voyager
  iSerial                 3 b3475f7170e1f6
  bNumConfigurations      1
  Configuration Descriptor:
    bLength                 9
    bDescriptorType         2
    wTotalLength           39
    bNumInterfaces          1
    bConfigurationValue     1
    iConfiguration          0 
    bmAttributes         0x80
      (Bus Powered)
    MaxPower               98mA
    Interface Descriptor:
      bLength                 9
      bDescriptorType         4
      bInterfaceNumber        0
      bAlternateSetting       0
      bNumEndpoints           3
      bInterfaceClass         8 Mass Storage
      bInterfaceSubClass      6 SCSI
      bInterfaceProtocol     80 Bulk (Zip)
      iInterface              0 
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x01  EP 1 OUT
        bmAttributes            2
          Transfer Type            Bulk
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0200  1x 512 bytes
        bInterval               1
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x82  EP 2 IN
        bmAttributes            2
          Transfer Type            Bulk
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0200  1x 512 bytes
        bInterval               1
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x83  EP 3 IN
        bmAttributes            3
          Transfer Type            Interrupt
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0040  1x 64 bytes
        bInterval               8
Device Qualifier (for other device speed):
  bLength                10
  bDescriptorType         6
  bcdUSB               2.00
  bDeviceClass            0 (Defined at Interface level)
  bDeviceSubClass         0 
  bDeviceProtocol         0 
  bMaxPacketSize0        64
  bNumConfigurations      1
can't get debug descriptor: Connection timed out
cannot read device status, Connection timed out (110)


===> It returned different results along few tries... it's strange: <===

0 ;) harvie@harvie-ntb ~ $ lsusb -v -d 1b1c:0ab1

Bus 001 Device 006: ID 1b1c:0ab1  
Device Descriptor:
  bLength                18
  bDescriptorType         1
  bcdUSB               2.00
  bDeviceClass            0 (Defined at Interface level)
  bDeviceSubClass         0 
  bDeviceProtocol         0 
  bMaxPacketSize0        64
  idVendor           0x1b1c 
  idProduct          0x0ab1 
  bcdDevice            1.00
  iManufacturer           1 
  iProduct                2 
  iSerial                 3 
  bNumConfigurations      1
  Configuration Descriptor:
    bLength                 9
    bDescriptorType         2
    wTotalLength           39
    bNumInterfaces          1
    bConfigurationValue     1
    iConfiguration          0 
    bmAttributes         0x80
      (Bus Powered)
    MaxPower               98mA
    Interface Descriptor:
      bLength                 9
      bDescriptorType         4
      bInterfaceNumber        0
      bAlternateSetting       0
      bNumEndpoints           3
      bInterfaceClass         8 Mass Storage
      bInterfaceSubClass      6 SCSI
      bInterfaceProtocol     80 Bulk (Zip)
      iInterface              0 
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x01  EP 1 OUT
        bmAttributes            2
          Transfer Type            Bulk
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0200  1x 512 bytes
        bInterval               1
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x82  EP 2 IN
        bmAttributes            2
          Transfer Type            Bulk
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0200  1x 512 bytes
        bInterval               1
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x83  EP 3 IN
        bmAttributes            3
          Transfer Type            Interrupt
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0040  1x 64 bytes
        bInterval               8
can't get device qualifier: Connection timed out
can't get debug descriptor: Connection timed out
cannot read device status, Connection timed out (110)
Comment 18 Alan Stern 2009-11-06 18:43:35 UTC
There hasn't been any progress on this recently.  It seems clear that something is buggy in the hardware or firmware, but we don't have any clue what the actual problem is or how to avoid it.

If there's no further progress to be made, I think this bug report can be closed and rejected.

Note You need to log in before you can comment on or make changes to this bug.