Bug 8851 - hard lock with drivers for hpt374 sata controller (Highpoint Rocket 1540)
Summary: hard lock with drivers for hpt374 sata controller (Highpoint Rocket 1540)
Status: RESOLVED WILL_NOT_FIX
Alias: None
Product: IO/Storage
Classification: Unclassified
Component: IDE (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: Sergei Shtylyov
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2007-08-07 01:45 UTC by Bob Ham
Modified: 2019-05-04 14:56 UTC (History)
7 users (show)

See Also:
Kernel Version: 2.6.23-rc2 (linus git tree as of 2007-08-06)
Subsystem:
Regression: No
Bisected commit-id:


Attachments

Description Bob Ham 2007-08-07 01:45:37 UTC
Neither the hpt366.c or pata_hpt37x.c driver works with the Highpoint Rocket 1540.  With these[0][1][2][3] patches from Sergei Shtylyov, both drivers result in some kernel output, listed below, followed by a hard lock.

[0] http://marc.info/?l=linux-ide&m=118634086127536&w=2
[1] http://marc.info/?l=linux-ide&m=118634077716246&w=2
[2] http://marc.info/?l=linux-ide&m=118634429502883&w=2
[3] http://marc.info/?l=linux-ide&m=118634442810858&w=2


Output of hpt366 driver:

HPT374: IDE controller at PCI slot 0000:00:0d.0
ACPI: PCI Interrupt 0000:00:0d.0[A] -> GSI 16 (level, low) -> IRQ 16
HPT374: chipset revision 7
HPT374: DPLL base: 48 MHz, f_CNT: 142, assuming 33 MHz PCI
HPT374: using 50 MHz DPLL clock
HPT374: 100% native mode on irq 16
    ide2: BM-DMA at 0xec00-0xec07, BIOS settings: hde:DMA, hdf:pio
    ide3: BM-DMA at 0xec08-0xec0f, BIOS settings: hdg:DMA, hdh:pio
ACPI: PCI Interrupt 0000:00:0d.1[A] -> GSI 16 (level, low) -> IRQ 16
HPT374: DPLL base: 48 MHz, f_CNT: 142, assuming 33 MHz PCI
HPT374: using 50 MHz DPLL clock
    ide4: BM-DMA at 0xed00-0xed07, BIOS settings: hdi:DMA, hdj:pio
    ide5: BM-DMA at 0xed08-0xed0f, BIOS settings: hdk:DMA, hdl:pio


Output of pata_hpt37x driver:

pata_hpt37x: bus clock 33MHz, using 50MHz DPLL.
ACPI: PCI Interrupt 0000:00:0d.0[A] -> GSI 16 (level, low) -> IRQ 17
scsi2: pata_hpt37x
scsi3: pata_hpt37x
ata3: PATA max UDMA/100 cmd 0x0001efa0 ctl 0x0001ef9e bmdma 0x0001ec00 irq 17
ata4: PATA max UDMA/100 cmd 0x0001ef90 ctl 0x0001ef9a bmdma 0x0001ec00 irq 17
Comment 1 Andrew Morton 2007-08-07 01:56:01 UTC
Is this a regression?  If so, can you identify which earlier kernel
version(s) worked OK?
Comment 2 Bob Ham 2007-08-07 02:27:55 UTC
No, this is not a regression.  I've tested the hpt366.c driver from around 2.4.2x and it has always caused a hard lock.  The pata_hpt37x.c driver has not worked since looking at it in 2.6.22.1.
Comment 3 Sergei Shtylyov 2007-08-10 07:39:25 UTC
I'm "maitaining" this driver. In quotes because I has been fixing more or less obvious mistakes so far... :-)
Comment 4 Bartlomiej Zolnierkiewicz 2008-02-16 10:29:25 UTC
Is this still a problem with 2.6.25-rc2?
Comment 5 Roland Kletzing 2008-04-30 13:12:12 UTC
bob,there have quite an amount of changes for this driver since your report - any chance to try 2.6.25 kernel ?
Comment 6 Roland Kletzing 2008-04-30 14:33:35 UTC
via direct conversation:

>On 4/30/08, bugme-daemon@bugzilla.kernel.org
><bugme-daemon@bugzilla.kernel.org> wrote:
>> http://bugzilla.kernel.org/show_bug.cgi?id=2555
>
>I have not have any problems with this bug and I think it was
>resolved in 2.6.13. See the last comment on
>http://bugzilla.kernel.org/show_bug.cgi?id=4300
>
>Btw., I boot up linux and then modprobe with 4 HDD connected,
>which workes great (I have not tested with the kernel yet).
>
>Best regards,
>Flemming Richter

so, since #4300 is closed, i think we can close this one, too - ok ?
Comment 7 Sergei Shtylyov 2008-05-01 10:30:20 UTC
(In reply to comment #6)
> so, since #4300 is closed, i think we can close this one, too - ok ?

I don't think so -- it's been reported against 2.6.23-rc2 which had the driver already rewritten by me...
Comment 8 Roland Kletzing 2008-05-01 11:02:52 UTC
yes, you`re right. that one was much older. 
so input from bob is still needed
Comment 9 Bob Ham 2008-05-02 13:33:56 UTC
Sorry for the delay in responding; no excuse really.  Version 2.6.25, unfortunately, still locks hard with both the pata_hpt37x and hpt366 drivers.  Here is the kernel output of the pata_hpt37x driver:

pata_hpt37x: bus clock 33MHz 0000:00:0d.0[A] -> GSI 16 (level, low) -> IRQ 16
scsi2 : pata_hpt37x
scsi3 : pata_hpt37x
ata3: PATA max UDMA/100 cmd 0xefa0 ctl 0xef9c bmdma 0xec00 irq 16
ata4: PATA max UDMA/100 cmd 0xef90 ctl 0xef98 bmdma 0xec08 irq 16
Find mode for 8 reports AX1F48A
Find mode for 8 reports AX1F48A

Here is the output of the hpt366 driver:

HPT374: IDE controller (0x1103:0x0008) rev 0x07) at PCI slot 0000:00:0d.0
ACPI: PCI interrupt 0000:00:0d.0[A] -> GSI 16 (level, low) -> IRQ 16
HPT374: DPLL base: 48 MHz, f_CNT: 142, assuming 33 MHz PCI
HPT374: using 50 MHz DPLL clock
HPT374: 100% native mode on irq 16
ACPI: PCI interrupt 0000:00:0d.1[A] -> GSI 16 (level, low) -> IRQ 16
HPT374: DPLL base: 48 MHz, f_CNT: 142, assuming 33 MHz PCI
HPT374: using 50 MHz DPLL clock
    ide2: BM-DMA at 0xec00-0xec07, BIOS settings: hde:DMA, hdf:PIO
    ide3: BM-DMA at 0xec08-0xec0f, BIOS settings: hdg:DMA, hdh:PIO
    ide4: BM-DMA at 0xed00-0xed07, BIOS settings: hdi:DMA, hdj:PIO
    ide5: BM-DMA at 0xed08-0xed0f, BIOS settings: hdk:DMA, hdl:PIO
Comment 10 Roland Kletzing 2008-05-02 14:17:53 UTC
did you ever make sure that your specific setup works at all?
what about windows?
what about the bios version of the controller?
no problem, as you can see linux community isn`t better with this - there are lots of open and unresolved bugreports because kernel devs are busy people ;)

it`s very apprechiated that you reply after such long time.

ok - let`s dig further:

maybe some controller<->disk incompatibility ?
did you try any other disk on this controller and does that make a difference ?

it`s weird that highpoint doesn`t list this controller at: http://www.highpoint-tech.com/USA/bios.htm , isn`t it ?

furthermore, there is a support-page, but most (all?) links to the binaries are broken: http://www.highpoint-tech.com/USA/bios_rr1540.htm

if you don`t know if your specific hardware setup works at all, please verify with any of the vendor supported operating systems, i.e. try windows or try with those linux distros listed at that site and the vendor drivers. please request those drivers/bios upgrades from highpoint support. 

we first need to make sure that it isn`t a hardware issue. 
Comment 11 Roland Kletzing 2008-05-02 14:21:34 UTC
whoops - i screwed up the reply. line 4-9 should be first lines.
Comment 12 Roland Kletzing 2008-05-02 14:35:05 UTC
similar issue is reported for freebsd: http://lists.freebsd.org/pipermail/freebsd-bugs/2004-October/009964.html

can you test if the problem is reproduceable with freebsd ?
you may tray a recent freebsd live-cd
e.g. http://www.truebsd.org/ or http://fbsd.wordpress.com/2007/07/19/freebsd-70-livecd-released/  (or find one via google)
Comment 13 Bob Ham 2008-05-02 14:45:31 UTC
The problem is definitely not the hardware.  I normally use a 2.4.34 kernel with the binary blob drivers from highpoint themselves.  Here is the output of this driver, just in case you're curious:

Rocket 1540 SATA Controller driver
Version 1.0, Compiled Jun  4 2007 21:30:19
PCI: Enabling device 00:0d.0 (0105 -> 0107)
scsi2 : hpt374
  Vendor: WDC WD20  Model: 00JD-98HBB0       Rev: 08.0
  Type:   Direct-Access                      ANSI SCSI revision: 00
  Vendor: WDC WD20  Model: 00JD-98HBB0       Rev: 08.0
  Type:   Direct-Access                      ANSI SCSI revision: 00
  Vendor: WDC WD20  Model: 00JD-00HBB0       Rev: 08.0
  Type:   Direct-Access                      ANSI SCSI revision: 00
  Vendor: WDC WD20  Model: 00JD-98HBB0       Rev: 08.0
  Type:   Direct-Access                      ANSI SCSI revision: 00
Attached scsi disk sdc at scsi2, channel 0, id 0, lun 0
Attached scsi disk sdd at scsi2, channel 0, id 1, lun 0
Attached scsi disk sde at scsi2, channel 0, id 2, lun 0
Attached scsi disk sdf at scsi2, channel 0, id 3, lun 0
SCSI device sdc: 390721967 512-byte hdwr sectors (200050 MB)
 /dev/scsi/host2/bus0/target0/lun0: p1
SCSI device sdd: 390721967 512-byte hdwr sectors (200050 MB)
 /dev/scsi/host2/bus0/target1/lun0: p1
SCSI device sde: 390721967 512-byte hdwr sectors (200050 MB)
 /dev/scsi/host2/bus0/target2/lun0: p1
SCSI device sdf: 390721967 512-byte hdwr sectors (200050 MB)
 /dev/scsi/host2/bus0/target3/lun0: p1

It works fine with all the disks in a software (ie, md) raid array but, as noted, it uses a binary blob (with some wrapper source that needs to be compiled.)  I want freedom, hence the bug report :-)

The binary-blob driver will only work with kernel 2.4 or 2.6 less than about 2.6.13 on account of changes to the SCSI system that prevent compilation of the blob's wrapper.  There are reports of these problems in the linux-ide archives, I believe.

I contacted highpoint about the issue over a year ago.  They said the card is no longer supported so they would not be releasing any updates.  I then asked if they would be prepared to open the source to their binary blob.  They didn't respond.

Bob
Comment 14 Roland Kletzing 2008-05-02 15:08:28 UTC
ah - ok. then indeed this should be fixed. 

i sent a request to support and sales and asked nicely again. i don`t have much hope this will change anything, but let`s see....
Comment 15 Roland Kletzing 2008-05-04 15:26:19 UTC
if i knew the linux equivalent of "atacontrol reinit", i would say: try if the bahaviour from the freebsd bugreport applies to you, too:

>If I boot up my computer with hard disks attached to the Highpoint Rocket 1540
>controller (non-RAID), it hard-locks when those disks are probed. Nothing
>>responds - not even the caps-lock and num-lock keys. The access-light on the
>>floppy drive is also permanently lit.

>This problem does not arise if no disks are attached when booting. If I attach
>the disks after booting and issue an 'atacontrol reinit' command, everything
>>works fine. However, it does make it impossible to boot off a drive attached
>>to this controller.


unfortunately, i don`t - but maybe somebody knows....
Comment 16 Roland Kletzing 2008-05-04 15:44:37 UTC
http://osdir.com/ml/raid/2004-04/msg00156.html
interesting.....


i would add printk`s in the driver code to find out where it hangs.
if you don`t know at which places, i would add to the beginning and the end (before return) of every function.
so you could get a clue where it hangs....
Comment 17 Sergei Shtylyov 2008-05-05 03:52:35 UTC
(In reply to comment #16)
> http://osdir.com/ml/raid/2004-04/msg00156.html
> interesting.....
 
> i would add printk`s in the driver code to find out where it hangs.
> if you don`t know at which places, i would add to the beginning and the end
> (before return) of every function.
> so you could get a clue where it hangs....

That whould have been too simple. IDE drivers are not autonomous -- the most work is done by the IDE core itself, so most probably you'll have to trace it.
Comment 18 Bob Ham 2008-08-27 13:10:01 UTC
Hi there,

Through the use of printk's and numerous reboots, I've managed to discern the offending call in the pata_hpt37x driver.  It's the call to ioread16_rep() in ata_sff_data_xfer() in libata-sff.c.  I stuck a call to dump_stack() at the beginning of the function and this was the output:

[    4.980479] Pid: 136, comm: ata/0 Not tainted 2.6.26.3 #12
[    4.980576]  [<c02bfb50>] ata_sff_data_xfer+0x2f/0x13e
[    4.980729]  [<c02beec4>] ata_pio_sector+0x125/0x1a1
[    4.980880]  [<c02bf032>] ata_pio_sectors+0xf2/0x106
[    4.981031]  [<c02bf7d5>] ata_sff_hsm_move+0x6cf/0x8af
[    4.981182]  [<c0121e2d>] ? destroy_timer_on_stack+0xd/0xf
[    4.981398]  [<c03ca468>] ? schedule_timeout+0x7b/0x90
[    4.981617]  [<c0122001>] ? process_timeout+0x0/0xa
[    4.981827]  [<c0232c2b>] ? delay_tsc+0x13/0x21
[    4.982037]  [<c02c0886>] ata_pio_task+0xb8/0xc9
[    4.982186]  [<c012791e>] run_workqueue+0xba/0x182
[    4.982340]  [<c01278e4>] ? run_workqueue+0x80/0x182
[    4.982546]  [<c02c07ce>] ? ata_pio_task+0x0/0xc9
[    4.982750]  [<c012804b>] ? worker_thread+0x0/0xbb
[    4.982957]  [<c01280fb>] worker_thread+0xb0/0xbb
[    4.983109]  [<c012a4c5>] ? autoremove_wake_function+0x0/0x33
[    4.983330]  [<c012a406>] kthread+0x39/0x5f
[    4.983478]  [<c012a3cd>] ? kthread+0x0/0x5f
[    4.983692]  [<c010373b>] kernel_thread_helper+0x7/0x10


Here is the sanitised kernel output from when the point at which the pata_hpt37x driver is first mentioned:

[    4.789578] bus: 'pci': add driver pata_hpt37x
[    4.789669] kobject: 'pata_hpt37x' (dfa92770): kobject_add_internal: parent: 'drivers', set: 'drivers'
[    4.789842] bus: 'pci': driver_probe_device: matched device 0000:00:0d.0 with driver pata_hpt37x
[    4.789945] bus: 'pci': really_probe: probing driver pata_hpt37x with device 0000:00:0d.0
[    4.790402] ACPI: PCI Interrupt 0000:00:0d.0[A] -> GSI 16 (level, low) -> IRQ 16
[    4.801525] pata_hpt37x: bus clock 33MHz, using 50MHz DPLL.
[    4.801699] ata_pci_sff_init_one: ENTER
[    4.801780] ata_host_alloc: ENTER
[    4.801873] ata_port_alloc: ENTER
[    4.802136] ata_port_alloc: ENTER
[    4.802474] __ata_port_freeze: ata4294967295 port frozen
[    4.802568] __ata_port_freeze: ata4294967295 port frozen
[    4.802833] scsi2 : pata_hpt37x
[    4.802921] device: 'host2': device_add
[    4.803002] kobject: 'host2' (dfa823e8): kobject_add_internal: parent: '0000:00:0d.0', set: 'devices'
[    4.803140] PM: Adding info for No Bus:host2
[    4.803234] kobject: 'host2' (dfa823e8): kobject_uevent_env
[    4.803331] kobject: 'host2' (dfa823e8): kobject_uevent_env: filter function caused the event to drop!
[    4.803437] device: 'host2': device_add
[    4.803530] kobject: 'host2' (dfa825a4): kobject_add_internal: parent: 'scsi_host', set: 'devices'
[    4.803739] PM: Adding info for No Bus:host2
[    4.803832] kobject: 'host2' (dfa825a4): kobject_uevent_env
[    4.803966] kobject: 'host2' (dfa825a4): fill_kobj_path: path = '/class/scsi_host/host2'
[    4.804069] kobject: '0000:00:0d.0' (df8e61b4): fill_kobj_path: path = '/devices/pci0000:00/0000:00:0d.0'
[    4.804555] scsi3 : pata_hpt37x
[    4.804639] device: 'host3': device_add
[    4.804736] kobject: 'host3' (df09f438): kobject_add_internal: parent: '0000:00:0d.0', set: 'devices'
[    4.804855] PM: Adding info for No Bus:host3
[    4.804964] kobject: 'host3' (df09f438): kobject_uevent_env
[    4.805048] kobject: 'host3' (df09f438): kobject_uevent_env: filter function caused the event to drop!
[    4.805165] device: 'host3': device_add
[    4.805246] kobject: 'host3' (df09f5f4): kobject_add_internal: parent: 'scsi_host', set: 'devices'
[    4.805455] PM: Adding info for No Bus:host3
[    4.805548] kobject: 'host3' (df09f5f4): kobject_uevent_env
[    4.805679] kobject: 'host3' (df09f5f4): fill_kobj_path: path = '/class/scsi_host/host3'
[    4.805781] kobject: '0000:00:0d.0' (df8e61b4): fill_kobj_path: path = '/devices/pci0000:00/0000:00:0d.0'
[    4.806137] ata3: PATA max UDMA/100 cmd 0xefa0 ctl 0xef9c bmdma 0xec00 irq 16
[    4.806228] ata4: PATA max UDMA/100 cmd 0xef90 ctl 0xef98 bmdma 0xec08 irq 16
[    4.806331] ata_host_register: probe begin
[    4.806415] ata_port_schedule_eh: port EH scheduled
[    4.807003] ata_scsi_error: ENTER
[    4.807078] ata_port_flush_task: ENTER
[    4.807172] ata3: ata_port_flush_task: EXIT
[    4.807261] ata_eh_link_autopsy: ENTER
[    4.807351] ata_eh_recover: ENTER
[    4.808417] Find mode for 8 reports 12848242
[    4.809190] Find mode for 8 reports 12848242
[    4.815837] __ata_port_freeze: ata3 port frozen
[    4.816173] ata_sff_softreset: ENTER
[    4.816536] ata_sff_softreset: about to softreset, devmask=3
[    4.816696] ata_bus_softreset: ata3: bus reset via SRST
[    4.971810] ata_dev_classify: found ATA device by sig
[    4.971994] ata_dev_classify: found ATA device by sig
[    4.972076] ata_sff_softreset: EXIT, classes[0]=1 [1]=9
[    4.972768] ata_eh_thaw_port: ata3 port thawed
[    4.972937] ata_std_postreset: ENTER
[    4.973013] ata_std_postreset: EXIT
[    4.973820] ata_eh_revalidate_and_attach: ENTER
[    4.973913] ata3.00: ata_dev_read_id: ENTER
[    4.973998] ata3: ata_dev_select: ENTER, device 0, wait 1
[    4.974125] ata_sff_tf_load: feat 0x0 nsect 0x0 lba 0x0 0x0 0x0
[    4.974209] ata_sff_tf_load: device 0xA0
[    4.974297] ata_sff_exec_command: ata3: cmd 0xEC
[    4.979622] ata_sff_hsm_move: ata3: protocol 2 task_state 2 (dev_stat 0x58)
[    4.980135] ata_pio_sector: data read
[    4.980397] libata-sff: ata_sff_data_xfer: begin; stack:
[    4.980479] Pid: 136, comm: ata/0 Not tainted 2.6.26.3 #12
[    4.980576]  [<c02bfb50>] ata_sff_data_xfer+0x2f/0x13e
[    4.980729]  [<c02beec4>] ata_pio_sector+0x125/0x1a1
[    4.980880]  [<c02bf032>] ata_pio_sectors+0xf2/0x106
[    4.981031]  [<c02bf7d5>] ata_sff_hsm_move+0x6cf/0x8af
[    4.981182]  [<c0121e2d>] ? destroy_timer_on_stack+0xd/0xf
[    4.981398]  [<c03ca468>] ? schedule_timeout+0x7b/0x90
[    4.981617]  [<c0122001>] ? process_timeout+0x0/0xa
[    4.981827]  [<c0232c2b>] ? delay_tsc+0x13/0x21
[    4.982037]  [<c02c0886>] ata_pio_task+0xb8/0xc9
[    4.982186]  [<c012791e>] run_workqueue+0xba/0x182
[    4.982340]  [<c01278e4>] ? run_workqueue+0x80/0x182
[    4.982546]  [<c02c07ce>] ? ata_pio_task+0x0/0xc9
[    4.982750]  [<c012804b>] ? worker_thread+0x0/0xbb
[    4.982957]  [<c01280fb>] worker_thread+0xb0/0xbb
[    4.983109]  [<c012a4c5>] ? autoremove_wake_function+0x0/0x33
[    4.983330]  [<c012a406>] kthread+0x39/0x5f
[    4.983478]  [<c012a3cd>] ? kthread+0x0/0x5f
[    4.983692]  [<c010373b>] kernel_thread_helper+0x7/0x10
[    4.983845]  =======================
Comment 19 Alan 2008-08-27 15:29:11 UTC
Not sure we can do much with this - no documentation on the raid devices.

ata_sff_data_xfer is the transfer of bytes to the controller, its a totally standard bit of interface, or at least it is on non RAID hardware. I guess you could try either 32bit transfers or the VLB sync sequence but really its guesswork here.
Comment 20 Sergei Shtylyov 2008-08-27 15:46:36 UTC
(In reply to comment #19)

> Not sure we can do much with this - no documentation on the raid devices.

Is Rocket 1540 a RAID adapter?

> ata_sff_data_xfer is the transfer of bytes to the controller, its a totally
> standard bit of interface, or at least it is on non RAID hardware. I guess
> you
> could try either 32bit transfers or the VLB sync sequence but really its
> guesswork here.

I would think that the issue is with the SATA bridge...
Comment 21 Bob Ham 2008-08-28 03:58:37 UTC
No, the Rocket 1540 isn't a RAID adapter, not even software RAID; just a straight 4-channel SATA card.  It's distinct, at least in terms of marketing, from the RocketRaid products.

I'm hoping here, but are 32-bit transfers or VLB sync simply options that can be enabled, or would a change to the driver be needed?
Comment 22 Bob Ham 2008-12-30 02:02:09 UTC
Having emailed the linux team at Highpoint (linux@highpoint-tech.com) about this issue, they've told me two general things which may be of use:

1. The Rocket 1540 doesn't conform to the standard HPT374 hardware interface and (I presume because of this) requires specific initialisation.

2. There is no way, in software, to differentiate between the Rocket 1540 and other HPT374-based adapters.
Comment 23 Sergei Shtylyov 2008-12-30 02:26:44 UTC
(In reply to comment #22)

First of all, I'm sorry for not having found enough time to try and deal with this bug so far (I could have). Unfortunately, I had to move away from the IDE activity in the past several months...

> Having emailed the linux team at Highpoint (linux@highpoint-tech.com) about
> this issue, they've told me two general things which may be of use:

> 1. The Rocket 1540 doesn't conform to the standard HPT374 hardware interface
> and (I presume because of this) requires specific initialisation.

It's Strange that pre-2.4.19 drivers managed to work anyway.

> 2. There is no way, in software, to differentiate between the Rocket 1540 and
> other HPT374-based adapters.

What can I say? Morons... their whole policy of assigning device IDs have always been bogus.
Comment 24 Bob Ham 2008-12-30 02:51:14 UTC
(In reply to comment #23)
> (In reply to comment #22)
> 
> > 1. The Rocket 1540 doesn't conform to the standard HPT374 hardware
> interface
> > and (I presume because of this) requires specific initialisation.
> 
> It's Strange that pre-2.4.19 drivers managed to work anyway.

Are you aware of specific instances of the Rocket 1540 working with pre-2.4.19 drivers or was that version number of pre-2.4.19 derived from what I noted about my testing?

Above I said "I've tested the hpt366.c driver from around 2.4.2x and it has always caused a hard lock."  To clarify, by that I meant that I haven't tested any drivers earlier than 2.4.2x but the ones I have have tested, from 2.4.2x upwards, have always hard locked.

If you're aware of other instances of pre-2.4.19 drivers working, I'll give those versions a test.
Comment 25 Sergei Shtylyov 2008-12-30 02:58:04 UTC
(In reply to comment #24)
> (In reply to comment #23)
> > (In reply to comment #22)
> > 
> > > 1. The Rocket 1540 doesn't conform to the standard HPT374 hardware
> interface
> > > and (I presume because of this) requires specific initialisation.
> > 
> > It's Strange that pre-2.4.19 drivers managed to work anyway.

> Are you aware of specific instances of the Rocket 1540 working with
> pre-2.4.19
> drivers or was that version number of pre-2.4.19 derived from what I noted
> about my testing?

Oops, I've mixed up this bug with the bug #7703. :-)
Comment 26 Sergei Shtylyov 2008-12-30 03:35:26 UTC
(In reply to comment #13)
> The problem is definitely not the hardware.  I normally use a 2.4.34 kernel
> with the binary blob drivers from highpoint themselves.  Here is the output
> of
> this driver, just in case you're curious:

> Rocket 1540 SATA Controller driver
> Version 1.0, Compiled Jun  4 2007 21:30:19

[...]

> It works fine with all the disks in a software (ie, md) raid array but, as
> noted, it uses a binary blob (with some wrapper source that needs to be
> compiled.)  I want freedom, hence the bug report :-)

> The binary-blob driver will only work with kernel 2.4 or 2.6 less than about
> 2.6.13 on account of changes to the SCSI system that prevent compilation of
> the
> blob's wrapper.  There are reports of these problems in the linux-ide
> archives,
> I believe.

BTW, how the tarball containing this driver was called. The RR1540 (seemingy orphaned) page on HPT site (http://www.highpoint-tech.com/USA/bios_rr1540.htm) points to http://www.highpoint-tech.com/BIOS%20+%20Driver/rr1540/Linux/hpt374-opensource-v2.14-1101.tgz which is no longer there. There are newer HPT374 drivers however -- look there:

http://www.highpoint-tech.com/BIOS_Driver/hpt374/Linux/

After having looked into the (messy) source code I've found out that it should be able to drive RR1540.
Comment 27 Bob Ham 2008-12-30 04:39:24 UTC
(In reply to comment #26)

> BTW, how the tarball containing this driver was called. The RR1540 (seemingy
> orphaned) page on HPT site
> (http://www.highpoint-tech.com/USA/bios_rr1540.htm)
> points to
>
> http://www.highpoint-tech.com/BIOS%20+%20Driver/rr1540/Linux/hpt374-opensource-v2.14-1101.tgz
> which is no longer there. There are newer HPT374 drivers however -- look
> there:
> 
> http://www.highpoint-tech.com/BIOS_Driver/hpt374/Linux/
> 
> After having looked into the (messy) source code I've found out that it
> should
> be able to drive RR1540.

First of all, I'd point out that the specific card that's causing the problem is the "Rocket 1540" and not the "RocketRaid 1540".  I'm not sure whether they contain different hardware but the fact that they have different drivers certainly gives that impression.


Originally, the support page for the Rocket 1540 was named "bios_r1540.htm" but it has been likewise orphaned.  Luckily, the Internet Archive has a copy:

http://web.archive.org/web/20060718163359/http://www.highpoint-tech.com/USA/bios_r1540.htm

The original binary-blob driver was named `r1540-openbuild-v1.0.tgz' and this is also available from the Internet Archive, through the link at the bottom of the page.

However, in the aforementioned email from the linux team at highpoint, I also received an updated binary-blob driver named `r1540-opensource-v1.1.tgz'.  I haven't tested this yet but it appears to at least compile with recent 2.6 kernels.  (I'd note it's odd that they're still updating drivers for supposedly unsupported products while refusing to release the source, especially when they said, in the very same email, that the reason they're not releasing the source is *because* the product is discontinued!)

The readme in the new tarball contains a liberal license for the included source but says nothing about the binary blobs.  So, I've removed the blobs (hptprot.o and hptprot-x86_64.o) and uploaded the tarball here:

  http://pkl.net/~node/software/r1540-opensource-v1.1-no-blob.tar.gz

Unfortunately, the blobs do differ between the 1.0 and 1.1 versions.  I've emailed highpoint again to ask if it's OK to share the driver with others.
Comment 28 Bob Ham 2008-12-31 03:35:24 UTC
I've received a response from hightpoint and they've stated that the driver can be freely distributed so I've uploaded it, complete with binary blobs here:

  http://pkl.net/~node/software/r1540-opensource-v1.1.tgz


They've also stated that, in fact, the reason they won't release the source is that it is a company policy not to release full source code.  Unfortunately, they haven't explained why it's a company policy.
Comment 29 Roland Kletzing 2008-12-31 04:11:01 UTC
R1540 and RR1504 are indeed different controllers - they have the same chipset, but with a different revision number:
http://www.highpoint-tech.com/image/products/SATA/r1540pix_big.gif
http://www.highpoint-tech.com/image/products/SATA/rr1540pix_big.jpg

>Having emailed the linux team at Highpoint (linux@highpoint-tech.com) about
>this issue, they've told me two general things which may be of use:

>1. The Rocket 1540 doesn't conform to the standard HPT374 hardware interface
>and (I presume because of this) requires specific initialisation.

so, what is a standard HPT374 interface ? think they mean the later revision....

>2. There is no way, in software, to differentiate between the Rocket 1540 and
>other HPT374-based adapters.
so, if R1540 needs to be handled differently (maybe because the driver needs some quirks), what about some driver param to tell him that it`s a R1540 and not another HPT374 based card. Anyway - the first thing to find out is what quirks are needed.....
Comment 30 Bob Ham 2008-12-31 06:37:37 UTC
(In reply to comment #29)

> Anyway - the first thing to find out is what
> quirks are needed.....

It might be worth contacting linux@highpoint-tech.com to ask about the quirks and how to initialise the controller.  They seem to be open to at least answering questions.
Comment 31 Alan 2009-03-26 17:24:55 UTC
Closing as WONTFIX. If HPT won't provide the needed info/references we can't do much about it
Comment 32 Bob Ham 2009-03-26 18:41:50 UTC
There seems to be two different camps within HPT.  The first are the people who deal with support tickets.  Their unwavering response is to parrot the company policy of the product no longer being "supported".  The second is the people who answer the linux@highpoint-tech.com email address.  They have recently, in admirable contradiction to the company policy of not supporting the product, turned around a fix for their binary-blob driver in a couple of hours.

I'm sure if one of the driver developers were to email the linux@... address to ask questions, they would provide the needed info/references.  I haven't done this because I've no real experience developing drivers (I've no knowledge of PCI, ATA, SATA, etc.)  I wouldn't know what to ask.  All I could do is be a go-between and it would be much more efficient if the person asking the questions communicated directly with the person giving the answers.


Point of order: I only set the bug to REOPENED because bugzilla won't let me submit this comment with the (only available) Resolution choice of "ARRAY(0x2dc4758)" given under the status of RESOLVED.  I leave it to others to decide whether the bug should still be open.
Comment 33 Bob Ham 2009-06-24 07:19:08 UTC
(In reply to comment #32)
> There seems to be two different camps within HPT.  The first are the people
> who
> deal with support tickets.  Their unwavering response is to parrot the
> company
> policy of the product no longer being "supported".  The second is the people
> who answer the linux@highpoint-tech.com email address.

Well, it was too much to ask.  There seemed to be no movement on this, so I thought I'd take on the role of a go-between and ask the linux@highpoint-tech.com people for the initialisation information.  They seem to have disappeared, to be replaced by the same "it's not supported" people as the support address.  To the degree that they won't even respond to emails.

<rah> what can we learn from this?
<rah> don't buy highpoint
<iDunno> no...
<iDunno> we learn from this...
<iDunno> 99% of hardware manufacturers are c**ts.

Can't say I disagree.
Comment 34 carlojpisani 2019-05-04 09:24:05 UTC
hi guys
the problem is back. What's happening is similar to the Bug 2271 appeared in 2004

With a couple of friends we tried HighPoint HPT374 on a C3600 workstation running Kernel v4.16 in 64bit mode. It didn't panic but, during a file-copy operation, the DMA caused corruption to the file. The filesystem was not corrupted. 

# lssize data1.bin
400 Mbyte
# cp data1 data2
md5sum data1.bin data2.bin
6004eb9dd9189770655f8b49a1d688a8  data1.bin
6004eb9dd9189770655f8b49a1d688a8  data2.bin

# lssize bigone1
5 Gbyte
# cp bigone1 bigone2
# md5sum bigone1 bigone2
f60a9f7ff4bcec465ea47e0f009354fd  bigone1
5e1fdedc560cfe82a5d59b740a980091  bigone2 <---- corrupted!

Digging deeper it only happens with big files.
Comment 35 Alan 2019-05-04 14:38:47 UTC
If it only happens with big files then its wildly unlikely to be the disk driver because the driver has no idea about file or where blocks are in a file.

Given this is a non-x86 platform I'd start with the folks who actually still touch that platform  much about the HPPA world is weird and wonderful including things like cache coherency.
Comment 36 carlojpisani 2019-05-04 14:54:29 UTC
I think it's related to the DMA's timing or something. It happens when copying a 500MB file, or even a whole partition (dd if=/dev/sdc ...). I first get 50 MB/s until my memory is full and then it decreases to 15 MB/s while at this precise point I see the kernel issuing message about DMA-problem and errors.

Anyway it also happens on my x86 machine, so it's not the architecture.

Besides, I remember the following email, where we had a similar problem with  SIL24. And the bahvior looks the same.

--

To: linux-ide@xxxxxxxxxxxxxxx
Subject: PROBLEM: sil24: transfer errors causing data corruption or very low performance
From: mtths@xxxxxxxxxxxxxx
Date: Sat, 22 Apr 2017 17:33:56 +0200
User-agent: KMail/5.2.3 (Linux/4.9.18-sil24dma32; KDE/5.28.0; x86_64; ; )

A SATA PCI-controller card using the kernel module sata_sil24 has problems transferring big files.

1st test setting:
Reading from the disks to /dev/null with
dd if=/dev/sde of=/dev/null bs=4k count=2304k
one after the other. (Disk ST3000DM001 was temporarily directly connected to the card's external SATA port.)

Result: After some MB, but before 470 MiB errors occurred:
failed command: READ FPDMA QUEUED
[cf. attachment dmesg]
failed command: READ DMA
[cf. attachment dmesg_tmpHD]

The errors were reproducible - however, they started after different amounts of data.

2nd test setting:
Doing the same tests as above.
Result: Repeatedly no errors occurred.

N.B.: 1) There is a Windows Vista (32 bit) installation that uses two of the 
ports of the controller card as a fake RAID 1: There are no problems with the 
internal directly connected drives nor with an external drive.
2) Prior to this controller card there was a PCI Express card using SiI 3132, 
that - if I remember rightly - also had such problems, but they started after 
a greater amount of data. (At that time a Windows XP x64 installation had no 
problems, too.)

CODE: SELECT ALL
--- drivers/ata/sata_sil24.c.orig   2017-01-09 08:32:38.000000000 +0100
+++ drivers/ata/sata_sil24.c   2017-01-13 23:44:58.378675018 +0100
@@ -1313,15 +1313,4 @@

   /* configure and activate the device */
-   if (!dma_set_mask(&pdev->dev, DMA_BIT_MASK(64))) {
-      rc = dma_set_coherent_mask(&pdev->dev, DMA_BIT_MASK(64));
-      if (rc) {
-         rc = dma_set_coherent_mask(&pdev->dev, DMA_BIT_MASK(32));
-         if (rc) {
-            dev_err(&pdev->dev,
-               "64-bit DMA enable failed\n");
-            return rc;
Comment 37 carlojpisani 2019-05-04 14:56:36 UTC
here we are willing to publish tests about sata controllers for non-x86 computers
http://www.downthebunker.com/reloaded/space/viewtopic.php?f=50&t=337&p=1534

if someone wants to collaborate, he/she is welcome!

Note You need to log in before you can comment on or make changes to this bug.