Bug 6344 - ESP DMA and sbus (mainly in sparc32 but in sparc64 too)
Summary: ESP DMA and sbus (mainly in sparc32 but in sparc64 too)
Status: CLOSED CODE_FIX
Alias: None
Product: Platform Specific/Hardware
Classification: Unclassified
Component: SPARC32 (show other bugs)
Hardware: i386 Linux
: P2 blocking
Assignee: Martin Habets
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2006-04-07 02:19 UTC by JKB
Modified: 2006-07-05 14:43 UTC (History)
1 user (show)

See Also:
Kernel Version: 2.6.16.1
Subsystem:
Regression: ---
Bisected commit-id:


Attachments

Description JKB 2006-04-07 02:19:32 UTC
Most recent kernel where this bug did not occur: 2.4.32.
Distribution: debian.
Hardware Environment: SparcSTATION 20 (sun4m), 448 MBytes of RAM, esp (FAS100A),
Mbus SuperSPARC SM71.
Software Environment: very basic kernel with built-in raid1, UP.
Problem Description:

When traffic on SCSI bus is high, esp hangs and freezes the system with :

Apr  6 20:36:01 lebegue kernel: esp0: DMA error a440030e
Apr  6 20:36:01 lebegue kernel: esp0: Resetting scsi bus
Apr  6 20:36:01 lebegue kernel: esp0: SCSI bus reset interrupt
Apr  6 20:36:01 lebegue kernel: esp0: SCSI bus reset interrupt
Apr  6 20:36:01 lebegue kernel: esp0: Warning, live target 3 not responding to
selection.
Apr  6 20:36:01 lebegue kernel: esp0: Warning, live target 1 not responding to
selection.
Apr  6 20:36:02 lebegue kernel: esp0: Resetting scsi bus
Apr  6 20:36:02 lebegue kernel: esp0: SCSI bus reset interrupt
Apr  6 20:36:02 lebegue kernel: esp0: SCSI bus reset interrupt
Apr  6 20:36:02 lebegue kernel: esp0: Warning, live target 3 not responding to
selection.
Apr  6 20:36:02 lebegue kernel: esp0: Warning, live target 1 not responding to
selection.
Apr  6 20:36:03 lebegue kernel: esp0: Resetting scsi bus
Apr  6 20:36:03 lebegue kernel: esp0: SCSI bus reset interrupt
Apr  6 20:36:03 lebegue kernel: esp0: SCSI bus reset interrupt
Apr  6 20:36:03 lebegue kernel: esp0: Warning, live target 3 not responding to
selection.
Apr  6 20:36:03 lebegue kernel: esp0: Resetting scsi bus
Apr  6 20:36:03 lebegue kernel: esp0: SCSI bus reset interrupt
Apr  6 20:36:03 lebegue kernel: esp0: SCSI bus reset interrupt
Apr  6 20:36:13 lebegue kernel: sd 0:0:3:0: scsi: Device offlined - not ready
after error recovery

Steps to reproduce:
1/ use a SparcSTATION (sun4c, sun4m, maybe sun4d but I don't have any sun4d...),
or UltraSPARC with Sbus and ESP FAS100A (for example U1, the U1E works fine with
HME ESP adapter) ;
2/ connect on the internal SCSI bus one or two disks
3/ stress the SCSI bus (for example with raid1)
4/ wait for the crash... Several hours with SuperSPARC-II on a SS20, a few
minutes with HyperSPARC/200 ;-)

Observations: the frequency of this bug is greater with HyperSPARC cpu than
SuperSPARC. I think that the bug comes from DMA/VDMA/Sbus support, but I don't
find any mistake in the sources... I have seen this bug on several stations (SS5
with MicroSPARC-II, SS20 with SuperSPARC and HyperSPARC UP, U1 with UltraSPARC-I).

Regards,

JKB
Comment 1 JKB 2006-04-09 07:14:34 UTC
I have tested some different configurations and this bug seems to come from
highmem support in sparc32 tree. But I don't know why my U1 has the same.

JKB
Comment 2 Jurij Smakov 2006-04-10 21:15:31 UTC
I can reproduce this bug on SparcStation 20, with Debian's kernel 2.6.16-5 from
unstable (essentially 2.6.16.2). I'm getting a slightly different error code:

 esp0: DMA error a440030f
Comment 3 Martin Habets 2006-06-23 03:34:38 UTC
Bob Breuer submitted a patch fir this, which has been pushed upstream
by David Miller.

http://marc.theaimsgroup.com/?l=linux-sparc&m=115077649707675&w=2

Note You need to log in before you can comment on or make changes to this bug.