Bug 7130

Summary: loading driver eat all CPU and since some time may block all system disk IO (even can not reboot)
Product: SCSI Drivers Reporter: Wizard Vandal (wizard580)
Component: OtherAssignee: scsi_drivers-other
Status: REJECTED INSUFFICIENT_DATA    
Severity: blocking CC: protasnb
Priority: P2    
Hardware: i386   
OS: Linux   
Kernel Version: 2.6.17.12 Subsystem:
Regression: --- Bisected commit-id:

Description Wizard Vandal 2006-09-09 03:23:21 UTC
Most recent kernel where this bug did not occur: 2.6.17.12
Distribution: Debian SID/Unstable
Hardware Environment: SUN UltraSparc2 E3500 FC-AL SCSI hard drives
Software Environment: Debian with latest updates on 09.09.2006
Problem Description: When I load a driver fcal, mu cpu will be 100% busy and 
will never fred. modprobe fcal will never ends. If I do not restart 
immediately, then since 2-4min I can not even reboot. Seems to there is a 
block of all disk io. A top and so programs running fine, until exit. Start 
again will fail.
So... may be because of that, I can not see my FC-AL scsi hard drives.
My hardware (if I did't mistaken):
1) X2652A FC-AL INTERFACE BOARD 3500.
2) X6731A FCAL GBIC MODULE 100MB/SEC
P.S.: I do anything for seeing my hard drives in Debian. If I need to tell you 
any info, or make a tests, just let me know. :)
If anybody can quickly help, I will be blessed. :D My life is bet on this.
P.P.S:
hard drives I see in "pre boot 'bios' with probe-fcal-all command"

Steps to reproduce:
modprobe fcal on a Sparc arch.
Comment 1 Wizard Vandal 2006-09-09 03:31:10 UTC
latest 2.6.17.12 kernel - bug exist.
Comment 2 Andrew Morton 2006-09-09 08:28:08 UTC
On Sat, 9 Sep 2006 03:32:07 -0700
bugme-daemon@bugzilla.kernel.org wrote:

> http://bugzilla.kernel.org/show_bug.cgi?id=7130
> 
>            Summary: loading driver eat all CPU and since some time may block
>                     all system disk IO (even can not reboot)
>     Kernel Version: 2.6.17.12
>             Status: NEW
>           Severity: blocking
>              Owner: scsi_drivers-other@kernel-bugs.osdl.org
>          Submitter: wizard580@gmail.com
> 
> 
> Most recent kernel where this bug did not occur: 2.6.17.12
> Distribution: Debian SID/Unstable
> Hardware Environment: SUN UltraSparc2 E3500 FC-AL SCSI hard drives
> Software Environment: Debian with latest updates on 09.09.2006
> Problem Description: When I load a driver fcal, mu cpu will be 100% busy and 
> will never fred. modprobe fcal will never ends. If I do not restart 
> immediately, then since 2-4min I can not even reboot. Seems to there is a 
> block of all disk io. A top and so programs running fine, until exit. Start 
> again will fail.
> So... may be because of that, I can not see my FC-AL scsi hard drives.
> My hardware (if I did't mistaken):
> 1) X2652A FC-AL INTERFACE BOARD 3500.
> 2) X6731A FCAL GBIC MODULE 100MB/SEC
> P.S.: I do anything for seeing my hard drives in Debian. If I need to tell you 
> any info, or make a tests, just let me know. :)
> If anybody can quickly help, I will be blessed. :D My life is bet on this.
> P.P.S:
> hard drives I see in "pre boot 'bios' with probe-fcal-all command"
> 
> Steps to reproduce:
> modprobe fcal on a Sparc arch.
> 

Does anyone else us the fcal driver?

I'd suggest the next step would be to run a kernel profile, or just sysrq-P
to find out where the CPU is stuck.  Could a sparc person please talk the
reporter through that process?

Thanks.

Comment 3 Anonymous Emailer 2006-09-09 08:55:51 UTC
Reply-To: James.Bottomley@SteelEye.com

On Sat, 2006-09-09 at 08:36 -0700, Andrew Morton wrote:
> Does anyone else us the fcal driver?
> 
> I'd suggest the next step would be to run a kernel profile, or just sysrq-P
> to find out where the CPU is stuck.  Could a sparc person please talk the
> reporter through that process?

I'm not sure this is really worth it.  Apart from trying to keep it
compiling, fcal has had no maintainer since the 2.2 kernel days.  Since
no-one has the hardware or the inclination, it's not plugged into the
SCSI FC infrastructure and thus it's bitrotting.  Now might be a good
time to declare it officially dead and remove it from the tree.  Unless
someone actually wants to maintain it and bring it into the 21st
century?

James


Comment 4 Wizard Vandal 2006-09-10 01:16:30 UTC
I agree that it's too old hardware... Ok. If it too unsupported, maybe anyone
can try with me to fix that? 2.4 kernel or 2.6
I very sorry for that, but I see that driver and think that it's working fine (I
did not see any present driver that was in kernel and was completly not
working). I can not return that hardware to the seller. And I have to get it
working... :(

Please, help me.
Comment 5 Wizard Vandal 2006-09-10 01:17:31 UTC
I a very little C programmer to bring it back by myself.
Comment 6 Wizard Vandal 2006-09-10 02:23:40 UTC
May be I simple do something wrong...
Look:

devalias

disk                     /sbus@2,0/SUNW,socal@d,10000/sf@0,0/ssd@0,0
disksocal                /sbus@2,0/SUNW,socal@d,10000/sf@0,0/ssd@0,0
disk                     /sbus@3,0/SUNW,fas@3,8800000/sd@0,0
diskbrd                  /sbus@3,0/SUNW,fas@3,8800000/sd@a,0
diskisp                  /sbus@3,0/QLGC,isp@0,10000/sd@0,0
net                      /sbus@3,0/SUNW,hme@3,8c00000
cdrom                    /sbus@3,0/SUNW,fas@3,8800000/sd@6,0:f
tape                     /sbus@3,0/SUNW,fas@3,8800000/st@4,0
scsi                     /sbus@3,0/SUNW,fas@3,8800000
disk0                    /sbus@3,0/SUNW,fas@3,8800000/sd@0,0
disk1                    /sbus@3,0/SUNW,fas@3,8800000/sd@1,0
disk2                    /sbus@3,0/SUNW,fas@3,8800000/sd@2,0
disk3                    /sbus@3,0/SUNW,fas@3,8800000/sd@3,0
disk4                    /sbus@3,0/SUNW,fas@3,8800000/sd@4,0
disk5                    /sbus@3,0/SUNW,fas@3,8800000/sd@5,0
tape0                    /sbus@3,0/SUNW,fas@3,8800000/st@4,0
tape1                    /sbus@3,0/SUNW,fas@3,8800000/st@5,0
ttya                     /central/fhc/zs@0,902000:a
ttyb                     /central/fhc/zs@0,902000:b
keyboard                 /central/fhc/zs@0,904000
keyboard!                /central/fhc/zs@0,904000:forcemode
Comment 7 Anonymous Emailer 2006-09-10 06:02:37 UTC
Reply-To: davem@davemloft.net

From: James Bottomley <James.Bottomley@SteelEye.com>
Date: Sat, 09 Sep 2006 11:04:15 -0500

> On Sat, 2006-09-09 at 08:36 -0700, Andrew Morton wrote:
> > Does anyone else us the fcal driver?
> > 
> > I'd suggest the next step would be to run a kernel profile, or just sysrq-P
> > to find out where the CPU is stuck.  Could a sparc person please talk the
> > reporter through that process?
> 
> I'm not sure this is really worth it.  Apart from trying to keep it
> compiling, fcal has had no maintainer since the 2.2 kernel days.  Since
> no-one has the hardware or the inclination, it's not plugged into the
> SCSI FC infrastructure and thus it's bitrotting.

I think it's not feasible nor worth plugging the fcal driver into the
SCSI FC infrastructure right now simply because these drivers need a
full software FC stack, and the SCSI FC stuff just provides very high
level interfaces to all of this an expects on-board firmware to do all
the protocol packet building and other FC stuff just like the
Qlogic-FC and other cards do.

So it's not just a matter of "porting fcal to SCSI FC", someone would
need to implement the full FC software stack.

Comment 8 Wizard Vandal 2006-09-10 15:51:08 UTC
If it will help...
1) ------
diskisp                  /sbus@3,0/QLGC,isp@0,10000/sd@0,0
2) ------
config SCSI_QLOGICPTI
        tristate "PTI Qlogic, ISP Driver"
        depends on SBUS && SCSI
        help
          This driver supports SBUS SCSI controllers from PTI or QLogic. These
          controllers are known under Solaris as qpti and in the openprom as
          PTI,ptisp or QLGC,isp. Note that PCI QLogic SCSI controllers are
          driven by a different driver.

          To compile this driver as a module, choose M here: the
          module will be called qlogicpti.
-----
As I think they are matching... but, device not found...
Comment 9 Wizard Vandal 2006-09-13 20:33:23 UTC
Hey guys...
What would you say if I give you a ssh to my server.
I've installed simple scsi hard drive and installed debian.

Then can you try to find solution?
I ready to remotely give you any what you need from me.
Comment 10 Natalie Protasevich 2008-03-29 23:07:30 UTC
I think the bug should be closed.
wizard, I hope you found better solution for your server needs.