Bug 5776

Summary: initio: Kernel crash while launching K3B app. with SCSI driver
Product: IO/Storage Reporter: Srdjan Todorovic (todorovic.s)
Component: SCSIAssignee: Mike Anderson (andmike)
Status: CLOSED PATCH_ALREADY_AVAILABLE    
Severity: normal CC: polynomial-c, protasnb
Priority: P2    
Hardware: i386   
OS: Linux   
Kernel Version: 2.6.14 Subsystem:
Regression: --- Bisected commit-id:
Attachments: dmesg output under 2.6.15-git9
Serial console messages under 2.6.17
Serial console message under 2.6.18-git21

Description Srdjan Todorovic 2005-12-22 20:22:59 UTC
Most recent kernel where this bug did not occur: 2.6.13.5
Distribution: Slackware 10.1
Hardware Environment:
  AMD Athlon(tm) XP 2600+
  nVidia nForce2
01:09.0 SCSI storage controller: DTC Technology Corp. Domex DMX3194UP SCSI
Adapter (rev 01)
        Subsystem: Unknown device 9292:0202
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
Stepping- SERR- FastB2B-
        Status: Cap- 66Mhz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort-
<TAbort- <MAbort- >SERR- <PERR-
        Latency: 32, cache line size 08
        Interrupt: pin A routed to IRQ 11
        Region 0: I/O ports at d000 [size=256]
        Region 1: Memory at e7000000 (32-bit, non-prefetchable) [size=4K]
        Expansion ROM at <unassigned> [disabled] [size=32K]

Attached devices:
Host: scsi0 Channel: 00 Id: 00 Lun: 00
  Vendor: HP       Model: psc 2175         Rev: 1.00
  Type:   Direct-Access                    ANSI SCSI revision: 02
Host: scsi1 Channel: 00 Id: 04 Lun: 00
  Vendor: GENERIC  Model: CRD-BP4          Rev: 4.28
  Type:   CD-ROM                           ANSI SCSI revision: 02

Software Environment: Slackware 10.1, KDE 3.4.0, K3B v0.12.5 and v0.12.10

Modules Loaded         iptable_filter ip_tables md5 ipv6 usblp rtc i2c_nforce2
w83627hf hwmon_vid i2c_isa i2c_core snd_intel8x0 snd_ac97_codec snd_ac97_bus
snd_pcm_oss snd_pcm snd_timer snd_page_alloc snd_mixer_oss snd ide_scsi loop
forcedeth 8139too

Gnu C                  3.3.4

Problem Description:

Launching K3B burning application from X11 causes kernel crash if the Initio
SCSI driver is loaded. The crash happens almost immediately upon launching
K3B. If the Initio driver is unloaded prior to launching K3B, the kernel does
not crash and K3B just detects the IDE DVD burner.

Syslogd does not dump anything relevant in the logs, even if told to sync the
logfiles. There is no trace of an Oops message.

The Magic SysRq key combo does not respond once the crash happens, and the only
way to reboot is to press the reset button.

Tested with kernels (Initio driver selected as module):
  2.6.12.1           Does not crash.
  2.6.13.2           Does not crash.
  2.6.13.5           Does not crash.
  2.6.14             Crash.
  2.6.14.{1,2,4}     Crash.
  2.6.15-rc5-mm3     Crash.
  2.6.15-rc6         Crash.
  2.6.15-rc6-git3    Crash.

Building the kernel with the Initio driver built-in does not change the outcome.
Kernels 2.6.14.4 and 2.6.15-rc6 were tested with the driver built-in and still
crashed.

diff -up linux-2.6.13.5/drivers/scsi/initio.c linux-2.6.14/drivers/scsi/initio.c
and
diff -up linux-2.6.13.5/drivers/scsi/initio.h linux-2.6.14/drivers/scsi/initio.h
return nothing.

Error possibly in SCSI Mid- and/or upper-layers.

K3B manages to output only:
[srdjant@tigerclaw ~]$ k3b: (K3bPluginManager) lib libk3bffmpegdecoder not found
k3b: (K3bPluginManager) lib libk3bflacdecoder not found
k3b: (K3bPluginManager) lib libk3bmpcdecoder not found

before crashing. When running K3B on non-crashing kernels, the next message
printed, after the above, is:

k3b: (K3bExternalBinManager) Cdrecord 2.1 features: gracetime, overburn,
       cdtext, clone, tao, cuefile, xamix, plain-atapi, hacked-atapi

Running cdrecord -scanbus does not crash the kernel.
I have successfully burned a CDR with cdrecord on the affected kernels in X11.

Tested with latest K3B version (0.12.10) and still experience crash.

Steps to reproduce:
  Make sure initio driver is loaded, run K3B.
Comment 1 Srdjan Todorovic 2005-12-26 11:25:01 UTC
Ran sysctl -w dev.scsi.logging_level=0xffffffff  on linux 2.6.15-rc6
with SCSI logging facility selected.

That produced a load of scsi debugging text. A lot of it seems to be
periodic command: Test Unit Ready: 00 00 00 00 00 00 with other commands like
command: Read TOC/PMA/ATIP intermixed.

The last 20 or so lines are as follows:

Dec 26 18:42:45 tigerclaw kernel: Trying ioctl with scsi command 30
Dec 26 18:42:45 tigerclaw kernel: scsi_add_timer: scmd: c15cb060, time: 2500,
(c02:0:4:0: done 0xc15cb060 SUCCESS        2 sr 0:0:4:0: 
Dec 26 18:42:45 tigerclaw kernel:         command: Read TOC/PMA/ATIP: 43 00 00
00 00 00 00 00 0c 40
Dec 26 18:42:45 tigerclaw kernel: : Current: sense key: Not Ready
Dec 26 18:42:45 tigerclaw kernel:     Additional sense: Medium not present -
tray closed
Dec 26 18:42:45 tigerclaw kernel: scsi host busy 1 failed 0
Dec 26 18:42:45 tigerclaw kernel: sr 0:0:4:0: Notifying upper driver of
completion (result 8000002)
Dec 26 18:42:45 tigerclaw kernel: 0 sectors total, 0 bytes done.
Dec 26 18:42:45 tigerclaw kernel: use_sg is 1
Dec 26 18:42:45 tigerclaw kernel: scsi_block_when_processing_errors: rtn: 1
Dec 26 18:42:45 tigerclaw kernel: scsi_add_timer: scmd: c15cb060, time: 7500,
(c025ef34)
Dec 26 18:42:45 tigerclaw kernel: sr 0:0:4:0: send 0xc15cb060                 
sr 0:0:4:0: 
Dec 26 18:42:45 tigerclaw kernel:         command: Read TOC/PMA/ATIP: 43 00 00
00 00 00 00 00 0c 00
Dec 26 18:42:45 tigerclaw kernel: buffer = 0xc1541e40, bufflen = 12, done =
0xc02617e0, queuecommand 0xc02683f2
Dec 26 18:42:45 tigerclaw kernel: leaving scsi_dispatch_cmnd()
Dec 26 18:42:45 tigerclaw kernel: scsi_delete_timer: scmd: c15cb060, rtn: 1
Dec 26 18:42:45 tigerclaw kernel: sr 0:0:4:0: done 0xc15cb060 SUCCESS        2
sr 0:0:4:0: 
Dec 26 18:42:45 tigerclaw kernel:         command: Read TOC/PMA/ATIP: 43 00 00
00 00 00 00 00 0c 00
Dec 26 18:42:45 tigerclaw kernel: : Current: sense key: Not Ready
Dec 26 18:42:45 tigerclaw kernel:     Additional sense: Medium not present -
tray closed
Dec 26 18:42:45 tigerclaw kernel: scsi host busy 1 failed 0
Dec 26 18:42:45 tigerclaw kernel: sr 0:0:4:0: Notifying upper driver of
completion (result 8000002)
Dec 26 18:44:38 tigerclaw kernel: klogd 1.4.1, log source = /proc/kmsg started.
Comment 2 Srdjan Todorovic 2006-01-13 15:07:00 UTC
Created attachment 7018 [details]
dmesg output under 2.6.15-git9
Comment 3 Srdjan Todorovic 2006-01-13 15:09:21 UTC
Newly tested:

  2.6.15
  2.6.15-git9
  2.6.15-mm3

All three kernels crash as detailed above.
Comment 4 Andrew Morton 2006-01-20 00:12:38 UTC
bugme-daemon@bugzilla.kernel.org wrote:
>
> ttp://bugzilla.kernel.org/show_bug.cgi?id=5776
> 
>             Summary: initio: Kernel crash while launching K3B app. with SCSI
>                      driver
>      Kernel Version: 2.6.14

The initio driver keeled over during some sort of scanning/querying operation.

Comment 5 Lars W. 2006-02-15 17:19:06 UTC
Hi,

isn't this report a duplicate of http://bugzilla.kernel.org/show_bug.cgi?id=5659
is it?

Cheers
Lars
Comment 6 Srdjan Todorovic 2006-10-04 08:56:29 UTC
Created attachment 9152 [details]
Serial console messages under 2.6.17

I just finally managed to get my hands on a null cable, and setup a serial
console.

Set nmi_watchdog=1 on the command line. SysRq was responsive after setting
this option.

Kernel is vanilla 2.6.17, with many many debug options turned on.
Initio driver set as module.
Lockup trigger is running k3b (if scsi device was added to application's list)
or calling k3bsetup to add the scsi device to applications list of writers.

See attachment for log.

Will test latest kernel soon.
Comment 7 Srdjan Todorovic 2006-10-05 11:48:38 UTC
Created attachment 9163 [details]
Serial console message under 2.6.18-git21

Log output for 2.6.18-git21
Comment 8 Srdjan Todorovic 2006-11-30 09:31:14 UTC
I ran git-bisect on Linus' tree to try to find the commit change that caused the
hangs.

Here's the commit:

---
186d330e682210100c671355580a8592e4a21692 is first bad commit   
commit 186d330e682210100c671355580a8592e4a21692   
Author: Timothy Thelin <Timothy.Thelin@wdc.com>   
Date:   Tue Sep 13 19:56:28 2005 -0700   
   
    [SCSI] scsi: sd, sr, st, and scsi_lib all fail to copy cmd_len to
new cmd   
   
    This fixes an issue in scsi command initialization from a request   
    where sd, sr, st, and scsi_lib all fail to copy the request's   
    cmd_len to the scsi command's cmd_len field.   
   
    Signed-off-by: Timothy Thelin <timothy.thelin@wdc.com>   
    Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
---

Running the k3b-1.0pre2 and newer versions does not produce a hang.
Running k3b-0.12.17 causes hangs.

I'm guessing that the newer k3b version properly passes the command length
parameters to the SCSI system.
Comment 9 Natalie Protasevich 2007-07-22 17:37:21 UTC
Has this problem been resolved?
According to #8, the bug can be closed now, correct?
Thanks.
Comment 10 Srdjan Todorovic 2007-07-29 13:24:24 UTC
Just tested with 2.6.22.1 and k3b-1.0.3
No crash or lockup. Bug appears to be fixed for some time now.
Closing bug report.