Bug 9590

Summary: CMD646 & cdrom: drive appears confused & drive not ready for command
Product: IO/Storage Reporter: Rafael J. Wysocki (rjwysocki)
Component: IDEAssignee: Bartlomiej Zolnierkiewicz (bzolnier)
Status: CLOSED CODE_FIX    
Severity: normal CC: bunk, mroos
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 2.6.24-rc5-gda8cadb3 Subsystem:
Regression: Yes Bisected commit-id:
Bug Depends on:    
Bug Blocks: 9243    

Description Rafael J. Wysocki 2007-12-17 13:44:38 UTC
Subject         : CMD646 & cdrom: drive appears confused & drive not ready for command
Submitter       : Meelis Roos <mroos@linux.ee>
References      : http://lkml.org/lkml/2007/12/16/99
Handled-By      : Andrew Morton <akpm@linux-foundation.org>
Comment 1 Bartlomiej Zolnierkiewicz 2007-12-17 14:00:39 UTC
Refael, please turn off "Regression" flag because there is no confirmation yet that this is a regression.

[ I tried to do it myself but... "You tried to change the Regression field from
  1, but only the assignee or reporter of the bug, or a sufficiently empowered
  user may change that field." - kernel bugzilla is FPOS holds...  sigh... ]

PS1 The problem seems to be caused by hald stupidity... :(

PS2 I'm in the process of re-writting ide-cd right now (57 patches and counting, not yet posted) and I see a possible fix for this particular problem - attached, may not apply/work etc. We _really_ need to put ide-cd into debuggable & maintainable state before investing more time into handling bugreports (but it is if course worth to keep them logged)...
Comment 2 Bartlomiej Zolnierkiewicz 2007-12-17 14:05:20 UTC
Update: Sorry Meelis, I was too optimistic - the possible fix already depends on 3 patches from the ide-cd re-write alone (not counting patches from IDE tree)...

Please be patient, I hope to finish this patch series soon (== before Christmas).

Cheers.
Comment 3 Bartlomiej Zolnierkiewicz 2007-12-17 14:06:46 UTC
Hmmm....

http://bugzilla.kernel.org/show_bug.cgi?id=9590

Anyway - back to patches...
Comment 4 Bartlomiej Zolnierkiewicz 2007-12-17 14:07:34 UTC
Should have been:

http://bugzilla.kernel.org/show_bug.cgi?id=8613
Comment 5 Meelis Roos 2007-12-20 02:15:25 UTC
Tried 2.6.23, so far it behaves fine, so it really seems to be a regression.

How can it be a hal stupidity - hal just polls the device, why would a driver make the driver confused? Yes, hal is dumbt to poll it every 2 sec but this hould not break it.
Comment 6 Bartlomiej Zolnierkiewicz 2007-12-20 13:35:31 UTC
> Tried 2.6.23, so far it behaves fine, so it really seems to be a regression.

Thanks for update.  I went through post-2.6.23 ide-cd and cmd646 commits but unfortunately there are no likely "guilty" candidates.  Would it be possible to narrow the problem down?  (For starters it would be useful to know if 2.6.24-rc1 is OK, then you would need to proceed with git-bisect).

> How can it be a hal stupidity - hal just polls the device, why would a driver
> make the driver confused? Yes, hal is dumbt to poll it every 2 sec but this
> hould not break it.

Yeah, that is what I was referring to.  You are of course right that it shouldn't confuse the drive and that it is a bug that needs fixing.
Comment 7 Meelis Roos 2007-12-29 08:38:21 UTC
I booted up the latest 2.6.24-rc6+git kernel (with the CMD646 updates 
applied) and the problem has not reappeared in 2 days 53 min uptime. 
Will see again on January 2 but so far it seems it has been fixed.
Comment 8 Meelis Roos 2008-01-02 01:44:53 UTC
Yep, no errors in 5 days uptime.
Comment 9 Bartlomiej Zolnierkiewicz 2008-01-02 14:21:51 UTC
I think that it is thanks to a recent cmd646 regression bugfix:

Author: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Date:   Mon Dec 24 15:23:44 2007 +0100

    cmd64x: fix hwif->chipset setup

    commit 528a572daea90aa41db92683e5a8756acef514c4 ("ide: add ->chipset field
    to ide_pci_device_t") broke hwif->chipset setup (it is now set to ide_cmd646
    for CMD648 instead of CMD646).  It seems that the breakage happend while
    I was moving patches around (cmd64x_chipsets[] entries for CMD646 and CMD648
    are identical except for 'name' field).  Fix it and bump driver version.

    Cc: Sergei Shtylyov <sshtylyov@ru.mvista.com>
    Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>

[ the bug was found by accident while I was fixing some other stuff ]

Thanks for confirmation and sorry for breaking it in the first place.