Most recent kernel where this bug did *NOT* occur: 2.6.19-rc6
Distribution: Fedora Core Devel
Hardware Environment: CK804 mainboard with DVD + 2 PATA on CK804 and core system
on secondary sata chipset (sil)
Lots of traces, dvd drive dead
Steps to reproduce:
Created attachment 9535 [details]
Successful 2.6.19-rc5-git6 dmesg
Created attachment 9536 [details]
Problem 2.6.19-rc5-mm2 dmesg
Created attachment 9537 [details]
Created attachment 9538 [details]
looks like sata_nv shat itself quite seriously here.
That would be a problem in the sata_nv ADMA patch. I'll look into this, it seems
like a simple bug. This code path probably isn't used other than in the case of
an ATAPI device connected to the SATA controller which you may well have been
the first person to have tested this patch with.
In the meantime you can try disabling ADMA by passing the parameter adma=0 to
the sata_nv module on loading, if you can, or change this line:
static int adma_enabled = 1;
static int adma_enabled = 0;
> In the meantime you can try disabling ADMA by passing the parameter adma=0 to
> the sata_nv module on loading, if you can,
is that sata_nv.adma=0 on the kernel command line or something else?
Created attachment 9540 [details]
Patch to fix silly mistake in nv_adma_bmdma functions
That would be if your sata_nv.c was not built modular, I believe.
But first, can you test this patch? It should at least get rid of those stack
dumps, hopefully the drive will actually work as well :-) I don't have any
means of testing it myself (not even for non-ATAPI at the moment,
With attachment #9540 [details] the traces are gone but the dvd drive is still dead
Can you post the full dmesg again with the patch?
Created attachment 9547 [details]
dmesg with attachment #9540 [details]
sorry about forgetting this
Created attachment 9552 [details]
Another patch to fix BMDMA setup
OK, I think I may see the error of my ways now. These controllers allow using
all of the registers in MMIO mode except for the legacy BMDMA registers (or at
least I know of no way to use them in MMIO mode). I was just overriding the
MMIO flag around the call to the bmdma functions, but those seem to call other
functions which may depend on the MMIO flag as well and access other registers
which should in fact be used as MMIO. I've added the appropriate code in now to
do the bmdma operations directly without messing with the MMIO flag.
This is against plain 2.6.19-rc5-mm2 again (so you'll have to revert my last
patch and put this on). It's once again compile tested only. Let me know how it
works. If it doesn't work, post the dmesg again.
By the way I noticed you have some drives attached to the Silicon Image
controller on your board. Those will perform faster on the NVIDIA controller
since that now supports NCQ and the Silicon Image controller does not.
when I built the box sata_sil was stable and sata_nv not
Once this bit is fixed I'll probably move one disk to sata_nv, it's better for
redundancy anyway (the two sata_sil drives are mirrored)
rebuilding with your patch now
Created attachment 9553 [details]
Another patch to fix BMDMA setup - with notifier clear fix
I just noticed another potential issue in the driver. It might not affect you
but it could hit if there was an ATAPI device and a non-ATAPI device on two
ports on the same controller. Yet another patch version with that fix..
Well with the patch in attachment #9553 [details], the boot got real slow, but I didn't
get the dvd drive back :(
Created attachment 9554 [details]
9547: dmesg with attachment #9553 [details]
BTW in case it wasn't clear : the dvd drive on sata_nv is a sata drive (at least
I think it might be a bit closer to functioning now, actually, but something
funky is obviously still going on. I'll try and look into it some more tomorrow.
Created attachment 9563 [details]
Patch with more debugging
Here's another patch for you to try. I doubt it'll fix the problem but it
should spew out some more debugging information in order to help track it down.
I'm suspecting that we need to disable ADMA mode entirely on the port with an
ATAPI device in order for the normal BMDMA engine to work properly..
Created attachment 9565 [details]
dmesg with attachment #9563 [details]
OK, from the output it appears that the normal BMDMA engine is essentially
stalling whenever we try and use it. My guess is that this is because we still
have ADMA mode enabled on the port (the
NV_MCP_SATA_CFG_20_PORT1_EN/NV_MCP_SATA_CFG_20_PORT1_PWB_EN flags in the
configuration register.) I'll try and get something together which disables
those bits when we are in port-register mode and see if that helps, however I'll
likely need to test that at home to make sure it doesn't somehow screw up ADMA
mode as well. (This is where actually having specs from NVIDIA on the controller
would be useful..)
Aside from that, I noticed another issue: The way it's intended to work with DMA
on ATAPI on this controller is that when an ATAPI device is configured we change
the DMA parameters to be the ones for standard PCI IDE BMDMA in the slave_config
function. However, it seems like from your original stack dumps that the libata
code is issuing DMA PACKET commands before the slave_config function has been
called. In this case the DMA parameters will still be set to the values for ADMA
mode and things may go haywire. I think I need to override the check_atapi_dma
function and disallow ATAPI DMA until the slave_config function has been called
to set up the parameters properly. (Jeff/Tejun, if you're listening, does that
Still in 2.6.19-rc6-mm1
2.6.19-rc6-mm1 dmesg in attachment #9623 [details]
Created attachment 9625 [details]
Patch for ATAPI support in ADMA mode
OK, here's another patch. I haven't gotten an SATA-to-PATA adapter to try ATAPI
mode myself, but I've kind of simulated it by testing a hacked version which
uses "ATAPI mode" on everything, and that seemed to work OK. Can you test this
out and see if it works? This is against 2.6.19-rc6-mm1.
This version turns off the PCI configuration space bits which control whether
ADMA is enabled, so we disable it entirely when an ATAPI device is connected.
Otherwise it seems like the legacy DMA engine doesn't work properly.
Created attachment 9627 [details]
dmesg with attachment #9625 [details]
This patch works at last
Excellent. I just submitted the patch to Andrew and Jeff.
Marking as fix available.
Thank you for looking at this!