The comment of this function: " LOCKING: caller".
But in some call chain lock is not acquired:
ata_dev_read_id <- ata_qc_issue_prot <- ata_bus_probe <- ata_dev_revalidate <-
ata_eh_revalidate_and_attach <- ata_device_add
A lock is not acquired.
But in other call chains, there is lock acquired.
Hmmm... Am I missing something?
How does ata_dev_revalidate() end up calling ata_bus_probe()? The call path is
ata_dev_revalidate() -> ata_dev_read_id() -> ata_exec_internal[_sg](), this one
acquires lock -> .qc_issue().
Thanks for your prompt reply! The comments before "ata_dev_select" says:
The call chain:
ata_dev_revalidate() -> ata_dev_read_id() -> ata_dev_select(), and then it calls
ata_exec_internal[_sg], so shall I assume the lock is acquired AFTER calling
ata_dev_select(), which violates the comment?
Sorry in previous I messed up earlier version of kernel (2.6.6)
No, what code are you looking at? ata_dev_read_id() calls ata_exec_internal()
which calls ata_exec_internal_sg() which grabs the host lock and then calls
->qc_issue which in turn calls ata_dev_select(). Nothing wrong there. I'm
closing the bug as INVALID.
I think I am looking at version 126.96.36.199, and following is the link of it on
If I am right, I find on line 1462 ata_dev_read_id calls ata_dev_select(), and
on line 1483, it further calls ata_exec_internal(). So when calling
ata_dev_select() seems to me no lock is acquired.
Am I making mistake?
No you aren't. I was mistaken. My apologies. Anyways, yeah, there is a race
there. The direct call to ata_dev_select() in ata_dev_read_id() is there for
mostly historical reason as ->qc_issue() is responsible for that anyway. It
also can be dangerous on a PATA channel. Thanks for reporting this. Not sure
whether that should be wrapped in host lock or just be killed. I'll post a
message to linux-ide mailing list.
Thanks for your all your prompt replies!
I am also suspecting the following functions may be potential race case:
From the comment we see they all require the caller to hold the lock, but in the
ata_bus_reset -> ata_pio_devchk -> ata_mmio_devchk|ata_pio_devchk, seems there
is no lock.
Am I making mistakes?
I might be too prompt today (writing faster than thinking). Anyways, yeah, it
seems like a class of issues. We should be holding host lock whether we're
calling from EH or not. Entering EH does take care of exclusion in the port but
nothing is guaranteed host-wide and register accesses should be guaranteed
In practice, as EH actions don't happen frequently && modern controllers
implement each port very independently, this doesn't cause real problem, but
there still is a slim chance that EH operation going on one channel interferes
operation going on another channel.
For the record, I don't think fixing this problem is of very high priority as it
has been working well until now. I'll post a patch for this in a few days.
Thanks a lot.
Status ? Has this been fixed upstream ?
Hello ping ??
The problem is still there. It definitely needs fixing but priority is still rather low.
Pong. Planning on working on it when moving EH to block layer before the end of this year, hopefully. :-)
Its now march so ping again
Ping again - Tejun did you look at this yet - its almost a year now ?
This being so tedious, I was trying to think of some clever way to do it without changing every exception processing functions. Ehhh... I think I should just bite and do it. Will work on it. Thanks for reminding this.
FYI: Tejun posted a patch and Jeff applied it: