Bug 13757 - Lockdep complains about possible irq lock inversion dependency
Summary: Lockdep complains about possible irq lock inversion dependency
Status: CLOSED CODE_FIX
Alias: None
Product: Drivers
Classification: Unclassified
Component: Infiniband/RDMA (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: drivers_infiniband-rdma
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2009-07-10 16:34 UTC by Bart Van Assche
Modified: 2012-06-12 10:32 UTC (History)
1 user (show)

See Also:
Kernel Version: 2.6.30.1
Tree: Mainline
Regression: No


Attachments
Kernel config (29.08 KB, text/plain)
2009-07-10 16:34 UTC, Bart Van Assche
Details
Extract from /var/log/messages (116.87 KB, text/plain)
2009-07-10 16:34 UTC, Bart Van Assche
Details
Patch for triggering this issue sooner (provided by Roland) (1.99 KB, patch)
2009-07-10 19:23 UTC, Bart Van Assche
Details | Diff
Lockdep report after having applied the provided patch. (35.89 KB, text/plain)
2009-07-10 19:26 UTC, Bart Van Assche
Details
Lockdep report after having applied the provided patch. (35.70 KB, text/plain)
2009-07-10 19:28 UTC, Bart Van Assche
Details
Proposed short-term solution (provided by Roland) (1.38 KB, patch)
2009-07-11 09:08 UTC, Bart Van Assche
Details | Diff
Lockdep locking inversion report for 2.6.30.3 kernel with workaround patch applied (59.06 KB, text/plain)
2009-07-30 09:42 UTC, Bart Van Assche
Details
Lockdep complaint about a hardirq unsafe lock order. (94.28 KB, text/plain)
2009-07-30 11:07 UTC, Bart Van Assche
Details
Proposed fix (provided by Roland). (2.28 KB, patch)
2009-08-06 09:43 UTC, Bart Van Assche
Details | Diff
Locking inversion report for 2.6.30.4 + patch in attachment 22624. (70.68 KB, text/plain)
2009-08-06 09:54 UTC, Bart Van Assche
Details
Locking inversion report for 2.6.30.4 + patches in attachments 22303 and 22624. (69.16 KB, text/plain)
2009-08-07 09:40 UTC, Bart Van Assche
Details
Fix for a (hard to trigger) locking cycle detected by lockdep. (5.93 KB, patch)
2009-08-15 06:26 UTC, Bart Van Assche
Details | Diff
(Deleted) (21.69 KB, text/plain)
2009-09-13 07:43 UTC, Bart Van Assche
Details

Description Bart Van Assche 2009-07-10 16:34:00 UTC
Created attachment 22299 [details]
Kernel config

Kernel: 2.6.30.1 with SCST zero-copy transfer completion notification and scsi_execute_fifo patches applied. These two patches do not modify any InfiniBand code.

Setup:
- Two servers connected back-to-back via InfiniBand.
- OpenSM is running on one of the two servers.

After having shut down one of the two servers, lockdep complained about possible irq lock inversion.
Comment 1 Bart Van Assche 2009-07-10 16:34:36 UTC
Created attachment 22300 [details]
Extract from /var/log/messages
Comment 2 Bart Van Assche 2009-07-10 16:45:47 UTC
Update: just booting both systems and leaving them running for about twenty minutes is sufficient to reproduce this phenomenon.
Comment 3 Bart Van Assche 2009-07-10 16:47:02 UTC
Another update: I have not yet seen this report with 2.6.29.4 on the same setup.
Comment 4 Bart Van Assche 2009-07-10 19:23:58 UTC
Created attachment 22303 [details]
Patch for triggering this issue sooner (provided by Roland)
Comment 5 Bart Van Assche 2009-07-10 19:26:56 UTC
Created attachment 22304 [details]
Lockdep report after having applied the provided patch.
Comment 6 Bart Van Assche 2009-07-10 19:28:12 UTC
Created attachment 22305 [details]
 Lockdep report after having applied the provided patch.
Comment 7 Bart Van Assche 2009-07-11 09:08:42 UTC
Created attachment 22308 [details]
Proposed short-term solution (provided by Roland)
Comment 8 Bart Van Assche 2009-07-30 07:27:10 UTC
See also the discussion at http://lists.openfabrics.org/pipermail/general/2009-July/060644.html.
Comment 9 Bart Van Assche 2009-07-30 09:42:32 UTC
Created attachment 22534 [details]
Lockdep locking inversion report for 2.6.30.3 kernel with workaround patch applied

Yesterday I found out that the proposed workaround doesn't solve all locking inversion issues unfortunately. The attached locking inversion report was obtained while testing module removal for ib_srpt.
Comment 10 Bart Van Assche 2009-07-30 10:36:33 UTC
(In reply to comment #9)
> Created an attachment (id=22534) [details]
> Lockdep locking inversion report for 2.6.30.3 kernel with workaround patch
> applied
> 
> Yesterday I found out that the proposed workaround doesn't solve all locking
> inversion issues unfortunately. The attached locking inversion report was
> obtained while testing module removal for ib_srpt.

Update: the locking inversion report referred to above has been obtained with a kernel on which only the second patch (workaround.patch) was applied, and not the first (ib-lockdep-trigger.patch). I will retest this issue with a kernel on which both patches have been applied.
Comment 11 Bart Van Assche 2009-07-30 11:07:21 UTC
Created attachment 22535 [details]
Lockdep complaint about a hardirq unsafe lock order.

This report was generated on a system equipped with an IB HCA and connected back-to-back to another system equipped with an IB HCA, and about four seconds after OpenSM generated the "SUBNET UP" event.
Comment 12 Bart Van Assche 2009-08-06 09:43:52 UTC
Created attachment 22624 [details]
Proposed fix (provided by Roland).
Comment 13 Bart Van Assche 2009-08-06 09:54:41 UTC
Created attachment 22625 [details]
Locking inversion report for 2.6.30.4 + patch in attachment 22624 [details].

Unfortunately the newly proposed patch does not seem to fix all locking inversion issues. The attached locking inversion report was triggered by running "/etc/init.d/openibd restart" repeatedly on the system connected back-to-back to the system on which the lockdep report was generated.
Comment 14 Bart Van Assche 2009-08-07 09:40:50 UTC
Created attachment 22631 [details]
Locking inversion report for 2.6.30.4 + patches in attachments 22303 and 22624.

As asked I ran a new test with both patches in attachments 22303 and 22624 applied.
Comment 15 Bart Van Assche 2009-08-15 06:26:41 UTC
Created attachment 22721 [details]
Fix for a (hard to trigger) locking cycle detected by lockdep.
Comment 16 Bart Van Assche 2009-08-16 15:48:46 UTC
Does no longer occur on a 2.6.30.4 kernel with the three attached patches applied.
Comment 17 Bart Van Assche 2009-09-13 07:43:08 UTC
Created attachment 23083 [details]
(Deleted)

Apparently there are still locking inversion complaints with the latest infiniband.git/for-next tree. This report was generated during shutdown.
Comment 18 Bart Van Assche 2009-09-13 07:45:06 UTC
Comment on attachment 23083 [details]
(Deleted)

(Deleted)
Comment 19 Bart Van Assche 2009-09-13 07:46:54 UTC
(In reply to comment #17)
> Created an attachment (id=23083) [details]
> Locking inversion report for infiniband.git/for-next of 2009-09-05 16:38:12
> (2.6.31-rc9)
> 
> Apparently there are still locking inversion complaints with the latest
> infiniband.git/for-next tree. This report was generated during shutdown.

Please ignore the above -- I have not observed any lockdep complaints with recent infiniband.git/for-next trees. The above lockdep complaint was generated by a 2.6.31 kernel.

Note You need to log in before you can comment on or make changes to this bug.