Bug 2829 - posix_locks_deadlock() loops infinitely
Summary: posix_locks_deadlock() loops infinitely
Status: CLOSED CODE_FIX
Alias: None
Product: File System
Classification: Unclassified
Component: VFS (show other bugs)
Hardware: i386 Linux
: P2 normal
Assignee: fs_vfs
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2004-06-03 22:45 UTC by i-sakamoto
Modified: 2008-03-03 20:37 UTC (History)
3 users (show)

See Also:
Kernel Version: 2.6.6
Subsystem:
Regression: ---
Bisected commit-id:


Attachments
fix infinite loop in posix_locks_deadlock (1.95 KB, patch)
2007-10-30 09:03 UTC, bfields
Details | Diff

Description i-sakamoto 2004-06-03 22:45:53 UTC
Distribution: debian
Hardware Environment:
Software Environment:
Problem Description: Using POSIX threads makes posix_locks_deadlock() loop
infinitely.

Steps to reproduce:

There are 3 processes; "A", "B" and "C".
Each process opens "file1", "file2" and "file3".

"A" creates 3 POSIX threads; "A1", "A2" and "A3". (These threads' PIDs are the
same.)
"B" creates 2 POSIX threads; "B1" and "B2". (These threads' PIDs are the same.)
"C" creates 1 POSIX thread; "C1".


1. "A1" locks "file1" exclusively.
2. "B1" locks "file2" exclusively.
3. "C1" locks "file3" exclusively.
4. "A2" trys to lock "file2" and sleeps (blocked by "B1").
5. "A3" trys to lock "file3" and sleeps (blocked by "C1").
6. "B2" trys to lock "file1".
   Here, posix_locks_deadlock() must detect deadlock, but
   it does not. Then "B2" sleeps (blocked "A1").

  "A"                                    "B"
   |                                      |
   +--A1(LOCK)---->(file1)<-B2(*deadlock)-+
   |                                      |
   |                                      |
   +--A2(BLOCK)--->(file2)<---B1(LOCK)----+
   |
   |
   +--A3(BLOCK)--->(file3)<---C1(LOCK)----"C"

    block_list -> A3(BLOCK) -> A2(BLOCK)


7. "C1" unlocks "file3". ("A3" wakes up and locks "file3".)
8. "A3" unlocks "file3".
9. Another process locks "file1".
   This makes posix_locks_deadlock() loop infinitely.
Comment 1 Diego Calleja 2006-08-04 05:15:29 UTC
Can you tell is this problem is still present in recent kernels?
Comment 2 i-sakamoto 2006-08-07 03:48:52 UTC
Recent kernel works fine. Thank you.
Comment 3 i-sakamoto 2006-08-07 04:46:16 UTC
Sorry, my previous report was not correct.
It seems that this problem is still present in debien's 2.6.8-3-686 kernel.
I do not test newer vanilla kernel such as 2.6.17.
Comment 4 Diego Calleja 2006-08-07 10:02:24 UTC
It'd be interesting to get results from 2.6.17, as 2.6.8 was released on August 2004
Comment 5 i-sakamoto 2006-08-07 23:08:28 UTC
I tested 2.6.17 on uni processer PC.
This kernel also has this problem.
Comment 6 Diego Calleja 2006-08-08 17:18:53 UTC
:/

CC'ing Ingo Molnar
Comment 7 Tanmay 2006-08-17 00:41:13 UTC
We simulated the above given situation using lockf and fcntl functions to aquire
locks on files by threads on 2.6.17.6 kernel on i386 arch. We found that lockf
generates error EDEADLK which tells that :  The command was T_LOCK and this lock
 operation  would  cause  a deadlock. 
Hence the deadlock situation is already handled in this case.
Also while debugging we found that the 'posix_locks_deadlock()' function is not
looping infinitely.

Regards,
Tanmay and Pranav (Calsoftinc, India)
Comment 8 Natalie Protasevich 2007-09-09 00:53:54 UTC
i-sakamoto, are you still having locking problem with recent kernels? #7 from Tinmay suggested that the problem was supposedly fixed.
Thanks.
Comment 9 bfields 2007-10-30 09:03:53 UTC
Created attachment 13337 [details]
fix infinite loop in posix_locks_deadlock

The attached patch should address the infinite loop which can be hit in the case described above; submitted for 2.6.24 and 2.6.23.x.  The problem is that (as noted in the comment above posix_locks_deadlock()), the loop there only terminates if the existing lock dependency graph only terminates if the existing lock dependency graph is acyclic.  But the posix_locks_deadlock() function is only guaranteed to detect (and prevent) such cycles if each node of that graph has outdegree 1.  In the example above, the node corresponding to lock owner A has outdegree 2 (because it's blocking on two locks at once--something not possible in the presence only of traditional posix processes)--so posix_locks_deadlock() misses the cycle that's created at step 6, which allows it to go into an infinite loop when it's called again at step 9.
Comment 10 Natalie Protasevich 2008-03-03 20:37:38 UTC
It looks like the code was commited (and later reworked further).
Closing the bug, thanks.

Note You need to log in before you can comment on or make changes to this bug.