Deadlock occurs when each of a group of threads is holding at least one lock while at the same time waiting on a lock held by a member of the same group.
Without some sort of external intervention, deadlock is forever. No thread can acquire the lock it is waiting on until that lock is released by the thread holding it, but the thread holding it cannot release it until the holding thread acquires the lock that it is waiting on.
We can create a directed-graph representation of a deadlock scenario
with nodes for threads and locks, as shown in
Figure .
An arrow from a lock to a thread indicates that the thread holds
the lock, for example, Thread B holds Locks 2 and 4.
An arrow from a thread to a lock indicates that the thread is waiting
on the lock, for example, Thread B is waiting on Lock 3.
A deadlock scenario will always contain at least one deadlock cycle.
In Figure , this cycle is
Thread B, Lock 3, Thread C, Lock 4, and back to Thread B.
Quick Quiz 8.1: But the definition of deadlock only said that each thread was holding at least one lock and waiting on another lock that was held by some thread. How do you know that there is a cycle? End Quick Quiz
Although there are some software environments such as database systems that can repair an existing deadlock, this approach requires either that one of the threads be killed or that a lock be forcibly stolen from one of the threads. This killing and forcible stealing can be appropriate for transactions, but is problematic for kernel and application-level use of locking.
Kernels and applications therefore work to avoid deadlocks. There are three major approaches, locking hierarchies, conditional locking, and single-lock-at-a-time designs.
Locking hierarchies order the locks and prohibit acquiring locks out
of order.
In Figure ,
we might order the locks numerically, so that a thread was
forbidden from acquiring a given lock if it already held a lock
with the same or a higher number.
Thread B has violated this hierarchy because it is attempting to
acquire Lock 3 while holding Lock 4, which permitted the deadlock
to occur.
Again, to apply a locking hierarchy, order the locks and prohibit out-of-order lock acquisition. In large program, it is wise to use tools to enforce your locking hierarchy [Cor06].
But suppose that there is no reasonable locking hierarchy.
This can happen in real life, for example, in layered network protocol stacks
where packets flow in both directions.
In the networking case, it might be necessary to hold the locks from
both layers when passing a packet from one layer to another.
Given that packets travel both up and down the protocol stack, this
is an excellent recipe for deadlock, as illustrated in
Figure .
Here, a packet moving down the stack towards the wire must acquire
the next layer's lock out of order.
Given that packets moving up the stack away from the wire are acquiring
the locks in order, the lock acquisition in line 4 of the figure
can result in deadlock.
One way to avoid deadlocks in this case is to impose a locking hierarchy,
but when it is necessary to acquire a lock out of order, acquire it
conditionally, as shown in
Figure .
Instead of unconditionally acquiring the layer-1 lock, line 5
conditionally acquires the lock using the spin_trylock() primitive.
This primitive acquires the lock immediately if the lock is available
(returning non-zero), and otherwise returns zero without acquiring the lock.
If spin_trylock() was successful, line 15 does the needed layer-1 processing. Otherwise, line 6 releases the lock, and lines 7 and 8 acquire them in the correct order. Unfortunately, there might be multiple networking devices on the system (e.g., Ethernet and WiFi), so that the layer_1() function must make a routing decision. This decision might change at any time, especially if the system is mobile.8.1Therefore, line 9 must recheck the decision, and if it has changed, must release the locks and start over.
Quick Quiz 8.2:
Can the transformation from
Figure to
Figure
be applied universally?
End Quick Quiz
Quick Quiz 8.3:
But the complexity in
Figure
is well worthwhile given that it avoids deadlock, right?
End Quick Quiz
In some cases, it is possible to avoid nesting locks, thus avoiding
deadlock.
However, there must be some mechanism to ensure that the needed data
structures remain in existence during the time that neither lock is
held.
One such mechanism is discussed in
Section
and several others are presented in
Chapter
.
Paul E. McKenney 2011-12-16