Distributed Deadlocks
Deadlock is an even thornier problem in distributed systems than in centralized systems. The resources are scattered, the deadlocked processes are scattered. The problem is harder to find and harder to deal with.
The same four strategies in centralized OSs apply
- Prevent the problem by design
- Avoid the problem by taking steps at resource allocation time
- Detect when deadlock happens and cleanup
- Ignore the problem (the venerable ostrich algorithm)
As in centralized systems, avoidance is almost never used in distributed system. The problem is the a priori info required of algorithms such as the Bankers algorithm.
Ignoring the problem may be justified by the rarity of problems, or the small cost of deadlock.
Avoidance is more difficult, but some strategies are possible in the context of atomic transactions.
Detection and recovery are the most common.
Detection
The thing that makes detection and recovery more feasible than in centralized systems is that atomic transactions are designed to cope with failure quite nicely. When a transaction is part of a deadlock it can be aborted. Since aborted transactions leave the system in a known, safe, state, they can simply be restarted at a later time when they'll hopefully avoid deadlock. Contrast this with killing processes involved in a deadlock (unhappy users, lost work).
Centralized detection
Periodically gather (either send from client or poll from coordinator) all resource allocation graph information in a single place. The coordinator can then run algorithms that detect cycles in the RAG and find deadlock. This is much like a centralized system. The problem is getting the data into one place and having it be a reliable snapshot of the state of the system. The possibility of getting data late, or in a bad order, leads to false deadlock.
Distributed detection
Chandra-Misra-Haas algorithm, 1983.
Allows for multiple requests simultaneously. A process can block on more than one process at a time.
Probe message sent when a process is blocked on a request. The probe goes to the process holding the desired resource and contains
the process that sent the probe
the process it is being sent to
When a process receives a probe, it replace the second number with its own id and the third number with the id of the process it is waiting on, then sends the new probe to that process. It sends as many probes as resources for which it is blocked. If the message goes all the way around and ends up back at the original sender, then a cycle exists and deadlock exists.
Some process must be killed. The initiator could commit suicide, but there might be multiple initiators of probes, so this could be overkill. If you keep track of the entire cycle a single process could be selected.
Problems
How do you send probes when you are blocked?
What will the performance be when you need at least 1ms to send messages?
90 percent of all deadlocks involve only two processes, according to a study done in 1990.
Prevention
Same techniques can be applied as in centralized systems
disallow hold and wait (hold only one resource at a time)
request all resources before starting and require all other resources to be released before requesting a new one
ordering resources and requesting them only in order
Global timestamps and atomic transactions give rise to other possiblilities. Each transaction is given a globally valid timestamp when it starts (Lamport's algorithm for a global clock is handy). No two transactions can ever have the same timestamp, so it is wise to use timestamp+hostid as the timestamp unique identifier.
When one process is about to block, it checks to see if the timestamp on the process that is holding the resource it wants is larger (younger). The block is only allowed if the waiting process is older than the process being waited for. In this way, if you follow the arcs in the RAG you always move to higher timestamps, so no cycle can exist. It also means that older processes have higher priorities. Processes that aren't allowed to wait kill their transaction. This favors older processes, which is good since they've done more work, and won't starve this way. The younger processes die then restart, eventually being allowed to wait for the resource as they age.
Another possibility arises because of the nature of transactions: preemption of resources. A resource can be taken away from a process by a process with an older transaction. The transaction is aborted, the system isn't hurt (e.g. by modified files) and the transaction is restarted.