yvsou.com

Brief Introduction

The success of a real time system in meeting its service requirements depends upon the correct ness and timeliness of its responses and its resilience to faults Regardless of its design or target applications a real time system must have some notion of time The time base can be absolute corresponding to a physical clock or relative based on specific events A synchronization primitive is implemented to establish and maintain a common time base among distributed functional units and between computational tasks. This review only presents recent developments in synchronization.

The notions of correctness and especially timeliness of distributed system services require a consistent global time base. In applications such as distributed databases synchronization is implicit in the task precedence constraints and in the phase lock and phase commit protocols which help serialize operations. For example a memory write followed by a read from the same location can yield differing results if a read is initiated prior to completion of the write in shared memory Fair access to shared resources data correctness and deadlock avoidance require a consistent ordering of events among cooperating processes.

Many techniques have been developed for circuit level synchronization supporting VLSI arrays or concurrent system models which employ semaphores or distributed clock lines to achieve the desired synchrony Synchronization can also be at the level of logical clocks messages or computational tasks Other facets of synchronization include coordinating functional units maintaining consistent distributed information and ensuring consistent scheduling diagnosis reconfiguration and application specific decisions. Even though synchronization methods goals and limits can differ considerably at various levels of the system hierarchy.

Recent Developments in Synchronization

Alternatives to the conventional hardware, software and logical synchronization methods have been developed to enhance the fault resiliency and efficiency of synchronization. This section highlights some recent developments

2.1       Hybrid Synchronization

Hybrid synchronization primitives are derived by combining software and hardware techniques and exploiting the benefits of each approach. A software model is typically superimposed over the system functions with some functions offloaded to dedicated hardware. For large system topologies, the cost of a dedicated clocking network often precludes hardware synchronization If software synchronization alone is used, the latency and computational message overhead limits the achievable tightness of synchronization Message routing delays also limit the tightness of synchronization that can be maintained. System faults only exacerbate the message routing traffic volume and delivery time deviations. Thus, as a scalable general solution for distributed systems, synchronization based solely on either hardware or software techniques is often inefficient or impossible.

Communication between nodes is still through messages, but the custom hardware at each node handles message traffic to reduce the system overhead Messages continue to propagate throughout the re synchronization interval, reducing both message congestion and minimizing delays due to message transit The interval during which a node records incoming messages prior to computing its timing correction is also shortened The achievable synchronization skew exceeds that of pure hardware techniques, but is still less than the minimum software synchronization skew Message transit delays are effectively masked by this approach, and the method is resilient to faults.

2.2       Non-fully Connected Models

Synchronization in large, fully connected systems is prohibitively expensive and usually physically impractical, especially when using hardware solutions. Viable synchronization methods have been developed for large systems without full connectivity.

The first method uses the conventional software synchronization approaches, requiring each node to collect information from other nodes. Hybrid techniques are then used to minimize the effect of message transit and routing delays. The consensus problem requires a similar aggregation of data from the system nodes, with information about all member nodes received redundantly over 2f + 1 disjoint routes.

An alternative approach requires only selective information assimilation. In this cluster based model, nodes within a cluster are fully connected, but not across clusters Each node obtains information from all of its cluster nodes, and from at least one node in every other cluster in the system Synchronization is thus provided with different granularity at the intra cluster and inter cluster levels While this hardware synchronization method achieves approximate agreement it cannot achieve consensus.

2.3       Probabilistic Synchronization

Thus far, the word have dealt with deterministic synchronization algorithms in which the probability of achieving synchrony is unity. However the inherent lower bounds on synchronization skew caused by message propagation and transit delay especially in large non-fully connected system may render deterministic synchronization inadequate for a given system model. If unbounded random message delays are permitted the achievable synchronization skew may be too large to be useful. In such systems the required skew is significantly less than is achievable using the best software methods and the cost of a hardware based protocol is prohibitive Probabilistic synchronization techniques were developed to deal with the limitations of deterministic methods.

Probabilistic techniques are used to achieve synchrony with a predefined accuracy. The draw back of these methods is that the probability of achieving synchrony while finite is not guaranteed to be unity A probabilistic approach was initiated based on the assumption that message propagation times are randomly distributed. The algorithm utilizes a query based approach. A node A sends out a request to determine the clock time of a specific node. Based on the assumed distribution of message transit times, node A can calculate the accuracy achievable based on the arrival time of reply to its query If the reply to the query is received later than the maximum receive time associated with a particular time reading accuracy the message is ignored. The query and clock reading procedure is repeated until the other node’s clock time has been read with the desired accuracy. There is no guarantee that over multiple reading attempts the desired reading accuracy condition will result. The algorithm is not fault tolerant and given the message overhead caused by multiple read attempts the maximum number of read attempts must be predetermined in any implementation of this method. This may decrease the probability of achieving synchrony because that probability is proportional to the number of read attempts.

2.4       Randomized Agreement

In principle a secret sequence S is shared among N nodes. Each of the N nodes possesses a subsequence of S selected randomly and independently by participating units. The sequences are chosen such that a set of k out of N processors can reconstruct S by exchanging their secret subsequences and S cannot be reconstructed by less than k nodes. These procedures do not broadcast aggregated data sets in consecutive rounds Selective information dispersal techniques limit the amount of message traffic in the system. Another non deterministic solution uses R data exchange rounds to reach agreement with a probability of 2^-R that agreement will not be achieved

2.5       Mechanical Verification of Clock Synchronization

Coverage of more than a few faults in a system is often made impractical by the overhead associated with redundant resource management In a real time critical control system, it is nearly impossible to maintain correct operation in the presence of generic or design faults because these faults are duplicated throughout the system Since fault avoidance is more cost effective than attempting to tolerate such faults formal methods for component and algorithm design are being explored. The goals of these techniques are to minimize intolerable designed in faults and to increase the portions of the system that can be formally validated.

The success of the clock synchronization methods we have discussed depends on the correctness of the algorithms and their implementations Recent research efforts have focused on the use of formal methods such as formal specification and verification of algorithms to avoid such faults in synchronization functions Many of the clock synchronization algorithms described herein have been formally verified using mechanical theorem proves or specification and verification systems

Synchronization in distributed System: Recent developments Version 0
👤 Author: by damajibodegmailcom 2020-03-30 12:16:30

Reversion History