Self-stabilization

Self-stabilization

Self-stabilization is a concept of fault-tolerance in distributed computing. Distributed computing systems are challenging to debug and analyze. As a result, strong properties (properties that hold under a variety of circumstances) of such systems are especially important to simplify systems analysis and to prove system correctness. Self-stabilization is considered a highly desirable property. A distributed system that is self-stabilizing will end up in a correct state no matter what state it is initialized with, and no matter what execution steps it will take. This property guarantees that the system will end in a correct state after a finite number of execution steps. This is in contrast to typical fault-tolerance algorithms that guarantee that under all state transitions, the system will never deviate from a correct state. E.W. Dijkstra in 1974 presented the first self-stabilizing algorithm, prompting further research in this area. [ [http://www.cs.utexas.edu/users/EWD/ewd04xx/EWD426.PDF E.W. Dijkstra: Self-stabilizing systems in spite of distributed control. Commun. ACM 17 (1974), 11: 643-644.] ]

The ability to recover without external intervention is very desirable in modern computer and telecommunications networks, since it would enable them to repair errors and return to normal operations on their own. Computers and networks can thus be made fault-tolerant. Hence, many years after the seminal paper of Dijkstra, this concept is gaining in importance as it presents an important foundation for self-managing computer systems and fault-tolerant systems. As a result, Dijkstra's paper received the 2002 ACM PODC Influential-Paper Award - one of the highest achievements in the distributed computing community. [ [http://www.podc.org/dijkstra/ Edsger W. Dijkstra Prize in Distributed Computing] ]

Overview

A distributed algorithm is self-stabilizing if, starting from an arbitrary state, it is guaranteed to converge to a legitimate state and remain in a legitimate set of states thereafter. A state is legitimate if starting from this state the algorithm satisfies its specification. The property of self-stabilization enables a distributed algorithm to recover from a transient fault regardless of its nature. Moreover, a self-stabilizing algorithm does not have to be initialized as it eventually starts to behave correctly.

Dijkstra's paper, which introduces the concept of self-stabilization, presents an example in the context of a "token ring" — a network of computers ordered in a circle, such that exactly one of them is supposed to "hold a token" at any given time. Not holding a token is a correct state for each computer in this network, since the token can be held by another computer. However, if every computer is in the state of "not holding a token" then the network as a whole is not in a correct state. Similarly, if more than one computer "has a token" then this is not a correct state for the network, although it cannot be observed to be incorrect by viewing any computer individually. Since every computer can "see" only the states of two other computers, it is hard for the computers to decide whether the network as a whole is in a correct state.

The time complexity of a self-stabilizing algorithm is measured in (asynchronous) "rounds" or "cycles". A round is a shortest execution trace in which each processor executes at least one step. Similarly, a cycle is a shortest execution trace in which each processor executes at least one complete iteration of its repeatedly executed list of commands. It is also interesting to measure the output stabilization time. For that, a subset of the state variables is defined to be externally visible (the "output"). Certain states of outputs are defined to be correct (legitimate). The set of the outputs of all the components of the system is said to have stabilized at the time that it starts to be correct, provided it stays correct indefinitely, unless additional faults occur. The output stabilization time is the time (the number of (asynchronous) "rounds") until the output stabilized.Baruch Awerbuch, Boaz Patt-Shamir, George Varghese. Self-Stabilization By Local Checking and Correction (Extended Abstract) FOCS 1991: 268-277.]

The first self-stabilizing algorithms did not detect errors explicitly in order to subsequently repair them. Instead, they constantly pushed the system towards a legitimate state, even without explicitly detecting error states. Since traditional methods for detecting an error (e.g. [Shmuel Katz, Kenneth J. Perry. Self-Stabilizing Extensions for Message-Passing Systems. Distributed Computing 7(1): 17-26 (1993).] ) were often very difficult and time-consuming, such a behaviour was considered desirable.

New methods for light-weight error detection for self-stabilizing systems were suggested in, [http://iew3.technion.ac.il/~kutten/aky.ps Yehuda Afek, Shay Kutten, Moti Yung. The Local Detection Paradigm and Its Application to Self-Stabilization. Theor. Comput. Sci. 186(1-2): 199-229 (1997).] ] Baruch Awerbuch, Boaz Patt-Shamir, George Varghese. Self-Stabilization By Local Checking and Correction (Extended Abstract) FOCS 1991: 268-277.] under the names of local detection and local checking. The term "local" refers to a part of a computer network. When local detection is used, a computer in a network is not required to communicate with the entire network in order to detect an error — the error can be detected by having each computer communicate only with its nearest neighbors. These local detection methods simplified the task of designing self-stabilizing algorithms considerably. This is because the error detection mechanism and the recovery mechanism can be designed separately. Newer algorithms based on these detection methods turned out to be also much more efficient.

Additional efficiency was introduced with the notion of time-adaptive protocols. [ [http://iew3.technion.ac.il/~kutten/boaz.ps Shay Kutten, Boaz Patt-Shamir: Stabilizing Time-Adaptive Protocols. Theor. Comput. Sci. 220(1): 93-111 (1999).] ] The idea behind these is that when only a small number of errors occurs, the recovery time should (and can) be made short. The original algorithms of Dijkstra do not have this property.

A useful property of self-stabilizing algorithms is that they can be composed by layering if they do not exhibit any circular dependencies. The stabilization time of the composition is then bounded by the sum of the individual stabilization times of each layer.

Definition

A system is self-stabilizing if and only if:
# Starting from any state, it is guaranteed that the system will eventually reach a correct state ("convergence").
# Given that the system is in a correct state, it is guaranteed to stay in a correct state, provided that no fault happens ("closure").

A system is said to be "randomized self-stabilizing" if and only if it is self-stabilizing and the expected number of rounds needed to reach a correct state is bounded by some constant k. [Self-Stabilization. Shlomi Dolev, MIT Press, 2000.]

A self-stabilizing algorithm is "silent" if and only if it converges to a global state where the values of communication registers used by the algorithm remain fixed. [Shlomi Dolev, Mohamed G. Gouda, and Marco Schneider. [http://doi.acm.org/10.1145/248052.248055 Memory requirements for silent stabilization] . In PODC '96: Proceedings of the fifteenth annual ACM symposium on Principles of distributed computing, pages 27--34, New York, NY, USA, 1996. ACM Press. [http://citeseer.ist.psu.edu/dolev96memory.html Online extended abstract] .]

Related work

An extension of the concept of self-stabilization is that of superstabilization. [Shlomi Dolev and Ted Herman. [http://cjtcs.cs.uchicago.edu/articles/1997/4/contents.html Superstabilizing protocols for dynamic distributed systems] . Chicago Journal of Theoretical Computer Science, 4, December 1997. Special Issue on Self-Stabilization.] The intent here is to cope with dynamic distributed systems that undergo topological changes. In classical self-stabilization theory, arbitrary changes are viewed as errors where no guarantees are given until the system has stabilized again. With superstabilizing systems, there is a "passage" predicate that is always satisfied, while the system's topology is reconfigured.

References


Wikimedia Foundation. 2010.

Игры ⚽ Нужно решить контрольную?

Look at other dictionaries:

  • self-stabilization — stabilizavimasis statusas T sritis radioelektronika atitikmenys: angl. self stabilization vok. Selbststabilisierung, f rus. самостабилизация, f pranc. autostabilisation, f …   Radioelektronikos terminų žodynas

  • self-stabilization of phase — fazavimasis statusas T sritis radioelektronika atitikmenys: angl. self phasing; self stabilization of phase vok. selbständige Phasenstabilisierung, f rus. автофазировка, f pranc. stabilisation automatique de phase, f …   Radioelektronikos terminų žodynas

  • self-phasing — fazavimasis statusas T sritis radioelektronika atitikmenys: angl. self phasing; self stabilization of phase vok. selbständige Phasenstabilisierung, f rus. автофазировка, f pranc. stabilisation automatique de phase, f …   Radioelektronikos terminų žodynas

  • Emergency Economic Stabilization Act of 2008 — This article is about one division of an enacted statute. For the entire statute, see Public Law 110 343. For the enacted rescue program, see Troubled Asset Relief Program. The Emergency Economic Stabilization Act of 2008 (Division A of Pub.L.… …   Wikipedia

  • Japan Self-Defense Forces — Military of Japan redirects here. For earlier military forces of the country, see Military history of Japan. JSDF redirects here. For other uses, see Japan Social Development Fund. Japan Self Defense Forces 日本国自衛隊 …   Wikipedia

  • Center for International Stabilization and Recovery — The Center for International Stabilization and Recovery (CISR), formerly the Mine Action Information Center (MAIC), is a public policy center at James Madison University that manages information, conducts training, holds conferences and workshops …   Wikipedia

  • Autostabilisation — L autostabilisation, ou auto stabilisation, est la propriété d un système réparti, composé de plusieurs machines capables de communiquer entre elles, qui consiste, lorsque le système est mal initialisé ou perturbé, à retourner automatiquement à… …   Wikipédia en Français

  • Superstabilization — in computer science is a specialization of the concept of self stabilization. The difference is that a passage predicate is satisfied while the system undergoes a topological change. Thus, a superstabilizing protocol is said to be more stable… …   Wikipedia

  • Bicycle and motorcycle dynamics — A computer generated, simplified model of bike and rider demonstrating an uncontrolled right turn. An …   Wikipedia

  • Edsger W. Dijkstra — Edsger Wybe Dijkstra Born May 11, 1930(1930 05 11) Rotterdam, Netherl …   Wikipedia

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”