Lockstep (computing)

Lockstep (computing)

Lockstep systems are redundant computing systems that run the same set of operations at the same time in parallel. The output from lockstep operations can be compared to determine if there has been a fault.

To run in lockstep, each system is set up to progress from one well-defined state to the next well-defined state. When a new set of inputs reaches the system, it processes them, generates new outputs and updates its state. This set of changes (new inputs, new outputs, new state) is considered to define that step, and must be treated as an atomic transaction; in other words, either all of it happens, or none of it happens, but not something in between.

The term "lockstep" originates in the prison usage, where it refers to the synchronized walking, in which the marchers walk as closely together as physically practical.

Dual Modular Redundancy

Where the computing systems are duplicated, but both actively process each step, it is difficult to arbitrate between them if their outputs differ at the end of a step. For this reason, it is common practice to run DMR systems as "master/slave" configurations with the slave as a "hot-standby" to the master, rather than in lockstep. Since there is no advantage in having the slave unit actively process each step, a common method of working is for the master to copy its state at the end of each step's processing to the slave. Should the master fail at some point, the slave is ready to continue from the previous known good step.

While either the lockstep or the DMR approach (when combined with some means of detecting errors in the master) can provide redundancy against hardware failure in the master, they do not protect against software failure. If the master fails because of a software error, it is highly likely that the slave - in attempting to repeat the execution of the step which failed - will simply repeat the same error and fail in the same way, an example of a common mode failure.

Triple Modular Redundancy

Where the computing systems are triplicated, it becomes possible to treat them as "voting" systems. If one unit's output disagrees with the other two, it is detected as having failed. The matched output from the other two is treated as correct.


Wikimedia Foundation. 2010.

Игры ⚽ Поможем сделать НИР

Look at other dictionaries:

  • Lockstep (disambiguation) — Lockstep or lock step may refer to one of the following*Lockstep marching, marching that involves all marcher s legs moving in the same way at the same time *Lockstep (computing), a term used in fault tolerant computing *Lockstep protocol, a… …   Wikipedia

  • Distributed computing — is a field of computer science that studies distributed systems. A distributed system consists of multiple autonomous computers that communicate through a computer network. The computers interact with each other in order to achieve a common goal …   Wikipedia

  • Triple modular redundancy — In computing, triple modular redundancy (TMR) is a fault tolerant form of N modular redundancy, in which three systems perform a process and that result is processed by a voting system to produce a single output. If any one of the three systems… …   Wikipedia

  • Fault-tolerant system — This article contains specific implementations of fault tolerant systems. For general theory, see fault tolerant design. Fault tolerance or graceful degradation is the property that enables a system (often computer based) to continue operating… …   Wikipedia

  • Itanium — 2 processor Produced From mid 2001 to present Common manufacturer(s) Intel Max. CPU c …   Wikipedia

  • Duncan's Taxonomy — is a classification of computer architectures, proposed by Ralph Duncan in 1990.[1] Duncan proposed modifications to Flynn s taxonomy[2] to include pipelined vector processes.[3] Contents 1 Taxonomy …   Wikipedia

  • Minimally Invasive Education — (or MIE) is a term used to describe how children learn in unsupervised environments. It was derived from an experiment done by Sugata Mitra while at NIIT in 1999 often called The Hole in the Wall.[1] It has since gone on to become a significant… …   Wikipedia

  • Mikroprozessoren von Intel — Dies ist eine zeitlich gegliederte Liste der PC Mikroprozessoren von Intel. Für eine Liste der mathematischen Koprozessoren, siehe X86er Koprozessoren. Siehe auch Intel Modellnummern Inhaltsverzeichnis 1 1970–1979 1.1 4004 1.2 4040 …   Deutsch Wikipedia

  • SPMD — In computing, SPMD (Single Process, Multiple Data) or (Single Program, Multiple Data) is a technique employed to achieve parallelism; it is a subcategory of MIMD. Tasks are split up and run simultaneously on multiple processors with different… …   Wikipedia

  • Liste der Mikroprozessoren von Intel — Dies ist eine zeitlich gegliederte Liste der PC Mikroprozessoren von Intel. Für eine Liste der mathematischen Koprozessoren, siehe die Liste der x86er Koprozessoren. Siehe auch Intel Modellnummern. Inhaltsverzeichnis 1 1970–1979 1.1 4004 1.2 4040 …   Deutsch Wikipedia

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”