Tornado code


Tornado code

In computer science, tornado codes are a class of erasure codes that support error correction and have fast encoding and decoding algorithms. Software-based implementations of tornado codes are about 100 times faster on small lengths and about 10,000 times faster on larger lengths than software-based Reed-Solomon erasure codes while having only slightly worse overhead.

Tornado codes are fixed rate, near optimal erasure correcting codes that use sparse bipartite graphs to trade encoding and decoding speed for reception overhead. Since the introduction of Tornado codes, many other similar codes have emerged, most notably Online codes, LT codes and Raptor codes.

Rough Overview

Tornado codes work via xor (exclusive or). Xor operates on binary values, 1s and 0s. A xor B is 1 if A and B have different values and 0 if A and B have the same values. If you are given (A xor B) and A, you can determine the value for B. (Note that A xor B xor A is equal to B.) Similarly, if you are given (A xor B) and B, you can determine the value for A. This extends to multiple values, so given (A xor B xor C xor D) and any 3 of the values, the missing value can be recovered.

The xor of a number of binary variables is called their "parity" and this is often used in error detection and correction. Tornado codes use it for error correction. They use another checksum (like CRC-32 or MD5) for error detection.

The Tornado code algorithm starts with the sender breaking an input file or message into equal sized blocks of bytes. Let's call these blocks A [1] through A [N] . The sender records the index of each block and computes a checksums for the block and its index. (These will be used to determine if a block has been damaged during transmission and therefore needs to be recovered.) The sender also calculates some parity blocks, B [1] through B [K] . Each of these parity blocks holds the parity for a subset of the input blocks A [1] through A [N] . The sizes of and composition of these subsets is key to the speed and success of this algorithm. For each parity block, the sender records the indices of the input blocks and a checksum for the parity block and its input indices.

The sender now sends the input and parity blocks (with their indices and checksums) to the receiver. During this transmission, some of the blocks may be corrupted.

The receiver uses the checksums to identify the bad blocks and discards them. The receiver is now left with a subset of the input blocks and some parity blocks. As long as the receiver has received N + C blocks (where C is some constant), it is highly probable that the receiver can recover the file. Now, each parity block is associated with a subset of input blocks and for most parity blocks there may be multiple input blocks missing from its subset. However, given the sizes of the random subsets, it is highly likely that there exists one parity block that is missing only one of its input blocks. Using the xor operation described above, that missing input block can be recovered. Once it is recovered, a parity block that was previously missing two input blocks may now be missing just one and that one can now be recovered. This process continues - input blocks being recovered and more parity blocks being available to recover missing blocks - until the entire input file or message is recovered.

The beauty of the algorithm comes from determining the sizes and composition of the subsets. On average, the sizes are low - making it very fast to create them and fast to recover the file. However, occasionally they are large - covering most of the input blocks - so that any missing block can be recovered.

Exact Algorithm

"Need details on the exact formula for computing the size of subsets, given a value for C and expected loss rate. Also need big O() notation for algorithm and comparison for best of Reed-Solomon implementation. Source code is always great."

The subsets for each parity block are generated in a semi-random fashion. The algorithm assigned for each input block a number of parity blocks it will be present in. The algorithm also assigns for each parity block the number of input blocks that will be in it. Then randomly parity blocks with remaining slots are assigned input blocks with remaining slots. This is called a random degree-constrained bipartite graph. This method generates parity blocks with random subsets of input blocks with a specific distribution that allows the input file or message to be recovered with very high probability.

Patent Issues

Tornado codes and many similar codes (LT codes, Online codes) are patented inside the United States of America and cannot be used with out the patent holder's permission.

Citations

Michael Luby created Tornado codes and has good descriptions on his website. [http://www.icsi.berkeley.edu/~luby/]

A readable description from CMU (PostScript): [http://www.cs.cmu.edu/afs/cs.cmu.edu/project/pscico-guyb/realworld/www/tornado.ps]

Probable patent citation from Michael Luby's CV: "Message Encoding and Transmission System and Method for Multilevel Data Redundancy", Andres Albanese, Michael Luby, Johannes Blomer and Je Edmonds, U.S. Patent No. 5,617,541, Issued April 1, 1997. Serial Number 08/361,802; 12-21-94, Assignment recorded February 27, 1995, Reel 7364, Frames 685-689.


Wikimedia Foundation. 2010.

Look at other dictionaries:

  • Tornado — Tornado, WV U.S. Census Designated Place in West Virginia Population (2000): 1111 Housing Units (2000): 437 Land area (2000): 3.595035 sq. miles (9.311097 sq. km) Water area (2000): 0.084938 sq. miles (0.219989 sq. km) Total area (2000): 3.679973 …   StarDict's U.S. Gazetteer Places

  • Tornado, WV — U.S. Census Designated Place in West Virginia Population (2000): 1111 Housing Units (2000): 437 Land area (2000): 3.595035 sq. miles (9.311097 sq. km) Water area (2000): 0.084938 sq. miles (0.219989 sq. km) Total area (2000): 3.679973 sq. miles… …   StarDict's U.S. Gazetteer Places

  • Tornado, West Virginia — Infobox Settlement official name = Tornado, West Virginia settlement type = CDP nickname = motto = imagesize = image caption = image mapsize = 250x200px map caption = Location of Tornado, West Virginia mapsize1 = map caption1 = subdivision type …   Wikipedia

  • Tornado Alley — For the book by William S. Burroughs, see Tornado Alley (book). Tornado Alley is a colloquial term most often used in reference to the area of the United States in which tornadoes are most frequent. Although an official location is not defined,… …   Wikipedia

  • Tornado watch — See Severe weather terminology for a comprehensive article on related weather terms. A tornado watch (SAME code: TOA; sometimes referred to as a red box by meteorologists and storm chasers) is issued when weather conditions are favorable for the… …   Wikipedia

  • Tornado Gundam — The Tornado Gundams are fictional weapons from the Gundam metaseries and were introduced in SD Gundam GX.AppearanceThe Tornado Gundams are the initial units given to the player upon first starting the game. As the name suggests, they are Gundam… …   Wikipedia

  • Building code — Code Violation: This concrete block wall is penetrated by cable trays and cables. The hole should be firestopped to restore the fire resistance rating of the wall. Instead, it is filled with flammable polyurethane foam. A building code, or… …   Wikipedia

  • LT code — In computer science, LT codes (Luby Transform codes) are the first class of practical fountain codes that are near optimal erasure correcting codes invented by Michael Luby in 1998 and published in 2002. [http://ieeexplore.ieee.org/xpl/freeabs… …   Wikipedia

  • Erasure code — In information theory, an erasure code is a forward error correction (FEC) code for the binary erasure channel, which transforms a message of k symbols into a longer message (code word) with n symbols such that the original message can be… …   Wikipedia

  • List of computer technology code names — Following is a list of code names that have been used to identify computer hardware and software products while in development. In some cases, the code name became the completed product s name, but most of these code names are no longer used once …   Wikipedia