Statistical Lempel Ziv

Statistical Lempel Ziv

Statistical Lempel-Ziv is a concept of lossless data compression technique published by Dr. Sam Kwong and Yu Fan Ho in 2001.[1] It may be viewed as a variant of the Lempel-Ziv (LZ) based method. The contribution of this concept is to include the statistical properties of the source information while most of the LZ-based compression methods, such as LZ78 and LZW do not take this property into consideration.



The concept of statistical Lempel-Ziv was firstly proposed by Yu Fan Ho in 2000 as the research topic of the Master degree in the Department of Computer Science of the City University of Hong Kong. Dr. Sam Kwong was Ho's supervisor in this research topic.

In Feb, 2001, the paper on the title "A Statistical Lempel-Ziv compression algorithm for personal digital assistant (PDA)" was published in the IEEE Transactions on Consumer Electronics.

In 2004, Ho successfully applied statistical Lempel-Ziv to a compression algorithm specific for polyphonic melody data. It was useful for the popular mobile phone or handhelds as polyphonic ring tones. Ho proved that the compression ratio, decompression speed and memory consumption outperformed the commonly used lossless compressors such as LZ77, zip, etc., although the compression speed is lower. Fortunately, the compression speed is not essential because compression of the ring tones for handheld devices were preprocessed in factory and not in the devices.

In March 2009, the application of statistical Lempel Ziv on melody data was granted a patent by the USPTO with the United States Patent number 7,507,897.[2]


Traditional LZ-based technologies make use of the repetitive characteristic of the data. The decompression process can be done simply by copying the repeated data from the search window according to an index in the compressed data. The data not found in the window is left uncompressed in the compressed data. The uncompressed data is then shifted into the search window for the next repetition and so on. The data is shifted into the window unconditionally without considering the statistical information. Because of limited size of the search window, the first-in data is shifted out unconditionally when the window is full. There are high possibilities that the window is occupied by the useless (non-repetitive) data while the useful (to be repeated) data is banished. To improve the compression ratio, larger search window should be used and hence more memory required in the decompressor.

Statistical Lempel Ziv is a LZ-like lossless compression algorithm but statistical information is also taken into consideration to identify the useful data that should be put into the dictionary (search window). It improves the compression ratio compared with LZ77 because more useful data can be kept in the dictionary. The dictionary can be smaller in size for keeping the useful data and hence less memory required in the decompressor. Since not all the data has to be shifted into the window, less processing power is required on the decompressor.


  1. ^ Kwong, S. & Ho, Y.F., "A statistical Lempel-Ziv compression algorithm for personal digital assistant (PDA)", IEEE Transactions on Consumer Electronics, Vol. 47, Issue 1, pp. 154-162, Feb 2001
  2. ^ Dictionary-based compression of melody data and compressor/decompressor for the same, U.S. patent: 7,507,897

Family of Compression Methods

Wikimedia Foundation. 2010.

Look at other dictionaries:

  • ZIV, JACOB — (1931– ), electrical engineer. Born in Tiberias, Israel, Ziv received his B.Sc., Dip. Eng. (1954), and M.Sc. (1957), both in electrical engineering, from the Technion, Israel Institute of Technology, Haifa. From 1955 to 1959, he was a senior… …   Encyclopedia of Judaism

  • Jacob Ziv — ( he. יעקב זיו; 1931 ) is an Israeli computer scientist who, along with Abraham Lempel, developed the lossless LZ77 compression algorithm.Ziv was born in Tiberias, British ruled Palestine, on November 27, 1931. He received the B.Sc., Dip. Eng.,… …   Wikipedia

  • Lossless data compression — is a class of data compression algorithms that allows the exact original data to be reconstructed from the compressed data. The term lossless is in contrast to lossy data compression, which only allows an approximation of the original data to be… …   Wikipedia

  • Data compression — Source coding redirects here. For the term in computer programming, see Source code. In computer science and information theory, data compression, source coding or bit rate reduction is the process of encoding information using fewer bits than… …   Wikipedia

  • Huffman coding — Huffman tree generated from the exact frequencies of the text this is an example of a huffman tree . The frequencies and codes of each character are below. Encoding the sentence with this code requires 135 bits, as opposed of 288 bits if 36… …   Wikipedia

  • Information theory — Not to be confused with Information science. Information theory is a branch of applied mathematics and electrical engineering involving the quantification of information. Information theory was developed by Claude E. Shannon to find fundamental… …   Wikipedia

  • Kolmogorov complexity — In algorithmic information theory (a subfield of computer science), the Kolmogorov complexity of an object, such as a piece of text, is a measure of the computational resources needed to specify the object. It is named after Soviet Russian… …   Wikipedia

  • MPEG-1 — Moving Picture Experts Group Phase 1 (MPEG 1) Filename extension .mpg, .mpeg, .mp1, .mp2, .mp3, .m1v, .m1a, .m2a, .mpa, .mpv Internet media type audio/mpeg, video/mpeg Developed by ISO, IEC Type of format audio, vid …   Wikipedia

  • Nyquist–Shannon sampling theorem — Fig.1: Hypothetical spectrum of a bandlimited signal as a function of frequency The Nyquist–Shannon sampling theorem, after Harry Nyquist and Claude Shannon, is a fundamental result in the field of information theory, in particular… …   Wikipedia

  • Motion compensation — is an algorithmic technique employed in the encoding of video data for video compression, for example in the generation of MPEG 2 files. Motion compensation describes a picture in terms of the transformation of a reference picture to the current… …   Wikipedia