Statistical randomness

A numeric sequence is said to be statistically random when it contains no recognizable patterns or regularities; sequences such as the results of an ideal die roll, or the digits of π exhibit statistical randomness.

Statistical randomness does not necessarily imply "true" randomness, i.e., objective unpredictability. Pseudorandomness is sufficient for many uses.

A distinction is sometimes made between global randomness and local randomness. Most philosophical conceptions of randomness are "global" — they are based on the idea that "in the long run" a sequence would look truly random, even if certain sequences would "not" look random (in a "truly" random sequence of sufficient length, for example, it is probable that there would be long sequences of nothing but zeros, though on the whole the sequence might be "random"). "Local" randomness refers to the idea that there can be minimum sequence lengths in which "random" distributions are approximated. Long stretches of the same digits, even those generated by "truly" random processes, would diminish the "local randomness" of a sample (it might only be locally random for sequences of 10,000 digits; taking sequences of less than 1,000 might not appear "random" at all, for example).

A sequence exhibiting a pattern is not thereby proved not statistically random. According to principles of Ramsey theory, sufficiently large objects must necessarily contain a given structure ("complete disorder is impossible").

Legislation concerning gambling imposes certain standards of statistical randomness to slot machines.

Contrast with algorithmic randomness.


The first tests for random numbers were published by M.G. Kendall and Bernard Babington Smith in the Journal of the Royal Statistical Society in 1938. They were built on statistical tools such as Pearson's chi-square test which were developed in order to distinguish whether or not experimental phenomena matched up with their theoretical probabilities (Pearson developed his test originally by showing that a number of dice experiments by W.F.R. Weldon did not display "random" behavior).

Kendall and Smith's original four tests were hypothesis tests, which took as their null hypothesis the idea that each number in a given random sequence had an equal chance of occurring, and that various other patterns in the data should be also distributed equiprobably.

* The frequency test, was very basic: checking to make sure that there were roughly the same number of 0s, 1s, 2s, 3s, etc.

* The serial test, did the same thing but for sequences of two digits at a time (00, 01, 02, etc.), comparing their observed frequencies with their hypothetical predictions were they equally distributed.
* The poker test, tested for certain sequences of five numbers at a time (aaaaa, aaaab, aaabb, etc.) based on hands in the game poker.
* The gap test, looked at the distances between 0s (00 would be a distance of 0, 030 would be a distance of 1, 02250 would be a distance of 3, etc.).

If a given sequence was able to pass all of these tests within a given degree of significance (generally 5%), then it was judged to be, in their words "locally random". Kendall and Smith differentiated "local randomness" from "true randomness" in that many sequences generated with truly random "methods" might not display "local randomness" to a given degree — "very" large sequences might contain many rows of a single digit. This might be "random" on the scale of the entire sequence, but in a smaller block it would not be "random" (it would not pass their tests), and would be useless for a number of statistical applications.

As random number sets became more and more common, more tests, of increasing sophistication were used. Some modern tests plot random digits as points on a three-dimensional plane, which can then be rotated to look for hidden patterns. In 1995, the statistician George Marsaglia created a set of tests known as the Diehard tests which he distributes with a CD-ROM of 5 billion pseudorandom numbers.

Pseudorandom number generators require tests as exclusive verifications for their "randomness" as they are decidedly "not" produced by "truly random" processes, but rather by deterministic algorithms. Over the history of random number generation, many sources of numbers thought to appear "random" under testing have later been discovered to be very non-random when subjected to certain types of tests. The notion of quasi-random numbers was developed in order to circumvent some of these problems, though pseudorandom number generators are still extensively used in many applications (even ones known to be extremely "non-random"), as they are "good enough" for most applications.

Other tests :
* The Monobit test treats each output bit of the random number generator as a coin flip test, and determine if the observed number of heads and tails are close to the expected 50% frequency. The number of heads in a coin flip trail forms a Binomial distribution.
* The Wald-Wolfowitz runs test tests for the number of bit transitions between 0 bits, and 1 bits, comparing the observed frequencies with expected frequency of a random bit sequence.
* Information entropy
* Autocorrelation test
* KS test
* Maurer's Universal Statistical Test.

See also

* Checking if a coin is fair
* Normal number
* Randomness
* Random number
* Statistical hypothesis testing
* One-time pad
* Randomness tests


* M.G. Kendall and B. Babington Smith, "Randomness and Random Sampling Numbers," "Journal of the Royal Statistical Society" 101:1 (1938), 147-166.

External links

* [ DieHarder] : A free (GPL) C Random Number Test Suite.
* [ Generating Normal Distributed Random Numbers]

Wikimedia Foundation. 2010.

Look at other dictionaries:

  • statistical randomness — noun Informally, the property of a numeric sequence of containing no recognizable patterns or regularities; exemplified in the results of an ideal die roll, or the digits of π (as far as we can tell). See Also: algorithmic randomness …   Wiktionary

  • Randomness tests — (or tests of randomness), in data evaluation, are used to analyze the distribution pattern of a set of data. In stochastic modeling, as in some computer simulations, the expected random input data can be verified to show that tests were performed …   Wikipedia

  • Randomness — Random redirects here. For other uses, see Random (disambiguation). For a random Wikipedia article, see Special:Random. For information about Wikipedia s random article feature, see Wikipedia:Random. Randomness has somewhat differing meanings as… …   Wikipedia

  • Statistical hypothesis testing — This article is about frequentist hypothesis testing which is taught in introductory statistics. For Bayesian hypothesis testing, see Bayesian inference. A statistical hypothesis test is a method of making decisions using data, whether from a… …   Wikipedia

  • Applications of randomness — Randomness has many uses in gambling, divination, statistics, cryptography, art, etc.Note that these uses may have different requirements when it comes to statistical randomness or unpredictability, which in turn leads to different randomization… …   Wikipedia

  • Complete spatial randomness — (CSR) describes a point process whereby point events occur within a given study area in a completely random fashion. Such a process is often modeled using only one parameter, i.e. the density of points, ρ within the defined area. This is also… …   Wikipedia

  • algorithmic randomness — noun Informally, the property of a string being not longer than any computer program that can produce that string; that is, the property of a string being incompressible. Syn: Kolmogorov randomness See Also: statistical randomness …   Wiktionary

  • Quantum statistical mechanics — is the study of statistical ensembles of quantum mechanical systems. A statistical ensemble is described by a density operator S , which is a non negative, self adjoint, trace class operator of trace 1 on the Hilbert space H describing the… …   Wikipedia

  • Per Martin-Löf — in 2004 Born May 8, 1942 (194 …   Wikipedia

  • Random number generation — A random number generator (often abbreviated as RNG) is a computational or physical device designed to generate a sequence of numbers or symbols that lack any pattern, i.e. appear random. Computer based systems for random number generation are… …   Wikipedia

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”