# Power law

﻿
Power law

A power law is any polynomial relationship that exhibits the property of scale invariance. The most common power laws relate two variables and have the form

:$f\left(x\right) = ax^k! +o\left(x^k\right),$

where $a$ and $k$ are constants, and $o\left(x^k\right)$ is of $x$. Here, $k$ is typically called the "scaling exponent", the word "scaling" denoting the fact that a power-law function satisfies $f\left(c x\right) propto f\left(x\right)$ where $c$ is a constant. That is, a rescaling of the function's argument changes the constant of proportionality but preserves the shape of the function itself. This point becomes clearer if we take the logarithm of both sides:

:$logleft\left(f\left(x\right) ight\right) = k log x + log a.$

Notice that this expression has the form of a linear relationship with slope $k$. Rescaling the argument produces a linear shift of the function up or down but leaves both the basic form and the slope $k$ unchanged.

Power-law relations characterize a staggering number of naturally occurring phenomena, and this is one of the principal reasons why they have attracted interest. For instance, inverse-square laws, such as gravitation and the Coulomb force, are power laws, as are many common mathematical formulae such as the quadratic law of area of the circle. However it is mainly in the study of probability distributions that power laws have attracted recent interest. A wide variety of observed probability distributions appear, at least approximately, to have tails asymptotically following power-law forms, an observation connected closely with the study of theory of large deviations (also called extreme value theory), which considers the frequency of extremely rare events like stock market crashes and large natural disasters. It is primarily in the study of statistical distributions that the name "power law" is used; in other areas the power-law functional form is more often referred to simply as a polynomial form or polynomial function.

Scientific interest in power law relations also derives from the ease with which certain general classes of mechanisms can generate them, so that the observation of a power-law relation in data often points to specific kinds of mechanisms that might underly the natural phenomenon in question, and can indicate a deep connection with other, seemingly unrelated systems (see the reference by Simon and the subsection on universality below). The ubiquity of power-law relations in physics is partly due to dimensional constraints, while in complex systems, power laws are often thought to be signatures of hierarchy or of specific stochastic processes. A few notable examples of power laws are the Gutenberg-Richter law for earthquake sizes, Pareto's law of income distribution, structural self-similarity of fractals, and scaling laws in biological systems. Research on the origins of power-law relations, and efforts to observe and validate them in the real world, is an active topic of research in many fields of science, including physics, computer science, linguistics, geophysics, sociology, economics and more.

Properties of power laws

cale invariance

The main property of power laws that makes them interesting is their scale invariance. Given a relation $f\left(x\right) = ax^k$, or, indeed any homogeneous polynomial, scaling the argument $x$ by a constant factor causes only a proportionate scaling of the function itself. That is,

:$f\left(c x\right) = a\left(c x\right)^k = c^\left\{k\right\}f\left(x\right) propto f\left(x\right).!$

That is, scaling by a constant simply multiplies the original power-law relation by the constant $c^k$. Thus, it follows that all power laws with a particular scaling exponent are equivalent up to constant factors, since each is simply a scaled version of the others. This behavior is what produces the linear relationship when both logarithms are taken of both $f\left(x\right)$ and $x$, and the straight-line on the log-log plot is often called the "signature" of a power law. Notably, however, with real data, such straightness is necessary, but not a sufficient condition for the data following a power-law relation. In fact, there are many ways to generate finite amounts of data that mimic this signature behavior, but, in their asymptotic limit, are not true power laws. Thus, accurately fitting and validating power-law models is an active area of research in statistics.

Universality

The equivalence of power laws with a particular scaling exponent can have a deeper origin in the dynamical processes that generate the power-law relation. In physics, for example, phase transitions in thermodynamic systems are associated with the emergence of power-law distributions of certain quantities, whose exponents are referred to as the critical exponents of the system. Diverse systems with the same critical exponents — that is, which display identical scaling behaviour as they approach criticality — can be shown, via renormalization group theory, to share the same fundamental dynamics. For instance, the behavior of water and CO2 at their boiling points fall in the same universality class because they have identical critical exponents. In fact, almost all material phase transitions are described by a small set of universality classes. Similar observations have been made, though not as comprehensively, for various self-organized critical systems, where the critical point of the system is an attractor. Formally, this sharing of dynamics is referred to as universality, and systems with precisely the same critical exponents are said to belong to the same universality class.

Power-law functions

The general power-law function follows the polynomial form given above, and is a ubiquitous form throughout mathematics and science. Notably, however, not all polynomial functions are power laws because not all polynomials exhibit the property of scale invariance. Typically, power-law functions are polynomials in a single variable, and are explicitly used to model the scaling behavior of natural processes. For instance, allometric scaling laws for the relation of biological variables are some of the best known power-law functions in nature. In this context, the $o\left(x^k\right)$ term is most typically replaced by a deviation term $epsilon$, which can represent uncertainty in the observed values (perhaps measurement or sampling errors) or provide a simple way for observations to deviate from the no power-law function (perhaps for stochastic reasons):

:$y = ax^k + varepsilon.!$

Estimating the exponent from empirical data

There are many methods for fitting power-law functions to data, and the best option typically depends strongly on the kind of question being asked. For instance, prediction-type questions should rely on nonlinear regression, while descriptive-type summary questions, such as those found in allometry, should use a method that allows for uncertainty in both the $x$ and $y$ measurements. If the residuals are log normally distributed, e.g. if the spread in $y$ is multiplicative (increasing proportionally with $x$), a simple least-squares linear regression on log-transformed data can be performed, since the log transformed residues are normally distributed after transformation. Otherwise, the logarithmic transformation produces residuals that are log-normally distributed, while the least squares method requires normally distributed errors. In this latter context, the method of standardized major axis (SMA) regression (sometimes called "reduced major axis", but this term should be avoided) is preferred.

The major axis is the linear equation that minimizes the sum of squares of the shortest (perpendicular) distance between data points and the equation. This axis is equivalent to the first principal component axis of the covariance matrix. From this observation, the estimator for the slope can be derived

:$hat\left\{k\right\} = frac\left\{ sigma_y \right\}\left\{ sigma_x \right\} = sqrt\left\{ frac\left\{ sum_\left\{i=1\right\}^\left\{N\right\} \left(y_i - mu_y\right)^2 \right\}\left\{ sum_\left\{i=1\right\}^N \left(x_i - mu_\left\{x\right\}\right)^2 \right\} \right\}$

where $mu_x$ and $mu_y$ are the sample means of the $x$ and $y$ data, respectively.

More about this method, and the conditions under which it can be used, can be found in the Warton reference below. Further, Warton's comprehensive review article also provides [http://web.maths.unsw.edu.au/~dwarton/programs.html usable code] (C++, R, and Matlab) for estimation and testing routines for power-law functions.

Examples of power law functions

*The Stefan-Boltzmann law
*The Gompertz Law of Mortality
*The Ramberg-Osgood stress-strain relationship
*The Inverse-square law of Newtonian gravity
*The Initial mass function
*Gamma correction relating light intensity with voltage
*Kleiber's law relating animal metabolism to size, and allometric laws in general
*Behaviour near second-order phase transitions involving critical exponents
*Proposed form of experience curve effects
*The differential energy spectrum of cosmic-ray nuclei
*Inverse-square law
*Square-cube law
*Constructal law
*Fractals

Power-law distributions

A power-law distribution is any that, in the most general sense, has the form

:$p\left(x\right) propto L\left(x\right) x^\left\{-alpha\right\}$

where $alpha > 1$, and $L\left(x\right)$ is a slowly varying function, which is any function that satisfies $lim_\left\{x ightarrowinfty\right\} L\left(t,x\right) / L\left(x\right) = 1$ with $t$ constant. This property of $L\left(x\right)$ follows directly from the requirement that $p\left(x\right)$ be asymptotically scale invariant; thus, the form of $L\left(x\right)$ only controls the shape and finite extent of the lower tail. For instance, if $L\left(x\right)$ is the constant function, then we have a power-law that holds for all values of $x$. In many cases, it is convenient to assume a lower bound $x_\left\{mathrm\left\{min$ from which the law holds. Combining these two cases, and where $x$ is a continuous variable, the power law has the form

:$p\left(x\right) = frac\left\{alpha-1\right\}\left\{x_min\right\} left\left(frac\left\{x\right\}\left\{x_min\right\} ight\right)^\left\{-alpha\right\},$

where the constant is necessary to guarantee that the distribution is properly normalized. Briefly, we can consider several properties of this distribution.

In general, the moments of this distribution are given by

:$langle x^\left\{m\right\} angle = int_\left\{x_min\right\}^\left\{infty\right\} x^\left\{m\right\} p\left(x\right) ,mathrm\left\{d\right\}x = frac\left\{alpha-1\right\}\left\{alpha-1-m\right\}x_min^m$

which is only well defined for $m < alpha -1$. That is, all moments $m geq alpha - 1$ diverge: when $alpha<2$, the average and all higher-order moments are infinite; when

Another kind of power-law distribution, which does not satisfy the general form above, is the power law with an exponential cutoff

:$p\left(x\right) propto L\left(x\right) x^\left\{-alpha\right\} mathrm\left\{e\right\}^\left\{-lambda x\right\}$

where we introduce an exponential decay term $mathrm\left\{e\right\}^\left\{-lambda x\right\}$ that overwhelms the power-law behavior at large values of $x$. This distribution does not scale and is thus not asymptotically a power law; however, it does approximately scale over a finite region before the cutoff. (Note that the pure form above is a subset of this family, with $lambda=0$.) This distribution is a common alternative to the asymptotic power-law distribution because it naturally captures finite-size effects. For instance, although the Gutenberg-Richter Law is commonly cited as an example of a power-law distribution, the distribution of earthquake magnitudes cannot scale as a power law in the limit $x ightarrowinfty$ because there is a finite amount of energy in the Earth's crust. Thus, there must be some maximum size earthquake, and the scaling behavior must taper off as it approaches this size.

Plotting power-law distributions

In general, power-law distributions are plotted on doubly logarithmic axes, which emphasizes the upper tail region. The most convenient way to do this is via the (complementary) cumulative distribution (cdf), $P\left(x\right) = mathrm\left\{Pr\right\}\left(X > x\right)$,

:$P\left(x\right) = Pr\left(X > x\right) = C int_x^infty p\left(X\right),mathrm\left\{d\right\}X = frac\left\{alpha-1\right\}\left\{x_min^\left\{-alpha+1 int_x^infty X^\left\{-alpha\right\},mathrm\left\{d\right\}X = left\left(frac\left\{x\right\}\left\{x_min\right\} ight\right)^\left\{\left(-alpha+1\right)\right\}.$

Note that the cdf is also a power-law function, but with a smaller scaling exponent. For data, an equivalent form of the cdf is the rank-frequency approach, in which we first sort the $n$ observed values in ascending order, and plot them against the vector $left \left[1,frac\left\{n-1\right\}\left\{n\right\},frac\left\{n-2\right\}\left\{n\right\},dots,frac\left\{1\right\}\left\{n\right\} ight\right]$.

Although it can be convenient to log-bin the data, or otherwise smooth the probability density (mass) function directly, these methods introduce an implicit bias in the representation of the data, and thus should be avoided. The cdf, on the other hand, introduces no bias in the data and preserves the linear signature on doubly logarithmic axes.

Estimating the exponent from empirical data

There are many ways of estimating the value of the scaling exponent for a power-law tail, however not all of them yield unbiased and consistent answers. The most reliable techniques are often based on the method of maximum likelihood. Alternative methods are often based on making a linear regression on either the log-log probability, the log-log cumulative distribution function, or on log-binned data, but these approaches should be avoided as they can all lead to highly biased estimates of the scaling exponent (see the Clauset et al. reference below).

For real-valued data, we fit a power-law distribution of the form

:$p\left(x\right) = frac\left\{alpha-1\right\}\left\{x_min\right\} left\left(frac\left\{x\right\}\left\{x_min\right\} ight\right)^\left\{-alpha\right\}$

to the data $xgeq x_min$. Given a choice for $x_min$, a simple derivation by this method yields the estimator equation

:$hat\left\{alpha\right\} = 1 + n left \left[ sum_\left\{i=1\right\}^\left\{n\right\} ln frac\left\{x_i\right\}\left\{x_min\right\} ight\right] ^\left\{-1\right\}$

where $\left\{x_i\right\}$ are the $n$ data points $x_\left\{i\right\}geq x_min$. (For a more detailed derivation, see Hall or Newman below.) This estimator exhibits a small finite sample-size bias of order $O\left(n^\left\{-1\right\}\right)$, which is small when "n" > 100. Further, the uncertainty in the estimation can be derived from the maximum likelihood argument, and has the form $sigma = frac\left\{alpha-1\right\}\left\{sqrt\left\{n$. This estimator is equivalent to the popular Hill estimator from quantitative finance and extreme value theory.

For a set of "n" integer-valued data points $\left\{x_i\right\}$, again where each $x_igeq x_min$, the maximum likelihood exponent is the solution to the transcendental equation

:$frac\left\{zeta\text{'}\left(hatalpha,x_min\right)\right\}\left\{zeta\left(hat\left\{alpha\right\},x_min\right)\right\} = -frac\left\{1\right\}\left\{n\right\} sum_\left\{i=1\right\}^n ln frac\left\{x_i\right\}\left\{x_min\right\}$

where $zeta\left(alpha,x_\left\{mathrm\left\{min\right)$ is the incomplete zeta function. The uncertainty in this estimate follows the same formula as for the continuous equation. However, the two equations for $hat\left\{alpha\right\}$ are not equivalent, and the continuous version should not be applied to discrete data, nor vice versa.

Further, both of these estimators require the choice of $x_min$. For functions with a non-trivial $L\left(x\right)$ function, choosing $x_min$ too small produces a significant bias in $hatalpha$, while choosing it too small increases the uncertainty in $hat\left\{alpha\right\}$, and reduces the statistical power of our model. In general, the optimum choice of $x_min$ depends strongly on the particular form of the lower tail, represented by $L\left(x\right)$ above.

More about these methods, and the conditions under which they can be used, can be found in the Clauset et al. reference below. Further, this comprehensive review article provides [http://www.santafe.edu/~aaronc/powerlaws/ usable code] (Matlab and R) for estimation and testing routines for power-law distributions.

Examples of power-law distributions

*Pareto distribution (continuous)
*Zeta distribution (discrete)
*Yule–Simon distribution (discrete)
*Student's t-distribution (continuous), of which the Cauchy distribution is a special case
*Zipf's law and its generalization, the Zipf-Mandelbrot law (discrete)
**Lotka's law
*The scale-free network model
*Bibliograms
*Gutenberg-Richter law of earthquake magnitudes
*Horton's laws describing river systems
*Richardson's Law for the severity of violent conflicts (wars and terrorism)
*population of cities
*net worth of individuals
*frequency of words in a text

A great many power-law distributions have been conjectured in recent years. For instance, power laws are thought to characterize the behavior of the upper tails for the popularity of websites, number of species per genus, the popularity of given names, the size of financial returns, and many others. However, much debate remains as to which of these tails are actually power-law distributed and which are not. For instance, it is commonly accepted now that the famous Gutenberg-Richter Law decays more rapidly than a pure power-law tail because of a finite exponential cutoff in the upper tail.

Validating power laws

Although power-law relations are attractive for many theoretical reasons, demonstrating that data do indeed follow a power-law relation requires more than simply fitting such a model to the data. In general, many alternative functional forms can appear to follow a power-law form for some extent. Thus, the preferred method for validation of power-law relations is by testing many orthogonal predictions of a particular generative mechanism against data, and not simply fitting a power-law relation to a particular kind of data. As such, the validation of power-law claims remains a very active field of research in many areas of modern science.

ee also

*Allometric law
*Extreme value theory
*Lognormal distribution
*Fat tail
*Heavy-tailed distributions
*80-20 rule
*The Long Tail
*Wealth condensation
*Keston process
*Levy skew alpha-stable distribution
*Lévy flight
*Kleiber's law
*Power law fluid
*Simon Model
*Stevens' power law
*Zipf's law

Bibliography

* cite journal
author = Simon, H. A.
year = 1955
title = On a Class of Skew Distribution Functions
journal = Biometrika
volume = 42
pages = 425&ndash;440
doi = 10.2307/2333389

* cite journal
author = Hall, P.
year = 1982
title = On Some Simple Estimates of an Exponent of Regular Variation
journal = Journal of the Royal Statistical Society, Series B (Methodological)
volume = 44
pages = 37&ndash;42
issue = 1

* cite journal
author = Mitzenmacher, M.
year = 2003
title = A brief history of generative models for power law and lognormal distributions
journal = Internet Mathematics
volume = 1
pages = 226&ndash;251
url = http://www.internetmathematics.org/volumes/1/2/pp226_251.pdf

* cite journal
author = Newman, M. E. J.
year = 2005
title = Power laws, Pareto distributions and Zipf's law
journal = Contemporary Physics
volume = 46
pages = 323&ndash;351
url = http://www.journalsonline.tandf.co.uk/openurl.asp?genre=article&doi=10.1080/00107510500052444
doi = 10.1080/00107510500052444

* cite journal
author = Warton, D. I., Wright, I. J., Falster, D. S., and Westoby, M.
year = 2006
title = Bivariate line-fitting methods for allometry
journal = Biological Reviews
volume = 81
pages = 259&ndash;291
url = http://www.maths.unsw.edu.au/statistics/files/preprint-2005-02.pdf
doi = 10.1017/S1464793106007007

* cite journal
author = Clauset, A., Shalizi, C. R. and Newman, M. E. J.
year = 2007
title = Power-law distributions in empirical data
url = http://arxiv.org/abs/0706.1062

* [http://www.nslij-genetics.org/wli/zipf/ Zipf's law]
* [http://aps.arxiv.org/abs/cond-mat/0412004/ Power laws, Pareto distributions and Zipf's law]
* [http://www.hpl.hp.com/research/idl/papers/ranking/ranking.html Zipf, Power-laws, and Pareto - a ranking tutorial]
* [http://www.physicalgeography.net/fundamentals/10ab.html Stream Morphometry and Horton's Laws]
* [http://www.fooledbyrandomness.com/fortune.pdf "How the Finance Gurus Get Risk All Wrong"] by Benoit Mandelbrot & Nassim Nicholas Taleb. "Fortune", July 11, 2005.
* [http://www.newyorker.com/fact/content/articles/060213fa_fact "Million-dollar Murray":] power-law distributions in homelessness and other social problems; by Malcolm Gladwell. "The New Yorker", February 13, 2006.
*Benoit Mandelbrot & Richard Hudson: The Misbehaviour of Markets (2004)
*Philip Ball: [http://www.agrfoto.com/philipball/criticalmass.php Critical Mass: How one thing leads to another] (2005)
* [http://econophysics.blogspot.com/2006/07/tyranny-of-power-law-and-why-we-should.html "Tyranny of the Power Law"] from [http://econophysics.blogspot.com The Econophysics Blog]

Wikimedia Foundation. 2010.

### Look at other dictionaries:

• power law — noun (psychophysics) the concept that the magnitude of a subjective sensation increases proportional to a power of the stimulus intensity • Syn: ↑Stevens law, ↑Stevens power law • Topics: ↑psychophysics • Hypernyms: ↑law, ↑ …   Useful english dictionary

• Power law — Una relación power law entre dos escalares cuantitativos x e y es aquella tal que la relación puede ser escrita como donde a (la constante de proporcionalidad) y k (el exponente de la power law) son constantes. Power laws puede ser interpretada… …   Enciclopedia Universal

• Power law — Dieser Artikel oder Abschnitt bedarf einer Überarbeitung. Näheres ist auf der Diskussionsseite angegeben. Hilf mit, ihn zu verbessern, und entferne anschließend diese Markierung. Die Potenzgesetze (engl. power law) gehören zu den Skalengesetzen… …   Deutsch Wikipedia

• power law — noun Any of many mathematical relationships in which something is related to something else by an equation of the form f(x) = a.x …   Wiktionary

• Power-law fluid — NOTOC A Power law fluid is a type of generalized Newtonian fluid for which the shear stress, tau; , is given by : au = K left( frac {partial u} {partial y} ight)^n where: * K is the flow consistency index (SI units Pa bull;s n ), * part; u /… …   Wikipedia

• Power-law index profile — For optical fibers, a power law index profile is an index of refraction profile characterized by : n(r) = egin{cases} n 1 sqrt{1 2Deltaleft({r over alpha} ight)^g} r le alpha n 1 sqrt{1 2Delta} r ge alpha end{cases}whereDelta = {n 1^2 n 2^2 over …   Wikipedia

• Power Law of Practice — The Power Law of Practice states that the logarithm of the reaction time for a particular task decreases linearly with the logarithm of the number of practice trials taken. It is an example of the learning curve effect on performance …   Wikipedia

• Stevens' power law — is a proposed relationship between the magnitude of a physical stimulus and its perceived intensity or strength. It is often considered to supersede the Weber Fechner law on the basis that it describes a wider range of sensations, although… …   Wikipedia

• Wind profile power law — The wind profile power law is a relationship between the wind speeds at one height, and those at another. The power law is often used in wind power assessments [Elliott, D.L., C.G. Holladay, W.R. Barchet, H.P. Foote, and W.F. Sandusky, 1986,… …   Wikipedia

• Stevens' power law — noun (psychophysics) the concept that the magnitude of a subjective sensation increases proportional to a power of the stimulus intensity • Syn: ↑Stevens law, ↑power law • Topics: ↑psychophysics • Hypernyms: ↑law, ↑ …   Useful english dictionary