# Independence (probability theory)

﻿
Independence (probability theory)

In probability theory, to say that two events are independent intuitively means that the occurrence of one event makes it neither more nor less probable that the other occurs. For example:

• The event of getting a 6 the first time a die is rolled and the event of getting a 6 the second time are independent.
• By contrast, the event of getting a 6 the first time a die is rolled and the event that the sum of the numbers seen on the first and second trials is 8 are not independent.
• If two cards are drawn with replacement from a deck of cards, the event of drawing a red card on the first trial and that of drawing a red card on the second trial are independent.
• By contrast, if two cards are drawn without replacement from a deck of cards, the event of drawing a red card on the first trial and that of drawing a red card on the second trial are again not independent.

Similarly, two random variables are independent if the conditional probability distribution of either given the observed value of the other is the same as if the other's value had not been observed. The concept of independence extends to dealing with collections of more than two events or random variables.

In some instances, the term "independent" is replaced by "statistically independent", "marginally independent", or "absolutely independent".[1]

## Independent events

The standard definition says:

Two events A and B are independent if and only if Pr(AB) = Pr(A)Pr(B).

Here AB is the intersection of A and B, that is, it is the event that both events A and B occur.

More generally, any collection of events—possibly more than just two of them—are mutually independent if and only if for every finite subset A1, ..., An of the collection we have

$\Pr\left(\bigcap_{i=1}^n A_i\right)=\prod_{i=1}^n \Pr(A_i). \!\,$

This is called the multiplication rule for independent events. Notice that independence requires this rule to hold for every subset of the collection; see[2] for a three-event example in which $\Pr\left(\bigcap_{i=1}^3 A_i\right)=\prod_{i=1}^3 \Pr(A_i)\!$ and yet no two of the three events are pairwise independent.

If two events A and B are independent, then the conditional probability of A given B is the same as the unconditional (or marginal) probability of A, that is,

$\Pr(A\mid B)=\Pr(A). \!\,$

There are at least two reasons why this statement is not taken to be the definition of independence: (1) the two events A and B do not play symmetrical roles in this statement, and (2) problems arise with this statement when events of probability 0 are involved.

The conditional probability of event A given B is given by

$\Pr(A\mid B)={\Pr(A \cap B) \over \Pr(B)}, \!\,$ (so long as Pr(B) ≠ 0 )

The statement above, when $\Pr(B)\neq 0$ is equivalent to

$\Pr(A \cap B)=\Pr(A\mid B)\Pr(B) \!\,$

which is the standard definition given above.

Note that an event is independent of itself if and only if

$\Pr(A) = \Pr(A \cap A) = \Pr(A)\Pr(A).$

That is, if its probability is one or zero. Thus if an event or its complement almost surely occurs, it is independent of itself. For example, if event A is choosing any number but 0.5 from a uniform distribution on the unit interval, A is independent of itself, even though, tautologically, A fully determines A.

## Independent random variables

What is defined above is independence of events. In this section we treat independence of random variables. If X is a real-valued random variable and a is a number then the event X ≤ a is the set of outcomes whose corresponding value of X is less than or equal to a. Since these are sets of outcomes that have probabilities, it makes sense to refer to events of this sort being independent of other events of this sort.

Two random variables X and Y are independent if and only if for every a and b, the events {X ≤ a} and {Y ≤ b} are independent events as defined above. Mathematically, this can be described as follows:

The random variables X and Y with cumulative distribution functions FX(x) and FY(y), and probability densities ƒX(x) and ƒY(y), are independent if and only if the combined random variable (XY) has a joint cumulative distribution function

$F_{X,Y}(x,y) = F_X(x) F_Y(y), \,$

or equivalently, a joint density

$f_{X,Y}(x,y) = f_X(x) f_Y(y). \,$.

Similar expressions characterise independence more generally for more than two random variables.

An arbitrary collection of random variables – possibly more than just two of them — is independent precisely if for any finite collection X1, ..., Xn and any finite set of numbers a1, ..., an, the events {X1 ≤ a1}, ..., {Xn ≤ an} are independent events as defined above.

The measure-theoretically inclined may prefer to substitute events {X ∈ A} for events {X ≤ a} in the above definition, where A is any Borel set. That definition is exactly equivalent to the one above when the values of the random variables are real numbers. It has the advantage of working also for complex-valued random variables or for random variables taking values in any measurable space (which includes topological spaces endowed by appropriate σ-algebras).

If any two of a collection of random variables are independent, they may nonetheless fail to be mutually independent; this is called pairwise independence.

If X and Y are independent, then the expectation operator E has the property

$E[X Y] = E[X] E[Y], \,$

and for the covariance since we have

$\text{cov}[X, Y] = E[X Y] - E[X] E[Y], \,$

so the covariance cov(XY) is zero. (The converse of these, i.e. the proposition that if two random variables have a covariance of 0 they must be independent, is not true. See uncorrelated.)

Two independent random variables X and Y have the property that the characteristic function of their sum is the product of their marginal characteristic functions:

$\varphi_{X+Y}(t) = \varphi_X(t)\cdot\varphi_Y(t), \,$

but the reverse implication is not true (see subindependence).

## Independent σ-algebras

The definitions above are both generalized by the following definition of independence for σ-algebras. Let (Ω, Σ, Pr) be a probability space and let A and B be two sub-σ-algebras of Σ. A and B are said to be independent if, whenever A ∈ A and B ∈ B,

$\Pr(A \cap B) = \Pr(A) \Pr(B).$

The new definition relates to the previous ones very directly:

• Two events are independent (in the old sense) if and only if the σ-algebras that they generate are independent (in the new sense). The σ-algebra generated by an event E ∈ Σ is, by definition,
$\sigma(E) = \{ \emptyset, E, \Omega \setminus E, \Omega \}.$
• Two random variables X and Y defined over Ω are independent (in the old sense) if and only if the σ-algebras that they generate are independent (in the new sense). The σ-algebra generated by a random variable X taking values in some measurable space S consists, by definition, of all subsets of Ω of the form X−1(U), where U is any measurable subset of S.

Using this definition, it is easy to show that if X and Y are random variables and Y is constant, then X and Y are independent, since the σ-algebra generated by a constant random variable is the trivial σ-algebra {∅, Ω}. Probability zero events cannot affect independence so independence also holds if Y is only Pr-almost surely constant.

## Conditionally independent random variables

Intuitively, two random variables X and Y are conditionally independent given Z if, once Z is known, the value of Y does not add any additional information about X. For instance, two measurements X and Y of the same underlying quantity Z are not independent, but they are conditionally independent given Z (unless the errors in the two measurements are somehow connected).

The formal definition of conditional independence is based on the idea of conditional distributions. If X, Y, and Z are discrete random variables, then we define X and Y to be conditionally independent given Z if

$\mathrm{P}(X \le x, Y \le y\;|\;Z = z) = \mathrm{P}(X \le x\;|\;Z = z) \cdot \mathrm{P}(Y \le y\;|\;Z = z)\,\!$

for all x, y and z such that P(Z = z) > 0. On the other hand, if the random variables are continuous and have a joint probability density function p, then X and Y are conditionally independent given Z if

$p_{XY|Z}(x, y | z) = p_{X|Z}(x | z) \cdot p_{Y|Z}(y | z)\,\!$

for all real numbers x, y and z such that pZ(z) > 0.

If X and Y are conditionally independent given Z, then

$\mathrm{P}(X = x | Y = y, Z = z) = \mathrm{P}(X = x | Z = z)\,\!$

for any x, y and z with P(Z = z) > 0. That is, the conditional distribution for X given Y and Z is the same as that given Z alone. A similar equation holds for the conditional probability density functions in the continuous case.

Independence can be seen as a special kind of conditional independence, since probability can be seen as a kind of conditional probability given no events.

## References

1. ^ Russell, Stuart; Peter Norvig (2002). Artificial Intelligence: A Modern Approach. Prentice Hall. p. 478. ISBN 0137903952.
2. ^ George, Glyn, "Testing for the independence of three events," Mathematical Gazette 88, November 2004, 568. PDF

Wikimedia Foundation. 2010.

### Look at other dictionaries:

• Probability theory — is the branch of mathematics concerned with analysis of random phenomena.[1] The central objects of probability theory are random variables, stochastic processes, and events: mathematical abstractions of non deterministic events or measured… …   Wikipedia

• probability theory — Math., Statistics. the theory of analyzing and making statements concerning the probability of the occurrence of uncertain events. Cf. probability (def. 4). [1830 40] * * * Branch of mathematics that deals with analysis of random events.… …   Universalium

• Copula (probability theory) — In probability theory and statistics, a copula can be used to describe the dependence between random variables. Copulas derive their name from linguistics. The cumulative distribution function of a random vector can be written in terms of… …   Wikipedia

• Characteristic function (probability theory) — The characteristic function of a uniform U(–1,1) random variable. This function is real valued because it corresponds to a random variable that is symmetric around the origin; however in general case characteristic functions may be complex valued …   Wikipedia

• Probability density function — Boxplot and probability density function of a normal distribution N(0, σ2). In probability theory, a probability density function (pdf), or density of a continuous random variable is a function that describes the relative likelihood for this… …   Wikipedia

• Probability space — This article is about mathematical term. For the novel, see Probability Space (novel). In probability theory, a probability space or a probability triple is a mathematical construct that models a real world process (or experiment ) consisting of… …   Wikipedia

• Independence of irrelevant alternatives — (IIA) is an axiom of decision theory and various social sciences. The word is used in different meanings in different contexts. Although they all attempt to provide a rational account of individual behavior or aggregation of individual… …   Wikipedia

• Theory of International Politics — is a 1979 international relations theory book, written by Kenneth Waltz that elaborated a new theory, the neorealist thory of international relations, and surpassed the cognitive limitations of the past. Taking into account the influence of… …   Wikipedia

• Theory of conjoint measurement — The theory of conjoint measurement (also known as conjoint measurement or additive conjoint measurement) is a general, formal theory of continuous quantity. It was independently discovered by the French economist Gerard Debreu (1960) and by the… …   Wikipedia

• Theory (mathematical logic) — This article is about theories in a formal language, as studied in mathematical logic. For other uses, see Theory (disambiguation). In mathematical logic, a theory (also called a formal theory) is a set of sentences in a formal language. Usually… …   Wikipedia