Independence (probability theory)

In probability theory, two events are independent, statistically independent, or stochastically independent^[1] if the occurrence of one does not affect the probability of occurrence of the other. Similarly, two random variables are independent if the realization of one does not affect the probability distribution of the other.

The concept of independence extends to dealing with collections of more than two events or random variables, in which case the events are pairwise independent if each pair are independent of each other, and the events are mutually independent if each event is independent of each other combination of events.

Definition[edit]

For events[edit]

Two events[edit]

Two events $A$ and $B$ are independent (often written as $A\perp B$ or $A\perp \!\!\!\perp B$ ) if and only if their joint probability equals the product of their probabilities:^[2]^{:p. 29}^[3]^{:p. 10}

$\mathrm {P} (A\cap B)=\mathrm {P} (A)\mathrm {P} (B)$

(Eq.1)

Why this defines independence is made clear by rewriting with conditional probabilities:

\mathrm {P} (A\cap B)=\mathrm {P} (A)\mathrm {P} (B)\iff \mathrm {P} (A)={\frac {\mathrm {P} (A\cap B)}{\mathrm {P} (B)}}=\mathrm {P} (A\mid B)

.

and similarly

\mathrm {P} (A\cap B)=\mathrm {P} (A)\mathrm {P} (B)\iff \mathrm {P} (B)=\mathrm {P} (B\mid A)

.

Thus, the occurrence of $B$ does not affect the probability of $A$ , and vice versa. Although the derived expressions may seem more intuitive, they are not the preferred definition, as the conditional probabilities may be undefined if $\mathrm {P} (A)$ or $\mathrm {P} (B)$ are 0. Furthermore, the preferred definition makes clear by symmetry that when $A$ is independent of $B$ , $B$ is also independent of $A$ .

More than two events[edit]

A finite set of events $\{A_{i}\}_{i=1}^{n}$ is pairwise independent if every pair of events is independent^[4]—that is, if and only if for all distinct pairs of indices $m,k$ ,

$\mathrm {P} (A_{m}\cap A_{k})=\mathrm {P} (A_{m})\mathrm {P} (A_{k})$

(Eq.2)

A finite set of events is mutually independent if every event is independent of any intersection of the other events^[4]^[3]^{:p. 11}—that is, if and only if for every $k\leq n$ and for every $k$ -element subset of events $\{B_{i}\}_{i=1}^{k}$ of $\{A_{i}\}_{i=1}^{n}$ ,

$\mathrm {P} \left(\bigcap _{i=1}^{k}B_{i}\right)=\prod _{i=1}^{k}\mathrm {P} (B_{i})$

(Eq.3)

This is called the multiplication rule for independent events. Note that it is not a single condition involving only the product of all the probabilities of all single events (see below for a counterexample); it must hold true for all subsets of events.

For more than two events, a mutually independent set of events is (by definition) pairwise independent; but the converse is not necessarily true (see below for a counterexample).^[2]^{:p. 30}

For real valued random variables[edit]

Two random variables[edit]

Two random variables $X$ and $Y$ are independent if and only if (iff) the elements of the π-system generated by them are independent; that is to say, for every $x$ and $y$ , the events $\{X\leq x\}$ and $\{Y\leq y\}$ are independent events (as defined above in Eq.1). That is, $X$ and $Y$ with cumulative distribution functions $F_{X}(x)$ and $F_{Y}(y)$ , are independent iff the combined random variable $(X,Y)$ has a joint cumulative distribution function^[3]^{:p. 15}

$F_{X,Y}(x,y)=F_{X}(x)F_{Y}(y)\quad {\text{for all }}x,y$

(Eq.4)

or equivalently, if the probability densities $f_{X}(x)$ and $f_{Y}(y)$ and the joint probability density $f_{X,Y}(x,y)$ exist,

f_{X,Y}(x,y)=f_{X}(x)f_{Y}(y)\quad {\text{for all }}x,y

.

More than two random variables[edit]

A finite set of $n$ random variables $\{X_{1},\ldots ,X_{n}\}$ is pairwise independent if and only if every pair of random variables is independent. Even if the set of random variables is pairwise independent, it is not necessarily mutually independent as defined next.

A finite set of $n$ random variables $\{X_{1},\ldots ,X_{n}\}$ is mutually independent if and only if for any sequence of numbers $\{x_{1},\ldots ,x_{n}\}$ , the events $\{X_{1}\leq x_{1}\},\ldots ,\{X_{n}\leq x_{n}\}$ are mutually independent events (as defined above in Eq.3). This is equivalent to the following condition on the joint cumulative distribution function $F_{X_{1},\ldots ,X_{n}}(x_{1},\ldots ,x_{n})$ . A finite set of $n$ random variables $\{X_{1},\ldots ,X_{n}\}$ is mutually independent if and only if^[3]^{:p. 16}

$F_{X_{1},\ldots ,X_{n}}(x_{1},\ldots ,x_{n})=F_{X_{1}}(x_{1})\cdot \ldots \cdot F_{X_{n}}(x_{n})\quad {\text{for all }}x_{1},\ldots ,x_{n}$

(Eq.5)

Notice that is not necessary here to require that the probability distribution factorizes for all possible $k-$ element subsets as in the case for $n$ events. This is not required because e.g. $F_{X_{1},X_{2},X_{3}}(x_{1},x_{2},x_{3})=F_{X_{1}}(x_{1})\cdot F_{X_{2}}(x_{2})\cdot F_{X_{3}}(x_{3})$ implies $F_{X_{1},X_{3}}(x_{1},x_{3})=F_{X_{1}}(x_{1})\cdot F_{X_{3}}(x_{3})$ .

The measure-theoretically inclined may prefer to substitute events $\{X\in A\}$ for events $\{X\leq x\}$ in the above definition, where $A$ is any Borel set. That definition is exactly equivalent to the one above when the values of the random variables are real numbers. It has the advantage of working also for complex-valued random variables or for random variables taking values in any measurable space (which includes topological spaces endowed by appropriate σ-algebras).

For real valued random vectors[edit]

Two random vectors $\mathbf {X} =(X_{1},...,X_{m})^{T}$ and $\mathbf {Y} =(Y_{1},...,Y_{n})^{T}$ are called independent if^[5]^{:p. 187}

$F_{\mathbf {X,Y} }(\mathbf {x,y} )=F_{\mathbf {X} }(\mathbf {x} )\cdot F_{\mathbf {Y} }(\mathbf {y} )\quad {\text{for all }}\mathbf {x} ,\mathbf {y}$

(Eq.6)

where $F_{\mathbf {X} }(\mathbf {x} )$ and $F_{\mathbf {Y} }(\mathbf {y} )$ denote the cumulative distribution functions of $\mathbf {X}$ and $\mathbf {Y}$ and $F_{\mathbf {X,Y} }(\mathbf {x,y} )$ denotes their joint cumulative distribution function. Independence of $\mathbf {X}$ and $\mathbf {Y}$ is often denoted by $\mathbf {X} \perp \!\!\!\perp \mathbf {Y}$ . Written component-wise, $\mathbf {X}$ and $\mathbf {Y}$ are called independent if

F_{X_{1},\ldots ,X_{m},Y_{1},\ldots ,Y_{n}}(x_{1},\ldots ,x_{m},y_{1},\ldots ,y_{n})=F_{X_{1},\ldots ,X_{m}}(x_{1},\ldots ,x_{m})\cdot F_{Y_{1},\ldots ,Y_{n}}(y_{1},\ldots ,y_{n})\quad {\text{for all }}x_{1},\ldots ,x_{m},y_{1},\ldots ,y_{n}

.

For stochastic processes[edit]

For one stochastic process[edit]

The definition of independence may be extended from random vectors to a stochastic process. Thereby it is required for an independent stochastic process that the random variables obtained by sampling the process at any $n$ times $t_{1},\ldots ,t_{n}$ are independent random variables for any $n$ .^[6]^{:p. 163}

Formally, a stochastic process $\left\{X_{t}\right\}_{t\in {\mathcal {T}}}$ is called independent, if and only if for all $n\in \mathbb {N}$ and for all $t_{1},\ldots ,t_{n}\in {\mathcal {T}}$

$F_{X_{t_{1}},\ldots ,X_{t_{n}}}(x_{1},\ldots ,x_{n})=F_{X_{t_{1}}}(x_{1})\cdot \ldots \cdot F_{X_{t_{n}}}(x_{n})\quad {\text{for all }}x_{1},\ldots ,x_{n}$

(Eq.7)

where $F_{X_{t_{1}},\ldots ,X_{t_{n}}}(x_{1},\ldots ,x_{n})=\mathrm {P} (X(t_{1})\leq x_{1},\ldots ,X(t_{n})\leq x_{n})$ . Notics that independence of a stochastic process is a property within a stochastic process, not between two stochastic processes.

For two stochastic processes[edit]

Independence of two stochastic processes is a property between two stochastic processes $\left\{X_{t}\right\}_{t\in {\mathcal {T}}}$ and $\left\{Y_{t}\right\}_{t\in {\mathcal {T}}}$ that are defined on the same probability space $(\Omega ,{\mathcal {F}},P)$ . Formally, two stochastic processes $\left\{X_{t}\right\}_{t\in {\mathcal {T}}}$ and $\left\{Y_{t}\right\}_{t\in {\mathcal {T}}}$ are said to be independent if for all $n\in \mathbb {N}$ and for all $t_{1},\ldots ,t_{n}\in {\mathcal {T}}$ , the random vectors $(X(t_{1}),\ldots ,X(t_{n}))$ and $(Y(t_{1}),\ldots ,Y(t_{n}))$ are independent^[7]^{:p. 515}, i.e if

$F_{X_{t_{1}},\ldots ,X_{t_{n}},Y_{t_{1}},\ldots ,Y_{t_{n}}}(x_{1},\ldots ,x_{n},y_{1},\ldots ,y_{n})=F_{X_{t_{1}},\ldots ,X_{t_{n}}}(x_{1},\ldots ,x_{n})\cdot F_{Y_{t_{1}},\ldots ,Y_{t_{n}}}(y_{1},\ldots ,y_{n})\quad {\text{for all }}x_{1},\ldots ,x_{n}$

(Eq.8)

Independent σ-algebras[edit]

The definitions above (Eq.1 and Eq.2) are both generalized by the following definition of independence for σ-algebras. Let $(\Omega ,\Sigma ,\mathrm {P} )$ be a probability space and let ${\mathcal {A}}$ and ${\mathcal {B}}$ be two sub-σ-algebras of $\Sigma$ . ${\mathcal {A}}$ and ${\mathcal {B}}$ are said to be independent if, whenever $A\in {\mathcal {A}}$ and $B\in {\mathcal {B}}$ ,

\mathrm {P} (A\cap B)=\mathrm {P} (A)\mathrm {P} (B).

Likewise, a finite family of σ-algebras $(\tau _{i})_{i\in I}$ , where $I$ is an index set, is said to be independent if and only if

\forall \left(A_{i}\right)_{i\in I}\in \prod \nolimits _{i\in I}\tau _{i}\ :\ \mathrm {P} \left(\bigcap \nolimits _{i\in I}A_{i}\right)=\prod \nolimits _{i\in I}\mathrm {P} \left(A_{i}\right)

and an infinite family of σ-algebras is said to be independent if all its finite subfamilies are independent.

The new definition relates to the previous ones very directly:

Two events are independent (in the old sense) if and only if the σ-algebras that they generate are independent (in the new sense). The σ-algebra generated by an event $E\in \Sigma$ is, by definition,

\sigma (\{E\})=\{\emptyset ,E,\Omega \setminus E,\Omega \}.

Two random variables $X$ and $Y$ defined over $\Omega$ are independent (in the old sense) if and only if the σ-algebras that they generate are independent (in the new sense). The σ-algebra generated by a random variable $X$ taking values in some measurable space $S$ consists, by definition, of all subsets of $\Omega$ of the form $X^{-1}(U)$ , where $U$ is any measurable subset of $S$ .

Using this definition, it is easy to show that if $X$ and $Y$ are random variables and $Y$ is constant, then $X$ and $Y$ are independent, since the σ-algebra generated by a constant random variable is the trivial σ-algebra $\{\varnothing ,\Omega \}$ . Probability zero events cannot affect independence so independence also holds if $Y$ is only Pr-almost surely constant.

Properties[edit]

Self-independence[edit]

Note that an event is independent of itself if and only if

\mathrm {P} (A)=\mathrm {P} (A\cap A)=\mathrm {P} (A)\cdot \mathrm {P} (A)\Leftrightarrow \mathrm {P} (A)=0{\text{ or }}\mathrm {P} (A)=1

.

Thus an event is independent of itself if and only if it almost surely occurs or its complement almost surely occurs; this fact is useful when proving zero–one laws.^[8]

Expectation and covariance[edit]

If $X$ and $Y$ are independent random variables, then the expectation operator $\operatorname {E}$ has the property

\operatorname {E} [XY]=\operatorname {E} [X]\operatorname {E} [Y],

and the covariance $\operatorname {cov} [X,Y]$ is zero, since we have

\operatorname {cov} [X,Y]=\operatorname {E} [XY]-\operatorname {E} [X]\operatorname {E} [Y]

.

(The converse of these, i.e. the proposition that if two random variables have a covariance of 0 they must be independent, is not true. See uncorrelated.)

Similarly for two stochastic processes $\left\{X_{t}\right\}_{t\in {\mathcal {T}}}$ and $\left\{Y_{t}\right\}_{t\in {\mathcal {T}}}$ : If they are independent, then they are uncorrelated.^[9]^{:p. 151}

Characteristic function[edit]

Two random variables $X$ and $Y$ are independent if and only if the characteristic function of the random vector $(X,Y)$ satisfies

\varphi _{(X,Y)}(t,s)=\varphi _{X}(t)\cdot \varphi _{Y}(s)

.

In particular the characteristic function of their sum is the product of their marginal characteristic functions:

\varphi _{X+Y}(t)=\varphi _{X}(t)\cdot \varphi _{Y}(t),

though the reverse implication is not true. Random variables that satisfy the latter condition are called subindependent.

Examples[edit]

Rolling dice[edit]

The event of getting a 6 the first time a die is rolled and the event of getting a 6 the second time are independent. By contrast, the event of getting a 6 the first time a die is rolled and the event that the sum of the numbers seen on the first and second trial is 8 are not independent.

Drawing cards[edit]

If two cards are drawn with replacement from a deck of cards, the event of drawing a red card on the first trial and that of drawing a red card on the second trial are independent. By contrast, if two cards are drawn without replacement from a deck of cards, the event of drawing a red card on the first trial and that of drawing a red card on the second trial are not independent, because a deck that has had a red card removed has proportionately fewer red cards.

Pairwise and mutual independence[edit]

Pairwise independent, but not mutually independent, events.

Mutually independent events.

Consider the two probability spaces shown. In both cases, $\mathrm {P} (A)=\mathrm {P} (B)=1/2$ and $\mathrm {P} (C)=1/4$ . The random variables in the first space are pairwise independent because $\mathrm {P} (A|B)=\mathrm {P} (A|C)=1/2=\mathrm {P} (A)$ , $\mathrm {P} (B|A)=\mathrm {P} (B|C)=1/2=\mathrm {P} (B)$ , and $\mathrm {P} (C|A)=\mathrm {P} (C|B)=1/4=\mathrm {P} (C)$ ; but the three random variables are not mutually independent. The random variables in the second space are both pairwise independent and mutually independent. To illustrate the difference, consider conditioning on two events. In the pairwise independent case, although any one event is independent of each of the other two individually, it is not independent of the intersection of the other two:

\mathrm {P} (A|BC)={\frac {\frac {4}{40}}{{\frac {4}{40}}+{\frac {1}{40}}}}={\tfrac {4}{5}}\neq \mathrm {P} (A)

\mathrm {P} (B|AC)={\frac {\frac {4}{40}}{{\frac {4}{40}}+{\frac {1}{40}}}}={\tfrac {4}{5}}\neq \mathrm {P} (B)

\mathrm {P} (C|AB)={\frac {\frac {4}{40}}{{\frac {4}{40}}+{\frac {6}{40}}}}={\tfrac {2}{5}}\neq \mathrm {P} (C)

In the mutually independent case, however,

\mathrm {P} (A|BC)={\frac {\frac {1}{16}}{{\frac {1}{16}}+{\frac {1}{16}}}}={\tfrac {1}{2}}=\mathrm {P} (A)

\mathrm {P} (B|AC)={\frac {\frac {1}{16}}{{\frac {1}{16}}+{\frac {1}{16}}}}={\tfrac {1}{2}}=\mathrm {P} (B)

\mathrm {P} (C|AB)={\frac {\frac {1}{16}}{{\frac {1}{16}}+{\frac {3}{16}}}}={\tfrac {1}{4}}=\mathrm {P} (C)

Mutual independence[edit]

It is possible to create a three-event example in which

\mathrm {P} (A\cap B\cap C)=\mathrm {P} (A)\mathrm {P} (B)\mathrm {P} (C),

and yet no two of the three events are pairwise independent (and hence the set of events are not mutually independent).^[10] This example shows that mutual independence involves requirements on the products of probabilities of all combinations of events, not just the single events as in this example. For another example, take $A$ to be empty and $B$ and $C$ to be identical events with non-zero probability. Then, since $B$ and $C$ are the same event, they are not independent, but the probability of the intersection of the events is zero, the product of the probabilities.

Conditional independence[edit]

For events[edit]

The events $A$ and $B$ are conditionally independent given an event $C$ when

$\mathrm {P} (A\cap B\mid C)=\mathrm {P} (A\mid C)\cdot \mathrm {P} (B\mid C)$ .

For random variables[edit]

Intuitively, two random variables $X$ and $Y$ are conditionally independent given $Z$ if, once $Z$ is known, the value of $Y$ does not add any additional information about $X$ . For instance, two measurements $X$ and $Y$ of the same underlying quantity $Z$ are not independent, but they are conditionally independent given $Z$ (unless the errors in the two measurements are somehow connected).

The formal definition of conditional independence is based on the idea of conditional distributions. If $X$ , $Y$ , and $Z$ are discrete random variables, then we define $X$ and $Y$ to be conditionally independent given $Z$ if

\mathrm {P} (X\leq x,Y\leq y\;|\;Z=z)=\mathrm {P} (X\leq x\;|\;Z=z)\cdot \mathrm {P} (Y\leq y\;|\;Z=z)

for all $x$ , $y$ and $z$ such that $\mathrm {P} (Z=z)>0$ . On the other hand, if the random variables are continuous and have a joint probability density function $f_{XYZ}(x,y,z)$ , then $X$ and $Y$ are conditionally independent given $Z$ if

f_{XY|Z}(x,y|z)=f_{X|Z}(x|z)\cdot f_{Y|Z}(y|z)

for all real numbers $x$ , $y$ and $z$ such that $f_{Z}(z)>0$ .

If discrete $X$ and $Y$ are conditionally independent given $Z$ , then

\mathrm {P} (X=x|Y=y,Z=z)=\mathrm {P} (X=x|Z=z)

for any $x$ , $y$ and $z$ with $\mathrm {P} (Z=z)>0$ . That is, the conditional distribution for $X$ given $Y$ and $Z$ is the same as that given $Z$ alone. A similar equation holds for the conditional probability density functions in the continuous case.

Independence can be seen as a special kind of conditional independence, since probability can be seen as a kind of conditional probability given no events.

References[edit]

^ Russell, Stuart; Norvig, Peter (2002). Artificial Intelligence: A Modern Approach. Prentice Hall. p. 478. ISBN 0-13-790395-2.
^ ^a ^b Florescu, Ionut (2014). Probability and Stochastic Processes. Wiley. ISBN 978-0-470-62455-5.
^ ^a ^b ^c ^d Gallager, Robert G. (2013). Stochastic Processes Theory for Applications. Cambridge University Press. ISBN 978-1-107-03975-9.
^ ^a ^b Feller, W (1971). "Stochastic Independence". An Introduction to Probability Theory and Its Applications. Wiley.
^ Papoulis, Athanasios (1991). Probability, Random Variables and Stochastic Porcesses. MCGraw Hill. ISBN 0-07-048477-5.
^ Hwei, Piao (1997). Theory and Problems of Probability, Random Variables, and Random Processes. McGraw-Hill. ISBN 0-07-030644-3.
^ Amos Lapidoth (8 February 2017). A Foundation in Digital Communication. Cambridge University Press. ISBN 978-1-107-17732-1.
^ Durrett, Richard (1996). Probability: theory and examples (Second ed.). page 62
^ Park,Kun Il (2018). Fundamentals of Probability and Stochastic Processes with Applications to Communications. Springer. ISBN 978-3-319-68074-3.
^ George, Glyn, "Testing for the independence of three events," Mathematical Gazette 88, November 2004, 568. PDF

External links[edit]

Media related to Statistical dependence at Wikimedia Commons

[Artificial_Intelligence-1] Russell, Stuart; Norvig, Peter (2002). Artificial Intelligence: A Modern Approach. Prentice Hall. p. 478. ISBN 0-13-790395-2.

[Florescu-2] Florescu, Ionut (2014). Probability and Stochastic Processes. Wiley. ISBN 978-0-470-62455-5.

[Gallager-3] Gallager, Robert G. (2013). Stochastic Processes Theory for Applications. Cambridge University Press. ISBN 978-1-107-03975-9.

[Feller-4] Feller, W (1971). "Stochastic Independence". An Introduction to Probability Theory and Its Applications. Wiley.

[Papoulis-5] Papoulis, Athanasios (1991). Probability, Random Variables and Stochastic Porcesses. MCGraw Hill. ISBN 0-07-048477-5.

[HweiHsu-6] Hwei, Piao (1997). Theory and Problems of Probability, Random Variables, and Random Processes. McGraw-Hill. ISBN 0-07-030644-3.

[Lapidoth2017-7] Amos Lapidoth (8 February 2017). A Foundation in Digital Communication. Cambridge University Press. ISBN 978-1-107-17732-1.

[8] Durrett, Richard (1996). Probability: theory and examples (Second ed.). page 62

[KunIlPark-9] Park,Kun Il (2018). Fundamentals of Probability and Stochastic Processes with Applications to Communications. Springer. ISBN 978-3-319-68074-3.

[10] George, Glyn, "Testing for the independence of three events," Mathematical Gazette 88, November 2004, 568. PDF

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

Independence (probability theory)

Contents

Definition[edit]

For events[edit]

Two events[edit]

More than two events[edit]

For real valued random variables[edit]

Two random variables[edit]

More than two random variables[edit]

For real valued random vectors[edit]

For stochastic processes[edit]

For one stochastic process[edit]

For two stochastic processes[edit]

Independent σ-algebras[edit]

Properties[edit]

Self-independence[edit]

Expectation and covariance[edit]

Characteristic function[edit]

Examples[edit]

Rolling dice[edit]

Drawing cards[edit]

Pairwise and mutual independence[edit]

Mutual independence[edit]

Conditional independence[edit]

For events[edit]

For random variables[edit]

See also[edit]

References[edit]

External links[edit]

Navigation menu

Search