PROBABILITY THEORY AND RANDOM VARIABLES

The probability space is denoted by Ω and its elements ω are samples or experimental outcomes. Certain subsets (collection of outcomes) Ω are called events Λ.

The set theory notation is used. If A and B are two sets, then AB, (A+B) is their union, AB, (AB) is their intersection, A-B is the complement of B with respect to A, the empty set is denoted by 0. If AB=0, that is the sets are disjoint then the events are mutually exclusive.

For a class of events Λ we assign probabilities to events Λ via a probability function Pr(.). That is, to each event we assign a number Pr(Λ), called the probability of Λ. The probability function satisfies the following axioms:

(i) Pr(Λ)0
(ii) Pr(Ω)=1
(iii) if ΛiΛi = 0, ii, i,j=1n , then Pr( Λ1 Λ2 Λn) = Pr(Λ1) + Pr(Λ2) + + Pr(Λn)
(iv) if ΛiΛi = 0, ii, i,j=1n, , then Pr( Λ1 Λ2 Λn ) = Pr(Λ1) + Pr(Λ2) + + Pr(Λn) +

The class of events has to be defined. In defining the class of events we want the set operations (unions, intersections, complements) to yield sets that are also events. A class of sets having these properties is called a Borel field. A class F of ω sets is called a Borel field if

(i) ΩF
(ii) if ΛF then Ω-Λ F
(iii) if Λ1, Λ2, , Λn F then 1n Λn F and 1n Λn F
(iv) if Λ1, Λ2, , Λn, F then 1n Λ F and 1n Λ F

The triplet (Ω, F,Pr) is called an experiment.

Example: Ω ={read numbers}, Borel field F ={ ωx1for allx1 }.

Example, rolling a die once: Ω ={ 1,2,3,4,5,6 }, Borel field F ={ 0,[ 1,3,5],[ 2,4,6],Ω }. But A ={ 0,[ 1,3,5],[ 2,4,6],[ 1],Ω } is not a Borel field, because {1} { 2,4,6} = { 1,2,4,6} A .

RANDOM VARIABLES

A real finite-valued function x (.) defined on Ω is called a (real) random variable if, for every real number x, the inequality x(ω) x defines an ω set whose probability is defined. A random variable is a Borel measurable function.

For a random variable the function

Fx(x)= Pr{ x(ω) x } (3.1)

is defined for all real x and is called the distribution function of the random variable x. A random variable x is called discrete if there exists a mass function mx (.) such that

Fx(x)= ξx mx(ξ)0 mx(ξ) (3.2)

A random variable is called continuous if there exists a density function px (.) such that

Fx(x)= - x px(ξ) dξ, - x (3.3)

If the number of points at which Fx (.) is not differentiable is countable then

px(x)= d dx Fx(x) (3.4)

at all x at which the derivative exists.

The expectation, average, mean or first moment of a continuous random variable is defined by

E{x}= - xpx(x) dx (3.5)

The nth moment of x is defined by

E{xn}= - xnpx(x) dx (3.6)

The second moment E{x2} is called the mean square value.

The nth central moment of x is define by

E{(x-E{x})n}= - (x-E{x})npx(x) dx (3.7)

The second central moment is called the variance of x.

Example: rolling a die. We can define a probability space Ω ={ 1,2,3,4,5,6 }, a random variable x(ω), and probability Pr(ω) as for example given in the table

Ω(ω) 1 2 3 4 5 6
x(ω) -30 -20 -10 10 10 30
Pr(ω) 1/6 1/6 1/6 1/6 1/6 1/6

Let us introduce subsets (events, corresponding to odd or even numbers) Λ1= {1,3,5}. and Λ2= {2,4,6}. The corresponding Borel field is F= {0, Λ1, Λ2, Ω} and the corresponding probabilities are elements of the row matrix Pr= [0, 0.5, 0.5, 1]. This is also the case of tossing a coin with two events heads or tails.

Example: choice of a random phase ψ from the interval -π< ψπ in continuum, thus Ω={ -π<ψπ}. Let us define the random variable x( ω)=ψ and the probability function is Pr{ ψ1 ψ ψ2 } = ( ψ1-ψ2 ) / ( 2π ) , ψ1,ψ2 [ -π,π ]

The corresponding distribution function is:

if ψ>π then Fx=1,

if -πψπ then Fx=( π+ψ) /(2π),

if ψ-π then Fx=0.

The standard uniform density function results by differentiating the distribution function with respect to the random variable x=ψ.

A random variable x is Gaussian or normally distributed if its density function is given by

px(x) = 1 2πσ2 exp [ -12 ( x-mσ ) 2 ] (3.8)

where m=E{x} and σ2 =E{(x-m)2} (m - mean σ2 - variance).

The normal distribution function is

Fx(x) = 1 2πσ2 -x exp [ -12 ( ξ-mσ ) 2 ] dξ = 12+ erfx-mσ (3.9)

It is convenient to specify a random variable by its characteristic function defined as the Fourier transform of the density function

φx(u) = - e iux px(x) dx (3.10)

The nth moment of the random variable is

E{xn} = - xn px(x) dx (3.11)

It is easy to verify the following relation

1in dndun φx(0) = - xn eiux px(x) dx | u=0 = E{xn} (3.12)

Thus when the characteristic function is calculated it is easy to calculate the values of the set of nth moments by differentiation.

The characteristic function for a Gaussian density with expectation m=0 is

φx(u) = 1 2πσ2 - exp [ iux -12 ( xσ ) 2 ] dx (3.13)

(The assumption that m=0 is not a serious loss of generality because from all samples we can subtract the value of m to obtain a random variable with mean values equal zero.)

Introducing the change of variables in the integral y=x/σ, x=σy, dx=σdx, the integration finally yields the following expression

φx = exp ( -12 u2 σ2 ) (3.14)

Differentiation and substitution into the expressions for the nth central moments lead to the conclusion that for odd values of n the central moments are zero and it is easy to establish a relation between the expression for consecutive even numbers of n. The final result is that for a normal distribution the nth central moment is equal to

E{(x-m)n} = 0, all oddn1 1·3·5··(n-1) σn, all evenn2 (3.15)

The even central moments grow without bound when n tends to infinity.

JOINTLY DISTRIBUTED RANDOM VARIABLES

Let us consider two random variables x and y. The two sets { x(ω)x} and { y(ω)y} are events with probabilities

Pr{ x(ω)x} = Fx(x) Pr{ y(ω)y} = Fy(y) (3.16)

where Fx(x) are Fy(y) distribution functions of the random variables x and y. The intersection of these two sets

{ x(ω)x } { y(ω)y } = { x(ω)x , y(ω)y } (3.17)

is an event. The probability of this event is the joint distribution function of the jointly distributed random variables x and y

Fx,y xy = Pr { x(ω)x , y(ω)y } . (3.18)

Example: temperature measured x(ω) at 6 at night and y(ω) at 12 (at the same day). It is possible to consider the random variables x and y separately or to look at the pair x(ω), y(ω) jointly. To estimate the joint probability one has to consider a two dimensional problem related to the x, y plane.

In general the continuous random variables x1, x2, , xn defined on the same probability space are said to be jointly distributed. They may be characterized by their joint distribution function

Fx1, x2, , xn x1 x2 xn = Pr { x1(ω)x1 ,, xn(ω)xn } , (3.19)

where

{ x1(ω)x1 ,, xn(ω)xn } = { x1(ω)x1 } { xn(ω)xn }, (3.20)

or by their joint density function

Fx1, , xn x1 xn = -x1 -xn px1, , xn ξ1 ξn dξ1, , dξn . (3.21)

For the differentiable case

px1, , xn ξ1 ξn = n x1 ,, xn Fx1, , xn x1 xn . (3.22)

The marginal distribution function is defined by

Fx1, , xm x1 xm = Fx1, , xn x1 xm . (3.23)

The marginal density function is

px1, , xm x1 xm = - - px1, , xn x1 xn dxm+1, , dxn . (3.24)

The expectation of xk, 1kn is given by

mk = E{xk} = - - xk px1, , xn x1 xn dx1, , dxn . (3.25)

The second moment of xk is

E{xk2} = - - xk2 px1, , xn x1 xn dx1, , dxn . (3.26)

Of great importance in applications is the covariance of xk and xl which is defined by

cov{xk,xl} = - - (xk-mk) (xl-ml) px1, , xn x1 xn dx1, , dxn . (3.27)

The generalization of the higher moments and central moments from the case of one random variable to the joint variables is straightforward.

Two jointly distributed random variables x1 and x2 are independent if any of the following equivalent conditions is satisfied

F x1,x2 x1 x2 = F x1 x1 F x2 x2 , (3.28)
p x1,x2 x1 x2 = p x1 x1 p x2 x2 ,

We say that x1,,xn are mutually independent if

p x1,,xn x1 xn = p x1 x1 · · p xn xn . (3.29)