Data Analysis

KALMAN FILTERS - BASIC RELATIONS

The introduction to the Kalman filter theory will be limited to the linear theory. The stress is on the applications. Thus relations will be justified by plausible arguments based on physics of the problem. The stress is not on rigorous mathematical proofs. The dynamical system is described by a set of linear Itô differential equation

d X (t) = f (t) X (t) d t + g (t) d B (t)

(10.1)

where $X (t)$ is the column matrix of $n$ random functions, $f (t)$ and $g (t)$ are respectively $n \times n$ and $n \times m$ deterministic, continuous matrix functions and $B (t)$ , $t \leq 0$ is a column matrix of $n$ independent Brownian motion processes with variance parameters equal to ones. The initial conditions correspond to Gaussian processes. In the second row the dimensions of the matrices are given for a twice differentiable process with a dominant frequency.

Discrete, linear observations are taken at times $t_{r}$

Y (t_{r}) = h (t_{r}) X (t_{r}) + V (t_{r}), r = 1, 2, \dots, 0 < t_{1} < \dots < t_{r} < t_{r + 1}

(10.2)

where $Y (r)$ is the column matrix of $s$ random observations, $h (t_{r})$ is a $s \times n$ deterministic matrix and $V (r)$ is the column matrix of $s$ Gaussian white noise sequences with zero mean values and prescribed variances. It is assumed that the initial conditions, the Brownian motion processes and the white noise observations sequences are independent.

The basic theorems are true for all linear systems. For constant coefficients the solutions are obtained by standard methods that are very easy to apply. In practical applications the computations are often restricted to discrete set of times with constant spacing $t_{r} = r Δ t$ . The most important case in applications is a dynamical system described by a set of difference equations with equal time intervals

X (r + 1) = φ (Δ t) X (r) + q (Δ t) U (r + 1)

(10.3)

where the column matrix $X (r)$ with $n$ rows contains the values of the mathematical model, the $n \times n$ matrix $φ (Δ t)$ corresponds to a transformation in one step, $q (Δ t)$ is an $n \times m$ matrix and $U (r + 1)$ is an $m \times 1$ Gaussian white noise sequence with mean values equal to zeros and variances equal to one (the variance may be included in the values of the elements of $q$ ). The observation model is the same as in the case of the continuous discrete model

Y (r) = h X (r) + V (r), r = 1, 2, \dots

(10.4)

In the example of a twice differentiable process with a dominant frequency the matrix $X$ and $h$ are

$X^{T} (r) = [X_{0} (r), Y_{0} (r), X_{1} (r), Y_{1} (r), X_{2} (r), Y_{2} (r)]$	(10.5)
$h = [0, 0, 0, 0, 1, 0]$

For two events $A$ and $B$ the conditional probability function $\Pr {A | B}$ of event $A$ given the event $B$ is

\Pr {A | B} = \frac{\Pr {A \cap B}}{\Pr {B}}, \Pr {B} > 0

(10.6)

Let us now introduce an outline of the procedure in the Kalman filter formulations. The discussion will be based on the example of a twice differentiable discrete random process and the corresponding observation model.

Let us assume that at the time $t_{r} = r Δ t$ we have a good estimate of the investigated realization of the random vector $X (r)$ denoted by ${\hat{X}}_{r}^{r}$ based on all observations up to $r$ . Now we want to compute the expected value at the time $t_{r + 1} = (r + 1) Δ t$ on the data from the observation model for the time intervals $0, 1, \dots, r$ . This value may be calculated from the mathematical model as the expectation of the conditional probability as defined in relation (10.6). Thus

{\hat{X}}_{r + 1}^{r} = E \{X (r + 1) | Y (r)\} = φ (Δ t) {\hat{X}}_{r}^{r}

(10.7)

Let us now compute the prediction of the variance of the error

\begin{array}{l} P_{r + 1}^{r} = E \{[X (r + 1) - {\hat{X}}_{r + 1}^{r}] {[X (r + 1) - {\hat{X}}_{r + 1}^{r}]}^{T} | Y (r)\} \\ = E \{[φ (X (r + 1) - {\hat{X}}_{r + 1}^{r}) + g U (r + 1)] {[φ (X (r + 1) - {\hat{X}}_{r + 1}^{r}) + g U (r + 1)]}^{T} | Y (r)\} \end{array}

The finale result is

P_{r + 1}^{r} = φ (Δ t) P_{r}^{r} φ^{T} (Δ t) + q (Δ t) q^{T} (Δ t)

(10.8)

Up till now we discussed the predictions for the time $t_{r + 1} = (r + 1) Δ t$ based on the mathematical model and the observation up till $t_{r} = r Δ t$ .

To compute the estimate ${\hat{X}}_{r + 1}^{r + 1}$ the observation at the time $t_{r + 1} = (r + 1) Δ t$ has to be taken into account. It was proved by Kalman that the influence of this observation leads to the following relation

{\hat{X}}_{r + 1}^{r + 1} = {\hat{X}}_{r + 1}^{r} + K (Δ t) [Y (r + 1) - h {\hat{X}}_{r + 1}^{r}]

(10.9)

where $K (Δ t)$ is the Kalman gain given by the following relation

K (Δ t) = P_{r + 1}^{r} h^{T} {[h P_{r + 1}^{r} h^{T} + R]}^{- 1}

(10.10)

where $R$ is a diagonal matrix of variances of the matrix white noise sequece $V (r)$ in the relation (10.4).

From the physics of the problem it is clear that the prediction has to be supplemented by the expected value of the observation error. In our example there is only one observation and it corresponds to the measurement of $X_{2} (r)$ . In general the matrix $R$ is a set of numbers that has to be related to the variances of the measured random processes. For example the standard deviation is 5% of the standard deviation of $X_{2} (r)$ .

The procedure must be completed by the estimation of the variance of the prediction error when the observation at the time is taken into account

P_{r + 1}^{r + 1} = P_{r + 1}^{r} - K (Δ t) h P_{r + 1}^{r}

(10.11)

This procedure leads to the computation of all the necessary initial values for the next step: $P_{r + 1}^{r + 1}$ , ${\hat{X}}_{r + 1}^{r + 1}$ . Now it is possible to follow the outlined procedure to calculate the values in the next step: $P_{r + 2}^{r + 1}$ , ${\hat{X}}_{r + 2}^{r + 1}$ , $P_{r + 2}^{r + 2}$ , ${\hat{X}}_{r + 2}^{r + 2}$ . Taking consecutive steps the computation of the realization may be completed. It is only necessary to have the initial values at the initial time $t = 0$ . If there is no information available it is reasonable to assume ${\hat{X}}_{0}^{0} = 0$ and $P_{0}^{0} = 0$ . It means the initial values are equal to the expected values and the variance of the initial error is equal to the asymptotic variance of the process.

Numerical examples

Example 1
The script file pwsemh04 computes examples of twice differentiable realizations with a dominant frequency computes matrices for Kalman filters and applies them.

Download
Scilab: pwsemh04.sci
Octave/Matlab: pwsemh04.m

Example 2
The script file pwsemi04 is a program to discuss a random function and the estimated corresponding random sequence.

Download
Scilab: pwsemi04.sci
Octave/Matlab: pwsemi04.m

Example 3
The script file pwsemj04 is a program to discuss a random function and the corresponding estimated random sequence and derivatives.

Download
Scilab: pwsemj04.sci
Octave/Matlab: pwsemj04.m