Financial Mathematics: Conditioning a multivariate normal vector on partial, noisy gaussian evidence

'cause I'm tired of re-deriving this stuff!

Lemma 1

Suppose we partition the mean and covariance of a gaussian vector $y$ as $$ y = \left[ \begin{array}{c} y_1 \\ y_2 \end{array} \right] \sim N \left( \left[ \begin{array}{c} u_1 \\ u_2 \end{array} \right] , \left[ \begin{array}{cc} C_{11} & C_{12} \\ C_{21} & C_{22} \end{array} \right] \right) $$ then the distribution of the latter part of the vector $y_2$ conditioned on the former taking a known value $y_1=a_1$ is multivariate gaussian with mean and variance indicated below: $$ y_2 \mid y_1=a_1 \sim N \left( u_2 + C_{21} C_{11}^{-1}(a_1-u_1), C_{22}-C_{21}C_{11}^{-1}C_{12} \right) $$ We note this here for convenience. See Wikipedia notes on conditional multivariate gaussian, for example.

Lemma 2a

Suppose $x$ is partitioned as $$ x = \left[ \begin{array}{c} x_1 \\ x_2 \end{array} \right] \sim N \left( \left[ \begin{array}{c} \mu_1 \\ \mu_2 \end{array} \right] , \left[ \begin{array}{cc} \Sigma_{11} & \Sigma_{12} \\ \Sigma_{21} & \Sigma_{22} \end{array} \right] \right) $$ If we hit $x$ with a linear transformation and also add another gaussian random variable, viz: $$ y = \left[ \begin{array}{cc} I & 0 \\ I & 0 \\ 0 & I \end{array} \right] x - \left[ \begin{array}{c} \eta \\ 0 \\ 0 \end{array} \right] $$ where $\eta$ is the multivariate normal random vector $$ \eta \sim N \left(\tilde{\mu}_1 - \tilde{a}_1, \tilde{\Sigma}_{11} \right) $$ with offset mean parameter $ \tilde{\mu}_1$ and variance $\tilde{\Sigma}_{11}$ (and the use of the offset $\tilde{a}_1$ will become apparent), then by properties of affine transformation (see Wikipedia again) and addition of multivariate random variables we have $$ y \sim N\left( \left[ \begin{array}{c} \mu_1 - \tilde{\mu}_1 + \tilde{a}_1 \\ \mu_1 \\ \mu_2 \end{array} \right], \left[ \begin{array}{ccc} \Sigma_{11} + \tilde{\Sigma}_{11} & \Sigma_{11} & \Sigma_{12} \\ \Sigma_{11} & \Sigma_{11} & \Sigma_{12} \\ \Sigma_{21} & \Sigma_{12} & \Sigma_{22} \end{array} \right] \right) $$

Lemma 2b

With $x,y$ as in Lemma 2a, let us reuse notation for the second and third blocks of $y$, i.e. $x_1$ and $x_2$, since the transformation evidently leaves them untouched. We write $\tilde{x}_1$ for the part that is changed: $$ y = \left[ \begin{array}{c} \tilde{x}_1 \\ x_1 \\ x_2 \end{array} \right] $$ But now if we condition on $\tilde{x}_1=\tilde{a}_1$ then the conditional distribution of the second and third parts of $y$ will of course change. It is gaussian by application of Lemma 1 and has mean, covariance given by $$ \left[ \begin{array}{c} x_1^{+} \\ x_2^{+} \end{array} \right] \sim N\left( \left[ \begin{array}{c} \mu_1 \\ \mu_2 \end{array} \right] + \left[ \begin{array}{c} \Sigma_{11} \\ \Sigma_{21} \end{array} \right] \left( \Sigma_{11} + \tilde{\Sigma}_{11}^{-1} \right) \left( \tilde{\mu}_1 - \mu_1 \right), \Sigma - \left[\begin{array}{c} \Sigma_{11} \\ \Sigma_{21} \end{array} \right] \left( \Sigma_{11} + \tilde{\Sigma}_{11} \right)^{-1} \left[ \Sigma_{11} \Sigma_{21} \right] \right) $$

Interpretation of Lemmas 2a and 2b

The formulas have the following interpretation. Suppose we assume $x$ has some prior distribution $$ x = \left[ \begin{array}{c} x_1 \\ x_2 \end{array} \right] \sim N \left( \left[ \begin{array}{c} \mu_1 \\ \mu_2 \end{array} \right] , \left[ \begin{array}{cc} \Sigma_{11} & \Sigma_{12} \\ \Sigma_{21} & \Sigma_{22} \end{array} \right] \right) $$ and we now condition on $x_1 -\nu = 0$ where $ \nu \sim N \left(\tilde{\mu}_1, \tilde{\Sigma}_{11} \right) $. Then the posterior distribution of $x = [x_1\ x_2]'$ is precisely the multivariate gaussian vector $x^{+} = [x^{+}_1 \ x^{+}_2]'$ given above. It has the interpretation of a posterior distribution when one part of the vector is conditioned not on a precise vector of values, but a noisy one as with Kalman filtering. In the calculation above we conditioned on $x_1 - \eta = \tilde{a}_1$ where $ \eta \sim N \left(\tilde{\mu}_1 - \tilde{a}_1, \tilde{\Sigma}_{11} \right) $, but that is evidently the same thing as conditioning on $x_1 = \nu $ where $ \nu \sim N \left(\tilde{\mu}_1 , \tilde{\Sigma}_{11} \right) $.
We remark that if $\tilde{\Sigma}_{11}$ is relatively small with respect to $\Sigma_{11}$ then this more or less forces one part of $x$ to adopt a new mean and variance $\tilde{\mu}_1,\tilde{\Sigma}_{11}$. For example in the scalar case with $ \Sigma_{11} = \sigma_{11} $ and $ \tilde{\Sigma}_{11} = \tilde{\sigma}_{11} $ the posterior mean of $x_1$ is $\tilde{\mu}_1- \frac{1}{1+\tilde{\sigma_{11}}/\sigma_{11}}(\tilde{\mu_1}-\mu_1) $ and this tends to the new, prescribed mean $\tilde{\mu}_1$ as $ \tilde{\sigma}_{11}/\sigma_{11}$ tends to zero.

Financial Mathematics

Saturday, December 29, 2012

Conditioning a multivariate normal vector on partial, noisy gaussian evidence

'cause I'm tired of re-deriving this stuff!

Lemma 1

Lemma 2a

Lemma 2b

Interpretation of Lemmas 2a and 2b

No comments:

Post a Comment

About Me