Loading [MathJax]/jax/output/HTML-CSS/jax.js

Saturday, December 29, 2012

Conditioning a multivariate normal vector on partial, noisy gaussian evidence


'cause I'm tired of re-deriving this stuff!

Lemma 1

Suppose we partition the mean and covariance of a gaussian vector y as y=[y1y2]N([u1u2],[C11C12C21C22]) then the distribution of the latter part of the vector y2 conditioned on the former taking a known value y1=a1 is multivariate gaussian with mean and variance indicated below: y2y1=a1N(u2+C21C111(a1u1),C22C21C111C12) We note this here for convenience. See Wikipedia notes on conditional multivariate gaussian, for example.

Lemma 2a

Suppose x is partitioned as x=[x1x2]N([μ1μ2],[Σ11Σ12Σ21Σ22]) If we hit x with a linear transformation and also add another gaussian random variable, viz: y=[I0I00I]x[η00] where η is the multivariate normal random vector ηN(˜μ1˜a1,˜Σ11) with offset mean parameter ˜μ1 and variance ˜Σ11 (and the use of the offset ˜a1 will become apparent), then by properties of affine transformation (see Wikipedia again) and addition of multivariate random variables we have yN([μ1˜μ1+˜a1μ1μ2],[Σ11+˜Σ11Σ11Σ12Σ11Σ11Σ12Σ21Σ12Σ22])

Lemma 2b

With x,y as in Lemma 2a, let us reuse notation for the second and third blocks of y, i.e. x1 and x2, since the transformation evidently leaves them untouched. We write ˜x1 for the part that is changed: y=[˜x1x1x2] But now if we condition on ˜x1=˜a1 then the conditional distribution of the second and third parts of y will of course change. It is gaussian by application of Lemma 1 and has mean, covariance given by [x+1x+2]N([μ1μ2]+[Σ11Σ21](Σ11+˜Σ111)(˜μ1μ1),Σ[Σ11Σ21](Σ11+˜Σ11)1[Σ11Σ21])

Interpretation of Lemmas 2a and 2b

The formulas have the following interpretation. Suppose we assume x has some prior distribution x=[x1x2]N([μ1μ2],[Σ11Σ12Σ21Σ22]) and we now condition on x1ν=0 where νN(˜μ1,˜Σ11). Then the posterior distribution of x=[x1 x2] is precisely the multivariate gaussian vector x+=[x+1 x+2] given above. It has the interpretation of a posterior distribution when one part of the vector is conditioned not on a precise vector of values, but a noisy one as with Kalman filtering. In the calculation above we conditioned on x1η=˜a1 where ηN(˜μ1˜a1,˜Σ11), but that is evidently the same thing as conditioning on x1=ν where νN(˜μ1,˜Σ11).
We remark that if ˜Σ11 is relatively small with respect to Σ11 then this more or less forces one part of x to adopt a new mean and variance ˜μ1,˜Σ11. For example in the scalar case with Σ11=σ11 and ˜Σ11=˜σ11 the posterior mean of x1 is ˜μ111+~σ11/σ11(~μ1μ1) and this tends to the new, prescribed mean ˜μ1 as ˜σ11/σ11 tends to zero.

I'm missing something about contragredient transformations and portfolio return

This entry probably just indicates that I am missing something obvious about portfolio return.
The instantaneous return γπt on a portfolio with weights π decomposes as γπt=dZtZt=πtγt+12(πtdiag(ξξT)portfolio varianceπTtξξTπt)excess return process where γit is the drift of log(Xit), the log of the i'th asset and the i'th row of ξ represents the factor decomposition of the diffusion into n independent Brownian motions comprising a vector dWt. d(log(Xit))=γitdt+ξdWt Consider the map taking vectors xy=exp(Clog(x)) and thereby defining a new asset vector Yt. That is, Yt is related to Xt by a simple matrix multiplication in log coordinates.
We consider portfolios of the new assets Yi with weights ϖi say. And again, the portfolio return is related to the asset return via dZϖZϖ=ϖdYtYt where the fractions indicate coordinate-wise (i.e. pointwise) division. Let us suppose further that any instantaneous portfolio return using assets {Xi} can be replicated by a portfolio using assets {Yi} only, and vice versa. ϖdYtYt=dZϖZϖ=dZπZπ=πdXtXt
By a slightly heavy handed application of the multivariate Ito's Lemma (hey it's good to have it lying around) with g(x)=Cx we have d(logYt)=gtdt+(g)d(log(Xt))+12(d(logXt))(2g)(d(logXt))=0+Cd(log(Xt))+0=Cγdt+CξdWt so writing the decomposition of portfolio return with Cγ in place of γ and Cξ in place of ξ we observe dZϖtZϖt=ϖtCγt+12(ϖtdiag(CξξC)ϖtCξξCϖt)=πtγt+12(πt(C1)diag(CξξC)πtξξπt) if we use contragredient weights ϖt=(C)1πt. But does the contragredient choice actually result in the same drift? The linear and portfolio variance terms are the same, but on the other hand πt(C1)diag(CξξC)πtdiag(ξξ)



When A Bar-Bell Bond Portfolio Optimizes Modified Excess Return

Here's a litte curiosity I might get around to publishing some day: a bar-bell portfolio maximizes modified excess return. At first I thought it maximized excess return (in the sense of Stochastic Portfolio Theory) but felt that couldn't be right. Sure enough I made a mistake and thus was born "modified excess return", a shamelessly reverse engineered criteria to make what follows work.

A simple model for zero coupon bond dynamics

Assume a lattice of zero coupon bonds with prices Bi(t)=B(t;t+τi) and integer time to maturities τi=i as i ranges from 1 to n years. We assume that all bonds are priced off the same piecewise constant forward curve with knot points also at integer years. We write Bi(t)=exp(t+itf(t,s)ds) and assume further that the changes in forward rates f(t,s) at time t for different years are independent. We presume the forward rates are driven by standard Brownian motion with the same standard deviation η. They may also have non-trivial drift but here it suffices to observe that the vector of bonds has dynamics given by d[logB1(t)logB2(t)logBn(t)]=[γ1(t)γ2(t)γn(t)] dt+η[10011001111][dW1(t)dW2(t)dWn(t)] or more succinctly d(logB)=γ dt+ηJ dW for scalar constant η, an n by n matrix J (implicitly defined by the above) and some drift coefficients γ that we don't care too much about in this particular exercise.

Defining modified excess return for a bond portfolio

We consider a portfolio of these bonds with weights π summing to unity. By analogy with Stochastic Portfolio theory we consider a modified excess return given by modified excess return=ni=1πiσii2ni,j=1πiπjσij+ni=1π2iσii where, following Stochastic Portfolio Theory notation, σij is the log-asset covariance, here equal to η2 multiplied by the i,j'th element of JJ. We make no statement as to what modified excess return represents, except to compare it to excess return=ni=1πiσiini,j=1πiπjσij which is most certainly meaningful. Indeed, the log-optimal investor may seek to maximize excess return. In contrast, the modified excess return makes the covariance term more important so one might reason that, all else being equal, choosing this modification over the bone fide excess return represents a sacrifice of long term growth in exchange for reduced portfolio variance - though the difference picks up the between-asset terms only, not the variances. modified excess returnexcess return=ni,j=1πiπjσij+ni=1π2iσii=ijπiπjσij A similar tradeoff is made, also inadvertently, by those constructing minimum variance portfolios in the tradition of Markowitz and de Finetti. In the minimum variance prescription only the portfolio variance term ni,j=1πiπjσij is contemplated, not ni=1πiσii.

Maximizing the modified excess return

Whatever the modification contemplated may imply, we proceed towards its surprising implication by mentally multiplying J above and noting that (JJ)i,j=min(i,j) because [10011001111][11101110001]=[11112212n] Thus in this slightly contrived forward rate model we have modified excess return proportional to ψ(π)=ni=1iπi2ni,j=1min(i,j)πiπj+ni=1iπ2i This leaves us with a cute little optimization. We claim that ψ(π) and hence the modified excess return is maximized, subject to ni=1πi=1, by setting the portfolio equal to a barbell. Half the portfolio is invested in the first (shortest maturity) bond and the other half on the last (longest maturity) bond. That is, π=[1/2001/2] corresponding to a modified excess return of ψ(π)=n14. To prove this observe that ψ(π) can be re-written as follows (count the number of times each πi and πiπj occurs) ψ(π)=2ni,j=1min(i,j)πiπj+ni=1iπ2i+ni=1iπi=(π1+π2+...+πn)2+(π1+π2+...+πn)(π2+...+πn)2+(π2+...+πn)(πn1+πn)2+(πn1+πn)(πn)2+πn=n1i=0(u2i+ui)=n1i=1(u2i+ui)=n1i=1((ui1/2)2+1/4)=n14n1i=1(ui1/2)2 where we have introduced ui=nj=i+1πj as the sum of portfolio weights leaving out the first i, and applied the constraint u0=1. The expression is clearly maximized by setting u1...un equal to 1/2. By back substitution beginning with πn this implies π=π as claimed.

Excess return on a portfolio of lognormal assets

Here's my attempt at "Stochastic Portfolio Theory in a Nutshell", for those who, like me, hadn't noticed a bug in Markowitz Theory.

Stochastic Portfolio Theory considers the decomposition of the instantaneous return on a continuously rebalanced portfolio into the instantaneous returns of constituent assets dZtZt=πtdXtXt Here Zt is the value of the portfolio, πt is a vector of portfolio weights, the fractions indicate pointwise (i.e. pathwise) division and dXt=(dX1t,...,dXnt) is a nx1 vector of stocks with lognormal dynamics: d(log(Xt)nx1=(γt)nx1dt+(ξ)nxn(dWt)nx1 where dWt is a vector and we've emphasized the dimensions throughout. Note that by Ito's Lemma we have dXtXt=(γt+12diag(ξξT))dt+(ξ)nxndWt where diag extracts a vector of diagonal entries. So if we mentally hit this on the left with the transpose of the portfolio weights π we can see that the right hand side of our instantaneous return equation will have a Brownian term πTtξdWt and a drift that we'll get back to momentarily. And if we are on the ball we'll remember that dZZ and d(logZ)t have the same Brownian terms. Thus d(logZ)t will also be driven by πTtξdWt and can write for some as yet unspecified drift γπ d(logZt)=γπtdt+πTtξdWt  To clean this up we apply Ito's Lemma again to retrieve Zt=exp(log(Zt)) and thereby: dZtZt={γπt+12πTtξξTπt}dt+πTtξdWt which reveals the drift term on the left hand side of the instantaneous return equations expressed in terms of a highly relevant quantity: the drift of the logarithm of portfolio wealth. Indeed we call γπt the portfolio growth process. And we observe the important equality appearing in too few investment textbooks, if any beside Bob's: γπ=πtγt+12(πtdiag(ξξT)portfolio varianceπTtξξTπt)excess return process Thus in log space we might say that the portfolio growth is the linear combination of the growth in individual stocks plus the term involving curly braces. We refer to that additional kick as the excess growth rate. And we further observe that it decomposes into the difference between the weighted combination of stock variances and the portfolio variance process, denoted σππt=πTtξξTπt That's it for now. We note that if one is interested in the return on the logarithm of one's portfolio then the full decomposition into linear and excess return is obviously more pertinent than the linear term alone. And we see why minimizing portfolio variance subject to a known linear return is not, despite its significant popularity, the most relevant exercise.

The decomposition is useful independent of the investors utility function.

For more see Dr Fernholz's book on Amazon, of this summary paper.