## Monday, January 13, 2014

### A game played on an Ornstein-Uhlenbeck bridge

Let $$X_t$$ denote the Ornstein-Uhlenbeck process $$dX_t = -\kappa X_t dt + \sigma dW_t$$ where $$X_0=b$$ is assumed known. Let $$Y_t = \exp(X_t)$$ and without serious loss of generality, choose $$\kappa$$ and $$\sigma$$ so that the unconditional distribution of $$X_t$$ is unit variance - that is to say $$\frac{\sigma^2}{2\kappa} = 1$$.
We play a two period game in which we choose a time $$t_1$$ and then, after $$Y_{t_1}$$ has been revealed to us (or equivalently $$X_{t_1}$$) we choose a second time $$t_2$$. We seek to maximize a utility function $$U = c Y_{t_1} + Y_{t_2}$$ We can evidently assume $$t_1>0$$ without loss of generality. The coefficient $$c$$ is the relative importance of the first period payoff versus the second, and for concreteness one might set $$c=1$$.
A time coordinate is used merely to make the mathematical representation more familiar. It does not represent a temporal choice, necessarily, but rather an abstraction of a more general search space (such as a location to drill for oil, a choice of marketing strategy, or a technology product). The process $$Y_t$$, unknown to us, represent the yet to be discovered variation in revenue as we move across the search space.
We seek to analyze how much exploration one should perform, versus sticking to what we know. Intuitively, if the initial point $$X(0)=b$$ (or $$Y(0)=e^b$$) is very high we might not want to risk any departure. Whereas if $$b$$ is less than zero we're better off forgetting it altogether (that is, choosing $$t_1 \rightarrow \infty$$). In what follows, we sharpen this intuition.

Lemma A: The single period game.

Conditioned on $$X_0 = b$$ the mean and variance of $$X_t$$ are given by \begin{eqnarray} \mu(t) & = & b e^{-\kappa t} \\ \nu(t) & = & \frac{ \sigma^2 }{2 \kappa } \left( 1 - e^{-2 \kappa t} \right) \end{eqnarray} So using our simplification $$\frac{\sigma^2}{2\kappa} = 1$$ we ought to choose \begin{eqnarray} t_1^* & = & \arg \max E[ e^{X(t)} ] \\ & = & \arg \max \log E[ e^{X(t)} ] \\ & = & \arg \max \left( \mu(t) + \frac{1}{2} \nu(t) \right) \\ & = & \arg \max \left( b \lambda_t + \frac{1}{2} ( 1 - \lambda_t ^2 ) \right) \\ & = & \arg \max -\frac{1}{2} \left( \lambda_t - b \right)^2 + \frac{1}{2} \left( b^2 + 1 \right) \end{eqnarray} for $$\lambda_t = e^{-\kappa t }$$ taking values in $$[0,1]$$. This has qualitatively different behaviour depending on which of three regions $$b$$ falls into. The three possibilities are tabulated below.

 Region $$\lambda^*_t$$ Optimal $$t$$ $$\log E[ e^{X_t} ]$$ Strategy $$b < 0$$ 0 $$\infty$$ $$\frac{1}{2}$$ "Reset" $$0 < b < 1$$ b $$-\frac{1}{\kappa} \log( b )$$ $$\frac{1}{2}\left( b^2+1 \right)$$ "Explore" $$b > 1$$ 1 $$0$$ $$b$$ "Stay"

Thus the one period problem has utility $$\zeta(b)$$ where $$\zeta( x ) = \left\{ \begin{array}{cc} e^{\frac{1}{2}} & x < 0 \\ e^{\frac{1}{2} \left( x^2+1 \right)} & 0 \le x \le 1 \\ e^x & x > 1 \end{array} \right.$$ We denote the corresponding optimal time $$t_1$$ by $$\pi^{(1)}( b )$$. Namely: $$\pi^{(1)}( b ) = \left\{ \begin{array}{cc} \infty & b < 0 \\ -\frac{1}{\kappa} \log( b ) & 0 \le b \le 1 \\ 0 & b > 1 \end{array} \right.$$

Lemma B: Optimizing over "outside" strategies.

For brevity we denote two period strategies by a decision function $$\pi( x; t, b )$$ that returns the second time choice $$t_2 = \pi( X(t_1); t, b )$$ after $$X(t_1)$$ is revealed. A simple one parameter family of strategies, indexed by the choice $$t_1$$, is given by $$t_2 = \pi^{(2)}( x ; t_1, b ) = \left\{ \begin{array}{cc} - \pi^{(1)}(b) & x < b \\ t_1 + \pi^{(1)}(b) & x > b \end{array} \right.$$ We call these the "outside" strategies because we assume the best choice for $$t_2$$ lies outside the open interval $$(0, t_1)$$. If $$X(t_1)$$ is greater than our starting point $$X(0)=b$$ we will never move left, but instead use the one period solution to move right (or stay). On the other hand if the first point is revealed to be lower than our starting point we explore, if anywhere, to the left of $$0$$ instead, again using the one period solution.
The evaluation of this strategy amounts to integration of the one period utility against the (gaussian) distribution of $$X(t_1)$$. For instance if $$0 < b < 1$$. \begin{eqnarray} L( \pi^{(2)} ) & = & c E[ e^{ X_{t_1} } ] + E[ e^{ X_{t_2} } ] \\ & = & c E[ e^{ X_{t_1} } ] + P( X_{t_1} < b ) \zeta( b ) + P( X_{t_1} > b ) E[ \zeta( X_{t_1} ) | X_{t_1} > b ] \end{eqnarray} We've already noted that the first term is just $$c e^{ \mu(t_1) + \frac{1}{2} \nu(t_1) }$$. But since $$\log \zeta(x)$$ is piecewise quadratic, the others are also integrals of the type \begin{eqnarray} I(\mu,\sigma;x_1,x_2;a_0,a_1,a_2) := \frac{1}{\sqrt{2 \pi} \sigma } \int_{x_1}^{x_2} e^{-\frac{1}{2}\left( \frac{x-\mu}{\sigma} \right)} e^{a_0 + a_1 x + a_2 x^2} dx \end{eqnarray} admitting analytical solution. It is sufficient to note the following equalities, each achieved by substitution. \begin{eqnarray} \frac{1}{\sqrt{2 \pi} \sigma } \int_{x_1}^{x_2} e^{-\frac{1}{2}\left( \frac{x-\mu}{\sigma} \right)^2} e^{a_1 x + a_2 x^2} dx & = & \frac{1}{\sqrt{2 \pi} } \int_{x_1/\sigma}^{x_2/\sigma} e^{-\frac{1}{2}\left( u-\mu/\sigma \right)^2} e^{a_1 \sigma x + a_2 \sigma^2 u^2} du \\ \frac{1}{\sqrt{2 \pi}} \int_{x_1}^{x_2} e^{-\frac{1}{2}\left( x-\mu \right)^2} e^{a_1 x + a_2 x^2} dx & = & \frac{1}{\sqrt{2 \pi}} \int_{x_1-\mu}^{x_2-\mu} e^{-\frac{1}{2}u^2} e^{ (a_1 \mu + a_2 \mu^2) + (a_1 + 2 a_2 \mu^2) u + a_2 u^2} du \\ \frac{1}{\sqrt{2 \pi}} \int_{x_1}^{x_2} e^{-\frac{1}{2}x^2 } e^{a_1 x + a_2 x^2} dx & = & \frac{1}{\sqrt{p}} \frac{1}{ \sqrt{2 \pi} } \int_{ x_1 \sqrt{p} }^{ x_2 \sqrt{p} } e^{ -\frac{1}{2}u^2 } e^{ \frac{a_1}{\sqrt{p}} u } du \\ \frac{1}{\sqrt{2 \pi}} \int_{x_1}^{x_2} e^{-\frac{1}{2}x^2 } e^{a_1 x} dx & = & e^{\frac{1}{2}a_1^2} \frac{1}{ \sqrt{2 \pi} } \int_{ x_1-a_1 }^{ x_2 - a_1 } e^{ -\frac{1}{2}u^2 } du \end{eqnarray} In the second to last equality $$p = 1 - 2 a_2$$ and validity requires $$p > 0$$.

Lemma C: If inside strategies are never optimal in the symmetric case, they are never optimal.

For simplicity we'd like to assume what would seem to be the critical case, $$X(t_1)=b$$. And we shall indeed show that irrespective of $$b$$ we never want to choose $$t_2 \in (0,t_1)$$. Intuitively it should also be true that we never wish to choose an inside strategy in the asymmetric case either. To tidy up this dangling thread and establish some formulas we'll need, we set $$X(t_1)=d$$. We may assume $$d < b$$ without loss of generality (otherwise exchange the roles, remembering that we shall establish our coming result for all values of $$b$$ ). Now suppose there is a point $$t_2 \in (0,t_1)$$. Using the unconditional means and variances used in the one period problem we apply Bayes Rule to find the conditional density on the Bridge: \begin{eqnarray} \rho( x; t_2 ) & \propto & e^{ -\frac{1}{2} \left( \frac{ x - b e^{-\kappa t_2} } { \sqrt{ 1-e^{-2\kappa t_2} } } \right)^2 } e^{-\frac{1}{2} \left( \frac{ x e^{-\kappa(t_1-t_2) } - d } { \sqrt{ 1-e^{-2\kappa (t_1-t_2) } } } \right)^2 } \\ & = & e^{-\frac{1}{2} \frac{ \left( x - b \lambda_2 \right)^2 } { 1-\lambda_2^2 } } e^{-\frac{1}{2} \frac{ \left( x \lambda - d \right)^2 } { 1-\lambda^2 } } \\ & = & e^{ -\frac{1}{2}\left( g_2 x^2 - 2 g_1 x + \dots \right) } \end{eqnarray} where $$\lambda_2 = e^{-\kappa t_2}$$, $$\lambda = e^{-\kappa (t_1-t_2)}$$ and matching coefficients of $$x^2$$ and $$x$$ in the exponent respectively we find \begin{eqnarray} g_2 & = & \frac{1}{1-\lambda_2^2} + \frac{\lambda^2}{1-\lambda^2} \\ g_1 & = & \frac{b \lambda_2}{1-\lambda_2^2} + \frac{d \lambda}{1-\lambda^2} \end{eqnarray} Now the gaussian conditional density $$\rho(x) \propto e^{ -\frac{1}{2}\left( g_2 x^2 - 2 g_1 x + \dots \right) } = e^{ -\frac{1}{2} g_2 \left( x - g_1/g_2 \right)^2 + \dots }$$ evidently has precision $$g_2$$ and mean $$g_1/g_2$$. The precision (i.e. $$1/variance^2$$) is independent of $$d$$ whereas the conditional mean is increasing in $$d$$. It follows that if the low side of the bridge were raised, the inside option would become more attractive.

Lemma D: The "outside" strategy is no worse than backtracking to the middle of the bridge

From the one period problem and the additive nature of the payoff we know that all other strategies with $$t_2 \ge t_1$$ or $$t_2 < 0$$ are worse. So specializing the formula for the conditional mean and precision given above to the case $$d=b$$ we write the conditional mean and variance as \begin{eqnarray} \mu^c(t) & = & \frac{ \frac{b \lambda_2}{1-\lambda_2^2} + \frac{b \lambda}{1-\lambda^2} } { \frac{1}{1-\lambda_2^2} + \frac{\lambda^2}{1-\lambda^2} } = b \frac{ \lambda_2(1-\lambda^2) + \lambda(1-\lambda_2^2) } { 1-\lambda_2^2 \lambda^2 } = b \frac{ \lambda_2 + \lambda}{ 1 +\lambda \lambda_2}\\ \nu^c(t) & = & \left( \frac{1}{1-\lambda_2^2} + \frac{\lambda^2}{1-\lambda^2} \right)^{-2} = \frac{ (1-\lambda_2^2)^2 (1-\lambda^2)^2 } { ( 1- \lambda_2^2 \lambda^2 )^2 } \end{eqnarray} In the case $$\lambda_2 = \lambda$$, which is the middle of the bridge, this simplifies further \begin{eqnarray} \mu^c(\lambda_2=\lambda) & = & b \frac{2\lambda}{1+\lambda^2} < b \\ \nu^c(\lambda_2=\lambda) & = & \left( \frac{ 1-\lambda^2 }{ 1+ \lambda^2} \right)^2 \end{eqnarray} But there is an outside point with $$\tilde{\lambda} = \frac{2\lambda}{1+\lambda^2}$$ we might choose instead. This, by construction, will have the same mean $$b \tilde{\lambda}$$. We notice it also has the same variance: \begin{eqnarray} \nu(\tilde{\lambda}) & = & 1- \tilde{\lambda}^2 \\ & = & \frac{ (1 + \lambda^2)^2 - 4 \lambda^2 }{ (1 + \lambda^2)^2 } \\ & = & \frac{ (1 - \lambda^2)^2 }{ (1 + \lambda^2)^2 } \\ & = & \nu^c(t). \end{eqnarray} Thus we have established an outside point with equivalent utility.

Exhibit E: Pictures looking for proofs.

Returning to the symmetric case $$X(t_1)=b$$ we ask is there any value for $$t_2$$ inside the interval $$(0,t_1)$$ that is a better choice that going outside the bridge (and using the optimal one period solution). To put it another way, is there any value for $$\lambda$$ for which there is any choice of $$\lambda_2$$ such that $$D(\lambda_2,\lambda,b) = \log \xi(b) - \psi(b) < 0$$ where $$\psi(b) = \mu^c(t) + \frac{1}{2} \nu^c(t)$$ ? Or, using the formulas above, is there a combination $$\lambda_2, \lambda$$ for which $$\log \xi(b) - \left( b \frac{ \lambda_2 + \lambda}{ 1 +\lambda \lambda_2} + \frac{1}{2} \frac{ (1-\lambda_2^2)^2 (1-\lambda^2)^2 } { ( 1- \lambda_2^2 \lambda^2 )^2 } \right) < 0 ?$$ Here is a picture of the difference between the outside option and inside options in the case $$b=1.5$$ that would suggest the difference is always positive:

Here is another, for the case $$b=1.5$$.

Lemma F: For $$b \in (0,1)$$ the function $$S(u) = D(\lambda_2=u,\lambda=u;b)$$ satisfies $$S(u)\ge0$$ for $$u \in (0,1)$$ with a single zero at $$u=\frac{1-\sqrt{1+b}}{b}$$

We consider the middle of the symmetric bridge once more, this time varying the width of the bridge. We claim that for $$b \in (0,1)$$ there is a particular width bridge for which we are indifferent as to whether we choose the middle of the bridge (i.e. $$t_2=\frac{1}{2} t_1$$ or the optimal outside solution. Specializing to this case implies $$\lambda_2=\lambda$$ so we write $$S(u) = D(\lambda_2=u,\lambda=u;b)$$. From the formula for $$D$$ given above we have \begin{eqnarray} S(u) &= & \frac{1}{2}(1+b^2) - \left( b \frac{ 2u}{ 1 + u^2} + \frac{1}{2} \frac{ (1-u^2)^4 } { ( 1- u^4 )^2 } \right) \\ &= & \frac{1}{2}(1+b^2) - \rho(u;b) \end{eqnarray} where $$\rho(u;b) = b \frac{ 2u}{ 1 + u^2} + \frac{1}{2} \left( \frac{1-u^2}{1+u^2} \right)^2$$ A little algebra shows that $$\frac{ \partial }{\partial u} \rho(u;b) = \frac{ 2(u^2-1)(bu^2-2u+b)}{ (1+u^2)^3}$$ which is zero for any solution of $$b u^2-2u+b=0$$. If $$b<1$$ we do indeed want the root in $$0,1$$ given by $$u = \frac{1-\sqrt{1+b}}{b}$$ So substituting $$u^2 = \frac{2u-b}{b}$$ back into $$\rho$$ yields, after a little simplification: $$\rho\left(u=\frac{1-\sqrt{1+b}}{b};b\right) = \frac{1}{2}(1+b^2)$$ which we recognize as precisely the value of the one period problem. This shows that \begin{eqnarray} S(u) &= & \frac{1}{2}(1+b^2) - \rho(u;b) \\ & \ge & \frac{1}{2}(1+b^2) - \max \rho(u;b) \\ & = & \frac{1}{2}(1+b^2) - \frac{1}{2}(1+b^2) \\ & = & 0 \end{eqnarray} as claimed.

Lemma G: For $$b >1$$ the function $$S(u) = D(\lambda_2=u,\lambda=u;b)$$ satisfies $$S(u) > 0$$ for $$u \in (0,1)$$. It approaches value zero as $$u \rightarrow 1$$, corresponding to the case of a very short bridge.

The limiting case is obvious algebraically and geometrically, since for $$b>1$$ we have \begin{eqnarray} S(u) &= & b - \left( b \frac{ 2u}{ 1 + u^2} + \frac{1}{2} \left( \frac{1-u^2}{1+u^2} \right)^2 \right) \\ & \rightarrow & 0 \end{eqnarray} as $$u\rightarrow 1$$. Moreover we can simplify to \begin{eqnarray} S(u) &= & \left(\frac{1-u^2}{1+u^2} \right) \left( 1+ \frac{1}{2} \frac{(1-u)^2}{(1+u^2)} \right) \end{eqnarray} which is evidently greater than zero.