MTH 4500: Introductory Financial Mathematics
Portfolio Theory
Minimum variance portfolio
Assume that we are given \(n\) risky securities and that the expected returns of the securities are respectively \(\mu_1\), \(\mu_2\), \(\dots\), \(\mu_n\). We can write these expected returns in a form of a vector \( m=\left[\begin{array}{cccc}\mu_1& \mu_2&\dots &\mu_n\end{array}\right]\). Let \(C\) denote the covariance matrix between the returns. Assume that \(C\) is invertible. Let us denote by \(u\) the vector whose \(n\) components are equal to \(1\), i.e. \(u=\left[\begin{array}{cccc}1&1&\dots&1\end{array}\right]\).
Theorem
Among all portfolios that consist of the given \(n\) securities, the minimal variance has the portfolio with the weights \[w=\frac{u C^{-1}}{uC^{-1}u^T}.\]
The variance of the portfolio with weights \(w\) is given by \(F(w)=wCw^T\). We need to find the minimum of the function \(F(w)\) under the condition that \(wu^T=1\). The problem can be re-formulated as finding the minimum of \(F(w)\) under the condition \(G(w)=0\), where \(G(w)=wu^T-1\).
We will use the method of Lagrange multipliers to find the critical points of the function \(F(w)\) that satisfy \(G(w)=0\). There are 5 categories of critical points:
Therefore we obtained that \(w=\frac{uC^{-1}}{uC^{-1}u^T}\) is the only critical point. Since the variance of each portfolio is non-negative, and the variance of the portfolio \((N,-N+1,0,\dots, 0)\) converges to \(+\infty\) as \(N\to +\infty\) we conclude that the obtained critical point corresponds to the minimal variance.
Theorem
Among all portfolios that consist of the given \(n\) securities and that have the expected return \(\mu\), the minimal variance has the portfolio with the weights \[w=\left[\begin{array}{cc}\mu&1\end{array}\right]M^{-1}\left[\begin{array}{c}mC^{-1}\newline uC^{-1}\end{array}\right],\quad\mbox{where } M=\left[\begin{array}{cc} mC^{-1}m^T& mC^{-1}u^T\newline uC^{-1}m^T& uC^{-1}u^T\end{array}\right].\]
We need to find the minimum of the function \(F(w)=wCw^T\) under the conditions \(G(w)=0\) and \(H(w)=0\), where \(G(w)= wu^T-1\) and \(H(w)=wm^T-\mu\).
We will use the method of Lagrange multipliers to find the critical points of the function \(F(w)\) that satisfy \(G(w)=0\). There are 5 categories of critical points:
- \(1^{\circ}\) Endpoints. There are no critical points in this category since weights can be any real numbers whose sum is equal to 1.
- \(2^{\circ}\) Points for which \(\nabla F\), \(\nabla G\), or \(\nabla H\) is not defined. The gradients \(\nabla F=2wC\), \(\nabla G=u\), and \(\nabla H=m\) are defined for all weights \(w\).
- \(3^{\circ}\) Points for which \(\nabla F=0\). The condition \(\nabla F=0\) implies \(2wC=0\), and similarly as above we obtain that there are no critical points of the second category.
- \(4^{\circ}\) Points for which \(\nabla G=0\) or \(\nabla H=0\). Since \(\nabla G=u\neq 0\) and \(\nabla H=m\neq 0\), there are no critical points in this category.
- \(5^{\circ}\) Points for which \(\nabla F\) belongs to the span of \(\{\nabla G,\nabla H\}\). We need to find the vectors \(w\) for which \(wu^T=1\) and \(\nabla F(w)=\lambda_1 \nabla H(w)+\lambda_2\nabla G(w)\) for some scalars \(\lambda_1\) and \(\lambda_2\).
The last equation reads \(2wC=\lambda_1 m+\lambda_2 u\). Dividing both sides by \(2\) implies that \(wC=\frac{\lambda_1}2m+\frac{\lambda_2}2u\) and after multiplying both sides of the obtained equation by \(C^{-1}\) from the right, we obtain \[w=\frac{\lambda_1}2mC^{-1}+\frac{\lambda_2}2uC^{-1}. \quad\quad\quad\quad\quad (1)\]
We now multiply both sides of the last equation by \(m^T\) and use the fact that \(wm^T=\mu\) to obtain \(\mu=\frac{\lambda_1}2mC^{-1}m^T+\frac{\lambda_2}2uC^{-1}m^T\). If we multply both sides of equation (1) with \(u^T\) it becomes
\(1=\frac{\lambda_1}2mC^{-1}u^T+\frac{\lambda_2}2uC^{-1}u^T\). The equation (1) will give us the desired \(w\) once we get \(\frac{\lambda_1}2\) and \(\frac{\lambda_2}2\) from the system
\begin{eqnarray*}
\mu&=&\frac{\lambda_1}2mC^{-1}m^T+\frac{\lambda_2}2uC^{-1}m^T\newline
1&=&\frac{\lambda_1}2mC^{-1}u^T+\frac{\lambda_2}2uC^{-1}u^T.
\end{eqnarray*}
The last system can be written in the matrix form as \[\left[\begin{array}{c}\mu\newline 1\end{array}\right]=M\left[\begin{array}{c}\frac{\lambda_1}2\newline\frac{\lambda_2}2\end{array}\right],\quad\mbox{where }M=\left[\begin{array}{cc} mC^{-1}m^T& mC^{-1}u^T\newline uC^{-1}m^T& uC^{-1}u^T\end{array}\right].\]
Let us prove that the matrix \(M\) is symmetric. Since \(C\) is symmetric, the matrix \(C^{-1}\) is symmetric as well. Hence
\[\left(mC^{-1}u^T\right)^T=\left(u^T\right)^T\cdot \left(C^{-1}\right)^T\cdot m^T=uC^{-1}m^T.\] However, \(mC^{-1}u^T\) is a number which means that \(mC^{-1}u^T=\left(mC^{-1}u^T\right)^T\). Therefore \(mC^{-1}u^T=uC^{-1}m^T\).
Therefore we get \begin{eqnarray*}\left[\begin{array}{c}\frac{\lambda_1}2\newline\frac{\lambda_2}2\end{array}\right]&=&M^{-1}\left[\begin{array}{c}\mu\newline 1\end{array}\right], \quad\mbox{which is equivalent to }\newline \left[\begin{array}{cc}\frac{\lambda_1}2 & \frac{\lambda_2}2\end{array}\right]&=&
\left[\begin{array}{c}\frac{\lambda_1}2\newline\frac{\lambda_2}2\end{array}\right]^T=\left(M^{-1}\left[\begin{array}{c}\mu\newline 1\end{array}\right]\right)^T=\left[\begin{array}{c}\mu\newline 1\end{array}\right]^T\left(M^{-1}\right)^T=
\left[\begin{array}{cc}\mu & 1\end{array}\right]M^{-1}.\end{eqnarray*}
Finally, the equation (1) implies that
\begin{eqnarray*}
w&=&\frac{\lambda_1}2mC^{-1}+\frac{\lambda_2}2uC^{-1} = \left[\begin{array}{cc}\frac{\lambda_1}2&\frac{\lambda_2}2\end{array}\right]\left[\begin{array}{c}mC^{-1}\newline uC^{-1}\end{array}\right]=
\left[\begin{array}{cc}\mu & 1\end{array}\right]M^{-1}
\left[\begin{array}{c}mC^{-1}\newline uC^{-1}\end{array}\right].
\end{eqnarray*}
We have obtained only one critical point. Since the variance is always positive, and the maximal variance is infinite, the obtained portfolio corresponds to the minimal variance.
Notice that \(M\) and \(M^{-1}\) are \(2\times 2\) matrices and that \(mC^{-1}\) is a product of \(1\times n\) matrix \(m\) and \(n\times n\) matrix \(C^{-1}\). Thus \(mC^{-1}\) is \(1\times n\) matrix. Similarly, \(uC^{-1}\) is \(1\times n\) matrix. Therefore \(M^{-1}\left[\begin{array}{c}mC^{-1}\newline uC^{-1}\end{array}\right]\) is a \(2\times n\) matrix. Let us denote its entries by \(a_1\), \(a_2\), \(\dots\), \(a_n\), \(b_1\), \(b_2\), \(\dots\), \(b_n\), i.e.
\[\left[\begin{array}{cccc}a_1&a_2&\dots&a_n\newline b_1&b_2&\dots&b_n\end{array}\right]=M^{-1}\left[\begin{array}{c}mC^{-1}\newline uC^{-1}\end{array}\right].\] Let us denote \(a=\left[\begin{array}{cccc}a_1&a_2&\dots&a_n\end{array}\right]\) and
\(b=\left[\begin{array}{cccc}b_1&b_2&\dots&b_n\end{array}\right]\). Then the minimal variance portfolio satisfies \[w=\left[\begin{array}{cc} \mu & 1\end{array}\right]\left[\begin{array}{c}a\newline b\end{array}\right]=\mu a+b.\]
We have established a linear relationship between \(\mu\) and the portfolio of minimal variance whose expected return is \(\mu\). This linear relationship is called minimal variance line.
Since every line is uniquely determine by any two of its points, we immediately derive the following theorem:
Theorem (Two Fund Theorem)
If \(w_1\) and \(w_2\) are any two portfolios on the minimal variance line, then any other portfolio \(w\) on the minimal variance line can be expressed as
\(w=\alpha w_1+(1-\alpha)w_2\).
Market portfolio
We now assume that in addition to the \(n\) risky securities, there is a risk-less security whose return is \(R > 0\).
Consider any risky portfolio \(P\) with expected return \(\mu\) and risk \(\sigma\) and construct a new portfolio that consists of this risky portfolio \(P\) and the risk-less security. If the weights of this new portfolio are \(\alpha\) and \(1-\alpha\) then the return is \(\alpha \mu+(1-\alpha)R\) and the risk is \(\alpha \sigma\).
Therefore, if we keep a portfolio with expected return \(\mu\) and risk \(\sigma\) fixed, we can vary \(\alpha\) and build a portfolio with expected return \(\mu^{\prime}=\alpha \mu+(1-\alpha)R\) and risk \(\sigma^{\prime}=\alpha \sigma\).
The quantity \((\sigma^{\prime},\mu^{\prime})=\left(\alpha\sigma, \alpha\mu+(1-\alpha)R\right)\) represents the parametrization of a line (with parameter \(\alpha)\) in the \((\sigma,\mu)\) plane that connects the point \((0,R)\) with the point \((\sigma,\mu)\).
We want to determine the line with the highest slope, i.e. we want to determine \(w\) for which the slope of the line that connects \((\sigma,\mu)\) with \((0,R)\) is the maximal possible. In other words we want to maximize the function \(S(w)=\frac{\mu(w)-R}{\sigma(w)}=\frac{wm^T-R}{\sqrt{wCw^T}}\) under the condition \(wu^T=1\).
Theorem
The function \(S(w)=\frac{wm^T-R}{\sqrt{wCw^T}}\) attains its maximum on \(wu^T=1\) for the following choice of \(w\):
\[w=\frac{mC^{-1}-RuC^{-1}}{mC^{-1}u^T-RuC^{-1}u^T}.\]
We will use the method of Lagrange multipliers to find the maximum of the function \(S(w)\) under the condition \(G(w)=0\), where \(G(w)=wu^T-1\). We will first find the critical points. We have that \(\nabla G=u\) and
\begin{eqnarray*}\nabla S&=&\nabla\left(\frac{wm^T-R}{\sqrt{wCw^T}}\right)=\frac{\sqrt{wCw^T}\nabla\left(wm^T-R\right) -\left(wm^T-R\right)\nabla \sqrt{wCw^T} }
{wCw^T} \newline
&=&
\frac{\sqrt{wCw^T}m -\left(wm^T-R\right) \frac{wC}{\sqrt{wCw^T} }}
{wCw^T} =\frac{wCw^T \cdot m-\left(wm^T-R\right) wC}{\left(wCw^T\right)^{\frac32}}
.
\end{eqnarray*}
There are 5 categories of critical points.
- \(1^{\circ}\) Endpoints. There are no critical points in this category since weights can be any real numbers whose sum is equal to 1.
- \(2^{\circ}\) Points for which \(\nabla S\) or \(\nabla G\) is not defined. The gradient of \(G\) is clearly defined for all \(w\). The gradient of \(S\) contains a fraction, however its denominator is positive for each \(w\) because each portfolio is risky.
- \(3^{\circ}\) Points for which \(\nabla S=0\). The condition \(\nabla S=0\) implies
\[wCw^T\cdot m=\left(wm^T-R\right)\cdot wC.\] Multiplying both sides by \(w^T\) yields
\(wCw^T\cdot mw^T=\left(wm^T-R\right)\cdot wCw^T\). Since \(wCw^T > 0\), we can cancel both sides by it and obtain
\(mw^T=wm^T-R\). Since \(mw^T=wm^T\) we conclude that \(R=0\) which is opposite to our assumption \(R > 0\).
- \(4^{\circ}\) Points for which \(\nabla G=0\). Since \(\nabla G=u\neq 0\), there are no critical points in this category.
- \(5^{\circ}\) Points for which \(\nabla S\|\nabla G\). We need to find those \(w\) for which there exists \(\lambda\) such that \(\nabla S(w)=\lambda \nabla G(w)\) and \(G(w)=0\). The first equation implies that
\(\frac{wCw^T \cdot m-\left(wm^T-R\right) wC}{\left(wCw^T\right)^{\frac32}}=\lambda u\). Multiplying both sides with \(w^T\) and using that \(uw^T=1\) immediately yields
\(wCw^T\cdot mw^T-\left(wm^T-R\right) wCw^T=\lambda\left(wCw^T\right)^{\frac32}\) which together with \(wCw^T\neq 0\) yields
\(\lambda=\frac{R}{\sqrt{wCw^T}}\). The equation \(\nabla S(w)=\lambda \nabla G(w)\) now becomes:
\[wCw^T m-\left(wm^T-R\right)wC=RwCw^Tu.
\]
Multiplying both sides of the previous equation by \(C^{-1}\) from the right yields:
\[w=\frac{wCw^T}{wm^T-R}\left(mC^{-1}-RuC^{-1}\right).\]
Multiplying both sides by \(u^T\) gives us
\(1=\frac{wCw^T}{wm^T-R}\left(mC^{-1}u^T-RuC^{-1}u^T\right)\), which is equivalent to
\(\frac{wCw^T}{wm^T-R}=\frac1{mC^{-1}u^T-RuC^{-1}u^T}\). Therefore
\[w=\frac{mC^{-1}-RuC^{-1}}{mC^{-1}u^T-RuC^{-1}u^T}.\]
The only critical point is \(w=\frac{mC^{-1}-RuC^{-1}}{mC^{-1}u^T-RuC^{-1}u^T}\). Since the function \(S\) is continuous it must attain its maximum, and this maximum has to be attained at \(w\).
Example
Assume that \(\overrightarrow a\) and \(\overrightarrow b\) are the vectors such that for each \(\mu\) the minimal variance portfolio with expected return \(\mu\) is given by \(\overrightarrow{w}_{\mu}=\mu\overrightarrow a+\overrightarrow b\). Prove that the sum of all components of \(\overrightarrow a\) is equal to \(0\) and the sum of all components of the vector \(\overrightarrow b\) is equal to \(1\).
Since \(\overrightarrow{w}_{\mu}\) are the weights of the portfolio for every \(\mu\) we must have \(\overrightarrow{w}_{\mu}\cdot \overrightarrow{u}^T=1\), where \(\overrightarrow u\) is the vector whose all coordinates are \(1\).
The last equation implies \(1=\mu \overrightarrow a\cdot \overrightarrow u^T+\overrightarrow b\cdot \overrightarrow u^T\) for every \(\mu\). The numbers \(\overrightarrow a\cdot\overrightarrow u^T\) and \(\overrightarrow b\cdot\overrightarrow u^T\) are constants hence we must have
\(\overrightarrow a\cdot\overrightarrow u^T=0\) and \(\overrightarrow b\cdot\overrightarrow u^T=1\).