Suppose you have a huge data set with assets and would like to construct a portfolio that is as mean reverting as possible while being sparse. How to filter out such a portfolio out of the data set? That is exactly what this code is meant to do.
Being sparse means that we don't want to have many assets. There are a few reasons for this:
- High transaction costs
- Overfitting
- Instability
- Little variance
Generally the more assets our portfolio has, the more mean reverting it is but the less volatile it is. We would like to find the golden middle of mean reverting and volatile enough while still being stable.
The set-up of this code was inspired by this article of Alex D'Aspremont. Nevertheless, the explanation below should be self-containing enough to understand the necessary theory to follow along with the provided code. Now, let's have some free lunch!
We assume that the assets in our portfolio satisfy the vector autoregressive process
Our portfolio
For simplicity, let us first assume that
In general
First, as the covariance matrix
This procedure can be used to test if multiple L(1) time series are co-integrated or not. L(1) refers to the lag of the autoregressive processes: if it is L(1), then it means that if you would subtract the last value from your time series and subtract it from the current value, then your time series becomes stationary. Similarly, L(2) would mean you need to subtract the last two values to make the series stationary, etcetera.
So far we have been looking at constructing a mean-reverting portfolio but not a sparse one. We want to find the most mean
reverting portfolio in our entire asset universe that only contains at most
The optimization problem is NP-hard. Therefore, we aim for a suboptimal solution that is still good and fast enough. One of the algorithm that we can employ for this objective is called Greedy Search. In short, this algorithm does the following:
- Use a brute force technique to get the most mean-revering pair of assets
- Add one asset that will yield the most mean reverting triplet
- Continue adding assets in this way until you reach
$k$ number of assets
To start out Greedy Search, we use brute force to find the most mean-reverting pair. For practicality -- under the realm of diversification -- we also add the constraint that the weight assigned to a single asset should not be over 80%, as minimizing predictability tends to give weights where often you have 99% or more of the total capital in one single asset, which is obviously not desirable.
Also, even with a sparse portfolio, it does not make sense to have assets in our portfolio that have a tiny weight assigned
to them. Therefore, we propose an additional constraint that the smallest weight in