Skip to content

vandomed/stocks

Repository files navigation

Get Rich with ‘stocks’

Dane Van Domelen
vandomed@gmail.com 2020-04-26

Build Status

Installation

You can install and load stocks from GitHub via the following code:

devtools::install_github("vandomed/stocks")
library("stocks")

Package overview

The stocks package has a variety of functions for analyzing investments and investment strategies. I use it for a lot of my articles on Seeking Alpha. The package relies heavily on Yahoo! Finance for historical prices and on the quantmod package for downloading that data into R.

There are functions for calculating performance metrics, visualizing the performance of funds and multi-fund portfolios, and backtesting trading strategies. The main functions are:

Function Purpose
load_prices Download Historical Prices
load_gains Download Historical Gains
plot_growth Plot Investment Growth
calc_metrics Calculate Performance Metrics
calc_metrics_overtime Calculate Performance Metrics over Time
calc_metrics_2funds Calculate Performance Metrics for Two-Fund Portfolios
calc_metrics_3funds Calculate Performance Metrics for Three-Fund Portfolios
plot_metrics Plot One Performance Metric (Sorted Bar Plot) or One vs. Another (Scatterplot)
plot_metrics_overtime Plot One Performance Metric vs. Time or One vs. Another over Time
plot_metrics_2funds Plot One Performance Metric vs. Another for Two-Fund Portfolios
plot_metrics_3funds Plot One Performance Metric vs. Another for Three-Fund Portfolios

Motivating example: A two-fund stocks and bonds portfolio

Rationale

Stocks and bonds are obviously the primary building blocks for a retirement portfolio, and I think the ETF’s SPY and TLT pair together very nicely for a very effective two-fund strategy. Let’s look at the performance of these funds separately and together.

Assess each fund’s performance over their mutual lifetimes

We can use load_gains to download historical daily gains for SPY and TLT over their mutual lifetimes:

library("stocks")
gains <- load_gains(c("SPY", "TLT"), to = "2018-12-31")
head(gains)
#>            Date      SPY      TLT
#> 2395 2002-07-31  0.00242  0.01239
#> 2396 2002-08-01 -0.02611  0.00569
#> 2397 2002-08-02 -0.02241  0.01024
#> 2398 2002-08-05 -0.03480  0.00441
#> 2399 2002-08-06  0.03366 -0.00855
#> 2400 2002-08-07  0.01744  0.00240

We can call (or pipe into) calc_metrics to calculate some performance metrics. calc_metrics returns a normal data frame, but I’ll call knitr::kable to print it as a neat-looking table:

metrics <- calc_metrics(gains)
knitr::kable(metrics)

Fund

CAGR (%)

Max drawdown (%)

Mean (%)

SD (%)

Sharpe ratio

Annualized alpha (%)

Beta

Correlation

SPY

8.49

55.2

0.039

1.168

0.034

0.0

1.000

1.000

TLT

6.31

26.6

0.028

0.844

0.033

10.4

-0.292

-0.404

We see here that SPY has achieved stronger growth (8.5% vs. 6.3%), but with a much worse max drawdown (55.2% vs. 26.6%). TLT’s Sharpe ratio (a measure of risk-adjusted returns) is somewhat higher than SPY’s.

Without getting too far ahead of myself, TLT’s positive alpha (0.039%) and negative beta (-0.292) are precisely why it pairs so well with SPY. This isn’t unique to TLT; all bond funds should generate alpha (otherwise, don’t invest!), and they’re often negatively correlated with equities.

For a visual comparison of the returns and volatility of these two ETF’s, we can plot mean vs. SD using plot_metrics.

plot_metrics(metrics, mean ~ sd)

No surprise, the S&P 500 ETF had more growth, but also higher volatility.

(Side note: You could achieve the same plot by specifying gains rather than metrics, or by simply specifying the tickers input.)

How reliable is TLT’s negative correlation?

Negative correlation works wonders for a two-fund portfolio, so let’s look at how consistently TLT achieves negative correlation with SPY, using calc_metrics_overtime and plot_metrics_overtime. For illustrative purposes, I’ll include the full 3-step process: load historical gains, calculate the correlation over time, and generate the plot.

c("SPY", "TLT") %>%
  load_gains(to = "2018-12-31") %>%
  calc_metrics_overtime("r") %>%
  plot_metrics_overtime(r ~ .)

While the tendency is certainly for negative correlation, there’s a lot of variability, and in some years the correlation was actually slightly positive.

As you can see, the default behavior is to calculate the requested metric on a per-year basis. You can also request per-month calculations or rolling windows of a certain width (see ?calc_metrics_overtime). And the Pearson correlation is just one of many metrics you can plot (see ?calc_metrics for the full list).

Everyone loves piping these days, but for typical use cases I would actually recommend skipping directly to plot_metrics_overtime. If you specify tickers, it will download the data it needs on the fly. This code is much shorter and produces the same figure as above:

plot_metrics_overtime(formula = beta ~ ., tickers = "TLT")

A 50-50 blend

A 50% SPY, 50% TLT portfolio should generate much better risk-adjusted returns than SPY (and perhaps TLT) itself, but a 50% bonds allocation is pretty high so raw returns will probably be lower.

To look at this, we can add a column to gains and then call calc_metrics, requesting a few particular metrics:

gains$`50-50` <- gains$SPY * 0.5 + gains$TLT * 0.5
calc_metrics(gains, c("cagr", "mdd", "sharpe", "sortino")) %>%
  knitr::kable()

Fund

CAGR (%)

Max drawdown (%)

Sharpe ratio

Sortino ratio

SPY

8.49

55.2

0.034

0.042

TLT

6.31

26.6

0.033

0.050

50-50

8.37

23.0

0.059

0.082

Indeed, while the 50-50 portfolio achieved slightly lower raw returns than SPY alone, its max drawdown was far better, and its Sharpe and Sortino ratios indicated much better risk-adjusted growth compared to the individual ETF’s.

What’s the optimal allocation?

That will likely depend on what metric you want to maximize. In terms of raw growth, roughly 75% SPY is optimal, but the curve is pretty flat–the CAGR is roughly the same from 60-100% SPY.

plot_metrics_2funds(gains = gains, 
                    formula = cagr ~ allocation, 
                    tickers = c("SPY", "TLT"), 
                    from = "2010-01-01")

In terms of risk-adjusted growth, the Sharpe ratio curve is somewhat more interesting. The maximum Sharpe ratio occurs around 40% SPY, and the Sharpe ratio gets much worse as you approach 60% SPY and higher.

plot_metrics_2funds(gains = gains, 
                    formula = sharpe ~ allocation, 
                    tickers = c("SPY", "TLT"), 
                    from = "2010-01-01")

We can gain additional insight by plotting two metrics against each other, across all possible allocations. A common strategy is to plot the mean vs. standard deviation as a function of the allocation:

plot_metrics_2funds(gains = gains, 
                    formula = mean ~ sd, 
                    tickers = c("SPY", "TLT"), 
                    from = "2010-01-01")

This plot yields an interesting finding: starting at 100% TLT, increasing the allocation to SPY simultaneously reduces volatility and increases returns. In other words, you’d be crazy not to ride the curve up and to the left, adding at least a 30% SPY allocation.

A big caveat is that this is all based on historical data. There’s no guarantee that 30% SPY, 70% TLT will have lower volatility or greater returns than TLT going forward.

Three-fund portfolios

I think three-fund portfolios are the sweetspot in terms of balancing complexity and performance. With two funds, you’re relying on a single source of alpha generation; with > 3 funds, it’s hard to visualize, and thus hard to understand whether the constituent funds actually complement each other.

I won’t go into full detail about it here, but three asset classes that I think work really well together are large-cap stocks, long-term bonds, and junk bonds. To visualize such a strategy, implemented via Vanguard mutual funds:

plot_metrics_3funds(formula = mean ~ sd, 
                    tickers = c("VFINX", "VBLTX", "VWEHX"), 
                    from = "2010-01-01")

100% VFINX maximizes expected returns, but also volatility. If you wanted to take on no more than one-half of the S&P’s volatility, while maximizing returns, you could add an allocation to VBLTX (move from 100% VFINX to the left along the upper black curve). If you’re very conservative and want to target something like 0.4% volatility, a VWEHX allocation eventually becomes helpful (get off of black curves before it veers downward and to the right).

Mean vs. SD is the standard way of visualizing portfolios, but Sharpe ratio vs. SD is more useful for understanding how risk-adjusted performance varies with allocation. If we plot Sharpe ratio vs. SD, the benefit of adding exposure to bonds becomes more clear:

plot_metrics_3funds(formula = sharpe ~ sd, 
                    tickers = c("VFINX", "VBLTX", "VWEHX"), 
                    from = "2010-01-01")

Groovy! By the way, if you want to see individual data points on the plot (i.e. what allocation each data point corresponds to) you can just set plotly = TRUE when you call plot_metrics_3funds or any of the other plotting functions.

Feedback and bugs

You can find me on Twitter at @DaneVanDomelen, and of course feel free to make feature requests and collaborate directly on GitHub.

Version history

Version Updates
1.0 Original
1.2-1.4 Added functions, bug fixes, etc.
2.0 Switched to ggplot, added piping support, simplified functions for calculating metrics

References

Ryan, Jeffrey A., and Joshua M. Ulrich. 2017. Quantmod: Quantitative Financial Modelling Framework. https://CRAN.R-project.org/package=quantmod.