Optimize Portfolios using the Markowitz Model

Tidy Finance Webinar Series

Christoph Scheuch

Historical context & significance

Harry Markowitz pioneered modern portfolio theory

  • 1952: influential paper on theory of portfolio selection (cited over 63,000 times)
  • 1990: Sveriges Riksbank Prize in Economic Sciences (with M. Miller & W. Sharpe)

What is modern portfolio theory?

How to optimally allocate wealth across assets with different characteristics, e.g. returns, risks, correlations?

  • Individual asset risks & their correlations matter
  • Trade-off between expected returns & risk
  • Mean-variance analysis as a key tool
  • Foundation for portfolio & risk management

Maximize expected returns

Expected return

  • The profit you anticipate from an investment
  • \(\mu_i\) represents the expected return of asset \(i\)

Example: expect a 10% return from Apple over next 12 months

… while minimizing risks

Risk

  • Returns are volatile
  • More volatility \(\rightarrow\) more risk
  • \(\sigma_i\) represents the volatility of asset \(i\)

Example: Apple’s stock might move \(\pm15\)% over next year


Markowitz: correlations across assets also matter!

The power of diversification in reducing risk

Fruit basket analogy

  • If all you have are apples & they spoil, you lose everything
  • With a variety, some fruits may spoil, but others will stay fresh


Diversification in investment

  • Spread investments across assets to reduce overall risk
  • Diversify across stocks, bonds, real estate, commodities, etc.

Outline of this webinar

  1. Estimate expected returns
  2. Estimate the variance-covariance matrix
  3. Calculate portfolio returns & volatility
  4. Calculate the minium variance portfolio
  5. Calculate the efficient frontier

Expected returns based on sample average returns

Sum past returns & divide by number of periods:

  • \(\hat{\mu}_i = \frac{1}{T} \sum_{t=1}^{T} r_{it}\)
  • \(r_{it}\) is return in period \(t\) and \(T\) is number of periods

Example:

  • Historical returns for \(i\) over 5 years: 8%, 10%, 6%, 12%, 9%
  • \(\hat{\mu}_i = (8\% + 10\% + 6\% + 12\% + 9\%) / 5 = 9\%\)

Assumption: past performance is indicative of future

Download daily stock prices

library(tidyverse)
library(tidyfinance)

symbols <- download_data(
  type = "constituents",
  index = "Dow Jones Industrial Average"
)

prices_daily <- download_data(
    type = "stock_prices", symbol = symbols$symbol,
    start_date = "2019-08-01", end_date = "2024-07-31"
) |> 
  select(symbol, date,  price = adjusted_close)
# A tibble: 37,710 × 3
   symbol date       price
   <chr>  <date>     <dbl>
 1 UNH    2019-08-01  231.
 2 UNH    2019-08-02  232.
 3 UNH    2019-08-05  227.
 4 UNH    2019-08-06  230.
 5 UNH    2019-08-07  229.
 6 UNH    2019-08-08  230.
 7 UNH    2019-08-09  231.
 8 UNH    2019-08-12  226.
 9 UNH    2019-08-13  231.
10 UNH    2019-08-14  226.
# ℹ 37,700 more rows

Calculate daily returns

returns_daily <- prices_daily |>
  group_by(symbol) |> 
  mutate(ret = price / lag(price) - 1) |>
  ungroup() |> 
  select(symbol, date, ret) |> 
  drop_na(ret) |> 
  arrange(symbol, date)
# A tibble: 37,680 × 3
   symbol date            ret
   <chr>  <date>        <dbl>
 1 AAPL   2019-08-02 -0.0212 
 2 AAPL   2019-08-05 -0.0523 
 3 AAPL   2019-08-06  0.0189 
 4 AAPL   2019-08-07  0.0104 
 5 AAPL   2019-08-08  0.0221 
 6 AAPL   2019-08-09 -0.00824
 7 AAPL   2019-08-12 -0.00254
 8 AAPL   2019-08-13  0.0423 
 9 AAPL   2019-08-14 -0.0298 
10 AAPL   2019-08-15 -0.00498
# ℹ 37,670 more rows

Calculate average returns

assets <- returns_daily |> 
  group_by(symbol) |> 
  summarize(mu = mean(ret))

fig_mu <- assets |> 
  ggplot(aes(x = mu, y = fct_reorder(symbol, mu), 
             fill = mu > 0)) +
  geom_col() +
  scale_x_continuous(labels = scales::percent) + 
  labs(x = NULL, y = NULL, fill = NULL,
       title = "Average daily returns of DOW index constituents")

Volatility measures individual asset risk

\[\hat{\sigma}_i = \sqrt{\frac{1}{T-1} \sum_{t=1}^{T} (R_{it} - \hat{\mu}_i)^2}\]


Interpretation: higher volatility indicates higher risk

Estimating volatilities

volatilities <- returns_daily |> 
  group_by(symbol) |> 
  summarize(sigma = sd(ret))

assets <- assets |> 
  left_join(volatilities, join_by(symbol))

fig_sigma <- assets |> 
  ggplot(aes(x = sigma, y = fct_reorder(symbol, sigma))) +
  geom_col() +
  scale_x_continuous(labels = scales::percent) + 
  labs(x = NULL, y = NULL,
       title = "Daily volatilities of DOW index constituents")

Covariance measures interaction between assets

\[\hat{\sigma}_{ij} = \frac{1}{T-1} \sum_{t=1}^{T} (R_{it} - \hat{\mu}_i)(R_{jt} - \hat{\mu}_j)\]


Interpretation:

  • Positive: assets move in the same direction, potentially increasing portfolio risk
  • Negative: assets move in opposite directions, which can reduce risk through diversification

Estimating the variance-covariance matrix

returns_wide <- returns_daily |> 
  pivot_wider(names_from = symbol, values_from = ret) 

sigma <- returns_wide |> 
  select(-date) |> 
  cov()

fig_sigma <- sigma |> 
  as_tibble(rownames = "symbol_a") |> 
  pivot_longer(-symbol_a, names_to = "symbol_b") |> 
  ggplot(aes(x = symbol_a, y = fct_rev(symbol_b),
             fill = value)) +
  geom_tile() +
  scale_fill_gradient(low = "blue", high = "red") + 
  labs(x = NULL, y = NULL, fill = "(Co-)Variance",
       title = "Variance-covariance matrix of Dow Industrial Average constituents") 

Calculate expected portfolio returns

\(\text{Expected Portfolio Return} = \sum_{i=1}^n \omega_i \hat{\mu}_i\)

  • \(\omega_i\): weight of asset \(i\) in the portfolio
  • \(\hat{\mu}_i\): estimated expected return of asset \(i\)

Example:

  • Asset A: 60% weight, expected return 8%
  • Asset B: 40% weight, expected return 12%
  • \((0.6 \times 8\%) + (0.4 \times 12\%) = 9.6\%\)

Assumption: portfolio weights are constant over time

Calculate the portfolio variance

Portfolio variance is calculated as

\[\sum_{i=1}^{n} \sum_{j=1}^{n} \omega_i \omega_j \hat{\sigma}_{ij}\]

  • \(\omega_i\), \(\omega_j\): the weights of assets \(i\), \(j\) in the portfolio
  • \(\hat{\sigma}_{ij}\): covariance between returns of assets \(i\) and \(j\)
  • \(n\): number of assets in portfolio

The minimum-variance framework

Minimize portfolio variance

\[\min_{\omega_1, ... \omega_n} \sum_{i=1}^{n} \sum_{j=1}^{n} \omega_i \omega_j \hat{\sigma}_{ij}\]

while staying fully invested

\[\sum_{i=1}^{n} \omega_i = 1\]

Minimum variance in matrix notation

Minimize portfolio variance

\[\min_{\omega} \omega' \hat{\Sigma} \omega\]

while staying fully invested

\[ \omega'\iota = 1\]

Solution for minimum-variance portfolio

\[\omega_\text{mvp} = \frac{\Sigma^{-1}\iota}{\iota'\Sigma^{-1}\iota}\]

  • \(\iota\): vector of 1’s
  • \(\Sigma^{-1}\): inverse of variance-covariance matrix \(\Sigma\)
iota <- rep(1, dim(sigma)[1])
sigma_inv <- solve(sigma)
omega_mvp <- as.vector(sigma_inv %*% iota) / 
  as.numeric(t(iota) %*% sigma_inv %*% iota)

Interpreting results

assets <- bind_cols(assets, omega_mvp = omega_mvp)

fig_omega_mvp <- assets |>
    ggplot(aes(x = omega_mvp, y = fct_reorder(symbol, omega_mvp), 
               fill = omega_mvp > 0)) +
    geom_col() +
    scale_x_continuous(labels = scales::percent) + 
    labs(x = NULL, y = NULL, 
         title = "Minimum-variance portfolio weights") 

Minimum-variance portfolio return

mu <- assets$mu

summary_mvp <- tibble(
  mu = sum(omega_mvp * mu),
  sigma = as.numeric(sqrt(t(omega_mvp) %*% sigma %*% omega_mvp)),
  type = "Minimum-Variance Portfolio"
)

summary_mvp
# A tibble: 1 × 3
        mu   sigma type                      
     <dbl>   <dbl> <chr>                     
1 0.000326 0.00932 Minimum-Variance Portfolio

Efficient portfolios

Minimize portfolio variance

\[\min_{\omega} \omega' \hat{\Sigma} \omega\]

While earning minimum expected return \(\bar{\mu}\)

  • \[ \omega'\iota = 1\]
  • \(\omega'\hat{\mu} = \bar{\mu}\)

Dow Jones vs Nasdaq 100

Choose a minimum expected return

Achieve at least average Nasdaq 100 return:

mu_bar <- download_data(
  "stock_prices", symbol = "^NDX", 
  start_date = "2019-08-01", end_date = "2024-07-31"
) |> 
  mutate(
    ret = adjusted_close / lag(adjusted_close) - 1
  ) |> 
  summarize(mean(ret, na.rm = TRUE)) |> 
  pull() 


Note: \(\bar\mu\) needs to be higher than \(\hat\mu_{mvp}\)

Solution for efficient portfolio

\[\omega_{efp} = \frac{\lambda^*}{2}\left(\Sigma^{-1}\mu -\frac{D}{C}\Sigma^{-1}\iota \right)\]

where \(\lambda^* = 2\frac{\bar\mu - D/C}{E-D^2/C}\), \(C = \iota'\Sigma^{-1}\iota\), \(D=\iota'\Sigma^{-1}\mu\) & \(E=\mu'\Sigma^{-1}\mu\)


See details on tidy-finance.org

Calculate efficient portfolio

C <- as.numeric(t(iota) %*% sigma_inv %*% iota)
D <- as.numeric(t(iota) %*% sigma_inv %*% mu)
E <- as.numeric(t(mu) %*% sigma_inv %*% mu)
lambda_tilde <- as.numeric(2 * (mu_bar - D / C) / (E - D^2 / C))
omega_efp <- as.vector(omega_mvp + lambda_tilde / 2 * (sigma_inv %*% mu - D * omega_mvp))

summary_efp <- tibble(
  mu = sum(omega_efp * mu),
  sigma = as.numeric(sqrt(t(omega_efp) %*% sigma %*% omega_efp)),
  type = "Efficient Portfolio"
)

Minimum variance vs efficient portfolio

summaries <- bind_rows(
  assets, summary_mvp, summary_efp
) 

fig_summaries <- summaries |> 
  ggplot(aes(x = sigma, y = mu)) +
  geom_point(data = summaries |>  filter(is.na(type))) +
  geom_point(data = summaries |>  filter(!is.na(type)), color = "red", size = 3) +
  ggrepel::geom_label_repel(aes(label = type)) +
  scale_x_continuous(labels = scales::percent) +
  scale_y_continuous(labels = scales::percent) + 
  labs(x = "Volatility", y = "Average return",
       title = "Efficient & minimum-variance portfolios for DOW index constituents",
       subtitle = "Points correspond to individual assets") 

The efficient frontier

Mutual fund separation theorem: any linear combination of efficient portfolios, is also efficient

\[\omega_{eff} = a \cdot \omega_{efp} + (1-a) \cdot\omega_{mvp}\]

Highest achievable expected return at each level of risk

Calculate the efficient frontier

efficient_frontier <- tibble(
  a = seq(from = -1, to = 4, by = 0.01),
) |> 
  mutate(
    omega = map(a, ~ .x * omega_efp + (1 - .x) * omega_mvp),
    mu = map_dbl(omega, ~ t(.x) %*% mu),
    sigma = map_dbl(omega, ~ sqrt(t(.x) %*% sigma %*% .x)),
  ) 

Visualizing the efficient frontier

summaries <- bind_rows(
    summaries, efficient_frontier
  )

fig_efficient_frontier <- summaries |> 
  ggplot(aes(x = sigma, y = mu)) +
  geom_point(data = summaries |> filter(is.na(type))) +
  geom_point(data = summaries |> filter(!is.na(type)), color = "red", size = 3) +
  ggrepel::geom_label_repel(aes(label = type)) +
  scale_x_continuous(labels = scales::percent) +
  scale_y_continuous(labels = scales::percent) + 
  labs(x = "Volatility", y = "Average return",
       title = "Efficient frontier for DOW index constituents",
       subtitle = "Points correspond to individual assets") 

Replicate minimum-variance via PortfolioAnalytics

library(PortfolioAnalytics)
library(CVXR)

returns_matrix <- column_to_rownames(
  returns_wide, var = "date"
)

problem_mvp <- portfolio.spec(colnames(returns_matrix)) |>
  add.objective(type = "risk", name = "var") |> 
  add.constraint("full_investment")

solution_mvp <- optimize.portfolio(
  returns_matrix, problem_mvp, optimize_method = "CVXR"
)

all.equal(omega_mvp, as.vector(solution_mvp$weights))
[1] TRUE

Replicate efficient portfolio via PortfolioAnalytics

problem_efp <- problem_mvp |> 
  add.constraint("return", return_target = mu_bar)

solution_efp <- optimize.portfolio(
  returns_matrix, problem_efp, optimize_method = "CVXR"
)

all.equal(omega_efp, as.vector(solution_efp$weights)) 
[1] TRUE

Easy to extend Markowitz model

Short sale constraints: add.constraint("long_only")

Position limit: add.constraint("position_limit", max_pos = 10)

Expected shortfall: add.objective(type = "risk", name = "ES")


.. and many more, see official PortfolioAnalytics vignette

Recap & key takeaways

  • Mean-variance framework is a cornerstone of finance
  • Download financial data using tidyfinance package
  • Easy to compute analytic solutions ‘manually’
  • Implement extensions using PortfolioAnalytics
  • More advanced: constrained optimization & backtesting

Follow on LinkedIn for news: christophscheuch

Slides on: talks.tidy-finance.org