Optimize Portfolios using the Markowitz Model

Tidy Finance Webinar Series

Christoph Scheuch

Historical context & significance

Harry Markowitz pioneered modern portfolio theory

1952: influential paper on theory of portfolio selection (cited over 63,000 times)
1990: Sveriges Riksbank Prize in Economic Sciences (with M. Miller & W. Sharpe)

What is modern portfolio theory?

How to optimally allocate wealth across assets with different characteristics, e.g. returns, risks, correlations?

Individual asset risks & their correlations matter
Trade-off between expected returns & risk
Mean-variance analysis as a key tool
Foundation for portfolio & risk management

Maximize expected returns …

Expected return

The profit you anticipate from an investment
\(\mu_i\) represents the expected return of asset \(i\)

Example: expect a 10% return from Apple over next 12 months

… while minimizing risks

Risk

Returns are volatile
More volatility \(\rightarrow\) more risk
\(\sigma_i\) represents the volatility of asset \(i\)

Example: Apple’s stock might move \(\pm15\)% over next year

Markowitz: correlations across assets also matter!

The power of diversification in reducing risk

Fruit basket analogy

If all you have are apples & they spoil, you lose everything
With a variety, some fruits may spoil, but others will stay fresh

Diversification in investment

Spread investments across assets to reduce overall risk
Diversify across stocks, bonds, real estate, commodities, etc.

Outline of this webinar

Estimate expected returns
Estimate the variance-covariance matrix
Calculate portfolio returns & volatility
Calculate the minium variance portfolio
Calculate the efficient frontier

Expected returns based on sample average returns

Sum past returns & divide by number of periods:

\(\hat{\mu}_i = \frac{1}{T} \sum_{t=1}^{T} r_{it}\)
\(r_{it}\) is return in period \(t\) and \(T\) is number of periods

Example:

Historical returns for \(i\) over 5 years: 8%, 10%, 6%, 12%, 9%
\(\hat{\mu}_i = (8\% + 10\% + 6\% + 12\% + 9\%) / 5 = 9\%\)

Assumption: past performance is indicative of future

library(tidyverse)
library(tidyfinance)

symbols <- download_data(
  type = "constituents",
  index = "Dow Jones Industrial Average"
)

prices_daily <- download_data(
    type = "stock_prices", symbol = symbols$symbol,
    start_date = "2019-08-01", end_date = "2024-07-31"
) |> 
  select(symbol, date,  price = adjusted_close)

# A tibble: 37,710 × 3
   symbol date       price
   <chr>  <date>     <dbl>
 1 UNH    2019-08-01  231.
 2 UNH    2019-08-02  232.
 3 UNH    2019-08-05  227.
 4 UNH    2019-08-06  230.
 5 UNH    2019-08-07  229.
 6 UNH    2019-08-08  230.
 7 UNH    2019-08-09  231.
 8 UNH    2019-08-12  226.
 9 UNH    2019-08-13  231.
10 UNH    2019-08-14  226.
# ℹ 37,700 more rows

Calculate daily returns

Code
Data

returns_daily <- prices_daily |>
  group_by(symbol) |> 
  mutate(ret = price / lag(price) - 1) |>
  ungroup() |> 
  select(symbol, date, ret) |> 
  drop_na(ret) |> 
  arrange(symbol, date)

# A tibble: 37,680 × 3
   symbol date            ret
   <chr>  <date>        <dbl>
 1 AAPL   2019-08-02 -0.0212 
 2 AAPL   2019-08-05 -0.0523 
 3 AAPL   2019-08-06  0.0189 
 4 AAPL   2019-08-07  0.0104 
 5 AAPL   2019-08-08  0.0221 
 6 AAPL   2019-08-09 -0.00824
 7 AAPL   2019-08-12 -0.00254
 8 AAPL   2019-08-13  0.0423 
 9 AAPL   2019-08-14 -0.0298 
10 AAPL   2019-08-15 -0.00498
# ℹ 37,670 more rows

Calculate average returns

Code
Figure

assets <- returns_daily |> 
  group_by(symbol) |> 
  summarize(mu = mean(ret))

fig_mu <- assets |> 
  ggplot(aes(x = mu, y = fct_reorder(symbol, mu), 
             fill = mu > 0)) +
  geom_col() +
  scale_x_continuous(labels = scales::percent) + 
  labs(x = NULL, y = NULL, fill = NULL,
       title = "Average daily returns of DOW index constituents")

Volatility measures individual asset risk

\[\hat{\sigma}_i = \sqrt{\frac{1}{T-1} \sum_{t=1}^{T} (R_{it} - \hat{\mu}_i)^2}\]

Interpretation: higher volatility indicates higher risk

Estimating volatilities

Code
Figure

volatilities <- returns_daily |> 
  group_by(symbol) |> 
  summarize(sigma = sd(ret))

assets <- assets |> 
  left_join(volatilities, join_by(symbol))

fig_sigma <- assets |> 
  ggplot(aes(x = sigma, y = fct_reorder(symbol, sigma))) +
  geom_col() +
  scale_x_continuous(labels = scales::percent) + 
  labs(x = NULL, y = NULL,
       title = "Daily volatilities of DOW index constituents")

Covariance measures interaction between assets

\[\hat{\sigma}_{ij} = \frac{1}{T-1} \sum_{t=1}^{T} (R_{it} - \hat{\mu}_i)(R_{jt} - \hat{\mu}_j)\]

Interpretation:

Positive: assets move in the same direction, potentially increasing portfolio risk
Negative: assets move in opposite directions, which can reduce risk through diversification

Estimating the variance-covariance matrix

Code
Figure

returns_wide <- returns_daily |> 
  pivot_wider(names_from = symbol, values_from = ret) 

sigma <- returns_wide |> 
  select(-date) |> 
  cov()

fig_sigma <- sigma |> 
  as_tibble(rownames = "symbol_a") |> 
  pivot_longer(-symbol_a, names_to = "symbol_b") |> 
  ggplot(aes(x = symbol_a, y = fct_rev(symbol_b),
             fill = value)) +
  geom_tile() +
  scale_fill_gradient(low = "blue", high = "red") + 
  labs(x = NULL, y = NULL, fill = "(Co-)Variance",
       title = "Variance-covariance matrix of Dow Industrial Average constituents")

Calculate expected portfolio returns

\(\text{Expected Portfolio Return} = \sum_{i=1}^n \omega_i \hat{\mu}_i\)

\(\omega_i\): weight of asset \(i\) in the portfolio
\(\hat{\mu}_i\): estimated expected return of asset \(i\)

Example:

Asset A: 60% weight, expected return 8%
Asset B: 40% weight, expected return 12%
\((0.6 \times 8\%) + (0.4 \times 12\%) = 9.6\%\)

Assumption: portfolio weights are constant over time

Calculate the portfolio variance

Portfolio variance is calculated as

\[\sum_{i=1}^{n} \sum_{j=1}^{n} \omega_i \omega_j \hat{\sigma}_{ij}\]

\(\omega_i\), \(\omega_j\): the weights of assets \(i\), \(j\) in the portfolio
\(\hat{\sigma}_{ij}\): covariance between returns of assets \(i\) and \(j\)
\(n\): number of assets in portfolio

The minimum-variance framework

Minimize portfolio variance

\[\min_{\omega_1, ... \omega_n} \sum_{i=1}^{n} \sum_{j=1}^{n} \omega_i \omega_j \hat{\sigma}_{ij}\]

while staying fully invested

\[\sum_{i=1}^{n} \omega_i = 1\]

Minimum variance in matrix notation

Minimize portfolio variance

\[\min_{\omega} \omega' \hat{\Sigma} \omega\]

while staying fully invested

\[ \omega'\iota = 1\]

Solution for minimum-variance portfolio

\[\omega_\text{mvp} = \frac{\Sigma^{-1}\iota}{\iota'\Sigma^{-1}\iota}\]

\(\iota\): vector of 1’s
\(\Sigma^{-1}\): inverse of variance-covariance matrix \(\Sigma\)

iota <- rep(1, dim(sigma)[1])
sigma_inv <- solve(sigma)
omega_mvp <- as.vector(sigma_inv %*% iota) / 
  as.numeric(t(iota) %*% sigma_inv %*% iota)

Interpreting results

Code
Figure

assets <- bind_cols(assets, omega_mvp = omega_mvp)

fig_omega_mvp <- assets |>
    ggplot(aes(x = omega_mvp, y = fct_reorder(symbol, omega_mvp), 
               fill = omega_mvp > 0)) +
    geom_col() +
    scale_x_continuous(labels = scales::percent) + 
    labs(x = NULL, y = NULL, 
         title = "Minimum-variance portfolio weights")

Minimum-variance portfolio return

mu <- assets$mu

summary_mvp <- tibble(
  mu = sum(omega_mvp * mu),
  sigma = as.numeric(sqrt(t(omega_mvp) %*% sigma %*% omega_mvp)),
  type = "Minimum-Variance Portfolio"
)

summary_mvp

# A tibble: 1 × 3
        mu   sigma type                      
     <dbl>   <dbl> <chr>                     
1 0.000326 0.00932 Minimum-Variance Portfolio

Efficient portfolios

Minimize portfolio variance

\[\min_{\omega} \omega' \hat{\Sigma} \omega\]

While earning minimum expected return \(\bar{\mu}\)

\[ \omega'\iota = 1\]
\(\omega'\hat{\mu} = \bar{\mu}\)

Dow Jones vs Nasdaq 100

Choose a minimum expected return

Achieve at least average Nasdaq 100 return:

mu_bar <- download_data(
  "stock_prices", symbol = "^NDX", 
  start_date = "2019-08-01", end_date = "2024-07-31"
) |> 
  mutate(
    ret = adjusted_close / lag(adjusted_close) - 1
  ) |> 
  summarize(mean(ret, na.rm = TRUE)) |> 
  pull()

Note: \(\bar\mu\) needs to be higher than \(\hat\mu_{mvp}\)

Solution for efficient portfolio

\[\omega_{efp} = \frac{\lambda^*}{2}\left(\Sigma^{-1}\mu -\frac{D}{C}\Sigma^{-1}\iota \right)\]

where \(\lambda^* = 2\frac{\bar\mu - D/C}{E-D^2/C}\), \(C = \iota'\Sigma^{-1}\iota\), \(D=\iota'\Sigma^{-1}\mu\) & \(E=\mu'\Sigma^{-1}\mu\)

See details on tidy-finance.org

Calculate efficient portfolio

C <- as.numeric(t(iota) %*% sigma_inv %*% iota)
D <- as.numeric(t(iota) %*% sigma_inv %*% mu)
E <- as.numeric(t(mu) %*% sigma_inv %*% mu)
lambda_tilde <- as.numeric(2 * (mu_bar - D / C) / (E - D^2 / C))
omega_efp <- as.vector(omega_mvp + lambda_tilde / 2 * (sigma_inv %*% mu - D * omega_mvp))

summary_efp <- tibble(
  mu = sum(omega_efp * mu),
  sigma = as.numeric(sqrt(t(omega_efp) %*% sigma %*% omega_efp)),
  type = "Efficient Portfolio"
)

Minimum variance vs efficient portfolio

Code
Figure

summaries <- bind_rows(
  assets, summary_mvp, summary_efp
) 

fig_summaries <- summaries |> 
  ggplot(aes(x = sigma, y = mu)) +
  geom_point(data = summaries |>  filter(is.na(type))) +
  geom_point(data = summaries |>  filter(!is.na(type)), color = "red", size = 3) +
  ggrepel::geom_label_repel(aes(label = type)) +
  scale_x_continuous(labels = scales::percent) +
  scale_y_continuous(labels = scales::percent) + 
  labs(x = "Volatility", y = "Average return",
       title = "Efficient & minimum-variance portfolios for DOW index constituents",
       subtitle = "Points correspond to individual assets")

The efficient frontier

Mutual fund separation theorem: any linear combination of efficient portfolios, is also efficient

\[\omega_{eff} = a \cdot \omega_{efp} + (1-a) \cdot\omega_{mvp}\]

Highest achievable expected return at each level of risk

Calculate the efficient frontier

efficient_frontier <- tibble(
  a = seq(from = -1, to = 4, by = 0.01),
) |> 
  mutate(
    omega = map(a, ~ .x * omega_efp + (1 - .x) * omega_mvp),
    mu = map_dbl(omega, ~ t(.x) %*% mu),
    sigma = map_dbl(omega, ~ sqrt(t(.x) %*% sigma %*% .x)),
  )

Visualizing the efficient frontier

Code
Figure

summaries <- bind_rows(
    summaries, efficient_frontier
  )

fig_efficient_frontier <- summaries |> 
  ggplot(aes(x = sigma, y = mu)) +
  geom_point(data = summaries |> filter(is.na(type))) +
  geom_point(data = summaries |> filter(!is.na(type)), color = "red", size = 3) +
  ggrepel::geom_label_repel(aes(label = type)) +
  scale_x_continuous(labels = scales::percent) +
  scale_y_continuous(labels = scales::percent) + 
  labs(x = "Volatility", y = "Average return",
       title = "Efficient frontier for DOW index constituents",
       subtitle = "Points correspond to individual assets")

Replicate minimum-variance via PortfolioAnalytics

library(PortfolioAnalytics)
library(CVXR)

returns_matrix <- column_to_rownames(
  returns_wide, var = "date"
)

problem_mvp <- portfolio.spec(colnames(returns_matrix)) |>
  add.objective(type = "risk", name = "var") |> 
  add.constraint("full_investment")

solution_mvp <- optimize.portfolio(
  returns_matrix, problem_mvp, optimize_method = "CVXR"
)

all.equal(omega_mvp, as.vector(solution_mvp$weights))

[1] TRUE

Replicate efficient portfolio via PortfolioAnalytics

problem_efp <- problem_mvp |> 
  add.constraint("return", return_target = mu_bar)

solution_efp <- optimize.portfolio(
  returns_matrix, problem_efp, optimize_method = "CVXR"
)

all.equal(omega_efp, as.vector(solution_efp$weights))

[1] TRUE

Easy to extend Markowitz model

Short sale constraints: add.constraint("long_only")

Position limit: add.constraint("position_limit", max_pos = 10)

Expected shortfall: add.objective(type = "risk", name = "ES")

.. and many more, see official PortfolioAnalytics vignette

Recap & key takeaways

Mean-variance framework is a cornerstone of finance
Download financial data using tidyfinance package
Easy to compute analytic solutions ‘manually’
Implement extensions using PortfolioAnalytics
More advanced: constrained optimization & backtesting

Follow on LinkedIn for news: christophscheuch

Slides on: talks.tidy-finance.org