Optimize Portfolios using the Markowitz Model
Tidy Finance Webinar Series
Historical context & significance
Harry Markowitz pioneered modern portfolio theory
1952: influential paper on theory of portfolio selection (cited over 63,000 times)
1990: Sveriges Riksbank Prize in Economic Sciences (with M. Miller & W. Sharpe)
What is modern portfolio theory ?
How to optimally allocate wealth across assets with different characteristics, e.g. returns, risks, correlations?
Individual asset risks & their correlations matter
Trade-off between expected returns & risk
Mean-variance analysis as a key tool
Foundation for portfolio & risk management
Maximize expected returns …
Expected return
The profit you anticipate from an investment
\(\mu_i\) represents the expected return of asset \(i\)
Example: expect a 10% return from Apple over next 12 months
… while minimizing risks
Risk
Returns are volatile
More volatility \(\rightarrow\) more risk
\(\sigma_i\) represents the volatility of asset \(i\)
Example: Apple’s stock might move \(\pm15\) % over next year
Markowitz: correlations across assets also matter!
The power of diversification in reducing risk
Fruit basket analogy
If all you have are apples & they spoil, you lose everything
With a variety, some fruits may spoil, but others will stay fresh
Diversification in investment
Spread investments across assets to reduce overall risk
Diversify across stocks, bonds, real estate, commodities, etc.
Outline of this webinar
Estimate expected returns
Estimate the variance-covariance matrix
Calculate portfolio returns & volatility
Calculate the minium variance portfolio
Calculate the efficient frontier
Expected returns based on sample average returns
Sum past returns & divide by number of periods:
\(\hat{\mu}_i = \frac{1}{T} \sum_{t=1}^{T} r_{it}\)
\(r_{it}\) is return in period \(t\) and \(T\) is number of periods
Example:
Historical returns for \(i\) over 5 years: 8%, 10%, 6%, 12%, 9%
\(\hat{\mu}_i = (8\% + 10\% + 6\% + 12\% + 9\%) / 5 = 9\%\)
Assumption: past performance is indicative of future
Download daily stock prices
library (tidyverse)
library (tidyfinance)
symbols <- download_data (
type = "constituents" ,
index = "Dow Jones Industrial Average"
)
prices_daily <- download_data (
type = "stock_prices" , symbol = symbols$ symbol,
start_date = "2019-08-01" , end_date = "2024-07-31"
) |>
select (symbol, date, price = adjusted_close)
# A tibble: 37,710 × 3
symbol date price
<chr> <date> <dbl>
1 UNH 2019-08-01 231.
2 UNH 2019-08-02 232.
3 UNH 2019-08-05 227.
4 UNH 2019-08-06 230.
5 UNH 2019-08-07 229.
6 UNH 2019-08-08 230.
7 UNH 2019-08-09 231.
8 UNH 2019-08-12 226.
9 UNH 2019-08-13 231.
10 UNH 2019-08-14 226.
# ℹ 37,700 more rows
Calculate daily returns
returns_daily <- prices_daily |>
group_by (symbol) |>
mutate (ret = price / lag (price) - 1 ) |>
ungroup () |>
select (symbol, date, ret) |>
drop_na (ret) |>
arrange (symbol, date)
# A tibble: 37,680 × 3
symbol date ret
<chr> <date> <dbl>
1 AAPL 2019-08-02 -0.0212
2 AAPL 2019-08-05 -0.0523
3 AAPL 2019-08-06 0.0189
4 AAPL 2019-08-07 0.0104
5 AAPL 2019-08-08 0.0221
6 AAPL 2019-08-09 -0.00824
7 AAPL 2019-08-12 -0.00254
8 AAPL 2019-08-13 0.0423
9 AAPL 2019-08-14 -0.0298
10 AAPL 2019-08-15 -0.00498
# ℹ 37,670 more rows
Calculate average returns
assets <- returns_daily |>
group_by (symbol) |>
summarize (mu = mean (ret))
fig_mu <- assets |>
ggplot (aes (x = mu, y = fct_reorder (symbol, mu),
fill = mu > 0 )) +
geom_col () +
scale_x_continuous (labels = scales:: percent) +
labs (x = NULL , y = NULL , fill = NULL ,
title = "Average daily returns of DOW index constituents" )
Volatility measures individual asset risk
\[\hat{\sigma}_i = \sqrt{\frac{1}{T-1} \sum_{t=1}^{T} (R_{it} - \hat{\mu}_i)^2}\]
Interpretation : higher volatility indicates higher risk
Estimating volatilities
volatilities <- returns_daily |>
group_by (symbol) |>
summarize (sigma = sd (ret))
assets <- assets |>
left_join (volatilities, join_by (symbol))
fig_sigma <- assets |>
ggplot (aes (x = sigma, y = fct_reorder (symbol, sigma))) +
geom_col () +
scale_x_continuous (labels = scales:: percent) +
labs (x = NULL , y = NULL ,
title = "Daily volatilities of DOW index constituents" )
Covariance measures interaction between assets
\[\hat{\sigma}_{ij} = \frac{1}{T-1} \sum_{t=1}^{T} (R_{it} - \hat{\mu}_i)(R_{jt} - \hat{\mu}_j)\]
Interpretation :
Positive : assets move in the same direction, potentially increasing portfolio risk
Negative : assets move in opposite directions, which can reduce risk through diversification
Estimating the variance-covariance matrix
returns_wide <- returns_daily |>
pivot_wider (names_from = symbol, values_from = ret)
sigma <- returns_wide |>
select (- date) |>
cov ()
fig_sigma <- sigma |>
as_tibble (rownames = "symbol_a" ) |>
pivot_longer (- symbol_a, names_to = "symbol_b" ) |>
ggplot (aes (x = symbol_a, y = fct_rev (symbol_b),
fill = value)) +
geom_tile () +
scale_fill_gradient (low = "blue" , high = "red" ) +
labs (x = NULL , y = NULL , fill = "(Co-)Variance" ,
title = "Variance-covariance matrix of Dow Industrial Average constituents" )
Calculate expected portfolio returns
\(\text{Expected Portfolio Return} = \sum_{i=1}^n \omega_i \hat{\mu}_i\)
\(\omega_i\) : weight of asset \(i\) in the portfolio
\(\hat{\mu}_i\) : estimated expected return of asset \(i\)
Example:
Asset A: 60% weight, expected return 8%
Asset B: 40% weight, expected return 12%
\((0.6 \times 8\%) + (0.4 \times 12\%) = 9.6\%\)
Assumption : portfolio weights are constant over time
Calculate the portfolio variance
Portfolio variance is calculated as
\[\sum_{i=1}^{n} \sum_{j=1}^{n} \omega_i \omega_j \hat{\sigma}_{ij}\]
\(\omega_i\) , \(\omega_j\) : the weights of assets \(i\) , \(j\) in the portfolio
\(\hat{\sigma}_{ij}\) : covariance between returns of assets \(i\) and \(j\)
\(n\) : number of assets in portfolio
The minimum-variance framework
Minimize portfolio variance
\[\min_{\omega_1, ... \omega_n} \sum_{i=1}^{n} \sum_{j=1}^{n} \omega_i \omega_j \hat{\sigma}_{ij}\]
while staying fully invested
\[\sum_{i=1}^{n} \omega_i = 1\]
Minimum variance in matrix notation
Minimize portfolio variance
\[\min_{\omega} \omega' \hat{\Sigma} \omega\]
while staying fully invested
\[ \omega'\iota = 1\]
Solution for minimum-variance portfolio
\[\omega_\text{mvp} = \frac{\Sigma^{-1}\iota}{\iota'\Sigma^{-1}\iota}\]
\(\iota\) : vector of 1’s
\(\Sigma^{-1}\) : inverse of variance-covariance matrix \(\Sigma\)
iota <- rep (1 , dim (sigma)[1 ])
sigma_inv <- solve (sigma)
omega_mvp <- as.vector (sigma_inv %*% iota) /
as.numeric (t (iota) %*% sigma_inv %*% iota)
Interpreting results
assets <- bind_cols (assets, omega_mvp = omega_mvp)
fig_omega_mvp <- assets |>
ggplot (aes (x = omega_mvp, y = fct_reorder (symbol, omega_mvp),
fill = omega_mvp > 0 )) +
geom_col () +
scale_x_continuous (labels = scales:: percent) +
labs (x = NULL , y = NULL ,
title = "Minimum-variance portfolio weights" )
Minimum-variance portfolio return
mu <- assets$ mu
summary_mvp <- tibble (
mu = sum (omega_mvp * mu),
sigma = as.numeric (sqrt (t (omega_mvp) %*% sigma %*% omega_mvp)),
type = "Minimum-Variance Portfolio"
)
summary_mvp
# A tibble: 1 × 3
mu sigma type
<dbl> <dbl> <chr>
1 0.000326 0.00932 Minimum-Variance Portfolio
Efficient portfolios
Minimize portfolio variance
\[\min_{\omega} \omega' \hat{\Sigma} \omega\]
While earning minimum expected return \(\bar{\mu}\)
\[ \omega'\iota = 1\]
\(\omega'\hat{\mu} = \bar{\mu}\)
Dow Jones vs Nasdaq 100
Choose a minimum expected return
Achieve at least average Nasdaq 100 return:
mu_bar <- download_data (
"stock_prices" , symbol = "^NDX" ,
start_date = "2019-08-01" , end_date = "2024-07-31"
) |>
mutate (
ret = adjusted_close / lag (adjusted_close) - 1
) |>
summarize (mean (ret, na.rm = TRUE )) |>
pull ()
Note: \(\bar\mu\) needs to be higher than \(\hat\mu_{mvp}\)
Solution for efficient portfolio
\[\omega_{efp} = \frac{\lambda^*}{2}\left(\Sigma^{-1}\mu -\frac{D}{C}\Sigma^{-1}\iota \right)\]
where \(\lambda^* = 2\frac{\bar\mu - D/C}{E-D^2/C}\) , \(C = \iota'\Sigma^{-1}\iota\) , \(D=\iota'\Sigma^{-1}\mu\) & \(E=\mu'\Sigma^{-1}\mu\)
See details on tidy-finance.org
Calculate efficient portfolio
C <- as.numeric (t (iota) %*% sigma_inv %*% iota)
D <- as.numeric (t (iota) %*% sigma_inv %*% mu)
E <- as.numeric (t (mu) %*% sigma_inv %*% mu)
lambda_tilde <- as.numeric (2 * (mu_bar - D / C) / (E - D^ 2 / C))
omega_efp <- as.vector (omega_mvp + lambda_tilde / 2 * (sigma_inv %*% mu - D * omega_mvp))
summary_efp <- tibble (
mu = sum (omega_efp * mu),
sigma = as.numeric (sqrt (t (omega_efp) %*% sigma %*% omega_efp)),
type = "Efficient Portfolio"
)
Minimum variance vs efficient portfolio
summaries <- bind_rows (
assets, summary_mvp, summary_efp
)
fig_summaries <- summaries |>
ggplot (aes (x = sigma, y = mu)) +
geom_point (data = summaries |> filter (is.na (type))) +
geom_point (data = summaries |> filter (! is.na (type)), color = "red" , size = 3 ) +
ggrepel:: geom_label_repel (aes (label = type)) +
scale_x_continuous (labels = scales:: percent) +
scale_y_continuous (labels = scales:: percent) +
labs (x = "Volatility" , y = "Average return" ,
title = "Efficient & minimum-variance portfolios for DOW index constituents" ,
subtitle = "Points correspond to individual assets" )
The efficient frontier
Mutual fund separation theorem : any linear combination of efficient portfolios, is also efficient
\[\omega_{eff} = a \cdot \omega_{efp} + (1-a) \cdot\omega_{mvp}\]
Highest achievable expected return at each level of risk
Calculate the efficient frontier
efficient_frontier <- tibble (
a = seq (from = - 1 , to = 4 , by = 0.01 ),
) |>
mutate (
omega = map (a, ~ .x * omega_efp + (1 - .x) * omega_mvp),
mu = map_dbl (omega, ~ t (.x) %*% mu),
sigma = map_dbl (omega, ~ sqrt (t (.x) %*% sigma %*% .x)),
)
Visualizing the efficient frontier
summaries <- bind_rows (
summaries, efficient_frontier
)
fig_efficient_frontier <- summaries |>
ggplot (aes (x = sigma, y = mu)) +
geom_point (data = summaries |> filter (is.na (type))) +
geom_point (data = summaries |> filter (! is.na (type)), color = "red" , size = 3 ) +
ggrepel:: geom_label_repel (aes (label = type)) +
scale_x_continuous (labels = scales:: percent) +
scale_y_continuous (labels = scales:: percent) +
labs (x = "Volatility" , y = "Average return" ,
title = "Efficient frontier for DOW index constituents" ,
subtitle = "Points correspond to individual assets" )
Replicate minimum-variance via PortfolioAnalytics
library (PortfolioAnalytics)
library (CVXR)
returns_matrix <- column_to_rownames (
returns_wide, var = "date"
)
problem_mvp <- portfolio.spec (colnames (returns_matrix)) |>
add.objective (type = "risk" , name = "var" ) |>
add.constraint ("full_investment" )
solution_mvp <- optimize.portfolio (
returns_matrix, problem_mvp, optimize_method = "CVXR"
)
all.equal (omega_mvp, as.vector (solution_mvp$ weights))
Replicate efficient portfolio via PortfolioAnalytics
problem_efp <- problem_mvp |>
add.constraint ("return" , return_target = mu_bar)
solution_efp <- optimize.portfolio (
returns_matrix, problem_efp, optimize_method = "CVXR"
)
all.equal (omega_efp, as.vector (solution_efp$ weights))
Easy to extend Markowitz model
Short sale constraints: add.constraint("long_only")
Position limit: add.constraint("position_limit", max_pos = 10)
Expected shortfall: add.objective(type = "risk", name = "ES")
.. and many more, see official PortfolioAnalytics vignette