HW 3¶
In this HW, we are going to back-test strategies of forecasting FX returns on a rolling basis.
We’ll back-test two strategies, one based on Model Selection Method A, and the other based on Method B.
Deliverables:
A. the program coding.
B. performance summaries for the two strategies.
By: Chengyi (Jeff) Chen
%load_ext autotime
%load_ext nb_black
%matplotlib inline
import matplotlib.pyplot as plt
plt.rcParams["figure.dpi"] = 300
plt.rcParams["figure.figsize"] = (16, 12)
import pandas as pd
import numpy as np
import cvxpy as cp
import scipy as sp
from sklearn.metrics import mean_squared_error
from statsmodels.tsa.arima_model import ARIMA
from datetime import datetime, timedelta
import warnings
warnings.simplefilter("ignore")
/opt/anaconda3/envs/ml/lib/python3.7/site-packages/statsmodels/tools/_testing.py:19: FutureWarning: pandas.util.testing is deprecated. Use the functions in the public API at pandas.testing instead.
import pandas.util.testing as tm
raw_data = pd.read_excel("./data/HW3_20201112.xlsx", sheet_name=1)
available_currencies = raw_data.columns[::3]
data = raw_data.iloc[1:, list(range(1, len(raw_data.columns), 3))].astype(float)
data.columns = raw_data.iloc[0, list(range(1, len(raw_data.columns), 3))].values
data.index = pd.to_datetime(raw_data.iloc[1:, 0])
data.index.name = "date"
data.index.freq = "-1BM"
data = data.sort_index()
data.head()
EURUSD | GBPUSD | JPYUSD | AUDUSD | CADUSD | NOKUSD | CHFUSD | |
---|---|---|---|---|---|---|---|
date | |||||||
1994-04-29 | 1.1840 | 1.5185 | 0.009833 | 0.7155 | 0.7231 | 0.1395 | 0.7125 |
1994-05-31 | 1.1909 | 1.5113 | 0.009544 | 0.7371 | 0.7225 | 0.1400 | 0.7130 |
1994-06-30 | 1.2238 | 1.5443 | 0.010158 | 0.7285 | 0.7224 | 0.1444 | 0.7499 |
1994-07-29 | 1.2261 | 1.5440 | 0.009985 | 0.7395 | 0.7211 | 0.1448 | 0.7463 |
1994-08-31 | 1.2341 | 1.5340 | 0.009995 | 0.7429 | 0.7310 | 0.1442 | 0.7508 |
time: 154 ms
Step 1:¶
Transform currency prices to monthly returns
monthly_returns = data.pct_change(1).iloc[1:]
monthly_returns.head()
EURUSD | GBPUSD | JPYUSD | AUDUSD | CADUSD | NOKUSD | CHFUSD | |
---|---|---|---|---|---|---|---|
date | |||||||
1994-05-31 | 0.005828 | -0.004742 | -0.029391 | 0.030189 | -0.000830 | 0.003584 | 0.000702 |
1994-06-30 | 0.027626 | 0.021836 | 0.064334 | -0.011667 | -0.000138 | 0.031429 | 0.051753 |
1994-07-29 | 0.001879 | -0.000194 | -0.017031 | 0.015100 | -0.001800 | 0.002770 | -0.004801 |
1994-08-31 | 0.006525 | -0.006477 | 0.001002 | 0.004598 | 0.013729 | -0.004144 | 0.006030 |
1994-09-30 | 0.017827 | 0.028357 | 0.009105 | -0.003634 | 0.018605 | 0.022191 | 0.034497 |
time: 24.9 ms
Step 2:¶
Starting from 12/31/1999, at each month-end date, we’ll forecast next-month returns for the seven currencies separately using one of two different model selection methods:
Two models:
AR(1)
MA(1)
Model Selection Method A (In-Sample):¶
At the time point t, calibrate/estimate the two models based on the past 24 monthly return observations (current observation at t included).
Calculate the AICs for the two estimated models for those 24 obsevations, and choose the one with lower AIC.
Please note that AIC and mean squared error (MSE) are equivalent for comparing the two pre-assigned models. (So if you are not familiar with AIC, try use MSE instead).
Forecast the next period return at time (t+1) using the selected model.
start_date = "1999-12-31"
time: 473 µs
def model_selection_A(monthly_returns):
"""Returns next period forecast using best model"""
ar_1_model = ARIMA(monthly_returns, order=(1, 0, 0)).fit() # AR(1)
ma_1_model = ARIMA(monthly_returns, order=(0, 0, 1)).fit() # MA(1)
# Select Best model based on AIC
best_model = ar_1_model if ar_1_model.aic < ma_1_model.aic else ma_1_model
# Get forecast
forecast, stderr, conf_int = best_model.forecast()
return forecast[0]
model_selection_A_predictions = (
pd.concat(
[
monthly_returns[currency]
.rolling(24, min_periods=24)
.agg(model_selection_A)
.loc[start_date:]
for currency in monthly_returns.columns
],
axis=1,
)
.shift(1)
.iloc[1:]
)
model_selection_A_predictions
EURUSD | GBPUSD | JPYUSD | AUDUSD | CADUSD | NOKUSD | CHFUSD | |
---|---|---|---|---|---|---|---|
date | |||||||
2000-01-31 | -0.003131 | -0.001858 | 0.014124 | -0.000595 | -0.002344 | 0.011939 | -0.003192 |
2000-02-29 | -0.006137 | -0.000819 | 0.018341 | -0.000382 | 0.000340 | 0.012523 | -0.007582 |
2000-03-31 | -0.004628 | 0.009065 | 0.012377 | -0.002184 | -0.000579 | 0.002050 | -0.005109 |
2000-04-28 | -0.004501 | -0.001630 | -0.000457 | -0.002368 | -0.000798 | -0.000776 | -0.002774 |
2000-05-31 | -0.011437 | 0.007824 | 0.027002 | -0.002033 | 0.000620 | 0.007167 | -0.008398 |
... | ... | ... | ... | ... | ... | ... | ... |
2018-11-30 | -0.003151 | 0.010656 | 0.001052 | -0.005594 | -0.001367 | 0.002645 | 0.000012 |
2018-12-31 | 0.002632 | 0.002070 | 0.001336 | -0.000985 | -0.000446 | 0.001262 | 0.000492 |
2019-01-31 | 0.005300 | 0.001749 | 0.001517 | -0.000071 | -0.004193 | 0.000369 | 0.001042 |
2019-02-28 | 0.001467 | -0.006201 | 0.001589 | -0.007189 | 0.005521 | 0.000814 | -0.000118 |
2019-03-29 | 0.000857 | 0.000701 | 0.001366 | 0.000846 | -0.000173 | -0.000415 | 0.000416 |
231 rows × 7 columns
time: 1min 55s
Model Selection Method B (Out-Sample):¶
At the time point t, for each model, out-of-samplely forecast the returns for the past 12 months.
for each time point from t-11 to t, say for t-k where k=0,…,11, we’ll estimate each model using the past 24 observations, i.e., observations at (t-k-24,…,t-k-1), then based on the estimated model to forecast returns at t-k.
Now, for AR(1) model, we have forecasts for t-11, t-10,…,t, the MSE can be calculated as the mean of (forecast - true return)^2. for MA(1) model, we have another vector of 12 forecasts, with corresponded 12 true returns, and we can calculate MSE.
The model with lower MSE would be selected.
And then we need to calibrate that model using the past 24 observations.
Forecast the next period return at time (t+1).
def model_selection_B(monthly_returns):
"""Returns next period forecast using best model according to lower MSE on past 12 return forecasts"""
y_true = monthly_returns.iloc[-12:].values
ar_1_model_preds = []
ma_1_model_preds = []
for k in range(12):
past_24_observations = monthly_returns.iloc[-12 + k - 24 : -12 + k]
ar_1_model = ARIMA(past_24_observations, order=(1, 0, 0)).fit() # AR(1)
ma_1_model = ARIMA(past_24_observations, order=(0, 0, 1)).fit() # MA(1)
# Get forecast
ar_1_model_pred, stderr, conf_int = ar_1_model.forecast()
ma_1_model_pred, stderr, conf_int = ma_1_model.forecast()
ar_1_model_preds.append(ar_1_model_pred)
ma_1_model_preds.append(ma_1_model_pred)
# Caliberated models based on past 24 month returns
ar_1_model = ARIMA(monthly_returns.iloc[-24:], order=(1, 0, 0)).fit() # AR(1)
ma_1_model = ARIMA(monthly_returns.iloc[-24:], order=(0, 0, 1)).fit() # MA(1)
# Select Best model based on AIC
best_model = (
ar_1_model
if mean_squared_error(y_true=y_true, y_pred=ar_1_model_preds)
< mean_squared_error(y_true=y_true, y_pred=ma_1_model_preds)
else ma_1_model
)
# Get forecast
forecast, stderr, conf_int = best_model.forecast()
return forecast[0]
model_selection_B_predictions = (
pd.concat(
[
monthly_returns[currency]
.rolling(36, min_periods=36)
.agg(model_selection_B)
.loc[start_date:]
for currency in monthly_returns.columns
],
axis=1,
)
.shift(1)
.iloc[1:]
)
model_selection_B_predictions
EURUSD | GBPUSD | JPYUSD | AUDUSD | CADUSD | NOKUSD | CHFUSD | |
---|---|---|---|---|---|---|---|
date | |||||||
2000-01-31 | -0.003022 | -0.005389 | 0.012898 | -0.000065 | -0.002057 | -0.006123 | -0.003192 |
2000-02-29 | -0.005529 | -0.000007 | 0.014642 | -0.000583 | 0.000287 | 0.009027 | -0.007582 |
2000-03-31 | -0.004428 | 0.006833 | 0.010626 | -0.002631 | -0.000590 | -0.002736 | -0.005109 |
2000-04-28 | -0.004393 | -0.005841 | 0.003849 | -0.002691 | -0.000777 | -0.002713 | -0.002774 |
2000-05-31 | -0.010242 | 0.007824 | 0.020417 | -0.002566 | 0.000268 | 0.007167 | -0.008398 |
... | ... | ... | ... | ... | ... | ... | ... |
2018-11-30 | -0.003151 | 0.010656 | -0.001926 | -0.004398 | -0.001367 | 0.001452 | -0.000205 |
2018-12-31 | 0.002632 | 0.002620 | 0.001336 | -0.000718 | -0.000446 | 0.000841 | 0.000503 |
2019-01-31 | 0.005300 | 0.001979 | 0.001517 | -0.000180 | -0.004193 | 0.000369 | 0.001274 |
2019-02-28 | 0.001467 | -0.003161 | 0.001589 | -0.006649 | 0.005521 | 0.000577 | -0.000090 |
2019-03-29 | 0.000857 | 0.000945 | 0.001222 | 0.001408 | 0.000340 | -0.000426 | 0.000435 |
231 rows × 7 columns
time: 24min 12s
Step 3:¶
Now at time t, we have made our forecasts for the seven currencies of their monthly returns at t+1.
Suppose the forecasted returns are (r1,r2,…,r7).
We’ll transform the forecasts to return sign only: (sign(r1),sign(r2),…,sign(r7)).
We’ll invest in currency h with a long position if sign(rh)=1, otherwise we’ll invest in it with a short position. But right now, we need to figure out the weight vector based on Risk Parity.
Prepare the covariance matrix corresponding to (sign(r1),sign(r2),…,sign(r7)), i.e., for currency h, if sign(rh)=1, we use the currency’s original returns; otherwise, use the negative of original returns. Then we’ll calculate the covariance matrix based on past 60 monthly returns (time t included) (ASSUMPTION 1. I assume that the past 60 monthly returns refer to the original past 60 monthly returns, not the ones we forecast using model A / B).
Based on the covariance matrix, run Risk Parity program (see class slide) to get a weight vector (w1, …, w7) whose summation equals 1. However, we’ll target an annualized volatility of 5%, i.e., we’ll invest with the updated weight vector (w1*,…,w7*)=the levarage factor x(w1,…,w7) where the leverage factor = 5%/(the expected volatility of the portfolio given (w1,…,w7)) (ASSUMPTION 2. I assume that the expected volatility here refers to the annualized expected volatility of the portfolio).
Note: w1*,…, w7* are all of positive values
A_returns = monthly_returns.loc[model_selection_A_predictions.index[0] :] * np.sign(
model_selection_A_predictions
)
A_returns
EURUSD | GBPUSD | JPYUSD | AUDUSD | CADUSD | NOKUSD | CHFUSD | |
---|---|---|---|---|---|---|---|
date | |||||||
2000-01-31 | 0.035281 | 0.001236 | -0.045310 | 0.029998 | 0.001157 | -0.036889 | 0.040401 |
2000-02-29 | 0.006696 | 0.022893 | -0.026415 | 0.030141 | -0.001593 | -0.007494 | 0.004641 |
2000-03-31 | 0.009023 | 0.007535 | 0.072902 | 0.017320 | -0.000870 | -0.009228 | -0.000000 |
2000-04-28 | 0.045631 | 0.025014 | 0.049856 | 0.038214 | 0.021008 | 0.053345 | 0.029642 |
2000-05-31 | -0.028622 | -0.032167 | 0.004977 | 0.019524 | -0.011100 | 0.006261 | -0.021452 |
... | ... | ... | ... | ... | ... | ... | ... |
2018-11-30 | -0.000442 | -0.001332 | -0.004631 | -0.032942 | 0.010132 | -0.018565 | 0.009882 |
2018-12-31 | 0.013254 | 0.000392 | 0.035293 | 0.035177 | 0.025389 | -0.005159 | 0.017174 |
2019-01-31 | -0.001657 | 0.027834 | 0.006906 | -0.031778 | -0.039143 | 0.025065 | -0.012761 |
2019-02-28 | -0.006726 | -0.011748 | -0.022643 | 0.024612 | -0.003938 | -0.015177 | 0.003679 |
2019-03-29 | -0.010289 | -0.006107 | 0.011027 | -0.000564 | 0.017525 | 0.007705 | 0.002994 |
231 rows × 7 columns
time: 20.8 ms
B_returns = monthly_returns.loc[model_selection_B_predictions.index[0] :] * np.sign(
model_selection_B_predictions
)
B_returns
EURUSD | GBPUSD | JPYUSD | AUDUSD | CADUSD | NOKUSD | CHFUSD | |
---|---|---|---|---|---|---|---|
date | |||||||
2000-01-31 | 0.035281 | 0.001236 | -0.045310 | 0.029998 | 0.001157 | 0.036889 | 0.040401 |
2000-02-29 | 0.006696 | 0.022893 | -0.026415 | 0.030141 | -0.001593 | -0.007494 | 0.004641 |
2000-03-31 | 0.009023 | 0.007535 | 0.072902 | 0.017320 | -0.000870 | 0.009228 | -0.000000 |
2000-04-28 | 0.045631 | 0.025014 | -0.049856 | 0.038214 | 0.021008 | 0.053345 | 0.029642 |
2000-05-31 | -0.028622 | -0.032167 | 0.004977 | 0.019524 | -0.011100 | 0.006261 | -0.021452 |
... | ... | ... | ... | ... | ... | ... | ... |
2018-11-30 | -0.000442 | -0.001332 | 0.004631 | -0.032942 | 0.010132 | -0.018565 | -0.009882 |
2018-12-31 | 0.013254 | 0.000392 | 0.035293 | 0.035177 | 0.025389 | -0.005159 | 0.017174 |
2019-01-31 | -0.001657 | 0.027834 | 0.006906 | -0.031778 | -0.039143 | 0.025065 | -0.012761 |
2019-02-28 | -0.006726 | -0.011748 | -0.022643 | 0.024612 | -0.003938 | -0.015177 | 0.003679 |
2019-03-29 | -0.010289 | -0.006107 | 0.011027 | -0.000564 | -0.017525 | 0.007705 | 0.002994 |
231 rows × 7 columns
time: 15.8 ms
Risk-Parity Program:
def risk_budgeting_weights(Σ, b, target_ann_vol=0.05):
"""Solves risk-budgeting program and returns optimal
weights for each asset such that the volatility of
each asset in portfolio conforms to the risk budget
allocated using differential evolution.
Args:
Σ (np.array): Covariance matrix of the monthly asset returns
b (np.array): Risk budgets
target_ann_vol (float): Target annualized volatility, default - 0.05
Returns:
(np.array): Weight vector
"""
n = Σ.shape[0] # Number of assets in portfolio
σ = lambda w, Σ: np.sqrt(w.T @ Σ @ w) # Portfolio volatility
MCR = lambda w, Σ, σ: (Σ @ w) / σ # Marginal Contribution to Risk
CR = lambda w, Σ, σ: np.diag(w) @ MCR(w, Σ, σ) # Contribution to Risk
obj = lambda w: (CR(w, Σ, σ(w, Σ)) - b.T * σ(w, Σ)).T @ (
CR(w, Σ, σ(w, Σ)) - b.T * σ(w, Σ)
) # Form objective
result = sp.optimize.differential_evolution(
func=obj,
bounds=[(0, 1) for _ in range(n)],
maxiter=1000,
popsize=15,
constraints=(
sp.optimize.LinearConstraint(A=np.ones(shape=(n,)), lb=1, ub=1), # 𝑤⊤𝟙 = 1
sp.optimize.LinearConstraint(
A=np.eye(n), lb=np.zeros(shape=(n,)), ub=np.ones(shape=(n,)),
), # 0 ≤ 𝑤 ≤ 1
),
)
# Optimal weight vector
w_star = result.x
# Update weight vector to target annualized volatility
# By normalizing the weights according to their expected
# annualized volatility and then x 5% target annualized volatility
leverage_factor = target_ann_vol / (σ(w_star, Σ) * np.sqrt(12))
w_star *= leverage_factor
# Form and solve problem.
return w_star
time: 3.7 ms
Step 4:¶
the portfolio (w1*,…,w7*)’s return at time t+1 will be calculated as (sign(r1) x w1* x r1_new+…+ sign(r7) x w1* x r7_new) where sign(rh) is your directional forecast for currency h at time t+1, and rh_new is its actual return at time t+1.
Record the portfolio’s returns at time t+1, and go to Step 2 to make forecast for time t+2 and build the portfolio, and stop until 3/29/2019.
annualized_volatility = (
lambda monthly_returns: np.std(monthly_returns) * np.sqrt(12)
if len(monthly_returns) > 0
else None
)
annualized_return = (
lambda monthly_returns: np.prod(monthly_returns + 1) ** (12 / len(monthly_returns))
- 1
if len(monthly_returns) > 0
else None
)
def sharpe_ratio(monthly_returns, R_f=0):
R_p = annualized_return(monthly_returns)
σ_p = annualized_volatility(monthly_returns)
return (R_p - R_f) / σ_p if len(monthly_returns) > 0 else None
def get_risk_parity_returns(past_returns, forecasted_returns):
n = len(past_returns.columns)
Σ = past_returns.cov().values
b = np.array([1] * n) / n
risk_parity_weights = risk_budgeting_weights(Σ, b)
return risk_parity_weights.T @ forecasted_returns
time: 2.35 ms
A_risk_parity_returns = (
pd.Series(
[
get_risk_parity_returns(
past_returns=monthly_returns.iloc[
monthly_returns.index.get_loc(start_date)
- 60
+ start_idx : monthly_returns.index.get_loc(start_date)
+ start_idx,
:,
],
forecasted_returns=A_returns.loc[
monthly_returns.iloc[
monthly_returns.index.get_loc(start_date) + start_idx
].name
],
)
for start_idx in range(
1, monthly_returns.shape[0] - monthly_returns.index.get_loc(start_date)
)
],
index=A_returns.loc[start_date:].index,
)
.to_frame()
.rename({0: "returns"}, axis=1)
)
A_risk_parity_returns
returns | |
---|---|
date | |
2000-01-31 | 0.006108 |
2000-02-29 | 0.006802 |
2000-03-31 | 0.009838 |
2000-04-28 | 0.038707 |
2000-05-31 | -0.011690 |
... | ... |
2018-11-30 | -0.003069 |
2018-12-31 | 0.014000 |
2019-01-31 | -0.002045 |
2019-02-28 | -0.004150 |
2019-03-29 | 0.002271 |
231 rows × 1 columns
time: 39min 23s
print(f"Risk-Parity with Model Selection A: ")
print(f"Annualized Return:", annualized_return(A_risk_parity_returns.values))
print(f"Annualized Volatility:", annualized_volatility(A_risk_parity_returns.values))
print(f"Annualized Sharpe Ratio:", sharpe_ratio(A_risk_parity_returns.values))
Risk-Parity with Model Selection A:
Annualized Return: 0.006451740379742388
Annualized Volatility: 0.04123175329661224
Annualized Sharpe Ratio: 0.15647504323500325
time: 1.48 ms
B_risk_parity_returns = (
pd.Series(
[
get_risk_parity_returns(
past_returns=monthly_returns.iloc[
monthly_returns.index.get_loc(start_date)
- 60
+ start_idx : monthly_returns.index.get_loc(start_date)
+ start_idx,
:,
],
forecasted_returns=B_returns.loc[
monthly_returns.iloc[
monthly_returns.index.get_loc(start_date) + start_idx
].name
],
)
for start_idx in range(
1, monthly_returns.shape[0] - monthly_returns.index.get_loc(start_date)
)
],
index=B_returns.loc[start_date:].index,
)
.to_frame()
.rename({0: "returns"}, axis=1)
)
B_risk_parity_returns
returns | |
---|---|
date | |
2000-01-31 | 0.014290 |
2000-02-29 | 0.006797 |
2000-03-31 | 0.011856 |
2000-04-28 | 0.029459 |
2000-05-31 | -0.011655 |
... | ... |
2018-11-30 | -0.004471 |
2018-12-31 | 0.014027 |
2019-01-31 | -0.002052 |
2019-02-28 | -0.004151 |
2019-03-29 | -0.001552 |
231 rows × 1 columns
time: 35min 53s
print(f"Risk-Parity with Model Selection B: ")
print(f"Annualized Return:", annualized_return(B_risk_parity_returns.values))
print(f"Annualized Volatility:", annualized_volatility(B_risk_parity_returns.values))
print(f"Annualized Sharpe Ratio:", sharpe_ratio(B_risk_parity_returns.values))
Risk-Parity with Model Selection B:
Annualized Return: 0.0061803207332966625
Annualized Volatility: 0.04209334277303941
Annualized Sharpe Ratio: 0.14682418468449907
time: 1.27 ms