HW 3

In this HW, we are going to back-test strategies of forecasting FX returns on a rolling basis.

We’ll back-test two strategies, one based on Model Selection Method A, and the other based on Method B.

Deliverables:

  • A. the program coding.

  • B. performance summaries for the two strategies.

By: Chengyi (Jeff) Chen

%load_ext autotime
%load_ext nb_black
%matplotlib inline

import matplotlib.pyplot as plt

plt.rcParams["figure.dpi"] = 300
plt.rcParams["figure.figsize"] = (16, 12)

import pandas as pd
import numpy as np
import cvxpy as cp
import scipy as sp
from sklearn.metrics import mean_squared_error
from statsmodels.tsa.arima_model import ARIMA
from datetime import datetime, timedelta
import warnings

warnings.simplefilter("ignore")
/opt/anaconda3/envs/ml/lib/python3.7/site-packages/statsmodels/tools/_testing.py:19: FutureWarning: pandas.util.testing is deprecated. Use the functions in the public API at pandas.testing instead.
  import pandas.util.testing as tm
raw_data = pd.read_excel("./data/HW3_20201112.xlsx", sheet_name=1)
available_currencies = raw_data.columns[::3]
data = raw_data.iloc[1:, list(range(1, len(raw_data.columns), 3))].astype(float)
data.columns = raw_data.iloc[0, list(range(1, len(raw_data.columns), 3))].values
data.index = pd.to_datetime(raw_data.iloc[1:, 0])
data.index.name = "date"
data.index.freq = "-1BM"
data = data.sort_index()
data.head()
EURUSD GBPUSD JPYUSD AUDUSD CADUSD NOKUSD CHFUSD
date
1994-04-29 1.1840 1.5185 0.009833 0.7155 0.7231 0.1395 0.7125
1994-05-31 1.1909 1.5113 0.009544 0.7371 0.7225 0.1400 0.7130
1994-06-30 1.2238 1.5443 0.010158 0.7285 0.7224 0.1444 0.7499
1994-07-29 1.2261 1.5440 0.009985 0.7395 0.7211 0.1448 0.7463
1994-08-31 1.2341 1.5340 0.009995 0.7429 0.7310 0.1442 0.7508
time: 154 ms

Step 1:

Transform currency prices to monthly returns

monthly_returns = data.pct_change(1).iloc[1:]
monthly_returns.head()
EURUSD GBPUSD JPYUSD AUDUSD CADUSD NOKUSD CHFUSD
date
1994-05-31 0.005828 -0.004742 -0.029391 0.030189 -0.000830 0.003584 0.000702
1994-06-30 0.027626 0.021836 0.064334 -0.011667 -0.000138 0.031429 0.051753
1994-07-29 0.001879 -0.000194 -0.017031 0.015100 -0.001800 0.002770 -0.004801
1994-08-31 0.006525 -0.006477 0.001002 0.004598 0.013729 -0.004144 0.006030
1994-09-30 0.017827 0.028357 0.009105 -0.003634 0.018605 0.022191 0.034497
time: 24.9 ms

Step 2:

Starting from 12/31/1999, at each month-end date, we’ll forecast next-month returns for the seven currencies separately using one of two different model selection methods:

Two models:

  1. AR(1)

  2. MA(1)

Model Selection Method A (In-Sample):

At the time point t, calibrate/estimate the two models based on the past 24 monthly return observations (current observation at t included).

Calculate the AICs for the two estimated models for those 24 obsevations, and choose the one with lower AIC.

Please note that AIC and mean squared error (MSE) are equivalent for comparing the two pre-assigned models. (So if you are not familiar with AIC, try use MSE instead).

Forecast the next period return at time (t+1) using the selected model.

start_date = "1999-12-31"
time: 473 µs
def model_selection_A(monthly_returns):
    """Returns next period forecast using best model"""
    ar_1_model = ARIMA(monthly_returns, order=(1, 0, 0)).fit()  # AR(1)
    ma_1_model = ARIMA(monthly_returns, order=(0, 0, 1)).fit()  # MA(1)

    # Select Best model based on AIC
    best_model = ar_1_model if ar_1_model.aic < ma_1_model.aic else ma_1_model

    # Get forecast
    forecast, stderr, conf_int = best_model.forecast()

    return forecast[0]


model_selection_A_predictions = (
    pd.concat(
        [
            monthly_returns[currency]
            .rolling(24, min_periods=24)
            .agg(model_selection_A)
            .loc[start_date:]
            for currency in monthly_returns.columns
        ],
        axis=1,
    )
    .shift(1)
    .iloc[1:]
)

model_selection_A_predictions
EURUSD GBPUSD JPYUSD AUDUSD CADUSD NOKUSD CHFUSD
date
2000-01-31 -0.003131 -0.001858 0.014124 -0.000595 -0.002344 0.011939 -0.003192
2000-02-29 -0.006137 -0.000819 0.018341 -0.000382 0.000340 0.012523 -0.007582
2000-03-31 -0.004628 0.009065 0.012377 -0.002184 -0.000579 0.002050 -0.005109
2000-04-28 -0.004501 -0.001630 -0.000457 -0.002368 -0.000798 -0.000776 -0.002774
2000-05-31 -0.011437 0.007824 0.027002 -0.002033 0.000620 0.007167 -0.008398
... ... ... ... ... ... ... ...
2018-11-30 -0.003151 0.010656 0.001052 -0.005594 -0.001367 0.002645 0.000012
2018-12-31 0.002632 0.002070 0.001336 -0.000985 -0.000446 0.001262 0.000492
2019-01-31 0.005300 0.001749 0.001517 -0.000071 -0.004193 0.000369 0.001042
2019-02-28 0.001467 -0.006201 0.001589 -0.007189 0.005521 0.000814 -0.000118
2019-03-29 0.000857 0.000701 0.001366 0.000846 -0.000173 -0.000415 0.000416

231 rows × 7 columns

time: 1min 55s

Model Selection Method B (Out-Sample):

At the time point t, for each model, out-of-samplely forecast the returns for the past 12 months.

  • for each time point from t-11 to t, say for t-k where k=0,…,11, we’ll estimate each model using the past 24 observations, i.e., observations at (t-k-24,…,t-k-1), then based on the estimated model to forecast returns at t-k.

  • Now, for AR(1) model, we have forecasts for t-11, t-10,…,t, the MSE can be calculated as the mean of (forecast - true return)^2. for MA(1) model, we have another vector of 12 forecasts, with corresponded 12 true returns, and we can calculate MSE.

The model with lower MSE would be selected.

And then we need to calibrate that model using the past 24 observations.

Forecast the next period return at time (t+1).

def model_selection_B(monthly_returns):
    """Returns next period forecast using best model according to lower MSE on past 12 return forecasts"""
    y_true = monthly_returns.iloc[-12:].values
    ar_1_model_preds = []
    ma_1_model_preds = []
    for k in range(12):
        past_24_observations = monthly_returns.iloc[-12 + k - 24 : -12 + k]

        ar_1_model = ARIMA(past_24_observations, order=(1, 0, 0)).fit()  # AR(1)
        ma_1_model = ARIMA(past_24_observations, order=(0, 0, 1)).fit()  # MA(1)

        # Get forecast
        ar_1_model_pred, stderr, conf_int = ar_1_model.forecast()
        ma_1_model_pred, stderr, conf_int = ma_1_model.forecast()

        ar_1_model_preds.append(ar_1_model_pred)
        ma_1_model_preds.append(ma_1_model_pred)

    # Caliberated models based on past 24 month returns
    ar_1_model = ARIMA(monthly_returns.iloc[-24:], order=(1, 0, 0)).fit()  # AR(1)
    ma_1_model = ARIMA(monthly_returns.iloc[-24:], order=(0, 0, 1)).fit()  # MA(1)

    # Select Best model based on AIC
    best_model = (
        ar_1_model
        if mean_squared_error(y_true=y_true, y_pred=ar_1_model_preds)
        < mean_squared_error(y_true=y_true, y_pred=ma_1_model_preds)
        else ma_1_model
    )

    # Get forecast
    forecast, stderr, conf_int = best_model.forecast()

    return forecast[0]


model_selection_B_predictions = (
    pd.concat(
        [
            monthly_returns[currency]
            .rolling(36, min_periods=36)
            .agg(model_selection_B)
            .loc[start_date:]
            for currency in monthly_returns.columns
        ],
        axis=1,
    )
    .shift(1)
    .iloc[1:]
)

model_selection_B_predictions
EURUSD GBPUSD JPYUSD AUDUSD CADUSD NOKUSD CHFUSD
date
2000-01-31 -0.003022 -0.005389 0.012898 -0.000065 -0.002057 -0.006123 -0.003192
2000-02-29 -0.005529 -0.000007 0.014642 -0.000583 0.000287 0.009027 -0.007582
2000-03-31 -0.004428 0.006833 0.010626 -0.002631 -0.000590 -0.002736 -0.005109
2000-04-28 -0.004393 -0.005841 0.003849 -0.002691 -0.000777 -0.002713 -0.002774
2000-05-31 -0.010242 0.007824 0.020417 -0.002566 0.000268 0.007167 -0.008398
... ... ... ... ... ... ... ...
2018-11-30 -0.003151 0.010656 -0.001926 -0.004398 -0.001367 0.001452 -0.000205
2018-12-31 0.002632 0.002620 0.001336 -0.000718 -0.000446 0.000841 0.000503
2019-01-31 0.005300 0.001979 0.001517 -0.000180 -0.004193 0.000369 0.001274
2019-02-28 0.001467 -0.003161 0.001589 -0.006649 0.005521 0.000577 -0.000090
2019-03-29 0.000857 0.000945 0.001222 0.001408 0.000340 -0.000426 0.000435

231 rows × 7 columns

time: 24min 12s

Step 3:

Now at time t, we have made our forecasts for the seven currencies of their monthly returns at t+1.

Suppose the forecasted returns are (r1,r2,…,r7).

We’ll transform the forecasts to return sign only: (sign(r1),sign(r2),…,sign(r7)).

We’ll invest in currency h with a long position if sign(rh)=1, otherwise we’ll invest in it with a short position. But right now, we need to figure out the weight vector based on Risk Parity.

Prepare the covariance matrix corresponding to (sign(r1),sign(r2),…,sign(r7)), i.e., for currency h, if sign(rh)=1, we use the currency’s original returns; otherwise, use the negative of original returns. Then we’ll calculate the covariance matrix based on past 60 monthly returns (time t included) (ASSUMPTION 1. I assume that the past 60 monthly returns refer to the original past 60 monthly returns, not the ones we forecast using model A / B).

Based on the covariance matrix, run Risk Parity program (see class slide) to get a weight vector (w1, …, w7) whose summation equals 1. However, we’ll target an annualized volatility of 5%, i.e., we’ll invest with the updated weight vector (w1*,…,w7*)=the levarage factor x(w1,…,w7) where the leverage factor = 5%/(the expected volatility of the portfolio given (w1,…,w7)) (ASSUMPTION 2. I assume that the expected volatility here refers to the annualized expected volatility of the portfolio).

Note: w1*,…, w7* are all of positive values

A_returns = monthly_returns.loc[model_selection_A_predictions.index[0] :] * np.sign(
    model_selection_A_predictions
)

A_returns
EURUSD GBPUSD JPYUSD AUDUSD CADUSD NOKUSD CHFUSD
date
2000-01-31 0.035281 0.001236 -0.045310 0.029998 0.001157 -0.036889 0.040401
2000-02-29 0.006696 0.022893 -0.026415 0.030141 -0.001593 -0.007494 0.004641
2000-03-31 0.009023 0.007535 0.072902 0.017320 -0.000870 -0.009228 -0.000000
2000-04-28 0.045631 0.025014 0.049856 0.038214 0.021008 0.053345 0.029642
2000-05-31 -0.028622 -0.032167 0.004977 0.019524 -0.011100 0.006261 -0.021452
... ... ... ... ... ... ... ...
2018-11-30 -0.000442 -0.001332 -0.004631 -0.032942 0.010132 -0.018565 0.009882
2018-12-31 0.013254 0.000392 0.035293 0.035177 0.025389 -0.005159 0.017174
2019-01-31 -0.001657 0.027834 0.006906 -0.031778 -0.039143 0.025065 -0.012761
2019-02-28 -0.006726 -0.011748 -0.022643 0.024612 -0.003938 -0.015177 0.003679
2019-03-29 -0.010289 -0.006107 0.011027 -0.000564 0.017525 0.007705 0.002994

231 rows × 7 columns

time: 20.8 ms
B_returns = monthly_returns.loc[model_selection_B_predictions.index[0] :] * np.sign(
    model_selection_B_predictions
)

B_returns
EURUSD GBPUSD JPYUSD AUDUSD CADUSD NOKUSD CHFUSD
date
2000-01-31 0.035281 0.001236 -0.045310 0.029998 0.001157 0.036889 0.040401
2000-02-29 0.006696 0.022893 -0.026415 0.030141 -0.001593 -0.007494 0.004641
2000-03-31 0.009023 0.007535 0.072902 0.017320 -0.000870 0.009228 -0.000000
2000-04-28 0.045631 0.025014 -0.049856 0.038214 0.021008 0.053345 0.029642
2000-05-31 -0.028622 -0.032167 0.004977 0.019524 -0.011100 0.006261 -0.021452
... ... ... ... ... ... ... ...
2018-11-30 -0.000442 -0.001332 0.004631 -0.032942 0.010132 -0.018565 -0.009882
2018-12-31 0.013254 0.000392 0.035293 0.035177 0.025389 -0.005159 0.017174
2019-01-31 -0.001657 0.027834 0.006906 -0.031778 -0.039143 0.025065 -0.012761
2019-02-28 -0.006726 -0.011748 -0.022643 0.024612 -0.003938 -0.015177 0.003679
2019-03-29 -0.010289 -0.006107 0.011027 -0.000564 -0.017525 0.007705 0.002994

231 rows × 7 columns

time: 15.8 ms

Risk-Parity Program:

(65)\[\begin{align} \underset{w}{\text{minimize }} &{(\mathbb{CR} - b\sigma(w))}^\top {(\mathbb{CR} - b\sigma(w))} \\ \text{subject to } &w^\top \mathbb{1} = 1 \text{ and } 0 \leq w \leq 1 \\ \text{where } &b_i = \frac{1}{n}\qquad\forall i \in 1, \cdots, n, \\ \text{and } &\mathbb{CR} = \frac{\text{diag}(w)\Sigma w}{\sigma(w)} \\ &\mathbb{MCR} = \frac{\Sigma w}{\sigma(w)} \\ &\sigma(w) = \sqrt{w^\top\Sigma w} \end{align}\]
def risk_budgeting_weights(Σ, b, target_ann_vol=0.05):
    """Solves risk-budgeting program and returns optimal
    weights for each asset such that the volatility of 
    each asset in portfolio conforms to the risk budget
    allocated using differential evolution.
    
    Args:
        Σ (np.array): Covariance matrix of the monthly asset returns
        b (np.array): Risk budgets
        target_ann_vol (float): Target annualized volatility, default - 0.05
        
    Returns:
        (np.array): Weight vector
    """
    n = Σ.shape[0]  # Number of assets in portfolio
    σ = lambda w, Σ: np.sqrt(w.T @ Σ @ w)  # Portfolio volatility
    MCR = lambda w, Σ, σ: (Σ @ w) / σ  # Marginal Contribution to Risk
    CR = lambda w, Σ, σ: np.diag(w) @ MCR(w, Σ, σ)  # Contribution to Risk
    obj = lambda w: (CR(w, Σ, σ(w, Σ)) - b.T * σ(w, Σ)).T @ (
        CR(w, Σ, σ(w, Σ)) - b.T * σ(w, Σ)
    )  # Form objective
    result = sp.optimize.differential_evolution(
        func=obj,
        bounds=[(0, 1) for _ in range(n)],
        maxiter=1000,
        popsize=15,
        constraints=(
            sp.optimize.LinearConstraint(A=np.ones(shape=(n,)), lb=1, ub=1),  # 𝑤⊤𝟙 = 1
            sp.optimize.LinearConstraint(
                A=np.eye(n), lb=np.zeros(shape=(n,)), ub=np.ones(shape=(n,)),
            ),  # 0 ≤ 𝑤 ≤ 1
        ),
    )

    # Optimal weight vector
    w_star = result.x

    # Update weight vector to target annualized volatility
    # By normalizing the weights according to their expected
    # annualized volatility and then x 5% target annualized volatility
    leverage_factor = target_ann_vol / (σ(w_star, Σ) * np.sqrt(12))
    w_star *= leverage_factor

    # Form and solve problem.
    return w_star
time: 3.7 ms

Step 4:

the portfolio (w1*,…,w7*)’s return at time t+1 will be calculated as (sign(r1) x w1* x r1_new+…+ sign(r7) x w1* x r7_new) where sign(rh) is your directional forecast for currency h at time t+1, and rh_new is its actual return at time t+1.

Record the portfolio’s returns at time t+1, and go to Step 2 to make forecast for time t+2 and build the portfolio, and stop until 3/29/2019.

annualized_volatility = (
    lambda monthly_returns: np.std(monthly_returns) * np.sqrt(12)
    if len(monthly_returns) > 0
    else None
)
annualized_return = (
    lambda monthly_returns: np.prod(monthly_returns + 1) ** (12 / len(monthly_returns))
    - 1
    if len(monthly_returns) > 0
    else None
)


def sharpe_ratio(monthly_returns, R_f=0):
    R_p = annualized_return(monthly_returns)
    σ_p = annualized_volatility(monthly_returns)
    return (R_p - R_f) / σ_p if len(monthly_returns) > 0 else None


def get_risk_parity_returns(past_returns, forecasted_returns):
    n = len(past_returns.columns)
    Σ = past_returns.cov().values
    b = np.array([1] * n) / n
    risk_parity_weights = risk_budgeting_weights(Σ, b)
    return risk_parity_weights.T @ forecasted_returns
time: 2.35 ms
A_risk_parity_returns = (
    pd.Series(
        [
            get_risk_parity_returns(
                past_returns=monthly_returns.iloc[
                    monthly_returns.index.get_loc(start_date)
                    - 60
                    + start_idx : monthly_returns.index.get_loc(start_date)
                    + start_idx,
                    :,
                ],
                forecasted_returns=A_returns.loc[
                    monthly_returns.iloc[
                        monthly_returns.index.get_loc(start_date) + start_idx
                    ].name
                ],
            )
            for start_idx in range(
                1, monthly_returns.shape[0] - monthly_returns.index.get_loc(start_date)
            )
        ],
        index=A_returns.loc[start_date:].index,
    )
    .to_frame()
    .rename({0: "returns"}, axis=1)
)
A_risk_parity_returns
returns
date
2000-01-31 0.006108
2000-02-29 0.006802
2000-03-31 0.009838
2000-04-28 0.038707
2000-05-31 -0.011690
... ...
2018-11-30 -0.003069
2018-12-31 0.014000
2019-01-31 -0.002045
2019-02-28 -0.004150
2019-03-29 0.002271

231 rows × 1 columns

time: 39min 23s
print(f"Risk-Parity with Model Selection A: ")
print(f"Annualized Return:", annualized_return(A_risk_parity_returns.values))
print(f"Annualized Volatility:", annualized_volatility(A_risk_parity_returns.values))
print(f"Annualized Sharpe Ratio:", sharpe_ratio(A_risk_parity_returns.values))
Risk-Parity with Model Selection A: 
Annualized Return: 0.006451740379742388
Annualized Volatility: 0.04123175329661224
Annualized Sharpe Ratio: 0.15647504323500325
time: 1.48 ms
B_risk_parity_returns = (
    pd.Series(
        [
            get_risk_parity_returns(
                past_returns=monthly_returns.iloc[
                    monthly_returns.index.get_loc(start_date)
                    - 60
                    + start_idx : monthly_returns.index.get_loc(start_date)
                    + start_idx,
                    :,
                ],
                forecasted_returns=B_returns.loc[
                    monthly_returns.iloc[
                        monthly_returns.index.get_loc(start_date) + start_idx
                    ].name
                ],
            )
            for start_idx in range(
                1, monthly_returns.shape[0] - monthly_returns.index.get_loc(start_date)
            )
        ],
        index=B_returns.loc[start_date:].index,
    )
    .to_frame()
    .rename({0: "returns"}, axis=1)
)
B_risk_parity_returns
returns
date
2000-01-31 0.014290
2000-02-29 0.006797
2000-03-31 0.011856
2000-04-28 0.029459
2000-05-31 -0.011655
... ...
2018-11-30 -0.004471
2018-12-31 0.014027
2019-01-31 -0.002052
2019-02-28 -0.004151
2019-03-29 -0.001552

231 rows × 1 columns

time: 35min 53s
print(f"Risk-Parity with Model Selection B: ")
print(f"Annualized Return:", annualized_return(B_risk_parity_returns.values))
print(f"Annualized Volatility:", annualized_volatility(B_risk_parity_returns.values))
print(f"Annualized Sharpe Ratio:", sharpe_ratio(B_risk_parity_returns.values))
Risk-Parity with Model Selection B: 
Annualized Return: 0.0061803207332966625
Annualized Volatility: 0.04209334277303941
Annualized Sharpe Ratio: 0.14682418468449907
time: 1.27 ms