Deep-Time Neural Networks

Hierarchical Risk Parity: A Modern Portfolio Construction Method

We present a cutting-edge method for constructing financial portfolios, called Hierarchical Risk Parity (HRP). This method overcomes challenges posed by traditional techniques, by using modern mathematics and machine learning. HRP works in three steps: Tree Clustering, Quasi-Diagonalization, and Recursive Bisection. The approach is based on the information in the covariance matrix without requiring its inversion, and it works even with singular matrices.

Tree Clustering

Given a matrix of observations XX, we first generate a correlation matrix. A distance measure is defined, and an N×NN \times N distance matrix DD is created.

We use tree structures to hierarchically cluster financial assets. The goal is to group together the closest financial assets in terms of their distance measure, providing a hierarchical structure that represents the similarities between the assets.

In mathematical terms:

di,j=12(1ρi,j)d_{i, j} = \sqrt{\frac{ 1 }{ 2 }(1 - \rho_{ i, j })}

The algorithm performs clustering based on a new distance measure d~\tilde{d}, which computes the distance between column vectors of DD. Eventually, we create clusters by minimizing d~i,j\tilde{d}_{i, j}.

hrp.py
def distance_corr(
        corr_matrix: np.ndarray
) -> np.ndarray:
    distance_matrix = ((1 - corr_matrix) / 2.0) ** 0.5
    return distance_matrix

View More: Julia | Python


Quasi-Diagonalization

The next stage rearranges the covariance matrix to group similar investments together along the diagonal. The result is a quasi-diagonal matrix that simplifies the allocation of investments.

hrp.py
def quasi_diagonal(link: np.ndarray) -> list:
    sorted_items = [link[-1, 0], link[-1, 1]]
    
    num_items = link[-1, -2]
    
    while sorted_items[-1] >= num_items:
        index = int(sorted_items[-1] - num_items)
        sorted_items.extend([link[index, 0], link[index, 1]])
    
    return sorted_items

View More: Julia | Python


Recursive Bisection

The last stage is the allocation of assets, leveraging the quasi-diagonal matrix. The algorithm recursively bisects the sorted list of original items into subsets and allocates investments in inverse proportion to their aggregated variances.

The algorithm follows:

  1. Initialize list of items LL and assign a unit weight to all items wn=1,nw_n=1, \forall n.
  2. For each subset LiL_i with more than one item:
  • Bisect LiL_i into two subsets Li(1)L_i^{(1)} and Li(2)L_i^{(2)}.
  • Define the variance of each subset V~i(j)\tilde{V}_{i}^{(j)} based on the covariance matrix of its constituents.

And so, the portfolio is constructed.

hrp.py
def recursive_bisection(cov: pd.DataFrame, sorted_items: list) -> pd.Series:
    if len(sorted_items) > 1:
        cluster = [sorted_items]
        while len(cluster) > 0:
            cluster_ = cluster.pop()
            if len(cluster_) > 1:
                var_covar = cov.loc[cluster_, cluster_]
                e_val, e_vec = np.linalg.eigh(var_covar)
                index = e_val.argsort()[::-1]
                e_val, e_vec = e_val[index], e_vec[:, index]
                w = np.zeros((len(cluster_)))
                w[-1] = 1
                cluster0, cluster1 = [], []
                for i in range(len(cluster_) - 1):
                    d = np.sqrt((w * e_vec[:, i]).T @ e_val[i] @ (w * e_vec[:, i]))
                    u = ((w * e_vec[:, i]) / d).reshape(-1, 1)
                    cluster0.append(cluster_[np.dot(var_covar, u).flatten() <= 0])
                    cluster1.append(cluster_[np.dot(var_covar, u).flatten() > 0])
                cluster.extend([sorted(cluster0), sorted(cluster1)])
            else:
                break
    else:
        cluster = [sorted_items]
    
    weights = pd.Series(1, index=[i for i in cluster[0]])
    
    return weights

View More: Julia | Python


Comparing Portfolio Risk Strategies: HRP vs. CLA vs. IVP

This blog presents a Monte Carlo simulation study comparing three portfolio allocation strategies: Hierarchical Risk Parity (HRP), Critical Line Algorithm (CLA), and Inverse Variance Portfolio (IVP). We find that while CLA aims for the lowest in-sample risk, it performs the worst in out-of-sample tests. Specifically, HRP yields the lowest out-of-sample variance, making it preferable for risk parity investors who often employ leverage.

Key Findings:

  • HRP portfolios have 72.47% lower variance than CLA portfolios and 38.24% lower variance than IVP portfolios.
  • HRP improves the out-of-sample Sharpe ratio of a CLA strategy by about 31.3%.

Methodology

Three key steps are followed:

  1. Generate 10 series of random Gaussian returns for 520 observations (equivalent to 2 years of daily history), with random shocks and correlation structure.
  2. Compute HRP, CLA, and IVP portfolios looking back at 260 observations (a year of daily history).
  3. Rebalance portfolios every 22 observations (equivalent to a monthly frequency).

This procedure is repeated 10,000 times.

hrp.py
def random_data(
        number_observations: int,
        length_sample: int,
        size_uncorrelated: int,
        size_correlated: int,
        mu_uncorrelated: float,
        sigma_uncorrelated: float,
        sigma_correlated: float
) -> (np.ndarray, list):
    # Generate random uncorrelated data
    data1 = np.random.normal(mu_uncorrelated, sigma_uncorrelated, size=(number_observations, size_uncorrelated))
    
    # Create correlation between the variables
    columns = [random.randint(0, size_uncorrelated - 1) for i in range(size_correlated)]  # randomly select columns
    data2 = data1[:, columns] + np.random.normal(0, sigma_uncorrelated * sigma_correlated, size=(number_observations, len(columns)))  # correlated data
    
    # Merge data
    data = np.append(data1, data2, axis=1)
    
    # Add common random shock
    points = np.random.randint(length_sample, number_observations - 1, size=2)  # select random observations
    data[np.ix_(points, [columns[0], size_uncorrelated])] = np.array([[-0.5, -0.5], [2, 2]])
    
    # Add specific random shock
    points = np.random.randint(length_sample, number_observations - 1, size=2)  # select random observations
    data[points, columns[-1]] = np.array([-0.5, 2])
    
    return data, columns

View More: Julia | Python


Insights and Interpretations

  1. CLA's optimization program aims for the lowest variance but ends up with the highest out-of-sample variance.
  2. IVP disregards the correlation structure, which affects its performance.
  3. HRP combines diversification across all investments and clusters of investments, making it more robust to both common and idiosyncratic shocks.

Flexibility and Scalability

The HRP method is flexible and allows for variations. It can incorporate forecasted returns and other econometric methods.

hrp.py
def hrp_mc(
        number_iterations: int = 5000,
        number_observations: int = 520,
        size_uncorrelated: int = 5,
        size_correlated: int = 5,
        mu_uncorrelated: float = 0,
        sigma_uncorrelated: float = 0.01,
        sigma_correlated: float = 0.25,
        length_sample: int = 260,
        test_size: int = 22
) -> None:
    methods = [hrp]  # methods
    
    results, iteration_counter = {i.__name__: pd.Series() for i in methods}, 0  # initialize results and iteration counter
    
    pointers = range(length_sample, number_observations, test_size)  # pointers for in-sample and out-sample
    
    while iteration_counter < number_iterations:
        data, columns = random_data(number_observations, length_sample, size_uncorrelated, size_correlated, mu_uncorrelated, sigma_uncorrelated, sigma_correlated)
        
        returns = {i.__name__: pd.Series() for i in methods}  # initialize returns
        
        for pointer in pointers:
            in_sample = data[pointer - length_sample: pointer]  # in sample
            cov_, corr_ = np.cov(in_sample, rowvar=0), np.corrcoef(in_sample, rowvar=0)  # cov and corr
            
            out_sample = data[pointer: pointer + test_size]  # out of sample
            
            for func in methods:
                weight = func(cov=cov_, corr=corr_)  # call methods
                ret = pd.Series(out_sample @ (weight.transpose()))  # return
                returns[func.__name__] = returns[func.__name__].append(ret)  # update returns
        
        for func in methods:
            ret = returns[func.__name__].reset_index(drop=True)  # return column of each method
            cumprod_return = (1 + ret).cumprod()  # cumprod of returns
            results[func.__name__].loc[iteration_counter] = cumprod_return.iloc[-1] - 1  # update results
        
        iteration_counter += 1  # next iteration
    
    results_df = pd.DataFrame.from_dict(results, orient="columns")  # dataframe of results
    results_df.to_csv("results.csv")  # csv file
    
    std_results, var_results = results_df.std(), results_df.var()  # std and var for each method
    print(pd.concat([std_results, var_results, var_results / var_results["hrp"] - 1], axis=1))

View More: Julia | Python


Mathematical Insight

In simpler terms, let's consider two vectors XX and YY. The correlation variable ρ[x,y]\rho[x, y] is used to derive a distance measure d[x,y]d[x, y] defined as:

d[x,y]=12(1ρ[x,y])d[x, y] = \sqrt{\frac{ 1 }{ 2 }(1 - \rho[x, y])}

This distance measure is proven to be a true metric, providing a robust way to measure how similar or different two portfolios are.

References

  1. De Prado, M. L. (2018). Advances in financial machine learning. John Wiley & Sons.
  2. De Prado, M. M. L. (2020). Machine learning for asset managers. Cambridge University Press.