- Published on
Detecting Structural Breaks in Financial Markets
Detecting Structural Breaks in Financial Markets
Structural breaks in financial markets refer to a shift from one type of market behavior to another, like from a mean-reverting to a momentum pattern. Such changes often catch market participants off guard, leading them to make costly mistakes. By analyzing these structural breaks, you can make more informed trading decisions. We'll focus on two types of tests for detecting structural breaks: CUSUM tests and explosiveness tests.
Types of Structural Break Tests
- CUSUM tests: These tests measure if the cumulative forecasting errors significantly deviate from random behavior.
- Explosiveness tests: These tests identify whether the process shows exponential growth or decline, which would be inconsistent with a random walk or stationary process.
Brown-Durbin-Evans CUSUM Test on Recursive Residuals
The Brown-Durbin-Evans CUSUM test evaluates structural breaks by using recursive least squares (RLS) estimates. The formula for the RLS is as follows:
We compute standardized 1-step ahead recursive residuals using:
Finally, the CUSUM statistic is calculated as:
using LinearAlgebra
using Statistics
using DataFrames
using Dates
function computeBeta(
X::Matrix,
y::Vector
)::Tuple{Vector, Matrix}
# See the source code for detailed implementaion.
end
function brownDurbinEvansTest(
X::Matrix,
y::Vector,
lags::Int,
k::Int,
index::Vector
)::DataFrame
β, _ = computeBeta(X, y)
residuals = y - X * β
σ = residuals' * residuals / (length(y) - 1 + lags - k)
startIndex = k - lags + 1
cumsum = 0
bdeCumsumStats = zeros(length(y) - startIndex)
for i in startIndex:length(y) - 1
X_, y_ = X[1:i, :], y[1:i]
β, _ = computeBeta(X_, y_)
ω = (y[i + 1] - X[i + 1, :]' * β) / sqrt(1 + X[i + 1, :]' * inv(X_' * X_) * X[i + 1, :])
cumsum += ω
bdeCumsumStats[i - startIndex + 1] = cumsum / sqrt(σ)
end
bdeStatsDf = DataFrame(index = index[k:length(y) + lags - 2], bdeStatistics = bdeCumsumStats)
return bdeStatsDf
end
Simplified CUSUM Test
A simplified version of the Brown-Durbin-Evans test focuses only on price levels, making it computationally less expensive. It calculates the standardized departure of log-price relative to a reference price :
When studying financial time series data, using log prices is often more appropriate than using raw prices. Using raw prices can yield results that aren't time-invariant, leading to structural heteroscedasticity. On the other hand, log prices give a more reliable statistical framework for understanding price behaviors.
Mathematically, using raw prices yields a model like:
Whereas using log prices can be modeled as:
Using log prices also handles changing economic conditions, bubbles, or other economic phases better, especially for data that spans multiple years.
Computational Complexity
The SADF algorithm's computational cost is . This complexity can quickly become a bottleneck for larger datasets. For example, a full SADF time series for a dataset with requires around PFLOPs of operations. Using a High-Performance Computing (HPC) cluster may be necessary for computations within a reasonable time frame.
Conditions for Exponential Behavior
There are three potential states for the system based on log prices: steady, unit-root, and explosive. The behaviors are largely defined by the ( \beta ) parameter in the equation:
Quantile ADF and Conditional ADF
Two robust alternatives to SADF are Quantile ADF (QADF) and Conditional ADF (CADF). These methods provide measures of centrality and dispersion of high ADF values, reducing sensitivity to outliers and data specifics.
SADF,QADF and CADF Algorithm Implementation
Below is an example of how there statistics are implemented in the RiskLabAI library.
function adfTestType(
data::DataFrame,
minSampleLength::Int,
constant::String,
lags,
type::String;
quantile::Union{Nothing, Float64} = nothing,
probability::Vector = ones(size(data, 1))
)::DataFrame
X, y = prepareData(data.price, constant, lags)
maxLag = isinteger(lags) ? lags : maximum(lags)
indexRange = (minSampleLength - maxLag) : length(y)
result = zeros(length(indexRange))
for (i, index) in enumerate(indexRange)
X_, y_ = X[1:index, :], y[1:index]
adfStats = adf(X_, y_, minSampleLength)
if isempty(adfStats)
continue
end
if type == "SADF"
result[i] = maximum(adfStats)
elseif type == "QADF"
result[i] = sort(adfStats)[floor(Int, quantile * length(adfStats))]
elseif type == "CADF"
perm = sortperm(adfStats)
perm = perm[floor(Int, quantile * length(adfStats)):end]
result[i] = adfStats[perm] .* probability[perm] / sum(probability[perm])
else
println("type must be SADF or QADF or CADF")
end
end
adfStatistics = DataFrame(index = data.index[minSampleLength:length(y) + maxLag], statistics = result)
return adfStatistics
end
This implementation takes seven inputs such as the data frame of close prices (data
), minimum sample length (minSampleLength
), the type of constant to use (constant
), lag values (lags
), and the type of ADF test (type
).
References
- De Prado, M. L. (2018). Advances in financial machine learning. John Wiley & Sons.
- De Prado, M. M. L. (2020). Machine learning for asset managers. Cambridge University Press.