How a CausalTrader report is built.

Every report is the output of a deterministic five-step pipeline: collect signals, test causality among them, attribute news effects, classify regime, and write the markets note. This page covers the math.

PCMCI+ — finding what causes what

PCMCI+ (Runge et al., 2019) is a constraint-based causal discovery algorithm built for time series. We feed it a daily panel of signals — price-based technical indicators, news flow features, sentiment scores, and macro proxies — and ask it which variables causally influence which other variables at lags up to five trading days. For every candidate edge, PCMCI+ tests conditional independence after conditioning on every other variable in the system across every lag, then applies a momentary-conditional-independence (MCI) step to control false-positive rates.

The output is a directed graph of statistically significant causal relationships at a chosen significance level (default α = 0.05). Edges that survive the filter are the candidate causal relationships we report on. The absence of an edge is informative too: it means that, after controlling for everything else, the algorithm rejected the relationship.

Dual-window architecture — short news, long causality

PCMCI+ needs a few hundred rows of aligned data to test conditional independence reliably; a 30-day customer window only yields ~21 trading days, which is too thin for the causal layer to resolve stable structure. We decouple the two halves of the report. News, sentiment, technicals, and the front-page statistic strip stay scoped to your selected analysis window because those are short-horizon phenomena and the reader is asking about them. The causal-discovery layer runs on a fixed 180-day context window that ends on the same trading day, giving PCMCI+ roughly 125 trading days of panel data to work with. Newly listed tickers without 180 days of history fall back to the available bars; the graph caption discloses the truncation when this happens. The two windows are reported side-by-side on the graph page and in the About box so the reader sees exactly which panel each part of the analysis came from.

VARLiNGAM — attributing news to returns

VARLiNGAM (Hyvärinen et al.) extends the LiNGAM family to vector autoregressive systems. It assumes linear relationships among variables with non-Gaussian residuals and acyclic causal ordering, and it identifies signed causal effects in addition to a full topological ordering of the variables. We fit it to a daily panel of news-derived features (sentiment, magnitude, source, headline counts) and market-model abnormal returns.

From the fitted VAR coefficient matrix we attribute each headline a portion of its day's estimated effect in basis points. These are attributions under the estimated daily DAG, not independent per-headline causal effects.

Abnormal returns — stripping out the index

Before VARLiNGAM, we adjust raw returns to remove the part explained by the broad market. We fit a standard market model against SPY on a rolling window: r_t = α + β · r_SPY,t + ε_t, where the abnormal return is the residual ε_t. When the window is too short for a stable market-model fit, we fall back to raw returns and flag that in the report.

Cluster-bootstrap confidence intervals

VARLiNGAM coefficients and per-headline effects are reported with 95% confidence intervals derived from a cluster bootstrap. We resample by week (not by day) to preserve short-horizon autocorrelation and clustered news arrival. Default sample count is 500. CIs that exclude zero are visually flagged in the report.

Benjamini-Hochberg FDR control

Running many independent significance tests inflates false discovery rates. We apply the Benjamini-Hochberg procedure to the VARLiNGAM coefficient p-values to control the false discovery rate at a configurable threshold (default q < 0.10). Effects that clear the threshold are marked as having passed FDR.

Regime classification

We segment the analysis window into bull, bear, and sideways regimes using a 21-day rolling assessment of trend, volatility, and drawdown. The technicals page shades the price panel by regime, and the front-page statistic strip reports the percentage of the period spent in each state.

What the report does not do

CausalTrader does not issue buy / hold / sell recommendations. It does not size positions. Causal relationships discovered in one window can break in the next — regime changes, earnings surprises, Fed announcements, or geopolitical events can scramble any historical pattern overnight. The full disclaimer spells out the model and data caveats.

References

Runge, J., et al. (2019). Detecting and quantifying causal associations in large nonlinear time series datasets. Science Advances 5(11).
Hyvärinen, A., Zhang, K., Shimizu, S., & Hoyer, P. O. Estimation of a structural vector autoregression model using non-Gaussianity. Journal of Machine Learning Research 11.
Benjamini, Y. & Hochberg, Y. (1995). Controlling the false discovery rate: a practical and powerful approach to multiple testing. JRSS-B 57(1).

Questions

For methodology questions or audit requests: dennis@mercurial-ai.com.