Backtesting and benchmarks
motac now treats forecasting as a probabilistic task, not only a
deterministic mean-field rollout.
Rolling-origin protocol
Backtests are run with rolling train/test folds:
fit on
y[:, :train_end]forecast
horizonsteps ahead with Monte Carlo pathsscore on held-out
y[:, train_end:train_end+horizon]repeat with a configurable rolling step
This is implemented via:
motac.eval.backtest_fit_forecast_nllmotac.eval.run_backtest_report
Probabilistic forecasts
The model forecast path now supports sampling future count trajectories:
forecast_count_paths_horizonreturns sampled paths and latent intensity paths.forecast_probabilistic_horizonreturns paths plus mean/quantile summaries.
Key outputs:
forecast mean counts
quantile envelopes (default
q=(0.05, 0.5, 0.95))fold-level and aggregate metrics
Metrics
Backtests report:
Negative log-likelihood (NLL)
RMSE
MAE
empirical interval coverage
Baselines
Reports include simple benchmark baselines:
last_valueseasonal_naivemoving_average
These provide context for model gains and avoid single-model performance claims.
Reproducible artifacts
run_backtest_report writes:
report.jsonfold metric figure
baseline comparison figure
The report structure is designed for machine-readable comparison across Chicago and ACLED benchmark runs.