Parameter recovery (road Hawkes)

motac includes a small parameter recovery harness for the parametric road-constrained Hawkes Poisson fitter.

The goal is not perfect recovery on a tiny synthetic problem (which is noisy), but a regression test that the end-to-end loop

simulate → fit → compare parameters

stays in a reasonable ballpark across multiple random seeds.

What is tested

On a tiny 3-cell road substrate, we simulate counts from:

  • baseline rates mu (per cell)

  • excitation scale alpha

  • travel-time decay beta

  • fixed discrete lag kernel g(ℓ)

Then we fit \((\mu, \alpha, \beta)\) by maximum likelihood under the Poisson family and check:

  • the optimiser improves the log-likelihood vs initialisation

  • median absolute errors across seeds stay below tolerant thresholds

  • most seeds succeed (to reduce CI flakiness)

Model recap (recovery target)

The simulated intensity follows the same form as the main parametric model:

\[ \lambda_{j,t} = \mu_j + \alpha \sum_{k \in \mathcal{N}(j)} W(d_{jk})\, h_{k,t}, \quad h_{k,t} = \sum_{\ell=1}^{L} g(\ell)\, y_{k,t-\ell} \]

with \(W(d) = \exp(-\beta d)\) in the recovery harness. Counts are generated as \(y_{j,t} \sim \text{Poisson}(\lambda_{j,t})\).

The recovery task asks: given only the simulated counts, can we re-estimate the parameters that generated them? We do not expect perfect recovery in tiny datasets, but we expect the optimizer to:

  1. Improve the log-likelihood over a naive initialization, and

  2. Recover parameters to within broad tolerances.

Running locally

The harness is exercised by the unit test:

uv run pytest -q tests/test_parameter_recovery.py

The underlying helper is exposed as:

  • motac.models.run_parameter_recovery_road_hawkes_poisson

(see motac.models.validation).

Notes on tolerances

Recovery on small simulated datasets is stochastic. The test intentionally uses:

  • multi-seed evaluation

  • median error checks

  • a minimum success count (e.g. 4/5 seeds)

so that minor numerical drift does not cause flaky CI failures while still catching real regressions.