# Model overview `motac` implements a **road-constrained, discrete-time Hawkes process** for event counts on a grid laid over a road network. The model decomposes into: 1. A **substrate** defining spatial cells and travel-time neighbours. 2. A **Hawkes intensity** driven by lagged counts and travel-time decay. 3. A **count likelihood** (Poisson or Negative Binomial). 4. Optional **observation noise** for generating observed counts from latent intensities in simulation workflows. Throughout, let $y_{j,t}$ be the count of events in cell $j$ during time bin $t$. ## Substrate (road constraints) The road network induces a travel-time distance $d_{jk}$ between grid cells $j$ and $k$ (e.g. shortest-path travel time). We only consider neighbours within a cutoff, defining a sparse neighbourhood set $\mathcal{N}(j)$ and a sparse travel-time matrix. This structure is used to build a **nonnegative travel-time kernel** $W(d_{jk})$, producing a sparse influence matrix that respects the road network connectivity. ## Hawkes intensity (discrete time) The parametric model defines the intensity (conditional mean) for each cell: $$ \lambda_{j,t} = \mu_j + \alpha \sum_{k \in \mathcal{N}(j)} W(d_{jk})\, h_{k,t} $$ with the lagged history term $$ h_{k,t} = \sum_{\ell=1}^{L} g(\ell)\, y_{k,t-\ell} $$ where: - $\mu_j \ge 0$ is the baseline intensity per cell, - $\alpha \ge 0$ scales self- and cross-excitation, - $g(\ell)$ is a nonnegative lag kernel over discrete lags, - $W(d_{jk})$ downweights excitation by road travel-time distance. In the parametric baseline, $W(d) = \exp(-\beta d)$, with $\beta > 0$ controlling decay. The code also supports swapping in custom kernel functions $W(d)$ (e.g. neural or alternative deterministic kernels) while preserving the same likelihood. ## Count likelihood Given the intensity, counts are modelled as either: - **Poisson:** $y_{j,t} \sim \text{Poisson}(\lambda_{j,t})$. - **Negative Binomial (NB2):** $y_{j,t} \sim \text{NegBin}(\text{mean}=\lambda_{j,t}, \text{dispersion}=\kappa)$. The NB2 parameterisation used throughout has variance $\text{Var}[Y] = \lambda + \lambda^2 / \kappa$, where larger $\kappa$ approaches the Poisson case. ## Simulation and observation noise Simulation utilities generate latent counts from the Hawkes recursion and can optionally apply observation noise (detection probability + false positives) to produce observed counts. This is primarily used for synthetic evaluation and parameter recovery. ## Forecasting and evaluation Forecasting rolls the intensity forward one step at a time using the fitted parameters and the latest observed history. Evaluation utilities support backtests (train window + held-out horizon) and log-likelihood / RMSE / MAE scoring for quick sanity checks. The package also provides a probabilistic forecast interface via Monte Carlo path sampling (`forecast_probabilistic_horizon`) and rolling-origin backtests with baseline comparisons.