Substrate

The substrate is the road-constrained spatial scaffold used by models and simulators:

  • a road graph (OSMnx GraphML)

  • a regular grid (cell centroids)

  • sparse travel-time neighbourhoods between grid cells

  • optional POI feature matrix aligned to the grid

Building

Use SubstrateBuilder(SubstrateConfig(...)).build() (or the CLI wrapper) to build and optionally cache artefacts.

SubstrateConfig supports three ways to specify the region:

  • bbox=(north,south,east,west) via the individual fields north/south/east/west

  • place="..." (OSM place query)

  • graphml_path="..." (offline / tests)

Cache artefacts

If cache_dir is set, the builder writes a self-contained cache directory:

  • graph.graphml

  • grid.npz

  • neighbours.npz

  • meta.json

  • poi.npz (optional)

Cache contents (v2)

graph.graphml

Road network saved via osmnx.save_graphml.

grid.npz

A compressed NumPy archive with:

  • lat: float64, shape (n_cells,) — grid cell centroid latitudes (EPSG:4326)

  • lon: float64, shape (n_cells,) — grid cell centroid longitudes (EPSG:4326)

  • cell_size_m: float64, shape (1,) — grid spacing in metres

neighbours.npz

A SciPy sparse matrix saved via scipy.sparse.save_npz.

  • matrix type: CSR (.tocsr() on load)

  • shape: (n_cells, n_cells)

  • entries: travel_time_s[i, j] = shortest-path travel time (seconds) from cell i to j, for all j reachable within max_travel_time_s (plus the diagonal)

poi.npz (optional)

Present when POIs are enabled.

  • x: float64, shape (n_cells, n_features) — POI feature matrix aligned to the grid

  • feature_names: object array of strings, length n_features

meta.json

Human-readable cache metadata. Keys:

  • cache_format_version (int)

  • built_at_utc (UTC timestamp, YYYY-MM-DDTHH:MM:SSZ)

  • motac_version (string)

  • config (dict) — normalized subset of SubstrateConfig fields

  • graphml_path (string) — path of the GraphML used by the cache (when cached, this points at cache_dir/graph.graphml)

  • has_poi (bool)

Cache format versioning

The cache includes meta.json["cache_format_version"]. The loader accepts supported versions (currently v1 and v2) and raises for unsupported versions.

Loading a cached directory (with version validation)

Minimal example: load cache + validate version {#cache-load-validate-version}

SubstrateBuilder.build() validates the cache format version automatically, but if you want an explicit, human-readable guard before doing any work you can read meta.json yourself:

from __future__ import annotations

import json
from pathlib import Path

from motac.spatial import SubstrateBuilder, SubstrateConfig

cache_dir = Path("./cache/camden")
meta = json.loads((cache_dir / "meta.json").read_text())

# Fast fail if the on-disk cache is from an unsupported format.
if meta.get("cache_format_version") not in SubstrateBuilder.SUPPORTED_CACHE_FORMAT_VERSIONS:
    raise ValueError(
        "Unsupported substrate cache format version: "
        f"{meta.get('cache_format_version')} (supported {SubstrateBuilder.SUPPORTED_CACHE_FORMAT_VERSIONS})"
    )

# Loads graph.graphml, grid.npz, neighbours.npz (and optionally poi.npz).
substrate = SubstrateBuilder(SubstrateConfig(cache_dir=str(cache_dir))).build()
print(substrate.grid.lat.shape, substrate.neighbours.travel_time_s.shape)

If cache_dir already contains a cache directory, SubstrateBuilder.build() loads it and validates cache_format_version automatically:

from motac.spatial import SubstrateBuilder, SubstrateConfig

# Point at a previously built cache directory containing:
# graph.graphml, grid.npz, neighbours.npz, meta.json (and optionally poi.npz)
cache_dir = "./cache/camden"

try:
    substrate = SubstrateBuilder(SubstrateConfig(cache_dir=cache_dir)).build()
except ValueError as e:
    # Raised e.g. when meta.json["cache_format_version"] is unsupported.
    raise

# Use the loaded substrate.
print(substrate.grid.lat.shape, substrate.neighbours.travel_time_s.shape)

POI features

If POIs are enabled, Substrate.poi is a POIFeatures object with:

  • x: shape (n_cells, n_features)

  • feature_names: list of feature names, aligned to columns of x

By default we always include:

  • poi_count: total number of POIs assigned to each grid cell

Tag/value breakout counts

If SubstrateConfig.poi_tags is provided, we also add optional breakout count features based on POI properties:

  • If {"amenity": True} then we add a feature named amenity counting POIs with a non-null amenity property.

  • If {"amenity": ["cafe", "restaurant"]} then we add features named amenity=cafe and amenity=restaurant.

This works both for OSM downloads (where those properties are columns in the GeoDataFrame) and for local GeoJSON inputs (as long as properties contain the relevant keys).

Example config

{
  "place": "Camden, London, UK",
  "cell_size_m": 250.0,
  "max_travel_time_s": 900.0,
  "poi_tags": {"amenity": ["cafe", "restaurant"]},
  "cache_dir": "./cache/camden"
}

Travel-time-to-nearest-POI (min travel time)

If SubstrateConfig.poi_travel_time_features=true, the builder appends travel-time-based features computed from the sparse neighbourhood matrix neighbours.travel_time_s:

  • poi_min_travel_time_s: minimum travel time (seconds) from each cell to any cell that contains at least one POI.

If poi_tags defines breakout categories, we also add category-specific features using the same naming convention as the count breakouts:

  • poi_<tag>_min_travel_time_s for <tag>: True (e.g. poi_amenity_min_travel_time_s)

  • poi_<tag>=<value>_min_travel_time_s for <tag>: [values] (e.g. poi_amenity=school_min_travel_time_s)

If no POI target is reachable for a cell within the cached neighbourhood cutoff, we set the feature to max_travel_time_s.


API reference

Spatial utilities (CRS transforms, grid building, lookup, travel-time helpers).

class motac.spatial.LonLatBounds(lon_min, lon_max, lat_min, lat_max)[source]

Bases: object

lon_min: float
lon_max: float
lat_min: float
lat_max: float
__init__(lon_min, lon_max, lat_min, lat_max)
motac.spatial.build_regular_grid(bounds, cell_size_m)[source]
Return type:

Grid

class motac.spatial.GridCellLookup(tf, x0_edge, y0_edge, nx, ny, cell_size_m)[source]

Bases: object

Fast lon/lat -> cell_id lookup for regular grids.

Notes

  • Assumes the grid was generated by motac.spatial.grid_builder.build_regular_grid().

  • Cell ids follow the ravel order used by np.meshgrid(xs, ys) and ravel(): x varies fastest (row-major), so cell_id = iy * nx + ix.

  • Returns -1 for points outside the grid extent.

tf: LonLatToXY
x0_edge: float
y0_edge: float
nx: int
ny: int
cell_size_m: float
classmethod from_grid(grid)[source]
Return type:

GridCellLookup

lonlat_to_cell_id(lon, lat)[source]

Map lon/lat to cell id(s).

Parameters:
  • lon (float | ndarray) – Scalars or numpy arrays of equal shape.

  • lat (float | ndarray) – Scalars or numpy arrays of equal shape.

Returns:

-1 (or array with -1) for points outside the grid.

Return type:

int or np.ndarray

__init__(tf, x0_edge, y0_edge, nx, ny, cell_size_m)
motac.spatial.lonlat_to_cell_id(grid, *, lon, lat)[source]

Convenience wrapper: build lookup from grid and map lon/lat -> cell_id.

Return type:

int | ndarray

motac.spatial.build_knn_travel_time_matrix(*, lat, lon, k=6, speed_kph=30.0)[source]

Build a symmetric kNN travel-time matrix from cell centroids.

Distances are haversine-approximated in meters and converted to seconds.

Return type:

csr_matrix

motac.spatial.min_travel_time_to_mask(*, travel_time_s, mask, default)[source]

Compute min travel time from each row to any column where mask is True.

Parameters:
  • travel_time_s (csr_matrix) – CSR matrix with shape (n_cells, n_cells), where entries are travel times in seconds.

  • mask (ndarray) – Boolean array of shape (n_cells,) indicating target cells.

  • default (float) – Value used when a row has no reachable target cell.

Returns:

Array of shape (n_cells,) with per-cell minimum travel time.

Return type:

out

motac.spatial.min_travel_time_feature_matrix(*, travel_time_s, masks, default, suffix='min_travel_time_s')[source]

Compute a feature matrix of min travel times for multiple target masks.

Parameters:
  • travel_time_s (csr_matrix) – CSR travel-time matrix.

  • masks (dict[str, ndarray]) – Dict mapping feature prefix -> boolean mask of target cells.

  • default (float) – Default value if no target is reachable.

  • suffix (str) – Suffix appended to each feature name.

Return type:

tuple[ndarray, list[str]]

Returns:

  • x – Array of shape (n_cells, n_features).

  • names – Feature names in order.

class motac.spatial.Grid(lat, lon, cell_size_m)[source]

Bases: object

Regular grid substrate (WGS84 centroids).

lat: ndarray
lon: ndarray
cell_size_m: float
__init__(lat, lon, cell_size_m)
class motac.spatial.POIFeatures(x, feature_names)[source]

Bases: object

POI features per grid cell.

x: ndarray
feature_names: list[str]
__init__(x, feature_names)
class motac.spatial.NeighbourSets(travel_time_s)[source]

Bases: object

Sparse travel-time neighbourhoods between grid cells.

travel_time_s: csr_matrix
__init__(travel_time_s)
class motac.spatial.Substrate(grid, neighbours, poi, graphml_path=None)[source]

Bases: object

Spatial substrate used by model fitting and forecasting.

grid: Grid
neighbours: NeighbourSets
poi: POIFeatures | None
graphml_path: str | None
__init__(grid, neighbours, poi, graphml_path=None)
class motac.spatial.SubstrateBuilder(config)[source]

Bases: object

Build a road-constrained substrate and persist cache artefacts.

Cache format (v2)

When SubstrateConfig.cache_dir is set, build() writes a self-contained artefact bundle:

  • graph.graphml: road network (OSMnx GraphML)

  • grid.npz: grid centroid lat/lon + cell_size_m

  • neighbours.npz: CSR travel-time neighbourhood matrix

  • meta.json: format version + config + basic provenance

  • optionally poi.npz: POI feature matrix

Compatibility

v2 caches are written by this builder. v1 caches are still loadable.

CACHE_FORMAT_VERSION = 2
SUPPORTED_CACHE_FORMAT_VERSIONS = (1, 2)
__init__(config)[source]
build()[source]
class motac.spatial.SubstrateConfig(north=None, south=None, east=None, west=None, place=None, graphml_path=None, cell_size_m=250.0, max_travel_time_s=900.0, disable_pois=False, poi_tags=None, poi_geojson_path=None, poi_travel_time_features=False, cache_dir=None)[source]

Bases: object

Configuration for substrate building.

Provide one of:
  • bbox (north, south, east, west)

  • place (OSM place query string)

  • graphml_path (local graph for offline/testing)

POIs are optional. Supported options include:
  • disable_pois=True

  • poi_geojson_path (local) OR poi_tags + (bbox/place/graph)

north: float | None
south: float | None
east: float | None
west: float | None
place: str | None
graphml_path: str | None
cell_size_m: float
max_travel_time_s: float
disable_pois: bool
poi_tags: dict[str, Any] | None
poi_geojson_path: str | None
poi_travel_time_features: bool
cache_dir: str | None
static from_json(path)[source]
Return type:

SubstrateConfig

bbox()[source]
Return type:

tuple[float, float, float, float] | None

__init__(north=None, south=None, east=None, west=None, place=None, graphml_path=None, cell_size_m=250.0, max_travel_time_s=900.0, disable_pois=False, poi_tags=None, poi_geojson_path=None, poi_travel_time_features=False, cache_dir=None)
motac.spatial.build_grid_from_lonlat_bounds(*, lon_min, lon_max, lat_min, lat_max, cell_size_m)[source]

Build a Grid from lon/lat bounds using the spatial grid builder.

Returns motac.spatial.types.Grid.

class motac.spatial.substrate.SubstrateConfig(north=None, south=None, east=None, west=None, place=None, graphml_path=None, cell_size_m=250.0, max_travel_time_s=900.0, disable_pois=False, poi_tags=None, poi_geojson_path=None, poi_travel_time_features=False, cache_dir=None)[source]

Bases: object

Configuration for substrate building.

Provide one of:
  • bbox (north, south, east, west)

  • place (OSM place query string)

  • graphml_path (local graph for offline/testing)

POIs are optional. Supported options include:
  • disable_pois=True

  • poi_geojson_path (local) OR poi_tags + (bbox/place/graph)

north: float | None
south: float | None
east: float | None
west: float | None
place: str | None
graphml_path: str | None
cell_size_m: float
max_travel_time_s: float
disable_pois: bool
poi_tags: dict[str, Any] | None
poi_geojson_path: str | None
poi_travel_time_features: bool
cache_dir: str | None
static from_json(path)[source]
Return type:

SubstrateConfig

bbox()[source]
Return type:

tuple[float, float, float, float] | None

__init__(north=None, south=None, east=None, west=None, place=None, graphml_path=None, cell_size_m=250.0, max_travel_time_s=900.0, disable_pois=False, poi_tags=None, poi_geojson_path=None, poi_travel_time_features=False, cache_dir=None)
class motac.spatial.substrate.SubstrateBuilder(config)[source]

Bases: object

Build a road-constrained substrate and persist cache artefacts.

Cache format (v2)

When SubstrateConfig.cache_dir is set, build() writes a self-contained artefact bundle:

  • graph.graphml: road network (OSMnx GraphML)

  • grid.npz: grid centroid lat/lon + cell_size_m

  • neighbours.npz: CSR travel-time neighbourhood matrix

  • meta.json: format version + config + basic provenance

  • optionally poi.npz: POI feature matrix

Compatibility

v2 caches are written by this builder. v1 caches are still loadable.

CACHE_FORMAT_VERSION = 2
SUPPORTED_CACHE_FORMAT_VERSIONS = (1, 2)
__init__(config)[source]
build()[source]
motac.spatial.substrate.build_grid_from_lonlat_bounds(*, lon_min, lon_max, lat_min, lat_max, cell_size_m)[source]

Build a Grid from lon/lat bounds using the spatial grid builder.

Returns motac.spatial.types.Grid.

class motac.spatial.types.Grid(lat, lon, cell_size_m)[source]

Bases: object

Regular grid substrate (WGS84 centroids).

lat: ndarray
lon: ndarray
cell_size_m: float
__init__(lat, lon, cell_size_m)
class motac.spatial.types.POIFeatures(x, feature_names)[source]

Bases: object

POI features per grid cell.

x: ndarray
feature_names: list[str]
__init__(x, feature_names)
class motac.spatial.types.NeighbourSets(travel_time_s)[source]

Bases: object

Sparse travel-time neighbourhoods between grid cells.

travel_time_s: csr_matrix
__init__(travel_time_s)
class motac.spatial.types.Substrate(grid, neighbours, poi, graphml_path=None)[source]

Bases: object

Spatial substrate used by model fitting and forecasting.

grid: Grid
neighbours: NeighbourSets
poi: POIFeatures | None
graphml_path: str | None
__init__(grid, neighbours, poi, graphml_path=None)