7.4 Atlas-Cached Import Optimization
This page describes the refresh mode used when the ATLAS and GTFS preprocessing inputs are unchanged.
Purpose
When the HTTP validators for the ATLAS and GTFS sources are unchanged, the pipeline can skip the expensive preprocessing download step and reuse the cached ATLAS/GTFS artifacts already written to disk.
If the database already contains the required static ATLAS/GTFS tables, the importer rewrites only the OSM-dependent and match-dependent tables. If those static tables are missing or empty, the importer performs a bootstrap rewrite of both the static and dynamic import tables without re-downloading ATLAS/GTFS.
This reduces database churn without changing the matching result for the current OSM snapshot.
Run Types
complete: full preprocessing and full database refresh.atlas_cached: reuse ATLAS/GTFS preprocessing artifacts and keep the static raw ATLAS/GTFS tables in place.atlas_cached_bootstrap: reuse cached ATLAS/GTFS preprocessing artifacts, but rewrite both static and dynamic import tables because the required static tables are missing or empty.
The scheduler selects atlas_cached automatically when both source snapshots are unchanged and the required static tables are already present with data.
The scheduler selects atlas_cached_bootstrap automatically when both source snapshots are unchanged but the required static tables are missing or empty.
Set PIPELINE_FORCE_FULL_REFRESH=1 to keep the cached preprocessing outputs but still force a full database rewrite.
Tables Reused In atlas_cached
atlas_operatorsatlas_stopsgtfs_stops_rawgtfs_stop_identity_resolutionatlas_line_familiesatlas_itinerariesatlas_itinerary_stop_calls
These are the persisted static import tables produced from cached ATLAS/GTFS artifacts.
Tables Rewritten In atlas_cached
itinerary_matchesline_family_matchesstop_callsitinerariesline_familiesosm_route_relation_stopsosm_route_relation_membersosm_route_relation_tagsosm_route_relationsosm_route_master_membersosm_route_master_tagsosm_route_mastersproblemsstops_matchedosm_stop_membersosm_nodesosm_stops
The normalized comparison layer is rewritten as a whole because it combines ATLAS and OSM inputs and carries the current matching output.
Tables Rewritten In atlas_cached_bootstrap
atlas_cached_bootstrap uses the same table rewrite scope as complete for the import step. In practice that means it rewrites:
- all
atlas_cacheddynamic tables listed above atlas_operatorsatlas_stopsgtfs_stops_rawgtfs_stop_identity_resolutionatlas_line_familiesatlas_itinerariesatlas_itinerary_stop_calls
Unlike complete, it still skips the ATLAS/GTFS download and preprocessing subprocess when cached artifacts are already valid.
Safety Checks
The importer validates that these required static tables exist and contain data before allowing an atlas_cached refresh:
atlas_stopsgtfs_stops_rawgtfs_stop_identity_resolutionatlas_line_familiesatlas_itinerariesatlas_itinerary_stop_calls
If any of them are missing or empty, the scheduler switches to atlas_cached_bootstrap instead of failing the run.
Observability
Pipeline status and data/data_meta.json record:
run_typerefresh_scope_tables_rewrittenrefresh_scope_tables_reused
This makes it visible whether the last successful run was a full rebuild, an atlas_cached refresh, or an atlas_cached_bootstrap refresh.