Routes

At a high level, the route pipeline has three stages:

  1. Source-specific route artifacts are produced in data/processed/.
  2. route_loader.py normalizes those artifacts into shared line_families, itineraries, and stop_calls rows during import preparation.
  3. The same preparation step builds line_family_matches and itinerary_matches, which the importer then persists.

This chapter explains the route model itself. Source extraction details remain in the preprocessing chapter, and stop-to-stop route-assisted matching remains in the matching chapter.

Route Layers

The implemented route pipeline has three conceptual layers:

Layer Purpose Main Outputs
Source artifacts Preserve GTFS-derived ATLAS route structure and OSM PTv2 route structure atlas_line_families.csv, atlas_itineraries.csv, atlas_itinerary_stop_calls.csv, osm_route_*.csv
Normalized comparison Convert both sides into a shared route model line_families, itineraries, stop_calls
Matching Link equivalent families and itineraries line_family_matches, itinerary_matches

Main Inputs

Main Responsibilities

The route subsystem inside matching_and_import_db/ is responsible for:

  • reconstructing ATLAS-side line families and itineraries from GTFS
  • extracting OSM route-master, route-relation, tag, member, and stop artifacts into processed CSVs
  • resolving route stops onto shared stop identities where possible, preferring matched ATLAS sloid values and falling back to UIC or raw canonical keys when necessary
  • matching ATLAS and OSM line families
  • scoring and pairing ATLAS and OSM itineraries only within already matched families

Persistence Boundary

Not every route artifact generated in data/processed/ is persisted as a raw database table.

  • route_loader.py still consumes the full source artifact set, including ATLAS itinerary CSVs and OSM tag/member/stop CSVs, to build normalized route rows.
  • The active import schema only keeps atlas_line_families and osm_route_relations as route-source anchor tables.
  • The web app reads normalized route data from line_families, itineraries, stop_calls, line_family_matches, and itinerary_matches.

It also supports two different consumers:

  • stop matching runtime: AtlasState, OsmState, RouteState, and RouteMatchPredicate
  • route import/runtime: database/route_loader.py and the /routes page queries

Subpages

Related Documentation

Data update in progress
Elapsed: -- ETA: -- Phase: idle