Routes
At a high level, the route pipeline has three stages:
- Source-specific route artifacts are produced in
data/processed/. route_loader.pynormalizes those artifacts into sharedline_families,itineraries, andstop_callsrows during import preparation.- The same preparation step builds
line_family_matchesanditinerary_matches, which the importer then persists.
This chapter explains the route model itself. Source extraction details remain in the preprocessing chapter, and stop-to-stop route-assisted matching remains in the matching chapter.
Route Layers
The implemented route pipeline has three conceptual layers:
| Layer | Purpose | Main Outputs |
|---|---|---|
| Source artifacts | Preserve GTFS-derived ATLAS route structure and OSM PTv2 route structure | atlas_line_families.csv, atlas_itineraries.csv, atlas_itinerary_stop_calls.csv, osm_route_*.csv |
| Normalized comparison | Convert both sides into a shared route model | line_families, itineraries, stop_calls |
| Matching | Link equivalent families and itineraries | line_family_matches, itinerary_matches |
Main Inputs
- GTFS route-family and itinerary artifacts built by
matching_and_import_db/downloader/get_atlas_gtfs.py - OSM route-master and route-relation artifacts built by
matching_and_import_db/downloader/get_osm_data.py - base stop-matching output used by
matching_and_import_db/database/route_loader.pyto resolve physical stop identity across both sides
Main Responsibilities
The route subsystem inside matching_and_import_db/ is responsible for:
- reconstructing ATLAS-side line families and itineraries from GTFS
- extracting OSM route-master, route-relation, tag, member, and stop artifacts into processed CSVs
- resolving route stops onto shared stop identities where possible, preferring matched ATLAS
sloidvalues and falling back to UIC or raw canonical keys when necessary - matching ATLAS and OSM line families
- scoring and pairing ATLAS and OSM itineraries only within already matched families
Persistence Boundary
Not every route artifact generated in data/processed/ is persisted as a raw database table.
route_loader.pystill consumes the full source artifact set, including ATLAS itinerary CSVs and OSM tag/member/stop CSVs, to build normalized route rows.- The active import schema only keeps
atlas_line_familiesandosm_route_relationsas route-source anchor tables. - The web app reads normalized route data from
line_families,itineraries,stop_calls,line_family_matches, anditinerary_matches.
It also supports two different consumers:
- stop matching runtime:
AtlasState,OsmState,RouteState, andRouteMatchPredicate - route import/runtime:
database/route_loader.pyand the/routespage queries