Stop Sync: ATLAS ↔ OSM

This project provides a systematic pipeline to identify and analyze discrepancies between public transport stop data from ATLAS (Swiss official data) and OpenStreetMap (OSM).

Purpose

Switzerland has two valuable sources of public transport data:

  • ATLAS: The authoritative dataset from the Swiss Federal Office of Transport, containing official traffic points, platform designations, and timetable associations.
  • OpenStreetMap (OSM): A community-maintained, freely available geographic database with rich attribute data.

By systematically comparing these datasets, we can:

Improve data quality and detect anomalies for both datasets by identifying missing stops, incorrect locations, or outdated attributes.

To accurately link Swiss official data with OpenStreetMap, the system performs four distinct types of matching. Linking ATLAS sloid to OSM stops, requires supporting layers:

  1. ATLAS sloid ↔ OSM Stop: The main matching objective, linking official platforms to geographic stop units in OSM.
  2. ATLAS GTFS ↔ OSM GTFS (Routes): Since stops are best understood within the context of the lines that serve them, we perform route-to-route matching. This process depends on having correctly mapped stop IDs and clustered OSM nodes.
  3. GTFS stop_id ↔ ATLAS sloid: opentransportdata.swiss GTFS data uses stop_id as stop identifier rather than sloid, necessitating a mapping layer to connect timetable data to Sloids.
  4. OSM Node ↔ OSM Node: In OSM, a single physical stop is often represented by multiple nodes (e.g., platforms, stop positions). We match these nodes to cluster them into logical "OSM stops."
graph TD Main["<b>Main Goal</b><br/>ATLAS sloid ↔ OSM stop"] Route["<b>Context Matching</b><br/>ATLAS GTFS ↔ OSM GTFS (Routes)"] Sloid["<b>Mapping</b><br/>GTFS stop_id ↔ ATLAS sloid"] Nodes["<b>Clustering</b><br/>OSM node ↔ OSM node"] Sloid --> Route Nodes --> Route Nodes --> Main Route --> Main

Key Statistics

Note: Statistics are automatically synchronized with the matching pipeline and updated on each data import.

Metric Value
ATLAS Platforms 54,998
OSM Nodes 65,049
OSM Stop Units 49,183
Atlas Match Rate 89.5%
Matched Pairs (ATLAS ↔ OSM) 62,968
Unmatched ATLAS Platforms 5,774
Unmatched OSM Stop Units 7,784

System Workflow

The system is designed to re-import base data periodically (e.g., every 2 hours) to stay synchronized with ATLAS and OSM updates. The database is wiped and fully rebuilt on each run.

flowchart LR subgraph Sources["Data Sources"] A[("ATLAS & GTFS data<br/>opentransportdata.swiss")] O[("OSM Transport Data<br/>overpass-api.de")] end subgraph Pipeline["Processing Pipeline"] direction TB D["1. Download & Process"] R["2. Route Normalization & Linking"] M["3. Multi-Stage Matching"] P["4. Problem Detection"] I["5. Database Import"] D --> R --> M --> P --> I end subgraph Output["Output"] DB[("Database<br/>PostgreSQL + PostGIS")] W["Web Application"] DB --> W end A --> D O --> D I --> DB

Documentation Sections

Section Content
1. Download and Process Data Data sources, filters, data processing
2. Matching Process Multi-stage stop-matching algorithm
3. Routes Route models, route data flow, and route-to-route linking
4. Problems Problem detection and prioritization
5. Database Schema, refresh modes, and import flow
6. Web App Front end, map data loading, reports, and route pages
7. System Architecture Docker services, scheduler execution, shared state, and security
8. Tests GitHub Actions, test setup, and local checks

Source Code

The complete source code is available at: github.com/openTdataCH/stop_sync_osm_atlas

Data update in progress
Elapsed: -- ETA: -- Phase: idle