7. System Architecture and Operations

This chapter describes the container layout, migration ownership, scheduler ownership, and shared runtime state used by the application.

Compose Services

docker-compose.yml defines these primary services:

flowchart LR subgraph DS [Data Stores] DB[(db: PostgreSQL + PostGIS)] State[(redis or file runtime dir)] end Migrator[migrator: One-shot DB Setup] App[app: Flask Web API] Scheduler[scheduler: Pipeline Worker] DB -->|Dependency for| Migrator DB -->|Dependency for| App DB -->|Dependency for| Scheduler State -. shared by .-> App State -. shared by .-> Scheduler Migrator -->|Dependency for| App Migrator -->|Dependency for| Scheduler
  • app: Flask web API/UI. Depends on successful migrator completion. Exposes port 5001.
  • scheduler: APScheduler worker running preprocessing, matching, payload precompute, and import orchestration in the background. Depends on successful migrator completion.
  • db: PostgreSQL + PostGIS container.
  • redis: Default shared backend for runtime state.
  • migrator: Fast, one-shot migration service (flask db upgrade). Starts, applies migrations, and exits.
  • test: Merged test image for full-suite pytest runs.

Runtime Ownership

The services have deliberately separate responsibilities.

Service Owns
migrator schema setup and Alembic upgrade
scheduler source freshness probing, preprocessing, matching, import precompute, and blocking DB refresh
app read-only API/UI access to imported data plus pipeline-status polling
db durable import state
redis or file runtime dir shared pipeline-status, lock, and async-export state backend

The app container does not run the data pipeline. It only reads imported data and shared pipeline status.

Migration Ownership

Migrations are owned solely by the dedicated one-shot container:

By isolating migrations, the stack avoids race conditions where multiple containers try to change the schema at the same time.

app and scheduler start only after the migration container succeeds via depends_on: condition: service_completed_successfully.

entrypoint.sh performs a final application-level connection check to ensure PostgreSQL is accepting connections before the web service boots.

Scheduler and Shared State

Pipeline execution state is independent from rate limiting.

  • rate limiting uses RATELIMIT_STORAGE_URI
  • shared runtime state uses STATE_BACKEND with either Redis or file-backed state

The shared runtime-state backend stores:

  • run status
  • current phase and human-readable message
  • next scheduled run timestamp
  • the distributed run lock
  • async export progress and completed-file metadata

This is what lets the scheduler, app, and browser agree on whether a pipeline run is active and whether the maintenance overlay should be shown.

Operational Notes

Subpages

Data update in progress
Elapsed: -- ETA: -- Phase: idle