Security and Rate Limits

This document outlines the application's current API rate limits, the current security middleware that is active in the Flask app, and the main request-path risks that still need hardening.

1. Current Rate Limits

The backend API is shielded by Flask-Limiter to prevent abuse and ensure fair usage across our infrastructure. Limits are enforced based on the client's IP address (get_remote_address).

Global Defaults

  • All unannotated endpoints: 500 per minute (defined in backend/extensions.py). "Unannotated endpoints" refer to any API routes in the application that do not have a specific, custom rate limit applied directly to them via a decorator. Because they aren't explicitly restricted with a custom rule, they fall back to this global default.

Storage Backend

Flask-Limiter is configured with RATELIMIT_STORAGE_URI, defaulting to memory:// when no shared backend is provided. In local development this is fine. In a multi-process deployment, a shared backend such as Redis is preferable so all workers see the same counters.

Specific Endpoints

Certain expensive or easily abusable endpoints have stricter throttles:

Endpoint Limit Location
/api/search 60 / minute backend/blueprints/search.py
/api/top_matches 60 / minute backend/blueprints/search.py
/api/stop_by_id 60 / minute backend/blueprints/search.py
/api/random_stop 30 / minute backend/blueprints/search.py

2. Shared Rate-Limit Storage

The codebase is structured so a shared storage backend can be plugged in without changing endpoint decorators. The default memory:// backend remains process-local, so deployments that use multiple workers should point RATELIMIT_STORAGE_URI at Redis or another supported shared store.

Why In-Memory Storage Is Only a Development Default

With in-memory storage, each worker process maintains a separate bucket of request counts.

flowchart TD Client((Client IP\nLimit: 60/min)) subgraph Gunicorn W1[Worker 1\nMemory Count: 25] W2[Worker 2\nMemory Count: 10] W3[Worker 3\nMemory Count: 45] end Client -->|Requests| W1 Client -->|Requests| W2 Client -->|Requests| W3

In the scenario above, the client has actually made 80 requests (25 + 10 + 45), violating the 60/min limit. However, because no single worker has hit 60 internally, none of the workers will block the client.

Why Redis Is the Usual Production Choice

To enforce an accurate, global limit across workers, a centralized backend such as Redis is the typical choice.

  1. When a request arrives, the Gunicorn worker handling it connects to Redis.
  2. The worker increments a counter specific to the [Endpoint] + [Client IP] key.
  3. If the returned counter exceeds the allowed limit, the worker immediately rejects the request with a HTTP 429 Too Many Requests.
sequenceDiagram participant Client participant Worker as Gunicorn Worker participant Redis Client->>Worker: HTTP Request (e.g., /api/search) Note over Worker,Redis: Worker checks central store Worker->>Redis: INCR count for "search_api + IP" Redis-->>Worker: Returns new count (e.g., 61) alt Count Exceeds Limit (e.g., > 60) Worker-->>Client: HTTP 429 Too Many Requests else Count Within Limit Worker->>Worker: Process normal request logic Worker-->>Client: HTTP 200 OK (Response Data) end
flowchart TD Client((Client IP)) subgraph Gunicorn W1[Worker 1] W2[Worker 2] W3[Worker 3] end Redis[(Redis\nGlobal Counts)] Client --> W1 Client --> W2 Client --> W3 W1 <-->|Read / Increment| Redis W2 <-->|Read / Increment| Redis W3 <-->|Read / Increment| Redis

Because Redis handles atomic increments and expirations efficiently, it provides a unified source of truth without requiring changes to the endpoint code.

3. Other Active Middleware

The Flask app also initializes flask_talisman.Talisman in backend/app.py. The current configuration disables CSP enforcement (content_security_policy=None) and only forces HTTPS when the relevant environment flag is enabled, so this is useful hardening but not a full browser-security policy yet.


4. Security Risks and Attack Vectors

The search endpoint in backend/blueprints/search.py applies explicit input and query guardrails before hitting expensive text scans:

# snippet from search.py
query_str = _normalize_search_query(request.args.get('q', ''))
if len(query_str) < 3:
    return jsonify({"osm": [], "atlas": []})
if len(query_str) > 50:
    return jsonify({"error": "..."}), 400

escaped_query = _escape_like_literal(query_str)
search_pattern = f"%{escaped_query}%"

# PostgreSQL-only guardrail
db.session.execute(text("SET LOCAL statement_timeout = 1500"))

# each query is capped
matched_query.limit(200)
unmatched_query.limit(200)

What is now enforced:

  1. Query length bounds: short probes (< 3) are ignored and long payloads (> 50) are rejected with HTTP 400.
  2. Wildcard escaping: user input is escaped before ILIKE, so % and _ are treated as literals instead of attacker-controlled wildcard operators.
  3. Query timeout circuit breaker: PostgreSQL search statements use a local statement_timeout to prevent long-running scans from monopolizing workers.
  4. Result-size caps: both matched and unmatched search branches are hard-limited (200 each), reducing memory and serialization pressure.
Data update running in background
Preparing update... | Phase: initializing
Data update in progress
Core data is being refreshed. Use this time to read the documentation.
Elapsed: -- ETA: -- Phase: idle