Filter and Search Logic
This document defines the filtering and search model used on the index map page.
Performance deep-dive for possible global stats precomputation:
Core Rule
The filter system uses one boolean grammar only:
- within a filter group, selected values combine with
OR - across filter groups, active groups combine with
AND
Formally:
FinalResult = ScopePredicate AND SearchPredicate AND AtlasPredicate AND OsmPredicate AND DuplicatePredicate
If a predicate group has no active selection, it contributes no restriction.
1. Scope Predicate
The scope predicate chooses which row categories are eligible.
ScopePredicate = MatchedBranch OR AtlasUnmatchedBranch OR OsmUnmatchedBranch
1.1 Matched Branch
Matched entries can be enabled in two ways:
All Matched Stops- one or more matched method sub-filters
Semantics:
All Matched Stopsmeans rows with effective matched scope are included- selecting matched sub-filters means matched rows with one of those methods are included
- if all matched method children are selected, the UI rolls up to the parent state
all
Formally:
MatchedBranch = stop_type = matched OR stop_type = effectively_matchedwhen the parent isallMatchedBranch = stop_type = matched AND match_type matches selected matched methodswhen the branch is insubsetmode
Matched method values include:
exactnamedistance_matching_triodistance_matching_1long_distance_group_proximitydistance_matching_2distance_matching_3adistance_matching_3broute_gtfs_tokensroute_gtfs_direction
Prefix matching is used for distance and route method families. For example, distance_matching_1 matches concrete import values such as distance_matching_1_uic_ref, and distance_matching_3a also includes distance_matching_3a_second_pass.
Special case: distance_matching_trio also includes effectively_matched trio-middle rows, because those rows have no direct match_type but are treated as matched for map and stats semantics.
1.2 ATLAS Unmatched Branch
ATLAS unmatched entries also support a parent state plus sub-filters.
Semantics:
ATLAS unmatchedmeans all atlas-unmatched rows are included- selecting
No OSM < 50mand/orOSM < 50mmeans only those unmatched reasons are included - if both unmatched reasons are selected, the UI rolls up to the parent state
all
Formally:
AtlasUnmatchedBranch = stop_type = atlas_unmatchedwhen the parent isallAtlasUnmatchedBranch = stop_type = atlas_unmatched AND unmatched_reason IN selected reasonswhen the branch is insubsetmode
The unmatched reason mapping is:
No OSM < 50m->match_type = no_nearby_counterpartOSM < 50m->match_type != no_nearby_counterpart OR match_type IS NULL
1.3 OSM Unmatched Branch
This branch is explicit only.
Semantics:
- if
OSM unmatchedis checked, allosm_unmatchedrows are included - if it is not checked, OSM-unmatched rows are not added implicitly by any other selection
Formally:
OsmUnmatchedBranch = stop_type = osm_unmatched
2. Search Predicate
Search tokens are OR-combined within the search group.
SearchPredicate = token_1 OR token_2 OR token_3 ...
Meaning:
- if multiple search tokens are active, a row matches if it matches any of them
2.1 Accepted Formats
The search input (#smartSearchInput) accepts the following formats, parsed by parseSmartSearchInput() in filters.js:
| Format | Example | Token kind | Backend identifier_type |
|---|---|---|---|
UIC station code (starts with 85) |
8503000 |
station |
station |
| ATLAS SLOID | ch:1:sloid:3000:3 |
atlas |
sloid |
| OSM node ID (digits only) | 123456789 |
osm |
osm_node_id |
Route ID (dash-separated or route: prefixed) |
11-T-j25-1 |
route |
route |
| Route + direction | 11-T-j25-1 dir:0 |
route |
route |
Unrecognized input shows the accepted formats hint tooltip (#smartSearchHint).
OSM node IDs may also be entered as node/123456789. Route input accepts dir:0 or dir:1; the token stores the direction for subsequent filtered requests, while the initial centering lookup validates the route ID itself.
2.2 Search Flow
- User submits input via Enter key
parseSmartSearchInput()classifies the value into a token kind or returns a validation erroraddSearchToken()adds the token toactiveFilters.stationand callsfetchAndCenterSpecificStop()fetchAndCenterSpecificStop()calls/api/stop_by_idwith the identifier and type- On success: the map centers on the result and filters update
- On failure: the token is reverted and an error is shown
2.3 Not-Found Feedback
When a correctly formatted input does not match any database entry, /api/stop_by_id returns a 404. The frontend displays an error in the #smartSearchError element, styled via .smart-search-feedback in index.css.
The error message follows the pattern: No {type} found matching: {identifier}, where {type} is the human-readable token kind (e.g. "OSM node", "UIC station", "ATLAS SLOID").
3. ATLAS Predicate
ATLAS-side attributes are OR-combined within the ATLAS predicate group.
AtlasPredicate = atlas_attribute_1 OR atlas_attribute_2 OR ...
Current ATLAS attribute values:
- ATLAS operator
Semantics:
- ATLAS predicates are evaluated on every row that has an ATLAS side
- matched rows can satisfy ATLAS predicates
- ATLAS-unmatched rows can satisfy ATLAS predicates
- OSM-unmatched rows naturally do not satisfy ATLAS predicates because they have no ATLAS side
4. OSM Predicate
The OSM predicate is composed of four subgroups that are AND-combined:
OsmPredicate = TransportPredicate AND EntityPredicate AND OsmOperatorPredicate AND OsmGroupPredicate
4.1 Transport Predicate
Transport types are OR-combined.
TransportPredicate = transport_type_1 OR transport_type_2 OR ...
Examples:
ferry_terminaltram_stopstationplatformstop_positionaerialway_station
4.2 Entity Predicate
OSM entity types (nodes vs ways) are OR-combined.
EntityPredicate = entity_type_1 OR entity_type_2 OR ...
Examples:
way(OSM entries derived from ways, identified by away_prefix in their ID)
4.3 OSM Operator Predicate
OSM stop operators are OR-combined within their own subgroup.
OsmOperatorPredicate = osm_operator_1 OR osm_operator_2 OR ...
The predicate is implemented through OsmNode.osm_operator and is AND-combined with the other OSM-side subgroups.
4.4 OSM Group Predicate (Pairs/Trios)
OSM group types are OR-combined. In current terminology, OSM group means OSM pair or OSM trio.
OsmGroupPredicate = group_type_1 OR group_type_2 OR ...
Examples:
osm_pair_uicosm_pair_uic_equal_15mosm_pair_nameosm_pair_name_equal_15mosm_pair_tramosm_pair_tram_equal_15mosm_trio
If the OSM groups master is selected with no subtype refinement, the system treats it as:
OsmGroupPredicate = group_member(any type)
If only osm_trio is selected, pair rows are excluded.
4.5 OSM-side Semantics
OSM predicates always apply to rows that have an OSM side.
This is an intentional product rule.
Consequences:
- matched rows can satisfy OSM predicates
- OSM-unmatched rows can satisfy OSM predicates
- ATLAS-unmatched rows naturally do not satisfy OSM predicates because they have no OSM side
There is no separate applicability toggle for OSM predicates.
5. Duplicate Predicate
The currently exposed duplicate control is Duplicate ATLAS.
Semantics:
Duplicate ATLASis an AND-filter on ATLAS duplicate-group membership:- row has
representative_sloidset (non-representative member), OR - row is a representative referenced by at least one sibling (
EXISTS atlas_stops WHERE representative_sloid = this.sloid)
- row has
Formally:
DuplicatePredicate = atlas_duplicate_member = true
Implementation note:
- duplicate filtering is a data predicate
- this predicate is applied server-side across
/api/data,/api/top_matches,/api/random_stop, and/api/global_stats - whether both sides of a matched row are drawn is still a rendering decision, not a predicate
6. Top N Distances
Top N is not part of the canonical predicate formula for /api/data.
It is a special matched-only mode used by:
/api/top_matches/api/random_stop/api/global_stats
The UI shows the Top N control whenever matched scope exists, meaning either:
All Matched Stopsis checked- or at least one matched sub-filter is selected
If matched scope disappears, Top N is automatically disabled.
Current endpoint behavior:
/api/top_matchesis always matched-only and sorts by descendingdistance_m/api/random_stopand/api/global_statsapplytop_nby narrowing to matched rows withdistance_m- the main map skips normal
/api/dataviewport loading while Top N is active - the Top N overlay request is issued by
loadTopNMatches()when the matched parent scope is active; method-only matched scope is still honored by/api/random_stopand/api/global_stats
7. Low-Zoom Overview Mode
When the map is below the marker threshold and there are no active user filters, the UI switches to an overview mode:
stop_filter = atlas_unmatched- only the ATLAS side is rendered
This is a display optimization for low zoom, not part of the canonical predicate algebra.
As soon as any user filter is active, normal predicate semantics are used again.
8. Request Serialization Rules
The frontend sends only the filters that are semantically active.
Important examples:
All Matched Stopschecked -> sendstop_filter=matchedExactchecked withoutAll Matched Stops-> sendmatch_method=exactonlyATLAS unmatchedchecked -> sendstop_filter=atlas_unmatched, omit unmatched reason refinementsNo OSM < 50mchecked without the parent -> sendmatch_method=no_nearby_counterpart- OSM group subtypes selected -> send
osm_group_types=subtype_1,subtype_2 - OSM stop operator selected -> send
osm_operator=operator_1,operator_2 - OSM groups master selected with no subtype -> send
osm_group_types=all Duplicate ATLASchecked -> sendshow_duplicates_only=true
This keeps requests compact. The backend treats any matched-method selection as implying matched scope, so match_method=exact works even if stop_filter=matched is omitted.
9. Consistency Guarantees
The endpoints below share the same request parameter model for common filters:
/api/data/api/global_stats/api/random_stop/api/top_matches
/api/data, /api/global_stats, and /api/random_stop use the shared scope helpers (resolve_stop_type_match_filters, build_stop_scope_condition) for stop-type and match-method semantics. /api/top_matches is a matched-only endpoint; it applies common attribute filters and optional matched-method filtering, then sorts by largest distance.
10. Worked Examples
Example 1:
(distance stage 1 OR atlas-unmatched OR osm-unmatched) AND operator=SBB AND duplicate_atlas
This means:
- keep rows in any of those three scope branches
- then require an ATLAS side with operator
SBB - then require ATLAS duplicate-group membership
Example 2:
matched OR osm-unmatched plus platform
This means:
- keep matched rows and OSM-unmatched rows in scope
- then keep only those whose OSM side is
platform
Example 3:
operator=SBB and tram_stop
This means:
- require an ATLAS side satisfying
SBB - require an OSM side satisfying
tram_stop - in practice this mostly yields matched rows because both sides must exist
11. Global Stats Endpoint Semantics
/api/global_stats delegates scoped query building and aggregation logic to backend/services/global_stats.py.
The endpoint still follows the same predicate algebra defined in this document.
11.1 Shared Scope Semantics
/api/global_stats uses the same helper path as /api/data for scope selection:
resolve_stop_type_match_filters()build_stop_scope_condition()build_trio_middle_with_matched_side_condition()
This preserves the same trio-middle effective-match behavior across map rendering and global summary stats.
11.2 Effective Matched Semantics in Stats
For global stats aggregation, an internal effective_stop_type is computed:
- rows with
stop_type = matchedare treated as matched - rows with
stop_type = effectively_matchedare also treated as matched
This is identical to the semantics already used for matched-scope filtering and avoids drift between counts and map behavior.
11.3 Global vs Viewport Scope
/api/global_stats is filter-scoped, not viewport-scoped.
- it does not use
min_lat,max_lat,min_lon, ormax_lon - it summarizes the full filtered dataset
/api/dataremains the viewport-scoped endpoint
11.4 Runtime Notes
The current implementation computes global stats directly from SQL for each request. There is no in-process LRU cache or cache-key canonicalization layer in backend/services/global_stats.py.
The frontend avoids stale UI updates by aborting an in-flight /api/global_stats request before starting the next one and by ignoring responses whose sequence number is no longer current.