Filter and Search Logic
This document defines the canonical filtering and search model used on the index map page.
Performance deep-dive for global stats query cost:
Core Rule
The filter system uses one boolean grammar only:
- within a filter group, selected values combine with
OR - across filter groups, active groups combine with
AND
Formally:
FinalResult = ScopePredicate AND SearchPredicate AND AtlasPredicate AND OsmPredicate AND DuplicatePredicate
If a predicate group has no active selection, it contributes no restriction.
1. Scope Predicate
The scope predicate chooses which row categories are eligible.
ScopePredicate = MatchedBranch OR AtlasUnmatchedBranch OR OsmUnmatchedBranch
1.1 Matched Branch
Matched entries can be enabled in two ways:
All Matched Stops- one or more matched method sub-filters
Semantics:
All Matched Stopsmeans all matched rows are included- selecting matched sub-filters means only matched rows with one of those methods are included
- if all matched method children are selected, the UI rolls up to the parent state
all
Formally:
MatchedBranch = stop_type = matchedwhen the parent isallMatchedBranch = stop_type = matched AND match_type IN selected matched methodswhen the branch is insubsetmode
Matched method values include:
exactnamedistance_matching_triodistance_matching_1distance_matching_2distance_matching_3adistance_matching_3a_second_passdistance_matching_3broute_gtfs_gtfs
1.2 ATLAS Unmatched Branch
ATLAS unmatched entries also support a parent state plus sub-filters.
Semantics:
ATLAS unmatchedmeans all atlas-unmatched rows are included- selecting
No OSM < 50mand/orOSM < 50mmeans only those unmatched reasons are included - if both unmatched reasons are selected, the UI rolls up to the parent state
all
Formally:
AtlasUnmatchedBranch = stop_type = atlas_unmatchedwhen the parent isallAtlasUnmatchedBranch = stop_type = atlas_unmatched AND unmatched_reason IN selected reasonswhen the branch is insubsetmode
The unmatched reason mapping is:
No OSM < 50m->match_type = no_nearby_counterpartOSM < 50m->match_type != no_nearby_counterpart OR match_type IS NULL
1.3 OSM Unmatched Branch
This branch is explicit only.
Semantics:
- if
OSM unmatchedis checked, allosm_unmatchedrows are included - if it is not checked, OSM-unmatched rows are not added implicitly by any other selection
Formally:
OsmUnmatchedBranch = stop_type = osm_unmatched
2. Search Predicate
Search tokens are OR-combined within the search group.
SearchPredicate = token_1 OR token_2 OR token_3 ...
Meaning:
- if multiple search tokens are active, a row matches if it matches any of them
2.1 Accepted Formats
The search input (#smartSearchInput) accepts the following formats, parsed by parseSmartSearchInput() in filters.js:
| Format | Example | Token kind | Backend identifier_type |
|---|---|---|---|
UIC station code (starts with 85) |
8503000 |
station |
station |
| ATLAS SLOID | ch:1:sloid:3000:3 |
atlas |
sloid |
| OSM node ID (digits only) | 123456789 |
osm |
osm_node_id |
| Route ID (dash-separated) | 11-T-j25-1 |
route |
route |
| Route + direction | 11-T-j25-1 dir:0 |
route |
route |
Unrecognized input shows the accepted formats hint tooltip (#smartSearchHint).
2.2 Search Flow
- User submits input via Enter key
parseSmartSearchInput()classifies the value into a token kind or returns a validation erroraddSearchToken()adds the token toactiveFilters.stationand callsfetchAndCenterSpecificStop()fetchAndCenterSpecificStop()calls/api/stop_by_idwith the identifier and type- On success: the map centers on the result and filters update
- On failure: the token is reverted and an error is shown
2.3 Not-Found Feedback
When a correctly formatted input does not match any database entry, /api/stop_by_id returns a 404. The frontend displays an error in the #smartSearchError element, styled via .smart-search-feedback in index.css.
The error message follows the pattern: No {type} found matching: {identifier}, where {type} is the human-readable token kind (e.g. "OSM node", "UIC station", "ATLAS SLOID").
3. ATLAS Predicate
ATLAS-side attributes are OR-combined within the ATLAS predicate group.
AtlasPredicate = atlas_attribute_1 OR atlas_attribute_2 OR ...
Current ATLAS attribute values:
- ATLAS operator
Semantics:
- ATLAS predicates are evaluated on every row that has an ATLAS side
- matched rows can satisfy ATLAS predicates
- ATLAS-unmatched rows can satisfy ATLAS predicates
- OSM-unmatched rows naturally do not satisfy ATLAS predicates because they have no ATLAS side
4. OSM Predicate
The OSM predicate is composed of two subgroups that are AND-combined:
OsmPredicate = TransportPredicate AND EntityPredicate AND OsmGroupPredicate
4.1 Transport Predicate
Transport types are OR-combined.
TransportPredicate = transport_type_1 OR transport_type_2 OR ...
Examples:
ferry_terminaltram_stopstationplatformstop_positionaerialway_station
4.2 Entity Predicate
OSM entity types (nodes vs ways) are OR-combined.
EntityPredicate = entity_type_1 OR entity_type_2 OR ...
Examples:
way(OSM entries derived from ways, identified by away_prefix in their ID)
4.3 OSM Group Predicate (Pairs/Trios)
OSM group types are OR-combined. In current terminology, OSM group means OSM pair or OSM trio.
OsmGroupPredicate = group_type_1 OR group_type_2 OR ...
Examples:
osm_pair_uicosm_pair_uic_equal_15mosm_pair_nameosm_pair_name_equal_15mosm_pair_tramosm_pair_tram_equal_15mosm_trio
If the OSM groups master is selected with no subtype refinement, the system treats it as:
OsmGroupPredicate = group_member(any type)
If only osm_trio is selected, pair rows are excluded.
4.4 OSM-side Semantics
OSM predicates always apply to rows that have an OSM side.
This is an intentional product rule.
Consequences:
- matched rows can satisfy OSM predicates
- OSM-unmatched rows can satisfy OSM predicates
- ATLAS-unmatched rows naturally do not satisfy OSM predicates because they have no OSM side
There is no separate applicability toggle for OSM predicates.
5. Duplicate Predicate
The currently exposed duplicate control is Duplicate ATLAS.
Semantics:
Duplicate ATLASis an AND-filter on ATLAS duplicate-group membership:- row has
representative_sloidset (non-representative member), OR - row is a representative referenced by at least one sibling (
EXISTS atlas_stops WHERE representative_sloid = this.sloid)
- row has
Formally:
DuplicatePredicate = atlas_duplicate_member = true
Implementation note:
- duplicate filtering is a data predicate
- this predicate is applied server-side across
/api/data,/api/top_matches,/api/random_stop, and/api/global_stats - whether both sides of a matched row are drawn is still a rendering decision, not a predicate
6. Top N Distances
Top N is not part of the canonical predicate formula for /api/data.
It is a special matched-only mode used by:
/api/top_matches/api/random_stop/api/global_stats
Top N is available whenever matched scope exists, meaning either:
All Matched Stopsis checked- or at least one matched sub-filter is selected
If matched scope disappears, Top N is automatically disabled.
7. Low-Zoom Overview Mode
When the map is below the marker threshold and there are no active user filters, the UI switches to an overview mode:
stop_filter = atlas_unmatched- only the ATLAS side is rendered
This is a display optimization for low zoom, not part of the canonical predicate algebra.
As soon as any user filter is active, normal predicate semantics are used again.
8. Request Serialization Rules
The frontend sends only the filters that are semantically active.
Important examples:
All Matched Stopschecked -> sendstop_filter=matched, optionally alongsidematch_methodrefinementsExactchecked withoutAll Matched Stops-> sendmatch_method=exactonlyATLAS unmatchedchecked -> sendstop_filter=atlas_unmatched, omit unmatched reason refinementsNo OSM < 50mchecked without the parent -> sendmatch_method=no_nearby_counterpart- OSM group subtypes selected -> send
osm_group_types=subtype_1,subtype_2 - OSM groups master selected with no subtype -> send
osm_group_types=all Duplicate ATLASchecked -> sendshow_duplicates_only=true
This keeps requests compact, but there is one current implementation wrinkle: the backend treats any matched-method selection as implying matched scope. In practice, match_method=exact works even if stop_filter=matched is omitted.
9. Consistency Guarantees
The endpoints below use the same request parameter model and the same scope helper functions (resolve_stop_type_match_filters, build_stop_scope_condition) so stop-type and match-method semantics remain aligned:
/api/data/api/global_stats/api/random_stop/api/top_matches
/api/data uses its own query builder function for viewport + attribute predicates, while /api/global_stats, /api/random_stop, and /api/top_matches use QueryBuilder.apply_common_filters. The resulting filter behavior is intended to be equivalent for shared parameters.
10. Worked Examples
Example 1:
(distance stage 1 OR atlas-unmatched OR osm-unmatched) AND operator=SBB AND duplicate_atlas
This means:
- keep rows in any of those three scope branches
- then require an ATLAS side with operator
SBB - then require ATLAS duplicate-group membership
Example 2:
matched OR osm-unmatched plus platform
This means:
- keep matched rows and OSM-unmatched rows in scope
- then keep only those whose OSM side is
platform
Example 3:
operator=SBB and tram_stop
This means:
- require an ATLAS side satisfying
SBB - require an OSM side satisfying
tram_stop - in practice this mostly yields matched rows because both sides must exist
11. Global Stats Endpoint Semantics and Cache
/api/global_stats now delegates all cache-key construction, scoped query building, and aggregation logic to backend/services/global_stats.py.
The endpoint still follows the same predicate algebra defined in this document.
11.1 Shared Scope Semantics
/api/global_stats uses the same helper path as /api/data for scope selection:
resolve_stop_type_match_filters()build_stop_scope_condition()build_trio_middle_with_matched_side_condition()
This preserves the same trio-middle effective-match behavior across map rendering and global summary stats.
11.2 Effective Matched Semantics in Stats
For global stats aggregation, an internal effective_stop_type is computed:
- rows with
stop_type = matchedare treated as matched - rows with
stop_type = effectively_matchedare also treated as matched
This is identical to the semantics already used for matched-scope filtering and avoids drift between counts and map behavior.
11.3 Global vs Viewport Scope
/api/global_stats is filter-scoped, not viewport-scoped.
- it does not use
min_lat,max_lat,min_lon, ormax_lon - it summarizes the full filtered dataset
/api/dataremains the viewport-scoped endpoint
11.4 Cache-Key Canonicalization
Global stats cache keys are canonicalized so equivalent requests share one cache entry.
Canonicalization rules:
- comma lists are trimmed, sorted, and rejoined for:
stop_filtermatch_methodtransport_typesosm_entity_typesnode_typeatlas_operatorosm_group_types
- station filters are canonicalized as sorted triples:
(station_filter value, filter_type, route_direction)
show_duplicates_onlyis normalized totrueorfalsetop_nis included directly in the key
As a result, different parameter orderings that express the same filter state map to the same cache key.
11.5 Cache Shape and Operational Notes
- cache is an in-process LRU (size 5)
- cache is thread-safe within one process
- cache is not shared across multiple app processes or containers
The service also exposes clear_global_stats_cache() for explicit invalidation wiring during future write-path integration (problem resolution/import completion hooks).