The Problem: 250 Items in a Grid
The original homepage showed all simulations in a responsive grid — fine at 50, manageable at 100, unnavigable at 250. Analytics confirmed the problem: users arriving via search landed on specific simulation pages, but users landing on the homepage scrolled for a few seconds then left.
We needed search and filtering. The constraints: no server, no build step, no external search service. Everything had to run in the browser from a static JSON file.
Step 1 — The Search Index JSON
A Python script walks every simulation directory, reads metadata from
each
index.html (title, description, category tags, keywords),
and emits a compact search-index.json:
The full index for 250 simulations is 142 KB uncompressed, 18 KB gzipped — well under the browser's HTTP cache threshold for instant second-visit loads.
Step 2 — Inverted Index for Full-Text Search
A plain array scan of 250 items on every keystroke would be fast enough (250 objects is trivial for a CPU), but we wanted prefix matching and ranked results. An inverted index maps every word token to the list of document IDs containing it:
Step 3 — Trie for Prefix Matching
Users type "fluid" and expect "fluid dynamics", "SPH fluid", and "microfluid" to appear. An inverted index only matches exact tokens. A trie (prefix tree) solves this: each node represents one character; all paths from root to a leaf represent a complete token. Finding all words that start with "flu" is O(prefix_length) — constant with respect to library size.
Step 4 — Multi-Tag Category Filtering
The filter panel uses bit-flag intersection. Each category is assigned a bit position; every simulation is represented as a bitmask of its categories. Multi-tag filtering is a single bitwise AND:
Step 5 — URL State Serialisation
The search query and active filters are serialised into the URL on
every change, so users can bookmark and share filtered views:
/?q=fluid&cat=physics,chemistry&diff=beginner.
The state is read back on page load and the UI is restored without any
page transition.
Performance Results
Lessons Learned
- Don't reach for a library first. Lunr.js and Fuse.js are great but 20–40 KB gzipped. Our custom solution is 2 KB including the trie and was easier to tune.
- Debounce the search input. Input fires on every keystroke; with a 120 ms debounce, we skip intermediate states entirely during fast typing.
- Pre-compute bitmasks at load, not at search time. The index-build runs once on page load (~8 ms); every subsequent search is pure lookup.
-
Virtual scrolling would help at 500+ items — for
now, CSS
content-visibility: autogives a free 60% render cost reduction for off-screen cards.
Open architecture: The search index JSON is generated by a Python script that reads simulation metadata. Adding a new simulation automatically includes it in search results — no manual maintenance needed.