# Latency
Skryx is engineered for sub-100ms median search latency at production scale. The actual number you observe depends on:
- Index size and document complexity.
- Query shape (number of filters, facets, sort criteria).
- Whether AI Query Understanding fires and whether it hits cache.
- Network distance between your client and our EU data centres.
Typical workloads:
| Workload | Typical experience |
|---|---|
| Filtered search on a mid-size catalogue | Fast — well within the budget for an instant-search UI. |
Autocomplete (/suggest) |
Even faster — designed for keystroke-rate use. |
| Search with AI Query Understanding (cache hit) | Indistinguishable from a plain search. |
| Search with AI Query Understanding (cache miss) | One-time cost on the first occurrence of a brand-new query; the result is cached for repeat visitors. |
The search_time_ms field in every response reports the engine time so
you can measure exactly what you're getting in your own environment.
# Throughput
A single index sustains thousands of queries per second on the default plan tier. We scale reads horizontally — you don't configure anything; only the price changes.
# What costs time
In rough descending order:
- Wide
query_by(5+ fields) without weights. - Many facets with high cardinality.
- Sort + filter combinations that can't use the engine's bitmaps.
- AI Query Understanding cache misses (one Skryx AI call per fresh query).
If you're chasing the last few ms, see the Performance guide.
# Caching
Three layers protect you from work:
- AI Query cache — rewrites are persisted for an extended window; identical queries are free.
- Synonym set cache — pushed to the engine once per change, then memory-resident.
- HTTP-level caching — the API sets
Cache-Control: private, max-age=2on identical*browse queries; turn it off with?nocache=1.
# Indexing throughput
Bulk imports are fast — most catalogues finish initial indexing in minutes, not hours. Use the batch endpoint for best throughput; per-document loops are slower by orders of magnitude.