Benchmarks

haystack-py includes a Docker-based benchmark suite that measures HTTP and WebSocket throughput against all three storage backends (InMemory, Redis, TimescaleDB). The suite also times the decoding of entity data in each supported wire format (JSON, Trio, Zinc).

Test Setup

All benchmarks were run on Docker Desktop with a single client container driving a single server container sequentially (HTTP first, then WebSocket) for each backend.

Parameter

Value

Server container

3 GB RAM limit

Client container

1.5 GB RAM limit

HTTP concurrency

50 connections

WebSocket concurrency

20 connections

Duration

15 s per test (+ 3 s warmup)

Entity dataset

3,109 entities (Alpha 2,032 + Bravo 1,077)

Entity mix

2,764 points, 333 equips, 2 sites, 8 VAVs, AHUs, meters

Server

Uvicorn (single worker), FastAPI, SCRAM-SHA-256 auth

Wire formats decoded

JSON (orjson), Trio, Zinc

The server loads entity data from JSON files (Trio and Zinc files are decoded for timing but not double-ingested). Each client authenticates via SCRAM-SHA-256 then sends a mix of Haystack operations (about, ops, read with filters, nav) in a tight loop for the benchmark duration.

HTTP API Results

Backend

RPS

p50

p95

p99

InMemory

3,380

15.3 ms

20.8 ms

37.9 ms

Redis

3,578

13.8 ms

21.7 ms

40.2 ms

TimescaleDB

3,713

12.7 ms

23.3 ms

34.3 ms

All three backends achieve comparable throughput (3,300–3,700 rps) thanks to response caching and decode-path optimizations that eliminate storage I/O as a bottleneck after warmup. The server is bottlenecked on HTTP/ASGI overhead rather than backend speed.

Per-endpoint breakdown (InMemory, 50 connections):

Endpoint

RPS

p50

p99

/about

677

9.9 ms

29.1 ms

/ops

677

9.9 ms

28.9 ms

/read (filter)

1,352

16.8 ms

40.0 ms

/nav

675

16.3 ms

39.2 ms

WebSocket API Results

Backend

msg/s

p50

p95

p99

InMemory

1,552

12.7 ms

23.8 ms

29.4 ms

Redis

1,616

10.2 ms

30.9 ms

40.7 ms

TimescaleDB

1,797

4.0 ms

38.4 ms

46.4 ms

WebSocket throughput ranges from 1,550–1,800 msg/s across backends with 20 concurrent connections. The persistent connection eliminates per-request auth/HTTP overhead. Response payloads are larger than HTTP (full entity grids per message), which makes the transport I/O the primary bottleneck. All three backends perform similarly because read caches serve repeated queries from memory.

Per-operation breakdown (TimescaleDB, 20 connections):

Operation

msg/s

p50

p99

about

360

0.8 ms

12.1 ms

ops

359

0.6 ms

17.4 ms

read

718

11.5 ms

40.7 ms

nav

359

25.2 ms

49.3 ms

Wire Format Decode Performance

Decoding 3,109 entities (Alpha + Bravo datasets):

Format

Alpha (2,032)

Bravo (1,077)

JSON (orjson)

34 ms

56 ms

Trio

67 ms

93 ms

Zinc

75 ms

N/A1

Note

1 The Bravo Zinc file contains a scientific notation edge case (10.6E) that the current Zinc scanner does not handle. This is a known limitation tracked for a future release.

JSON (via orjson) is the fastest decoder — roughly 2× faster than Trio and Zinc for the same dataset. All formats decode the full 2,032-entity Alpha dataset in under 100 ms.

Running the Benchmarks

The benchmark suite lives in benchmarks/ and can be run with a single script:

cd benchmarks
./run_benchmarks.sh

This will:

  1. Build the server and client Docker images

  2. Start each backend (InMemory → Redis → TimescaleDB)

  3. Run HTTP then WebSocket client sequentially per backend

  4. Collect results to benchmarks/results/

  5. Print an aggregated summary

To run a single backend manually:

# Start server with InMemory backend
docker compose -f benchmarks/docker-compose.yml up -d --wait server-inmemory

# Run one HTTP client
docker compose -f benchmarks/docker-compose.yml run --rm http-inmemory

# Run one WebSocket client
docker compose -f benchmarks/docker-compose.yml run --rm ws-inmemory

# Cleanup
docker compose -f benchmarks/docker-compose.yml down -v --remove-orphans

Configuration is via environment variables:

Variable

Default

Description

HS_PY_CONCURRENCY

50 / 20

Concurrent connections (HTTP / WebSocket)

HS_PY_DURATION

15

Benchmark duration in seconds

HS_PY_WARMUP

3

Warmup duration in seconds

BACKEND

inmemory

Server storage backend (inmemory, redis, timescale)