.. _benchmarks: Benchmarks ========== bac-py includes both **local** and **Docker-based** stress tests that measure sustained throughput and latency under realistic BACnet workloads. Four benchmark scenarios exercise the core protocol paths: BACnet/IP (UDP), BACnet Secure Connect (WebSocket), cross-network routing, and BBMD foreign-device forwarding. - **Local benchmarks** (``scripts/bench_*.py``) run server and clients in a single process on ``127.0.0.1`` using auto-assigned ports. No Docker required. Default: 5s warmup + 30s sustained. - **Docker benchmarks** (``docker/scenarios/test_*_stress.py``) run server and clients in separate containers on Docker bridge networks. Default: 15s warmup + 60s sustained. Both enforce a **< 0.5% error rate** threshold (< 1% for router due to routing overhead) over the sustained measurement window. .. _stress-server-inventory: Stress Server Object Inventory ------------------------------ All BIP, Router, and BBMD benchmarks (local and Docker) use the same 40-object stress server across 11 object types: .. list-table:: :header-rows: 1 :widths: 30 15 55 * - Object Type - Count - Notes * - AnalogInput - 10 - Read-only, varied present_value and engineering units * - AnalogOutput - 5 - Commandable (always per Clause 12.7) * - AnalogValue - 5 - Commandable, used as write targets * - BinaryInput - 5 - Read-only * - BinaryOutput - 3 - Commandable (always) * - BinaryValue - 3 - Commandable * - MultiStateInput - 3 - 4 states each * - MultiStateValue - 2 - Commandable, 3 states * - Schedule - 1 - Weekly schedule * - Calendar - 1 - Date list * - NotificationClass - 1 - Priority and ack_required configured All workers yield to the event loop between requests (``asyncio.sleep(0)``) and apply a 50ms backoff on errors to prevent cascade failures from UDP socket contention. .. _local-benchmarks: Local Benchmarks ---------------- Local benchmarks run entirely in a single Python process on localhost. They are fast to iterate on and require no Docker installation. Because traffic stays on the loopback interface, latency is lower than Docker but throughput may also be lower due to single-process concurrency limits. .. _local-bip-benchmark: BACnet/IP (Local) ^^^^^^^^^^^^^^^^^ A ``BACnetApplication`` stress server with 40 objects and ``Client`` instances all bound to ``127.0.0.1`` on auto-assigned UDP ports. **Worker mix (7 total):** 2 readers, 1 writer, 1 RPM, 1 WPM, 1 object-list, 1 COV subscriber. **Reference results (macOS, Apple M-series, single process):** .. list-table:: :header-rows: 1 :widths: 35 65 * - Metric - Value * - Sustained throughput - ~13,300 req/s * - Error rate - ~0.4% * - Overall latency (p50 / p95 / p99) - 0.2ms / 0.3ms / 0.3ms * - Duration - 30s sustained + 5s warmup .. _local-sc-benchmark: BACnet/SC (Local) ^^^^^^^^^^^^^^^^^ An in-process SC hub, two echo nodes, and a test client all connected via ``wss://127.0.0.1`` with mutual TLS 1.3 (mock CA, EC P-256 certificates generated at startup). Echo nodes receive NPDUs and echo them back with an ``ECHO:`` prefix for round-trip latency measurement. Use ``--no-tls`` to fall back to plaintext ``ws://`` for comparison. **Worker mix (10 total):** 8 unicast, 2 broadcast. **Reference results (macOS, Apple M-series, single process):** .. list-table:: :header-rows: 1 :widths: 35 65 * - Metric - Value * - Sustained throughput (TLS) - ~9,200 msg/s * - Sustained throughput (plaintext) - ~11,700 msg/s * - Error rate - 0% * - Unicast latency (p50 / p95 / p99) - 0.8ms / 0.9ms / 1.1ms * - Duration - 30s sustained + 5s warmup .. note:: TLS adds ~22% overhead compared to plaintext (~9,200 vs ~11,700 msg/s). Use ``--no-tls`` to compare. Docker SC throughput (~16,350 msg/s) exceeds local because server and clients run as separate OS processes, enabling true CPU parallelism across the hub, echo nodes, and test client. .. _local-router-benchmark: Router (Local) ^^^^^^^^^^^^^^ A ``BACnetApplication`` router bridges network 1 and network 2, both on ``127.0.0.1`` with auto-assigned ports. A separate ``BACnetApplication`` stress server listens on network 2. Clients on network 1 discover the server via the router using routed addresses (``NETWORK:HEXMAC`` format). **Worker mix (6 total):** 2 readers, 1 writer, 1 RPM, 1 WPM, 1 object-list. **Reference results (macOS, Apple M-series, single process):** .. list-table:: :header-rows: 1 :widths: 35 65 * - Metric - Value * - Sustained throughput - ~7,300 req/s * - Error rate - ~0.6% * - Overall latency (p50 / p95 / p99) - 0.2ms / 0.3ms / 0.4ms * - Duration - 30s sustained + 5s warmup .. note:: Router throughput is lower than direct BIP because every request traverses two UDP hops (client -> router port 1 -> router port 2 -> server) and the NPDU must be decoded, re-addressed, and re-encoded at each hop. .. _local-bbmd-benchmark: BBMD (Local) ^^^^^^^^^^^^ A ``BACnetApplication`` with BBMD attached hosts 40 stress objects. Clients register as foreign devices with the BBMD and perform standard workloads plus FDT and BDT reads. **Worker mix (8 total):** 2 readers, 1 writer, 1 RPM, 1 WPM, 1 object-list, 1 FDT reader, 1 BDT reader. **Reference results (macOS, Apple M-series, single process):** .. list-table:: :header-rows: 1 :widths: 35 65 * - Metric - Value * - Sustained throughput - ~13,240 req/s * - Error rate - ~0.4% * - Overall latency (p50 / p95 / p99) - 0.2ms / 0.2ms / 0.3ms * - Duration - 30s sustained + 5s warmup .. _docker-benchmarks: Docker Benchmarks ----------------- Docker benchmarks run server and clients in separate containers on Docker bridge networks. They exercise the full network stack including inter-container UDP/TCP, Docker NAT, and separate Python processes. Results may differ significantly from local benchmarks due to Docker networking overhead. .. _bip-stress-benchmark: BACnet/IP (Docker) ^^^^^^^^^^^^^^^^^^ The BIP stress test exercises the full BACnet/IP stack over real UDP sockets between Docker containers. **Worker mix (7 total):** 2 readers, 1 writer, 1 RPM, 1 WPM, 1 object-list, 1 COV subscriber. Same as the local BIP benchmark. **Reference results (Docker, Alpine Linux, single host):** .. list-table:: :header-rows: 1 :widths: 35 65 * - Metric - Value * - Sustained throughput - ~18,500 req/s * - Error rate - ~0.3% * - Overall latency (p50 / p95 / p99) - 0.1ms / 0.2ms / 0.2ms * - Duration - 60s sustained + 15s warmup .. note:: Docker BIP throughput can exceed local single-process throughput because server and clients run as separate OS processes, allowing true parallel execution across CPU cores. .. _sc-stress-benchmark: BACnet/SC (Docker) ^^^^^^^^^^^^^^^^^^ The SC stress test exercises the BACnet Secure Connect WebSocket transport. A test client connects to an SC hub alongside two echo nodes, then sends varied-size NPDUs via unicast and broadcast at sustained concurrency. **Architecture:** - **SC Hub** -- WebSocket server routing messages between connected nodes - **Echo Node 1 & 2** -- receive NPDUs and echo them back with an ``ECHO:`` prefix - **Test Client** -- connects to the hub, sends unicast/broadcast NPDUs, measures round-trip latency for unicast via Future-based echo correlation **Worker mix (10 total):** 8 unicast, 2 broadcast. **Payload size distribution (matches real BACnet traffic):** .. list-table:: :header-rows: 1 :widths: 15 20 65 * - Proportion - Size - Representative traffic * - 30% - 25 bytes - Simple ReadProperty responses, Who-Is * - 30% - 200 bytes - RPM responses, COV notifications * - 25% - 800 bytes - Object-list responses, segmented data * - 15% - 1,400 bytes - Large RPM responses, trend data Each message is tagged with a 6-byte identifier (``worker_id`` + ``sequence``) for echo correlation. The test verifies that echoed payloads match. **Reference results (Docker, Alpine Linux, single host):** .. list-table:: :header-rows: 1 :widths: 35 65 * - Metric - Value * - Sustained throughput - ~16,350 msg/s * - Error rate - 0% * - Unicast latency (p50 / p95 / p99) - 0.2ms / 0.3ms / 0.4ms * - Duration - 60s sustained + 15s warmup .. _router-stress-benchmark: Router (Docker) ^^^^^^^^^^^^^^^ The router stress test exercises cross-network routing performance by sending standard BACnet service traffic through a BACnet router. The test client is on BACnet network 1 and the stress server (with 40 objects) is on BACnet network 2, with all requests routed through the router. **Architecture:** - **Router** -- Bridges network 1 (172.30.1.0/24) and network 2 (172.30.2.0/24) - **Stress Server** -- Standard stress server (40 objects) on network 2 - **Test Client** -- On network 1, discovers server via router, runs mixed workloads **Worker mix (7 total):** 2 readers, 1 writer, 1 RPM, 1 WPM, 1 object-list, plus a route health-check worker that periodically verifies the router is advertising the remote network via Who-Is-Router-To-Network. **Reference results (Docker, Alpine Linux, single host):** .. list-table:: :header-rows: 1 :widths: 35 65 * - Metric - Value * - Sustained throughput - ~7,900 req/s * - Error rate - ~0.7% * - Overall latency (p50 / p95 / p99) - 0.2ms / 0.3ms / 0.3ms * - Duration - 60s sustained + 15s warmup .. note:: The Docker router stress test uses pre-populated router caches (``add_route()``) and a direct server address (``SERVER_ADDRESS``) to bypass broadcast-based discovery, which is unreliable for ephemeral-port clients in Docker bridge networks. Confirmed services (Read, Write, RPM, WPM) work reliably via unicast through the router. The error threshold is 1% (vs. 0.5% for direct BIP) due to the additional routing hop. .. _bbmd-stress-benchmark: BBMD (Docker) ^^^^^^^^^^^^^ The BBMD stress test exercises foreign-device management alongside standard BACnet service traffic. Test clients register as foreign devices with a BBMD and perform concurrent reads, writes, RPM/WPM, plus BBMD-specific operations. **Architecture:** - **BBMD** -- Manages foreign device registrations and broadcast distribution - **Stress Server** -- Standard stress server (40 objects) on the same network - **Test Client** -- Registered as foreign device, runs mixed workloads + FDT/BDT reads **Worker mix (8 total):** 2 readers, 1 writer, 1 RPM, 1 WPM, 1 object-list, 1 FDT reader, 1 BDT reader. **Reference results (Docker, Alpine Linux, single host):** .. list-table:: :header-rows: 1 :widths: 35 65 * - Metric - Value * - Sustained throughput - ~18,750 req/s * - Error rate - ~0.3% * - Overall latency (p50 / p95 / p99) - 0.1ms / 0.2ms / 0.2ms * - Duration - 60s sustained + 15s warmup .. _mixed-transport-tests: Mixed-Transport Integration Tests ---------------------------------- In addition to the per-transport stress benchmarks above, bac-py includes Docker integration tests that verify **cross-transport routing** through the ``NetworkRouter``. These are functional correctness tests (not throughput benchmarks) that exercise the full NPDU forwarding path across different transport types. .. _mixed-bip-ipv6: BIP ↔ IPv6 (Docker) ^^^^^^^^^^^^^^^^^^^^ A BACnet/IP client on network 1 communicates with a BACnet/IPv6 server on network 2 through a dual-stack ``NetworkRouter``. **Architecture:** - **Router** -- Bridges BIP (172.30.1.180:47808, network 1) and IPv6 (fd00:bac:1::40, network 2) - **IPv6 Server** -- Standard stress server (40 objects) on the IPv6 network - **BIP Test Client** -- On network 1, uses ``add_route()`` to reach the server through the router via routed addresses (``NETWORK:HEXMAC`` format) **Tests (6 total):** read present-value, read object-name, read-property-multiple, write + readback, write-property-multiple + readback, get object-list. All requests traverse the BIP→router→IPv6 path and responses return via IPv6→router→BIP. .. code-block:: bash make docker-test-mixed-bip-ipv6 .. _mixed-bip-sc: BIP ↔ SC (Docker) ^^^^^^^^^^^^^^^^^ A BACnet/IP client on network 1 sends NPDUs through a BIP↔SC ``NetworkRouter`` to SC echo nodes connected via an SC hub on network 2. **Architecture:** - **SC Hub** -- WebSocket server with mutual TLS 1.3 on network 2 - **SC Echo Nodes (x2)** -- Parse incoming NPDUs, swap SNET/SADR→DNET/DADR headers, and echo the payload back with proper routing headers - **Router** -- ``BIPTransport`` (port 1, network 1) + ``SCTransport`` (port 2, network 2), connects to the SC hub as a node - **BIP Test Client** -- Constructs NPDUs with ``DNET=2, DADR=node_vmac``, sends as unicast to the router's BIP address TLS certificates are generated locally before the Docker build and bind-mounted into containers. Certificates are cleaned up after the test. **Tests (4 total):** unicast to node 1, unicast to node 2, 5 sequential round-trips, large (1000-byte) NPDU crossing the transport boundary. .. code-block:: bash make docker-test-mixed-bip-sc .. _results-comparison: Results Comparison ------------------ .. list-table:: :header-rows: 1 :widths: 20 20 15 20 15 * - Transport - Local (req/s) - Errors - Docker (req/s) - Errors * - BACnet/IP - ~13,300 - ~0.4% - ~18,500 - ~0.3% * - BACnet/SC (TLS) - ~9,200 msg/s - 0% - ~16,350 msg/s - 0% * - Router - ~7,300 - ~0.6% - ~7,900 - ~0.7% * - BBMD - ~13,240 - ~0.4% - ~18,750 - ~0.3% **Key observations:** - **BIP/BBMD Docker > Local:** Docker runs server and clients as separate OS processes, enabling true CPU parallelism. The single-process local benchmark is limited by Python's GIL and event-loop scheduling. - **SC Docker > Local:** Like BIP/BBMD, Docker SC benefits from multi-process parallelism across the hub, echo nodes, and test client. The hub's routing work (decode header, lookup destination, forward raw bytes) is CPU-bound and benefits from running in a dedicated process. - **Router Local ≈ Docker:** Unlike BIP/BBMD, router throughput is similar in both environments because the routing overhead (two UDP hops, NPDU decode/re-encode at each hop) dominates over the multi-process benefit. - **Router overhead:** Routing adds ~40% latency vs. direct BIP. Each request traverses two UDP hops and requires NPDU decode/re-encode at each hop. - **TLS overhead:** SC with TLS 1.3 adds ~22% overhead vs. plaintext locally (~9,200 vs ~11,700 msg/s). .. note:: All reference results were collected on macOS with Apple M-series hardware (local) and Alpine Linux containers on the same host (Docker). Throughput and latency depend on host hardware, OS, Docker version, and container resource limits. .. _running-benchmarks: Running Benchmarks ------------------ **Local benchmarks** (no Docker required): .. code-block:: bash # BACnet/IP make bench-bip # human-readable to stderr make bench-bip-json # JSON report to stdout # BACnet/SC make bench-sc # human-readable to stderr make bench-sc-json # JSON report to stdout # Router make bench-router # human-readable to stderr make bench-router-json # JSON report to stdout # BBMD make bench-bbmd # human-readable to stderr make bench-bbmd-json # JSON report to stdout **Docker benchmarks** (requires Docker): .. code-block:: bash # BACnet/IP stress test (pytest, pass/fail) make docker-test-stress # BACnet/IP stress runner (standalone, JSON report to stdout) make docker-stress # BACnet/SC stress test (pytest, pass/fail) make docker-test-sc-stress # BACnet/SC stress runner (standalone, JSON report to stdout) make docker-sc-stress # Router stress test (pytest, pass/fail) make docker-test-router-stress # Router stress runner (standalone, JSON report to stdout) make docker-router-stress # BBMD stress test (pytest, pass/fail) make docker-test-bbmd-stress # BBMD stress runner (standalone, JSON report to stdout) make docker-bbmd-stress # Mixed-transport integration tests (functional, not benchmarks) make docker-test-mixed-bip-ipv6 # BIP client ↔ IPv6 server via router make docker-test-mixed-bip-sc # BIP client ↔ SC echo nodes via router # Run all Docker integration tests including stress make docker-test The pytest variants assert ``error_rate < 0.5%`` (< 1% for router) and exit non-zero on failure. The standalone runners output a structured JSON report suitable for CI pipelines or historical tracking. .. _benchmark-tuning: Tuning Parameters ----------------- **Local benchmark CLI options:** All local benchmarks accept ``--warmup`` (default 5), ``--sustain`` (default 30), and ``--json`` flags. Transport-specific options: .. list-table:: :header-rows: 1 :widths: 25 15 60 * - Option - Default - Description * - ``--pools`` - 1 - Client pool count (BIP, Router, BBMD) * - ``--readers`` - 2 - ReadProperty workers per pool * - ``--writers`` - 1 - WriteProperty workers per pool * - ``--rpm`` - 1 - ReadPropertyMultiple workers per pool * - ``--wpm`` - 1 - WritePropertyMultiple workers per pool * - ``--objlist`` - 1 - Object-list reader workers * - ``--cov`` - 1 - COV subscribers (BIP only) * - ``--fdt-workers`` - 1 - FDT read workers (BBMD only) * - ``--bdt-workers`` - 1 - BDT read workers (BBMD only) * - ``--no-tls`` - off - Disable TLS, use plaintext WebSocket (SC only) * - ``--unicast`` - 8 - Unicast NPDU workers (SC only) * - ``--broadcast`` - 2 - Broadcast NPDU workers (SC only) * - ``--port`` - 0 - Server/hub port (0 = auto-assign) **Docker benchmark environment variables:** Docker benchmarks are configured via environment variables in ``docker-compose.yml``. Override them to adjust concurrency, duration, or thresholds. .. list-table:: :header-rows: 1 :widths: 30 15 55 * - Variable - Default - Description * - ``NUM_POOLS`` - 1 - Number of client pools (each shares one UDP socket) * - ``READERS_PER_POOL`` - 2 - ReadProperty workers per pool * - ``WRITERS_PER_POOL`` - 1 - WriteProperty workers per pool * - ``RPM_PER_POOL`` - 1 - ReadPropertyMultiple workers per pool * - ``WPM_PER_POOL`` - 1 - WritePropertyMultiple workers per pool * - ``OBJLIST_WORKERS`` - 1 - Object-list reader workers (global) * - ``COV_SUBSCRIBERS`` - 1 - COV subscription workers (global) * - ``UNICAST_WORKERS`` - 8 - Unicast NPDU workers (SC only) * - ``BROADCAST_WORKERS`` - 2 - Broadcast NPDU workers (SC only) * - ``ERROR_BACKOFF`` - 0.05 - Seconds to pause after an error (prevents cascade) * - ``WARMUP_SECONDS`` - 15 - Warmup phase duration * - ``SUSTAIN_SECONDS`` - 60 - Sustained measurement duration * - ``CONNECT_TIMEOUT`` - 30 - Hub connection timeout in seconds (SC only) .. tip:: When increasing concurrency, watch for UDP socket contention (BIP) or WebSocket frame queuing (SC). The error backoff parameter is critical for BIP stability -- without it, failed requests retry instantly and flood the socket, causing cascade failures. .. _profiling: Profiling --------- All local benchmark scripts support `pyinstrument `_ profiling via two flags: - ``--profile`` — print a text call-tree summary to stderr after the benchmark - ``--profile-html PATH`` — save an interactive HTML profile to a file Both flags can be combined. Profiling output goes to stderr so it does not interfere with ``--json`` output on stdout. **Quick profiling with Make targets:** .. code-block:: bash make bench-bip-profile # BIP with --profile --sustain 10 make bench-router-profile # Router make bench-bbmd-profile # BBMD make bench-sc-profile # SC **HTML report for deeper analysis:** .. code-block:: bash uv run python scripts/bench_bip.py --profile-html /tmp/bip.html --sustain 10 Open ``/tmp/bip.html`` in a browser for an interactive flame graph showing where time is spent inside bac-py during the benchmark workload. Mixed-Environment SC Profiling ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ The SC benchmark supports split Docker/local execution to isolate hub-side or client-side TLS overhead with pyinstrument. The ``--mode`` flag selects which components run locally (and thus are profiled): - ``--mode all`` (default) — everything in-process - ``--mode hub`` — start only the hub locally; pair with Docker clients - ``--mode client`` — connect local echo nodes and stress workers to a Docker-hosted hub First, generate shared TLS certificates with broad SANs (localhost, ``host.docker.internal``, Docker bridge IPs): .. code-block:: bash uv run python scripts/bench_sc.py --generate-certs .sc-bench-certs Then use the Makefile targets: .. code-block:: bash # Profile client side (hub in Docker, local echo nodes + stress workers) make bench-sc-profile-client # Profile hub side (hub local, nodes + runner in Docker) make bench-sc-profile-hub .. note:: ``pyinstrument`` is a dev dependency — it is not required at runtime or in Docker images. The ``--profile`` / ``--profile-html`` flags use a lazy import so the scripts still work without it installed.