Encoding¶
Haystack wire format implementations for JSON, Zinc, Trio, and CSV encoding.
Haystack encoding formats.
Provides JSON, Zinc, Trio, and CSV encoding/decoding.
JSON:
from hs_py.encoding.json import ...(v3 and v4)Zinc:
from hs_py.encoding.zinc import ...(grid text format)Trio:
from hs_py.encoding.trio import ...(record text format)CSV:
from hs_py.encoding.csv import ...(lossy grid export)
For convenience, the most common JSON functions are re-exported directly from this package. Zinc, Trio, and CSV functions should be imported from their respective modules to avoid name collisions.
JSON¶
Haystack JSON v3 and v4 encode/decode with optional pythonic mode.
Haystack JSON encoding and decoding.
Supports both Haystack 4 (v4) and Haystack 3 (v3) JSON formats, with an optional pythonic decode mode that converts Haystack types to native Python equivalents where possible.
See: https://project-haystack.org/doc/docHaystack/Json
- class hs_py.encoding.json.JsonVersion(*values)[source]¶
Bases:
EnumHaystack JSON encoding version.
- V3 = 'v3'¶
Haystack 3 JSON — type-prefixed strings (e.g.
"n:42 °F").
- V4 = 'v4'¶
Haystack 4 JSON —
_kindobject wrappers.
- hs_py.encoding.json.decode_grid(data, *, version=JsonVersion.V4, pythonic=False)[source]¶
Decode Haystack JSON bytes to a
Grid.- Parameters:
data (
bytes) – JSON bytes.version (
JsonVersion(default:<JsonVersion.V4: 'v4'>)) – JSON encoding version to decode.pythonic (
bool(default:False)) – IfTrue, convert values to native Python types.
- Return type:
- Returns:
Decoded
Grid.
- hs_py.encoding.json.decode_grid_dict(obj, *, version=JsonVersion.V4, pythonic=False)[source]¶
Decode a pre-parsed JSON dict to a
Grid.Use this when the JSON has already been deserialized (e.g. from a WebSocket message) to avoid an unnecessary
orjson.dumps/orjson.loadsround-trip.
- hs_py.encoding.json.decode_val(obj, *, version=JsonVersion.V4, pythonic=False)[source]¶
Decode a JSON value to a Haystack kind.
- Parameters:
- Return type:
- Returns:
Decoded Haystack value.
- hs_py.encoding.json.encode_grid(grid, *, version=JsonVersion.V4)[source]¶
Encode a
Gridto Haystack JSON bytes.- Parameters:
grid (
Grid) – Grid to encode.version (
JsonVersion(default:<JsonVersion.V4: 'v4'>)) – JSON encoding version to use.
- Return type:
- Returns:
JSON-encoded bytes via
orjson.
- hs_py.encoding.json.encode_grid_dict(grid, *, version=JsonVersion.V4)[source]¶
Encode a
Gridto a JSON-compatible dict (no serialization).Use this when embedding a grid dict inside a larger JSON structure to avoid the overhead of serializing to bytes and back.
- Parameters:
grid (
Grid) – Grid to encode.version (
JsonVersion(default:<JsonVersion.V4: 'v4'>)) – JSON encoding version to use.
- Return type:
- Returns:
JSON-serializable dict.
- hs_py.encoding.json.encode_val(val, *, version=JsonVersion.V4)[source]¶
Encode a single Haystack value to its JSON-compatible representation.
- Parameters:
val (
Any) – Haystack value to encode.version (
JsonVersion(default:<JsonVersion.V4: 'v4'>)) – JSON encoding version to use.
- Return type:
- Returns:
JSON-serializable Python object.
Zinc¶
Haystack Zinc text grid format encode/decode.
Haystack Zinc encoding and decoding.
Zinc is the primary text format for Haystack data. It encodes grids as a line-oriented text format with typed scalar values.
Trio¶
Trio record format parser and encoder.
Trio text format parser and encoder.
Trio is a line-oriented format for hand-authoring Haystack data records.
Each record contains tag name-value pairs separated by lines of dashes.
Values are encoded in Zinc scalar format with Trio-specific extensions
(unquoted strings, true/false booleans).
See: https://project-haystack.org/doc/docHaystack/Trio
- hs_py.encoding.trio.encode_trio(records)[source]¶
Encode a list of tag dicts as Trio text.
Multi-line strings, nested
Gridvalues (via Zinc), and nested record lists (via Trio) are encoded using indented continuation lines.
- hs_py.encoding.trio.parse_trio(text, *, _depth=0)[source]¶
Parse Trio text into a list of tag dicts.
Each dict represents one record (separated by lines of
---). Supports multi-line string, Zinc, and Trio values via indented continuation lines.
- hs_py.encoding.trio.parse_zinc_val(text)[source]¶
Parse a Zinc-encoded scalar value string.
This parses strict Zinc syntax only. For Trio-specific extensions (unquoted strings,
true/false), useparse_trio().
CSV¶
Lossy CSV grid export (encode-only).
Haystack CSV encoding.
CSV is a lossy text format for grids — metadata, column meta, and type information are discarded. It is useful for exporting grid data to spreadsheets and other tools that consume RFC 4180 CSV.
Scanner¶
Shared position-based Zinc value scanning helpers.
Shared Zinc value scanning utilities.
Position-based scanner functions for Zinc-encoded scalar values. Used by both the Trio parser and the filter lexer to avoid duplicating regex constants and parsing logic.
All scan functions use the (text, pos) -> (value, end_pos) signature.
- hs_py.encoding.scanner.DATETIME_RE = re.compile('\\d{4}-\\d{2}-\\d{2}T\\d{2}:\\d{2}:\\d{2}(?:\\.\\d+)?(?:Z|[+-]\\d{2}:\\d{2})(?:\\s+[A-Z][a-zA-Z0-9_/]+)?')¶
Regex for Zinc datetime values.
- hs_py.encoding.scanner.DATE_RE = re.compile('\\d{4}-\\d{2}-\\d{2}')¶
Regex for Zinc date values.
- hs_py.encoding.scanner.DIGIT_CHARS = frozenset({'0', '1', '2', '3', '4', '5', '6', '7', '8', '9', '_'})¶
Digit characters and underscore (for numeric scanning).
- hs_py.encoding.scanner.IDENT_CHARS = frozenset({'0', '1', '2', '3', '4', '5', '6', '7', '8', '9', 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z', '_', 'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z'})¶
Characters valid in tag names and identifiers (alphanumeric + underscore).
- hs_py.encoding.scanner.REF_CHARS = frozenset({'-', '.', '0', '1', '2', '3', '4', '5', '6', '7', '8', '9', ':', 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z', '_', 'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z', '~'})¶
Characters valid in a Ref id.
- hs_py.encoding.scanner.STR_ESCAPES: dict[str, str] = {'"': '"', '$': '$', '\\': '\\', 'b': '\x08', 'f': '\x0c', 'n': '\n', 'r': '\r', 't': '\t'}¶
String escape sequences per Zinc spec.
- hs_py.encoding.scanner.SYMBOL_CHARS = frozenset({'-', '.', '0', '1', '2', '3', '4', '5', '6', '7', '8', '9', ':', 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z', '_', 'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z'})¶
Characters valid in symbol names (alphanumeric + hyphen, underscore, colon, dot).
- hs_py.encoding.scanner.TIME_RE = re.compile('\\d{2}:\\d{2}:\\d{2}(?:\\.\\d+)?')¶
Regex for Zinc time values.
- hs_py.encoding.scanner.UNIT_STOP_BASE = frozenset({'\t', '\n', '\r', ' '})¶
Base characters that terminate a number unit (whitespace only). Consumers extend this with context-specific delimiters.
- hs_py.encoding.scanner.city_to_tz(name)[source]¶
Resolve a Haystack timezone name to a
ZoneInfo.Accepts both city-only names (
"New_York") and full IANA names ("America/New_York"). Results are cached to avoid repeated filesystem lookups fromZoneInfo.
- hs_py.encoding.scanner.format_num(val)[source]¶
Format a float, dropping unnecessary trailing zeros.
- hs_py.encoding.scanner.format_number(n)[source]¶
Format a
Numberas a string with optional unit.Handles
NaN,INF,-INF, and appends the unit if present.
- hs_py.encoding.scanner.format_ref(ref, *, zinc=False)[source]¶
Format a
Refas@id disor@id "dis".
- hs_py.encoding.scanner.parse_datetime(s)[source]¶
Parse a Zinc datetime string into a Python datetime.
- hs_py.encoding.scanner.scan_dict(text, pos, *, _depth=0)[source]¶
Scan a Zinc dict literal starting at
{.
- hs_py.encoding.scanner.scan_keyword(text, pos)[source]¶
Scan a keyword (
T/F/M/NA/…),Coord,XStr, or bare identifier.
- hs_py.encoding.scanner.scan_list(text, pos, *, _depth=0)[source]¶
Scan a Zinc list literal starting at
[.
- hs_py.encoding.scanner.scan_number(text, pos, *, unit_stop=None)[source]¶
Scan a numeric literal with optional unit.
Supports underscore digit separators per the Zinc spec (e.g.
10_000).
- hs_py.encoding.scanner.scan_number_or_temporal(text, pos, *, unit_stop=None)[source]¶
Disambiguate and scan a number, date, time, or datetime.
- hs_py.encoding.scanner.scan_str(text, pos)[source]¶
Scan a Zinc quoted string starting at the opening
".
- hs_py.encoding.scanner.scan_tag_name(text, pos)[source]¶
Scan a tag name (alphanumeric + underscore) starting at pos.
- hs_py.encoding.scanner.tz_name(dt)[source]¶
Extract the Haystack city timezone name from a datetime.