zmem-org/ZMEM: Extremely fast binary message format with minimal memory overhead and zero copy access

A high-performance binary serialization format with minimal overhead for fixed structs and zero-copy access.

ZMEM (Zero-copy Memory Format) is designed for scenarios where serialization performance is critical: high-frequency trading, game networking, real-time systems, and inter-process communication. Unlike formats that prioritize schema evolution, ZMEM prioritizes raw speed by requiring all communicating parties use identical schemas at compile time.

Minimal overhead for fixed structs – Direct memcpy serialization with no headers or pointers (only padding to 8-byte boundary)
Zero-copy deserialization – Access data in-place without parsing or allocation
Native mutable state – Fixed structs can serve as your application’s data model directly
8-byte size alignment (minimum) – All wire sizes are padded to 8 bytes; higher alignment is honored when required (e.g., i128)
Deterministic output – Identical data always produces identical bytes (content-addressable storage friendly)
Memory-mapped file support – O(1) random access to any field in large files
Large data support – 64-bit size headers support documents up to 2^64−1 bytes and arrays up to 2^64−1 elements

Reading/Writing Native C++ Types

This benchmark measures serialization from and deserialization to native C++ types (structs, std::string, std::vector). This is ZMEM’s primary use case.

Throughput in MB/s (higher is better). Benchmarked using Glaze on Apple M1 Max.

This benchmark compares zero-copy field access – reading data directly from the serialized buffer without allocating native types like std::string or std::vector:

ZMEM and FlatBuffers achieve similar zero-copy performance (~14.5 GB/s). Cap’n Proto’s accessor pattern has more overhead (2.8 GB/s).

ZMEM makes an explicit trade-off: no schema evolution in exchange for maximum performance. This is the right choice when:

All parties are deployed together (IPC, game client/server, embedded systems)
Performance matters more than independent versioning
Data is transient (real-time telemetry, frame data, market data)

If you need schema evolution, consider Protocol Buffers, FlatBuffers, BEVE, or Cap’n Proto instead.

Why ZMEM is Faster than FlatBuffers

With FlatBuffers, you define types in a .fbs schema file and run a code generator. For structs containing strings or vectors, you must use the builder pattern:

// FlatBuffers: Schema file + code generation + builder pattern
flatbuffers::FlatBufferBuilder builder;
auto name = builder.CreateString("Alice");
auto scores = builder.CreateVector(std::vectorfloat>{95.5f, 87.0f, 91.5f});
auto player = CreatePlayer(builder, 42, name, scores);
builder.Finish(player);

With ZMEM (using Glaze), you use your existing C++ structs directly—no schema file, no code generation, no builder:

// ZMEM: Use native C++ types directly
struct Player {
    uint64_t id;
    std::string name;
    std::vectorfloat> scores;
};

Player player{42, "Alice", {95.5f, 87.0f, 91.5f}};
std::string buffer;
glz::write_zmem(player, buffer);  // That's it

	FlatBuffers	ZMEM (C++ with Glaze)
Schema definition	`.fbs` file required	Use C++ structs directly (optional `.zmem` schema for cross-language)
Code generation	Required (`flatc`)	None for C++ (optional codegen for other languages)
Serialization API	Builder pattern	Single function call
`std::string` / `std::vector`	Supported (via builder)	Supported (native)
Schema evolution	Yes	No

ZMEM’s simpler serialization path—no builder objects, no intermediate allocations, no vtable construction—is why it achieves ~3x higher write throughput than FlatBuffers.

version 1.0.0

namespace game

struct Vec3 {
  x::f32
  y::f32
  z::f32
}

struct Player {
  id::u64
  name::str[64]
  position::Vec3
  health::f32 = 100.0
  inventory::[u32]
}

Structure	ZMEM	Cap’n Proto	FlatBuffers
`Point { x, y: f32 }`	8 bytes	24 bytes	20 bytes
`Vec3 { x, y, z: f32 }`	16 bytes (12 + 4 padding)	24 bytes	20 bytes
Empty struct	8 bytes (padding)	16 bytes	4 bytes

For fixed structs, ZMEM has minimal overhead—just padding to 8-byte boundaries for safe zero-copy access. No headers, vtables, or pointers.

Note: These sizes are illustrative for the specific schemas shown and typical encodings. Actual sizes vary with schema shape and framing.

Type	Size	Description
`bool`	1 byte	Boolean value
`i8`, `i16`, `i32`, `i64`, `i128`	1-16 bytes	Signed integers
`u8`, `u16`, `u32`, `u64`, `u128`	1-16 bytes	Unsigned integers
`f16`, `f32`, `f64`	2-8 bytes	IEEE 754 floats
`bf16`	2 bytes	Brain float (ML applications)

Syntax	Description	Example
`str[N]`	Fixed-size string (N bytes, null-terminated)	`str[64]`
`string`	Variable-length string	`string`
`T[N]`	Fixed array	`f32[4]`, `Vec3[3]`
`[T]`	Vector (variable length)	`[f32]`, `[Player]`
`opt`	Optional value	`opt`
`map`	Sorted key-value pairs	`map`
`enum Name : T`	Enumeration	`enum Status : u8 { ... }`
`union Name : T`	Tagged union	`union Result : u32 { ... }`

Type Aliases and Constants

const MAX_NAME::u32 = 64
const ORIGIN::f32[3] = [0.0, 0.0, 0.0]

type PlayerId = u64
type Name = str[MAX_NAME]
type Color = u8[4]

ZMEM categorizes types into two categories:

Fixed Types (Minimal Overhead)

Fixed types are trivially copyable and serialize with a direct memcpy. Wire sizes are padded to multiples of 8 bytes (and higher alignment is respected when required):

struct Particle {
    uint64_t id;        // 8 bytes
    float position[3];  // 12 bytes
    float velocity[3];  // 12 bytes
    float mass;         // 4 bytes
    // 4 bytes tail padding (struct alignment = 8)
};
// sizeof(Particle) == 40 bytes (36 data + 4 padding)
// Wire size: 40 bytes (already multiple of 8)
// Serialization: direct memcpy

Variable Types (Minimal Overhead)

Structs containing vectors, variable-length strings, or maps use an 8-byte size header plus references for each variable field:

┌──────────────────────────────────────────────────────────────┐
│ [Size: 8 bytes] [Inline Section] [Variable Data Section]     │
└──────────────────────────────────────────────────────────────┘

All wire sizes (fixed and variable) are padded to multiples of 8 bytes, and variable section data is 8-byte aligned, enabling safe zero-copy access via reinterpret_cast. Types with alignment > 8 insert additional padding after headers so the inline section starts at the required alignment.

Use Case	ZMEM	Alternatives
IPC / Shared memory	Ideal	–
Game networking	Ideal	–
High-frequency trading	Ideal	SBE
Real-time telemetry	Ideal	–
Memory-mapped files	Ideal	–
Microservices (independent deployment)	Not suitable	Protobuf, BEVE
Long-term storage with evolution	Not suitable	BEVE, Avro
Browser/server communication	Not suitable	JSON, BEVE, Protobuf

Performance Characteristics

Operation	ZMEM (Fixed)	ZMEM (Variable)	Cap’n Proto	FlatBuffers
Serialize small struct	`memcpy`	Single pass	Arena + setup	Builder + copy
Deserialize	Cast or `memcpy`	Wrap buffer	Wrap buffer	Wrap buffer
Field access	Compile-time offset	Compile-time offset	Pointer chase	vtable lookup
Random access (mmap)	O(1) direct	O(1) with offset	O(1) with pointer	O(1) with vtable

The benchmarks use Glaze as the ZMEM implementation.

cmake -B build -DCMAKE_BUILD_TYPE=Release
cmake --build build --config Release
./build/zmem_bench

MIT License

Source link