zmem-org/ZMEM: Extremely fast binary message format with minimal memory overhead and zero copy access


A high-performance binary serialization format with minimal overhead for fixed structs and zero-copy access.

ZMEM (Zero-copy Memory Format) is designed for scenarios where serialization performance is critical: high-frequency trading, game networking, real-time systems, and inter-process communication. Unlike formats that prioritize schema evolution, ZMEM prioritizes raw speed by requiring all communicating parties use identical schemas at compile time.

  • Minimal overhead for fixed structs – Direct memcpy serialization with no headers or pointers (only padding to 8-byte boundary)
  • Zero-copy deserialization – Access data in-place without parsing or allocation
  • Native mutable state – Fixed structs can serve as your application’s data model directly
  • 8-byte size alignment (minimum) – All wire sizes are padded to 8 bytes; higher alignment is honored when required (e.g., i128)
  • Deterministic output – Identical data always produces identical bytes (content-addressable storage friendly)
  • Memory-mapped file support – O(1) random access to any field in large files
  • Large data support – 64-bit size headers support documents up to 2^64−1 bytes and arrays up to 2^64−1 elements

Reading/Writing Native C++ Types

This benchmark measures serialization from and deserialization to native C++ types (structs, std::string, std::vector). This is ZMEM’s primary use case.

ZMEM Benchmark Results

Throughput in MB/s (higher is better). Benchmarked using Glaze on Apple M1 Max.

This benchmark compares zero-copy field access – reading data directly from the serialized buffer without allocating native types like std::string or std::vector:

Zero-Copy Read Performance

ZMEM and FlatBuffers achieve similar zero-copy performance (~14.5 GB/s). Cap’n Proto’s accessor pattern has more overhead (2.8 GB/s).

ZMEM makes an explicit trade-off: no schema evolution in exchange for maximum performance. This is the right choice when:

  • All parties are deployed together (IPC, game client/server, embedded systems)
  • Performance matters more than independent versioning
  • Data is transient (real-time telemetry, frame data, market data)

If you need schema evolution, consider Protocol Buffers, FlatBuffers, BEVE, or Cap’n Proto instead.

Why ZMEM is Faster than FlatBuffers

With FlatBuffers, you define types in a .fbs schema file and run a code generator. For structs containing strings or vectors, you must use the builder pattern:

// FlatBuffers: Schema file + code generation + builder pattern
flatbuffers::FlatBufferBuilder builder;
auto name = builder.CreateString("Alice");
auto scores = builder.CreateVector(std::vectorfloat>{95.5f, 87.0f, 91.5f});
auto player = CreatePlayer(builder, 42, name, scores);
builder.Finish(player);

With ZMEM (using Glaze), you use your existing C++ structs directly—no schema file, no code generation, no builder:

// ZMEM: Use native C++ types directly
struct Player {
    uint64_t id;
    std::string name;
    std::vectorfloat> scores;
};

Player player{42, "Alice", {95.5f, 87.0f, 91.5f}};
std::string buffer;
glz::write_zmem(player, buffer);  // That's it

FlatBuffers ZMEM (C++ with Glaze)
Schema definition .fbs file required Use C++ structs directly (optional .zmem schema for cross-language)
Code generation Required (flatc) None for C++ (optional codegen for other languages)
Serialization API Builder pattern Single function call
std::string / std::vector Supported (via builder) Supported (native)
Schema evolution Yes No

ZMEM’s simpler serialization path—no builder objects, no intermediate allocations, no vtable construction—is why it achieves ~3x higher write throughput than FlatBuffers.

version 1.0.0

namespace game

struct Vec3 {
  x::f32
  y::f32
  z::f32
}

struct Player {
  id::u64
  name::str[64]
  position::Vec3
  health::f32 = 100.0
  inventory::[u32]
}

Structure ZMEM Cap’n Proto FlatBuffers
Point { x, y: f32 } 8 bytes 24 bytes 20 bytes
Vec3 { x, y, z: f32 } 16 bytes (12 + 4 padding) 24 bytes 20 bytes
Empty struct 8 bytes (padding) 16 bytes 4 bytes

For fixed structs, ZMEM has minimal overhead—just padding to 8-byte boundaries for safe zero-copy access. No headers, vtables, or pointers.

Note: These sizes are illustrative for the specific schemas shown and typical encodings. Actual sizes vary with schema shape and framing.

Type Size Description
bool 1 byte Boolean value
i8, i16, i32, i64, i128 1-16 bytes Signed integers
u8, u16, u32, u64, u128 1-16 bytes Unsigned integers
f16, f32, f64 2-8 bytes IEEE 754 floats
bf16 2 bytes Brain float (ML applications)

Syntax Description Example
str[N] Fixed-size string (N bytes, null-terminated) str[64]
string Variable-length string string
T[N] Fixed array f32[4], Vec3[3]
[T] Vector (variable length) [f32], [Player]
opt Optional value opt
map Sorted key-value pairs map
enum Name : T Enumeration enum Status : u8 { ... }
union Name : T Tagged union union Result : u32 { ... }

Type Aliases and Constants

const MAX_NAME::u32 = 64
const ORIGIN::f32[3] = [0.0, 0.0, 0.0]

type PlayerId = u64
type Name = str[MAX_NAME]
type Color = u8[4]

ZMEM categorizes types into two categories:

Fixed Types (Minimal Overhead)

Fixed types are trivially copyable and serialize with a direct memcpy. Wire sizes are padded to multiples of 8 bytes (and higher alignment is respected when required):

struct Particle {
    uint64_t id;        // 8 bytes
    float position[3];  // 12 bytes
    float velocity[3];  // 12 bytes
    float mass;         // 4 bytes
    // 4 bytes tail padding (struct alignment = 8)
};
// sizeof(Particle) == 40 bytes (36 data + 4 padding)
// Wire size: 40 bytes (already multiple of 8)
// Serialization: direct memcpy

Variable Types (Minimal Overhead)

Structs containing vectors, variable-length strings, or maps use an 8-byte size header plus references for each variable field:

┌──────────────────────────────────────────────────────────────┐
│ [Size: 8 bytes] [Inline Section] [Variable Data Section]     │
└──────────────────────────────────────────────────────────────┘

All wire sizes (fixed and variable) are padded to multiples of 8 bytes, and variable section data is 8-byte aligned, enabling safe zero-copy access via reinterpret_cast. Types with alignment > 8 insert additional padding after headers so the inline section starts at the required alignment.

Use Case ZMEM Alternatives
IPC / Shared memory Ideal
Game networking Ideal
High-frequency trading Ideal SBE
Real-time telemetry Ideal
Memory-mapped files Ideal
Microservices (independent deployment) Not suitable Protobuf, BEVE
Long-term storage with evolution Not suitable BEVE, Avro
Browser/server communication Not suitable JSON, BEVE, Protobuf

Performance Characteristics

Operation ZMEM (Fixed) ZMEM (Variable) Cap’n Proto FlatBuffers
Serialize small struct memcpy Single pass Arena + setup Builder + copy
Deserialize Cast or memcpy Wrap buffer Wrap buffer Wrap buffer
Field access Compile-time offset Compile-time offset Pointer chase vtable lookup
Random access (mmap) O(1) direct O(1) with offset O(1) with pointer O(1) with vtable

The benchmarks use Glaze as the ZMEM implementation.

cmake -B build -DCMAKE_BUILD_TYPE=Release
cmake --build build --config Release
./build/zmem_bench

MIT License



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *