The Sigma Book

KYA: Know Your Assumptions

This is a PRE-ALPHA version of The Sigma Book.

Before using this material, understand these critical assumptions:

  1. Not Authoritative: This book is NOT an official specification. It is a research and educational resource derived from studying the source code.

  2. May Contain Errors: Content has not been formally verified. Implementations based solely on this book may be incorrect.

  3. Subject to Change: As a pre-alpha work, chapters may be incomplete, reorganized, or substantially rewritten.

  4. Source of Truth: For authoritative information, always consult:

  5. Verification Required: Cross-reference all claims against the actual source code before relying on them.

Use this book to learn and explore, but verify everything against the source.


Complete Technical Reference for SigmaState Interpreter

Welcome to The Sigma Book, a comprehensive technical reference covering the SigmaState interpreter, ErgoTrees, and the Sigma language. This book is written for engineers who need deep understanding of the implementation details, algorithms, and data structures behind the Ergo blockchain's smart contract system.

Code examples use idiomatic Zig 0.13+ with data-oriented design patterns, making algorithms explicit and accessible to implementers in any language.

What This Book Covers

This book provides complete documentation of:

  1. Specifications: Formal and informal specifications of the Sigma language, type system, and ErgoTree format
  2. Implementation Details: Internal algorithms and data structures from both the reference Scala implementation (sigmastate-interpreter) and Rust implementation (sigma-rust)
  3. Node Integration: How the Ergo node uses the interpreter for transaction validation
  4. Practical APIs: SDK and high-level interfaces for building applications

How to Read This Book

Prerequisites Approach

Every chapter includes an explicit Prerequisites section that lists:

  • Required knowledge assumptions
  • Related concepts you should understand
  • Links to earlier chapters covering dependencies

This allows you to:

  • Jump directly to topics of interest if you have the background
  • Trace backward to fill gaps in your understanding
  • Use the book as a reference rather than reading linearly

Code Examples

Code examples use Zig 0.13+ to illustrate algorithms with explicit memory management and data-oriented patterns. While not directly runnable against the Scala or Rust implementations, they demonstrate the core logic clearly.

Exercises

Each chapter concludes with exercises at three levels:

  1. Conceptual: Test your understanding of the material
  2. Implementation: Write code applying the concepts
  3. Analysis: Read and analyze real source code

Source Material

This book is derived from:

  • sigmastate-interpreter: Reference Scala implementation (ScorexFoundation/sigmastate-interpreter)
  • sigma-rust: Rust implementation (ergoplatform/sigma-rust)
  • Ergo node: Full node implementation showing integration
  • Formal specifications: LaTeX documents in docs/spec/
  • Test suites: Language specification tests defining expected behavior

Citations use footnotes referencing both Scala and Rust source locations.

Book Structure

PartFocusDepth
I. FoundationsCore concepts and type systemOverview
II. ASTExpression node catalogReference
III. SerializationBinary formatDetailed
IV. CryptographyZero-knowledge proofsDeep
V. InterpreterEvaluation engineDeep
VI. CompilerErgoScript compilationDeep
VII. Data StructuresCollections, AVL trees, boxesDetailed
VIII. Node IntegrationTransaction validationPractical
IX. SDKDeveloper APIsPractical
X. AdvancedSoft-forks, cross-platformSpecialized

Conventions Used

// Code blocks use Zig to illustrate algorithms
const ErgoTree = struct {
    header: Header,
    constants: []const Constant,
    root: *const Expr,
};

Note: Highlighted notes provide important context or warnings.

Footnotes: [^1]: Scala: path/to/file.scala:123 and [^2]: Rust: path/to/file.rs:456 reference source locations in both implementations.

Version Information

This book documents:

  • sigmastate-interpreter: Version 6.x (with notes on v5 differences)
  • sigma-rust: ergotree-ir and ergotree-interpreter crates
  • Protocol versions: v0 (initial), v1 (v4.0), v2 (v5.0 JIT), v3 (v6.0)

Contributing

This book is maintained as part of the ErgoTree research project. Corrections and improvements are welcome.


Let's begin with Chapter 1: Introduction to Sigma and ErgoTree.

Chapter 1: Introduction to Sigma and ErgoTree

PRE-ALPHA WARNING: This is a pre-alpha version of The Sigma Book. Content may be incomplete, inaccurate, or subject to change. Do not use as a source of truth. For authoritative information, consult the official repositories:

Prerequisites

  • Basic blockchain concepts (transactions, blocks, consensus)
  • Understanding of the UTXO model (unspent transaction outputs)
  • Familiarity with any systems programming language (C, Rust, Go, or similar)
  • Public key cryptography fundamentals (key pairs, digital signatures, hash functions)

Learning Objectives

By the end of this chapter, you will be able to:

  • Explain why Sigma protocols offer advantages over traditional blockchain scripting
  • Describe the relationship between ErgoScript, ErgoTree, and SigmaBoolean
  • Understand the UTXO model and how scripts guard spending conditions
  • Differentiate the roles of prover and verifier in transaction validation
  • Identify the core components of the Sigma interpreter architecture

What is Sigma?

Traditional blockchain scripting languages like Bitcoin Script offer limited expressiveness: they support hash preimages, signature checks, and timelocks, but little else. Ethereum's EVM provides Turing completeness but at the cost of complexity, high gas fees, and limited privacy guarantees.

Sigma (Σ) protocols occupy a middle ground. They are cryptographic proof systems that enable zero-knowledge proofs of knowledge—proving you know a secret without revealing it1. The name comes from the Greek letter Σ and reflects their characteristic three-move structure:

  1. Commitment: The prover sends a randomized commitment value
  2. Challenge: The verifier sends a random challenge
  3. Response: The prover sends a response that proves knowledge without revealing secrets

What makes Sigma protocols powerful for blockchains is their composability: you can combine them with AND, OR, and threshold operations to build complex spending conditions while preserving zero-knowledge properties.

The Three Layers

┌─────────────────────────────────────┐
│           ErgoScript                │  High-level language
│     (Human-readable source)         │
└─────────────────┬───────────────────┘
                  │ Compilation
                  ▼
┌─────────────────────────────────────┐
│            ErgoTree                 │  Intermediate representation
│      (Typed AST / Bytecode)         │  (Serialized in UTXOs)
└─────────────────┬───────────────────┘
                  │ Evaluation
                  ▼
┌─────────────────────────────────────┐
│          SigmaBoolean               │  Cryptographic proposition
│      (Sigma protocol tree)          │  (What needs to be proven)
└─────────────────────────────────────┘

ErgoScript

High-level, statically-typed language with Scala-like syntax2:

  • First-class lambdas and higher-order functions
  • Call-by-value evaluation
  • Local type inference
  • Blocks as expressions
// Zig representation of an ErgoScript contract
const Contract = struct {
    freeze_deadline: i32,
    pk_owner: SigmaProp,

    pub fn evaluate(self: Contract, height: i32) SigmaProp {
        const deadline_passed = height > self.freeze_deadline;
        return SigmaProp.and(
            SigmaProp.fromBool(deadline_passed),
            self.pk_owner,
        );
    }
};

ErgoTree

Compiled bytecode representation stored on-chain34:

  • Typed abstract syntax tree (AST)
  • Serialized as bytes in UTXOs
  • Deterministically interpretable
  • Version-controlled for soft-fork upgrades
const ErgoTree = struct {
    header: HeaderType,
    constants: []const Constant,
    root: union(enum) {
        parsed: SigmaPropValue,
        unparsed: UnparsedTree,
    },

    /// Header byte layout:
    /// Bit 7: Multi-byte header flag
    /// Bit 6: Reserved (GZIP)
    /// Bit 5: Reserved (context-dependent costing)
    /// Bit 4: Constant segregation flag
    /// Bit 3: Size flag
    /// Bits 2-0: Version (0-7)
    pub const HeaderType = packed struct(u8) {
        version: u3,
        has_size: bool,
        constant_segregation: bool,
        reserved1: bool = false,
        reserved_gzip: bool = false,
        multi_byte: bool = false,
    };
};

SigmaBoolean

After evaluation, ErgoTree reduces to a SigmaBoolean—a tree of cryptographic propositions56:

const SigmaBoolean = union(enum) {
    prove_dlog: ProveDlog,           // Knowledge of discrete log
    prove_dh_tuple: ProveDhTuple,    // Diffie-Hellman tuple
    cand: Cand,                      // Logical AND
    cor: Cor,                        // Logical OR
    cthreshold: Cthreshold,          // k-of-n threshold
    trivial: TrivialProp,            // True/False

    /// Count nodes in proposition tree
    pub fn size(self: SigmaBoolean) usize {
        return switch (self) {
            .prove_dlog, .prove_dh_tuple, .trivial => 1,
            .cand => |c| 1 + totalChildrenSize(c.children),
            .cor => |c| 1 + totalChildrenSize(c.children),
            .cthreshold => |c| 1 + totalChildrenSize(c.children),
        };
    }
    // NOTE: In production, use an explicit work stack instead of recursion
    // to guarantee bounded stack depth. See ZIGMA_STYLE.md.
};

const ProveDlog = struct {
    /// Public key (compressed EC point, 33 bytes)
    h: EcPoint,
};

const ProveDhTuple = struct {
    g: EcPoint, // Generator
    h: EcPoint, // Point h
    u: EcPoint, // g^w
    v: EcPoint, // h^w
};

The UTXO Model

Ergo extends the UTXO (Unspent Transaction Output) model pioneered by Bitcoin. Instead of simple locking scripts, Ergo uses boxes—rich data structures that contain value, tokens, and arbitrary typed data:

┌─────────────────────────────────────────┐
│                  Box                    │
├─────────────────────────────────────────┤
│  R0: value (i64 nanoERGs)               │  ← Computed registers
│  R1: ergoTree (spending condition)      │
│  R2: tokens (asset list)                │
│  R3: creationInfo (height, txId, idx)   │
├─────────────────────────────────────────┤
│  R4-R9: additional registers ───────────┼──► User-defined data
│         (optional, typed constants)     │     (up to 6 registers)
└─────────────────────────────────────────┘
                    │
     ┌──────────────┴──────────────┐
     ▼                             ▼
┌─────────────────────────┐   ┌─────────────────────────┐
│         Token           │   │   Register (R4-R9)      │
├─────────────────────────┤   ├─────────────────────────┤
│  id: [32]u8 (token ID)  │   │  value: Constant        │
│  amount: i64            │   │  (any SType value)      │
└─────────────────────────┘   └─────────────────────────┘

Registers R0–R3 are computed from box fields and always present. Registers R4–R9 are optional and can store any typed value—integers, byte arrays, group elements, or even nested collections.

const Box = struct {
    value: i64,                          // nanoERGs (R0)
    ergo_tree: ErgoTree,                 // Spending condition (R1)
    tokens: []const Token,               // Additional assets (R2)
    creation_height: u32,                // Part of creation info (R3)
    tx_id: [32]u8,                       // Part of creation info (R3)
    output_index: u16,                   // Part of creation info (R3)
    additional_registers: [6]?Constant,  // R4-R9 (user-defined, optional)

    pub fn id(self: *const Box) [32]u8 {
        // Blake2b256(tx_id || output_index || serialized_content)
        var hasher = std.crypto.hash.blake2.Blake2b256.init(.{});
        hasher.update(&self.tx_id);
        hasher.update(std.mem.asBytes(&self.output_index));
        // ... serialize and hash content
        return hasher.finalResult();
    }
};
// NOTE: R0-R3 are computed from box fields; only R4-R9 are stored explicitly.

The Prover/Verifier Model

TODO: Add explainations.

                PROVER                              VERIFIER
          ┌──────────────┐                    ┌──────────────┐
          │   Secrets    │                    │              │
          │  (private    │                    │   Context    │
          │   keys)      │                    │              │
          └──────┬───────┘                    └──────┬───────┘
                 │                                   │
  ErgoTree ─────►│                    ErgoTree ─────►│
                 │                                   │
          ┌──────▼───────┐                    ┌──────▼───────┐
          │  Reduction   │                    │  Reduction   │
          │  (same as    │                    │  (same as    │
          │  verifier)   │                    │   prover)    │
          └──────┬───────┘                    └──────┬───────┘
                 │                                   │
          SigmaBoolean                        SigmaBoolean
                 │                                   │
          ┌──────▼───────┐                    ┌──────▼───────┐
          │   Signing    │───────Proof───────►│   Verify     │
          │ (Fiat-Shamir)│                    │  Signature   │
          └──────────────┘                    └──────┬───────┘
                                                     │
                                              true / false

Prover

TODO: Add explainations.

const Prover = struct {
    secrets: []const SecretKey,

    pub fn prove(
        self: *const Prover,
        ergo_tree: *const ErgoTree,
        context: *const Context,
    ) !Proof {
        // 1. Reduce to SigmaBoolean
        const sigma_bool = try Evaluator.reduce(ergo_tree, context);

        // 2. Generate proof using Fiat-Shamir
        return try self.generateProof(sigma_bool, context.message);
    }
};

Verifier

TODO: Add explainations.

const Verifier = struct {
    cost_limit: u64,

    pub fn verify(
        self: *const Verifier,
        ergo_tree: *const ErgoTree,
        context: *const Context,
        proof: *const Proof,
    ) !bool {
        // 1. Reduce with cost tracking
        var cost: u64 = 0;
        const sigma_bool = try Evaluator.reduceWithCost(
            ergo_tree, context, &cost, self.cost_limit,
        );

        // 2. Verify signature
        return try verifySignature(sigma_bool, proof, context.message);
    }
};

Why Sigma Protocols?

Consider what Bitcoin Script can express: "This output can be spent if you provide a valid signature for public key X." This covers most payment scenarios but falls short for more sophisticated applications.

Sigma protocols enable a fundamentally richer set of spending conditions:

FeatureWhat It EnablesExample Use Case
Composable ZK ProofsAND, OR, threshold combinations of conditionsMulti-party escrow with complex release logic
Ring SignaturesProve you're one of N signers without revealing whichAnonymous voting, whistleblower systems
Threshold SignaturesRequire k-of-n parties to signDAO governance, cold storage recovery
Zero-Knowledge PrivacyProve statements without revealing underlying dataPrivate auctions, confidential identity verification

The key insight is that Sigma protocols can be composed while preserving their zero-knowledge properties. An OR composition of two Sigma proofs reveals that the prover knows one of two secrets—but not which one.

// OR composition hides actual signer
const ring_signature = SigmaBoolean{
    .cor = .{
        .children = &[_]SigmaBoolean{
            .{ .prove_dlog = pk_alice },
            .{ .prove_dlog = pk_bob },
            .{ .prove_dlog = pk_carol },
        },
    },
};
// Proof reveals ONE signed, but not which

Repository Structure

ModulePurpose
coreCryptographic primitives, base types
dataErgoTree, AST nodes, serialization
interpreterEvaluation engine, Sigma protocols
parsersErgoScript parser
scCompiler with IR optimization
sdkHigh-level transaction APIs

Key Design Principles

The Sigma interpreter is built around four core principles that make it suitable for blockchain consensus:

Determinism

Every operation must produce identical results for identical inputs, regardless of platform or implementation. This means no floating-point arithmetic, no uninitialized memory, and careful handling of hash map iteration order. Without determinism, nodes would disagree on transaction validity.

Bounded Execution

Every script must complete within a predictable cost limit. The interpreter tracks three resource categories:

  • Computational operations: arithmetic, comparisons, function calls
  • Memory allocations: collections, tuples, intermediate values
  • Cryptographic operations: EC point multiplication, signature verification

Scripts exceeding the cost limit fail validation, preventing denial-of-service attacks.

Soft-Fork Compatibility

ErgoTree includes version information in its header. When nodes encounter unknown opcodes (from future protocol versions), they can handle them gracefully rather than rejecting the entire block. This enables protocol upgrades without hard forks.

Cross-Platform Consistency

The specification must be implementable identically across different platforms. Reference implementations exist for:

  • JVM (Scala): The original sigmastate-interpreter
  • JavaScript (Scala.js): Browser and Node.js environments
  • Native (Rust): sigma-rust for performance-critical applications7

Summary

This chapter introduced the fundamental concepts of the Sigma protocol ecosystem:

  • Sigma protocols are three-move cryptographic proofs that enable zero-knowledge proofs of knowledge, with the crucial property of composability
  • ErgoScript is a high-level, statically-typed language that compiles to ErgoTree bytecode
  • ErgoTree is a serialized AST stored in UTXO boxes that evaluates to SigmaBoolean propositions
  • SigmaBoolean represents cryptographic conditions (discrete log proofs, Diffie-Hellman tuples) combined with AND, OR, and threshold logic
  • The prover generates zero-knowledge proofs; the verifier checks them without learning secrets
  • The system is designed for blockchain consensus: deterministic, bounded, soft-fork compatible, and cross-platform

In the following chapters, we'll dive deep into each layer—starting with the type system that makes ErgoTree's static guarantees possible.


Next: Chapter 2: Type System

1

Sigma protocols are interactive proof systems with the special "honest-verifier zero-knowledge" property.

7

Rust implementation: sigma-rust crate at ergotree-ir/, ergotree-interpreter/

Chapter 2: Type System

PRE-ALPHA WARNING: This is a pre-alpha version of The Sigma Book. Content may be incomplete, inaccurate, or subject to change. Do not use as a source of truth. For authoritative information, consult the official repositories:

Prerequisites

  • Basic type system concepts (static vs dynamic typing, generic types)
  • Understanding of binary serialization concepts
  • Prior chapters: Chapter 1

Learning Objectives

By the end of this chapter, you will be able to:

  • Identify all ErgoTree primitive types and their numeric ranges
  • Understand why type codes exist and how they enable compact serialization
  • Explain the "embeddable" type concept and its efficiency benefits
  • Construct collection, option, tuple, and function types
  • Recognize version-specific type additions (v6 and beyond)

Type System Overview

Every value in ErgoTree has a statically-known type. Unlike dynamically-typed languages where types are checked at runtime, ErgoTree's type system catches errors at compile time—before the script ever reaches the blockchain.

The type system provides12:

  • Static typing: All types known at compile time, enabling early error detection
  • Type inference: The compiler automatically deduces types in most cases
  • Generic types: Collections and options parameterized over element types
  • Type codes: Each type has a unique numeric code enabling compact binary serialization

Understanding type codes is essential because they directly affect how data is serialized on-chain. The type system is carefully designed so that common types serialize to single bytes, minimizing transaction size.

/// Base type descriptor
const SType = union(enum) {
    // Primitives (embeddable, codes 1-9)
    boolean,
    byte,
    short,
    int,
    long,
    big_int,
    group_element,
    sigma_prop,
    unsigned_big_int, // v6+

    // Compound types
    coll: *const SType,
    option: *const SType,
    tuple: []const SType,
    func: SFunc,

    // Object types (codes 99-106)
    box,
    avl_tree,
    context,
    header,
    pre_header,
    global,

    // Special
    unit,
    any,
    type_var: []const u8,

    pub fn typeCode(self: SType) u8 {
        return switch (self) {
            .boolean => 1,
            .byte => 2,
            .short => 3,
            .int => 4,
            .long => 5,
            .big_int => 6,
            .group_element => 7,
            .sigma_prop => 8,
            .unsigned_big_int => 9,
            .coll => 12,
            .option => 36,
            .tuple => 96,
            .box => 99,
            .avl_tree => 100,
            .context => 101,
            .header => 104,
            .pre_header => 105,
            .global => 106,
            else => 0,
        };
    }

    pub fn isEmbeddable(self: SType) bool {
        return self.typeCode() >= 1 and self.typeCode() <= 9;
    }

    pub fn isNumeric(self: SType) bool {
        return switch (self) {
            .byte, .short, .int, .long, .big_int, .unsigned_big_int => true,
            else => false,
        };
    }
};

Type Hierarchy

                              SType
                                │
         ┌──────────────────────┼──────────────────────┐
         │                      │                      │
    SEmbeddable            SCollection            SOption
         │                 (elemType)            (elemType)
    ┌────┴────┬─────────────────┐
    │         │                 │
SNumericType SBoolean    SGroupElement
    │                     SSigmaProp
    │
    ├── SByte (code 2)
    ├── SShort (code 3)
    ├── SInt (code 4)
    ├── SLong (code 5)
    ├── SBigInt (code 6)
    └── SUnsignedBigInt (code 9, v6+)

Object Types (non-embeddable):
  SBox(99), SAvlTree(100), SContext(101),
  SHeader(104), SPreHeader(105), SGlobal(106)

Primitive Types

Numeric Types

All numeric types support conversion via upcast (widening) and downcast (narrowing, throws on overflow)34:

TypeCodeSizeRange
SByte28-bit-128 to 127
SShort316-bit-32,768 to 32,767
SInt432-bit±2.1 billion
SLong564-bit±9.2 quintillion
SBigInt6256-bitSigned arbitrary
SUnsignedBigInt9256-bitUnsigned (v6+)
const SNumericType = struct {
    type_code: u8,
    numeric_index: u8, // 0=Byte, 1=Short, 2=Int, 3=Long, 4=BigInt, 5=UBigInt

    /// Ordering: Byte < Short < Int < Long < BigInt < UnsignedBigInt
    pub fn canUpcastTo(self: SNumericType, target: SNumericType) bool {
        return self.numeric_index <= target.numeric_index;
    }

    /// Downcast with overflow check
    pub fn downcast(comptime T: type, value: anytype) !T {
        const min = std.math.minInt(T);
        const max = std.math.maxInt(T);
        if (value < min or value > max) {
            return error.ArithmeticOverflow;
        }
        return @intCast(value);
    }
};

// Type instances
const SByte = SNumericType{ .type_code = 2, .numeric_index = 0 };
const SShort = SNumericType{ .type_code = 3, .numeric_index = 1 };
const SInt = SNumericType{ .type_code = 4, .numeric_index = 2 };
const SLong = SNumericType{ .type_code = 5, .numeric_index = 3 };
const SBigInt = SNumericType{ .type_code = 6, .numeric_index = 4 };
const SUnsignedBigInt = SNumericType{ .type_code = 9, .numeric_index = 5 };

Boolean Type

const SBoolean = struct {
    pub const type_code: u8 = 1;
    pub const is_embeddable = true;
};

Cryptographic Types

GroupElement — Point on secp256k1 curve (33 bytes compressed)5:

const SGroupElement = struct {
    pub const type_code: u8 = 7;

    /// 33 bytes: 1-byte prefix (0x02/0x03) + 32-byte X coordinate
    pub const SERIALIZED_SIZE = 33;
};

SigmaProp — Cryptographic proposition (required return type)6:

const SSigmaProp = struct {
    pub const type_code: u8 = 8;

    /// Maximum serialized size
    pub const MAX_SIZE_BYTES: usize = 1024;
};

Type Codes

Type code space partitioning7:

RangeDescription
1-9Primitive embeddable types
10-11Reserved
12-23Coll[T] (T primitive)
24-35Coll[Coll[T]]
36-47Option[T]
48-59Option[Coll[T]]
60+Other types
const TypeCodes = struct {
    // Primitives
    pub const BOOLEAN: u8 = 1;
    pub const BYTE: u8 = 2;
    pub const SHORT: u8 = 3;
    pub const INT: u8 = 4;
    pub const LONG: u8 = 5;
    pub const BIGINT: u8 = 6;
    pub const GROUP_ELEMENT: u8 = 7;
    pub const SIGMA_PROP: u8 = 8;
    pub const UNSIGNED_BIGINT: u8 = 9;

    // Type constructor bases
    pub const PRIM_RANGE: u8 = 12; // MaxPrimTypeCode + 1
    pub const COLL_BASE: u8 = 12;
    pub const NESTED_COLL_BASE: u8 = 24;
    pub const OPTION_BASE: u8 = 36;
    pub const OPTION_COLL_BASE: u8 = 48;

    // Object types
    pub const TUPLE: u8 = 96;
    pub const ANY: u8 = 97;
    pub const UNIT: u8 = 98;
    pub const BOX: u8 = 99;
    pub const AVL_TREE: u8 = 100;
    pub const CONTEXT: u8 = 101;
    pub const HEADER: u8 = 104;
    pub const PREHEADER: u8 = 105;
    pub const GLOBAL: u8 = 106;
};

Embeddable Types

The type system's most elegant optimization is the concept of embeddable types. These nine primitive types (codes 1–9) can be "embedded" directly into type constructor codes, allowing common composite types to serialize as a single byte.

Consider Coll[Int] (a collection of integers). Without embedding, this would require two bytes: one for "Collection" and one for "Int". With embedding, it serializes as a single byte: 12 + 4 = 16. This matters because type information appears frequently in serialized ErgoTrees—every constant, every expression result has a type.

The embedding formula is simple8:

/// Embed primitive type code into constructor
pub fn embedType(type_constr_base: u8, prim_type_code: u8) u8 {
    return type_constr_base + prim_type_code;
}

// Examples:
// Coll[Byte]  = 12 + 2 = 14
// Coll[Int]   = 12 + 4 = 16
// Option[Long] = 36 + 5 = 41
// Option[Coll[Byte]] = 48 + 2 = 50
TypeCodeColl[T]Option[T]
Boolean11337
Byte21438
Short31539
Int41640
Long51741
BigInt61842
GroupElement71943
SigmaProp82044
UnsignedBigInt92145

Collection Types

Collections are homogeneous sequences910:

const SCollection = struct {
    elem_type: *const SType,

    pub fn typeCode(self: SCollection) u8 {
        if (self.elem_type.isEmbeddable()) {
            return TypeCodes.COLL_BASE + self.elem_type.typeCode();
        }
        return TypeCodes.COLL_BASE; // Followed by element type
    }
};

// Pre-defined collection types (avoid allocation)
const SByteArray = SCollection{ .elem_type = &SType.byte };
const SIntArray = SCollection{ .elem_type = &SType.int };
const SBooleanArray = SCollection{ .elem_type = &SType.boolean };
const SBoxArray = SCollection{ .elem_type = &SType.box };

Option Types

Optional values11:

const SOption = struct {
    elem_type: *const SType,

    pub fn typeCode(self: SOption) u8 {
        if (self.elem_type.isEmbeddable()) {
            return TypeCodes.OPTION_BASE + self.elem_type.typeCode();
        }
        return TypeCodes.OPTION_BASE;
    }
};

// Pre-defined option types
const SByteOption = SOption{ .elem_type = &SType.byte };
const SIntOption = SOption{ .elem_type = &SType.int };
const SLongOption = SOption{ .elem_type = &SType.long };
const SBoxOption = SOption{ .elem_type = &SType.box };

Tuple Types

Heterogeneous fixed-size sequences:

const STuple = struct {
    items: []const SType,

    pub const type_code: u8 = 96;

    pub fn pair(left: SType, right: SType) STuple {
        return STuple{ .items = &[_]SType{ left, right } };
    }
};

Function Types

Function signatures for lambdas and methods:

const SFunc = struct {
    t_dom: []const SType,     // Domain (argument types)
    t_range: *const SType,    // Range (return type)
    tpe_params: []const STypeVar, // Generic type parameters

    pub const type_code: u8 = 246;
};

// Example: (Int) => Boolean
const intToBool = SFunc{
    .t_dom = &[_]SType{SType.int},
    .t_range = &SType.boolean,
    .tpe_params = &[_]STypeVar{},
};

Object Types

TypeCodeDescription
SBox99UTXO with value, script, tokens, registers
SAvlTree100Authenticated dictionary (Merkle proofs)
SContext101Transaction context
SHeader104Block header
SPreHeader105Pre-solved block header
SGlobal106Global operations

Type Variables

Used internally by compiler for generic methods (never serialized)12:

const STypeVar = struct {
    name: []const u8,

    // Standard type variables
    pub const T = STypeVar{ .name = "T" };
    pub const R = STypeVar{ .name = "R" };
    pub const K = STypeVar{ .name = "K" };
    pub const V = STypeVar{ .name = "V" };
    pub const IV = STypeVar{ .name = "IV" }; // Input Value
    pub const OV = STypeVar{ .name = "OV" }; // Output Value
};

Version Differences

v6 additions13:

  • SUnsignedBigInt (type code 9)
  • Bitwise operations on numeric types
  • Additional numeric methods (toBytes, toBits, shifts)
pub fn allPredefTypes(version: ErgoTreeVersion) []const SType {
    const v5_types = &[_]SType{
        .boolean, .byte, .short, .int, .long, .big_int,
        .context, .global, .header, .pre_header, .avl_tree,
        .group_element, .sigma_prop, .box, .unit, .any,
    };

    if (version.value >= 3) { // v6+
        return v5_types ++ &[_]SType{.unsigned_big_int};
    }
    return v5_types;
}

Complete Type Code Reference

TypeCodeEmbeddable
Boolean1Yes
Byte2Yes
Short3Yes
Int4Yes
Long5Yes
BigInt6Yes
GroupElement7Yes
SigmaProp8Yes
UnsignedBigInt9Yes
Coll[T]12Constructor
Option[T]36Constructor
Tuple96No
Any97No
Unit98No
Box99No
AvlTree100No
Context101No
Header104No
PreHeader105No
Global106No

Summary

This chapter covered ErgoTree's type system, which provides the foundation for type-safe script execution:

  • Type codes (unique numeric identifiers) enable compact binary serialization—critical for on-chain storage efficiency
  • Embeddable types (codes 1–9) combine with type constructors using a clever arithmetic encoding, reducing common types to single bytes
  • Numeric types form an ordered hierarchy (Byte < Short < Int < Long < BigInt) with safe upcasting and checked downcasting
  • SigmaProp is the required return type for all ErgoScript contracts—it represents the cryptographic proposition that must be proven
  • Object types (Box, Context, Header) provide access to blockchain state during script execution
  • Version 6 introduces SUnsignedBigInt and additional numeric operations for greater expressiveness

The type system ensures that scripts are well-formed before execution, preventing runtime type errors that could cause consensus failures. In the next chapter, we'll see how these types are organized into the ErgoTree structure—the actual format stored on-chain.


Next: Chapter 3: ErgoTree Structure

3

Scala: SType.scala:395-575 (numeric type definitions)

4

Rust: snumeric.rs:12-37 (method IDs)

5

Scala: SType.scala (SGroupElement definition)

6

Scala: SType.scala (SSigmaProp definition)

7

Scala: SType.scala:320-332 (type code ranges)

8

Scala: SType.scala:305-313 (SEmbeddable trait)

9

Scala: SType.scala:743-799 (SCollection)

10

Rust: scoll.rs

11

Scala: SType.scala:691-741 (SOption)

12

Scala: SType.scala:67-95 (type variables)

13

Scala: SType.scala:105-128 (version differences)

Chapter 3: ErgoTree Structure

PRE-ALPHA WARNING: This is a pre-alpha version of The Sigma Book. Content may be incomplete, inaccurate, or subject to change. Do not use as a source of truth. For authoritative information, consult the official repositories:

Prerequisites

  • Binary representation concepts (bits, bytes, bitwise operations)
  • Variable-Length Quantity (VLQ) encoding—a method for encoding integers using a variable number of bytes
  • Understanding of Abstract Syntax Trees (ASTs) as hierarchical representations of code structure
  • Prior chapters: Chapter 1 for the three-layer architecture, Chapter 2 for type codes used in serialization

Learning Objectives

By the end of this chapter, you will be able to:

  • Parse and interpret ErgoTree header bytes, extracting version and feature flags
  • Explain how constant segregation enables template sharing and caching optimizations
  • Describe the version mechanism and how it enables soft-fork protocol upgrades
  • Read and write the complete ErgoTree binary format

ErgoTree Overview

When you write an ErgoScript contract, the compiler transforms it into ErgoTree—a compact binary format designed for blockchain storage and deterministic execution. Every UTXO box contains an ErgoTree that defines its spending conditions.

ErgoTree is specifically designed to be12:

  • Self-sufficient: Contains everything needed for evaluation (no external dependencies)
  • Compact: Optimized binary encoding minimizes on-chain storage
  • Forward-compatible: Version mechanism enables protocol upgrades without hard forks
  • Deterministic: Same bytes always produce the same evaluation result

The structure consists of:

  • Header byte — Format version and feature flags
  • Size field (optional) — Total size for fast skipping
  • Constants array (optional) — Extracted constants for template sharing
  • Root expression — The actual script logic, returning SigmaProp
const ErgoTree = struct {
    header: HeaderType,
    constants: []const Constant,
    root: union(enum) {
        parsed: SigmaPropValue,
        unparsed: UnparsedTree,
    },
    proposition_bytes: ?[]const u8,

    pub fn bytes(self: *ErgoTree, allocator: Allocator) ![]u8 {
        if (self.proposition_bytes) |b| return b;
        return try serialize(self, allocator);
    }

    pub fn bytesHex(self: *ErgoTree, allocator: Allocator) ![]u8 {
        const b = try self.bytes(allocator);
        return std.fmt.allocPrint(allocator, "{x}", .{b});
    }
};

Header Format

The first byte uses a bit-field format34:

   7  6  5  4  3  2  1  0
 ┌──┬──┬──┬──┬──┬──┬──┬──┐
 │  │  │  │  │  │  │  │  │
 └──┴──┴──┴──┴──┴──┴──┴──┘
  │  │  │  │  │  └──┴──┴── Version (bits 0-2)
  │  │  │  │  └─────────── Size flag (bit 3)
  │  │  │  └────────────── Constant segregation (bit 4)
  │  │  └───────────────── Reserved (bit 5, must be 0)
  │  └──────────────────── Reserved for GZIP (bit 6, must be 0)
  └─────────────────────── Extended header (bit 7)
const HeaderType = packed struct(u8) {
    version: u3,              // bits 0-2
    has_size: bool,           // bit 3
    constant_segregation: bool, // bit 4
    reserved1: bool = false,  // bit 5
    reserved_gzip: bool = false, // bit 6
    multi_byte: bool = false, // bit 7

    pub const VERSION_MASK: u8 = 0x07;
    pub const SIZE_FLAG: u8 = 0x08;
    pub const CONST_SEG_FLAG: u8 = 0x10;

    pub fn fromByte(byte: u8) HeaderType {
        return @bitCast(byte);
    }

    pub fn toByte(self: HeaderType) u8 {
        return @bitCast(self);
    }

    pub fn v0(constant_segregation: bool) HeaderType {
        return .{
            .version = 0,
            .has_size = false,
            .constant_segregation = constant_segregation,
        };
    }

    pub fn v1(constant_segregation: bool) HeaderType {
        return .{
            .version = 1,
            .has_size = true, // Required for v1+
            .constant_segregation = constant_segregation,
        };
    }
};

Common Header Values

ByteBinaryMeaning
0x0000000000v0, no segregation, no size
0x0800001000v0, no segregation, with size
0x1000010000v0, constant segregation, no size
0x1800011000v0, constant segregation, with size
0x0900001001v1, with size (required)
0x1900011001v1, constant segregation, with size

Binary Format

┌──────────────────────────────────────────────────────────────────┐
│                            ErgoTree                              │
├─────────┬─────────────┬──────────────────┬───────────────────────┤
│ Header  │ [Size]      │ [Constants]      │ Root Expression       │
│ 1 byte  │ VLQ (opt)   │ Array (opt)      │ Serialized tree       │
└─────────┴─────────────┴──────────────────┴───────────────────────┘

If header bit 3 is set (hasSize):
  Size = VLQ-encoded size of (Constants + Root Expression)

If header bit 4 is set (isConstantSegregation):
  Constants = VLQ count + Array of serialized constants

Root Expression = Serialized expression tree (SigmaPropValue)
const ErgoTreeSerializer = struct {
    pub fn deserialize(reader: anytype) !ErgoTree {
        // 1. Read header byte
        const header = HeaderType.fromByte(try reader.readByte());

        // 2. Read extended header if bit 7 set
        if (header.multi_byte) {
            // VLQ continuation - read additional bytes
            _ = try readVlqExtension(reader);
        }

        // 3. Read size if flag set
        var tree_size: ?u32 = null;
        if (header.has_size) {
            tree_size = try readVlq(reader);
        }

        // 4. Read constants if segregation enabled
        var constants: []Constant = &.{};
        if (header.constant_segregation) {
            const count = try readVlq(reader);
            // Bounds check: prevent DoS via excessive allocation
            const MAX_CONSTANTS: u32 = 4096;
            if (count > MAX_CONSTANTS) {
                return error.TooManyConstants;
            }
            constants = try allocator.alloc(Constant, count);
            for (constants) |*c| {
                c.* = try Constant.deserialize(reader);
            }
        }
        // NOTE: In production, use a pre-allocated pool instead of dynamic
        // allocation during deserialization. See ZIGMA_STYLE.md.

        // 5. Read root expression
        const root = try Expr.deserialize(reader);

        return ErgoTree{
            .header = header,
            .constants = constants,
            .root = .{ .parsed = root },
            .proposition_bytes = null,
        };
    }
};

Constant Segregation

Constant segregation is an optimization technique that extracts literal values from the expression tree and stores them in a separate array5. The expression tree then references these constants via placeholder indices. This seemingly simple change enables several powerful optimizations:

Without segregation:
┌─────────────────────────────────────────────────┐
│ header: 0x00                                    │
│ root: AND(GT(HEIGHT, IntConstant(100)), pk)     │
└─────────────────────────────────────────────────┘

With segregation:
┌─────────────────────────────────────────────────┐
│ header: 0x10                                    │
│ constants: [IntConstant(100)]                   │
│ root: AND(GT(HEIGHT, Placeholder(0)), pk)       │
└─────────────────────────────────────────────────┘

Benefits:

  1. Template sharing: Same template, different constants
  2. Caching: Templates cached for repeated evaluation
  3. Substitution: Constants replaced without re-parsing
/// Substitute ConstantPlaceholder nodes with actual constants
pub fn substConstants(
    root: *const Expr,
    constants: []const Constant,
) Expr {
    return switch (root.*) {
        .constant_placeholder => |ph| .{
            .constant = constants[ph.index],
        },
        .and => |a| .{
            .and = .{
                .left = substConstants(a.left, constants),
                .right = substConstants(a.right, constants),
            },
        },
        // ... other node types
        else => root.*,
    };
}
// NOTE: In production, use an iterative approach with an explicit work stack
// to guarantee bounded stack depth and prevent stack overflow on deep trees.

Version Mechanism

The ErgoTree version field (bits 0-2) enables soft-fork protocol upgrades without breaking consensus6. Each ErgoTree version corresponds to a minimum required block version in the Ergo protocol—nodes running older protocol versions will skip validation of scripts with newer ErgoTree versions rather than rejecting them as invalid.

ErgoTree VersionMin Block VersionKey Features
v01Original format with Ahead-of-Time (AOT) costing calculated during compilation
v12Just-in-Time (JIT) costing calculated during execution; size field required
v23Extended operations and new opcodes
v34UnsignedBigInt type and enhanced collection methods

The size field became mandatory in v1 to support forward compatibility—nodes can skip over scripts they cannot fully parse by reading the size and advancing past the unknown content.

pub fn setVersionBits(header: HeaderType, version: u3) HeaderType {
    var h = header;
    h.version = version;
    // Size flag required for version > 0
    if (version > 0) {
        h.has_size = true;
    }
    return h;
}

Unparsed Trees

When a node encounters an ErgoTree with an unknown opcode—typically from a newer protocol version—deserialization fails. Rather than rejecting the transaction entirely, the raw bytes are preserved as an "unparsed tree"7. This design is critical for soft-fork compatibility: older nodes can process blocks containing newer script versions without understanding their contents.

const UnparsedTree = struct {
    bytes: []const u8,
    err: DeserializationError,
};

/// Convert to proposition, handling unparsed case
pub fn toProposition(self: *const ErgoTree, replace_constants: bool) !SigmaPropValue {
    return switch (self.root) {
        .parsed => |tree| blk: {
            if (replace_constants and self.constants.len > 0) {
                break :blk substConstants(tree, self.constants);
            }
            break :blk tree;
        },
        .unparsed => |u| return u.err,
    };
}

Creating ErgoTrees

pub fn fromProposition(prop: SigmaPropValue) ErgoTree {
    return fromPropositionWithHeader(HeaderType.v0(false), prop);
}

pub fn fromPropositionWithHeader(header: HeaderType, prop: SigmaPropValue) ErgoTree {
    // Simple constants don't need segregation
    if (prop == .sigma_prop_constant) {
        return withoutSegregation(header, prop);
    }
    // Complex expressions benefit from segregation
    return withSegregation(header, prop);
}

fn withSegregation(header: HeaderType, prop: SigmaPropValue) ErgoTree {
    var constants = std.ArrayList(Constant).init(allocator);
    const segregated = extractConstants(prop, &constants);
    return ErgoTree{
        .header = .{
            .version = header.version,
            .has_size = header.has_size,
            .constant_segregation = true,
        },
        .constants = constants.toOwnedSlice(),
        .root = .{ .parsed = segregated },
        .proposition_bytes = null,
    };
}

Template Extraction

The template is root expression bytes without constant values:

pub fn template(self: *const ErgoTree) ![]const u8 {
    // Serialize root with placeholders (no constant substitution)
    var buf = std.ArrayList(u8).init(allocator);
    try self.root.parsed.serialize(buf.writer());
    return buf.toOwnedSlice();
}

Templates are useful for:

  • Identifying script patterns regardless of constants
  • Contract template matching
  • Caching deserialized templates

Properties

const ErgoTree = struct {
    // ... fields ...

    /// Returns true if tree contains deserialization operations
    pub fn hasDeserialize(self: *const ErgoTree) bool {
        return switch (self.root) {
            .parsed => |p| containsDeserializeOp(p),
            .unparsed => false,
        };
    }

    /// Returns true if tree uses blockchain context
    pub fn isUsingBlockchainContext(self: *const ErgoTree) bool {
        return switch (self.root) {
            .parsed => |p| containsContextOp(p),
            .unparsed => false,
        };
    }

    /// Convert to SigmaBoolean if simple proposition
    pub fn toSigmaBooleanOpt(self: *const ErgoTree) ?SigmaBoolean {
        const prop = self.toProposition(self.header.constant_segregation) catch return null;
        return switch (prop) {
            .sigma_prop_constant => |c| c.value,
            else => null,
        };
    }
};

Summary

This chapter covered the complete ErgoTree binary format—the serialized representation of smart contracts stored in every UTXO box:

  • ErgoTree is a self-sufficient serialized contract format containing everything needed for evaluation without external dependencies
  • The header byte uses a bit-field layout: version (bits 0-2), size flag (bit 3), constant segregation flag (bit 4), with reserved bits for future extensions
  • Constant segregation (bit 4) extracts literal values into a separate array, enabling template sharing, caching, and runtime substitution without re-parsing
  • The version mechanism enables soft-fork protocol upgrades—newer ErgoTree versions are skipped by older nodes rather than causing consensus failures
  • ErgoTree versions 1+ require the size flag, allowing nodes to skip past unknown content
  • UnparsedTree preserves raw bytes when deserialization fails, maintaining block validity even with unknown opcodes
  • Simple cryptographic propositions can be extracted as SigmaBoolean values for direct signature verification

Next: Chapter 4: Value Nodes

Chapter 4: Value Nodes

PRE-ALPHA WARNING: This is a pre-alpha version of The Sigma Book. Content may be incomplete, inaccurate, or subject to change. Do not use as a source of truth. For authoritative information, consult the official repositories:

Prerequisites

  • Understanding of Abstract Syntax Trees (ASTs) as hierarchical representations where each node represents a language construct
  • Tree traversal techniques (depth-first evaluation)
  • Prior chapters: Chapter 2 for the type system that governs value types, Chapter 3 for how values are serialized in ErgoTree

Learning Objectives

By the end of this chapter, you will be able to:

  • Explain the Value base type and its role as the foundation for all ErgoTree expression nodes
  • Distinguish between different constant value types (primitives, cryptographic, collections)
  • Describe how the eval method implements the evaluation semantics for each node type
  • Work with compound values including collections and tuples

The Value Base Type

ErgoTree is fundamentally an expression tree where every node produces a typed value. The Value base type defines the common interface that all expression nodes share—a type annotation, an opcode for serialization, and an evaluation method that computes the result12.

/// Base type for all ErgoTree expression nodes
const Value = struct {
    tpe: SType,
    op_code: OpCode,

    /// Evaluate this node in the given environment
    pub fn eval(self: *const Value, env: *const DataEnv, evaluator: *Evaluator) !Any {
        // Default: must be overridden
        return error.NotImplemented;
    }

    /// Add fixed cost to accumulator
    pub fn addCost(self: *const Value, evaluator: *Evaluator, cost: FixedCost) void {
        evaluator.addCost(cost, self.op_code);
    }

    /// Add per-item cost for known iteration count
    pub fn addSeqCost(
        self: *const Value,
        evaluator: *Evaluator,
        cost: PerItemCost,
        n_items: usize,
    ) void {
        evaluator.addSeqCost(cost, n_items, self.op_code);
    }
};

Value Hierarchy

Value
├── Constant
│   ├── BooleanConstant (TrueLeaf, FalseLeaf)
│   ├── ByteConstant, ShortConstant, IntConstant, LongConstant
│   ├── BigIntConstant, UnsignedBigIntConstant (v6+)
│   ├── GroupElementConstant
│   ├── SigmaPropConstant
│   ├── CollectionConstant
│   └── UnitConstant
├── ConstantPlaceholder
├── Tuple
├── ConcreteCollection
├── SigmaPropValue
│   ├── BoolToSigmaProp
│   ├── CreateProveDlog
│   ├── CreateProveDHTuple
│   ├── SigmaAnd
│   └── SigmaOr
└── Transformer (collection operations)
    ├── AND, OR, XorOf
    ├── Map, Filter, Fold
    └── Exists, ForAll

The hierarchy divides into several major categories:

  • Constants hold literal values known at compile time
  • ConstantPlaceholder references segregated constants by index (see Chapter 3)
  • Compound values (Tuple, ConcreteCollection) combine multiple values
  • SigmaPropValue nodes produce cryptographic propositions for signing
  • Transformers perform operations on collections

Constant Values

Constants are pre-evaluated values embedded in the tree34:

const Constant = struct {
    tpe: SType,
    value: Literal,

    pub const COST = FixedCost{ .value = 5 }; // JitCost units

    pub fn eval(self: *const Constant, env: *const DataEnv, E: *Evaluator) Any {
        E.addCost(COST, OpCode.Constant);
        return self.value.toAny();
    }
};

/// Literal values for constants
const Literal = union(enum) {
    boolean: bool,
    byte: i8,
    short: i16,
    int: i32,
    long: i64,
    big_int: BigInt256,
    unsigned_big_int: UnsignedBigInt256,
    group_element: EcPoint,
    sigma_prop: SigmaProp,
    coll: Collection,
    tuple: []const Literal,
    unit: void,

    pub fn toAny(self: Literal) Any {
        return switch (self) {
            .boolean => |b| .{ .boolean = b },
            .int => |i| .{ .int = i },
            // ... other cases
        };
    }
};

Primitive Constant Factories

pub fn intConstant(value: i32) Constant {
    return .{
        .tpe = SType.int,
        .value = .{ .int = value },
    };
}

pub fn longConstant(value: i64) Constant {
    return .{
        .tpe = SType.long,
        .value = .{ .long = value },
    };
}

pub fn byteArrayConstant(bytes: []const u8) Constant {
    return .{
        .tpe = .{ .coll = &SType.byte },
        .value = .{ .coll = .{ .bytes = bytes } },
    };
}

Boolean Singletons

Boolean has special singleton instances for efficiency5:

pub const TrueLeaf = Constant{
    .tpe = SType.boolean,
    .value = .{ .boolean = true },
};

pub const FalseLeaf = Constant{
    .tpe = SType.boolean,
    .value = .{ .boolean = false },
};

pub fn booleanConstant(v: bool) *const Constant {
    return if (v) &TrueLeaf else &FalseLeaf;
}

Cryptographic Constants

pub fn groupElementConstant(point: EcPoint) Constant {
    return .{
        .tpe = SType.group_element,
        .value = .{ .group_element = point },
    };
}

pub fn sigmaPropConstant(prop: SigmaProp) Constant {
    return .{
        .tpe = SType.sigma_prop,
        .value = .{ .sigma_prop = prop },
    };
}

/// Group generator - base point G of secp256k1
pub const GroupGenerator = struct {
    pub const COST = FixedCost{ .value = 10 };

    pub fn eval(_: *const @This(), _: *const DataEnv, E: *Evaluator) GroupElement {
        E.addCost(COST, OpCode.GroupGenerator);
        return crypto.SECP256K1_GENERATOR;
    }
};

Constant Placeholders

When constant segregation is enabled (Chapter 3), placeholders replace inline constants with index references into the constants array67. This separation enables template caching—the same expression tree structure can be reused with different constant values. Placeholder evaluation costs less than inline constants because the constant data has already been parsed and validated during ErgoTree deserialization.

const ConstantPlaceholder = struct {
    index: u32,
    tpe: SType,

    pub const COST = FixedCost{ .value = 1 }; // Cheaper than Constant

    pub fn eval(self: *const ConstantPlaceholder, _: *const DataEnv, E: *Evaluator) !Any {
        // Bounds check first (prevents out-of-bounds access)
        if (self.index >= E.constants.len) {
            return error.ConstantIndexOutOfBounds;
        }
        const c = E.constants[self.index];
        E.addCost(COST, OpCode.ConstantPlaceholder);

        // Type check
        if (c.tpe != self.tpe) {
            return error.TypeMismatch;
        }
        return c.value.toAny();
    }
};

Collection Values

ErgoTree supports two kinds of collection nodes, optimized for different use cases:

CollectionConstant

For collections where all elements are known at compile time, CollectionConstant stores the values directly. This enables efficient serialization and avoids evaluation overhead for static data like byte arrays and fixed integer sequences.

const CollectionConstant = struct {
    elem_type: SType,
    items: union(enum) {
        bytes: []const u8,
        ints: []const i32,
        longs: []const i64,
        bools: []const bool,
        any: []const Literal,
    },

    pub fn tpe(self: *const CollectionConstant) SType {
        return .{ .coll = &self.elem_type };
    }
};

ConcreteCollection

When collection elements are computed expressions rather than literals, ConcreteCollection holds references to sub-expression nodes8. Each element is evaluated at runtime, making this suitable for dynamically constructed collections.

const ConcreteCollection = struct {
    items: []const *Value,
    elem_type: SType,

    pub const COST = FixedCost{ .value = 20 };

    pub fn eval(self: *const ConcreteCollection, env: *const DataEnv, E: *Evaluator) ![]Any {
        E.addCost(COST, OpCode.ConcreteCollection);

        var result = try E.allocator.alloc(Any, self.items.len);
        for (self.items, 0..) |item, i| {
            result[i] = try item.eval(env, E);
        }
        return result;
    }
};
// NOTE: In production, use a pre-allocated value pool to avoid dynamic
// allocation during evaluation. See ZIGMA_STYLE.md memory management section.

Tuple Values

Heterogeneous fixed-size sequences9:

const Tuple = struct {
    items: []const *Value,

    pub const COST = FixedCost{ .value = 15 };

    pub fn tpe(self: *const Tuple) STuple {
        var types = try allocator.alloc(SType, self.items.len);
        for (self.items, 0..) |item, i| {
            types[i] = item.tpe;
        }
        return STuple{ .items = types };
    }

    pub fn eval(self: *const Tuple, env: *const DataEnv, E: *Evaluator) !TupleValue {
        // Note: v5.0 only supports pairs (2 elements)
        if (self.items.len != 2) {
            return error.InvalidTupleSize;
        }

        const x = try self.items[0].eval(env, E);
        const y = try self.items[1].eval(env, E);
        E.addCost(COST, OpCode.Tuple);

        return .{ x, y };
    }
};

Sigma Proposition Values

BoolToSigmaProp

Converts boolean to cryptographic proposition10:

const BoolToSigmaProp = struct {
    input: *Value, // Must be boolean

    pub const COST = FixedCost{ .value = 15 };

    pub fn eval(self: *const BoolToSigmaProp, env: *const DataEnv, E: *Evaluator) !SigmaProp {
        const v = try self.input.eval(env, E);
        E.addCost(COST, OpCode.BoolToSigmaProp);

        return SigmaProp.fromBool(v.boolean);
    }
};

CreateProveDlog

Creates discrete log proposition (standard public key)11:

const CreateProveDlog = struct {
    input: *Value, // GroupElement

    pub const COST = FixedCost{ .value = 10 };

    pub fn eval(self: *const CreateProveDlog, env: *const DataEnv, E: *Evaluator) !SigmaProp {
        const point = try self.input.eval(env, E);
        E.addCost(COST, OpCode.ProveDlog);

        return SigmaProp{
            .prove_dlog = ProveDlog{ .h = point.group_element },
        };
    }
};

CreateProveDHTuple

Creates Diffie-Hellman tuple proposition:

const CreateProveDHTuple = struct {
    g: *Value,
    h: *Value,
    u: *Value,
    v: *Value,

    pub const COST = FixedCost{ .value = 20 };

    pub fn eval(self: *const CreateProveDHTuple, env: *const DataEnv, E: *Evaluator) !SigmaProp {
        const g_val = try self.g.eval(env, E);
        const h_val = try self.h.eval(env, E);
        const u_val = try self.u.eval(env, E);
        const v_val = try self.v.eval(env, E);
        E.addCost(COST, OpCode.ProveDHTuple);

        return SigmaProp{
            .prove_dh_tuple = ProveDhTuple{
                .g = g_val.group_element,
                .h = h_val.group_element,
                .u = u_val.group_element,
                .v = v_val.group_element,
            },
        };
    }
};

SigmaAnd / SigmaOr

Combine sigma propositions12:

const SigmaAnd = struct {
    items: []const *Value, // SigmaPropValues

    pub const COST = PerItemCost{
        .base = 10,
        .per_chunk = 2,
        .chunk_size = 1,
    };

    pub fn eval(self: *const SigmaAnd, env: *const DataEnv, E: *Evaluator) !SigmaProp {
        var props = try E.allocator.alloc(SigmaProp, self.items.len);
        for (self.items, 0..) |item, i| {
            props[i] = (try item.eval(env, E)).sigma_prop;
        }
        E.addSeqCost(COST, self.items.len, OpCode.SigmaAnd);

        return SigmaProp{ .cand = Cand{ .children = props } };
    }
};

const SigmaOr = struct {
    items: []const *Value,

    pub const COST = PerItemCost{
        .base = 10,
        .per_chunk = 2,
        .chunk_size = 1,
    };

    pub fn eval(self: *const SigmaOr, env: *const DataEnv, E: *Evaluator) !SigmaProp {
        var props = try E.allocator.alloc(SigmaProp, self.items.len);
        for (self.items, 0..) |item, i| {
            props[i] = (try item.eval(env, E)).sigma_prop;
        }
        E.addSeqCost(COST, self.items.len, OpCode.SigmaOr);

        return SigmaProp{ .cor = Cor{ .children = props } };
    }
};

Logical Operations

AND / OR with Short-Circuit

Boolean operations support short-circuit evaluation13:

const AND = struct {
    input: *Value, // Collection[Boolean]

    pub const COST = PerItemCost{
        .base = 10,
        .per_chunk = 5,
        .chunk_size = 32,
    };

    pub fn eval(self: *const AND, env: *const DataEnv, E: *Evaluator) !bool {
        const coll = try self.input.eval(env, E);
        const items = coll.coll.bools;

        var result = true;
        var i: usize = 0;

        // Short-circuit: stop on first false
        while (i < items.len and result) {
            result = result and items[i];
            i += 1;
        }

        // Cost based on actual items processed
        E.addSeqCost(COST, i, OpCode.And);
        return result;
    }
};

const OR = struct {
    input: *Value, // Collection[Boolean]

    pub const COST = PerItemCost{
        .base = 10,
        .per_chunk = 5,
        .chunk_size = 32,
    };

    pub fn eval(self: *const OR, env: *const DataEnv, E: *Evaluator) !bool {
        const coll = try self.input.eval(env, E);
        const items = coll.coll.bools;

        var result = false;
        var i: usize = 0;

        // Short-circuit: stop on first true
        while (i < items.len and !result) {
            result = result or items[i];
            i += 1;
        }

        E.addSeqCost(COST, i, OpCode.Or);
        return result;
    }
};

XorOf

XOR over boolean collection:

const XorOf = struct {
    input: *Value,

    pub const COST = PerItemCost{
        .base = 20,
        .per_chunk = 5,
        .chunk_size = 32,
    };

    pub fn eval(self: *const XorOf, env: *const DataEnv, E: *Evaluator) !bool {
        const coll = try self.input.eval(env, E);
        const items = coll.coll.bools;

        var result = false;
        for (items) |b| {
            result = result != b; // XOR
        }

        E.addSeqCost(COST, items.len, OpCode.XorOf);
        return result;
    }
};

Cost Summary

OperationCost TypeValue
ConstantFixed5
ConstantPlaceholderFixed1
TupleFixed15
BoolToSigmaPropFixed15
CreateProveDlogFixed10
CreateProveDHTupleFixed20
GroupGeneratorFixed10
AND/ORPerItembase=10, chunk=5/32
SigmaAnd/SigmaOrPerItembase=10, chunk=2/1
XorOfPerItembase=20, chunk=5/32

Summary

This chapter introduced the value node hierarchy that forms the foundation of ErgoTree's expression tree:

  • Value is the base type for all ErgoTree expression nodes, defining the common interface of type, opcode, and evaluation method
  • Every value carries type information (tpe) used for static type checking and cost information used for bounded execution
  • Constants are pre-evaluated literals embedded in the tree; ConstantPlaceholder provides indirection to segregated constants for template sharing
  • Collection values come in two forms: CollectionConstant for static data and ConcreteCollection for computed elements
  • Sigma proposition values (CreateProveDlog, CreateProveDHTuple, SigmaAnd, SigmaOr) produce cryptographic propositions that require zero-knowledge proofs
  • Boolean operations (AND, OR) support short-circuit evaluation, charging costs only for elements actually processed
  • The eval method on each value type implements its evaluation semantics, transforming the AST node into a runtime value

Next: Chapter 5: Operations and Opcodes

2

Rust: expr.rs:1-80

5

Scala: values.scala (TrueLeaf, FalseLeaf)

8

Scala: values.scala (ConcreteCollection)

11

Scala: trees.scala (CreateProveDlog)

12

Scala: trees.scala (SigmaAnd, SigmaOr)

Chapter 5: Operations and Opcodes

PRE-ALPHA WARNING: This is a pre-alpha version of The Sigma Book. Content may be incomplete, inaccurate, or subject to change. Do not use as a source of truth. For authoritative information, consult the official repositories:

Prerequisites

  • Understanding of bytecode as numeric instruction encodings
  • Single-byte vs multi-byte encoding trade-offs
  • Prior chapters: Chapter 4 for value node types, Chapter 2 for type codes that occupy the lower opcode range

Learning Objectives

By the end of this chapter, you will be able to:

  • Explain the opcode encoding scheme and why constants share space with operations
  • Navigate the complete opcode space (0x00-0xFF) and identify operation categories
  • Describe the three cost descriptor types (FixedCost, PerItemCost, TypeBasedCost)
  • Understand how short-circuit evaluation affects cost calculation

Opcode Encoding Scheme

Every ErgoTree operation is identified by a single-byte opcode12:

Opcode Space Layout:
┌────────────────────────────────────────────────────────┐
│ 0x00       │ Reserved (Undefined)                      │
├────────────┼───────────────────────────────────────────┤
│ 0x01-0x70  │ Constant type codes (optimized encoding)  │
├────────────┼───────────────────────────────────────────┤
│ 0x71       │ Function type marker (LastConstantCode+1) │
├────────────┼───────────────────────────────────────────┤
│ 0x72-0xFF  │ Operation codes (newOpCode 1-143)         │
└────────────┴───────────────────────────────────────────┘

This layout is an optimization: constant values in the range 0x01-0x70 encode their type code directly as the opcode, saving one byte per constant in the serialized tree. The type code simultaneously identifies both what the value is and how to deserialize it. Operations occupy the upper range (0x72-0xFF), providing 143 distinct operation codes.

const OpCode = struct {
    value: u8,

    pub const FIRST_DATA_TYPE: u8 = 1;
    pub const LAST_DATA_TYPE: u8 = 111;
    pub const LAST_CONSTANT_CODE: u8 = 112; // LAST_DATA_TYPE + 1

    pub fn new(shift: u8) OpCode {
        return .{ .value = LAST_CONSTANT_CODE + shift };
    }

    pub fn isConstant(byte: u8) bool {
        return byte >= FIRST_DATA_TYPE and byte <= LAST_CONSTANT_CODE;
    }
};

Opcode Definitions

const OpCodes = struct {
    // Variables (0x71-0x74)
    pub const TaggedVariable = OpCode.new(1);     // 113
    pub const ValUse = OpCode.new(2);             // 114
    pub const ConstantPlaceholder = OpCode.new(3); // 115
    pub const SubstConstants = OpCode.new(4);     // 116

    // Conversions (0x7A-0x7E)
    pub const LongToByteArray = OpCode.new(10);   // 122
    pub const ByteArrayToBigInt = OpCode.new(11); // 123
    pub const ByteArrayToLong = OpCode.new(12);   // 124
    pub const Downcast = OpCode.new(13);          // 125
    pub const Upcast = OpCode.new(14);            // 126

    // Literals (0x7F-0x86)
    pub const True = OpCode.new(15);              // 127
    pub const False = OpCode.new(16);             // 128
    pub const UnitConstant = OpCode.new(17);      // 129
    pub const GroupGenerator = OpCode.new(18);    // 130
    pub const Coll = OpCode.new(19);              // 131
    pub const CollOfBoolConst = OpCode.new(21);   // 133
    pub const Tuple = OpCode.new(22);             // 134

    // Tuple access (0x87-0x8C)
    pub const Select1 = OpCode.new(23);           // 135
    pub const Select2 = OpCode.new(24);           // 136
    pub const Select3 = OpCode.new(25);           // 137
    pub const Select4 = OpCode.new(26);           // 138
    pub const Select5 = OpCode.new(27);           // 139
    pub const SelectField = OpCode.new(28);       // 140

    // Relations (0x8F-0x98)
    pub const Lt = OpCode.new(31);                // 143
    pub const Le = OpCode.new(32);                // 144
    pub const Gt = OpCode.new(33);                // 145
    pub const Ge = OpCode.new(34);                // 146
    pub const Eq = OpCode.new(35);                // 147
    pub const Neq = OpCode.new(36);               // 148
    pub const If = OpCode.new(37);                // 149
    pub const And = OpCode.new(38);               // 150
    pub const Or = OpCode.new(39);                // 151
    pub const AtLeast = OpCode.new(40);           // 152

    // Arithmetic (0x99-0xA2)
    pub const Minus = OpCode.new(41);             // 153
    pub const Plus = OpCode.new(42);              // 154
    pub const Xor = OpCode.new(43);               // 155
    pub const Multiply = OpCode.new(44);          // 156
    pub const Division = OpCode.new(45);          // 157
    pub const Modulo = OpCode.new(46);            // 158
    pub const Exponentiate = OpCode.new(47);      // 159
    pub const MultiplyGroup = OpCode.new(48);     // 160
    pub const Min = OpCode.new(49);               // 161
    pub const Max = OpCode.new(50);               // 162

    // Context (0xA3-0xAC)
    pub const Height = OpCode.new(51);            // 163
    pub const Inputs = OpCode.new(52);            // 164
    pub const Outputs = OpCode.new(53);           // 165
    pub const LastBlockUtxoRootHash = OpCode.new(54); // 166
    pub const Self = OpCode.new(55);              // 167
    pub const MinerPubkey = OpCode.new(60);       // 172

    // Collections (0xAD-0xB8)
    pub const Map = OpCode.new(61);               // 173
    pub const Exists = OpCode.new(62);            // 174
    pub const ForAll = OpCode.new(63);            // 175
    pub const Fold = OpCode.new(64);              // 176
    pub const SizeOf = OpCode.new(65);            // 177
    pub const ByIndex = OpCode.new(66);           // 178
    pub const Append = OpCode.new(67);            // 179
    pub const Slice = OpCode.new(68);             // 180
    pub const Filter = OpCode.new(69);            // 181
    pub const AvlTree = OpCode.new(70);           // 182
    pub const FlatMap = OpCode.new(72);           // 184

    // Box access (0xC1-0xC7)
    pub const ExtractAmount = OpCode.new(81);     // 193
    pub const ExtractScriptBytes = OpCode.new(82); // 194
    pub const ExtractBytes = OpCode.new(83);      // 195
    pub const ExtractBytesWithNoRef = OpCode.new(84); // 196
    pub const ExtractId = OpCode.new(85);         // 197
    pub const ExtractRegisterAs = OpCode.new(86); // 198
    pub const ExtractCreationInfo = OpCode.new(87); // 199

    // Crypto (0xCB-0xD3)
    pub const CalcBlake2b256 = OpCode.new(91);    // 203
    pub const CalcSha256 = OpCode.new(92);        // 204
    pub const ProveDlog = OpCode.new(93);         // 205
    pub const ProveDHTuple = OpCode.new(94);      // 206
    pub const SigmaPropBytes = OpCode.new(96);    // 208
    pub const BoolToSigmaProp = OpCode.new(97);   // 209
    pub const TrivialFalse = OpCode.new(98);      // 210
    pub const TrivialTrue = OpCode.new(99);       // 211

    // Blocks (0xD4-0xDD)
    pub const DeserializeContext = OpCode.new(100); // 212
    pub const DeserializeRegister = OpCode.new(101); // 213
    pub const ValDef = OpCode.new(102);           // 214
    pub const FunDef = OpCode.new(103);           // 215
    pub const BlockValue = OpCode.new(104);       // 216
    pub const FuncValue = OpCode.new(105);        // 217
    pub const FuncApply = OpCode.new(106);        // 218
    pub const PropertyCall = OpCode.new(107);     // 219
    pub const MethodCall = OpCode.new(108);       // 220
    pub const Global = OpCode.new(109);           // 221

    // Options (0xDE-0xE6)
    pub const SomeValue = OpCode.new(110);        // 222
    pub const NoneValue = OpCode.new(111);        // 223
    pub const GetVar = OpCode.new(115);           // 227
    pub const OptionGet = OpCode.new(116);        // 228
    pub const OptionGetOrElse = OpCode.new(117);  // 229
    pub const OptionIsDefined = OpCode.new(118);  // 230

    // Sigma props (0xEA-0xED)
    pub const SigmaAnd = OpCode.new(122);         // 234
    pub const SigmaOr = OpCode.new(123);          // 235
    pub const BinOr = OpCode.new(124);            // 236
    pub const BinAnd = OpCode.new(125);           // 237

    // Bitwise (0xEE-0xFB)
    pub const DecodePoint = OpCode.new(126);      // 238
    pub const LogicalNot = OpCode.new(127);       // 239
    pub const Negation = OpCode.new(128);         // 240
    pub const BitInversion = OpCode.new(129);     // 241
    pub const BitOr = OpCode.new(130);            // 242
    pub const BitAnd = OpCode.new(131);           // 243
    pub const BinXor = OpCode.new(132);           // 244
    pub const BitXor = OpCode.new(133);           // 245
    pub const BitShiftRight = OpCode.new(134);    // 246
    pub const BitShiftLeft = OpCode.new(135);     // 247
    pub const BitShiftRightZeroed = OpCode.new(136); // 248

    // Special (0xFE-0xFF)
    pub const Context = OpCode.new(142);          // 254
    pub const XorOf = OpCode.new(143);            // 255
};

Opcode Categories Summary

CategoryRangeCountDescription
Variables113-1164Variable references, placeholders
Conversions122-1265Type conversions
Literals127-1348Boolean, unit, collections
Tuple access135-1406Field selection
Relations143-15210Comparisons, conditionals
Arithmetic153-16210Math operations
Context163-1726Transaction context
Collections173-18410Collection operations
Box access193-1997Box property access
Crypto203-2119Hashing, sigma props
Blocks212-22110Definitions, lambdas
Options222-2307Option operations
Sigma props234-2374Sigma composition
Bitwise238-24811Bit operations

Arithmetic Operations

Arithmetic operations use type-based costing34:

const ArithOp = struct {
    op_code: OpCode,
    left: *const Value,
    right: *const Value,

    pub fn eval(self: *const ArithOp, env: *const DataEnv, E: *Evaluator) !Any {
        const x = try self.left.eval(env, E);
        const y = try self.right.eval(env, E);

        const cost = switch (self.left.tpe) {
            .big_int, .unsigned_big_int => 30,
            else => 15,
        };
        E.addCost(FixedCost{ .value = cost }, self.op_code);

        return switch (self.op_code.value) {
            OpCodes.Plus.value => arithPlus(x, y, self.left.tpe),
            OpCodes.Minus.value => arithMinus(x, y, self.left.tpe),
            OpCodes.Multiply.value => arithMultiply(x, y, self.left.tpe),
            OpCodes.Division.value => arithDivision(x, y, self.left.tpe),
            OpCodes.Modulo.value => arithModulo(x, y, self.left.tpe),
            OpCodes.Min.value => arithMin(x, y, self.left.tpe),
            OpCodes.Max.value => arithMax(x, y, self.left.tpe),
            else => error.UnknownOpcode,
        };
    }
};

fn arithPlus(x: Any, y: Any, tpe: SType) !Any {
    // NOTE: ErgoTree arithmetic uses modular (wrapping) semantics for primitives.
    // The +% operator in Zig performs wrapping addition, matching this behavior.
    // In production, use @addWithOverflow for explicit overflow detection when
    // the application requires overflow errors. See ZIGMA_STYLE.md.
    return switch (tpe) {
        .byte => .{ .byte = x.byte +% y.byte },
        .short => .{ .short = x.short +% y.short },
        .int => .{ .int = x.int +% y.int },
        .long => .{ .long = x.long +% y.long },
        .big_int => .{ .big_int = x.big_int.add(y.big_int) },
        else => unreachable,
    };
}

Arithmetic Cost Table

OperationPrimitive CostBigInt Cost
Plus (+)1520
Minus (-)1520
Multiply (*)1530
Division (/)1530
Modulo (%)1530
Min/Max1520

Relation Operations

Comparison operations5:

const Relation = struct {
    op_code: OpCode,
    left: *const Value,
    right: *const Value,

    pub fn eval(self: *const Relation, env: *const DataEnv, E: *Evaluator) !bool {
        const lv = try self.left.eval(env, E);
        const rv = try self.right.eval(env, E);

        const cost: u32 = switch (self.op_code.value) {
            OpCodes.Eq.value, OpCodes.Neq.value => 3, // Equality cheap
            else => 15, // Ordering comparisons
        };
        E.addCost(FixedCost{ .value = cost }, self.op_code);

        return switch (self.op_code.value) {
            OpCodes.Lt.value => compare(lv, rv, self.left.tpe) < 0,
            OpCodes.Le.value => compare(lv, rv, self.left.tpe) <= 0,
            OpCodes.Gt.value => compare(lv, rv, self.left.tpe) > 0,
            OpCodes.Ge.value => compare(lv, rv, self.left.tpe) >= 0,
            OpCodes.Eq.value => equalValues(lv, rv),
            OpCodes.Neq.value => !equalValues(lv, rv),
            else => error.UnknownOpcode,
        };
    }
};

Logical Operations

Short-circuit evaluation with per-item cost6:

const LogicalAnd = struct {
    input: *const Value, // Collection[Boolean]

    pub const COST = PerItemCost{
        .base = 10,
        .per_chunk = 5,
        .chunk_size = 32,
    };

    pub fn eval(self: *const LogicalAnd, env: *const DataEnv, E: *Evaluator) !bool {
        const coll = try self.input.eval(env, E);
        const items = coll.coll.bools;

        var result = true;
        var i: usize = 0;

        // Short-circuit: stop on first false
        while (i < items.len and result) : (i += 1) {
            result = result and items[i];
        }

        // Cost based on actual items processed
        E.addSeqCost(COST, i, OpCodes.And);
        return result;
    }
};

const BinaryAnd = struct {
    left: *const Value,
    right: *const Value,

    pub const COST = FixedCost{ .value = 20 };

    pub fn eval(self: *const BinaryAnd, env: *const DataEnv, E: *Evaluator) !bool {
        const l = try self.left.eval(env, E);
        E.addCost(COST, OpCodes.BinAnd);

        // Short-circuit: don't evaluate right if left is false
        if (!l.boolean) return false;
        return (try self.right.eval(env, E)).boolean;
    }
};

Cost Descriptors

Every operation has an associated cost that the interpreter accumulates during evaluation. If the total cost exceeds the block limit, execution fails—this prevents denial-of-service attacks via expensive computations. Three cost descriptor types model different operation characteristics7:

/// Fixed cost regardless of input
const FixedCost = struct {
    value: u32, // JitCost units
};

/// Cost scales with input size
const PerItemCost = struct {
    base: u32,       // Fixed overhead
    per_chunk: u32,  // Cost per chunk
    chunk_size: u32, // Items per chunk

    pub fn calculate(self: PerItemCost, n_items: usize) u32 {
        const chunks = (n_items + self.chunk_size - 1) / self.chunk_size;
        return self.base + @intCast(chunks) * self.per_chunk;
    }
};

/// Cost depends on operand type
const TypeBasedCost = struct {
    primitive_cost: u32,
    big_int_cost: u32,

    pub fn forType(self: TypeBasedCost, tpe: SType) u32 {
        return switch (tpe) {
            .big_int, .unsigned_big_int => self.big_int_cost,
            else => self.primitive_cost,
        };
    }
};

Context Operations

Access transaction context8:

const ContextOps = struct {
    pub const Height = struct {
        pub const COST = FixedCost{ .value = 26 };

        pub fn eval(_: *const @This(), _: *const DataEnv, E: *Evaluator) i32 {
            E.addCost(COST, OpCodes.Height);
            return E.context.pre_header.height;
        }
    };

    pub const Inputs = struct {
        pub const COST = FixedCost{ .value = 10 };

        pub fn eval(_: *const @This(), _: *const DataEnv, E: *Evaluator) []const Box {
            E.addCost(COST, OpCodes.Inputs);
            return E.context.inputs;
        }
    };

    pub const Outputs = struct {
        pub const COST = FixedCost{ .value = 10 };

        pub fn eval(_: *const @This(), _: *const DataEnv, E: *Evaluator) []const Box {
            E.addCost(COST, OpCodes.Outputs);
            return E.context.outputs;
        }
    };

    pub const SelfBox = struct {
        pub const COST = FixedCost{ .value = 10 };

        pub fn eval(_: *const @This(), _: *const DataEnv, E: *Evaluator) *const Box {
            E.addCost(COST, OpCodes.Self);
            return E.context.self_box;
        }
    };
};

Box Property Access

Extract box properties9:

const ExtractAmount = struct {
    box: *const Value,

    pub const COST = FixedCost{ .value = 12 };

    pub fn eval(self: *const ExtractAmount, env: *const DataEnv, E: *Evaluator) !i64 {
        const b = try self.box.eval(env, E);
        E.addCost(COST, OpCodes.ExtractAmount);
        return b.box.value;
    }
};

const ExtractId = struct {
    box: *const Value,

    pub const COST = FixedCost{ .value = 12 };

    pub fn eval(self: *const ExtractId, env: *const DataEnv, E: *Evaluator) ![32]u8 {
        const b = try self.box.eval(env, E);
        E.addCost(COST, OpCodes.ExtractId);
        return b.box.id();
    }
};

const ExtractRegisterAs = struct {
    box: *const Value,
    register_id: u4, // 0-9

    pub const COST = FixedCost{ .value = 12 };

    pub fn eval(self: *const ExtractRegisterAs, env: *const DataEnv, E: *Evaluator) !?Constant {
        const b = try self.box.eval(env, E);
        E.addCost(COST, OpCodes.ExtractRegisterAs);
        return b.box.registers[self.register_id];
    }
};

Cryptographic Operations

Hash and sigma prop operations10:

const CalcBlake2b256 = struct {
    input: *const Value, // Coll[Byte]

    pub const COST = PerItemCost{
        .base = 117,
        .per_chunk = 1,
        .chunk_size = 128,
    };

    pub fn eval(self: *const CalcBlake2b256, env: *const DataEnv, E: *Evaluator) ![32]u8 {
        const bytes = try self.input.eval(env, E);
        E.addSeqCost(COST, bytes.coll.bytes.len, OpCodes.CalcBlake2b256);

        var hasher = std.crypto.hash.blake2.Blake2b256.init(.{});
        hasher.update(bytes.coll.bytes);
        return hasher.finalResult();
    }
};

const CalcSha256 = struct {
    input: *const Value,

    pub const COST = PerItemCost{
        .base = 79,
        .per_chunk = 1,
        .chunk_size = 64,
    };

    pub fn eval(self: *const CalcSha256, env: *const DataEnv, E: *Evaluator) ![32]u8 {
        const bytes = try self.input.eval(env, E);
        E.addSeqCost(COST, bytes.coll.bytes.len, OpCodes.CalcSha256);

        var hasher = std.crypto.hash.sha2.Sha256.init(.{});
        hasher.update(bytes.coll.bytes);
        return hasher.finalResult();
    }
};

Summary

This chapter detailed the opcode encoding scheme that gives each ErgoTree operation a unique byte identifier:

  • Opcode space is split between constant type codes (0x01-0x70) and operation codes (0x72-0xFF), with constants using their type code directly to save one byte per value
  • Operation categories group related functionality: variables, conversions, relations, arithmetic, context access, collections, box properties, cryptography, blocks, options, sigma propositions, and bitwise operations
  • Cost descriptors come in three types: FixedCost for constant-time operations, PerItemCost for operations that scale with input size, and TypeBasedCost for operations where BigInt is more expensive than primitive types
  • Short-circuit evaluation in logical operations (AND, OR, BinaryAnd, BinaryOr) stops early when the result is determined, with costs calculated based on actual items processed
  • Context operations provide access to transaction data: HEIGHT, INPUTS, OUTPUTS, SELF box, and miner public key

Next: Chapter 6: Methods on Types

1

Scala: OpCodes.scala

4

Rust: bin_op.rs

6

Scala: trees.scala (AND, OR)

7

Scala: CostKind.scala

8

Scala: trees.scala (context operations)

9

Scala: trees.scala (box accessors)

10

Scala: trees.scala (crypto operations)

Chapter 6: Methods on Types

PRE-ALPHA WARNING: This is a pre-alpha version of The Sigma Book. Content may be incomplete, inaccurate, or subject to change. Do not use as a source of truth. For authoritative information, consult the official repositories:

Prerequisites

  • Understanding of method dispatch—how method calls are resolved to specific implementations based on the receiver type
  • Familiarity with type hierarchies and how types can share common method interfaces
  • Prior chapters: Chapter 2 for type codes used in method resolution, Chapter 5 for operations vs methods distinction

Learning Objectives

By the end of this chapter, you will be able to:

  • Explain how methods are organized via MethodsContainer and resolved by type code and method ID
  • Use methods on numeric, collection, box, and cryptographic types
  • Describe the method resolution process from MethodCall to method implementation
  • Access transaction context and blockchain state through context methods

Method Architecture

While Chapter 5 covered standalone operations (arithmetic, comparisons, etc.), ErgoTree also supports methods—operations that belong to specific types. The distinction matters for serialization: operations use opcodes directly, while method calls serialize a type code, method ID, and arguments. This design allows types to have rich APIs without consuming the limited opcode space.

Methods are organized through a MethodsContainer system that groups related methods by their receiver type12:

Method Organization
══════════════════════════════════════════════════════════════════

┌────────────────────────────────────────────────────────────────┐
│                    STypeCompanion                              │
│    (type_code: u8, methods: []const SMethodDesc)               │
└───────────────────────┬────────────────────────────────────────┘
                        │
       ┌────────────────┼────────────────┬───────────────────┐
       ▼                ▼                ▼                   ▼
┌──────────────┐ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐
│ SNumeric     │ │ SBox         │ │ SColl        │ │ SContext     │
│ TYPE_CODE=2-6│ │ TYPE_CODE=99 │ │ TYPE_CODE=12 │ │ TYPE_CODE=101│
├──────────────┤ ├──────────────┤ ├──────────────┤ ├──────────────┤
│ toByte   (1) │ │ value    (1) │ │ size     (1) │ │ dataInputs(1)│
│ toShort  (2) │ │ propBytes(2) │ │ getOrElse(2) │ │ headers  (2) │
│ toInt    (3) │ │ bytes    (3) │ │ map      (3) │ │ preHeader(3) │
│ toLong   (4) │ │ id       (5) │ │ exists   (4) │ │ INPUTS   (4) │
│ toBigInt (5) │ │ getReg   (7) │ │ forall   (5) │ │ OUTPUTS  (5) │
│ toBytes  (6) │ │ tokens   (8) │ │ fold     (6) │ │ HEIGHT   (6) │
│ ...          │ │ ...          │ │ ...          │ │ ...          │
└──────────────┘ └──────────────┘ └──────────────┘ └──────────────┘

Method Resolution:
  MethodCall(receiver, type_code=99, method_id=1)
       │
       ▼
  resolveMethod(99, 1) → SBoxMethods.VALUE
       │
       ▼
  method.eval(receiver, args, evaluator)
const MethodsContainer = struct {
    type_code: u8,
    methods: []const SMethod,

    pub fn getMethodById(self: *const MethodsContainer, method_id: u8) ?*const SMethod {
        for (self.methods) |*m| {
            if (m.method_id == method_id) return m;
        }
        return null;
    }

    pub fn getMethodByName(self: *const MethodsContainer, name: []const u8) ?*const SMethod {
        for (self.methods) |*m| {
            if (std.mem.eql(u8, m.name, name)) return m;
        }
        return null;
    }
};

const SMethod = struct {
    obj_type: STypeCompanion,
    name: []const u8,
    method_id: u8,
    tpe: SFunc,
    cost_kind: CostKind,
    min_version: ?ErgoTreeVersion = null, // v6+ methods

    pub fn eval(
        self: *const SMethod,
        receiver: Any,
        args: []const Any,
        E: *Evaluator,
    ) !Any {
        // Method dispatch by type_code and method_id
        return try evalMethod(
            self.obj_type.type_code,
            self.method_id,
            receiver,
            args,
            E,
        );
    }
};

Available Method Containers

TypeContainerMethod Count
Byte, Short, Int, LongSNumericMethods13
BigIntSBigIntMethods13
UnsignedBigInt (v6+)SUnsignedBigIntMethods13
BooleanSBooleanMethods0
GroupElementSGroupElementMethods4
SigmaPropSSigmaPropMethods2
BoxSBoxMethods10
Coll[T]SCollectionMethods20
Option[T]SOptionMethods4
ContextSContextMethods12
HeaderSHeaderMethods16
PreHeaderSPreHeaderMethods8
AvlTreeSAvlTreeMethods9
GlobalSGlobalMethods4

Numeric Type Methods

All numeric types share common methods34:

const SNumericMethods = struct {
    pub const TYPE_CODE = 0; // Varies by actual type

    // Conversion methods (v5+)
    pub const TO_BYTE = SMethod{
        .method_id = 1,
        .name = "toByte",
        .tpe = SFunc.unary(.{ .type_var = "T" }, .byte),
        .cost_kind = .{ .type_based = .{ .primitive = 5, .big_int = 10 } },
    };

    pub const TO_SHORT = SMethod{ .method_id = 2, .name = "toShort", ... };
    pub const TO_INT = SMethod{ .method_id = 3, .name = "toInt", ... };
    pub const TO_LONG = SMethod{ .method_id = 4, .name = "toLong", ... };
    pub const TO_BIGINT = SMethod{ .method_id = 5, .name = "toBigInt", ... };

    // Binary representation (v6+)
    pub const TO_BYTES = SMethod{
        .method_id = 6,
        .name = "toBytes",
        .tpe = SFunc.unary(.{ .type_var = "T" }, .{ .coll = &SType.byte }),
        .cost_kind = .{ .fixed = 5 },
        .min_version = .v3, // v6
    };

    pub const TO_BITS = SMethod{
        .method_id = 7,
        .name = "toBits",
        .tpe = SFunc.unary(.{ .type_var = "T" }, .{ .coll = &SType.boolean }),
        .cost_kind = .{ .fixed = 5 },
        .min_version = .v3,
    };

    // Bitwise operations (v6+)
    pub const BITWISE_INVERSE = SMethod{ .method_id = 8, .name = "bitwiseInverse", ... };
    pub const BITWISE_OR = SMethod{ .method_id = 9, .name = "bitwiseOr", ... };
    pub const BITWISE_AND = SMethod{ .method_id = 10, .name = "bitwiseAnd", ... };
    pub const BITWISE_XOR = SMethod{ .method_id = 11, .name = "bitwiseXor", ... };
    pub const SHIFT_LEFT = SMethod{ .method_id = 12, .name = "shiftLeft", ... };
    pub const SHIFT_RIGHT = SMethod{ .method_id = 13, .name = "shiftRight", ... };
};

Numeric Method Summary

IDMethodSignaturev5v6Description
1toByteT => ByteConvert (may overflow)
2toShortT => ShortConvert (may overflow)
3toIntT => IntConvert (may overflow)
4toLongT => LongConvert (may overflow)
5toBigIntT => BigIntConvert (always safe)
6toBytesT => Coll[Byte]-Big-endian bytes
7toBitsT => Coll[Bool]-Bit representation
8bitwiseInverseT => T-Bitwise NOT
9bitwiseOr(T,T) => T-Bitwise OR
10bitwiseAnd(T,T) => T-Bitwise AND
11bitwiseXor(T,T) => T-Bitwise XOR
12shiftLeft(T,Int) => T-Left shift
13shiftRight(T,Int) => T-Arithmetic right shift

Collection Methods

Collections have the richest method set56:

const SCollectionMethods = struct {
    pub const TYPE_CODE = 12;

    // Basic access
    pub const SIZE = SMethod{
        .method_id = 1,
        .name = "size",
        .tpe = SFunc.unary(.{ .coll = .{ .type_var = "T" } }, .int),
        .cost_kind = .{ .fixed = 14 },
    };

    pub const GET_OR_ELSE = SMethod{
        .method_id = 2,
        .name = "getOrElse",
        .tpe = SFunc.new(&[_]SType{
            .{ .coll = .{ .type_var = "T" } },
            .int,
            .{ .type_var = "T" },
        }, .{ .type_var = "T" }),
        .cost_kind = .dynamic,
    };

    // Transformation
    pub const MAP = SMethod{
        .method_id = 3,
        .name = "map",
        .tpe = SFunc.new(&[_]SType{
            .{ .coll = .{ .type_var = "IV" } },
            SFunc.unary(.{ .type_var = "IV" }, .{ .type_var = "OV" }),
        }, .{ .coll = .{ .type_var = "OV" } }),
        .cost_kind = .{ .per_item = .{ .base = 20, .per_chunk = 1, .chunk_size = 10 } },
    };

    pub const FILTER = SMethod{
        .method_id = 8,
        .name = "filter",
        .tpe = SFunc.new(&[_]SType{
            .{ .coll = .{ .type_var = "IV" } },
            SFunc.unary(.{ .type_var = "IV" }, .boolean),
        }, .{ .coll = .{ .type_var = "IV" } }),
        .cost_kind = .{ .per_item = .{ .base = 20, .per_chunk = 1, .chunk_size = 10 } },
    };

    pub const FOLD = SMethod{
        .method_id = 6,
        .name = "fold",
        .tpe = SFunc.new(&[_]SType{
            .{ .coll = .{ .type_var = "IV" } },
            .{ .type_var = "OV" },
            SFunc.binary(.{ .type_var = "OV" }, .{ .type_var = "IV" }, .{ .type_var = "OV" }),
        }, .{ .type_var = "OV" }),
        .cost_kind = .{ .per_item = .{ .base = 20, .per_chunk = 1, .chunk_size = 10 } },
    };

    // Predicates
    pub const EXISTS = SMethod{
        .method_id = 4,
        .name = "exists",
        .tpe = SFunc.new(&[_]SType{
            .{ .coll = .{ .type_var = "IV" } },
            SFunc.unary(.{ .type_var = "IV" }, .boolean),
        }, .boolean),
        .cost_kind = .{ .per_item = .{ .base = 20, .per_chunk = 1, .chunk_size = 10 } },
    };

    pub const FORALL = SMethod{
        .method_id = 5,
        .name = "forall",
        .tpe = SFunc.new(&[_]SType{
            .{ .coll = .{ .type_var = "IV" } },
            SFunc.unary(.{ .type_var = "IV" }, .boolean),
        }, .boolean),
        .cost_kind = .{ .per_item = .{ .base = 20, .per_chunk = 1, .chunk_size = 10 } },
    };

    // Combination
    pub const APPEND = SMethod{
        .method_id = 9,
        .name = "append",
        .tpe = SFunc.binary(
            .{ .coll = .{ .type_var = "IV" } },
            .{ .coll = .{ .type_var = "IV" } },
            .{ .coll = .{ .type_var = "IV" } },
        ),
        .cost_kind = .{ .per_item = .{ .base = 20, .per_chunk = 2, .chunk_size = 100 } },
    };

    pub const SLICE = SMethod{
        .method_id = 10,
        .name = "slice",
        .tpe = SFunc.new(&[_]SType{
            .{ .coll = .{ .type_var = "IV" } },
            .int,
            .int,
        }, .{ .coll = .{ .type_var = "IV" } }),
        .cost_kind = .{ .per_item = .{ .base = 10, .per_chunk = 2, .chunk_size = 100 } },
    };

    pub const ZIP = SMethod{
        .method_id = 14,
        .name = "zip",
        .cost_kind = .{ .per_item = .{ .base = 10, .per_chunk = 1, .chunk_size = 10 } },
    };

    // Index operations
    pub const INDICES = SMethod{
        .method_id = 11,
        .name = "indices",
        .tpe = SFunc.unary(.{ .coll = .{ .type_var = "T" } }, .{ .coll = &SType.int }),
        .cost_kind = .{ .per_item = .{ .base = 20, .per_chunk = 2, .chunk_size = 128 } },
    };

    pub const INDEX_OF = SMethod{
        .method_id = 26,
        .name = "indexOf",
        .cost_kind = .{ .per_item = .{ .base = 20, .per_chunk = 1, .chunk_size = 10 } },
    };
};

Collection Method Summary

IDMethodv5v6Description
1sizeNumber of elements
2getOrElseElement with default
3mapTransform elements
4existsAny match predicate
5forallAll match predicate
6foldReduce to single value
7applyElement at index (panics if OOB)
8filterKeep matching elements
9appendConcatenate
10sliceExtract range
14indicesRange 0..size-1
15flatMapMap and flatten
19patchReplace range
20updatedReplace at index
21updateManyBatch update
26indexOfFind element index
29zipPair with other collection
30reverse-Reverse order
31startsWith-Prefix match
32endsWith-Suffix match
33get-Safe element access (returns Option)

Box Methods

Access box properties78:

const SBoxMethods = struct {
    pub const TYPE_CODE = 99;

    pub const VALUE = SMethod{
        .method_id = 1,
        .name = "value",
        .tpe = SFunc.unary(.box, .long),
        .cost_kind = .{ .fixed = 1 },
    };

    pub const PROPOSITION_BYTES = SMethod{
        .method_id = 2,
        .name = "propositionBytes",
        .tpe = SFunc.unary(.box, .{ .coll = &SType.byte }),
        .cost_kind = .{ .fixed = 10 },
    };

    pub const BYTES = SMethod{
        .method_id = 3,
        .name = "bytes",
        .tpe = SFunc.unary(.box, .{ .coll = &SType.byte }),
        .cost_kind = .{ .fixed = 10 },
    };

    pub const ID = SMethod{
        .method_id = 5,
        .name = "id",
        .tpe = SFunc.unary(.box, .{ .coll = &SType.byte }),
        .cost_kind = .{ .fixed = 10 },
    };

    pub const CREATION_INFO = SMethod{
        .method_id = 6,
        .name = "creationInfo",
        .tpe = SFunc.unary(.box, .{ .tuple = &[_]SType{ .int, .{ .coll = &SType.byte } } }),
        .cost_kind = .{ .fixed = 10 },
    };

    pub const TOKENS = SMethod{
        .method_id = 8,
        .name = "tokens",
        .tpe = SFunc.unary(.box, .{
            .coll = &.{ .tuple = &[_]SType{ .{ .coll = &SType.byte }, .long } },
        }),
        .cost_kind = .{ .fixed = 15 },
    };

    // Register access: R0-R9
    pub const R0 = SMethod{ .method_id = 10, .name = "R0", ... };
    pub const R1 = SMethod{ .method_id = 11, .name = "R1", ... };
    pub const R2 = SMethod{ .method_id = 12, .name = "R2", ... };
    pub const R3 = SMethod{ .method_id = 13, .name = "R3", ... };
    pub const R4 = SMethod{ .method_id = 14, .name = "R4", ... };
    pub const R5 = SMethod{ .method_id = 15, .name = "R5", ... };
    pub const R6 = SMethod{ .method_id = 16, .name = "R6", ... };
    pub const R7 = SMethod{ .method_id = 17, .name = "R7", ... };
    pub const R8 = SMethod{ .method_id = 18, .name = "R8", ... };
    pub const R9 = SMethod{ .method_id = 19, .name = "R9", ... };
};

Context Methods

Access transaction context910:

const SContextMethods = struct {
    pub const TYPE_CODE = 101;

    pub const DATA_INPUTS = SMethod{
        .method_id = 1,
        .name = "dataInputs",
        .tpe = SFunc.unary(.context, .{ .coll = &SType.box }),
        .cost_kind = .{ .fixed = 15 },
    };

    pub const HEADERS = SMethod{
        .method_id = 2,
        .name = "headers",
        .tpe = SFunc.unary(.context, .{ .coll = &SType.header }),
        .cost_kind = .{ .fixed = 15 },
    };

    pub const PRE_HEADER = SMethod{
        .method_id = 3,
        .name = "preHeader",
        .tpe = SFunc.unary(.context, .pre_header),
        .cost_kind = .{ .fixed = 10 },
    };

    pub const INPUTS = SMethod{
        .method_id = 4,
        .name = "INPUTS",
        .tpe = SFunc.unary(.context, .{ .coll = &SType.box }),
        .cost_kind = .{ .fixed = 10 },
    };

    pub const OUTPUTS = SMethod{
        .method_id = 5,
        .name = "OUTPUTS",
        .tpe = SFunc.unary(.context, .{ .coll = &SType.box }),
        .cost_kind = .{ .fixed = 10 },
    };

    pub const HEIGHT = SMethod{
        .method_id = 6,
        .name = "HEIGHT",
        .tpe = SFunc.unary(.context, .int),
        .cost_kind = .{ .fixed = 26 },
    };

    pub const SELF = SMethod{
        .method_id = 7,
        .name = "SELF",
        .tpe = SFunc.unary(.context, .box),
        .cost_kind = .{ .fixed = 10 },
    };

    pub const GET_VAR = SMethod{
        .method_id = 8,
        .name = "getVar",
        .tpe = SFunc.new(&[_]SType{ .context, .byte }, .{ .option = .{ .type_var = "T" } }),
        .cost_kind = .dynamic,
    };
};

GroupElement Methods

Elliptic curve operations1112:

const SGroupElementMethods = struct {
    pub const TYPE_CODE = 7;

    pub const GET_ENCODED = SMethod{
        .method_id = 2,
        .name = "getEncoded",
        .tpe = SFunc.unary(.group_element, .{ .coll = &SType.byte }),
        .cost_kind = .{ .fixed = 250 },
    };

    pub const EXP = SMethod{
        .method_id = 3,
        .name = "exp",
        .tpe = SFunc.binary(.group_element, .big_int, .group_element),
        .cost_kind = .{ .fixed = 900 },
    };

    pub const MULTIPLY = SMethod{
        .method_id = 4,
        .name = "multiply",
        .tpe = SFunc.binary(.group_element, .group_element, .group_element),
        .cost_kind = .{ .fixed = 40 },
    };

    pub const NEGATE = SMethod{
        .method_id = 5,
        .name = "negate",
        .tpe = SFunc.unary(.group_element, .group_element),
        .cost_kind = .{ .fixed = 45 },
    };
};

SigmaProp Methods

Cryptographic proposition operations13:

const SSigmaPropMethods = struct {
    pub const TYPE_CODE = 8;

    pub const PROP_BYTES = SMethod{
        .method_id = 1,
        .name = "propBytes",
        .tpe = SFunc.unary(.sigma_prop, .{ .coll = &SType.byte }),
        .cost_kind = .{ .fixed = 35 },
    };

    pub const IS_PROVEN = SMethod{
        .method_id = 2,
        .name = "isProven",
        .tpe = SFunc.unary(.sigma_prop, .boolean),
        .cost_kind = .{ .fixed = 10 },
        .deprecated = true, // Use in scripts only
    };
};

Method Resolution

Method lookup by type code and method ID:

pub fn resolveMethod(type_code: u8, method_id: u8) ?*const SMethod {
    const container = switch (type_code) {
        2, 3, 4, 5 => &SNumericMethods,  // Byte, Short, Int, Long
        6 => &SBigIntMethods,
        7 => &SGroupElementMethods,
        8 => &SSigmaPropMethods,
        9 => &SUnsignedBigIntMethods,     // v6+
        12...23 => &SCollectionMethods,   // Coll[T]
        36...47 => &SOptionMethods,       // Option[T]
        99 => &SBoxMethods,
        100 => &SAvlTreeMethods,
        101 => &SContextMethods,
        104 => &SHeaderMethods,
        105 => &SPreHeaderMethods,
        106 => &SGlobalMethods,
        else => return null,
    };
    return container.getMethodById(method_id);
}

Method Call Evaluation

const MethodCall = struct {
    receiver_type: SType,
    method: *const SMethod,
    receiver: *const Value,
    args: []const *Value,

    pub fn eval(self: *const MethodCall, env: *const DataEnv, E: *Evaluator) !Any {
        // Evaluate receiver
        const recv = try self.receiver.eval(env, E);

        // Evaluate arguments
        var arg_values = try E.allocator.alloc(Any, self.args.len);
        for (self.args, 0..) |arg, i| {
            arg_values[i] = try arg.eval(env, E);
        }

        // Add cost
        E.addCost(self.method.cost_kind, self.method.op_code);

        // Dispatch to method implementation
        return try self.method.eval(recv, arg_values, E);
    }
};

Summary

This chapter covered the method system that extends ErgoTree types with rich APIs:

  • MethodsContainer organizes methods per type, with each method having a unique ID (1-255) within its container
  • Method resolution uses the receiver's type code and the method ID to locate the implementation, avoiding opcode space consumption
  • Numeric methods provide type conversions (toByte, toInt, toLong, toBigInt) shared across all numeric types, with v6 adding bitwise operations and byte representation
  • Collection methods form the richest API with transformation (map, filter, fold), predicates (exists, forall), and combination operations (append, slice, zip)
  • Box methods access UTXO properties: value (nanoERGs), tokens, propositionBytes, and registers R0-R9
  • Context methods provide access to transaction data: INPUTS, OUTPUTS, HEIGHT, SELF, dataInputs, headers, and context variables via getVar
  • Cryptographic methods on GroupElement support elliptic curve operations (exp, multiply, negate) and SigmaProp provides propBytes for serialization

Next: Chapter 7: Serialization Framework

1

Scala: methods.scala

2

Rust: smethod.rs:36-99 (SMethod, SMethodDesc)

4

Rust: snumeric.rs

6

Rust: scoll.rs:22-266 (METHOD_DESC, method IDs)

7

Scala: methods.scala (SBoxMethods)

8

Rust: sbox.rs:29-92 (VALUE_METHOD, GET_REG_METHOD, TOKENS_METHOD)

9

Scala: methods.scala (SContextMethods)

10

Rust: scontext.rs

11

Scala: methods.scala (SGroupElementMethods)

12

Rust: sgroup_elem.rs

13

Scala: methods.scala (SSigmaPropMethods)

Chapter 7: Serialization Framework

PRE-ALPHA WARNING: This is a pre-alpha version of The Sigma Book. Content may be incomplete, inaccurate, or subject to change. Do not use as a source of truth. For authoritative information, consult the official repositories:

Prerequisites

  • Binary encoding concepts (bits, bytes, big-endian vs little-endian)
  • Familiarity with variable-length encoding techniques and their space-efficiency trade-offs
  • Prior chapters: Chapter 2 for type codes, Chapter 5 for opcodes

Learning Objectives

By the end of this chapter, you will be able to:

  • Explain VLQ (Variable-Length Quantity) encoding and how it achieves compact integer representation
  • Describe ZigZag encoding and why it improves VLQ efficiency for signed integers
  • Implement type serialization using the type code embedding scheme
  • Use SigmaByteReader and SigmaByteWriter for type-aware serialization

Serialization Architecture

Blockchain storage is expensive—every byte of an ErgoTree increases transaction fees and network bandwidth. The serialization framework therefore prioritizes compactness while maintaining determinism (identical inputs must produce identical outputs across all implementations). The system uses a layered design where each layer handles a specific concern12:

┌─────────────────────────────────────────────────┐
│              Application Layer                  │
│      (ErgoTree, Box, Transaction)               │
├─────────────────────────────────────────────────┤
│            Value Serializers                    │
│      (ConstantSerializer, MethodCall)           │
├─────────────────────────────────────────────────┤
│         SigmaByteReader/Writer                  │
│      (Type-aware, constant store)               │
├─────────────────────────────────────────────────┤
│           VLQ Encoding Layer                    │
│      (Variable-length integers)                 │
├─────────────────────────────────────────────────┤
│            Byte Buffer I/O                      │
│      (Raw read/write operations)                │
└─────────────────────────────────────────────────┘

Base Serializer Interface

const SigmaSerializer = struct {
    pub const MAX_PROPOSITION_SIZE: usize = 4096;
    pub const MAX_TREE_DEPTH: u32 = 110;

    pub fn toBytes(comptime T: type, obj: T, allocator: Allocator) ![]u8 {
        var list = std.ArrayList(u8).init(allocator);
        var writer = SigmaByteWriter.init(&list);
        try T.serialize(obj, &writer);
        return list.toOwnedSlice();
    }

    pub fn fromBytes(comptime T: type, bytes: []const u8) !T {
        var reader = SigmaByteReader.init(bytes);
        return try T.deserialize(&reader);
    }
};

VLQ Encoding

Variable-Length Quantity (VLQ) represents integers compactly34:

Value Range              Bytes   Format
─────────────────────────────────────────────────
0 - 127                  1       0xxxxxxx
128 - 16,383             2       1xxxxxxx 0xxxxxxx
16,384 - 2,097,151       3       1xxxxxxx 1xxxxxxx 0xxxxxxx
2,097,152 - 268,435,455  4       1xxxxxxx 1xxxxxxx 1xxxxxxx 0xxxxxxx
...                      ...     ...

Each byte uses 7 bits for data; MSB is continuation flag:

  • 0 = final byte
  • 1 = more bytes follow

VLQ Implementation

const VlqEncoder = struct {
    /// Write unsigned integer using VLQ encoding
    pub fn putUInt(writer: anytype, value: u64) !void {
        var v = value;
        while ((v & 0xFFFFFF80) != 0) {
            try writer.writeByte(@intCast((v & 0x7F) | 0x80));
            v >>= 7;
        }
        try writer.writeByte(@intCast(v & 0x7F));
    }

    /// Read unsigned integer using VLQ decoding
    /// Maximum 10 bytes for u64 (ceil(64/7) = 10)
    pub fn getUInt(reader: anytype) !u64 {
        const MAX_VLQ_BYTES: u6 = 10;  // ceil(64/7) = 10 bytes max
        var result: u64 = 0;
        var shift: u6 = 0;
        var byte_count: u6 = 0;
        while (shift < 64) {
            const b = try reader.readByte();
            byte_count += 1;
            if (byte_count > MAX_VLQ_BYTES) return error.VlqTooLong;
            result |= @as(u64, b & 0x7F) << shift;
            if ((b & 0x80) == 0) return result;
            shift += 7;
        }
        return error.VlqDecodingFailed;
    }
};
// NOTE: In production, VLQ decoding should use compile-time assertions to
// verify max byte counts. See ZIGMA_STYLE.md for bounded iteration patterns.

VLQ Size by Value Range

Unsigned Value           Bytes
─────────────────────────────────
0 - 127                  1
128 - 16,383             2
16,384 - 2,097,151       3
2,097,152 - 268M         4
268M - 34B               5
34B - 4T                 6
4T - 562T                7
562T - 72P               8
72P - 9E                 9
> 9E                     10

ZigZag Encoding

VLQ assumes non-negative values—it encodes the magnitude directly. For signed integers like -1, the two's complement representation has all high bits set, resulting in maximum VLQ length. ZigZag encoding solves this by mapping signed values to unsigned in a way that preserves magnitude: small positive and negative numbers both produce small unsigned values56:

Signed    ZigZag Encoded
────────────────────────
 0        0
-1        1
 1        2
-2        3
 2        4
-3        5
 n        2n     (n >= 0)
-n        2n-1   (n < 0)

ZigZag Implementation

const ZigZag = struct {
    /// Encode signed 32-bit to unsigned
    pub fn encode32(n: i32) u64 {
        // Arithmetic right shift replicates sign bit
        return @bitCast(@as(i64, (n << 1) ^ (n >> 31)));
    }

    /// Decode unsigned back to signed 32-bit
    pub fn decode32(n: u64) i32 {
        const v: u32 = @intCast(n);
        return @as(i32, @intCast(v >> 1)) ^ -@as(i32, @intCast(v & 1));
    }

    /// Encode signed 64-bit to unsigned
    pub fn encode64(n: i64) u64 {
        return @bitCast((n << 1) ^ (n >> 63));
    }

    /// Decode unsigned back to signed 64-bit
    pub fn decode64(n: u64) i64 {
        return @as(i64, @intCast(n >> 1)) ^ -@as(i64, @intCast(n & 1));
    }
};

ZigZag ensures small-magnitude signed values use few bytes:

Value     ZigZag    VLQ Bytes
─────────────────────────────
 0        0         1
-1        1         1
 1        2         1
-64       127       1
 64       128       2
-65       129       2

SigmaByteWriter

The writer handles type-aware serialization with cost tracking78:

const SigmaByteWriter = struct {
    buffer: *std.ArrayList(u8),
    constant_store: ?*ConstantStore,
    tree_version: ErgoTreeVersion,

    pub fn init(buffer: *std.ArrayList(u8)) SigmaByteWriter {
        return .{
            .buffer = buffer,
            .constant_store = null,
            .tree_version = .v0,
        };
    }

    /// Write single byte
    pub fn putByte(self: *SigmaByteWriter, b: u8) !void {
        try self.buffer.append(b);
    }

    /// Write byte slice
    pub fn putBytes(self: *SigmaByteWriter, bytes: []const u8) !void {
        try self.buffer.appendSlice(bytes);
    }

    /// Write unsigned integer (VLQ encoded)
    pub fn putUInt(self: *SigmaByteWriter, value: u64) !void {
        try VlqEncoder.putUInt(self.buffer.writer(), value);
    }

    /// Write signed short (ZigZag + VLQ)
    pub fn putShort(self: *SigmaByteWriter, value: i16) !void {
        try self.putUInt(ZigZag.encode32(value));
    }

    /// Write signed int (ZigZag + VLQ)
    pub fn putInt(self: *SigmaByteWriter, value: i32) !void {
        try self.putUInt(ZigZag.encode32(value));
    }

    /// Write signed long (ZigZag + VLQ)
    pub fn putLong(self: *SigmaByteWriter, value: i64) !void {
        try self.putUInt(ZigZag.encode64(value));
    }

    /// Write type descriptor
    pub fn putType(self: *SigmaByteWriter, tpe: SType) !void {
        try TypeSerializer.serialize(tpe, self);
    }

    /// Write value with optional constant extraction
    pub fn putValue(self: *SigmaByteWriter, value: *const Value) !void {
        if (self.constant_store) |store| {
            if (value.isConstant()) {
                const idx = store.put(value.asConstant());
                try self.putByte(OpCode.ConstantPlaceholder.value);
                try self.putUInt(idx);
                return;
            }
        }
        try ValueSerializer.serialize(value, self);
    }
};

SigmaByteReader

The reader provides type-aware deserialization910:

const SigmaByteReader = struct {
    data: []const u8,
    pos: usize,
    constant_store: ConstantStore,
    substitute_placeholders: bool,
    val_def_type_store: ValDefTypeStore,
    tree_version: ErgoTreeVersion,

    pub fn init(data: []const u8) SigmaByteReader {
        return .{
            .data = data,
            .pos = 0,
            .constant_store = ConstantStore.empty(),
            .substitute_placeholders = false,
            .val_def_type_store = ValDefTypeStore.init(),
            .tree_version = .v0,
        };
    }

    pub fn initWithStore(data: []const u8, store: ConstantStore) SigmaByteReader {
        var reader = init(data);
        reader.constant_store = store;
        reader.substitute_placeholders = true;
        return reader;
    }

    /// Read single byte
    pub fn getByte(self: *SigmaByteReader) !u8 {
        if (self.pos >= self.data.len) return error.EndOfStream;
        const b = self.data[self.pos];
        self.pos += 1;
        return b;
    }

    /// Read byte slice
    pub fn getBytes(self: *SigmaByteReader, n: usize) ![]const u8 {
        if (self.pos + n > self.data.len) return error.EndOfStream;
        const slice = self.data[self.pos..][0..n];
        self.pos += n;
        return slice;
    }

    /// Read unsigned integer (VLQ)
    pub fn getUInt(self: *SigmaByteReader) !u64 {
        return VlqEncoder.getUInt(self);
    }

    /// Read signed short (VLQ + ZigZag)
    pub fn getShort(self: *SigmaByteReader) !i16 {
        const v = try self.getUInt();
        return @intCast(ZigZag.decode32(v));
    }

    /// Read signed int (VLQ + ZigZag)
    pub fn getInt(self: *SigmaByteReader) !i32 {
        return ZigZag.decode32(try self.getUInt());
    }

    /// Read signed long (VLQ + ZigZag)
    pub fn getLong(self: *SigmaByteReader) !i64 {
        return ZigZag.decode64(try self.getUInt());
    }

    /// Read type descriptor
    pub fn getType(self: *SigmaByteReader) !SType {
        return TypeSerializer.deserialize(self);
    }

    /// Read value expression
    pub fn getValue(self: *SigmaByteReader) !*Value {
        return ValueSerializer.deserialize(self);
    }

    /// Remaining bytes available
    pub fn remaining(self: *const SigmaByteReader) usize {
        return self.data.len - self.pos;
    }

    // Reader interface for VLQ
    pub fn readByte(self: *SigmaByteReader) !u8 {
        return self.getByte();
    }
};

Constant Store

Manages constants during ErgoTree serialization11:

const ConstantStore = struct {
    constants: []const Constant,
    extracted: std.ArrayList(Constant),

    pub fn empty() ConstantStore {
        return .{
            .constants = &.{},
            .extracted = undefined,
        };
    }

    pub fn init(constants: []const Constant, allocator: Allocator) ConstantStore {
        return .{
            .constants = constants,
            .extracted = std.ArrayList(Constant).init(allocator),
        };
    }

    /// Get constant by index
    pub fn get(self: *const ConstantStore, index: usize) !Constant {
        if (index >= self.constants.len) return error.IndexOutOfBounds;
        return self.constants[index];
    }

    /// Store constant during extraction, return index
    pub fn put(self: *ConstantStore, c: Constant) !u32 {
        const idx = self.extracted.items.len;
        try self.extracted.append(c);
        return @intCast(idx);
    }
};

Type Serialization

Types use a compact encoding scheme based on type codes1213:

Type Code Space
───────────────────────────────────────────────────────────
 1-11    Primitive embeddable types
12-23    Coll[primitive]           (12 + primCode)
24-35    Coll[Coll[primitive]]     (24 + primCode)
36-47    Option[primitive]         (36 + primCode)
48-59    Option[Coll[primitive]]   (48 + primCode)
60-71    (primitive, T2) pairs     (60 + primCode)
72-83    (T1, primitive) pairs     (72 + primCode)
84-95    (primitive, primitive)    (84 + primCode) symmetric
96       Tuple (generic)
97-106   Object types (Any, Unit, Box, ...)
112      SFunc (v6+)

Type Code Constants

const TypeCode = struct {
    value: u8,

    // Primitive types (embeddable)
    pub const BOOLEAN: u8 = 1;
    pub const BYTE: u8 = 2;
    pub const SHORT: u8 = 3;
    pub const INT: u8 = 4;
    pub const LONG: u8 = 5;
    pub const BIGINT: u8 = 6;
    pub const GROUP_ELEMENT: u8 = 7;
    pub const SIGMA_PROP: u8 = 8;
    pub const UNSIGNED_BIGINT: u8 = 9;

    // Type constructor bases
    pub const MAX_PRIM: u8 = 11;
    pub const PRIM_RANGE: u8 = 12;  // MAX_PRIM + 1
    pub const COLL: u8 = 12;
    pub const NESTED_COLL: u8 = 24;
    pub const OPTION: u8 = 36;
    pub const OPTION_COLL: u8 = 48;
    pub const TUPLE_PAIR1: u8 = 60;
    pub const TUPLE_PAIR2: u8 = 72;
    pub const TUPLE_SYMMETRIC: u8 = 84;
    pub const TUPLE: u8 = 96;

    // Object types
    pub const ANY: u8 = 97;
    pub const UNIT: u8 = 98;
    pub const BOX: u8 = 99;
    pub const AVL_TREE: u8 = 100;
    pub const CONTEXT: u8 = 101;
    pub const STRING: u8 = 102;
    pub const TYPE_VAR: u8 = 103;
    pub const HEADER: u8 = 104;
    pub const PRE_HEADER: u8 = 105;
    pub const GLOBAL: u8 = 106;
    pub const FUNC: u8 = 112;

    /// Embed primitive type into container code
    pub fn embed(container_base: u8, prim_code: u8) u8 {
        return container_base + prim_code;
    }

    /// Extract container and primitive from combined code
    pub fn unpack(code: u8) struct { container: ?u8, primitive: ?u8 } {
        if (code >= TUPLE) return .{ .container = null, .primitive = null };
        const container_id = (code / PRIM_RANGE) * PRIM_RANGE;
        const type_id = code % PRIM_RANGE;
        return .{
            .container = if (container_id == 0) null else container_id,
            .primitive = if (type_id == 0) null else type_id,
        };
    }
};

Type Serializer

const TypeSerializer = struct {
    pub fn serialize(tpe: SType, w: *SigmaByteWriter) !void {
        switch (tpe) {
            // Primitives - single byte
            .boolean => try w.putByte(TypeCode.BOOLEAN),
            .byte => try w.putByte(TypeCode.BYTE),
            .short => try w.putByte(TypeCode.SHORT),
            .int => try w.putByte(TypeCode.INT),
            .long => try w.putByte(TypeCode.LONG),
            .big_int => try w.putByte(TypeCode.BIGINT),
            .group_element => try w.putByte(TypeCode.GROUP_ELEMENT),
            .sigma_prop => try w.putByte(TypeCode.SIGMA_PROP),
            .unsigned_big_int => try w.putByte(TypeCode.UNSIGNED_BIGINT),

            // Object types
            .box => try w.putByte(TypeCode.BOX),
            .avl_tree => try w.putByte(TypeCode.AVL_TREE),
            .context => try w.putByte(TypeCode.CONTEXT),
            .header => try w.putByte(TypeCode.HEADER),
            .pre_header => try w.putByte(TypeCode.PRE_HEADER),
            .global => try w.putByte(TypeCode.GLOBAL),
            .unit => try w.putByte(TypeCode.UNIT),
            .any => try w.putByte(TypeCode.ANY),

            // Collections
            .coll => |elem| {
                if (elem.isEmbeddable()) {
                    // Single byte: Coll[primitive]
                    try w.putByte(TypeCode.embed(TypeCode.COLL, elem.typeCode()));
                } else if (elem.* == .coll) {
                    const inner = elem.coll;
                    if (inner.isEmbeddable()) {
                        // Single byte: Coll[Coll[primitive]]
                        try w.putByte(TypeCode.embed(TypeCode.NESTED_COLL, inner.typeCode()));
                    } else {
                        try w.putByte(TypeCode.COLL);
                        try serialize(elem.*, w);
                    }
                } else {
                    try w.putByte(TypeCode.COLL);
                    try serialize(elem.*, w);
                }
            },

            // Options
            .option => |elem| {
                if (elem.isEmbeddable()) {
                    try w.putByte(TypeCode.embed(TypeCode.OPTION, elem.typeCode()));
                } else if (elem.* == .coll) {
                    const inner = elem.coll;
                    if (inner.isEmbeddable()) {
                        try w.putByte(TypeCode.embed(TypeCode.OPTION_COLL, inner.typeCode()));
                    } else {
                        try w.putByte(TypeCode.OPTION);
                        try serialize(elem.*, w);
                    }
                } else {
                    try w.putByte(TypeCode.OPTION);
                    try serialize(elem.*, w);
                }
            },

            // Tuples (pairs)
            .tuple => |items| {
                if (items.len == 2) {
                    try serializePair(items[0], items[1], w);
                } else {
                    try w.putByte(TypeCode.TUPLE);
                    try w.putByte(@intCast(items.len));
                    for (items) |item| {
                        try serialize(item, w);
                    }
                }
            },

            // Functions (v6+)
            .func => |f| {
                try w.putByte(TypeCode.FUNC);
                try w.putByte(@intCast(f.t_dom.len));
                for (f.t_dom) |arg| try serialize(arg, w);
                try serialize(f.t_range.*, w);
                try w.putByte(@intCast(f.tpe_params.len));
                for (f.tpe_params) |p| {
                    try w.putByte(TypeCode.TYPE_VAR);
                    try w.putBytes(p.name);
                }
            },

            else => return error.UnsupportedType,
        }
    }

    fn serializePair(t1: SType, t2: SType, w: *SigmaByteWriter) !void {
        const e1 = t1.isEmbeddable();
        const e2 = t2.isEmbeddable();

        if (e1 and e2 and std.meta.eql(t1, t2)) {
            // Symmetric pair: (Int, Int)
            try w.putByte(TypeCode.embed(TypeCode.TUPLE_SYMMETRIC, t1.typeCode()));
        } else if (e1) {
            // First is primitive: (Int, T)
            try w.putByte(TypeCode.embed(TypeCode.TUPLE_PAIR1, t1.typeCode()));
            try serialize(t2, w);
        } else if (e2) {
            // Second is primitive: (T, Int)
            try w.putByte(TypeCode.embed(TypeCode.TUPLE_PAIR2, t2.typeCode()));
            try serialize(t1, w);
        } else {
            // Both non-primitive
            try w.putByte(TypeCode.TUPLE_PAIR1);
            try serialize(t1, w);
            try serialize(t2, w);
        }
    }

    pub fn deserialize(r: *SigmaByteReader) !SType {
        const c = try r.getByte();
        return parseWithTag(r, c);
    }

    fn parseWithTag(r: *SigmaByteReader, c: u8) !SType {
        if (c < TypeCode.TUPLE) {
            const unpacked = TypeCode.unpack(c);
            const elem_type = if (unpacked.primitive) |p|
                try getEmbeddableType(p, r.tree_version)
            else
                try deserialize(r);

            if (unpacked.container) |container| {
                return switch (container) {
                    TypeCode.COLL => .{ .coll = &elem_type },
                    TypeCode.NESTED_COLL => .{ .coll = &SType{ .coll = &elem_type } },
                    TypeCode.OPTION => .{ .option = &elem_type },
                    TypeCode.OPTION_COLL => .{ .option = &SType{ .coll = &elem_type } },
                    TypeCode.TUPLE_PAIR1 => blk: {
                        const t2 = try deserialize(r);
                        break :blk .{ .tuple = &[_]SType{ elem_type, t2 } };
                    },
                    TypeCode.TUPLE_PAIR2 => blk: {
                        const t1 = try deserialize(r);
                        break :blk .{ .tuple = &[_]SType{ t1, elem_type } };
                    },
                    TypeCode.TUPLE_SYMMETRIC => .{ .tuple = &[_]SType{ elem_type, elem_type } },
                    else => return error.InvalidTypeCode,
                };
            }
            return elem_type;
        }

        return switch (c) {
            TypeCode.TUPLE => blk: {
                const len = try r.getByte();
                var items: [8]SType = undefined;
                for (0..len) |i| items[i] = try deserialize(r);
                break :blk .{ .tuple = items[0..len] };
            },
            TypeCode.ANY => .any,
            TypeCode.UNIT => .unit,
            TypeCode.BOX => .box,
            TypeCode.AVL_TREE => .avl_tree,
            TypeCode.CONTEXT => .context,
            TypeCode.HEADER => .header,
            TypeCode.PRE_HEADER => .pre_header,
            TypeCode.GLOBAL => .global,
            TypeCode.FUNC => blk: {
                if (r.tree_version.value < 3) return error.UnsupportedVersion;
                const dom_len = try r.getByte();
                var t_dom: [255]SType = undefined;
                for (0..dom_len) |i| t_dom[i] = try deserialize(r);
                const t_range = try deserialize(r);
                // ... parse tpe_params
                break :blk .{ .func = undefined }; // Simplified
            },
            else => error.InvalidTypeCode,
        };
    }

    fn getEmbeddableType(code: u8, version: ErgoTreeVersion) !SType {
        return switch (code) {
            TypeCode.BOOLEAN => .boolean,
            TypeCode.BYTE => .byte,
            TypeCode.SHORT => .short,
            TypeCode.INT => .int,
            TypeCode.LONG => .long,
            TypeCode.BIGINT => .big_int,
            TypeCode.GROUP_ELEMENT => .group_element,
            TypeCode.SIGMA_PROP => .sigma_prop,
            TypeCode.UNSIGNED_BIGINT => blk: {
                if (version.value < 3) return error.UnsupportedVersion;
                break :blk .unsigned_big_int;
            },
            else => error.InvalidTypeCode,
        };
    }
};

Encoding Examples

Example: Encode 300 as VLQ

300 = 0x12C = 0b100101100

Step 1: Take low 7 bits, set continuation: 0x2C | 0x80 = 0xAC
Step 2: Shift right 7: 300 >> 7 = 2
Step 3: Take low 7 bits, no continuation: 0x02

Result: [0xAC, 0x02]

Example: Encode -5 as ZigZag + VLQ

ZigZag(-5) = (-5 << 1) ^ (-5 >> 31)
          = -10 ^ -1
          = 9

VLQ(9) = [0x09]  (fits in 7 bits)

Example: Serialize Coll[Int]

Coll[Int] → single byte
         → TypeCode.COLL + TypeCode.INT
         → 12 + 4 = 16 = 0x10

Example: Serialize (Int, Long)

(Int, Long) → TUPLE_PAIR1 + INT, then Long
           → 60 + 4 = 64, then 5
           → [0x40, 0x05]

Summary

This chapter covered the serialization framework that enables compact, deterministic encoding of ErgoTree structures:

  • VLQ (Variable-Length Quantity) encoding represents integers using 7 data bits per byte with a continuation flag, achieving compact representation where small values use fewer bytes
  • ZigZag encoding transforms signed integers to unsigned before VLQ encoding, ensuring small-magnitude values (positive or negative) remain compact
  • Type code embedding packs common type patterns (like Coll[Int] or Option[Long]) into single bytes by combining container and primitive codes
  • SigmaByteWriter provides type-aware serialization with optional constant extraction for segregated constant trees
  • SigmaByteReader manages deserialization state including constant stores for placeholder resolution and version tracking
  • The type code space (0-112) is partitioned to enable single-byte encoding for primitives, nested collections, options, and pairs

Next: Chapter 8: Value Serializers

5

Scala: (via scorex-util ZigZag implementation)

13

Rust: types.rs:18-160

Chapter 8: Value Serializers

PRE-ALPHA WARNING: This is a pre-alpha version of The Sigma Book. Content may be incomplete, inaccurate, or subject to change. Do not use as a source of truth. For authoritative information, consult the official repositories:

Prerequisites

  • Chapter 7 for VLQ encoding, type serialization, and SigmaByteReader/SigmaByteWriter
  • Chapter 4 for the Value hierarchy and expression node types
  • Chapter 5 for the opcode space and operation categories

Learning Objectives

By the end of this chapter, you will be able to:

  • Explain opcode-based serialization dispatch and how it enables extensibility
  • Implement value serializers following common patterns (binary, unary, nullary, collection)
  • Describe constant extraction and placeholder substitution for segregated constant trees
  • Handle type inference during deserialization using ValDefTypeStore

Serialization Architecture

Chapter 7 covered the low-level encoding primitives (VLQ, ZigZag, type codes). This chapter builds on that foundation to show how entire expression trees are serialized. The key insight is that each expression's opcode determines its serialization format, enabling a registry-based dispatch pattern that scales to hundreds of operation types12.

Expression Serialization Flow
─────────────────────────────────────────────────────────

        ┌─────────────────┐
        │   Expression    │
        └────────┬────────┘
                 │
    ┌────────────┴────────────┐
    │  Is Constant?           │
    └────────────┬────────────┘
           ┌─────┴─────┐
           │ Yes       │ No
           ▼           ▼
   ┌───────────────┐  ┌───────────────┐
   │ Extract to    │  │ Get OpCode    │
   │ Store or      │  │ Write OpCode  │
   │ Write Inline  │  │ Serialize Body│
   └───────────────┘  └───────────────┘

Serializer Registry

All serializers are registered in a sparse array indexed by opcode34:

const ValueSerializer = struct {
    /// Sparse array of serializers indexed by opcode
    serializers: [256]?*const Serializer,

    pub fn init() ValueSerializer {
        var self = ValueSerializer{ .serializers = [_]?*const Serializer{null} ** 256 };

        // Constants
        self.register(OpCode.Constant, &ConstantSerializer);
        self.register(OpCode.ConstantPlaceholder, &ConstantPlaceholderSerializer);

        // Tuples
        self.register(OpCode.Tuple, &TupleSerializer);
        self.register(OpCode.SelectField, &SelectFieldSerializer);

        // Relations
        self.register(OpCode.GT, &BinOpSerializer);
        self.register(OpCode.GE, &BinOpSerializer);
        self.register(OpCode.LT, &BinOpSerializer);
        self.register(OpCode.LE, &BinOpSerializer);
        self.register(OpCode.EQ, &BinOpSerializer);
        self.register(OpCode.NEQ, &BinOpSerializer);

        // Logical
        self.register(OpCode.BinAnd, &BinOpSerializer);
        self.register(OpCode.BinOr, &BinOpSerializer);
        self.register(OpCode.BinXor, &BinOpSerializer);

        // Arithmetic
        self.register(OpCode.Plus, &BinOpSerializer);
        self.register(OpCode.Minus, &BinOpSerializer);
        self.register(OpCode.Multiply, &BinOpSerializer);
        self.register(OpCode.Division, &BinOpSerializer);
        self.register(OpCode.Modulo, &BinOpSerializer);

        // Context
        self.register(OpCode.Height, &NullarySerializer);
        self.register(OpCode.Self, &NullarySerializer);
        self.register(OpCode.Inputs, &NullarySerializer);
        self.register(OpCode.Outputs, &NullarySerializer);
        self.register(OpCode.Context, &NullarySerializer);
        self.register(OpCode.Global, &NullarySerializer);

        // Collections
        self.register(OpCode.Coll, &CollectionSerializer);
        self.register(OpCode.CollBoolConst, &BoolCollectionSerializer);
        self.register(OpCode.Map, &MapSerializer);
        self.register(OpCode.Filter, &FilterSerializer);
        self.register(OpCode.Fold, &FoldSerializer);

        // Method calls
        self.register(OpCode.PropertyCall, &PropertyCallSerializer);
        self.register(OpCode.MethodCall, &MethodCallSerializer);

        return self;
    }

    fn register(self: *ValueSerializer, opcode: OpCode, serializer: *const Serializer) void {
        self.serializers[opcode.value] = serializer;
    }

    pub fn getSerializer(self: *const ValueSerializer, opcode: OpCode) !*const Serializer {
        return self.serializers[opcode.value] orelse error.UnknownOpCode;
    }
};

Serialization Dispatch

Serialize Expression

pub fn serialize(expr: *const Expr, w: *SigmaByteWriter) !void {
    switch (expr.*) {
        .constant => |c| {
            if (w.constant_store) |store| {
                // Extract constant to store, write placeholder
                const idx = try store.put(c);
                try w.putByte(OpCode.ConstantPlaceholder.value);
                try w.putUInt(idx);
            } else {
                // Write constant inline (type + value)
                try ConstantSerializer.serialize(c, w);
            }
        },
        else => {
            const opcode = expr.opCode();
            try w.putByte(opcode.value);  // Write opcode first
            const ser = registry.getSerializer(opcode) catch return error.UnknownOpCode;
            try ser.serialize(expr, w);   // Then serialize body
        },
    }
}

Deserialize Expression

pub fn deserialize(r: *SigmaByteReader) !Expr {
    const tag = try r.getByte();

    // Look-ahead: constants have type codes 1-112
    if (tag <= OpCode.LAST_CONSTANT_CODE) {
        return .{ .constant = try ConstantSerializer.deserializeWithTag(r, tag) };
    }

    const opcode = OpCode{ .value = tag };
    const ser = registry.getSerializer(opcode) catch {
        return error.UnknownOpCode;
    };
    return ser.deserialize(r);
}

Constant Serialization

Constants are serialized as type followed by value56:

const ConstantSerializer = struct {
    pub fn serialize(c: Constant, w: *SigmaByteWriter) !void {
        try TypeSerializer.serialize(c.tpe, w);   // 1. Type
        try DataSerializer.serialize(c.value, c.tpe, w);  // 2. Value
    }

    pub fn deserialize(r: *SigmaByteReader) !Constant {
        const tag = try r.getByte();
        return deserializeWithTag(r, tag);
    }

    pub fn deserializeWithTag(r: *SigmaByteReader, tag: u8) !Constant {
        const tpe = try TypeSerializer.parseWithTag(r, tag);
        const value = try DataSerializer.deserialize(tpe, r);
        return Constant{ .tpe = tpe, .value = value };
    }
};

Constant Placeholder

When constant segregation is enabled, constants become placeholders7:

const ConstantPlaceholderSerializer = struct {
    pub fn serialize(ph: ConstantPlaceholder, w: *SigmaByteWriter) !void {
        try w.putUInt(ph.index);  // Just the index
    }

    pub fn deserialize(r: *SigmaByteReader) !Expr {
        const id = try r.getUInt();

        if (r.substitute_placeholders) {
            // Return actual constant from store
            const c = try r.constant_store.get(@intCast(id));
            return .{ .constant = c };
        } else {
            // Return placeholder (for template extraction)
            const tpe = (try r.constant_store.get(@intCast(id))).tpe;
            return .{ .constant_placeholder = .{ .index = @intCast(id), .tpe = tpe } };
        }
    }
};

Common Serializer Patterns

BinOp Serializer (Two Arguments)

For binary operations like arithmetic and comparisons8:

const BinOpSerializer = struct {
    pub fn serialize(expr: *const Expr, w: *SigmaByteWriter) !void {
        const binop = expr.asBinOp();
        try ValueSerializer.serialize(binop.left, w);   // Left operand
        try ValueSerializer.serialize(binop.right, w);  // Right operand
    }

    pub fn deserialize(r: *SigmaByteReader, kind: BinOp.Kind) !Expr {
        const left = try ValueSerializer.deserialize(r);
        const right = try ValueSerializer.deserialize(r);
        return .{ .bin_op = .{
            .kind = kind,
            .left = &left,
            .right = &right,
        } };
    }
};

Unary Serializer (One Argument)

For single-input transformations:

const UnarySerializer = struct {
    pub fn serialize(input: *const Expr, w: *SigmaByteWriter) !void {
        try ValueSerializer.serialize(input, w);
    }

    pub fn deserialize(r: *SigmaByteReader) !*const Expr {
        return try ValueSerializer.deserialize(r);
    }
};

Nullary Serializer (No Body)

For singletons where opcode is sufficient:

const NullarySerializer = struct {
    pub fn serialize(_: *const Expr, _: *SigmaByteWriter) !void {
        // Nothing to write - opcode is enough
    }

    pub fn deserialize(r: *SigmaByteReader, opcode: OpCode) !Expr {
        _ = r;
        return switch (opcode) {
            .Height => .{ .global_var = .height },
            .Self => .{ .global_var = .self_box },
            .Inputs => .{ .global_var = .inputs },
            .Outputs => .{ .global_var = .outputs },
            .Context => .context,
            .Global => .global,
            else => error.InvalidOpCode,
        };
    }
};

Collection Serializers

ConcreteCollection

For collections of expressions9:

const CollectionSerializer = struct {
    const MAX_COLLECTION_ITEMS: u16 = 4096;  // DoS protection

    pub fn serialize(coll: *const Collection, w: *SigmaByteWriter) !void {
        try w.putUShort(@intCast(coll.items.len));  // Count
        try TypeSerializer.serialize(coll.elem_type, w);  // Element type
        for (coll.items) |item| {
            try ValueSerializer.serialize(item, w);  // Each item
        }
    }

    pub fn deserialize(r: *SigmaByteReader) !Expr {
        const count = try r.getUShort();
        if (count > MAX_COLLECTION_ITEMS) return error.CollectionTooLarge;

        const elem_type = try TypeSerializer.deserialize(r);

        var items = try r.allocator.alloc(*Expr, count);
        for (0..count) |i| {
            items[i] = try ValueSerializer.deserialize(r);
        }

        return .{ .collection = .{
            .elem_type = elem_type,
            .items = items,
        } };
    }
};
// NOTE: In production, use a pre-allocated expression pool instead of
// dynamic allocation during deserialization. See ZIGMA_STYLE.md.

Boolean Collection Constant

Compact serialization for Coll[Boolean] constants:

const BoolCollectionSerializer = struct {
    pub fn serialize(bools: []const bool, w: *SigmaByteWriter) !void {
        try w.putUShort(@intCast(bools.len));
        // Pack into bits
        const byte_count = (bools.len + 7) / 8;
        var i: usize = 0;
        for (0..byte_count) |_| {
            var byte: u8 = 0;
            for (0..8) |bit| {
                if (i < bools.len and bools[i]) {
                    byte |= @as(u8, 1) << @intCast(bit);
                }
                i += 1;
            }
            try w.putByte(byte);
        }
    }

    pub fn deserialize(r: *SigmaByteReader) !Expr {
        const count = try r.getUShort();
        const byte_count = (count + 7) / 8;

        var bools = try r.allocator.alloc(bool, count);
        var i: usize = 0;
        for (0..byte_count) |_| {
            const byte = try r.getByte();
            for (0..8) |bit| {
                if (i >= count) break;
                bools[i] = (byte >> @intCast(bit)) & 1 == 1;
                i += 1;
            }
        }

        return .{ .coll_bool_const = bools };
    }
};

Map/Filter/Fold

Higher-order collection operations:

const MapSerializer = struct {
    pub fn serialize(m: *const Map, w: *SigmaByteWriter) !void {
        try ValueSerializer.serialize(m.input, w);   // Collection
        try ValueSerializer.serialize(m.mapper, w);  // Function
    }

    pub fn deserialize(r: *SigmaByteReader) !Expr {
        const input = try ValueSerializer.deserialize(r);
        const mapper = try ValueSerializer.deserialize(r);
        return .{ .map = .{ .input = &input, .mapper = &mapper } };
    }
};

const FoldSerializer = struct {
    pub fn serialize(f: *const Fold, w: *SigmaByteWriter) !void {
        try ValueSerializer.serialize(f.input, w);   // Collection
        try ValueSerializer.serialize(f.zero, w);    // Initial value
        try ValueSerializer.serialize(f.folder, w);  // Fold function
    }

    pub fn deserialize(r: *SigmaByteReader) !Expr {
        const input = try ValueSerializer.deserialize(r);
        const zero = try ValueSerializer.deserialize(r);
        const folder = try ValueSerializer.deserialize(r);
        return .{ .fold = .{
            .input = &input,
            .zero = &zero,
            .folder = &folder,
        } };
    }
};

Block and Function Serializers

BlockValue

For blocks with local definitions10:

const BlockValueSerializer = struct {
    pub fn serialize(block: *const BlockValue, w: *SigmaByteWriter) !void {
        try w.putUInt(block.items.len);  // Definition count
        for (block.items) |item| {
            try ValueSerializer.serialize(item, w);  // Each definition
        }
        try ValueSerializer.serialize(block.result, w);  // Result expression
    }

    pub fn deserialize(r: *SigmaByteReader) !Expr {
        const count = try r.getUInt();
        var items = try r.allocator.alloc(*Expr, @intCast(count));
        for (0..count) |i| {
            items[i] = try ValueSerializer.deserialize(r);
        }
        const result = try ValueSerializer.deserialize(r);
        return .{ .block_value = .{ .items = items, .result = &result } };
    }
};

FuncValue

For lambda functions:

const FuncValueSerializer = struct {
    pub fn serialize(func: *const FuncValue, w: *SigmaByteWriter) !void {
        try w.putUInt(func.args.len);  // Argument count
        for (func.args) |arg| {
            try w.putUInt(arg.id);      // Argument id
            try TypeSerializer.serialize(arg.tpe, w);  // Argument type
        }
        try ValueSerializer.serialize(func.body, w);  // Body
    }

    pub fn deserialize(r: *SigmaByteReader) !Expr {
        const arg_count = try r.getUInt();
        var args = try r.allocator.alloc(FuncArg, @intCast(arg_count));

        for (0..arg_count) |i| {
            const id = try r.getUInt();
            const tpe = try TypeSerializer.deserialize(r);
            // Store type for ValUse resolution
            r.val_def_type_store.put(@intCast(id), tpe);
            args[i] = .{ .id = @intCast(id), .tpe = tpe };
        }

        const body = try ValueSerializer.deserialize(r);
        return .{ .func_value = .{ .args = args, .body = &body } };
    }
};

ValDef / ValUse

Variable definitions and references:

const ValDefSerializer = struct {
    pub fn serialize(vd: *const ValDef, w: *SigmaByteWriter) !void {
        try w.putUInt(vd.id);
        try TypeSerializer.serialize(vd.tpe, w);
        try ValueSerializer.serialize(vd.rhs, w);
    }

    pub fn deserialize(r: *SigmaByteReader) !Expr {
        const id = try r.getUInt();
        const tpe = try TypeSerializer.deserialize(r);
        // Store for ValUse resolution
        r.val_def_type_store.put(@intCast(id), tpe);
        const rhs = try ValueSerializer.deserialize(r);
        return .{ .val_def = .{ .id = @intCast(id), .tpe = tpe, .rhs = &rhs } };
    }
};

const ValUseSerializer = struct {
    pub fn serialize(vu: *const ValUse, w: *SigmaByteWriter) !void {
        try w.putUInt(vu.id);
    }

    pub fn deserialize(r: *SigmaByteReader) !Expr {
        const id = try r.getUInt();
        // Lookup type from earlier ValDef
        const tpe = r.val_def_type_store.get(@intCast(id)) orelse
            return error.UndefinedVariable;
        return .{ .val_use = .{ .id = @intCast(id), .tpe = tpe } };
    }
};

MethodCall Serializer

Method calls require type and method ID lookup1112:

const MethodCallSerializer = struct {
    pub fn serialize(mc: *const MethodCall, w: *SigmaByteWriter) !void {
        try w.putByte(mc.method.obj_type.typeId());  // Type ID
        try w.putByte(mc.method.method_id);          // Method ID
        try ValueSerializer.serialize(mc.obj, w);    // Receiver
        try w.putUInt(mc.args.len);                  // Arg count
        for (mc.args) |arg| {
            try ValueSerializer.serialize(arg, w);   // Each argument
        }
        // Explicit type arguments (for generic methods)
        for (mc.method.explicit_type_args) |tvar| {
            const tpe = mc.type_subst.get(tvar) orelse continue;
            try TypeSerializer.serialize(tpe, w);
        }
    }

    pub fn deserialize(r: *SigmaByteReader) !Expr {
        const type_id = try r.getByte();
        const method_id = try r.getByte();
        const obj = try ValueSerializer.deserialize(r);

        const arg_count = try r.getUInt();
        var args = try r.allocator.alloc(*Expr, @intCast(arg_count));
        for (0..arg_count) |i| {
            args[i] = try ValueSerializer.deserialize(r);
        }

        // Lookup method by type and method ID
        const method = try SMethod.fromIds(type_id, method_id);

        // Check version compatibility
        if (r.tree_version.value < method.min_version.value) {
            return error.MethodNotAvailable;
        }

        // Read type arguments
        var type_args = std.AutoHashMap(STypeVar, SType).init(r.allocator);
        for (method.explicit_type_args) |tvar| {
            const tpe = try TypeSerializer.deserialize(r);
            try type_args.put(tvar, tpe);
        }

        return .{ .method_call = .{
            .obj = &obj,
            .method = method,
            .args = args,
            .type_subst = type_args,
        } };
    }
};

Serializer Summary Table

OpCode Range    Category            Serializer Pattern
────────────────────────────────────────────────────────────
1-112           Constants           Type + Value inline
113             ConstPlaceholder    Index only
114-120         Global vars         Nullary (opcode only)
121-130         Unary ops           Single child
131-150         Binary ops          Left + Right
151-160         Collection ops      Input + Function
161-170         Block/Func          Items + Body
171-180         Method calls        TypeId + MethodId + Args

Summary

This chapter covered the value serialization system that transforms ErgoTree expression trees to and from bytes:

  • Opcode dispatch enables extensible serialization—the first byte of each expression determines which serializer handles the remaining bytes, allowing O(1) lookup via a sparse registry array
  • Constant extraction supports two modes: inline serialization (type + value) when constant segregation is disabled, or placeholder indices when segregation is enabled for template sharing
  • Common serializer patterns reduce code duplication: BinOpSerializer handles all two-argument operations, UnarySerializer handles single-input transformations, and NullarySerializer handles singletons where the opcode alone is sufficient
  • Collection serializers include bounds checking to prevent DoS attacks from maliciously large collections during deserialization
  • Type inference via ValDefTypeStore tracks variable types as ValDef nodes are deserialized, allowing ValUse nodes to recover their types without storing them redundantly
  • Method call serialization includes type ID, method ID, and version checking to ensure compatibility with the ErgoTree version being deserialized

Next: Chapter 9: Elliptic Curve Cryptography

8

Rust: bin_op.rs

Chapter 9: Elliptic Curve Cryptography

PRE-ALPHA WARNING: This is a pre-alpha version of The Sigma Book. Content may be incomplete, inaccurate, or subject to change. Do not use as a source of truth. For authoritative information, consult the official repositories:

Prerequisites

  • Basic finite field arithmetic: operations modulo a prime p, multiplicative inverses
  • Public key cryptography concepts: key pairs, discrete logarithm problem
  • Understanding of elliptic curves as sets of points satisfying y² = x³ + ax + b over a finite field
  • Prior chapters: Chapter 2 for the GroupElement type

Learning Objectives

By the end of this chapter, you will be able to:

  • Explain why secp256k1 was chosen for Sigma protocols and describe its key parameters
  • Implement the discrete logarithm group interface: exponentiate, multiply, inverse
  • Encode and decode group elements using compressed SEC1 format (33 bytes)
  • Translate between multiplicative group notation (used in Sigma protocols) and additive notation (used in libraries)

The Secp256k1 Curve

Sigma protocols use secp256k1—the same elliptic curve as Bitcoin and Ethereum12. This choice provides several benefits: widespread library support, extensive security analysis, and compatibility with existing blockchain infrastructure. The curve offers 128-bit security (meaning the best known attack requires approximately 2^128 operations) while using 256-bit keys.

Curve Definition

The curve is defined by:

y² = x³ + 7  (mod p)

where:
  p = 2²⁵⁶ - 2³² - 977  (field characteristic)
  n = group order        (number of points)
  G = generator point    (base point)

Cryptographic Constants

const CryptoConstants = struct {
    /// Encoded group element size in bytes (compressed)
    pub const ENCODED_GROUP_ELEMENT_LENGTH: usize = 33;

    /// Group size in bits
    pub const GROUP_SIZE_BITS: u32 = 256;

    /// Challenge size for Sigma protocols
    /// Must be < GROUP_SIZE_BITS for security
    pub const SOUNDNESS_BITS: u32 = 192;

    /// Group order (number of curve points)
    /// n = FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFEBAAEDCE6AF48A03BBFD25E8CD0364141
    pub const GROUP_ORDER: [32]u8 = .{
        0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF,
        0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFE,
        0xBA, 0xAE, 0xDC, 0xE6, 0xAF, 0x48, 0xA0, 0x3B,
        0xBF, 0xD2, 0x5E, 0x8C, 0xD0, 0x36, 0x41, 0x41,
    };

    /// Field characteristic
    /// p = FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFEFFFFFC2F
    pub const FIELD_PRIME: [32]u8 = .{
        0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF,
        0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF,
        0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF,
        0xFF, 0xFF, 0xFF, 0xFE, 0xFF, 0xFF, 0xFC, 0x2F,
    };

    comptime {
        // Security constraint: 2^soundnessBits < groupOrder
        std.debug.assert(SOUNDNESS_BITS < GROUP_SIZE_BITS);
    }
};

Group Element Representation

EcPoint Structure

const EcPoint = struct {
    /// Compressed encoding size
    pub const GROUP_SIZE: usize = 33;

    /// Internal representation (projective coordinates)
    x: FieldElement,
    y: FieldElement,
    z: FieldElement,

    /// Identity element (point at infinity)
    pub const IDENTITY = EcPoint{
        .x = FieldElement.zero(),
        .y = FieldElement.one(),
        .z = FieldElement.zero(),
    };

    /// Generator point G
    pub const GENERATOR = init: {
        // secp256k1 generator coordinates
        const gx = FieldElement.fromHex(
            "79BE667EF9DCBBAC55A06295CE870B07029BFCDB2DCE28D959F2815B16F81798"
        );
        const gy = FieldElement.fromHex(
            "483ADA7726A3C4655DA4FBFC0E1108A8FD17B448A68554199C47D08FFB10D4B8"
        );
        break :init EcPoint{ .x = gx, .y = gy, .z = FieldElement.one() };
    };

    /// Check if this is the identity (infinity) point
    pub fn isIdentity(self: *const EcPoint) bool {
        return self.z.isZero();
    }

    /// Convert to affine coordinates
    pub fn toAffine(self: *const EcPoint) struct { x: FieldElement, y: FieldElement } {
        if (self.isIdentity()) return .{ .x = .zero(), .y = .zero() };
        const z_inv = self.z.inverse();
        return .{
            .x = self.x.mul(z_inv),
            .y = self.y.mul(z_inv),
        };
    }
};

Group Operations

The discrete logarithm group interface provides standard operations34:

Operation           Notation      Description
─────────────────────────────────────────────────────
Exponentiate        g^x           Scalar multiplication
Multiply            g * h         Point addition
Inverse             g^(-1)        Point negation
Identity            1             Point at infinity
Generator           g             Base point G

Group Interface

SECURITY: The exponentiate (scalar multiplication) operation must be implemented in constant-time when the scalar is secret (e.g., private keys, nonces). Variable-time implementations leak secret bits through timing side-channels. Use audited libraries like libsecp256k1 or Zig's std.crypto.ecc.

const DlogGroup = struct {
    /// The generator point
    pub fn generator() EcPoint {
        return EcPoint.GENERATOR;
    }

    /// The identity element (point at infinity)
    pub fn identity() EcPoint {
        return EcPoint.IDENTITY;
    }

    /// Check if point is identity
    pub fn isIdentity(point: *const EcPoint) bool {
        return point.isIdentity();
    }

    /// Exponentiate: base^exponent (scalar multiplication)
    pub fn exponentiate(base: *const EcPoint, exponent: *const Scalar) EcPoint {
        if (base.isIdentity()) return base.*;

        // Handle negative exponents
        var exp = exponent.*;
        if (exp.isNegative()) {
            exp = exp.mod(CryptoConstants.GROUP_ORDER);
        }

        return scalarMul(base, &exp);
    }

    /// Multiply two group elements: g1 * g2 (point addition)
    pub fn multiply(g1: *const EcPoint, g2: *const EcPoint) EcPoint {
        return pointAdd(g1, g2);
    }

    /// Compute inverse: g^(-1) (point negation)
    pub fn inverse(point: *const EcPoint) EcPoint {
        return EcPoint{
            .x = point.x,
            .y = point.y.negate(),
            .z = point.z,
        };
    }

    /// Create random group element
    pub fn randomElement(rng: std.rand.Random) EcPoint {
        const scalar = Scalar.random(rng);
        return exponentiate(&EcPoint.GENERATOR, &scalar);
    }
};

Notation Translation

Sigma protocols use multiplicative notation while underlying libraries often use additive5:

Sigma (multiplicative)    Library (additive)     Operation
──────────────────────────────────────────────────────────
g * h                     g + h                  Point addition
g^n                       n * g                  Scalar multiplication
g^(-1)                    -g                     Point negation
1 (identity)              O (origin)             Point at infinity
/// Wrapper translating multiplicative to additive notation
const MultiplicativeGroup = struct {
    /// Multiply in multiplicative notation = Add in additive
    pub fn mul(a: *const EcPoint, b: *const EcPoint) EcPoint {
        return pointAdd(a, b);
    }

    /// Exponentiate in multiplicative = Scalar multiply in additive
    pub fn exp(base: *const EcPoint, scalar: *const Scalar) EcPoint {
        return scalarMul(base, scalar);
    }

    /// Inverse in multiplicative = Negate in additive
    pub fn inv(p: *const EcPoint) EcPoint {
        return pointNegate(p);
    }
};

Point Encoding

Group elements use compressed SEC1 encoding (33 bytes)67:

Compressed Point Format (33 bytes)
────────────────────────────────────────────────────

┌──────────┬────────────────────────────────────────┐
│ Byte 0   │           Bytes 1-32                   │
├──────────┼────────────────────────────────────────┤
│ 0x02     │    X coordinate (32 bytes, big-end)    │  Y is even
│ 0x03     │    X coordinate (32 bytes, big-end)    │  Y is odd
│ 0x00     │    31 zero bytes                       │  Identity
└──────────┴────────────────────────────────────────┘

Serialization Implementation

const GroupElementSerializer = struct {
    const ENCODING_SIZE: usize = 33;

    /// Identity encoding (33 zero bytes)
    const IDENTITY_ENCODING = [_]u8{0} ** ENCODING_SIZE;

    pub fn serialize(point: *const EcPoint, writer: anytype) !void {
        if (point.isIdentity()) {
            try writer.writeAll(&IDENTITY_ENCODING);
            return;
        }

        const affine = point.toAffine();

        // Determine sign byte from Y coordinate parity
        const y_bytes = affine.y.toBytes();
        const sign_byte: u8 = if (y_bytes[31] & 1 == 0) 0x02 else 0x03;

        // Write sign byte + X coordinate
        try writer.writeByte(sign_byte);
        try writer.writeAll(&affine.x.toBytes());
    }

    pub fn deserialize(reader: anytype) !EcPoint {
        var buf: [ENCODING_SIZE]u8 = undefined;
        try reader.readNoEof(&buf);

        if (buf[0] == 0) {
            // Check all zeros for identity
            for (buf[1..]) |b| {
                if (b != 0) return error.InvalidEncoding;
            }
            return EcPoint.IDENTITY;
        }

        if (buf[0] != 0x02 and buf[0] != 0x03) {
            return error.InvalidPrefix;
        }

        // Recover Y from X using curve equation: y² = x³ + 7
        const x = FieldElement.fromBytes(buf[1..33]);
        const y_squared = x.cube().add(FieldElement.fromInt(7));
        var y = y_squared.sqrt() orelse return error.NotOnCurve;

        // Choose correct Y based on sign byte
        const y_is_odd = y.toBytes()[31] & 1 == 1;
        if ((buf[0] == 0x02) == y_is_odd) {
            y = y.negate();
        }

        const point = EcPoint{ .x = x, .y = y, .z = FieldElement.one() };

        // CRITICAL: Validate point is on curve and in correct subgroup
        // This prevents invalid curve attacks. See ZIGMA_STYLE.md.
        // if (!point.isOnCurve()) return error.NotOnCurve;
        // if (!point.isInSubgroup()) return error.InvalidSubgroup;

        return point;
    }
};

Why Compressed Encoding?

Format          Size     Content
────────────────────────────────────────────────────
Compressed      33 B     Sign (1) + X (32)
Uncompressed    65 B     0x04 (1) + X (32) + Y (32)
Savings         49%      Y recovered from curve equation

Coordinate Systems

Affine vs Projective

Libraries use projective coordinates internally for efficiency:

Coordinate System    Representation    Division Required
──────────────────────────────────────────────────────────
Affine               (x, y)            Per operation
Projective           (X, Y, Z)         Only at end
                     x = X/Z
                     y = Y/Z

Normalization

/// Normalize point to affine coordinates
/// Required before: encoding, comparison, coordinate access
pub fn normalize(point: *const EcPoint) EcPoint {
    if (point.isIdentity()) return point.*;

    const z_inv = point.z.inverse();
    const z_inv_sq = z_inv.square();
    const z_inv_cu = z_inv_sq.mul(z_inv);

    return EcPoint{
        .x = point.x.mul(z_inv_sq),
        .y = point.y.mul(z_inv_cu),
        .z = FieldElement.one(),
    };
}

Random Scalar Generation

Secure random scalars for key generation8:

const Scalar = struct {
    bytes: [32]u8,

    /// Generate random scalar in [1, n-1] where n is group order
    pub fn random(rng: std.rand.Random) Scalar {
        while (true) {
            var bytes: [32]u8 = undefined;
            rng.bytes(&bytes);

            // Ensure scalar < group order
            if (lessThan(&bytes, &CryptoConstants.GROUP_ORDER)) {
                // Ensure non-zero
                var is_zero = true;
                for (bytes) |b| {
                    if (b != 0) { is_zero = false; break; }
                }
                if (!is_zero) {
                    return Scalar{ .bytes = bytes };
                }
            }
        }
    }

    /// Constant-time comparison to prevent timing attacks
    fn lessThan(a: *const [32]u8, b: *const [32]u8) bool {
        // NOTE: This simplified version is NOT constant-time.
        // In production, use constant-time comparison like:
        //   var borrow: u1 = 0;
        //   for (a.*, b.*) |ai, bi| {
        //       borrow = @intFromBool(ai < bi) | (borrow & @intFromBool(ai == bi));
        //   }
        //   return borrow == 1;
        // See ZIGMA_STYLE.md for constant-time crypto requirements.
        for (a.*, b.*) |ai, bi| {
            if (ai < bi) return true;
            if (ai > bi) return false;
        }
        return false;
    }
};

Security Properties

Discrete Logarithm Assumption

The security relies on the hardness of the DLP9:

Given:  g (generator), h = g^x (public key)
Find:   x (secret key)

Best known attack: ~2^128 operations for secp256k1

Soundness Parameter

The SOUNDNESS_BITS = 192 determines:

  • Challenge size in Sigma protocols
  • Security level against malicious provers
  • Constraint: 2^192 < n (group order)
comptime {
    // Verify soundness constraint
    // 2^soundnessBits must be less than group order
    // Group order ≈ 2^256, so 192 < 256 satisfies this
    std.debug.assert(CryptoConstants.SOUNDNESS_BITS < 256);
}

Summary

This chapter covered the elliptic curve cryptography foundation that underlies all Sigma protocol operations:

  • secp256k1 (y² = x³ + 7) provides the mathematical foundation for Sigma protocols, chosen for its security properties and widespread support in Bitcoin and Ethereum tooling
  • Group elements are encoded as 33 bytes using compressed SEC1 format—a sign byte (0x02 or 0x03 based on Y coordinate parity) followed by the 32-byte X coordinate
  • Multiplicative notation used in Sigma protocol literature (g^x, g·h) maps to additive operations in typical EC libraries (scalar multiplication, point addition)
  • SOUNDNESS_BITS = 192 determines the challenge size in Sigma protocols and must be less than the group order's bit length for security
  • The DlogGroup interface provides exponentiate (scalar multiplication), multiply (point addition), inverse (point negation), and identity (point at infinity)
  • Projective coordinates (X, Y, Z) avoid expensive field inversions during computation; conversion to affine coordinates is required only for encoding and comparison

Next: Chapter 10: Hash Functions

3

Scala: DlogGroup.scala

Chapter 10: Hash Functions

PRE-ALPHA WARNING: This is a pre-alpha version of The Sigma Book. Content may be incomplete, inaccurate, or subject to change. Do not use as a source of truth. For authoritative information, consult the official repositories:

Prerequisites

  • Cryptographic hash function properties: collision resistance, preimage resistance, deterministic output
  • Understanding of message authentication codes (MACs) and their role in key derivation
  • Prior chapters: Chapter 9 for the cryptographic context, Chapter 5 for opcode-based operations

Learning Objectives

By the end of this chapter, you will be able to:

  • Explain why BLAKE2b256 is the primary hash function in Ergo and when SHA-256 is used
  • Implement hash operations with per-item costing based on block size
  • Describe Fiat-Shamir challenge generation and why challenges are truncated to 192 bits
  • Use HMAC-SHA512 for BIP32/BIP39 key derivation

Hash Functions in Sigma

Hash functions are fundamental to blockchain security—they provide integrity guarantees, enable content addressing, and transform interactive proofs into non-interactive ones via the Fiat-Shamir heuristic. The Sigma protocol uses two primary hash functions, each optimized for different use cases12:

Hash Function Uses
─────────────────────────────────────────────────────
Purpose              Function        Output
─────────────────────────────────────────────────────
Script hashing       blake2b256()    32 bytes
External compat      sha256()        32 bytes
Challenge gen        Fiat-Shamir     24 bytes (truncated)
Box identification   blake2b256()    32 bytes
Key derivation       HMAC-SHA512     64 bytes

BLAKE2b256

The primary hash function for Ergo—faster and more secure than SHA-25634.

Implementation

const Blake2b256 = struct {
    /// Output size in bytes
    pub const DIGEST_SIZE: usize = 32;
    /// Block size for cost calculation
    pub const BLOCK_SIZE: usize = 128;

    state: [8]u64,
    buf: [BLOCK_SIZE]u8,
    buf_len: usize,
    total_len: u128,

    const IV: [8]u64 = .{
        0x6a09e667f3bcc908, 0xbb67ae8584caa73b,
        0x3c6ef372fe94f82b, 0xa54ff53a5f1d36f1,
        0x510e527fade682d1, 0x9b05688c2b3e6c1f,
        0x1f83d9abfb41bd6b, 0x5be0cd19137e2179,
    };

    pub fn init() Blake2b256 {
        var self = Blake2b256{
            .state = IV,
            .buf = undefined,
            .buf_len = 0,
            .total_len = 0,
        };
        // Parameter block XOR (digest length, fanout, depth)
        self.state[0] ^= 0x01010000 ^ DIGEST_SIZE;
        return self;
    }

    pub fn update(self: *Blake2b256, data: []const u8) void {
        var offset: usize = 0;

        // Fill buffer if partially full
        if (self.buf_len > 0 and self.buf_len + data.len > BLOCK_SIZE) {
            const fill = BLOCK_SIZE - self.buf_len;
            @memcpy(self.buf[self.buf_len..][0..fill], data[0..fill]);
            self.compress(false);
            self.buf_len = 0;
            offset = fill;
        }

        // Process full blocks
        while (offset + BLOCK_SIZE <= data.len) {
            @memcpy(&self.buf, data[offset..][0..BLOCK_SIZE]);
            self.compress(false);
            offset += BLOCK_SIZE;
        }

        // Buffer remaining
        const remaining = data.len - offset;
        if (remaining > 0) {
            @memcpy(self.buf[self.buf_len..][0..remaining], data[offset..][0..remaining]);
            self.buf_len += remaining;
        }
        self.total_len += data.len;
    }

    pub fn final(self: *Blake2b256) [DIGEST_SIZE]u8 {
        // Pad with zeros
        @memset(self.buf[self.buf_len..], 0);
        self.compress(true); // Final block

        var result: [DIGEST_SIZE]u8 = undefined;
        for (self.state[0..4], 0..) |s, i| {
            @memcpy(result[i * 8 ..][0..8], &std.mem.toBytes(std.mem.nativeToLittle(u64, s)));
        }
        return result;
    }

    fn compress(self: *Blake2b256, is_final: bool) void {
        // BLAKE2b compression function
        // ... (standard BLAKE2b round function)
        _ = is_final;
    }

    /// One-shot hash
    pub fn hash(data: []const u8) [DIGEST_SIZE]u8 {
        var hasher = init();
        hasher.update(data);
        return hasher.final();
    }
};

AST Node

const CalcBlake2b256 = struct {
    input: *const Expr, // Coll[Byte]

    pub const OP_CODE = OpCode.new(87);

    pub const COST = PerItemCost{
        .base = JitCost{ .value = 20 },
        .per_chunk = JitCost{ .value = 7 },
        .chunk_size = 128,
    };

    pub fn tpe(_: *const CalcBlake2b256) SType {
        return .{ .coll = &SType.byte };
    }

    pub fn eval(self: *const CalcBlake2b256, env: *const DataEnv, E: *Evaluator) ![]const u8 {
        const input_bytes = try self.input.eval(env, E);
        const coll = input_bytes.coll.bytes;

        // Add cost based on input length
        E.addSeqCost(COST, coll.len, OP_CODE);

        const result = Blake2b256.hash(coll);
        return try E.allocator.dupe(u8, &result);
    }
};

SHA-256

Available for external system compatibility (Bitcoin, etc.)56.

Implementation

const Sha256 = struct {
    pub const DIGEST_SIZE: usize = 32;
    pub const BLOCK_SIZE: usize = 64;

    state: [8]u32,
    buf: [BLOCK_SIZE]u8,
    buf_len: usize,
    total_len: u64,

    const K: [64]u32 = .{
        0x428a2f98, 0x71374491, 0xb5c0fbcf, 0xe9b5dba5,
        0x3956c25b, 0x59f111f1, 0x923f82a4, 0xab1c5ed5,
        // ... remaining round constants
    };

    const H0: [8]u32 = .{
        0x6a09e667, 0xbb67ae85, 0x3c6ef372, 0xa54ff53a,
        0x510e527f, 0x9b05688c, 0x1f83d9ab, 0x5be0cd19,
    };

    pub fn init() Sha256 {
        return .{
            .state = H0,
            .buf = undefined,
            .buf_len = 0,
            .total_len = 0,
        };
    }

    pub fn hash(data: []const u8) [DIGEST_SIZE]u8 {
        var hasher = init();
        hasher.update(data);
        return hasher.final();
    }

    // ... update, final, compress methods
};

AST Node

const CalcSha256 = struct {
    input: *const Expr,

    pub const OP_CODE = OpCode.new(88);

    /// SHA-256 is more expensive than BLAKE2b
    pub const COST = PerItemCost{
        .base = JitCost{ .value = 80 },
        .per_chunk = JitCost{ .value = 8 },
        .chunk_size = 64,
    };

    pub fn eval(self: *const CalcSha256, env: *const DataEnv, E: *Evaluator) ![]const u8 {
        const input_bytes = try self.input.eval(env, E);
        const coll = input_bytes.coll.bytes;

        E.addSeqCost(COST, coll.len, OP_CODE);

        const result = Sha256.hash(coll);
        return try E.allocator.dupe(u8, &result);
    }
};

Cost Comparison

Hash Function Costs
─────────────────────────────────────────────────────
             Base    Per Chunk   Chunk Size
─────────────────────────────────────────────────────
BLAKE2b256   20      7           128 bytes
SHA-256      80      8           64 bytes
─────────────────────────────────────────────────────

Cost Formula:  total = base + ceil(len / chunk_size) * per_chunk

Example: 200-byte Input

BLAKE2b256:
  chunks = ceil(200 / 128) = 2
  cost = 20 + 2 * 7 = 34

SHA-256:
  chunks = ceil(200 / 64) = 4
  cost = 80 + 4 * 8 = 112

Ratio: SHA-256 is ~3.3x more expensive

Fiat-Shamir Hash

Internal hash for Sigma protocol challenge generation78:

const FiatShamir = struct {
    /// Soundness bits (192 = 24 bytes)
    pub const SOUNDNESS_BITS: u32 = 192;
    pub const SOUNDNESS_BYTES: usize = SOUNDNESS_BITS / 8; // 24

    /// Fiat-Shamir hash function
    /// Returns first 24 bytes of BLAKE2b256 hash
    pub fn hashFn(input: []const u8) [SOUNDNESS_BYTES]u8 {
        const full_hash = Blake2b256.hash(input);

        var result: [SOUNDNESS_BYTES]u8 = undefined;
        @memcpy(&result, full_hash[0..SOUNDNESS_BYTES]);
        return result;
    }
};

Why 192 Bits?

The truncation to 192 bits is not arbitrary9:

Security Constraints
─────────────────────────────────────────────────────
1. Challenge must be unpredictable to cheating prover
2. Threshold signatures use GF(2^192) polynomials
3. Must satisfy: 2^soundnessBits < group_order
4. Group order ≈ 2^256, so 192 < 256 works
comptime {
    // This constraint is critical for security
    std.debug.assert(FiatShamir.SOUNDNESS_BITS < CryptoConstants.GROUP_SIZE_BITS);
}

Fiat-Shamir Tree Serialization

The challenge is computed from a serialized proof tree10:

const FiatShamirTreeSerializer = struct {
    const INTERNAL_NODE_PREFIX: u8 = 0;
    const LEAF_PREFIX: u8 = 1;

    pub fn serialize(tree: *const ProofTree, writer: anytype) !void {
        switch (tree.*) {
            .leaf => |leaf| {
                try writer.writeByte(LEAF_PREFIX);

                // Serialize proposition as ErgoTree
                const prop_bytes = try leaf.proposition.toErgoTreeBytes();
                try writer.writeInt(i16, @intCast(prop_bytes.len), .big);
                try writer.writeAll(prop_bytes);

                // Serialize commitment
                const commitment = leaf.commitment orelse
                    return error.EmptyCommitment;
                try writer.writeInt(i16, @intCast(commitment.len), .big);
                try writer.writeAll(commitment);
            },
            .conjecture => |conj| {
                try writer.writeByte(INTERNAL_NODE_PREFIX);
                try writer.writeByte(@intFromEnum(conj.conj_type));

                // Threshold k for CTHRESHOLD
                if (conj.conj_type == .cthreshold) {
                    try writer.writeByte(conj.k);
                }

                try writer.writeInt(i16, @intCast(conj.children.len), .big);
                for (conj.children) |child| {
                    try serialize(child, writer);
                }
            },
        }
    }
};

HMAC-SHA512

For BIP32/BIP39 key derivation11:

const HmacSha512 = struct {
    pub const DIGEST_SIZE: usize = 64;
    pub const BLOCK_SIZE: usize = 128;

    inner: Sha512,
    outer: Sha512,

    pub fn init(key: []const u8) HmacSha512 {
        var padded_key: [BLOCK_SIZE]u8 = [_]u8{0} ** BLOCK_SIZE;

        if (key.len > BLOCK_SIZE) {
            const hashed = Sha512.hash(key);
            @memcpy(padded_key[0..64], &hashed);
        } else {
            @memcpy(padded_key[0..key.len], key);
        }

        // Inner padding (0x36)
        var inner_pad: [BLOCK_SIZE]u8 = undefined;
        for (padded_key, 0..) |b, i| {
            inner_pad[i] = b ^ 0x36;
        }

        // Outer padding (0x5c)
        var outer_pad: [BLOCK_SIZE]u8 = undefined;
        for (padded_key, 0..) |b, i| {
            outer_pad[i] = b ^ 0x5c;
        }

        var self = HmacSha512{
            .inner = Sha512.init(),
            .outer = Sha512.init(),
        };
        self.inner.update(&inner_pad);
        self.outer.update(&outer_pad);
        return self;
    }

    pub fn update(self: *HmacSha512, data: []const u8) void {
        self.inner.update(data);
    }

    pub fn final(self: *HmacSha512) [DIGEST_SIZE]u8 {
        const inner_hash = self.inner.final();
        self.outer.update(&inner_hash);
        return self.outer.final();
    }

    pub fn hash(key: []const u8, data: []const u8) [DIGEST_SIZE]u8 {
        var hmac = init(key);
        hmac.update(data);
        return hmac.final();
    }
};

Key Derivation Constants

const KeyDerivation = struct {
    /// BIP39 HMAC key
    pub const BITCOIN_SEED = "Bitcoin seed";

    /// PBKDF2 iterations for BIP39
    pub const PBKDF2_ITERATIONS: u32 = 2048;

    /// Derived key length
    pub const PBKDF2_KEY_LENGTH: u32 = 512;
};

Box ID Computation

Box IDs are BLAKE2b256 hashes of box content:

pub fn computeBoxId(box_bytes: []const u8) [32]u8 {
    return Blake2b256.hash(box_bytes);
}

Summary

This chapter covered the hash functions that provide cryptographic integrity throughout the Sigma protocol:

  • BLAKE2b256 is the primary hash function—approximately 3x cheaper than SHA-256 due to its larger block size (128 bytes vs 64 bytes) and optimized design
  • SHA-256 is available for external system compatibility (Bitcoin scripts, cross-chain verification)
  • Fiat-Shamir challenge generation uses BLAKE2b256 truncated to 192 bits, matching the threshold signature polynomial field size while satisfying the constraint 2^192 < group_order
  • Per-item costing calculates hash cost as base + ceil(input_length / block_size) * per_chunk, accurately reflecting the computational work
  • HMAC-SHA512 provides key derivation for BIP32/BIP39 wallet compatibility, using the standard "Bitcoin seed" key
  • Box IDs are computed as BLAKE2b256 hashes of serialized box content, providing content-addressable identification

Next: Chapter 11: Sigma Protocols

2

Rust: hash.rs:5-26

3

Scala: trees.scala (CalcBlake2b256)

5

Scala: trees.scala (CalcSha256)

11

Scala: HmacSHA512.scala

Chapter 11: Sigma Protocols

PRE-ALPHA WARNING: This is a pre-alpha version of The Sigma Book. Content may be incomplete, inaccurate, or subject to change. Do not use as a source of truth. For authoritative information, consult the official repositories:

Prerequisites

  • Chapter 9 for elliptic curve operations and the discrete logarithm problem
  • Chapter 10 for Fiat-Shamir hash generation
  • Understanding of zero-knowledge proofs: proving knowledge without revealing secrets

Learning Objectives

By the end of this chapter, you will be able to:

  • Explain the three-move Sigma protocol structure (commitment, challenge, response)
  • Implement the Schnorr (DLog) protocol for proving knowledge of a discrete logarithm
  • Describe the Diffie-Hellman Tuple protocol for proving equality of discrete logs
  • Compose protocols using AND, OR, and THRESHOLD operations
  • Apply the Fiat-Shamir transformation to convert interactive proofs to non-interactive

Sigma Protocol Structure

Sigma (Σ) protocols are the cryptographic foundation that makes Ergo's smart contracts possible. Named for their characteristic three-move "sigma-shaped" structure, they enable a prover to convince a verifier that they know a secret without revealing anything about that secret—the defining property of zero-knowledge proofs.

A Sigma protocol is a three-move interactive proof12:

Sigma Protocol Flow
─────────────────────────────────────────────────────

    Prover (P)                           Verifier (V)
        │                                      │
        │   ────────  a (commitment) ───────>  │
        │                                      │
        │   <───────  e (challenge) ─────────  │
        │                                      │
        │   ────────  z (response) ─────────>  │
        │                                      │
        │                          Verify(a, e, z)?

Message Types

/// First message: prover's commitment
const FirstProverMessage = union(enum) {
    dlog: FirstDlogProverMessage,
    dht: FirstDhtProverMessage,

    pub fn bytes(self: FirstProverMessage) []const u8 {
        return switch (self) {
            .dlog => |m| m.a.serialize(),
            .dht => |m| m.a.serialize() ++ m.b.serialize(),
        };
    }
};

/// Second message: prover's response
const SecondProverMessage = union(enum) {
    dlog: SecondDlogProverMessage,
    dht: SecondDhtProverMessage,
};

/// Challenge from verifier (192 bits = 24 bytes)
const Challenge = [FiatShamir.SOUNDNESS_BYTES]u8;

Schnorr Protocol (Discrete Log)

Proves knowledge of secret w such that h = g^w34:

Schnorr Protocol Steps
─────────────────────────────────────────────────────
Given: g (generator), h = g^w (public key), w (secret)

Step     Message    Computation
─────────────────────────────────────────────────────
1. Commit    a      r ← random, a = g^r
2. Challenge e      Verifier sends random e
3. Response  z      z = r + e·w (mod q)
4. Verify    ✓      g^z = a · h^e

Implementation

const DlogProverInput = struct {
    /// Secret scalar w in [0, q-1]
    w: Scalar,

    /// Compute public image h = g^w
    pub fn publicImage(self: *const DlogProverInput) ProveDlog {
        const g = DlogGroup.generator();
        const h = DlogGroup.exponentiate(&g, &self.w);
        return ProveDlog{ .h = h };
    }

    /// Generate random secret
    pub fn random(rng: std.rand.Random) DlogProverInput {
        return .{ .w = Scalar.random(rng) };
    }
};

/// First message: commitment a = g^r
const FirstDlogProverMessage = struct {
    a: EcPoint,

    pub fn bytes(self: *const FirstDlogProverMessage) [33]u8 {
        return GroupElementSerializer.serialize(&self.a);
    }
};

/// Second message: response z
const SecondDlogProverMessage = struct {
    z: Scalar,
};

Prover Steps

const DlogProver = struct {
    /// Step 1: Generate commitment (real proof)
    pub fn firstMessage(rng: std.rand.Random) struct { r: Scalar, msg: FirstDlogProverMessage } {
        const r = Scalar.random(rng);
        const g = DlogGroup.generator();
        const a = DlogGroup.exponentiate(&g, &r);
        return .{ .r = r, .msg = .{ .a = a } };
    }

    /// Step 3: Compute response z = r + e·w (mod q)
    pub fn secondMessage(
        private_input: *const DlogProverInput,
        r: Scalar,
        challenge: *const Challenge,
    ) SecondDlogProverMessage {
        const e = Scalar.fromBytes(challenge);
        const ew = e.mul(&private_input.w);  // e * w mod q
        const z = r.add(&ew);                 // r + ew mod q
        return .{ .z = z };
    }
};

Simulation

For OR composition, we need to simulate proofs without knowing the secret5:

/// Simulate transcript without knowing secret
/// Given challenge e, produce valid-looking (a, z)
pub fn simulate(
    public_input: *const ProveDlog,
    challenge: *const Challenge,
    rng: std.rand.Random,
) struct { first: FirstDlogProverMessage, second: SecondDlogProverMessage } {
    // SAMPLE random z
    const z = Scalar.random(rng);

    // COMPUTE a = g^z · h^(-e)
    // This satisfies verification equation: g^z = a · h^e
    const e = Scalar.fromBytes(challenge);
    const minus_e = e.negate();
    const g = DlogGroup.generator();
    const h = public_input.h;

    const g_to_z = DlogGroup.exponentiate(&g, &z);
    const h_to_minus_e = DlogGroup.exponentiate(&h, &minus_e);
    const a = DlogGroup.multiply(&g_to_z, &h_to_minus_e);

    return .{
        .first = .{ .a = a },
        .second = .{ .z = z },
    };
}

Verification (Commitment Reconstruction)

/// Verify: reconstruct a from z and e, check equality
/// g^z = a · h^e  =>  a = g^z / h^e
pub fn computeCommitment(
    proposition: *const ProveDlog,
    challenge: *const Challenge,
    second_msg: *const SecondDlogProverMessage,
) EcPoint {
    const g = DlogGroup.generator();
    const h = proposition.h;
    const e = Scalar.fromBytes(challenge);

    const g_to_z = DlogGroup.exponentiate(&g, &second_msg.z);
    const h_to_e = DlogGroup.exponentiate(&h, &e);
    const h_to_e_inv = DlogGroup.inverse(&h_to_e);

    return DlogGroup.multiply(&g_to_z, &h_to_e_inv);
}

Diffie-Hellman Tuple Protocol

Proves knowledge of w such that u = g^w AND v = h^w67:

DHT Protocol: Prove (u, v) share the same discrete log

Given: g, h (generators), u = g^w, v = h^w (public tuple)

Step     Message      Computation
─────────────────────────────────────────────────────
1. Commit    (a, b)    r ← random, a = g^r, b = h^r
2. Challenge e         Verifier sends random e
3. Response  z         z = r + e·w (mod q)
4. Verify    ✓         g^z = a·u^e  AND  h^z = b·v^e

Implementation

const ProveDhTuple = struct {
    g: EcPoint,
    h: EcPoint,
    u: EcPoint,  // u = g^w
    v: EcPoint,  // v = h^w

    pub const OP_CODE = OpCode.ProveDiffieHellmanTuple;
};

const FirstDhtProverMessage = struct {
    a: EcPoint,  // a = g^r
    b: EcPoint,  // b = h^r

    pub fn bytes(self: *const FirstDhtProverMessage) [66]u8 {
        var result: [66]u8 = undefined;
        @memcpy(result[0..33], &GroupElementSerializer.serialize(&self.a));
        @memcpy(result[33..66], &GroupElementSerializer.serialize(&self.b));
        return result;
    }
};

const DhtProver = struct {
    pub fn firstMessage(
        public_input: *const ProveDhTuple,
        rng: std.rand.Random,
    ) struct { r: Scalar, msg: FirstDhtProverMessage } {
        const r = Scalar.random(rng);
        const a = DlogGroup.exponentiate(&public_input.g, &r);
        const b = DlogGroup.exponentiate(&public_input.h, &r);
        return .{ .r = r, .msg = .{ .a = a, .b = b } };
    }
};

SigmaBoolean Proposition Types

Propositions form a tree structure89:

const SigmaBoolean = union(enum) {
    /// Leaf: prove knowledge of discrete log
    prove_dlog: ProveDlog,
    /// Leaf: prove DHT equality
    prove_dh_tuple: ProveDhTuple,
    /// Conjunction: all children must be proven
    cand: Cand,
    /// Disjunction: at least one child proven
    cor: Cor,
    /// Threshold: k-of-n children proven
    cthreshold: Cthreshold,
    /// Trivially true
    trivial_true: void,
    /// Trivially false
    trivial_false: void,

    pub fn opCode(self: SigmaBoolean) OpCode {
        return switch (self) {
            .prove_dlog => OpCode.ProveDlog,
            .prove_dh_tuple => OpCode.ProveDiffieHellmanTuple,
            .cand => OpCode.SigmaAnd,
            .cor => OpCode.SigmaOr,
            .cthreshold => OpCode.Atleast,
            else => OpCode.Constant,
        };
    }

    /// Count nodes in proposition tree
    pub fn size(self: SigmaBoolean) usize {
        return switch (self) {
            .cand => |c| 1 + sumChildSizes(c.children),
            .cor => |c| 1 + sumChildSizes(c.children),
            .cthreshold => |c| 1 + sumChildSizes(c.children),
            else => 1,
        };
    }
};

const ProveDlog = struct {
    h: EcPoint,  // Public key h = g^w

    pub const OP_CODE = OpCode.ProveDlog;
};

const Cand = struct {
    children: []const SigmaBoolean,
};

const Cor = struct {
    children: []const SigmaBoolean,
};

const Cthreshold = struct {
    k: u8,  // Threshold
    children: []const SigmaBoolean,
};

Protocol Composition

AND Composition

All children share the same challenge10:

       Challenge e
          │
    ┌─────┴─────┐
    │           │
  σ₁(e)       σ₂(e)
   real        real
/// AND: prove all children with same challenge
fn proveAnd(
    children: []const *SigmaBoolean,
    secrets: []const PrivateInput,
    challenge: *const Challenge,
) []const ProofNode {
    var proofs = allocator.alloc(ProofNode, children.len);
    for (children, secrets, 0..) |child, secret, i| {
        proofs[i] = proveReal(child, secret, challenge);
    }
    return proofs;
}

OR Composition

At least one child is real; others are simulated11:

       Challenge e
          │
    ┌─────┴─────┐
    │           │
  σ₁(e₁)      σ₂(e₂)
   REAL       SIMULATED

Constraint: e₁ ⊕ e₂ = e (XOR)
/// OR: one real proof, rest simulated
/// Challenges must XOR to root challenge
fn proveOr(
    children: []const *SigmaBoolean,
    real_index: usize,
    secret: PrivateInput,
    challenge: *const Challenge,
    rng: std.rand.Random,
) []const ProofNode {
    var proofs = allocator.alloc(ProofNode, children.len);
    var challenge_sum: Challenge = [_]u8{0} ** FiatShamir.SOUNDNESS_BYTES;

    // First: generate simulated proofs with random challenges
    for (children, 0..) |child, i| {
        if (i != real_index) {
            var sim_challenge: Challenge = undefined;
            rng.bytes(&sim_challenge);
            proofs[i] = simulate(child, &sim_challenge, rng);
            xorChallenge(&challenge_sum, &sim_challenge);
        }
    }

    // Derive real challenge: e_real = e ⊕ (sum of simulated challenges)
    var real_challenge: Challenge = undefined;
    for (0..FiatShamir.SOUNDNESS_BYTES) |i| {
        real_challenge[i] = challenge[i] ^ challenge_sum[i];
    }

    proofs[real_index] = proveReal(children[real_index], secret, &real_challenge);
    return proofs;
}

THRESHOLD (k-of-n)

Uses polynomial interpolation over GF(2^192)12:

Threshold k-of-n Challenge Distribution
─────────────────────────────────────────────────────
- Construct polynomial p(x) of degree k-1
- p(0) = e (root challenge)
- Each child i gets challenge p(i)
- k real children, (n-k) simulated
const GF2_192 = struct {
    /// 192 bits = 3 × 64-bit words
    words: [3]u64,

    pub fn fromChallenge(c: *const Challenge) GF2_192 {
        var self = GF2_192{ .words = .{ 0, 0, 0 } };
        // Load 24 bytes into 3 words (only 192 bits used)
        @memcpy(std.mem.asBytes(&self.words[0])[0..8], c[0..8]);
        @memcpy(std.mem.asBytes(&self.words[1])[0..8], c[8..16]);
        @memcpy(std.mem.asBytes(&self.words[2])[0..8], c[16..24]);
        return self;
    }

    pub fn add(a: *const GF2_192, b: *const GF2_192) GF2_192 {
        // Addition in GF(2^192) is XOR
        return .{ .words = .{
            a.words[0] ^ b.words[0],
            a.words[1] ^ b.words[1],
            a.words[2] ^ b.words[2],
        } };
    }

    // Multiplication uses polynomial representation with reduction
    // NOTE: This is a stub. Full implementation requires:
    // 1. Carry-less multiplication of 192-bit polynomials
    // 2. Reduction modulo irreducible polynomial x^192 + x^7 + x^2 + x + 1
    // See sigma-rust: ergotree-interpreter/src/sigma_protocol/gf2_192.rs
    pub fn mul(a: *const GF2_192, b: *const GF2_192) GF2_192 {
        _ = a;
        _ = b;
        @compileError("GF2_192.mul() not implemented - see reference implementations");
    }
};

const GF2_192_Poly = struct {
    coefficients: []GF2_192,
    degree: usize,

    /// Evaluate polynomial at point x using Horner's method
    pub fn evaluate(self: *const GF2_192_Poly, x: u8) GF2_192 {
        var result = self.coefficients[self.degree];
        var i = self.degree;
        while (i > 0) {
            i -= 1;
            result = GF2_192.add(&GF2_192.mulByByte(&result, x), &self.coefficients[i]);
        }
        return result;
    }

    /// Lagrange interpolation through given points
    /// NOTE: Stub - full implementation requires GF2_192 arithmetic
    /// See sigma-rust: ergotree-interpreter/src/sigma_protocol/gf2_192_poly.rs
    pub fn interpolate(
        points: []const u8,
        values: []const GF2_192,
        value_at_0: GF2_192,
    ) GF2_192_Poly {
        // Construct unique polynomial of degree (n-1)
        // passing through n points with p(0) = value_at_0
        _ = points;
        _ = values;
        _ = value_at_0;
        @compileError("GF2_192_Poly.interpolate() not implemented");
    }
};

// NOTE: In production, all scalar operations (add, mul, negate) must be
// constant-time to prevent timing side-channel attacks. See ZIGMA_STYLE.md.

Mathematical correctness of GF(2^192) polynomial interpolation:

The threshold k-of-n scheme uses polynomial interpolation over the finite field GF(2^192):

  1. Field properties: GF(2^192) is a finite field where addition is XOR and multiplication is polynomial multiplication modulo an irreducible polynomial (x^192 + x^7 + x^2 + x + 1). All arithmetic is well-defined and invertible.

  2. Lagrange interpolation: Given any k distinct points (x₁, y₁), ..., (xₖ, yₖ), there exists a unique polynomial p(x) of degree at most k-1 passing through all points. This is constructed using Lagrange basis polynomials.

  3. Challenge distribution: The prover constructs a polynomial of degree n-k with p(0) = root_challenge. Simulated children's challenges are assigned randomly, and the polynomial is interpolated through these points. Real children receive challenges p(i) computed from this polynomial.

  4. Security: The 192-bit field size matches SOUNDNESS_BITS, ensuring that a cheating prover (who knows fewer than k secrets) succeeds with probability at most 2^-192—cryptographically negligible.

Proof Trees

Track proof state during proving13:

const UnprovenTree = union(enum) {
    leaf: UnprovenLeaf,
    conjecture: UnprovenConjecture,
};

const UnprovenLeaf = struct {
    proposition: SigmaBoolean,
    position: NodePosition,
    simulated: bool,
    commitment_opt: ?FirstProverMessage,
    randomness_opt: ?Scalar,
    challenge_opt: ?Challenge,
};

const UnprovenConjecture = struct {
    conj_type: enum { and, or_, threshold },
    children: []UnprovenTree,
    position: NodePosition,
    simulated: bool,
    challenge_opt: ?Challenge,
    k: ?u8,  // For threshold
    polynomial_opt: ?GF2_192_Poly,
};

/// Position in tree: "0-2-1" means root → child 2 → child 1
const NodePosition = struct {
    positions: []const usize,

    pub fn child(self: NodePosition, idx: usize) NodePosition {
        return .{ .positions = self.positions ++ &[_]usize{idx} };
    }

    pub const CRYPTO_PREFIX = NodePosition{ .positions = &.{0} };
};

Fiat-Shamir Transformation

Convert interactive to non-interactive by deriving challenge from hash14:

/// Derive challenge from tree serialization
pub fn fiatShamirChallenge(tree: *const ProofTree) Challenge {
    var buf = std.ArrayList(u8).init(allocator);
    fiatShamirSerialize(tree, &buf);
    return FiatShamir.hashFn(buf.items);
}

fn fiatShamirSerialize(tree: *const ProofTree, writer: anytype) !void {
    const INTERNAL_PREFIX: u8 = 0;
    const LEAF_PREFIX: u8 = 1;

    switch (tree.*) {
        .leaf => |leaf| {
            try writer.writeByte(LEAF_PREFIX);
            // Proposition bytes
            const prop_bytes = try leaf.proposition.toErgoTreeBytes();
            try writer.writeIntBig(i16, @intCast(prop_bytes.len));
            try writer.writeAll(prop_bytes);
            // Commitment bytes
            const commitment = leaf.commitment_opt orelse return error.NoCommitment;
            const comm_bytes = commitment.bytes();
            try writer.writeIntBig(i16, @intCast(comm_bytes.len));
            try writer.writeAll(comm_bytes);
        },
        .conjecture => |conj| {
            try writer.writeByte(INTERNAL_PREFIX);
            try writer.writeByte(@intFromEnum(conj.conj_type));
            if (conj.k) |k| try writer.writeByte(k);
            try writer.writeIntBig(i16, @intCast(conj.children.len));
            for (conj.children) |child| {
                try fiatShamirSerialize(child, writer);
            }
        },
    }
}

Security Properties

Security Properties
─────────────────────────────────────────────────────
Property         Meaning
─────────────────────────────────────────────────────
Completeness     Honest prover always convinces
Soundness        Cheater succeeds with prob ≤ 2^-192
Zero-Knowledge   Proof reveals nothing about secret
Special Sound.   Two transcripts extract secret

Summary

This chapter covered Sigma protocols—the zero-knowledge proof system that forms the cryptographic core of Ergo's smart contracts:

  • Sigma protocols use a three-move structure: the prover sends a commitment, receives a challenge, and responds with a value that proves knowledge without revealing the secret
  • Schnorr (DLog) protocol proves knowledge of a discrete logarithm: given h = g^w, prove knowledge of w without revealing it
  • Diffie-Hellman Tuple protocol proves equality of discrete logs across different bases: given u = g^w and v = h^w, prove that u and v share the same discrete log
  • AND composition applies the same challenge to all children—all must be proven
  • OR composition distributes challenges via XOR constraint—only one child needs a real proof, others are simulated
  • THRESHOLD (k-of-n) uses GF(2^192) polynomial interpolation to distribute challenges, requiring k real proofs
  • Simulation generates valid-looking transcripts without knowing secrets, enabling OR and threshold compositions
  • Fiat-Shamir transformation makes interactive protocols non-interactive by deriving the challenge from a hash of the commitments

Next: Chapter 12: Evaluation Model

10

Rust: cand.rs

11

Rust: cor.rs

14

Rust: fiat_shamir.rs

Chapter 12: Evaluation Model

PRE-ALPHA WARNING: This is a pre-alpha version of The Sigma Book. Content may be incomplete, inaccurate, or subject to change. Do not use as a source of truth. For authoritative information, consult the official repositories:

Prerequisites

  • Chapter 4 for the AST structure and Value hierarchy
  • Chapter 5 for opcodes and cost descriptors
  • Chapter 3 for ErgoTree format and constant segregation

Learning Objectives

By the end of this chapter, you will be able to:

  • Explain direct-style big-step interpretation and why it suits ErgoTree evaluation
  • Implement eval dispatch for AST node types (constants, variables, functions, operations)
  • Work with the Env environment structure for variable binding and closure capture
  • Track accumulated costs during evaluation to enforce resource limits

Evaluation Architecture

The Sigma interpreter transforms an ErgoTree expression into a SigmaBoolean proposition that can be proven or verified. This "reduction" process uses direct-style big-step evaluation—each expression immediately returns its result value rather than producing intermediate steps. This approach is simpler than continuation-passing style while still supporting the necessary features: lexical closures, short-circuit evaluation, and cost tracking12.

Evaluation Flow
─────────────────────────────────────────────────────

┌──────────────────────────────────────────────────┐
│              ErgoTreeEvaluator                   │
├──────────────────────────────────────────────────┤
│  context: Context     (SELF, INPUTS, OUTPUTS)    │
│  constants: []Const   (segregated constants)     │
│  cost_accum: CostAcc  (tracks execution cost)    │
│  env: Env             (variable bindings)        │
└───────────────────────┬──────────────────────────┘
                        │
                        │ eval(expr)
                        ▼
┌──────────────────────────────────────────────────┐
│              AST Traversal                       │
│                                                  │
│    Expr.eval(env, ctx)                           │
│         │                                        │
│         ├── Evaluate children                    │
│         ├── Add operation cost                   │
│         ├── Perform operation                    │
│         └── Return result Value                  │
└──────────────────────────────────────────────────┘

Evaluator Structure

const Evaluator = struct {
    context: *const Context,
    constants: []const Constant,
    cost_accum: CostAccumulator,
    allocator: Allocator,

    pub fn init(
        context: *const Context,
        constants: []const Constant,
        cost_limit: JitCost,
        allocator: Allocator,
    ) Evaluator {
        return .{
            .context = context,
            .constants = constants,
            .cost_accum = CostAccumulator.init(cost_limit),
            .allocator = allocator,
        };
    }

    /// Evaluate expression in given environment
    pub fn eval(self: *Evaluator, env: *const Env, expr: *const Expr) !Value {
        return expr.eval(env, self);
    }

    /// Evaluate to specific type
    pub fn evalTo(
        self: *Evaluator,
        comptime T: type,
        env: *const Env,
        expr: *const Expr,
    ) !T {
        const result = try self.eval(env, expr);
        return result.as(T) orelse error.TypeMismatch;
    }

    /// Add fixed cost
    pub fn addCost(self: *Evaluator, cost: FixedCost, op: OpCode) !void {
        try self.cost_accum.add(cost.value, op);
    }

    /// Add per-item cost
    pub fn addSeqCost(self: *Evaluator, cost: PerItemCost, n_items: usize, op: OpCode) !void {
        const total = cost.base.value + (n_items / cost.chunk_size + 1) * cost.per_chunk.value;
        try self.cost_accum.add(total, op);
    }
};

Environment (Variable Binding)

The Env maps variable IDs to computed values34:

const Env = struct {
    /// HashMap from variable ID to value
    bindings: std.AutoHashMap(u32, Value),
    allocator: Allocator,

    pub fn init(allocator: Allocator) Env {
        return .{
            .bindings = std.AutoHashMap(u32, Value).init(allocator),
            .allocator = allocator,
        };
    }

    /// Look up variable by ID
    pub fn get(self: *const Env, val_id: u32) ?Value {
        return self.bindings.get(val_id);
    }

    /// Create new environment with additional binding
    /// NOTE: This implementation clones the HashMap on every extend() call.
    /// In production, use a pre-allocated binding stack with O(1) extend/pop:
    ///   bindings: [MAX_BINDINGS]Binding (pre-allocated)
    ///   stack_ptr: usize (grows/shrinks without allocation)
    /// See ZIGMA_STYLE.md for zero-allocation evaluation patterns.
    pub fn extend(self: *const Env, val_id: u32, value: Value) !Env {
        var new_env = Env{
            .bindings = try self.bindings.clone(),
            .allocator = self.allocator,
        };
        try new_env.bindings.put(val_id, value);
        return new_env;
    }

    /// Create new environment with multiple bindings
    pub fn extendMany(self: *const Env, bindings: []const struct { id: u32, val: Value }) !Env {
        var new_env = Env{
            .bindings = try self.bindings.clone(),
            .allocator = self.allocator,
        };
        for (bindings) |b| {
            try new_env.bindings.put(b.id, b.val);
        }
        return new_env;
    }
};

Expression Dispatch

Each expression type implements eval56:

const Expr = union(enum) {
    constant: Constant,
    const_placeholder: ConstantPlaceholder,
    val_use: ValUse,
    block_value: BlockValue,
    func_value: FuncValue,
    apply: Apply,
    if_op: If,
    bin_op: BinOp,
    // ... other expression types

    /// Evaluate expression recursively
    /// NOTE: This recursive approach is clear for learning but uses the call
    /// stack. In production, use an explicit work stack to:
    /// 1. Guarantee bounded stack depth (no stack overflow)
    /// 2. Enable O(1) reset between transactions
    /// See ZIGMA_STYLE.md for iterative evaluation patterns.
    pub fn eval(self: *const Expr, env: *const Env, E: *Evaluator) !Value {
        return switch (self.*) {
            .constant => |c| c.eval(env, E),
            .const_placeholder => |cp| cp.eval(env, E),
            .val_use => |vu| vu.eval(env, E),
            .block_value => |bv| bv.eval(env, E),
            .func_value => |fv| fv.eval(env, E),
            .apply => |a| a.eval(env, E),
            .if_op => |i| i.eval(env, E),
            .bin_op => |b| b.eval(env, E),
            // ... dispatch to other eval implementations
        };
    }
};

Constant Evaluation

Constants return their value with fixed cost7:

const Constant = struct {
    tpe: SType,
    value: Literal,

    pub const COST = FixedCost{ .value = 5 };

    pub fn eval(self: *const Constant, _: *const Env, E: *Evaluator) !Value {
        try E.addCost(COST, OpCode.Constant);
        return Value.fromLiteral(self.value);
    }
};

const ConstantPlaceholder = struct {
    index: u32,
    tpe: SType,

    pub const COST = FixedCost{ .value = 1 };

    pub fn eval(self: *const ConstantPlaceholder, _: *const Env, E: *Evaluator) !Value {
        try E.addCost(COST, OpCode.ConstantPlaceholder);
        if (self.index >= E.constants.len) {
            return error.IndexOutOfBounds;
        }
        const c = E.constants[self.index];
        return Value.fromLiteral(c.value);
    }
};

Variable Access

ValUse looks up variables in environment8:

const ValUse = struct {
    val_id: u32,
    tpe: SType,

    pub const COST = FixedCost{ .value = 5 };

    pub fn eval(self: *const ValUse, env: *const Env, E: *Evaluator) !Value {
        try E.addCost(COST, OpCode.ValUse);
        return env.get(self.val_id) orelse error.UndefinedVariable;
    }
};

Block Evaluation

Blocks introduce variable bindings9:

const BlockValue = struct {
    items: []const ValDef,
    result: *const Expr,

    pub const COST = PerItemCost{
        .base = JitCost{ .value = 1 },
        .per_chunk = JitCost{ .value = 1 },
        .chunk_size = 1,
    };

    pub fn eval(self: *const BlockValue, env: *const Env, E: *Evaluator) !Value {
        try E.addSeqCost(COST, self.items.len, OpCode.BlockValue);

        var cur_env = env.*;
        for (self.items) |item| {
            // Evaluate right-hand side
            const rhs_val = try item.rhs.eval(&cur_env, E);

            // Extend environment with new binding
            try E.addCost(FuncValue.ADD_TO_ENV_COST, OpCode.FuncValue);
            cur_env = try cur_env.extend(item.id, rhs_val);
        }

        // Evaluate result in extended environment
        return self.result.eval(&cur_env, E);
    }
};

const ValDef = struct {
    id: u32,
    tpe: SType,
    rhs: *const Expr,
};

Lambda Functions

FuncValue creates closures10:

const FuncValue = struct {
    args: []const FuncArg,
    body: *const Expr,

    pub const COST = FixedCost{ .value = 10 };
    pub const ADD_TO_ENV_COST = FixedCost{ .value = 5 };

    pub fn eval(self: *const FuncValue, env: *const Env, E: *Evaluator) !Value {
        try E.addCost(COST, OpCode.FuncValue);

        // Create closure capturing current environment
        return Value{
            .closure = .{
                .captured_env = env.*,
                .args = self.args,
                .body = self.body,
            },
        };
    }
};

const FuncArg = struct {
    id: u32,
    tpe: SType,
};

const Apply = struct {
    func: *const Expr,
    args: *const Expr,

    pub fn eval(self: *const Apply, env: *const Env, E: *Evaluator) !Value {
        // Evaluate function
        const func_val = try self.func.eval(env, E);
        const closure = func_val.closure;

        // Evaluate argument
        const arg_val = try self.args.eval(env, E);

        // Extend closure's captured env with argument binding
        try E.addCost(FuncValue.ADD_TO_ENV_COST, OpCode.Apply);
        var new_env = try closure.captured_env.extend(closure.args[0].id, arg_val);

        // Evaluate body in new environment
        return closure.body.eval(&new_env, E);
    }
};

Conditional Evaluation

If uses short-circuit semantics11:

const If = struct {
    condition: *const Expr,
    true_branch: *const Expr,
    false_branch: *const Expr,

    pub const COST = FixedCost{ .value = 10 };

    pub fn eval(self: *const If, env: *const Env, E: *Evaluator) !Value {
        // Evaluate condition
        const cond = try E.evalTo(bool, env, self.condition);

        try E.addCost(COST, OpCode.If);

        // Only evaluate taken branch (short-circuit)
        if (cond) {
            return self.true_branch.eval(env, E);
        } else {
            return self.false_branch.eval(env, E);
        }
    }
};

Collection Operations

Map, filter, fold evaluate with per-item costs12:

const Map = struct {
    input: *const Expr,
    mapper: *const Expr,

    pub const COST = PerItemCost{
        .base = JitCost{ .value = 10 },
        .per_chunk = JitCost{ .value = 5 },
        .chunk_size = 10,
    };

    pub fn eval(self: *const Map, env: *const Env, E: *Evaluator) !Value {
        const input_coll = try E.evalTo(Collection, env, self.input);
        const mapper_fn = try E.evalTo(Closure, env, self.mapper);

        try E.addSeqCost(COST, input_coll.len, OpCode.Map);

        var result = try E.allocator.alloc(Value, input_coll.len);
        for (input_coll.items, 0..) |item, i| {
            // Apply mapper to each element
            try E.addCost(FuncValue.ADD_TO_ENV_COST, OpCode.Map);
            var fn_env = try mapper_fn.captured_env.extend(mapper_fn.args[0].id, item);
            result[i] = try mapper_fn.body.eval(&fn_env, E);
        }

        return Value{ .coll = .{ .items = result } };
    }
};

const Fold = struct {
    input: *const Expr,
    zero: *const Expr,
    folder: *const Expr,

    pub const COST = PerItemCost{
        .base = JitCost{ .value = 10 },
        .per_chunk = JitCost{ .value = 5 },
        .chunk_size = 10,
    };

    pub fn eval(self: *const Fold, env: *const Env, E: *Evaluator) !Value {
        const input_coll = try E.evalTo(Collection, env, self.input);
        const zero_val = try self.zero.eval(env, E);
        const folder_fn = try E.evalTo(Closure, env, self.folder);

        try E.addSeqCost(COST, input_coll.len, OpCode.Fold);

        var accum = zero_val;
        for (input_coll.items) |item| {
            // folder takes (accum, item)
            const tuple = Value{ .tuple = .{ accum, item } };
            try E.addCost(FuncValue.ADD_TO_ENV_COST, OpCode.Fold);
            var fn_env = try folder_fn.captured_env.extend(folder_fn.args[0].id, tuple);
            accum = try folder_fn.body.eval(&fn_env, E);
        }

        return accum;
    }
};

Binary Operations

const BinOp = struct {
    kind: Kind,
    left: *const Expr,
    right: *const Expr,

    const Kind = enum {
        plus, minus, multiply, divide, modulo,
        gt, ge, lt, le, eq, neq,
        bin_and, bin_or, bin_xor,
    };

    pub fn eval(self: *const BinOp, env: *const Env, E: *Evaluator) !Value {
        const left_val = try self.left.eval(env, E);
        const right_val = try self.right.eval(env, E);

        return switch (self.kind) {
            .plus => try evalPlus(left_val, right_val, E),
            .minus => try evalMinus(left_val, right_val, E),
            .gt => try evalGt(left_val, right_val, E),
            // ... other operations
        };
    }

    fn evalPlus(left: Value, right: Value, E: *Evaluator) !Value {
        try E.addCost(ArithOp.PLUS_COST, OpCode.Plus);
        return switch (left) {
            .int => |l| Value{ .int = try std.math.add(i32, l, right.int) },
            .long => |l| Value{ .long = try std.math.add(i64, l, right.long) },
            else => error.TypeMismatch,
        };
    }
};

Top-Level Evaluation

Reduce ErgoTree to SigmaBoolean:

pub fn reduceToSigmaBoolean(
    ergo_tree: *const ErgoTree,
    context: *const Context,
    cost_limit: JitCost,
    allocator: Allocator,
) !struct { prop: SigmaBoolean, cost: JitCost } {
    var evaluator = Evaluator.init(
        context,
        ergo_tree.constants,
        cost_limit,
        allocator,
    );

    const empty_env = Env.init(allocator);
    const result = try evaluator.eval(&empty_env, ergo_tree.root);

    const sigma_prop = result.asSigmaProp() orelse
        return error.NotSigmaProp;

    return .{
        .prop = sigma_prop.sigma_boolean,
        .cost = evaluator.cost_accum.totalCost(),
    };
}

Summary

This chapter covered the evaluation model that transforms ErgoTree expressions into SigmaBoolean propositions:

  • Direct-style big-step interpretation evaluates expressions recursively, with each node immediately returning its result value
  • Env maps variable IDs to values using immutable functional updates—each extend() creates a new environment with additional bindings
  • Each AST node implements an eval() method that returns a Value and accumulates execution cost
  • BlockValue extends the environment with ValDef bindings, enabling local variable definitions
  • FuncValue creates closures that capture the current environment, enabling lexical scoping
  • If implements short-circuit evaluation—only the taken branch is evaluated, reducing unnecessary computation and cost
  • Collection operations (Map, Filter, Fold) have per-item costs reflecting their iteration over elements
  • Top-level reduction produces a SigmaBoolean proposition that the prover/verifier can then handle cryptographically

Next: Chapter 13: Cost Model

2

Rust: eval.rs:1-100

3

Scala: ErgoTreeEvaluator.scala (DataEnv)

4

Rust: env.rs

5

Scala: values.scala (eval methods)

7

Scala: values.scala (ConstantNode.eval)

8

Rust: val_use.rs

9

Rust: block.rs

10

Rust: func_value.rs

11

Rust: if_op.rs

Chapter 13: Cost Model

PRE-ALPHA WARNING: This is a pre-alpha version of The Sigma Book. Content may be incomplete, inaccurate, or subject to change. Do not use as a source of truth. For authoritative information, consult the official repositories:

Prerequisites

  • Chapter 12 for the evaluation architecture and how costs are accumulated during eval
  • Chapter 5 for operation categories and cost descriptor types
  • Basic computational complexity: understanding of constant-time vs linear-time operations

Learning Objectives

By the end of this chapter, you will be able to:

  • Explain JitCost scaling (10x) and conversion to/from block costs
  • Apply the three cost descriptor types: FixedCost, PerItemCost, and TypeBasedCost
  • Implement cost accumulation with limit enforcement to prevent denial-of-service attacks
  • Use cost tracing to analyze script execution costs

Cost Model Purpose

Unlike Turing-complete smart contract platforms that can enter infinite loops, ErgoTree scripts must terminate within bounded resources. The cost model assigns a computational cost to every operation, accumulating these costs during evaluation. If the accumulated cost exceeds the block limit, execution fails—this guarantees that all scripts terminate and prevents attackers from crafting expensive scripts that slow down block validation.

ErgoTree scripts execute in a resource-constrained environment12:

Cost Model Guarantees
─────────────────────────────────────────────────────
1. DoS Protection     Expensive scripts blocked
2. Predictable Time   Miners estimate validation
3. Fair Pricing       Users pay for resources
4. Bounded Verify     All scripts terminate

JitCost: The Cost Unit

JitCost provides 10x finer granularity than block costs3:

const JitCost = struct {
    value: i32,

    pub const SCALE_FACTOR: i32 = 10;

    /// Add with overflow protection
    pub fn add(self: JitCost, other: JitCost) !JitCost {
        const result = @addWithOverflow(self.value, other.value);
        if (result[1] != 0) return error.CostOverflow;
        return .{ .value = result[0] };
    }

    /// Multiply with overflow protection
    pub fn mul(self: JitCost, n: i32) !JitCost {
        const result = @mulWithOverflow(self.value, n);
        if (result[1] != 0) return error.CostOverflow;
        return .{ .value = result[0] };
    }

    /// Divide by integer
    pub fn div(self: JitCost, n: i32) JitCost {
        return .{ .value = @divTrunc(self.value, n) };
    }

    /// Convert to block cost (inverse of fromBlockCost)
    pub fn toBlockCost(self: JitCost) i32 {
        return @divTrunc(self.value, SCALE_FACTOR);
    }

    /// Create from block cost
    pub fn fromBlockCost(block_cost: i32) !JitCost {
        const result = @mulWithOverflow(block_cost, SCALE_FACTOR);
        if (result[1] != 0) return error.CostOverflow;
        return .{ .value = result[0] };
    }

    /// Comparison
    pub fn gt(self: JitCost, other: JitCost) bool {
        return self.value > other.value;
    }
};

Cost Scaling

Cost Scales
─────────────────────────────────────────────────────

JitCost (internal)    ─────────────────>    Block Cost
                          ÷ 10

Example:
  JitCost(50)   ──────────────────────>   5 block units
  JitCost(123)  ──────────────────────>   12 block units

Block Cost (external) <─────────────────    JitCost
                          × 10

The 10x scaling provides:

  • Finer granularity for internal calculations
  • Integer arithmetic (no floating point)
  • Overflow protection via checked operations

Cost Kind Descriptors

Cost descriptors define how operations are costed45:

const CostKind = union(enum) {
    fixed: FixedCost,
    per_item: PerItemCost,
    type_based: TypeBasedCost,
    dynamic: void,
};

/// Constant time operations
const FixedCost = struct {
    cost: JitCost,
};

/// Linear operations with chunking
const PerItemCost = struct {
    base_cost: JitCost,
    per_chunk_cost: JitCost,
    chunk_size: u32,

    /// Compute number of chunks for n items
    pub fn chunks(self: PerItemCost, n_items: usize) usize {
        if (n_items == 0) return 1;
        return (n_items - 1) / self.chunk_size + 1;
    }

    /// Compute total cost for n items
    pub fn cost(self: PerItemCost, n_items: usize) !JitCost {
        const n_chunks = self.chunks(n_items);
        const chunk_cost = try self.per_chunk_cost.mul(@intCast(n_chunks));
        return self.base_cost.add(chunk_cost);
    }
};

/// Type-dependent operations
const TypeBasedCost = struct {
    cost_fn: *const fn (SType) JitCost,
};

FixedCost Operations

OperationCostDescription
Constant5Return constant value
ConstantPlaceholder1Lookup segregated constant
ValUse5Variable lookup
If10Conditional branch
SelectField10Tuple field access
SizeOf14Get collection size

PerItemCost Operations

OperationBasePer ChunkChunk Size
blake2b256207128 bytes
sha25680864 bytes
Append20210 items
Filter20210 items
Map20210 items
Fold20210 items

Cost Formula

PerItemCost Formula
─────────────────────────────────────────────────────

total = baseCost + ceil(nItems / chunkSize) × perChunkCost

Example: Map over 50 elements
  chunks = ceil(50 / 10) = 5
  cost = 20 + 5 × 2 = 30 JitCost units

Type-Based Costs

Operations with type-dependent complexity6:

/// Numeric cast cost depends on target type
const NumericCastCost = struct {
    pub fn costFunc(target_type: SType) JitCost {
        return switch (target_type) {
            .s_big_int, .s_unsigned_big_int => .{ .value = 30 },
            else => .{ .value = 10 }, // Byte, Short, Int, Long
        };
    }
};

/// Equality cost depends on operand types
const EqualityCost = struct {
    pub fn costFunc(tpe: SType) JitCost {
        return switch (tpe) {
            .s_byte, .s_short, .s_int, .s_long => .{ .value = 3 },
            .s_big_int => .{ .value = 6 },
            .s_group_element => .{ .value = 172 },
            .s_coll => |elem| blk: {
                // Recursive: base + per-element
                const elem_cost = costFunc(elem.*);
                break :blk .{ .value = 10 + elem_cost.value };
            },
            else => .{ .value = 10 },
        };
    }
};

Cost Items: Tracing

Cost items record individual contributions for debugging78:

const CostItem = union(enum) {
    fixed: FixedCostItem,
    seq: SeqCostItem,
    type_based: TypeBasedCostItem,

    pub fn opName(self: CostItem) []const u8 {
        return switch (self) {
            .fixed => |f| f.op_desc.name,
            .seq => |s| s.op_desc.name,
            .type_based => |t| t.op_desc.name,
        };
    }

    pub fn cost(self: CostItem) JitCost {
        return switch (self) {
            .fixed => |f| f.cost_kind.cost,
            .seq => |s| s.cost_kind.cost(s.n_items) catch .{ .value = 0 },
            .type_based => |t| t.cost_kind.cost_fn(t.tpe),
        };
    }
};

const FixedCostItem = struct {
    op_desc: OperationDesc,
    cost_kind: FixedCost,
};

const SeqCostItem = struct {
    op_desc: OperationDesc,
    cost_kind: PerItemCost,
    n_items: usize,

    pub fn chunks(self: SeqCostItem) usize {
        return self.cost_kind.chunks(self.n_items);
    }
};

const TypeBasedCostItem = struct {
    op_desc: OperationDesc,
    cost_kind: TypeBasedCost,
    tpe: SType,
};

Cost Accumulator

Tracks costs during evaluation with limit enforcement910:

const CostCounter = struct {
    initial_cost: JitCost,
    current_cost: JitCost,

    pub fn init(initial: JitCost) CostCounter {
        return .{
            .initial_cost = initial,
            .current_cost = initial,
        };
    }

    pub fn add(self: *CostCounter, cost: JitCost) !void {
        self.current_cost = try self.current_cost.add(cost);
    }

    pub fn reset(self: *CostCounter) void {
        self.current_cost = self.initial_cost;
    }
};

const CostAccumulator = struct {
    scope_stack: std.ArrayList(Scope),
    cost_limit: ?JitCost,
    allocator: Allocator,

    const Scope = struct {
        counter: CostCounter,
        child_result: i32 = 0,

        pub fn add(self: *Scope, cost: JitCost) !void {
            try self.counter.add(cost);
        }

        pub fn currentCost(self: *const Scope) JitCost {
            return self.counter.current_cost;
        }
    };

    pub fn init(
        allocator: Allocator,
        initial_cost: JitCost,
        cost_limit: ?JitCost,
    ) CostAccumulator {
        var stack = std.ArrayList(Scope).init(allocator);
        stack.append(.{ .counter = CostCounter.init(initial_cost) }) catch unreachable;
        return .{
            .scope_stack = stack,
            .cost_limit = cost_limit,
            .allocator = allocator,
        };
    }

    pub fn currentScope(self: *CostAccumulator) *Scope {
        return &self.scope_stack.items[self.scope_stack.items.len - 1];
    }

    /// Add cost, checking limit
    pub fn add(self: *CostAccumulator, cost: JitCost) !void {
        try self.currentScope().add(cost);

        if (self.cost_limit) |limit| {
            const accumulated = self.currentScope().currentCost();
            if (accumulated.gt(limit)) {
                return error.CostLimitExceeded;
            }
        }
    }

    /// Total accumulated cost
    pub fn totalCost(self: *const CostAccumulator) JitCost {
        return self.scope_stack.items[self.scope_stack.items.len - 1].counter.current_cost;
    }

    pub fn reset(self: *CostAccumulator) void {
        self.scope_stack.clearRetainingCapacity();
        self.scope_stack.append(.{
            .counter = CostCounter.init(.{ .value = 0 }),
        }) catch unreachable;
    }
};

Cost Limit Enforcement

Cost Accumulation Flow
─────────────────────────────────────────────────────

Each operation:
  1. Compute operation cost
  2. Call accumulator.add(opCost)
  3. Check: accumulatedCost > limit?
     Yes → return CostLimitExceeded
     No  → continue execution

At the end:
  totalCost = accumulator.totalCost()
  blockCost = totalCost.toBlockCost()

Evaluator Cost Methods

The evaluator provides methods to add costs1112:

const Evaluator = struct {
    cost_accum: CostAccumulator,
    cost_trace: ?std.ArrayList(CostItem),
    profiler: ?*Profiler,

    // ... other fields

    /// Add fixed cost
    pub fn addCost(self: *Evaluator, cost_kind: FixedCost, op_desc: OperationDesc) !void {
        try self.cost_accum.add(cost_kind.cost);

        if (self.cost_trace) |*trace| {
            try trace.append(.{
                .fixed = .{ .op_desc = op_desc, .cost_kind = cost_kind },
            });
        }
    }

    /// Add fixed cost and execute block
    pub fn addFixedCost(
        self: *Evaluator,
        cost_kind: FixedCost,
        op_desc: OperationDesc,
        comptime block: fn (*Evaluator) anyerror!anytype,
    ) !@TypeOf(block(self)) {
        if (self.profiler) |prof| {
            const start = std.time.nanoTimestamp();
            try self.cost_accum.add(cost_kind.cost);
            const result = try block(self);
            const end = std.time.nanoTimestamp();
            prof.addTiming(op_desc, end - start);
            return result;
        } else {
            try self.cost_accum.add(cost_kind.cost);
            return block(self);
        }
    }

    /// Add per-item cost for known count
    pub fn addSeqCost(
        self: *Evaluator,
        cost_kind: PerItemCost,
        n_items: usize,
        op_desc: OperationDesc,
    ) !void {
        const cost = try cost_kind.cost(n_items);
        try self.cost_accum.add(cost);

        if (self.cost_trace) |*trace| {
            try trace.append(.{
                .seq = .{
                    .op_desc = op_desc,
                    .cost_kind = cost_kind,
                    .n_items = n_items,
                },
            });
        }
    }

    /// Add type-based cost
    pub fn addTypeBasedCost(
        self: *Evaluator,
        cost_kind: TypeBasedCost,
        tpe: SType,
        op_desc: OperationDesc,
    ) !void {
        const cost = cost_kind.cost_fn(tpe);
        try self.cost_accum.add(cost);

        if (self.cost_trace) |*trace| {
            try trace.append(.{
                .type_based = .{
                    .op_desc = op_desc,
                    .cost_kind = cost_kind,
                    .tpe = tpe,
                },
            });
        }
    }
};

PowHit (Autolykos2) Cost

Special cost computation for Autolykos2 mining13:

const PowHitCost = struct {
    /// Cost of custom Autolykos2 hash function
    pub fn cost(
        k: u32,               // k-sum problem inputs
        msg: []const u8,      // message to hash
        nonce: []const u8,    // padding for PoW output
        h: []const u8,        // block height padding
    ) JitCost {
        const chunk_size = CalcBlake2b256.COST.chunk_size;
        const per_chunk = CalcBlake2b256.COST.per_chunk_cost.value;
        const base_cost: i32 = 500;

        // The heaviest part: k + 1 Blake2b256 invocations
        const input_len = msg.len + nonce.len + h.len;
        const chunks_per_hash = input_len / chunk_size + 1;
        const total_cost = base_cost + @as(i32, @intCast(k + 1)) *
            @as(i32, @intCast(chunks_per_hash)) * per_chunk;

        return .{ .value = total_cost };
    }
};

Operation Cost Constants

Defined in operation companion structs:

const Constant = struct {
    pub const COST = FixedCost{ .cost = .{ .value = 5 } };
    // ...
};

const ValUse = struct {
    pub const COST = FixedCost{ .cost = .{ .value = 5 } };
    // ...
};

const If = struct {
    pub const COST = FixedCost{ .cost = .{ .value = 10 } };
    // ...
};

const MapCollection = struct {
    pub const COST = PerItemCost{
        .base_cost = .{ .value = 20 },
        .per_chunk_cost = .{ .value = 2 },
        .chunk_size = 10,
    };
    // ...
};

const CalcBlake2b256 = struct {
    pub const COST = PerItemCost{
        .base_cost = .{ .value = 20 },
        .per_chunk_cost = .{ .value = 7 },
        .chunk_size = 128,
    };
    // ...
};

const CalcSha256 = struct {
    pub const COST = PerItemCost{
        .base_cost = .{ .value = 80 },
        .per_chunk_cost = .{ .value = 8 },
        .chunk_size = 64,
    };
    // ...
};

Cost Tracing Output

Example trace from evaluating a script:

Cost Trace
─────────────────────────────────────────────────────
Constant           :    5
ValUse             :    5
ByIndex            :   30
Constant           :    5
MapCollection[10]  :   22  (base=20, chunks=1)
Filter[5]          :   22  (base=20, chunks=1)
blake2b256[256]    :   34  (base=20, chunks=2)
─────────────────────────────────────────────────────
Total JitCost      :  123
Block Cost         :   12

Complete Evaluation with Costing

pub fn evaluateWithCost(
    ergo_tree: *const ErgoTree,
    context: *const Context,
    cost_limit: JitCost,
    allocator: Allocator,
) !struct { result: SigmaBoolean, cost: JitCost } {
    var cost_accum = CostAccumulator.init(
        allocator,
        .{ .value = 0 },
        cost_limit,
    );

    var evaluator = Evaluator{
        .context = context,
        .constants = ergo_tree.constants,
        .cost_accum = cost_accum,
        .cost_trace = null,
        .profiler = null,
        .allocator = allocator,
    };

    const empty_env = Env.init(allocator);
    const result = try evaluator.eval(&empty_env, ergo_tree.root);

    const sigma_prop = result.asSigmaProp() orelse
        return error.NotSigmaProp;

    return .{
        .result = sigma_prop.sigma_boolean,
        .cost = evaluator.cost_accum.totalCost(),
    };
}

Summary

This chapter covered the cost model that ensures all ErgoTree scripts terminate within bounded resources:

  • JitCost uses 10x scaling from block costs, providing finer granularity for internal calculations while maintaining integer arithmetic without floating point
  • FixedCost applies to constant-time operations like variable access (cost = 5) and conditionals (cost = 10)
  • PerItemCost models operations that scale with input size using the formula: baseCost + ceil(n/chunkSize) × perChunkCost—this applies to collection operations and hash functions
  • TypeBasedCost handles operations whose cost depends on operand type—BigInt operations are more expensive than primitive integer operations
  • CostAccumulator tracks accumulated costs during evaluation and checks against the limit after each operation; exceeding the limit immediately fails evaluation
  • CostItem types (FixedCostItem, SeqCostItem, TypeBasedCostItem) enable detailed cost tracing for debugging and optimization
  • The PowHit cost function handles the special case of Autolykos2 mining operations

Next: Chapter 14: Verifier Implementation

5

Rust: costs.rs:1-24

12

Rust: eval.rs:130-160

Chapter 14: Verifier Implementation

PRE-ALPHA WARNING: This is a pre-alpha version of The Sigma Book. Content may be incomplete, inaccurate, or subject to change. Do not use as a source of truth. For authoritative information, consult the official repositories:

Prerequisites

  • Chapter 11 for Sigma protocol verification and Fiat-Shamir transformation
  • Chapter 12 for ErgoTree reduction to SigmaBoolean
  • Chapter 13 for cost accumulation during verification

Learning Objectives

By the end of this chapter, you will be able to:

  • Trace the complete verification flow from ErgoTree to boolean result
  • Implement verify() and fullReduction() methods
  • Handle soft-fork conditions gracefully to maintain network compatibility
  • Verify cryptographic signatures using Fiat-Shamir commitment reconstruction
  • Estimate verification cost before performing expensive cryptographic operations

Verification Overview

Verification is the counterpart to proving: given an ErgoTree, a transaction context, and a cryptographic proof, the verifier determines whether the proof is valid. This process happens for every input box in every transaction—efficient verification is critical for blockchain throughput.

The verification proceeds in two phases: first reduce the ErgoTree to a SigmaBoolean proposition (using the evaluator from Chapter 12), then verify the cryptographic proof satisfies that proposition12.

Verification Pipeline
─────────────────────────────────────────────────────

Input: ErgoTree + Context + Proof + Message

┌──────────────────────────────────────────────────┐
│              1. REDUCTION PHASE                  │
│                                                  │
│  ErgoTree ────> propositionFromErgoTree()        │
│                        │                         │
│                        ▼                         │
│              SigmaPropValue                      │
│                        │                         │
│                        ▼                         │
│              fullReduction()                     │
│                        │                         │
│                        ▼                         │
│              SigmaBoolean + Cost                 │
└──────────────────────────────────────────────────┘
                         │
                         ▼
┌──────────────────────────────────────────────────┐
│            2. VERIFICATION PHASE                 │
│                                                  │
│  TrueProp  ────> return (true, cost)             │
│  FalseProp ────> return (false, cost)            │
│                                                  │
│  Otherwise:                                      │
│    estimateCryptoVerifyCost()                    │
│              │                                   │
│              ▼                                   │
│    verifySignature() ────> boolean result        │
└──────────────────────────────────────────────────┘

Output: (verified: bool, total_cost: u64)

Verification Result

const VerificationResult = struct {
    /// Result of SigmaProp verification
    result: bool,
    /// Estimated cost of contract execution
    cost: u64,
    /// Diagnostic information
    diag: ReductionDiagnosticInfo,
};

const ReductionResult = struct {
    /// SigmaBoolean proposition
    sigma_prop: SigmaBoolean,
    /// Accumulated cost (block scale)
    cost: u64,
    /// Diagnostic info
    diag: ReductionDiagnosticInfo,
};

const ReductionDiagnosticInfo = struct {
    /// Environment after evaluation
    env: Env,
    /// Pretty-printed expression
    pretty_printed_expr: ?[]const u8,
};

Verifier Trait

The base verifier interface34:

const Verifier = struct {
    const Self = @This();

    /// Cost per byte for deserialization
    pub const COST_PER_BYTE_DESERIALIZED: i32 = 2;
    /// Cost per tree byte for substitution
    pub const COST_PER_TREE_BYTE: i32 = 2;

    /// Verify an ErgoTree in context with proof
    pub fn verify(
        self: *const Self,
        ergo_tree: *const ErgoTree,
        context: *const Context,
        proof: ProofBytes,
        message: []const u8,
    ) VerifierError!VerificationResult {
        // Reduce to SigmaBoolean
        const reduction = try reduceToCrypto(ergo_tree, context);

        const result: bool = switch (reduction.sigma_prop) {
            .trivial_prop => |b| b,
            else => |sb| blk: {
                if (proof.isEmpty()) {
                    break :blk false;
                }
                // Verifier Steps 1-3: Parse proof
                const unchecked = try parseAndComputeChallenges(&sb, proof.bytes());
                // Verifier Steps 4-6: Check commitments
                break :blk try checkCommitments(unchecked, message);
            },
        };

        return .{
            .result = result,
            .cost = reduction.cost,
            .diag = reduction.diag,
        };
    }
};

The verify() Method

Complete verification entry point5:

pub fn verify(
    env: ScriptEnv,
    ergo_tree: *const ErgoTree,
    context: *const Context,
    proof: []const u8,
    message: []const u8,
) VerifierError!VerificationResult {
    // Check soft-fork condition first
    if (checkSoftForkCondition(ergo_tree, context)) |soft_fork_result| {
        return soft_fork_result;
    }

    // REDUCTION PHASE
    const reduced = try fullReduction(ergo_tree, context, env);

    // VERIFICATION PHASE
    return switch (reduced.sigma_prop) {
        .true_prop => .{ .result = true, .cost = reduced.cost, .diag = reduced.diag },
        .false_prop => .{ .result = false, .cost = reduced.cost, .diag = reduced.diag },
        else => |sb| blk: {
            // Non-trivial proposition: verify cryptographic proof
            const full_cost = try addCryptoCost(sb, reduced.cost, context.cost_limit);

            const ok = verifySignature(sb, message, proof) catch false;
            break :blk .{
                .result = ok,
                .cost = full_cost,
                .diag = reduced.diag,
            };
        },
    };
}

Full Reduction

Reduces ErgoTree to SigmaBoolean with cost tracking67:

pub fn fullReduction(
    ergo_tree: *const ErgoTree,
    context: *const Context,
    env: ScriptEnv,
) ReducerError!ReductionResult {
    // Extract proposition from ErgoTree
    const prop = try propositionFromErgoTree(ergo_tree, context);

    // Fast path: SigmaProp constant
    if (prop == .sigma_prop_constant) {
        const sb = prop.sigma_prop_constant.toSigmaBoolean();
        const eval_cost = SigmaPropConstant.COST.cost.toBlockCost();
        const res_cost = try addCostChecked(context.init_cost, eval_cost, context.cost_limit);
        return .{
            .sigma_prop = sb,
            .cost = res_cost,
            .diag = .{ .env = context.env, .pretty_printed_expr = null },
        };
    }

    // No DeserializeContext: direct evaluation
    if (!ergo_tree.hasDeserialize()) {
        return evalToCrypto(context, ergo_tree);
    }

    // Has DeserializeContext: special handling
    return reductionWithDeserialize(ergo_tree, prop, context, env);
}

fn propositionFromErgoTree(
    ergo_tree: *const ErgoTree,
    context: *const Context,
) PropositionError!SigmaPropValue {
    return switch (ergo_tree.root) {
        .parsed => |tree| ergo_tree.toProposition(ergo_tree.header.constant_segregation),
        .unparsed => |u| blk: {
            if (context.validation_settings.isSoftFork(u.err)) {
                // Soft-fork: return true (accept)
                break :blk SigmaPropValue.true_sigma_prop;
            }
            // Hard error
            return error.UnparsedErgoTree;
        },
    };
}

Signature Verification

Implements Verifier Steps 4-6 of the Sigma protocol89:

/// Verify a signature on message for given proposition
pub fn verifySignature(
    sigma_tree: SigmaBoolean,
    message: []const u8,
    signature: []const u8,
) VerifierError!bool {
    return switch (sigma_tree) {
        .trivial_prop => |b| b,
        else => |sb| blk: {
            if (signature.len == 0) {
                break :blk false;
            }
            // Verifier Steps 1-3: Parse proof
            const unchecked = try parseAndComputeChallenges(&sb, signature);
            // Verifier Steps 4-6: Check commitments
            break :blk try checkCommitments(unchecked, message);
        },
    };
}

/// Verifier Steps 4-6: Check commitments match Fiat-Shamir challenge
fn checkCommitments(
    sp: UncheckedTree,
    message: []const u8,
) VerifierError!bool {
    // Verifier Step 4: Compute commitments from challenges and responses
    const new_root = computeCommitments(sp);

    // Steps 5-6: Serialize tree for Fiat-Shamir
    var buf = std.ArrayList(u8).init(allocator);
    try fiatShamirTreeToBytes(&new_root, buf.writer());
    try buf.appendSlice(message);

    // Compute expected challenge
    const expected_challenge = fiatShamirHashFn(buf.items);

    // Compare with actual challenge
    // NOTE: In production, use constant-time comparison for challenge bytes
    // to prevent timing side-channels: std.crypto.utils.timingSafeEql
    return std.mem.eql(u8, &new_root.challenge(), &expected_challenge);
}

Computing Commitments

Verifier Step 4: Reconstruct commitments from challenges and responses1011:

/// For every leaf, compute commitment from challenge and response
pub fn computeCommitments(sp: UncheckedTree) UncheckedTree {
    return switch (sp) {
        .unchecked_leaf => |leaf| switch (leaf) {
            .unchecked_schnorr => |sn| blk: {
                // Reconstruct: a = g^z / h^e
                const a = DlogProver.computeCommitment(
                    &sn.proposition,
                    &sn.challenge,
                    &sn.second_message,
                );
                break :blk UncheckedTree{
                    .unchecked_leaf = .{
                        .unchecked_schnorr = .{
                            .proposition = sn.proposition,
                            .challenge = sn.challenge,
                            .second_message = sn.second_message,
                            .commitment_opt = FirstDlogProverMessage{ .a = a },
                        },
                    },
                };
            },
            .unchecked_dh_tuple => |dh| blk: {
                // Reconstruct both commitments
                const commitment = DhTupleProver.computeCommitment(
                    &dh.proposition,
                    &dh.challenge,
                    &dh.second_message,
                );
                break :blk UncheckedTree{
                    .unchecked_leaf = .{
                        .unchecked_dh_tuple = .{
                            .proposition = dh.proposition,
                            .challenge = dh.challenge,
                            .second_message = dh.second_message,
                            .commitment_opt = commitment,
                        },
                    },
                };
            },
        },
        .unchecked_conjecture => |conj| blk: {
            // Recursively process children
            var new_children = allocator.alloc(UncheckedTree, conj.children.len);
            for (conj.children, 0..) |child, i| {
                new_children[i] = computeCommitments(child);
            }
            break :blk conj.withChildren(new_children);
        },
    };
}

Crypto Verification Cost

Estimate cost before performing expensive operations12:

const VerificationCosts = struct {
    /// Cost for Schnorr commitment computation
    pub const COMPUTE_COMMITMENTS_SCHNORR = FixedCost{ .cost = .{ .value = 3400 } };
    /// Cost for DHT commitment computation
    pub const COMPUTE_COMMITMENTS_DHT = FixedCost{ .cost = .{ .value = 6450 } };

    /// Total Schnorr verification cost
    pub const PROVE_DLOG_VERIFICATION: JitCost = blk: {
        const parse = ParseChallenge_ProveDlog.COST.cost;
        const compute = COMPUTE_COMMITMENTS_SCHNORR.cost;
        const serialize = ToBytes_Schnorr.COST.cost;
        break :blk parse.add(compute).add(serialize);
    };

    /// Total DHT verification cost
    pub const PROVE_DHT_VERIFICATION: JitCost = blk: {
        const parse = ParseChallenge_ProveDHT.COST.cost;
        const compute = COMPUTE_COMMITMENTS_DHT.cost;
        const serialize = ToBytes_DHT.COST.cost;
        break :blk parse.add(compute).add(serialize);
    };
};

/// Estimate verification cost without performing crypto
pub fn estimateCryptoVerifyCost(sb: SigmaBoolean) JitCost {
    return switch (sb) {
        .prove_dlog => VerificationCosts.PROVE_DLOG_VERIFICATION,
        .prove_dh_tuple => VerificationCosts.PROVE_DHT_VERIFICATION,
        .c_and => |and_node| blk: {
            const node_cost = ToBytes_ProofTreeConjecture.COST.cost;
            var children_cost = JitCost{ .value = 0 };
            for (and_node.children) |child| {
                children_cost = children_cost.add(estimateCryptoVerifyCost(child)) catch unreachable;
            }
            break :blk node_cost.add(children_cost) catch unreachable;
        },
        .c_or => |or_node| blk: {
            const node_cost = ToBytes_ProofTreeConjecture.COST.cost;
            var children_cost = JitCost{ .value = 0 };
            for (or_node.children) |child| {
                children_cost = children_cost.add(estimateCryptoVerifyCost(child)) catch unreachable;
            }
            break :blk node_cost.add(children_cost) catch unreachable;
        },
        .c_threshold => |th| blk: {
            const n_children = th.children.len;
            const n_coefs = n_children - th.k;
            const parse_cost = ParsePolynomial.COST.cost(@intCast(n_coefs));
            const eval_cost = EvaluatePolynomial.COST.cost(@intCast(n_coefs)).mul(@intCast(n_children)) catch unreachable;
            const node_cost = ToBytes_ProofTreeConjecture.COST.cost;
            var children_cost = JitCost{ .value = 0 };
            for (th.children) |child| {
                children_cost = children_cost.add(estimateCryptoVerifyCost(child)) catch unreachable;
            }
            break :blk parse_cost.add(eval_cost).add(node_cost).add(children_cost) catch unreachable;
        },
        else => JitCost{ .value = 0 }, // Trivial proposition
    };
}

/// Add crypto cost to accumulated cost
fn addCryptoCost(
    sigma_prop: SigmaBoolean,
    base_cost: u64,
    cost_limit: u64,
) CostError!u64 {
    const crypto_cost = estimateCryptoVerifyCost(sigma_prop).toBlockCost();
    return addCostChecked(base_cost, crypto_cost, cost_limit);
}

Soft-Fork Handling

Handle unrecognized script versions gracefully13:

/// Check for soft-fork condition
fn checkSoftForkCondition(
    ergo_tree: *const ErgoTree,
    context: *const Context,
) ?VerificationResult {
    if (context.activated_script_version > MAX_SUPPORTED_SCRIPT_VERSION) {
        // Protocol version exceeds interpreter capabilities
        if (ergo_tree.header.version > MAX_SUPPORTED_SCRIPT_VERSION) {
            // Cannot verify: accept and rely on 90% upgraded nodes
            return .{
                .result = true,
                .cost = context.init_cost,
                .diag = .{ .env = Env.empty(), .pretty_printed_expr = null },
            };
        }
        // Can verify despite protocol upgrade
    } else {
        // Activated version within supported range
        if (ergo_tree.header.version > context.activated_script_version) {
            // ErgoTree version too high
            return error.ErgoTreeVersionTooHigh;
        }
    }
    return null; // Proceed normally
}

/// Soft-fork reduction result: accept as true
fn whenSoftForkReductionResult(cost: u64) ReductionResult {
    return .{
        .sigma_prop = .{ .trivial_prop = true },
        .cost = cost,
        .diag = .{ .env = Env.empty(), .pretty_printed_expr = null },
    };
}

DeserializeContext Handling

Scripts may contain deserialization operations14:

fn reductionWithDeserialize(
    ergo_tree: *const ErgoTree,
    prop: SigmaPropValue,
    context: *const Context,
    env: ScriptEnv,
) ReducerError!ReductionResult {
    // Add cost for deserialization substitution
    const tree_bytes = ergo_tree.bytes();
    const deserialize_cost = @as(i64, @intCast(tree_bytes.len)) * COST_PER_TREE_BYTE;
    const curr_cost = try addCostChecked(context.init_cost, deserialize_cost, context.cost_limit);

    var context1 = context.*;
    context1.init_cost = curr_cost;

    // Substitute DeserializeContext nodes
    const prop_tree = try applyDeserializeContext(&context1, prop);

    // Reduce the substituted tree
    return reduceToCrypto(&context1, prop_tree);
}

Complete Verification Flow

verify(ergoTree, context, proof, message)
─────────────────────────────────────────────────────

Step 1: checkSoftForkCondition()
        │
        ├─ activated > MaxSupported AND script > MaxSupported
        │  └─> return (true, initCost)  [soft-fork accept]
        │
        ├─ script.version > activated
        │  └─> throw ErgoTreeVersionTooHigh
        │
        └─ Otherwise: proceed
                │
                ▼
Step 2: fullReduction()
        │
        ├─ propositionFromErgoTree()
        │  └─ Handle unparsed trees
        │
        ├─ SigmaPropConstant
        │  └─> Extract directly
        │
        ├─ No DeserializeContext
        │  └─> evalToCrypto()
        │
        └─ Has DeserializeContext
           └─> reductionWithDeserialize()
                │
                ▼
        ReductionResult(sigmaBoolean, cost)
                │
                ▼
Step 3: Check result
        │
        ├─ TrueProp  ────> return (true, cost)
        ├─ FalseProp ────> return (false, cost)
        └─ Non-trivial ────> continue
                │
                ▼
Step 4: addCryptoCost()
        │
        └─ Estimate without crypto ops
                │
                ▼
Step 5: verifySignature()
        │
        ├─ parseAndComputeChallenges()
        │  └─ Parse proof bytes
        │
        ├─ computeCommitments()
        │  └─ Reconstruct commitments
        │
        ├─ fiatShamirTreeToBytes()
        │  └─ Serialize tree
        │
        └─ fiatShamirHashFn()
           └─ Compute expected challenge
                │
                ▼
Step 6: Return (result, totalCost)

Verifier Errors

const VerifierError = error{
    /// Failed to parse ErgoTree
    ErgoTreeError,
    /// Failed to evaluate ErgoTree
    EvalError,
    /// Signature parsing error
    SigParsingError,
    /// Fiat-Shamir serialization error
    FiatShamirTreeSerializationError,
    /// Cost limit exceeded
    CostLimitExceeded,
    /// ErgoTree version too high
    ErgoTreeVersionTooHigh,
    /// Cannot parse unparsed tree
    UnparsedErgoTree,
};

Test Verifier

Simple verifier implementation for testing15:

const TestVerifier = struct {
    const Self = @This();

    pub fn verify(
        self: *const Self,
        tree: *const ErgoTree,
        ctx: *const Context,
        proof: ProofBytes,
        message: []const u8,
    ) VerifierError!VerificationResult {
        _ = self;
        const reduction = try reduceToCrypto(tree, ctx);

        const result: bool = switch (reduction.sigma_prop) {
            .trivial_prop => |b| b,
            else => |sb| blk: {
                if (proof.isEmpty()) {
                    break :blk false;
                }
                const unchecked = try parseAndComputeChallenges(&sb, proof.bytes());
                break :blk try checkCommitments(unchecked, message);
            },
        };

        return .{
            .result = result,
            .cost = 0, // Test verifier doesn't track cost
            .diag = reduction.diag,
        };
    }
};

Summary

This chapter covered the verifier implementation that validates Sigma proofs:

  • Verification proceeds in two phases: reduction (ErgoTree → SigmaBoolean) and cryptographic verification (proof checking)
  • fullReduction() evaluates the ErgoTree to a SigmaBoolean proposition while tracking costs
  • verifySignature() implements Verifier Steps 4-6: parse proof bytes, compute expected commitments from challenges and responses, then verify via Fiat-Shamir hash
  • Soft-fork handling accepts scripts with unrecognized versions or opcodes, enabling protocol upgrades without network splits
  • Cost estimation predicts cryptographic verification cost before performing expensive EC operations, failing early if the limit would be exceeded
  • Commitment reconstruction (computeCommitments) derives the prover's commitments from the challenges and responses, which must match the Fiat-Shamir challenge
  • DeserializeContext nodes are substituted with their deserialized values before reduction begins

Next: Chapter 15: Prover Implementation

Chapter 15: Prover Implementation

PRE-ALPHA WARNING: This is a pre-alpha version of The Sigma Book. Content may be incomplete, inaccurate, or subject to change. Do not use as a source of truth. For authoritative information, consult the official repositories:

Prerequisites

  • Chapter 11 for Sigma protocol structure, simulation, and Fiat-Shamir
  • Chapter 12 for ErgoTree reduction to SigmaBoolean
  • Chapter 14 for understanding what the verifier expects

Learning Objectives

By the end of this chapter, you will be able to:

  • Trace the 10-step proving algorithm from SigmaBoolean to serialized proof
  • Work with the UnprovenTree data structure and its transformations
  • Explain challenge flow through AND, OR, and THRESHOLD compositions
  • Use the hint system for distributed multi-party signing
  • Serialize proofs in the compact format expected by verifiers

Prover Overview

The prover is the counterpart to the verifier: given an ErgoTree, a transaction context, and the necessary secret keys, it generates a cryptographic proof that the verifier will accept. The proving algorithm is significantly more complex than verification because it must handle composite propositions (AND/OR/THRESHOLD) by generating simulated transcripts for children the prover cannot prove, while maintaining the zero-knowledge property that simulated and real transcripts are indistinguishable.

The prover generates cryptographic proofs for sigma propositions through a multi-phase algorithm12:

Proving Pipeline
─────────────────────────────────────────────────────

Step 0:  SigmaBoolean ─────> convertToUnproven()
                             │
                             ▼
Step 1:  Mark real nodes (bottom-up)
                             │
                             ▼
Step 2:  Check root is real (abort if simulated)
                             │
                             ▼
Step 3:  Polish simulated (top-down)
                             │
                             ▼
Steps 4-6: Simulate/Commit
         - Assign challenges to simulated children
         - Simulate simulated leaves
         - Compute commitments for real leaves
                             │
                             ▼
Step 7:  Serialize for Fiat-Shamir
                             │
                             ▼
Step 8:  Compute root challenge = H(tree || message)
                             │
                             ▼
Step 9:  Compute real challenges and responses
                             │
                             ▼
Step 10: Serialize proof bytes

Tree Data Structures

Node Position

Position encodes path from root3:

const NodePosition = struct {
    /// Position bytes (e.g., [0, 2, 1] for "0-2-1")
    positions: []const u8,

    pub const CRYPTO_TREE_PREFIX: NodePosition = .{ .positions = &[_]u8{0} };

    pub fn child(self: NodePosition, idx: usize, allocator: Allocator) !NodePosition {
        var new_pos = try allocator.alloc(u8, self.positions.len + 1);
        @memcpy(new_pos[0..self.positions.len], self.positions);
        new_pos[self.positions.len] = @intCast(idx);
        return .{ .positions = new_pos };
    }
};
Position Encoding
─────────────────────────────────────────────────────

            0           (root)
          / | \
         /  |  \
       0-0 0-1 0-2      (children)
               /|
              / |
            0-2-0 0-2-1 (grandchildren)

Prefix "0" = crypto-tree (vs "1" = ErgoTree)

Unproven Tree

During proving, the tree undergoes transformations45:

const UnprovenTree = union(enum) {
    unproven_leaf: UnprovenLeaf,
    unproven_conjecture: UnprovenConjecture,

    pub fn isReal(self: UnprovenTree) bool {
        return !self.simulated();
    }

    pub fn simulated(self: UnprovenTree) bool {
        return switch (self) {
            .unproven_leaf => |l| l.simulated,
            .unproven_conjecture => |c| c.simulated(),
        };
    }

    pub fn withChallenge(self: UnprovenTree, challenge: Challenge) UnprovenTree {
        return switch (self) {
            .unproven_leaf => |l| .{ .unproven_leaf = l.withChallenge(challenge) },
            .unproven_conjecture => |c| .{ .unproven_conjecture = c.withChallenge(challenge) },
        };
    }

    pub fn withSimulated(self: UnprovenTree, sim: bool) UnprovenTree {
        return switch (self) {
            .unproven_leaf => |l| .{ .unproven_leaf = l.withSimulated(sim) },
            .unproven_conjecture => |c| .{ .unproven_conjecture = c.withSimulated(sim) },
        };
    }
};

Unproven Leaf Nodes

const UnprovenLeaf = union(enum) {
    unproven_schnorr: UnprovenSchnorr,
    unproven_dh_tuple: UnprovenDhTuple,

    // ... accessor methods
};

const UnprovenSchnorr = struct {
    proposition: ProveDlog,
    commitment_opt: ?FirstDlogProverMessage,
    randomness_opt: ?Scalar,  // Secret r for commitment
    challenge_opt: ?Challenge,
    simulated: bool,
    position: NodePosition,

    pub fn withChallenge(self: UnprovenSchnorr, c: Challenge) UnprovenSchnorr {
        return .{
            .proposition = self.proposition,
            .commitment_opt = self.commitment_opt,
            .randomness_opt = self.randomness_opt,
            .challenge_opt = c,
            .simulated = self.simulated,
            .position = self.position,
        };
    }

    pub fn withSimulated(self: UnprovenSchnorr, sim: bool) UnprovenSchnorr {
        return .{
            .proposition = self.proposition,
            .commitment_opt = self.commitment_opt,
            .randomness_opt = self.randomness_opt,
            .challenge_opt = self.challenge_opt,
            .simulated = sim,
            .position = self.position,
        };
    }
};

const UnprovenDhTuple = struct {
    proposition: ProveDhTuple,
    commitment_opt: ?FirstDhTupleProverMessage,
    randomness_opt: ?Scalar,
    challenge_opt: ?Challenge,
    simulated: bool,
    position: NodePosition,
};

Unproven Conjecture Nodes

const UnprovenConjecture = union(enum) {
    cand_unproven: CandUnproven,
    cor_unproven: CorUnproven,
    cthreshold_unproven: CthresholdUnproven,

    pub fn simulated(self: UnprovenConjecture) bool {
        return switch (self) {
            .cand_unproven => |c| c.simulated,
            .cor_unproven => |c| c.simulated,
            .cthreshold_unproven => |c| c.simulated,
        };
    }

    pub fn children(self: UnprovenConjecture) []ProofTree {
        return switch (self) {
            .cand_unproven => |c| c.children,
            .cor_unproven => |c| c.children,
            .cthreshold_unproven => |c| c.children,
        };
    }
};

const CandUnproven = struct {
    proposition: Cand,
    challenge_opt: ?Challenge,
    simulated: bool,
    children: []ProofTree,
    position: NodePosition,
};

const CorUnproven = struct {
    proposition: Cor,
    challenge_opt: ?Challenge,
    simulated: bool,
    children: []ProofTree,
    position: NodePosition,
};

const CthresholdUnproven = struct {
    proposition: Cthreshold,
    challenge_opt: ?Challenge,
    simulated: bool,
    k: u8,                        // Threshold
    children: []ProofTree,
    polynomial_opt: ?Gf2_192Poly, // For challenge distribution
    position: NodePosition,
};

The Proving Algorithm

Prover Trait

const Prover = struct {
    secrets: []const PrivateInput,

    pub fn prove(
        self: *const Prover,
        tree: *const ErgoTree,
        ctx: *const Context,
        message: []const u8,
        hints_bag: *const HintsBag,
    ) ProverError!ProverResult {
        const reduction = try reduceToCrypto(tree, ctx);
        const proof = try self.generateProof(
            reduction.sigma_prop,
            message,
            hints_bag,
        );
        return .{
            .proof = proof,
            .extension = ctx.extension,
        };
    }

    pub fn generateProof(
        self: *const Prover,
        sigma_bool: SigmaBoolean,
        message: []const u8,
        hints_bag: *const HintsBag,
    ) ProverError!ProofBytes {
        return switch (sigma_bool) {
            .trivial_prop => |b| blk: {
                if (b) break :blk ProofBytes.empty();
                return error.ReducedToFalse;
            },
            else => |sb| blk: {
                const unproven = try convertToUnproven(sb);
                const unchecked = try proveToUnchecked(self, unproven, message, hints_bag);
                break :blk serializeSig(unchecked);
            },
        };
    }
};

Step 0: Convert to Unproven

Transform SigmaBoolean to UnprovenTree6:

fn convertToUnproven(sigma_tree: SigmaBoolean) ProverError!UnprovenTree {
    return switch (sigma_tree) {
        .c_and => |and_node| blk: {
            var children = try allocator.alloc(ProofTree, and_node.children.len);
            for (and_node.children, 0..) |child, i| {
                children[i] = .{ .unproven_tree = try convertToUnproven(child) };
            }
            break :blk .{
                .unproven_conjecture = .{
                    .cand_unproven = .{
                        .proposition = and_node,
                        .challenge_opt = null,
                        .simulated = false,
                        .children = children,
                        .position = NodePosition.CRYPTO_TREE_PREFIX,
                    },
                },
            };
        },
        .c_or => |or_node| blk: {
            // Similar conversion for OR
            // ...
        },
        .c_threshold => |th| blk: {
            // Similar conversion for THRESHOLD
            // ...
        },
        .prove_dlog => |pk| .{
            .unproven_leaf = .{
                .unproven_schnorr = .{
                    .proposition = pk,
                    .commitment_opt = null,
                    .randomness_opt = null,
                    .challenge_opt = null,
                    .simulated = false,
                    .position = NodePosition.CRYPTO_TREE_PREFIX,
                },
            },
        },
        .prove_dh_tuple => |dht| .{
            .unproven_leaf = .{
                .unproven_dh_tuple = .{
                    .proposition = dht,
                    .commitment_opt = null,
                    .randomness_opt = null,
                    .challenge_opt = null,
                    .simulated = false,
                    .position = NodePosition.CRYPTO_TREE_PREFIX,
                },
            },
        },
        else => error.Unexpected,
    };
}

Step 1: Mark Real Nodes

Bottom-up traversal to mark what prover can prove78:

fn markReal(
    prover: *const Prover,
    tree: UnprovenTree,
    hints_bag: *const HintsBag,
) ProverError!UnprovenTree {
    return rewriteBottomUp(tree, struct {
        fn transform(node: ProofTree, p: *const Prover, hints: *const HintsBag) ?ProofTree {
            return switch (node) {
                .unproven_tree => |ut| switch (ut) {
                    .unproven_leaf => |leaf| blk: {
                        // Leaf is real if prover has secret OR hint shows knowledge
                        const secret_known = hints.realImages().contains(leaf.proposition()) or
                            p.hasSecretFor(leaf.proposition());
                        break :blk leaf.withSimulated(!secret_known);
                    },
                    .unproven_conjecture => |conj| switch (conj) {
                        .cand_unproven => |cand| blk: {
                            // AND is real only if ALL children are real
                            const simulated = anyChildSimulated(cand.children);
                            break :blk cand.withSimulated(simulated);
                        },
                        .cor_unproven => |cor| blk: {
                            // OR is real if AT LEAST ONE child is real
                            const simulated = allChildrenSimulated(cor.children);
                            break :blk cor.withSimulated(simulated);
                        },
                        .cthreshold_unproven => |ct| blk: {
                            // THRESHOLD(k) is real if AT LEAST k children are real
                            const real_count = countRealChildren(ct.children);
                            break :blk ct.withSimulated(real_count < ct.k);
                        },
                    },
                },
                else => null,
            };
        }
    }.transform, prover, hints_bag);
}

Step 2: Check Root

fn proveToUnchecked(
    prover: *const Prover,
    unproven: UnprovenTree,
    message: []const u8,
    hints_bag: *const HintsBag,
) ProverError!UncheckedTree {
    // Step 1
    const step1 = try markReal(prover, unproven, hints_bag);

    // Step 2: If root is simulated, prover cannot prove
    if (!step1.isReal()) {
        return error.TreeRootIsNotReal;
    }

    // Steps 3-9...
}

Step 3: Polish Simulated

Top-down traversal to ensure correct structure9:

fn polishSimulated(tree: UnprovenTree) ProverError!UnprovenTree {
    return rewriteTopDown(tree, struct {
        fn transform(node: ProofTree) ?ProofTree {
            return switch (node) {
                .unproven_tree => |ut| switch (ut) {
                    .unproven_conjecture => |conj| switch (conj) {
                        .cand_unproven => |cand| blk: {
                            // Simulated AND: all children simulated
                            if (cand.simulated) {
                                break :blk cand.withChildren(
                                    markAllChildrenSimulated(cand.children),
                                );
                            }
                            break :blk cand;
                        },
                        .cor_unproven => |cor| blk: {
                            if (cor.simulated) {
                                // Simulated OR: all children simulated
                                break :blk cor.withChildren(
                                    markAllChildrenSimulated(cor.children),
                                );
                            } else {
                                // Real OR: keep ONE child real, mark rest simulated
                                break :blk makeCorChildrenSimulated(cor);
                            }
                        },
                        .cthreshold_unproven => |ct| blk: {
                            if (ct.simulated) {
                                break :blk ct.withChildren(
                                    markAllChildrenSimulated(ct.children),
                                );
                            } else {
                                // Real THRESHOLD(k): keep only k children real
                                break :blk makeThresholdChildrenSimulated(ct);
                            }
                        },
                    },
                    else => null,
                },
                else => null,
            };
        }
    }.transform);
}

fn makeCorChildrenSimulated(cor: CorUnproven) CorUnproven {
    // Find first real child, mark all others simulated
    var found_real = false;
    var new_children = allocator.alloc(ProofTree, cor.children.len);
    for (cor.children, 0..) |child, i| {
        const ut = child.unproven_tree;
        if (ut.isReal() and !found_real) {
            new_children[i] = child;
            found_real = true;
        } else if (ut.isReal()) {
            new_children[i] = ut.withSimulated(true);
        } else {
            new_children[i] = child;
        }
    }
    return cor.withChildren(new_children);
}

Steps 4-6: Simulate and Commit

Combined traversal for challenges, simulation, and commitments1011:

fn simulateAndCommit(
    tree: UnprovenTree,
    hints_bag: *const HintsBag,
    rng: std.rand.Random,
) ProverError!ProofTree {
    return rewriteTopDown(tree, struct {
        fn transform(node: ProofTree, hints: *const HintsBag, random: std.rand.Random) ?ProofTree {
            return switch (node) {
                .unproven_tree => |ut| switch (ut) {
                    // Step 4: Real conjecture assigns random challenges to simulated children
                    .unproven_conjecture => |conj| blk: {
                        if (conj.isReal()) {
                            break :blk assignChallengesFromRealParent(conj, random);
                        } else {
                            break :blk propagateChallengeToSimulatedChildren(conj, random);
                        }
                    },
                    // Steps 5-6: Simulate or commit at leaves
                    .unproven_leaf => |leaf| blk: {
                        if (leaf.simulated()) {
                            // Step 5: Simulate
                            break :blk simulateLeaf(leaf);
                        } else {
                            // Step 6: Compute commitment
                            break :blk commitLeaf(leaf, hints, random);
                        }
                    },
                },
                else => null,
            };
        }
    }.transform, hints_bag, rng);
}

/// Simulate a leaf: pick random z, compute commitment backwards
fn simulateLeaf(leaf: UnprovenLeaf) UncheckedTree {
    return switch (leaf) {
        .unproven_schnorr => |us| blk: {
            const challenge = us.challenge_opt orelse return error.SimulatedLeafWithoutChallenge;
            const sim = DlogProver.simulate(us.proposition, challenge);
            break :blk .{
                .unchecked_leaf = .{
                    .unchecked_schnorr = .{
                        .proposition = us.proposition,
                        .commitment_opt = sim.first_message,
                        .challenge = challenge,
                        .second_message = sim.second_message,
                    },
                },
            };
        },
        .unproven_dh_tuple => |ud| blk: {
            // Similar for DHT
        },
    };
}

/// Commit at a real leaf: pick random r, compute a = g^r
///
/// SECURITY: The randomness `r` MUST come from a cryptographically secure source:
/// - Use a CSPRNG (e.g., OS-provided /dev/urandom, std.crypto.random)
/// - For platforms without secure random, use deterministic nonce generation
///   (RFC 6979 style: r = HMAC(secret_key, message))
/// - NEVER reuse nonces: reusing r with different messages reveals the secret key
fn commitLeaf(
    leaf: UnprovenLeaf,
    hints: *const HintsBag,
    rng: std.rand.Random,
) UnprovenTree {
    return switch (leaf) {
        .unproven_schnorr => |us| blk: {
            // Check hints first
            if (hints.findCommitment(us.position)) |hint| {
                break :blk us.withCommitment(hint.commitment);
            }
            // Generate fresh commitment
            const first = DlogProver.firstMessage(rng);
            break :blk .{
                .unproven_leaf = .{
                    .unproven_schnorr = .{
                        .proposition = us.proposition,
                        .commitment_opt = first.message,
                        .randomness_opt = first.r,
                        .challenge_opt = null,
                        .simulated = false,
                        .position = us.position,
                    },
                },
            };
        },
        // Similar for DHT
    };
}

Steps 7-8: Fiat-Shamir

Serialize tree and compute root challenge12:

fn computeRootChallenge(tree: ProofTree, message: []const u8) Challenge {
    // Step 7: Serialize tree structure + propositions + commitments
    var buf = std.ArrayList(u8).init(allocator);
    fiatShamirTreeToBytes(&tree, buf.writer());

    // Step 8: Append message and hash
    buf.appendSlice(message);
    return fiatShamirHashFn(buf.items);
}

Step 9: Compute Real Challenges and Responses

Top-down traversal for real nodes1314:

fn proving(
    prover: *const Prover,
    tree: ProofTree,
    hints_bag: *const HintsBag,
) ProverError!ProofTree {
    return rewriteTopDown(tree, struct {
        fn transform(node: ProofTree, p: *const Prover, hints: *const HintsBag) ?ProofTree {
            return switch (node) {
                .unproven_tree => |ut| switch (ut) {
                    .unproven_conjecture => |conj| blk: {
                        if (!conj.isReal()) break :blk null;

                        switch (conj) {
                            .cand_unproven => |cand| blk: {
                                // Real AND: all children get same challenge
                                const challenge = cand.challenge_opt.?;
                                break :blk cand.withChildren(
                                    propagateChallenge(cand.children, challenge),
                                );
                            },
                            .cor_unproven => |cor| blk: {
                                // Real OR: real child gets XOR of root and simulated
                                const root_challenge = cor.challenge_opt.?;
                                const xored = xorChallenges(root_challenge, cor.children);
                                break :blk cor.withRealChildChallenge(xored);
                            },
                            .cthreshold_unproven => |ct| blk: {
                                // Real THRESHOLD: polynomial interpolation
                                break :blk computeThresholdChallenges(ct);
                            },
                        }
                    },
                    .unproven_leaf => |leaf| blk: {
                        if (!leaf.isReal()) break :blk null;

                        // Compute response z = r + e*w mod q
                        const challenge = leaf.challenge_opt orelse
                            return error.RealUnprovenTreeWithoutChallenge;

                        switch (leaf) {
                            .unproven_schnorr => |us| blk: {
                                const secret = p.findSecret(us.proposition) orelse
                                    hints.findRealProof(us.position)?.unchecked.second_message orelse
                                    return error.SecretNotFound;

                                const z = DlogProver.secondMessage(
                                    secret,
                                    us.randomness_opt.?,
                                    challenge,
                                );
                                break :blk .{
                                    .unchecked_leaf = .{
                                        .unchecked_schnorr = .{
                                            .proposition = us.proposition,
                                            .commitment_opt = null,
                                            .challenge = challenge,
                                            .second_message = z,
                                        },
                                    },
                                };
                            },
                            // Similar for DHT
                        }
                    },
                },
                else => null,
            };
        }
    }.transform, prover, hints_bag);
}

Step 10: Serialize Proof

fn serializeSig(tree: UncheckedTree) ProofBytes {
    var buf = std.ArrayList(u8).init(allocator);
    var w = SigmaByteWriter.init(buf.writer());

    sigWriteBytes(&tree, &w, true);

    return .{ .bytes = buf.items };
}

fn sigWriteBytes(node: *const UncheckedTree, w: *SigmaByteWriter, write_challenge: bool) void {
    if (write_challenge) {
        w.writeBytes(&node.challenge());
    }

    switch (node.*) {
        .unchecked_leaf => |leaf| switch (leaf) {
            .unchecked_schnorr => |us| {
                w.writeBytes(&us.second_message.z.toBytes());
            },
            .unchecked_dh_tuple => |dh| {
                w.writeBytes(&dh.second_message.z.toBytes());
            },
        },
        .unchecked_conjecture => |conj| switch (conj) {
            .cand_unchecked => |cand| {
                // Children's challenges equal parent's - don't write
                for (cand.children) |child| {
                    sigWriteBytes(&child, w, false);
                }
            },
            .cor_unchecked => |cor| {
                // Write all except last (computed via XOR)
                for (cor.children[0 .. cor.children.len - 1]) |child| {
                    sigWriteBytes(&child, w, true);
                }
                sigWriteBytes(&cor.children[cor.children.len - 1], w, false);
            },
            .cthreshold_unchecked => |ct| {
                // Write polynomial coefficients
                w.writeBytes(ct.polynomial.toBytes(false));
                for (ct.children) |child| {
                    sigWriteBytes(&child, w, false);
                }
            },
        },
    };
}

Response Computation

Schnorr Response

const DlogProver = struct {
    /// First message: a = g^r
    pub fn firstMessage(rng: std.rand.Random) struct { r: Scalar, message: FirstDlogProverMessage } {
        const r = Scalar.random(rng);
        const a = DlogGroup.exponentiate(&DlogGroup.generator(), &r);
        return .{ .r = r, .message = .{ .a = a } };
    }

    /// Second message: z = r + e*w mod q
    pub fn secondMessage(
        private_key: DlogProverInput,
        r: Scalar,
        challenge: Challenge,
    ) SecondDlogProverMessage {
        const e = Scalar.fromBytes(&challenge.bytes);
        const z = r.add(e.mul(private_key.w));
        return .{ .z = z };
    }

    /// Simulation: pick random z, compute a = g^z * h^(-e)
    pub fn simulate(
        proposition: ProveDlog,
        challenge: Challenge,
    ) struct { first_message: FirstDlogProverMessage, second_message: SecondDlogProverMessage } {
        const z = Scalar.random(rng);
        const e = Scalar.fromBytes(&challenge.bytes);
        const minus_e = e.negate();

        const gz = DlogGroup.exponentiate(&DlogGroup.generator(), &z);
        const h_neg_e = DlogGroup.exponentiate(&proposition.h, &minus_e);
        const a = gz.multiply(&h_neg_e);

        return .{
            .first_message = .{ .a = a },
            .second_message = .{ .z = z },
        };
    }
};

Hint System

Hint Types

For distributed signing15:

const Hint = union(enum) {
    real_secret_proof: RealSecretProof,
    simulated_secret_proof: SimulatedSecretProof,
    own_commitment: OwnCommitment,
    real_commitment: RealCommitment,
    simulated_commitment: SimulatedCommitment,
};

const RealSecretProof = struct {
    image: SigmaBoolean,
    challenge: Challenge,
    unchecked_tree: UncheckedTree,
    position: NodePosition,
};

const OwnCommitment = struct {
    image: SigmaBoolean,
    secret_randomness: Scalar,  // PRIVATE - NEVER share!
    commitment: FirstProverMessage,
    position: NodePosition,
};
// SECURITY: OwnCommitment contains secret randomness (r). NEVER send
// OwnCommitment to other parties - only send RealCommitment (public part).
// Leaking r allows computing secret key w = (z - r) / e.

const RealCommitment = struct {
    image: SigmaBoolean,
    commitment: FirstProverMessage,
    position: NodePosition,
};

const HintsBag = struct {
    hints: []const Hint,

    pub fn realImages(self: *const HintsBag) []const SigmaBoolean {
        // Collect public images from real proofs and commitments
    }

    pub fn findCommitment(self: *const HintsBag, pos: NodePosition) ?CommitmentHint {
        for (self.hints) |hint| {
            switch (hint) {
                .own_commitment, .real_commitment => |c| {
                    if (c.position.eql(pos)) return c;
                },
                else => {},
            }
        }
        return null;
    }

    pub fn findRealProof(self: *const HintsBag, pos: NodePosition) ?RealSecretProof {
        for (self.hints) |hint| {
            if (hint == .real_secret_proof and hint.real_secret_proof.position.eql(pos)) {
                return hint.real_secret_proof;
            }
        }
        return null;
    }
};

Distributed Signing Protocol

Distributed Signing (2-of-2 AND)
─────────────────────────────────────────────────────

Round 1: Generate commitments
  Party 1 (sk1) ─────> OwnCommitment(pk1, r1, g^r1)
  Party 2 (sk2) ─────> OwnCommitment(pk2, r2, g^r2)

Exchange: Share RealCommitment (NOT OwnCommitment!)
  Party 1 ─────> RealCommitment(pk1, g^r1) ─────> Party 2
  Party 2 ─────> RealCommitment(pk2, g^r2) ─────> Party 1

Round 2: Sign sequentially
  Party 1:
    combined = hints1 ++ RealCommitment(pk2)
    partialProof = prove(tree, msg, combined)

  Extract hints from partial:
    hintsFromProof = bagForMultisig(partialProof, ...)

  Party 2:
    combined = hints2 ++ hintsFromProof
    finalProof = prove(tree, msg, combined)

Prover Errors

const ProverError = error{
    ErgoTreeError,
    EvalError,
    Gf2_192Error,
    ReducedToFalse,
    TreeRootIsNotReal,
    SimulatedLeafWithoutChallenge,
    RealUnprovenTreeWithoutChallenge,
    SecretNotFound,
    Unexpected,
    FiatShamirTreeSerializationError,
};

Summary

This chapter covered the prover implementation that generates Sigma proofs:

The prover transforms a sigma-tree through a 10-step algorithm:

  1. Convert to unproven: Transform SigmaBoolean to UnprovenTree data structure
  2. Mark real (bottom-up): Identify which nodes the prover has secrets for
  3. Check root: Fail if the root is simulated (prover cannot prove)
  4. Polish simulated (top-down): Ensure OR keeps only one real child, THRESHOLD keeps exactly k
  5. Simulate and commit: Assign challenges to simulated children, generate commitments for real leaves
  6. Fiat-Shamir serialization: Serialize tree structure and commitments
  7. Compute root challenge: Hash serialized tree with message
  8. Prove (top-down): Distribute challenges and compute responses for real nodes
  9. Serialize proof: Output compact format

Key design principles:

  • Zero-knowledge: Simulated transcripts are computationally indistinguishable from real ones
  • Challenge flow depends on composition: AND propagates same challenge to all; OR uses XOR constraint; THRESHOLD uses polynomial interpolation over GF(2^192)
  • Hint system enables distributed signing: parties exchange commitments (never secret randomness), then sign sequentially

Next: Chapter 16: ErgoScript Parser

3

Rust: unproven_tree.rs (NodePosition)

6

Rust: prover.rs (convert_to_unproven)

7

Scala: ProverInterpreter.scala (markReal)

10

Scala: ProverInterpreter.scala (simulateAndCommit)

11

Rust: prover.rs (simulate_and_commit)

12

Rust: fiat_shamir.rs

13

Scala: ProverInterpreter.scala (proving)

14

Rust: prover.rs (proving)

15

Rust: hint.rs

Chapter 16: ErgoScript Parser

PRE-ALPHA WARNING: This is a pre-alpha version of The Sigma Book. Content may be incomplete, inaccurate, or subject to change. Do not use as a source of truth. For authoritative information, consult the official repositories:

Prerequisites

  • Chapter 4 for AST node types that the parser produces
  • Chapter 2 for type syntax parsing
  • Familiarity with parsing concepts: tokenization, recursive descent, operator precedence

Learning Objectives

By the end of this chapter, you will be able to:

  • Explain parser combinator and Pratt parsing techniques used in ErgoScript
  • Navigate the parser module structure (lexer, grammar, expressions, types)
  • Implement operator precedence using binding power
  • Trace expression parsing from ErgoScript source to untyped AST
  • Handle source position tracking for meaningful error messages

Parser Architecture

ErgoScript source code transforms to AST through lexing and parsing12:

Parsing Pipeline
─────────────────────────────────────────────────────

Source Code
    │
    ▼
┌──────────────────────────────────────────────────┐
│                    LEXER                         │
│                                                  │
│  Characters ─────> Tokens                        │
│  "val x = 1 + 2"                                 │
│  ─────>  [ValKw, Ident("x"), Eq, Int(1),         │
│           Plus, Int(2)]                          │
└──────────────────────────────────────────────────┘
    │
    ▼
┌──────────────────────────────────────────────────┐
│                   PARSER                         │
│                                                  │
│  Tokens ─────> AST                               │
│  Grammar rules, precedence, associativity        │
│  ─────>  ValDef("x", BinOp(Int(1), +, Int(2)))   │
└──────────────────────────────────────────────────┘
    │
    ▼
Untyped AST (SValue)

Lexer (Tokenizer)

Converts character stream to tokens3:

const TokenKind = enum {
    // Literals
    int_number,
    long_number,
    string_literal,

    // Keywords
    val_kw,
    def_kw,
    if_kw,
    else_kw,
    true_kw,
    false_kw,

    // Operators
    plus,
    minus,
    star,
    slash,
    percent,
    eq,
    neq,
    lt,
    gt,
    le,
    ge,
    and_and,
    or_or,
    bang,

    // Punctuation
    l_paren,
    r_paren,
    l_brace,
    r_brace,
    l_bracket,
    r_bracket,
    dot,
    comma,
    colon,
    semicolon,
    arrow,

    // Identifiers
    ident,

    // Special
    whitespace,
    comment,
    eof,
    err,
};

const Token = struct {
    kind: TokenKind,
    text: []const u8,
    range: Range,
};

const Range = struct {
    start: usize,
    end: usize,
};

Lexer Implementation

const Lexer = struct {
    source: []const u8,
    pos: usize,

    pub fn init(source: []const u8) Lexer {
        return .{ .source = source, .pos = 0 };
    }

    pub fn nextToken(self: *Lexer) Token {
        self.skipWhitespaceAndComments();

        if (self.pos >= self.source.len) {
            return .{ .kind = .eof, .text = "", .range = .{ .start = self.pos, .end = self.pos } };
        }

        const start = self.pos;
        const c = self.source[self.pos];

        // Single-character tokens
        const single_char_token: ?TokenKind = switch (c) {
            '(' => .l_paren,
            ')' => .r_paren,
            '{' => .l_brace,
            '}' => .r_brace,
            '[' => .l_bracket,
            ']' => .r_bracket,
            '.' => .dot,
            ',' => .comma,
            ':' => .colon,
            ';' => .semicolon,
            '+' => .plus,
            '-' => .minus,
            '*' => .star,
            '/' => .slash,
            '%' => .percent,
            else => null,
        };

        if (single_char_token) |kind| {
            self.pos += 1;
            return .{ .kind = kind, .text = self.source[start..self.pos], .range = .{ .start = start, .end = self.pos } };
        }

        // Multi-character tokens
        if (c == '=' and self.peek(1) == '=') {
            self.pos += 2;
            return .{ .kind = .eq, .text = "==", .range = .{ .start = start, .end = self.pos } };
        }

        if (c == '=' and self.peek(1) == '>') {
            self.pos += 2;
            return .{ .kind = .arrow, .text = "=>", .range = .{ .start = start, .end = self.pos } };
        }

        if (c == '&' and self.peek(1) == '&') {
            self.pos += 2;
            return .{ .kind = .and_and, .text = "&&", .range = .{ .start = start, .end = self.pos } };
        }

        // Numbers
        if (std.ascii.isDigit(c)) {
            return self.scanNumber(start);
        }

        // Identifiers and keywords
        if (std.ascii.isAlphabetic(c) or c == '_') {
            return self.scanIdentifier(start);
        }

        // Unknown character
        self.pos += 1;
        return .{ .kind = .err, .text = self.source[start..self.pos], .range = .{ .start = start, .end = self.pos } };
    }

    fn scanIdentifier(self: *Lexer, start: usize) Token {
        while (self.pos < self.source.len) {
            const c = self.source[self.pos];
            if (std.ascii.isAlphanumeric(c) or c == '_') {
                self.pos += 1;
            } else {
                break;
            }
        }

        const text = self.source[start..self.pos];
        const kind: TokenKind = if (keywords.get(text)) |kw| kw else .ident;
        return .{ .kind = kind, .text = text, .range = .{ .start = start, .end = self.pos } };
    }

    fn scanNumber(self: *Lexer, start: usize) Token {
        // Check for hex
        if (self.source[self.pos] == '0' and self.pos + 1 < self.source.len and
            (self.source[self.pos + 1] == 'x' or self.source[self.pos + 1] == 'X'))
        {
            self.pos += 2;
            while (self.pos < self.source.len and std.ascii.isHex(self.source[self.pos])) {
                self.pos += 1;
            }
        } else {
            while (self.pos < self.source.len and std.ascii.isDigit(self.source[self.pos])) {
                self.pos += 1;
            }
        }

        // Check for L suffix (long)
        var kind: TokenKind = .int_number;
        if (self.pos < self.source.len and (self.source[self.pos] == 'L' or self.source[self.pos] == 'l')) {
            kind = .long_number;
            self.pos += 1;
        }

        return .{ .kind = kind, .text = self.source[start..self.pos], .range = .{ .start = start, .end = self.pos } };
    }

    const keywords = std.ComptimeStringMap(TokenKind, .{
        .{ "val", .val_kw },
        .{ "def", .def_kw },
        .{ "if", .if_kw },
        .{ "else", .else_kw },
        .{ "true", .true_kw },
        .{ "false", .false_kw },
    });
};

Parser Structure

Event-based parser using markers45:

const Event = union(enum) {
    start_node: SyntaxKind,
    add_token,
    finish_node,
    err: ParseError,
    placeholder,
};

const Parser = struct {
    source: Source,
    events: std.ArrayList(Event),
    expected_kinds: std.ArrayList(TokenKind),
    allocator: Allocator,

    pub fn init(allocator: Allocator, tokens: []const Token) Parser {
        return .{
            .source = Source.init(tokens),
            .events = std.ArrayList(Event).init(allocator),
            .expected_kinds = std.ArrayList(TokenKind).init(allocator),
            .allocator = allocator,
        };
    }

    pub fn parse(self: *Parser) []Event {
        grammar.root(self);
        return self.events.toOwnedSlice();
    }

    fn start(self: *Parser) Marker {
        const pos = self.events.items.len;
        try self.events.append(.placeholder);
        return Marker.init(pos);
    }

    fn at(self: *Parser, kind: TokenKind) bool {
        try self.expected_kinds.append(kind);
        return self.peek() == kind;
    }

    fn bump(self: *Parser) void {
        self.expected_kinds.clearRetainingCapacity();
        _ = self.source.nextToken();
        try self.events.append(.add_token);
    }

    fn expect(self: *Parser, kind: TokenKind) void {
        if (self.at(kind)) {
            self.bump();
        } else {
            self.err();
        }
    }
};

const Marker = struct {
    pos: usize,

    pub fn init(pos: usize) Marker {
        return .{ .pos = pos };
    }

    pub fn complete(self: Marker, p: *Parser, kind: SyntaxKind) CompletedMarker {
        p.events.items[self.pos] = .{ .start_node = kind };
        try p.events.append(.finish_node);
        return .{ .pos = self.pos };
    }

    pub fn precede(self: Marker, p: *Parser) Marker {
        const new_marker = p.start();
        p.events.items[self.pos] = .{ .start_node_at = new_marker.pos };
        return new_marker;
    }
};

Pratt Parsing (Binding Power)

Expression parsing uses Pratt parsing for operator precedence67. This technique, introduced by Vaughan Pratt in 1973 ("Top Down Operator Precedence"), elegantly handles operator precedence and associativity through numeric "binding power" values:

Binding Power Concept
─────────────────────────────────────────────────────

Expression:   A       +       B       *       C
Power:           3       3       5       5

The * has higher binding power, holds B and C tighter.
Result: A + (B * C)

Associativity via asymmetric power:
Expression:   A       +       B       +       C
Power:     0     3      3.1     3      3.1     0

Right power slightly higher → left associativity
Result: (A + B) + C

Expression Grammar

const grammar = struct {
    pub fn root(p: *Parser) CompletedMarker {
        const m = p.start();
        while (!p.atEnd()) {
            stmt(p);
        }
        return m.complete(p, .root);
    }

    pub fn expr(p: *Parser) ?CompletedMarker {
        return exprBindingPower(p, 0);
    }

    /// Pratt parser core
    fn exprBindingPower(p: *Parser, min_bp: u8) ?CompletedMarker {
        var lhs = lhs(p) orelse return null;

        while (true) {
            const op: ?BinaryOp = blk: {
                if (p.at(.plus)) break :blk .add;
                if (p.at(.minus)) break :blk .sub;
                if (p.at(.star)) break :blk .mul;
                if (p.at(.slash)) break :blk .div;
                if (p.at(.percent)) break :blk .mod;
                if (p.at(.lt)) break :blk .lt;
                if (p.at(.gt)) break :blk .gt;
                if (p.at(.le)) break :blk .le;
                if (p.at(.ge)) break :blk .ge;
                if (p.at(.eq)) break :blk .eq;
                if (p.at(.neq)) break :blk .neq;
                if (p.at(.and_and)) break :blk .and_;
                if (p.at(.or_or)) break :blk .or_;
                break :blk null;
            };

            if (op == null) break;

            const bp = op.?.bindingPower();
            if (bp.left < min_bp) break;

            // Consume operator
            p.bump();

            // Parse right operand with right binding power
            const m = lhs.precede(p);
            const parsed_rhs = exprBindingPower(p, bp.right) != null;
            lhs = m.complete(p, .infix_expr);

            if (!parsed_rhs) break;
        }

        return lhs;
    }

    /// Left-hand side (atoms and prefix expressions)
    fn lhs(p: *Parser) ?CompletedMarker {
        if (p.at(.int_number)) return intNumber(p);
        if (p.at(.long_number)) return longNumber(p);
        if (p.at(.ident)) return ident(p);
        if (p.at(.true_kw) or p.at(.false_kw)) return boolLiteral(p);
        if (p.at(.minus) or p.at(.bang)) return prefixExpr(p);
        if (p.at(.l_paren)) return parenExpr(p);
        if (p.at(.l_brace)) return blockExpr(p);
        if (p.at(.if_kw)) return ifExpr(p);

        p.err();
        return null;
    }

    fn intNumber(p: *Parser) CompletedMarker {
        const m = p.start();
        p.bump();
        return m.complete(p, .int_literal);
    }

    fn ident(p: *Parser) CompletedMarker {
        const m = p.start();
        p.bump();
        return m.complete(p, .ident);
    }

    fn prefixExpr(p: *Parser) ?CompletedMarker {
        const m = p.start();
        const op_bp = UnaryOp.fromToken(p.peek()).?.bindingPower();

        p.bump(); // operator
        _ = exprBindingPower(p, op_bp.right);

        return m.complete(p, .prefix_expr);
    }

    fn parenExpr(p: *Parser) CompletedMarker {
        const m = p.start();
        p.expect(.l_paren);
        _ = expr(p);
        p.expect(.r_paren);
        return m.complete(p, .paren_expr);
    }

    fn ifExpr(p: *Parser) CompletedMarker {
        const m = p.start();
        p.expect(.if_kw);
        p.expect(.l_paren);
        _ = expr(p);
        p.expect(.r_paren);
        _ = expr(p);
        if (p.at(.else_kw)) {
            p.bump();
            _ = expr(p);
        }
        return m.complete(p, .if_expr);
    }
};

Binary Operators

const BinaryOp = enum {
    add,
    sub,
    mul,
    div,
    mod,
    lt,
    gt,
    le,
    ge,
    eq,
    neq,
    and_,
    or_,

    const BindingPower = struct { left: u8, right: u8 };

    pub fn bindingPower(self: BinaryOp) BindingPower {
        return switch (self) {
            .or_ => .{ .left = 1, .right = 2 },      // ||
            .and_ => .{ .left = 3, .right = 4 },     // &&
            .eq, .neq => .{ .left = 5, .right = 6 }, // ==, !=
            .lt, .gt, .le, .ge => .{ .left = 7, .right = 8 },
            .add, .sub => .{ .left = 9, .right = 10 },
            .mul, .div, .mod => .{ .left = 11, .right = 12 },
        };
    }
};

const UnaryOp = enum {
    neg,
    not,

    pub fn bindingPower(self: UnaryOp) struct { right: u8 } {
        return switch (self) {
            .neg, .not => .{ .right = 13 }, // Higher than all binary
        };
    }
};

Operator Precedence Table

Operator Precedence (lowest to highest)
─────────────────────────────────────────────────────
 1-2    ||                 Logical OR
 3-4    &&                 Logical AND
 5-6    == !=              Equality
 7-8    < > <= >=          Comparison
 9-10   + -                Addition, Subtraction
11-12   * / %              Multiplication, Division
  13    - ! ~              Prefix (unary)
  14    . ()               Postfix (method call, index)

Type Parsing

const TypeParser = struct {
    const predef_types = std.ComptimeStringMap(SType, .{
        .{ "Boolean", .s_boolean },
        .{ "Byte", .s_byte },
        .{ "Short", .s_short },
        .{ "Int", .s_int },
        .{ "Long", .s_long },
        .{ "BigInt", .s_big_int },
        .{ "GroupElement", .s_group_element },
        .{ "SigmaProp", .s_sigma_prop },
        .{ "Box", .s_box },
        .{ "AvlTree", .s_avl_tree },
        .{ "Context", .s_context },
        .{ "Header", .s_header },
        .{ "PreHeader", .s_pre_header },
        .{ "Unit", .s_unit },
    });

    pub fn parseType(p: *Parser) ?SType {
        if (p.at(.ident)) {
            const name = p.currentText();

            // Check predefined types
            if (predef_types.get(name)) |t| {
                p.bump();
                return t;
            }

            // Generic types: Coll[T], Option[T]
            p.bump();
            if (p.at(.l_bracket)) {
                p.bump();
                const inner = parseType(p) orelse return null;
                p.expect(.r_bracket);

                if (std.mem.eql(u8, name, "Coll")) {
                    return .{ .s_coll = inner };
                } else if (std.mem.eql(u8, name, "Option")) {
                    return .{ .s_option = inner };
                }
            }

            // Type variable
            return .{ .s_type_var = name };
        }

        // Tuple type: (T1, T2, ...)
        if (p.at(.l_paren)) {
            p.bump();
            var items = std.ArrayList(SType).init(p.allocator);
            while (!p.at(.r_paren)) {
                const t = parseType(p) orelse return null;
                try items.append(t);
                if (!p.at(.r_paren)) p.expect(.comma);
            }
            p.expect(.r_paren);
            return .{ .s_tuple = items.toOwnedSlice() };
        }

        // Function type: T1 => T2
        const domain = parseType(p) orelse return null;
        if (p.at(.arrow)) {
            p.bump();
            const range = parseType(p) orelse return null;
            return .{ .s_func = .{ .args = &[_]SType{domain}, .ret = range } };
        }

        return domain;
    }
};

Statement Parsing

fn stmt(p: *Parser) ?CompletedMarker {
    if (p.at(.val_kw)) {
        return valDef(p);
    }
    if (p.at(.def_kw)) {
        return defDef(p);
    }
    return expr(p);
}

fn valDef(p: *Parser) CompletedMarker {
    const m = p.start();
    p.expect(.val_kw);
    p.expect(.ident);

    // Optional type annotation
    if (p.at(.colon)) {
        p.bump();
        _ = TypeParser.parseType(p);
    }

    p.expect(.eq);
    _ = expr(p);

    return m.complete(p, .val_def);
}

fn defDef(p: *Parser) CompletedMarker {
    const m = p.start();
    p.expect(.def_kw);
    p.expect(.ident);

    // Parameters
    if (p.at(.l_paren)) {
        p.bump();
        while (!p.at(.r_paren)) {
            p.expect(.ident);
            p.expect(.colon);
            _ = TypeParser.parseType(p);
            if (!p.at(.r_paren)) p.expect(.comma);
        }
        p.expect(.r_paren);
    }

    // Return type
    if (p.at(.colon)) {
        p.bump();
        _ = TypeParser.parseType(p);
    }

    p.expect(.eq);
    _ = expr(p);

    return m.complete(p, .def_def);
}

Source Position Tracking

Every AST node carries source position for error messages8:

const SourceContext = struct {
    index: usize,
    line: u32,
    column: u32,
    source_line: []const u8,

    pub fn fromIndex(index: usize, source: []const u8) SourceContext {
        var line: u32 = 1;
        var col: u32 = 1;
        var line_start: usize = 0;

        for (source[0..index], 0..) |c, i| {
            if (c == '\n') {
                line += 1;
                col = 1;
                line_start = i + 1;
            } else {
                col += 1;
            }
        }

        // Find end of current line
        var line_end = index;
        while (line_end < source.len and source[line_end] != '\n') {
            line_end += 1;
        }

        return .{
            .index = index,
            .line = line,
            .column = col,
            .source_line = source[line_start..line_end],
        };
    }
};

const ParseError = struct {
    expected: []const TokenKind,
    found: ?TokenKind,
    span: Range,

    pub fn format(self: ParseError, ctx: SourceContext) []const u8 {
        // Format error message with source context
    }
};

Syntax Tree Construction

Events convert to concrete syntax tree9:

const SyntaxKind = enum {
    // Nodes
    root,
    val_def,
    def_def,
    if_expr,
    block_expr,
    infix_expr,
    prefix_expr,
    paren_expr,
    lambda_expr,
    apply_expr,
    select_expr,

    // Literals
    int_literal,
    long_literal,
    bool_literal,
    string_literal,
    ident,

    // Error
    err,
};

const SyntaxNode = struct {
    kind: SyntaxKind,
    range: Range,
    children: []SyntaxNode,
    text: ?[]const u8,
};

fn buildTree(events: []const Event, tokens: []const Token) SyntaxNode {
    var builder = TreeBuilder.init();

    for (events) |event| {
        switch (event) {
            .start_node => |kind| builder.startNode(kind),
            .add_token => builder.addToken(tokens[builder.token_idx]),
            .finish_node => builder.finishNode(),
            .err => |e| builder.addError(e),
            .placeholder => {},
        }
    }

    return builder.finish();
}

Parsing Example

Input: "val x = 1 + 2 * 3"

Tokens:
  [val_kw, ident("x"), eq, int(1), plus, int(2), star, int(3)]

Events:
  start_node(val_def)
    add_token(val_kw)
    add_token(ident)
    add_token(eq)
    start_node(infix_expr)       // 1 + (2 * 3)
      add_token(int)             // 1
      add_token(plus)
      start_node(infix_expr)     // 2 * 3
        add_token(int)           // 2
        add_token(star)
        add_token(int)           // 3
      finish_node
    finish_node
  finish_node

AST:
  ValDef
    name: "x"
    rhs: InfixExpr(+)
           lhs: IntLiteral(1)
           rhs: InfixExpr(*)
                  lhs: IntLiteral(2)
                  rhs: IntLiteral(3)

Error Recovery

const RECOVERY_SET = [_]TokenKind{ .val_kw, .def_kw, .r_brace };

fn err(p: *Parser) void {
    const current = p.source.peekToken();
    const range = if (current) |t| t.range else p.source.lastTokenRange();

    try p.events.append(.{
        .err = .{
            .expected = p.expected_kinds.toOwnedSlice(),
            .found = if (current) |t| t.kind else null,
            .span = range,
        },
    });

    // Skip tokens until recovery point
    if (!p.atSet(&RECOVERY_SET) and !p.atEnd()) {
        const m = p.start();
        p.bump();
        _ = m.complete(p, .err);
    }
}

Summary

  • Lexer converts characters to tokens with position tracking
  • Parser uses event-based architecture with markers
  • Pratt parsing handles operator precedence via binding power
  • Left associativity: right power slightly higher than left
  • Source positions enable accurate error messages
  • Error recovery skips to synchronization points
  • Output is untyped AST; semantic analysis comes next

Next: Chapter 17: Semantic Analysis

2

Rust: parser.rs

3

Rust: lexer.rs

4

Scala: Basic.scala

5

Rust: marker.rs

6

Scala: Exprs.scala

7

Rust: expr.rs:1-60

9

Rust: sink.rs

Chapter 17: Semantic Analysis

PRE-ALPHA WARNING: This is a pre-alpha version of The Sigma Book. Content may be incomplete, inaccurate, or subject to change. Do not use as a source of truth. For authoritative information, consult the official repositories:

Prerequisites

  • Chapter 16 for the untyped AST structure
  • Chapter 2 for type codes and type compatibility rules
  • Familiarity with type inference concepts: type variables, unification, constraint solving

Learning Objectives

By the end of this chapter, you will be able to:

  • Explain the two-phase semantic analysis: name binding followed by type inference
  • Implement name resolution for globals, environment variables, and local definitions
  • Apply the type unification algorithm to infer types and detect mismatches
  • Describe method resolution and how method calls are lowered to direct operations
  • Trace type inference for complex expressions involving generics and collections

Semantic Analysis Overview

After parsing, ErgoScript transforms through two phases12:

Semantic Analysis Pipeline
─────────────────────────────────────────────────────

Source Code
    │
    ▼
┌──────────────────────────────────────────────────┐
│                    PARSE                         │
│                                                  │
│  Untyped AST                                     │
│  - Identifiers have NoType                       │
│  - References are unresolved strings             │
│  - Operators are symbolic                        │
└──────────────────────────────────────────────────┘
    │
    ▼
┌──────────────────────────────────────────────────┐
│                    BIND                          │
│                                                  │
│  Resolve names:                                  │
│  - Global constants (HEIGHT, SELF, INPUTS)       │
│  - Environment variables                         │
│  - Predefined functions                          │
└──────────────────────────────────────────────────┘
    │
    ▼
┌──────────────────────────────────────────────────┐
│                    TYPE                          │
│                                                  │
│  Assign types:                                   │
│  - Infer expression types                        │
│  - Resolve method calls                          │
│  - Unify generic types                           │
│  - Check type consistency                        │
└──────────────────────────────────────────────────┘
    │
    ▼
Typed AST (ready for IR)

Phase 1: Name Binding

The binder resolves identifiers to their definitions34:

const BinderError = struct {
    msg: []const u8,
    span: Range,

    pub fn prettyDesc(self: BinderError, source: []const u8) []const u8 {
        // Format error with source context
    }
};

const GlobalVars = enum {
    height,
    self_,
    inputs,
    outputs,
    context,
    global,
    miner_pubkey,
    last_block_utxo_root_hash,

    pub fn tpe(self: GlobalVars) SType {
        return switch (self) {
            .height => .s_int,
            .self_ => .s_box,
            .inputs => .{ .s_coll = .s_box },
            .outputs => .{ .s_coll = .s_box },
            .context => .s_context,
            .global => .s_global,
            .miner_pubkey => .{ .s_coll = .s_byte },
            .last_block_utxo_root_hash => .s_avl_tree,
        };
    }
};

const Binder = struct {
    env: ScriptEnv,
    allocator: Allocator,

    pub fn init(allocator: Allocator, env: ScriptEnv) Binder {
        return .{ .env = env, .allocator = allocator };
    }

    pub fn bind(self: *const Binder, expr: Expr) BinderError!Expr {
        return self.rewrite(expr);
    }

    fn rewrite(self: *const Binder, expr: Expr) BinderError!Expr {
        return switch (expr.kind) {
            .ident => |name| blk: {
                // Check environment first
                if (self.env.get(name)) |value| {
                    break :blk liftToConstant(value, expr.span);
                }

                // Check global variables
                if (resolveGlobal(name)) |global| {
                    break :blk .{
                        .kind = .{ .global_vars = global },
                        .span = expr.span,
                        .tpe = global.tpe(),
                    };
                }

                // Leave unresolved for typer
                break :blk expr;
            },

            .binary => |bin| blk: {
                const left = try self.rewrite(bin.lhs.*);
                const right = try self.rewrite(bin.rhs.*);
                break :blk .{
                    .kind = .{ .binary = .{
                        .op = bin.op,
                        .lhs = try self.allocator.create(Expr),
                        .rhs = try self.allocator.create(Expr),
                    } },
                    .span = expr.span,
                    .tpe = expr.tpe,
                };
            },

            .block => |block| blk: {
                var new_bindings = try self.allocator.alloc(ValDef, block.bindings.len);
                for (block.bindings, 0..) |binding, i| {
                    const rhs = try self.rewrite(binding.rhs.*);
                    new_bindings[i] = .{
                        .name = binding.name,
                        .tpe = rhs.tpe orelse binding.tpe,
                        .rhs = rhs,
                    };
                }
                const body = try self.rewrite(block.body.*);
                break :blk .{
                    .kind = .{ .block = .{
                        .bindings = new_bindings,
                        .body = body,
                    } },
                    .span = expr.span,
                    .tpe = body.tpe,
                };
            },

            .lambda => |lam| blk: {
                const body = try self.rewrite(lam.body.*);
                break :blk .{
                    .kind = .{ .lambda = .{
                        .args = lam.args,
                        .body = body,
                    } },
                    .span = expr.span,
                    .tpe = expr.tpe,
                };
            },

            else => expr,
        };
    }

    fn resolveGlobal(name: []const u8) ?GlobalVars {
        const globals = std.ComptimeStringMap(GlobalVars, .{
            .{ "HEIGHT", .height },
            .{ "SELF", .self_ },
            .{ "INPUTS", .inputs },
            .{ "OUTPUTS", .outputs },
            .{ "CONTEXT", .context },
            .{ "Global", .global },
            .{ "MinerPubkey", .miner_pubkey },
            .{ "LastBlockUtxoRootHash", .last_block_utxo_root_hash },
        });
        return globals.get(name);
    }

    fn liftToConstant(value: anytype, span: Range) Expr {
        const T = @TypeOf(value);
        return .{
            .kind = .{ .literal = switch (T) {
                i32 => .{ .int = value },
                i64 => .{ .long = value },
                bool => .{ .bool_ = value },
                else => @compileError("unsupported type"),
            } },
            .span = span,
            .tpe = SType.fromNative(T),
        };
    }
};

Global Constants

Built-in Global Constants
─────────────────────────────────────────────────────
Name                    Type            Description
─────────────────────────────────────────────────────
HEIGHT                  Int             Current block height
SELF                    Box             Current box being spent
INPUTS                  Coll[Box]       Transaction inputs
OUTPUTS                 Coll[Box]       Transaction outputs
CONTEXT                 Context         Execution context
MinerPubkey             Coll[Byte]      Miner's public key
LastBlockUtxoRootHash   AvlTree         UTXO digest

Phase 2: Type Inference

The typer assigns types to all expressions56:

const TyperError = struct {
    msg: []const u8,
    span: Range,
};

const TypeEnv = std.StringHashMap(SType);

const Typer = struct {
    predef_env: TypeEnv,
    lower_method_calls: bool,
    allocator: Allocator,

    pub fn init(allocator: Allocator, type_env: TypeEnv, lower: bool) Typer {
        var env = TypeEnv.init(allocator);
        // Add predefined function types
        env.put("min", .{ .s_func = .{ .args = &[_]SType{ .s_int, .s_int }, .ret = .s_int } }) catch {};
        env.put("max", .{ .s_func = .{ .args = &[_]SType{ .s_int, .s_int }, .ret = .s_int } }) catch {};
        // Merge with provided env
        var it = type_env.iterator();
        while (it.next()) |entry| {
            env.put(entry.key_ptr.*, entry.value_ptr.*) catch {};
        }
        return .{
            .predef_env = env,
            .lower_method_calls = lower,
            .allocator = allocator,
        };
    }

    pub fn typecheck(self: *Typer, bound: Expr) TyperError!Expr {
        const typed = try self.assignType(&self.predef_env, bound);
        if (typed.tpe == null) {
            return error.NoTypeAssigned;
        }
        return typed;
    }

    fn assignType(self: *Typer, env: *const TypeEnv, expr: Expr) TyperError!Expr {
        return switch (expr.kind) {
            // Identifier: lookup in environment
            .ident => |name| blk: {
                if (env.get(name)) |t| {
                    break :blk .{
                        .kind = expr.kind,
                        .span = expr.span,
                        .tpe = t,
                    };
                }
                return TyperError{
                    .msg = "Cannot assign type for variable",
                    .span = expr.span,
                };
            },

            // Global variables already typed
            .global_vars => |g| .{
                .kind = expr.kind,
                .span = expr.span,
                .tpe = g.tpe(),
            },

            // Block: extend environment with each binding
            // NOTE: In production, use a binding stack instead of cloning HashMap
            // for each scope. See ZIGMA_STYLE.md for zero-allocation patterns.
            .block => |block| blk: {
                var cur_env = env.clone();
                var new_bindings = try self.allocator.alloc(ValDef, block.bindings.len);

                for (block.bindings, 0..) |binding, i| {
                    const rhs = try self.assignType(&cur_env, binding.rhs.*);
                    try cur_env.put(binding.name, rhs.tpe.?);
                    new_bindings[i] = .{
                        .name = binding.name,
                        .tpe = rhs.tpe.?,
                        .rhs = rhs,
                    };
                }

                const body = try self.assignType(&cur_env, block.body.*);

                break :blk .{
                    .kind = .{ .block = .{
                        .bindings = new_bindings,
                        .body = body,
                    } },
                    .span = expr.span,
                    .tpe = body.tpe,
                };
            },

            // Binary: type operands, check compatibility
            .binary => |bin| blk: {
                const left = try self.assignType(env, bin.lhs.*);
                const right = try self.assignType(env, bin.rhs.*);

                const result_type = try inferBinaryType(
                    bin.op,
                    left.tpe.?,
                    right.tpe.?,
                );

                break :blk .{
                    .kind = .{ .binary = .{
                        .op = bin.op,
                        .lhs = left,
                        .rhs = right,
                    } },
                    .span = expr.span,
                    .tpe = result_type,
                };
            },

            // If: check condition is Boolean, branches have same type
            .if_ => |if_expr| blk: {
                const cond = try self.assignType(env, if_expr.cond.*);
                const then_ = try self.assignType(env, if_expr.then_.*);
                const else_ = try self.assignType(env, if_expr.else_.*);

                if (cond.tpe.? != .s_boolean) {
                    return TyperError{
                        .msg = "Condition must be Boolean",
                        .span = cond.span,
                    };
                }

                if (!typesEqual(then_.tpe.?, else_.tpe.?)) {
                    return TyperError{
                        .msg = "Branches must have same type",
                        .span = expr.span,
                    };
                }

                break :blk .{
                    .kind = .{ .if_ = .{
                        .cond = cond,
                        .then_ = then_,
                        .else_ = else_,
                    } },
                    .span = expr.span,
                    .tpe = then_.tpe,
                };
            },

            // Lambda: check argument types, type body
            .lambda => |lam| blk: {
                var lambda_env = env.clone();
                for (lam.args) |arg| {
                    if (arg.tpe == .no_type) {
                        return TyperError{
                            .msg = "Lambda argument must have explicit type",
                            .span = expr.span,
                        };
                    }
                    try lambda_env.put(arg.name, arg.tpe);
                }

                const body = try self.assignType(&lambda_env, lam.body.*);
                const func_type = SType{
                    .s_func = .{
                        .args = lam.args.map(fn(a) a.tpe),
                        .ret = body.tpe.?,
                    },
                };

                break :blk .{
                    .kind = .{ .lambda = .{
                        .args = lam.args,
                        .body = body,
                    } },
                    .span = expr.span,
                    .tpe = func_type,
                };
            },

            // Method call: type receiver, resolve method, unify types
            .select => |sel| try self.typeSelect(env, sel, expr.span),

            .apply => |app| try self.typeApply(env, app, expr.span),

            // Literals already typed
            .literal => |lit| .{
                .kind = expr.kind,
                .span = expr.span,
                .tpe = switch (lit) {
                    .int => .s_int,
                    .long => .s_long,
                    .bool_ => .s_boolean,
                    .string => .{ .s_coll = .s_byte },
                },
            },

            else => expr,
        };
    }
};

Binary Operation Type Inference

fn inferBinaryType(op: BinaryOp, left: SType, right: SType) TyperError!SType {
    return switch (op) {
        // Arithmetic: operands must be same numeric type
        .plus, .minus, .multiply, .divide, .modulo => blk: {
            if (!left.isNumeric() or !right.isNumeric()) {
                return error.TypeMismatch;
            }
            if (!typesEqual(left, right)) {
                return error.TypeMismatch;
            }
            break :blk left;
        },

        // Comparison: operands must be same type, result is Boolean
        .lt, .gt, .le, .ge => blk: {
            if (!typesEqual(left, right)) {
                return error.TypeMismatch;
            }
            break :blk .s_boolean;
        },

        // Equality: operands must be same type
        .eq, .neq => blk: {
            if (!typesEqual(left, right)) {
                return error.TypeMismatch;
            }
            break :blk .s_boolean;
        },

        // Logical: Boolean operands
        .and_, .or_ => blk: {
            if (left == .s_boolean and right == .s_boolean) {
                break :blk .s_boolean;
            }
            // SigmaProp operations
            if (left == .s_sigma_prop and right == .s_sigma_prop) {
                break :blk .s_sigma_prop;
            }
            // Mixed: SigmaProp with Boolean
            if ((left == .s_sigma_prop and right == .s_boolean) or
                (left == .s_boolean and right == .s_sigma_prop))
            {
                break :blk .s_boolean;
            }
            return error.TypeMismatch;
        },

        // Bitwise: numeric operands
        .bit_and, .bit_or, .bit_xor => blk: {
            if (!left.isNumeric() or !right.isNumeric()) {
                return error.TypeMismatch;
            }
            if (!typesEqual(left, right)) {
                return error.TypeMismatch;
            }
            break :blk left;
        },
    };
}

Type Unification

Finds a substitution making two types equal7:

const TypeSubst = std.StringHashMap(SType);

fn unifyTypes(t1: SType, t2: SType) ?TypeSubst {
    var subst = TypeSubst.init(allocator);

    return switch (t1) {
        // Type variable matches anything
        .s_type_var => |name| blk: {
            subst.put(name, t2) catch return null;
            break :blk subst;
        },

        // Collection types: unify element types
        .s_coll => |elem1| switch (t2) {
            .s_coll => |elem2| unifyTypes(elem1, elem2),
            else => null,
        },

        // Option types: unify element types
        .s_option => |elem1| switch (t2) {
            .s_option => |elem2| unifyTypes(elem1, elem2),
            else => null,
        },

        // Tuple types: unify element-wise
        .s_tuple => |items1| switch (t2) {
            .s_tuple => |items2| blk: {
                if (items1.len != items2.len) break :blk null;
                for (items1, items2) |i1, i2| {
                    const sub = unifyTypes(i1, i2) orelse break :blk null;
                    subst = mergeSubst(subst, sub) orelse break :blk null;
                }
                break :blk subst;
            },
            else => null,
        },

        // Function types: unify domain and range
        .s_func => |f1| switch (t2) {
            .s_func => |f2| blk: {
                if (f1.args.len != f2.args.len) break :blk null;
                for (f1.args, f2.args) |a1, a2| {
                    const sub = unifyTypes(a1, a2) orelse break :blk null;
                    subst = mergeSubst(subst, sub) orelse break :blk null;
                }
                const ret_sub = unifyTypes(f1.ret, f2.ret) orelse break :blk null;
                break :blk mergeSubst(subst, ret_sub);
            },
            else => null,
        },

        // Boolean can unify with SigmaProp (implicit conversion)
        .s_boolean => switch (t2) {
            .s_sigma_prop, .s_boolean => subst,
            else => null,
        },

        // SAny matches anything
        .s_any => subst,

        // Primitive types must match exactly
        else => if (typesEqual(t1, t2)) subst else null,
    };
}

fn applySubst(tpe: SType, subst: TypeSubst) SType {
    return switch (tpe) {
        .s_type_var => |name| subst.get(name) orelse tpe,
        .s_coll => |elem| .{ .s_coll = applySubst(elem, subst) },
        .s_option => |elem| .{ .s_option = applySubst(elem, subst) },
        .s_tuple => |items| .{
            .s_tuple = items.map(fn(t) applySubst(t, subst)),
        },
        .s_func => |f| .{
            .s_func = .{
                .args = f.args.map(fn(t) applySubst(t, subst)),
                .ret = applySubst(f.ret, subst),
            },
        },
        else => tpe,
    };
}

fn mergeSubst(s1: TypeSubst, s2: TypeSubst) ?TypeSubst {
    var result = s1.clone();
    var it = s2.iterator();
    while (it.next()) |entry| {
        if (result.get(entry.key_ptr.*)) |existing| {
            if (!typesEqual(existing, entry.value_ptr.*)) {
                return null; // Conflict
            }
        } else {
            result.put(entry.key_ptr.*, entry.value_ptr.*) catch return null;
        }
    }
    return result;
}

Unification Example

Generic Method Specialization
─────────────────────────────────────────────────────

coll.map(f) where:
  - coll: Coll[Byte]
  - map type: (Coll[T], T => R) => Coll[R]
  - f: Byte => Int

Step 1: Unify Coll[T] with Coll[Byte]
        Result: {T → Byte}

Step 2: Unify (T => R) with (Byte => Int)
        T already bound to Byte ✓
        Result: {T → Byte, R → Int}

Step 3: Apply substitution to result type
        Coll[R] → Coll[Int]

Final: map specialized to (Coll[Byte], Byte => Int) => Coll[Int]

Method Resolution

Methods are looked up in type's methods container8:

const MethodsContainer = struct {
    const methods_by_type = std.ComptimeStringMap([]const MethodInfo, .{
        .{ "SBox", &box_methods },
        .{ "SColl", &coll_methods },
        .{ "SContext", &context_methods },
        // ...
    });

    pub fn getMethod(tpe: SType, name: []const u8) ?MethodInfo {
        const type_name = tpe.typeName();
        if (methods_by_type.get(type_name)) |methods| {
            for (methods) |m| {
                if (std.mem.eql(u8, m.name, name)) {
                    return m;
                }
            }
        }
        return null;
    }
};

const MethodInfo = struct {
    name: []const u8,
    stype: SType,
    ir_builder: ?*const fn (Expr, []const Expr) Expr,
};

const box_methods = [_]MethodInfo{
    .{ .name = "value", .stype = .s_long, .ir_builder = null },
    .{ .name = "propositionBytes", .stype = .{ .s_coll = .s_byte }, .ir_builder = null },
    .{ .name = "id", .stype = .{ .s_coll = .s_byte }, .ir_builder = null },
    .{ .name = "tokens", .stype = .{ .s_coll = .{ .s_tuple = &[_]SType{
        .{ .s_coll = .s_byte }, .s_long,
    } } }, .ir_builder = null },
    // ...
};

const coll_methods = [_]MethodInfo{
    .{ .name = "size", .stype = .s_int, .ir_builder = &buildSizeOf },
    .{ .name = "map", .stype = .{ .s_func = .{
        .args = &[_]SType{ .{ .s_type_var = "T" }, .{ .s_func = .{
            .args = &[_]SType{.{ .s_type_var = "T" }},
            .ret = .{ .s_type_var = "R" },
        } } },
        .ret = .{ .s_coll = .{ .s_type_var = "R" } },
    } }, .ir_builder = &buildMapCollection },
    // ...
};

Method Lowering

When lower_method_calls = true, method calls become IR nodes9:

fn typeSelect(
    self: *Typer,
    env: *const TypeEnv,
    sel: SelectExpr,
    span: Range,
) TyperError!Expr {
    const receiver = try self.assignType(env, sel.obj.*);
    const receiver_type = receiver.tpe.?;

    const method = MethodsContainer.getMethod(receiver_type, sel.field) orelse {
        return TyperError{
            .msg = "Method not found",
            .span = span,
        };
    };

    // Specialize generic method type
    const specialized = specializeMethod(method.stype, receiver_type);

    // Lower to IR node if builder available
    if (method.ir_builder) |builder| {
        if (self.lower_method_calls) {
            return builder(receiver, &[_]Expr{});
        }
    }

    // Keep as method call
    return .{
        .kind = .{ .select = .{
            .obj = receiver,
            .field = sel.field,
        } },
        .span = span,
        .tpe = specialized,
    };
}

fn buildSizeOf(receiver: Expr, _: []const Expr) Expr {
    return .{
        .kind = .{ .size_of = receiver },
        .span = receiver.span,
        .tpe = .s_int,
    };
}

fn buildMapCollection(receiver: Expr, args: []const Expr) Expr {
    return .{
        .kind = .{ .map = .{
            .input = receiver,
            .mapper = args[0],
        } },
        .span = receiver.span,
        .tpe = args[0].tpe.?.s_func.ret,
    };
}

MIR Lowering

After typing, HIR lowers to MIR (typed IR)10:

const MirLoweringError = struct {
    msg: []const u8,
    span: Range,
};

pub fn lower(hir_expr: hir.Expr) MirLoweringError!mir.Expr {
    const mir_expr: mir.Expr = switch (hir_expr.kind) {
        .global_vars => |g| switch (g) {
            .height => mir.GlobalVars.height.toExpr(),
            .self_ => mir.GlobalVars.self_.toExpr(),
            // ...
        },

        .ident => return MirLoweringError{
            .msg = "Unresolved identifier",
            .span = hir_expr.span,
        },

        .binary => |bin| blk: {
            const left = try lower(bin.lhs.*);
            const right = try lower(bin.rhs.*);
            break :blk mir.BinOp{
                .kind = bin.op.toMirOp(),
                .left = left,
                .right = right,
            }.toExpr();
        },

        .literal => |lit| switch (lit) {
            .int => |v| mir.Constant{ .int = v }.toExpr(),
            .long => |v| mir.Constant{ .long = v }.toExpr(),
            .bool_ => |v| (if (v) mir.TrueLeaf else mir.FalseLeaf).toExpr(),
        },

        // ...
    };

    // Verify types match
    const hir_tpe = hir_expr.tpe orelse return MirLoweringError{
        .msg = "Missing type for HIR expression",
        .span = hir_expr.span,
    };

    if (!typesEqual(mir_expr.tpe(), hir_tpe)) {
        return MirLoweringError{
            .msg = "Type mismatch after lowering",
            .span = hir_expr.span,
        };
    }

    return mir_expr;
}

Complete Compilation Flow

pub fn compile(source: []const u8, env: ScriptEnv) CompileError!mir.Expr {
    // 1. Parse
    const tokens = Lexer.init(source).tokenize();
    const events = Parser.init(tokens).parse();
    const ast = buildTree(events, tokens);

    // 2. Lower to HIR
    const hir = try hir.lower(ast);

    // 3. Bind
    const binder = Binder.init(allocator, env);
    const bound = try binder.bind(hir);

    // 4. Type
    const typer = Typer.init(allocator, TypeEnv.init(allocator), true);
    const typed = try typer.typecheck(bound);

    // 5. Lower to MIR
    const mir_expr = try mir.lower(typed);

    return mir_expr;
}

Error Messages

Error Types
─────────────────────────────────────────────────────

BinderError:
  - "Variable x already defined"
  - "Cannot lift value to constant"

TyperError:
  - "Cannot assign type for variable 'foo'"
  - "Condition must be Boolean, got Int"
  - "Branches must have same type: Int vs Long"
  - "Method 'bar' not found in type Box"

MirLoweringError:
  - "Unresolved identifier"
  - "Type mismatch after lowering"

Summary

Semantic analysis consists of two phases:

Binding (Binder):

  • Resolves global names (HEIGHT, SELF, etc.)
  • Lifts environment values to constants
  • Uses bottom-up tree rewriting

Typing (Typer):

  • Assigns types to all expressions
  • Resolves method calls via MethodsContainer
  • Unifies generic types with concrete types
  • Optionally lowers method calls to IR nodes
  • Checks type consistency

Key algorithms:

  • Type unification: Find substitution making types equal
  • Substitution application: Specialize generic types
  • Method resolution: Look up methods in type's container

Next: Chapter 18: Intermediate Representation

2

Rust: binder.rs

6

Rust: type_infer.rs

7

Scala: package.scala (unifyTypes)

8

Scala: SRMethod.scala

10

Rust: lower.rs:29-76

Chapter 18: Intermediate Representation (IR)

PRE-ALPHA WARNING: This is a pre-alpha version of The Sigma Book. Content may be incomplete, inaccurate, or subject to change. Do not use as a source of truth. For authoritative information, consult the official repositories:

Prerequisites

  • Chapter 17 for the typed AST that feeds into IR construction
  • Chapter 5 for operation codes that IR nodes map to
  • Understanding of compiler optimization concepts: CSE, dead code elimination

Learning Objectives

By the end of this chapter, you will be able to:

  • Explain the graph-based IR design using the Def/Ref pattern
  • Implement common subexpression elimination (CSE) via hash-consing
  • Apply graph rewriting for algebraic simplifications
  • Trace the AST → Graph IR → Optimized Tree transformations

IR Architecture Overview

The Scala compiler uses a sophisticated graph-based IR for optimization12. The Rust compiler uses a simpler direct HIR→MIR pipeline3.

Compilation Pipelines
─────────────────────────────────────────────────────

Scala (Graph IR):
┌─────────┐   GraphBuilding   ┌──────────┐   TreeBuilding    ┌──────────┐
│ Typed   │ ─────────────────>│ Graph IR │ ─────────────────>│ Optimized│
│ AST     │   (+ CSE)         │ (Def/Ref)│   (ValDef min)    │ ErgoTree │
└─────────┘                   └──────────┘                   └──────────┘
                                   │
                                   │ DefRewriting
                                   │ (algebraic simplifications)
                                   ▼

Rust (Direct):
┌─────────┐   Lower    ┌──────────┐   Lower    ┌──────────┐   Check   ┌──────────┐
│ HIR     │ ─────────> │ Bound    │ ─────────> │ Typed    │ ─────────>│ MIR/     │
│ (parse) │            │ HIR      │            │ HIR      │           │ ErgoTree │
└─────────┘            └──────────┘            └──────────┘           └──────────┘

The Def/Ref Pattern

The core IR abstraction uses definitions (nodes) and references (edges)45:

/// Reference to a definition (graph edge)
/// Like a pointer but with type information
const Sym = u32;  // Symbol ID

/// Type descriptor for IR values
const Elem = struct {
    stype: SType,
    source_type: ?*const std.meta.Type,
};

/// Base type for all graph nodes
const Node = struct {
    /// Unique ID assigned on creation
    node_id: u32,
    /// Cached dependencies (other nodes this one uses)
    deps: ?[]const Sym,
    /// Cached hash for structural equality
    hash_code: u32,

    pub fn getDeps(self: *const Node) []const Sym {
        if (self.deps) |d| return d;
        // Computed lazily from node contents
        return computeDeps(self);
    }
};

/// Definition of a computation (graph node)
const Def = struct {
    node: Node,
    /// Type of the result value
    result_type: Elem,
    /// Reference to this definition (created lazily)
    self_ref: ?Sym,

    pub fn self(d: *Def, ctx: *IRContext) Sym {
        if (d.self_ref) |s| return s;
        const sym = ctx.freshSym(d);
        d.self_ref = sym;
        return sym;
    }
};

IR Context

The IR context manages the graph and provides CSE67:

const IRContext = struct {
    allocator: Allocator,
    /// Counter for unique node IDs
    id_counter: u32,
    /// Global definitions: Def hash → Sym
    /// This enables CSE through hash-consing
    global_defs: std.HashMap(*const Def, Sym, DefHashContext, 80),
    /// Sym → Def mapping
    sym_to_def: std.AutoHashMap(Sym, *const Def),

    pub fn init(allocator: Allocator) IRContext {
        return .{
            .allocator = allocator,
            .id_counter = 0,
            .global_defs = std.HashMap(*const Def, Sym, DefHashContext, 80).init(allocator),
            .sym_to_def = std.AutoHashMap(Sym, *const Def).init(allocator),
        };
    }

    /// Generate fresh symbol ID
    pub fn freshSym(self: *IRContext, def: *const Def) Sym {
        const id = self.id_counter;
        self.id_counter += 1;
        self.sym_to_def.put(id, def) catch unreachable;
        return id;
    }

    /// Create or reuse existing definition (CSE)
    pub fn reifyObject(self: *IRContext, d: *Def) Sym {
        return self.findOrCreateDefinition(d);
    }

    /// Hash-consing: lookup by structural equality
    fn findOrCreateDefinition(self: *IRContext, d: *Def) Sym {
        if (self.global_defs.get(d)) |existing_sym| {
            // Reuse existing definition
            return existing_sym;
        }
        // Register new definition
        const sym = d.self(self);
        self.global_defs.put(d, sym) catch unreachable;
        return sym;
    }
};

/// Hash context for structural equality of definitions
const DefHashContext = struct {
    pub fn hash(_: DefHashContext, def: *const Def) u64 {
        // Hash based on node type and contents (structural)
        return def.node.hash_code;
    }

    pub fn eql(_: DefHashContext, a: *const Def, b: *const Def) bool {
        // Structural equality of definitions
        return structuralEqual(a, b);
    }
};

Common Subexpression Elimination

CSE is achieved automatically through hash-consing8:

CSE Through Hash-Consing
─────────────────────────────────────────────────────

Source:
  val a = SELF.value
  val b = SELF.value  // Same computation!
  a + b

Step 1: Build graph for SELF.value
  s1 = Self
  s2 = MethodCall(s1, "value")  → stored in global_defs

Step 2: Build graph for second SELF.value
  s1 = Self                      → already exists, reuse
  s2 = MethodCall(s1, "value")   → lookup in global_defs
                                 → found! return existing s2

Step 3: Build addition
  s3 = Plus(s2, s2)              → both operands point to s2

Result: Single computation of SELF.value
/// Build graph from typed AST
const GraphBuilder = struct {
    ctx: *IRContext,
    env: std.StringHashMap(Sym),

    pub fn buildGraph(self: *GraphBuilder, expr: *const TypedExpr) !Sym {
        return switch (expr.kind) {
            .constant => |c| self.buildConstant(c),
            .val_use => |name| self.env.get(name) orelse error.UndefinedVariable,
            .block => |b| self.buildBlock(b),
            .bin_op => |op| self.buildBinOp(op),
            .method_call => |mc| self.buildMethodCall(mc),
            .if_expr => |i| self.buildIf(i),
            .func_value => |f| self.buildLambda(f),
            .apply => |a| self.buildApply(a),
        };
    }

    fn buildConstant(self: *GraphBuilder, c: Constant) Sym {
        const def = self.ctx.allocator.create(ConstDef) catch unreachable;
        def.* = .{ .value = c };
        // CSE: if same constant exists, reuse it
        return self.ctx.reifyObject(&def.base);
    }

    fn buildBinOp(self: *GraphBuilder, op: *const BinOp) !Sym {
        const left_sym = try self.buildGraph(op.left);
        const right_sym = try self.buildGraph(op.right);

        const def = self.ctx.allocator.create(BinOpDef) catch unreachable;
        def.* = .{
            .op = op.kind,
            .left = left_sym,
            .right = right_sym,
        };
        // CSE: reuse if same operation on same operands exists
        return self.ctx.reifyObject(&def.base);
    }

    fn buildMethodCall(self: *GraphBuilder, mc: *const MethodCall) !Sym {
        const receiver_sym = try self.buildGraph(mc.receiver);
        var arg_syms = try self.ctx.allocator.alloc(Sym, mc.args.len);
        for (mc.args, 0..) |arg, i| {
            arg_syms[i] = try self.buildGraph(arg);
        }

        const def = self.ctx.allocator.create(MethodCallDef) catch unreachable;
        def.* = .{
            .receiver = receiver_sym,
            .method = mc.method,
            .args = arg_syms,
        };
        // CSE: reuse if identical method call exists
        return self.ctx.reifyObject(&def.base);
    }
};

Graph Rewriting

Algebraic simplifications are applied as rewrite rules910:

/// Rewriting rules for optimization
const DefRewriter = struct {
    ctx: *IRContext,

    /// Called on each new definition
    /// Returns replacement Sym or null for no rewrite
    pub fn rewriteDef(self: *DefRewriter, d: *const Def) ?Sym {
        return switch (d.kind()) {
            .coll_length => self.rewriteLength(d.as(CollLengthDef)),
            .coll_map => self.rewriteMap(d.as(CollMapDef)),
            .coll_zip => self.rewriteZip(d.as(CollZipDef)),
            .option_get_or_else => self.rewriteGetOrElse(d.as(OptionGetOrElseDef)),
            else => null,
        };
    }

    /// xs.map(f).length => xs.length
    fn rewriteLength(self: *DefRewriter, len_def: *const CollLengthDef) ?Sym {
        const input = self.ctx.getDef(len_def.input);
        return switch (input.kind()) {
            .coll_map => |map_def| {
                // Rule: xs.map(f).length => xs.length
                return self.makeLength(map_def.input);
            },
            .coll_replicate => |rep_def| {
                // Rule: replicate(len, v).length => len
                return rep_def.length;
            },
            .const_coll => |coll_def| {
                // Rule: Const(coll).length => coll.length
                return self.makeConstant(.{ .int = @intCast(coll_def.items.len) });
            },
            .coll_from_items => |items_def| {
                // Rule: Coll(items).length => items.length
                return self.makeConstant(.{ .int = @intCast(items_def.items.len) });
            },
            else => null,
        };
    }

    /// xs.map(identity) => xs
    /// xs.map(f).map(g) => xs.map(x => g(f(x)))
    fn rewriteMap(self: *DefRewriter, map_def: *const CollMapDef) ?Sym {
        const mapper = self.ctx.getDef(map_def.mapper);

        // Rule: xs.map(identity) => xs
        if (isIdentityLambda(mapper)) {
            return map_def.input;
        }

        const input = self.ctx.getDef(map_def.input);
        return switch (input.kind()) {
            .coll_replicate => |rep_def| {
                // Rule: replicate(l, v).map(f) => replicate(l, f(v))
                const applied = self.makeApply(map_def.mapper, rep_def.value);
                return self.makeReplicate(rep_def.length, applied);
            },
            .coll_map => |inner_map| {
                // Rule: xs.map(f).map(g) => xs.map(x => g(f(x)))
                const composed = self.composeLambdas(inner_map.mapper, map_def.mapper);
                return self.makeMap(inner_map.input, composed);
            },
            else => null,
        };
    }

    /// replicate(l, x).zip(replicate(l, y)) => replicate(l, (x, y))
    fn rewriteZip(self: *DefRewriter, zip_def: *const CollZipDef) ?Sym {
        const left = self.ctx.getDef(zip_def.left);
        const right = self.ctx.getDef(zip_def.right);

        if (left.kind() == .coll_replicate and right.kind() == .coll_replicate) {
            const rep_l = left.as(CollReplicateDef);
            const rep_r = right.as(CollReplicateDef);

            // Check same length and builder
            if (rep_l.length == rep_r.length and rep_l.builder == rep_r.builder) {
                const pair = self.makePair(rep_l.value, rep_r.value);
                return self.makeReplicate(rep_l.length, pair);
            }
        }
        return null;
    }

    /// Some(x).getOrElse(d) => x
    fn rewriteGetOrElse(self: *DefRewriter, def: *const OptionGetOrElseDef) ?Sym {
        const opt = self.ctx.getDef(def.option);
        if (opt.kind() == .option_const) {
            const opt_const = opt.as(OptionConstDef);
            if (opt_const.value) |v| {
                return self.liftValue(v);
            }
        }
        return null;
    }
};

Sigma-Specific Rewrites

Special optimizations for Sigma propositions11:

/// Sigma-specific rewriting rules
const SigmaRewriter = struct {
    ctx: *IRContext,

    pub fn rewriteSigma(self: *SigmaRewriter, d: *const Def) ?Sym {
        return switch (d.kind()) {
            .sigma_prop_is_valid => self.rewriteIsValid(d),
            .sigma_prop_from_bool => self.rewriteSigmaProp(d),
            .all_of => self.rewriteAllOf(d),
            .any_of => self.rewriteAnyOf(d),
            else => null,
        };
    }

    /// sigmaProp(sp.isValid) => sp
    fn rewriteIsValid(self: *SigmaRewriter, d: *const Def) ?Sym {
        const is_valid = d.as(SigmaIsValidDef);
        const inner = self.ctx.getDef(is_valid.prop);

        if (inner.kind() == .sigma_prop_from_bool) {
            const from_bool = inner.as(SigmaPropFromBoolDef);
            // Check if the bool is another isValid
            const bool_def = self.ctx.getDef(from_bool.bool_expr);
            if (bool_def.kind() == .sigma_prop_is_valid) {
                return bool_def.as(SigmaIsValidDef).prop;
            }
        }
        return null;
    }

    /// sigmaProp(b).isValid => b
    fn rewriteSigmaProp(self: *SigmaRewriter, d: *const Def) ?Sym {
        _ = d;
        _ = self;
        // This rewrite is handled in rewriteIsValid
        return null;
    }

    /// allOf(Coll(b1, ..., sp1.isValid, ...)) =>
    ///   (allOf(Coll(b1, ...)) && allZK(sp1, ...)).isValid
    fn rewriteAllOf(self: *SigmaRewriter, d: *const Def) ?Sym {
        const all_of = d.as(AllOfDef);
        const items = self.extractItems(all_of.input) orelse return null;

        var bools = std.ArrayList(Sym).init(self.ctx.allocator);
        var sigmas = std.ArrayList(Sym).init(self.ctx.allocator);

        for (items) |item| {
            const item_def = self.ctx.getDef(item);
            if (item_def.kind() == .sigma_prop_is_valid) {
                const is_valid = item_def.as(SigmaIsValidDef);
                sigmas.append(is_valid.prop) catch unreachable;
            } else {
                bools.append(item) catch unreachable;
            }
        }

        if (sigmas.items.len == 0) return null;

        // Build: (allOf(bools) && allZK(sigmas)).isValid
        const zk_all = self.makeAllZK(sigmas.items);
        if (bools.items.len == 0) {
            return self.makeIsValid(zk_all);
        }
        const bool_all = self.makeSigmaProp(self.makeAllOf(bools.items));
        const combined = self.makeSigmaAnd(bool_all, zk_all);
        return self.makeIsValid(combined);
    }
};

Tree Building

Transform optimized graph back to ErgoTree1213:

/// Transform graph IR to ErgoTree
const TreeBuilder = struct {
    ctx: *IRContext,
    /// Maps symbols to ValDef IDs
    env: std.AutoHashMap(Sym, struct { id: u32, tpe: SType }),
    /// Current ValDef ID counter
    def_id: u32,
    allocator: Allocator,

    pub fn buildTree(self: *TreeBuilder, root: Sym) !*Expr {
        // Compute usage counts to minimize ValDefs
        const usage = self.computeUsageCounts(root);

        // Build topological schedule
        const schedule = self.buildSchedule(root);

        // Process nodes, introducing ValDefs only for multi-use
        var val_defs = std.ArrayList(ValDef).init(self.allocator);
        for (schedule) |sym| {
            if (usage.get(sym).? > 1) {
                // Multi-use node: create ValDef
                const rhs = try self.buildValue(sym);
                const tpe = self.ctx.getDef(sym).result_type.stype;
                try val_defs.append(.{
                    .id = self.def_id,
                    .tpe = tpe,
                    .rhs = rhs,
                });
                try self.env.put(sym, .{ .id = self.def_id, .tpe = tpe });
                self.def_id += 1;
            }
        }

        // Build result expression
        const result = try self.buildValue(root);

        // Wrap in block if we have ValDefs
        if (val_defs.items.len == 0) {
            return result;
        }
        return self.makeBlock(val_defs.items, result);
    }

    fn buildValue(self: *TreeBuilder, sym: Sym) !*Expr {
        // Check if already bound in environment
        if (self.env.get(sym)) |binding| {
            return self.makeValUse(binding.id, binding.tpe);
        }

        const def = self.ctx.getDef(sym);
        return switch (def.kind()) {
            .constant => |c| self.makeConstant(c),
            .context_prop => |prop| self.buildContextProp(prop),
            .method_call => |mc| self.buildMethodCall(mc),
            .bin_op => |op| self.buildBinOp(op),
            .lambda => |lam| self.buildLambda(lam),
            .apply => |app| self.buildApply(app),
            .if_then_else => |ite| self.buildIf(ite),
            else => error.UnhandledDefKind,
        };
    }

    fn computeUsageCounts(self: *TreeBuilder, root: Sym) std.AutoHashMap(Sym, u32) {
        var counts = std.AutoHashMap(Sym, u32).init(self.allocator);
        self.countUsagesRecursive(root, &counts);
        return counts;
    }

    fn countUsagesRecursive(self: *TreeBuilder, sym: Sym, counts: *std.AutoHashMap(Sym, u32)) void {
        const current = counts.get(sym) orelse 0;
        counts.put(sym, current + 1) catch unreachable;

        // Only traverse dependencies on first visit
        if (current == 0) {
            const def = self.ctx.getDef(sym);
            for (def.node.getDeps()) |dep| {
                self.countUsagesRecursive(dep, counts);
            }
        }
    }

    fn buildSchedule(self: *TreeBuilder, root: Sym) []const Sym {
        // Topological sort via DFS
        var visited = std.AutoHashMap(Sym, void).init(self.allocator);
        var schedule = std.ArrayList(Sym).init(self.allocator);
        self.dfs(root, &visited, &schedule);
        return schedule.items;
    }

    fn dfs(self: *TreeBuilder, sym: Sym, visited: *std.AutoHashMap(Sym, void), schedule: *std.ArrayList(Sym)) void {
        if (visited.contains(sym)) return;
        visited.put(sym, {}) catch unreachable;

        const def = self.ctx.getDef(sym);
        for (def.node.getDeps()) |dep| {
            self.dfs(dep, visited, schedule);
        }
        schedule.append(sym) catch unreachable;
    }
};

Operation Translation

Map IR operations to ErgoTree nodes14:

/// Recognize arithmetic operations
fn translateArithOp(op: BinOpKind) ?OpCode {
    return switch (op) {
        .plus => OpCode.Plus,
        .minus => OpCode.Minus,
        .multiply => OpCode.Multiply,
        .divide => OpCode.Division,
        .modulo => OpCode.Modulo,
        .min => OpCode.Min,
        .max => OpCode.Max,
        else => null,
    };
}

/// Recognize comparison operations
fn translateRelationOp(op: BinOpKind) ?fn (*Expr, *Expr) *Expr {
    return switch (op) {
        .eq => makeEQ,
        .neq => makeNEQ,
        .gt => makeGT,
        .lt => makeLT,
        .ge => makeGE,
        .le => makeLE,
        else => null,
    };
}

/// Recognize context properties
fn translateContextProp(prop: ContextProperty) *Expr {
    return switch (prop) {
        .height => &expr_height,
        .inputs => &expr_inputs,
        .outputs => &expr_outputs,
        .self => &expr_self,
    };
}

/// Internal definitions should not become ValDefs
fn isInternalDef(def: *const Def) bool {
    return switch (def.kind()) {
        .sigma_dsl_builder, .coll_builder => true,
        else => false,
    };
}

Rust HIR (Alternative Approach)

The Rust compiler uses a simpler tree-based HIR without graph IR1516:

/// Rust-style HIR expression
const HirExpr = struct {
    kind: ExprKind,
    span: TextRange,
    tpe: ?SType,

    const ExprKind = union(enum) {
        ident: []const u8,
        binary: Binary,
        global_vars: GlobalVars,
        literal: Literal,
    };

    const Binary = struct {
        op: Spanned(BinaryOp),
        lhs: *HirExpr,
        rhs: *HirExpr,
    };

    const GlobalVars = enum {
        height,
    };

    const Literal = union(enum) {
        int: i32,
        long: i64,
    };
};

/// Rewrite HIR expressions (simpler than graph rewriting)
fn rewrite(
    e: HirExpr,
    f: fn (*const HirExpr) ?HirExpr,
) HirExpr {
    // Apply rewrite function
    const rewritten = f(&e) orelse e;

    // Recursively rewrite children
    return switch (rewritten.kind) {
        .binary => |bin| blk: {
            const new_lhs = f(bin.lhs) orelse bin.lhs.*;
            const new_rhs = f(bin.rhs) orelse bin.rhs.*;
            break :blk HirExpr{
                .kind = .{ .binary = .{
                    .op = bin.op,
                    .lhs = &new_lhs,
                    .rhs = &new_rhs,
                }},
                .span = rewritten.span,
                .tpe = rewritten.tpe,
            };
        },
        else => rewritten,
    };
}

CSE Example Walkthrough

Source:
─────────────────────────────────────────────────────
{
  val x = SELF.value
  val y = SELF.value    // Duplicate!
  val z = OUTPUTS(0).value
  x + y > z
}

After GraphBuilding (with CSE):
─────────────────────────────────────────────────────
s1 = Context.SELF
s2 = s1.value           // Single node for both x and y
s3 = Context.OUTPUTS
s4 = s3.apply(0)
s5 = s4.value
s6 = Plus(s2, s2)       // x + y = s2 + s2
s7 = GT(s6, s5)

After TreeBuilding (ValDef minimization):
─────────────────────────────────────────────────────
{
  val v1 = SELF.value   // s2 used twice → ValDef
  GT(Plus(v1, v1), OUTPUTS(0).value)
}

Nodes s1, s3, s4, s5 have single use → inlined
Node s2 has multiple uses → ValDef introduced

Summary

  • Def/Ref pattern separates computation definitions from references
  • Hash-consing enables automatic CSE—structurally equal nodes share identity
  • Graph rewriting applies algebraic simplifications (map fusion, etc.)
  • TreeBuilding transforms graph back to ErgoTree with minimal ValDefs
  • Usage counting determines which nodes need ValDef bindings
  • Scala uses full graph IR; Rust uses simpler tree-based HIR
  • IR optimizations reduce serialized ErgoTree size
  • Not part of consensus—compiler-only optimization

Next: Chapter 19: Compiler Pipeline

1

Scala: IRContext.scala

2

Scala: Base.scala:17-200 (Node, Def, Ref)

3

Rust: compiler.rs:59-76 (compile pipeline)

4

Scala: Base.scala:100-160 (Def trait)

5

Rust: hir.rs:32-37 (Expr struct)

6

Scala: IRContext.scala:28-50 (cake pattern)

7

Rust: compiler.rs:78-87 (compile_hir)

8

Scala: GraphBuilding.scala:28-35 (CSE documentation)

9

Scala: IRContext.scala:105-150 (rewriteDef)

10

Rust: rewrite.rs:10-29 (rewrite function)

11

Scala: GraphBuilding.scala:75-120 (HasSigmas, AllOf)

12

Scala: TreeBuilding.scala:21-50 (TreeBuilding trait)

13

Scala: TreeBuilding.scala:60-100 (IsArithOp, IsRelationOp)

14

Scala: TreeBuilding.scala:100-140 (IsContextProperty)

15

Rust: hir.rs:146-167 (ExprKind enum)

16

Rust: hir.rs:61-94 (Expr::lower)

Chapter 19: Compiler Pipeline

PRE-ALPHA WARNING: This is a pre-alpha version of The Sigma Book. Content may be incomplete, inaccurate, or subject to change. Do not use as a source of truth. For authoritative information, consult the official repositories:

Prerequisites

Learning Objectives

By the end of this chapter, you will be able to:

  • Trace the complete compilation pipeline from ErgoScript source to ErgoTree bytecode
  • Use the SigmaCompiler API to compile scripts programmatically
  • Explain method call lowering strategies and when direct operations are used
  • Configure compiler settings for different networks (mainnet vs testnet)

Pipeline Architecture

The ErgoScript compiler transforms source code through multiple phases12:

Compilation Pipeline
─────────────────────────────────────────────────────

Source: "sigmaProp(SELF.value > 1000L)"
                       │
                       │ (1) Parse
                       ▼
┌─────────────────────────────────────────────────────┐
│ Untyped AST                                         │
│ Apply(Ident("sigmaProp"), [GT(Select(...), ...)])   │
└─────────────────────────────────────────────────────┘
                       │
                       │ (2) Bind
                       ▼
┌─────────────────────────────────────────────────────┐
│ Bound AST (names resolved)                          │
│ Apply(SigmaPropFunc, [GT(Self.value, 1000L)])       │
└─────────────────────────────────────────────────────┘
                       │
                       │ (3) Typecheck
                       ▼
┌─────────────────────────────────────────────────────┐
│ Typed AST                                           │
│ BoolToSigmaProp(GT(ExtractAmount(Self), 1000L))     │
│ :: SSigmaProp                                       │
└─────────────────────────────────────────────────────┘
                       │
                       │ (4) BuildGraph (Scala only)
                       ▼
┌─────────────────────────────────────────────────────┐
│ Graph IR (CSE applied)                              │
│ s1=Self, s2=s1.value, s3=1000L, s4=GT(s2,s3)        │
│ s5=sigmaProp(s4)                                    │
└─────────────────────────────────────────────────────┘
                       │
                       │ (5) BuildTree / Lower to MIR
                       ▼
┌─────────────────────────────────────────────────────┐
│ ErgoTree                                            │
│ BoolToSigmaProp(GT(ExtractAmount(Self), 1000L))     │
└─────────────────────────────────────────────────────┘

Compiler Settings

Configuration controls optimization and network behavior3:

const CompilerSettings = struct {
    /// Network prefix for address decoding (mainnet=0, testnet=16)
    network_prefix: u8,
    /// Whether to lower MethodCall to direct nodes
    lower_method_calls: bool,
    /// Builder for creating ErgoTree nodes
    builder: *const SigmaBuilder,

    pub fn mainnet() CompilerSettings {
        return .{
            .network_prefix = 0x00,
            .lower_method_calls = true,
            .builder = &TransformingSigmaBuilder,
        };
    }

    pub fn testnet() CompilerSettings {
        return .{
            .network_prefix = 0x10,
            .lower_method_calls = true,
            .builder = &TransformingSigmaBuilder,
        };
    }
};

SigmaCompiler Implementation

The compiler orchestrates all phases45:

const SigmaCompiler = struct {
    settings: CompilerSettings,
    allocator: Allocator,

    pub fn init(settings: CompilerSettings, allocator: Allocator) SigmaCompiler {
        return .{
            .settings = settings,
            .allocator = allocator,
        };
    }

    /// Phase 1: Parse source to AST
    pub fn parse(self: *const SigmaCompiler, source: []const u8) !*Expr {
        var parser = Parser.init(source, self.allocator);
        return parser.parseExpr() catch |err| {
            return error.ParserError;
        };
    }

    /// Phases 2-3: Bind and typecheck
    pub fn typecheck(
        self: *const SigmaCompiler,
        env: *const ScriptEnv,
        parsed: *const Expr,
    ) !*TypedExpr {
        // Phase 2: Bind names
        const predef_registry = PredefinedFuncRegistry.init(self.settings.builder);
        var binder = Binder.init(env, self.settings.builder, self.settings.network_prefix, &predef_registry);
        const bound = try binder.bind(parsed);

        // Phase 3: Type inference and checking
        const type_env = env.collectTypes();
        var typer = Typer.init(
            self.settings.builder,
            &predef_registry,
            type_env,
            self.settings.lower_method_calls,
        );
        return typer.typecheck(bound);
    }

    /// Full compilation: all phases
    pub fn compile(
        self: *const SigmaCompiler,
        env: *const ScriptEnv,
        source: []const u8,
        ir_ctx: *IRContext,
    ) !CompilerResult {
        const parsed = try self.parse(source);
        const typed = try self.typecheck(env, parsed);
        return self.compileTyped(env, typed, ir_ctx, source);
    }

    /// Phases 4-5: Graph building and tree building
    fn compileTyped(
        self: *const SigmaCompiler,
        env: *const ScriptEnv,
        typed: *const TypedExpr,
        ir_ctx: *IRContext,
        source: []const u8,
    ) !CompilerResult {
        // Create placeholder constants for type parameters
        var placeholders_env = env.clone();
        var idx: u32 = 0;
        var iter = env.typeParams();
        while (iter.next()) |entry| {
            const placeholder = ConstantPlaceholder{
                .index = idx,
                .tpe = entry.value,
            };
            try placeholders_env.put(entry.key, .{ .placeholder = placeholder });
            idx += 1;
        }

        // Phase 4: Build graph (CSE)
        var graph_builder = GraphBuilder.init(ir_ctx, &placeholders_env);
        const compiled_graph = try graph_builder.buildGraph(typed);

        // Phase 5: Build tree (ValDef minimization)
        var tree_builder = TreeBuilder.init(ir_ctx, self.allocator);
        const compiled_tree = try tree_builder.buildTree(compiled_graph);

        return CompilerResult{
            .env = env,
            .source = source,
            .compiled_graph = compiled_graph,
            .ergo_tree = compiled_tree,
        };
    }
};

/// Result of compilation
const CompilerResult = struct {
    env: *const ScriptEnv,
    source: []const u8,
    compiled_graph: Sym,
    ergo_tree: *Expr,
};

Rust Compiler Pipeline

The Rust implementation uses a direct pipeline without graph IR67:

/// Rust-style direct compilation pipeline
const RustCompiler = struct {
    allocator: Allocator,

    /// Compile source to ErgoTree expression
    pub fn compileExpr(
        self: *const RustCompiler,
        source: []const u8,
        env: ScriptEnv,
    ) !*MirExpr {
        // Parse to CST, then lower to HIR
        const hir = try self.compileHir(source);

        // Bind names in HIR
        var binder = Binder.init(env);
        const bound = try binder.bind(hir);

        // Assign types
        const typed = try assignType(bound);

        // Lower to MIR (ErgoTree IR)
        const mir = try lowerToMir(typed);

        // Type check MIR
        return try typeCheck(mir);
    }

    /// Compile to full ErgoTree
    pub fn compile(
        self: *const RustCompiler,
        source: []const u8,
        env: ScriptEnv,
    ) !ErgoTree {
        const expr = try self.compileExpr(source, env);
        return ErgoTree.fromExpr(expr);
    }

    fn compileHir(self: *const RustCompiler, source: []const u8) !*HirExpr {
        var parser = Parser.init(source);
        const parse_result = parser.parse();

        if (parse_result.errors.len > 0) {
            return error.ParseError;
        }

        const syntax = parse_result.syntax();
        const root = AstRoot.cast(syntax) orelse return error.InvalidRoot;
        return hirLower(root);
    }
};

Method Call Lowering

Lowering transforms generic MethodCall to compact direct nodes89:

Method Call Lowering
─────────────────────────────────────────────────────

Before lowering (MethodCall - 3+ bytes):
  MethodCall(xs, CollMethods.MapMethod, [f], {})

After lowering (MapCollection - 1 byte):
  MapCollection(xs, f)

Size savings: 2+ bytes per operation
/// Method call lowering during typing
const MethodCallLowerer = struct {
    builder: *const SigmaBuilder,
    lower_enabled: bool,

    /// Try to lower MethodCall to direct node
    pub fn tryLower(
        self: *const MethodCallLowerer,
        obj: *const Expr,
        method: *const SMethod,
        args: []const *const Expr,
        subst: TypeSubst,
    ) ?*Expr {
        if (!self.lower_enabled) return null;

        // Check if method has IR builder
        const ir_builder = method.ir_info.ir_builder orelse return null;

        // Try to apply the builder
        return ir_builder.build(self.builder, obj, method, args, subst);
    }

    /// Unlower: convert direct nodes back to MethodCall (for display)
    pub fn unlower(self: *const MethodCallLowerer, expr: *const Expr) *Expr {
        return switch (expr.kind) {
            .multiply_group => |mg| self.builder.makeMethodCall(
                mg.left,
                &SGroupElementMethods.multiply_method,
                &[_]*const Expr{mg.right},
            ),
            .exponentiate => |exp| self.builder.makeMethodCall(
                exp.base,
                &SGroupElementMethods.exponentiate_method,
                &[_]*const Expr{exp.exponent},
            ),
            .map_collection => |mc| self.builder.makeMethodCall(
                mc.input,
                &SCollectionMethods.map_method.withConcreteTypes(.{
                    .tIV = mc.input.tpe.elemType(),
                    .tOV = mc.mapper.tpe.resultType(),
                }),
                &[_]*const Expr{mc.mapper},
            ),
            .fold => |f| self.builder.makeMethodCall(
                f.input,
                &SCollectionMethods.fold_method.withConcreteTypes(.{
                    .tIV = f.input.tpe.elemType(),
                    .tOV = f.zero.tpe,
                }),
                &[_]*const Expr{ f.zero, f.folder },
            ),
            .for_all => |fa| self.builder.makeMethodCall(
                fa.input,
                &SCollectionMethods.forall_method.withConcreteTypes(.{
                    .tIV = fa.input.tpe.elemType(),
                }),
                &[_]*const Expr{fa.predicate},
            ),
            .exists => |ex| self.builder.makeMethodCall(
                ex.input,
                &SCollectionMethods.exists_method.withConcreteTypes(.{
                    .tIV = ex.input.tpe.elemType(),
                }),
                &[_]*const Expr{ex.predicate},
            ),
            else => expr,
        };
    }
};

Type Inference

Type assignment propagates and unifies types1011:

const Typer = struct {
    builder: *const SigmaBuilder,
    predef_registry: *const PredefinedFuncRegistry,
    type_env: std.StringHashMap(SType),
    lower_method_calls: bool,

    /// Assign types to bound expression
    pub fn typecheck(self: *Typer, bound: *const Expr) !*TypedExpr {
        return self.assignType(self.type_env, bound);
    }

    fn assignType(self: *Typer, env: std.StringHashMap(SType), expr: *const Expr) !*TypedExpr {
        return switch (expr.kind) {
            .block => |b| self.typecheckBlock(env, b),
            .tuple => |t| self.typecheckTuple(env, t),
            .ident => |id| self.typecheckIdent(env, id),
            .select => |s| self.typecheckSelect(env, s),
            .apply => |a| self.typecheckApply(env, a),
            .lambda => |l| self.typecheckLambda(env, l),
            .if_expr => |i| self.typecheckIf(env, i),
            .constant => |c| self.makeTyped(c, c.tpe),
            else => error.UnsupportedExpr,
        };
    }

    fn typecheckBlock(self: *Typer, env: std.StringHashMap(SType), block: *const Block) !*TypedExpr {
        var cur_env = try env.clone();

        for (block.items) |val_def| {
            if (cur_env.contains(val_def.name)) {
                return error.DuplicateVariable;
            }
            const rhs_typed = try self.assignType(cur_env, val_def.rhs);
            try cur_env.put(val_def.name, rhs_typed.tpe);
        }

        const result_typed = try self.assignType(cur_env, block.result);
        return self.builder.makeBlock(block.items, result_typed);
    }

    fn typecheckSelect(self: *Typer, env: std.StringHashMap(SType), sel: *const Select) !*TypedExpr {
        const obj_typed = try self.assignType(env, sel.obj);

        const method = MethodsContainer.getMethod(obj_typed.tpe, sel.field) orelse
            return error.MethodNotFound;

        // Unify method receiver type with object type
        const subst = unifyTypes(method.stype.domain[0], obj_typed.tpe) orelse
            return error.TypeMismatch;

        const result_type = applySubst(method.stype.range, subst);

        // Try to lower if it's a property access (no args)
        if (self.lower_method_calls) {
            if (method.ir_info.ir_builder) |ir_builder| {
                if (ir_builder.buildProperty(self.builder, obj_typed, method)) |lowered| {
                    return lowered;
                }
            }
        }

        return self.builder.makeSelect(obj_typed, sel.field, result_type);
    }
};

Error Handling

Each phase produces specific errors12:

const CompileError = union(enum) {
    parse_error: ParseError,
    hir_lowering_error: HirLoweringError,
    binder_error: BinderError,
    type_error: TypeInferenceError,
    mir_lowering_error: MirLoweringError,
    type_check_error: TypeCheckError,
    ergo_tree_error: ErgoTreeError,

    pub fn prettyDesc(self: CompileError, source: []const u8) []const u8 {
        return switch (self) {
            .parse_error => |e| e.prettyDesc(source),
            .hir_lowering_error => |e| e.prettyDesc(source),
            .binder_error => |e| e.prettyDesc(source),
            .type_error => |e| e.prettyDesc(source),
            .mir_lowering_error => |e| e.prettyDesc(source),
            .type_check_error => |e| e.prettyDesc(),
            .ergo_tree_error => |e| std.fmt.allocPrint(
                allocator,
                "ErgoTree error: {any}",
                .{e},
            ) catch "format error",
        };
    }
};

/// Parse error with source location
const ParseError = struct {
    message: []const u8,
    span: TextRange,
    expected: []const TokenKind,
    found: ?TokenKind,

    pub fn prettyDesc(self: ParseError, source: []const u8) []const u8 {
        const line_info = getLineInfo(source, self.span.start);
        return std.fmt.allocPrint(allocator,
            "error: {s}\nline: {d}\n{s}\n{s}",
            .{
                self.message,
                line_info.line_num,
                line_info.line_text,
                makeUnderline(line_info, self.span),
            },
        ) catch "format error";
    }
};

Predefined Functions Registry

Built-in functions are registered for name resolution13:

const PredefinedFuncRegistry = struct {
    funcs: std.StringHashMap(PredefinedFunc),
    builder: *const SigmaBuilder,

    pub fn init(builder: *const SigmaBuilder) PredefinedFuncRegistry {
        var self = PredefinedFuncRegistry{
            .funcs = std.StringHashMap(PredefinedFunc).init(allocator),
            .builder = builder,
        };
        self.registerAll();
        return self;
    }

    fn registerAll(self: *PredefinedFuncRegistry) void {
        // Boolean operations
        self.register("allOf", .{
            .tpe = SFunc.init(&[_]SType{SType.collOf(.boolean)}, .boolean),
            .ir_builder = AllOfIrBuilder,
        });
        self.register("anyOf", .{
            .tpe = SFunc.init(&[_]SType{SType.collOf(.boolean)}, .boolean),
            .ir_builder = AnyOfIrBuilder,
        });

        // Sigma operations
        self.register("sigmaProp", .{
            .tpe = SFunc.init(&[_]SType{.boolean}, .sigma_prop),
            .ir_builder = SigmaPropIrBuilder,
        });
        self.register("atLeast", .{
            .tpe = SFunc.init(&[_]SType{ .int, SType.collOf(.sigma_prop) }, .sigma_prop),
            .ir_builder = AtLeastIrBuilder,
        });
        self.register("allZK", .{
            .tpe = SFunc.init(&[_]SType{SType.collOf(.sigma_prop)}, .sigma_prop),
            .ir_builder = AllZKIrBuilder,
        });
        self.register("anyZK", .{
            .tpe = SFunc.init(&[_]SType{SType.collOf(.sigma_prop)}, .sigma_prop),
            .ir_builder = AnyZKIrBuilder,
        });

        // Cryptographic
        self.register("proveDlog", .{
            .tpe = SFunc.init(&[_]SType{.group_element}, .sigma_prop),
            .ir_builder = ProveDlogIrBuilder,
        });
        self.register("proveDHTuple", .{
            .tpe = SFunc.init(&[_]SType{
                .group_element,
                .group_element,
                .group_element,
                .group_element,
            }, .sigma_prop),
            .ir_builder = ProveDHTupleIrBuilder,
        });

        // Hash functions
        self.register("blake2b256", .{
            .tpe = SFunc.init(&[_]SType{SType.collOf(.byte)}, SType.collOf(.byte)),
            .ir_builder = Blake2b256IrBuilder,
        });
        self.register("sha256", .{
            .tpe = SFunc.init(&[_]SType{SType.collOf(.byte)}, SType.collOf(.byte)),
            .ir_builder = Sha256IrBuilder,
        });

        // Global
        self.register("groupGenerator", .{
            .tpe = SFunc.init(&[_]SType{}, .group_element),
            .ir_builder = GroupGeneratorIrBuilder,
        });
    }

    fn register(self: *PredefinedFuncRegistry, name: []const u8, func: PredefinedFunc) void {
        self.funcs.put(name, func) catch unreachable;
    }
};

Compilation Example

pub fn main() !void {
    var gpa = std.heap.GeneralPurposeAllocator(.{}){};
    const allocator = gpa.allocator();

    // Setup compiler
    const settings = CompilerSettings.testnet();
    const compiler = SigmaCompiler.init(settings, allocator);
    var ir_ctx = IRContext.init(allocator);

    // Source code
    const source =
        \\{
        \\  val deadline = 100000
        \\  val pk = PK("9fRusAarL1KkrWQVsxSRVYnvWxaAT2A96cKtNn9tvPh5XUCTgGi")
        \\  sigmaProp(HEIGHT > deadline) && pk
        \\}
    ;

    // Compile
    const env = ScriptEnv.empty();
    const result = try compiler.compile(&env, source, &ir_ctx);

    // Access results
    std.debug.print("Source: {s}\n", .{result.source});
    std.debug.print("ErgoTree: {any}\n", .{result.ergo_tree});
    std.debug.print("Type: {any}\n", .{result.ergo_tree.tpe});

    // Serialize
    const ergo_tree = try ErgoTree.fromSigmaProp(result.ergo_tree);
    const bytes = try ergo_tree.toBytes(allocator);
    std.debug.print("Bytes: {x}\n", .{std.fmt.fmtSliceHexLower(bytes)});
}

Compilation Flow Detail

Detailed Phase Transitions
─────────────────────────────────────────────────────

Source: "OUTPUTS.exists({ (b: Box) => b.value > 100L })"

Phase 1 - Parse:
  Apply(
    Select(Ident("OUTPUTS"), "exists"),
    [Lambda(["b": Box], GT(Select(Ident("b"), "value"), 100L))]
  )

Phase 2 - Bind:
  Apply(
    Select(Context.OUTPUTS, ExistsMethod),
    [Lambda([b: SBox], GT(Select(ValUse(b), "value"), 100L))]
  )

Phase 3 - Typecheck:
  Exists(
    input: Outputs :: SColl[SBox],
    predicate: Lambda(
      args: [(0, SBox)],
      body: GT(
        ExtractAmount(ValUse(0, SBox)) :: SLong,
        LongConstant(100) :: SLong
      ) :: SBoolean
    ) :: SFunc[SBox, SBoolean]
  ) :: SBoolean

Phase 4 - BuildGraph (if using Scala IR):
  s1 = Context.OUTPUTS
  s2 = Lambda(args=[(0,SBox)], body=s3)
  s3 = GT(s4, s5)
  s4 = ValUse(0).value  // ExtractAmount
  s5 = 100L
  s6 = Exists(s1, s2)

Phase 5 - BuildTree:
  Exists(
    Outputs,
    FuncValue(
      [(1, SBox)],
      GT(ExtractAmount(ValUse(1, SBox)), LongConstant(100))
    )
  )

Summary

  • 5-phase pipeline: Parse → Bind → Typecheck → BuildGraph → BuildTree
  • Method lowering transforms MethodCall (3+ bytes) to direct nodes (1 byte)
  • Scala uses graph IR for CSE optimization; Rust uses direct HIR→MIR
  • Type inference propagates and unifies types through the AST
  • Predefined registry resolves built-in function names
  • Error handling provides detailed source-location diagnostics
  • Compiler is development-time only—interpreter uses serialized ErgoTree

Next: Chapter 20: Collections

1

Scala: SigmaCompiler.scala:51-100 (SigmaCompiler class)

2

Rust: lib.rs:16-27 (module structure)

3

Scala: SigmaCompiler.scala:15-25 (CompilerSettings)

4

Scala: SigmaCompiler.scala:55-95 (compile methods)

5

Rust: compiler.rs:59-76 (compile_expr)

6

Rust: compiler.rs:73-76 (compile)

7

Rust: compiler.rs:78-87 (compile_hir)

8

Scala: SigmaCompiler.scala:105-150 (unlowerMethodCalls)

9

Scala: SigmaTyper.scala:30-45 (processGlobalMethod)

10

Scala: SigmaTyper.scala:50-100 (assignType)

11

Rust: type_infer.rs:25-49 (assign_type)

12

Rust: compiler.rs:23-55 (CompileError)

13

Scala: SigmaPredef.scala (PredefinedFuncRegistry)

Chapter 20: Collections

PRE-ALPHA WARNING: This is a pre-alpha version of The Sigma Book. Content may be incomplete, inaccurate, or subject to change. Do not use as a source of truth. For authoritative information, consult the official repositories:

Prerequisites

  • Chapter 2 for Coll[T] type and type parameters
  • Chapter 5 for collection operation opcodes
  • Chapter 12 for how collection operations are evaluated

Learning Objectives

By the end of this chapter, you will be able to:

  • Explain the Coll[T] interface and its core operations (map, filter, fold, etc.)
  • Implement array-backed collections with bounds checking
  • Describe the Structure-of-Arrays optimization for pair collections
  • Use CollBuilder for creating and manipulating collections
  • Understand cost implications of collection operations

Collection Architecture

Collections in ErgoScript are immutable, indexed sequences12:

Collection Architecture
─────────────────────────────────────────────────────

                    Coll[T]
                       │
         ┌─────────────┴─────────────┐
         │                           │
   CollOverArray[T]            PairColl[L,R]
   (standard array)         (structure-of-arrays)
         │                           │
         │                    ┌──────┴──────┐
    Array[T]                Coll[L]      Coll[R]
                           (left)       (right)

Coll[T] Interface

Core collection interface with specialized operations34:

/// Immutable indexed collection
const Coll = struct {
    data: CollData,
    elem_type: SType,
    builder: *CollBuilder,

    const CollData = union(enum) {
        /// Standard array-backed collection
        array: ArrayColl,
        /// Optimized pair collection
        pair: PairCollData,
    };

    /// Number of elements
    pub fn length(self: *const Coll) usize {
        return switch (self.data) {
            .array => |a| a.items.len,
            .pair => |p| @min(p.ls.length(), p.rs.length()),
        };
    }

    pub fn size(self: *const Coll) usize {
        return self.length();
    }

    pub fn isEmpty(self: *const Coll) bool {
        return self.length() == 0;
    }

    /// Element at index (0-based)
    pub fn get(self: *const Coll, i: usize) ?Value {
        if (i >= self.length()) return null;
        return switch (self.data) {
            .array => |a| a.items[i],
            .pair => |p| Value.tuple(.{ p.ls.get(i).?, p.rs.get(i).? }),
        };
    }

    /// Element at index with default
    pub fn getOrElse(self: *const Coll, i: usize, default: Value) Value {
        return self.get(i) orelse default;
    }

    /// Element access (throws on out of bounds)
    pub fn apply(self: *const Coll, i: usize) !Value {
        return self.get(i) orelse error.IndexOutOfBounds;
    }
};

Transformation Operations

Map, filter, and fold with cost tracking56:

/// Collection transformation operations
const CollTransforms = struct {

    /// Apply function to each element
    pub fn map(
        coll: *const Coll,
        mapper: *const Closure,
        E: *Evaluator,
    ) !*Coll {
        const n = coll.length();
        try E.addSeqCost(MapCost, n, OpCode.Map);

        var result = try E.allocator.alloc(Value, n);
        for (0..n) |i| {
            const elem = coll.get(i).?;
            try E.addCost(AddToEnvCost, OpCode.Map);
            var fn_env = try mapper.captured_env.extend(mapper.args[0].id, elem);
            result[i] = try mapper.body.eval(&fn_env, E);
        }

        return coll.builder.fromArray(result, mapper.result_type);
    }

    /// Select elements satisfying predicate
    pub fn filter(
        coll: *const Coll,
        predicate: *const Closure,
        E: *Evaluator,
    ) !*Coll {
        const n = coll.length();
        try E.addSeqCost(FilterCost, n, OpCode.Filter);

        var result = std.ArrayList(Value).init(E.allocator);
        for (0..n) |i| {
            const elem = coll.get(i).?;
            try E.addCost(AddToEnvCost, OpCode.Filter);
            var fn_env = try predicate.captured_env.extend(predicate.args[0].id, elem);
            const keep = try E.evalTo(bool, &fn_env, predicate.body);
            if (keep) {
                try result.append(elem);
            }
        }

        return coll.builder.fromArray(result.items, coll.elem_type);
    }

    /// Left-associative fold
    pub fn foldLeft(
        coll: *const Coll,
        zero: Value,
        folder: *const Closure,
        E: *Evaluator,
    ) !Value {
        const n = coll.length();
        try E.addSeqCost(FoldCost, n, OpCode.Fold);

        var accum = zero;
        for (0..n) |i| {
            const elem = coll.get(i).?;
            const tuple = Value.tuple(.{ accum, elem });
            try E.addCost(AddToEnvCost, OpCode.Fold);
            var fn_env = try folder.captured_env.extend(folder.args[0].id, tuple);
            accum = try folder.body.eval(&fn_env, E);
        }

        return accum;
    }

    /// Flatten nested collections
    pub fn flatMap(
        coll: *const Coll,
        mapper: *const Closure,
        E: *Evaluator,
    ) !*Coll {
        const n = coll.length();
        var result = std.ArrayList(Value).init(E.allocator);

        for (0..n) |i| {
            const elem = coll.get(i).?;
            try E.addCost(AddToEnvCost, OpCode.FlatMap);
            var fn_env = try mapper.captured_env.extend(mapper.args[0].id, elem);
            const inner_coll = try E.evalTo(*Coll, &fn_env, mapper.body);

            for (0..inner_coll.length()) |j| {
                try result.append(inner_coll.get(j).?);
            }
        }

        return coll.builder.fromArray(result.items, mapper.result_type);
    }
};

const MapCost = PerItemCost{
    .base = JitCost{ .value = 10 },
    .per_chunk = JitCost{ .value = 5 },
    .chunk_size = 10,
};

const FilterCost = PerItemCost{
    .base = JitCost{ .value = 20 },
    .per_chunk = JitCost{ .value = 5 },
    .chunk_size = 10,
};

const FoldCost = PerItemCost{
    .base = JitCost{ .value = 10 },
    .per_chunk = JitCost{ .value = 5 },
    .chunk_size = 10,
};

Predicate Operations

Exists, forall with short-circuit evaluation7. Note: Short-circuit behavior means execution time varies based on collection contents. This is acceptable in blockchain contexts where data is public, but would be a timing side-channel if collections contained secrets.

/// Predicate operations (short-circuit)
const CollPredicates = struct {

    /// True if any element satisfies predicate
    pub fn exists(
        coll: *const Coll,
        predicate: *const Closure,
        E: *Evaluator,
    ) !bool {
        const n = coll.length();

        for (0..n) |i| {
            const elem = coll.get(i).?;
            try E.addCost(AddToEnvCost, OpCode.Exists);
            var fn_env = try predicate.captured_env.extend(predicate.args[0].id, elem);
            const result = try E.evalTo(bool, &fn_env, predicate.body);

            if (result) {
                // Short-circuit: found matching element
                try E.addSeqCost(ExistsCost, i + 1, OpCode.Exists);
                return true;
            }
        }

        try E.addSeqCost(ExistsCost, n, OpCode.Exists);
        return false;
    }

    /// True if all elements satisfy predicate
    pub fn forall(
        coll: *const Coll,
        predicate: *const Closure,
        E: *Evaluator,
    ) !bool {
        const n = coll.length();

        for (0..n) |i| {
            const elem = coll.get(i).?;
            try E.addCost(AddToEnvCost, OpCode.ForAll);
            var fn_env = try predicate.captured_env.extend(predicate.args[0].id, elem);
            const result = try E.evalTo(bool, &fn_env, predicate.body);

            if (!result) {
                // Short-circuit: found non-matching element
                try E.addSeqCost(ForAllCost, i + 1, OpCode.ForAll);
                return false;
            }
        }

        try E.addSeqCost(ForAllCost, n, OpCode.ForAll);
        return true;
    }

    /// Find first element satisfying predicate
    pub fn find(
        coll: *const Coll,
        predicate: *const Closure,
        E: *Evaluator,
    ) !?Value {
        const n = coll.length();

        for (0..n) |i| {
            const elem = coll.get(i).?;
            var fn_env = try predicate.captured_env.extend(predicate.args[0].id, elem);
            const result = try E.evalTo(bool, &fn_env, predicate.body);

            if (result) {
                return elem;
            }
        }

        return null;
    }

    /// Index of first element satisfying predicate
    pub fn indexWhere(
        coll: *const Coll,
        predicate: *const Closure,
        from: usize,
        E: *Evaluator,
    ) !i32 {
        const n = coll.length();
        const start = @max(from, 0);

        for (start..n) |i| {
            const elem = coll.get(i).?;
            var fn_env = try predicate.captured_env.extend(predicate.args[0].id, elem);
            const result = try E.evalTo(bool, &fn_env, predicate.body);

            if (result) {
                return @intCast(i);
            }
        }

        return -1;  // Not found
    }
};

Slicing Operations

Slice, take, append8:

/// Slicing operations
const CollSlicing = struct {

    /// First n elements
    pub fn take(coll: *const Coll, n: usize, E: *Evaluator) !*Coll {
        if (n <= 0) return coll.builder.emptyColl(coll.elem_type);
        if (n >= coll.length()) return coll;

        try E.addSeqCost(SliceCost, n, OpCode.Slice);
        return coll.builder.fromSlice(coll, 0, n);
    }

    /// Elements from index `from` until `until`
    pub fn slice(
        coll: *const Coll,
        from: usize,
        until: usize,
        E: *Evaluator,
    ) !*Coll {
        const actual_from = @min(from, coll.length());
        const actual_until = @min(until, coll.length());
        const len = if (actual_until > actual_from) actual_until - actual_from else 0;

        try E.addSeqCost(SliceCost, len, OpCode.Slice);
        return coll.builder.fromSlice(coll, actual_from, actual_until);
    }

    /// Concatenate collections
    pub fn append(coll: *const Coll, other: *const Coll, E: *Evaluator) !*Coll {
        if (coll.length() == 0) return other;
        if (other.length() == 0) return coll;

        const total = coll.length() + other.length();
        try E.addSeqCost(AppendCost, total, OpCode.Append);

        var result = try E.allocator.alloc(Value, total);
        for (0..coll.length()) |i| {
            result[i] = coll.get(i).?;
        }
        for (0..other.length()) |i| {
            result[coll.length() + i] = other.get(i).?;
        }

        return coll.builder.fromArray(result, coll.elem_type);
    }

    /// Replace slice with patch
    pub fn patch(
        coll: *const Coll,
        from: usize,
        replacement: *const Coll,
        replaced: usize,
        E: *Evaluator,
    ) !*Coll {
        const before = coll.slice(0, from, E);
        const after = coll.slice(from + replaced, coll.length(), E);
        const temp = try before.append(replacement, E);
        return temp.append(after, E);
    }

    /// Replace single element
    pub fn updated(
        coll: *const Coll,
        index: usize,
        elem: Value,
        E: *Evaluator,
    ) !*Coll {
        if (index >= coll.length()) return error.IndexOutOfBounds;

        var result = try E.allocator.alloc(Value, coll.length());
        for (0..coll.length()) |i| {
            result[i] = if (i == index) elem else coll.get(i).?;
        }

        return coll.builder.fromArray(result, coll.elem_type);
    }
};

Structure-of-Arrays: PairColl

Optimized representation for collections of pairs910:

Structure-of-Arrays vs Array-of-Structures
─────────────────────────────────────────────────────

Array-of-Structures (standard):
┌────────────────────────────────────────────────────┐
│ [(L0,R0), (L1,R1), (L2,R2), (L3,R3), (L4,R4)]      │
│                                                    │
│ Memory: L0 R0 L1 R1 L2 R2 L3 R3 L4 R4              │
│ Issue: Cache unfriendly when accessing only Ls     │
└────────────────────────────────────────────────────┘

Structure-of-Arrays (PairColl):
┌────────────────────────────────────────────────────┐
│ ls: [L0, L1, L2, L3, L4]                           │
│ rs: [R0, R1, R2, R3, R4]                           │
│                                                    │
│ Memory: L0 L1 L2 L3 L4 | R0 R1 R2 R3 R4            │
│ Benefit: Cache friendly, O(1) unzip                │
└────────────────────────────────────────────────────┘
/// Optimized pair collection (Structure-of-Arrays)
const PairColl = struct {
    ls: *Coll,  // Left components
    rs: *Coll,  // Right components
    builder: *CollBuilder,

    pub fn length(self: *const PairColl) usize {
        return @min(self.ls.length(), self.rs.length());
    }

    /// Element at index returns tuple
    pub fn get(self: *const PairColl, i: usize) ?Value {
        const l = self.ls.get(i) orelse return null;
        const r = self.rs.get(i) orelse return null;
        return Value.tuple(.{ l, r });
    }

    /// O(1) unzip - just return components
    pub fn unzip(self: *const PairColl) struct { *Coll, *Coll } {
        return .{ self.ls, self.rs };
    }

    /// Map only left components
    pub fn mapFirst(
        self: *const PairColl,
        mapper: *const Closure,
        E: *Evaluator,
    ) !*PairColl {
        const mapped_ls = try CollTransforms.map(self.ls, mapper, E);
        return self.builder.pairColl(mapped_ls, self.rs);
    }

    /// Map only right components
    pub fn mapSecond(
        self: *const PairColl,
        mapper: *const Closure,
        E: *Evaluator,
    ) !*PairColl {
        const mapped_rs = try CollTransforms.map(self.rs, mapper, E);
        return self.builder.pairColl(self.ls, mapped_rs);
    }

    /// Slice maintains structure-of-arrays
    pub fn slice(
        self: *const PairColl,
        from: usize,
        until: usize,
        E: *Evaluator,
    ) !*PairColl {
        const sliced_ls = try CollSlicing.slice(self.ls, from, until, E);
        const sliced_rs = try CollSlicing.slice(self.rs, from, until, E);
        return self.builder.pairColl(sliced_ls, sliced_rs);
    }

    /// Append maintains structure
    pub fn append(
        self: *const PairColl,
        other: *const PairColl,
        E: *Evaluator,
    ) !*PairColl {
        const combined_ls = try CollSlicing.append(self.ls, other.ls, E);
        const combined_rs = try CollSlicing.append(self.rs, other.rs, E);
        return self.builder.pairColl(combined_ls, combined_rs);
    }
};

CollBuilder

Factory for creating collections1112:

/// Factory for creating collections
const CollBuilder = struct {
    allocator: Allocator,

    /// Create pair collection from two collections
    pub fn pairColl(
        self: *CollBuilder,
        ls: *Coll,
        rs: *Coll,
    ) *PairColl {
        // Handle length mismatch by using minimum
        const result = self.allocator.create(PairColl) catch unreachable;
        result.* = .{
            .ls = ls,
            .rs = rs,
            .builder = self,
        };
        return result;
    }

    /// Create collection from array
    pub fn fromArray(
        self: *CollBuilder,
        items: []const Value,
        elem_type: SType,
    ) *Coll {
        // Enforce size limit
        if (items.len > MAX_ARRAY_LENGTH) {
            @panic("Collection size exceeds maximum");
        }

        const result = self.allocator.create(Coll) catch unreachable;

        // Special handling for pairs → PairColl
        if (elem_type == .s_tuple and elem_type.s_tuple.items.len == 2) {
            const ls = self.allocator.alloc(Value, items.len) catch unreachable;
            const rs = self.allocator.alloc(Value, items.len) catch unreachable;
            for (items, 0..) |item, i| {
                ls[i] = item.tuple[0];
                rs[i] = item.tuple[1];
            }
            result.* = .{
                .data = .{ .pair = .{
                    .ls = self.fromArray(ls, elem_type.s_tuple.items[0]),
                    .rs = self.fromArray(rs, elem_type.s_tuple.items[1]),
                } },
                .elem_type = elem_type,
                .builder = self,
            };
        } else {
            result.* = .{
                .data = .{ .array = .{ .items = items } },
                .elem_type = elem_type,
                .builder = self,
            };
        }
        return result;
    }

    /// Create collection of n copies of value
    pub fn replicate(
        self: *CollBuilder,
        n: usize,
        value: Value,
        elem_type: SType,
    ) *Coll {
        var items = self.allocator.alloc(Value, n) catch unreachable;
        for (items) |*item| {
            item.* = value;
        }
        return self.fromArray(items, elem_type);
    }

    /// Create empty collection
    pub fn emptyColl(self: *CollBuilder, elem_type: SType) *Coll {
        return self.fromArray(&[_]Value{}, elem_type);
    }

    /// Split pair collection into two collections
    pub fn unzip(self: *CollBuilder, coll: *const Coll) struct { *Coll, *Coll } {
        switch (coll.data) {
            .pair => |p| {
                // O(1) for PairColl
                return .{ p.ls, p.rs };
            },
            .array => |a| {
                // O(n) for regular collection - must materialize
                const n = a.items.len;
                var ls = self.allocator.alloc(Value, n) catch unreachable;
                var rs = self.allocator.alloc(Value, n) catch unreachable;
                for (a.items, 0..) |item, i| {
                    ls[i] = item.tuple[0];
                    rs[i] = item.tuple[1];
                }
                const elem_type = coll.elem_type.s_tuple;
                return .{
                    self.fromArray(ls, elem_type.items[0]),
                    self.fromArray(rs, elem_type.items[1]),
                };
            },
        }
    }

    /// Element-wise XOR of byte arrays
    pub fn xor(self: *CollBuilder, left: *const Coll, right: *const Coll) *Coll {
        const n = @min(left.length(), right.length());
        var result = self.allocator.alloc(Value, n) catch unreachable;
        for (0..n) |i| {
            const l = left.get(i).?.byte;
            const r = right.get(i).?.byte;
            result[i] = Value{ .byte = l ^ r };
        }
        return self.fromArray(result, .byte);
    }
};

/// Maximum collection size (DoS protection)
const MAX_ARRAY_LENGTH: usize = 100_000;

Rust Collection Representation

The Rust implementation uses a different approach1314:

/// Rust-style Collection enum
const RustCollection = union(enum) {
    /// Special representation for boolean constants (bit-packed)
    bool_constants: []const bool,
    /// Collection of expressions
    exprs: struct {
        elem_type: SType,
        items: []const *Expr,
    },

    pub fn tpe(self: RustCollection) SType {
        return switch (self) {
            .bool_constants => SType.collOf(.boolean),
            .exprs => |e| SType.collOf(e.elem_type),
        };
    }

    pub fn opCode(self: RustCollection) OpCode {
        return switch (self) {
            .bool_constants => OpCode.CollOfBoolConst,
            .exprs => OpCode.Coll,
        };
    }
};

/// Rust collection serialization
fn serializeCollection(coll: RustCollection, writer: anytype) !void {
    switch (coll) {
        .bool_constants => |bools| {
            try writer.writeInt(u16, @intCast(bools.len), .big);
            try writeBits(writer, bools);  // Bit-packed
        },
        .exprs => |e| {
            try writer.writeInt(u16, @intCast(e.items.len), .big);
            try serializeSType(e.elem_type, writer);
            for (e.items) |item| {
                try serializeExpr(item, writer);
            }
        },
    }
}

Cost Model

Collection Operation Costs
─────────────────────────────────────────────────────

Operation       │ Cost Type    │ Formula
────────────────┼──────────────┼──────────────────────
length          │ Fixed        │ 10
apply(i)        │ Fixed        │ 10
get(i)          │ Fixed        │ 10
map(f)          │ PerItem      │ 10 + ⌈n/10⌉ × 5
filter(p)       │ PerItem      │ 20 + ⌈n/10⌉ × 5
fold(z, op)     │ PerItem      │ 10 + ⌈n/10⌉ × 5
exists(p)       │ PerItem      │ 10 + ⌈k/10⌉ × 5  (k=items checked)
forall(p)       │ PerItem      │ 10 + ⌈k/10⌉ × 5  (k=items checked)
slice(from,to)  │ PerItem      │ 10 + ⌈len/10⌉ × 2
append(other)   │ PerItem      │ 20 + ⌈(n+m)/10⌉ × 2
zip(other)      │ Fixed        │ 10 (structural)
unzip           │ Fixed        │ 10 (PairColl), PerItem (array)
flatMap(f)      │ Dynamic      │ depends on result sizes
─────────────────────────────────────────────────────

Where: n = collection size, k = items processed before short-circuit

Set Operations

Distinct, union, intersection15:

/// Set-like operations on collections
const CollSetOps = struct {

    /// Remove duplicates, preserving first occurrences
    pub fn distinct(coll: *const Coll, E: *Evaluator) !*Coll {
        var seen = std.AutoHashMap(Value, void).init(E.allocator);
        var result = std.ArrayList(Value).init(E.allocator);

        for (0..coll.length()) |i| {
            const elem = coll.get(i).?;
            if (!seen.contains(elem)) {
                try seen.put(elem, {});
                try result.append(elem);
            }
        }

        return coll.builder.fromArray(result.items, coll.elem_type);
    }

    /// Union preserving order (set semantics)
    pub fn unionSet(coll: *const Coll, other: *const Coll, E: *Evaluator) !*Coll {
        var seen = std.AutoHashMap(Value, void).init(E.allocator);
        var result = std.ArrayList(Value).init(E.allocator);

        // Add all from first collection
        for (0..coll.length()) |i| {
            const elem = coll.get(i).?;
            if (!seen.contains(elem)) {
                try seen.put(elem, {});
                try result.append(elem);
            }
        }

        // Add unseen from second collection
        for (0..other.length()) |i| {
            const elem = other.get(i).?;
            if (!seen.contains(elem)) {
                try seen.put(elem, {});
                try result.append(elem);
            }
        }

        return coll.builder.fromArray(result.items, coll.elem_type);
    }

    /// Multiset intersection
    pub fn intersect(coll: *const Coll, other: *const Coll, E: *Evaluator) !*Coll {
        // Count occurrences in other
        var counts = std.AutoHashMap(Value, usize).init(E.allocator);
        for (0..other.length()) |i| {
            const elem = other.get(i).?;
            const entry = try counts.getOrPut(elem);
            if (!entry.found_existing) {
                entry.value_ptr.* = 0;
            }
            entry.value_ptr.* += 1;
        }

        // Collect elements that exist in other
        var result = std.ArrayList(Value).init(E.allocator);
        for (0..coll.length()) |i| {
            const elem = coll.get(i).?;
            if (counts.get(elem)) |*count| {
                if (count.* > 0) {
                    try result.append(elem);
                    count.* -= 1;
                }
            }
        }

        return coll.builder.fromArray(result.items, coll.elem_type);
    }
};

Summary

  • Coll[T] is immutable, indexed, deterministic
  • CollOverArray wraps arrays with specialized primitive support
  • PairColl uses Structure-of-Arrays for O(1) unzip
  • CollBuilder creates collections with automatic pair optimization
  • Short-circuit evaluation for exists/forall reduces costs
  • Size limit (100K elements) prevents DoS attacks
  • All operations have defined costs for gas calculation

Next: Chapter 21: AVL Trees

1

Scala: Colls.scala:12-50 (Coll trait)

2

Rust: collection.rs:21-32 (Collection enum)

3

Scala: Colls.scala:50-100 (core operations)

4

Rust: coll_by_index.rs (ByIndex)

5

Scala: CollsOverArrays.scala:30-50 (map, filter)

6

Rust: coll_map.rs:17-62 (Map struct)

8

Scala: CollsOverArrays.scala:50-80 (slice, append)

9

Scala: Colls.scala:150-180 (PairColl trait)

10

Scala: CollsOverArrays.scala:200-280 (PairOfCols)

11

Scala: Colls.scala:180-220 (CollBuilder trait)

12

Scala: CollsOverArrays.scala:300-400 (CollOverArrayBuilder)

13

Rust: collection.rs:34-56 (Collection::new)

14

Rust: collection.rs:100-136 (serialization)

15

Scala: CollsOverArrays.scala:100-150 (set operations)

Chapter 21: AVL+ Trees

PRE-ALPHA WARNING: This is a pre-alpha version of The Sigma Book. Content may be incomplete, inaccurate, or subject to change. Do not use as a source of truth. For authoritative information, consult the official repositories:

Prerequisites

  • Chapter 10 for BLAKE2b256 hashing used in node digests
  • Chapter 20 for collection operations that AVL trees extend
  • Familiarity with binary search tree concepts and balancing

Learning Objectives

By the end of this chapter, you will be able to:

  • Explain the prover-verifier architecture for authenticated dictionaries
  • Implement the AvlTreeData and ADDigest structures storing 33-byte commitments
  • Use operation flags to control insert/update/remove permissions
  • Apply proof-based verification for tree operations (contains, get, insert, update, remove)
  • Calculate operation costs based on proof length and tree height

Authenticated Dictionary Model

AVL+ trees provide authenticated key-value storage12:

Prover-Verifier Architecture
─────────────────────────────────────────────────────

OFF-CHAIN (Prover - holds full tree):
┌─────────────────────────────────────────────────────┐
│                 BatchAVLProver                      │
│  ┌─────────────────────────────────────────────────┐│
│  │           Complete Tree Structure               ││
│  │                    [H]                          ││
│  │                   /   \                         ││
│  │                 [D]   [L]                       ││
│  │                /   \ /   \                      ││
│  │              [B] [F][J] [N]                     ││
│  └─────────────────────────────────────────────────┘│
│  • Performs operations                              │
│  • Generates proofs                                 │
│  • Maintains full state                             │
└─────────────────────────│───────────────────────────┘
                          │ proof bytes
                          ▼
ON-CHAIN (Verifier - holds only digest):
┌─────────────────────────────────────────────────────┐
│               CAvlTreeVerifier                      │
│  ┌─────────────────────────────────────────────────┐│
│  │  Digest: [32-byte root hash][height byte]       ││
│  │          ═══════════════════════════════        ││
│  │          (33 bytes total)                       ││
│  └─────────────────────────────────────────────────┘│
│  • Verifies proof bytes                             │
│  • Returns operation results                        │
│  • Rejects invalid proofs                           │
└─────────────────────────────────────────────────────┘

AvlTreeData Structure

Core data type for authenticated trees34:

/// Authenticated tree data (stored on-chain)
const AvlTreeData = struct {
    /// Root hash + height (33 bytes total)
    digest: ADDigest,
    /// Permitted operations
    tree_flags: AvlTreeFlags,
    /// Fixed key length (all keys same size)
    /// Note: In Ergo, this is always 32 bytes (Blake2b256 hash)
    key_length: u32,
    /// Optional fixed value length
    value_length_opt: ?u32,

    pub const DIGEST_SIZE: usize = 33; // 32-byte hash + 1-byte height

    pub fn fromDigest(digest: []const u8) AvlTreeData {
        return .{
            .digest = ADDigest.fromSlice(digest),
            .tree_flags = AvlTreeFlags.allOperationsAllowed(),
            .key_length = 32, // Ergo: always 32 bytes (Blake2b256 hash)
            .value_length_opt = null,
        };
    }
};

/// 33-byte authenticated digest
const ADDigest = struct {
    /// 32-byte BLAKE2b256 root hash
    root_hash: [32]u8,
    /// Tree height (0-255)
    height: u8,

    pub fn fromSlice(bytes: []const u8) ADDigest {
        var result: ADDigest = undefined;
        @memcpy(&result.root_hash, bytes[0..32]);
        result.height = bytes[32];
        return result;
    }

    pub fn toBytes(self: ADDigest) [33]u8 {
        var result: [33]u8 = undefined;
        @memcpy(result[0..32], &self.root_hash);
        result[32] = self.height;
        return result;
    }
};

Operation Flags

Control which modifications are permitted56:

/// Operation permission flags (bit-packed)
const AvlTreeFlags = struct {
    flags: u8,

    /// Bit positions
    const INSERT_BIT: u8 = 0x01;
    const UPDATE_BIT: u8 = 0x02;
    const REMOVE_BIT: u8 = 0x04;

    pub fn new(insert_allowed: bool, update_allowed: bool, remove_allowed: bool) AvlTreeFlags {
        var flags: u8 = 0;
        if (insert_allowed) flags |= INSERT_BIT;
        if (update_allowed) flags |= UPDATE_BIT;
        if (remove_allowed) flags |= REMOVE_BIT;
        return .{ .flags = flags };
    }

    pub fn parse(byte: u8) AvlTreeFlags {
        return .{ .flags = byte };
    }

    pub fn serialize(self: AvlTreeFlags) u8 {
        return self.flags;
    }

    // Predefined flag combinations
    pub fn readOnly() AvlTreeFlags {
        return .{ .flags = 0x00 };
    }

    pub fn allOperationsAllowed() AvlTreeFlags {
        return .{ .flags = 0x07 };
    }

    pub fn insertOnly() AvlTreeFlags {
        return .{ .flags = 0x01 };
    }

    pub fn removeOnly() AvlTreeFlags {
        return .{ .flags = 0x04 };
    }

    // Permission checks
    pub fn insertAllowed(self: AvlTreeFlags) bool {
        return (self.flags & INSERT_BIT) != 0;
    }

    pub fn updateAllowed(self: AvlTreeFlags) bool {
        return (self.flags & UPDATE_BIT) != 0;
    }

    pub fn removeAllowed(self: AvlTreeFlags) bool {
        return (self.flags & REMOVE_BIT) != 0;
    }
};

AvlTree Interface

ErgoScript interface for authenticated trees7:

/// AvlTree wrapper providing ErgoScript interface
const AvlTree = struct {
    data: AvlTreeData,

    /// Get 33-byte authenticated digest
    pub fn digest(self: *const AvlTree) []const u8 {
        return &self.data.digest.toBytes();
    }

    /// Get operation flags byte
    pub fn enabledOperations(self: *const AvlTree) u8 {
        return self.data.tree_flags.serialize();
    }

    /// Get fixed key length
    pub fn keyLength(self: *const AvlTree) i32 {
        return @intCast(self.data.key_length);
    }

    /// Get optional fixed value length
    pub fn valueLengthOpt(self: *const AvlTree) ?i32 {
        if (self.data.value_length_opt) |v| {
            return @intCast(v);
        }
        return null;
    }

    /// Permission checks
    pub fn isInsertAllowed(self: *const AvlTree) bool {
        return self.data.tree_flags.insertAllowed();
    }

    pub fn isUpdateAllowed(self: *const AvlTree) bool {
        return self.data.tree_flags.updateAllowed();
    }

    pub fn isRemoveAllowed(self: *const AvlTree) bool {
        return self.data.tree_flags.removeAllowed();
    }

    /// Create new tree with updated digest (immutable)
    pub fn updateDigest(self: *const AvlTree, new_digest: []const u8) AvlTree {
        var new_data = self.data;
        new_data.digest = ADDigest.fromSlice(new_digest);
        return .{ .data = new_data };
    }

    /// Create new tree with updated flags (immutable)
    pub fn updateOperations(self: *const AvlTree, new_flags: u8) AvlTree {
        var new_data = self.data;
        new_data.tree_flags = AvlTreeFlags.parse(new_flags);
        return .{ .data = new_data };
    }
};

Verifier Implementation

The verifier processes proofs to verify operations89:

/// AVL tree proof verifier
const AvlTreeVerifier = struct {
    /// Current state digest (None if verification failed)
    current_digest: ?ADDigest,
    /// Proof bytes to process
    proof: []const u8,
    /// Current position in proof
    proof_pos: usize,
    /// Key length
    key_length: usize,
    /// Optional value length
    value_length_opt: ?usize,

    pub fn init(tree: *const AvlTree, proof: []const u8) AvlTreeVerifier {
        return .{
            .current_digest = tree.data.digest,
            .proof = proof,
            .proof_pos = 0,
            .key_length = tree.data.key_length,
            .value_length_opt = tree.data.value_length_opt,
        };
    }

    /// Get current tree height
    pub fn treeHeight(self: *const AvlTreeVerifier) usize {
        if (self.current_digest) |d| {
            return d.height;
        }
        return 0;
    }

    /// Get current digest (None if verification failed)
    pub fn digest(self: *const AvlTreeVerifier) ?[]const u8 {
        if (self.current_digest) |d| {
            return &d.toBytes();
        }
        return null;
    }

    /// Perform lookup operation
    pub fn performLookup(self: *AvlTreeVerifier, key: []const u8) !?[]const u8 {
        if (self.current_digest == null) return error.VerificationFailed;

        // Process proof to verify key existence
        const result = try self.verifyLookup(key);
        return result;
    }

    /// Perform insert operation
    pub fn performInsert(
        self: *AvlTreeVerifier,
        key: []const u8,
        value: []const u8,
    ) !?[]const u8 {
        if (self.current_digest == null) return error.VerificationFailed;

        // Process proof to verify insertion
        const old_value = try self.verifyInsert(key, value);

        // Update digest based on proof
        self.updateDigestFromProof();

        return old_value;
    }

    /// Perform update operation
    pub fn performUpdate(
        self: *AvlTreeVerifier,
        key: []const u8,
        value: []const u8,
    ) !?[]const u8 {
        if (self.current_digest == null) return error.VerificationFailed;

        const old_value = try self.verifyUpdate(key, value);
        self.updateDigestFromProof();
        return old_value;
    }

    /// Perform remove operation
    pub fn performRemove(self: *AvlTreeVerifier, key: []const u8) !?[]const u8 {
        if (self.current_digest == null) return error.VerificationFailed;

        const old_value = try self.verifyRemove(key);
        self.updateDigestFromProof();
        return old_value;
    }

    fn verifyLookup(self: *AvlTreeVerifier, key: []const u8) !?[]const u8 {
        // NOTE: Stub - full implementation requires:
        // 1. Read node type from proof (leaf vs internal)
        // 2. Compare key with node key
        // 3. Follow proof path based on comparison result
        // 4. Verify all hashes match computed values
        // See scorex-util: BatchAVLVerifier for reference.
        _ = self;
        _ = key;
        @compileError("verifyLookup not implemented - see reference implementations");
    }
    // SECURITY: Key comparisons in production must be constant-time to prevent
    // timing attacks that could leak key values. Use std.crypto.utils.timingSafeEql.

    fn updateDigestFromProof(self: *AvlTreeVerifier) void {
        // Extract new digest from proof processing
        _ = self;
    }
};

Proof-Based Operations

Operations use proofs for verification1011:

/// Evaluate contains operation
fn containsEval(
    tree: *const AvlTree,
    key: []const u8,
    proof: []const u8,
    E: *Evaluator,
) !bool {
    // Cost: create verifier O(proof.length)
    try E.addSeqCost(CreateVerifierCost, proof.len, OpCode.AvlTreeContains);
    var verifier = AvlTreeVerifier.init(tree, proof);

    // Cost: lookup O(tree.height)
    const n_items = verifier.treeHeight();
    try E.addSeqCost(LookupCost, n_items, OpCode.AvlTreeContains);

    const result = verifier.performLookup(key) catch return false;
    return result != null;
}

/// Evaluate get operation
fn getEval(
    tree: *const AvlTree,
    key: []const u8,
    proof: []const u8,
    E: *Evaluator,
) !?[]const u8 {
    try E.addSeqCost(CreateVerifierCost, proof.len, OpCode.AvlTreeGet);
    var verifier = AvlTreeVerifier.init(tree, proof);

    const n_items = verifier.treeHeight();
    try E.addSeqCost(LookupCost, n_items, OpCode.AvlTreeGet);

    return verifier.performLookup(key) catch return error.InvalidProof;
}

/// Evaluate insert operation
fn insertEval(
    tree: *const AvlTree,
    entries: []const KeyValue,
    proof: []const u8,
    E: *Evaluator,
) !?*AvlTree {
    // Check permission
    try E.addCost(IsInsertAllowedCost, OpCode.AvlTreeInsert);
    if (!tree.isInsertAllowed()) {
        return null;
    }

    try E.addSeqCost(CreateVerifierCost, proof.len, OpCode.AvlTreeInsert);
    var verifier = AvlTreeVerifier.init(tree, proof);

    const n_items = @max(verifier.treeHeight(), 1);

    // Process each entry
    for (entries) |entry| {
        try E.addSeqCost(InsertCost, n_items, OpCode.AvlTreeInsert);
        _ = verifier.performInsert(entry.key, entry.value) catch return null;
    }

    // Return new tree with updated digest
    const new_digest = verifier.digest() orelse return null;
    try E.addCost(UpdateDigestCost, OpCode.AvlTreeInsert);
    const new_tree = tree.updateDigest(new_digest);
    return &new_tree;
}

/// Evaluate remove operation
fn removeEval(
    tree: *const AvlTree,
    keys: []const []const u8,
    proof: []const u8,
    E: *Evaluator,
) !?*AvlTree {
    try E.addCost(IsRemoveAllowedCost, OpCode.AvlTreeRemove);
    if (!tree.isRemoveAllowed()) {
        return null;
    }

    try E.addSeqCost(CreateVerifierCost, proof.len, OpCode.AvlTreeRemove);
    var verifier = AvlTreeVerifier.init(tree, proof);

    const n_items = @max(verifier.treeHeight(), 1);

    for (keys) |key| {
        try E.addSeqCost(RemoveCost, n_items, OpCode.AvlTreeRemove);
        _ = verifier.performRemove(key) catch return null;
    }

    const new_digest = verifier.digest() orelse return null;
    try E.addCost(UpdateDigestCost, OpCode.AvlTreeRemove);
    return &tree.updateDigest(new_digest);
}

const KeyValue = struct {
    key: []const u8,
    value: []const u8,
};

Cost Model

AVL tree operations have two-part costs12. Since AVL+ trees are balanced, the tree height is O(log n) where n is the number of entries. Proof size is also O(log n) as proofs contain one sibling hash per tree level.

AVL Tree Operation Costs
─────────────────────────────────────────────────────

Phase 1 - Create Verifier (O(proof.length)):
  base = 110, per_chunk = 20, chunk_size = 64

Phase 2 - Per Operation (O(tree.height)):
Operation     │ Base │ Per Height │ Chunk
──────────────┼──────┼────────────┼───────
Lookup        │  40  │    10      │   1
Insert        │  40  │    10      │   1
Update        │ 120  │    20      │   1
Remove        │ 100  │    15      │   1
──────────────────────────────────────────────────────

Example: Get operation on tree with height 10, proof 128 bytes
  Verifier: 110 + ⌈128/64⌉ × 20 = 110 + 2 × 20 = 150
  Lookup:   40 + 10 × 10 = 140
  Total:    290 JitCost units
const CreateVerifierCost = PerItemCost{
    .base = JitCost{ .value = 110 },
    .per_chunk = JitCost{ .value = 20 },
    .chunk_size = 64,
};

const LookupCost = PerItemCost{
    .base = JitCost{ .value = 40 },
    .per_chunk = JitCost{ .value = 10 },
    .chunk_size = 1,
};

const InsertCost = PerItemCost{
    .base = JitCost{ .value = 40 },
    .per_chunk = JitCost{ .value = 10 },
    .chunk_size = 1,
};

const UpdateCost = PerItemCost{
    .base = JitCost{ .value = 120 },
    .per_chunk = JitCost{ .value = 20 },
    .chunk_size = 1,
};

const RemoveCost = PerItemCost{
    .base = JitCost{ .value = 100 },
    .per_chunk = JitCost{ .value = 15 },
    .chunk_size = 1,
};

Serialization

AvlTreeData serialization format1314:

/// Serialize AvlTreeData
fn serializeAvlTreeData(data: *const AvlTreeData, writer: anytype) !void {
    // Digest (33 bytes)
    try writer.writeAll(&data.digest.toBytes());

    // Flags (1 byte)
    try writer.writeByte(data.tree_flags.serialize());

    // Key length (VLQ)
    try writeUInt(writer, data.key_length);

    // Optional value length
    if (data.value_length_opt) |vlen| {
        try writer.writeByte(1); // Some
        try writeUInt(writer, vlen);
    } else {
        try writer.writeByte(0); // None
    }
}

/// Parse AvlTreeData
fn parseAvlTreeData(reader: anytype) !AvlTreeData {
    // Digest (33 bytes)
    var digest_bytes: [33]u8 = undefined;
    _ = try reader.readAll(&digest_bytes);
    const digest = ADDigest.fromSlice(&digest_bytes);

    // Flags (1 byte)
    const flags = AvlTreeFlags.parse(try reader.readByte());

    // Key length (VLQ)
    const key_length = try readUInt(reader);

    // Optional value length
    const has_value_length = (try reader.readByte()) != 0;
    const value_length_opt: ?u32 = if (has_value_length)
        try readUInt(reader)
    else
        null;

    return AvlTreeData{
        .digest = digest,
        .tree_flags = flags,
        .key_length = key_length,
        .value_length_opt = value_length_opt,
    };
}

Off-Chain Proof Generation

Provers generate proofs for operations:

/// Off-chain AVL tree prover (holds full tree)
const AvlProver = struct {
    /// Full tree structure
    root: ?*AvlNode,
    /// Key length
    key_length: usize,
    /// Value length (optional)
    value_length_opt: ?usize,
    /// Pending operations for batch proof
    pending_ops: std.ArrayList(Operation),
    allocator: Allocator,

    const Operation = union(enum) {
        lookup: []const u8,
        insert: struct { key: []const u8, value: []const u8 },
        update: struct { key: []const u8, value: []const u8 },
        remove: []const u8,
    };

    /// Perform insert and record for proof
    pub fn performInsert(self: *AvlProver, key: []const u8, value: []const u8) !void {
        // Actually insert into tree
        self.root = try self.insertNode(self.root, key, value);
        // Record for proof generation
        try self.pending_ops.append(.{ .insert = .{ .key = key, .value = value } });
    }

    /// Generate proof for all pending operations
    pub fn generateProof(self: *AvlProver) ![]const u8 {
        var proof_builder = ProofBuilder.init(self.allocator);

        for (self.pending_ops.items) |op| {
            switch (op) {
                .lookup => |key| try proof_builder.addLookupPath(self.root, key),
                .insert => |ins| try proof_builder.addInsertPath(self.root, ins.key),
                .update => |upd| try proof_builder.addUpdatePath(self.root, upd.key),
                .remove => |key| try proof_builder.addRemovePath(self.root, key),
            }
        }

        self.pending_ops.clearRetainingCapacity();
        return proof_builder.finish();
    }

    /// Get current tree digest
    pub fn digest(self: *const AvlProver) ADDigest {
        if (self.root) |r| {
            return computeNodeDigest(r);
        }
        return ADDigest{ .root_hash = [_]u8{0} ** 32, .height = 0 };
    }

    fn computeNodeDigest(node: *const AvlNode) ADDigest {
        _ = node;
        // Compute BLAKE2b256 hash of node contents
        // Include left and right child hashes
        return undefined;
    }
};

const AvlNode = struct {
    key: []const u8,
    value: []const u8,
    left: ?*AvlNode,
    right: ?*AvlNode,
    height: u8,
};

Key Ordering Requirement

Keys must be provided in the same order during proof generation and verification15:

CRITICAL: Key Ordering
─────────────────────────────────────────────────────

Proof Generation (off-chain):
  prover.performLookup(key_A)
  prover.performLookup(key_B)
  prover.performLookup(key_C)
  proof = prover.generateProof()

Verification (on-chain):
  tree.getMany([key_A, key_B, key_C], proof)  ✓ Works
  tree.getMany([key_B, key_A, key_C], proof)  ✗ Fails

The proof encodes a specific traversal path.
Different key order = different path = verification failure.

Summary

  • Authenticated dictionaries store only 33-byte digest on-chain
  • Ergo key size: Always 32 bytes (Blake2b256 hash); keyLength field exists for generality
  • Prover (off-chain) holds full tree, generates proofs
  • Verifier (on-chain) verifies proofs with only digest
  • Operation flags control insert/update/remove permissions
  • Key ordering must match between proof generation and verification
  • Cost scales with proof length (verifier creation) and tree height (operations)
  • All methods are immutable—return new tree instances

Next: Chapter 22: Box Model

1

Scala: AvlTreeData.scala:43-57 (AvlTreeData case class)

2

Rust: avl_tree_data.rs:56-69 (AvlTreeData struct)

3

Scala: AvlTreeData.scala:57 (DigestSize = 33)

4

Rust: avl_tree_data.rs:61-62 (digest field)

5

Scala: AvlTreeData.scala:7-36 (AvlTreeFlags)

6

Rust: avl_tree_data.rs:10-54 (AvlTreeFlags impl)

7

Scala: SigmaDsl.scala:547-589 (AvlTree trait)

8

Scala: AvlTreeVerifier.scala:8-88 (AvlTreeVerifier)

9

Scala: CAvlTreeVerifier.scala:17-45 (CAvlTreeVerifier)

10

Scala: CErgoTreeEvaluator.scala:78-93 (contains_eval)

11

Scala: CErgoTreeEvaluator.scala:132-164 (insert_eval)

12

Scala: methods.scala:1498-1540 (cost info constants)

13

Scala: AvlTreeData.scala:71-90 (serializer)

14

Rust: avl_tree_data.rs:71-91 (SigmaSerializable impl)

15

Scala: methods.scala:1588 (getMany key ordering caution)

Chapter 22: Box Model

PRE-ALPHA WARNING: This is a pre-alpha version of The Sigma Book. Content may be incomplete, inaccurate, or subject to change. Do not use as a source of truth. For authoritative information, consult the official repositories:

Prerequisites

  • Understanding of UTXO (Unspent Transaction Output) model basics
  • Chapter 3 for ErgoTree format stored in boxes
  • Chapter 20 for collection types used in registers

Learning Objectives

By the end of this chapter, you will be able to:

  • Explain the Ergo box as the fundamental UTXO structure with extended capabilities
  • Work with the register-based data model (R0-R3 mandatory, R4-R9 optional)
  • Manage tokens—the multi-asset feature of Ergo boxes
  • Compute box IDs using Blake2b256 hashing of serialized content
  • Implement box serialization and deserialization

Box Architecture

Boxes are Ergo's state containers—the extended UTXO model12:

Box Structure
─────────────────────────────────────────────────────

┌─────────────────────────────────────────────────────┐
│                     ErgoBox                         │
├─────────────────────────────────────────────────────┤
│  box_id: [32]u8          Blake2b256(serialize(box)) │
├─────────────────────────────────────────────────────┤
│                   Mandatory Registers               │
│  ┌───────────────────────────────────────────────┐  │
│  │ R0: Long           Value in nanoERG (10⁻⁹ ERG)│  │
│  │ R1: ErgoTree       Guarding script            │  │
│  │ R2: Coll[Token]    Secondary tokens           │  │
│  │ R3: (Int, Bytes)   Creation info              │  │
│  └───────────────────────────────────────────────┘  │
├─────────────────────────────────────────────────────┤
│                Non-Mandatory Registers              │
│  ┌───────────────────────────────────────────────┐  │
│  │ R4-R9: Any         Application-defined data   │  │
│  └───────────────────────────────────────────────┘  │
├─────────────────────────────────────────────────────┤
│               Transaction Reference                 │
│  ┌───────────────────────────────────────────────┐  │
│  │ transaction_id: [32]u8    Creating tx hash    │  │
│  │ index: u16                Output index in tx  │  │
│  └───────────────────────────────────────────────┘  │
└─────────────────────────────────────────────────────┘

Core Box Structure

const ErgoBox = struct {
    /// Blake2b256 hash of serialized box (computed)
    box_id: BoxId,
    /// Amount in NanoErgs (R0)
    value: BoxValue,
    /// Guarding script (R1)
    ergo_tree: ErgoTree,
    /// Secondary tokens (R2), up to MAX_TOKENS_COUNT
    tokens: ?BoundedVec(Token, 1, MAX_TOKENS_COUNT),
    /// Additional registers R4-R9
    additional_registers: NonMandatoryRegisters,
    /// Block height when transaction was created (part of R3)
    creation_height: u32,
    /// Transaction that created this box (part of R3)
    transaction_id: TxId,
    /// Output index in transaction (part of R3)
    index: u16,

    /// Protocol: 255 (u8), practical: ~122 due to box size limit
    pub const MAX_TOKENS_COUNT: usize = 255;
    pub const MAX_BOX_SIZE: usize = 4096;
    pub const MAX_SCRIPT_SIZE: usize = 4096;

    /// Create new box, computing box_id from content
    pub fn init(
        value: BoxValue,
        ergo_tree: ErgoTree,
        tokens: ?BoundedVec(Token, 1, MAX_TOKENS_COUNT),
        additional_registers: NonMandatoryRegisters,
        creation_height: u32,
        transaction_id: TxId,
        index: u16,
    ) !ErgoBox {
        var box_with_zero_id = ErgoBox{
            .box_id = BoxId.zero(),
            .value = value,
            .ergo_tree = ergo_tree,
            .tokens = tokens,
            .additional_registers = additional_registers,
            .creation_height = creation_height,
            .transaction_id = transaction_id,
            .index = index,
        };
        box_with_zero_id.box_id = try box_with_zero_id.calcBoxId();
        return box_with_zero_id;
    }

    /// Compute box ID as Blake2b256 hash of serialized bytes
    fn calcBoxId(self: *const ErgoBox) !BoxId {
        const bytes = try self.sigmaSerialize();
        const hash = blake2b256(bytes);
        return BoxId{ .digest = hash };
    }

    /// Create box from candidate by adding transaction reference
    pub fn fromBoxCandidate(
        candidate: *const ErgoBoxCandidate,
        transaction_id: TxId,
        index: u16,
    ) !ErgoBox {
        return init(
            candidate.value,
            candidate.ergo_tree,
            candidate.tokens,
            candidate.additional_registers,
            candidate.creation_height,
            transaction_id,
            index,
        );
    }
};

ErgoBoxCandidate

Before confirmation, boxes exist as candidates without transaction reference34:

/// Box before transaction confirmation (no tx reference yet)
const ErgoBoxCandidate = struct {
    /// Amount in NanoErgs
    value: BoxValue,
    /// Guarding script
    ergo_tree: ErgoTree,
    /// Secondary tokens
    tokens: ?BoundedVec(Token, 1, ErgoBox.MAX_TOKENS_COUNT),
    /// Additional registers R4-R9
    additional_registers: NonMandatoryRegisters,
    /// Declared creation height
    creation_height: u32,

    pub fn toBox(self: *const ErgoBoxCandidate, tx_id: TxId, index: u16) !ErgoBox {
        return ErgoBox.fromBoxCandidate(self, tx_id, index);
    }
};

Register Model

Ten registers total—four mandatory, six application-defined56:

Register Layout
─────────────────────────────────────────────────────
 ID    Type                   Purpose
─────────────────────────────────────────────────────
 R0    Long                   Value in nanoERG (10⁻⁹ ERG)
 R1    Coll[Byte]             Serialized ErgoTree
 R2    Coll[(Coll[Byte],Long)] Secondary tokens
 R3    (Int, Coll[Byte])      (height, txId ++ index)
─────────────────────────────────────────────────────
 R4    Any                    Application data
 R5    Any                    Application data
 R6    Any                    Application data
 R7    Any                    Application data
 R8    Any                    Application data
 R9    Any                    Application data
─────────────────────────────────────────────────────

Note: R4-R9 must be densely packed.
      If R6 is used, R4 and R5 must also be present.

Register ID Types

/// Register identifier (0-9)
const RegisterId = union(enum) {
    mandatory: MandatoryRegisterId,
    non_mandatory: NonMandatoryRegisterId,

    pub const R0 = RegisterId{ .mandatory = .r0 };
    pub const R1 = RegisterId{ .mandatory = .r1 };
    pub const R2 = RegisterId{ .mandatory = .r2 };
    pub const R3 = RegisterId{ .mandatory = .r3 };

    pub fn fromByte(value: u8) !RegisterId {
        if (value < 4) {
            return RegisterId{ .mandatory = @enumFromInt(value) };
        } else if (value <= 9) {
            return RegisterId{ .non_mandatory = @enumFromInt(value) };
        } else {
            return error.RegisterIdOutOfBounds;
        }
    }
};

/// Mandatory registers (R0-R3) - every box has these
const MandatoryRegisterId = enum(u8) {
    /// Monetary value in NanoErgs
    r0 = 0,
    /// Guarding script (serialized ErgoTree)
    r1 = 1,
    /// Secondary tokens
    r2 = 2,
    /// Transaction reference and creation height
    r3 = 3,
};

/// Non-mandatory registers (R4-R9) - application defined
const NonMandatoryRegisterId = enum(u8) {
    r4 = 4,
    r5 = 5,
    r6 = 6,
    r7 = 7,
    r8 = 8,
    r9 = 9,

    pub const START_INDEX: usize = 4;
    pub const END_INDEX: usize = 9;
    pub const NUM_REGS: usize = 6;
};

Non-Mandatory Registers

Densely-packed storage for R4-R978:

const NonMandatoryRegisters = struct {
    /// Registers stored as contiguous array (R4 at index 0)
    values: []RegisterValue,
    allocator: Allocator,

    pub const MAX_SIZE: usize = NonMandatoryRegisterId.NUM_REGS;

    pub fn empty() NonMandatoryRegisters {
        return .{ .values = &.{}, .allocator = undefined };
    }

    /// Create from map, ensuring dense packing
    pub fn fromMap(
        allocator: Allocator,
        map: std.AutoHashMap(NonMandatoryRegisterId, Constant),
    ) !NonMandatoryRegisters {
        const count = map.count();
        if (count > MAX_SIZE) return error.InvalidSize;

        // Verify dense packing: R4...R(4+count-1) must all be present
        var values = try allocator.alloc(RegisterValue, count);
        var i: usize = 0;
        while (i < count) : (i += 1) {
            const reg_id: NonMandatoryRegisterId = @enumFromInt(4 + i);
            const constant = map.get(reg_id) orelse
                return error.NonDenselyPacked;
            values[i] = RegisterValue{ .parsed = constant };
        }

        return .{ .values = values, .allocator = allocator };
    }

    /// Get register by ID, returns null if not present
    pub fn get(self: *const NonMandatoryRegisters, reg_id: NonMandatoryRegisterId) ?*const RegisterValue {
        const index = @intFromEnum(reg_id) - NonMandatoryRegisterId.START_INDEX;
        if (index >= self.values.len) return null;
        return &self.values[index];
    }

    /// Get as Constant, handling parse errors
    pub fn getConstant(self: *const NonMandatoryRegisters, reg_id: NonMandatoryRegisterId) !?Constant {
        const reg_val = self.get(reg_id) orelse return null;
        return try reg_val.asConstant();
    }
};

/// Register value—either parsed Constant or unparseable bytes
const RegisterValue = union(enum) {
    parsed: Constant,
    parsed_tuple: EvaluatedTuple,
    invalid: struct {
        bytes: []const u8,
        error_msg: []const u8,
    },

    pub fn asConstant(self: *const RegisterValue) !Constant {
        return switch (self.*) {
            .parsed => |c| c,
            .parsed_tuple => |t| t.toConstant(),
            .invalid => |inv| error.UnparseableRegister,
        };
    }
};

Box ID Computation

Box ID is Blake2b256 hash of serialized content910:

const BoxId = struct {
    digest: [32]u8,

    pub const SIZE: usize = 32;

    pub fn zero() BoxId {
        return .{ .digest = [_]u8{0} ** 32 };
    }

    pub fn fromBytes(bytes: []const u8) !BoxId {
        if (bytes.len != SIZE) return error.InvalidLength;
        var result: BoxId = undefined;
        @memcpy(&result.digest, bytes);
        return result;
    }
};

/// Compute box ID from serialized box bytes
pub fn computeBoxId(box_bytes: []const u8) BoxId {
    return BoxId{ .digest = blake2b256(box_bytes) };
}

The ID includes transaction reference, making each box unique:

Box ID Computation
─────────────────────────────────────────────────────

┌──────────────────────────────────────────────────┐
│              Serialized Box Bytes                │
├──────────────────────────────────────────────────┤
│  value (VLQ)                                     │
│  ergo_tree (bytes)                               │
│  creation_height (VLQ)                           │
│  tokens_count (u8)                               │
│  tokens[] (token_id + amount)                    │
│  registers_count (u8)                            │
│  additional_registers[]                          │
│  transaction_id (32 bytes)                       │
│  index (2 bytes, big-endian)                     │
└──────────────────────────────────────────────────┘
                        │
                        ▼
              ┌─────────────────┐
              │   Blake2b256    │
              └────────┬────────┘
                       │
                       ▼
              ┌─────────────────┐
              │  BoxId (32 B)   │
              └─────────────────┘

Register Access

Get register value with type checking1112:

/// Get any register value (R0-R9)
pub fn getRegister(box: *const ErgoBox, id: RegisterId) !?Constant {
    return switch (id) {
        .mandatory => |mid| switch (mid) {
            .r0 => Constant.fromLong(box.value.as_i64()),
            .r1 => Constant.fromBytes(try box.ergo_tree.serialize()),
            .r2 => Constant.fromTokens(box.tokensRaw()),
            .r3 => Constant.fromTuple(box.creationInfo()),
        },
        .non_mandatory => |nid| try box.additional_registers.getConstant(nid),
    };
}

/// Get tokens as raw (bytes, amount) pairs
pub fn tokensRaw(box: *const ErgoBox) []const struct { []const i8, i64 } {
    if (box.tokens) |tokens| {
        var result = allocator.alloc(@TypeOf(result[0]), tokens.len);
        for (tokens.items(), 0..) |token, i| {
            result[i] = .{ token.token_id.asVecI8(), token.amount.as_i64() };
        }
        return result;
    }
    return &.{};
}

/// Get creation info as (height, txId ++ index)
pub fn creationInfo(box: *const ErgoBox) struct { i32, []const i8 } {
    var bytes: [34]u8 = undefined; // 32-byte tx_id + 2-byte index
    @memcpy(bytes[0..32], &box.transaction_id.digest);
    std.mem.writeInt(u16, bytes[32..34], box.index, .big);
    return .{
        @intCast(box.creation_height),
        std.mem.bytesAsSlice(i8, &bytes),
    };
}

ExtractRegisterAs (AST Node)

Register access in ErgoScript compiles to ExtractRegisterAs1314:

/// Box.R0 - Box.R9 operations
const ExtractRegisterAs = struct {
    /// Input box expression
    input: *const Expr,
    /// Register index (0-9)
    register_id: i8,
    /// Expected element type (wrapped in Option)
    elem_tpe: SType,

    pub const OP_CODE = OpCode.new(0x6E); // EXTRACT_REGISTER_AS

    pub fn tpe(self: *const ExtractRegisterAs) SType {
        return SType.option(self.elem_tpe);
    }

    pub fn eval(self: *const ExtractRegisterAs, env: *Env, ctx: *Context) !Value {
        const ir_box = try self.input.eval(env, ctx);
        const box = ir_box.asBox() orelse return error.TypeMismatch;

        const id = RegisterId.fromByte(@intCast(self.register_id)) catch
            return error.RegisterIdOutOfBounds;

        const reg_val_opt = try box.getRegister(id);

        if (reg_val_opt) |constant| {
            // Type must match exactly
            if (!constant.tpe.equals(self.elem_tpe)) {
                return error.UnexpectedType;
            }
            return Value.some(constant.value);
        } else {
            return Value.none();
        }
    }
};

Token Representation

Tokens are (id, amount) pairs stored in R21516:

const Token = struct {
    /// 32-byte token identifier
    token_id: TokenId,
    /// Token amount (positive i64)
    amount: TokenAmount,
};

const TokenId = struct {
    digest: [32]u8,

    pub const SIZE: usize = 32;
};

const TokenAmount = struct {
    value: u64,

    pub fn as_i64(self: TokenAmount) i64 {
        return @intCast(self.value);
    }
};

/// Bounded collection of tokens (1 to MAX_TOKENS)
const BoxTokens = BoundedVec(Token, 1, ErgoBox.MAX_TOKENS_COUNT);

Token minting rule:

Token Creation Rule
─────────────────────────────────────────────────────

A new token can ONLY be minted when:
  token_id == INPUTS(0).id   (MUST equal first input's box ID)

This is a consensus rule enforced by the protocol.
Only the first input's box ID can be used as a new token ID.
This ensures uniqueness: tokens are "born" from a specific box.

┌─────────────┐     Spend      ┌─────────────────┐
│  Input Box  │ ─────────────► │   Output Box    │
│  id: ABC123 │                │  token: ABC123  │
└─────────────┘                │  amount: 1000   │
                               └─────────────────┘

Box Serialization

/// Serialize box with optional token ID indexing
pub fn serializeBoxWithIndexedDigests(
    box_value: BoxValue,
    ergo_tree_bytes: []const u8,
    tokens: ?BoxTokens,
    additional_registers: *const NonMandatoryRegisters,
    creation_height: u32,
    token_ids_in_tx: ?*const IndexSet(TokenId),
    writer: anytype,
) !void {
    // Value (VLQ-encoded)
    try box_value.serialize(writer);

    // ErgoTree bytes
    try writer.writeAll(ergo_tree_bytes);

    // Creation height (VLQ-encoded)
    try writeVLQ(writer, creation_height);

    // Tokens
    const token_slice = if (tokens) |t| t.items() else &[_]Token{};
    try writer.writeByte(@intCast(token_slice.len));

    for (token_slice) |token| {
        if (token_ids_in_tx) |index_set| {
            // Write index into transaction's token list
            const idx = index_set.getIndex(token.token_id) orelse
                return error.TokenNotInIndex;
            try writeVLQ(writer, @intCast(idx));
        } else {
            // Write full 32-byte token ID
            try writer.writeAll(&token.token_id.digest);
        }
        try writeVLQ(writer, token.amount.value);
    }

    // Additional registers
    try additional_registers.serialize(writer);
}

/// Full ErgoBox serialization (adds tx reference)
pub fn serializeErgoBox(box: *const ErgoBox, writer: anytype) !void {
    const ergo_tree_bytes = try box.ergo_tree.serialize();

    try serializeBoxWithIndexedDigests(
        box.value,
        ergo_tree_bytes,
        box.tokens,
        &box.additional_registers,
        box.creation_height,
        null,
        writer,
    );

    // Transaction reference
    try writer.writeAll(&box.transaction_id.digest);
    try writer.writeInt(u16, box.index, .big);
}

Size Limits

Box Constraints
─────────────────────────────────────────────────────
Limit                    Value     Notes
─────────────────────────────────────────────────────
Max box size             4 KB      Total serialized bytes
Max tokens per box       255       Protocol limit (u8)
  (practical limit)      ~122      Due to 4KB size limit
Max registers            10        R0-R9
Max script size          4 KB      ErgoTree in R1 (part of box)
─────────────────────────────────────────────────────
const SigmaConstants = struct {
    pub const MAX_BOX_SIZE: usize = 4 * 1024;
    /// Protocol allows 255 (u8), but ~122 fit within MAX_BOX_SIZE
    pub const MAX_TOKENS_PROTOCOL: usize = 255;
    pub const MAX_TOKENS_PRACTICAL: usize = 122;
    pub const MAX_REGISTERS: usize = 10;
};

Box Interface Methods

Methods available on Box type1718:

const BoxMethods = struct {
    /// Box.value: Long - monetary value in NanoErgs
    pub fn value(box: *const ErgoBox) i64 {
        return box.value.as_i64();
    }

    /// Box.propositionBytes: Coll[Byte] - serialized script
    pub fn propositionBytes(box: *const ErgoBox) ![]const u8 {
        return try box.ergo_tree.serialize();
    }

    /// Box.bytes: Coll[Byte] - full serialized box
    pub fn bytes(box: *const ErgoBox) ![]const u8 {
        return try box.serialize();
    }

    /// Box.bytesWithoutRef: Coll[Byte] - without tx reference
    pub fn bytesWithoutRef(box: *const ErgoBox) ![]const u8 {
        const candidate = ErgoBoxCandidate{
            .value = box.value,
            .ergo_tree = box.ergo_tree,
            .tokens = box.tokens,
            .additional_registers = box.additional_registers,
            .creation_height = box.creation_height,
        };
        return try candidate.serialize();
    }

    /// Box.id: Coll[Byte] - 32-byte Blake2b256 hash
    pub fn id(box: *const ErgoBox) []const u8 {
        return &box.box_id.digest;
    }

    /// Box.creationInfo: (Int, Coll[Byte])
    pub fn creationInfo(box: *const ErgoBox) struct { i32, []const u8 } {
        return box.creationInfo();
    }

    /// Box.tokens: Coll[(Coll[Byte], Long)]
    pub fn tokens(box: *const ErgoBox) []const Token {
        return if (box.tokens) |t| t.items() else &.{};
    }

    /// Box.getReg[T](i: Int): Option[T]
    pub fn getReg(box: *const ErgoBox, comptime T: type, index: i32) !?T {
        const id = try RegisterId.fromByte(@intCast(index));
        const constant = try box.getRegister(id) orelse return null;
        return constant.extractAs(T);
    }
};

Type-Safe Register Access

Three outcomes when accessing registers1920:

Register Access Outcomes
─────────────────────────────────────────────────────

┌─────────────────────────────────────────────────┐
│             box.R4[Int]                         │
└─────────────────────────────────────────────────┘
                      │
        ┌─────────────┼─────────────┐
        ▼             ▼             ▼
┌──────────────┐ ┌──────────┐ ┌────────────────┐
│ R4 not set   │ │ R4 = Int │ │ R4 = Long      │
│              │ │          │ │ (wrong type)   │
└──────┬───────┘ └────┬─────┘ └───────┬────────┘
       │              │               │
       ▼              ▼               ▼
   None           Some(value)     ERROR!
                                  InvalidType
/// Type-safe register access with explicit error handling
pub fn extractRegisterAs(
    box: *const ErgoBox,
    register_id: i8,
    expected_type: SType,
) !?Value {
    const id = try RegisterId.fromByte(@intCast(register_id));
    const constant_opt = try box.getRegister(id);

    if (constant_opt) |constant| {
        if (!constant.tpe.equals(expected_type)) {
            return error.InvalidType;
        }
        return constant.value;
    }
    return null;
}

Summary

  • Boxes are immutable UTXO state containers with 10 registers
  • R0-R3 are mandatory (value, script, tokens, creation info)
  • R4-R9 are application-defined, must be densely packed
  • Box ID is Blake2b256 hash of serialized content including tx reference
  • Tokens stored in R2, max 255 per box (protocol), ~122 practical; token ID MUST equal first input's box ID
  • Type-safe access with three outcomes: None, Some(value), or InvalidType
  • 4KB limit on total box size

Next: Chapter 23: Interpreter Wrappers

6

Rust: id.rs:78-90

7

Scala: ErgoBox.scala (additionalRegisters)

11

Scala: CBox.scala:77-94

13

Scala: methods.scala:1263 (SBoxMethods)

16

Rust: ergo_box.rs:36-37 (BoxTokens)

19

Scala: CBox.scala:20-74

Chapter 23: Interpreter Wrappers

PRE-ALPHA WARNING: This is a pre-alpha version of The Sigma Book. Content may be incomplete, inaccurate, or subject to change. Do not use as a source of truth. For authoritative information, consult the official repositories:

Prerequisites

Learning Objectives

By the end of this chapter, you will be able to:

  • Explain the interpreter hierarchy and how verifier/prover are combined
  • Describe storage rent rules for expired boxes
  • Use the Wallet API for transaction signing
  • Implement proof verification with context extensions

Interpreter Architecture

The interpreter provides a layered architecture for script evaluation and proving12:

Interpreter Hierarchy
─────────────────────────────────────────────────────

┌─────────────────────────────────────────────────────┐
│                    Verifier                         │
│  verify(tree, ctx, proof, message) -> bool          │
│  Evaluates tree, then verifies sigma protocol proof │
└────────────────────────┬────────────────────────────┘
                         │ uses
                         ▼
┌─────────────────────────────────────────────────────┐
│                    Prover                           │
│  prove(tree, ctx, message, hints) -> ProverResult   │
│  Reduces to SigmaBoolean, generates proof           │
├─────────────────────────────────────────────────────┤
│  secrets: []PrivateInput                            │
│  prove() generates commitment, response             │
└────────────────────────┬────────────────────────────┘
                         │ uses
                         ▼
┌─────────────────────────────────────────────────────┐
│               reduce_to_crypto                      │
│  Evaluates ErgoTree to SigmaBoolean                 │
│  Returns: { sigma_prop, cost, diag }                │
└─────────────────────────────────────────────────────┘

Reduction to Crypto

The core evaluation function reduces ErgoTree to a cryptographic proposition34:

/// Result of expression reduction
const ReductionResult = struct {
    /// SigmaBoolean representing verifiable statement
    sigma_prop: SigmaBoolean,
    /// Estimated execution cost
    cost: u64,
    /// Diagnostic info (env state, pretty-printed expr)
    diag: ReductionDiagnosticInfo,
};

/// Evaluate ErgoTree to SigmaBoolean
pub fn reduceToCrypto(tree: *const ErgoTree, ctx: *const Context) !ReductionResult {
    const expr = try tree.root();
    var env = Env.empty();

    const value = try expr.eval(&env, ctx);

    const sigma_prop = switch (value) {
        .boolean => |b| SigmaBoolean.trivial(b),
        .sigma_prop => |sp| sp.value(),
        else => return error.NotSigmaProp,
    };

    return ReductionResult{
        .sigma_prop = sigma_prop,
        .cost = ctx.cost_accum.total(),
        .diag = .{
            .env = env.toStatic(),
            .pretty_printed_expr = null,
        },
    };
}

Verifier Trait

Verification executes script and validates proof56:

const Verifier = struct {
    /// Verify proof against ErgoTree in context
    pub fn verify(
        self: *const Verifier,
        tree: *const ErgoTree,
        ctx: *const Context,
        proof: ProofBytes,
        message: []const u8,
    ) !VerificationResult {
        // Step 1-2: Reduce to SigmaBoolean
        const reduction = try reduceToCrypto(tree, ctx);

        // Step 3: Verify proof
        const result = switch (reduction.sigma_prop) {
            .trivial_prop => |b| b,
            else => |sb| blk: {
                if (proof.isEmpty()) break :blk false;

                // Parse signature and compute challenges
                const unchecked_tree = try parseSigComputeChallenges(
                    sb,
                    proof.bytes(),
                );

                // Verify commitments match
                break :blk try checkCommitments(unchecked_tree, message);
            },
        };

        return VerificationResult{
            .result = result,
            .cost = reduction.cost,
            .diag = reduction.diag,
        };
    }
};

const VerificationResult = struct {
    /// True if proof validates
    result: bool,
    /// Execution cost
    cost: u64,
    /// Diagnostic information
    diag: ReductionDiagnosticInfo,
};

Prover Trait

The prover generates proofs for sigma propositions78:

const Prover = struct {
    /// Private inputs (secrets)
    secrets: []const PrivateInput,

    /// Generate proof for ErgoTree
    pub fn prove(
        self: *const Prover,
        tree: *const ErgoTree,
        ctx: *const Context,
        message: []const u8,
        hints: ?*const HintsBag,
    ) !ProverResult {
        // Reduce to crypto
        const reduction = try reduceToCrypto(tree, ctx);

        return switch (reduction.sigma_prop) {
            .trivial_prop => |b| if (b)
                ProverResult.empty()
            else
                error.ReducedToFalse,

            else => |sb| blk: {
                // Generate proof using sigma protocol
                const proof = try self.generateProof(sb, message, hints);
                break :blk proof;
            },
        };
    }

    /// Add secret to prover
    pub fn appendSecret(self: *Prover, secret: PrivateInput) void {
        self.secrets = append(self.secrets, secret);
    }

    /// Get public images of all secrets
    pub fn publicImages(self: *const Prover) []SigmaBoolean {
        var result: []SigmaBoolean = &.{};
        for (self.secrets) |secret| {
            result = append(result, secret.publicImage());
        }
        return result;
    }
};

ProverResult

Proof output with context extension910:

const ProverResult = struct {
    /// Serialized proof bytes
    proof: ProofBytes,
    /// User-defined context variables
    extension: ContextExtension,

    pub fn empty() ProverResult {
        return .{
            .proof = ProofBytes.empty(),
            .extension = ContextExtension.empty(),
        };
    }
};

/// Proof bytes (empty for trivial proofs)
const ProofBytes = union(enum) {
    empty: void,
    some: []const u8,

    pub fn isEmpty(self: ProofBytes) bool {
        return self == .empty;
    }

    pub fn bytes(self: ProofBytes) []const u8 {
        return switch (self) {
            .empty => &.{},
            .some => |b| b,
        };
    }
};

Wallet

The Wallet wraps prover for transaction signing1112:

const Wallet = struct {
    /// Underlying prover
    prover: *Prover,

    /// Create from mnemonic phrase
    pub fn fromMnemonic(
        phrase: []const u8,
        password: []const u8,
    ) !Wallet {
        const seed = Mnemonic.toSeed(phrase, password);
        const ext_sk = try ExtSecretKey.deriveMaster(seed);
        return Wallet.fromSecrets(&.{ext_sk.secretKey()});
    }

    /// Create from secret keys
    pub fn fromSecrets(secrets: []const SecretKey) Wallet {
        var private_inputs: []PrivateInput = &.{};
        for (secrets) |sk| {
            private_inputs = append(private_inputs, PrivateInput.from(sk));
        }
        return .{
            .prover = &Prover{ .secrets = private_inputs },
        };
    }

    /// Add secret to wallet
    pub fn addSecret(self: *Wallet, secret: SecretKey) void {
        self.prover.appendSecret(PrivateInput.from(secret));
    }

    /// Sign a transaction
    pub fn signTransaction(
        self: *const Wallet,
        tx_context: *const TransactionContext,
        state_context: *const ErgoStateContext,
        tx_hints: ?*const TransactionHintsBag,
    ) !Transaction {
        return signTransactionImpl(
            self.prover,
            tx_context,
            state_context,
            tx_hints,
        );
    }

    /// Sign a reduced transaction
    pub fn signReducedTransaction(
        self: *const Wallet,
        reduced_tx: *const ReducedTransaction,
        tx_hints: ?*const TransactionHintsBag,
    ) !Transaction {
        return signReducedTransactionImpl(
            self.prover,
            reduced_tx,
            tx_hints,
        );
    }
};

Transaction Signing

Sign all inputs, accumulating costs1314:

/// Sign transaction, generating proofs for all inputs
pub fn signTransaction(
    prover: *const Prover,
    tx_context: *const TransactionContext,
    state_context: *const ErgoStateContext,
    tx_hints: ?*const TransactionHintsBag,
) !Transaction {
    const tx = tx_context.spending_tx;
    const message = try tx.bytesToSign();

    // Build context for first input
    var ctx = try makeContext(state_context, tx_context, 0);

    // Sign each input
    var inputs: []Input = &.{};
    for (tx.inputs(), 0..) |unsigned_input, idx| {
        if (idx > 0) {
            try updateContext(&ctx, tx_context, idx);
        }

        // Get hints for this input
        const hints = if (tx_hints) |h| h.allHintsForInput(idx) else null;

        // Generate proof
        const input_box = tx_context.getInputBox(unsigned_input.box_id) orelse
            return error.InputBoxNotFound;

        const prover_result = try prover.prove(
            &input_box.ergo_tree,
            &ctx,
            message,
            hints,
        );

        inputs = append(inputs, Input{
            .box_id = unsigned_input.box_id,
            .spending_proof = prover_result,
        });
    }

    return Transaction{
        .inputs = inputs,
        .data_inputs = tx.data_inputs,
        .output_candidates = tx.output_candidates,
    };
}

/// Create evaluation context for input
pub fn makeContext(
    state_ctx: *const ErgoStateContext,
    tx_ctx: *const TransactionContext,
    self_index: usize,
) !Context {
    const self_box = tx_ctx.getInputBox(
        tx_ctx.spending_tx.inputs()[self_index].box_id,
    ) orelse return error.InputBoxNotFound;

    return Context{
        .height = state_ctx.pre_header.height,
        .self_box = self_box,
        .outputs = tx_ctx.spending_tx.outputs(),
        .inputs = tx_ctx.inputBoxes(),
        .data_inputs = tx_ctx.dataBoxes(),
        .pre_header = state_ctx.pre_header,
        .headers = state_ctx.headers,
        .extension = tx_ctx.spending_tx.contextExtension(self_index),
    };
}

Transaction Hints Bag

Hints for multi-party signing protocols1516:

const TransactionHintsBag = struct {
    /// Secret hints by input index
    secret_hints: std.AutoHashMap(usize, HintsBag),
    /// Public hints (commitments) by input index
    public_hints: std.AutoHashMap(usize, HintsBag),

    pub fn empty() TransactionHintsBag {
        return .{
            .secret_hints = std.AutoHashMap(usize, HintsBag).init(allocator),
            .public_hints = std.AutoHashMap(usize, HintsBag).init(allocator),
        };
    }

    /// Replace all hints for an input
    pub fn replaceHintsForInput(self: *TransactionHintsBag, index: usize, hints: HintsBag) void {
        var public_hints: []Hint = &.{};
        var secret_hints: []Hint = &.{};

        for (hints.hints) |hint| {
            switch (hint) {
                .commitment_hint => public_hints = append(public_hints, hint),
                .secret_proven => secret_hints = append(secret_hints, hint),
            }
        }

        self.secret_hints.put(index, HintsBag{ .hints = secret_hints }) catch {};
        self.public_hints.put(index, HintsBag{ .hints = public_hints }) catch {};
    }

    /// Add hints for an input (appending to existing)
    pub fn addHintsForInput(self: *TransactionHintsBag, index: usize, hints: HintsBag) void {
        // Get existing or empty
        var existing_secret = self.secret_hints.get(index) orelse HintsBag.empty();
        var existing_public = self.public_hints.get(index) orelse HintsBag.empty();

        for (hints.hints) |hint| {
            switch (hint) {
                .commitment_hint => existing_public.hints = append(existing_public.hints, hint),
                .secret_proven => existing_secret.hints = append(existing_secret.hints, hint),
            }
        }

        self.secret_hints.put(index, existing_secret) catch {};
        self.public_hints.put(index, existing_public) catch {};
    }

    /// Get all hints for input
    pub fn allHintsForInput(self: *const TransactionHintsBag, index: usize) HintsBag {
        var hints: []Hint = &.{};

        if (self.secret_hints.get(index)) |bag| {
            for (bag.hints) |h| hints = append(hints, h);
        }
        if (self.public_hints.get(index)) |bag| {
            for (bag.hints) |h| hints = append(hints, h);
        }

        return HintsBag{ .hints = hints };
    }
};

Commitment Generation

Generate first-round commitments for distributed signing1718:

/// Generate commitments for transaction inputs
pub fn generateCommitments(
    wallet: *const Wallet,
    tx_context: *const TransactionContext,
    state_context: *const ErgoStateContext,
) !TransactionHintsBag {
    // Get public keys from wallet secrets
    var public_keys: []SigmaBoolean = &.{};
    for (wallet.prover.secrets) |secret| {
        public_keys = append(public_keys, secret.publicImage());
    }

    var hints_bag = TransactionHintsBag.empty();

    for (tx_context.spending_tx.inputs(), 0..) |_, idx| {
        var ctx = try makeContext(state_context, tx_context, idx);

        const input_box = tx_context.inputBoxes()[idx];
        const reduction = try reduceToCrypto(&input_box.ergo_tree, &ctx);

        // Generate commitments for this sigma proposition
        const input_hints = generateCommitmentsFor(
            &reduction.sigma_prop,
            public_keys,
        );
        hints_bag.addHintsForInput(idx, input_hints);
    }

    return hints_bag;
}

Storage Rent (Ergo-Specific)

Boxes expire after ~4 years and can be spent by anyone19:

Storage Rent Rules
─────────────────────────────────────────────────────

Period: 1,051,200 blocks ≈ 4 years (at 2 min/block)

Expired Box Spending:
┌─────────────────────────────────────────────────────┐
│ IF:                                                 │
│   current_height - box.creation_height >= 1,051,200 │
│   AND proof.isEmpty()                               │
│   AND extension.contains(STORAGE_INDEX_VAR)         │
│ THEN:                                               │
│   Check recreation rules instead of script          │
└─────────────────────────────────────────────────────┘

Recreation Rules:
┌─────────────────────────────────────────────────────┐
│ output.creation_height == current_height            │
│ output.value >= box.value - storage_fee             │
│ output.R1 == box.R1  (script preserved)             │
│ output.R2 == box.R2  (tokens preserved)             │
│ output.R4-R9 == box.R4-R9  (registers preserved)    │
│                                                     │
│ storage_fee = storage_fee_factor * box.bytes.len    │
└─────────────────────────────────────────────────────┘
const StorageConstants = struct {
    /// Storage period in blocks (~4 years at 2 min/block)
    pub const STORAGE_PERIOD: u32 = 1_051_200;
    /// Context extension variable ID for storage index
    pub const STORAGE_INDEX_VAR_ID: u8 = 127;
    /// Fixed cost for storage contract evaluation
    pub const STORAGE_CONTRACT_COST: u64 = 50;
};

/// Check if expired box spending is valid
pub fn checkExpiredBox(
    box: *const ErgoBox,
    output: *const ErgoBoxCandidate,
    current_height: u32,
    storage_fee_factor: u64,
) bool {
    // Calculate storage fee
    const storage_fee = storage_fee_factor * box.serializedSize();

    // If box value <= fee, it's "dust" - always allowed
    if (box.value.as_i64() - @as(i64, @intCast(storage_fee)) <= 0) {
        return true;
    }

    // Check recreation rules
    const correct_height = output.creation_height == current_height;
    const correct_value = output.value.as_i64() >= box.value.as_i64() - @as(i64, @intCast(storage_fee));
    const correct_registers = checkRegistersPreserved(box, output);

    return correct_height and correct_value and correct_registers;
}

fn checkRegistersPreserved(box: *const ErgoBox, output: *const ErgoBoxCandidate) bool {
    // R0 (value) and R3 (reference) can change
    // R1 (script), R2 (tokens), R4-R9 must be preserved
    return eql(box.ergo_tree, output.ergo_tree) and
        eql(box.tokens, output.tokens) and
        eql(box.additional_registers, output.additional_registers);
}

Signing Errors

const TxSigningError = error{
    /// Transaction context invalid
    TransactionContextError,
    /// Prover failed on input
    ProverError,
    /// Serialization failed
    SerializationError,
    /// Signature parsing failed
    SigParsingError,
};

const ProverError = error{
    /// ErgoTree parsing failed
    ErgoTreeError,
    /// Evaluation failed
    EvalError,
    /// Script reduced to false
    ReducedToFalse,
    /// Missing witness for proof
    TreeRootIsNotReal,
    /// Secret not found for leaf
    SecretNotFound,
    /// Simulated leaf needs challenge
    SimulatedLeafWithoutChallenge,
};

Cost Tracking

Transaction costs are accumulated across inputs20:

const TxCostComponents = struct {
    /// Interpreter initialization (once per tx)
    pub const INTERPRETER_INIT_COST: u64 = 10_000;

    /// Calculate total transaction cost
    pub fn calculateInitialCost(
        params: *const BlockchainParameters,
        inputs_count: usize,
        data_inputs_count: usize,
        outputs_count: usize,
        token_access_cost: u64,
    ) u64 {
        return INTERPRETER_INIT_COST +
            inputs_count * params.input_cost +
            data_inputs_count * params.data_input_cost +
            outputs_count * params.output_cost +
            token_access_cost;
    }
};

Deterministic Signing

For platforms without secure random2122:

/// Generate deterministic nonce from secret and message
/// Used when secure random is unavailable
pub fn generateDeterministicCommitments(
    wallet: *const Wallet,
    reduced_tx: *const ReducedTransaction,
    aux_rand: []const u8,
) !TransactionHintsBag {
    var hints_bag = TransactionHintsBag.empty();
    const message = try reduced_tx.unsigned_tx.bytesToSign();

    for (reduced_tx.reduced_inputs(), 0..) |input, idx| {
        // Deterministic nonce: H(secret || message || aux_rand)
        if (generateDeterministicCommitmentsFor(
            wallet.prover,
            &input.sigma_prop,
            message,
            aux_rand,
        )) |bag| {
            hints_bag.addHintsForInput(idx, bag);
        }
    }

    return hints_bag;
}

Summary

  • Verifier evaluates script, verifies sigma protocol proof
  • Prover reduces to SigmaBoolean, generates proof using secrets
  • Wallet wraps prover with transaction-level signing API
  • TransactionHintsBag coordinates multi-party signing
  • Storage rent allows expired boxes (~4 years) to be spent by anyone
  • Deterministic signing available for platforms without secure random
  • Cost accumulates across inputs with initial overhead

Next: Chapter 24: Transaction Validation

2

Rust: eval.rs:1-50

3

Scala: Interpreter.scala (reduce)

5

Scala: Interpreter.scala (verify)

12

Rust: wallet.rs:52-94

15

Scala: HintsBag.scala

Chapter 24: Transaction Validation

PRE-ALPHA WARNING: This is a pre-alpha version of The Sigma Book. Content may be incomplete, inaccurate, or subject to change. Do not use as a source of truth. For authoritative information, consult the official repositories:

Prerequisites

Learning Objectives

By the end of this chapter, you will be able to:

  • Explain the two-phase validation pipeline (stateless then stateful)
  • Implement stateless validation rules (input/output counts, no duplicates)
  • Perform stateful validation with cost accumulation
  • Verify ERG and token preservation across transaction inputs and outputs

Validation Pipeline

Transaction validation occurs in two phases12:

Transaction Validation Pipeline
─────────────────────────────────────────────────────

┌─────────────────────────────────────────────────────┐
│           STATELESS VALIDATION                      │
│   (No blockchain state required)                    │
├─────────────────────────────────────────────────────┤
│  • Has inputs?         (at least 1)                 │
│  • Has outputs?        (at least 1)                 │
│  • Count limits        (≤ 32,767 each)              │
│  • No negative values  (outputs ≥ 0)                │
│  • Output sum valid    (no overflow)                │
│  • Unique inputs       (no double-spend)            │
└──────────────────────────┬──────────────────────────┘
                           │ Pass
                           ▼
┌─────────────────────────────────────────────────────┐
│           STATEFUL VALIDATION                       │
│   (Requires UTXO state and blockchain context)      │
├─────────────────────────────────────────────────────┤
│  1. Calculate initial cost                          │
│  2. Verify outputs (dust, height, size)             │
│  3. Check ERG preservation                          │
│  4. Verify asset preservation                       │
│  5. Verify input scripts (accumulate cost)          │
│  6. Check re-emission rules (EIP-27)                │
└─────────────────────────────────────────────────────┘

Transaction Structure

const Transaction = struct {
    /// Transaction ID (Blake2b256 of serialized tx without proofs)
    tx_id: TxId,
    /// Input boxes to spend (with proofs)
    inputs: TxIoVec(Input),
    /// Read-only input references (no proofs)
    data_inputs: ?TxIoVec(DataInput),
    /// Output box candidates
    output_candidates: TxIoVec(ErgoBoxCandidate),
    /// Materialized outputs (with tx_id and index)
    outputs: TxIoVec(ErgoBox),

    pub const MAX_OUTPUTS_COUNT: usize = std.math.maxInt(u16);

    pub fn init(
        inputs: TxIoVec(Input),
        data_inputs: ?TxIoVec(DataInput),
        output_candidates: TxIoVec(ErgoBoxCandidate),
    ) !Transaction {
        // First pass: compute outputs with zero tx_id
        const zero_outputs = try output_candidates.mapIndexed(
            struct {
                fn f(idx: usize, bc: *const ErgoBoxCandidate) !ErgoBox {
                    return ErgoBox.fromBoxCandidate(bc, TxId.zero(), @intCast(idx));
                }
            }.f,
        );

        var tx = Transaction{
            .tx_id = TxId.zero(),
            .inputs = inputs,
            .data_inputs = data_inputs,
            .output_candidates = output_candidates,
            .outputs = zero_outputs,
        };

        // Compute actual tx_id
        tx.tx_id = try tx.calcTxId();

        // Update outputs with correct tx_id
        tx.outputs = try output_candidates.mapIndexed(
            struct {
                fn f(idx: usize, bc: *const ErgoBoxCandidate) !ErgoBox {
                    return ErgoBox.fromBoxCandidate(bc, tx.tx_id, @intCast(idx));
                }
            }.f,
        );

        return tx;
    }
};

Validation Error Types

const TxValidationError = error{
    /// Output ERG sum overflow
    OutputSumOverflow,
    /// Input ERG sum overflow
    InputSumOverflow,
    /// Same box spent twice
    DoubleSpend,
    /// ERG not preserved (inputs != outputs)
    ErgPreservationError,
    /// Token amounts not preserved
    TokenPreservationError,
    /// Output below dust threshold
    DustOutput,
    /// Creation height > current height
    InvalidHeightError,
    /// Creation height < max input height (v3+)
    MonotonicHeightError,
    /// Negative creation height (v1+)
    NegativeHeight,
    /// Box exceeds 4KB limit
    BoxSizeExceeded,
    /// Script exceeds size limit
    ScriptSizeExceeded,
    /// Script verification failed
    ReducedToFalse,
    /// Verifier error
    VerifierError,
};

Stateless Validation

Checks that don't require blockchain state34:

/// Validate transaction structure without blockchain state
pub fn validateStateless(tx: *const Transaction) TxValidationError!void {
    // BoundedVec ensures 1 ≤ count ≤ 32767, so no explicit checks needed

    // Check output sum doesn't overflow
    var output_sum: i64 = 0;
    for (tx.outputs.items()) |out| {
        output_sum = std.math.add(i64, output_sum, out.value.as_i64()) catch
            return error.OutputSumOverflow;
    }

    // Check no double-spend (unique inputs)
    var seen = std.AutoHashMap(BoxId, void).init(allocator);
    defer seen.deinit();

    for (tx.inputs.items()) |input| {
        const result = seen.getOrPut(input.box_id);
        if (result.found_existing) {
            return error.DoubleSpend;
        }
    }
}

Stateless Rules Table

Stateless Validation Rules
─────────────────────────────────────────────────────
Rule              Check                    Limit
─────────────────────────────────────────────────────
txNoInputs        inputs.len >= 1          min 1
txNoOutputs       outputs.len >= 1         min 1
txManyInputs      inputs.len <= MAX        32,767
txManyDataInputs  data_inputs.len <= MAX   32,767
txManyOutputs     outputs.len <= MAX       32,767
txNegativeOutput  all outputs >= 0         -
txOutputSum       sum(outputs) no overflow -
txInputsUnique    no duplicate box_ids     -
─────────────────────────────────────────────────────

Stateful Validation

Requires UTXO state and blockchain context56:

/// Validate transaction against blockchain state
pub fn validateStateful(
    tx: *const Transaction,
    boxes_to_spend: []const ErgoBox,
    data_boxes: []const ErgoBox,
    state_context: *const ErgoStateContext,
    accumulated_cost: u64,
    verifier: *const Verifier,
) TxValidationError!u64 {
    const params = state_context.current_parameters;
    const max_cost = params.max_block_cost;

    // 1. Calculate initial cost
    const initial_cost = calculateInitialCost(
        tx,
        boxes_to_spend.len,
        data_boxes.len,
        params,
    );
    var current_cost = accumulated_cost + initial_cost;

    if (current_cost > max_cost) {
        return error.CostExceeded;
    }

    // 2. Verify outputs
    const max_input_height = maxCreationHeight(boxes_to_spend);
    for (tx.outputs.items()) |out| {
        try verifyOutput(out, state_context, max_input_height);
    }

    // 3. Check ERG preservation (inputs must equal outputs exactly)
    const input_sum = try sumValues(boxes_to_spend);
    const output_sum = try sumValues(tx.outputs.items());
    if (input_sum != output_sum) {
        return error.ErgPreservationError;
    }

    // 4. Verify asset preservation
    current_cost = try verifyAssets(
        tx,
        boxes_to_spend,
        state_context,
        current_cost,
    );

    // 5. Verify each input script
    for (boxes_to_spend, 0..) |box, idx| {
        current_cost = try verifyInput(
            tx,
            boxes_to_spend,
            data_boxes,
            box,
            @intCast(idx),
            state_context,
            current_cost,
            verifier,
        );
    }

    return current_cost;
}

Initial Cost Calculation

Transaction cost starts with fixed overhead78:

const CostConstants = struct {
    pub const INTERPRETER_INIT_COST: u64 = 10_000;
};

pub fn calculateInitialCost(
    tx: *const Transaction,
    inputs_count: usize,
    data_inputs_count: usize,
    params: *const BlockchainParameters,
) u64 {
    return CostConstants.INTERPRETER_INIT_COST +
        inputs_count * params.input_cost +
        data_inputs_count * params.data_input_cost +
        tx.outputs.len() * params.output_cost;
}

Output Verification

Each output must pass structural checks910:

pub fn verifyOutput(
    out: *const ErgoBox,
    state_context: *const ErgoStateContext,
    max_input_height: u32,
) TxValidationError!void {
    const params = state_context.current_parameters;
    const block_version = state_context.block_version;
    const current_height = state_context.current_height;

    // Dust check: value >= minimum for box size
    const min_value = BoxUtils.minimalErgoAmount(out, params);
    if (out.value.as_u64() < min_value) {
        return error.DustOutput;
    }

    // Future check: creation height <= current height
    if (out.creation_height > current_height) {
        return error.InvalidHeightError;
    }

    // Non-negative height (after v1)
    if (block_version > 1 and out.creation_height < 0) {
        return error.NegativeHeight;
    }

    // Monotonic height (after v3): output height >= max input height
    if (block_version >= 3 and out.creation_height < max_input_height) {
        return error.MonotonicHeightError;
    }

    // Size limits
    if (out.serializedSize() > ErgoBox.MAX_BOX_SIZE) {
        return error.BoxSizeExceeded;
    }
    if (out.propositionBytes().len > ErgoBox.MAX_SCRIPT_SIZE) {
        return error.ScriptSizeExceeded;
    }
}

Asset Verification

Token preservation rules1112:

pub fn verifyAssets(
    tx: *const Transaction,
    boxes_to_spend: []const ErgoBox,
    state_context: *const ErgoStateContext,
    current_cost: u64,
) TxValidationError!u64 {
    // Extract input assets
    var in_assets = std.AutoHashMap(TokenId, u64).init(allocator);
    defer in_assets.deinit();

    for (boxes_to_spend) |box| {
        if (box.tokens) |tokens| {
            for (tokens.items()) |token| {
                const entry = in_assets.getOrPut(token.token_id);
                if (entry.found_existing) {
                    entry.value_ptr.* += token.amount.value;
                } else {
                    entry.value_ptr.* = token.amount.value;
                }
            }
        }
    }

    // Extract output assets
    var out_assets = std.AutoHashMap(TokenId, u64).init(allocator);
    defer out_assets.deinit();

    for (tx.outputs.items()) |out| {
        if (out.tokens) |tokens| {
            for (tokens.items()) |token| {
                const entry = out_assets.getOrPut(token.token_id);
                if (entry.found_existing) {
                    entry.value_ptr.* += token.amount.value;
                } else {
                    entry.value_ptr.* = token.amount.value;
                }
            }
        }
    }

    // First input box ID can mint new tokens
    const new_token_id = TokenId{ .digest = tx.inputs.items()[0].box_id.digest };

    // Verify each output token
    var iter = out_assets.iterator();
    while (iter.next()) |entry| {
        const out_id = entry.key_ptr.*;
        const out_amount = entry.value_ptr.*;

        const in_amount = in_assets.get(out_id) orelse 0;

        // Output amount <= input amount OR it's a new token
        if (out_amount > in_amount) {
            if (!std.mem.eql(u8, &out_id.digest, &new_token_id.digest) or out_amount == 0) {
                return error.TokenPreservationError;
            }
        }
    }

    // Add token access cost
    const token_access_cost = calculateTokenAccessCost(
        in_assets.count(),
        out_assets.count(),
        state_context.current_parameters.token_access_cost,
    );

    return current_cost + token_access_cost;
}

Input Script Verification

The most expensive step—verify each input's script1314:

pub fn verifyInput(
    tx: *const Transaction,
    boxes_to_spend: []const ErgoBox,
    data_boxes: []const ErgoBox,
    box: *const ErgoBox,
    input_index: u16,
    state_context: *const ErgoStateContext,
    current_cost: u64,
    verifier: *const Verifier,
) TxValidationError!u64 {
    const max_cost = state_context.current_parameters.max_block_cost;
    const input = tx.inputs.items()[input_index];
    const proof = input.spending_proof;

    // Check for storage rent spending first
    const ctx = try buildContext(
        tx,
        boxes_to_spend,
        data_boxes,
        input_index,
        state_context,
        max_cost - current_cost,
    );

    if (trySpendStorageRent(&input, box, state_context, &ctx)) |_| {
        // Storage rent conditions satisfied, skip script verification
        return current_cost + StorageConstants.STORAGE_CONTRACT_COST;
    }

    // Normal script verification
    const result = verifier.verify(
        &box.ergo_tree,
        &ctx,
        proof.proof,
        tx.messageToSign(),
    ) catch |err| {
        return error.VerifierError;
    };

    if (!result.result) {
        return error.ReducedToFalse;
    }

    const new_cost = current_cost + result.cost;
    if (new_cost > max_cost) {
        return error.CostExceeded;
    }

    return new_cost;
}

Context Construction

Build evaluation context for input verification1516:

pub fn buildContext(
    tx: *const Transaction,
    boxes_to_spend: []const ErgoBox,
    data_boxes: []const ErgoBox,
    input_index: u16,
    state_context: *const ErgoStateContext,
    cost_limit: u64,
) !Context {
    return Context{
        .height = state_context.pre_header.height,
        .self_box = &boxes_to_spend[input_index],
        .inputs = boxes_to_spend,
        .data_inputs = data_boxes,
        .outputs = tx.outputs.items(),
        .pre_header = &state_context.pre_header,
        .headers = state_context.headers,
        .extension = tx.contextExtension(input_index),
        .cost_limit = cost_limit,
        .tree_version = @intCast(state_context.block_version - 1),
    };
}

Storage Rent Spending

Expired boxes can be spent without script verification1718:

const StorageConstants = struct {
    /// Blocks before box is eligible (~4 years)
    pub const STORAGE_PERIOD: u32 = 1_051_200;
    /// Context extension key for output index
    pub const STORAGE_EXTENSION_INDEX: u8 = 127;
    /// Cost for storage rent verification
    pub const STORAGE_CONTRACT_COST: u64 = 50;
};

pub fn trySpendStorageRent(
    input: *const Input,
    input_box: *const ErgoBox,
    state_context: *const ErgoStateContext,
    ctx: *const Context,
) ?void {
    // Must have empty proof
    if (!input.spending_proof.proof.isEmpty()) return null;

    return checkStorageRentConditions(input_box, state_context, ctx);
}

pub fn checkStorageRentConditions(
    input_box: *const ErgoBox,
    state_context: *const ErgoStateContext,
    ctx: *const Context,
) ?void {
    // Check time elapsed
    const age = ctx.pre_header.height - ctx.self_box.creation_height;
    if (age < StorageConstants.STORAGE_PERIOD) return null;

    // Get output index from context extension
    const output_idx_value = ctx.extension.values.get(
        StorageConstants.STORAGE_EXTENSION_INDEX,
    ) orelse return null;
    const output_idx = output_idx_value.extractAs(i16) orelse return null;

    const output = ctx.outputs[@intCast(output_idx)];

    // Calculate storage fee
    const storage_fee = input_box.serializedSize() *
        state_context.parameters.storage_fee_factor;

    // Dust boxes can always be spent
    if (ctx.self_box.value.as_u64() <= storage_fee) return {};

    // Verify recreation rules
    if (output.creation_height != state_context.pre_header.height) return null;
    if (output.value.as_u64() < ctx.self_box.value.as_u64() - storage_fee) return null;

    // Registers must be preserved (except R0 value and R3 creation info)
    for (0..10) |i| {
        const reg_id = RegisterId.fromByte(@intCast(i));
        if (reg_id == .r0 or reg_id == .r3) continue;
        if (!std.meta.eql(
            ctx.self_box.getRegister(reg_id),
            output.getRegister(reg_id),
        )) return null;
    }

    return {};
}

Cost Accumulation Flow

Cost Accumulation
─────────────────────────────────────────────────────

Block accumulated cost (from previous txs)
    │
    ├── + INTERPRETER_INIT_COST  (10,000)
    ├── + inputs.len × inputCost
    ├── + data_inputs.len × dataInputCost
    ├── + outputs.len × outputCost
    │
    ▼
startCost
    │
    ├── Input[0] script → + scriptCost₀
    ├── Input[1] script → + scriptCost₁
    ├── ...
    ├── Input[n] script → + scriptCostₙ
    │
    ├── Token access → + tokenAccessCost
    │
    ▼
finalCost ≤ maxBlockCost

Each input verification receives remaining budget:
  ctx.cost_limit = maxBlockCost - current_cost

Validation Rules Summary

Validation Rules Reference
─────────────────────────────────────────────────────
ID    Name                    Phase       Description
─────────────────────────────────────────────────────
100   txNoInputs              Stateless   ≥1 input
101   txNoOutputs             Stateless   ≥1 output
102   txManyInputs            Stateless   ≤32,767
103   txManyDataInputs        Stateless   ≤32,767
104   txManyOutputs           Stateless   ≤32,767
105   txNegativeOutput        Stateless   values ≥ 0
106   txOutputSum             Stateless   no overflow
107   txInputsUnique          Stateless   no duplicates
─────────────────────────────────────────────────────
120   txScriptValidation      Stateful    scripts pass
121   bsBlockTransactionsCost Stateful    cost in limit
122   txDust                  Stateful    min value
123   txFuture                Stateful    valid height
124   txErgPreservation       Stateful    inputs == outputs
125   txAssetsPreservation    Stateful    tokens balanced
126   txBoxSize               Stateful    ≤4KB
127   txReemission            Stateful    EIP-27 rules
─────────────────────────────────────────────────────

Complete Validation Flow

/// Full transaction validation
pub fn validateTransaction(
    tx: *const Transaction,
    utxo_state: *const UtxoState,
    state_context: *const ErgoStateContext,
    verifier: *const Verifier,
    accumulated_cost: u64,
) !u64 {
    // Phase 1: Stateless validation
    try validateStateless(tx);

    // Phase 2: Resolve input boxes
    var boxes_to_spend: []ErgoBox = &.{};
    for (tx.inputs.items()) |input| {
        const box = utxo_state.boxById(input.box_id) orelse
            return error.InputBoxNotFound;
        boxes_to_spend = append(boxes_to_spend, box);
    }

    // Phase 3: Resolve data input boxes
    var data_boxes: []ErgoBox = &.{};
    if (tx.data_inputs) |data_inputs| {
        for (data_inputs.items()) |data_input| {
            const box = utxo_state.boxById(data_input.box_id) orelse
                return error.DataInputBoxNotFound;
            data_boxes = append(data_boxes, box);
        }
    }

    // Phase 4: Stateful validation
    return validateStateful(
        tx,
        boxes_to_spend,
        data_boxes,
        state_context,
        accumulated_cost,
        verifier,
    );
}

Summary

  • Two-phase validation: Stateless (structural) then stateful (UTXO-dependent)
  • Stateless: Count limits, no negatives, no overflow, unique inputs
  • Stateful: Cost tracking, output checks, preservation rules, script verification
  • Cost accumulation: Tracks across inputs, bounded by maxBlockCost
  • Storage rent: Expired boxes (~4 years) spendable by anyone with recreation
  • Asset preservation: ERG exactly preserved (inputs == outputs), tokens can only decrease (or mint new)

Next: Chapter 25: Cost Limits and Parameters

Chapter 25: Cost Limits and Parameters

PRE-ALPHA WARNING: This is a pre-alpha version of The Sigma Book. Content may be incomplete, inaccurate, or subject to change. Do not use as a source of truth. For authoritative information, consult the official repositories:

Prerequisites

  • Chapter 24 for how cost limits are enforced during validation
  • Chapter 13 for JitCost and operation costs

Learning Objectives

By the end of this chapter, you will be able to:

  • Explain Ergo's adjustable blockchain parameters and their governance
  • Describe the miner voting mechanism for parameter changes
  • Work with cost-related parameters and their default values
  • Configure validation rules and soft-fork settings

Parameter System

Ergo's blockchain parameters are adjustable through miner voting12:

Parameter Governance
─────────────────────────────────────────────────────

┌─────────────────────────────────────────────────────┐
│                  Parameters                         │
├─────────────────────────────────────────────────────┤
│  parameters_table: HashMap<Parameter, i32>          │
│  proposed_update: ValidationSettingsUpdate          │
│  height: u32                                        │
└─────────────────────────────────────────────────────┘
                        │
                        ▼
┌─────────────────────────────────────────────────────┐
│              Parameter Types                        │
├─────────────────────────────────────────────────────┤
│  Cost:     maxBlockCost, inputCost, outputCost...   │
│  Size:     maxBlockSize, minValuePerByte            │
│  Fee:      storageFeeFactor                         │
│  Version:  blockVersion                             │
└─────────────────────────────────────────────────────┘
                        │
                        ▼
┌─────────────────────────────────────────────────────┐
│              Voting Mechanism                       │
├─────────────────────────────────────────────────────┤
│  Miners include votes in block headers              │
│  Votes tallied over epochs (1024 blocks)            │
│  Majority (>= 90%) activates change                 │
│  Each param has min/max bounds and step size        │
└─────────────────────────────────────────────────────┘

Parameter Enum

const Parameter = enum(i8) {
    /// Storage fee factor (per byte per ~4 year storage period)
    storage_fee_factor = 1,
    /// Minimum monetary value per byte of box
    min_value_per_byte = 2,
    /// Maximum block size in bytes
    max_block_size = 3,
    /// Maximum computational cost per block
    max_block_cost = 4,
    /// Cost per token access
    token_access_cost = 5,
    /// Cost per transaction input
    input_cost = 6,
    /// Cost per data input
    data_input_cost = 7,
    /// Cost per transaction output
    output_cost = 8,
    /// Sub-blocks per block (v6+)
    subblocks_per_block = 9,
    /// Soft-fork vote
    soft_fork = 120,
    /// Soft-fork votes collected
    soft_fork_votes = 121,
    /// Soft-fork starting height
    soft_fork_start_height = 122,
    /// Current block version
    block_version = 123,

    /// Negative values indicate decrease vote
    pub fn decreaseVote(self: Parameter) i8 {
        return -@intFromEnum(self);
    }
};

Parameters Structure

const Parameters = struct {
    /// Current block height
    height: u32,
    /// Parameter ID -> value mapping
    parameters_table: std.AutoHashMap(Parameter, i32),
    /// Proposed validation settings update
    proposed_update: ValidationSettingsUpdate,

    /// Get block version
    pub fn blockVersion(self: *const Parameters) i32 {
        return self.parameters_table.get(.block_version) orelse 1;
    }

    /// Get max block cost
    pub fn maxBlockCost(self: *const Parameters) i32 {
        return self.parameters_table.get(.max_block_cost) orelse DefaultParams.MAX_BLOCK_COST;
    }

    /// Get input cost
    pub fn inputCost(self: *const Parameters) i32 {
        return self.parameters_table.get(.input_cost) orelse DefaultParams.INPUT_COST;
    }

    /// Get data input cost
    pub fn dataInputCost(self: *const Parameters) i32 {
        return self.parameters_table.get(.data_input_cost) orelse DefaultParams.DATA_INPUT_COST;
    }

    /// Get output cost
    pub fn outputCost(self: *const Parameters) i32 {
        return self.parameters_table.get(.output_cost) orelse DefaultParams.OUTPUT_COST;
    }

    /// Get token access cost
    pub fn tokenAccessCost(self: *const Parameters) i32 {
        return self.parameters_table.get(.token_access_cost) orelse DefaultParams.TOKEN_ACCESS_COST;
    }

    /// Get storage fee factor
    pub fn storageFeeFactor(self: *const Parameters) i32 {
        return self.parameters_table.get(.storage_fee_factor) orelse DefaultParams.STORAGE_FEE_FACTOR;
    }

    /// Get min value per byte
    pub fn minValuePerByte(self: *const Parameters) i32 {
        return self.parameters_table.get(.min_value_per_byte) orelse DefaultParams.MIN_VALUE_PER_BYTE;
    }

    /// Get max block size
    pub fn maxBlockSize(self: *const Parameters) i32 {
        return self.parameters_table.get(.max_block_size) orelse DefaultParams.MAX_BLOCK_SIZE;
    }
};

Default Values

const DefaultParams = struct {
    /// Cost parameters
    pub const MAX_BLOCK_COST: i32 = 1_000_000;
    pub const TOKEN_ACCESS_COST: i32 = 100;
    pub const INPUT_COST: i32 = 2_000;
    pub const DATA_INPUT_COST: i32 = 100;
    pub const OUTPUT_COST: i32 = 100;

    /// Size parameters
    pub const MAX_BLOCK_SIZE: i32 = 512 * 1024; // 512 KB
    pub const MAX_BLOCK_SIZE_MAX: i32 = 1024 * 1024; // 1 MB
    pub const MAX_BLOCK_SIZE_MIN: i32 = 16 * 1024; // 16 KB

    /// Fee parameters
    pub const STORAGE_FEE_FACTOR: i32 = 1_250_000; // 0.00125 ERG per byte per ~4 years
    pub const STORAGE_FEE_FACTOR_MAX: i32 = 2_500_000;
    pub const STORAGE_FEE_FACTOR_MIN: i32 = 0;
    pub const STORAGE_FEE_FACTOR_STEP: i32 = 25_000;

    /// Dust prevention
    pub const MIN_VALUE_PER_BYTE: i32 = 30 * 12; // 360 nanoErgs per byte
    pub const MIN_VALUE_PER_BYTE_MAX: i32 = 10_000;
    pub const MIN_VALUE_PER_BYTE_MIN: i32 = 0;
    pub const MIN_VALUE_PER_BYTE_STEP: i32 = 10;

    /// Sub-blocks (v6+)
    pub const SUBBLOCKS_PER_BLOCK: i32 = 30;
    pub const SUBBLOCKS_PER_BLOCK_MIN: i32 = 2;
    pub const SUBBLOCKS_PER_BLOCK_MAX: i32 = 2048;
    pub const SUBBLOCKS_PER_BLOCK_STEP: i32 = 1;

    /// Interpreter initialization cost
    pub const INTERPRETER_INIT_COST: i32 = 10_000;
};

/// Create default parameters
pub fn defaultParameters() Parameters {
    var table = std.AutoHashMap(Parameter, i32).init(allocator);
    table.put(.storage_fee_factor, DefaultParams.STORAGE_FEE_FACTOR) catch {};
    table.put(.min_value_per_byte, DefaultParams.MIN_VALUE_PER_BYTE) catch {};
    table.put(.token_access_cost, DefaultParams.TOKEN_ACCESS_COST) catch {};
    table.put(.input_cost, DefaultParams.INPUT_COST) catch {};
    table.put(.data_input_cost, DefaultParams.DATA_INPUT_COST) catch {};
    table.put(.output_cost, DefaultParams.OUTPUT_COST) catch {};
    table.put(.max_block_size, DefaultParams.MAX_BLOCK_SIZE) catch {};
    table.put(.max_block_cost, DefaultParams.MAX_BLOCK_COST) catch {};
    table.put(.block_version, 1) catch {};
    return Parameters{
        .height = 0,
        .parameters_table = table,
        .proposed_update = ValidationSettingsUpdate.empty(),
    };
}

Parameter Reference

Default Parameter Values
────────────────────────────────────────────────────────────────────
ID   Name                Default      Min        Max        Step
────────────────────────────────────────────────────────────────────
1    storageFeeFactor    1,250,000    0          2,500,000  25,000
2    minValuePerByte     360          0          10,000     10
3    maxBlockSize        524,288      16,384     1,048,576  1%
4    maxBlockCost        1,000,000    16,384     -          1%
5    tokenAccessCost     100          -          -          1%
6    inputCost           2,000        -          -          1%
7    dataInputCost       100          -          -          1%
8    outputCost          100          -          -          1%
9    subblocksPerBlock   30           2          2,048      1
123  blockVersion        1            1          -          -
────────────────────────────────────────────────────────────────────

Voting Mechanism

Miners vote for parameter changes in block headers3:

const VotingSettings = struct {
    /// Blocks per voting epoch
    pub const EPOCH_LENGTH: u32 = 1024;
    /// Required approval threshold (90%)
    pub const APPROVAL_THRESHOLD: f32 = 0.90;

    /// Check if vote count meets approval threshold
    pub fn changeApproved(self: *const VotingSettings, vote_count: u32) bool {
        const threshold = @as(u32, @intFromFloat(EPOCH_LENGTH * APPROVAL_THRESHOLD));
        return vote_count >= threshold;
    }
};

/// Generate votes based on targets
pub fn generateVotes(
    params: *const Parameters,
    own_targets: std.AutoHashMap(Parameter, i32),
    epoch_votes: []const struct { param: i8, count: u32 },
    vote_for_fork: bool,
) []i8 {
    var votes: []i8 = &.{};

    for (epoch_votes) |ev| {
        const param_id = ev.param;

        if (param_id == @intFromEnum(Parameter.soft_fork)) {
            if (vote_for_fork) {
                votes = append(votes, param_id);
            }
        } else if (param_id > 0) {
            // Vote for increase if current < target
            const param: Parameter = @enumFromInt(param_id);
            const current = params.parameters_table.get(param) orelse continue;
            const target = own_targets.get(param) orelse continue;
            if (target > current) {
                votes = append(votes, param_id);
            }
        } else if (param_id < 0) {
            // Vote for decrease if current > target
            const param: Parameter = @enumFromInt(-param_id);
            const current = params.parameters_table.get(param) orelse continue;
            const target = own_targets.get(param) orelse continue;
            if (target < current) {
                votes = append(votes, param_id);
            }
        }
    }

    return padVotes(votes);
}

Parameter Update Logic

Apply votes at epoch boundaries4:

/// Update parameters based on epoch votes
pub fn updateParams(
    params_table: std.AutoHashMap(Parameter, i32),
    epoch_votes: []const struct { param: i8, count: u32 },
    settings: *const VotingSettings,
) std.AutoHashMap(Parameter, i32) {
    var new_table = params_table.clone();

    for (epoch_votes) |ev| {
        const param_id = ev.param;
        if (param_id >= @intFromEnum(Parameter.soft_fork)) continue;

        const param_abs: Parameter = @enumFromInt(if (param_id < 0) -param_id else param_id);

        if (settings.changeApproved(ev.count)) {
            const current = new_table.get(param_abs) orelse continue;
            const max_val = getMaxValue(param_abs);
            const min_val = getMinValue(param_abs);
            const step = getStep(param_abs, current);

            const new_value = if (param_id > 0) blk: {
                // Increase: cap at max
                break :blk if (current < max_val) current + step else current;
            } else blk: {
                // Decrease: floor at min
                break :blk if (current > min_val) current - step else current;
            };

            new_table.put(param_abs, new_value) catch {};
        }
    }

    return new_table;
}

fn getMaxValue(param: Parameter) i32 {
    return switch (param) {
        .storage_fee_factor => DefaultParams.STORAGE_FEE_FACTOR_MAX,
        .min_value_per_byte => DefaultParams.MIN_VALUE_PER_BYTE_MAX,
        .max_block_size => DefaultParams.MAX_BLOCK_SIZE_MAX,
        .subblocks_per_block => DefaultParams.SUBBLOCKS_PER_BLOCK_MAX,
        else => std.math.maxInt(i32) / 2,
    };
}

fn getMinValue(param: Parameter) i32 {
    return switch (param) {
        .storage_fee_factor => DefaultParams.STORAGE_FEE_FACTOR_MIN,
        .min_value_per_byte => DefaultParams.MIN_VALUE_PER_BYTE_MIN,
        .max_block_size => DefaultParams.MAX_BLOCK_SIZE_MIN,
        .max_block_cost => 16 * 1024,
        .subblocks_per_block => DefaultParams.SUBBLOCKS_PER_BLOCK_MIN,
        else => 0,
    };
}

fn getStep(param: Parameter, current: i32) i32 {
    return switch (param) {
        .storage_fee_factor => DefaultParams.STORAGE_FEE_FACTOR_STEP,
        .min_value_per_byte => DefaultParams.MIN_VALUE_PER_BYTE_STEP,
        .subblocks_per_block => DefaultParams.SUBBLOCKS_PER_BLOCK_STEP,
        else => @max(1, @divTrunc(current, 100)), // Default 1% step
    };
}

Cost Calculation

Transaction cost formula56:

Cost Formula
──────────────────────────────────────────────────────────────────

totalCost = interpreterInitCost        // 10,000
          + inputs × inputCost         // inputs × 2,000
          + dataInputs × dataInputCost // dataInputs × 100
          + outputs × outputCost       // outputs × 100
          + tokenAccessCost × tokens   // varies
          + scriptExecutionCost        // varies per script

Example (2 inputs, 1 data input, 3 outputs, 50K script):
──────────────────────────────────────────────────────────────────
  10,000  interpreter init
   4,000  2 × 2,000 inputs
     100  1 × 100 data inputs
     300  3 × 100 outputs
  50,000  script execution
──────────────────────────────────────────────────────────────────
  64,400  TOTAL
/// Calculate transaction cost
pub fn calculateTransactionCost(
    params: *const Parameters,
    num_inputs: usize,
    num_data_inputs: usize,
    num_outputs: usize,
    script_cost: u64,
    token_ops: usize,
) u64 {
    const init_cost = DefaultParams.INTERPRETER_INIT_COST;
    const input_cost = params.inputCost() * @as(i32, @intCast(num_inputs));
    const data_input_cost = params.dataInputCost() * @as(i32, @intCast(num_data_inputs));
    const output_cost = params.outputCost() * @as(i32, @intCast(num_outputs));
    const token_cost = params.tokenAccessCost() * @as(i32, @intCast(token_ops));

    return @intCast(init_cost + input_cost + data_input_cost + output_cost + token_cost) + script_cost;
}

/// Calculate block capacity in simple transactions
pub fn estimateBlockCapacity(params: *const Parameters) u32 {
    const max_cost = params.maxBlockCost();

    // Simple tx: 1 input (P2PK), 2 outputs, ~15K script cost
    const simple_tx_cost = DefaultParams.INTERPRETER_INIT_COST +
        params.inputCost() +
        params.outputCost() * 2 +
        15_000; // P2PK verification

    return @intCast(@divTrunc(max_cost, simple_tx_cost));
}

Block Version History

Protocol Versions
────────────────────────────────────────────────────────
Block Version   Protocol   Features
────────────────────────────────────────────────────────
1               v1         Initial mainnet
2               v5.0       Script improvements
3               v5.0.12    Height monotonicity (EIP-39)
4               v6.0       Sub-blocks, new operations
────────────────────────────────────────────────────────

Script version = block_version - 1

Validation Rules

Rules can be enabled/disabled via soft-fork7:

const RuleStatus = struct {
    /// Creates error from modifier details
    create_error: fn (InvalidModifier) Invalid,
    /// Which modifier types this rule applies to
    affected_classes: []const ModifierType,
    /// Can this rule be disabled via soft-fork?
    may_be_disabled: bool,
    /// Is this rule currently active?
    is_active: bool,
};

/// Validation rule IDs
const ValidationRules = struct {
    // Stateless (100-109)
    pub const TX_NO_INPUTS: u16 = 100;
    pub const TX_NO_OUTPUTS: u16 = 101;
    pub const TX_MANY_INPUTS: u16 = 102;
    pub const TX_MANY_DATA_INPUTS: u16 = 103;
    pub const TX_MANY_OUTPUTS: u16 = 104;
    pub const TX_NEGATIVE_OUTPUT: u16 = 105;
    pub const TX_OUTPUT_SUM: u16 = 106;
    pub const TX_INPUTS_UNIQUE: u16 = 107;
    pub const TX_POSITIVE_ASSETS: u16 = 108;
    pub const TX_ASSETS_IN_ONE_BOX: u16 = 109;

    // Stateful (111-127)
    pub const TX_DUST: u16 = 111;
    pub const TX_FUTURE: u16 = 112;
    pub const TX_BOXES_TO_SPEND: u16 = 113;
    pub const TX_DATA_BOXES: u16 = 114;
    pub const TX_INPUTS_SUM: u16 = 115;
    pub const TX_ERG_PRESERVATION: u16 = 116;
    pub const TX_ASSETS_PRESERVATION: u16 = 117;
    pub const TX_BOX_TO_SPEND: u16 = 118;
    pub const TX_SCRIPT_VALIDATION: u16 = 119;
    pub const TX_BOX_SIZE: u16 = 120;
    pub const TX_BOX_PROPOSITION_SIZE: u16 = 121;
    pub const TX_NEG_HEIGHT: u16 = 122; // v2+
    pub const TX_REEMISSION: u16 = 123; // EIP-27
    pub const TX_MONOTONIC_HEIGHT: u16 = 124; // v3+

    // Block rules (300+)
    pub const BS_BLOCK_TX_SIZE: u16 = 306;
    pub const BS_BLOCK_TX_COST: u16 = 307;
};

Rule Configurability

Rule Categories
───────────────────────────────────────────────────────────
Category              Can Disable?  Examples
───────────────────────────────────────────────────────────
Consensus Critical    No            txErgPreservation
                                    txScriptValidation
                                    txNoInputs

Soft-Forkable         Yes           txDust
                                    txBoxSize
                                    txReemission

Version-Gated         N/A           txNegHeight (v2+)
                                    txMonotonicHeight (v3+)
───────────────────────────────────────────────────────────
/// Check if rule can be disabled
pub fn mayBeDisabled(rule: u16) bool {
    return switch (rule) {
        ValidationRules.TX_DUST,
        ValidationRules.TX_BOX_SIZE,
        ValidationRules.TX_BOX_PROPOSITION_SIZE,
        ValidationRules.TX_REEMISSION,
        => true,

        // Consensus-critical rules cannot be disabled
        ValidationRules.TX_NO_INPUTS,
        ValidationRules.TX_ERG_PRESERVATION,
        ValidationRules.TX_SCRIPT_VALIDATION,
        ValidationRules.TX_ASSETS_PRESERVATION,
        => false,

        else => false,
    };
}

Parameter Serialization

Parameters stored in block extensions8:

const SYSTEM_PARAMETERS_PREFIX: u8 = 0x00;
const SOFT_FORK_DISABLING_RULES_KEY: [2]u8 = .{ 0x00, 0x01 };

/// Serialize parameters to extension candidate
pub fn toExtensionCandidate(params: *const Parameters) ExtensionCandidate {
    var fields: []ExtensionField = &.{};

    // Add parameter fields
    var iter = params.parameters_table.iterator();
    while (iter.next()) |entry| {
        const key = [2]u8{ SYSTEM_PARAMETERS_PREFIX, @intFromEnum(entry.key_ptr.*) };
        const value = std.mem.toBytes(@byteSwap(entry.value_ptr.*));
        fields = append(fields, ExtensionField{ .key = key, .value = &value });
    }

    // Add proposed update
    const update_bytes = params.proposed_update.serialize();
    fields = append(fields, ExtensionField{
        .key = SOFT_FORK_DISABLING_RULES_KEY,
        .value = update_bytes,
    });

    return ExtensionCandidate{ .fields = fields };
}

/// Parse parameters from extension
pub fn parseExtension(height: u32, extension: *const Extension) !Parameters {
    var params_table = std.AutoHashMap(Parameter, i32).init(allocator);

    for (extension.fields) |field| {
        if (field.key[0] == SYSTEM_PARAMETERS_PREFIX and
            field.key[1] != SOFT_FORK_DISABLING_RULES_KEY[1])
        {
            const param: Parameter = @enumFromInt(field.key[1]);
            const value = @byteSwap(std.mem.bytesToValue(i32, field.value[0..4]));
            try params_table.put(param, value);
        }
    }

    var proposed_update = ValidationSettingsUpdate.empty();
    for (extension.fields) |field| {
        if (std.mem.eql(u8, &field.key, &SOFT_FORK_DISABLING_RULES_KEY)) {
            proposed_update = try ValidationSettingsUpdate.parse(field.value);
            break;
        }
    }

    return Parameters{
        .height = height,
        .parameters_table = params_table,
        .proposed_update = proposed_update,
    };
}

Summary

  • Parameters adjustable via miner voting (1024-block epochs, 90% threshold)
  • Cost parameters: maxBlockCost (1M), inputCost (2K), outputCost (100)
  • Size parameters: maxBlockSize (512KB), minValuePerByte (360)
  • Fee parameters: storageFeeFactor (1.25M nanoErgs per byte per ~4 years)
  • Block version tracks protocol upgrades (script_version = block_version - 1)
  • Validation rules can be consensus-critical or soft-forkable
  • Parameters stored in block extensions, parsed at epoch boundaries

Next: Chapter 26: Wallet and Signing

Chapter 26: Wallet and Signing

PRE-ALPHA WARNING: This is a pre-alpha version of The Sigma Book. Content may be incomplete, inaccurate, or subject to change. Do not use as a source of truth. For authoritative information, consult the official repositories:

Prerequisites

Learning Objectives

By the end of this chapter, you will be able to:

  • Explain the wallet service architecture and its role in transaction signing
  • Trace the complete transaction signing flow from unsigned to signed
  • Use TransactionHintsBag for distributed multi-party signing
  • Implement box selection strategies for building transactions

Wallet Architecture

The wallet bridges high-level operations with the interpreter layer12:

Wallet Service Architecture
─────────────────────────────────────────────────────────

┌────────────────────────────────────────────────────────┐
│                   Wallet                               │
├────────────────────────────────────────────────────────┤
│  prover: Box<dyn Prover>                               │
│                                                        │
│  ├── from_mnemonic(phrase, pass) -> Wallet             │
│  ├── from_secrets([]SecretKey) -> Wallet               │
│  ├── add_secret(SecretKey)                             │
│  │                                                     │
│  ├── sign_transaction(...) -> Transaction              │
│  ├── sign_reduced_transaction(...) -> Transaction      │
│  │                                                     │
│  └── generate_commitments(...) -> TransactionHintsBag  │
└────────────────────────────────────────────────────────┘
                        │
                        │ uses
                        ▼
┌────────────────────────────────────────────────────────┐
│                    Prover                              │
│  prove(tree, ctx, message, hints) -> ProverResult      │
└────────────────────────────────────────────────────────┘

Wallet Structure

const Wallet = struct {
    /// Underlying prover (boxed trait object)
    prover: *Prover,
    allocator: Allocator,

    /// Create wallet from mnemonic phrase
    pub fn fromMnemonic(
        mnemonic_phrase: []const u8,
        mnemonic_pass: []const u8,
        allocator: Allocator,
    ) !Wallet {
        const seed = Mnemonic.toSeed(mnemonic_phrase, mnemonic_pass);
        const ext_sk = try ExtSecretKey.deriveMaster(seed);
        return Wallet.fromSecrets(&.{ext_sk.secretKey()}, allocator);
    }

    /// Create wallet from secret keys
    pub fn fromSecrets(secrets: []const SecretKey, allocator: Allocator) Wallet {
        var private_inputs = allocator.alloc(PrivateInput, secrets.len) catch unreachable;
        for (secrets, 0..) |sk, i| {
            private_inputs[i] = PrivateInput.from(sk);
        }
        return .{
            .prover = TestProver.init(private_inputs, allocator),
            .allocator = allocator,
        };
    }

    /// Add secret to wallet
    pub fn addSecret(self: *Wallet, secret: SecretKey) void {
        self.prover.appendSecret(PrivateInput.from(secret));
    }

    /// Sign a transaction
    pub fn signTransaction(
        self: *const Wallet,
        tx_context: *const TransactionContext(UnsignedTransaction),
        state_context: *const ErgoStateContext,
        tx_hints: ?*const TransactionHintsBag,
    ) !Transaction {
        return signTransactionImpl(
            self.prover,
            tx_context,
            state_context,
            tx_hints,
        );
    }

    /// Sign a reduced transaction
    pub fn signReducedTransaction(
        self: *const Wallet,
        reduced_tx: *const ReducedTransaction,
        tx_hints: ?*const TransactionHintsBag,
    ) !Transaction {
        return signReducedTransactionImpl(
            self.prover,
            reduced_tx,
            tx_hints,
        );
    }

    /// Generate commitments for distributed signing
    pub fn generateCommitments(
        self: *const Wallet,
        tx_context: *const TransactionContext(UnsignedTransaction),
        state_context: *const ErgoStateContext,
    ) !TransactionHintsBag {
        var public_keys: []SigmaBoolean = &.{};
        for (self.prover.secrets()) |secret| {
            public_keys = append(public_keys, secret.publicImage());
        }
        return generateCommitmentsImpl(tx_context, state_context, public_keys);
    }
};

Mnemonic Seed Generation

BIP-39 mnemonic to seed conversion34:

const Mnemonic = struct {
    /// PBKDF2 iterations per BIP-39
    pub const PBKDF2_ITERATIONS: u32 = 2048;
    /// Seed output length (SHA-512)
    pub const SEED_LENGTH: usize = 64;

    /// Convert mnemonic phrase to seed bytes
    pub fn toSeed(
        mnemonic_phrase: []const u8,
        mnemonic_pass: []const u8,
    ) [SEED_LENGTH]u8 {
        var seed: [SEED_LENGTH]u8 = undefined;

        // Normalize to NFKD form
        const normalized_phrase = normalizeNfkd(mnemonic_phrase);
        const normalized_pass = normalizeNfkd(mnemonic_pass);

        // Salt is "mnemonic" + password
        var salt_buf: [256]u8 = undefined;
        const salt_prefix = "mnemonic";
        @memcpy(salt_buf[0..salt_prefix.len], salt_prefix);
        @memcpy(salt_buf[salt_prefix.len..][0..normalized_pass.len], normalized_pass);
        const salt = salt_buf[0 .. salt_prefix.len + normalized_pass.len];

        // PBKDF2-HMAC-SHA512
        pbkdf2HmacSha512(
            normalized_phrase,
            salt,
            PBKDF2_ITERATIONS,
            &seed,
        );

        return seed;
    }
};

Transaction Hints Bag

Manages hints for distributed signing (EIP-11)56:

const TransactionHintsBag = struct {
    /// Secret hints by input index (own commitments)
    secret_hints: std.AutoHashMap(usize, HintsBag),
    /// Public hints by input index (other signers' commitments)
    public_hints: std.AutoHashMap(usize, HintsBag),
    allocator: Allocator,

    pub fn empty(allocator: Allocator) TransactionHintsBag {
        return .{
            .secret_hints = std.AutoHashMap(usize, HintsBag).init(allocator),
            .public_hints = std.AutoHashMap(usize, HintsBag).init(allocator),
            .allocator = allocator,
        };
    }

    /// Replace all hints for an input index
    pub fn replaceHintsForInput(
        self: *TransactionHintsBag,
        index: usize,
        hints_bag: HintsBag,
    ) void {
        var secret_hints: []Hint = &.{};
        var public_hints: []Hint = &.{};

        for (hints_bag.hints) |hint| {
            switch (hint) {
                .own_commitment => secret_hints = append(secret_hints, hint),
                else => public_hints = append(public_hints, hint),
            }
        }

        self.secret_hints.put(index, HintsBag{ .hints = secret_hints }) catch {};
        self.public_hints.put(index, HintsBag{ .hints = public_hints }) catch {};
    }

    /// Add hints for an input (accumulate with existing)
    pub fn addHintsForInput(
        self: *TransactionHintsBag,
        index: usize,
        hints_bag: HintsBag,
    ) void {
        var existing_secret = self.secret_hints.get(index) orelse HintsBag.empty();
        var existing_public = self.public_hints.get(index) orelse HintsBag.empty();

        for (hints_bag.hints) |hint| {
            switch (hint) {
                .own_commitment => existing_secret.hints = append(existing_secret.hints, hint),
                else => existing_public.hints = append(existing_public.hints, hint),
            }
        }

        self.secret_hints.put(index, existing_secret) catch {};
        self.public_hints.put(index, existing_public) catch {};
    }

    /// Get all hints (secret + public) for an input
    pub fn allHintsForInput(self: *const TransactionHintsBag, index: usize) HintsBag {
        var hints: []Hint = &.{};

        if (self.secret_hints.get(index)) |bag| {
            for (bag.hints) |h| hints = append(hints, h);
        }
        if (self.public_hints.get(index)) |bag| {
            for (bag.hints) |h| hints = append(hints, h);
        }

        return HintsBag{ .hints = hints };
    }

    /// Get only public hints (safe to share)
    pub fn publicHintsForInput(self: *const TransactionHintsBag, index: usize) HintsBag {
        return self.public_hints.get(index) orelse HintsBag.empty();
    }
};

Distributed Signing Protocol (EIP-11)

Distributed Signing Flow
─────────────────────────────────────────────────────

Party A                              Party B
─────────                            ─────────

1. Generate Commitments
   commitmentsA = generateCommitments()
                                     commitmentsB = generateCommitments()

2. Exchange Public Hints
   publicA ──────────────────────────►
                    ◄────────────────── publicB

3. Sign with Combined Hints
   combinedA = commitmentsA + publicB
   partialSigA = sign(tx, combinedA)
                                     combinedB = commitmentsB + publicA
                                     partialSigB = sign(tx, combinedB)

4. Extract & Complete
   partialSigA ─────────────────────►
                                     extractedHints = extractHints(partialSigA)
                                     finalTx = sign(tx, commitmentsB + extracted)

Security: Secret hints (randomness r) NEVER leave their owner.
          Only public hints (commitments g^r) are exchanged.
/// Generate commitments for all transaction inputs
pub fn generateCommitments(
    wallet: *const Wallet,
    tx_context: *const TransactionContext(UnsignedTransaction),
    state_context: *const ErgoStateContext,
) !TransactionHintsBag {
    var public_keys: []SigmaBoolean = &.{};
    for (wallet.prover.secrets()) |secret| {
        public_keys = append(public_keys, secret.publicImage());
    }

    var hints_bag = TransactionHintsBag.empty(wallet.allocator);

    for (tx_context.spending_tx.inputs.items(), 0..) |_, idx| {
        const ctx = try makeContext(state_context, tx_context, idx);
        const input_box = tx_context.inputBoxes()[idx];

        // Reduce to SigmaBoolean
        const reduction = try reduceToCrypto(&input_box.ergo_tree, &ctx);

        // Generate commitments for propositions we can prove
        const input_hints = generateCommitmentsFor(
            &reduction.sigma_prop,
            public_keys,
        );
        hints_bag.addHintsForInput(idx, input_hints);
    }

    return hints_bag;
}

/// Extract hints from a partial signature
pub fn extractHints(
    tx: *const Transaction,
    real_propositions: []const SigmaBoolean,
    simulated_propositions: []const SigmaBoolean,
    boxes_to_spend: []const ErgoBox,
    data_boxes: []const ErgoBox,
) TransactionHintsBag {
    var hints_bag = TransactionHintsBag.empty(allocator);

    for (tx.inputs.items(), 0..) |input, idx| {
        const proof = input.spending_proof.proof;
        if (proof.isEmpty()) continue;

        const box = boxes_to_spend[idx];
        const extracted = extractHintsFromProof(
            &box.ergo_tree,
            proof.bytes(),
            real_propositions,
            simulated_propositions,
        );
        hints_bag.addHintsForInput(idx, extracted);
    }

    return hints_bag;
}

Box Selection

Select inputs to satisfy target balance and tokens78:

const BoxSelector = struct {
    /// Selects boxes to satisfy target balance and tokens
    pub fn select(
        self: *const BoxSelector,
        inputs: []const ErgoBox,
        target_balance: BoxValue,
        target_tokens: []const Token,
    ) BoxSelectorError!BoxSelection {
        var selected: []ErgoBox = &.{};
        var total_value: u64 = 0;
        var total_tokens = std.AutoHashMap(TokenId, u64).init(allocator);
        defer total_tokens.deinit();

        // First pass: select boxes until targets met
        for (inputs) |box| {
            const needed = needsMoreBoxes(
                total_value,
                &total_tokens,
                target_balance.as_u64(),
                target_tokens,
            );
            if (!needed) break;

            selected = append(selected, box);
            total_value += box.value.as_u64();

            if (box.tokens) |tokens| {
                for (tokens.items()) |token| {
                    const entry = total_tokens.getOrPut(token.token_id);
                    if (entry.found_existing) {
                        entry.value_ptr.* += token.amount.value;
                    } else {
                        entry.value_ptr.* = token.amount.value;
                    }
                }
            }
        }

        // Check if targets met
        if (total_value < target_balance.as_u64()) {
            return error.NotEnoughCoins;
        }

        for (target_tokens) |target| {
            const have = total_tokens.get(target.token_id) orelse 0;
            if (have < target.amount.value) {
                return error.NotEnoughTokens;
            }
        }

        // Calculate change
        const change = calculateChange(
            total_value,
            &total_tokens,
            target_balance.as_u64(),
            target_tokens,
        );

        return BoxSelection{
            .boxes = try BoundedVec(ErgoBox, 1, MAX_INPUTS).fromSlice(selected),
            .change_boxes = change,
        };
    }
};

const BoxSelection = struct {
    /// Selected boxes to spend
    boxes: BoundedVec(ErgoBox, 1, MAX_INPUTS),
    /// Change boxes to create
    change_boxes: []ErgoBoxAssetsData,
};

const BoxSelectorError = error{
    NotEnoughCoins,
    NotEnoughTokens,
    TokenAmountError,
    NotEnoughCoinsForChangeBox,
    SelectedInputsOutOfBounds,
};

Transaction Signing Flow

Transaction Signing Flow
─────────────────────────────────────────────────────

┌──────────────────────────────────────────────────┐
│ 1. User Request                                  │
│    ├── Target balance                            │
│    └── Target tokens                             │
└──────────────────────────┬───────────────────────┘
                           ▼
┌──────────────────────────────────────────────────┐
│ 2. Box Selection                                 │
│    ├── BoxSelector.select(inputs, target)        │
│    └── Returns: boxes + change                   │
└──────────────────────────┬───────────────────────┘
                           ▼
┌──────────────────────────────────────────────────┐
│ 3. Build Unsigned Transaction                    │
│    ├── inputs: selected boxes                    │
│    ├── data_inputs: read-only references         │
│    └── output_candidates: targets + change       │
└──────────────────────────┬───────────────────────┘
                           ▼
┌──────────────────────────────────────────────────┐
│ 4. Sign Transaction                              │
│    For each input:                               │
│    ├── Create Context                            │
│    ├── Get hints for input                       │
│    ├── prover.prove(tree, ctx, message, hints)   │
│    └── Accumulate cost                           │
└──────────────────────────┬───────────────────────┘
                           ▼
┌──────────────────────────────────────────────────┐
│ 5. Signed Transaction                            │
│    └── Submit to mempool                         │
└──────────────────────────────────────────────────┘
/// Sign transaction with prover
pub fn signTransaction(
    prover: *const Prover,
    tx_context: *const TransactionContext(UnsignedTransaction),
    state_context: *const ErgoStateContext,
    tx_hints: ?*const TransactionHintsBag,
) !Transaction {
    const tx = tx_context.spending_tx;
    const message = try tx.bytesToSign();

    var signed_inputs: []Input = &.{};

    for (tx.inputs.items(), 0..) |unsigned_input, idx| {
        const ctx = try makeContext(state_context, tx_context, idx);

        // Get hints for this input
        const hints = if (tx_hints) |h| h.allHintsForInput(idx) else HintsBag.empty();

        const input_box = tx_context.getInputBox(unsigned_input.box_id) orelse
            return error.InputBoxNotFound;

        // Generate proof
        const prover_result = try prover.prove(
            &input_box.ergo_tree,
            &ctx,
            message,
            &hints,
        );

        signed_inputs = append(signed_inputs, Input{
            .box_id = unsigned_input.box_id,
            .spending_proof = prover_result,
        });
    }

    return Transaction.new(
        try TxIoVec(Input).fromSlice(signed_inputs),
        tx.data_inputs,
        tx.output_candidates,
    );
}

Asset Extraction

Calculate token access costs9:

const ErgoBoxAssetExtractor = struct {
    pub const MAX_ASSETS_PER_BOX: usize = 255;

    /// Extract total token amounts from boxes
    pub fn extractAssets(
        boxes: []const ErgoBoxCandidate,
    ) !struct { assets: std.AutoHashMap(TokenId, u64), count: usize } {
        var assets = std.AutoHashMap(TokenId, u64).init(allocator);
        var total_count: usize = 0;

        for (boxes) |box| {
            if (box.tokens) |tokens| {
                if (tokens.len() > MAX_ASSETS_PER_BOX) {
                    return error.TooManyAssetsInBox;
                }

                for (tokens.items()) |token| {
                    const entry = assets.getOrPut(token.token_id);
                    if (entry.found_existing) {
                        entry.value_ptr.* = std.math.add(
                            u64,
                            entry.value_ptr.*,
                            token.amount.value,
                        ) catch return error.Overflow;
                    } else {
                        entry.value_ptr.* = token.amount.value;
                    }
                }
                total_count += tokens.len();
            }
        }

        return .{ .assets = assets, .count = total_count };
    }

    /// Calculate total token access cost
    pub fn totalAssetsAccessCost(
        in_assets_num: usize,
        in_assets_size: usize,
        out_assets_num: usize,
        out_assets_size: usize,
        token_access_cost: u32,
    ) u64 {
        // Cost to iterate through all tokens
        const all_assets_cost = (out_assets_num + in_assets_num) * token_access_cost;
        // Cost to check preservation of unique tokens
        const unique_assets_cost = (in_assets_size + out_assets_size) * token_access_cost;
        return all_assets_cost + unique_assets_cost;
    }
};

Wallet Errors

const WalletError = error{
    /// Transaction signing failed
    TxSigningError,
    /// Prover failed to generate proof
    ProverError,
    /// Key derivation failed
    ExtSecretKeyError,
    /// Secret key parsing failed
    SecretKeyParsingError,
    /// Wallet not initialized
    WalletNotInitialized,
    /// Wallet locked
    WalletLocked,
    /// Wallet already unlocked
    WalletAlreadyUnlocked,
    /// Box selection failed
    BoxSelectionError,
};

Distributed Signing Example

// Party A: Generate commitments
const commitments_a = try wallet_a.generateCommitments(&tx_context, &state_context);

// Party B: Generate commitments
const commitments_b = try wallet_b.generateCommitments(&tx_context, &state_context);

// Exchange public hints (safe to share)
const public_a = commitments_a.publicHintsForInput(0);
const public_b = commitments_b.publicHintsForInput(0);

// Party A: Sign with combined hints
var combined_a = commitments_a;
combined_a.addHintsForInput(0, public_b);
const partial_sig_a = try wallet_a.signTransaction(&tx_context, &state_context, &combined_a);

// Party B: Extract hints from A's partial signature
const extracted = extractHints(
    &partial_sig_a,
    real_propositions,
    simulated_propositions,
    boxes_to_spend,
    data_boxes,
);

// Party B: Complete signing
var final_hints = commitments_b;
final_hints.addHintsForInput(0, extracted.allHintsForInput(0));
const final_tx = try wallet_b.signTransaction(&tx_context, &state_context, &final_hints);

Summary

  • Wallet wraps prover with high-level signing API
  • Mnemonic converts BIP-39 phrase to seed via PBKDF2
  • TransactionHintsBag separates secret/public hints for distributed signing
  • BoxSelector finds optimal input set for target balance/tokens
  • Distributed signing (EIP-11) exchanges commitments, never secrets
  • Asset extraction calculates token access costs

Next: Chapter 27: High-Level SDK

3

Scala: Mnemonic.scala

Chapter 27: High-Level SDK

PRE-ALPHA WARNING: This is a pre-alpha version of The Sigma Book. Content may be incomplete, inaccurate, or subject to change. Do not use as a source of truth. For authoritative information, consult the official repositories:

Prerequisites

Learning Objectives

By the end of this chapter, you will be able to:

  • Explain the SDK architecture layers from cryptography to transaction building
  • Use TxBuilder with the builder pattern for ergonomic transaction construction
  • Trace the reduce-then-sign pipeline for transaction signing
  • Work with TransactionContext and BoxSelection for complex transaction scenarios

SDK Architecture

The SDK provides a layered abstraction from low-level cryptography to high-level transaction building12:

SDK Layer Architecture
══════════════════════════════════════════════════════════════════

┌────────────────────────────────────────────────────────────────┐
│                     Application Layer                          │
│   TxBuilder    BoxSelector    ErgoBoxCandidateBuilder          │
├────────────────────────────────────────────────────────────────┤
│                     Wallet Layer                               │
│   Wallet    TransactionContext    TransactionHintsBag          │
├────────────────────────────────────────────────────────────────┤
│                     Reduction Layer                            │
│   reduce_tx()    ReducedTransaction    ReducedInput            │
├────────────────────────────────────────────────────────────────┤
│                     Signing Layer                              │
│   sign_transaction()    sign_reduced_transaction()             │
├────────────────────────────────────────────────────────────────┤
│                     Interpreter Layer                          │
│   Prover    Verifier    reduce_to_crypto()                     │
└────────────────────────────────────────────────────────────────┘

Transaction Builder

The builder pattern constructs unsigned transactions with validation34:

const TxBuilder = struct {
    box_selection: BoxSelection,
    data_inputs: std.ArrayList(DataInput),
    output_candidates: std.ArrayList(ErgoBoxCandidate),
    current_height: u32,
    fee_amount: BoxValue,
    change_address: Address,
    context_extensions: std.AutoHashMap(BoxId, ContextExtension),
    token_burn_permit: std.ArrayList(Token),
    allocator: Allocator,

    pub fn init(
        box_selection: BoxSelection,
        output_candidates: []const ErgoBoxCandidate,
        current_height: u32,
        fee_amount: BoxValue,
        change_address: Address,
        allocator: Allocator,
    ) !TxBuilder {
        var outputs = std.ArrayList(ErgoBoxCandidate).init(allocator);
        try outputs.appendSlice(output_candidates);

        return .{
            .box_selection = box_selection,
            .data_inputs = std.ArrayList(DataInput).init(allocator),
            .output_candidates = outputs,
            .current_height = current_height,
            .fee_amount = fee_amount,
            .change_address = change_address,
            .context_extensions = std.AutoHashMap(BoxId, ContextExtension).init(allocator),
            .token_burn_permit = std.ArrayList(Token).init(allocator),
            .allocator = allocator,
        };
    }

    pub fn deinit(self: *TxBuilder) void {
        self.data_inputs.deinit();
        self.output_candidates.deinit();
        self.context_extensions.deinit();
        self.token_burn_permit.deinit();
    }

    pub fn setDataInputs(self: *TxBuilder, data_inputs: []const DataInput) !void {
        self.data_inputs.clearRetainingCapacity();
        try self.data_inputs.appendSlice(data_inputs);
    }

    pub fn setContextExtension(self: *TxBuilder, box_id: BoxId, ext: ContextExtension) !void {
        try self.context_extensions.put(box_id, ext);
    }

    pub fn setTokenBurnPermit(self: *TxBuilder, tokens: []const Token) !void {
        self.token_burn_permit.clearRetainingCapacity();
        try self.token_burn_permit.appendSlice(tokens);
    }
};

Build Validation

Building performs comprehensive validation before creating the transaction56:

pub fn build(self: *TxBuilder) !UnsignedTransaction {
    // Validate inputs
    if (self.box_selection.boxes.items.len == 0) {
        return error.EmptyInputs;
    }
    if (self.output_candidates.items.len == 0) {
        return error.EmptyOutputs;
    }
    if (self.box_selection.boxes.items.len > std.math.maxInt(u16)) {
        return error.TooManyInputs;
    }

    // Check for duplicate inputs
    var seen = std.AutoHashMap(BoxId, void).init(self.allocator);
    defer seen.deinit();
    for (self.box_selection.boxes.items) |box| {
        const result = try seen.getOrPut(box.box_id);
        if (result.found_existing) {
            return error.DuplicateInputs;
        }
    }

    // Build output candidates with change boxes
    var all_outputs = try self.buildOutputCandidates();
    defer all_outputs.deinit();

    // Validate coin preservation
    const total_in = sumValue(self.box_selection.boxes.items);
    const total_out = sumValue(all_outputs.items);

    if (total_out > total_in) {
        return error.NotEnoughCoinsInInputs;
    }
    if (total_out < total_in) {
        return error.NotEnoughCoinsInOutputs;
    }

    // Validate token balance
    try self.validateTokenBalance(all_outputs.items);

    // Create unsigned inputs with context extensions
    var unsigned_inputs = std.ArrayList(UnsignedInput).init(self.allocator);
    for (self.box_selection.boxes.items) |box| {
        const ext = self.context_extensions.get(box.box_id) orelse
            ContextExtension.empty();
        try unsigned_inputs.append(.{
            .box_id = box.box_id,
            .extension = ext,
        });
    }

    return UnsignedTransaction{
        .inputs = try unsigned_inputs.toOwnedSlice(),
        .data_inputs = try self.data_inputs.toOwnedSlice(),
        .output_candidates = try all_outputs.toOwnedSlice(),
    };
}

fn buildOutputCandidates(self: *TxBuilder) !std.ArrayList(ErgoBoxCandidate) {
    var outputs = std.ArrayList(ErgoBoxCandidate).init(self.allocator);

    // Add user-specified outputs
    try outputs.appendSlice(self.output_candidates.items);

    // Add change boxes from selection
    const change_tree = try Contract.payToAddress(self.change_address);
    for (self.box_selection.change_boxes.items) |change| {
        var candidate = try ErgoBoxCandidateBuilder.init(
            change.value,
            change_tree,
            self.current_height,
            self.allocator,
        );
        for (change.tokens) |token| {
            try candidate.addToken(token);
        }
        try outputs.append(try candidate.build());
    }

    // Add miner fee box
    const fee_box = try newMinerFeeBox(self.fee_amount, self.current_height);
    try outputs.append(fee_box);

    return outputs;
}

Token Balance Validation

Token flow must be explicitly validated78:

fn validateTokenBalance(self: *TxBuilder, outputs: []const ErgoBoxCandidate) !void {
    const input_tokens = try sumTokens(self.box_selection.boxes.items, self.allocator);
    defer input_tokens.deinit();

    const output_tokens = try sumTokens(outputs, self.allocator);
    defer output_tokens.deinit();

    // Token minting rule: new tokens can ONLY have token_id == first_input.box_id
    // You can mint any AMOUNT of this token type, but only ONE token type per tx.
    const first_input_id = TokenId.fromBoxId(self.box_selection.boxes.items[0].box_id);

    // Separate minted tokens (first_input_id) from transferred tokens
    var has_minted_token = false;
    var output_without_minted = std.AutoHashMap(TokenId, TokenAmount).init(self.allocator);
    defer output_without_minted.deinit();

    var iter = output_tokens.iterator();
    while (iter.next()) |entry| {
        if (entry.key_ptr.*.eql(first_input_id)) {
            has_minted_token = true;
            // Note: any amount is allowed for the minted token
        } else {
            try output_without_minted.put(entry.key_ptr.*, entry.value_ptr.*);
        }
    }
    _ = has_minted_token; // Used for documentation; actual validation is below

    // Check all output tokens exist in inputs
    var out_iter = output_without_minted.iterator();
    while (out_iter.next()) |entry| {
        const input_amt = input_tokens.get(entry.key_ptr.*) orelse {
            return error.NotEnoughTokens;
        };
        if (input_amt < entry.value_ptr.*) {
            return error.NotEnoughTokens;
        }
    }

    // Check token burn permits
    const burned = try subtractTokens(input_tokens, output_without_minted, self.allocator);
    defer burned.deinit();

    try self.checkBurnPermit(burned);
}

fn checkBurnPermit(self: *TxBuilder, burned: std.AutoHashMap(TokenId, TokenAmount)) !void {
    // Build permit map
    var permits = std.AutoHashMap(TokenId, TokenAmount).init(self.allocator);
    defer permits.deinit();
    for (self.token_burn_permit.items) |token| {
        try permits.put(token.id, token.amount);
    }

    // Every burned token must have permit
    var iter = burned.iterator();
    while (iter.next()) |entry| {
        const permit_amt = permits.get(entry.key_ptr.*) orelse {
            return error.TokenBurnPermitMissing;
        };
        if (entry.value_ptr.* > permit_amt) {
            return error.TokenBurnPermitExceeded;
        }
    }

    // Every permit must be used exactly
    var permit_iter = permits.iterator();
    while (permit_iter.next()) |entry| {
        const burned_amt = burned.get(entry.key_ptr.*) orelse {
            return error.TokenBurnPermitUnused;
        };
        if (burned_amt < entry.value_ptr.*) {
            return error.TokenBurnPermitUnused;
        }
    }
}

Box Candidate Builder

Constructs output boxes with fluent API:

const ErgoBoxCandidateBuilder = struct {
    value: BoxValue,
    ergo_tree: ErgoTree,
    creation_height: u32,
    tokens: std.ArrayList(Token),
    registers: [6]?Constant, // R4-R9
    allocator: Allocator,

    pub fn init(
        value: BoxValue,
        ergo_tree: ErgoTree,
        creation_height: u32,
        allocator: Allocator,
    ) !ErgoBoxCandidateBuilder {
        return .{
            .value = value,
            .ergo_tree = ergo_tree,
            .creation_height = creation_height,
            .tokens = std.ArrayList(Token).init(allocator),
            .registers = [_]?Constant{null} ** 6,
            .allocator = allocator,
        };
    }

    pub fn addToken(self: *ErgoBoxCandidateBuilder, token: Token) !void {
        if (self.tokens.items.len >= MAX_TOKENS) {
            return error.TooManyTokens;
        }
        try self.tokens.append(token);
    }

    pub fn mintToken(
        self: *ErgoBoxCandidateBuilder,
        token: Token,
        name: []const u8,
        description: []const u8,
        decimals: u8,
    ) !void {
        try self.addToken(token);
        // Store metadata in R4-R6
        self.registers[0] = Constant.fromBytes(name);
        self.registers[1] = Constant.fromBytes(description);
        self.registers[2] = Constant.fromByte(decimals);
    }

    pub fn setRegister(self: *ErgoBoxCandidateBuilder, reg: RegisterId, value: Constant) void {
        const idx = @intFromEnum(reg) - 4; // R4 = 0, R5 = 1, etc.
        self.registers[idx] = value;
    }

    pub fn build(self: *ErgoBoxCandidateBuilder) !ErgoBoxCandidate {
        return ErgoBoxCandidate{
            .value = self.value,
            .ergo_tree = self.ergo_tree,
            .creation_height = self.creation_height,
            .tokens = try self.tokens.toOwnedSlice(),
            .additional_registers = self.registers,
        };
    }
};

Transaction Context

Bundles transaction with input boxes for signing910:

const TransactionContext = struct {
    spending_tx: UnsignedTransaction,
    input_boxes: []const ErgoBox,
    data_boxes: ?[]const ErgoBox,

    pub fn init(
        spending_tx: UnsignedTransaction,
        input_boxes: []const ErgoBox,
        data_boxes: ?[]const ErgoBox,
    ) !TransactionContext {
        // Validate input boxes match transaction inputs
        if (input_boxes.len != spending_tx.inputs.len) {
            return error.InputBoxCountMismatch;
        }

        for (spending_tx.inputs, input_boxes) |input, box| {
            if (!input.box_id.eql(box.box_id())) {
                return error.InputBoxIdMismatch;
            }
        }

        // Validate data boxes if present
        if (spending_tx.data_inputs) |data_inputs| {
            const data = data_boxes orelse return error.DataInputBoxNotFound;
            if (data.len != data_inputs.len) {
                return error.DataInputBoxCountMismatch;
            }
        }

        return .{
            .spending_tx = spending_tx,
            .input_boxes = input_boxes,
            .data_boxes = data_boxes,
        };
    }

    pub fn getInputBox(self: *const TransactionContext, box_id: BoxId) ?*const ErgoBox {
        for (self.input_boxes) |*box| {
            if (box.box_id().eql(box_id)) {
                return box;
            }
        }
        return null;
    }
};

Box Selection

Selects input boxes to satisfy output requirements1112:

const BoxSelection = struct {
    boxes: std.ArrayList(ErgoBox),
    change_boxes: std.ArrayList(ErgoBoxAssets),

    const ErgoBoxAssets = struct {
        value: BoxValue,
        tokens: []const Token,
    };
};

const SimpleBoxSelector = struct {
    pub fn select(
        available: []const ErgoBox,
        target_value: BoxValue,
        target_tokens: []const Token,
        allocator: Allocator,
    ) !BoxSelection {
        var selected = std.ArrayList(ErgoBox).init(allocator);
        var total_value: u64 = 0;
        var token_sums = std.AutoHashMap(TokenId, TokenAmount).init(allocator);
        defer token_sums.deinit();

        // Greedy selection
        for (available) |box| {
            const needed = checkNeed(total_value, target_value, token_sums, target_tokens);
            if (!needed) break;

            try selected.append(box);
            total_value += box.value.as_u64();

            for (box.tokens) |token| {
                const entry = try token_sums.getOrPut(token.id);
                if (entry.found_existing) {
                    entry.value_ptr.* = try entry.value_ptr.*.checkedAdd(token.amount);
                } else {
                    entry.value_ptr.* = token.amount;
                }
            }
        }

        // Calculate change
        var change_boxes = std.ArrayList(BoxSelection.ErgoBoxAssets).init(allocator);
        const change_value = total_value - target_value.as_u64();
        if (change_value > 0) {
            const change_tokens = try calculateChangeTokens(token_sums, target_tokens, allocator);
            try change_boxes.append(.{
                .value = BoxValue.init(change_value) catch return error.ChangeValueTooSmall,
                .tokens = change_tokens,
            });
        }

        return .{
            .boxes = selected,
            .change_boxes = change_boxes,
        };
    }
};

Reduced Transaction

Script reduction separates evaluation from signing1314:

const ReducedInput = struct {
    sigma_prop: SigmaBoolean,
    cost: u64,
    extension: ContextExtension,
};

const ReducedTransaction = struct {
    unsigned_tx: UnsignedTransaction,
    reduced_inputs: []const ReducedInput,
    tx_cost: u32,

    pub fn reducedInputs(self: *const ReducedTransaction) []const ReducedInput {
        return self.reduced_inputs;
    }
};

/// Reduce transaction inputs to sigma propositions
pub fn reduceTx(
    tx_context: TransactionContext,
    state_context: *const ErgoStateContext,
    allocator: Allocator,
) !ReducedTransaction {
    var reduced_inputs = std.ArrayList(ReducedInput).init(allocator);

    for (tx_context.spending_tx.inputs, 0..) |input, idx| {
        // Build evaluation context
        var ctx = try makeContext(state_context, &tx_context, idx);

        // Get input box
        const input_box = tx_context.getInputBox(input.box_id) orelse
            return error.InputBoxNotFound;

        // Reduce ErgoTree to SigmaBoolean
        const result = try reduceToCrypto(&input_box.ergo_tree, &ctx);

        try reduced_inputs.append(.{
            .sigma_prop = result.sigma_prop,
            .cost = result.cost,
            .extension = input.extension,
        });
    }

    return .{
        .unsigned_tx = tx_context.spending_tx,
        .reduced_inputs = try reduced_inputs.toOwnedSlice(),
        .tx_cost = 0,
    };
}

Signing Pipeline

Signing Flow
══════════════════════════════════════════════════════════════════

┌─────────────────┐     ┌──────────────────┐     ┌───────────────┐
│ UnsignedTx      │     │ ReducedTx        │     │ SignedTx      │
│ + InputBoxes    │────▶│ (SigmaProps)     │────▶│ (Proofs)      │
│ + StateContext  │     │                  │     │               │
└─────────────────┘     └──────────────────┘     └───────────────┘
        │                       │                       │
        │  reduce_tx()          │  sign_reduced_tx()    │
        │  (needs context)      │  (context-free)       │
        ▼                       ▼                       ▼
   ┌─────────┐            ┌─────────┐              ┌─────────┐
   │ Online  │            │ Offline │              │ Verify  │
   │ Wallet  │            │ Wallet  │              │ Node    │
   └─────────┘            └─────────┘              └─────────┘

Transaction signing with optional hints1516:

pub fn signTransaction(
    prover: *const Prover,
    tx_context: TransactionContext,
    state_context: *const ErgoStateContext,
    tx_hints: ?*const TransactionHintsBag,
) !Transaction {
    const message = try tx_context.spending_tx.bytesToSign();

    var signed_inputs = std.ArrayList(Input).init(prover.allocator);
    for (tx_context.spending_tx.inputs, 0..) |input, idx| {
        const signed = try signTxInput(
            prover,
            &tx_context,
            state_context,
            tx_hints,
            idx,
            message,
        );
        try signed_inputs.append(signed);
    }

    return Transaction{
        .inputs = try signed_inputs.toOwnedSlice(),
        .data_inputs = tx_context.spending_tx.data_inputs,
        .outputs = tx_context.spending_tx.output_candidates,
    };
}

pub fn signReducedTransaction(
    prover: *const Prover,
    reduced_tx: ReducedTransaction,
    tx_hints: ?*const TransactionHintsBag,
) !Transaction {
    const message = try reduced_tx.unsigned_tx.bytesToSign();

    var signed_inputs = std.ArrayList(Input).init(prover.allocator);
    for (reduced_tx.unsigned_tx.inputs, 0..) |input, idx| {
        const reduced_input = reduced_tx.reduced_inputs[idx];

        // Get hints for this input
        const hints = if (tx_hints) |bag|
            bag.allHintsForInput(idx)
        else
            HintsBag.empty();

        // Generate proof from sigma proposition
        const proof = try prover.generateProof(
            reduced_input.sigma_prop,
            message,
            &hints,
        );

        try signed_inputs.append(.{
            .box_id = input.box_id,
            .spending_proof = .{
                .proof = proof,
                .extension = reduced_input.extension,
            },
        });
    }

    return Transaction{
        .inputs = try signed_inputs.toOwnedSlice(),
        .data_inputs = reduced_tx.unsigned_tx.data_inputs,
        .outputs = reduced_tx.unsigned_tx.output_candidates,
    };
}

Miner Fee Box

Standard miner fee output:

/// Miner fee ErgoTree (false proposition with height constraint)
const MINERS_FEE_ERGO_TREE = [_]u8{
    0x10, 0x05, 0x04, 0x00, 0x04, 0x00, 0x0e, 0x36,
    0x10, 0x02, 0x04, 0xa0, 0x0b, 0x08, 0xcd, 0x02,
    // ... (standard miner fee script)
};

pub fn newMinerFeeBox(fee: BoxValue, creation_height: u32) !ErgoBoxCandidate {
    const tree = try ErgoTree.sigmaParse(&MINERS_FEE_ERGO_TREE);

    return ErgoBoxCandidate{
        .value = fee,
        .ergo_tree = tree,
        .creation_height = creation_height,
        .tokens = &[_]Token{},
        .additional_registers = [_]?Constant{null} ** 6,
    };
}

/// Suggested transaction fee (1.1 mERG)
pub const SUGGESTED_TX_FEE = BoxValue.init(1_100_000) catch unreachable;

Reduced Transaction Serialization

EIP-19 format for cold wallet transfer1718:

const ReducedTransactionSerializer = struct {
    pub fn serialize(tx: *const ReducedTransaction, writer: anytype) !void {
        // Write message to sign (includes all tx data)
        const msg = try tx.unsigned_tx.bytesToSign();
        try writer.writeInt(u32, @intCast(msg.len), .little);
        try writer.writeAll(msg);

        // Write reduced inputs
        for (tx.reduced_inputs) |red_in| {
            try SigmaBoolean.serialize(&red_in.sigma_prop, writer);
            try writer.writeInt(u64, red_in.cost, .little);
        }

        try writer.writeInt(u32, tx.tx_cost, .little);
    }

    pub fn parse(reader: anytype, allocator: Allocator) !ReducedTransaction {
        // Read and parse message
        const msg_len = try reader.readInt(u32, .little);
        const msg = try allocator.alloc(u8, msg_len);
        try reader.readNoEof(msg);

        const tx = try Transaction.sigmaParse(msg);

        // Read reduced inputs
        var reduced_inputs = std.ArrayList(ReducedInput).init(allocator);
        for (tx.inputs) |input| {
            const sigma_prop = try SigmaBoolean.parse(reader);
            const cost = try reader.readInt(u64, .little);

            try reduced_inputs.append(.{
                .sigma_prop = sigma_prop,
                .cost = cost,
                .extension = input.spending_proof.extension,
            });
        }

        const tx_cost = try reader.readInt(u32, .little);

        return .{
            .unsigned_tx = tx.toUnsigned(),
            .reduced_inputs = try reduced_inputs.toOwnedSlice(),
            .tx_cost = tx_cost,
        };
    }
};

Cold Wallet Flow

Cold Wallet Signing
══════════════════════════════════════════════════════════════════

Online Wallet (Hot)              Cold Wallet (Air-gapped)
──────────────────────           ────────────────────────
       │                                    │
  Build Unsigned Tx                         │
       │                                    │
  reduce_tx()                               │
       │                                    │
  Serialize ReducedTx ─────────────────────▶│
  (QR code / USB)                           │
       │                               Parse ReducedTx
       │                                    │
       │                               sign_reduced_tx()
       │                               (uses secrets)
       │                                    │
       │◀──────────────────────── Serialize SignedTx
       │                          (QR code / USB)
  Broadcast Tx                              │
       │                                    │
       ▼                                    ▼

Complete Usage Example

pub fn buildAndSignTransaction(
    wallet: *const Wallet,
    available_boxes: []const ErgoBox,
    recipient: Address,
    amount: u64,
    state_context: *const ErgoStateContext,
    allocator: Allocator,
) !Transaction {
    const current_height = state_context.pre_header.height;

    // 1. Build output
    const recipient_tree = try Contract.payToAddress(recipient);
    var out_builder = try ErgoBoxCandidateBuilder.init(
        try BoxValue.init(amount),
        recipient_tree,
        current_height,
        allocator,
    );
    const output = try out_builder.build();

    // 2. Select inputs
    const total_needed = try BoxValue.init(amount + SUGGESTED_TX_FEE.as_u64());
    const selection = try SimpleBoxSelector.select(
        available_boxes,
        total_needed,
        &[_]Token{},
        allocator,
    );

    // 3. Build transaction
    const change_address = wallet.getP2PKAddress();
    var builder = try TxBuilder.init(
        selection,
        &[_]ErgoBoxCandidate{output},
        current_height,
        SUGGESTED_TX_FEE,
        change_address,
        allocator,
    );
    defer builder.deinit();

    const unsigned_tx = try builder.build();

    // 4. Create transaction context
    const tx_context = try TransactionContext.init(
        unsigned_tx,
        selection.boxes.items,
        null,
    );

    // 5. Sign transaction
    return wallet.signTransaction(tx_context, state_context, null);
}

Summary

  • TxBuilder constructs unsigned transactions with validation
  • BoxSelection satisfies value and token requirements
  • ErgoBoxCandidateBuilder creates output boxes with fluent API
  • TransactionContext bundles transaction with input data
  • reduce_tx() separates script evaluation from signing
  • ReducedTransaction enables air-gapped cold wallet signing
  • Token burn requires explicit permits to prevent accidents

Next: Chapter 28: Key Derivation

1

Scala: sdk/

7

Scala: AppkitProvingInterpreter.scala (token validation)

10

Rust: tx_context.rs

12

Rust: box_selector.rs

Chapter 28: Key Derivation

PRE-ALPHA WARNING: This is a pre-alpha version of The Sigma Book. Content may be incomplete, inaccurate, or subject to change. Do not use as a source of truth. For authoritative information, consult the official repositories:

Prerequisites

Learning Objectives

  • Understand BIP-32 hierarchical deterministic key derivation
  • Implement derivation paths and index encoding
  • Distinguish hardened from non-hardened derivation
  • Master EIP-3 key derivation for Ergo

HD Wallet Architecture

Hierarchical Deterministic (HD) wallets derive unlimited keys from a single master seed12:

HD Key Derivation Tree
══════════════════════════════════════════════════════════════════

                     Master Seed (BIP-39)
                            │
                     HMAC-SHA512("Bitcoin seed", seed)
                            │
              ┌─────────────┴─────────────┐
              │                           │
         Master Key                  Chain Code
         (32 bytes)                  (32 bytes)
              │                           │
              └───────────┬───────────────┘
                          │
                   Extended Master Key
                          │
         ┌────────────────┼────────────────┐
         │                │                │
    m/44' (Purpose)  m/44'/429'      m/44'/429'/0'
         │           (Coin Type)      (Account)
         │                │                │
         ▼                ▼                ▼
    BIP-44 Keys      Ergo Keys       Account Keys

Index Types

Child indices distinguish hardened from normal derivation34:

const ChildIndex = union(enum) {
    hardened: HardenedIndex,
    normal: NormalIndex,

    const HardenedIndex = struct {
        value: u31, // 0 to 2^31-1

        pub fn toBits(self: HardenedIndex) u32 {
            return @as(u32, self.value) | HARDENED_BIT;
        }
    };

    const NormalIndex = struct {
        value: u31, // 0 to 2^31-1

        pub fn toBits(self: NormalIndex) u32 {
            return @as(u32, self.value);
        }

        pub fn next(self: NormalIndex) NormalIndex {
            return .{ .value = self.value + 1 };
        }
    };

    const HARDENED_BIT: u32 = 0x80000000; // 2^31

    pub fn hardened(i: u31) ChildIndex {
        return .{ .hardened = .{ .value = i } };
    }

    pub fn normal(i: u31) ChildIndex {
        return .{ .normal = .{ .value = i } };
    }

    pub fn toBits(self: ChildIndex) u32 {
        return switch (self) {
            .hardened => |h| h.toBits(),
            .normal => |n| n.toBits(),
        };
    }

    pub fn isHardened(self: ChildIndex) bool {
        return self == .hardened;
    }
};

Hardened vs Normal Derivation

Derivation Security Properties
══════════════════════════════════════════════════════════════════

┌──────────────┬─────────────────┬─────────────────────────────────┐
│ Type         │ Index Range     │ Security Property               │
├──────────────┼─────────────────┼─────────────────────────────────┤
│ Normal       │ 0 to 2³¹-1      │ Public derivation possible      │
│              │ (0, 1, 2)       │ Child pubkey from parent pubkey │
├──────────────┼─────────────────┼─────────────────────────────────┤
│ Hardened     │ 2³¹ to 2³²-1    │ Requires private key            │
│              │ (0', 1', 2')    │ Prevents key leakage            │
└──────────────┴─────────────────┴─────────────────────────────────┘

Why Hardened Matters:
─────────────────────────────────────────────────────────────────
If attacker obtains:
  - Child private key (leaked)
  - Parent chain code (public in xpub)

With normal derivation: Attacker can compute parent private key!
With hardened derivation: Parent key remains secure

Derivation Path

Paths encode the key tree location56:

const DerivationPath = struct {
    indices: []const ChildIndex,

    const PURPOSE: ChildIndex = ChildIndex.hardened(44);
    const ERG_COIN_TYPE: ChildIndex = ChildIndex.hardened(429);
    const CHANGE_EXTERNAL: ChildIndex = ChildIndex.normal(0);

    /// Create EIP-3 compliant path: m/44'/429'/account'/0/address
    pub fn eip3(account: u31, address: u31) DerivationPath {
        return .{
            .indices = &[_]ChildIndex{
                PURPOSE,
                ERG_COIN_TYPE,
                ChildIndex.hardened(account),
                CHANGE_EXTERNAL,
                ChildIndex.normal(address),
            },
        };
    }

    /// Master path (empty)
    pub fn master() DerivationPath {
        return .{ .indices = &[_]ChildIndex{} };
    }

    pub fn depth(self: *const DerivationPath) usize {
        return self.indices.len;
    }

    /// Extend path with new index
    pub fn extend(self: *const DerivationPath, index: ChildIndex, allocator: Allocator) !DerivationPath {
        var new_indices = try allocator.alloc(ChildIndex, self.indices.len + 1);
        @memcpy(new_indices[0..self.indices.len], self.indices);
        new_indices[self.indices.len] = index;
        return .{ .indices = new_indices };
    }

    /// Increment last index
    pub fn next(self: *const DerivationPath, allocator: Allocator) !DerivationPath {
        if (self.indices.len == 0) return error.EmptyPath;

        var new_indices = try allocator.dupe(ChildIndex, self.indices);
        const last = &new_indices[new_indices.len - 1];
        last.* = switch (last.*) {
            .hardened => |h| ChildIndex.hardened(h.value + 1),
            .normal => |n| ChildIndex.normal(n.value + 1),
        };
        return .{ .indices = new_indices };
    }
};

Path Parsing and Display

const PathParser = struct {
    pub fn parse(path_str: []const u8, allocator: Allocator) !DerivationPath {
        var indices = std.ArrayList(ChildIndex).init(allocator);

        var iter = std.mem.splitScalar(u8, path_str, '/');

        // First element must be 'm' or 'M'
        const master = iter.next() orelse return error.EmptyPath;
        if (!std.mem.eql(u8, master, "m") and !std.mem.eql(u8, master, "M")) {
            return error.InvalidMasterPrefix;
        }

        while (iter.next()) |segment| {
            const is_hardened = std.mem.endsWith(u8, segment, "'");
            const num_str = if (is_hardened)
                segment[0 .. segment.len - 1]
            else
                segment;

            const value = try std.fmt.parseInt(u31, num_str, 10);
            const index = if (is_hardened)
                ChildIndex.hardened(value)
            else
                ChildIndex.normal(value);

            try indices.append(index);
        }

        return .{ .indices = try indices.toOwnedSlice() };
    }

    pub fn format(path: *const DerivationPath, writer: anytype) !void {
        try writer.writeAll("m");
        for (path.indices) |index| {
            try writer.writeAll("/");
            switch (index) {
                .hardened => |h| try writer.print("{}'", .{h.value}),
                .normal => |n| try writer.print("{}", .{n.value}),
            }
        }
    }
};

EIP-3 Derivation Standard

Ergo's EIP-3 defines the derivation structure78:

EIP-3 Path Structure
══════════════════════════════════════════════════════════════════

m / 44' / 429' / account' / change / address
│    │      │        │         │        │
│    │      │        │         │        └── Address Index (normal)
│    │      │        │         └─────────── Change: 0=external, 1=internal
│    │      │        └───────────────────── Account Index (hardened)
│    │      └────────────────────────────── Coin Type: 429 (Ergo)
│    └───────────────────────────────────── Purpose: BIP-44
└────────────────────────────────────────── Master private key

Examples:
  m/44'/429'/0'/0/0   First address, first account
  m/44'/429'/0'/0/1   Second address, first account
  m/44'/429'/1'/0/0   First address, second account

Extended Secret Key

Extended keys pair key material with chain code910:

const ExtSecretKey = struct {
    key_bytes: [32]u8,      // Private key scalar
    chain_code: [32]u8,     // Chain code for derivation
    path: DerivationPath,

    const BITCOIN_SEED = "Bitcoin seed";

    /// Derive master key from seed
    pub fn deriveMaster(seed: []const u8) !ExtSecretKey {
        var hmac = HmacSha512.init(BITCOIN_SEED);
        hmac.update(seed);
        var output: [64]u8 = undefined;
        hmac.final(&output);

        return ExtSecretKey{
            .key_bytes = output[0..32].*,
            .chain_code = output[32..64].*,
            .path = DerivationPath.master(),
        };
    }

    /// Get public image (ProveDlog)
    pub fn publicImage(self: *const ExtSecretKey) ProveDlog {
        const scalar = Scalar.fromBytes(self.key_bytes);
        const point = CryptoConstants.generator.mul(scalar);
        return ProveDlog{ .h = point };
    }

    /// Get corresponding extended public key
    pub fn publicKey(self: *const ExtSecretKey) !ExtPubKey {
        return ExtPubKey{
            .key_bytes = self.publicImage().compress(),
            .chain_code = self.chain_code,
            .path = self.path,
        };
    }

    /// Zero out key material
    /// SECURITY: In production, use volatile write or std.crypto.utils.secureZero
    /// to prevent compiler optimization from eliding the zeroing.
    pub fn zeroSecret(self: *ExtSecretKey) void {
        std.crypto.utils.secureZero(u8, &self.key_bytes);
    }
};

Child Key Derivation

BIP-32 child derivation algorithm1112:

pub fn deriveChild(parent: *const ExtSecretKey, index: ChildIndex, allocator: Allocator) !ExtSecretKey {
    var hmac = HmacSha512.init(&parent.chain_code);

    // HMAC input depends on derivation type
    switch (index) {
        .hardened => {
            // Hardened: 0x00 || parent_key (33 bytes)
            hmac.update(&[_]u8{0x00});
            hmac.update(&parent.key_bytes);
        },
        .normal => {
            // Normal: parent_public_key (33 bytes compressed)
            const pub_key = parent.publicImage().compress();
            hmac.update(&pub_key);
        },
    }

    // Append index as big-endian u32
    var index_bytes: [4]u8 = undefined;
    std.mem.writeInt(u32, &index_bytes, index.toBits(), .big);
    hmac.update(&index_bytes);

    var output: [64]u8 = undefined;
    hmac.final(&output);

    // Parse left 32 bytes as scalar
    const child_key_proto = Scalar.fromBytes(output[0..32].*);

    // Check validity (must be < group order)
    if (child_key_proto.isOverflow()) {
        return deriveChild(parent, index.next(), allocator);
    }

    // child_key = (child_key_proto + parent_key) mod n
    const parent_scalar = Scalar.fromBytes(parent.key_bytes);
    const child_scalar = child_key_proto.add(parent_scalar);

    // Check for zero (invalid)
    if (child_scalar.isZero()) {
        return deriveChild(parent, index.next(), allocator);
    }

    return ExtSecretKey{
        .key_bytes = child_scalar.toBytes(),
        .chain_code = output[32..64].*,
        .path = try parent.path.extend(index, allocator),
    };
}

/// Derive key at full path
pub fn derive(master: *const ExtSecretKey, path: DerivationPath, allocator: Allocator) !ExtSecretKey {
    var current = master.*;
    for (path.indices) |index| {
        current = try deriveChild(&current, index, allocator);
    }
    return current;
}

Extended Public Key

Public key derivation (non-hardened only)1314:

const ExtPubKey = struct {
    key_bytes: [33]u8,      // Compressed public key
    chain_code: [32]u8,
    path: DerivationPath,

    pub fn deriveChild(parent: *const ExtPubKey, index: ChildIndex, allocator: Allocator) !ExtPubKey {
        // Cannot derive hardened children from public key
        if (index.isHardened()) {
            return error.HardenedDerivationRequiresPrivateKey;
        }

        var hmac = HmacSha512.init(&parent.chain_code);
        hmac.update(&parent.key_bytes);

        var index_bytes: [4]u8 = undefined;
        std.mem.writeInt(u32, &index_bytes, index.toBits(), .big);
        hmac.update(&index_bytes);

        var output: [64]u8 = undefined;
        hmac.final(&output);

        const child_key_proto = Scalar.fromBytes(output[0..32].*);

        if (child_key_proto.isOverflow()) {
            return deriveChild(parent, index.next(), allocator);
        }

        // child_public = point(child_key_proto) + parent_public
        const proto_point = CryptoConstants.generator.mul(child_key_proto);
        const parent_point = Point.decompress(parent.key_bytes);
        const child_point = proto_point.add(parent_point);

        if (child_point.isInfinity()) {
            return deriveChild(parent, index.next(), allocator);
        }

        return ExtPubKey{
            .key_bytes = child_point.compress(),
            .chain_code = output[32..64].*,
            .path = try parent.path.extend(index, allocator),
        };
    }
};

Mnemonic to Seed

BIP-39 seed derivation1516:

const Mnemonic = struct {
    const PBKDF2_ITERATIONS: u32 = 2048;
    const SEED_LENGTH: usize = 64;

    /// Convert mnemonic phrase to seed using PBKDF2-HMAC-SHA512
    pub fn toSeed(phrase: []const u8, passphrase: []const u8) [SEED_LENGTH]u8 {
        var seed: [SEED_LENGTH]u8 = undefined;

        // Normalize using NFKD
        const normalized_phrase = normalizeNfkd(phrase);
        const normalized_pass = normalizeNfkd(passphrase);

        // Salt = "mnemonic" + passphrase
        var salt_buf: [256]u8 = undefined;
        const salt = std.fmt.bufPrint(&salt_buf, "mnemonic{s}", .{normalized_pass}) catch unreachable;

        // PBKDF2-HMAC-SHA512
        pbkdf2(
            HmacSha512,
            normalized_phrase,
            salt,
            PBKDF2_ITERATIONS,
            &seed,
        );

        return seed;
    }
};

/// Full derivation from mnemonic to key
pub fn mnemonicToKey(
    phrase: []const u8,
    passphrase: []const u8,
    path: DerivationPath,
    allocator: Allocator,
) !ExtSecretKey {
    const seed = Mnemonic.toSeed(phrase, passphrase);
    const master = try ExtSecretKey.deriveMaster(&seed);
    return derive(&master, path, allocator);
}

Path Serialization

Binary format for storage/transfer1718:

const DerivationPathSerializer = struct {
    pub fn serialize(path: *const DerivationPath, writer: anytype) !void {
        // Public branch flag (0x00 for private, 0x01 for public)
        try writer.writeByte(0x00);

        // Depth
        try writer.writeInt(u32, @intCast(path.indices.len), .little);

        // Each index as 4-byte big-endian
        for (path.indices) |index| {
            var bytes: [4]u8 = undefined;
            std.mem.writeInt(u32, &bytes, index.toBits(), .big);
            try writer.writeAll(&bytes);
        }
    }

    pub fn parse(reader: anytype, allocator: Allocator) !DerivationPath {
        const public_branch = try reader.readByte();
        _ = public_branch; // TODO: handle public branch

        const depth = try reader.readInt(u32, .little);

        var indices = try allocator.alloc(ChildIndex, depth);
        for (0..depth) |i| {
            var bytes: [4]u8 = undefined;
            try reader.readNoEof(&bytes);
            const bits = std.mem.readInt(u32, &bytes, .big);

            indices[i] = if (bits & 0x80000000 != 0)
                ChildIndex.hardened(@truncate(bits & 0x7FFFFFFF))
            else
                ChildIndex.normal(@truncate(bits));
        }

        return .{ .indices = indices };
    }
};

Watch-Only Wallet

Public key derivation enables watch-only wallets:

Watch-Only Wallet Setup
══════════════════════════════════════════════════════════════════

Full Wallet (has secrets)           Watch-Only Wallet (no secrets)
─────────────────────────           ──────────────────────────────

Master Secret Key
       │
       ├── m/44'/429'/0'            Extended Public Key
       │   (hardened account)  ───▶  at m/44'/429'/0'/0
       │          │                        │
       │          └── m/44'/429'/0'/0      ├── Address 0 public
       │              (change branch) ───▶ ├── Address 1 public
       │                   │               ├── Address 2 public
       │                   ├── 0           └── ... (can derive more)
       │                   ├── 1
       │                   └── 2           Cannot derive:
       │                                    × Account 1 keys
                                            × Hardened children
                                            × Private keys

Export at: m/44'/429'/0'/0 (parent of address keys)
Can derive: All non-hardened children (addresses)
Cannot derive: Hardened children, private keys

Usage Example

const allocator = std.heap.page_allocator;

// 1. From mnemonic to master key
const mnemonic = "abandon abandon abandon abandon abandon abandon " ++
                 "abandon abandon abandon abandon abandon about";
const seed = Mnemonic.toSeed(mnemonic, "");
var master = try ExtSecretKey.deriveMaster(&seed);
defer master.zeroSecret();

// 2. Derive first EIP-3 address key
const path = DerivationPath.eip3(0, 0); // m/44'/429'/0'/0/0
var first_key = try derive(&master, path, allocator);
defer first_key.zeroSecret();

// 3. Get public image for address
const pub_key = first_key.publicImage();

// 4. Derive next address
const next_path = try path.next(allocator);
var second_key = try derive(&master, next_path, allocator);
defer second_key.zeroSecret();

// 5. Create watch-only wallet
const watch_only_path = try PathParser.parse("m/44'/429'/0'/0", allocator);
var account_key = try derive(&master, watch_only_path, allocator);
const watch_only = try account_key.publicKey();

// 6. Derive address public keys without secrets
const addr0_pub = try watch_only.deriveChild(ChildIndex.normal(0), allocator);
const addr1_pub = try watch_only.deriveChild(ChildIndex.normal(1), allocator);

// 7. Cannot derive hardened from public key
_ = watch_only.deriveChild(ChildIndex.hardened(0), allocator) catch |err| {
    std.debug.assert(err == error.HardenedDerivationRequiresPrivateKey);
};

Security Considerations

Key Derivation Security
══════════════════════════════════════════════════════════════════

Attack: Child + Chain Code → Parent  ⚠️ PRACTICAL ATTACK
────────────────────────────────────────────────────────
This is NOT theoretical - a single compromised child key
(via malware, hardware fault, or insider threat) can
recover the entire account if normal derivation was used.

Given:
  - Child private key k_i
  - Parent chain code c

For NORMAL derivation:
  HMAC-SHA512(c, K_parent || i) = IL || IR
  k_i = IL + k_parent  mod n

  Attacker can compute:
  k_parent = k_i - IL  mod n  ← COMPROMISED!

For HARDENED derivation:
  HMAC-SHA512(c, 0x00 || k_parent || i) = IL || IR

  Cannot compute IL without knowing k_parent
  → Parent key remains SECURE

Recommendation:
  └── Always use hardened derivation for account/purpose levels
  └── Normal derivation only for address indices

Summary

  • BIP-32 defines hierarchical deterministic key derivation
  • Derivation paths use notation m/44'/429'/0'/0/0
  • Hardened derivation (') requires private key; prevents key leakage
  • Normal derivation allows public key derivation from parent public key
  • EIP-3 standardizes Ergo's path: m/44'/429'/account'/change/address
  • Extended keys = key material (32 bytes) + chain code (32 bytes)
  • Watch-only wallets use extended public keys for address generation

Next: Chapter 29: Soft Fork Mechanism

8

Rust: derivation_path.rs:88-91 (PURPOSE, ERG, CHANGE constants)

14

Rust: ext_pub_key.rs

18

Rust: derivation_path.rs:235-241 (ledger_bytes)

Chapter 29: Soft-Fork Mechanism

PRE-ALPHA WARNING: This is a pre-alpha version of The Sigma Book. Content may be incomplete, inaccurate, or subject to change. Do not use as a source of truth. For authoritative information, consult the official repositories:

Prerequisites

Learning Objectives

By the end of this chapter, you will be able to:

  • Explain version context and how script versioning enables protocol upgrades
  • Implement validation rules with configurable status (enabled, disabled, soft-fork)
  • Handle unknown opcodes gracefully to support future soft-forks
  • Describe the transition from AOT (Ahead-of-Time) to JIT (Just-in-Time) costing

Version Context Architecture

The soft-fork mechanism enables protocol upgrades without breaking consensus12:

Soft-Fork Version Architecture
══════════════════════════════════════════════════════════════════

┌─────────────────────────────────────────────────────────────────┐
│                    Block Header                                 │
│                                                                 │
│   Block Version: 1, 2, 3, 4                                     │
│                                                                 │
│   Activated Script Version = Block Version - 1                  │
└────────────────────────┬────────────────────────────────────────┘
                         │
                         ▼
┌─────────────────────────────────────────────────────────────────┐
│                    ErgoTree Header                              │
│                                                                 │
│   7   6   5   4   3   2   1   0                                 │
│   ├───┼───┼───┼───┼───┼───┼───┤                                 │
│   │ M │ G │ C │ S │ Z │ V │ V │ V                               │
│   └───┴───┴───┴───┴───┴───┴───┘                                 │
│   M = More bytes follow                                         │
│   G = GZIP (reserved)                                           │
│   C = Context costing (reserved)                                │
│   S = Constant segregation                                      │
│   Z = Size included                                             │
│   V = Version (0-7)                                             │
└─────────────────────────────────────────────────────────────────┘

ErgoTree Version

Script version is encoded in header bits 0-234:

const ErgoTreeVersion = struct {
    value: u3, // 0-7

    const VERSION_MASK: u8 = 0x07;

    /// Version 0 - Initial mainnet (v3.x)
    pub const V0 = ErgoTreeVersion{ .value = 0 };
    /// Version 1 - Height monotonicity (v4.x)
    pub const V1 = ErgoTreeVersion{ .value = 1 };
    /// Version 2 - JIT interpreter (v5.x)
    pub const V2 = ErgoTreeVersion{ .value = 2 };
    /// Version 3 - Sub-blocks, new ops (v6.x)
    pub const V3 = ErgoTreeVersion{ .value = 3 };

    /// Maximum supported script version
    pub const MAX_SCRIPT_VERSION = V3;

    /// Parse version from header byte
    pub fn parseVersion(header_byte: u8) ErgoTreeVersion {
        return .{ .value = @truncate(header_byte & VERSION_MASK) };
    }

    pub fn toU8(self: ErgoTreeVersion) u8 {
        return @as(u8, self.value);
    }
};

ErgoTree Header

Header byte encoding with flags56:

const ErgoTreeHeader = struct {
    version: ErgoTreeVersion,
    is_constant_segregation: bool,
    has_size: bool,

    const CONSTANT_SEGREGATION_FLAG: u8 = 0b0001_0000;
    const HAS_SIZE_FLAG: u8 = 0b0000_1000;

    /// Parse header from byte
    pub fn parse(header_byte: u8) !ErgoTreeHeader {
        return .{
            .version = ErgoTreeVersion.parseVersion(header_byte),
            .is_constant_segregation = (header_byte & CONSTANT_SEGREGATION_FLAG) != 0,
            .has_size = (header_byte & HAS_SIZE_FLAG) != 0,
        };
    }

    /// Serialize header to byte
    pub fn serialize(self: *const ErgoTreeHeader) u8 {
        var header_byte: u8 = self.version.toU8();
        if (self.is_constant_segregation) {
            header_byte |= CONSTANT_SEGREGATION_FLAG;
        }
        if (self.has_size) {
            header_byte |= HAS_SIZE_FLAG;
        }
        return header_byte;
    }

    /// Create v0 header
    pub fn v0(constant_segregation: bool) ErgoTreeHeader {
        return .{
            .version = ErgoTreeVersion.V0,
            .is_constant_segregation = constant_segregation,
            .has_size = false,
        };
    }

    /// Create v1 header (size is mandatory)
    pub fn v1(constant_segregation: bool) ErgoTreeHeader {
        return .{
            .version = ErgoTreeVersion.V1,
            .is_constant_segregation = constant_segregation,
            .has_size = true,
        };
    }
};

Version Context

Thread-local context tracks activated and tree versions78:

const VersionContext = struct {
    activated_version: u8,
    ergo_tree_version: u8,

    /// JIT costing activation version (v5.0)
    const JIT_ACTIVATION_VERSION: u8 = 2;
    /// v6.0 soft-fork version
    const V6_SOFT_FORK_VERSION: u8 = 3;

    pub fn init(activated: u8, tree: u8) !VersionContext {
        // ergoTreeVersion must never exceed activatedVersion
        if (activated >= JIT_ACTIVATION_VERSION and tree > activated) {
            return error.InvalidVersionContext;
        }
        return .{
            .activated_version = activated,
            .ergo_tree_version = tree,
        };
    }

    /// True if JIT costing is activated (v5.0+)
    pub fn isJitActivated(self: *const VersionContext) bool {
        return self.activated_version >= JIT_ACTIVATION_VERSION;
    }

    /// True if v6.0 protocol is activated
    pub fn isV6Activated(self: *const VersionContext) bool {
        return self.activated_version >= V6_SOFT_FORK_VERSION;
    }

    /// True if v3+ ErgoTree version
    pub fn isV3OrLaterErgoTree(self: *const VersionContext) bool {
        return self.ergo_tree_version >= V6_SOFT_FORK_VERSION;
    }
};

/// Thread-local version context
threadlocal var current_context: ?VersionContext = null;

pub fn withVersions(
    activated: u8,
    tree: u8,
    comptime block: fn (*VersionContext) anyerror!void,
) !void {
    const ctx = try VersionContext.init(activated, tree);
    const prev = current_context;
    current_context = ctx;
    defer current_context = prev;
    try block(&ctx);
}

pub fn currentContext() !*const VersionContext {
    return &(current_context orelse return error.VersionContextNotSet);
}

Version History

Protocol Version History
══════════════════════════════════════════════════════════════════

┌─────────────┬────────────────┬──────────────┬────────────────────┐
│ Block Ver   │ Script Ver     │ Protocol     │ Features           │
├─────────────┼────────────────┼──────────────┼────────────────────┤
│ 1           │ 0              │ v3.x         │ Initial mainnet    │
│ 2           │ 1              │ v4.x         │ Height monotonicity│
│ 3           │ 2              │ v5.x         │ JIT interpreter    │
│ 4           │ 3              │ v6.x         │ Sub-blocks, new ops│
└─────────────┴────────────────┴──────────────┴────────────────────┘

Relation: activated_script_version = block_version - 1

Rule Status

Validation rules have configurable status910:

const RuleStatus = union(enum) {
    /// Default: rule is active and enforced
    enabled,
    /// Rule is disabled (via voting)
    disabled,
    /// Rule replaced by new rule
    replaced: struct { new_rule_id: u16 },
    /// Rule parameters changed
    changed: struct { new_value: []const u8 },

    const StatusCode = enum(u8) {
        enabled = 1,
        disabled = 2,
        replaced = 3,
        changed = 4,
    };

    pub fn statusCode(self: RuleStatus) StatusCode {
        return switch (self) {
            .enabled => .enabled,
            .disabled => .disabled,
            .replaced => .replaced,
            .changed => .changed,
        };
    }
};

Validation Rules

Rules define validation behavior with soft-fork support1112:

const ValidationRule = struct {
    id: u16,
    description: []const u8,
    soft_fork_checker: SoftForkChecker,
    checked: bool = false,

    pub fn checkRule(self: *ValidationRule, settings: *const ValidationSettings) !void {
        if (!self.checked) {
            if (settings.getStatus(self.id) == null) {
                return error.ValidationRuleNotFound;
            }
            self.checked = true;
        }
    }

    pub fn throwValidationException(
        self: *const ValidationRule,
        cause: anyerror,
        args: []const u8,
    ) ValidationError {
        return ValidationError{
            .rule = self,
            .args = args,
            .cause = cause,
        };
    }
};

const ValidationError = struct {
    rule: *const ValidationRule,
    args: []const u8,
    cause: anyerror,
};

Core Validation Rules

const ValidationRules = struct {
    const FIRST_RULE_ID: u16 = 1000;

    /// Check primitive type code is valid
    pub const CheckPrimitiveTypeCode = ValidationRule{
        .id = 1007,
        .description = "Check primitive type code is supported or added via soft-fork",
        .soft_fork_checker = .code_added,
    };

    /// Check non-primitive type code is valid
    pub const CheckTypeCode = ValidationRule{
        .id = 1008,
        .description = "Check non-primitive type code is supported or added via soft-fork",
        .soft_fork_checker = .code_added,
    };

    /// Check data can be serialized for type
    pub const CheckSerializableTypeCode = ValidationRule{
        .id = 1009,
        .description = "Check data values of type can be serialized",
        .soft_fork_checker = .when_replaced,
    };

    /// Check reader position limit
    pub const CheckPositionLimit = ValidationRule{
        .id = 1014,
        .description = "Check Reader position limit",
        .soft_fork_checker = .when_replaced,
    };
};

Soft-Fork Checkers

Detect soft-fork conditions from validation failures1314:

const SoftForkChecker = enum {
    none,
    when_replaced,
    code_added,

    pub fn isSoftFork(
        self: SoftForkChecker,
        settings: *const ValidationSettings,
        rule_id: u16,
        status: RuleStatus,
        args: []const u8,
    ) bool {
        return switch (self) {
            .none => false,
            .when_replaced => switch (status) {
                .replaced => true,
                else => false,
            },
            .code_added => switch (status) {
                .changed => |c| std.mem.indexOf(u8, c.new_value, args) != null,
                else => false,
            },
        };
    }
};

Validation Settings

Configurable settings from blockchain state1516:

const ValidationSettings = struct {
    rules: std.AutoHashMap(u16, struct { rule: *ValidationRule, status: RuleStatus }),

    pub fn getStatus(self: *const ValidationSettings, id: u16) ?RuleStatus {
        if (self.rules.get(id)) |entry| {
            return entry.status;
        }
        return null;
    }

    pub fn updated(self: *const ValidationSettings, id: u16, new_status: RuleStatus) !ValidationSettings {
        var new_rules = try self.rules.clone();
        if (new_rules.getPtr(id)) |entry| {
            entry.status = new_status;
        }
        return .{ .rules = new_rules };
    }

    /// Check if exception represents a soft-fork condition
    pub fn isSoftFork(self: *const ValidationSettings, ve: ValidationError) bool {
        const entry = self.rules.get(ve.rule.id) orelse return false;

        // Don't tolerate replaced v5.0 rules after v6.0 activation
        switch (entry.status) {
            .replaced => {
                const ctx = currentContext() catch return false;
                if (ctx.isV6Activated() and
                    (ve.rule.id == 1011 or ve.rule.id == 1007 or ve.rule.id == 1008))
                {
                    return false;
                }
                return true;
            },
            else => return entry.rule.soft_fork_checker.isSoftFork(
                self,
                ve.rule.id,
                entry.status,
                ve.args,
            ),
        }
    }
};

Soft-Fork Execution Wrapper

Execute code with soft-fork fallback:

pub fn trySoftForkable(
    comptime T: type,
    settings: *const ValidationSettings,
    when_soft_fork: T,
    block: fn () anyerror!T,
) T {
    return block() catch |err| {
        if (@errorCast(ValidationError, err)) |ve| {
            if (settings.isSoftFork(ve)) {
                return when_soft_fork;
            }
        }
        return err;
    };
}

// Usage: handling unknown opcodes
fn deserializeValue(
    reader: *Reader,
    settings: *const ValidationSettings,
) !Value {
    return trySoftForkable(
        Value,
        settings,
        // Soft-fork fallback: return unit placeholder
        Value.unit,
        // Normal deserialization
        struct {
            fn f() !Value {
                const op_code = try reader.readByte();
                const serializer = getSerializer(op_code) orelse
                    return error.UnknownOpCode;
                return serializer.parse(reader);
            }
        }.f,
    );
}

AOT to JIT Transition

Script Validation Rules Across Versions
══════════════════════════════════════════════════════════════════

Rule │ Block │ Block Type │ Script │ v4.0 Action     │ v5.0 Action
─────┼───────┼────────────┼────────┼─────────────────┼─────────────
  1  │ 1,2   │ candidate  │ v0/v1  │ AOT-cost,verify │ AOT-cost,verify
  2  │ 1,2   │ mined      │ v0/v1  │ AOT-cost,verify │ AOT-cost,verify
  3  │ 1,2   │ candidate  │ v2     │ skip-pool-tx    │ skip-pool-tx
  4  │ 1,2   │ mined      │ v2     │ skip-reject     │ skip-reject
─────┼───────┼────────────┼────────┼─────────────────┼─────────────
  5  │ 3     │ candidate  │ v0/v1  │ skip-pool-tx    │ JIT-verify
  6  │ 3     │ mined      │ v0/v1  │ skip-accept     │ JIT-verify
  7  │ 3     │ candidate  │ v2     │ skip-pool-tx    │ JIT-verify
  8  │ 3     │ mined      │ v2     │ skip-accept     │ JIT-verify

Actions:
  AOT-cost,verify  Cost estimation + verification using v4.0 AOT
  JIT-verify       Verification using v5.0 JIT interpreter
  skip-pool-tx     Skip mempool transaction (can't handle)
  skip-accept      Accept block without verification (trust majority)
  skip-reject      Reject transaction/block (invalid version)

Consensus Equivalence Properties

For safe transition between interpreter versions:

// Property 1: AOT-verify preserved between releases
// forall s:ScriptV0/V1, R4.0-AOT-verify(s) == R5.0-AOT-verify(s)

// Property 2: AOT-cost preserved
// forall s:ScriptV0/V1, R4.0-AOT-cost(s) == R5.0-AOT-cost(s)

// Property 3: JIT can replace AOT
// forall s:ScriptV0/V1, R5.0-JIT-verify(s) == R4.0-AOT-verify(s)

// Property 4: JIT cost bounded by AOT
// forall s:ScriptV0/V1, R5.0-JIT-cost(s) <= R4.0-AOT-cost(s)

// Property 5: ScriptV2 rejected before soft-fork
// forall s:ScriptV2, if not SF active => reject

Version-Aware Interpreter

pub fn verify(
    ergo_tree: *const ErgoTree,
    ctx: *const Context,
) !bool {
    const script_version = ergo_tree.header.version;
    const activated_version = ctx.activatedScriptVersion();

    // Execute with proper version context
    var version_ctx = try VersionContext.init(
        activated_version.toU8(),
        script_version.toU8(),
    );

    const prev = current_context;
    current_context = version_ctx;
    defer current_context = prev;

    // Version-specific execution
    if (version_ctx.isJitActivated()) {
        return verifyJit(ergo_tree, ctx);
    } else {
        return verifyAot(ergo_tree, ctx);
    }
}

fn verifyJit(tree: *const ErgoTree, ctx: *const Context) !bool {
    const reduced = try fullReduction(tree, ctx);
    return verifySignature(reduced, ctx.messageToSign());
}

fn verifyAot(tree: *const ErgoTree, ctx: *const Context) !bool {
    // Legacy AOT interpreter path
    const result = try aotEvaluate(tree, ctx);
    return verifySignature(result, ctx.messageToSign());
}

Block Extension Voting

Rule status changes via blockchain extension voting:

Extension Voting Flow
══════════════════════════════════════════════════════════════════

┌────────────────────────────────────────────────────────────────────┐
│                    Block Extension Section                         │
│                                                                    │
│  Key (2 bytes)    │    Value                                       │
│  ─────────────────┼─────────────────────────────────────────────── │
│  Rule ID          │    RuleStatus + data                           │
│  0x03EF (1007)    │    ChangedRule([0x5A, 0x5B])                   │
│                   │    (new opcodes 0x5A, 0x5B allowed)            │
└────────────────────────────────────────────────────────────────────┘
                              │
                              ▼
┌────────────────────────────────────────────────────────────────────┐
│                    Voting Epoch                                    │
│                                                                    │
│  Epoch 1:    □ □ □ ■ □ ■ ■ □ ■ ■   (5/10 = 50%)              │
│  Epoch 2:    ■ ■ □ ■ ■ ■ □ ■ ■ ■   (8/10 = 80%)              │
│  Epoch 3:    ■ ■ ■ ■ ■ □ ■ ■ ■ ■   (9/10 = 90%) → ACTIVATED  │
└────────────────────────────────────────────────────────────────────┘

New Opcode Addition:
  1. Before soft-fork: Unknown opcode → ValidationException
  2. Extension update: ChangedRule(Array(newOpcode)) for rule 1001
  3. After activation: Old nodes recognize soft-fork via SoftForkWhenCodeAdded
  4. Result: Old nodes skip verification; new nodes execute new opcode

Unknown Opcode Handling

fn handleUnknownOpcode(
    reader: *Reader,
    settings: *const ValidationSettings,
    op_code: u8,
) !Expr {
    // Check if this is a soft-fork condition
    const rule = &ValidationRules.CheckTypeCode;
    const status = settings.getStatus(rule.id) orelse return error.RuleNotFound;

    switch (status) {
        .changed => |c| {
            // Check if opcode was added via soft-fork
            if (std.mem.indexOfScalar(u8, c.new_value, op_code) != null) {
                // Soft-fork: skip remaining bytes, return placeholder
                reader.skipToEnd();
                return Expr{ .constant = Constant.unit };
            }
        },
        else => {},
    }

    // Not a soft-fork condition - fail hard
    return rule.throwValidationException(error.UnknownOpCode, &[_]u8{op_code});
}

Summary

  • ErgoTreeVersion encodes script version in 3-bit header field (0-7)
  • VersionContext tracks activated protocol and tree versions
  • RuleStatus can be Enabled, Disabled, Replaced, or Changed
  • SoftForkChecker detects soft-fork conditions from validation failures
  • trySoftForkable provides graceful fallback for unknown constructs
  • AOT→JIT transition demonstrated soft-fork for major interpreter change
  • Block extension voting enables parameter changes via miner consensus
  • Old nodes remain compatible by trusting majority on unverifiable blocks

Next: Chapter 30: Cross-Platform Support

3

Scala: ErgoTree.scala (header)

10

Rust: Not directly present in sigma-rust; validation handled at higher level

12

Rust: Validation rules embedded in deserializer implementations

14

Rust: Soft-fork handling at application level (ergo-lib)

16

Rust: parameters.rs (blockchain parameters)

Chapter 30: Cross-Platform Support

PRE-ALPHA WARNING: This is a pre-alpha version of The Sigma Book. Content may be incomplete, inaccurate, or subject to change. Do not use as a source of truth. For authoritative information, consult the official repositories:

Prerequisites

  • Chapter 9 for platform-specific cryptographic implementations
  • Chapter 27 for SDK APIs that must work across platforms
  • Familiarity with build systems and compilation toolchains

Learning Objectives

By the end of this chapter, you will be able to:

  • Explain Zig's cross-compilation architecture and target selection
  • Implement platform abstraction layers for OS-specific functionality
  • Use conditional compilation (comptime if) for target-specific code
  • Build for WASM, native, and embedded targets from a single codebase

Cross-Compilation Architecture

Zig provides native cross-compilation to any target from any host12:

Cross-Compilation Targets
══════════════════════════════════════════════════════════════════

┌─────────────────────────────────────────────────────────────────┐
│                    Host Build System                            │
│                                                                 │
│   zig build -Dtarget=<target>                                   │
└────────────────────────┬────────────────────────────────────────┘

Common Targets:
  x86_64-linux-gnu       Linux desktop/server
  aarch64-linux-gnu      ARM64 Linux (Raspberry Pi, etc.)
  x86_64-macos           macOS Intel
  aarch64-macos          macOS Apple Silicon
  x86_64-windows-gnu     Windows
  wasm32-wasi            WebAssembly with WASI
  wasm32-freestanding    WebAssembly browser

Platform Abstraction

Platform-specific code via conditional compilation:

const builtin = @import("builtin");
const std = @import("std");

const Platform = struct {
    pub const target = builtin.target;
    pub const os = target.os.tag;
    pub const arch = target.cpu.arch;

    pub const is_wasm = arch == .wasm32 or arch == .wasm64;
    pub const is_native = !is_wasm;
    pub const is_windows = os == .windows;
    pub const is_linux = os == .linux;
    pub const is_macos = os == .macos;

    /// Get platform-appropriate crypto implementation
    pub fn getCrypto() type {
        if (is_wasm) {
            return WasmCrypto;
        } else {
            return NativeCrypto;
        }
    }

    /// Get platform-appropriate allocator
    pub fn getDefaultAllocator() std.mem.Allocator {
        if (is_wasm) {
            return std.heap.wasm_allocator;
        } else {
            return std.heap.c_allocator;
        }
    }
};

Crypto Abstraction Layer

Platform-agnostic cryptography interface34.

SECURITY: All implementations of cryptographic operations involving secret data (scalar multiplication, HMAC with secret keys, etc.) must be constant-time to prevent timing side-channel attacks. Use libraries that guarantee constant-time behavior (e.g., libsecp256k1, Zig's std.crypto).

const CryptoFacade = struct {
    const Impl = Platform.getCrypto();

    pub const SECRET_KEY_LENGTH: usize = 32;
    pub const PUBLIC_KEY_LENGTH: usize = 33;
    pub const SIGNATURE_LENGTH: usize = 64;

    /// Create new crypto context
    pub fn createContext() CryptoContext {
        return Impl.createContext();
    }

    /// Normalize point to affine coordinates
    pub fn normalizePoint(p: Ecp) Ecp {
        return Impl.normalizePoint(p);
    }

    /// Negate point (y-coordinate)
    pub fn negatePoint(p: Ecp) Ecp {
        return Impl.negatePoint(p);
    }

    /// Check if point is infinity
    pub fn isInfinityPoint(p: Ecp) bool {
        return Impl.isInfinityPoint(p);
    }

    /// Point exponentiation: p^n
    pub fn exponentiatePoint(p: Ecp, n: *const Scalar) Ecp {
        return Impl.exponentiatePoint(p, n);
    }

    /// Point multiplication (addition in EC group): p1 + p2
    pub fn multiplyPoints(p1: Ecp, p2: Ecp) Ecp {
        return Impl.multiplyPoints(p1, p2);
    }

    /// Encode point (compressed or uncompressed)
    pub fn encodePoint(p: Ecp, compressed: bool) [PUBLIC_KEY_LENGTH]u8 {
        return Impl.encodePoint(p, compressed);
    }

    /// HMAC-SHA512
    pub fn hashHmacSha512(key: []const u8, data: []const u8) [64]u8 {
        return Impl.hashHmacSha512(key, data);
    }

    /// PBKDF2-HMAC-SHA512
    pub fn generatePbkdf2Key(
        password: []const u8,
        salt: []const u8,
        iterations: u32,
    ) [64]u8 {
        return Impl.generatePbkdf2Key(password, salt, iterations);
    }

    /// Secure random bytes
    pub fn randomBytes(dest: []u8) void {
        Impl.randomBytes(dest);
    }
};

Native Crypto Implementation

Using Zig's standard library and optional C bindings56:

const NativeCrypto = struct {
    const std = @import("std");
    const crypto = std.crypto;

    pub fn createContext() CryptoContext {
        return CryptoContext.secp256k1();
    }

    pub fn normalizePoint(p: Ecp) Ecp {
        return p.normalize();
    }

    pub fn negatePoint(p: Ecp) Ecp {
        return p.negate();
    }

    pub fn isInfinityPoint(p: Ecp) bool {
        return p.isIdentity();
    }

    pub fn exponentiatePoint(p: Ecp, n: *const Scalar) Ecp {
        return p.mul(n.*);
    }

    pub fn multiplyPoints(p1: Ecp, p2: Ecp) Ecp {
        return p1.add(p2);
    }

    pub fn encodePoint(p: Ecp, compressed: bool) [33]u8 {
        if (compressed) {
            return p.toCompressedSec1();
        } else {
            var buf: [65]u8 = undefined;
            return p.toUncompressedSec1(&buf)[0..33].*;
        }
    }

    pub fn hashHmacSha512(key: []const u8, data: []const u8) [64]u8 {
        var hmac = crypto.auth.HmacSha512.init(key);
        hmac.update(data);
        return hmac.finalResult();
    }

    pub fn generatePbkdf2Key(
        password: []const u8,
        salt: []const u8,
        iterations: u32,
    ) [64]u8 {
        var result: [64]u8 = undefined;
        crypto.pwhash.pbkdf2(
            &result,
            password,
            salt,
            iterations,
            crypto.auth.HmacSha512,
        );
        return result;
    }

    pub fn randomBytes(dest: []u8) void {
        crypto.random.bytes(dest);
    }
};

WASM Crypto Implementation

WebAssembly-specific implementation using imports:

const WasmCrypto = struct {
    // External functions imported from JavaScript host
    extern "env" fn crypto_random_bytes(ptr: [*]u8, len: usize) void;
    extern "env" fn crypto_hmac_sha512(
        key_ptr: [*]const u8,
        key_len: usize,
        data_ptr: [*]const u8,
        data_len: usize,
        out_ptr: [*]u8,
    ) void;
    extern "env" fn crypto_secp256k1_mul(
        point_ptr: [*]const u8,
        scalar_ptr: [*]const u8,
        out_ptr: [*]u8,
    ) void;

    pub fn createContext() CryptoContext {
        return CryptoContext.secp256k1();
    }

    pub fn randomBytes(dest: []u8) void {
        crypto_random_bytes(dest.ptr, dest.len);
    }

    pub fn hashHmacSha512(key: []const u8, data: []const u8) [64]u8 {
        var result: [64]u8 = undefined;
        crypto_hmac_sha512(
            key.ptr,
            key.len,
            data.ptr,
            data.len,
            &result,
        );
        return result;
    }

    pub fn exponentiatePoint(p: Ecp, n: *const Scalar) Ecp {
        var result: [33]u8 = undefined;
        crypto_secp256k1_mul(
            &p.toCompressedSec1(),
            &n.toBytes(),
            &result,
        );
        return Ecp.fromCompressedSec1(result) catch unreachable;
    }

    // ... other operations using WASM imports
};

WASM JavaScript Host

JavaScript glue code for browser/Node.js:

// wasm_host.js - JavaScript host for WASM crypto
const crypto = require('crypto');
const secp256k1 = require('secp256k1');

const imports = {
    env: {
        crypto_random_bytes: (ptr, len) => {
            const bytes = crypto.randomBytes(len);
            const mem = new Uint8Array(wasmMemory.buffer, ptr, len);
            mem.set(bytes);
        },

        crypto_hmac_sha512: (keyPtr, keyLen, dataPtr, dataLen, outPtr) => {
            const key = new Uint8Array(wasmMemory.buffer, keyPtr, keyLen);
            const data = new Uint8Array(wasmMemory.buffer, dataPtr, dataLen);
            const hmac = crypto.createHmac('sha512', key);
            hmac.update(data);
            const result = hmac.digest();
            const out = new Uint8Array(wasmMemory.buffer, outPtr, 64);
            out.set(result);
        },

        crypto_secp256k1_mul: (pointPtr, scalarPtr, outPtr) => {
            const point = new Uint8Array(wasmMemory.buffer, pointPtr, 33);
            const scalar = new Uint8Array(wasmMemory.buffer, scalarPtr, 32);
            const result = secp256k1.publicKeyTweakMul(point, scalar, true);
            const out = new Uint8Array(wasmMemory.buffer, outPtr, 33);
            out.set(result);
        }
    }
};

Conditional Compilation

Target-specific code paths:

const builtin = @import("builtin");

pub fn getTimestamp() i64 {
    if (builtin.target.os.tag == .wasi) {
        // WASI clock_time_get
        var ts: std.os.wasi.timestamp_t = undefined;
        _ = std.os.wasi.clock_time_get(.REALTIME, 1, &ts);
        return @intCast(ts / 1_000_000_000);
    } else if (builtin.target.cpu.arch == .wasm32) {
        // Freestanding WASM - use imported function
        return wasmGetTimestamp();
    } else {
        // Native - use std
        return std.time.timestamp();
    }
}

pub fn allocate(comptime T: type, n: usize) ![]T {
    const allocator = if (Platform.is_wasm)
        std.heap.wasm_allocator
    else if (builtin.link_libc)
        std.heap.c_allocator
    else
        std.heap.page_allocator;

    return allocator.alloc(T, n);
}

Build Configuration

build.zig for multi-target builds:

const std = @import("std");

pub fn build(b: *std.Build) void {
    // Native target (default)
    const native_target = b.standardTargetOptions(.{});
    const native_optimize = b.standardOptimizeOption(.{});

    const lib = b.addStaticLibrary(.{
        .name = "ergotree",
        .root_source_file = .{ .path = "src/main.zig" },
        .target = native_target,
        .optimize = native_optimize,
    });

    // WASM target
    const wasm_target = b.resolveTargetQuery(.{
        .cpu_arch = .wasm32,
        .os_tag = .freestanding,
    });

    const wasm_lib = b.addStaticLibrary(.{
        .name = "ergotree",
        .root_source_file = .{ .path = "src/main.zig" },
        .target = wasm_target,
        .optimize = .ReleaseSmall,
    });

    // Export for JavaScript
    wasm_lib.rdynamic = true;
    wasm_lib.export_memory = true;

    // WASI target (for Node.js)
    const wasi_target = b.resolveTargetQuery(.{
        .cpu_arch = .wasm32,
        .os_tag = .wasi,
    });

    const wasi_lib = b.addExecutable(.{
        .name = "ergotree",
        .root_source_file = .{ .path = "src/main.zig" },
        .target = wasi_target,
        .optimize = .ReleaseSmall,
    });

    // Install all targets
    b.installArtifact(lib);
    b.installArtifact(wasm_lib);
    b.installArtifact(wasi_lib);
}

Memory Management

Platform-specific allocation strategies:

const Allocator = std.mem.Allocator;

const MemoryConfig = struct {
    /// Maximum memory for WASM (64KB pages)
    wasm_max_pages: u32 = 256, // 16MB

    /// Use arena for batch operations
    use_arena: bool = true,

    /// Pre-allocate constant pool
    constant_pool_size: usize = 4096,
};

pub fn createPlatformAllocator(config: MemoryConfig) Allocator {
    if (Platform.is_wasm) {
        // WASM uses linear memory with explicit growth
        return std.heap.WasmAllocator.init(.{
            .max_memory = config.wasm_max_pages * 65536,
        });
    } else {
        // Native uses page allocator with arena wrapper
        if (config.use_arena) {
            const backing = std.heap.page_allocator;
            var arena = std.heap.ArenaAllocator.init(backing);
            return arena.allocator();
        }
        return std.heap.page_allocator;
    }
}

Type Representation

Consistent types across platforms:

/// Platform-independent big integer
pub const BigInt = struct {
    limbs: []u64,
    positive: bool,

    pub fn fromBytes(bytes: []const u8) BigInt {
        // Works on all platforms
        var limbs = std.ArrayList(u64).init(allocator);
        // ... conversion logic
        return .{ .limbs = limbs.items, .positive = true };
    }

    pub fn toBytes(self: *const BigInt, buf: []u8) []u8 {
        // Consistent byte representation
        // ... conversion logic
        return buf[0..written];
    }
};

/// Platform-independent scalar (256-bit)
pub const Scalar = struct {
    bytes: [32]u8,

    pub fn fromBigInt(n: *const BigInt) Scalar {
        var result: Scalar = undefined;
        _ = n.toBytes(&result.bytes);
        return result;
    }
};

Endianness Handling

Consistent byte order across architectures:

pub fn readU32BE(bytes: []const u8) u32 {
    return std.mem.readInt(u32, bytes[0..4], .big);
}

pub fn writeU32BE(value: u32, buf: []u8) void {
    std.mem.writeInt(u32, buf[0..4], value, .big);
}

pub fn readU64LE(bytes: []const u8) u64 {
    return std.mem.readInt(u64, bytes[0..8], .little);
}

// Serialization always uses network byte order (big-endian)
pub fn serializeInt(value: anytype, writer: anytype) !void {
    const T = @TypeOf(value);
    var buf: [@sizeOf(T)]u8 = undefined;
    std.mem.writeInt(T, &buf, value, .big);
    try writer.writeAll(&buf);
}

Performance Considerations

Platform Performance Characteristics
══════════════════════════════════════════════════════════════════

┌─────────────────┬───────────────────────────────────────────────┐
│ Platform        │ Characteristics                               │
├─────────────────┼───────────────────────────────────────────────┤
│ Native (x86_64) │ ✓ SIMD acceleration (AVX2/AVX512)             │
│                 │ ✓ Hardware AES-NI                             │
│                 │ ✓ Large memory, fast allocation               │
│                 │ ✓ Multi-threaded execution                    │
├─────────────────┼───────────────────────────────────────────────┤
│ Native (ARM64)  │ ✓ NEON SIMD                                   │
│                 │ ✓ Hardware crypto extensions                  │
│                 │ ✓ Power-efficient                             │
├─────────────────┼───────────────────────────────────────────────┤
│ WASM (browser)  │ ○ Single-threaded (mostly)                    │
│                 │ ○ Linear memory model                         │
│                 │ ✓ JIT compilation by browser                  │
│                 │ ○ No direct filesystem/network                │
├─────────────────┼───────────────────────────────────────────────┤
│ WASI (Node.js)  │ ○ Single-threaded                             │
│                 │ ✓ WASI syscalls for I/O                       │
│                 │ ✓ Sandboxed execution                         │
└─────────────────┴───────────────────────────────────────────────┘

Optimization Strategies:
  Native:   Use comptime for specialization, SIMD intrinsics
  WASM:     Minimize memory allocations, batch operations
  Both:     Profile-guided optimization, cache-friendly layouts

Testing Cross-Platform

const testing = std.testing;

test "crypto operations consistent across platforms" {
    const key = "test_key";
    const data = "test_data";

    const result = CryptoFacade.hashHmacSha512(key, data);

    // Expected value computed externally
    const expected = [_]u8{
        0x8f, 0x9d, 0x1c, // ... full 64 bytes
    };

    try testing.expectEqualSlices(u8, &expected, &result);
}

test "point operations" {
    const ctx = CryptoFacade.createContext();
    const g = ctx.generator;

    // g + g = 2g
    const two_g_add = CryptoFacade.multiplyPoints(g, g);
    const scalar_2 = Scalar.fromInt(2);
    const two_g_mul = CryptoFacade.exponentiatePoint(g, &scalar_2);

    try testing.expect(two_g_add.eql(two_g_mul));
}

Usage Example

Cross-platform wallet library:

const Wallet = struct {
    prover: Prover,
    allocator: Allocator,

    pub fn init(seed: []const u8) !Wallet {
        const allocator = Platform.getDefaultAllocator();

        // Platform-independent key derivation
        const master_key = CryptoFacade.generatePbkdf2Key(
            seed,
            "mnemonic",
            2048,
        );

        return .{
            .prover = try Prover.fromSeed(master_key, allocator),
            .allocator = allocator,
        };
    }

    pub fn signTransaction(
        self: *const Wallet,
        tx: *const Transaction,
    ) !SignedTransaction {
        // Works identically on all platforms
        return self.prover.sign(tx);
    }
};

// Same code runs on all targets:
// - Desktop app (native)
// - Browser extension (WASM)
// - Mobile wallet (ARM native or WASM)
// - Server-side validation (native)

Summary

  • Zig cross-compiles to any target from any host without external tools
  • Platform abstraction uses builtin.target for conditional compilation
  • CryptoFacade provides consistent API across native and WASM
  • WASM targets use JavaScript imports for platform-specific crypto
  • Memory management adapts to platform constraints
  • Type representation ensures consistent behavior across architectures
  • Testing verifies identical results on all platforms

Next: Chapter 31: Performance Engineering

1

Scala: CryptoFacade.scala (abstraction)

2

Rust: Platform-independent design in sigma-rust crate structure

3

Scala: Platform.scala (JVM impl)

4

Rust: sigma_protocol/ (crypto operations)

5

Scala: Platform.scala (JS impl)

6

Rust: Feature flags in Cargo.toml for optional dependencies

Chapter 31: Performance Engineering

PRE-ALPHA WARNING: This is a pre-alpha version of The Sigma Book. Content may be incomplete, inaccurate, or subject to change. Do not use as a source of truth. For authoritative information, consult the official repositories:

Prerequisites

  • Chapter 12 for evaluation model fundamentals that define hot paths
  • Chapter 7 for serialization patterns to optimize
  • Chapter 13 for understanding cost accounting overhead

Learning Objectives

By the end of this chapter, you will be able to:

  • Identify performance-critical paths in script interpretation
  • Apply Zig's comptime for zero-cost abstractions and type dispatch
  • Design data structures using Struct-of-Arrays (SoA) for cache efficiency
  • Use arena allocators to batch allocations and reduce overhead
  • Implement SIMD and vectorization for throughput-critical operations
  • Profile and benchmark interpreter components systematically

Performance Architecture

Script interpretation requires processing thousands of transactions per block12:

Performance Critical Paths
══════════════════════════════════════════════════════════════════

┌─────────────────────────────────────────────────────────────────┐
│                    Transaction Flow                             │
│                                                                 │
│   Block (1000+ txs)                                             │
│       │                                                         │
│       ├── Tx 1: 3 inputs × (deserialize + evaluate + verify)    │
│       ├── Tx 2: 1 input × (deserialize + evaluate + verify)     │
│       ├── Tx 3: 5 inputs × (deserialize + evaluate + verify)    │
│       └── ...                                                   │
│                                                                 │
│   Hot paths (per input):                                        │
│     • Deserialization: ~50-200 opcode parses                    │
│     • Evaluation: ~100-500 operations                           │
│     • Proof verification: 1-10 EC operations                    │
└─────────────────────────────────────────────────────────────────┘

Performance Targets:
  Deserialization:   < 100 µs per script
  Evaluation:        < 500 µs per script
  Verification:      < 2 ms per input
  Total per block:   < 5 seconds

Comptime Optimization

Zig's comptime enables zero-cost abstractions34:

/// Compile-time type dispatch eliminates runtime branching
fn evalOperation(comptime op: OpCode, args: []const Value) !Value {
    return switch (op) {
        .plus => evalPlus(args),
        .minus => evalMinus(args),
        .multiply => evalMultiply(args),
        // All branches resolved at compile time
        inline else => |o| evalGeneric(o, args),
    };
}

/// Comptime-generated lookup tables
const op_costs = blk: {
    var costs: [256]u32 = undefined;
    for (0..256) |i| {
        costs[i] = computeCost(@enumFromInt(i));
    }
    break :blk costs;
};

/// Zero-cost field access via comptime offset calculation
fn getField(comptime T: type, comptime field: []const u8, ptr: *const T) *const @TypeOf(@field(T{}, field)) {
    const offset = @offsetOf(T, field);
    const byte_ptr: [*]const u8 = @ptrCast(ptr);
    return @ptrCast(@alignCast(byte_ptr + offset));
}

Data-Oriented Design

Structure data for cache efficiency. The Array-of-Structs to Struct-of-Arrays transformation is a semantics-preserving isomorphism: Array[n](A × B) ≅ Array[n](A) × Array[n](B). Both represent the same data with identical behavior, but different memory layouts yield dramatically different cache performance:

/// Bad: Array of Structs (AoS) - poor cache locality for iteration
const ValueAoS = struct {
    tag: ValueTag,      // 1 byte
    padding: [7]u8,     // 7 bytes padding
    data: [8]u8,        // 8 bytes payload
}; // 16 bytes per value, only 9 used

/// Good: Struct of Arrays (SoA) - excellent cache locality
const ValueStore = struct {
    tags: []ValueTag,           // Packed tags
    data: [][8]u8,              // Packed payloads
    len: usize,

    /// Iterate tags without touching payload
    pub fn countType(self: *const ValueStore, target: ValueTag) usize {
        var count: usize = 0;
        for (self.tags) |tag| {
            count += @intFromBool(tag == target);
        }
        return count;
    }

    /// Access specific value
    pub fn get(self: *const ValueStore, idx: usize) Value {
        return Value.decode(self.tags[idx], self.data[idx]);
    }
};

Memory Layout Analysis

Cache Line Utilization
══════════════════════════════════════════════════════════════════

Array of Structs (AoS):
┌──────────────────────────────────────────────────────────────────┐
│ Cache Line (64 bytes)                                            │
├──────────────────────────────────────────────────────────────────┤
│ Value[0] │ Value[1] │ Value[2] │ Value[3] │                      │
│ 16 bytes │ 16 bytes │ 16 bytes │ 16 bytes │                      │
│ T+P+D    │ T+P+D    │ T+P+D    │ T+P+D    │ Wasted               │
└──────────────────────────────────────────────────────────────────┘
Tag iteration: 25% cache utilization (touches only 1 byte per 16)

Struct of Arrays (SoA):
┌──────────────────────────────────────────────────────────────────┐
│ Tags Cache Line (64 bytes)                                       │
├──────────────────────────────────────────────────────────────────┤
│ T[0] T[1] T[2] ... T[63]                                         │
│ 64 tags in single cache line                                     │
└──────────────────────────────────────────────────────────────────┘
Tag iteration: 100% cache utilization (64 values per fetch)

Speedup: ~4x for tag-only operations

Arena Allocators

Batch allocations reduce overhead:

const ArenaAllocator = std.heap.ArenaAllocator;

/// Evaluation context with arena for temporary allocations
const EvalContext = struct {
    arena: ArenaAllocator,
    constants: []const Constant,
    env: Environment,

    pub fn init(backing: Allocator) EvalContext {
        return .{
            .arena = ArenaAllocator.init(backing),
            .constants = &[_]Constant{},
            .env = Environment.init(),
        };
    }

    /// All temporary allocations use arena
    pub fn allocTemp(self: *EvalContext, comptime T: type, n: usize) ![]T {
        return self.arena.allocator().alloc(T, n);
    }

    /// Single deallocation frees all temps
    pub fn reset(self: *EvalContext) void {
        _ = self.arena.reset(.retain_capacity);
    }

    pub fn deinit(self: *EvalContext) void {
        self.arena.deinit();
    }
};

/// Usage: batch evaluation without per-operation allocations
fn evaluateScript(tree: *const ErgoTree, allocator: Allocator) !Value {
    var ctx = EvalContext.init(allocator);
    defer ctx.deinit();

    for (tree.ops) |op| {
        try evalOp(op, &ctx);
    }

    ctx.reset(); // Free all temps at once
    return ctx.result;
}

Loop Optimization

Efficient iteration patterns:

/// Unrolled loop for fixed-size operations
fn hashBlock(data: *const [64]u8, state: *[8]u32) void {
    // Process 16 words per iteration, unrolled
    comptime var i: usize = 0;
    inline while (i < 64) : (i += 4) {
        const w0 = std.mem.readInt(u32, data[i..][0..4], .big);
        const w1 = std.mem.readInt(u32, data[i + 4..][0..4], .big);
        const w2 = std.mem.readInt(u32, data[i + 8..][0..4], .big);
        const w3 = std.mem.readInt(u32, data[i + 12..][0..4], .big);
        round(state, w0);
        round(state, w1);
        round(state, w2);
        round(state, w3);
    }
}

/// Vectorized collection operations
fn sumValues(values: []const i64) i64 {
    const Vec = @Vector(4, i64);
    var sum_vec: Vec = @splat(0);

    var i: usize = 0;
    while (i + 4 <= values.len) : (i += 4) {
        const chunk: Vec = values[i..][0..4].*;
        sum_vec += chunk;
    }

    // Reduce vector to scalar
    var sum = @reduce(.Add, sum_vec);

    // Handle remainder
    while (i < values.len) : (i += 1) {
        sum += values[i];
    }

    return sum;
}

Memoization

Cache expensive computations56:

/// Generic memoization with comptime key type
fn Memoized(comptime K: type, comptime V: type) type {
    return struct {
        cache: std.AutoHashMap(K, V),

        const Self = @This();

        pub fn init(allocator: Allocator) Self {
            return .{ .cache = std.AutoHashMap(K, V).init(allocator) };
        }

        pub fn getOrCompute(
            self: *Self,
            key: K,
            compute: *const fn (K) V,
        ) V {
            const result = self.cache.getOrPut(key) catch unreachable;
            if (!result.found_existing) {
                result.value_ptr.* = compute(key);
            }
            return result.value_ptr.*;
        }

        pub fn reset(self: *Self) void {
            self.cache.clearRetainingCapacity();
        }
    };
}

/// Type method resolution memoization
const MethodCache = Memoized(struct { type_code: u8, method_id: u8 }, *const Method);

var method_cache: MethodCache = undefined;

fn resolveMethod(type_code: u8, method_id: u8) *const Method {
    return method_cache.getOrCompute(
        .{ .type_code = type_code, .method_id = method_id },
        computeMethod,
    );
}

String Interning

Avoid repeated string allocations:

const StringInterner = struct {
    table: std.StringHashMap([]const u8),
    arena: ArenaAllocator,

    pub fn init(allocator: Allocator) StringInterner {
        return .{
            .table = std.StringHashMap([]const u8).init(allocator),
            .arena = ArenaAllocator.init(allocator),
        };
    }

    /// Return interned string (pointer stable for lifetime)
    pub fn intern(self: *StringInterner, str: []const u8) []const u8 {
        if (self.table.get(str)) |existing| {
            return existing;
        }

        // Allocate permanent copy
        const copy = self.arena.allocator().dupe(u8, str) catch unreachable;
        self.table.put(copy, copy) catch unreachable;
        return copy;
    }
};

// Variable names are interned for fast comparison
fn lookupVar(env: *const Environment, name: []const u8) ?Value {
    const interned = global_interner.intern(name);
    return env.bindings.get(interned);
}

SIMD for Crypto

Vectorized elliptic curve operations:

/// SIMD-accelerated field multiplication (mod p)
fn mulModP(a: *const [4]u64, b: *const [4]u64) [4]u64 {
    // Use vector operations where available
    if (comptime std.Target.current.cpu.arch.isX86()) {
        return mulModP_avx2(a, b);
    } else if (comptime std.Target.current.cpu.arch.isAARCH64()) {
        return mulModP_neon(a, b);
    } else {
        return mulModP_scalar(a, b);
    }
}

fn mulModP_avx2(a: *const [4]u64, b: *const [4]u64) [4]u64 {
    // AVX2 implementation using 256-bit vectors
    const va: @Vector(4, u64) = a.*;
    const vb: @Vector(4, u64) = b.*;

    // Schoolbook multiplication with vector operations
    // ... (optimized implementation)

    return result;
}

Profiling and Benchmarking

Built-in profiling support:

const Timer = struct {
    start: i128,

    pub fn init() Timer {
        return .{ .start = std.time.nanoTimestamp() };
    }

    pub fn elapsed(self: *const Timer) u64 {
        const now = std.time.nanoTimestamp();
        return @intCast(now - self.start);
    }
};

/// Benchmark harness
fn benchmark(
    comptime name: []const u8,
    comptime iterations: usize,
    comptime warmup: usize,
    func: *const fn () void,
) void {
    // Warmup
    for (0..warmup) |_| {
        func();
    }

    // Measure
    const timer = Timer.init();
    for (0..iterations) |_| {
        func();
    }
    const total_ns = timer.elapsed();

    const ns_per_op = total_ns / iterations;
    const ops_per_sec = @as(f64, 1_000_000_000) / @as(f64, @floatFromInt(ns_per_op));

    std.debug.print("{s}: {} ns/op ({d:.0} ops/sec)\n", .{
        name,
        ns_per_op,
        ops_per_sec,
    });
}

// Usage
test "benchmark deserialization" {
    benchmark("deserialize_script", 10000, 1000, struct {
        fn run() void {
            _ = deserialize(test_script);
        }
    }.run);
}

Memory Profiling

Track allocations in debug builds:

const DebugAllocator = struct {
    backing: Allocator,
    total_allocated: usize = 0,
    total_freed: usize = 0,
    allocation_count: usize = 0,

    pub fn allocator(self: *DebugAllocator) Allocator {
        return .{
            .ptr = self,
            .vtable = &.{
                .alloc = alloc,
                .resize = resize,
                .free = free,
            },
        };
    }

    fn alloc(ctx: *anyopaque, len: usize, ptr_align: u8, ret_addr: usize) ?[*]u8 {
        const self: *DebugAllocator = @ptrCast(@alignCast(ctx));
        self.total_allocated += len;
        self.allocation_count += 1;
        return self.backing.rawAlloc(len, ptr_align, ret_addr);
    }

    // ... other methods

    pub fn report(self: *const DebugAllocator) void {
        std.debug.print("Allocations: {}\n", .{self.allocation_count});
        std.debug.print("Total allocated: {} bytes\n", .{self.total_allocated});
        std.debug.print("Total freed: {} bytes\n", .{self.total_freed});
        std.debug.print("Leaked: {} bytes\n", .{self.total_allocated - self.total_freed});
    }
};

Performance Patterns

Optimization Decision Tree
══════════════════════════════════════════════════════════════════

Is operation in hot path?
│
├── NO → Optimize for clarity, not speed
│
└── YES → Profile first, then:
    │
    ├── CPU-bound?
    │   ├── Use comptime for dispatch
    │   ├── Unroll small loops
    │   ├── Use SIMD where applicable
    │   └── Inline critical functions
    │
    ├── Memory-bound?
    │   ├── Use SoA layout
    │   ├── Pool/arena allocators
    │   ├── Reduce allocations
    │   └── Prefetch data
    │
    └── Allocation-bound?
        ├── Arena allocators
        ├── Object pools
        ├── String interning
        └── Stack allocation where safe

Performance Checklist

When writing performance-critical code:

// ✓ Use comptime for type-level decisions
const Handler = comptime getHandler(op);

// ✓ Pre-compute lookup tables
const costs = comptime computeCostTable();

// ✓ Use SoA for iterated data
const Store = struct { tags: []Tag, values: []Value };

// ✓ Arena allocators for batch operations
var arena = ArenaAllocator.init(allocator);
defer arena.deinit();

// ✓ Inline hot functions
pub inline fn addCost(self: *CostAccum, cost: u32) !void

// ✓ Avoid allocations in tight loops
for (items) |item| {
    // Process without allocation
}

// ✓ Use vectors for parallel data
const Vec4 = @Vector(4, u64);

// ✓ Profile before optimizing
// std.debug.print("elapsed: {} ns\n", .{timer.elapsed()});

Summary

  • Comptime enables zero-cost abstractions and compile-time dispatch
  • Data-oriented design (SoA) improves cache efficiency 4x+
  • Arena allocators batch deallocations for throughput
  • Loop unrolling and SIMD accelerate hot paths
  • Memoization caches expensive computations
  • String interning reduces allocation pressure
  • Profile first before optimizing—measure, don't guess

Next: Chapter 32: v6 Protocol Features

1

Scala: perf-style-guide.md (HOTSPOT patterns)

2

Rust: Performance-oriented design throughout sigma-rust crates

6

Rust: Memoization patterns in ergotree-interpreter

3

Scala: CErgoTreeEvaluator.scala (fixedCostOp)

4

Rust: eval.rs (cost tracking)

Chapter 32: v6 Protocol Features

PRE-ALPHA WARNING: This is a pre-alpha version of The Sigma Book. Content may be incomplete, inaccurate, or subject to change. Do not use as a source of truth. For authoritative information, consult the official repositories:

Prerequisites

  • Chapter 2 for the ErgoTree type system and numeric types
  • Chapter 6 for method definitions on types
  • Chapter 29 for soft-fork versioning and activation

Learning Objectives

By the end of this chapter, you will be able to:

  • Implement SUnsignedBigInt (256-bit unsigned integers) with modular arithmetic operations
  • Apply bitwise operations (AND, OR, XOR, NOT, shifts) to all numeric types
  • Use new collection manipulation methods (patch, updated, updateMany, reverse, get)
  • Understand the Autolykos2 proof-of-work algorithm and Header.checkPow
  • Serialize values using Global.serialize and decode difficulty with NBits encoding
  • Write version-aware scripts that use v6 features safely

Version Activation

ErgoTree version 3 corresponds to protocol v6.0. Features in this chapter are only available when the v6 soft-fork is activated.

Version Mapping
═══════════════════════════════════════════════════════════════════

Block Version    ErgoTree Version    Protocol    Features
─────────────────────────────────────────────────────────────────────
1-2              0-1                 v3.x-v4.x   AOT costing
3                2                   v5.x        JIT costing
4                3                   v6.x        This chapter

Version Context

const VersionContext = struct {
    activated_version: u8,
    ergo_tree_version: u8,

    pub const JIT_ACTIVATION_VERSION: u8 = 2;   // v5.0
    pub const V6_SOFT_FORK_VERSION: u8 = 3;     // v6.0

    /// True if v6.0 protocol is activated
    pub fn isV6Activated(self: *const VersionContext) bool {
        return self.activated_version >= V6_SOFT_FORK_VERSION;
    }

    /// True if current ErgoTree is v3 or later
    pub fn isV3OrLaterErgoTree(self: *const VersionContext) bool {
        return self.ergo_tree_version >= V6_SOFT_FORK_VERSION;
    }

    /// Check if a v6 method can be used
    pub fn canUseV6Method(self: *const VersionContext) bool {
        return self.isV6Activated() and self.isV3OrLaterErgoTree();
    }
};

SUnsignedBigInt Type

The SUnsignedBigInt type (type code 9) is a 256-bit unsigned integer designed for cryptographic modular arithmetic12. Unlike SBigInt which is signed, SUnsignedBigInt guarantees non-negative values—essential for operations like modular exponentiation where sign handling would introduce complexity and potential errors.

Type Definition

/// 256-bit unsigned integer for modular arithmetic
/// Type code: 0x09
const UnsignedBigInt256 = struct {
    /// Internal representation: 4 x 64-bit words (little-endian)
    words: [4]u64,

    pub const TYPE_CODE: u8 = 0x09;
    pub const BIT_WIDTH: usize = 256;
    pub const BYTE_WIDTH: usize = 32;

    /// Maximum value: 2^256 - 1
    pub const MAX = UnsignedBigInt256{ .words = .{
        0xFFFFFFFFFFFFFFFF, 0xFFFFFFFFFFFFFFFF,
        0xFFFFFFFFFFFFFFFF, 0xFFFFFFFFFFFFFFFF,
    }};

    /// Zero value
    pub const ZERO = UnsignedBigInt256{ .words = .{ 0, 0, 0, 0 }};

    /// One value
    pub const ONE = UnsignedBigInt256{ .words = .{ 1, 0, 0, 0 }};

    /// Create from bytes (big-endian)
    pub fn fromBytes(bytes: [32]u8) UnsignedBigInt256 {
        var result = UnsignedBigInt256{ .words = undefined };
        // Convert big-endian bytes to little-endian words
        inline for (0..4) |i| {
            const offset = (3 - i) * 8;
            result.words[i] = std.mem.readInt(u64, bytes[offset..][0..8], .big);
        }
        return result;
    }

    /// Convert to bytes (big-endian)
    pub fn toBytes(self: UnsignedBigInt256) [32]u8 {
        var result: [32]u8 = undefined;
        inline for (0..4) |i| {
            const offset = (3 - i) * 8;
            std.mem.writeInt(u64, result[offset..][0..8], self.words[i], .big);
        }
        return result;
    }

    /// Convert from signed BigInt (errors if negative)
    pub fn fromBigInt(bi: BigInt256) !UnsignedBigInt256 {
        if (bi.isNegative()) {
            return error.NegativeValue;
        }
        // Safe to reinterpret since non-negative
        return @bitCast(bi.abs());
    }

    /// Convert to signed BigInt (errors if > BigInt.MAX)
    pub fn toBigInt(self: UnsignedBigInt256) !BigInt256 {
        // Check if value exceeds signed max (2^255 - 1)
        if (self.words[3] & 0x8000000000000000 != 0) {
            return error.Overflow;
        }
        return @bitCast(self);
    }

    /// Comparison
    pub fn lessThan(self: UnsignedBigInt256, other: UnsignedBigInt256) bool {
        // Compare from most significant word
        var i: usize = 4;
        while (i > 0) {
            i -= 1;
            if (self.words[i] < other.words[i]) return true;
            if (self.words[i] > other.words[i]) return false;
        }
        return false; // Equal
    }

    pub fn eql(self: UnsignedBigInt256, other: UnsignedBigInt256) bool {
        return std.mem.eql(u64, &self.words, &other.words);
    }
};

Why Unsigned Matters for Cryptography

Signed integers introduce complexity in modular arithmetic:

  1. Sign bit ambiguity: In two's complement, the high bit indicates sign. For cryptographic operations on field elements, all 256 bits should represent magnitude.

  2. Modular reduction: Computing a mod m for negative a requires adjustment: (-5) mod 7 = 2, not -5. Unsigned values eliminate this edge case.

  3. Constant-time operations: Sign handling can introduce timing variations. Unsigned operations are more naturally constant-time.

  4. Field element representation: Finite field elements are inherently non-negative integers in [0, p-1].

Serialization

const UnsignedBigInt256Serializer = struct {
    /// Serialize to variable-length big-endian bytes
    pub fn serialize(value: UnsignedBigInt256, writer: anytype) !void {
        const bytes = value.toBytes();

        // Find first non-zero byte (skip leading zeros)
        var start: usize = 0;
        while (start < 32 and bytes[start] == 0) : (start += 1) {}

        // Write length + bytes
        const len = 32 - start;
        try writer.writeInt(u8, @intCast(len), .big);
        try writer.writeAll(bytes[start..]);
    }

    /// Deserialize from variable-length big-endian bytes
    pub fn deserialize(reader: anytype) !UnsignedBigInt256 {
        const len = try reader.readInt(u8, .big);
        if (len > 32) return error.InvalidLength;

        var bytes: [32]u8 = .{0} ** 32;
        const start = 32 - len;
        try reader.readNoEof(bytes[start..]);

        return UnsignedBigInt256.fromBytes(bytes);
    }
};

Modular Arithmetic Operations

v6 adds six modular arithmetic methods to SUnsignedBigInt34:

Modular Arithmetic Methods
═══════════════════════════════════════════════════════════════════

Method          Signature                     Cost    Description
─────────────────────────────────────────────────────────────────────
mod             (UBI, UBI) → UBI              20      a mod m
modInverse      (UBI, UBI) → UBI              150     a⁻¹ mod m
plusMod         (UBI, UBI, UBI) → UBI         30      (a + b) mod m
subtractMod     (UBI, UBI, UBI) → UBI         30      (a - b) mod m
multiplyMod     (UBI, UBI, UBI) → UBI         40      (a × b) mod m
toSigned        UBI → BigInt                  10      Convert to signed

Basic Modulo Operation

/// a mod m - remainder after division
/// Cost: FixedCost(20)
pub fn mod(a: UnsignedBigInt256, m: UnsignedBigInt256) !UnsignedBigInt256 {
    if (m.eql(UnsignedBigInt256.ZERO)) {
        return error.DivisionByZero;
    }

    // Use schoolbook division for 256-bit values
    // Result is always < m
    return divmod(a, m).remainder;
}

Modular Inverse (Extended Euclidean Algorithm)

The modular inverse a⁻¹ mod m is the value x such that (a × x) mod m = 1. It exists only when gcd(a, m) = 1.

/// Extended Euclidean Algorithm
/// Returns x such that (a * x) ≡ 1 (mod m)
/// Cost: FixedCost(150) - most expensive modular operation
pub fn modInverse(a: UnsignedBigInt256, m: UnsignedBigInt256) !UnsignedBigInt256 {
    if (m.eql(UnsignedBigInt256.ZERO)) {
        return error.DivisionByZero;
    }
    if (a.eql(UnsignedBigInt256.ZERO)) {
        return error.NoInverse; // gcd(0, m) = m ≠ 1
    }

    // Extended Euclidean Algorithm
    // Maintains: old_s * a + old_t * m = old_r (Bézout's identity)
    var old_r = a;
    var r = m;
    var old_s = UnsignedBigInt256.ONE;
    var s = UnsignedBigInt256.ZERO;
    var old_s_negative = false;
    var s_negative = false;

    while (!r.eql(UnsignedBigInt256.ZERO)) {
        const quotient = divmod(old_r, r).quotient;

        // (old_r, r) = (r, old_r - quotient * r)
        const temp_r = r;
        const qr = multiply(quotient, r);
        if (old_r.lessThan(qr)) {
            r = subtract(qr, old_r);
        } else {
            r = subtract(old_r, qr);
        }
        old_r = temp_r;

        // (old_s, s) = (s, old_s - quotient * s)
        // Handle signed arithmetic carefully
        const temp_s = s;
        const temp_s_neg = s_negative;
        const qs = multiply(quotient, s);

        if (old_s_negative == s_negative) {
            // Same sign: subtraction
            if (old_s.lessThan(qs)) {
                s = subtract(qs, old_s);
                s_negative = !old_s_negative;
            } else {
                s = subtract(old_s, qs);
                s_negative = old_s_negative;
            }
        } else {
            // Different signs: addition
            s = add(old_s, qs);
            s_negative = old_s_negative;
        }

        old_s = temp_s;
        old_s_negative = temp_s_neg;
    }

    // Check that gcd(a, m) = 1
    if (!old_r.eql(UnsignedBigInt256.ONE)) {
        return error.NoInverse; // a and m are not coprime
    }

    // Adjust result to be positive
    if (old_s_negative) {
        return subtract(m, old_s);
    }
    return old_s;
}

Modular Addition

/// (a + b) mod m - modular addition
/// Handles overflow by using 320-bit intermediate
/// Cost: FixedCost(30)
pub fn plusMod(
    a: UnsignedBigInt256,
    b: UnsignedBigInt256,
    m: UnsignedBigInt256,
) !UnsignedBigInt256 {
    if (m.eql(UnsignedBigInt256.ZERO)) {
        return error.DivisionByZero;
    }

    // a + b can overflow 256 bits, so use 320-bit intermediate
    var sum: [5]u64 = .{ 0, 0, 0, 0, 0 };
    var carry: u64 = 0;

    for (0..4) |i| {
        const s = @as(u128, a.words[i]) + @as(u128, b.words[i]) + carry;
        sum[i] = @truncate(s);
        carry = @truncate(s >> 64);
    }
    sum[4] = carry;

    // Reduce mod m
    return reduce320(sum, m);
}

/// (a - b) mod m - modular subtraction
/// If a < b, result is m - (b - a)
/// Cost: FixedCost(30)
pub fn subtractMod(
    a: UnsignedBigInt256,
    b: UnsignedBigInt256,
    m: UnsignedBigInt256,
) !UnsignedBigInt256 {
    if (m.eql(UnsignedBigInt256.ZERO)) {
        return error.DivisionByZero;
    }

    if (a.lessThan(b)) {
        // a - b is negative: compute m - (b - a)
        const diff = subtract(b, a);
        const diff_mod = try mod(diff, m);
        if (diff_mod.eql(UnsignedBigInt256.ZERO)) {
            return UnsignedBigInt256.ZERO;
        }
        return subtract(m, diff_mod);
    } else {
        const diff = subtract(a, b);
        return mod(diff, m);
    }
}

Modular Multiplication

/// (a * b) mod m - modular multiplication
/// Uses 512-bit intermediate to handle overflow
/// Cost: FixedCost(40)
pub fn multiplyMod(
    a: UnsignedBigInt256,
    b: UnsignedBigInt256,
    m: UnsignedBigInt256,
) !UnsignedBigInt256 {
    if (m.eql(UnsignedBigInt256.ZERO)) {
        return error.DivisionByZero;
    }

    // Multiply to 512-bit result
    var product: [8]u64 = .{0} ** 8;

    for (0..4) |i| {
        var carry: u64 = 0;
        for (0..4) |j| {
            const p = @as(u128, a.words[i]) * @as(u128, b.words[j]) +
                      @as(u128, product[i + j]) + @as(u128, carry);
            product[i + j] = @truncate(p);
            carry = @truncate(p >> 64);
        }
        product[i + 4] = carry;
    }

    // Reduce 512-bit product mod m
    return reduce512(product, m);
}

Bitwise Operations

v6 adds eight bitwise methods to all numeric types (Byte, Short, Int, Long, BigInt, UnsignedBigInt)56:

Bitwise Operations
═══════════════════════════════════════════════════════════════════

Method          Signature           Cost    Description
─────────────────────────────────────────────────────────────────────
bitwiseInverse  T → T               5       ~x (NOT)
bitwiseOr       (T, T) → T          5       x | y
bitwiseAnd      (T, T) → T          5       x & y
bitwiseXor      (T, T) → T          5       x ^ y
shiftLeft       (T, Int) → T        5       x << n
shiftRight      (T, Int) → T        5       x >> n
toBytes         T → Coll[Byte]      5       Byte representation
toBits          T → Coll[Boolean]   5       Bit representation

Implementation

/// Bitwise operations for all numeric types
pub fn BitwiseOps(comptime T: type) type {
    return struct {
        /// Bitwise NOT (~x)
        /// For signed types: ~x = -x - 1 (two's complement identity)
        /// For unsigned: ~x = MAX - x
        /// Cost: FixedCost(5)
        pub fn bitwiseInverse(x: T) T {
            return ~x;
        }

        /// Bitwise OR (x | y)
        /// Cost: FixedCost(5)
        pub fn bitwiseOr(x: T, y: T) T {
            return x | y;
        }

        /// Bitwise AND (x & y)
        /// Cost: FixedCost(5)
        pub fn bitwiseAnd(x: T, y: T) T {
            return x & y;
        }

        /// Bitwise XOR (x ^ y)
        /// Cost: FixedCost(5)
        pub fn bitwiseXor(x: T, y: T) T {
            return x ^ y;
        }

        /// Left shift (x << n)
        /// Returns 0 if n >= bitwidth
        /// Cost: FixedCost(5)
        pub fn shiftLeft(x: T, n: i32) !T {
            if (n < 0) return error.NegativeShift;
            const bits = @bitSizeOf(T);
            if (n >= bits) return 0;
            return x << @intCast(n);
        }

        /// Right shift (x >> n)
        /// Arithmetic shift for signed (preserves sign)
        /// Logical shift for unsigned (fills with 0)
        /// Cost: FixedCost(5)
        pub fn shiftRight(x: T, n: i32) !T {
            if (n < 0) return error.NegativeShift;
            const bits = @bitSizeOf(T);
            if (n >= bits) {
                // For signed: return -1 if negative, 0 otherwise
                // For unsigned: return 0
                if (@typeInfo(T).Int.signedness == .signed) {
                    return if (x < 0) -1 else 0;
                }
                return 0;
            }
            return x >> @intCast(n);
        }
    };
}

/// Byte conversion for numeric types
pub fn toBytes(comptime T: type, x: T) []const u8 {
    const size = @sizeOf(T);
    var result: [size]u8 = undefined;
    std.mem.writeInt(T, &result, x, .big);
    return &result;
}

/// Bit conversion for numeric types
pub fn toBits(comptime T: type, x: T) []const bool {
    const bits = @bitSizeOf(T);
    var result: [bits]bool = undefined;
    for (0..bits) |i| {
        result[bits - 1 - i] = ((x >> @intCast(i)) & 1) == 1;
    }
    return &result;
}

BigInt Bitwise Operations

For BigInt256 and UnsignedBigInt256, bitwise operations work on the full 256-bit representation:

/// 256-bit bitwise operations
const BigIntBitwise = struct {
    /// Bitwise NOT for 256-bit unsigned
    /// ~x = MAX - x for unsigned interpretation
    pub fn bitwiseInverse(x: UnsignedBigInt256) UnsignedBigInt256 {
        return .{ .words = .{
            ~x.words[0],
            ~x.words[1],
            ~x.words[2],
            ~x.words[3],
        }};
    }

    /// Bitwise OR for 256-bit
    pub fn bitwiseOr(x: UnsignedBigInt256, y: UnsignedBigInt256) UnsignedBigInt256 {
        return .{ .words = .{
            x.words[0] | y.words[0],
            x.words[1] | y.words[1],
            x.words[2] | y.words[2],
            x.words[3] | y.words[3],
        }};
    }

    /// Left shift for 256-bit (handles cross-word shifts)
    pub fn shiftLeft(x: UnsignedBigInt256, n: i32) !UnsignedBigInt256 {
        if (n < 0) return error.NegativeShift;
        if (n >= 256) return UnsignedBigInt256.ZERO;

        const shift: u8 = @intCast(n);
        const word_shift = shift / 64;
        const bit_shift: u6 = @intCast(shift % 64);

        var result = UnsignedBigInt256.ZERO;

        if (bit_shift == 0) {
            // Word-aligned shift
            for (word_shift..4) |i| {
                result.words[i] = x.words[i - word_shift];
            }
        } else {
            // Cross-word shift
            for (word_shift..4) |i| {
                result.words[i] = x.words[i - word_shift] << bit_shift;
                if (i > word_shift) {
                    result.words[i] |= x.words[i - word_shift - 1] >> (64 - bit_shift);
                }
            }
        }

        return result;
    }
};

Collection Methods (v6)

v6 adds seven new methods to Coll[T] for efficient collection manipulation78:

Collection Methods (v6)
═══════════════════════════════════════════════════════════════════

Method          Signature                          Cost
─────────────────────────────────────────────────────────────────────
patch           (Coll[T], Int, Coll[T], Int) → Coll[T]  PerItem(30,2,10)
updated         (Coll[T], Int, T) → Coll[T]             PerItem(20,1,10)
updateMany      (Coll[T], Coll[Int], Coll[T]) → Coll[T] PerItem(20,2,10)
reverse         Coll[T] → Coll[T]                       PerItem (append)
startsWith      (Coll[T], Coll[T]) → Boolean            PerItem (zip)
endsWith        (Coll[T], Coll[T]) → Boolean            PerItem (zip)
get             (Coll[T], Int) → Option[T]              FixedCost(14)

patch - Replace Slice

/// Replace elements from index `from`, removing `replaced` elements,
/// inserting `patch` collection in their place.
///
/// xs.patch(from, patch, replaced):
///   result = xs[0..from] ++ patch ++ xs[from+replaced..]
///
/// Cost: PerItemCost(30, 2, 10) based on xs.len + patch.len
pub fn patch(
    comptime T: type,
    xs: []const T,
    from: i32,
    patchColl: []const T,
    replaced: i32,
    allocator: Allocator,
) ![]T {
    if (from < 0) return error.IndexOutOfBounds;
    const from_idx: usize = @intCast(from);
    if (from_idx > xs.len) return error.IndexOutOfBounds;

    const replaced_count: usize = if (replaced < 0)
        0
    else
        @min(@as(usize, @intCast(replaced)), xs.len - from_idx);

    // Result length: original - replaced + patch
    const result_len = xs.len - replaced_count + patchColl.len;
    var result = try allocator.alloc(T, result_len);

    // Copy prefix [0..from]
    @memcpy(result[0..from_idx], xs[0..from_idx]);

    // Copy patch
    @memcpy(result[from_idx..][0..patchColl.len], patchColl);

    // Copy suffix [from+replaced..]
    const suffix_start = from_idx + replaced_count;
    const suffix_dest = from_idx + patchColl.len;
    @memcpy(result[suffix_dest..], xs[suffix_start..]);

    return result;
}

updated - Single Element Update

/// Return a new collection with element at index replaced.
/// Immutable operation - original collection unchanged.
///
/// Cost: PerItemCost(20, 1, 10)
pub fn updated(
    comptime T: type,
    xs: []const T,
    idx: i32,
    value: T,
    allocator: Allocator,
) ![]T {
    if (idx < 0) return error.IndexOutOfBounds;
    const index: usize = @intCast(idx);
    if (index >= xs.len) return error.IndexOutOfBounds;

    var result = try allocator.dupe(T, xs);
    result[index] = value;
    return result;
}

updateMany - Batch Update

/// Update multiple elements at specified indices.
/// indexes and updates must have the same length.
///
/// Cost: PerItemCost(20, 2, 10)
pub fn updateMany(
    comptime T: type,
    xs: []const T,
    indexes: []const i32,
    updates: []const T,
    allocator: Allocator,
) ![]T {
    if (indexes.len != updates.len) {
        return error.LengthMismatch;
    }

    // Validate all indexes first
    for (indexes) |idx| {
        if (idx < 0) return error.IndexOutOfBounds;
        if (@as(usize, @intCast(idx)) >= xs.len) return error.IndexOutOfBounds;
    }

    var result = try allocator.dupe(T, xs);

    for (indexes, updates) |idx, val| {
        result[@intCast(idx)] = val;
    }

    return result;
}

reverse, startsWith, endsWith, get

/// Reverse collection order
/// Cost: Same as append (PerItem)
pub fn reverse(comptime T: type, xs: []const T, allocator: Allocator) ![]T {
    var result = try allocator.alloc(T, xs.len);
    for (xs, 0..) |x, i| {
        result[xs.len - 1 - i] = x;
    }
    return result;
}

/// Check if collection starts with prefix
/// Cost: Same as zip (PerItem based on prefix length)
pub fn startsWith(comptime T: type, xs: []const T, prefix: []const T) bool {
    if (prefix.len > xs.len) return false;
    return std.mem.eql(T, xs[0..prefix.len], prefix);
}

/// Check if collection ends with suffix
/// Cost: Same as zip (PerItem based on suffix length)
pub fn endsWith(comptime T: type, xs: []const T, suffix: []const T) bool {
    if (suffix.len > xs.len) return false;
    return std.mem.eql(T, xs[xs.len - suffix.len ..], suffix);
}

/// Safe element access returning Option
/// Returns null if index out of bounds (instead of error)
/// Cost: FixedCost(14)
pub fn get(comptime T: type, xs: []const T, idx: i32) ?T {
    if (idx < 0) return null;
    const index: usize = @intCast(idx);
    if (index >= xs.len) return null;
    return xs[index];
}

Autolykos2 Proof-of-Work

v6 exposes proof-of-work verification in ErgoScript through Header.checkPow() and Global.powHit()910. Autolykos2 is Ergo's memory-hard, ASIC-resistant PoW algorithm designed for fair GPU mining.

Algorithm Overview

Autolykos2 Structure
═══════════════════════════════════════════════════════════════════

Parameters:
  N = 2²⁶ ≈ 67 million     Table size (memory requirement)
  k = 32                    Elements to sum per solution
  n = 26                    log₂(N)

Memory: N × 32 bytes ≈ 2 GB table

Algorithm:
  1. Seed table from height (changes every ~1024 blocks)
  2. For each nonce attempt:
     a. Compute 32 pseudo-random indices from (msg, nonce)
     b. Sum the 32 table elements at those indices
     c. Hash (msg || nonce || sum) to get PoW hit
     d. If hit < target, solution found

Implementation

/// Autolykos2 proof-of-work algorithm constants and functions
const Autolykos2 = struct {
    /// Table size: 2^26 elements
    pub const N: u32 = 1 << 26;

    /// Elements summed per solution attempt
    pub const K: u32 = 32;

    /// Bits in N (log2(N))
    pub const N_BITS: u5 = 26;

    /// Element size in bytes
    pub const ELEMENT_SIZE: usize = 32;

    /// Total table memory requirement
    pub const TABLE_SIZE: usize = N * ELEMENT_SIZE; // ~2 GB

    /// Epoch length for table seed rotation
    pub const EPOCH_LENGTH: u32 = 1024;

    /// Compute the PoW hit value for a header
    /// Returns BigInt256 that must be < target (from nBits)
    ///
    /// Cost: ~700 JitCost (multiple Blake2b256 computations)
    pub fn powHit(
        header_without_pow: []const u8,
        nonce: u64,
        height: u32,
    ) BigInt256 {
        // Step 1: Compute message hash
        const msg = Blake2b256.hash(header_without_pow);

        // Step 2: Generate table seed from height epoch
        const seed = computeTableSeed(height);

        // Step 3: Compute k-sum of table elements
        var sum = UnsignedBigInt256.ZERO;
        var nonce_bytes: [8]u8 = undefined;
        std.mem.writeInt(u64, &nonce_bytes, nonce, .big);

        for (0..K) |i| {
            // Derive index from hash(msg || nonce || i)
            var index_input: [32 + 8 + 4]u8 = undefined;
            @memcpy(index_input[0..32], &msg);
            @memcpy(index_input[32..40], &nonce_bytes);
            std.mem.writeInt(u32, index_input[40..44], @intCast(i), .big);

            const index_hash = Blake2b256.hash(&index_input);
            const idx = std.mem.readInt(u32, index_hash[0..4], .big) % N;

            // Look up table element
            const element = computeTableElement(seed, idx);
            sum = addUnchecked(sum, element);
        }

        // Step 4: Final hash to get PoW hit
        var final_input: [32 + 8 + 32]u8 = undefined;
        @memcpy(final_input[0..32], &msg);
        @memcpy(final_input[32..40], &nonce_bytes);
        @memcpy(final_input[40..72], &sum.toBytes());

        const hit_hash = Blake2b256.hash(&final_input);
        return BigInt256.fromBytes(hit_hash);
    }

    /// Compute table seed from block height
    /// Seed changes every EPOCH_LENGTH blocks to prevent precomputation
    fn computeTableSeed(height: u32) [32]u8 {
        const epoch = height / EPOCH_LENGTH;
        var epoch_bytes: [4]u8 = undefined;
        std.mem.writeInt(u32, &epoch_bytes, epoch, .big);
        return Blake2b256.hash(&epoch_bytes);
    }

    /// Compute table element at given index
    /// This is the memory-hard part - miners must store or recompute
    fn computeTableElement(seed: [32]u8, idx: u32) UnsignedBigInt256 {
        // Element = H(seed || idx || 0) || H(seed || idx || 1) || ...
        // Combined to form 256-bit value
        var result: [32]u8 = undefined;

        for (0..4) |chunk| {
            var input: [32 + 4 + 1]u8 = undefined;
            @memcpy(input[0..32], &seed);
            std.mem.writeInt(u32, input[32..36], idx, .big);
            input[36] = @intCast(chunk);

            const chunk_hash = Blake2b256.hash(&input);
            @memcpy(result[chunk * 8 ..][0..8], chunk_hash[0..8]);
        }

        return UnsignedBigInt256.fromBytes(result);
    }
};

Header.checkPow

/// Verify that a block header satisfies the PoW difficulty requirement
///
/// checkPow() returns true iff powHit(header) < decodeNBits(header.nBits)
///
/// Cost: FixedCost(700) - approximately 2×32 hash computations
pub fn checkPow(header: Header) bool {
    // Serialize header without PoW solution
    const header_bytes = header.serializeWithoutPow();

    // Compute PoW hit
    const hit = Autolykos2.powHit(
        header_bytes,
        header.powSolutions.n, // nonce
        header.height,
    );

    // Decode difficulty target from nBits
    const target = NBits.decode(header.nBits);

    // Valid if hit < target
    return hit.lessThan(target);
}

Why Memory-Hard?

Autolykos2's memory requirement (~2GB) provides ASIC resistance:

  1. Table storage: Miners must maintain the full table in fast memory
  2. Random access: k=32 random lookups per attempt prevents caching tricks
  3. Epoch rotation: Table changes every ~1024 blocks, invalidating precomputation
  4. GPU-friendly: Memory bandwidth is the bottleneck, favoring commodity GPUs

NBits Difficulty Encoding

The nBits field in block headers uses a compact encoding for the difficulty target11:

NBits Format
═══════════════════════════════════════════════════════════════════

Format: 0xAABBBBBB (4 bytes)
  AA     = exponent (1 byte)
  BBBBBB = mantissa (3 bytes)

Value = mantissa × 256^(exponent - 3)

Example:
  nBits = 0x1d00ffff
  exponent = 0x1d = 29
  mantissa = 0x00ffff = 65535
  target = 65535 × 256^(29-3) = 65535 × 256^26

Implementation

const NBits = struct {
    /// Decode nBits to BigInt target
    /// Cost: FixedCost(10)
    pub fn decode(nBits: i64) BigInt256 {
        const n = @as(u32, @intCast(nBits & 0xFFFFFFFF));
        const exp: u8 = @intCast((n >> 24) & 0xFF);
        const mantissa: u32 = n & 0x00FFFFFF;

        if (exp <= 3) {
            // Small exponent: right shift mantissa
            const shift = (3 - exp) * 8;
            return BigInt256.fromInt(mantissa >> @intCast(shift));
        } else {
            // Normal case: left shift mantissa
            const shift = (exp - 3) * 8;
            return BigInt256.fromInt(mantissa).shiftLeft(shift);
        }
    }

    /// Encode BigInt to nBits
    /// Cost: FixedCost(10)
    pub fn encode(target: BigInt256) i64 {
        // Find the byte length of target
        const bytes = target.toBytes();
        var byte_len: u8 = 32;
        for (bytes) |b| {
            if (b != 0) break;
            byte_len -= 1;
        }

        if (byte_len == 0) return 0;

        // Extract top 3 bytes as mantissa
        const start = 32 - byte_len;
        var mantissa: u32 = 0;

        if (byte_len >= 3) {
            mantissa = (@as(u32, bytes[start]) << 16) |
                       (@as(u32, bytes[start + 1]) << 8) |
                       @as(u32, bytes[start + 2]);
        } else if (byte_len == 2) {
            mantissa = (@as(u32, bytes[start]) << 8) |
                       @as(u32, bytes[start + 1]);
        } else {
            mantissa = bytes[start];
        }

        // Handle sign bit in mantissa (MSB must be 0)
        if (mantissa & 0x00800000 != 0) {
            mantissa >>= 8;
            byte_len += 1;
        }

        return @as(i64, byte_len) << 24 | @as(i64, mantissa);
    }
};

Global Serialization Methods

v6 adds methods to Global for value serialization12:

serialize

/// Serialize any value to bytes using SigmaSerializer
/// Works for all serializable types
/// Cost: Varies by type complexity
pub fn serialize(comptime T: type, value: T) ![]const u8 {
    var buffer = std.ArrayList(u8).init(allocator);
    const writer = buffer.writer();

    try SigmaSerializer.serialize(T, value, writer);

    return buffer.toOwnedSlice();
}

fromBigEndianBytes

/// Deserialize numeric type from big-endian bytes
/// Generic over numeric types
/// Cost: FixedCost(5) for primitives
pub fn fromBigEndianBytes(comptime T: type, bytes: []const u8) !T {
    const size = @sizeOf(T);
    if (bytes.len != size) return error.InvalidLength;

    var arr: [size]u8 = undefined;
    @memcpy(&arr, bytes);

    return std.mem.readInt(T, &arr, .big);
}

Cost Model for v6 Operations

v6 Operation Costs
═══════════════════════════════════════════════════════════════════

Operation                    Cost Type       Value    Notes
─────────────────────────────────────────────────────────────────────
Modular Arithmetic:
  mod(a, m)                  Fixed           20       Division
  modInverse(a, m)           Fixed           150      Extended Euclid
  plusMod(a, b, m)           Fixed           30       Add + mod
  subtractMod(a, b, m)       Fixed           30       Sub + mod
  multiplyMod(a, b, m)       Fixed           40       Mul + mod

Bitwise (all types):
  bitwiseInverse(x)          Fixed           5        Single op
  bitwiseOr(x, y)            Fixed           5        Single op
  bitwiseAnd(x, y)           Fixed           5        Single op
  bitwiseXor(x, y)           Fixed           5        Single op
  shiftLeft(x, n)            Fixed           5        Single op
  shiftRight(x, n)           Fixed           5        Single op
  toBytes(x)                 Fixed           5        Conversion
  toBits(x)                  Fixed           5        Conversion

Collections:
  patch(xs, from, p, r)      PerItem(30,2,10)        O(n)
  updated(xs, idx, v)        PerItem(20,1,10)        O(n) copy
  updateMany(xs, is, vs)     PerItem(20,2,10)        O(n)
  reverse(xs)                PerItem (append)        O(n)
  startsWith(xs, p)          PerItem (zip)           O(|p|)
  endsWith(xs, s)            PerItem (zip)           O(|s|)
  get(xs, idx)               Fixed           14      O(1)

Cryptographic:
  expUnsigned(g, k)          Fixed           900     Scalar mult
  checkPow(header)           Fixed           700     ~32 hashes
  powHit(...)                Dynamic                 Autolykos2

Serialization:
  serialize(v)               Varies                  Type-dependent
  fromBigEndianBytes(b)      Fixed           5       Simple parse
  encodeNBits(n)             Fixed           10      Encoding
  decodeNBits(n)             Fixed           10      Decoding

Migration Guide

Version Checking in Scripts

// ErgoScript: Check v6 availability
val canUseV6 = getVar[Boolean](127).getOrElse(false)

// Conditional v6 feature usage
if (canUseV6) {
  // Use v6 features
  val x: UnsignedBigInt = ...
  val inv = x.modInverse(p)
} else {
  // Fallback for pre-v6
}

When to Use v6 Features

FeatureUse When
SUnsignedBigIntCryptographic protocols requiring modular arithmetic
modInverseComputing multiplicative inverses in finite fields
Bitwise opsBit manipulation, flags, compact encodings
patch/updatedImmutable collection updates in contracts
getSafe array access without exceptions
checkPowOn-chain PoW verification for sidechains/merged mining

Backward Compatibility

  • v6 features are only available when VersionContext.isV6Activated() returns true
  • Scripts using v6 features will fail validation on pre-v6 nodes
  • Design scripts with fallback paths for pre-v6 compatibility during transition

Summary

This chapter covered the v6 protocol features that expand ErgoTree's capabilities:

  • SUnsignedBigInt provides 256-bit unsigned integers for cryptographic modular arithmetic, with six new methods (mod, modInverse, plusMod, subtractMod, multiplyMod, toSigned)

  • Bitwise operations (AND, OR, XOR, NOT, shifts) are now available on all numeric types with consistent semantics and low cost (5 JitCost each)

  • Collection methods (patch, updated, updateMany, reverse, startsWith, endsWith, get) enable efficient immutable collection manipulation

  • Autolykos2 PoW verification is exposed through Header.checkPow() and Global.powHit(), enabling on-chain proof-of-work validation

  • NBits encoding provides compact difficulty target representation with encodeNBits/decodeNBits

  • Serialization methods (Global.serialize, fromBigEndianBytes) support arbitrary value serialization

  • Cost model assigns appropriate costs to each operation, with modInverse (150) and checkPow (700) being the most expensive due to their computational complexity


Previous: Chapter 31 | Next: Appendix A

3

Scala: methods.scala:570-625 (SUnsignedBigIntMethods)

5

Scala: methods.scala (Bitwise method definitions)

7

Scala: Colls.scala (Collection trait)

9

Ergo: Autolykos PoW

10

Rust: Header type

11

Bitcoin Wiki: Difficulty

12

Scala: sglobal.scala (SGlobalMethods)

Appendix A: Complete Type Code Table

PRE-ALPHA WARNING: This is a pre-alpha version of The Sigma Book. Content may be incomplete, inaccurate, or subject to change. Do not use as a source of truth. For authoritative information, consult the official repositories:

Complete reference for all type codes used in ErgoTree serialization12.

Type Code Ranges

Type Code Organization
══════════════════════════════════════════════════════════════════

Range          Usage
─────────────────────────────────────────────────────────────────
0x00           Reserved (invalid)
0x01-0x09      Primitive types (embeddable)
0x0A-0x0B      Reserved
0x0C           Collection type constructor
0x0D-0x17      Reserved
0x18           Nested collection (Coll[Coll[T]])
0x19-0x23      Reserved
0x24           Option type constructor
0x25-0x3B      Reserved
0x3C-0x5F      Pair type constructors
0x60           Tuple type constructor
0x61-0x6A      Object types (non-embeddable)
0x6B-0x6F      Reserved for future object types

Primitive Types (Embeddable)

Embeddable types can be used as element types in collections (Coll[T], Option[T]) and have compact type code encodings. They are "embedded" into composite type codes rather than being serialized separately. For example, Coll[Int] is encoded as 0x0C 0x04 where 0x04 (Int) is embedded directly after 0x0C (Coll).

DecHexTypeSizeZig Type
10x01SBoolean1 bitbool
20x02SByte8 bitsi8
30x03SShort16 bitsi16
40x04SInt32 bitsi32
50x05SLong64 bitsi64
60x06SBigInt256 bitsBigInt256
70x07SGroupElement33 bytesEcp (compressed)
80x08SSigmaPropvariableSigmaBoolean
90x09SUnsignedBigInt256 bitsUnsignedBigInt256

Object Types

DecHexTypeDescription
970x61SAnySupertype of all types
980x62SUnitUnit type (singleton)
990x63SBoxTransaction box
1000x64SAvlTreeAuthenticated dictionary
1010x65SContextExecution context
1020x66SStringString (ErgoScript only)
1030x67STypeVarType variable (internal)
1040x68SHeaderBlock header
1050x69SPreHeaderBlock pre-header
1060x6ASGlobalGlobal object (SigmaDslBuilder)

Type Constructors

DecHexConstructorExampleSerialized As
120x0CSCollColl[Byte]0x0C 0x02
240x18Nested SCollColl[Coll[Int]]0x18 0x04
360x24SOptionOption[Long]0x24 0x05
600x3CPair (first generic)(_, Byte)0x3C 0x02
720x48Pair (second generic)(Int, _)0x48 0x04
840x54Pair (symmetric)(Long, Long)0x54 0x05
960x60STuple(Int, Boolean, ...)0x60 len types...

Zig Type Definition

const TypeCode = enum(u8) {
    // Primitive types
    boolean = 0x01,
    byte = 0x02,
    short = 0x03,
    int = 0x04,
    long = 0x05,
    big_int = 0x06,
    group_element = 0x07,
    sigma_prop = 0x08,
    unsigned_big_int = 0x09,

    // Type constructors
    coll = 0x0C,
    nested_coll = 0x18,
    option = 0x24,
    pair_first = 0x3C,
    pair_second = 0x48,
    pair_symmetric = 0x54,
    tuple = 0x60,

    // Object types
    any = 0x61,
    unit = 0x62,
    box = 0x63,
    avl_tree = 0x64,
    context = 0x65,
    string = 0x66,
    type_var = 0x67,
    header = 0x68,
    pre_header = 0x69,
    global = 0x6A,

    pub fn isPrimitive(self: TypeCode) bool {
        return @intFromEnum(self) >= 0x01 and @intFromEnum(self) <= 0x09;
    }

    pub fn isEmbeddable(self: TypeCode) bool {
        return self.isPrimitive();
    }

    pub fn isNumeric(self: TypeCode) bool {
        return switch (self) {
            .byte, .short, .int, .long, .big_int, .unsigned_big_int => true,
            else => false,
        };
    }
};

Type Serialization

const SType = union(enum) {
    boolean,
    byte,
    short,
    int,
    long,
    big_int,
    group_element,
    sigma_prop,
    unsigned_big_int,
    coll: *const SType,
    option: *const SType,
    tuple: []const SType,
    box,
    avl_tree,
    context,
    header,
    pre_header,
    global,
    unit,
    any,

    pub fn typeCode(self: *const SType) u8 {
        return switch (self.*) {
            .boolean => 0x01,
            .byte => 0x02,
            .short => 0x03,
            .int => 0x04,
            .long => 0x05,
            .big_int => 0x06,
            .group_element => 0x07,
            .sigma_prop => 0x08,
            .unsigned_big_int => 0x09,
            .coll => |elem| blk: {
                if (elem.* == .coll) break :blk 0x18;
                break :blk 0x0C;
            },
            .option => 0x24,
            .tuple => 0x60,
            .box => 0x63,
            .avl_tree => 0x64,
            .context => 0x65,
            .header => 0x68,
            .pre_header => 0x69,
            .global => 0x6A,
            .unit => 0x62,
            .any => 0x61,
        };
    }
};

Encoding Rules

Type Encoding Examples
══════════════════════════════════════════════════════════════════

Simple Types:
  SInt           → [0x04]
  SBoolean       → [0x01]
  SGroupElement  → [0x07]

Collections:
  Coll[Byte]         → [0x0C, 0x02]      (coll + byte)
  Coll[Int]          → [0x0C, 0x04]      (coll + int)
  Coll[Coll[Byte]]   → [0x18, 0x02]      (nested_coll + byte)

Options:
  Option[Int]        → [0x24, 0x04]      (option + int)
  Option[Box]        → [0x24, 0x63]      (option + box)

Tuples (2 elements):
  (Int, Int)         → [0x54, 0x04]      (symmetric + int)
  (Int, Long)        → [0x48, 0x04, 0x05] (pair2 + int + long)
  (Long, Int)        → [0x3C, 0x05, 0x04] (pair1 + long + int)

Tuples (3+ elements):
  (Int, Long, Byte)  → [0x60, 0x03, 0x04, 0x05, 0x02]
                       (tuple + len + int + long + byte)

Constants

const TypeConstants = struct {
    /// First type code for primitive types
    pub const FIRST_PRIMITIVE_TYPE: u8 = 0x01;
    /// Last type code for primitive types
    pub const LAST_PRIMITIVE_TYPE: u8 = 0x09;
    /// Maximum supported type code
    pub const MAX_TYPE_CODE: u8 = 0x6A;
    /// Last data type (can be serialized as data)
    pub const LAST_DATA_TYPE: u8 = 111;
};

Previous: Chapter 31 | Next: Appendix B

1

Scala: SType.scala

Appendix B: Complete Opcode Table

PRE-ALPHA WARNING: This is a pre-alpha version of The Sigma Book. Content may be incomplete, inaccurate, or subject to change. Do not use as a source of truth. For authoritative information, consult the official repositories:

Complete reference for all operation codes used in ErgoTree serialization12.

Opcode Ranges

Opcode Space Organization
══════════════════════════════════════════════════════════════════

Range           Usage                           Encoding
─────────────────────────────────────────────────────────────────
0x00            Reserved (invalid)              -
0x01-0x6F       Data types (constants)          Type code directly
0x70            LastConstantCode boundary       112
0x71-0xFF       Operations                      LastConstantCode + shift

Operation Categories:
─────────────────────────────────────────────────────────────────
0x71-0x79       Variables & references          ValUse, ConstPlaceholder
0x7A-0x7E       Type conversions               Upcast, Downcast
0x7F-0x8C       Constants & tuples             True, False, Tuple
0x8F-0x98       Relations & logic              Lt, Gt, Eq, And, Or
0x99-0xA2       Arithmetic                     Plus, Minus, Multiply
0xA3-0xAC       Context access                 HEIGHT, INPUTS, OUTPUTS
0xAD-0xB8       Collection operations          Map, Filter, Fold
0xC1-0xC7       Box extraction                 ExtractAmount, ExtractId
0xCB-0xD5       Crypto & serialization         Blake2b, ProveDlog
0xD6-0xE7       Blocks & functions             ValDef, FuncValue, Apply
0xEA-0xEB       Sigma operations               SigmaAnd, SigmaOr
0xEC-0xFF       Bitwise & misc                 BitOr, BitAnd, XorOf

Zig Opcode Definition

const OpCode = enum(u8) {
    // Constants region: 0x01-0x70 (type codes)
    // Operations start at LAST_CONSTANT_CODE + 1 = 113

    // Variable references
    tagged_variable = 0x71,      // Context variable by ID
    val_use = 0x72,              // Reference to ValDef binding
    constant_placeholder = 0x73, // Segregated constant reference
    subst_constants = 0x74,      // Substitute constants in tree

    // Type conversions
    long_to_byte_array = 0x7A,
    byte_array_to_bigint = 0x7B,
    byte_array_to_long = 0x7C,
    downcast = 0x7D,
    upcast = 0x7E,

    // Primitive constants
    true_const = 0x7F,
    false_const = 0x80,
    unit_constant = 0x81,
    group_generator = 0x82,

    // Collection & tuple construction
    concrete_collection = 0x83,
    concrete_collection_bool = 0x85,
    tuple = 0x86,
    select_1 = 0x87,
    select_2 = 0x88,
    select_3 = 0x89,
    select_4 = 0x8A,
    select_5 = 0x8B,
    select_field = 0x8C,

    // Relational operations
    lt = 0x8F,
    le = 0x90,
    gt = 0x91,
    ge = 0x92,
    eq = 0x93,
    neq = 0x94,

    // Control flow & logic
    if_op = 0x95,
    and_op = 0x96,
    or_op = 0x97,
    atleast = 0x98,

    // Arithmetic
    minus = 0x99,
    plus = 0x9A,
    xor = 0x9B,
    multiply = 0x9C,
    division = 0x9D,
    modulo = 0x9E,
    exponentiate = 0x9F,
    multiply_group = 0xA0,
    min = 0xA1,
    max = 0xA2,

    // Context access
    height = 0xA3,
    inputs = 0xA4,
    outputs = 0xA5,
    last_block_utxo_root_hash = 0xA6,
    self_box = 0xA7,
    miner_pubkey = 0xAC,

    // Collection operations
    map_collection = 0xAD,
    exists = 0xAE,
    forall = 0xAF,
    fold = 0xB0,
    size_of = 0xB1,
    by_index = 0xB2,
    append = 0xB3,
    slice = 0xB4,
    filter = 0xB5,
    avl_tree = 0xB6,
    avl_tree_get = 0xB7,
    flat_map = 0xB8,

    // Box extraction
    extract_amount = 0xC1,
    extract_script_bytes = 0xC2,
    extract_bytes = 0xC3,
    extract_bytes_with_no_ref = 0xC4,
    extract_id = 0xC5,
    extract_register_as = 0xC6,
    extract_creation_info = 0xC7,

    // Cryptographic operations
    calc_blake2b256 = 0xCB,
    calc_sha256 = 0xCC,
    prove_dlog = 0xCD,
    prove_diffie_hellman_tuple = 0xCE,
    sigma_prop_is_proven = 0xCF,
    sigma_prop_bytes = 0xD0,
    bool_to_sigma_prop = 0xD1,
    trivial_prop_false = 0xD2,
    trivial_prop_true = 0xD3,

    // Deserialization
    deserialize_context = 0xD4,
    deserialize_register = 0xD5,

    // Block & function definitions
    val_def = 0xD6,
    fun_def = 0xD7,
    block_value = 0xD8,
    func_value = 0xD9,
    func_apply = 0xDA,
    property_call = 0xDB,
    method_call = 0xDC,
    global = 0xDD,

    // Option operations
    some_value = 0xDE,
    none_value = 0xDF,
    get_var = 0xE3,
    option_get = 0xE4,
    option_get_or_else = 0xE5,
    option_is_defined = 0xE6,

    // Modular arithmetic (deprecated in v5+)
    mod_q = 0xE7,
    plus_mod_q = 0xE8,
    minus_mod_q = 0xE9,

    // Sigma operations
    sigma_and = 0xEA,
    sigma_or = 0xEB,

    // Binary operations
    bin_or = 0xEC,
    bin_and = 0xED,
    decode_point = 0xEE,
    logical_not = 0xEF,
    negation = 0xF0,

    // Bitwise operations
    bit_inversion = 0xF1,
    bit_or = 0xF2,
    bit_and = 0xF3,
    bin_xor = 0xF4,
    bit_xor = 0xF5,
    bit_shift_right = 0xF6,
    bit_shift_left = 0xF7,
    bit_shift_right_zeroed = 0xF8,

    // Collection bitwise operations
    coll_shift_right = 0xF9,
    coll_shift_left = 0xFA,
    coll_shift_right_zeroed = 0xFB,
    coll_rotate_left = 0xFC,
    coll_rotate_right = 0xFD,

    // Misc
    context = 0xFE,
    xor_of = 0xFF,

    pub fn isConstant(code: u8) bool {
        return code >= 0x01 and code <= 0x70;
    }

    pub fn isOperation(code: u8) bool {
        return code > 0x70;
    }

    pub fn fromShift(shift: u8) OpCode {
        return @enumFromInt(0x70 + shift);
    }
};

Variable & Reference Operations

HexDecimalOperationDescription
0x71113TaggedVariableReference context variable by ID
0x72114ValUseUse value defined by ValDef
0x73115ConstantPlaceholderReference segregated constant
0x74116SubstConstantsSubstitute constants in tree

Type Conversion Operations

HexDecimalOperationDescription
0x7A122LongToByteArrayLong → Coll[Byte] (big-endian)
0x7B123ByteArrayToBigIntColl[Byte] → BigInt
0x7C124ByteArrayToLongColl[Byte] → Long
0x7D125DowncastNumeric downcast (may overflow)
0x7E126UpcastNumeric upcast (always safe)

Constants & Tuples

HexDecimalOperationDescription
0x7F127TrueBoolean true constant
0x80128FalseBoolean false constant
0x81129UnitConstantUnit () value
0x82130GroupGeneratorEC generator point G
0x83131ConcreteCollectionColl construction
0x85133ConcreteCollectionBoolOptimized Coll[Boolean]
0x86134TupleTuple construction
0x87-0x8B135-139Select1-5Tuple element access
0x8C140SelectFieldSelect by field index

Relational & Logic Operations

HexDecimalOperationDescription
0x8F143LtLess than (<)
0x90144LeLess or equal (≤)
0x91145GtGreater than (>)
0x92146GeGreater or equal (≥)
0x93147EqEqual (==)
0x94148NeqNot equal (≠)
0x95149IfIf-then-else
0x96150AndLogical AND (&&)
0x97151OrLogical OR (||)
0x98152AtLeastk-of-n threshold

Arithmetic Operations

HexDecimalOperationDescription
0x99153MinusSubtraction
0x9A154PlusAddition
0x9B155XorByte-array XOR
0x9C156MultiplyMultiplication
0x9D157DivisionInteger division
0x9E158ModuloRemainder
0x9F159ExponentiateBigInt exponentiation
0xA0160MultiplyGroupEC point multiplication
0xA1161MinMinimum
0xA2162MaxMaximum

Context Access Operations

HexDecimalOperationDescription
0xA3163HeightCurrent block height
0xA4164InputsTransaction inputs (INPUTS)
0xA5165OutputsTransaction outputs (OUTPUTS)
0xA6166LastBlockUtxoRootHashUTXO tree root hash
0xA7167SelfCurrent box (SELF)
0xAC172MinerPubkeyMiner's public key
0xFE254ContextContext object

Collection Operations

HexDecimalOperationDescription
0xAD173MapCollectionTransform elements
0xAE174ExistsAny element matches
0xAF175ForAllAll elements match
0xB0176FoldReduce to single value
0xB1177SizeOfCollection length
0xB2178ByIndexElement at index
0xB3179AppendConcatenate collections
0xB4180SliceExtract sub-collection
0xB5181FilterKeep matching elements
0xB6182AvlTreeAVL tree construction
0xB7183AvlTreeGetAVL tree lookup
0xB8184FlatMapMap and flatten

Box Extraction Operations

HexDecimalOperationDescription
0xC1193ExtractAmountBox.value (nanoErgs)
0xC2194ExtractScriptBytesBox.propositionBytes
0xC3195ExtractBytesBox.bytes (full)
0xC4196ExtractBytesWithNoRefBox.bytesWithoutRef
0xC5197ExtractIdBox.id (32 bytes)
0xC6198ExtractRegisterAsBox.Rx[T]
0xC7199ExtractCreationInfoBox.creationInfo

Cryptographic Operations

HexDecimalOperationDescription
0xCB203CalcBlake2b256Blake2b256 hash
0xCC204CalcSha256SHA-256 hash
0xCD205ProveDlogDLog proposition
0xCE206ProveDHTupleDHT proposition
0xCF207SigmaPropIsProvenCheck proven
0xD0208SigmaPropBytesSerialize SigmaProp
0xD1209BoolToSigmaPropBool → SigmaProp
0xD2210TrivialPropFalseAlways false
0xD3211TrivialPropTrueAlways true
0xEE238DecodePointBytes → GroupElement

Block & Function Operations

HexDecimalOperationDescription
0xD4212DeserializeContextDeserialize from context
0xD5213DeserializeRegisterDeserialize from register
0xD6214ValDefDefine value binding
0xD7215FunDefDefine function
0xD8216BlockValueBlock expression { }
0xD9217FuncValueLambda expression
0xDA218FuncApplyApply function
0xDB219PropertyCallProperty access
0xDC220MethodCallMethod invocation
0xDD221GlobalGlobal object

Option Operations

HexDecimalOperationDescription
0xDE222SomeValueSome(x) construction
0xDF223NoneValueNone construction
0xE3227GetVarGet context variable
0xE4228OptionGetOption.get (may fail)
0xE5229OptionGetOrElseOption.getOrElse
0xE6230OptionIsDefinedOption.isDefined

Sigma Operations

HexDecimalOperationDescription
0xEA234SigmaAndSigma AND (∧)
0xEB235SigmaOrSigma OR (∨)

Bitwise Operations (v6+)

HexDecimalOperationDescription
0xEF239LogicalNotBoolean NOT (!)
0xF0240NegationNumeric negation (-x)
0xF1241BitInversionBitwise NOT (~)
0xF2242BitOrBitwise OR (|)
0xF3243BitAndBitwise AND (&)
0xF4244BinXorBinary XOR
0xF5245BitXorBitwise XOR (^)
0xF6246BitShiftRightArithmetic right shift (>>)
0xF7247BitShiftLeftLeft shift (<<)
0xF8248BitShiftRightZeroedLogical right shift (>>>)

Collection Bitwise Operations (v6+)

HexDecimalOperationDescription
0xF9249CollShiftRightCollection shift right
0xFA250CollShiftLeftCollection shift left
0xFB251CollShiftRightZeroedCollection logical shift right
0xFC252CollRotateLeftCollection rotate left
0xFD253CollRotateRightCollection rotate right
0xFF255XorOfXOR of collection elements

Opcode Parsing

const OpCodeParser = struct {
    /// Parse opcode from byte, determining if constant or operation
    pub fn parse(byte: u8) ParseResult {
        if (byte == 0) return .invalid;
        if (byte <= 0x70) return .{ .constant = byte };
        return .{ .operation = @enumFromInt(byte) };
    }

    /// Check if opcode requires additional data
    pub fn hasPayload(op: OpCode) bool {
        return switch (op) {
            .val_use,
            .constant_placeholder,
            .tagged_variable,
            .extract_register_as,
            .by_index,
            .select_field,
            .method_call,
            .property_call,
            => true,
            else => false,
        };
    }

    const ParseResult = union(enum) {
        invalid,
        constant: u8,
        operation: OpCode,
    };
};

Constants

const OpCodeConstants = struct {
    /// First valid data type code
    pub const FIRST_DATA_TYPE: u8 = 0x01;
    /// Last data type code
    pub const LAST_DATA_TYPE: u8 = 111; // 0x6F
    /// Boundary between constants and operations
    pub const LAST_CONSTANT_CODE: u8 = 112; // 0x70
    /// First operation code
    pub const FIRST_OP_CODE: u8 = 113; // 0x71
    /// Maximum opcode value
    pub const MAX_OP_CODE: u8 = 255; // 0xFF
};

Previous: Appendix A | Next: Appendix C

1

Scala: OpCodes.scala

Appendix C: Cost Table

PRE-ALPHA WARNING: This is a pre-alpha version of The Sigma Book. Content may be incomplete, inaccurate, or subject to change. Do not use as a source of truth. For authoritative information, consult the official repositories:

Complete reference for operation costs in the JIT cost model12.

Cost Model Architecture

Cost Model Structure
══════════════════════════════════════════════════════════════════

┌────────────────────────────────────────────────────────────────┐
│                       CostKind                                  │
├─────────────┬─────────────┬─────────────┬─────────────────────┤
│  FixedCost  │PerItemCost  │TypeBasedCost│   DynamicCost       │
│             │             │             │                      │
│  cost: u32  │ base: u32   │ costFunc()  │ sum of sub-costs    │
│             │ per_chunk   │ per type    │                      │
│             │ chunk_size  │             │                      │
└─────────────┴─────────────┴─────────────┴─────────────────────┘

Cost Calculation Flow:
─────────────────────────────────────────────────────────────────
                 ┌─────────────┐
                 │  Operation  │
                 └──────┬──────┘
                        │
         ┌──────────────┼──────────────┐
         ▼              ▼              ▼
    ┌─────────┐   ┌──────────┐   ┌─────────┐
    │ FixedOp │   │PerItemOp │   │TypedOp  │
    │ cost=26 │   │base=20   │   │depends  │
    └────┬────┘   │chunk=10  │   │on type  │
         │        └────┬─────┘   └────┬────┘
         │             │              │
         └─────────────┼──────────────┘
                       ▼
              ┌────────────────┐
              │CostAccumulator │
              │ accum += cost  │
              │ check < limit  │
              └────────────────┘

Zig Cost Types

const JitCost = struct {
    value: u32,

    pub fn add(self: JitCost, other: JitCost) !JitCost {
        return .{ .value = try std.math.add(u32, self.value, other.value) };
    }
};

const CostKind = union(enum) {
    fixed: FixedCost,
    per_item: PerItemCost,
    type_based: TypeBasedCost,
    dynamic,

    pub fn compute(self: CostKind, ctx: CostContext) JitCost {
        return switch (self) {
            .fixed => |f| f.cost,
            .per_item => |p| p.compute(ctx.n_items),
            .type_based => |t| t.costFunc(ctx.tpe),
            .dynamic => ctx.computed_cost,
        };
    }
};

/// Fixed cost regardless of input
const FixedCost = struct {
    cost: JitCost,
};

/// Cost proportional to collection size
const PerItemCost = struct {
    base: JitCost,
    per_chunk: JitCost,
    chunk_size: usize,

    /// totalCost = base + per_chunk * ceil(n_items / chunk_size)
    pub fn compute(self: PerItemCost, n_items: usize) JitCost {
        const chunks = (n_items + self.chunk_size - 1) / self.chunk_size;
        return .{
            .value = self.base.value + @as(u32, @intCast(chunks)) * self.per_chunk.value,
        };
    }
};

/// Cost depends on type
const TypeBasedCost = struct {
    primitive_cost: JitCost,
    bigint_cost: JitCost,
    collection_cost: ?PerItemCost,

    pub fn costFunc(self: TypeBasedCost, tpe: SType) JitCost {
        return switch (tpe) {
            .byte, .short, .int, .long => self.primitive_cost,
            .big_int, .unsigned_big_int => self.bigint_cost,
            .coll => |elem| if (self.collection_cost) |c|
                c.compute(elem.len)
            else
                self.primitive_cost,
            else => self.primitive_cost,
        };
    }
};

Cost Accumulator

const CostAccumulator = struct {
    accum: u64,
    limit: u64,

    pub fn init(limit: u64) CostAccumulator {
        return .{ .accum = 0, .limit = limit };
    }

    pub fn add(self: *CostAccumulator, cost: JitCost) !void {
        self.accum += cost.value;
        if (self.accum > self.limit) {
            return error.CostLimitExceeded;
        }
    }

    pub fn addSeq(
        self: *CostAccumulator,
        cost: PerItemCost,
        n_items: usize,
    ) !void {
        try self.add(cost.compute(n_items));
    }

    pub fn totalCost(self: *const CostAccumulator) JitCost {
        return .{ .value = @intCast(self.accum) };
    }
};

Fixed Cost Operations

OperationCostDescription
ConstantPlaceholder1Reference segregated constant
Height1Current block height
Inputs1Transaction inputs
Outputs1Transaction outputs
LastBlockUtxoRootHash1UTXO root hash
Self1Self box
MinerPubkey1Miner public key
ValUse5Use defined value
TaggedVariable5Context variable
SomeValue5Option Some
NoneValue5Option None
SelectField8Select tuple field
CreateProveDlog10Create DLog
OptionGetOrElse10Option.getOrElse
OptionIsDefined10Option.isDefined
OptionGet10Option.get
ExtractAmount10Box value
ExtractScriptBytes10Proposition bytes
ExtractId10Box ID
Tuple10Create tuple
Select1-512Select tuple element
ByIndex14Collection access
BoolToSigmaProp15Bool → SigmaProp
DeserializeContext15Deserialize context
DeserializeRegister15Deserialize register
ByteArrayToLong16Bytes → Long
LongToByteArray17Long → bytes
CreateProveDHTuple20Create DHT
If20Conditional
LogicalNot20Boolean NOT
Negation20Numeric negation
ArithOp26Plus, Minus, etc.
ByteArrayToBigInt30Bytes → BigInt
SubstConstants30Substitute constants
SizeOf30Collection size
MultiplyGroup40EC point multiply
ExtractRegisterAs50Register access
Exponentiate300BigInt exponent
DecodePoint900Decode EC point

Per-Item Cost Operations

OperationBasePer ChunkChunk Size
CalcBlake2b256207128
CalcSha25620864
MapCollection20110
Exists20510
ForAll20510
Fold20110
Filter20510
FlatMap20510
Slice102100
Append202100
SigmaAnd1021
SigmaOr1021
AND (logical)10532
OR (logical)10532
XorOf20532
AtLeast2031
Xor (bytes)102128

Type-Based Costs

Numeric Casting

Target TypeCost
Byte, Short, Int, Long10
BigInt30
UnsignedBigInt30

Comparison Operations

TypeCost
Primitives10-20
BigInt30
CollectionsPerItemCost
TuplesSum of components

Interpreter Overhead

Cost TypeValueDescription
interpreterInitCost10,000Interpreter init
inputCost2,000Per input
dataInputCost100Per data input
outputCost100Per output
tokenAccessCost100Per token

Cost Limits

ParameterValueDescription
maxBlockCost1,000,000Max per block
scriptCostLimit~8,000,000Single script

Zig Cost Constants

const OperationCosts = struct {
    // Context access (very cheap)
    pub const HEIGHT: FixedCost = .{ .cost = .{ .value = 1 } };
    pub const INPUTS: FixedCost = .{ .cost = .{ .value = 1 } };
    pub const OUTPUTS: FixedCost = .{ .cost = .{ .value = 1 } };
    pub const SELF: FixedCost = .{ .cost = .{ .value = 1 } };

    // Variable access
    pub const VAL_USE: FixedCost = .{ .cost = .{ .value = 5 } };
    pub const CONSTANT_PLACEHOLDER: FixedCost = .{ .cost = .{ .value = 1 } };

    // Arithmetic
    pub const ARITH_OP: FixedCost = .{ .cost = .{ .value = 26 } };
    pub const COMPARISON: FixedCost = .{ .cost = .{ .value = 20 } };

    // Box extraction
    pub const EXTRACT_AMOUNT: FixedCost = .{ .cost = .{ .value = 10 } };
    pub const EXTRACT_REGISTER: FixedCost = .{ .cost = .{ .value = 50 } };

    // Cryptographic
    pub const PROVE_DLOG: FixedCost = .{ .cost = .{ .value = 10 } };
    pub const PROVE_DHT: FixedCost = .{ .cost = .{ .value = 20 } };
    pub const DECODE_POINT: FixedCost = .{ .cost = .{ .value = 900 } };
    pub const MULTIPLY_GROUP: FixedCost = .{ .cost = .{ .value = 40 } };
    pub const EXPONENTIATE: FixedCost = .{ .cost = .{ .value = 300 } };

    // Hashing
    pub const BLAKE2B256: PerItemCost = .{
        .base = .{ .value = 20 },
        .per_chunk = .{ .value = 7 },
        .chunk_size = 128,
    };
    pub const SHA256: PerItemCost = .{
        .base = .{ .value = 20 },
        .per_chunk = .{ .value = 8 },
        .chunk_size = 64,
    };

    // Collection operations
    pub const MAP: PerItemCost = .{
        .base = .{ .value = 20 },
        .per_chunk = .{ .value = 1 },
        .chunk_size = 10,
    };
    pub const FILTER: PerItemCost = .{
        .base = .{ .value = 20 },
        .per_chunk = .{ .value = 5 },
        .chunk_size = 10,
    };
    pub const FOLD: PerItemCost = .{
        .base = .{ .value = 20 },
        .per_chunk = .{ .value = 1 },
        .chunk_size = 10,
    };

    // Sigma operations
    pub const SIGMA_AND: PerItemCost = .{
        .base = .{ .value = 10 },
        .per_chunk = .{ .value = 2 },
        .chunk_size = 1,
    };
    pub const SIGMA_OR: PerItemCost = .{
        .base = .{ .value = 10 },
        .per_chunk = .{ .value = 2 },
        .chunk_size = 1,
    };
};

const InterpreterCosts = struct {
    pub const INIT: u32 = 10_000;
    pub const PER_INPUT: u32 = 2_000;
    pub const PER_DATA_INPUT: u32 = 100;
    pub const PER_OUTPUT: u32 = 100;
    pub const PER_TOKEN: u32 = 100;
};

const CostLimits = struct {
    pub const MAX_BLOCK_COST: u64 = 1_000_000;
    pub const MAX_SCRIPT_COST: u64 = 8_000_000;
};

Cost Calculation Example

/// Calculate total cost for transaction verification
fn calculateTxCost(
    n_inputs: usize,
    n_data_inputs: usize,
    n_outputs: usize,
    script_costs: []const JitCost,
) u64 {
    var total: u64 = InterpreterCosts.INIT;

    total += @as(u64, n_inputs) * InterpreterCosts.PER_INPUT;
    total += @as(u64, n_data_inputs) * InterpreterCosts.PER_DATA_INPUT;
    total += @as(u64, n_outputs) * InterpreterCosts.PER_OUTPUT;

    for (script_costs) |cost| {
        total += cost.value;
    }

    return total;
}

// Example: 2 inputs, 1 data input, 3 outputs
// Base: 10,000 + 4,000 + 100 + 300 = 14,400
// Plus script costs per input

Previous: Appendix B | Next: Appendix D

1

Scala: CostKind.scala

Appendix D: Method Reference

PRE-ALPHA WARNING: This is a pre-alpha version of The Sigma Book. Content may be incomplete, inaccurate, or subject to change. Do not use as a source of truth. For authoritative information, consult the official repositories:

Complete reference for all methods available on each type12.

Method Organization

Method System Architecture
══════════════════════════════════════════════════════════════════

                    ┌────────────────────────┐
                    │    STypeCompanion      │
                    │  type_code: TypeCode   │
                    │  methods: []SMethod    │
                    └──────────┬─────────────┘
                               │
       ┌───────────────────────┼───────────────────────┐
       ▼                       ▼                       ▼
┌──────────────┐       ┌──────────────┐       ┌──────────────┐
│  SNumeric    │       │    SBox      │       │   SColl      │
│  methods.len │       │  methods.len │       │ methods.len  │
│     = 13     │       │     = 10     │       │    = 20+     │
└──────────────┘       └──────────────┘       └──────────────┘

Method Lookup:
─────────────────────────────────────────────────────────────────
  receiver.methodCall(type_code=99, method_id=1)
       │
       ▼
  STypeCompanion::Box.method_by_id(1)
       │
       ▼
  SMethod { name: "value", tpe: Box => Long, cost: 10 }

Zig Method Descriptors

const SMethod = struct {
    name: []const u8,
    method_id: u8,
    tpe: SFunc,
    cost_kind: CostKind,
    min_version: ?ErgoTreeVersion = null,

    pub fn isV6Only(self: *const SMethod) bool {
        return self.min_version != null and
            @intFromEnum(self.min_version.?) >= 3;
    }
};

const SFunc = struct {
    t_dom: []const SType,  // Domain (receiver + args)
    t_range: SType,        // Return type

    pub fn unary(recv: SType, ret: SType) SFunc {
        return .{ .t_dom = &[_]SType{recv}, .t_range = ret };
    }

    pub fn binary(recv: SType, arg: SType, ret: SType) SFunc {
        return .{ .t_dom = &[_]SType{ recv, arg }, .t_range = ret };
    }
};

Numeric Types (SByte, SShort, SInt, SLong)3

IDMethodSignaturev5v6Cost
1toByteT → Byte10
2toShortT → Short10
3toIntT → Int10
4toLongT → Long10
5toBigIntT → BigInt30
6toBytesT → Coll[Byte]-5
7toBitsT → Coll[Boolean]-5
8bitwiseInverseT → T-5
9bitwiseOr(T, T) → T-5
10bitwiseAnd(T, T) → T-5
11bitwiseXor(T, T) → T-5
12shiftLeft(T, Int) → T-5
13shiftRight(T, Int) → T-5

SBigInt4

IDMethodSignaturev5v6Cost
1-5toXxxConversions10-30
6-13bitwiseBitwise ops-5-10
14toUnsignedBigInt → UnsignedBigInt-5
15toUnsignedMod(BigInt, UBI) → UBI-10

SUnsignedBigInt (v6+)5

IDMethodSignatureCost
14modInverse(UBI, UBI) → UBI50
15plusMod(UBI, UBI, UBI) → UBI10
16subtractMod(UBI, UBI, UBI) → UBI10
17multiplyMod(UBI, UBI, UBI) → UBI15
18mod(UBI, UBI) → UBI10
19toSignedUBI → BigInt5

SGroupElement6

IDMethodSignaturev5v6Cost
2getEncodedGE → Coll[Byte]250
3exp(GE, BigInt) → GE900
4multiply(GE, GE) → GE40
5negateGE → GE45
6expUnsigned(GE, UBI) → GE-900

SSigmaProp7

IDMethodSignatureCost
1propBytesSigmaProp → Coll[Byte]35
2isProvenSigmaProp → Boolean10

SBox8

IDMethodSignatureCost
1valueBox → Long1
2propositionBytesBox → Coll[Byte]10
3bytesBox → Coll[Byte]10
4bytesWithoutRefBox → Coll[Byte]10
5idBox → Coll[Byte]10
6creationInfoBox → (Int, Coll[Byte])10
7getReg[T](Box, Int) → Option[T]50
8tokensBox → Coll[(Coll[Byte], Long)]15

Register Access

const BoxMethods = struct {
    // R0-R3: mandatory registers
    pub const R0 = makeRegMethod(0);  // monetary value
    pub const R1 = makeRegMethod(1);  // guard script
    pub const R2 = makeRegMethod(2);  // tokens
    pub const R3 = makeRegMethod(3);  // creation info
    // R4-R9: optional registers
    pub const R4 = makeRegMethod(4);
    pub const R5 = makeRegMethod(5);
    pub const R6 = makeRegMethod(6);
    pub const R7 = makeRegMethod(7);
    pub const R8 = makeRegMethod(8);
    pub const R9 = makeRegMethod(9);

    fn makeRegMethod(comptime idx: u8) SMethod {
        return .{
            .method_id = 7,  // getReg opcode
            .name = "R" ++ &[_]u8{'0' + idx},
            .cost_kind = .{ .fixed = .{ .cost = .{ .value = 50 } } },
        };
    }
};

SAvlTree9

IDMethodSignatureCost
1digestAvlTree → Coll[Byte]15
2enabledOperationsAvlTree → Byte15
3keyLengthAvlTree → Int15
4valueLengthOptAvlTree → Option[Int]15
5isInsertAllowedAvlTree → Boolean15
6isUpdateAllowedAvlTree → Boolean15
7isRemoveAllowedAvlTree → Boolean15
8updateOperations(AvlTree, Byte) → AvlTree20
9contains(AvlTree, key, proof) → Booleandynamic
10get(AvlTree, key, proof) → Option[Coll[Byte]]dynamic
11getMany(AvlTree, keys, proof) → Coll[Option[...]]dynamic
12insert(AvlTree, entries, proof) → Option[AvlTree]dynamic
13update(AvlTree, operations, proof) → Option[AvlTree]dynamic
14remove(AvlTree, keys, proof) → Option[AvlTree]dynamic
15updateDigest(AvlTree, Coll[Byte]) → AvlTree20

SContext10

IDMethodSignatureCost
1dataInputsContext → Coll[Box]15
2headersContext → Coll[Header]15
3preHeaderContext → PreHeader10
4INPUTSContext → Coll[Box]10
5OUTPUTSContext → Coll[Box]10
6HEIGHTContext → Int26
7SELFContext → Box10
8selfBoxIndexContext → Int20
9LastBlockUtxoRootHashContext → AvlTree15
10minerPubKeyContext → Coll[Byte]20
11getVar[T](Context, Byte) → Option[T]dynamic

SHeader11

IDMethodSignatureCost
1idHeader → Coll[Byte]10
2versionHeader → Byte10
3parentIdHeader → Coll[Byte]10
4ADProofsRootHeader → Coll[Byte]10
5stateRootHeader → AvlTree10
6transactionsRootHeader → Coll[Byte]10
7timestampHeader → Long10
8nBitsHeader → Long10
9heightHeader → Int10
10extensionRootHeader → Coll[Byte]10
11minerPkHeader → GroupElement10
12powOnetimePkHeader → GroupElement10
13powNonceHeader → Coll[Byte]10
14powDistanceHeader → BigInt10
15votesHeader → Coll[Byte]10
16checkPowHeader → Boolean (v6+)500

SPreHeader12

IDMethodSignatureCost
1versionPreHeader → Byte10
2parentIdPreHeader → Coll[Byte]10
3timestampPreHeader → Long10
4nBitsPreHeader → Long10
5heightPreHeader → Int10
6minerPkPreHeader → GroupElement10
7votesPreHeader → Coll[Byte]10

SGlobal13

IDMethodSignaturev5v6Cost
1groupGeneratorGlobal → GroupElement10
2xor(Coll[Byte], Coll[Byte]) → Coll[Byte]PerItem
3serialize[T]T → Coll[Byte]-dynamic
4fromBigEndianBytes[T]Coll[Byte] → T-10
5encodeNBitsBigInt → Long-20
6decodeNBitsLong → BigInt-20
7powHit(Int, ...) → BigInt-500

SCollection14

IDMethodSignatureCost
1sizeColl[T] → Int14
2apply(Coll[T], Int) → T14
3getOrElse(Coll[T], Int, T) → Tdynamic
4map[R](Coll[T], T → R) → Coll[R]PerItem(20,1,10)
5exists(Coll[T], T → Bool) → BoolPerItem(20,5,10)
6fold[R](Coll[T], R, (R,T) → R) → RPerItem(20,1,10)
7forall(Coll[T], T → Bool) → BoolPerItem(20,5,10)
8slice(Coll[T], Int, Int) → Coll[T]PerItem(10,2,100)
9filter(Coll[T], T → Bool) → Coll[T]PerItem(20,5,10)
10append(Coll[T], Coll[T]) → Coll[T]PerItem(20,2,100)
14indicesColl[T] → Coll[Int]PerItem(20,2,128)
15flatMap[R](Coll[T], T → Coll[R]) → Coll[R]PerItem(20,5,10)
19patch (v6)(Coll[T], Int, Coll[T], Int) → Coll[T]dynamic
20updated (v6)(Coll[T], Int, T) → Coll[T]20
21updateMany (v6)(Coll[T], Coll[Int], Coll[T]) → Coll[T]PerItem
26indexOf(Coll[T], T, Int) → IntPerItem(20,1,10)
29zip[U](Coll[T], Coll[U]) → Coll[(T,U)]PerItem(10,1,10)
30reverse (v6)Coll[T] → Coll[T]PerItem
31startsWith (v6)(Coll[T], Coll[T]) → BooleanPerItem
32endsWith (v6)(Coll[T], Coll[T]) → BooleanPerItem
33get (v6)(Coll[T], Int) → Option[T]14

SOption15

IDMethodSignatureCost
2isDefinedOption[T] → Boolean10
3getOption[T] → T10
4getOrElse(Option[T], T) → T10
7map[R](Option[T], T → R) → Option[R]dynamic
8filter(Option[T], T → Bool) → Option[T]dynamic

STuple

Tuples support component access by position:

const TupleMethods = struct {
    /// Access tuple component by index (1-based like Scala)
    pub fn component(comptime idx: usize) SMethod {
        return .{
            .name = "_" ++ std.fmt.comptimePrint("{}", .{idx}),
            .method_id = @intCast(idx),
            .cost_kind = .{ .fixed = .{ .cost = .{ .value = 12 } } },
        };
    }
};

// Usage: tuple._1, tuple._2, ... up to tuple._255

Previous: Appendix C | Next: Appendix E

1

Scala: methods.scala

3

Scala: methods.scala (SNumericTypeMethods)

4

Scala: methods.scala (SBigIntMethods)

5

Scala: methods.scala (SUnsignedBigIntMethods)

7

Scala: methods.scala (SSigmaPropMethods)

8

Rust: sbox.rs:29-92

9

Rust: savltree.rs

10

Rust: scontext.rs

11

Rust: sheader.rs

12

Rust: spreheader.rs

13

Rust: sglobal.rs

14

Rust: scoll.rs:22-266

15

Rust: soption.rs

Appendix E: Serialization Format Reference

PRE-ALPHA WARNING: This is a pre-alpha version of The Sigma Book. Content may be incomplete, inaccurate, or subject to change. Do not use as a source of truth. For authoritative information, consult the official repositories:

Complete reference for ErgoTree and value serialization formats12.

Integer Encoding

VLQ (Variable-Length Quantity)

VLQ Encoding
══════════════════════════════════════════════════════════════════

Byte format: [C][D D D D D D D]
             |  |____________|
             |       |
             |       +-- 7 data bits
             +---------- Continuation bit (1 = more bytes follow)

Examples:
  0       → [0x00]                    (1 byte)
  127     → [0x7F]                    (1 byte)
  128     → [0x80, 0x01]              (2 bytes: 10000000 00000001)
  16383   → [0xFF, 0x7F]              (2 bytes)
  16384   → [0x80, 0x80, 0x01]        (3 bytes)
const VlqEncoder = struct {
    /// Encode unsigned integer as VLQ
    pub fn encodeU64(value: u64, writer: anytype) !void {
        var v = value;
        while (v >= 0x80) {
            try writer.writeByte(@as(u8, @truncate(v)) | 0x80);
            v >>= 7;
        }
        try writer.writeByte(@as(u8, @truncate(v)));
    }

    /// Decode VLQ to unsigned integer
    pub fn decodeU64(reader: anytype) !u64 {
        var result: u64 = 0;
        var shift: u6 = 0;
        while (true) {
            const byte = try reader.readByte();
            result |= @as(u64, byte & 0x7F) << shift;
            if (byte & 0x80 == 0) break;
            shift += 7;
            if (shift > 63) return error.VlqOverflow;
        }
        return result;
    }
};

ZigZag Encoding

const ZigZag = struct {
    /// Encode signed → unsigned (small negatives stay small)
    pub fn encode32(n: i32) u32 {
        return @bitCast((n << 1) ^ (n >> 31));
    }

    pub fn encode64(n: i64) u64 {
        return @bitCast((n << 1) ^ (n >> 63));
    }

    /// Decode unsigned → signed
    pub fn decode32(n: u32) i32 {
        return @bitCast((n >> 1) ^ (~(n & 1) +% 1));
    }

    pub fn decode64(n: u64) i64 {
        return @bitCast((n >> 1) ^ (~(n & 1) +% 1));
    }
};

// Mapping: 0 → 0, -1 → 1, 1 → 2, -2 → 3, 2 → 4, ...

Type Serialization3

Primitive Type Codes

TypeDecHexZig
SBoolean10x01.boolean
SByte20x02.byte
SShort30x03.short
SInt40x04.int
SLong50x05.long
SBigInt60x06.big_int
SGroupElement70x07.group_element
SSigmaProp80x08.sigma_prop
SUnsignedBigInt90x09.unsigned_big_int

Collection Types

const TypeEncoder = struct {
    const COLL_BASE: u8 = 12;      // 0x0C
    const NESTED_COLL: u8 = 24;    // 0x18
    const OPTION_BASE: u8 = 36;    // 0x24

    /// Encode collection type
    pub fn encodeColl(elem: SType) u8 {
        if (elem.isPrimitive()) {
            return COLL_BASE + elem.typeCode();
        }
        if (elem == .coll) {
            return NESTED_COLL + elem.inner().typeCode();
        }
        // Non-embeddable: write COLL_BASE then element type separately
        return COLL_BASE;
    }
};

Non-Embeddable Types

TypeDecHex
SBox990x63
SAvlTree1000x64
SContext1010x65
SHeader1040x68
SPreHeader1050x69
SGlobal1060x6A

ErgoTree Format4

Header Byte

ErgoTree Header
══════════════════════════════════════════════════════════════════

Bits: [V V V V][S][C][R][R]
      |______|  |  |  |__|
         |      |  |    |
         |      |  |    +-- Reserved (2 bits)
         |      |  +------- Constant segregation (1 = segregated)
         |      +---------- Size flag (1 = size bytes present)
         +----------------- Version (4 bits, 0-15)

Version Mapping:
  0 → ErgoTree v0 (protocol v3.x)
  1 → ErgoTree v1 (protocol v4.x)
  2 → ErgoTree v2 (protocol v5.x, JIT costing)
  3 → ErgoTree v3 (protocol v6.x)
const ErgoTreeHeader = struct {
    version: u4,
    has_size: bool,
    constant_segregation: bool,

    pub fn parse(byte: u8) ErgoTreeHeader {
        return .{
            .version = @truncate(byte >> 4),
            .has_size = (byte & 0x08) != 0,
            .constant_segregation = (byte & 0x04) != 0,
        };
    }

    pub fn serialize(self: ErgoTreeHeader) u8 {
        var result: u8 = @as(u8, self.version) << 4;
        if (self.has_size) result |= 0x08;
        if (self.constant_segregation) result |= 0x04;
        return result;
    }
};

Complete Structure

ErgoTree Wire Format
══════════════════════════════════════════════════════════════════

┌─────────┬──────────┬──────────────┬─────────────┬──────────────┐
│ Header  │   Size   │  Constants   │ Complexity  │    Root      │
│ 1 byte  │   VLQ    │    Array     │    VLQ      │ Expression   │
│         │(optional)│  (if C=1)    │ (optional)  │              │
└─────────┴──────────┴──────────────┴─────────────┴──────────────┘

With constant segregation (C=1):
┌─────────┬──────────┬───────────┬──────────────────────────────┐
│ Header  │ # consts │ Constants │   Root (with placeholders)   │
│         │   VLQ    │  [type +  │                              │
│         │          │  value]*  │                              │
└─────────┴──────────┴───────────┴──────────────────────────────┘

Value Serialization5

Primitive Values

const DataSerializer = struct {
    pub fn serialize(value: Value, writer: anytype) !void {
        switch (value) {
            .boolean => |b| try writer.writeByte(if (b) 0x01 else 0x00),
            .byte => |b| try writer.writeByte(@bitCast(b)),
            .short => |s| try VlqEncoder.encodeI16(s, writer),
            .int => |i| try VlqEncoder.encodeI32(i, writer),
            .long => |l| try VlqEncoder.encodeI64(l, writer),
            .big_int => |bi| try serializeBigInt(bi, writer),
            .group_element => |ge| try ge.serializeCompressed(writer),
            .sigma_prop => |sp| try serializeSigmaProp(sp, writer),
            .coll => |c| try serializeColl(c, writer),
            // ...
        }
    }

    fn serializeBigInt(bi: BigInt256, writer: anytype) !void {
        const bytes = bi.toBytesBigEndian();
        // Skip leading zeros for signed representation
        var start: usize = 0;
        while (start < bytes.len - 1 and bytes[start] == 0) : (start += 1) {}
        try writer.writeByte(@intCast(bytes.len - start));
        try writer.writeAll(bytes[start..]);
    }
};

GroupElement (SEC1 Compressed)

GroupElement Encoding (33 bytes)
══════════════════════════════════════════════════════════════════

┌────────────┬─────────────────────────────────────────────────────┐
│   Prefix   │                 X Coordinate                        │
│  (1 byte)  │                  (32 bytes)                         │
├────────────┼─────────────────────────────────────────────────────┤
│ 0x02 = Y   │                                                     │
│    even    │              Big-endian X value                     │
│ 0x03 = Y   │                                                     │
│    odd     │                                                     │
└────────────┴─────────────────────────────────────────────────────┘

SigmaProp

const SigmaPropSerializer = struct {
    const PROVE_DLOG: u8 = 0xCD;
    const PROVE_DHT: u8 = 0xCE;
    const THRESHOLD: u8 = 0x98;
    const AND: u8 = 0x96;
    const OR: u8 = 0x97;

    pub fn serialize(sp: SigmaBoolean, writer: anytype) !void {
        switch (sp) {
            .prove_dlog => |pk| {
                try writer.writeByte(PROVE_DLOG);
                try pk.serializeCompressed(writer);
            },
            .prove_dht => |dht| {
                try writer.writeByte(PROVE_DHT);
                try dht.g.serializeCompressed(writer);
                try dht.h.serializeCompressed(writer);
                try dht.u.serializeCompressed(writer);
                try dht.v.serializeCompressed(writer);
            },
            .and_conj => |children| {
                try writer.writeByte(AND);
                try VlqEncoder.encodeU64(children.len, writer);
                for (children) |child| try serialize(child, writer);
            },
            .or_conj => |children| {
                try writer.writeByte(OR);
                try VlqEncoder.encodeU64(children.len, writer);
                for (children) |child| try serialize(child, writer);
            },
            .threshold => |t| {
                try writer.writeByte(THRESHOLD);
                try VlqEncoder.encodeU64(t.k, writer);
                try VlqEncoder.encodeU64(t.children.len, writer);
                for (t.children) |child| try serialize(child, writer);
            },
        }
    }
};

Collections

const CollSerializer = struct {
    pub fn serialize(coll: Collection, writer: anytype) !void {
        try VlqEncoder.encodeU64(coll.len, writer);
        // Element type already encoded in type header
        for (coll.items) |item| {
            try DataSerializer.serialize(item, writer);
        }
    }

    /// Optimized boolean collection (bit-packed)
    pub fn serializeBoolColl(bools: []const bool, writer: anytype) !void {
        try VlqEncoder.encodeU64(bools.len, writer);
        var byte: u8 = 0;
        var bit: u3 = 0;
        for (bools) |b| {
            if (b) byte |= @as(u8, 1) << bit;
            bit +%= 1;
            if (bit == 0) {
                try writer.writeByte(byte);
                byte = 0;
            }
        }
        if (bools.len % 8 != 0) try writer.writeByte(byte);
    }
};

Expression Serialization6

General Pattern

const ExprSerializer = struct {
    pub fn serialize(expr: Expr, writer: anytype) !void {
        // Write opcode
        try writer.writeByte(@intFromEnum(expr.opCode()));

        // Write opcode-specific data
        switch (expr) {
            .val_use => |vu| try VlqEncoder.encodeU32(vu.id, writer),
            .constant_placeholder => |cp| {
                try VlqEncoder.encodeU32(cp.index, writer);
                try TypeEncoder.serialize(cp.tpe, writer);
            },
            .bin_op => |bo| {
                try serialize(bo.left.*, writer);
                try serialize(bo.right.*, writer);
            },
            .method_call => |mc| {
                try writer.writeByte(mc.type_code);
                try writer.writeByte(mc.method_id);
                try serialize(mc.receiver.*, writer);
                try VlqEncoder.encodeU64(mc.args.len, writer);
                for (mc.args) |arg| try serialize(arg.*, writer);
            },
            // ...
        }
    }
};

Block Expressions

Block Value Structure
══════════════════════════════════════════════════════════════════

BlockValue:
┌────────┬──────────┬─────────────────────┬───────────────────────┐
│ 0xD8   │  count   │   ValDef items      │   Result expr         │
│        │   VLQ    │                     │                       │
└────────┴──────────┴─────────────────────┴───────────────────────┘

ValDef:
┌────────┬────────┬────────────┬───────────────────────────────────┐
│ 0xD6   │   ID   │   Type     │        RHS Expression             │
│        │  VLQ   │ (optional) │                                   │
└────────┴────────┴────────────┴───────────────────────────────────┘

FuncValue (Lambda):
┌────────┬──────────┬─────────────────────┬───────────────────────┐
│ 0xD9   │ arg cnt  │  Args (ID + type)   │   Body expr           │
│        │   VLQ    │                     │                       │
└────────┴──────────┴─────────────────────┴───────────────────────┘

Size Limits7

LimitValueDescription
Max ErgoTree size4 KBSerialized bytes
Max box size4 KBTotal serialized
Max constants255Per ErgoTree
Max registers10R0-R9
Max tokens/box255Token types
Max BigInt bytes32256 bits

Deserialization

const SigmaByteReader = struct {
    reader: std.io.Reader,
    constant_store: []const Constant,
    version: ErgoTreeVersion,

    pub fn readVlqU64(self: *SigmaByteReader) !u64 {
        return VlqEncoder.decodeU64(self.reader);
    }

    pub fn readType(self: *SigmaByteReader) !SType {
        const code = try self.reader.readByte();
        return TypeEncoder.decode(code, self);
    }

    pub fn readExpr(self: *SigmaByteReader) !Expr {
        const opcode = try self.reader.readByte();
        if (opcode <= 0x70) {
            // Constant (type code in data region)
            return try self.readConstantWithType(opcode);
        }
        return try ExprSerializer.deserialize(@enumFromInt(opcode), self);
    }
};

Previous: Appendix D | Next: Appendix F

1

Scala: serialization/

3

Rust: types.rs

4

Rust: ergo_tree.rs (header parsing)

5

Rust: data.rs

6

Rust: expr.rs

7

Scala: serialization.tex (size limits)

Appendix F: Version History

PRE-ALPHA WARNING: This is a pre-alpha version of The Sigma Book. Content may be incomplete, inaccurate, or subject to change. Do not use as a source of truth. For authoritative information, consult the official repositories:

Version history of ErgoScript and the SigmaState interpreter12.

Protocol Versions Overview

Block VersionActivated VersionErgoTree VersionNameRelease
100InitialMainnet launch
211v4.02020
322v5.0 (JIT)2022
433v6.02024/2025

Version Context

const VersionContext = struct {
    activated_version: u8,  // Protocol version on network
    ergo_tree_version: u8,  // Version of currently executing script

    pub const MAX_SUPPORTED_SCRIPT_VERSION: u8 = 3; // Supports 0, 1, 2, 3
    pub const JIT_ACTIVATION_VERSION: u8 = 2;       // v5.0 JIT activation
    pub const V6_SOFT_FORK_VERSION: u8 = 3;         // v6.0 soft-fork

    pub fn isJitActivated(self: VersionContext) bool {
        return self.activated_version >= JIT_ACTIVATION_VERSION;
    }

    pub fn isV6Activated(self: VersionContext) bool {
        return self.activated_version >= V6_SOFT_FORK_VERSION;
    }
};

Version 1 (Initial - v3.x)

ErgoTree Version: 0

Features:

  • Core ErgoScript language
  • Basic types: Boolean, Byte, Short, Int, Long, BigInt, GroupElement, SigmaProp
  • Collection operations: map, filter, fold, exists, forall
  • Sigma protocols: ProveDlog, ProveDHTuple, AND, OR, THRESHOLD
  • Box operations: value, propositionBytes, id, registers R0-R9
  • Context access: INPUTS, OUTPUTS, HEIGHT, SELF

Limitations:

  • AOT (Ahead-of-Time) interpreter only
  • Fixed cost model
  • No constant segregation required

Version 2 (v4.0)

ErgoTree Version: 1 Block Version: 2

New Features:

  • Mandatory constant segregation flag
  • Improved script validation
  • Enhanced soft-fork mechanism
  • Size flag in ErgoTree header

Changes:

  • ErgoTree header now requires size bytes when flag is set
  • Better error handling for malformed scripts

Version 3 (v5.0 - JIT)

ErgoTree Version: 2 Block Version: 3 Activated Version: 2

This was the major interpreter upgrade replacing AOT with JIT costing.

Major Changes

New Interpreter Architecture:

  • JIT (Just-In-Time) costing model
  • Data-driven evaluation via eval() methods
  • Precise cost tracking per operation
  • Profiler support for cost measurement

New Cost Model:

  • FixedCost for constant-time operations
  • PerItemCost for collection operations
  • TypeBasedCost for type-dependent costs
  • DynamicCost for complex operations

Costing Changes:

AOT: Fixed costs estimated at compile time
JIT: Actual costs computed during execution

New Operations:

  • Context.dataInputs - access data inputs
  • Context.headers - access last 10 block headers
  • Context.preHeader - access current block pre-header
  • Header type with full block header access
  • PreHeader type

Soft-Fork Infrastructure:

  • ValidationRules framework
  • Configurable rule status (enabled, disabled, replaced)
  • trySoftForkable pattern for graceful degradation

AOT to JIT Transition

The transition happened at a specific block height. Scripts created before JIT activation continue to work, but new scripts benefit from more accurate costing.

Version 4 (v6.0 - Evolution)

ErgoTree Version: 3 Block Version: 4 Activated Version: 3

This soft-fork adds significant new functionality.

New Types

SUnsignedBigInt (Type code 9):

  • 256-bit unsigned integers
  • Modular arithmetic operations
  • Conversion between signed/unsigned

New Methods

Numeric Types (Byte, Short, Int, Long, BigInt):

  • toBytes: Convert to byte array
  • toBits: Convert to boolean array
  • bitwiseInverse: Bitwise NOT
  • bitwiseOr, bitwiseAnd, bitwiseXor: Bitwise operations
  • shiftLeft, shiftRight: Bit shifting

BigInt:

  • toUnsigned: Convert to UnsignedBigInt
  • toUnsignedMod: Modular conversion

UnsignedBigInt:

  • modInverse: Modular multiplicative inverse
  • plusMod, subtractMod, multiplyMod: Modular arithmetic
  • mod: Modulo operation
  • toSigned: Convert to signed BigInt

GroupElement:

  • expUnsigned: Scalar multiplication with unsigned exponent

Header:

  • checkPow: Verify Proof-of-Work solution

Collection:

  • patch: Replace range with another collection
  • updated: Update single element
  • updateMany: Batch update elements
  • indexOf: Find element index
  • zip: Pair with another collection
  • reverse: Reverse order
  • startsWith, endsWith: Prefix/suffix checks
  • get: Safe element access returning Option

Global:

  • serialize: Serialize any value to bytes
  • fromBigEndianBytes: Decode big-endian bytes
  • encodeNBits, decodeNBits: Difficulty encoding
  • powHit: Autolykos2 PoW verification

Version Checks

fn evaluateWithVersion(ctx: *VersionContext, expr: *const Expr) !Value {
    if (ctx.isV6Activated()) {
        // Use v6 methods and features
        return try evalV6(expr);
    } else if (ctx.isJitActivated()) {
        // Use JIT costing
        return try evalJit(expr);
    } else {
        // Legacy AOT path
        return try evalAot(expr);
    }
}

Backward Compatibility

Script Compatibility

All scripts created for earlier versions continue to work:

  1. Version 0 scripts: Execute with v0 semantics
  2. Version 1 scripts: Execute with v1 semantics
  3. Version 2 scripts: Execute with JIT costing
  4. Version 3 scripts: Full v6 features available

Method Resolution by Version

fn getMethods(ctx: *const VersionContext, type_code: u8) []const SMethod {
    const container = getTypeCompanion(type_code);
    if (ctx.isV6Activated()) {
        return container.all_methods;  // All methods including v6
    }
    return container.v5_methods;  // Pre-v6 methods only
}

Soft-Fork Safety

Unknown opcodes and methods in future versions are handled gracefully:

fn checkOpCode(opcode: u8, ctx: *const VersionContext) ValidationResult {
    if (isKnownOpcode(opcode)) return .validated;
    if (ctx.isSoftForkable(opcode)) return .soft_forkable;
    return .invalid;
}

Migration Guide

For Script Authors

v5 → v6:

  • Use UnsignedBigInt for modular arithmetic (more efficient)
  • Use new collection methods (reverse, zip, etc.)
  • Use Header.checkPow for PoW verification
  • Use Global.serialize for value encoding

For Node Operators

Upgrading to v6:

  1. Update node software before activation height
  2. No action needed for existing scripts
  3. New features available after soft-fork activation

Feature Matrix

Featurev3.xv4.0v5.0v6.0
Basic types
Sigma protocols
JIT costing--
Data inputs--
Headers access--
UnsignedBigInt---
Bitwise ops---
Collection updates---
PoW verification---
Serialization---

Test Coverage

Version-specific behavior is tested in:

  • LanguageSpecificationV5.scala (~9,690 lines)
  • LanguageSpecificationV6.scala (~3,081 lines)

These tests verify:

  • All operations produce expected results
  • Cost calculations are accurate
  • Version-gated features work correctly
  • Backward compatibility is maintained

Previous: Appendix E | Back to Book

2

Rust: ergo_tree.rs (ErgoTreeVersion)

Glossary

PRE-ALPHA WARNING: This is a pre-alpha version of The Sigma Book. Content may be incomplete, inaccurate, or subject to change. Do not use as a source of truth. For authoritative information, consult the official repositories:

A

AOT (Ahead-Of-Time): Costing model where script costs are calculated before execution. Used in ErgoTree versions 0-1.

AVL Tree: A self-balancing binary search tree used for authenticated dictionaries in Ergo.

B

BigInt: 256-bit signed integer type in ErgoTree.

Box: The fundamental UTXO unit in Ergo, containing value, ErgoTree script, tokens, and registers.

C

Constant Segregation: Optimization where constants are extracted from ErgoTree expressions and stored in a separate array. Enables efficient script substitution without re-serializing the expression tree.

Context: Execution environment containing blockchain state (HEIGHT, headers), transaction data (INPUTS, OUTPUTS, dataInputs), and current input information (SELF).

Cost Accumulator: Runtime tracker that sums operation costs and enforces the script cost limit.

D

Data Input: Read-only box reference in a transaction. Provides data without being spent.

DHT (Diffie-Hellman Tuple): Four-element sigma protocol proving knowledge of secret x where u = g^x and v = h^x.

DLog (Discrete Logarithm): Sigma protocol proving knowledge of discrete logarithm. Given generator g and public key h = g^x, proves knowledge of x.

E

ErgoScript: High-level smart contract language with Scala-like syntax.

ErgoTree: Serialized bytecode representation of smart contracts.

F

Fiat-Shamir Transformation: Technique to convert interactive proofs into non-interactive proofs.

G

GroupElement: An elliptic curve point on secp256k1.

H

Header: The first byte(s) of ErgoTree that specify version and format flags.

I

Interpreter: Component that evaluates ErgoTree expressions against a context to produce a SigmaBoolean result.

J

JIT (Just-In-Time): Costing model where costs are calculated during execution. Used in ErgoTree version 2+.

O

OpCode: Single-byte identifier for expression nodes in serialized ErgoTree. Values 0x01-0x70 encode constants; 0x71+ encode operations.

P

Prover: Component that generates cryptographic proofs for spending conditions.

Proposition: A statement that can be proven true or false.

S

Secp256k1: The elliptic curve used in Ergo (same as Bitcoin).

SigmaBoolean: A tree of cryptographic propositions (AND, OR, threshold, DLog, DHT).

SigmaProp: Type representing sigma-protocol propositions.

Sigma Protocol: Zero-knowledge proof system with three-move structure.

T

Type Code: Unique byte identifier for each type in ErgoTree serialization.

U

UTXO: Unspent Transaction Output model used by Ergo.

UnsignedBigInt: 256-bit unsigned integer type (added in v6).

V

Verifier: Component that verifies cryptographic proofs.

VLQ: Variable-Length Quantity encoding for unsigned integers. Uses 7 data bits per byte with continuation bit.

Z

ZigZag Encoding: Maps signed integers to unsigned: 0→0, -1→1, 1→2, -2→3, etc. Keeps small negatives small for efficient VLQ encoding.


Back to Contents

Bibliography

PRE-ALPHA WARNING: This is a pre-alpha version of The Sigma Book. Content may be incomplete, inaccurate, or subject to change. Do not use as a source of truth. For authoritative information, consult the official repositories:

Primary Sources

  1. sigmastate-interpreter Repository

    • URL: https://github.com/ScorexFoundation/sigmastate-interpreter
    • Reference Scala implementation of the SigmaState interpreter
    • Key packages: sigma.ast, sigma.serialization, sigma.eval, sigma.crypto
  2. sigma-rust Repository

    • URL: https://github.com/ergoplatform/sigma-rust
    • Rust implementation of ErgoTree IR and interpreter
    • Key crates: ergotree-ir, ergotree-interpreter, ergo-lib
  3. Ergo Node Repository

    • URL: https://github.com/ergoplatform/ergo
    • Full node implementation in Scala

Specifications

  1. ErgoTree Specification (spec.pdf)

    • Location: sigmastate-interpreter/docs/spec/spec.pdf
    • Formal specification of ErgoTree format and semantics
  2. ErgoScript Language Specification (LangSpec.md)

    • Location: sigmastate-interpreter/docs/LangSpec.md
    • Informal language specification
  3. Sigma Protocols Paper (sigma.pdf)

    • Location: sigmastate-interpreter/docs/wpaper/sigma.pdf
    • Formal specification of Sigma protocols

Academic Papers

  1. Sigmastate Protocols

    • Location: sigmastate-interpreter/docs/sigmastate_protocols/sigmastate_protocols.pdf
    • Detailed protocol descriptions
  2. Ergo Whitepaper

    • Platform overview and design rationale
  3. Ergo Yellow Paper

    • Technical specification

External References

  1. Schnorr Identification Protocol

    • Schnorr, C.P. (1991). Efficient signature generation by smart cards
  2. Fiat-Shamir Heuristic

    • Fiat, A., & Shamir, A. (1986). How to prove yourself
  3. secp256k1 Curve

    • Standards for Efficient Cryptography (SEC 2)
  4. BLAKE2 Hash Function

    • https://www.blake2.net/

Back to Contents