The Sigma Book

Complete Technical Reference for SigmaState Interpreter

Welcome to The Sigma Book, a comprehensive technical reference covering the SigmaState interpreter, ErgoTrees, and the Sigma language. This book is written for engineers who need deep understanding of the implementation details, algorithms, and data structures behind the Ergo blockchain's smart contract system.

Code examples use idiomatic Zig 0.13+ with data-oriented design patterns, making algorithms explicit and accessible to implementers in any language.

What This Book Covers

This book provides complete documentation of:

Specifications: Formal and informal specifications of the Sigma language, type system, and ErgoTree format
Implementation Details: Internal algorithms and data structures from both the reference Scala implementation (sigmastate-interpreter) and Rust implementation (sigma-rust)
Node Integration: How the Ergo node uses the interpreter for transaction validation
Practical APIs: SDK and high-level interfaces for building applications

How to Read This Book

Prerequisites Approach

Every chapter includes an explicit Prerequisites section that lists:

Required knowledge assumptions
Related concepts you should understand
Links to earlier chapters covering dependencies

This allows you to:

Jump directly to topics of interest if you have the background
Trace backward to fill gaps in your understanding
Use the book as a reference rather than reading linearly

Code Examples

Code examples use Zig 0.13+ to illustrate algorithms with explicit memory management and data-oriented patterns. While not directly runnable against the Scala or Rust implementations, they demonstrate the core logic clearly.

Exercises

Each chapter concludes with exercises at three levels:

Conceptual: Test your understanding of the material
Implementation: Write code applying the concepts
Analysis: Read and analyze real source code

Source Material

This book is derived from:

sigmastate-interpreter: Reference Scala implementation (ScorexFoundation/sigmastate-interpreter)
sigma-rust: Rust implementation (ergoplatform/sigma-rust)
Ergo node: Full node implementation showing integration
Formal specifications: LaTeX documents in docs/spec/
Test suites: Language specification tests defining expected behavior

Citations use footnotes referencing both Scala and Rust source locations.

Book Structure

Part	Focus	Depth
I. Foundations	Core concepts and type system	Overview
II. AST	Expression node catalog	Reference
III. Serialization	Binary format	Detailed
IV. Cryptography	Zero-knowledge proofs	Deep
V. Interpreter	Evaluation engine	Deep
VI. Compiler	ErgoScript compilation	Deep
VII. Data Structures	Collections, AVL trees, boxes	Detailed
VIII. Node Integration	Transaction validation	Practical
IX. SDK	Developer APIs	Practical
X. Advanced	Soft-forks, cross-platform	Specialized

Conventions Used

// Code blocks use Zig to illustrate algorithms
const ErgoTree = struct {
    header: Header,
    constants: []const Constant,
    root: *const Expr,
};

Footnotes: [^1]: Scala: path/to/file.scala:123 and [^2]: Rust: path/to/file.rs:456 reference source locations in both implementations.

Version Information

This book documents:

sigmastate-interpreter: Version 6.x (with notes on v5 differences)
sigma-rust: ergotree-ir and ergotree-interpreter crates
Protocol versions: v0 (initial), v1 (v4.0), v2 (v5.0 JIT), v3 (v6.0)

Contributing

This book is maintained as part of the ErgoTree research project. Corrections and improvements are welcome.

Let's begin with Chapter 1: Introduction to Sigma and ErgoTree.

Chapter 1: Introduction to Sigma and ErgoTree

Prerequisites

Basic blockchain concepts (transactions, blocks, consensus)
Understanding of the UTXO model (unspent transaction outputs)
Familiarity with any systems programming language (C, Rust, Go, or similar)
Public key cryptography fundamentals (key pairs, digital signatures, hash functions)

Learning Objectives

By the end of this chapter, you will be able to:

Explain why Sigma protocols offer advantages over traditional blockchain scripting
Describe the relationship between ErgoScript, ErgoTree, and SigmaBoolean
Understand the UTXO model and how scripts guard spending conditions
Differentiate the roles of prover and verifier in transaction validation
Identify the core components of the Sigma interpreter architecture

What is Sigma?

Traditional blockchain scripting languages like Bitcoin Script offer limited expressiveness: they support hash preimages, signature checks, and timelocks, but little else. Ethereum's EVM provides Turing completeness but at the cost of complexity, high gas fees, and limited privacy guarantees.

Sigma (Σ) protocols occupy a middle ground. They are cryptographic proof systems that enable zero-knowledge proofs of knowledge—proving you know a secret without revealing it¹. The name comes from the Greek letter Σ and reflects their characteristic three-move structure:

Commitment: The prover sends a randomized commitment value
Challenge: The verifier sends a random challenge
Response: The prover sends a response that proves knowledge without revealing secrets

What makes Sigma protocols powerful for blockchains is their composability: you can combine them with AND, OR, and threshold operations to build complex spending conditions while preserving zero-knowledge properties.

The Three Layers

┌─────────────────────────────────────┐
│           ErgoScript                │  High-level language
│     (Human-readable source)         │
└─────────────────┬───────────────────┘
                  │ Compilation
                  ▼
┌─────────────────────────────────────┐
│            ErgoTree                 │  Intermediate representation
│      (Typed AST / Bytecode)         │  (Serialized in UTXOs)
└─────────────────┬───────────────────┘
                  │ Evaluation
                  ▼
┌─────────────────────────────────────┐
│          SigmaBoolean               │  Cryptographic proposition
│      (Sigma protocol tree)          │  (What needs to be proven)
└─────────────────────────────────────┘

ErgoScript

High-level, statically-typed language with Scala-like syntax²:

First-class lambdas and higher-order functions
Call-by-value evaluation
Local type inference
Blocks as expressions

// Zig representation of an ErgoScript contract
const Contract = struct {
    freeze_deadline: i32,
    pk_owner: SigmaProp,

    pub fn evaluate(self: Contract, height: i32) SigmaProp {
        const deadline_passed = height > self.freeze_deadline;
        return SigmaProp.and(
            SigmaProp.fromBool(deadline_passed),
            self.pk_owner,
        );
    }
};

ErgoTree

Compiled bytecode representation stored on-chain³⁴:

Typed abstract syntax tree (AST)
Serialized as bytes in UTXOs
Deterministically interpretable
Version-controlled for soft-fork upgrades

const ErgoTree = struct {
    header: HeaderType,
    constants: []const Constant,
    root: union(enum) {
        parsed: SigmaPropValue,
        unparsed: UnparsedTree,
    },

    /// Header byte layout:
    /// Bit 7: Multi-byte header flag
    /// Bit 6: Reserved (GZIP)
    /// Bit 5: Reserved (context-dependent costing)
    /// Bit 4: Constant segregation flag
    /// Bit 3: Size flag
    /// Bits 2-0: Version (0-7)
    pub const HeaderType = packed struct(u8) {
        version: u3,
        has_size: bool,
        constant_segregation: bool,
        reserved1: bool = false,
        reserved_gzip: bool = false,
        multi_byte: bool = false,
    };
};

SigmaBoolean

After evaluation, ErgoTree reduces to a SigmaBoolean—a tree of cryptographic propositions⁵⁶:

const SigmaBoolean = union(enum) {
    prove_dlog: ProveDlog,           // Knowledge of discrete log
    prove_dh_tuple: ProveDhTuple,    // Diffie-Hellman tuple
    cand: Cand,                      // Logical AND
    cor: Cor,                        // Logical OR
    cthreshold: Cthreshold,          // k-of-n threshold
    trivial: TrivialProp,            // True/False

    /// Count nodes in proposition tree
    pub fn size(self: SigmaBoolean) usize {
        return switch (self) {
            .prove_dlog, .prove_dh_tuple, .trivial => 1,
            .cand => |c| 1 + totalChildrenSize(c.children),
            .cor => |c| 1 + totalChildrenSize(c.children),
            .cthreshold => |c| 1 + totalChildrenSize(c.children),
        };
    }
    // NOTE: In production, use an explicit work stack instead of recursion
    // to guarantee bounded stack depth. See ZIGMA_STYLE.md.
};

const ProveDlog = struct {
    /// Public key (compressed EC point, 33 bytes)
    h: EcPoint,
};

const ProveDhTuple = struct {
    g: EcPoint, // Generator
    h: EcPoint, // Point h
    u: EcPoint, // g^w
    v: EcPoint, // h^w
};

The UTXO Model

Ergo extends the UTXO (Unspent Transaction Output) model pioneered by Bitcoin. Instead of simple locking scripts, Ergo uses boxes—rich data structures that contain value, tokens, and arbitrary typed data:

┌─────────────────────────────────────────┐
│                  Box                    │
├─────────────────────────────────────────┤
│  R0: value (i64 nanoERGs)               │  ← Computed registers
│  R1: ergoTree (spending condition)      │
│  R2: tokens (asset list)                │
│  R3: creationInfo (height, txId, idx)   │
├─────────────────────────────────────────┤
│  R4-R9: additional registers ───────────┼──► User-defined data
│         (optional, typed constants)     │     (up to 6 registers)
└─────────────────────────────────────────┘
                    │
     ┌──────────────┴──────────────┐
     ▼                             ▼
┌─────────────────────────┐   ┌─────────────────────────┐
│         Token           │   │   Register (R4-R9)      │
├─────────────────────────┤   ├─────────────────────────┤
│  id: [32]u8 (token ID)  │   │  value: Constant        │
│  amount: i64            │   │  (any SType value)      │
└─────────────────────────┘   └─────────────────────────┘

Registers R0–R3 are computed from box fields and always present. Registers R4–R9 are optional and can store any typed value—integers, byte arrays, group elements, or even nested collections.

const Box = struct {
    value: i64,                          // nanoERGs (R0)
    ergo_tree: ErgoTree,                 // Spending condition (R1)
    tokens: []const Token,               // Additional assets (R2)
    creation_height: u32,                // Part of creation info (R3)
    tx_id: [32]u8,                       // Part of creation info (R3)
    output_index: u16,                   // Part of creation info (R3)
    additional_registers: [6]?Constant,  // R4-R9 (user-defined, optional)

    pub fn id(self: *const Box) [32]u8 {
        // Blake2b256(tx_id || output_index || serialized_content)
        var hasher = std.crypto.hash.blake2.Blake2b256.init(.{});
        hasher.update(&self.tx_id);
        hasher.update(std.mem.asBytes(&self.output_index));
        // ... serialize and hash content
        return hasher.finalResult();
    }
};
// NOTE: R0-R3 are computed from box fields; only R4-R9 are stored explicitly.

The Prover/Verifier Model

TODO: Add explainations.

                PROVER                              VERIFIER
          ┌──────────────┐                    ┌──────────────┐
          │   Secrets    │                    │              │
          │  (private    │                    │   Context    │
          │   keys)      │                    │              │
          └──────┬───────┘                    └──────┬───────┘
                 │                                   │
  ErgoTree ─────►│                    ErgoTree ─────►│
                 │                                   │
          ┌──────▼───────┐                    ┌──────▼───────┐
          │  Reduction   │                    │  Reduction   │
          │  (same as    │                    │  (same as    │
          │  verifier)   │                    │   prover)    │
          └──────┬───────┘                    └──────┬───────┘
                 │                                   │
          SigmaBoolean                        SigmaBoolean
                 │                                   │
          ┌──────▼───────┐                    ┌──────▼───────┐
          │   Signing    │───────Proof───────►│   Verify     │
          │ (Fiat-Shamir)│                    │  Signature   │
          └──────────────┘                    └──────┬───────┘
                                                     │
                                              true / false

Prover

TODO: Add explainations.

const Prover = struct {
    secrets: []const SecretKey,

    pub fn prove(
        self: *const Prover,
        ergo_tree: *const ErgoTree,
        context: *const Context,
    ) !Proof {
        // 1. Reduce to SigmaBoolean
        const sigma_bool = try Evaluator.reduce(ergo_tree, context);

        // 2. Generate proof using Fiat-Shamir
        return try self.generateProof(sigma_bool, context.message);
    }
};

Verifier

TODO: Add explainations.

const Verifier = struct {
    cost_limit: u64,

    pub fn verify(
        self: *const Verifier,
        ergo_tree: *const ErgoTree,
        context: *const Context,
        proof: *const Proof,
    ) !bool {
        // 1. Reduce with cost tracking
        var cost: u64 = 0;
        const sigma_bool = try Evaluator.reduceWithCost(
            ergo_tree, context, &cost, self.cost_limit,
        );

        // 2. Verify signature
        return try verifySignature(sigma_bool, proof, context.message);
    }
};

Why Sigma Protocols?

Consider what Bitcoin Script can express: "This output can be spent if you provide a valid signature for public key X." This covers most payment scenarios but falls short for more sophisticated applications.

Sigma protocols enable a fundamentally richer set of spending conditions:

Feature	What It Enables	Example Use Case
Composable ZK Proofs	AND, OR, threshold combinations of conditions	Multi-party escrow with complex release logic
Ring Signatures	Prove you're one of N signers without revealing which	Anonymous voting, whistleblower systems
Threshold Signatures	Require k-of-n parties to sign	DAO governance, cold storage recovery
Zero-Knowledge Privacy	Prove statements without revealing underlying data	Private auctions, confidential identity verification

The key insight is that Sigma protocols can be composed while preserving their zero-knowledge properties. An OR composition of two Sigma proofs reveals that the prover knows one of two secrets—but not which one.

// OR composition hides actual signer
const ring_signature = SigmaBoolean{
    .cor = .{
        .children = &[_]SigmaBoolean{
            .{ .prove_dlog = pk_alice },
            .{ .prove_dlog = pk_bob },
            .{ .prove_dlog = pk_carol },
        },
    },
};
// Proof reveals ONE signed, but not which

Repository Structure

Module	Purpose
`core`	Cryptographic primitives, base types
`data`	ErgoTree, AST nodes, serialization
`interpreter`	Evaluation engine, Sigma protocols
`parsers`	ErgoScript parser
`sc`	Compiler with IR optimization
`sdk`	High-level transaction APIs

Key Design Principles

The Sigma interpreter is built around four core principles that make it suitable for blockchain consensus:

Determinism

Every operation must produce identical results for identical inputs, regardless of platform or implementation. This means no floating-point arithmetic, no uninitialized memory, and careful handling of hash map iteration order. Without determinism, nodes would disagree on transaction validity.

Bounded Execution

Every script must complete within a predictable cost limit. The interpreter tracks three resource categories:

Computational operations: arithmetic, comparisons, function calls
Memory allocations: collections, tuples, intermediate values
Cryptographic operations: EC point multiplication, signature verification

Scripts exceeding the cost limit fail validation, preventing denial-of-service attacks.

Soft-Fork Compatibility

ErgoTree includes version information in its header. When nodes encounter unknown opcodes (from future protocol versions), they can handle them gracefully rather than rejecting the entire block. This enables protocol upgrades without hard forks.

Cross-Platform Consistency

The specification must be implementable identically across different platforms. Reference implementations exist for:

JVM (Scala): The original sigmastate-interpreter
JavaScript (Scala.js): Browser and Node.js environments
Native (Rust): sigma-rust for performance-critical applications⁷

Summary

This chapter introduced the fundamental concepts of the Sigma protocol ecosystem:

Sigma protocols are three-move cryptographic proofs that enable zero-knowledge proofs of knowledge, with the crucial property of composability
ErgoScript is a high-level, statically-typed language that compiles to ErgoTree bytecode
ErgoTree is a serialized AST stored in UTXO boxes that evaluates to SigmaBoolean propositions
SigmaBoolean represents cryptographic conditions (discrete log proofs, Diffie-Hellman tuples) combined with AND, OR, and threshold logic
The prover generates zero-knowledge proofs; the verifier checks them without learning secrets
The system is designed for blockchain consensus: deterministic, bounded, soft-fork compatible, and cross-platform

In the following chapters, we'll dive deep into each layer—starting with the type system that makes ErgoTree's static guarantees possible.

Next: Chapter 2: Type System

Sigma protocols are interactive proof systems with the special "honest-verifier zero-knowledge" property.

Scala: LangSpec.md:57-80

Scala: ErgoTree.scala:24-88

⁴

Rust: tree_header.rs:10-32

⁵

Scala: SigmaBoolean.scala:12-21

⁶

Rust: sigma_boolean.rs:34-80

⁷

Rust implementation: sigma-rust crate at ergotree-ir/, ergotree-interpreter/

Chapter 2: Type System

Prerequisites

Basic type system concepts (static vs dynamic typing, generic types)
Understanding of binary serialization concepts
Prior chapters: Chapter 1

Learning Objectives

By the end of this chapter, you will be able to:

Identify all ErgoTree primitive types and their numeric ranges
Understand why type codes exist and how they enable compact serialization
Explain the "embeddable" type concept and its efficiency benefits
Construct collection, option, tuple, and function types
Recognize version-specific type additions (v6 and beyond)

Type System Overview

Every value in ErgoTree has a statically-known type. Unlike dynamically-typed languages where types are checked at runtime, ErgoTree's type system catches errors at compile time—before the script ever reaches the blockchain.

The type system provides¹²:

Static typing: All types known at compile time, enabling early error detection
Type inference: The compiler automatically deduces types in most cases
Generic types: Collections and options parameterized over element types
Type codes: Each type has a unique numeric code enabling compact binary serialization

Understanding type codes is essential because they directly affect how data is serialized on-chain. The type system is carefully designed so that common types serialize to single bytes, minimizing transaction size.

/// Base type descriptor
const SType = union(enum) {
    // Primitives (embeddable, codes 1-9)
    boolean,
    byte,
    short,
    int,
    long,
    big_int,
    group_element,
    sigma_prop,
    unsigned_big_int, // v6+

    // Compound types
    coll: *const SType,
    option: *const SType,
    tuple: []const SType,
    func: SFunc,

    // Object types (codes 99-106)
    box,
    avl_tree,
    context,
    header,
    pre_header,
    global,

    // Special
    unit,
    any,
    type_var: []const u8,

    pub fn typeCode(self: SType) u8 {
        return switch (self) {
            .boolean => 1,
            .byte => 2,
            .short => 3,
            .int => 4,
            .long => 5,
            .big_int => 6,
            .group_element => 7,
            .sigma_prop => 8,
            .unsigned_big_int => 9,
            .coll => 12,
            .option => 36,
            .tuple => 96,
            .box => 99,
            .avl_tree => 100,
            .context => 101,
            .header => 104,
            .pre_header => 105,
            .global => 106,
            else => 0,
        };
    }

    pub fn isEmbeddable(self: SType) bool {
        return self.typeCode() >= 1 and self.typeCode() <= 9;
    }

    pub fn isNumeric(self: SType) bool {
        return switch (self) {
            .byte, .short, .int, .long, .big_int, .unsigned_big_int => true,
            else => false,
        };
    }
};

Type Hierarchy

                              SType
                                │
         ┌──────────────────────┼──────────────────────┐
         │                      │                      │
    SEmbeddable            SCollection            SOption
         │                 (elemType)            (elemType)
    ┌────┴────┬─────────────────┐
    │         │                 │
SNumericType SBoolean    SGroupElement
    │                     SSigmaProp
    │
    ├── SByte (code 2)
    ├── SShort (code 3)
    ├── SInt (code 4)
    ├── SLong (code 5)
    ├── SBigInt (code 6)
    └── SUnsignedBigInt (code 9, v6+)

Object Types (non-embeddable):
  SBox(99), SAvlTree(100), SContext(101),
  SHeader(104), SPreHeader(105), SGlobal(106)

Primitive Types

Numeric Types

All numeric types support conversion via upcast (widening) and downcast (narrowing, throws on overflow)³⁴:

Type	Code	Size	Range
`SByte`	2	8-bit	-128 to 127
`SShort`	3	16-bit	-32,768 to 32,767
`SInt`	4	32-bit	±2.1 billion
`SLong`	5	64-bit	±9.2 quintillion
`SBigInt`	6	256-bit	Signed arbitrary
`SUnsignedBigInt`	9	256-bit	Unsigned (v6+)

const SNumericType = struct {
    type_code: u8,
    numeric_index: u8, // 0=Byte, 1=Short, 2=Int, 3=Long, 4=BigInt, 5=UBigInt

    /// Ordering: Byte < Short < Int < Long < BigInt < UnsignedBigInt
    pub fn canUpcastTo(self: SNumericType, target: SNumericType) bool {
        return self.numeric_index <= target.numeric_index;
    }

    /// Downcast with overflow check
    pub fn downcast(comptime T: type, value: anytype) !T {
        const min = std.math.minInt(T);
        const max = std.math.maxInt(T);
        if (value < min or value > max) {
            return error.ArithmeticOverflow;
        }
        return @intCast(value);
    }
};

// Type instances
const SByte = SNumericType{ .type_code = 2, .numeric_index = 0 };
const SShort = SNumericType{ .type_code = 3, .numeric_index = 1 };
const SInt = SNumericType{ .type_code = 4, .numeric_index = 2 };
const SLong = SNumericType{ .type_code = 5, .numeric_index = 3 };
const SBigInt = SNumericType{ .type_code = 6, .numeric_index = 4 };
const SUnsignedBigInt = SNumericType{ .type_code = 9, .numeric_index = 5 };

Boolean Type

const SBoolean = struct {
    pub const type_code: u8 = 1;
    pub const is_embeddable = true;
};

Cryptographic Types

GroupElement — Point on secp256k1 curve (33 bytes compressed)⁵:

const SGroupElement = struct {
    pub const type_code: u8 = 7;

    /// 33 bytes: 1-byte prefix (0x02/0x03) + 32-byte X coordinate
    pub const SERIALIZED_SIZE = 33;
};

SigmaProp — Cryptographic proposition (required return type)⁶:

const SSigmaProp = struct {
    pub const type_code: u8 = 8;

    /// Maximum serialized size
    pub const MAX_SIZE_BYTES: usize = 1024;
};

Type Codes

Type code space partitioning⁷:

Range	Description
1-9	Primitive embeddable types
10-11	Reserved
12-23	`Coll[T]` (T primitive)
24-35	`Coll[Coll[T]]`
36-47	`Option[T]`
48-59	`Option[Coll[T]]`
60+	Other types

const TypeCodes = struct {
    // Primitives
    pub const BOOLEAN: u8 = 1;
    pub const BYTE: u8 = 2;
    pub const SHORT: u8 = 3;
    pub const INT: u8 = 4;
    pub const LONG: u8 = 5;
    pub const BIGINT: u8 = 6;
    pub const GROUP_ELEMENT: u8 = 7;
    pub const SIGMA_PROP: u8 = 8;
    pub const UNSIGNED_BIGINT: u8 = 9;

    // Type constructor bases
    pub const PRIM_RANGE: u8 = 12; // MaxPrimTypeCode + 1
    pub const COLL_BASE: u8 = 12;
    pub const NESTED_COLL_BASE: u8 = 24;
    pub const OPTION_BASE: u8 = 36;
    pub const OPTION_COLL_BASE: u8 = 48;

    // Object types
    pub const TUPLE: u8 = 96;
    pub const ANY: u8 = 97;
    pub const UNIT: u8 = 98;
    pub const BOX: u8 = 99;
    pub const AVL_TREE: u8 = 100;
    pub const CONTEXT: u8 = 101;
    pub const HEADER: u8 = 104;
    pub const PREHEADER: u8 = 105;
    pub const GLOBAL: u8 = 106;
};

Embeddable Types

The type system's most elegant optimization is the concept of embeddable types. These nine primitive types (codes 1–9) can be "embedded" directly into type constructor codes, allowing common composite types to serialize as a single byte.

Consider Coll[Int] (a collection of integers). Without embedding, this would require two bytes: one for "Collection" and one for "Int". With embedding, it serializes as a single byte: 12 + 4 = 16. This matters because type information appears frequently in serialized ErgoTrees—every constant, every expression result has a type.

The embedding formula is simple⁸:

/// Embed primitive type code into constructor
pub fn embedType(type_constr_base: u8, prim_type_code: u8) u8 {
    return type_constr_base + prim_type_code;
}

// Examples:
// Coll[Byte]  = 12 + 2 = 14
// Coll[Int]   = 12 + 4 = 16
// Option[Long] = 36 + 5 = 41
// Option[Coll[Byte]] = 48 + 2 = 50

Type	Code	Coll[T]	Option[T]
Boolean	1	13	37
Byte	2	14	38
Short	3	15	39
Int	4	16	40
Long	5	17	41
BigInt	6	18	42
GroupElement	7	19	43
SigmaProp	8	20	44
UnsignedBigInt	9	21	45

Collection Types

Collections are homogeneous sequences⁹¹⁰:

const SCollection = struct {
    elem_type: *const SType,

    pub fn typeCode(self: SCollection) u8 {
        if (self.elem_type.isEmbeddable()) {
            return TypeCodes.COLL_BASE + self.elem_type.typeCode();
        }
        return TypeCodes.COLL_BASE; // Followed by element type
    }
};

// Pre-defined collection types (avoid allocation)
const SByteArray = SCollection{ .elem_type = &SType.byte };
const SIntArray = SCollection{ .elem_type = &SType.int };
const SBooleanArray = SCollection{ .elem_type = &SType.boolean };
const SBoxArray = SCollection{ .elem_type = &SType.box };

Option Types

Optional values¹¹:

const SOption = struct {
    elem_type: *const SType,

    pub fn typeCode(self: SOption) u8 {
        if (self.elem_type.isEmbeddable()) {
            return TypeCodes.OPTION_BASE + self.elem_type.typeCode();
        }
        return TypeCodes.OPTION_BASE;
    }
};

// Pre-defined option types
const SByteOption = SOption{ .elem_type = &SType.byte };
const SIntOption = SOption{ .elem_type = &SType.int };
const SLongOption = SOption{ .elem_type = &SType.long };
const SBoxOption = SOption{ .elem_type = &SType.box };

Tuple Types

Heterogeneous fixed-size sequences:

const STuple = struct {
    items: []const SType,

    pub const type_code: u8 = 96;

    pub fn pair(left: SType, right: SType) STuple {
        return STuple{ .items = &[_]SType{ left, right } };
    }
};

Function Types

Function signatures for lambdas and methods:

const SFunc = struct {
    t_dom: []const SType,     // Domain (argument types)
    t_range: *const SType,    // Range (return type)
    tpe_params: []const STypeVar, // Generic type parameters

    pub const type_code: u8 = 246;
};

// Example: (Int) => Boolean
const intToBool = SFunc{
    .t_dom = &[_]SType{SType.int},
    .t_range = &SType.boolean,
    .tpe_params = &[_]STypeVar{},
};

Object Types

Type	Code	Description
`SBox`	99	UTXO with value, script, tokens, registers
`SAvlTree`	100	Authenticated dictionary (Merkle proofs)
`SContext`	101	Transaction context
`SHeader`	104	Block header
`SPreHeader`	105	Pre-solved block header
`SGlobal`	106	Global operations

Type Variables

Used internally by compiler for generic methods (never serialized)¹²:

const STypeVar = struct {
    name: []const u8,

    // Standard type variables
    pub const T = STypeVar{ .name = "T" };
    pub const R = STypeVar{ .name = "R" };
    pub const K = STypeVar{ .name = "K" };
    pub const V = STypeVar{ .name = "V" };
    pub const IV = STypeVar{ .name = "IV" }; // Input Value
    pub const OV = STypeVar{ .name = "OV" }; // Output Value
};

Version Differences

v6 additions¹³:

SUnsignedBigInt (type code 9)
Bitwise operations on numeric types
Additional numeric methods (toBytes, toBits, shifts)

pub fn allPredefTypes(version: ErgoTreeVersion) []const SType {
    const v5_types = &[_]SType{
        .boolean, .byte, .short, .int, .long, .big_int,
        .context, .global, .header, .pre_header, .avl_tree,
        .group_element, .sigma_prop, .box, .unit, .any,
    };

    if (version.value >= 3) { // v6+
        return v5_types ++ &[_]SType{.unsigned_big_int};
    }
    return v5_types;
}

Complete Type Code Reference

Type	Code	Embeddable
Boolean	1	Yes
Byte	2	Yes
Short	3	Yes
Int	4	Yes
Long	5	Yes
BigInt	6	Yes
GroupElement	7	Yes
SigmaProp	8	Yes
UnsignedBigInt	9	Yes
Coll[T]	12	Constructor
Option[T]	36	Constructor
Tuple	96	No
Any	97	No
Unit	98	No
Box	99	No
AvlTree	100	No
Context	101	No
Header	104	No
PreHeader	105	No
Global	106	No

Summary

This chapter covered ErgoTree's type system, which provides the foundation for type-safe script execution:

Type codes (unique numeric identifiers) enable compact binary serialization—critical for on-chain storage efficiency
Embeddable types (codes 1–9) combine with type constructors using a clever arithmetic encoding, reducing common types to single bytes
Numeric types form an ordered hierarchy (Byte < Short < Int < Long < BigInt) with safe upcasting and checked downcasting
SigmaProp is the required return type for all ErgoScript contracts—it represents the cryptographic proposition that must be proven
Object types (Box, Context, Header) provide access to blockchain state during script execution
Version 6 introduces SUnsignedBigInt and additional numeric operations for greater expressiveness

The type system ensures that scripts are well-formed before execution, preventing runtime type errors that could cause consensus failures. In the next chapter, we'll see how these types are organized into the ErgoTree structure—the actual format stored on-chain.

Next: Chapter 3: ErgoTree Structure

Scala: SType.scala:17-61

Rust: stype.rs:27-76

Scala: SType.scala:395-575 (numeric type definitions)

⁴

Rust: snumeric.rs:12-37 (method IDs)

⁵

Scala: SType.scala (SGroupElement definition)

⁶

Scala: SType.scala (SSigmaProp definition)

⁷

Scala: SType.scala:320-332 (type code ranges)

⁸

Scala: SType.scala:305-313 (SEmbeddable trait)

⁹

Scala: SType.scala:743-799 (SCollection)

¹⁰

Rust: scoll.rs

¹¹

Scala: SType.scala:691-741 (SOption)

¹²

Scala: SType.scala:67-95 (type variables)

¹³

Scala: SType.scala:105-128 (version differences)

Chapter 3: ErgoTree Structure

Prerequisites

Binary representation concepts (bits, bytes, bitwise operations)
Variable-Length Quantity (VLQ) encoding—a method for encoding integers using a variable number of bytes
Understanding of Abstract Syntax Trees (ASTs) as hierarchical representations of code structure
Prior chapters: Chapter 1 for the three-layer architecture, Chapter 2 for type codes used in serialization

Learning Objectives

By the end of this chapter, you will be able to:

Parse and interpret ErgoTree header bytes, extracting version and feature flags
Explain how constant segregation enables template sharing and caching optimizations
Describe the version mechanism and how it enables soft-fork protocol upgrades
Read and write the complete ErgoTree binary format

ErgoTree Overview

When you write an ErgoScript contract, the compiler transforms it into ErgoTree—a compact binary format designed for blockchain storage and deterministic execution. Every UTXO box contains an ErgoTree that defines its spending conditions.

ErgoTree is specifically designed to be¹²:

Self-sufficient: Contains everything needed for evaluation (no external dependencies)
Compact: Optimized binary encoding minimizes on-chain storage
Forward-compatible: Version mechanism enables protocol upgrades without hard forks
Deterministic: Same bytes always produce the same evaluation result

The structure consists of:

Header byte — Format version and feature flags
Size field (optional) — Total size for fast skipping
Constants array (optional) — Extracted constants for template sharing
Root expression — The actual script logic, returning SigmaProp

const ErgoTree = struct {
    header: HeaderType,
    constants: []const Constant,
    root: union(enum) {
        parsed: SigmaPropValue,
        unparsed: UnparsedTree,
    },
    proposition_bytes: ?[]const u8,

    pub fn bytes(self: *ErgoTree, allocator: Allocator) ![]u8 {
        if (self.proposition_bytes) |b| return b;
        return try serialize(self, allocator);
    }

    pub fn bytesHex(self: *ErgoTree, allocator: Allocator) ![]u8 {
        const b = try self.bytes(allocator);
        return std.fmt.allocPrint(allocator, "{x}", .{b});
    }
};

Header Format

The first byte uses a bit-field format³⁴:

   7  6  5  4  3  2  1  0
 ┌──┬──┬──┬──┬──┬──┬──┬──┐
 │  │  │  │  │  │  │  │  │
 └──┴──┴──┴──┴──┴──┴──┴──┘
  │  │  │  │  │  └──┴──┴── Version (bits 0-2)
  │  │  │  │  └─────────── Size flag (bit 3)
  │  │  │  └────────────── Constant segregation (bit 4)
  │  │  └───────────────── Reserved (bit 5, must be 0)
  │  └──────────────────── Reserved for GZIP (bit 6, must be 0)
  └─────────────────────── Extended header (bit 7)

const HeaderType = packed struct(u8) {
    version: u3,              // bits 0-2
    has_size: bool,           // bit 3
    constant_segregation: bool, // bit 4
    reserved1: bool = false,  // bit 5
    reserved_gzip: bool = false, // bit 6
    multi_byte: bool = false, // bit 7

    pub const VERSION_MASK: u8 = 0x07;
    pub const SIZE_FLAG: u8 = 0x08;
    pub const CONST_SEG_FLAG: u8 = 0x10;

    pub fn fromByte(byte: u8) HeaderType {
        return @bitCast(byte);
    }

    pub fn toByte(self: HeaderType) u8 {
        return @bitCast(self);
    }

    pub fn v0(constant_segregation: bool) HeaderType {
        return .{
            .version = 0,
            .has_size = false,
            .constant_segregation = constant_segregation,
        };
    }

    pub fn v1(constant_segregation: bool) HeaderType {
        return .{
            .version = 1,
            .has_size = true, // Required for v1+
            .constant_segregation = constant_segregation,
        };
    }
};

Common Header Values

Byte	Binary	Meaning
`0x00`	`00000000`	v0, no segregation, no size
`0x08`	`00001000`	v0, no segregation, with size
`0x10`	`00010000`	v0, constant segregation, no size
`0x18`	`00011000`	v0, constant segregation, with size
`0x09`	`00001001`	v1, with size (required)
`0x19`	`00011001`	v1, constant segregation, with size

Binary Format

┌──────────────────────────────────────────────────────────────────┐
│                            ErgoTree                              │
├─────────┬─────────────┬──────────────────┬───────────────────────┤
│ Header  │ [Size]      │ [Constants]      │ Root Expression       │
│ 1 byte  │ VLQ (opt)   │ Array (opt)      │ Serialized tree       │
└─────────┴─────────────┴──────────────────┴───────────────────────┘

If header bit 3 is set (hasSize):
  Size = VLQ-encoded size of (Constants + Root Expression)

If header bit 4 is set (isConstantSegregation):
  Constants = VLQ count + Array of serialized constants

Root Expression = Serialized expression tree (SigmaPropValue)

const ErgoTreeSerializer = struct {
    pub fn deserialize(reader: anytype) !ErgoTree {
        // 1. Read header byte
        const header = HeaderType.fromByte(try reader.readByte());

        // 2. Read extended header if bit 7 set
        if (header.multi_byte) {
            // VLQ continuation - read additional bytes
            _ = try readVlqExtension(reader);
        }

        // 3. Read size if flag set
        var tree_size: ?u32 = null;
        if (header.has_size) {
            tree_size = try readVlq(reader);
        }

        // 4. Read constants if segregation enabled
        var constants: []Constant = &.{};
        if (header.constant_segregation) {
            const count = try readVlq(reader);
            // Bounds check: prevent DoS via excessive allocation
            const MAX_CONSTANTS: u32 = 4096;
            if (count > MAX_CONSTANTS) {
                return error.TooManyConstants;
            }
            constants = try allocator.alloc(Constant, count);
            for (constants) |*c| {
                c.* = try Constant.deserialize(reader);
            }
        }
        // NOTE: In production, use a pre-allocated pool instead of dynamic
        // allocation during deserialization. See ZIGMA_STYLE.md.

        // 5. Read root expression
        const root = try Expr.deserialize(reader);

        return ErgoTree{
            .header = header,
            .constants = constants,
            .root = .{ .parsed = root },
            .proposition_bytes = null,
        };
    }
};

Constant Segregation

Constant segregation is an optimization technique that extracts literal values from the expression tree and stores them in a separate array⁵. The expression tree then references these constants via placeholder indices. This seemingly simple change enables several powerful optimizations:

Without segregation:
┌─────────────────────────────────────────────────┐
│ header: 0x00                                    │
│ root: AND(GT(HEIGHT, IntConstant(100)), pk)     │
└─────────────────────────────────────────────────┘

With segregation:
┌─────────────────────────────────────────────────┐
│ header: 0x10                                    │
│ constants: [IntConstant(100)]                   │
│ root: AND(GT(HEIGHT, Placeholder(0)), pk)       │
└─────────────────────────────────────────────────┘

Benefits:

Template sharing: Same template, different constants
Caching: Templates cached for repeated evaluation
Substitution: Constants replaced without re-parsing

/// Substitute ConstantPlaceholder nodes with actual constants
pub fn substConstants(
    root: *const Expr,
    constants: []const Constant,
) Expr {
    return switch (root.*) {
        .constant_placeholder => |ph| .{
            .constant = constants[ph.index],
        },
        .and => |a| .{
            .and = .{
                .left = substConstants(a.left, constants),
                .right = substConstants(a.right, constants),
            },
        },
        // ... other node types
        else => root.*,
    };
}
// NOTE: In production, use an iterative approach with an explicit work stack
// to guarantee bounded stack depth and prevent stack overflow on deep trees.

Version Mechanism

The ErgoTree version field (bits 0-2) enables soft-fork protocol upgrades without breaking consensus⁶. Each ErgoTree version corresponds to a minimum required block version in the Ergo protocol—nodes running older protocol versions will skip validation of scripts with newer ErgoTree versions rather than rejecting them as invalid.

ErgoTree Version	Min Block Version	Key Features
v0	1	Original format with Ahead-of-Time (AOT) costing calculated during compilation
v1	2	Just-in-Time (JIT) costing calculated during execution; size field required
v2	3	Extended operations and new opcodes
v3	4	`UnsignedBigInt` type and enhanced collection methods

The size field became mandatory in v1 to support forward compatibility—nodes can skip over scripts they cannot fully parse by reading the size and advancing past the unknown content.

pub fn setVersionBits(header: HeaderType, version: u3) HeaderType {
    var h = header;
    h.version = version;
    // Size flag required for version > 0
    if (version > 0) {
        h.has_size = true;
    }
    return h;
}

Unparsed Trees

When a node encounters an ErgoTree with an unknown opcode—typically from a newer protocol version—deserialization fails. Rather than rejecting the transaction entirely, the raw bytes are preserved as an "unparsed tree"⁷. This design is critical for soft-fork compatibility: older nodes can process blocks containing newer script versions without understanding their contents.

const UnparsedTree = struct {
    bytes: []const u8,
    err: DeserializationError,
};

/// Convert to proposition, handling unparsed case
pub fn toProposition(self: *const ErgoTree, replace_constants: bool) !SigmaPropValue {
    return switch (self.root) {
        .parsed => |tree| blk: {
            if (replace_constants and self.constants.len > 0) {
                break :blk substConstants(tree, self.constants);
            }
            break :blk tree;
        },
        .unparsed => |u| return u.err,
    };
}

Creating ErgoTrees

pub fn fromProposition(prop: SigmaPropValue) ErgoTree {
    return fromPropositionWithHeader(HeaderType.v0(false), prop);
}

pub fn fromPropositionWithHeader(header: HeaderType, prop: SigmaPropValue) ErgoTree {
    // Simple constants don't need segregation
    if (prop == .sigma_prop_constant) {
        return withoutSegregation(header, prop);
    }
    // Complex expressions benefit from segregation
    return withSegregation(header, prop);
}

fn withSegregation(header: HeaderType, prop: SigmaPropValue) ErgoTree {
    var constants = std.ArrayList(Constant).init(allocator);
    const segregated = extractConstants(prop, &constants);
    return ErgoTree{
        .header = .{
            .version = header.version,
            .has_size = header.has_size,
            .constant_segregation = true,
        },
        .constants = constants.toOwnedSlice(),
        .root = .{ .parsed = segregated },
        .proposition_bytes = null,
    };
}

Template Extraction

The template is root expression bytes without constant values:

pub fn template(self: *const ErgoTree) ![]const u8 {
    // Serialize root with placeholders (no constant substitution)
    var buf = std.ArrayList(u8).init(allocator);
    try self.root.parsed.serialize(buf.writer());
    return buf.toOwnedSlice();
}

Templates are useful for:

Identifying script patterns regardless of constants
Contract template matching
Caching deserialized templates

Properties

const ErgoTree = struct {
    // ... fields ...

    /// Returns true if tree contains deserialization operations
    pub fn hasDeserialize(self: *const ErgoTree) bool {
        return switch (self.root) {
            .parsed => |p| containsDeserializeOp(p),
            .unparsed => false,
        };
    }

    /// Returns true if tree uses blockchain context
    pub fn isUsingBlockchainContext(self: *const ErgoTree) bool {
        return switch (self.root) {
            .parsed => |p| containsContextOp(p),
            .unparsed => false,
        };
    }

    /// Convert to SigmaBoolean if simple proposition
    pub fn toSigmaBooleanOpt(self: *const ErgoTree) ?SigmaBoolean {
        const prop = self.toProposition(self.header.constant_segregation) catch return null;
        return switch (prop) {
            .sigma_prop_constant => |c| c.value,
            else => null,
        };
    }
};

Summary

This chapter covered the complete ErgoTree binary format—the serialized representation of smart contracts stored in every UTXO box:

ErgoTree is a self-sufficient serialized contract format containing everything needed for evaluation without external dependencies
The header byte uses a bit-field layout: version (bits 0-2), size flag (bit 3), constant segregation flag (bit 4), with reserved bits for future extensions
Constant segregation (bit 4) extracts literal values into a separate array, enabling template sharing, caching, and runtime substitution without re-parsing
The version mechanism enables soft-fork protocol upgrades—newer ErgoTree versions are skipped by older nodes rather than causing consensus failures
ErgoTree versions 1+ require the size flag, allowing nodes to skip past unknown content
UnparsedTree preserves raw bytes when deserialization fails, maintaining block validity even with unknown opcodes
Simple cryptographic propositions can be extracted as SigmaBoolean values for direct signature verification

Next: Chapter 4: Value Nodes

Scala: ErgoTree.scala:24-80

Rust: ergo_tree.rs:33-41

Scala: ErgoTree.scala:227-270

⁴

Rust: tree_header.rs:10-32

⁵

Scala: ErgoTree.scala:307-322

⁶

Scala: ErgoTree.scala:263-305

⁷

Scala: ErgoTree.scala:19-22

Chapter 4: Value Nodes

Prerequisites

Understanding of Abstract Syntax Trees (ASTs) as hierarchical representations where each node represents a language construct
Tree traversal techniques (depth-first evaluation)
Prior chapters: Chapter 2 for the type system that governs value types, Chapter 3 for how values are serialized in ErgoTree

Learning Objectives

By the end of this chapter, you will be able to:

Explain the Value base type and its role as the foundation for all ErgoTree expression nodes
Distinguish between different constant value types (primitives, cryptographic, collections)
Describe how the eval method implements the evaluation semantics for each node type
Work with compound values including collections and tuples

The Value Base Type

ErgoTree is fundamentally an expression tree where every node produces a typed value. The Value base type defines the common interface that all expression nodes share—a type annotation, an opcode for serialization, and an evaluation method that computes the result¹².

/// Base type for all ErgoTree expression nodes
const Value = struct {
    tpe: SType,
    op_code: OpCode,

    /// Evaluate this node in the given environment
    pub fn eval(self: *const Value, env: *const DataEnv, evaluator: *Evaluator) !Any {
        // Default: must be overridden
        return error.NotImplemented;
    }

    /// Add fixed cost to accumulator
    pub fn addCost(self: *const Value, evaluator: *Evaluator, cost: FixedCost) void {
        evaluator.addCost(cost, self.op_code);
    }

    /// Add per-item cost for known iteration count
    pub fn addSeqCost(
        self: *const Value,
        evaluator: *Evaluator,
        cost: PerItemCost,
        n_items: usize,
    ) void {
        evaluator.addSeqCost(cost, n_items, self.op_code);
    }
};

Value Hierarchy

Value
├── Constant
│   ├── BooleanConstant (TrueLeaf, FalseLeaf)
│   ├── ByteConstant, ShortConstant, IntConstant, LongConstant
│   ├── BigIntConstant, UnsignedBigIntConstant (v6+)
│   ├── GroupElementConstant
│   ├── SigmaPropConstant
│   ├── CollectionConstant
│   └── UnitConstant
├── ConstantPlaceholder
├── Tuple
├── ConcreteCollection
├── SigmaPropValue
│   ├── BoolToSigmaProp
│   ├── CreateProveDlog
│   ├── CreateProveDHTuple
│   ├── SigmaAnd
│   └── SigmaOr
└── Transformer (collection operations)
    ├── AND, OR, XorOf
    ├── Map, Filter, Fold
    └── Exists, ForAll

The hierarchy divides into several major categories:

Constants hold literal values known at compile time
ConstantPlaceholder references segregated constants by index (see Chapter 3)
Compound values (Tuple, ConcreteCollection) combine multiple values
SigmaPropValue nodes produce cryptographic propositions for signing
Transformers perform operations on collections

Constant Values

Constants are pre-evaluated values embedded in the tree³⁴:

const Constant = struct {
    tpe: SType,
    value: Literal,

    pub const COST = FixedCost{ .value = 5 }; // JitCost units

    pub fn eval(self: *const Constant, env: *const DataEnv, E: *Evaluator) Any {
        E.addCost(COST, OpCode.Constant);
        return self.value.toAny();
    }
};

/// Literal values for constants
const Literal = union(enum) {
    boolean: bool,
    byte: i8,
    short: i16,
    int: i32,
    long: i64,
    big_int: BigInt256,
    unsigned_big_int: UnsignedBigInt256,
    group_element: EcPoint,
    sigma_prop: SigmaProp,
    coll: Collection,
    tuple: []const Literal,
    unit: void,

    pub fn toAny(self: Literal) Any {
        return switch (self) {
            .boolean => |b| .{ .boolean = b },
            .int => |i| .{ .int = i },
            // ... other cases
        };
    }
};

Primitive Constant Factories

pub fn intConstant(value: i32) Constant {
    return .{
        .tpe = SType.int,
        .value = .{ .int = value },
    };
}

pub fn longConstant(value: i64) Constant {
    return .{
        .tpe = SType.long,
        .value = .{ .long = value },
    };
}

pub fn byteArrayConstant(bytes: []const u8) Constant {
    return .{
        .tpe = .{ .coll = &SType.byte },
        .value = .{ .coll = .{ .bytes = bytes } },
    };
}

Boolean Singletons

Boolean has special singleton instances for efficiency⁵:

pub const TrueLeaf = Constant{
    .tpe = SType.boolean,
    .value = .{ .boolean = true },
};

pub const FalseLeaf = Constant{
    .tpe = SType.boolean,
    .value = .{ .boolean = false },
};

pub fn booleanConstant(v: bool) *const Constant {
    return if (v) &TrueLeaf else &FalseLeaf;
}

Cryptographic Constants

pub fn groupElementConstant(point: EcPoint) Constant {
    return .{
        .tpe = SType.group_element,
        .value = .{ .group_element = point },
    };
}

pub fn sigmaPropConstant(prop: SigmaProp) Constant {
    return .{
        .tpe = SType.sigma_prop,
        .value = .{ .sigma_prop = prop },
    };
}

/// Group generator - base point G of secp256k1
pub const GroupGenerator = struct {
    pub const COST = FixedCost{ .value = 10 };

    pub fn eval(_: *const @This(), _: *const DataEnv, E: *Evaluator) GroupElement {
        E.addCost(COST, OpCode.GroupGenerator);
        return crypto.SECP256K1_GENERATOR;
    }
};

Constant Placeholders

When constant segregation is enabled (Chapter 3), placeholders replace inline constants with index references into the constants array⁶⁷. This separation enables template caching—the same expression tree structure can be reused with different constant values. Placeholder evaluation costs less than inline constants because the constant data has already been parsed and validated during ErgoTree deserialization.

const ConstantPlaceholder = struct {
    index: u32,
    tpe: SType,

    pub const COST = FixedCost{ .value = 1 }; // Cheaper than Constant

    pub fn eval(self: *const ConstantPlaceholder, _: *const DataEnv, E: *Evaluator) !Any {
        // Bounds check first (prevents out-of-bounds access)
        if (self.index >= E.constants.len) {
            return error.ConstantIndexOutOfBounds;
        }
        const c = E.constants[self.index];
        E.addCost(COST, OpCode.ConstantPlaceholder);

        // Type check
        if (c.tpe != self.tpe) {
            return error.TypeMismatch;
        }
        return c.value.toAny();
    }
};

Collection Values

ErgoTree supports two kinds of collection nodes, optimized for different use cases:

CollectionConstant

For collections where all elements are known at compile time, CollectionConstant stores the values directly. This enables efficient serialization and avoids evaluation overhead for static data like byte arrays and fixed integer sequences.

const CollectionConstant = struct {
    elem_type: SType,
    items: union(enum) {
        bytes: []const u8,
        ints: []const i32,
        longs: []const i64,
        bools: []const bool,
        any: []const Literal,
    },

    pub fn tpe(self: *const CollectionConstant) SType {
        return .{ .coll = &self.elem_type };
    }
};

ConcreteCollection

When collection elements are computed expressions rather than literals, ConcreteCollection holds references to sub-expression nodes⁸. Each element is evaluated at runtime, making this suitable for dynamically constructed collections.

const ConcreteCollection = struct {
    items: []const *Value,
    elem_type: SType,

    pub const COST = FixedCost{ .value = 20 };

    pub fn eval(self: *const ConcreteCollection, env: *const DataEnv, E: *Evaluator) ![]Any {
        E.addCost(COST, OpCode.ConcreteCollection);

        var result = try E.allocator.alloc(Any, self.items.len);
        for (self.items, 0..) |item, i| {
            result[i] = try item.eval(env, E);
        }
        return result;
    }
};
// NOTE: In production, use a pre-allocated value pool to avoid dynamic
// allocation during evaluation. See ZIGMA_STYLE.md memory management section.

Tuple Values

Heterogeneous fixed-size sequences⁹:

const Tuple = struct {
    items: []const *Value,

    pub const COST = FixedCost{ .value = 15 };

    pub fn tpe(self: *const Tuple) STuple {
        var types = try allocator.alloc(SType, self.items.len);
        for (self.items, 0..) |item, i| {
            types[i] = item.tpe;
        }
        return STuple{ .items = types };
    }

    pub fn eval(self: *const Tuple, env: *const DataEnv, E: *Evaluator) !TupleValue {
        // Note: v5.0 only supports pairs (2 elements)
        if (self.items.len != 2) {
            return error.InvalidTupleSize;
        }

        const x = try self.items[0].eval(env, E);
        const y = try self.items[1].eval(env, E);
        E.addCost(COST, OpCode.Tuple);

        return .{ x, y };
    }
};

Sigma Proposition Values

BoolToSigmaProp

Converts boolean to cryptographic proposition¹⁰:

const BoolToSigmaProp = struct {
    input: *Value, // Must be boolean

    pub const COST = FixedCost{ .value = 15 };

    pub fn eval(self: *const BoolToSigmaProp, env: *const DataEnv, E: *Evaluator) !SigmaProp {
        const v = try self.input.eval(env, E);
        E.addCost(COST, OpCode.BoolToSigmaProp);

        return SigmaProp.fromBool(v.boolean);
    }
};

CreateProveDlog

Creates discrete log proposition (standard public key)¹¹:

const CreateProveDlog = struct {
    input: *Value, // GroupElement

    pub const COST = FixedCost{ .value = 10 };

    pub fn eval(self: *const CreateProveDlog, env: *const DataEnv, E: *Evaluator) !SigmaProp {
        const point = try self.input.eval(env, E);
        E.addCost(COST, OpCode.ProveDlog);

        return SigmaProp{
            .prove_dlog = ProveDlog{ .h = point.group_element },
        };
    }
};

CreateProveDHTuple

Creates Diffie-Hellman tuple proposition:

const CreateProveDHTuple = struct {
    g: *Value,
    h: *Value,
    u: *Value,
    v: *Value,

    pub const COST = FixedCost{ .value = 20 };

    pub fn eval(self: *const CreateProveDHTuple, env: *const DataEnv, E: *Evaluator) !SigmaProp {
        const g_val = try self.g.eval(env, E);
        const h_val = try self.h.eval(env, E);
        const u_val = try self.u.eval(env, E);
        const v_val = try self.v.eval(env, E);
        E.addCost(COST, OpCode.ProveDHTuple);

        return SigmaProp{
            .prove_dh_tuple = ProveDhTuple{
                .g = g_val.group_element,
                .h = h_val.group_element,
                .u = u_val.group_element,
                .v = v_val.group_element,
            },
        };
    }
};

SigmaAnd / SigmaOr

Combine sigma propositions¹²:

const SigmaAnd = struct {
    items: []const *Value, // SigmaPropValues

    pub const COST = PerItemCost{
        .base = 10,
        .per_chunk = 2,
        .chunk_size = 1,
    };

    pub fn eval(self: *const SigmaAnd, env: *const DataEnv, E: *Evaluator) !SigmaProp {
        var props = try E.allocator.alloc(SigmaProp, self.items.len);
        for (self.items, 0..) |item, i| {
            props[i] = (try item.eval(env, E)).sigma_prop;
        }
        E.addSeqCost(COST, self.items.len, OpCode.SigmaAnd);

        return SigmaProp{ .cand = Cand{ .children = props } };
    }
};

const SigmaOr = struct {
    items: []const *Value,

    pub const COST = PerItemCost{
        .base = 10,
        .per_chunk = 2,
        .chunk_size = 1,
    };

    pub fn eval(self: *const SigmaOr, env: *const DataEnv, E: *Evaluator) !SigmaProp {
        var props = try E.allocator.alloc(SigmaProp, self.items.len);
        for (self.items, 0..) |item, i| {
            props[i] = (try item.eval(env, E)).sigma_prop;
        }
        E.addSeqCost(COST, self.items.len, OpCode.SigmaOr);

        return SigmaProp{ .cor = Cor{ .children = props } };
    }
};

Logical Operations

AND / OR with Short-Circuit

Boolean operations support short-circuit evaluation¹³:

const AND = struct {
    input: *Value, // Collection[Boolean]

    pub const COST = PerItemCost{
        .base = 10,
        .per_chunk = 5,
        .chunk_size = 32,
    };

    pub fn eval(self: *const AND, env: *const DataEnv, E: *Evaluator) !bool {
        const coll = try self.input.eval(env, E);
        const items = coll.coll.bools;

        var result = true;
        var i: usize = 0;

        // Short-circuit: stop on first false
        while (i < items.len and result) {
            result = result and items[i];
            i += 1;
        }

        // Cost based on actual items processed
        E.addSeqCost(COST, i, OpCode.And);
        return result;
    }
};

const OR = struct {
    input: *Value, // Collection[Boolean]

    pub const COST = PerItemCost{
        .base = 10,
        .per_chunk = 5,
        .chunk_size = 32,
    };

    pub fn eval(self: *const OR, env: *const DataEnv, E: *Evaluator) !bool {
        const coll = try self.input.eval(env, E);
        const items = coll.coll.bools;

        var result = false;
        var i: usize = 0;

        // Short-circuit: stop on first true
        while (i < items.len and !result) {
            result = result or items[i];
            i += 1;
        }

        E.addSeqCost(COST, i, OpCode.Or);
        return result;
    }
};

XorOf

XOR over boolean collection:

const XorOf = struct {
    input: *Value,

    pub const COST = PerItemCost{
        .base = 20,
        .per_chunk = 5,
        .chunk_size = 32,
    };

    pub fn eval(self: *const XorOf, env: *const DataEnv, E: *Evaluator) !bool {
        const coll = try self.input.eval(env, E);
        const items = coll.coll.bools;

        var result = false;
        for (items) |b| {
            result = result != b; // XOR
        }

        E.addSeqCost(COST, items.len, OpCode.XorOf);
        return result;
    }
};

Cost Summary

Operation	Cost Type	Value
Constant	Fixed	5
ConstantPlaceholder	Fixed	1
Tuple	Fixed	15
BoolToSigmaProp	Fixed	15
CreateProveDlog	Fixed	10
CreateProveDHTuple	Fixed	20
GroupGenerator	Fixed	10
AND/OR	PerItem	base=10, chunk=5/32
SigmaAnd/SigmaOr	PerItem	base=10, chunk=2/1
XorOf	PerItem	base=20, chunk=5/32

Summary

This chapter introduced the value node hierarchy that forms the foundation of ErgoTree's expression tree:

Value is the base type for all ErgoTree expression nodes, defining the common interface of type, opcode, and evaluation method
Every value carries type information (tpe) used for static type checking and cost information used for bounded execution
Constants are pre-evaluated literals embedded in the tree; ConstantPlaceholder provides indirection to segregated constants for template sharing
Collection values come in two forms: CollectionConstant for static data and ConcreteCollection for computed elements
Sigma proposition values (CreateProveDlog, CreateProveDHTuple, SigmaAnd, SigmaOr) produce cryptographic propositions that require zero-knowledge proofs
Boolean operations (AND, OR) support short-circuit evaluation, charging costs only for elements actually processed
The eval method on each value type implements its evaluation semantics, transforming the AST node into a runtime value

Next: Chapter 5: Operations and Opcodes

Scala: values.scala:30-165

Rust: expr.rs:1-80

Scala: values.scala:305-398

⁴

Rust: constant.rs:51-58

⁵

Scala: values.scala (TrueLeaf, FalseLeaf)

⁶

Scala: values.scala:400-422

⁷

Rust: constant_placeholder.rs

⁸

Scala: values.scala (ConcreteCollection)

⁹

Scala: values.scala:771-810

¹⁰

Scala: trees.scala:28-57

¹¹

Scala: trees.scala (CreateProveDlog)

¹²

Scala: trees.scala (SigmaAnd, SigmaOr)

¹³

Scala: trees.scala:186-299

Chapter 5: Operations and Opcodes

Prerequisites

Understanding of bytecode as numeric instruction encodings
Single-byte vs multi-byte encoding trade-offs
Prior chapters: Chapter 4 for value node types, Chapter 2 for type codes that occupy the lower opcode range

Learning Objectives

By the end of this chapter, you will be able to:

Explain the opcode encoding scheme and why constants share space with operations
Navigate the complete opcode space (0x00-0xFF) and identify operation categories
Describe the three cost descriptor types (FixedCost, PerItemCost, TypeBasedCost)
Understand how short-circuit evaluation affects cost calculation

Opcode Encoding Scheme

Every ErgoTree operation is identified by a single-byte opcode¹²:

Opcode Space Layout:
┌────────────────────────────────────────────────────────┐
│ 0x00       │ Reserved (Undefined)                      │
├────────────┼───────────────────────────────────────────┤
│ 0x01-0x70  │ Constant type codes (optimized encoding)  │
├────────────┼───────────────────────────────────────────┤
│ 0x71       │ Function type marker (LastConstantCode+1) │
├────────────┼───────────────────────────────────────────┤
│ 0x72-0xFF  │ Operation codes (newOpCode 1-143)         │
└────────────┴───────────────────────────────────────────┘

This layout is an optimization: constant values in the range 0x01-0x70 encode their type code directly as the opcode, saving one byte per constant in the serialized tree. The type code simultaneously identifies both what the value is and how to deserialize it. Operations occupy the upper range (0x72-0xFF), providing 143 distinct operation codes.

const OpCode = struct {
    value: u8,

    pub const FIRST_DATA_TYPE: u8 = 1;
    pub const LAST_DATA_TYPE: u8 = 111;
    pub const LAST_CONSTANT_CODE: u8 = 112; // LAST_DATA_TYPE + 1

    pub fn new(shift: u8) OpCode {
        return .{ .value = LAST_CONSTANT_CODE + shift };
    }

    pub fn isConstant(byte: u8) bool {
        return byte >= FIRST_DATA_TYPE and byte <= LAST_CONSTANT_CODE;
    }
};

Opcode Definitions

const OpCodes = struct {
    // Variables (0x71-0x74)
    pub const TaggedVariable = OpCode.new(1);     // 113
    pub const ValUse = OpCode.new(2);             // 114
    pub const ConstantPlaceholder = OpCode.new(3); // 115
    pub const SubstConstants = OpCode.new(4);     // 116

    // Conversions (0x7A-0x7E)
    pub const LongToByteArray = OpCode.new(10);   // 122
    pub const ByteArrayToBigInt = OpCode.new(11); // 123
    pub const ByteArrayToLong = OpCode.new(12);   // 124
    pub const Downcast = OpCode.new(13);          // 125
    pub const Upcast = OpCode.new(14);            // 126

    // Literals (0x7F-0x86)
    pub const True = OpCode.new(15);              // 127
    pub const False = OpCode.new(16);             // 128
    pub const UnitConstant = OpCode.new(17);      // 129
    pub const GroupGenerator = OpCode.new(18);    // 130
    pub const Coll = OpCode.new(19);              // 131
    pub const CollOfBoolConst = OpCode.new(21);   // 133
    pub const Tuple = OpCode.new(22);             // 134

    // Tuple access (0x87-0x8C)
    pub const Select1 = OpCode.new(23);           // 135
    pub const Select2 = OpCode.new(24);           // 136
    pub const Select3 = OpCode.new(25);           // 137
    pub const Select4 = OpCode.new(26);           // 138
    pub const Select5 = OpCode.new(27);           // 139
    pub const SelectField = OpCode.new(28);       // 140

    // Relations (0x8F-0x98)
    pub const Lt = OpCode.new(31);                // 143
    pub const Le = OpCode.new(32);                // 144
    pub const Gt = OpCode.new(33);                // 145
    pub const Ge = OpCode.new(34);                // 146
    pub const Eq = OpCode.new(35);                // 147
    pub const Neq = OpCode.new(36);               // 148
    pub const If = OpCode.new(37);                // 149
    pub const And = OpCode.new(38);               // 150
    pub const Or = OpCode.new(39);                // 151
    pub const AtLeast = OpCode.new(40);           // 152

    // Arithmetic (0x99-0xA2)
    pub const Minus = OpCode.new(41);             // 153
    pub const Plus = OpCode.new(42);              // 154
    pub const Xor = OpCode.new(43);               // 155
    pub const Multiply = OpCode.new(44);          // 156
    pub const Division = OpCode.new(45);          // 157
    pub const Modulo = OpCode.new(46);            // 158
    pub const Exponentiate = OpCode.new(47);      // 159
    pub const MultiplyGroup = OpCode.new(48);     // 160
    pub const Min = OpCode.new(49);               // 161
    pub const Max = OpCode.new(50);               // 162

    // Context (0xA3-0xAC)
    pub const Height = OpCode.new(51);            // 163
    pub const Inputs = OpCode.new(52);            // 164
    pub const Outputs = OpCode.new(53);           // 165
    pub const LastBlockUtxoRootHash = OpCode.new(54); // 166
    pub const Self = OpCode.new(55);              // 167
    pub const MinerPubkey = OpCode.new(60);       // 172

    // Collections (0xAD-0xB8)
    pub const Map = OpCode.new(61);               // 173
    pub const Exists = OpCode.new(62);            // 174
    pub const ForAll = OpCode.new(63);            // 175
    pub const Fold = OpCode.new(64);              // 176
    pub const SizeOf = OpCode.new(65);            // 177
    pub const ByIndex = OpCode.new(66);           // 178
    pub const Append = OpCode.new(67);            // 179
    pub const Slice = OpCode.new(68);             // 180
    pub const Filter = OpCode.new(69);            // 181
    pub const AvlTree = OpCode.new(70);           // 182
    pub const FlatMap = OpCode.new(72);           // 184

    // Box access (0xC1-0xC7)
    pub const ExtractAmount = OpCode.new(81);     // 193
    pub const ExtractScriptBytes = OpCode.new(82); // 194
    pub const ExtractBytes = OpCode.new(83);      // 195
    pub const ExtractBytesWithNoRef = OpCode.new(84); // 196
    pub const ExtractId = OpCode.new(85);         // 197
    pub const ExtractRegisterAs = OpCode.new(86); // 198
    pub const ExtractCreationInfo = OpCode.new(87); // 199

    // Crypto (0xCB-0xD3)
    pub const CalcBlake2b256 = OpCode.new(91);    // 203
    pub const CalcSha256 = OpCode.new(92);        // 204
    pub const ProveDlog = OpCode.new(93);         // 205
    pub const ProveDHTuple = OpCode.new(94);      // 206
    pub const SigmaPropBytes = OpCode.new(96);    // 208
    pub const BoolToSigmaProp = OpCode.new(97);   // 209
    pub const TrivialFalse = OpCode.new(98);      // 210
    pub const TrivialTrue = OpCode.new(99);       // 211

    // Blocks (0xD4-0xDD)
    pub const DeserializeContext = OpCode.new(100); // 212
    pub const DeserializeRegister = OpCode.new(101); // 213
    pub const ValDef = OpCode.new(102);           // 214
    pub const FunDef = OpCode.new(103);           // 215
    pub const BlockValue = OpCode.new(104);       // 216
    pub const FuncValue = OpCode.new(105);        // 217
    pub const FuncApply = OpCode.new(106);        // 218
    pub const PropertyCall = OpCode.new(107);     // 219
    pub const MethodCall = OpCode.new(108);       // 220
    pub const Global = OpCode.new(109);           // 221

    // Options (0xDE-0xE6)
    pub const SomeValue = OpCode.new(110);        // 222
    pub const NoneValue = OpCode.new(111);        // 223
    pub const GetVar = OpCode.new(115);           // 227
    pub const OptionGet = OpCode.new(116);        // 228
    pub const OptionGetOrElse = OpCode.new(117);  // 229
    pub const OptionIsDefined = OpCode.new(118);  // 230

    // Sigma props (0xEA-0xED)
    pub const SigmaAnd = OpCode.new(122);         // 234
    pub const SigmaOr = OpCode.new(123);          // 235
    pub const BinOr = OpCode.new(124);            // 236
    pub const BinAnd = OpCode.new(125);           // 237

    // Bitwise (0xEE-0xFB)
    pub const DecodePoint = OpCode.new(126);      // 238
    pub const LogicalNot = OpCode.new(127);       // 239
    pub const Negation = OpCode.new(128);         // 240
    pub const BitInversion = OpCode.new(129);     // 241
    pub const BitOr = OpCode.new(130);            // 242
    pub const BitAnd = OpCode.new(131);           // 243
    pub const BinXor = OpCode.new(132);           // 244
    pub const BitXor = OpCode.new(133);           // 245
    pub const BitShiftRight = OpCode.new(134);    // 246
    pub const BitShiftLeft = OpCode.new(135);     // 247
    pub const BitShiftRightZeroed = OpCode.new(136); // 248

    // Special (0xFE-0xFF)
    pub const Context = OpCode.new(142);          // 254
    pub const XorOf = OpCode.new(143);            // 255
};

Opcode Categories Summary

Category	Range	Count	Description
Variables	113-116	4	Variable references, placeholders
Conversions	122-126	5	Type conversions
Literals	127-134	8	Boolean, unit, collections
Tuple access	135-140	6	Field selection
Relations	143-152	10	Comparisons, conditionals
Arithmetic	153-162	10	Math operations
Context	163-172	6	Transaction context
Collections	173-184	10	Collection operations
Box access	193-199	7	Box property access
Crypto	203-211	9	Hashing, sigma props
Blocks	212-221	10	Definitions, lambdas
Options	222-230	7	Option operations
Sigma props	234-237	4	Sigma composition
Bitwise	238-248	11	Bit operations

Arithmetic Operations

Arithmetic operations use type-based costing³⁴:

const ArithOp = struct {
    op_code: OpCode,
    left: *const Value,
    right: *const Value,

    pub fn eval(self: *const ArithOp, env: *const DataEnv, E: *Evaluator) !Any {
        const x = try self.left.eval(env, E);
        const y = try self.right.eval(env, E);

        const cost = switch (self.left.tpe) {
            .big_int, .unsigned_big_int => 30,
            else => 15,
        };
        E.addCost(FixedCost{ .value = cost }, self.op_code);

        return switch (self.op_code.value) {
            OpCodes.Plus.value => arithPlus(x, y, self.left.tpe),
            OpCodes.Minus.value => arithMinus(x, y, self.left.tpe),
            OpCodes.Multiply.value => arithMultiply(x, y, self.left.tpe),
            OpCodes.Division.value => arithDivision(x, y, self.left.tpe),
            OpCodes.Modulo.value => arithModulo(x, y, self.left.tpe),
            OpCodes.Min.value => arithMin(x, y, self.left.tpe),
            OpCodes.Max.value => arithMax(x, y, self.left.tpe),
            else => error.UnknownOpcode,
        };
    }
};

fn arithPlus(x: Any, y: Any, tpe: SType) !Any {
    // NOTE: ErgoTree arithmetic uses modular (wrapping) semantics for primitives.
    // The +% operator in Zig performs wrapping addition, matching this behavior.
    // In production, use @addWithOverflow for explicit overflow detection when
    // the application requires overflow errors. See ZIGMA_STYLE.md.
    return switch (tpe) {
        .byte => .{ .byte = x.byte +% y.byte },
        .short => .{ .short = x.short +% y.short },
        .int => .{ .int = x.int +% y.int },
        .long => .{ .long = x.long +% y.long },
        .big_int => .{ .big_int = x.big_int.add(y.big_int) },
        else => unreachable,
    };
}

Arithmetic Cost Table

Operation	Primitive Cost	BigInt Cost
Plus (+)	15	20
Minus (-)	15	20
Multiply (*)	15	30
Division (/)	15	30
Modulo (%)	15	30
Min/Max	15	20

Relation Operations

Comparison operations⁵:

const Relation = struct {
    op_code: OpCode,
    left: *const Value,
    right: *const Value,

    pub fn eval(self: *const Relation, env: *const DataEnv, E: *Evaluator) !bool {
        const lv = try self.left.eval(env, E);
        const rv = try self.right.eval(env, E);

        const cost: u32 = switch (self.op_code.value) {
            OpCodes.Eq.value, OpCodes.Neq.value => 3, // Equality cheap
            else => 15, // Ordering comparisons
        };
        E.addCost(FixedCost{ .value = cost }, self.op_code);

        return switch (self.op_code.value) {
            OpCodes.Lt.value => compare(lv, rv, self.left.tpe) < 0,
            OpCodes.Le.value => compare(lv, rv, self.left.tpe) <= 0,
            OpCodes.Gt.value => compare(lv, rv, self.left.tpe) > 0,
            OpCodes.Ge.value => compare(lv, rv, self.left.tpe) >= 0,
            OpCodes.Eq.value => equalValues(lv, rv),
            OpCodes.Neq.value => !equalValues(lv, rv),
            else => error.UnknownOpcode,
        };
    }
};

Logical Operations

Short-circuit evaluation with per-item cost⁶:

const LogicalAnd = struct {
    input: *const Value, // Collection[Boolean]

    pub const COST = PerItemCost{
        .base = 10,
        .per_chunk = 5,
        .chunk_size = 32,
    };

    pub fn eval(self: *const LogicalAnd, env: *const DataEnv, E: *Evaluator) !bool {
        const coll = try self.input.eval(env, E);
        const items = coll.coll.bools;

        var result = true;
        var i: usize = 0;

        // Short-circuit: stop on first false
        while (i < items.len and result) : (i += 1) {
            result = result and items[i];
        }

        // Cost based on actual items processed
        E.addSeqCost(COST, i, OpCodes.And);
        return result;
    }
};

const BinaryAnd = struct {
    left: *const Value,
    right: *const Value,

    pub const COST = FixedCost{ .value = 20 };

    pub fn eval(self: *const BinaryAnd, env: *const DataEnv, E: *Evaluator) !bool {
        const l = try self.left.eval(env, E);
        E.addCost(COST, OpCodes.BinAnd);

        // Short-circuit: don't evaluate right if left is false
        if (!l.boolean) return false;
        return (try self.right.eval(env, E)).boolean;
    }
};

Cost Descriptors

Every operation has an associated cost that the interpreter accumulates during evaluation. If the total cost exceeds the block limit, execution fails—this prevents denial-of-service attacks via expensive computations. Three cost descriptor types model different operation characteristics⁷:

/// Fixed cost regardless of input
const FixedCost = struct {
    value: u32, // JitCost units
};

/// Cost scales with input size
const PerItemCost = struct {
    base: u32,       // Fixed overhead
    per_chunk: u32,  // Cost per chunk
    chunk_size: u32, // Items per chunk

    pub fn calculate(self: PerItemCost, n_items: usize) u32 {
        const chunks = (n_items + self.chunk_size - 1) / self.chunk_size;
        return self.base + @intCast(chunks) * self.per_chunk;
    }
};

/// Cost depends on operand type
const TypeBasedCost = struct {
    primitive_cost: u32,
    big_int_cost: u32,

    pub fn forType(self: TypeBasedCost, tpe: SType) u32 {
        return switch (tpe) {
            .big_int, .unsigned_big_int => self.big_int_cost,
            else => self.primitive_cost,
        };
    }
};

Context Operations

Access transaction context⁸:

const ContextOps = struct {
    pub const Height = struct {
        pub const COST = FixedCost{ .value = 26 };

        pub fn eval(_: *const @This(), _: *const DataEnv, E: *Evaluator) i32 {
            E.addCost(COST, OpCodes.Height);
            return E.context.pre_header.height;
        }
    };

    pub const Inputs = struct {
        pub const COST = FixedCost{ .value = 10 };

        pub fn eval(_: *const @This(), _: *const DataEnv, E: *Evaluator) []const Box {
            E.addCost(COST, OpCodes.Inputs);
            return E.context.inputs;
        }
    };

    pub const Outputs = struct {
        pub const COST = FixedCost{ .value = 10 };

        pub fn eval(_: *const @This(), _: *const DataEnv, E: *Evaluator) []const Box {
            E.addCost(COST, OpCodes.Outputs);
            return E.context.outputs;
        }
    };

    pub const SelfBox = struct {
        pub const COST = FixedCost{ .value = 10 };

        pub fn eval(_: *const @This(), _: *const DataEnv, E: *Evaluator) *const Box {
            E.addCost(COST, OpCodes.Self);
            return E.context.self_box;
        }
    };
};

Box Property Access

Extract box properties⁹:

const ExtractAmount = struct {
    box: *const Value,

    pub const COST = FixedCost{ .value = 12 };

    pub fn eval(self: *const ExtractAmount, env: *const DataEnv, E: *Evaluator) !i64 {
        const b = try self.box.eval(env, E);
        E.addCost(COST, OpCodes.ExtractAmount);
        return b.box.value;
    }
};

const ExtractId = struct {
    box: *const Value,

    pub const COST = FixedCost{ .value = 12 };

    pub fn eval(self: *const ExtractId, env: *const DataEnv, E: *Evaluator) ![32]u8 {
        const b = try self.box.eval(env, E);
        E.addCost(COST, OpCodes.ExtractId);
        return b.box.id();
    }
};

const ExtractRegisterAs = struct {
    box: *const Value,
    register_id: u4, // 0-9

    pub const COST = FixedCost{ .value = 12 };

    pub fn eval(self: *const ExtractRegisterAs, env: *const DataEnv, E: *Evaluator) !?Constant {
        const b = try self.box.eval(env, E);
        E.addCost(COST, OpCodes.ExtractRegisterAs);
        return b.box.registers[self.register_id];
    }
};

Cryptographic Operations

Hash and sigma prop operations¹⁰:

const CalcBlake2b256 = struct {
    input: *const Value, // Coll[Byte]

    pub const COST = PerItemCost{
        .base = 117,
        .per_chunk = 1,
        .chunk_size = 128,
    };

    pub fn eval(self: *const CalcBlake2b256, env: *const DataEnv, E: *Evaluator) ![32]u8 {
        const bytes = try self.input.eval(env, E);
        E.addSeqCost(COST, bytes.coll.bytes.len, OpCodes.CalcBlake2b256);

        var hasher = std.crypto.hash.blake2.Blake2b256.init(.{});
        hasher.update(bytes.coll.bytes);
        return hasher.finalResult();
    }
};

const CalcSha256 = struct {
    input: *const Value,

    pub const COST = PerItemCost{
        .base = 79,
        .per_chunk = 1,
        .chunk_size = 64,
    };

    pub fn eval(self: *const CalcSha256, env: *const DataEnv, E: *Evaluator) ![32]u8 {
        const bytes = try self.input.eval(env, E);
        E.addSeqCost(COST, bytes.coll.bytes.len, OpCodes.CalcSha256);

        var hasher = std.crypto.hash.sha2.Sha256.init(.{});
        hasher.update(bytes.coll.bytes);
        return hasher.finalResult();
    }
};

Summary

This chapter detailed the opcode encoding scheme that gives each ErgoTree operation a unique byte identifier:

Opcode space is split between constant type codes (0x01-0x70) and operation codes (0x72-0xFF), with constants using their type code directly to save one byte per value
Operation categories group related functionality: variables, conversions, relations, arithmetic, context access, collections, box properties, cryptography, blocks, options, sigma propositions, and bitwise operations
Cost descriptors come in three types: FixedCost for constant-time operations, PerItemCost for operations that scale with input size, and TypeBasedCost for operations where BigInt is more expensive than primitive types
Short-circuit evaluation in logical operations (AND, OR, BinaryAnd, BinaryOr) stops early when the result is determined, with costs calculated based on actual items processed
Context operations provide access to transaction data: HEIGHT, INPUTS, OUTPUTS, SELF box, and miner public key

Next: Chapter 6: Methods on Types

Scala: OpCodes.scala

Rust: op_code.rs:10-100

Scala: trees.scala:704-827

⁴

Rust: bin_op.rs

⁵

Scala: trees.scala:908-1100

⁶

Scala: trees.scala (AND, OR)

⁷

Scala: CostKind.scala

⁸

Scala: trees.scala (context operations)

⁹

Scala: trees.scala (box accessors)

¹⁰

Scala: trees.scala (crypto operations)

Chapter 6: Methods on Types

Prerequisites

Understanding of method dispatch—how method calls are resolved to specific implementations based on the receiver type
Familiarity with type hierarchies and how types can share common method interfaces
Prior chapters: Chapter 2 for type codes used in method resolution, Chapter 5 for operations vs methods distinction

Learning Objectives

By the end of this chapter, you will be able to:

Explain how methods are organized via MethodsContainer and resolved by type code and method ID
Use methods on numeric, collection, box, and cryptographic types
Describe the method resolution process from MethodCall to method implementation
Access transaction context and blockchain state through context methods

Method Architecture

While Chapter 5 covered standalone operations (arithmetic, comparisons, etc.), ErgoTree also supports methods—operations that belong to specific types. The distinction matters for serialization: operations use opcodes directly, while method calls serialize a type code, method ID, and arguments. This design allows types to have rich APIs without consuming the limited opcode space.

Methods are organized through a MethodsContainer system that groups related methods by their receiver type¹²:

Method Organization
══════════════════════════════════════════════════════════════════

┌────────────────────────────────────────────────────────────────┐
│                    STypeCompanion                              │
│    (type_code: u8, methods: []const SMethodDesc)               │
└───────────────────────┬────────────────────────────────────────┘
                        │
       ┌────────────────┼────────────────┬───────────────────┐
       ▼                ▼                ▼                   ▼
┌──────────────┐ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐
│ SNumeric     │ │ SBox         │ │ SColl        │ │ SContext     │
│ TYPE_CODE=2-6│ │ TYPE_CODE=99 │ │ TYPE_CODE=12 │ │ TYPE_CODE=101│
├──────────────┤ ├──────────────┤ ├──────────────┤ ├──────────────┤
│ toByte   (1) │ │ value    (1) │ │ size     (1) │ │ dataInputs(1)│
│ toShort  (2) │ │ propBytes(2) │ │ getOrElse(2) │ │ headers  (2) │
│ toInt    (3) │ │ bytes    (3) │ │ map      (3) │ │ preHeader(3) │
│ toLong   (4) │ │ id       (5) │ │ exists   (4) │ │ INPUTS   (4) │
│ toBigInt (5) │ │ getReg   (7) │ │ forall   (5) │ │ OUTPUTS  (5) │
│ toBytes  (6) │ │ tokens   (8) │ │ fold     (6) │ │ HEIGHT   (6) │
│ ...          │ │ ...          │ │ ...          │ │ ...          │
└──────────────┘ └──────────────┘ └──────────────┘ └──────────────┘

Method Resolution:
  MethodCall(receiver, type_code=99, method_id=1)
       │
       ▼
  resolveMethod(99, 1) → SBoxMethods.VALUE
       │
       ▼
  method.eval(receiver, args, evaluator)

const MethodsContainer = struct {
    type_code: u8,
    methods: []const SMethod,

    pub fn getMethodById(self: *const MethodsContainer, method_id: u8) ?*const SMethod {
        for (self.methods) |*m| {
            if (m.method_id == method_id) return m;
        }
        return null;
    }

    pub fn getMethodByName(self: *const MethodsContainer, name: []const u8) ?*const SMethod {
        for (self.methods) |*m| {
            if (std.mem.eql(u8, m.name, name)) return m;
        }
        return null;
    }
};

const SMethod = struct {
    obj_type: STypeCompanion,
    name: []const u8,
    method_id: u8,
    tpe: SFunc,
    cost_kind: CostKind,
    min_version: ?ErgoTreeVersion = null, // v6+ methods

    pub fn eval(
        self: *const SMethod,
        receiver: Any,
        args: []const Any,
        E: *Evaluator,
    ) !Any {
        // Method dispatch by type_code and method_id
        return try evalMethod(
            self.obj_type.type_code,
            self.method_id,
            receiver,
            args,
            E,
        );
    }
};

Available Method Containers

Type	Container	Method Count
Byte, Short, Int, Long	SNumericMethods	13
BigInt	SBigIntMethods	13
UnsignedBigInt (v6+)	SUnsignedBigIntMethods	13
Boolean	SBooleanMethods	0
GroupElement	SGroupElementMethods	4
SigmaProp	SSigmaPropMethods	2
Box	SBoxMethods	10
Coll[T]	SCollectionMethods	20
Option[T]	SOptionMethods	4
Context	SContextMethods	12
Header	SHeaderMethods	16
PreHeader	SPreHeaderMethods	8
AvlTree	SAvlTreeMethods	9
Global	SGlobalMethods	4

Numeric Type Methods

All numeric types share common methods³⁴:

const SNumericMethods = struct {
    pub const TYPE_CODE = 0; // Varies by actual type

    // Conversion methods (v5+)
    pub const TO_BYTE = SMethod{
        .method_id = 1,
        .name = "toByte",
        .tpe = SFunc.unary(.{ .type_var = "T" }, .byte),
        .cost_kind = .{ .type_based = .{ .primitive = 5, .big_int = 10 } },
    };

    pub const TO_SHORT = SMethod{ .method_id = 2, .name = "toShort", ... };
    pub const TO_INT = SMethod{ .method_id = 3, .name = "toInt", ... };
    pub const TO_LONG = SMethod{ .method_id = 4, .name = "toLong", ... };
    pub const TO_BIGINT = SMethod{ .method_id = 5, .name = "toBigInt", ... };

    // Binary representation (v6+)
    pub const TO_BYTES = SMethod{
        .method_id = 6,
        .name = "toBytes",
        .tpe = SFunc.unary(.{ .type_var = "T" }, .{ .coll = &SType.byte }),
        .cost_kind = .{ .fixed = 5 },
        .min_version = .v3, // v6
    };

    pub const TO_BITS = SMethod{
        .method_id = 7,
        .name = "toBits",
        .tpe = SFunc.unary(.{ .type_var = "T" }, .{ .coll = &SType.boolean }),
        .cost_kind = .{ .fixed = 5 },
        .min_version = .v3,
    };

    // Bitwise operations (v6+)
    pub const BITWISE_INVERSE = SMethod{ .method_id = 8, .name = "bitwiseInverse", ... };
    pub const BITWISE_OR = SMethod{ .method_id = 9, .name = "bitwiseOr", ... };
    pub const BITWISE_AND = SMethod{ .method_id = 10, .name = "bitwiseAnd", ... };
    pub const BITWISE_XOR = SMethod{ .method_id = 11, .name = "bitwiseXor", ... };
    pub const SHIFT_LEFT = SMethod{ .method_id = 12, .name = "shiftLeft", ... };
    pub const SHIFT_RIGHT = SMethod{ .method_id = 13, .name = "shiftRight", ... };
};

Numeric Method Summary

ID	Method	Signature	v5	v6	Description
1	toByte	T => Byte	✓	✓	Convert (may overflow)
2	toShort	T => Short	✓	✓	Convert (may overflow)
3	toInt	T => Int	✓	✓	Convert (may overflow)
4	toLong	T => Long	✓	✓	Convert (may overflow)
5	toBigInt	T => BigInt	✓	✓	Convert (always safe)
6	toBytes	T => Coll[Byte]	-	✓	Big-endian bytes
7	toBits	T => Coll[Bool]	-	✓	Bit representation
8	bitwiseInverse	T => T	-	✓	Bitwise NOT
9	bitwiseOr	(T,T) => T	-	✓	Bitwise OR
10	bitwiseAnd	(T,T) => T	-	✓	Bitwise AND
11	bitwiseXor	(T,T) => T	-	✓	Bitwise XOR
12	shiftLeft	(T,Int) => T	-	✓	Left shift
13	shiftRight	(T,Int) => T	-	✓	Arithmetic right shift

Collection Methods

Collections have the richest method set⁵⁶:

const SCollectionMethods = struct {
    pub const TYPE_CODE = 12;

    // Basic access
    pub const SIZE = SMethod{
        .method_id = 1,
        .name = "size",
        .tpe = SFunc.unary(.{ .coll = .{ .type_var = "T" } }, .int),
        .cost_kind = .{ .fixed = 14 },
    };

    pub const GET_OR_ELSE = SMethod{
        .method_id = 2,
        .name = "getOrElse",
        .tpe = SFunc.new(&[_]SType{
            .{ .coll = .{ .type_var = "T" } },
            .int,
            .{ .type_var = "T" },
        }, .{ .type_var = "T" }),
        .cost_kind = .dynamic,
    };

    // Transformation
    pub const MAP = SMethod{
        .method_id = 3,
        .name = "map",
        .tpe = SFunc.new(&[_]SType{
            .{ .coll = .{ .type_var = "IV" } },
            SFunc.unary(.{ .type_var = "IV" }, .{ .type_var = "OV" }),
        }, .{ .coll = .{ .type_var = "OV" } }),
        .cost_kind = .{ .per_item = .{ .base = 20, .per_chunk = 1, .chunk_size = 10 } },
    };

    pub const FILTER = SMethod{
        .method_id = 8,
        .name = "filter",
        .tpe = SFunc.new(&[_]SType{
            .{ .coll = .{ .type_var = "IV" } },
            SFunc.unary(.{ .type_var = "IV" }, .boolean),
        }, .{ .coll = .{ .type_var = "IV" } }),
        .cost_kind = .{ .per_item = .{ .base = 20, .per_chunk = 1, .chunk_size = 10 } },
    };

    pub const FOLD = SMethod{
        .method_id = 6,
        .name = "fold",
        .tpe = SFunc.new(&[_]SType{
            .{ .coll = .{ .type_var = "IV" } },
            .{ .type_var = "OV" },
            SFunc.binary(.{ .type_var = "OV" }, .{ .type_var = "IV" }, .{ .type_var = "OV" }),
        }, .{ .type_var = "OV" }),
        .cost_kind = .{ .per_item = .{ .base = 20, .per_chunk = 1, .chunk_size = 10 } },
    };

    // Predicates
    pub const EXISTS = SMethod{
        .method_id = 4,
        .name = "exists",
        .tpe = SFunc.new(&[_]SType{
            .{ .coll = .{ .type_var = "IV" } },
            SFunc.unary(.{ .type_var = "IV" }, .boolean),
        }, .boolean),
        .cost_kind = .{ .per_item = .{ .base = 20, .per_chunk = 1, .chunk_size = 10 } },
    };

    pub const FORALL = SMethod{
        .method_id = 5,
        .name = "forall",
        .tpe = SFunc.new(&[_]SType{
            .{ .coll = .{ .type_var = "IV" } },
            SFunc.unary(.{ .type_var = "IV" }, .boolean),
        }, .boolean),
        .cost_kind = .{ .per_item = .{ .base = 20, .per_chunk = 1, .chunk_size = 10 } },
    };

    // Combination
    pub const APPEND = SMethod{
        .method_id = 9,
        .name = "append",
        .tpe = SFunc.binary(
            .{ .coll = .{ .type_var = "IV" } },
            .{ .coll = .{ .type_var = "IV" } },
            .{ .coll = .{ .type_var = "IV" } },
        ),
        .cost_kind = .{ .per_item = .{ .base = 20, .per_chunk = 2, .chunk_size = 100 } },
    };

    pub const SLICE = SMethod{
        .method_id = 10,
        .name = "slice",
        .tpe = SFunc.new(&[_]SType{
            .{ .coll = .{ .type_var = "IV" } },
            .int,
            .int,
        }, .{ .coll = .{ .type_var = "IV" } }),
        .cost_kind = .{ .per_item = .{ .base = 10, .per_chunk = 2, .chunk_size = 100 } },
    };

    pub const ZIP = SMethod{
        .method_id = 14,
        .name = "zip",
        .cost_kind = .{ .per_item = .{ .base = 10, .per_chunk = 1, .chunk_size = 10 } },
    };

    // Index operations
    pub const INDICES = SMethod{
        .method_id = 11,
        .name = "indices",
        .tpe = SFunc.unary(.{ .coll = .{ .type_var = "T" } }, .{ .coll = &SType.int }),
        .cost_kind = .{ .per_item = .{ .base = 20, .per_chunk = 2, .chunk_size = 128 } },
    };

    pub const INDEX_OF = SMethod{
        .method_id = 26,
        .name = "indexOf",
        .cost_kind = .{ .per_item = .{ .base = 20, .per_chunk = 1, .chunk_size = 10 } },
    };
};

Collection Method Summary

ID	Method	v5	v6	Description
1	size	✓	✓	Number of elements
2	getOrElse	✓	✓	Element with default
3	map	✓	✓	Transform elements
4	exists	✓	✓	Any match predicate
5	forall	✓	✓	All match predicate
6	fold	✓	✓	Reduce to single value
7	apply	✓	✓	Element at index (panics if OOB)
8	filter	✓	✓	Keep matching elements
9	append	✓	✓	Concatenate
10	slice	✓	✓	Extract range
14	indices	✓	✓	Range 0..size-1
15	flatMap	✓	✓	Map and flatten
19	patch	✓	✓	Replace range
20	updated	✓	✓	Replace at index
21	updateMany	✓	✓	Batch update
26	indexOf	✓	✓	Find element index
29	zip	✓	✓	Pair with other collection
30	reverse	-	✓	Reverse order
31	startsWith	-	✓	Prefix match
32	endsWith	-	✓	Suffix match
33	get	-	✓	Safe element access (returns Option)

Box Methods

Access box properties⁷⁸:

const SBoxMethods = struct {
    pub const TYPE_CODE = 99;

    pub const VALUE = SMethod{
        .method_id = 1,
        .name = "value",
        .tpe = SFunc.unary(.box, .long),
        .cost_kind = .{ .fixed = 1 },
    };

    pub const PROPOSITION_BYTES = SMethod{
        .method_id = 2,
        .name = "propositionBytes",
        .tpe = SFunc.unary(.box, .{ .coll = &SType.byte }),
        .cost_kind = .{ .fixed = 10 },
    };

    pub const BYTES = SMethod{
        .method_id = 3,
        .name = "bytes",
        .tpe = SFunc.unary(.box, .{ .coll = &SType.byte }),
        .cost_kind = .{ .fixed = 10 },
    };

    pub const ID = SMethod{
        .method_id = 5,
        .name = "id",
        .tpe = SFunc.unary(.box, .{ .coll = &SType.byte }),
        .cost_kind = .{ .fixed = 10 },
    };

    pub const CREATION_INFO = SMethod{
        .method_id = 6,
        .name = "creationInfo",
        .tpe = SFunc.unary(.box, .{ .tuple = &[_]SType{ .int, .{ .coll = &SType.byte } } }),
        .cost_kind = .{ .fixed = 10 },
    };

    pub const TOKENS = SMethod{
        .method_id = 8,
        .name = "tokens",
        .tpe = SFunc.unary(.box, .{
            .coll = &.{ .tuple = &[_]SType{ .{ .coll = &SType.byte }, .long } },
        }),
        .cost_kind = .{ .fixed = 15 },
    };

    // Register access: R0-R9
    pub const R0 = SMethod{ .method_id = 10, .name = "R0", ... };
    pub const R1 = SMethod{ .method_id = 11, .name = "R1", ... };
    pub const R2 = SMethod{ .method_id = 12, .name = "R2", ... };
    pub const R3 = SMethod{ .method_id = 13, .name = "R3", ... };
    pub const R4 = SMethod{ .method_id = 14, .name = "R4", ... };
    pub const R5 = SMethod{ .method_id = 15, .name = "R5", ... };
    pub const R6 = SMethod{ .method_id = 16, .name = "R6", ... };
    pub const R7 = SMethod{ .method_id = 17, .name = "R7", ... };
    pub const R8 = SMethod{ .method_id = 18, .name = "R8", ... };
    pub const R9 = SMethod{ .method_id = 19, .name = "R9", ... };
};

Context Methods

Access transaction context⁹¹⁰:

const SContextMethods = struct {
    pub const TYPE_CODE = 101;

    pub const DATA_INPUTS = SMethod{
        .method_id = 1,
        .name = "dataInputs",
        .tpe = SFunc.unary(.context, .{ .coll = &SType.box }),
        .cost_kind = .{ .fixed = 15 },
    };

    pub const HEADERS = SMethod{
        .method_id = 2,
        .name = "headers",
        .tpe = SFunc.unary(.context, .{ .coll = &SType.header }),
        .cost_kind = .{ .fixed = 15 },
    };

    pub const PRE_HEADER = SMethod{
        .method_id = 3,
        .name = "preHeader",
        .tpe = SFunc.unary(.context, .pre_header),
        .cost_kind = .{ .fixed = 10 },
    };

    pub const INPUTS = SMethod{
        .method_id = 4,
        .name = "INPUTS",
        .tpe = SFunc.unary(.context, .{ .coll = &SType.box }),
        .cost_kind = .{ .fixed = 10 },
    };

    pub const OUTPUTS = SMethod{
        .method_id = 5,
        .name = "OUTPUTS",
        .tpe = SFunc.unary(.context, .{ .coll = &SType.box }),
        .cost_kind = .{ .fixed = 10 },
    };

    pub const HEIGHT = SMethod{
        .method_id = 6,
        .name = "HEIGHT",
        .tpe = SFunc.unary(.context, .int),
        .cost_kind = .{ .fixed = 26 },
    };

    pub const SELF = SMethod{
        .method_id = 7,
        .name = "SELF",
        .tpe = SFunc.unary(.context, .box),
        .cost_kind = .{ .fixed = 10 },
    };

    pub const GET_VAR = SMethod{
        .method_id = 8,
        .name = "getVar",
        .tpe = SFunc.new(&[_]SType{ .context, .byte }, .{ .option = .{ .type_var = "T" } }),
        .cost_kind = .dynamic,
    };
};

GroupElement Methods

Elliptic curve operations¹¹¹²:

const SGroupElementMethods = struct {
    pub const TYPE_CODE = 7;

    pub const GET_ENCODED = SMethod{
        .method_id = 2,
        .name = "getEncoded",
        .tpe = SFunc.unary(.group_element, .{ .coll = &SType.byte }),
        .cost_kind = .{ .fixed = 250 },
    };

    pub const EXP = SMethod{
        .method_id = 3,
        .name = "exp",
        .tpe = SFunc.binary(.group_element, .big_int, .group_element),
        .cost_kind = .{ .fixed = 900 },
    };

    pub const MULTIPLY = SMethod{
        .method_id = 4,
        .name = "multiply",
        .tpe = SFunc.binary(.group_element, .group_element, .group_element),
        .cost_kind = .{ .fixed = 40 },
    };

    pub const NEGATE = SMethod{
        .method_id = 5,
        .name = "negate",
        .tpe = SFunc.unary(.group_element, .group_element),
        .cost_kind = .{ .fixed = 45 },
    };
};

SigmaProp Methods

Cryptographic proposition operations¹³:

const SSigmaPropMethods = struct {
    pub const TYPE_CODE = 8;

    pub const PROP_BYTES = SMethod{
        .method_id = 1,
        .name = "propBytes",
        .tpe = SFunc.unary(.sigma_prop, .{ .coll = &SType.byte }),
        .cost_kind = .{ .fixed = 35 },
    };

    pub const IS_PROVEN = SMethod{
        .method_id = 2,
        .name = "isProven",
        .tpe = SFunc.unary(.sigma_prop, .boolean),
        .cost_kind = .{ .fixed = 10 },
        .deprecated = true, // Use in scripts only
    };
};

Method Resolution

Method lookup by type code and method ID:

pub fn resolveMethod(type_code: u8, method_id: u8) ?*const SMethod {
    const container = switch (type_code) {
        2, 3, 4, 5 => &SNumericMethods,  // Byte, Short, Int, Long
        6 => &SBigIntMethods,
        7 => &SGroupElementMethods,
        8 => &SSigmaPropMethods,
        9 => &SUnsignedBigIntMethods,     // v6+
        12...23 => &SCollectionMethods,   // Coll[T]
        36...47 => &SOptionMethods,       // Option[T]
        99 => &SBoxMethods,
        100 => &SAvlTreeMethods,
        101 => &SContextMethods,
        104 => &SHeaderMethods,
        105 => &SPreHeaderMethods,
        106 => &SGlobalMethods,
        else => return null,
    };
    return container.getMethodById(method_id);
}

Method Call Evaluation

const MethodCall = struct {
    receiver_type: SType,
    method: *const SMethod,
    receiver: *const Value,
    args: []const *Value,

    pub fn eval(self: *const MethodCall, env: *const DataEnv, E: *Evaluator) !Any {
        // Evaluate receiver
        const recv = try self.receiver.eval(env, E);

        // Evaluate arguments
        var arg_values = try E.allocator.alloc(Any, self.args.len);
        for (self.args, 0..) |arg, i| {
            arg_values[i] = try arg.eval(env, E);
        }

        // Add cost
        E.addCost(self.method.cost_kind, self.method.op_code);

        // Dispatch to method implementation
        return try self.method.eval(recv, arg_values, E);
    }
};

Summary

This chapter covered the method system that extends ErgoTree types with rich APIs:

MethodsContainer organizes methods per type, with each method having a unique ID (1-255) within its container
Method resolution uses the receiver's type code and the method ID to locate the implementation, avoiding opcode space consumption
Numeric methods provide type conversions (toByte, toInt, toLong, toBigInt) shared across all numeric types, with v6 adding bitwise operations and byte representation
Collection methods form the richest API with transformation (map, filter, fold), predicates (exists, forall), and combination operations (append, slice, zip)
Box methods access UTXO properties: value (nanoERGs), tokens, propositionBytes, and registers R0-R9
Context methods provide access to transaction data: INPUTS, OUTPUTS, HEIGHT, SELF, dataInputs, headers, and context variables via getVar
Cryptographic methods on GroupElement support elliptic curve operations (exp, multiply, negate) and SigmaProp provides propBytes for serialization

Next: Chapter 7: Serialization Framework

Scala: methods.scala

Rust: smethod.rs:36-99 (SMethod, SMethodDesc)

Scala: methods.scala:232-500

⁴

Rust: snumeric.rs

⁵

Scala: methods.scala:805-1260

⁶

Rust: scoll.rs:22-266 (METHOD_DESC, method IDs)

⁷

Scala: methods.scala (SBoxMethods)

⁸

Rust: sbox.rs:29-92 (VALUE_METHOD, GET_REG_METHOD, TOKENS_METHOD)

⁹

Scala: methods.scala (SContextMethods)

¹⁰

Rust: scontext.rs

¹¹

Scala: methods.scala (SGroupElementMethods)

¹²

Rust: sgroup_elem.rs

¹³

Scala: methods.scala (SSigmaPropMethods)

Chapter 7: Serialization Framework

Prerequisites

Binary encoding concepts (bits, bytes, big-endian vs little-endian)
Familiarity with variable-length encoding techniques and their space-efficiency trade-offs
Prior chapters: Chapter 2 for type codes, Chapter 5 for opcodes

Learning Objectives

By the end of this chapter, you will be able to:

Explain VLQ (Variable-Length Quantity) encoding and how it achieves compact integer representation
Describe ZigZag encoding and why it improves VLQ efficiency for signed integers
Implement type serialization using the type code embedding scheme
Use SigmaByteReader and SigmaByteWriter for type-aware serialization

Serialization Architecture

Blockchain storage is expensive—every byte of an ErgoTree increases transaction fees and network bandwidth. The serialization framework therefore prioritizes compactness while maintaining determinism (identical inputs must produce identical outputs across all implementations). The system uses a layered design where each layer handles a specific concern¹²:

┌─────────────────────────────────────────────────┐
│              Application Layer                  │
│      (ErgoTree, Box, Transaction)               │
├─────────────────────────────────────────────────┤
│            Value Serializers                    │
│      (ConstantSerializer, MethodCall)           │
├─────────────────────────────────────────────────┤
│         SigmaByteReader/Writer                  │
│      (Type-aware, constant store)               │
├─────────────────────────────────────────────────┤
│           VLQ Encoding Layer                    │
│      (Variable-length integers)                 │
├─────────────────────────────────────────────────┤
│            Byte Buffer I/O                      │
│      (Raw read/write operations)                │
└─────────────────────────────────────────────────┘

Base Serializer Interface

const SigmaSerializer = struct {
    pub const MAX_PROPOSITION_SIZE: usize = 4096;
    pub const MAX_TREE_DEPTH: u32 = 110;

    pub fn toBytes(comptime T: type, obj: T, allocator: Allocator) ![]u8 {
        var list = std.ArrayList(u8).init(allocator);
        var writer = SigmaByteWriter.init(&list);
        try T.serialize(obj, &writer);
        return list.toOwnedSlice();
    }

    pub fn fromBytes(comptime T: type, bytes: []const u8) !T {
        var reader = SigmaByteReader.init(bytes);
        return try T.deserialize(&reader);
    }
};

VLQ Encoding

Variable-Length Quantity (VLQ) represents integers compactly³⁴:

Value Range              Bytes   Format
─────────────────────────────────────────────────
0 - 127                  1       0xxxxxxx
128 - 16,383             2       1xxxxxxx 0xxxxxxx
16,384 - 2,097,151       3       1xxxxxxx 1xxxxxxx 0xxxxxxx
2,097,152 - 268,435,455  4       1xxxxxxx 1xxxxxxx 1xxxxxxx 0xxxxxxx
...                      ...     ...

Each byte uses 7 bits for data; MSB is continuation flag:

0 = final byte
1 = more bytes follow

VLQ Implementation

const VlqEncoder = struct {
    /// Write unsigned integer using VLQ encoding
    pub fn putUInt(writer: anytype, value: u64) !void {
        var v = value;
        while ((v & 0xFFFFFF80) != 0) {
            try writer.writeByte(@intCast((v & 0x7F) | 0x80));
            v >>= 7;
        }
        try writer.writeByte(@intCast(v & 0x7F));
    }

    /// Read unsigned integer using VLQ decoding
    /// Maximum 10 bytes for u64 (ceil(64/7) = 10)
    pub fn getUInt(reader: anytype) !u64 {
        const MAX_VLQ_BYTES: u6 = 10;  // ceil(64/7) = 10 bytes max
        var result: u64 = 0;
        var shift: u6 = 0;
        var byte_count: u6 = 0;
        while (shift < 64) {
            const b = try reader.readByte();
            byte_count += 1;
            if (byte_count > MAX_VLQ_BYTES) return error.VlqTooLong;
            result |= @as(u64, b & 0x7F) << shift;
            if ((b & 0x80) == 0) return result;
            shift += 7;
        }
        return error.VlqDecodingFailed;
    }
};
// NOTE: In production, VLQ decoding should use compile-time assertions to
// verify max byte counts. See ZIGMA_STYLE.md for bounded iteration patterns.

VLQ Size by Value Range

Unsigned Value           Bytes
─────────────────────────────────
0 - 127                  1
128 - 16,383             2
16,384 - 2,097,151       3
2,097,152 - 268M         4
268M - 34B               5
34B - 4T                 6
4T - 562T                7
562T - 72P               8
72P - 9E                 9
> 9E                     10

ZigZag Encoding

VLQ assumes non-negative values—it encodes the magnitude directly. For signed integers like -1, the two's complement representation has all high bits set, resulting in maximum VLQ length. ZigZag encoding solves this by mapping signed values to unsigned in a way that preserves magnitude: small positive and negative numbers both produce small unsigned values⁵⁶:

Signed    ZigZag Encoded
────────────────────────
 0        0
-1        1
 1        2
-2        3
 2        4
-3        5
 n        2n     (n >= 0)
-n        2n-1   (n < 0)

ZigZag Implementation

const ZigZag = struct {
    /// Encode signed 32-bit to unsigned
    pub fn encode32(n: i32) u64 {
        // Arithmetic right shift replicates sign bit
        return @bitCast(@as(i64, (n << 1) ^ (n >> 31)));
    }

    /// Decode unsigned back to signed 32-bit
    pub fn decode32(n: u64) i32 {
        const v: u32 = @intCast(n);
        return @as(i32, @intCast(v >> 1)) ^ -@as(i32, @intCast(v & 1));
    }

    /// Encode signed 64-bit to unsigned
    pub fn encode64(n: i64) u64 {
        return @bitCast((n << 1) ^ (n >> 63));
    }

    /// Decode unsigned back to signed 64-bit
    pub fn decode64(n: u64) i64 {
        return @as(i64, @intCast(n >> 1)) ^ -@as(i64, @intCast(n & 1));
    }
};

ZigZag ensures small-magnitude signed values use few bytes:

Value     ZigZag    VLQ Bytes
─────────────────────────────
 0        0         1
-1        1         1
 1        2         1
-64       127       1
 64       128       2
-65       129       2

SigmaByteWriter

The writer handles type-aware serialization with cost tracking⁷⁸:

const SigmaByteWriter = struct {
    buffer: *std.ArrayList(u8),
    constant_store: ?*ConstantStore,
    tree_version: ErgoTreeVersion,

    pub fn init(buffer: *std.ArrayList(u8)) SigmaByteWriter {
        return .{
            .buffer = buffer,
            .constant_store = null,
            .tree_version = .v0,
        };
    }

    /// Write single byte
    pub fn putByte(self: *SigmaByteWriter, b: u8) !void {
        try self.buffer.append(b);
    }

    /// Write byte slice
    pub fn putBytes(self: *SigmaByteWriter, bytes: []const u8) !void {
        try self.buffer.appendSlice(bytes);
    }

    /// Write unsigned integer (VLQ encoded)
    pub fn putUInt(self: *SigmaByteWriter, value: u64) !void {
        try VlqEncoder.putUInt(self.buffer.writer(), value);
    }

    /// Write signed short (ZigZag + VLQ)
    pub fn putShort(self: *SigmaByteWriter, value: i16) !void {
        try self.putUInt(ZigZag.encode32(value));
    }

    /// Write signed int (ZigZag + VLQ)
    pub fn putInt(self: *SigmaByteWriter, value: i32) !void {
        try self.putUInt(ZigZag.encode32(value));
    }

    /// Write signed long (ZigZag + VLQ)
    pub fn putLong(self: *SigmaByteWriter, value: i64) !void {
        try self.putUInt(ZigZag.encode64(value));
    }

    /// Write type descriptor
    pub fn putType(self: *SigmaByteWriter, tpe: SType) !void {
        try TypeSerializer.serialize(tpe, self);
    }

    /// Write value with optional constant extraction
    pub fn putValue(self: *SigmaByteWriter, value: *const Value) !void {
        if (self.constant_store) |store| {
            if (value.isConstant()) {
                const idx = store.put(value.asConstant());
                try self.putByte(OpCode.ConstantPlaceholder.value);
                try self.putUInt(idx);
                return;
            }
        }
        try ValueSerializer.serialize(value, self);
    }
};

SigmaByteReader

The reader provides type-aware deserialization⁹¹⁰:

const SigmaByteReader = struct {
    data: []const u8,
    pos: usize,
    constant_store: ConstantStore,
    substitute_placeholders: bool,
    val_def_type_store: ValDefTypeStore,
    tree_version: ErgoTreeVersion,

    pub fn init(data: []const u8) SigmaByteReader {
        return .{
            .data = data,
            .pos = 0,
            .constant_store = ConstantStore.empty(),
            .substitute_placeholders = false,
            .val_def_type_store = ValDefTypeStore.init(),
            .tree_version = .v0,
        };
    }

    pub fn initWithStore(data: []const u8, store: ConstantStore) SigmaByteReader {
        var reader = init(data);
        reader.constant_store = store;
        reader.substitute_placeholders = true;
        return reader;
    }

    /// Read single byte
    pub fn getByte(self: *SigmaByteReader) !u8 {
        if (self.pos >= self.data.len) return error.EndOfStream;
        const b = self.data[self.pos];
        self.pos += 1;
        return b;
    }

    /// Read byte slice
    pub fn getBytes(self: *SigmaByteReader, n: usize) ![]const u8 {
        if (self.pos + n > self.data.len) return error.EndOfStream;
        const slice = self.data[self.pos..][0..n];
        self.pos += n;
        return slice;
    }

    /// Read unsigned integer (VLQ)
    pub fn getUInt(self: *SigmaByteReader) !u64 {
        return VlqEncoder.getUInt(self);
    }

    /// Read signed short (VLQ + ZigZag)
    pub fn getShort(self: *SigmaByteReader) !i16 {
        const v = try self.getUInt();
        return @intCast(ZigZag.decode32(v));
    }

    /// Read signed int (VLQ + ZigZag)
    pub fn getInt(self: *SigmaByteReader) !i32 {
        return ZigZag.decode32(try self.getUInt());
    }

    /// Read signed long (VLQ + ZigZag)
    pub fn getLong(self: *SigmaByteReader) !i64 {
        return ZigZag.decode64(try self.getUInt());
    }

    /// Read type descriptor
    pub fn getType(self: *SigmaByteReader) !SType {
        return TypeSerializer.deserialize(self);
    }

    /// Read value expression
    pub fn getValue(self: *SigmaByteReader) !*Value {
        return ValueSerializer.deserialize(self);
    }

    /// Remaining bytes available
    pub fn remaining(self: *const SigmaByteReader) usize {
        return self.data.len - self.pos;
    }

    // Reader interface for VLQ
    pub fn readByte(self: *SigmaByteReader) !u8 {
        return self.getByte();
    }
};

Constant Store

Manages constants during ErgoTree serialization¹¹:

const ConstantStore = struct {
    constants: []const Constant,
    extracted: std.ArrayList(Constant),

    pub fn empty() ConstantStore {
        return .{
            .constants = &.{},
            .extracted = undefined,
        };
    }

    pub fn init(constants: []const Constant, allocator: Allocator) ConstantStore {
        return .{
            .constants = constants,
            .extracted = std.ArrayList(Constant).init(allocator),
        };
    }

    /// Get constant by index
    pub fn get(self: *const ConstantStore, index: usize) !Constant {
        if (index >= self.constants.len) return error.IndexOutOfBounds;
        return self.constants[index];
    }

    /// Store constant during extraction, return index
    pub fn put(self: *ConstantStore, c: Constant) !u32 {
        const idx = self.extracted.items.len;
        try self.extracted.append(c);
        return @intCast(idx);
    }
};

Type Serialization

Types use a compact encoding scheme based on type codes¹²¹³:

Type Code Space
───────────────────────────────────────────────────────────
 1-11    Primitive embeddable types
12-23    Coll[primitive]           (12 + primCode)
24-35    Coll[Coll[primitive]]     (24 + primCode)
36-47    Option[primitive]         (36 + primCode)
48-59    Option[Coll[primitive]]   (48 + primCode)
60-71    (primitive, T2) pairs     (60 + primCode)
72-83    (T1, primitive) pairs     (72 + primCode)
84-95    (primitive, primitive)    (84 + primCode) symmetric
96       Tuple (generic)
97-106   Object types (Any, Unit, Box, ...)
112      SFunc (v6+)

Type Code Constants

const TypeCode = struct {
    value: u8,

    // Primitive types (embeddable)
    pub const BOOLEAN: u8 = 1;
    pub const BYTE: u8 = 2;
    pub const SHORT: u8 = 3;
    pub const INT: u8 = 4;
    pub const LONG: u8 = 5;
    pub const BIGINT: u8 = 6;
    pub const GROUP_ELEMENT: u8 = 7;
    pub const SIGMA_PROP: u8 = 8;
    pub const UNSIGNED_BIGINT: u8 = 9;

    // Type constructor bases
    pub const MAX_PRIM: u8 = 11;
    pub const PRIM_RANGE: u8 = 12;  // MAX_PRIM + 1
    pub const COLL: u8 = 12;
    pub const NESTED_COLL: u8 = 24;
    pub const OPTION: u8 = 36;
    pub const OPTION_COLL: u8 = 48;
    pub const TUPLE_PAIR1: u8 = 60;
    pub const TUPLE_PAIR2: u8 = 72;
    pub const TUPLE_SYMMETRIC: u8 = 84;
    pub const TUPLE: u8 = 96;

    // Object types
    pub const ANY: u8 = 97;
    pub const UNIT: u8 = 98;
    pub const BOX: u8 = 99;
    pub const AVL_TREE: u8 = 100;
    pub const CONTEXT: u8 = 101;
    pub const STRING: u8 = 102;
    pub const TYPE_VAR: u8 = 103;
    pub const HEADER: u8 = 104;
    pub const PRE_HEADER: u8 = 105;
    pub const GLOBAL: u8 = 106;
    pub const FUNC: u8 = 112;

    /// Embed primitive type into container code
    pub fn embed(container_base: u8, prim_code: u8) u8 {
        return container_base + prim_code;
    }

    /// Extract container and primitive from combined code
    pub fn unpack(code: u8) struct { container: ?u8, primitive: ?u8 } {
        if (code >= TUPLE) return .{ .container = null, .primitive = null };
        const container_id = (code / PRIM_RANGE) * PRIM_RANGE;
        const type_id = code % PRIM_RANGE;
        return .{
            .container = if (container_id == 0) null else container_id,
            .primitive = if (type_id == 0) null else type_id,
        };
    }
};

Type Serializer

const TypeSerializer = struct {
    pub fn serialize(tpe: SType, w: *SigmaByteWriter) !void {
        switch (tpe) {
            // Primitives - single byte
            .boolean => try w.putByte(TypeCode.BOOLEAN),
            .byte => try w.putByte(TypeCode.BYTE),
            .short => try w.putByte(TypeCode.SHORT),
            .int => try w.putByte(TypeCode.INT),
            .long => try w.putByte(TypeCode.LONG),
            .big_int => try w.putByte(TypeCode.BIGINT),
            .group_element => try w.putByte(TypeCode.GROUP_ELEMENT),
            .sigma_prop => try w.putByte(TypeCode.SIGMA_PROP),
            .unsigned_big_int => try w.putByte(TypeCode.UNSIGNED_BIGINT),

            // Object types
            .box => try w.putByte(TypeCode.BOX),
            .avl_tree => try w.putByte(TypeCode.AVL_TREE),
            .context => try w.putByte(TypeCode.CONTEXT),
            .header => try w.putByte(TypeCode.HEADER),
            .pre_header => try w.putByte(TypeCode.PRE_HEADER),
            .global => try w.putByte(TypeCode.GLOBAL),
            .unit => try w.putByte(TypeCode.UNIT),
            .any => try w.putByte(TypeCode.ANY),

            // Collections
            .coll => |elem| {
                if (elem.isEmbeddable()) {
                    // Single byte: Coll[primitive]
                    try w.putByte(TypeCode.embed(TypeCode.COLL, elem.typeCode()));
                } else if (elem.* == .coll) {
                    const inner = elem.coll;
                    if (inner.isEmbeddable()) {
                        // Single byte: Coll[Coll[primitive]]
                        try w.putByte(TypeCode.embed(TypeCode.NESTED_COLL, inner.typeCode()));
                    } else {
                        try w.putByte(TypeCode.COLL);
                        try serialize(elem.*, w);
                    }
                } else {
                    try w.putByte(TypeCode.COLL);
                    try serialize(elem.*, w);
                }
            },

            // Options
            .option => |elem| {
                if (elem.isEmbeddable()) {
                    try w.putByte(TypeCode.embed(TypeCode.OPTION, elem.typeCode()));
                } else if (elem.* == .coll) {
                    const inner = elem.coll;
                    if (inner.isEmbeddable()) {
                        try w.putByte(TypeCode.embed(TypeCode.OPTION_COLL, inner.typeCode()));
                    } else {
                        try w.putByte(TypeCode.OPTION);
                        try serialize(elem.*, w);
                    }
                } else {
                    try w.putByte(TypeCode.OPTION);
                    try serialize(elem.*, w);
                }
            },

            // Tuples (pairs)
            .tuple => |items| {
                if (items.len == 2) {
                    try serializePair(items[0], items[1], w);
                } else {
                    try w.putByte(TypeCode.TUPLE);
                    try w.putByte(@intCast(items.len));
                    for (items) |item| {
                        try serialize(item, w);
                    }
                }
            },

            // Functions (v6+)
            .func => |f| {
                try w.putByte(TypeCode.FUNC);
                try w.putByte(@intCast(f.t_dom.len));
                for (f.t_dom) |arg| try serialize(arg, w);
                try serialize(f.t_range.*, w);
                try w.putByte(@intCast(f.tpe_params.len));
                for (f.tpe_params) |p| {
                    try w.putByte(TypeCode.TYPE_VAR);
                    try w.putBytes(p.name);
                }
            },

            else => return error.UnsupportedType,
        }
    }

    fn serializePair(t1: SType, t2: SType, w: *SigmaByteWriter) !void {
        const e1 = t1.isEmbeddable();
        const e2 = t2.isEmbeddable();

        if (e1 and e2 and std.meta.eql(t1, t2)) {
            // Symmetric pair: (Int, Int)
            try w.putByte(TypeCode.embed(TypeCode.TUPLE_SYMMETRIC, t1.typeCode()));
        } else if (e1) {
            // First is primitive: (Int, T)
            try w.putByte(TypeCode.embed(TypeCode.TUPLE_PAIR1, t1.typeCode()));
            try serialize(t2, w);
        } else if (e2) {
            // Second is primitive: (T, Int)
            try w.putByte(TypeCode.embed(TypeCode.TUPLE_PAIR2, t2.typeCode()));
            try serialize(t1, w);
        } else {
            // Both non-primitive
            try w.putByte(TypeCode.TUPLE_PAIR1);
            try serialize(t1, w);
            try serialize(t2, w);
        }
    }

    pub fn deserialize(r: *SigmaByteReader) !SType {
        const c = try r.getByte();
        return parseWithTag(r, c);
    }

    fn parseWithTag(r: *SigmaByteReader, c: u8) !SType {
        if (c < TypeCode.TUPLE) {
            const unpacked = TypeCode.unpack(c);
            const elem_type = if (unpacked.primitive) |p|
                try getEmbeddableType(p, r.tree_version)
            else
                try deserialize(r);

            if (unpacked.container) |container| {
                return switch (container) {
                    TypeCode.COLL => .{ .coll = &elem_type },
                    TypeCode.NESTED_COLL => .{ .coll = &SType{ .coll = &elem_type } },
                    TypeCode.OPTION => .{ .option = &elem_type },
                    TypeCode.OPTION_COLL => .{ .option = &SType{ .coll = &elem_type } },
                    TypeCode.TUPLE_PAIR1 => blk: {
                        const t2 = try deserialize(r);
                        break :blk .{ .tuple = &[_]SType{ elem_type, t2 } };
                    },
                    TypeCode.TUPLE_PAIR2 => blk: {
                        const t1 = try deserialize(r);
                        break :blk .{ .tuple = &[_]SType{ t1, elem_type } };
                    },
                    TypeCode.TUPLE_SYMMETRIC => .{ .tuple = &[_]SType{ elem_type, elem_type } },
                    else => return error.InvalidTypeCode,
                };
            }
            return elem_type;
        }

        return switch (c) {
            TypeCode.TUPLE => blk: {
                const len = try r.getByte();
                var items: [8]SType = undefined;
                for (0..len) |i| items[i] = try deserialize(r);
                break :blk .{ .tuple = items[0..len] };
            },
            TypeCode.ANY => .any,
            TypeCode.UNIT => .unit,
            TypeCode.BOX => .box,
            TypeCode.AVL_TREE => .avl_tree,
            TypeCode.CONTEXT => .context,
            TypeCode.HEADER => .header,
            TypeCode.PRE_HEADER => .pre_header,
            TypeCode.GLOBAL => .global,
            TypeCode.FUNC => blk: {
                if (r.tree_version.value < 3) return error.UnsupportedVersion;
                const dom_len = try r.getByte();
                var t_dom: [255]SType = undefined;
                for (0..dom_len) |i| t_dom[i] = try deserialize(r);
                const t_range = try deserialize(r);
                // ... parse tpe_params
                break :blk .{ .func = undefined }; // Simplified
            },
            else => error.InvalidTypeCode,
        };
    }

    fn getEmbeddableType(code: u8, version: ErgoTreeVersion) !SType {
        return switch (code) {
            TypeCode.BOOLEAN => .boolean,
            TypeCode.BYTE => .byte,
            TypeCode.SHORT => .short,
            TypeCode.INT => .int,
            TypeCode.LONG => .long,
            TypeCode.BIGINT => .big_int,
            TypeCode.GROUP_ELEMENT => .group_element,
            TypeCode.SIGMA_PROP => .sigma_prop,
            TypeCode.UNSIGNED_BIGINT => blk: {
                if (version.value < 3) return error.UnsupportedVersion;
                break :blk .unsigned_big_int;
            },
            else => error.InvalidTypeCode,
        };
    }
};

Encoding Examples

Example: Encode 300 as VLQ

300 = 0x12C = 0b100101100

Step 1: Take low 7 bits, set continuation: 0x2C | 0x80 = 0xAC
Step 2: Shift right 7: 300 >> 7 = 2
Step 3: Take low 7 bits, no continuation: 0x02

Result: [0xAC, 0x02]

Example: Encode -5 as ZigZag + VLQ

ZigZag(-5) = (-5 << 1) ^ (-5 >> 31)
          = -10 ^ -1
          = 9

VLQ(9) = [0x09]  (fits in 7 bits)

Example: Serialize Coll[Int]

Coll[Int] → single byte
         → TypeCode.COLL + TypeCode.INT
         → 12 + 4 = 16 = 0x10

Example: Serialize (Int, Long)

(Int, Long) → TUPLE_PAIR1 + INT, then Long
           → 60 + 4 = 64, then 5
           → [0x40, 0x05]

Summary

This chapter covered the serialization framework that enables compact, deterministic encoding of ErgoTree structures:

VLQ (Variable-Length Quantity) encoding represents integers using 7 data bits per byte with a continuation flag, achieving compact representation where small values use fewer bytes
ZigZag encoding transforms signed integers to unsigned before VLQ encoding, ensuring small-magnitude values (positive or negative) remain compact
Type code embedding packs common type patterns (like Coll[Int] or Option[Long]) into single bytes by combining container and primitive codes
SigmaByteWriter provides type-aware serialization with optional constant extraction for segregated constant trees
SigmaByteReader manages deserialization state including constant stores for placeholder resolution and version tracking
The type code space (0-112) is partitioned to enable single-byte encoding for primitives, nested collections, options, and pairs

Next: Chapter 8: Value Serializers

Scala: SigmaSerializer.scala:24-60

Rust: serializable.rs

Scala: VLQByteBufferWriter.scala

⁴

Rust: vlq_encode.rs:94-112

⁵

Scala: (via scorex-util ZigZag implementation)

⁶

Rust: zig_zag_encode.rs:12-40

⁷

Scala: SigmaByteWriter.scala

⁸

Rust: sigma_byte_writer.rs:9-69

⁹

Scala: SigmaByteReader.scala

¹⁰

Rust: sigma_byte_reader.rs:12-161

¹¹

Rust: constant_store.rs

¹²

Scala: TypeSerializer.scala

¹³

Rust: types.rs:18-160

Chapter 8: Value Serializers

Prerequisites

Chapter 7 for VLQ encoding, type serialization, and SigmaByteReader/SigmaByteWriter
Chapter 4 for the Value hierarchy and expression node types
Chapter 5 for the opcode space and operation categories

Learning Objectives

By the end of this chapter, you will be able to:

Explain opcode-based serialization dispatch and how it enables extensibility
Implement value serializers following common patterns (binary, unary, nullary, collection)
Describe constant extraction and placeholder substitution for segregated constant trees
Handle type inference during deserialization using ValDefTypeStore

Serialization Architecture

Chapter 7 covered the low-level encoding primitives (VLQ, ZigZag, type codes). This chapter builds on that foundation to show how entire expression trees are serialized. The key insight is that each expression's opcode determines its serialization format, enabling a registry-based dispatch pattern that scales to hundreds of operation types¹².

Expression Serialization Flow
─────────────────────────────────────────────────────────

        ┌─────────────────┐
        │   Expression    │
        └────────┬────────┘
                 │
    ┌────────────┴────────────┐
    │  Is Constant?           │
    └────────────┬────────────┘
           ┌─────┴─────┐
           │ Yes       │ No
           ▼           ▼
   ┌───────────────┐  ┌───────────────┐
   │ Extract to    │  │ Get OpCode    │
   │ Store or      │  │ Write OpCode  │
   │ Write Inline  │  │ Serialize Body│
   └───────────────┘  └───────────────┘

Serializer Registry

All serializers are registered in a sparse array indexed by opcode³⁴:

const ValueSerializer = struct {
    /// Sparse array of serializers indexed by opcode
    serializers: [256]?*const Serializer,

    pub fn init() ValueSerializer {
        var self = ValueSerializer{ .serializers = [_]?*const Serializer{null} ** 256 };

        // Constants
        self.register(OpCode.Constant, &ConstantSerializer);
        self.register(OpCode.ConstantPlaceholder, &ConstantPlaceholderSerializer);

        // Tuples
        self.register(OpCode.Tuple, &TupleSerializer);
        self.register(OpCode.SelectField, &SelectFieldSerializer);

        // Relations
        self.register(OpCode.GT, &BinOpSerializer);
        self.register(OpCode.GE, &BinOpSerializer);
        self.register(OpCode.LT, &BinOpSerializer);
        self.register(OpCode.LE, &BinOpSerializer);
        self.register(OpCode.EQ, &BinOpSerializer);
        self.register(OpCode.NEQ, &BinOpSerializer);

        // Logical
        self.register(OpCode.BinAnd, &BinOpSerializer);
        self.register(OpCode.BinOr, &BinOpSerializer);
        self.register(OpCode.BinXor, &BinOpSerializer);

        // Arithmetic
        self.register(OpCode.Plus, &BinOpSerializer);
        self.register(OpCode.Minus, &BinOpSerializer);
        self.register(OpCode.Multiply, &BinOpSerializer);
        self.register(OpCode.Division, &BinOpSerializer);
        self.register(OpCode.Modulo, &BinOpSerializer);

        // Context
        self.register(OpCode.Height, &NullarySerializer);
        self.register(OpCode.Self, &NullarySerializer);
        self.register(OpCode.Inputs, &NullarySerializer);
        self.register(OpCode.Outputs, &NullarySerializer);
        self.register(OpCode.Context, &NullarySerializer);
        self.register(OpCode.Global, &NullarySerializer);

        // Collections
        self.register(OpCode.Coll, &CollectionSerializer);
        self.register(OpCode.CollBoolConst, &BoolCollectionSerializer);
        self.register(OpCode.Map, &MapSerializer);
        self.register(OpCode.Filter, &FilterSerializer);
        self.register(OpCode.Fold, &FoldSerializer);

        // Method calls
        self.register(OpCode.PropertyCall, &PropertyCallSerializer);
        self.register(OpCode.MethodCall, &MethodCallSerializer);

        return self;
    }

    fn register(self: *ValueSerializer, opcode: OpCode, serializer: *const Serializer) void {
        self.serializers[opcode.value] = serializer;
    }

    pub fn getSerializer(self: *const ValueSerializer, opcode: OpCode) !*const Serializer {
        return self.serializers[opcode.value] orelse error.UnknownOpCode;
    }
};

Serialization Dispatch

Serialize Expression

pub fn serialize(expr: *const Expr, w: *SigmaByteWriter) !void {
    switch (expr.*) {
        .constant => |c| {
            if (w.constant_store) |store| {
                // Extract constant to store, write placeholder
                const idx = try store.put(c);
                try w.putByte(OpCode.ConstantPlaceholder.value);
                try w.putUInt(idx);
            } else {
                // Write constant inline (type + value)
                try ConstantSerializer.serialize(c, w);
            }
        },
        else => {
            const opcode = expr.opCode();
            try w.putByte(opcode.value);  // Write opcode first
            const ser = registry.getSerializer(opcode) catch return error.UnknownOpCode;
            try ser.serialize(expr, w);   // Then serialize body
        },
    }
}

Deserialize Expression

pub fn deserialize(r: *SigmaByteReader) !Expr {
    const tag = try r.getByte();

    // Look-ahead: constants have type codes 1-112
    if (tag <= OpCode.LAST_CONSTANT_CODE) {
        return .{ .constant = try ConstantSerializer.deserializeWithTag(r, tag) };
    }

    const opcode = OpCode{ .value = tag };
    const ser = registry.getSerializer(opcode) catch {
        return error.UnknownOpCode;
    };
    return ser.deserialize(r);
}

Constant Serialization

Constants are serialized as type followed by value⁵⁶:

const ConstantSerializer = struct {
    pub fn serialize(c: Constant, w: *SigmaByteWriter) !void {
        try TypeSerializer.serialize(c.tpe, w);   // 1. Type
        try DataSerializer.serialize(c.value, c.tpe, w);  // 2. Value
    }

    pub fn deserialize(r: *SigmaByteReader) !Constant {
        const tag = try r.getByte();
        return deserializeWithTag(r, tag);
    }

    pub fn deserializeWithTag(r: *SigmaByteReader, tag: u8) !Constant {
        const tpe = try TypeSerializer.parseWithTag(r, tag);
        const value = try DataSerializer.deserialize(tpe, r);
        return Constant{ .tpe = tpe, .value = value };
    }
};

Constant Placeholder

When constant segregation is enabled, constants become placeholders⁷:

const ConstantPlaceholderSerializer = struct {
    pub fn serialize(ph: ConstantPlaceholder, w: *SigmaByteWriter) !void {
        try w.putUInt(ph.index);  // Just the index
    }

    pub fn deserialize(r: *SigmaByteReader) !Expr {
        const id = try r.getUInt();

        if (r.substitute_placeholders) {
            // Return actual constant from store
            const c = try r.constant_store.get(@intCast(id));
            return .{ .constant = c };
        } else {
            // Return placeholder (for template extraction)
            const tpe = (try r.constant_store.get(@intCast(id))).tpe;
            return .{ .constant_placeholder = .{ .index = @intCast(id), .tpe = tpe } };
        }
    }
};

Common Serializer Patterns

BinOp Serializer (Two Arguments)

For binary operations like arithmetic and comparisons⁸:

const BinOpSerializer = struct {
    pub fn serialize(expr: *const Expr, w: *SigmaByteWriter) !void {
        const binop = expr.asBinOp();
        try ValueSerializer.serialize(binop.left, w);   // Left operand
        try ValueSerializer.serialize(binop.right, w);  // Right operand
    }

    pub fn deserialize(r: *SigmaByteReader, kind: BinOp.Kind) !Expr {
        const left = try ValueSerializer.deserialize(r);
        const right = try ValueSerializer.deserialize(r);
        return .{ .bin_op = .{
            .kind = kind,
            .left = &left,
            .right = &right,
        } };
    }
};

Unary Serializer (One Argument)

For single-input transformations:

const UnarySerializer = struct {
    pub fn serialize(input: *const Expr, w: *SigmaByteWriter) !void {
        try ValueSerializer.serialize(input, w);
    }

    pub fn deserialize(r: *SigmaByteReader) !*const Expr {
        return try ValueSerializer.deserialize(r);
    }
};

Nullary Serializer (No Body)

For singletons where opcode is sufficient:

const NullarySerializer = struct {
    pub fn serialize(_: *const Expr, _: *SigmaByteWriter) !void {
        // Nothing to write - opcode is enough
    }

    pub fn deserialize(r: *SigmaByteReader, opcode: OpCode) !Expr {
        _ = r;
        return switch (opcode) {
            .Height => .{ .global_var = .height },
            .Self => .{ .global_var = .self_box },
            .Inputs => .{ .global_var = .inputs },
            .Outputs => .{ .global_var = .outputs },
            .Context => .context,
            .Global => .global,
            else => error.InvalidOpCode,
        };
    }
};

Collection Serializers

ConcreteCollection

For collections of expressions⁹:

const CollectionSerializer = struct {
    const MAX_COLLECTION_ITEMS: u16 = 4096;  // DoS protection

    pub fn serialize(coll: *const Collection, w: *SigmaByteWriter) !void {
        try w.putUShort(@intCast(coll.items.len));  // Count
        try TypeSerializer.serialize(coll.elem_type, w);  // Element type
        for (coll.items) |item| {
            try ValueSerializer.serialize(item, w);  // Each item
        }
    }

    pub fn deserialize(r: *SigmaByteReader) !Expr {
        const count = try r.getUShort();
        if (count > MAX_COLLECTION_ITEMS) return error.CollectionTooLarge;

        const elem_type = try TypeSerializer.deserialize(r);

        var items = try r.allocator.alloc(*Expr, count);
        for (0..count) |i| {
            items[i] = try ValueSerializer.deserialize(r);
        }

        return .{ .collection = .{
            .elem_type = elem_type,
            .items = items,
        } };
    }
};
// NOTE: In production, use a pre-allocated expression pool instead of
// dynamic allocation during deserialization. See ZIGMA_STYLE.md.

Boolean Collection Constant

Compact serialization for Coll[Boolean] constants:

const BoolCollectionSerializer = struct {
    pub fn serialize(bools: []const bool, w: *SigmaByteWriter) !void {
        try w.putUShort(@intCast(bools.len));
        // Pack into bits
        const byte_count = (bools.len + 7) / 8;
        var i: usize = 0;
        for (0..byte_count) |_| {
            var byte: u8 = 0;
            for (0..8) |bit| {
                if (i < bools.len and bools[i]) {
                    byte |= @as(u8, 1) << @intCast(bit);
                }
                i += 1;
            }
            try w.putByte(byte);
        }
    }

    pub fn deserialize(r: *SigmaByteReader) !Expr {
        const count = try r.getUShort();
        const byte_count = (count + 7) / 8;

        var bools = try r.allocator.alloc(bool, count);
        var i: usize = 0;
        for (0..byte_count) |_| {
            const byte = try r.getByte();
            for (0..8) |bit| {
                if (i >= count) break;
                bools[i] = (byte >> @intCast(bit)) & 1 == 1;
                i += 1;
            }
        }

        return .{ .coll_bool_const = bools };
    }
};

Map/Filter/Fold

Higher-order collection operations:

const MapSerializer = struct {
    pub fn serialize(m: *const Map, w: *SigmaByteWriter) !void {
        try ValueSerializer.serialize(m.input, w);   // Collection
        try ValueSerializer.serialize(m.mapper, w);  // Function
    }

    pub fn deserialize(r: *SigmaByteReader) !Expr {
        const input = try ValueSerializer.deserialize(r);
        const mapper = try ValueSerializer.deserialize(r);
        return .{ .map = .{ .input = &input, .mapper = &mapper } };
    }
};

const FoldSerializer = struct {
    pub fn serialize(f: *const Fold, w: *SigmaByteWriter) !void {
        try ValueSerializer.serialize(f.input, w);   // Collection
        try ValueSerializer.serialize(f.zero, w);    // Initial value
        try ValueSerializer.serialize(f.folder, w);  // Fold function
    }

    pub fn deserialize(r: *SigmaByteReader) !Expr {
        const input = try ValueSerializer.deserialize(r);
        const zero = try ValueSerializer.deserialize(r);
        const folder = try ValueSerializer.deserialize(r);
        return .{ .fold = .{
            .input = &input,
            .zero = &zero,
            .folder = &folder,
        } };
    }
};

Block and Function Serializers

BlockValue

For blocks with local definitions¹⁰:

const BlockValueSerializer = struct {
    pub fn serialize(block: *const BlockValue, w: *SigmaByteWriter) !void {
        try w.putUInt(block.items.len);  // Definition count
        for (block.items) |item| {
            try ValueSerializer.serialize(item, w);  // Each definition
        }
        try ValueSerializer.serialize(block.result, w);  // Result expression
    }

    pub fn deserialize(r: *SigmaByteReader) !Expr {
        const count = try r.getUInt();
        var items = try r.allocator.alloc(*Expr, @intCast(count));
        for (0..count) |i| {
            items[i] = try ValueSerializer.deserialize(r);
        }
        const result = try ValueSerializer.deserialize(r);
        return .{ .block_value = .{ .items = items, .result = &result } };
    }
};

FuncValue

For lambda functions:

const FuncValueSerializer = struct {
    pub fn serialize(func: *const FuncValue, w: *SigmaByteWriter) !void {
        try w.putUInt(func.args.len);  // Argument count
        for (func.args) |arg| {
            try w.putUInt(arg.id);      // Argument id
            try TypeSerializer.serialize(arg.tpe, w);  // Argument type
        }
        try ValueSerializer.serialize(func.body, w);  // Body
    }

    pub fn deserialize(r: *SigmaByteReader) !Expr {
        const arg_count = try r.getUInt();
        var args = try r.allocator.alloc(FuncArg, @intCast(arg_count));

        for (0..arg_count) |i| {
            const id = try r.getUInt();
            const tpe = try TypeSerializer.deserialize(r);
            // Store type for ValUse resolution
            r.val_def_type_store.put(@intCast(id), tpe);
            args[i] = .{ .id = @intCast(id), .tpe = tpe };
        }

        const body = try ValueSerializer.deserialize(r);
        return .{ .func_value = .{ .args = args, .body = &body } };
    }
};

ValDef / ValUse

Variable definitions and references:

const ValDefSerializer = struct {
    pub fn serialize(vd: *const ValDef, w: *SigmaByteWriter) !void {
        try w.putUInt(vd.id);
        try TypeSerializer.serialize(vd.tpe, w);
        try ValueSerializer.serialize(vd.rhs, w);
    }

    pub fn deserialize(r: *SigmaByteReader) !Expr {
        const id = try r.getUInt();
        const tpe = try TypeSerializer.deserialize(r);
        // Store for ValUse resolution
        r.val_def_type_store.put(@intCast(id), tpe);
        const rhs = try ValueSerializer.deserialize(r);
        return .{ .val_def = .{ .id = @intCast(id), .tpe = tpe, .rhs = &rhs } };
    }
};

const ValUseSerializer = struct {
    pub fn serialize(vu: *const ValUse, w: *SigmaByteWriter) !void {
        try w.putUInt(vu.id);
    }

    pub fn deserialize(r: *SigmaByteReader) !Expr {
        const id = try r.getUInt();
        // Lookup type from earlier ValDef
        const tpe = r.val_def_type_store.get(@intCast(id)) orelse
            return error.UndefinedVariable;
        return .{ .val_use = .{ .id = @intCast(id), .tpe = tpe } };
    }
};

MethodCall Serializer

Method calls require type and method ID lookup¹¹¹²:

const MethodCallSerializer = struct {
    pub fn serialize(mc: *const MethodCall, w: *SigmaByteWriter) !void {
        try w.putByte(mc.method.obj_type.typeId());  // Type ID
        try w.putByte(mc.method.method_id);          // Method ID
        try ValueSerializer.serialize(mc.obj, w);    // Receiver
        try w.putUInt(mc.args.len);                  // Arg count
        for (mc.args) |arg| {
            try ValueSerializer.serialize(arg, w);   // Each argument
        }
        // Explicit type arguments (for generic methods)
        for (mc.method.explicit_type_args) |tvar| {
            const tpe = mc.type_subst.get(tvar) orelse continue;
            try TypeSerializer.serialize(tpe, w);
        }
    }

    pub fn deserialize(r: *SigmaByteReader) !Expr {
        const type_id = try r.getByte();
        const method_id = try r.getByte();
        const obj = try ValueSerializer.deserialize(r);

        const arg_count = try r.getUInt();
        var args = try r.allocator.alloc(*Expr, @intCast(arg_count));
        for (0..arg_count) |i| {
            args[i] = try ValueSerializer.deserialize(r);
        }

        // Lookup method by type and method ID
        const method = try SMethod.fromIds(type_id, method_id);

        // Check version compatibility
        if (r.tree_version.value < method.min_version.value) {
            return error.MethodNotAvailable;
        }

        // Read type arguments
        var type_args = std.AutoHashMap(STypeVar, SType).init(r.allocator);
        for (method.explicit_type_args) |tvar| {
            const tpe = try TypeSerializer.deserialize(r);
            try type_args.put(tvar, tpe);
        }

        return .{ .method_call = .{
            .obj = &obj,
            .method = method,
            .args = args,
            .type_subst = type_args,
        } };
    }
};

Serializer Summary Table

OpCode Range    Category            Serializer Pattern
────────────────────────────────────────────────────────────
1-112           Constants           Type + Value inline
113             ConstPlaceholder    Index only
114-120         Global vars         Nullary (opcode only)
121-130         Unary ops           Single child
131-150         Binary ops          Left + Right
151-160         Collection ops      Input + Function
161-170         Block/Func          Items + Body
171-180         Method calls        TypeId + MethodId + Args

Summary

This chapter covered the value serialization system that transforms ErgoTree expression trees to and from bytes:

Opcode dispatch enables extensible serialization—the first byte of each expression determines which serializer handles the remaining bytes, allowing O(1) lookup via a sparse registry array
Constant extraction supports two modes: inline serialization (type + value) when constant segregation is disabled, or placeholder indices when segregation is enabled for template sharing
Common serializer patterns reduce code duplication: BinOpSerializer handles all two-argument operations, UnarySerializer handles single-input transformations, and NullarySerializer handles singletons where the opcode alone is sufficient
Collection serializers include bounds checking to prevent DoS attacks from maliciously large collections during deserialization
Type inference via ValDefTypeStore tracks variable types as ValDef nodes are deserialized, allowing ValUse nodes to recover their types without storing them redundantly
Method call serialization includes type ID, method ID, and version checking to ensure compatibility with the ErgoTree version being deserialized

Next: Chapter 9: Elliptic Curve Cryptography

Scala: ValueSerializer.scala:65-95

Rust: expr.rs:83-203

Scala: ValueSerializer.scala:50-182

⁴

Rust: expr.rs:215-298

⁵

Scala: ConstantSerializer.scala

⁶

Rust: constant.rs:9-29

⁷

Rust: constant_placeholder.rs

⁸

Rust: bin_op.rs

⁹

Scala: ConcreteCollectionSerializer.scala

¹⁰

Scala: BlockValueSerializer.scala

¹¹

Scala: MethodCallSerializer.scala

¹²

Rust: method_call.rs:19-60

Chapter 9: Elliptic Curve Cryptography

Prerequisites

Basic finite field arithmetic: operations modulo a prime p, multiplicative inverses
Public key cryptography concepts: key pairs, discrete logarithm problem
Understanding of elliptic curves as sets of points satisfying y² = x³ + ax + b over a finite field
Prior chapters: Chapter 2 for the GroupElement type

Learning Objectives

By the end of this chapter, you will be able to:

Explain why secp256k1 was chosen for Sigma protocols and describe its key parameters
Implement the discrete logarithm group interface: exponentiate, multiply, inverse
Encode and decode group elements using compressed SEC1 format (33 bytes)
Translate between multiplicative group notation (used in Sigma protocols) and additive notation (used in libraries)

The Secp256k1 Curve

Sigma protocols use secp256k1—the same elliptic curve as Bitcoin and Ethereum¹². This choice provides several benefits: widespread library support, extensive security analysis, and compatibility with existing blockchain infrastructure. The curve offers 128-bit security (meaning the best known attack requires approximately 2^128 operations) while using 256-bit keys.

Curve Definition

The curve is defined by:

y² = x³ + 7  (mod p)

where:
  p = 2²⁵⁶ - 2³² - 977  (field characteristic)
  n = group order        (number of points)
  G = generator point    (base point)

Cryptographic Constants

const CryptoConstants = struct {
    /// Encoded group element size in bytes (compressed)
    pub const ENCODED_GROUP_ELEMENT_LENGTH: usize = 33;

    /// Group size in bits
    pub const GROUP_SIZE_BITS: u32 = 256;

    /// Challenge size for Sigma protocols
    /// Must be < GROUP_SIZE_BITS for security
    pub const SOUNDNESS_BITS: u32 = 192;

    /// Group order (number of curve points)
    /// n = FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFEBAAEDCE6AF48A03BBFD25E8CD0364141
    pub const GROUP_ORDER: [32]u8 = .{
        0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF,
        0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFE,
        0xBA, 0xAE, 0xDC, 0xE6, 0xAF, 0x48, 0xA0, 0x3B,
        0xBF, 0xD2, 0x5E, 0x8C, 0xD0, 0x36, 0x41, 0x41,
    };

    /// Field characteristic
    /// p = FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFEFFFFFC2F
    pub const FIELD_PRIME: [32]u8 = .{
        0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF,
        0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF,
        0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF,
        0xFF, 0xFF, 0xFF, 0xFE, 0xFF, 0xFF, 0xFC, 0x2F,
    };

    comptime {
        // Security constraint: 2^soundnessBits < groupOrder
        std.debug.assert(SOUNDNESS_BITS < GROUP_SIZE_BITS);
    }
};

Group Element Representation

EcPoint Structure

const EcPoint = struct {
    /// Compressed encoding size
    pub const GROUP_SIZE: usize = 33;

    /// Internal representation (projective coordinates)
    x: FieldElement,
    y: FieldElement,
    z: FieldElement,

    /// Identity element (point at infinity)
    pub const IDENTITY = EcPoint{
        .x = FieldElement.zero(),
        .y = FieldElement.one(),
        .z = FieldElement.zero(),
    };

    /// Generator point G
    pub const GENERATOR = init: {
        // secp256k1 generator coordinates
        const gx = FieldElement.fromHex(
            "79BE667EF9DCBBAC55A06295CE870B07029BFCDB2DCE28D959F2815B16F81798"
        );
        const gy = FieldElement.fromHex(
            "483ADA7726A3C4655DA4FBFC0E1108A8FD17B448A68554199C47D08FFB10D4B8"
        );
        break :init EcPoint{ .x = gx, .y = gy, .z = FieldElement.one() };
    };

    /// Check if this is the identity (infinity) point
    pub fn isIdentity(self: *const EcPoint) bool {
        return self.z.isZero();
    }

    /// Convert to affine coordinates
    pub fn toAffine(self: *const EcPoint) struct { x: FieldElement, y: FieldElement } {
        if (self.isIdentity()) return .{ .x = .zero(), .y = .zero() };
        const z_inv = self.z.inverse();
        return .{
            .x = self.x.mul(z_inv),
            .y = self.y.mul(z_inv),
        };
    }
};

Group Operations

The discrete logarithm group interface provides standard operations³⁴:

Operation           Notation      Description
─────────────────────────────────────────────────────
Exponentiate        g^x           Scalar multiplication
Multiply            g * h         Point addition
Inverse             g^(-1)        Point negation
Identity            1             Point at infinity
Generator           g             Base point G

Group Interface

const DlogGroup = struct {
    /// The generator point
    pub fn generator() EcPoint {
        return EcPoint.GENERATOR;
    }

    /// The identity element (point at infinity)
    pub fn identity() EcPoint {
        return EcPoint.IDENTITY;
    }

    /// Check if point is identity
    pub fn isIdentity(point: *const EcPoint) bool {
        return point.isIdentity();
    }

    /// Exponentiate: base^exponent (scalar multiplication)
    pub fn exponentiate(base: *const EcPoint, exponent: *const Scalar) EcPoint {
        if (base.isIdentity()) return base.*;

        // Handle negative exponents
        var exp = exponent.*;
        if (exp.isNegative()) {
            exp = exp.mod(CryptoConstants.GROUP_ORDER);
        }

        return scalarMul(base, &exp);
    }

    /// Multiply two group elements: g1 * g2 (point addition)
    pub fn multiply(g1: *const EcPoint, g2: *const EcPoint) EcPoint {
        return pointAdd(g1, g2);
    }

    /// Compute inverse: g^(-1) (point negation)
    pub fn inverse(point: *const EcPoint) EcPoint {
        return EcPoint{
            .x = point.x,
            .y = point.y.negate(),
            .z = point.z,
        };
    }

    /// Create random group element
    pub fn randomElement(rng: std.rand.Random) EcPoint {
        const scalar = Scalar.random(rng);
        return exponentiate(&EcPoint.GENERATOR, &scalar);
    }
};

Notation Translation

Sigma protocols use multiplicative notation while underlying libraries often use additive⁵:

Sigma (multiplicative)    Library (additive)     Operation
──────────────────────────────────────────────────────────
g * h                     g + h                  Point addition
g^n                       n * g                  Scalar multiplication
g^(-1)                    -g                     Point negation
1 (identity)              O (origin)             Point at infinity

/// Wrapper translating multiplicative to additive notation
const MultiplicativeGroup = struct {
    /// Multiply in multiplicative notation = Add in additive
    pub fn mul(a: *const EcPoint, b: *const EcPoint) EcPoint {
        return pointAdd(a, b);
    }

    /// Exponentiate in multiplicative = Scalar multiply in additive
    pub fn exp(base: *const EcPoint, scalar: *const Scalar) EcPoint {
        return scalarMul(base, scalar);
    }

    /// Inverse in multiplicative = Negate in additive
    pub fn inv(p: *const EcPoint) EcPoint {
        return pointNegate(p);
    }
};

Point Encoding

Group elements use compressed SEC1 encoding (33 bytes)⁶⁷:

Compressed Point Format (33 bytes)
────────────────────────────────────────────────────

┌──────────┬────────────────────────────────────────┐
│ Byte 0   │           Bytes 1-32                   │
├──────────┼────────────────────────────────────────┤
│ 0x02     │    X coordinate (32 bytes, big-end)    │  Y is even
│ 0x03     │    X coordinate (32 bytes, big-end)    │  Y is odd
│ 0x00     │    31 zero bytes                       │  Identity
└──────────┴────────────────────────────────────────┘

Serialization Implementation

const GroupElementSerializer = struct {
    const ENCODING_SIZE: usize = 33;

    /// Identity encoding (33 zero bytes)
    const IDENTITY_ENCODING = [_]u8{0} ** ENCODING_SIZE;

    pub fn serialize(point: *const EcPoint, writer: anytype) !void {
        if (point.isIdentity()) {
            try writer.writeAll(&IDENTITY_ENCODING);
            return;
        }

        const affine = point.toAffine();

        // Determine sign byte from Y coordinate parity
        const y_bytes = affine.y.toBytes();
        const sign_byte: u8 = if (y_bytes[31] & 1 == 0) 0x02 else 0x03;

        // Write sign byte + X coordinate
        try writer.writeByte(sign_byte);
        try writer.writeAll(&affine.x.toBytes());
    }

    pub fn deserialize(reader: anytype) !EcPoint {
        var buf: [ENCODING_SIZE]u8 = undefined;
        try reader.readNoEof(&buf);

        if (buf[0] == 0) {
            // Check all zeros for identity
            for (buf[1..]) |b| {
                if (b != 0) return error.InvalidEncoding;
            }
            return EcPoint.IDENTITY;
        }

        if (buf[0] != 0x02 and buf[0] != 0x03) {
            return error.InvalidPrefix;
        }

        // Recover Y from X using curve equation: y² = x³ + 7
        const x = FieldElement.fromBytes(buf[1..33]);
        const y_squared = x.cube().add(FieldElement.fromInt(7));
        var y = y_squared.sqrt() orelse return error.NotOnCurve;

        // Choose correct Y based on sign byte
        const y_is_odd = y.toBytes()[31] & 1 == 1;
        if ((buf[0] == 0x02) == y_is_odd) {
            y = y.negate();
        }

        const point = EcPoint{ .x = x, .y = y, .z = FieldElement.one() };

        // CRITICAL: Validate point is on curve and in correct subgroup
        // This prevents invalid curve attacks. See ZIGMA_STYLE.md.
        // if (!point.isOnCurve()) return error.NotOnCurve;
        // if (!point.isInSubgroup()) return error.InvalidSubgroup;

        return point;
    }
};

Why Compressed Encoding?

Format          Size     Content
────────────────────────────────────────────────────
Compressed      33 B     Sign (1) + X (32)
Uncompressed    65 B     0x04 (1) + X (32) + Y (32)
Savings         49%      Y recovered from curve equation

Coordinate Systems

Affine vs Projective

Libraries use projective coordinates internally for efficiency:

Coordinate System    Representation    Division Required
──────────────────────────────────────────────────────────
Affine               (x, y)            Per operation
Projective           (X, Y, Z)         Only at end
                     x = X/Z
                     y = Y/Z

Normalization

/// Normalize point to affine coordinates
/// Required before: encoding, comparison, coordinate access
pub fn normalize(point: *const EcPoint) EcPoint {
    if (point.isIdentity()) return point.*;

    const z_inv = point.z.inverse();
    const z_inv_sq = z_inv.square();
    const z_inv_cu = z_inv_sq.mul(z_inv);

    return EcPoint{
        .x = point.x.mul(z_inv_sq),
        .y = point.y.mul(z_inv_cu),
        .z = FieldElement.one(),
    };
}

Random Scalar Generation

Secure random scalars for key generation⁸:

const Scalar = struct {
    bytes: [32]u8,

    /// Generate random scalar in [1, n-1] where n is group order
    pub fn random(rng: std.rand.Random) Scalar {
        while (true) {
            var bytes: [32]u8 = undefined;
            rng.bytes(&bytes);

            // Ensure scalar < group order
            if (lessThan(&bytes, &CryptoConstants.GROUP_ORDER)) {
                // Ensure non-zero
                var is_zero = true;
                for (bytes) |b| {
                    if (b != 0) { is_zero = false; break; }
                }
                if (!is_zero) {
                    return Scalar{ .bytes = bytes };
                }
            }
        }
    }

    /// Constant-time comparison to prevent timing attacks
    fn lessThan(a: *const [32]u8, b: *const [32]u8) bool {
        // NOTE: This simplified version is NOT constant-time.
        // In production, use constant-time comparison like:
        //   var borrow: u1 = 0;
        //   for (a.*, b.*) |ai, bi| {
        //       borrow = @intFromBool(ai < bi) | (borrow & @intFromBool(ai == bi));
        //   }
        //   return borrow == 1;
        // See ZIGMA_STYLE.md for constant-time crypto requirements.
        for (a.*, b.*) |ai, bi| {
            if (ai < bi) return true;
            if (ai > bi) return false;
        }
        return false;
    }
};

Security Properties

Discrete Logarithm Assumption

The security relies on the hardness of the DLP⁹:

Given:  g (generator), h = g^x (public key)
Find:   x (secret key)

Best known attack: ~2^128 operations for secp256k1

Soundness Parameter

The SOUNDNESS_BITS = 192 determines:

Challenge size in Sigma protocols
Security level against malicious provers
Constraint: 2^192 < n (group order)

comptime {
    // Verify soundness constraint
    // 2^soundnessBits must be less than group order
    // Group order ≈ 2^256, so 192 < 256 satisfies this
    std.debug.assert(CryptoConstants.SOUNDNESS_BITS < 256);
}

Summary

This chapter covered the elliptic curve cryptography foundation that underlies all Sigma protocol operations:

secp256k1 (y² = x³ + 7) provides the mathematical foundation for Sigma protocols, chosen for its security properties and widespread support in Bitcoin and Ethereum tooling
Group elements are encoded as 33 bytes using compressed SEC1 format—a sign byte (0x02 or 0x03 based on Y coordinate parity) followed by the 32-byte X coordinate
Multiplicative notation used in Sigma protocol literature (g^x, g·h) maps to additive operations in typical EC libraries (scalar multiplication, point addition)
SOUNDNESS_BITS = 192 determines the challenge size in Sigma protocols and must be less than the group order's bit length for security
The DlogGroup interface provides exponentiate (scalar multiplication), multiply (point addition), inverse (point negation), and identity (point at infinity)
Projective coordinates (X, Y, Z) avoid expensive field inversions during computation; conversion to affine coordinates is required only for encoding and comparison

Next: Chapter 10: Hash Functions

Scala: CryptoConstants.scala

Rust: ec_point.rs:41-51

Scala: DlogGroup.scala

⁴

Rust: dlog_group.rs:39-84

⁵

Scala: Platform.scala:217-225

⁶

Scala: GroupElementSerializer.scala

⁷

Rust: ec_point.rs:120-146

⁸

Rust: dlog_group.rs:40-43

⁹

Scala: CryptoConstants.scala:70-75

Chapter 10: Hash Functions

Prerequisites

Cryptographic hash function properties: collision resistance, preimage resistance, deterministic output
Understanding of message authentication codes (MACs) and their role in key derivation
Prior chapters: Chapter 9 for the cryptographic context, Chapter 5 for opcode-based operations

Learning Objectives

By the end of this chapter, you will be able to:

Explain why BLAKE2b256 is the primary hash function in Ergo and when SHA-256 is used
Implement hash operations with per-item costing based on block size
Describe Fiat-Shamir challenge generation and why challenges are truncated to 192 bits
Use HMAC-SHA512 for BIP32/BIP39 key derivation

Hash Functions in Sigma

Hash functions are fundamental to blockchain security—they provide integrity guarantees, enable content addressing, and transform interactive proofs into non-interactive ones via the Fiat-Shamir heuristic. The Sigma protocol uses two primary hash functions, each optimized for different use cases¹²:

Hash Function Uses
─────────────────────────────────────────────────────
Purpose              Function        Output
─────────────────────────────────────────────────────
Script hashing       blake2b256()    32 bytes
External compat      sha256()        32 bytes
Challenge gen        Fiat-Shamir     24 bytes (truncated)
Box identification   blake2b256()    32 bytes
Key derivation       HMAC-SHA512     64 bytes

BLAKE2b256

The primary hash function for Ergo—faster and more secure than SHA-256³⁴.

Implementation

const Blake2b256 = struct {
    /// Output size in bytes
    pub const DIGEST_SIZE: usize = 32;
    /// Block size for cost calculation
    pub const BLOCK_SIZE: usize = 128;

    state: [8]u64,
    buf: [BLOCK_SIZE]u8,
    buf_len: usize,
    total_len: u128,

    const IV: [8]u64 = .{
        0x6a09e667f3bcc908, 0xbb67ae8584caa73b,
        0x3c6ef372fe94f82b, 0xa54ff53a5f1d36f1,
        0x510e527fade682d1, 0x9b05688c2b3e6c1f,
        0x1f83d9abfb41bd6b, 0x5be0cd19137e2179,
    };

    pub fn init() Blake2b256 {
        var self = Blake2b256{
            .state = IV,
            .buf = undefined,
            .buf_len = 0,
            .total_len = 0,
        };
        // Parameter block XOR (digest length, fanout, depth)
        self.state[0] ^= 0x01010000 ^ DIGEST_SIZE;
        return self;
    }

    pub fn update(self: *Blake2b256, data: []const u8) void {
        var offset: usize = 0;

        // Fill buffer if partially full
        if (self.buf_len > 0 and self.buf_len + data.len > BLOCK_SIZE) {
            const fill = BLOCK_SIZE - self.buf_len;
            @memcpy(self.buf[self.buf_len..][0..fill], data[0..fill]);
            self.compress(false);
            self.buf_len = 0;
            offset = fill;
        }

        // Process full blocks
        while (offset + BLOCK_SIZE <= data.len) {
            @memcpy(&self.buf, data[offset..][0..BLOCK_SIZE]);
            self.compress(false);
            offset += BLOCK_SIZE;
        }

        // Buffer remaining
        const remaining = data.len - offset;
        if (remaining > 0) {
            @memcpy(self.buf[self.buf_len..][0..remaining], data[offset..][0..remaining]);
            self.buf_len += remaining;
        }
        self.total_len += data.len;
    }

    pub fn final(self: *Blake2b256) [DIGEST_SIZE]u8 {
        // Pad with zeros
        @memset(self.buf[self.buf_len..], 0);
        self.compress(true); // Final block

        var result: [DIGEST_SIZE]u8 = undefined;
        for (self.state[0..4], 0..) |s, i| {
            @memcpy(result[i * 8 ..][0..8], &std.mem.toBytes(std.mem.nativeToLittle(u64, s)));
        }
        return result;
    }

    fn compress(self: *Blake2b256, is_final: bool) void {
        // BLAKE2b compression function
        // ... (standard BLAKE2b round function)
        _ = is_final;
    }

    /// One-shot hash
    pub fn hash(data: []const u8) [DIGEST_SIZE]u8 {
        var hasher = init();
        hasher.update(data);
        return hasher.final();
    }
};

AST Node

const CalcBlake2b256 = struct {
    input: *const Expr, // Coll[Byte]

    pub const OP_CODE = OpCode.new(87);

    pub const COST = PerItemCost{
        .base = JitCost{ .value = 20 },
        .per_chunk = JitCost{ .value = 7 },
        .chunk_size = 128,
    };

    pub fn tpe(_: *const CalcBlake2b256) SType {
        return .{ .coll = &SType.byte };
    }

    pub fn eval(self: *const CalcBlake2b256, env: *const DataEnv, E: *Evaluator) ![]const u8 {
        const input_bytes = try self.input.eval(env, E);
        const coll = input_bytes.coll.bytes;

        // Add cost based on input length
        E.addSeqCost(COST, coll.len, OP_CODE);

        const result = Blake2b256.hash(coll);
        return try E.allocator.dupe(u8, &result);
    }
};

SHA-256

Available for external system compatibility (Bitcoin, etc.)⁵⁶.

Implementation

const Sha256 = struct {
    pub const DIGEST_SIZE: usize = 32;
    pub const BLOCK_SIZE: usize = 64;

    state: [8]u32,
    buf: [BLOCK_SIZE]u8,
    buf_len: usize,
    total_len: u64,

    const K: [64]u32 = .{
        0x428a2f98, 0x71374491, 0xb5c0fbcf, 0xe9b5dba5,
        0x3956c25b, 0x59f111f1, 0x923f82a4, 0xab1c5ed5,
        // ... remaining round constants
    };

    const H0: [8]u32 = .{
        0x6a09e667, 0xbb67ae85, 0x3c6ef372, 0xa54ff53a,
        0x510e527f, 0x9b05688c, 0x1f83d9ab, 0x5be0cd19,
    };

    pub fn init() Sha256 {
        return .{
            .state = H0,
            .buf = undefined,
            .buf_len = 0,
            .total_len = 0,
        };
    }

    pub fn hash(data: []const u8) [DIGEST_SIZE]u8 {
        var hasher = init();
        hasher.update(data);
        return hasher.final();
    }

    // ... update, final, compress methods
};

AST Node

const CalcSha256 = struct {
    input: *const Expr,

    pub const OP_CODE = OpCode.new(88);

    /// SHA-256 is more expensive than BLAKE2b
    pub const COST = PerItemCost{
        .base = JitCost{ .value = 80 },
        .per_chunk = JitCost{ .value = 8 },
        .chunk_size = 64,
    };

    pub fn eval(self: *const CalcSha256, env: *const DataEnv, E: *Evaluator) ![]const u8 {
        const input_bytes = try self.input.eval(env, E);
        const coll = input_bytes.coll.bytes;

        E.addSeqCost(COST, coll.len, OP_CODE);

        const result = Sha256.hash(coll);
        return try E.allocator.dupe(u8, &result);
    }
};

Cost Comparison

Hash Function Costs
─────────────────────────────────────────────────────
             Base    Per Chunk   Chunk Size
─────────────────────────────────────────────────────
BLAKE2b256   20      7           128 bytes
SHA-256      80      8           64 bytes
─────────────────────────────────────────────────────

Cost Formula:  total = base + ceil(len / chunk_size) * per_chunk

Example: 200-byte Input

BLAKE2b256:
  chunks = ceil(200 / 128) = 2
  cost = 20 + 2 * 7 = 34

SHA-256:
  chunks = ceil(200 / 64) = 4
  cost = 80 + 4 * 8 = 112

Ratio: SHA-256 is ~3.3x more expensive

Fiat-Shamir Hash

Internal hash for Sigma protocol challenge generation⁷⁸:

const FiatShamir = struct {
    /// Soundness bits (192 = 24 bytes)
    pub const SOUNDNESS_BITS: u32 = 192;
    pub const SOUNDNESS_BYTES: usize = SOUNDNESS_BITS / 8; // 24

    /// Fiat-Shamir hash function
    /// Returns first 24 bytes of BLAKE2b256 hash
    pub fn hashFn(input: []const u8) [SOUNDNESS_BYTES]u8 {
        const full_hash = Blake2b256.hash(input);

        var result: [SOUNDNESS_BYTES]u8 = undefined;
        @memcpy(&result, full_hash[0..SOUNDNESS_BYTES]);
        return result;
    }
};

Why 192 Bits?

The truncation to 192 bits is not arbitrary⁹:

Security Constraints
─────────────────────────────────────────────────────
1. Challenge must be unpredictable to cheating prover
2. Threshold signatures use GF(2^192) polynomials
3. Must satisfy: 2^soundnessBits < group_order
4. Group order ≈ 2^256, so 192 < 256 works

comptime {
    // This constraint is critical for security
    std.debug.assert(FiatShamir.SOUNDNESS_BITS < CryptoConstants.GROUP_SIZE_BITS);
}

Fiat-Shamir Tree Serialization

The challenge is computed from a serialized proof tree¹⁰:

const FiatShamirTreeSerializer = struct {
    const INTERNAL_NODE_PREFIX: u8 = 0;
    const LEAF_PREFIX: u8 = 1;

    pub fn serialize(tree: *const ProofTree, writer: anytype) !void {
        switch (tree.*) {
            .leaf => |leaf| {
                try writer.writeByte(LEAF_PREFIX);

                // Serialize proposition as ErgoTree
                const prop_bytes = try leaf.proposition.toErgoTreeBytes();
                try writer.writeInt(i16, @intCast(prop_bytes.len), .big);
                try writer.writeAll(prop_bytes);

                // Serialize commitment
                const commitment = leaf.commitment orelse
                    return error.EmptyCommitment;
                try writer.writeInt(i16, @intCast(commitment.len), .big);
                try writer.writeAll(commitment);
            },
            .conjecture => |conj| {
                try writer.writeByte(INTERNAL_NODE_PREFIX);
                try writer.writeByte(@intFromEnum(conj.conj_type));

                // Threshold k for CTHRESHOLD
                if (conj.conj_type == .cthreshold) {
                    try writer.writeByte(conj.k);
                }

                try writer.writeInt(i16, @intCast(conj.children.len), .big);
                for (conj.children) |child| {
                    try serialize(child, writer);
                }
            },
        }
    }
};

HMAC-SHA512

For BIP32/BIP39 key derivation¹¹:

const HmacSha512 = struct {
    pub const DIGEST_SIZE: usize = 64;
    pub const BLOCK_SIZE: usize = 128;

    inner: Sha512,
    outer: Sha512,

    pub fn init(key: []const u8) HmacSha512 {
        var padded_key: [BLOCK_SIZE]u8 = [_]u8{0} ** BLOCK_SIZE;

        if (key.len > BLOCK_SIZE) {
            const hashed = Sha512.hash(key);
            @memcpy(padded_key[0..64], &hashed);
        } else {
            @memcpy(padded_key[0..key.len], key);
        }

        // Inner padding (0x36)
        var inner_pad: [BLOCK_SIZE]u8 = undefined;
        for (padded_key, 0..) |b, i| {
            inner_pad[i] = b ^ 0x36;
        }

        // Outer padding (0x5c)
        var outer_pad: [BLOCK_SIZE]u8 = undefined;
        for (padded_key, 0..) |b, i| {
            outer_pad[i] = b ^ 0x5c;
        }

        var self = HmacSha512{
            .inner = Sha512.init(),
            .outer = Sha512.init(),
        };
        self.inner.update(&inner_pad);
        self.outer.update(&outer_pad);
        return self;
    }

    pub fn update(self: *HmacSha512, data: []const u8) void {
        self.inner.update(data);
    }

    pub fn final(self: *HmacSha512) [DIGEST_SIZE]u8 {
        const inner_hash = self.inner.final();
        self.outer.update(&inner_hash);
        return self.outer.final();
    }

    pub fn hash(key: []const u8, data: []const u8) [DIGEST_SIZE]u8 {
        var hmac = init(key);
        hmac.update(data);
        return hmac.final();
    }
};

Key Derivation Constants

const KeyDerivation = struct {
    /// BIP39 HMAC key
    pub const BITCOIN_SEED = "Bitcoin seed";

    /// PBKDF2 iterations for BIP39
    pub const PBKDF2_ITERATIONS: u32 = 2048;

    /// Derived key length
    pub const PBKDF2_KEY_LENGTH: u32 = 512;
};

Box ID Computation

Box IDs are BLAKE2b256 hashes of box content:

pub fn computeBoxId(box_bytes: []const u8) [32]u8 {
    return Blake2b256.hash(box_bytes);
}

Summary

This chapter covered the hash functions that provide cryptographic integrity throughout the Sigma protocol:

BLAKE2b256 is the primary hash function—approximately 3x cheaper than SHA-256 due to its larger block size (128 bytes vs 64 bytes) and optimized design
SHA-256 is available for external system compatibility (Bitcoin scripts, cross-chain verification)
Fiat-Shamir challenge generation uses BLAKE2b256 truncated to 192 bits, matching the threshold signature polynomial field size while satisfying the constraint 2^192 < group_order
Per-item costing calculates hash cost as base + ceil(input_length / block_size) * per_chunk, accurately reflecting the computational work
HMAC-SHA512 provides key derivation for BIP32/BIP39 wallet compatibility, using the standard "Bitcoin seed" key
Box IDs are computed as BLAKE2b256 hashes of serialized box content, providing content-addressable identification

Next: Chapter 11: Sigma Protocols

Scala: CryptoFunctions.scala

Rust: hash.rs:5-26

Scala: trees.scala (CalcBlake2b256)

⁴

Rust: calc_blake2b256.rs:14-47

⁵

Scala: trees.scala (CalcSha256)

⁶

Rust: calc_sha256.rs

⁷

Scala: CryptoFunctions.scala:hashFn

⁸

Rust: fiat_shamir.rs:70-76

⁹

Scala: CryptoConstants.scala:70-75

¹⁰

Rust: fiat_shamir.rs:116-203

¹¹

Scala: HmacSHA512.scala

Chapter 11: Sigma Protocols

Prerequisites

Chapter 9 for elliptic curve operations and the discrete logarithm problem
Chapter 10 for Fiat-Shamir hash generation
Understanding of zero-knowledge proofs: proving knowledge without revealing secrets

Learning Objectives

By the end of this chapter, you will be able to:

Explain the three-move Sigma protocol structure (commitment, challenge, response)
Implement the Schnorr (DLog) protocol for proving knowledge of a discrete logarithm
Describe the Diffie-Hellman Tuple protocol for proving equality of discrete logs
Compose protocols using AND, OR, and THRESHOLD operations
Apply the Fiat-Shamir transformation to convert interactive proofs to non-interactive

Sigma Protocol Structure

Sigma (Σ) protocols are the cryptographic foundation that makes Ergo's smart contracts possible. Named for their characteristic three-move "sigma-shaped" structure, they enable a prover to convince a verifier that they know a secret without revealing anything about that secret—the defining property of zero-knowledge proofs.

A Sigma protocol is a three-move interactive proof¹²:

Sigma Protocol Flow
─────────────────────────────────────────────────────

    Prover (P)                           Verifier (V)
        │                                      │
        │   ────────  a (commitment) ───────>  │
        │                                      │
        │   <───────  e (challenge) ─────────  │
        │                                      │
        │   ────────  z (response) ─────────>  │
        │                                      │
        │                          Verify(a, e, z)?

Message Types

/// First message: prover's commitment
const FirstProverMessage = union(enum) {
    dlog: FirstDlogProverMessage,
    dht: FirstDhtProverMessage,

    pub fn bytes(self: FirstProverMessage) []const u8 {
        return switch (self) {
            .dlog => |m| m.a.serialize(),
            .dht => |m| m.a.serialize() ++ m.b.serialize(),
        };
    }
};

/// Second message: prover's response
const SecondProverMessage = union(enum) {
    dlog: SecondDlogProverMessage,
    dht: SecondDhtProverMessage,
};

/// Challenge from verifier (192 bits = 24 bytes)
const Challenge = [FiatShamir.SOUNDNESS_BYTES]u8;

Schnorr Protocol (Discrete Log)

Proves knowledge of secret w such that h = g^w³⁴:

Schnorr Protocol Steps
─────────────────────────────────────────────────────
Given: g (generator), h = g^w (public key), w (secret)

Step     Message    Computation
─────────────────────────────────────────────────────
1. Commit    a      r ← random, a = g^r
2. Challenge e      Verifier sends random e
3. Response  z      z = r + e·w (mod q)
4. Verify    ✓      g^z = a · h^e

Implementation

const DlogProverInput = struct {
    /// Secret scalar w in [0, q-1]
    w: Scalar,

    /// Compute public image h = g^w
    pub fn publicImage(self: *const DlogProverInput) ProveDlog {
        const g = DlogGroup.generator();
        const h = DlogGroup.exponentiate(&g, &self.w);
        return ProveDlog{ .h = h };
    }

    /// Generate random secret
    pub fn random(rng: std.rand.Random) DlogProverInput {
        return .{ .w = Scalar.random(rng) };
    }
};

/// First message: commitment a = g^r
const FirstDlogProverMessage = struct {
    a: EcPoint,

    pub fn bytes(self: *const FirstDlogProverMessage) [33]u8 {
        return GroupElementSerializer.serialize(&self.a);
    }
};

/// Second message: response z
const SecondDlogProverMessage = struct {
    z: Scalar,
};

Prover Steps

const DlogProver = struct {
    /// Step 1: Generate commitment (real proof)
    pub fn firstMessage(rng: std.rand.Random) struct { r: Scalar, msg: FirstDlogProverMessage } {
        const r = Scalar.random(rng);
        const g = DlogGroup.generator();
        const a = DlogGroup.exponentiate(&g, &r);
        return .{ .r = r, .msg = .{ .a = a } };
    }

    /// Step 3: Compute response z = r + e·w (mod q)
    pub fn secondMessage(
        private_input: *const DlogProverInput,
        r: Scalar,
        challenge: *const Challenge,
    ) SecondDlogProverMessage {
        const e = Scalar.fromBytes(challenge);
        const ew = e.mul(&private_input.w);  // e * w mod q
        const z = r.add(&ew);                 // r + ew mod q
        return .{ .z = z };
    }
};

Simulation

For OR composition, we need to simulate proofs without knowing the secret⁵:

/// Simulate transcript without knowing secret
/// Given challenge e, produce valid-looking (a, z)
pub fn simulate(
    public_input: *const ProveDlog,
    challenge: *const Challenge,
    rng: std.rand.Random,
) struct { first: FirstDlogProverMessage, second: SecondDlogProverMessage } {
    // SAMPLE random z
    const z = Scalar.random(rng);

    // COMPUTE a = g^z · h^(-e)
    // This satisfies verification equation: g^z = a · h^e
    const e = Scalar.fromBytes(challenge);
    const minus_e = e.negate();
    const g = DlogGroup.generator();
    const h = public_input.h;

    const g_to_z = DlogGroup.exponentiate(&g, &z);
    const h_to_minus_e = DlogGroup.exponentiate(&h, &minus_e);
    const a = DlogGroup.multiply(&g_to_z, &h_to_minus_e);

    return .{
        .first = .{ .a = a },
        .second = .{ .z = z },
    };
}

Verification (Commitment Reconstruction)

/// Verify: reconstruct a from z and e, check equality
/// g^z = a · h^e  =>  a = g^z / h^e
pub fn computeCommitment(
    proposition: *const ProveDlog,
    challenge: *const Challenge,
    second_msg: *const SecondDlogProverMessage,
) EcPoint {
    const g = DlogGroup.generator();
    const h = proposition.h;
    const e = Scalar.fromBytes(challenge);

    const g_to_z = DlogGroup.exponentiate(&g, &second_msg.z);
    const h_to_e = DlogGroup.exponentiate(&h, &e);
    const h_to_e_inv = DlogGroup.inverse(&h_to_e);

    return DlogGroup.multiply(&g_to_z, &h_to_e_inv);
}

Diffie-Hellman Tuple Protocol

Proves knowledge of w such that u = g^w AND v = h^w⁶⁷:

DHT Protocol: Prove (u, v) share the same discrete log

Given: g, h (generators), u = g^w, v = h^w (public tuple)

Step     Message      Computation
─────────────────────────────────────────────────────
1. Commit    (a, b)    r ← random, a = g^r, b = h^r
2. Challenge e         Verifier sends random e
3. Response  z         z = r + e·w (mod q)
4. Verify    ✓         g^z = a·u^e  AND  h^z = b·v^e

Implementation

const ProveDhTuple = struct {
    g: EcPoint,
    h: EcPoint,
    u: EcPoint,  // u = g^w
    v: EcPoint,  // v = h^w

    pub const OP_CODE = OpCode.ProveDiffieHellmanTuple;
};

const FirstDhtProverMessage = struct {
    a: EcPoint,  // a = g^r
    b: EcPoint,  // b = h^r

    pub fn bytes(self: *const FirstDhtProverMessage) [66]u8 {
        var result: [66]u8 = undefined;
        @memcpy(result[0..33], &GroupElementSerializer.serialize(&self.a));
        @memcpy(result[33..66], &GroupElementSerializer.serialize(&self.b));
        return result;
    }
};

const DhtProver = struct {
    pub fn firstMessage(
        public_input: *const ProveDhTuple,
        rng: std.rand.Random,
    ) struct { r: Scalar, msg: FirstDhtProverMessage } {
        const r = Scalar.random(rng);
        const a = DlogGroup.exponentiate(&public_input.g, &r);
        const b = DlogGroup.exponentiate(&public_input.h, &r);
        return .{ .r = r, .msg = .{ .a = a, .b = b } };
    }
};

SigmaBoolean Proposition Types

Propositions form a tree structure⁸⁹:

const SigmaBoolean = union(enum) {
    /// Leaf: prove knowledge of discrete log
    prove_dlog: ProveDlog,
    /// Leaf: prove DHT equality
    prove_dh_tuple: ProveDhTuple,
    /// Conjunction: all children must be proven
    cand: Cand,
    /// Disjunction: at least one child proven
    cor: Cor,
    /// Threshold: k-of-n children proven
    cthreshold: Cthreshold,
    /// Trivially true
    trivial_true: void,
    /// Trivially false
    trivial_false: void,

    pub fn opCode(self: SigmaBoolean) OpCode {
        return switch (self) {
            .prove_dlog => OpCode.ProveDlog,
            .prove_dh_tuple => OpCode.ProveDiffieHellmanTuple,
            .cand => OpCode.SigmaAnd,
            .cor => OpCode.SigmaOr,
            .cthreshold => OpCode.Atleast,
            else => OpCode.Constant,
        };
    }

    /// Count nodes in proposition tree
    pub fn size(self: SigmaBoolean) usize {
        return switch (self) {
            .cand => |c| 1 + sumChildSizes(c.children),
            .cor => |c| 1 + sumChildSizes(c.children),
            .cthreshold => |c| 1 + sumChildSizes(c.children),
            else => 1,
        };
    }
};

const ProveDlog = struct {
    h: EcPoint,  // Public key h = g^w

    pub const OP_CODE = OpCode.ProveDlog;
};

const Cand = struct {
    children: []const SigmaBoolean,
};

const Cor = struct {
    children: []const SigmaBoolean,
};

const Cthreshold = struct {
    k: u8,  // Threshold
    children: []const SigmaBoolean,
};

Protocol Composition

AND Composition

All children share the same challenge¹⁰:

       Challenge e
          │
    ┌─────┴─────┐
    │           │
  σ₁(e)       σ₂(e)
   real        real

/// AND: prove all children with same challenge
fn proveAnd(
    children: []const *SigmaBoolean,
    secrets: []const PrivateInput,
    challenge: *const Challenge,
) []const ProofNode {
    var proofs = allocator.alloc(ProofNode, children.len);
    for (children, secrets, 0..) |child, secret, i| {
        proofs[i] = proveReal(child, secret, challenge);
    }
    return proofs;
}

OR Composition

At least one child is real; others are simulated¹¹:

       Challenge e
          │
    ┌─────┴─────┐
    │           │
  σ₁(e₁)      σ₂(e₂)
   REAL       SIMULATED

Constraint: e₁ ⊕ e₂ = e (XOR)

/// OR: one real proof, rest simulated
/// Challenges must XOR to root challenge
fn proveOr(
    children: []const *SigmaBoolean,
    real_index: usize,
    secret: PrivateInput,
    challenge: *const Challenge,
    rng: std.rand.Random,
) []const ProofNode {
    var proofs = allocator.alloc(ProofNode, children.len);
    var challenge_sum: Challenge = [_]u8{0} ** FiatShamir.SOUNDNESS_BYTES;

    // First: generate simulated proofs with random challenges
    for (children, 0..) |child, i| {
        if (i != real_index) {
            var sim_challenge: Challenge = undefined;
            rng.bytes(&sim_challenge);
            proofs[i] = simulate(child, &sim_challenge, rng);
            xorChallenge(&challenge_sum, &sim_challenge);
        }
    }

    // Derive real challenge: e_real = e ⊕ (sum of simulated challenges)
    var real_challenge: Challenge = undefined;
    for (0..FiatShamir.SOUNDNESS_BYTES) |i| {
        real_challenge[i] = challenge[i] ^ challenge_sum[i];
    }

    proofs[real_index] = proveReal(children[real_index], secret, &real_challenge);
    return proofs;
}

THRESHOLD (k-of-n)

Uses polynomial interpolation over GF(2^192)¹²:

Threshold k-of-n Challenge Distribution
─────────────────────────────────────────────────────
- Construct polynomial p(x) of degree k-1
- p(0) = e (root challenge)
- Each child i gets challenge p(i)
- k real children, (n-k) simulated

const GF2_192 = struct {
    /// 192 bits = 3 × 64-bit words
    words: [3]u64,

    pub fn fromChallenge(c: *const Challenge) GF2_192 {
        var self = GF2_192{ .words = .{ 0, 0, 0 } };
        // Load 24 bytes into 3 words (only 192 bits used)
        @memcpy(std.mem.asBytes(&self.words[0])[0..8], c[0..8]);
        @memcpy(std.mem.asBytes(&self.words[1])[0..8], c[8..16]);
        @memcpy(std.mem.asBytes(&self.words[2])[0..8], c[16..24]);
        return self;
    }

    pub fn add(a: *const GF2_192, b: *const GF2_192) GF2_192 {
        // Addition in GF(2^192) is XOR
        return .{ .words = .{
            a.words[0] ^ b.words[0],
            a.words[1] ^ b.words[1],
            a.words[2] ^ b.words[2],
        } };
    }

    // Multiplication uses polynomial representation with reduction
    // NOTE: This is a stub. Full implementation requires:
    // 1. Carry-less multiplication of 192-bit polynomials
    // 2. Reduction modulo irreducible polynomial x^192 + x^7 + x^2 + x + 1
    // See sigma-rust: ergotree-interpreter/src/sigma_protocol/gf2_192.rs
    pub fn mul(a: *const GF2_192, b: *const GF2_192) GF2_192 {
        _ = a;
        _ = b;
        @compileError("GF2_192.mul() not implemented - see reference implementations");
    }
};

const GF2_192_Poly = struct {
    coefficients: []GF2_192,
    degree: usize,

    /// Evaluate polynomial at point x using Horner's method
    pub fn evaluate(self: *const GF2_192_Poly, x: u8) GF2_192 {
        var result = self.coefficients[self.degree];
        var i = self.degree;
        while (i > 0) {
            i -= 1;
            result = GF2_192.add(&GF2_192.mulByByte(&result, x), &self.coefficients[i]);
        }
        return result;
    }

    /// Lagrange interpolation through given points
    /// NOTE: Stub - full implementation requires GF2_192 arithmetic
    /// See sigma-rust: ergotree-interpreter/src/sigma_protocol/gf2_192_poly.rs
    pub fn interpolate(
        points: []const u8,
        values: []const GF2_192,
        value_at_0: GF2_192,
    ) GF2_192_Poly {
        // Construct unique polynomial of degree (n-1)
        // passing through n points with p(0) = value_at_0
        _ = points;
        _ = values;
        _ = value_at_0;
        @compileError("GF2_192_Poly.interpolate() not implemented");
    }
};

// NOTE: In production, all scalar operations (add, mul, negate) must be
// constant-time to prevent timing side-channel attacks. See ZIGMA_STYLE.md.

Proof Trees

Track proof state during proving¹³:

const UnprovenTree = union(enum) {
    leaf: UnprovenLeaf,
    conjecture: UnprovenConjecture,
};

const UnprovenLeaf = struct {
    proposition: SigmaBoolean,
    position: NodePosition,
    simulated: bool,
    commitment_opt: ?FirstProverMessage,
    randomness_opt: ?Scalar,
    challenge_opt: ?Challenge,
};

const UnprovenConjecture = struct {
    conj_type: enum { and, or_, threshold },
    children: []UnprovenTree,
    position: NodePosition,
    simulated: bool,
    challenge_opt: ?Challenge,
    k: ?u8,  // For threshold
    polynomial_opt: ?GF2_192_Poly,
};

/// Position in tree: "0-2-1" means root → child 2 → child 1
const NodePosition = struct {
    positions: []const usize,

    pub fn child(self: NodePosition, idx: usize) NodePosition {
        return .{ .positions = self.positions ++ &[_]usize{idx} };
    }

    pub const CRYPTO_PREFIX = NodePosition{ .positions = &.{0} };
};

Fiat-Shamir Transformation

Convert interactive to non-interactive by deriving challenge from hash¹⁴:

/// Derive challenge from tree serialization
pub fn fiatShamirChallenge(tree: *const ProofTree) Challenge {
    var buf = std.ArrayList(u8).init(allocator);
    fiatShamirSerialize(tree, &buf);
    return FiatShamir.hashFn(buf.items);
}

fn fiatShamirSerialize(tree: *const ProofTree, writer: anytype) !void {
    const INTERNAL_PREFIX: u8 = 0;
    const LEAF_PREFIX: u8 = 1;

    switch (tree.*) {
        .leaf => |leaf| {
            try writer.writeByte(LEAF_PREFIX);
            // Proposition bytes
            const prop_bytes = try leaf.proposition.toErgoTreeBytes();
            try writer.writeIntBig(i16, @intCast(prop_bytes.len));
            try writer.writeAll(prop_bytes);
            // Commitment bytes
            const commitment = leaf.commitment_opt orelse return error.NoCommitment;
            const comm_bytes = commitment.bytes();
            try writer.writeIntBig(i16, @intCast(comm_bytes.len));
            try writer.writeAll(comm_bytes);
        },
        .conjecture => |conj| {
            try writer.writeByte(INTERNAL_PREFIX);
            try writer.writeByte(@intFromEnum(conj.conj_type));
            if (conj.k) |k| try writer.writeByte(k);
            try writer.writeIntBig(i16, @intCast(conj.children.len));
            for (conj.children) |child| {
                try fiatShamirSerialize(child, writer);
            }
        },
    }
}

Security Properties

Security Properties
─────────────────────────────────────────────────────
Property         Meaning
─────────────────────────────────────────────────────
Completeness     Honest prover always convinces
Soundness        Cheater succeeds with prob ≤ 2^-192
Zero-Knowledge   Proof reveals nothing about secret
Special Sound.   Two transcripts extract secret

Summary

This chapter covered Sigma protocols—the zero-knowledge proof system that forms the cryptographic core of Ergo's smart contracts:

Sigma protocols use a three-move structure: the prover sends a commitment, receives a challenge, and responds with a value that proves knowledge without revealing the secret
Schnorr (DLog) protocol proves knowledge of a discrete logarithm: given h = g^w, prove knowledge of w without revealing it
Diffie-Hellman Tuple protocol proves equality of discrete logs across different bases: given u = g^w and v = h^w, prove that u and v share the same discrete log
AND composition applies the same challenge to all children—all must be proven
OR composition distributes challenges via XOR constraint—only one child needs a real proof, others are simulated
THRESHOLD (k-of-n) uses GF(2^192) polynomial interpolation to distribute challenges, requiring k real proofs
Simulation generates valid-looking transcripts without knowing secrets, enabling OR and threshold compositions
Fiat-Shamir transformation makes interactive protocols non-interactive by deriving the challenge from a hash of the commitments

Next: Chapter 12: Evaluation Model

Scala: SigmaProtocolFunctions.scala

Rust: sigma_protocol.rs

Scala: DLogProtocol.scala

⁴

Rust: dlog_protocol.rs:10-47

⁵

Rust: dlog_protocol.rs:73-93

⁶

Scala: DiffieHellmanTupleProtocol.scala

⁷

Rust: dht_protocol.rs

⁸

Scala: SigmaBoolean.scala

⁹

Rust: sigma_boolean.rs:31-96

¹⁰

Rust: cand.rs

¹¹

Rust: cor.rs

¹²

Scala: GF2_192_Poly.scala

¹³

Rust: unproven_tree.rs

¹⁴

Rust: fiat_shamir.rs

Chapter 12: Evaluation Model

Prerequisites

Chapter 4 for the AST structure and Value hierarchy
Chapter 5 for opcodes and cost descriptors
Chapter 3 for ErgoTree format and constant segregation

Learning Objectives

By the end of this chapter, you will be able to:

Explain direct-style big-step interpretation and why it suits ErgoTree evaluation
Implement eval dispatch for AST node types (constants, variables, functions, operations)
Work with the Env environment structure for variable binding and closure capture
Track accumulated costs during evaluation to enforce resource limits

Evaluation Architecture

The Sigma interpreter transforms an ErgoTree expression into a SigmaBoolean proposition that can be proven or verified. This "reduction" process uses direct-style big-step evaluation—each expression immediately returns its result value rather than producing intermediate steps. This approach is simpler than continuation-passing style while still supporting the necessary features: lexical closures, short-circuit evaluation, and cost tracking¹².

Evaluation Flow
─────────────────────────────────────────────────────

┌──────────────────────────────────────────────────┐
│              ErgoTreeEvaluator                   │
├──────────────────────────────────────────────────┤
│  context: Context     (SELF, INPUTS, OUTPUTS)    │
│  constants: []Const   (segregated constants)     │
│  cost_accum: CostAcc  (tracks execution cost)    │
│  env: Env             (variable bindings)        │
└───────────────────────┬──────────────────────────┘
                        │
                        │ eval(expr)
                        ▼
┌──────────────────────────────────────────────────┐
│              AST Traversal                       │
│                                                  │
│    Expr.eval(env, ctx)                           │
│         │                                        │
│         ├── Evaluate children                    │
│         ├── Add operation cost                   │
│         ├── Perform operation                    │
│         └── Return result Value                  │
└──────────────────────────────────────────────────┘

Evaluator Structure

const Evaluator = struct {
    context: *const Context,
    constants: []const Constant,
    cost_accum: CostAccumulator,
    allocator: Allocator,

    pub fn init(
        context: *const Context,
        constants: []const Constant,
        cost_limit: JitCost,
        allocator: Allocator,
    ) Evaluator {
        return .{
            .context = context,
            .constants = constants,
            .cost_accum = CostAccumulator.init(cost_limit),
            .allocator = allocator,
        };
    }

    /// Evaluate expression in given environment
    pub fn eval(self: *Evaluator, env: *const Env, expr: *const Expr) !Value {
        return expr.eval(env, self);
    }

    /// Evaluate to specific type
    pub fn evalTo(
        self: *Evaluator,
        comptime T: type,
        env: *const Env,
        expr: *const Expr,
    ) !T {
        const result = try self.eval(env, expr);
        return result.as(T) orelse error.TypeMismatch;
    }

    /// Add fixed cost
    pub fn addCost(self: *Evaluator, cost: FixedCost, op: OpCode) !void {
        try self.cost_accum.add(cost.value, op);
    }

    /// Add per-item cost
    pub fn addSeqCost(self: *Evaluator, cost: PerItemCost, n_items: usize, op: OpCode) !void {
        const total = cost.base.value + (n_items / cost.chunk_size + 1) * cost.per_chunk.value;
        try self.cost_accum.add(total, op);
    }
};

Environment (Variable Binding)

The Env maps variable IDs to computed values³⁴:

const Env = struct {
    /// HashMap from variable ID to value
    bindings: std.AutoHashMap(u32, Value),
    allocator: Allocator,

    pub fn init(allocator: Allocator) Env {
        return .{
            .bindings = std.AutoHashMap(u32, Value).init(allocator),
            .allocator = allocator,
        };
    }

    /// Look up variable by ID
    pub fn get(self: *const Env, val_id: u32) ?Value {
        return self.bindings.get(val_id);
    }

    /// Create new environment with additional binding
    /// NOTE: This implementation clones the HashMap on every extend() call.
    /// In production, use a pre-allocated binding stack with O(1) extend/pop:
    ///   bindings: [MAX_BINDINGS]Binding (pre-allocated)
    ///   stack_ptr: usize (grows/shrinks without allocation)
    /// See ZIGMA_STYLE.md for zero-allocation evaluation patterns.
    pub fn extend(self: *const Env, val_id: u32, value: Value) !Env {
        var new_env = Env{
            .bindings = try self.bindings.clone(),
            .allocator = self.allocator,
        };
        try new_env.bindings.put(val_id, value);
        return new_env;
    }

    /// Create new environment with multiple bindings
    pub fn extendMany(self: *const Env, bindings: []const struct { id: u32, val: Value }) !Env {
        var new_env = Env{
            .bindings = try self.bindings.clone(),
            .allocator = self.allocator,
        };
        for (bindings) |b| {
            try new_env.bindings.put(b.id, b.val);
        }
        return new_env;
    }
};

Expression Dispatch

Each expression type implements eval⁵⁶:

const Expr = union(enum) {
    constant: Constant,
    const_placeholder: ConstantPlaceholder,
    val_use: ValUse,
    block_value: BlockValue,
    func_value: FuncValue,
    apply: Apply,
    if_op: If,
    bin_op: BinOp,
    // ... other expression types

    /// Evaluate expression recursively
    /// NOTE: This recursive approach is clear for learning but uses the call
    /// stack. In production, use an explicit work stack to:
    /// 1. Guarantee bounded stack depth (no stack overflow)
    /// 2. Enable O(1) reset between transactions
    /// See ZIGMA_STYLE.md for iterative evaluation patterns.
    pub fn eval(self: *const Expr, env: *const Env, E: *Evaluator) !Value {
        return switch (self.*) {
            .constant => |c| c.eval(env, E),
            .const_placeholder => |cp| cp.eval(env, E),
            .val_use => |vu| vu.eval(env, E),
            .block_value => |bv| bv.eval(env, E),
            .func_value => |fv| fv.eval(env, E),
            .apply => |a| a.eval(env, E),
            .if_op => |i| i.eval(env, E),
            .bin_op => |b| b.eval(env, E),
            // ... dispatch to other eval implementations
        };
    }
};

Constant Evaluation

Constants return their value with fixed cost⁷:

const Constant = struct {
    tpe: SType,
    value: Literal,

    pub const COST = FixedCost{ .value = 5 };

    pub fn eval(self: *const Constant, _: *const Env, E: *Evaluator) !Value {
        try E.addCost(COST, OpCode.Constant);
        return Value.fromLiteral(self.value);
    }
};

const ConstantPlaceholder = struct {
    index: u32,
    tpe: SType,

    pub const COST = FixedCost{ .value = 1 };

    pub fn eval(self: *const ConstantPlaceholder, _: *const Env, E: *Evaluator) !Value {
        try E.addCost(COST, OpCode.ConstantPlaceholder);
        if (self.index >= E.constants.len) {
            return error.IndexOutOfBounds;
        }
        const c = E.constants[self.index];
        return Value.fromLiteral(c.value);
    }
};

Variable Access

ValUse looks up variables in environment⁸:

const ValUse = struct {
    val_id: u32,
    tpe: SType,

    pub const COST = FixedCost{ .value = 5 };

    pub fn eval(self: *const ValUse, env: *const Env, E: *Evaluator) !Value {
        try E.addCost(COST, OpCode.ValUse);
        return env.get(self.val_id) orelse error.UndefinedVariable;
    }
};

Block Evaluation

Blocks introduce variable bindings⁹:

const BlockValue = struct {
    items: []const ValDef,
    result: *const Expr,

    pub const COST = PerItemCost{
        .base = JitCost{ .value = 1 },
        .per_chunk = JitCost{ .value = 1 },
        .chunk_size = 1,
    };

    pub fn eval(self: *const BlockValue, env: *const Env, E: *Evaluator) !Value {
        try E.addSeqCost(COST, self.items.len, OpCode.BlockValue);

        var cur_env = env.*;
        for (self.items) |item| {
            // Evaluate right-hand side
            const rhs_val = try item.rhs.eval(&cur_env, E);

            // Extend environment with new binding
            try E.addCost(FuncValue.ADD_TO_ENV_COST, OpCode.FuncValue);
            cur_env = try cur_env.extend(item.id, rhs_val);
        }

        // Evaluate result in extended environment
        return self.result.eval(&cur_env, E);
    }
};

const ValDef = struct {
    id: u32,
    tpe: SType,
    rhs: *const Expr,
};

Lambda Functions

FuncValue creates closures¹⁰:

const FuncValue = struct {
    args: []const FuncArg,
    body: *const Expr,

    pub const COST = FixedCost{ .value = 10 };
    pub const ADD_TO_ENV_COST = FixedCost{ .value = 5 };

    pub fn eval(self: *const FuncValue, env: *const Env, E: *Evaluator) !Value {
        try E.addCost(COST, OpCode.FuncValue);

        // Create closure capturing current environment
        return Value{
            .closure = .{
                .captured_env = env.*,
                .args = self.args,
                .body = self.body,
            },
        };
    }
};

const FuncArg = struct {
    id: u32,
    tpe: SType,
};

const Apply = struct {
    func: *const Expr,
    args: *const Expr,

    pub fn eval(self: *const Apply, env: *const Env, E: *Evaluator) !Value {
        // Evaluate function
        const func_val = try self.func.eval(env, E);
        const closure = func_val.closure;

        // Evaluate argument
        const arg_val = try self.args.eval(env, E);

        // Extend closure's captured env with argument binding
        try E.addCost(FuncValue.ADD_TO_ENV_COST, OpCode.Apply);
        var new_env = try closure.captured_env.extend(closure.args[0].id, arg_val);

        // Evaluate body in new environment
        return closure.body.eval(&new_env, E);
    }
};

Conditional Evaluation

If uses short-circuit semantics¹¹:

const If = struct {
    condition: *const Expr,
    true_branch: *const Expr,
    false_branch: *const Expr,

    pub const COST = FixedCost{ .value = 10 };

    pub fn eval(self: *const If, env: *const Env, E: *Evaluator) !Value {
        // Evaluate condition
        const cond = try E.evalTo(bool, env, self.condition);

        try E.addCost(COST, OpCode.If);

        // Only evaluate taken branch (short-circuit)
        if (cond) {
            return self.true_branch.eval(env, E);
        } else {
            return self.false_branch.eval(env, E);
        }
    }
};

Collection Operations

Map, filter, fold evaluate with per-item costs¹²:

const Map = struct {
    input: *const Expr,
    mapper: *const Expr,

    pub const COST = PerItemCost{
        .base = JitCost{ .value = 10 },
        .per_chunk = JitCost{ .value = 5 },
        .chunk_size = 10,
    };

    pub fn eval(self: *const Map, env: *const Env, E: *Evaluator) !Value {
        const input_coll = try E.evalTo(Collection, env, self.input);
        const mapper_fn = try E.evalTo(Closure, env, self.mapper);

        try E.addSeqCost(COST, input_coll.len, OpCode.Map);

        var result = try E.allocator.alloc(Value, input_coll.len);
        for (input_coll.items, 0..) |item, i| {
            // Apply mapper to each element
            try E.addCost(FuncValue.ADD_TO_ENV_COST, OpCode.Map);
            var fn_env = try mapper_fn.captured_env.extend(mapper_fn.args[0].id, item);
            result[i] = try mapper_fn.body.eval(&fn_env, E);
        }

        return Value{ .coll = .{ .items = result } };
    }
};

const Fold = struct {
    input: *const Expr,
    zero: *const Expr,
    folder: *const Expr,

    pub const COST = PerItemCost{
        .base = JitCost{ .value = 10 },
        .per_chunk = JitCost{ .value = 5 },
        .chunk_size = 10,
    };

    pub fn eval(self: *const Fold, env: *const Env, E: *Evaluator) !Value {
        const input_coll = try E.evalTo(Collection, env, self.input);
        const zero_val = try self.zero.eval(env, E);
        const folder_fn = try E.evalTo(Closure, env, self.folder);

        try E.addSeqCost(COST, input_coll.len, OpCode.Fold);

        var accum = zero_val;
        for (input_coll.items) |item| {
            // folder takes (accum, item)
            const tuple = Value{ .tuple = .{ accum, item } };
            try E.addCost(FuncValue.ADD_TO_ENV_COST, OpCode.Fold);
            var fn_env = try folder_fn.captured_env.extend(folder_fn.args[0].id, tuple);
            accum = try folder_fn.body.eval(&fn_env, E);
        }

        return accum;
    }
};

Binary Operations

const BinOp = struct {
    kind: Kind,
    left: *const Expr,
    right: *const Expr,

    const Kind = enum {
        plus, minus, multiply, divide, modulo,
        gt, ge, lt, le, eq, neq,
        bin_and, bin_or, bin_xor,
    };

    pub fn eval(self: *const BinOp, env: *const Env, E: *Evaluator) !Value {
        const left_val = try self.left.eval(env, E);
        const right_val = try self.right.eval(env, E);

        return switch (self.kind) {
            .plus => try evalPlus(left_val, right_val, E),
            .minus => try evalMinus(left_val, right_val, E),
            .gt => try evalGt(left_val, right_val, E),
            // ... other operations
        };
    }

    fn evalPlus(left: Value, right: Value, E: *Evaluator) !Value {
        try E.addCost(ArithOp.PLUS_COST, OpCode.Plus);
        return switch (left) {
            .int => |l| Value{ .int = try std.math.add(i32, l, right.int) },
            .long => |l| Value{ .long = try std.math.add(i64, l, right.long) },
            else => error.TypeMismatch,
        };
    }
};

Top-Level Evaluation

Reduce ErgoTree to SigmaBoolean:

pub fn reduceToSigmaBoolean(
    ergo_tree: *const ErgoTree,
    context: *const Context,
    cost_limit: JitCost,
    allocator: Allocator,
) !struct { prop: SigmaBoolean, cost: JitCost } {
    var evaluator = Evaluator.init(
        context,
        ergo_tree.constants,
        cost_limit,
        allocator,
    );

    const empty_env = Env.init(allocator);
    const result = try evaluator.eval(&empty_env, ergo_tree.root);

    const sigma_prop = result.asSigmaProp() orelse
        return error.NotSigmaProp;

    return .{
        .prop = sigma_prop.sigma_boolean,
        .cost = evaluator.cost_accum.totalCost(),
    };
}

Summary

This chapter covered the evaluation model that transforms ErgoTree expressions into SigmaBoolean propositions:

Direct-style big-step interpretation evaluates expressions recursively, with each node immediately returning its result value
Env maps variable IDs to values using immutable functional updates—each extend() creates a new environment with additional bindings
Each AST node implements an eval() method that returns a Value and accumulates execution cost
BlockValue extends the environment with ValDef bindings, enabling local variable definitions
FuncValue creates closures that capture the current environment, enabling lexical scoping
If implements short-circuit evaluation—only the taken branch is evaluated, reducing unnecessary computation and cost
Collection operations (Map, Filter, Fold) have per-item costs reflecting their iteration over elements
Top-level reduction produces a SigmaBoolean proposition that the prover/verifier can then handle cryptographically

Next: Chapter 13: Cost Model

Scala: CErgoTreeEvaluator.scala

Rust: eval.rs:1-100

Scala: ErgoTreeEvaluator.scala (DataEnv)

⁴

Rust: env.rs

⁵

Scala: values.scala (eval methods)

⁶

Rust: expr.rs:14-100

⁷

Scala: values.scala (ConstantNode.eval)

⁸

Rust: val_use.rs

⁹

Rust: block.rs

¹⁰

Rust: func_value.rs

¹¹

Rust: if_op.rs

¹²

Rust: coll_map.rs, coll_fold.rs

Chapter 13: Cost Model

Prerequisites

Chapter 12 for the evaluation architecture and how costs are accumulated during eval
Chapter 5 for operation categories and cost descriptor types
Basic computational complexity: understanding of constant-time vs linear-time operations

Learning Objectives

By the end of this chapter, you will be able to:

Explain JitCost scaling (10x) and conversion to/from block costs
Apply the three cost descriptor types: FixedCost, PerItemCost, and TypeBasedCost
Implement cost accumulation with limit enforcement to prevent denial-of-service attacks
Use cost tracing to analyze script execution costs

Cost Model Purpose

Unlike Turing-complete smart contract platforms that can enter infinite loops, ErgoTree scripts must terminate within bounded resources. The cost model assigns a computational cost to every operation, accumulating these costs during evaluation. If the accumulated cost exceeds the block limit, execution fails—this guarantees that all scripts terminate and prevents attackers from crafting expensive scripts that slow down block validation.

ErgoTree scripts execute in a resource-constrained environment¹²:

Cost Model Guarantees
─────────────────────────────────────────────────────
1. DoS Protection     Expensive scripts blocked
2. Predictable Time   Miners estimate validation
3. Fair Pricing       Users pay for resources
4. Bounded Verify     All scripts terminate

JitCost: The Cost Unit

JitCost provides 10x finer granularity than block costs³:

const JitCost = struct {
    value: i32,

    pub const SCALE_FACTOR: i32 = 10;

    /// Add with overflow protection
    pub fn add(self: JitCost, other: JitCost) !JitCost {
        const result = @addWithOverflow(self.value, other.value);
        if (result[1] != 0) return error.CostOverflow;
        return .{ .value = result[0] };
    }

    /// Multiply with overflow protection
    pub fn mul(self: JitCost, n: i32) !JitCost {
        const result = @mulWithOverflow(self.value, n);
        if (result[1] != 0) return error.CostOverflow;
        return .{ .value = result[0] };
    }

    /// Divide by integer
    pub fn div(self: JitCost, n: i32) JitCost {
        return .{ .value = @divTrunc(self.value, n) };
    }

    /// Convert to block cost (inverse of fromBlockCost)
    pub fn toBlockCost(self: JitCost) i32 {
        return @divTrunc(self.value, SCALE_FACTOR);
    }

    /// Create from block cost
    pub fn fromBlockCost(block_cost: i32) !JitCost {
        const result = @mulWithOverflow(block_cost, SCALE_FACTOR);
        if (result[1] != 0) return error.CostOverflow;
        return .{ .value = result[0] };
    }

    /// Comparison
    pub fn gt(self: JitCost, other: JitCost) bool {
        return self.value > other.value;
    }
};

Cost Scaling

Cost Scales
─────────────────────────────────────────────────────

JitCost (internal)    ─────────────────>    Block Cost
                          ÷ 10

Example:
  JitCost(50)   ──────────────────────>   5 block units
  JitCost(123)  ──────────────────────>   12 block units

Block Cost (external) <─────────────────    JitCost
                          × 10

The 10x scaling provides:

Finer granularity for internal calculations
Integer arithmetic (no floating point)
Overflow protection via checked operations

Cost Kind Descriptors

Cost descriptors define how operations are costed⁴⁵:

const CostKind = union(enum) {
    fixed: FixedCost,
    per_item: PerItemCost,
    type_based: TypeBasedCost,
    dynamic: void,
};

/// Constant time operations
const FixedCost = struct {
    cost: JitCost,
};

/// Linear operations with chunking
const PerItemCost = struct {
    base_cost: JitCost,
    per_chunk_cost: JitCost,
    chunk_size: u32,

    /// Compute number of chunks for n items
    pub fn chunks(self: PerItemCost, n_items: usize) usize {
        if (n_items == 0) return 1;
        return (n_items - 1) / self.chunk_size + 1;
    }

    /// Compute total cost for n items
    pub fn cost(self: PerItemCost, n_items: usize) !JitCost {
        const n_chunks = self.chunks(n_items);
        const chunk_cost = try self.per_chunk_cost.mul(@intCast(n_chunks));
        return self.base_cost.add(chunk_cost);
    }
};

/// Type-dependent operations
const TypeBasedCost = struct {
    cost_fn: *const fn (SType) JitCost,
};

FixedCost Operations

Operation	Cost	Description
Constant	5	Return constant value
ConstantPlaceholder	1	Lookup segregated constant
ValUse	5	Variable lookup
If	10	Conditional branch
SelectField	10	Tuple field access
SizeOf	14	Get collection size

PerItemCost Operations

Operation	Base	Per Chunk	Chunk Size
blake2b256	20	7	128 bytes
sha256	80	8	64 bytes
Append	20	2	10 items
Filter	20	2	10 items
Map	20	2	10 items
Fold	20	2	10 items

Cost Formula

PerItemCost Formula
─────────────────────────────────────────────────────

total = baseCost + ceil(nItems / chunkSize) × perChunkCost

Example: Map over 50 elements
  chunks = ceil(50 / 10) = 5
  cost = 20 + 5 × 2 = 30 JitCost units

Type-Based Costs

Operations with type-dependent complexity⁶:

/// Numeric cast cost depends on target type
const NumericCastCost = struct {
    pub fn costFunc(target_type: SType) JitCost {
        return switch (target_type) {
            .s_big_int, .s_unsigned_big_int => .{ .value = 30 },
            else => .{ .value = 10 }, // Byte, Short, Int, Long
        };
    }
};

/// Equality cost depends on operand types
const EqualityCost = struct {
    pub fn costFunc(tpe: SType) JitCost {
        return switch (tpe) {
            .s_byte, .s_short, .s_int, .s_long => .{ .value = 3 },
            .s_big_int => .{ .value = 6 },
            .s_group_element => .{ .value = 172 },
            .s_coll => |elem| blk: {
                // Recursive: base + per-element
                const elem_cost = costFunc(elem.*);
                break :blk .{ .value = 10 + elem_cost.value };
            },
            else => .{ .value = 10 },
        };
    }
};

Cost Items: Tracing

Cost items record individual contributions for debugging⁷⁸:

const CostItem = union(enum) {
    fixed: FixedCostItem,
    seq: SeqCostItem,
    type_based: TypeBasedCostItem,

    pub fn opName(self: CostItem) []const u8 {
        return switch (self) {
            .fixed => |f| f.op_desc.name,
            .seq => |s| s.op_desc.name,
            .type_based => |t| t.op_desc.name,
        };
    }

    pub fn cost(self: CostItem) JitCost {
        return switch (self) {
            .fixed => |f| f.cost_kind.cost,
            .seq => |s| s.cost_kind.cost(s.n_items) catch .{ .value = 0 },
            .type_based => |t| t.cost_kind.cost_fn(t.tpe),
        };
    }
};

const FixedCostItem = struct {
    op_desc: OperationDesc,
    cost_kind: FixedCost,
};

const SeqCostItem = struct {
    op_desc: OperationDesc,
    cost_kind: PerItemCost,
    n_items: usize,

    pub fn chunks(self: SeqCostItem) usize {
        return self.cost_kind.chunks(self.n_items);
    }
};

const TypeBasedCostItem = struct {
    op_desc: OperationDesc,
    cost_kind: TypeBasedCost,
    tpe: SType,
};

Cost Accumulator

Tracks costs during evaluation with limit enforcement⁹¹⁰:

const CostCounter = struct {
    initial_cost: JitCost,
    current_cost: JitCost,

    pub fn init(initial: JitCost) CostCounter {
        return .{
            .initial_cost = initial,
            .current_cost = initial,
        };
    }

    pub fn add(self: *CostCounter, cost: JitCost) !void {
        self.current_cost = try self.current_cost.add(cost);
    }

    pub fn reset(self: *CostCounter) void {
        self.current_cost = self.initial_cost;
    }
};

const CostAccumulator = struct {
    scope_stack: std.ArrayList(Scope),
    cost_limit: ?JitCost,
    allocator: Allocator,

    const Scope = struct {
        counter: CostCounter,
        child_result: i32 = 0,

        pub fn add(self: *Scope, cost: JitCost) !void {
            try self.counter.add(cost);
        }

        pub fn currentCost(self: *const Scope) JitCost {
            return self.counter.current_cost;
        }
    };

    pub fn init(
        allocator: Allocator,
        initial_cost: JitCost,
        cost_limit: ?JitCost,
    ) CostAccumulator {
        var stack = std.ArrayList(Scope).init(allocator);
        stack.append(.{ .counter = CostCounter.init(initial_cost) }) catch unreachable;
        return .{
            .scope_stack = stack,
            .cost_limit = cost_limit,
            .allocator = allocator,
        };
    }

    pub fn currentScope(self: *CostAccumulator) *Scope {
        return &self.scope_stack.items[self.scope_stack.items.len - 1];
    }

    /// Add cost, checking limit
    pub fn add(self: *CostAccumulator, cost: JitCost) !void {
        try self.currentScope().add(cost);

        if (self.cost_limit) |limit| {
            const accumulated = self.currentScope().currentCost();
            if (accumulated.gt(limit)) {
                return error.CostLimitExceeded;
            }
        }
    }

    /// Total accumulated cost
    pub fn totalCost(self: *const CostAccumulator) JitCost {
        return self.scope_stack.items[self.scope_stack.items.len - 1].counter.current_cost;
    }

    pub fn reset(self: *CostAccumulator) void {
        self.scope_stack.clearRetainingCapacity();
        self.scope_stack.append(.{
            .counter = CostCounter.init(.{ .value = 0 }),
        }) catch unreachable;
    }
};

Cost Limit Enforcement

Cost Accumulation Flow
─────────────────────────────────────────────────────

Each operation:
  1. Compute operation cost
  2. Call accumulator.add(opCost)
  3. Check: accumulatedCost > limit?
     Yes → return CostLimitExceeded
     No  → continue execution

At the end:
  totalCost = accumulator.totalCost()
  blockCost = totalCost.toBlockCost()

Evaluator Cost Methods

The evaluator provides methods to add costs¹¹¹²:

const Evaluator = struct {
    cost_accum: CostAccumulator,
    cost_trace: ?std.ArrayList(CostItem),
    profiler: ?*Profiler,

    // ... other fields

    /// Add fixed cost
    pub fn addCost(self: *Evaluator, cost_kind: FixedCost, op_desc: OperationDesc) !void {
        try self.cost_accum.add(cost_kind.cost);

        if (self.cost_trace) |*trace| {
            try trace.append(.{
                .fixed = .{ .op_desc = op_desc, .cost_kind = cost_kind },
            });
        }
    }

    /// Add fixed cost and execute block
    pub fn addFixedCost(
        self: *Evaluator,
        cost_kind: FixedCost,
        op_desc: OperationDesc,
        comptime block: fn (*Evaluator) anyerror!anytype,
    ) !@TypeOf(block(self)) {
        if (self.profiler) |prof| {
            const start = std.time.nanoTimestamp();
            try self.cost_accum.add(cost_kind.cost);
            const result = try block(self);
            const end = std.time.nanoTimestamp();
            prof.addTiming(op_desc, end - start);
            return result;
        } else {
            try self.cost_accum.add(cost_kind.cost);
            return block(self);
        }
    }

    /// Add per-item cost for known count
    pub fn addSeqCost(
        self: *Evaluator,
        cost_kind: PerItemCost,
        n_items: usize,
        op_desc: OperationDesc,
    ) !void {
        const cost = try cost_kind.cost(n_items);
        try self.cost_accum.add(cost);

        if (self.cost_trace) |*trace| {
            try trace.append(.{
                .seq = .{
                    .op_desc = op_desc,
                    .cost_kind = cost_kind,
                    .n_items = n_items,
                },
            });
        }
    }

    /// Add type-based cost
    pub fn addTypeBasedCost(
        self: *Evaluator,
        cost_kind: TypeBasedCost,
        tpe: SType,
        op_desc: OperationDesc,
    ) !void {
        const cost = cost_kind.cost_fn(tpe);
        try self.cost_accum.add(cost);

        if (self.cost_trace) |*trace| {
            try trace.append(.{
                .type_based = .{
                    .op_desc = op_desc,
                    .cost_kind = cost_kind,
                    .tpe = tpe,
                },
            });
        }
    }
};

PowHit (Autolykos2) Cost

Special cost computation for Autolykos2 mining¹³:

const PowHitCost = struct {
    /// Cost of custom Autolykos2 hash function
    pub fn cost(
        k: u32,               // k-sum problem inputs
        msg: []const u8,      // message to hash
        nonce: []const u8,    // padding for PoW output
        h: []const u8,        // block height padding
    ) JitCost {
        const chunk_size = CalcBlake2b256.COST.chunk_size;
        const per_chunk = CalcBlake2b256.COST.per_chunk_cost.value;
        const base_cost: i32 = 500;

        // The heaviest part: k + 1 Blake2b256 invocations
        const input_len = msg.len + nonce.len + h.len;
        const chunks_per_hash = input_len / chunk_size + 1;
        const total_cost = base_cost + @as(i32, @intCast(k + 1)) *
            @as(i32, @intCast(chunks_per_hash)) * per_chunk;

        return .{ .value = total_cost };
    }
};

Operation Cost Constants

Defined in operation companion structs:

const Constant = struct {
    pub const COST = FixedCost{ .cost = .{ .value = 5 } };
    // ...
};

const ValUse = struct {
    pub const COST = FixedCost{ .cost = .{ .value = 5 } };
    // ...
};

const If = struct {
    pub const COST = FixedCost{ .cost = .{ .value = 10 } };
    // ...
};

const MapCollection = struct {
    pub const COST = PerItemCost{
        .base_cost = .{ .value = 20 },
        .per_chunk_cost = .{ .value = 2 },
        .chunk_size = 10,
    };
    // ...
};

const CalcBlake2b256 = struct {
    pub const COST = PerItemCost{
        .base_cost = .{ .value = 20 },
        .per_chunk_cost = .{ .value = 7 },
        .chunk_size = 128,
    };
    // ...
};

const CalcSha256 = struct {
    pub const COST = PerItemCost{
        .base_cost = .{ .value = 80 },
        .per_chunk_cost = .{ .value = 8 },
        .chunk_size = 64,
    };
    // ...
};

Cost Tracing Output

Example trace from evaluating a script:

Cost Trace
─────────────────────────────────────────────────────
Constant           :    5
ValUse             :    5
ByIndex            :   30
Constant           :    5
MapCollection[10]  :   22  (base=20, chunks=1)
Filter[5]          :   22  (base=20, chunks=1)
blake2b256[256]    :   34  (base=20, chunks=2)
─────────────────────────────────────────────────────
Total JitCost      :  123
Block Cost         :   12

Complete Evaluation with Costing

pub fn evaluateWithCost(
    ergo_tree: *const ErgoTree,
    context: *const Context,
    cost_limit: JitCost,
    allocator: Allocator,
) !struct { result: SigmaBoolean, cost: JitCost } {
    var cost_accum = CostAccumulator.init(
        allocator,
        .{ .value = 0 },
        cost_limit,
    );

    var evaluator = Evaluator{
        .context = context,
        .constants = ergo_tree.constants,
        .cost_accum = cost_accum,
        .cost_trace = null,
        .profiler = null,
        .allocator = allocator,
    };

    const empty_env = Env.init(allocator);
    const result = try evaluator.eval(&empty_env, ergo_tree.root);

    const sigma_prop = result.asSigmaProp() orelse
        return error.NotSigmaProp;

    return .{
        .result = sigma_prop.sigma_boolean,
        .cost = evaluator.cost_accum.totalCost(),
    };
}

Summary

This chapter covered the cost model that ensures all ErgoTree scripts terminate within bounded resources:

JitCost uses 10x scaling from block costs, providing finer granularity for internal calculations while maintaining integer arithmetic without floating point
FixedCost applies to constant-time operations like variable access (cost = 5) and conditionals (cost = 10)
PerItemCost models operations that scale with input size using the formula: baseCost + ceil(n/chunkSize) × perChunkCost—this applies to collection operations and hash functions
TypeBasedCost handles operations whose cost depends on operand type—BigInt operations are more expensive than primitive integer operations
CostAccumulator tracks accumulated costs during evaluation and checks against the limit after each operation; exceeding the limit immediately fails evaluation
CostItem types (FixedCostItem, SeqCostItem, TypeBasedCostItem) enable detailed cost tracing for debugging and optimization
The PowHit cost function handles the special case of Autolykos2 mining operations

Next: Chapter 14: Verifier Implementation

Scala: JitCost.scala:3-7

Rust: cost_accum.rs:1-12

Scala: JitCost.scala:9-36

⁴

Scala: CostKind.scala:10-55

⁵

Rust: costs.rs:1-24

⁶

Scala: CostKind.scala:60-66

⁷

Scala: CostItem.scala:3-78

⁸

Rust: cost_accum.rs:13-17

⁹

Scala: CostAccumulator.scala:7-79

¹⁰

Rust: cost_accum.rs:19-43

¹¹

Scala: ErgoTreeEvaluator.scala:18-86

¹²

Rust: eval.rs:130-160

¹³

Scala: CostKind.scala:71-88

Chapter 14: Verifier Implementation

Prerequisites

Chapter 11 for Sigma protocol verification and Fiat-Shamir transformation
Chapter 12 for ErgoTree reduction to SigmaBoolean
Chapter 13 for cost accumulation during verification

Learning Objectives

By the end of this chapter, you will be able to:

Trace the complete verification flow from ErgoTree to boolean result
Implement verify() and fullReduction() methods
Handle soft-fork conditions gracefully to maintain network compatibility
Verify cryptographic signatures using Fiat-Shamir commitment reconstruction
Estimate verification cost before performing expensive cryptographic operations

Verification Overview

Verification is the counterpart to proving: given an ErgoTree, a transaction context, and a cryptographic proof, the verifier determines whether the proof is valid. This process happens for every input box in every transaction—efficient verification is critical for blockchain throughput.

The verification proceeds in two phases: first reduce the ErgoTree to a SigmaBoolean proposition (using the evaluator from Chapter 12), then verify the cryptographic proof satisfies that proposition¹².

Verification Pipeline
─────────────────────────────────────────────────────

Input: ErgoTree + Context + Proof + Message

┌──────────────────────────────────────────────────┐
│              1. REDUCTION PHASE                  │
│                                                  │
│  ErgoTree ────> propositionFromErgoTree()        │
│                        │                         │
│                        ▼                         │
│              SigmaPropValue                      │
│                        │                         │
│                        ▼                         │
│              fullReduction()                     │
│                        │                         │
│                        ▼                         │
│              SigmaBoolean + Cost                 │
└──────────────────────────────────────────────────┘
                         │
                         ▼
┌──────────────────────────────────────────────────┐
│            2. VERIFICATION PHASE                 │
│                                                  │
│  TrueProp  ────> return (true, cost)             │
│  FalseProp ────> return (false, cost)            │
│                                                  │
│  Otherwise:                                      │
│    estimateCryptoVerifyCost()                    │
│              │                                   │
│              ▼                                   │
│    verifySignature() ────> boolean result        │
└──────────────────────────────────────────────────┘

Output: (verified: bool, total_cost: u64)

Verification Result

const VerificationResult = struct {
    /// Result of SigmaProp verification
    result: bool,
    /// Estimated cost of contract execution
    cost: u64,
    /// Diagnostic information
    diag: ReductionDiagnosticInfo,
};

const ReductionResult = struct {
    /// SigmaBoolean proposition
    sigma_prop: SigmaBoolean,
    /// Accumulated cost (block scale)
    cost: u64,
    /// Diagnostic info
    diag: ReductionDiagnosticInfo,
};

const ReductionDiagnosticInfo = struct {
    /// Environment after evaluation
    env: Env,
    /// Pretty-printed expression
    pretty_printed_expr: ?[]const u8,
};

Verifier Trait

The base verifier interface³⁴:

const Verifier = struct {
    const Self = @This();

    /// Cost per byte for deserialization
    pub const COST_PER_BYTE_DESERIALIZED: i32 = 2;
    /// Cost per tree byte for substitution
    pub const COST_PER_TREE_BYTE: i32 = 2;

    /// Verify an ErgoTree in context with proof
    pub fn verify(
        self: *const Self,
        ergo_tree: *const ErgoTree,
        context: *const Context,
        proof: ProofBytes,
        message: []const u8,
    ) VerifierError!VerificationResult {
        // Reduce to SigmaBoolean
        const reduction = try reduceToCrypto(ergo_tree, context);

        const result: bool = switch (reduction.sigma_prop) {
            .trivial_prop => |b| b,
            else => |sb| blk: {
                if (proof.isEmpty()) {
                    break :blk false;
                }
                // Verifier Steps 1-3: Parse proof
                const unchecked = try parseAndComputeChallenges(&sb, proof.bytes());
                // Verifier Steps 4-6: Check commitments
                break :blk try checkCommitments(unchecked, message);
            },
        };

        return .{
            .result = result,
            .cost = reduction.cost,
            .diag = reduction.diag,
        };
    }
};

The verify() Method

Complete verification entry point⁵:

pub fn verify(
    env: ScriptEnv,
    ergo_tree: *const ErgoTree,
    context: *const Context,
    proof: []const u8,
    message: []const u8,
) VerifierError!VerificationResult {
    // Check soft-fork condition first
    if (checkSoftForkCondition(ergo_tree, context)) |soft_fork_result| {
        return soft_fork_result;
    }

    // REDUCTION PHASE
    const reduced = try fullReduction(ergo_tree, context, env);

    // VERIFICATION PHASE
    return switch (reduced.sigma_prop) {
        .true_prop => .{ .result = true, .cost = reduced.cost, .diag = reduced.diag },
        .false_prop => .{ .result = false, .cost = reduced.cost, .diag = reduced.diag },
        else => |sb| blk: {
            // Non-trivial proposition: verify cryptographic proof
            const full_cost = try addCryptoCost(sb, reduced.cost, context.cost_limit);

            const ok = verifySignature(sb, message, proof) catch false;
            break :blk .{
                .result = ok,
                .cost = full_cost,
                .diag = reduced.diag,
            };
        },
    };
}

Full Reduction

Reduces ErgoTree to SigmaBoolean with cost tracking⁶⁷:

pub fn fullReduction(
    ergo_tree: *const ErgoTree,
    context: *const Context,
    env: ScriptEnv,
) ReducerError!ReductionResult {
    // Extract proposition from ErgoTree
    const prop = try propositionFromErgoTree(ergo_tree, context);

    // Fast path: SigmaProp constant
    if (prop == .sigma_prop_constant) {
        const sb = prop.sigma_prop_constant.toSigmaBoolean();
        const eval_cost = SigmaPropConstant.COST.cost.toBlockCost();
        const res_cost = try addCostChecked(context.init_cost, eval_cost, context.cost_limit);
        return .{
            .sigma_prop = sb,
            .cost = res_cost,
            .diag = .{ .env = context.env, .pretty_printed_expr = null },
        };
    }

    // No DeserializeContext: direct evaluation
    if (!ergo_tree.hasDeserialize()) {
        return evalToCrypto(context, ergo_tree);
    }

    // Has DeserializeContext: special handling
    return reductionWithDeserialize(ergo_tree, prop, context, env);
}

fn propositionFromErgoTree(
    ergo_tree: *const ErgoTree,
    context: *const Context,
) PropositionError!SigmaPropValue {
    return switch (ergo_tree.root) {
        .parsed => |tree| ergo_tree.toProposition(ergo_tree.header.constant_segregation),
        .unparsed => |u| blk: {
            if (context.validation_settings.isSoftFork(u.err)) {
                // Soft-fork: return true (accept)
                break :blk SigmaPropValue.true_sigma_prop;
            }
            // Hard error
            return error.UnparsedErgoTree;
        },
    };
}

Signature Verification

Implements Verifier Steps 4-6 of the Sigma protocol⁸⁹:

/// Verify a signature on message for given proposition
pub fn verifySignature(
    sigma_tree: SigmaBoolean,
    message: []const u8,
    signature: []const u8,
) VerifierError!bool {
    return switch (sigma_tree) {
        .trivial_prop => |b| b,
        else => |sb| blk: {
            if (signature.len == 0) {
                break :blk false;
            }
            // Verifier Steps 1-3: Parse proof
            const unchecked = try parseAndComputeChallenges(&sb, signature);
            // Verifier Steps 4-6: Check commitments
            break :blk try checkCommitments(unchecked, message);
        },
    };
}

/// Verifier Steps 4-6: Check commitments match Fiat-Shamir challenge
fn checkCommitments(
    sp: UncheckedTree,
    message: []const u8,
) VerifierError!bool {
    // Verifier Step 4: Compute commitments from challenges and responses
    const new_root = computeCommitments(sp);

    // Steps 5-6: Serialize tree for Fiat-Shamir
    var buf = std.ArrayList(u8).init(allocator);
    try fiatShamirTreeToBytes(&new_root, buf.writer());
    try buf.appendSlice(message);

    // Compute expected challenge
    const expected_challenge = fiatShamirHashFn(buf.items);

    // Compare with actual challenge
    // NOTE: In production, use constant-time comparison for challenge bytes
    // to prevent timing side-channels: std.crypto.utils.timingSafeEql
    return std.mem.eql(u8, &new_root.challenge(), &expected_challenge);
}

Computing Commitments

Verifier Step 4: Reconstruct commitments from challenges and responses¹⁰¹¹:

/// For every leaf, compute commitment from challenge and response
pub fn computeCommitments(sp: UncheckedTree) UncheckedTree {
    return switch (sp) {
        .unchecked_leaf => |leaf| switch (leaf) {
            .unchecked_schnorr => |sn| blk: {
                // Reconstruct: a = g^z / h^e
                const a = DlogProver.computeCommitment(
                    &sn.proposition,
                    &sn.challenge,
                    &sn.second_message,
                );
                break :blk UncheckedTree{
                    .unchecked_leaf = .{
                        .unchecked_schnorr = .{
                            .proposition = sn.proposition,
                            .challenge = sn.challenge,
                            .second_message = sn.second_message,
                            .commitment_opt = FirstDlogProverMessage{ .a = a },
                        },
                    },
                };
            },
            .unchecked_dh_tuple => |dh| blk: {
                // Reconstruct both commitments
                const commitment = DhTupleProver.computeCommitment(
                    &dh.proposition,
                    &dh.challenge,
                    &dh.second_message,
                );
                break :blk UncheckedTree{
                    .unchecked_leaf = .{
                        .unchecked_dh_tuple = .{
                            .proposition = dh.proposition,
                            .challenge = dh.challenge,
                            .second_message = dh.second_message,
                            .commitment_opt = commitment,
                        },
                    },
                };
            },
        },
        .unchecked_conjecture => |conj| blk: {
            // Recursively process children
            var new_children = allocator.alloc(UncheckedTree, conj.children.len);
            for (conj.children, 0..) |child, i| {
                new_children[i] = computeCommitments(child);
            }
            break :blk conj.withChildren(new_children);
        },
    };
}

Crypto Verification Cost

Estimate cost before performing expensive operations¹²:

const VerificationCosts = struct {
    /// Cost for Schnorr commitment computation
    pub const COMPUTE_COMMITMENTS_SCHNORR = FixedCost{ .cost = .{ .value = 3400 } };
    /// Cost for DHT commitment computation
    pub const COMPUTE_COMMITMENTS_DHT = FixedCost{ .cost = .{ .value = 6450 } };

    /// Total Schnorr verification cost
    pub const PROVE_DLOG_VERIFICATION: JitCost = blk: {
        const parse = ParseChallenge_ProveDlog.COST.cost;
        const compute = COMPUTE_COMMITMENTS_SCHNORR.cost;
        const serialize = ToBytes_Schnorr.COST.cost;
        break :blk parse.add(compute).add(serialize);
    };

    /// Total DHT verification cost
    pub const PROVE_DHT_VERIFICATION: JitCost = blk: {
        const parse = ParseChallenge_ProveDHT.COST.cost;
        const compute = COMPUTE_COMMITMENTS_DHT.cost;
        const serialize = ToBytes_DHT.COST.cost;
        break :blk parse.add(compute).add(serialize);
    };
};

/// Estimate verification cost without performing crypto
pub fn estimateCryptoVerifyCost(sb: SigmaBoolean) JitCost {
    return switch (sb) {
        .prove_dlog => VerificationCosts.PROVE_DLOG_VERIFICATION,
        .prove_dh_tuple => VerificationCosts.PROVE_DHT_VERIFICATION,
        .c_and => |and_node| blk: {
            const node_cost = ToBytes_ProofTreeConjecture.COST.cost;
            var children_cost = JitCost{ .value = 0 };
            for (and_node.children) |child| {
                children_cost = children_cost.add(estimateCryptoVerifyCost(child)) catch unreachable;
            }
            break :blk node_cost.add(children_cost) catch unreachable;
        },
        .c_or => |or_node| blk: {
            const node_cost = ToBytes_ProofTreeConjecture.COST.cost;
            var children_cost = JitCost{ .value = 0 };
            for (or_node.children) |child| {
                children_cost = children_cost.add(estimateCryptoVerifyCost(child)) catch unreachable;
            }
            break :blk node_cost.add(children_cost) catch unreachable;
        },
        .c_threshold => |th| blk: {
            const n_children = th.children.len;
            const n_coefs = n_children - th.k;
            const parse_cost = ParsePolynomial.COST.cost(@intCast(n_coefs));
            const eval_cost = EvaluatePolynomial.COST.cost(@intCast(n_coefs)).mul(@intCast(n_children)) catch unreachable;
            const node_cost = ToBytes_ProofTreeConjecture.COST.cost;
            var children_cost = JitCost{ .value = 0 };
            for (th.children) |child| {
                children_cost = children_cost.add(estimateCryptoVerifyCost(child)) catch unreachable;
            }
            break :blk parse_cost.add(eval_cost).add(node_cost).add(children_cost) catch unreachable;
        },
        else => JitCost{ .value = 0 }, // Trivial proposition
    };
}

/// Add crypto cost to accumulated cost
fn addCryptoCost(
    sigma_prop: SigmaBoolean,
    base_cost: u64,
    cost_limit: u64,
) CostError!u64 {
    const crypto_cost = estimateCryptoVerifyCost(sigma_prop).toBlockCost();
    return addCostChecked(base_cost, crypto_cost, cost_limit);
}

Soft-Fork Handling

Handle unrecognized script versions gracefully¹³:

/// Check for soft-fork condition
fn checkSoftForkCondition(
    ergo_tree: *const ErgoTree,
    context: *const Context,
) ?VerificationResult {
    if (context.activated_script_version > MAX_SUPPORTED_SCRIPT_VERSION) {
        // Protocol version exceeds interpreter capabilities
        if (ergo_tree.header.version > MAX_SUPPORTED_SCRIPT_VERSION) {
            // Cannot verify: accept and rely on 90% upgraded nodes
            return .{
                .result = true,
                .cost = context.init_cost,
                .diag = .{ .env = Env.empty(), .pretty_printed_expr = null },
            };
        }
        // Can verify despite protocol upgrade
    } else {
        // Activated version within supported range
        if (ergo_tree.header.version > context.activated_script_version) {
            // ErgoTree version too high
            return error.ErgoTreeVersionTooHigh;
        }
    }
    return null; // Proceed normally
}

/// Soft-fork reduction result: accept as true
fn whenSoftForkReductionResult(cost: u64) ReductionResult {
    return .{
        .sigma_prop = .{ .trivial_prop = true },
        .cost = cost,
        .diag = .{ .env = Env.empty(), .pretty_printed_expr = null },
    };
}

DeserializeContext Handling

Scripts may contain deserialization operations¹⁴:

fn reductionWithDeserialize(
    ergo_tree: *const ErgoTree,
    prop: SigmaPropValue,
    context: *const Context,
    env: ScriptEnv,
) ReducerError!ReductionResult {
    // Add cost for deserialization substitution
    const tree_bytes = ergo_tree.bytes();
    const deserialize_cost = @as(i64, @intCast(tree_bytes.len)) * COST_PER_TREE_BYTE;
    const curr_cost = try addCostChecked(context.init_cost, deserialize_cost, context.cost_limit);

    var context1 = context.*;
    context1.init_cost = curr_cost;

    // Substitute DeserializeContext nodes
    const prop_tree = try applyDeserializeContext(&context1, prop);

    // Reduce the substituted tree
    return reduceToCrypto(&context1, prop_tree);
}

Complete Verification Flow

verify(ergoTree, context, proof, message)
─────────────────────────────────────────────────────

Step 1: checkSoftForkCondition()
        │
        ├─ activated > MaxSupported AND script > MaxSupported
        │  └─> return (true, initCost)  [soft-fork accept]
        │
        ├─ script.version > activated
        │  └─> throw ErgoTreeVersionTooHigh
        │
        └─ Otherwise: proceed
                │
                ▼
Step 2: fullReduction()
        │
        ├─ propositionFromErgoTree()
        │  └─ Handle unparsed trees
        │
        ├─ SigmaPropConstant
        │  └─> Extract directly
        │
        ├─ No DeserializeContext
        │  └─> evalToCrypto()
        │
        └─ Has DeserializeContext
           └─> reductionWithDeserialize()
                │
                ▼
        ReductionResult(sigmaBoolean, cost)
                │
                ▼
Step 3: Check result
        │
        ├─ TrueProp  ────> return (true, cost)
        ├─ FalseProp ────> return (false, cost)
        └─ Non-trivial ────> continue
                │
                ▼
Step 4: addCryptoCost()
        │
        └─ Estimate without crypto ops
                │
                ▼
Step 5: verifySignature()
        │
        ├─ parseAndComputeChallenges()
        │  └─ Parse proof bytes
        │
        ├─ computeCommitments()
        │  └─ Reconstruct commitments
        │
        ├─ fiatShamirTreeToBytes()
        │  └─ Serialize tree
        │
        └─ fiatShamirHashFn()
           └─ Compute expected challenge
                │
                ▼
Step 6: Return (result, totalCost)

Verifier Errors

const VerifierError = error{
    /// Failed to parse ErgoTree
    ErgoTreeError,
    /// Failed to evaluate ErgoTree
    EvalError,
    /// Signature parsing error
    SigParsingError,
    /// Fiat-Shamir serialization error
    FiatShamirTreeSerializationError,
    /// Cost limit exceeded
    CostLimitExceeded,
    /// ErgoTree version too high
    ErgoTreeVersionTooHigh,
    /// Cannot parse unparsed tree
    UnparsedErgoTree,
};

Test Verifier

Simple verifier implementation for testing¹⁵:

const TestVerifier = struct {
    const Self = @This();

    pub fn verify(
        self: *const Self,
        tree: *const ErgoTree,
        ctx: *const Context,
        proof: ProofBytes,
        message: []const u8,
    ) VerifierError!VerificationResult {
        _ = self;
        const reduction = try reduceToCrypto(tree, ctx);

        const result: bool = switch (reduction.sigma_prop) {
            .trivial_prop => |b| b,
            else => |sb| blk: {
                if (proof.isEmpty()) {
                    break :blk false;
                }
                const unchecked = try parseAndComputeChallenges(&sb, proof.bytes());
                break :blk try checkCommitments(unchecked, message);
            },
        };

        return .{
            .result = result,
            .cost = 0, // Test verifier doesn't track cost
            .diag = reduction.diag,
        };
    }
};

Summary

This chapter covered the verifier implementation that validates Sigma proofs:

Verification proceeds in two phases: reduction (ErgoTree → SigmaBoolean) and cryptographic verification (proof checking)
fullReduction() evaluates the ErgoTree to a SigmaBoolean proposition while tracking costs
verifySignature() implements Verifier Steps 4-6: parse proof bytes, compute expected commitments from challenges and responses, then verify via Fiat-Shamir hash
Soft-fork handling accepts scripts with unrecognized versions or opcodes, enabling protocol upgrades without network splits
Cost estimation predicts cryptographic verification cost before performing expensive EC operations, failing early if the limit would be exceeded
Commitment reconstruction (computeCommitments) derives the prover's commitments from the challenges and responses, which must match the Fiat-Shamir challenge
DeserializeContext nodes are substituted with their deserialized values before reduction begins

Next: Chapter 15: Prover Implementation

Scala: Interpreter.scala:30-100

Rust: verifier.rs:27-52

Scala: Interpreter.scala:78-92

⁴

Rust: verifier.rs:55-88

⁵

Scala: Interpreter.scala:132-167

⁶

Scala: Interpreter.scala:196-239

⁷

Rust: eval.rs:130-160

⁸

Scala: Interpreter.scala:282-298

⁹

Rust: verifier.rs:91-125

¹⁰

Scala: Interpreter.scala:324-347

¹¹

Rust: verifier.rs:127-163

¹²

Scala: Interpreter.scala:362-408

¹³

Scala: Interpreter.scala:450-472

¹⁴

Scala: Interpreter.scala:492-517

¹⁵

Rust: verifier.rs:166-168

Chapter 15: Prover Implementation

Prerequisites

Chapter 11 for Sigma protocol structure, simulation, and Fiat-Shamir
Chapter 12 for ErgoTree reduction to SigmaBoolean
Chapter 14 for understanding what the verifier expects

Learning Objectives

By the end of this chapter, you will be able to:

Trace the 10-step proving algorithm from SigmaBoolean to serialized proof
Work with the UnprovenTree data structure and its transformations
Explain challenge flow through AND, OR, and THRESHOLD compositions
Use the hint system for distributed multi-party signing
Serialize proofs in the compact format expected by verifiers

Prover Overview

The prover is the counterpart to the verifier: given an ErgoTree, a transaction context, and the necessary secret keys, it generates a cryptographic proof that the verifier will accept. The proving algorithm is significantly more complex than verification because it must handle composite propositions (AND/OR/THRESHOLD) by generating simulated transcripts for children the prover cannot prove, while maintaining the zero-knowledge property that simulated and real transcripts are indistinguishable.

The prover generates cryptographic proofs for sigma propositions through a multi-phase algorithm¹²:

Proving Pipeline
─────────────────────────────────────────────────────

Step 0:  SigmaBoolean ─────> convertToUnproven()
                             │
                             ▼
Step 1:  Mark real nodes (bottom-up)
                             │
                             ▼
Step 2:  Check root is real (abort if simulated)
                             │
                             ▼
Step 3:  Polish simulated (top-down)
                             │
                             ▼
Steps 4-6: Simulate/Commit
         - Assign challenges to simulated children
         - Simulate simulated leaves
         - Compute commitments for real leaves
                             │
                             ▼
Step 7:  Serialize for Fiat-Shamir
                             │
                             ▼
Step 8:  Compute root challenge = H(tree || message)
                             │
                             ▼
Step 9:  Compute real challenges and responses
                             │
                             ▼
Step 10: Serialize proof bytes

Tree Data Structures

Node Position

Position encodes path from root³:

const NodePosition = struct {
    /// Position bytes (e.g., [0, 2, 1] for "0-2-1")
    positions: []const u8,

    pub const CRYPTO_TREE_PREFIX: NodePosition = .{ .positions = &[_]u8{0} };

    pub fn child(self: NodePosition, idx: usize, allocator: Allocator) !NodePosition {
        var new_pos = try allocator.alloc(u8, self.positions.len + 1);
        @memcpy(new_pos[0..self.positions.len], self.positions);
        new_pos[self.positions.len] = @intCast(idx);
        return .{ .positions = new_pos };
    }
};

Position Encoding
─────────────────────────────────────────────────────

            0           (root)
          / | \
         /  |  \
       0-0 0-1 0-2      (children)
               /|
              / |
            0-2-0 0-2-1 (grandchildren)

Prefix "0" = crypto-tree (vs "1" = ErgoTree)

Unproven Tree

During proving, the tree undergoes transformations⁴⁵:

const UnprovenTree = union(enum) {
    unproven_leaf: UnprovenLeaf,
    unproven_conjecture: UnprovenConjecture,

    pub fn isReal(self: UnprovenTree) bool {
        return !self.simulated();
    }

    pub fn simulated(self: UnprovenTree) bool {
        return switch (self) {
            .unproven_leaf => |l| l.simulated,
            .unproven_conjecture => |c| c.simulated(),
        };
    }

    pub fn withChallenge(self: UnprovenTree, challenge: Challenge) UnprovenTree {
        return switch (self) {
            .unproven_leaf => |l| .{ .unproven_leaf = l.withChallenge(challenge) },
            .unproven_conjecture => |c| .{ .unproven_conjecture = c.withChallenge(challenge) },
        };
    }

    pub fn withSimulated(self: UnprovenTree, sim: bool) UnprovenTree {
        return switch (self) {
            .unproven_leaf => |l| .{ .unproven_leaf = l.withSimulated(sim) },
            .unproven_conjecture => |c| .{ .unproven_conjecture = c.withSimulated(sim) },
        };
    }
};

Unproven Leaf Nodes

const UnprovenLeaf = union(enum) {
    unproven_schnorr: UnprovenSchnorr,
    unproven_dh_tuple: UnprovenDhTuple,

    // ... accessor methods
};

const UnprovenSchnorr = struct {
    proposition: ProveDlog,
    commitment_opt: ?FirstDlogProverMessage,
    randomness_opt: ?Scalar,  // Secret r for commitment
    challenge_opt: ?Challenge,
    simulated: bool,
    position: NodePosition,

    pub fn withChallenge(self: UnprovenSchnorr, c: Challenge) UnprovenSchnorr {
        return .{
            .proposition = self.proposition,
            .commitment_opt = self.commitment_opt,
            .randomness_opt = self.randomness_opt,
            .challenge_opt = c,
            .simulated = self.simulated,
            .position = self.position,
        };
    }

    pub fn withSimulated(self: UnprovenSchnorr, sim: bool) UnprovenSchnorr {
        return .{
            .proposition = self.proposition,
            .commitment_opt = self.commitment_opt,
            .randomness_opt = self.randomness_opt,
            .challenge_opt = self.challenge_opt,
            .simulated = sim,
            .position = self.position,
        };
    }
};

const UnprovenDhTuple = struct {
    proposition: ProveDhTuple,
    commitment_opt: ?FirstDhTupleProverMessage,
    randomness_opt: ?Scalar,
    challenge_opt: ?Challenge,
    simulated: bool,
    position: NodePosition,
};

Unproven Conjecture Nodes

const UnprovenConjecture = union(enum) {
    cand_unproven: CandUnproven,
    cor_unproven: CorUnproven,
    cthreshold_unproven: CthresholdUnproven,

    pub fn simulated(self: UnprovenConjecture) bool {
        return switch (self) {
            .cand_unproven => |c| c.simulated,
            .cor_unproven => |c| c.simulated,
            .cthreshold_unproven => |c| c.simulated,
        };
    }

    pub fn children(self: UnprovenConjecture) []ProofTree {
        return switch (self) {
            .cand_unproven => |c| c.children,
            .cor_unproven => |c| c.children,
            .cthreshold_unproven => |c| c.children,
        };
    }
};

const CandUnproven = struct {
    proposition: Cand,
    challenge_opt: ?Challenge,
    simulated: bool,
    children: []ProofTree,
    position: NodePosition,
};

const CorUnproven = struct {
    proposition: Cor,
    challenge_opt: ?Challenge,
    simulated: bool,
    children: []ProofTree,
    position: NodePosition,
};

const CthresholdUnproven = struct {
    proposition: Cthreshold,
    challenge_opt: ?Challenge,
    simulated: bool,
    k: u8,                        // Threshold
    children: []ProofTree,
    polynomial_opt: ?Gf2_192Poly, // For challenge distribution
    position: NodePosition,
};

The Proving Algorithm

Prover Trait

const Prover = struct {
    secrets: []const PrivateInput,

    pub fn prove(
        self: *const Prover,
        tree: *const ErgoTree,
        ctx: *const Context,
        message: []const u8,
        hints_bag: *const HintsBag,
    ) ProverError!ProverResult {
        const reduction = try reduceToCrypto(tree, ctx);
        const proof = try self.generateProof(
            reduction.sigma_prop,
            message,
            hints_bag,
        );
        return .{
            .proof = proof,
            .extension = ctx.extension,
        };
    }

    pub fn generateProof(
        self: *const Prover,
        sigma_bool: SigmaBoolean,
        message: []const u8,
        hints_bag: *const HintsBag,
    ) ProverError!ProofBytes {
        return switch (sigma_bool) {
            .trivial_prop => |b| blk: {
                if (b) break :blk ProofBytes.empty();
                return error.ReducedToFalse;
            },
            else => |sb| blk: {
                const unproven = try convertToUnproven(sb);
                const unchecked = try proveToUnchecked(self, unproven, message, hints_bag);
                break :blk serializeSig(unchecked);
            },
        };
    }
};

Step 0: Convert to Unproven

Transform SigmaBoolean to UnprovenTree⁶:

fn convertToUnproven(sigma_tree: SigmaBoolean) ProverError!UnprovenTree {
    return switch (sigma_tree) {
        .c_and => |and_node| blk: {
            var children = try allocator.alloc(ProofTree, and_node.children.len);
            for (and_node.children, 0..) |child, i| {
                children[i] = .{ .unproven_tree = try convertToUnproven(child) };
            }
            break :blk .{
                .unproven_conjecture = .{
                    .cand_unproven = .{
                        .proposition = and_node,
                        .challenge_opt = null,
                        .simulated = false,
                        .children = children,
                        .position = NodePosition.CRYPTO_TREE_PREFIX,
                    },
                },
            };
        },
        .c_or => |or_node| blk: {
            // Similar conversion for OR
            // ...
        },
        .c_threshold => |th| blk: {
            // Similar conversion for THRESHOLD
            // ...
        },
        .prove_dlog => |pk| .{
            .unproven_leaf = .{
                .unproven_schnorr = .{
                    .proposition = pk,
                    .commitment_opt = null,
                    .randomness_opt = null,
                    .challenge_opt = null,
                    .simulated = false,
                    .position = NodePosition.CRYPTO_TREE_PREFIX,
                },
            },
        },
        .prove_dh_tuple => |dht| .{
            .unproven_leaf = .{
                .unproven_dh_tuple = .{
                    .proposition = dht,
                    .commitment_opt = null,
                    .randomness_opt = null,
                    .challenge_opt = null,
                    .simulated = false,
                    .position = NodePosition.CRYPTO_TREE_PREFIX,
                },
            },
        },
        else => error.Unexpected,
    };
}

Step 1: Mark Real Nodes

Bottom-up traversal to mark what prover can prove⁷⁸:

fn markReal(
    prover: *const Prover,
    tree: UnprovenTree,
    hints_bag: *const HintsBag,
) ProverError!UnprovenTree {
    return rewriteBottomUp(tree, struct {
        fn transform(node: ProofTree, p: *const Prover, hints: *const HintsBag) ?ProofTree {
            return switch (node) {
                .unproven_tree => |ut| switch (ut) {
                    .unproven_leaf => |leaf| blk: {
                        // Leaf is real if prover has secret OR hint shows knowledge
                        const secret_known = hints.realImages().contains(leaf.proposition()) or
                            p.hasSecretFor(leaf.proposition());
                        break :blk leaf.withSimulated(!secret_known);
                    },
                    .unproven_conjecture => |conj| switch (conj) {
                        .cand_unproven => |cand| blk: {
                            // AND is real only if ALL children are real
                            const simulated = anyChildSimulated(cand.children);
                            break :blk cand.withSimulated(simulated);
                        },
                        .cor_unproven => |cor| blk: {
                            // OR is real if AT LEAST ONE child is real
                            const simulated = allChildrenSimulated(cor.children);
                            break :blk cor.withSimulated(simulated);
                        },
                        .cthreshold_unproven => |ct| blk: {
                            // THRESHOLD(k) is real if AT LEAST k children are real
                            const real_count = countRealChildren(ct.children);
                            break :blk ct.withSimulated(real_count < ct.k);
                        },
                    },
                },
                else => null,
            };
        }
    }.transform, prover, hints_bag);
}

Step 2: Check Root

fn proveToUnchecked(
    prover: *const Prover,
    unproven: UnprovenTree,
    message: []const u8,
    hints_bag: *const HintsBag,
) ProverError!UncheckedTree {
    // Step 1
    const step1 = try markReal(prover, unproven, hints_bag);

    // Step 2: If root is simulated, prover cannot prove
    if (!step1.isReal()) {
        return error.TreeRootIsNotReal;
    }

    // Steps 3-9...
}

Step 3: Polish Simulated

Top-down traversal to ensure correct structure⁹:

fn polishSimulated(tree: UnprovenTree) ProverError!UnprovenTree {
    return rewriteTopDown(tree, struct {
        fn transform(node: ProofTree) ?ProofTree {
            return switch (node) {
                .unproven_tree => |ut| switch (ut) {
                    .unproven_conjecture => |conj| switch (conj) {
                        .cand_unproven => |cand| blk: {
                            // Simulated AND: all children simulated
                            if (cand.simulated) {
                                break :blk cand.withChildren(
                                    markAllChildrenSimulated(cand.children),
                                );
                            }
                            break :blk cand;
                        },
                        .cor_unproven => |cor| blk: {
                            if (cor.simulated) {
                                // Simulated OR: all children simulated
                                break :blk cor.withChildren(
                                    markAllChildrenSimulated(cor.children),
                                );
                            } else {
                                // Real OR: keep ONE child real, mark rest simulated
                                break :blk makeCorChildrenSimulated(cor);
                            }
                        },
                        .cthreshold_unproven => |ct| blk: {
                            if (ct.simulated) {
                                break :blk ct.withChildren(
                                    markAllChildrenSimulated(ct.children),
                                );
                            } else {
                                // Real THRESHOLD(k): keep only k children real
                                break :blk makeThresholdChildrenSimulated(ct);
                            }
                        },
                    },
                    else => null,
                },
                else => null,
            };
        }
    }.transform);
}

fn makeCorChildrenSimulated(cor: CorUnproven) CorUnproven {
    // Find first real child, mark all others simulated
    var found_real = false;
    var new_children = allocator.alloc(ProofTree, cor.children.len);
    for (cor.children, 0..) |child, i| {
        const ut = child.unproven_tree;
        if (ut.isReal() and !found_real) {
            new_children[i] = child;
            found_real = true;
        } else if (ut.isReal()) {
            new_children[i] = ut.withSimulated(true);
        } else {
            new_children[i] = child;
        }
    }
    return cor.withChildren(new_children);
}

Steps 4-6: Simulate and Commit

Combined traversal for challenges, simulation, and commitments¹⁰¹¹:

fn simulateAndCommit(
    tree: UnprovenTree,
    hints_bag: *const HintsBag,
    rng: std.rand.Random,
) ProverError!ProofTree {
    return rewriteTopDown(tree, struct {
        fn transform(node: ProofTree, hints: *const HintsBag, random: std.rand.Random) ?ProofTree {
            return switch (node) {
                .unproven_tree => |ut| switch (ut) {
                    // Step 4: Real conjecture assigns random challenges to simulated children
                    .unproven_conjecture => |conj| blk: {
                        if (conj.isReal()) {
                            break :blk assignChallengesFromRealParent(conj, random);
                        } else {
                            break :blk propagateChallengeToSimulatedChildren(conj, random);
                        }
                    },
                    // Steps 5-6: Simulate or commit at leaves
                    .unproven_leaf => |leaf| blk: {
                        if (leaf.simulated()) {
                            // Step 5: Simulate
                            break :blk simulateLeaf(leaf);
                        } else {
                            // Step 6: Compute commitment
                            break :blk commitLeaf(leaf, hints, random);
                        }
                    },
                },
                else => null,
            };
        }
    }.transform, hints_bag, rng);
}

/// Simulate a leaf: pick random z, compute commitment backwards
fn simulateLeaf(leaf: UnprovenLeaf) UncheckedTree {
    return switch (leaf) {
        .unproven_schnorr => |us| blk: {
            const challenge = us.challenge_opt orelse return error.SimulatedLeafWithoutChallenge;
            const sim = DlogProver.simulate(us.proposition, challenge);
            break :blk .{
                .unchecked_leaf = .{
                    .unchecked_schnorr = .{
                        .proposition = us.proposition,
                        .commitment_opt = sim.first_message,
                        .challenge = challenge,
                        .second_message = sim.second_message,
                    },
                },
            };
        },
        .unproven_dh_tuple => |ud| blk: {
            // Similar for DHT
        },
    };
}

/// Commit at a real leaf: pick random r, compute a = g^r
///
/// SECURITY: The randomness `r` MUST come from a cryptographically secure source:
/// - Use a CSPRNG (e.g., OS-provided /dev/urandom, std.crypto.random)
/// - For platforms without secure random, use deterministic nonce generation
///   (RFC 6979 style: r = HMAC(secret_key, message))
/// - NEVER reuse nonces: reusing r with different messages reveals the secret key
fn commitLeaf(
    leaf: UnprovenLeaf,
    hints: *const HintsBag,
    rng: std.rand.Random,
) UnprovenTree {
    return switch (leaf) {
        .unproven_schnorr => |us| blk: {
            // Check hints first
            if (hints.findCommitment(us.position)) |hint| {
                break :blk us.withCommitment(hint.commitment);
            }
            // Generate fresh commitment
            const first = DlogProver.firstMessage(rng);
            break :blk .{
                .unproven_leaf = .{
                    .unproven_schnorr = .{
                        .proposition = us.proposition,
                        .commitment_opt = first.message,
                        .randomness_opt = first.r,
                        .challenge_opt = null,
                        .simulated = false,
                        .position = us.position,
                    },
                },
            };
        },
        // Similar for DHT
    };
}

Steps 7-8: Fiat-Shamir

Serialize tree and compute root challenge¹²:

fn computeRootChallenge(tree: ProofTree, message: []const u8) Challenge {
    // Step 7: Serialize tree structure + propositions + commitments
    var buf = std.ArrayList(u8).init(allocator);
    fiatShamirTreeToBytes(&tree, buf.writer());

    // Step 8: Append message and hash
    buf.appendSlice(message);
    return fiatShamirHashFn(buf.items);
}

Step 9: Compute Real Challenges and Responses

Top-down traversal for real nodes¹³¹⁴:

fn proving(
    prover: *const Prover,
    tree: ProofTree,
    hints_bag: *const HintsBag,
) ProverError!ProofTree {
    return rewriteTopDown(tree, struct {
        fn transform(node: ProofTree, p: *const Prover, hints: *const HintsBag) ?ProofTree {
            return switch (node) {
                .unproven_tree => |ut| switch (ut) {
                    .unproven_conjecture => |conj| blk: {
                        if (!conj.isReal()) break :blk null;

                        switch (conj) {
                            .cand_unproven => |cand| blk: {
                                // Real AND: all children get same challenge
                                const challenge = cand.challenge_opt.?;
                                break :blk cand.withChildren(
                                    propagateChallenge(cand.children, challenge),
                                );
                            },
                            .cor_unproven => |cor| blk: {
                                // Real OR: real child gets XOR of root and simulated
                                const root_challenge = cor.challenge_opt.?;
                                const xored = xorChallenges(root_challenge, cor.children);
                                break :blk cor.withRealChildChallenge(xored);
                            },
                            .cthreshold_unproven => |ct| blk: {
                                // Real THRESHOLD: polynomial interpolation
                                break :blk computeThresholdChallenges(ct);
                            },
                        }
                    },
                    .unproven_leaf => |leaf| blk: {
                        if (!leaf.isReal()) break :blk null;

                        // Compute response z = r + e*w mod q
                        const challenge = leaf.challenge_opt orelse
                            return error.RealUnprovenTreeWithoutChallenge;

                        switch (leaf) {
                            .unproven_schnorr => |us| blk: {
                                const secret = p.findSecret(us.proposition) orelse
                                    hints.findRealProof(us.position)?.unchecked.second_message orelse
                                    return error.SecretNotFound;

                                const z = DlogProver.secondMessage(
                                    secret,
                                    us.randomness_opt.?,
                                    challenge,
                                );
                                break :blk .{
                                    .unchecked_leaf = .{
                                        .unchecked_schnorr = .{
                                            .proposition = us.proposition,
                                            .commitment_opt = null,
                                            .challenge = challenge,
                                            .second_message = z,
                                        },
                                    },
                                };
                            },
                            // Similar for DHT
                        }
                    },
                },
                else => null,
            };
        }
    }.transform, prover, hints_bag);
}

Step 10: Serialize Proof

fn serializeSig(tree: UncheckedTree) ProofBytes {
    var buf = std.ArrayList(u8).init(allocator);
    var w = SigmaByteWriter.init(buf.writer());

    sigWriteBytes(&tree, &w, true);

    return .{ .bytes = buf.items };
}

fn sigWriteBytes(node: *const UncheckedTree, w: *SigmaByteWriter, write_challenge: bool) void {
    if (write_challenge) {
        w.writeBytes(&node.challenge());
    }

    switch (node.*) {
        .unchecked_leaf => |leaf| switch (leaf) {
            .unchecked_schnorr => |us| {
                w.writeBytes(&us.second_message.z.toBytes());
            },
            .unchecked_dh_tuple => |dh| {
                w.writeBytes(&dh.second_message.z.toBytes());
            },
        },
        .unchecked_conjecture => |conj| switch (conj) {
            .cand_unchecked => |cand| {
                // Children's challenges equal parent's - don't write
                for (cand.children) |child| {
                    sigWriteBytes(&child, w, false);
                }
            },
            .cor_unchecked => |cor| {
                // Write all except last (computed via XOR)
                for (cor.children[0 .. cor.children.len - 1]) |child| {
                    sigWriteBytes(&child, w, true);
                }
                sigWriteBytes(&cor.children[cor.children.len - 1], w, false);
            },
            .cthreshold_unchecked => |ct| {
                // Write polynomial coefficients
                w.writeBytes(ct.polynomial.toBytes(false));
                for (ct.children) |child| {
                    sigWriteBytes(&child, w, false);
                }
            },
        },
    };
}

Response Computation

Schnorr Response

const DlogProver = struct {
    /// First message: a = g^r
    pub fn firstMessage(rng: std.rand.Random) struct { r: Scalar, message: FirstDlogProverMessage } {
        const r = Scalar.random(rng);
        const a = DlogGroup.exponentiate(&DlogGroup.generator(), &r);
        return .{ .r = r, .message = .{ .a = a } };
    }

    /// Second message: z = r + e*w mod q
    pub fn secondMessage(
        private_key: DlogProverInput,
        r: Scalar,
        challenge: Challenge,
    ) SecondDlogProverMessage {
        const e = Scalar.fromBytes(&challenge.bytes);
        const z = r.add(e.mul(private_key.w));
        return .{ .z = z };
    }

    /// Simulation: pick random z, compute a = g^z * h^(-e)
    pub fn simulate(
        proposition: ProveDlog,
        challenge: Challenge,
    ) struct { first_message: FirstDlogProverMessage, second_message: SecondDlogProverMessage } {
        const z = Scalar.random(rng);
        const e = Scalar.fromBytes(&challenge.bytes);
        const minus_e = e.negate();

        const gz = DlogGroup.exponentiate(&DlogGroup.generator(), &z);
        const h_neg_e = DlogGroup.exponentiate(&proposition.h, &minus_e);
        const a = gz.multiply(&h_neg_e);

        return .{
            .first_message = .{ .a = a },
            .second_message = .{ .z = z },
        };
    }
};

Hint System

Hint Types

For distributed signing¹⁵:

const Hint = union(enum) {
    real_secret_proof: RealSecretProof,
    simulated_secret_proof: SimulatedSecretProof,
    own_commitment: OwnCommitment,
    real_commitment: RealCommitment,
    simulated_commitment: SimulatedCommitment,
};

const RealSecretProof = struct {
    image: SigmaBoolean,
    challenge: Challenge,
    unchecked_tree: UncheckedTree,
    position: NodePosition,
};

const OwnCommitment = struct {
    image: SigmaBoolean,
    secret_randomness: Scalar,  // PRIVATE - NEVER share!
    commitment: FirstProverMessage,
    position: NodePosition,
};
// SECURITY: OwnCommitment contains secret randomness (r). NEVER send
// OwnCommitment to other parties - only send RealCommitment (public part).
// Leaking r allows computing secret key w = (z - r) / e.

const RealCommitment = struct {
    image: SigmaBoolean,
    commitment: FirstProverMessage,
    position: NodePosition,
};

const HintsBag = struct {
    hints: []const Hint,

    pub fn realImages(self: *const HintsBag) []const SigmaBoolean {
        // Collect public images from real proofs and commitments
    }

    pub fn findCommitment(self: *const HintsBag, pos: NodePosition) ?CommitmentHint {
        for (self.hints) |hint| {
            switch (hint) {
                .own_commitment, .real_commitment => |c| {
                    if (c.position.eql(pos)) return c;
                },
                else => {},
            }
        }
        return null;
    }

    pub fn findRealProof(self: *const HintsBag, pos: NodePosition) ?RealSecretProof {
        for (self.hints) |hint| {
            if (hint == .real_secret_proof and hint.real_secret_proof.position.eql(pos)) {
                return hint.real_secret_proof;
            }
        }
        return null;
    }
};

Distributed Signing Protocol

Distributed Signing (2-of-2 AND)
─────────────────────────────────────────────────────

Round 1: Generate commitments
  Party 1 (sk1) ─────> OwnCommitment(pk1, r1, g^r1)
  Party 2 (sk2) ─────> OwnCommitment(pk2, r2, g^r2)

Exchange: Share RealCommitment (NOT OwnCommitment!)
  Party 1 ─────> RealCommitment(pk1, g^r1) ─────> Party 2
  Party 2 ─────> RealCommitment(pk2, g^r2) ─────> Party 1

Round 2: Sign sequentially
  Party 1:
    combined = hints1 ++ RealCommitment(pk2)
    partialProof = prove(tree, msg, combined)

  Extract hints from partial:
    hintsFromProof = bagForMultisig(partialProof, ...)

  Party 2:
    combined = hints2 ++ hintsFromProof
    finalProof = prove(tree, msg, combined)

Prover Errors

const ProverError = error{
    ErgoTreeError,
    EvalError,
    Gf2_192Error,
    ReducedToFalse,
    TreeRootIsNotReal,
    SimulatedLeafWithoutChallenge,
    RealUnprovenTreeWithoutChallenge,
    SecretNotFound,
    Unexpected,
    FiatShamirTreeSerializationError,
};

Summary

This chapter covered the prover implementation that generates Sigma proofs:

The prover transforms a sigma-tree through a 10-step algorithm:

Convert to unproven: Transform SigmaBoolean to UnprovenTree data structure
Mark real (bottom-up): Identify which nodes the prover has secrets for
Check root: Fail if the root is simulated (prover cannot prove)
Polish simulated (top-down): Ensure OR keeps only one real child, THRESHOLD keeps exactly k
Simulate and commit: Assign challenges to simulated children, generate commitments for real leaves
Fiat-Shamir serialization: Serialize tree structure and commitments
Compute root challenge: Hash serialized tree with message
Prove (top-down): Distribute challenges and compute responses for real nodes
Serialize proof: Output compact format

Key design principles:

Zero-knowledge: Simulated transcripts are computationally indistinguishable from real ones
Challenge flow depends on composition: AND propagates same challenge to all; OR uses XOR constraint; THRESHOLD uses polynomial interpolation over GF(2^192)
Hint system enables distributed signing: parties exchange commitments (never secret randomness), then sign sequentially

Next: Chapter 16: ErgoScript Parser

Scala: ProverInterpreter.scala:1-100

Rust: prover.rs:1-100

Rust: unproven_tree.rs (NodePosition)

⁴

Scala: UnprovenTree.scala

⁵

Rust: unproven_tree.rs:27-88

⁶

Rust: prover.rs (convert_to_unproven)

⁷

Scala: ProverInterpreter.scala (markReal)

⁸

Rust: prover.rs:243-305

⁹

Rust: prover.rs:367-400

¹⁰

Scala: ProverInterpreter.scala (simulateAndCommit)

¹¹

Rust: prover.rs (simulate_and_commit)

¹²

Rust: fiat_shamir.rs

¹³

Scala: ProverInterpreter.scala (proving)

¹⁴

Rust: prover.rs (proving)

¹⁵

Rust: hint.rs

Chapter 16: ErgoScript Parser

Prerequisites

Chapter 4 for AST node types that the parser produces
Chapter 2 for type syntax parsing
Familiarity with parsing concepts: tokenization, recursive descent, operator precedence

Learning Objectives

By the end of this chapter, you will be able to:

Explain parser combinator and Pratt parsing techniques used in ErgoScript
Navigate the parser module structure (lexer, grammar, expressions, types)
Implement operator precedence using binding power
Trace expression parsing from ErgoScript source to untyped AST
Handle source position tracking for meaningful error messages

Parser Architecture

ErgoScript source code transforms to AST through lexing and parsing¹²:

Parsing Pipeline
─────────────────────────────────────────────────────

Source Code
    │
    ▼
┌──────────────────────────────────────────────────┐
│                    LEXER                         │
│                                                  │
│  Characters ─────> Tokens                        │
│  "val x = 1 + 2"                                 │
│  ─────>  [ValKw, Ident("x"), Eq, Int(1),         │
│           Plus, Int(2)]                          │
└──────────────────────────────────────────────────┘
    │
    ▼
┌──────────────────────────────────────────────────┐
│                   PARSER                         │
│                                                  │
│  Tokens ─────> AST                               │
│  Grammar rules, precedence, associativity        │
│  ─────>  ValDef("x", BinOp(Int(1), +, Int(2)))   │
└──────────────────────────────────────────────────┘
    │
    ▼
Untyped AST (SValue)

Lexer (Tokenizer)

Converts character stream to tokens³:

const TokenKind = enum {
    // Literals
    int_number,
    long_number,
    string_literal,

    // Keywords
    val_kw,
    def_kw,
    if_kw,
    else_kw,
    true_kw,
    false_kw,

    // Operators
    plus,
    minus,
    star,
    slash,
    percent,
    eq,
    neq,
    lt,
    gt,
    le,
    ge,
    and_and,
    or_or,
    bang,

    // Punctuation
    l_paren,
    r_paren,
    l_brace,
    r_brace,
    l_bracket,
    r_bracket,
    dot,
    comma,
    colon,
    semicolon,
    arrow,

    // Identifiers
    ident,

    // Special
    whitespace,
    comment,
    eof,
    err,
};

const Token = struct {
    kind: TokenKind,
    text: []const u8,
    range: Range,
};

const Range = struct {
    start: usize,
    end: usize,
};

Lexer Implementation

const Lexer = struct {
    source: []const u8,
    pos: usize,

    pub fn init(source: []const u8) Lexer {
        return .{ .source = source, .pos = 0 };
    }

    pub fn nextToken(self: *Lexer) Token {
        self.skipWhitespaceAndComments();

        if (self.pos >= self.source.len) {
            return .{ .kind = .eof, .text = "", .range = .{ .start = self.pos, .end = self.pos } };
        }

        const start = self.pos;
        const c = self.source[self.pos];

        // Single-character tokens
        const single_char_token: ?TokenKind = switch (c) {
            '(' => .l_paren,
            ')' => .r_paren,
            '{' => .l_brace,
            '}' => .r_brace,
            '[' => .l_bracket,
            ']' => .r_bracket,
            '.' => .dot,
            ',' => .comma,
            ':' => .colon,
            ';' => .semicolon,
            '+' => .plus,
            '-' => .minus,
            '*' => .star,
            '/' => .slash,
            '%' => .percent,
            else => null,
        };

        if (single_char_token) |kind| {
            self.pos += 1;
            return .{ .kind = kind, .text = self.source[start..self.pos], .range = .{ .start = start, .end = self.pos } };
        }

        // Multi-character tokens
        if (c == '=' and self.peek(1) == '=') {
            self.pos += 2;
            return .{ .kind = .eq, .text = "==", .range = .{ .start = start, .end = self.pos } };
        }

        if (c == '=' and self.peek(1) == '>') {
            self.pos += 2;
            return .{ .kind = .arrow, .text = "=>", .range = .{ .start = start, .end = self.pos } };
        }

        if (c == '&' and self.peek(1) == '&') {
            self.pos += 2;
            return .{ .kind = .and_and, .text = "&&", .range = .{ .start = start, .end = self.pos } };
        }

        // Numbers
        if (std.ascii.isDigit(c)) {
            return self.scanNumber(start);
        }

        // Identifiers and keywords
        if (std.ascii.isAlphabetic(c) or c == '_') {
            return self.scanIdentifier(start);
        }

        // Unknown character
        self.pos += 1;
        return .{ .kind = .err, .text = self.source[start..self.pos], .range = .{ .start = start, .end = self.pos } };
    }

    fn scanIdentifier(self: *Lexer, start: usize) Token {
        while (self.pos < self.source.len) {
            const c = self.source[self.pos];
            if (std.ascii.isAlphanumeric(c) or c == '_') {
                self.pos += 1;
            } else {
                break;
            }
        }

        const text = self.source[start..self.pos];
        const kind: TokenKind = if (keywords.get(text)) |kw| kw else .ident;
        return .{ .kind = kind, .text = text, .range = .{ .start = start, .end = self.pos } };
    }

    fn scanNumber(self: *Lexer, start: usize) Token {
        // Check for hex
        if (self.source[self.pos] == '0' and self.pos + 1 < self.source.len and
            (self.source[self.pos + 1] == 'x' or self.source[self.pos + 1] == 'X'))
        {
            self.pos += 2;
            while (self.pos < self.source.len and std.ascii.isHex(self.source[self.pos])) {
                self.pos += 1;
            }
        } else {
            while (self.pos < self.source.len and std.ascii.isDigit(self.source[self.pos])) {
                self.pos += 1;
            }
        }

        // Check for L suffix (long)
        var kind: TokenKind = .int_number;
        if (self.pos < self.source.len and (self.source[self.pos] == 'L' or self.source[self.pos] == 'l')) {
            kind = .long_number;
            self.pos += 1;
        }

        return .{ .kind = kind, .text = self.source[start..self.pos], .range = .{ .start = start, .end = self.pos } };
    }

    const keywords = std.ComptimeStringMap(TokenKind, .{
        .{ "val", .val_kw },
        .{ "def", .def_kw },
        .{ "if", .if_kw },
        .{ "else", .else_kw },
        .{ "true", .true_kw },
        .{ "false", .false_kw },
    });
};

Parser Structure

Event-based parser using markers⁴⁵:

const Event = union(enum) {
    start_node: SyntaxKind,
    add_token,
    finish_node,
    err: ParseError,
    placeholder,
};

const Parser = struct {
    source: Source,
    events: std.ArrayList(Event),
    expected_kinds: std.ArrayList(TokenKind),
    allocator: Allocator,

    pub fn init(allocator: Allocator, tokens: []const Token) Parser {
        return .{
            .source = Source.init(tokens),
            .events = std.ArrayList(Event).init(allocator),
            .expected_kinds = std.ArrayList(TokenKind).init(allocator),
            .allocator = allocator,
        };
    }

    pub fn parse(self: *Parser) []Event {
        grammar.root(self);
        return self.events.toOwnedSlice();
    }

    fn start(self: *Parser) Marker {
        const pos = self.events.items.len;
        try self.events.append(.placeholder);
        return Marker.init(pos);
    }

    fn at(self: *Parser, kind: TokenKind) bool {
        try self.expected_kinds.append(kind);
        return self.peek() == kind;
    }

    fn bump(self: *Parser) void {
        self.expected_kinds.clearRetainingCapacity();
        _ = self.source.nextToken();
        try self.events.append(.add_token);
    }

    fn expect(self: *Parser, kind: TokenKind) void {
        if (self.at(kind)) {
            self.bump();
        } else {
            self.err();
        }
    }
};

const Marker = struct {
    pos: usize,

    pub fn init(pos: usize) Marker {
        return .{ .pos = pos };
    }

    pub fn complete(self: Marker, p: *Parser, kind: SyntaxKind) CompletedMarker {
        p.events.items[self.pos] = .{ .start_node = kind };
        try p.events.append(.finish_node);
        return .{ .pos = self.pos };
    }

    pub fn precede(self: Marker, p: *Parser) Marker {
        const new_marker = p.start();
        p.events.items[self.pos] = .{ .start_node_at = new_marker.pos };
        return new_marker;
    }
};

Pratt Parsing (Binding Power)

Expression parsing uses Pratt parsing for operator precedence⁶⁷. This technique, introduced by Vaughan Pratt in 1973 ("Top Down Operator Precedence"), elegantly handles operator precedence and associativity through numeric "binding power" values:

Binding Power Concept
─────────────────────────────────────────────────────

Expression:   A       +       B       *       C
Power:           3       3       5       5

The * has higher binding power, holds B and C tighter.
Result: A + (B * C)

Associativity via asymmetric power:
Expression:   A       +       B       +       C
Power:     0     3      3.1     3      3.1     0

Right power slightly higher → left associativity
Result: (A + B) + C

Expression Grammar

const grammar = struct {
    pub fn root(p: *Parser) CompletedMarker {
        const m = p.start();
        while (!p.atEnd()) {
            stmt(p);
        }
        return m.complete(p, .root);
    }

    pub fn expr(p: *Parser) ?CompletedMarker {
        return exprBindingPower(p, 0);
    }

    /// Pratt parser core
    fn exprBindingPower(p: *Parser, min_bp: u8) ?CompletedMarker {
        var lhs = lhs(p) orelse return null;

        while (true) {
            const op: ?BinaryOp = blk: {
                if (p.at(.plus)) break :blk .add;
                if (p.at(.minus)) break :blk .sub;
                if (p.at(.star)) break :blk .mul;
                if (p.at(.slash)) break :blk .div;
                if (p.at(.percent)) break :blk .mod;
                if (p.at(.lt)) break :blk .lt;
                if (p.at(.gt)) break :blk .gt;
                if (p.at(.le)) break :blk .le;
                if (p.at(.ge)) break :blk .ge;
                if (p.at(.eq)) break :blk .eq;
                if (p.at(.neq)) break :blk .neq;
                if (p.at(.and_and)) break :blk .and_;
                if (p.at(.or_or)) break :blk .or_;
                break :blk null;
            };

            if (op == null) break;

            const bp = op.?.bindingPower();
            if (bp.left < min_bp) break;

            // Consume operator
            p.bump();

            // Parse right operand with right binding power
            const m = lhs.precede(p);
            const parsed_rhs = exprBindingPower(p, bp.right) != null;
            lhs = m.complete(p, .infix_expr);

            if (!parsed_rhs) break;
        }

        return lhs;
    }

    /// Left-hand side (atoms and prefix expressions)
    fn lhs(p: *Parser) ?CompletedMarker {
        if (p.at(.int_number)) return intNumber(p);
        if (p.at(.long_number)) return longNumber(p);
        if (p.at(.ident)) return ident(p);
        if (p.at(.true_kw) or p.at(.false_kw)) return boolLiteral(p);
        if (p.at(.minus) or p.at(.bang)) return prefixExpr(p);
        if (p.at(.l_paren)) return parenExpr(p);
        if (p.at(.l_brace)) return blockExpr(p);
        if (p.at(.if_kw)) return ifExpr(p);

        p.err();
        return null;
    }

    fn intNumber(p: *Parser) CompletedMarker {
        const m = p.start();
        p.bump();
        return m.complete(p, .int_literal);
    }

    fn ident(p: *Parser) CompletedMarker {
        const m = p.start();
        p.bump();
        return m.complete(p, .ident);
    }

    fn prefixExpr(p: *Parser) ?CompletedMarker {
        const m = p.start();
        const op_bp = UnaryOp.fromToken(p.peek()).?.bindingPower();

        p.bump(); // operator
        _ = exprBindingPower(p, op_bp.right);

        return m.complete(p, .prefix_expr);
    }

    fn parenExpr(p: *Parser) CompletedMarker {
        const m = p.start();
        p.expect(.l_paren);
        _ = expr(p);
        p.expect(.r_paren);
        return m.complete(p, .paren_expr);
    }

    fn ifExpr(p: *Parser) CompletedMarker {
        const m = p.start();
        p.expect(.if_kw);
        p.expect(.l_paren);
        _ = expr(p);
        p.expect(.r_paren);
        _ = expr(p);
        if (p.at(.else_kw)) {
            p.bump();
            _ = expr(p);
        }
        return m.complete(p, .if_expr);
    }
};

Binary Operators

const BinaryOp = enum {
    add,
    sub,
    mul,
    div,
    mod,
    lt,
    gt,
    le,
    ge,
    eq,
    neq,
    and_,
    or_,

    const BindingPower = struct { left: u8, right: u8 };

    pub fn bindingPower(self: BinaryOp) BindingPower {
        return switch (self) {
            .or_ => .{ .left = 1, .right = 2 },      // ||
            .and_ => .{ .left = 3, .right = 4 },     // &&
            .eq, .neq => .{ .left = 5, .right = 6 }, // ==, !=
            .lt, .gt, .le, .ge => .{ .left = 7, .right = 8 },
            .add, .sub => .{ .left = 9, .right = 10 },
            .mul, .div, .mod => .{ .left = 11, .right = 12 },
        };
    }
};

const UnaryOp = enum {
    neg,
    not,

    pub fn bindingPower(self: UnaryOp) struct { right: u8 } {
        return switch (self) {
            .neg, .not => .{ .right = 13 }, // Higher than all binary
        };
    }
};

Operator Precedence Table

Operator Precedence (lowest to highest)
─────────────────────────────────────────────────────
 1-2    ||                 Logical OR
 3-4    &&                 Logical AND
 5-6    == !=              Equality
 7-8    < > <= >=          Comparison
 9-10   + -                Addition, Subtraction
11-12   * / %              Multiplication, Division
  13    - ! ~              Prefix (unary)
  14    . ()               Postfix (method call, index)

Type Parsing

const TypeParser = struct {
    const predef_types = std.ComptimeStringMap(SType, .{
        .{ "Boolean", .s_boolean },
        .{ "Byte", .s_byte },
        .{ "Short", .s_short },
        .{ "Int", .s_int },
        .{ "Long", .s_long },
        .{ "BigInt", .s_big_int },
        .{ "GroupElement", .s_group_element },
        .{ "SigmaProp", .s_sigma_prop },
        .{ "Box", .s_box },
        .{ "AvlTree", .s_avl_tree },
        .{ "Context", .s_context },
        .{ "Header", .s_header },
        .{ "PreHeader", .s_pre_header },
        .{ "Unit", .s_unit },
    });

    pub fn parseType(p: *Parser) ?SType {
        if (p.at(.ident)) {
            const name = p.currentText();

            // Check predefined types
            if (predef_types.get(name)) |t| {
                p.bump();
                return t;
            }

            // Generic types: Coll[T], Option[T]
            p.bump();
            if (p.at(.l_bracket)) {
                p.bump();
                const inner = parseType(p) orelse return null;
                p.expect(.r_bracket);

                if (std.mem.eql(u8, name, "Coll")) {
                    return .{ .s_coll = inner };
                } else if (std.mem.eql(u8, name, "Option")) {
                    return .{ .s_option = inner };
                }
            }

            // Type variable
            return .{ .s_type_var = name };
        }

        // Tuple type: (T1, T2, ...)
        if (p.at(.l_paren)) {
            p.bump();
            var items = std.ArrayList(SType).init(p.allocator);
            while (!p.at(.r_paren)) {
                const t = parseType(p) orelse return null;
                try items.append(t);
                if (!p.at(.r_paren)) p.expect(.comma);
            }
            p.expect(.r_paren);
            return .{ .s_tuple = items.toOwnedSlice() };
        }

        // Function type: T1 => T2
        const domain = parseType(p) orelse return null;
        if (p.at(.arrow)) {
            p.bump();
            const range = parseType(p) orelse return null;
            return .{ .s_func = .{ .args = &[_]SType{domain}, .ret = range } };
        }

        return domain;
    }
};

Statement Parsing

fn stmt(p: *Parser) ?CompletedMarker {
    if (p.at(.val_kw)) {
        return valDef(p);
    }
    if (p.at(.def_kw)) {
        return defDef(p);
    }
    return expr(p);
}

fn valDef(p: *Parser) CompletedMarker {
    const m = p.start();
    p.expect(.val_kw);
    p.expect(.ident);

    // Optional type annotation
    if (p.at(.colon)) {
        p.bump();
        _ = TypeParser.parseType(p);
    }

    p.expect(.eq);
    _ = expr(p);

    return m.complete(p, .val_def);
}

fn defDef(p: *Parser) CompletedMarker {
    const m = p.start();
    p.expect(.def_kw);
    p.expect(.ident);

    // Parameters
    if (p.at(.l_paren)) {
        p.bump();
        while (!p.at(.r_paren)) {
            p.expect(.ident);
            p.expect(.colon);
            _ = TypeParser.parseType(p);
            if (!p.at(.r_paren)) p.expect(.comma);
        }
        p.expect(.r_paren);
    }

    // Return type
    if (p.at(.colon)) {
        p.bump();
        _ = TypeParser.parseType(p);
    }

    p.expect(.eq);
    _ = expr(p);

    return m.complete(p, .def_def);
}

Source Position Tracking

Every AST node carries source position for error messages⁸:

const SourceContext = struct {
    index: usize,
    line: u32,
    column: u32,
    source_line: []const u8,

    pub fn fromIndex(index: usize, source: []const u8) SourceContext {
        var line: u32 = 1;
        var col: u32 = 1;
        var line_start: usize = 0;

        for (source[0..index], 0..) |c, i| {
            if (c == '\n') {
                line += 1;
                col = 1;
                line_start = i + 1;
            } else {
                col += 1;
            }
        }

        // Find end of current line
        var line_end = index;
        while (line_end < source.len and source[line_end] != '\n') {
            line_end += 1;
        }

        return .{
            .index = index,
            .line = line,
            .column = col,
            .source_line = source[line_start..line_end],
        };
    }
};

const ParseError = struct {
    expected: []const TokenKind,
    found: ?TokenKind,
    span: Range,

    pub fn format(self: ParseError, ctx: SourceContext) []const u8 {
        // Format error message with source context
    }
};

Syntax Tree Construction

Events convert to concrete syntax tree⁹:

const SyntaxKind = enum {
    // Nodes
    root,
    val_def,
    def_def,
    if_expr,
    block_expr,
    infix_expr,
    prefix_expr,
    paren_expr,
    lambda_expr,
    apply_expr,
    select_expr,

    // Literals
    int_literal,
    long_literal,
    bool_literal,
    string_literal,
    ident,

    // Error
    err,
};

const SyntaxNode = struct {
    kind: SyntaxKind,
    range: Range,
    children: []SyntaxNode,
    text: ?[]const u8,
};

fn buildTree(events: []const Event, tokens: []const Token) SyntaxNode {
    var builder = TreeBuilder.init();

    for (events) |event| {
        switch (event) {
            .start_node => |kind| builder.startNode(kind),
            .add_token => builder.addToken(tokens[builder.token_idx]),
            .finish_node => builder.finishNode(),
            .err => |e| builder.addError(e),
            .placeholder => {},
        }
    }

    return builder.finish();
}

Parsing Example

Input: "val x = 1 + 2 * 3"

Tokens:
  [val_kw, ident("x"), eq, int(1), plus, int(2), star, int(3)]

Events:
  start_node(val_def)
    add_token(val_kw)
    add_token(ident)
    add_token(eq)
    start_node(infix_expr)       // 1 + (2 * 3)
      add_token(int)             // 1
      add_token(plus)
      start_node(infix_expr)     // 2 * 3
        add_token(int)           // 2
        add_token(star)
        add_token(int)           // 3
      finish_node
    finish_node
  finish_node

AST:
  ValDef
    name: "x"
    rhs: InfixExpr(+)
           lhs: IntLiteral(1)
           rhs: InfixExpr(*)
                  lhs: IntLiteral(2)
                  rhs: IntLiteral(3)

Error Recovery

const RECOVERY_SET = [_]TokenKind{ .val_kw, .def_kw, .r_brace };

fn err(p: *Parser) void {
    const current = p.source.peekToken();
    const range = if (current) |t| t.range else p.source.lastTokenRange();

    try p.events.append(.{
        .err = .{
            .expected = p.expected_kinds.toOwnedSlice(),
            .found = if (current) |t| t.kind else null,
            .span = range,
        },
    });

    // Skip tokens until recovery point
    if (!p.atSet(&RECOVERY_SET) and !p.atEnd()) {
        const m = p.start();
        p.bump();
        _ = m.complete(p, .err);
    }
}

Summary

Lexer converts characters to tokens with position tracking
Parser uses event-based architecture with markers
Pratt parsing handles operator precedence via binding power
Left associativity: right power slightly higher than left
Source positions enable accurate error messages
Error recovery skips to synchronization points
Output is untyped AST; semantic analysis comes next

Next: Chapter 17: Semantic Analysis

Scala: SigmaParser.scala

Rust: parser.rs

Rust: lexer.rs

⁴

Scala: Basic.scala

⁵

Rust: marker.rs

⁶

Scala: Exprs.scala

⁷

Rust: expr.rs:1-60

⁸

Scala: SourceContext.scala

⁹

Rust: sink.rs

Chapter 17: Semantic Analysis

Prerequisites

Chapter 16 for the untyped AST structure
Chapter 2 for type codes and type compatibility rules
Familiarity with type inference concepts: type variables, unification, constraint solving

Learning Objectives

By the end of this chapter, you will be able to:

Explain the two-phase semantic analysis: name binding followed by type inference
Implement name resolution for globals, environment variables, and local definitions
Apply the type unification algorithm to infer types and detect mismatches
Describe method resolution and how method calls are lowered to direct operations
Trace type inference for complex expressions involving generics and collections

Semantic Analysis Overview

After parsing, ErgoScript transforms through two phases¹²:

Semantic Analysis Pipeline
─────────────────────────────────────────────────────

Source Code
    │
    ▼
┌──────────────────────────────────────────────────┐
│                    PARSE                         │
│                                                  │
│  Untyped AST                                     │
│  - Identifiers have NoType                       │
│  - References are unresolved strings             │
│  - Operators are symbolic                        │
└──────────────────────────────────────────────────┘
    │
    ▼
┌──────────────────────────────────────────────────┐
│                    BIND                          │
│                                                  │
│  Resolve names:                                  │
│  - Global constants (HEIGHT, SELF, INPUTS)       │
│  - Environment variables                         │
│  - Predefined functions                          │
└──────────────────────────────────────────────────┘
    │
    ▼
┌──────────────────────────────────────────────────┐
│                    TYPE                          │
│                                                  │
│  Assign types:                                   │
│  - Infer expression types                        │
│  - Resolve method calls                          │
│  - Unify generic types                           │
│  - Check type consistency                        │
└──────────────────────────────────────────────────┘
    │
    ▼
Typed AST (ready for IR)

Phase 1: Name Binding

The binder resolves identifiers to their definitions³⁴:

const BinderError = struct {
    msg: []const u8,
    span: Range,

    pub fn prettyDesc(self: BinderError, source: []const u8) []const u8 {
        // Format error with source context
    }
};

const GlobalVars = enum {
    height,
    self_,
    inputs,
    outputs,
    context,
    global,
    miner_pubkey,
    last_block_utxo_root_hash,

    pub fn tpe(self: GlobalVars) SType {
        return switch (self) {
            .height => .s_int,
            .self_ => .s_box,
            .inputs => .{ .s_coll = .s_box },
            .outputs => .{ .s_coll = .s_box },
            .context => .s_context,
            .global => .s_global,
            .miner_pubkey => .{ .s_coll = .s_byte },
            .last_block_utxo_root_hash => .s_avl_tree,
        };
    }
};

const Binder = struct {
    env: ScriptEnv,
    allocator: Allocator,

    pub fn init(allocator: Allocator, env: ScriptEnv) Binder {
        return .{ .env = env, .allocator = allocator };
    }

    pub fn bind(self: *const Binder, expr: Expr) BinderError!Expr {
        return self.rewrite(expr);
    }

    fn rewrite(self: *const Binder, expr: Expr) BinderError!Expr {
        return switch (expr.kind) {
            .ident => |name| blk: {
                // Check environment first
                if (self.env.get(name)) |value| {
                    break :blk liftToConstant(value, expr.span);
                }

                // Check global variables
                if (resolveGlobal(name)) |global| {
                    break :blk .{
                        .kind = .{ .global_vars = global },
                        .span = expr.span,
                        .tpe = global.tpe(),
                    };
                }

                // Leave unresolved for typer
                break :blk expr;
            },

            .binary => |bin| blk: {
                const left = try self.rewrite(bin.lhs.*);
                const right = try self.rewrite(bin.rhs.*);
                break :blk .{
                    .kind = .{ .binary = .{
                        .op = bin.op,
                        .lhs = try self.allocator.create(Expr),
                        .rhs = try self.allocator.create(Expr),
                    } },
                    .span = expr.span,
                    .tpe = expr.tpe,
                };
            },

            .block => |block| blk: {
                var new_bindings = try self.allocator.alloc(ValDef, block.bindings.len);
                for (block.bindings, 0..) |binding, i| {
                    const rhs = try self.rewrite(binding.rhs.*);
                    new_bindings[i] = .{
                        .name = binding.name,
                        .tpe = rhs.tpe orelse binding.tpe,
                        .rhs = rhs,
                    };
                }
                const body = try self.rewrite(block.body.*);
                break :blk .{
                    .kind = .{ .block = .{
                        .bindings = new_bindings,
                        .body = body,
                    } },
                    .span = expr.span,
                    .tpe = body.tpe,
                };
            },

            .lambda => |lam| blk: {
                const body = try self.rewrite(lam.body.*);
                break :blk .{
                    .kind = .{ .lambda = .{
                        .args = lam.args,
                        .body = body,
                    } },
                    .span = expr.span,
                    .tpe = expr.tpe,
                };
            },

            else => expr,
        };
    }

    fn resolveGlobal(name: []const u8) ?GlobalVars {
        const globals = std.ComptimeStringMap(GlobalVars, .{
            .{ "HEIGHT", .height },
            .{ "SELF", .self_ },
            .{ "INPUTS", .inputs },
            .{ "OUTPUTS", .outputs },
            .{ "CONTEXT", .context },
            .{ "Global", .global },
            .{ "MinerPubkey", .miner_pubkey },
            .{ "LastBlockUtxoRootHash", .last_block_utxo_root_hash },
        });
        return globals.get(name);
    }

    fn liftToConstant(value: anytype, span: Range) Expr {
        const T = @TypeOf(value);
        return .{
            .kind = .{ .literal = switch (T) {
                i32 => .{ .int = value },
                i64 => .{ .long = value },
                bool => .{ .bool_ = value },
                else => @compileError("unsupported type"),
            } },
            .span = span,
            .tpe = SType.fromNative(T),
        };
    }
};

Global Constants

Built-in Global Constants
─────────────────────────────────────────────────────
Name                    Type            Description
─────────────────────────────────────────────────────
HEIGHT                  Int             Current block height
SELF                    Box             Current box being spent
INPUTS                  Coll[Box]       Transaction inputs
OUTPUTS                 Coll[Box]       Transaction outputs
CONTEXT                 Context         Execution context
MinerPubkey             Coll[Byte]      Miner's public key
LastBlockUtxoRootHash   AvlTree         UTXO digest

Phase 2: Type Inference

The typer assigns types to all expressions⁵⁶:

const TyperError = struct {
    msg: []const u8,
    span: Range,
};

const TypeEnv = std.StringHashMap(SType);

const Typer = struct {
    predef_env: TypeEnv,
    lower_method_calls: bool,
    allocator: Allocator,

    pub fn init(allocator: Allocator, type_env: TypeEnv, lower: bool) Typer {
        var env = TypeEnv.init(allocator);
        // Add predefined function types
        env.put("min", .{ .s_func = .{ .args = &[_]SType{ .s_int, .s_int }, .ret = .s_int } }) catch {};
        env.put("max", .{ .s_func = .{ .args = &[_]SType{ .s_int, .s_int }, .ret = .s_int } }) catch {};
        // Merge with provided env
        var it = type_env.iterator();
        while (it.next()) |entry| {
            env.put(entry.key_ptr.*, entry.value_ptr.*) catch {};
        }
        return .{
            .predef_env = env,
            .lower_method_calls = lower,
            .allocator = allocator,
        };
    }

    pub fn typecheck(self: *Typer, bound: Expr) TyperError!Expr {
        const typed = try self.assignType(&self.predef_env, bound);
        if (typed.tpe == null) {
            return error.NoTypeAssigned;
        }
        return typed;
    }

    fn assignType(self: *Typer, env: *const TypeEnv, expr: Expr) TyperError!Expr {
        return switch (expr.kind) {
            // Identifier: lookup in environment
            .ident => |name| blk: {
                if (env.get(name)) |t| {
                    break :blk .{
                        .kind = expr.kind,
                        .span = expr.span,
                        .tpe = t,
                    };
                }
                return TyperError{
                    .msg = "Cannot assign type for variable",
                    .span = expr.span,
                };
            },

            // Global variables already typed
            .global_vars => |g| .{
                .kind = expr.kind,
                .span = expr.span,
                .tpe = g.tpe(),
            },

            // Block: extend environment with each binding
            // NOTE: In production, use a binding stack instead of cloning HashMap
            // for each scope. See ZIGMA_STYLE.md for zero-allocation patterns.
            .block => |block| blk: {
                var cur_env = env.clone();
                var new_bindings = try self.allocator.alloc(ValDef, block.bindings.len);

                for (block.bindings, 0..) |binding, i| {
                    const rhs = try self.assignType(&cur_env, binding.rhs.*);
                    try cur_env.put(binding.name, rhs.tpe.?);
                    new_bindings[i] = .{
                        .name = binding.name,
                        .tpe = rhs.tpe.?,
                        .rhs = rhs,
                    };
                }

                const body = try self.assignType(&cur_env, block.body.*);

                break :blk .{
                    .kind = .{ .block = .{
                        .bindings = new_bindings,
                        .body = body,
                    } },
                    .span = expr.span,
                    .tpe = body.tpe,
                };
            },

            // Binary: type operands, check compatibility
            .binary => |bin| blk: {
                const left = try self.assignType(env, bin.lhs.*);
                const right = try self.assignType(env, bin.rhs.*);

                const result_type = try inferBinaryType(
                    bin.op,
                    left.tpe.?,
                    right.tpe.?,
                );

                break :blk .{
                    .kind = .{ .binary = .{
                        .op = bin.op,
                        .lhs = left,
                        .rhs = right,
                    } },
                    .span = expr.span,
                    .tpe = result_type,
                };
            },

            // If: check condition is Boolean, branches have same type
            .if_ => |if_expr| blk: {
                const cond = try self.assignType(env, if_expr.cond.*);
                const then_ = try self.assignType(env, if_expr.then_.*);
                const else_ = try self.assignType(env, if_expr.else_.*);

                if (cond.tpe.? != .s_boolean) {
                    return TyperError{
                        .msg = "Condition must be Boolean",
                        .span = cond.span,
                    };
                }

                if (!typesEqual(then_.tpe.?, else_.tpe.?)) {
                    return TyperError{
                        .msg = "Branches must have same type",
                        .span = expr.span,
                    };
                }

                break :blk .{
                    .kind = .{ .if_ = .{
                        .cond = cond,
                        .then_ = then_,
                        .else_ = else_,
                    } },
                    .span = expr.span,
                    .tpe = then_.tpe,
                };
            },

            // Lambda: check argument types, type body
            .lambda => |lam| blk: {
                var lambda_env = env.clone();
                for (lam.args) |arg| {
                    if (arg.tpe == .no_type) {
                        return TyperError{
                            .msg = "Lambda argument must have explicit type",
                            .span = expr.span,
                        };
                    }
                    try lambda_env.put(arg.name, arg.tpe);
                }

                const body = try self.assignType(&lambda_env, lam.body.*);
                const func_type = SType{
                    .s_func = .{
                        .args = lam.args.map(fn(a) a.tpe),
                        .ret = body.tpe.?,
                    },
                };

                break :blk .{
                    .kind = .{ .lambda = .{
                        .args = lam.args,
                        .body = body,
                    } },
                    .span = expr.span,
                    .tpe = func_type,
                };
            },

            // Method call: type receiver, resolve method, unify types
            .select => |sel| try self.typeSelect(env, sel, expr.span),

            .apply => |app| try self.typeApply(env, app, expr.span),

            // Literals already typed
            .literal => |lit| .{
                .kind = expr.kind,
                .span = expr.span,
                .tpe = switch (lit) {
                    .int => .s_int,
                    .long => .s_long,
                    .bool_ => .s_boolean,
                    .string => .{ .s_coll = .s_byte },
                },
            },

            else => expr,
        };
    }
};

Binary Operation Type Inference

fn inferBinaryType(op: BinaryOp, left: SType, right: SType) TyperError!SType {
    return switch (op) {
        // Arithmetic: operands must be same numeric type
        .plus, .minus, .multiply, .divide, .modulo => blk: {
            if (!left.isNumeric() or !right.isNumeric()) {
                return error.TypeMismatch;
            }
            if (!typesEqual(left, right)) {
                return error.TypeMismatch;
            }
            break :blk left;
        },

        // Comparison: operands must be same type, result is Boolean
        .lt, .gt, .le, .ge => blk: {
            if (!typesEqual(left, right)) {
                return error.TypeMismatch;
            }
            break :blk .s_boolean;
        },

        // Equality: operands must be same type
        .eq, .neq => blk: {
            if (!typesEqual(left, right)) {
                return error.TypeMismatch;
            }
            break :blk .s_boolean;
        },

        // Logical: Boolean operands
        .and_, .or_ => blk: {
            if (left == .s_boolean and right == .s_boolean) {
                break :blk .s_boolean;
            }
            // SigmaProp operations
            if (left == .s_sigma_prop and right == .s_sigma_prop) {
                break :blk .s_sigma_prop;
            }
            // Mixed: SigmaProp with Boolean
            if ((left == .s_sigma_prop and right == .s_boolean) or
                (left == .s_boolean and right == .s_sigma_prop))
            {
                break :blk .s_boolean;
            }
            return error.TypeMismatch;
        },

        // Bitwise: numeric operands
        .bit_and, .bit_or, .bit_xor => blk: {
            if (!left.isNumeric() or !right.isNumeric()) {
                return error.TypeMismatch;
            }
            if (!typesEqual(left, right)) {
                return error.TypeMismatch;
            }
            break :blk left;
        },
    };
}

Type Unification

Finds a substitution making two types equal⁷:

const TypeSubst = std.StringHashMap(SType);

fn unifyTypes(t1: SType, t2: SType) ?TypeSubst {
    var subst = TypeSubst.init(allocator);

    return switch (t1) {
        // Type variable matches anything
        .s_type_var => |name| blk: {
            subst.put(name, t2) catch return null;
            break :blk subst;
        },

        // Collection types: unify element types
        .s_coll => |elem1| switch (t2) {
            .s_coll => |elem2| unifyTypes(elem1, elem2),
            else => null,
        },

        // Option types: unify element types
        .s_option => |elem1| switch (t2) {
            .s_option => |elem2| unifyTypes(elem1, elem2),
            else => null,
        },

        // Tuple types: unify element-wise
        .s_tuple => |items1| switch (t2) {
            .s_tuple => |items2| blk: {
                if (items1.len != items2.len) break :blk null;
                for (items1, items2) |i1, i2| {
                    const sub = unifyTypes(i1, i2) orelse break :blk null;
                    subst = mergeSubst(subst, sub) orelse break :blk null;
                }
                break :blk subst;
            },
            else => null,
        },

        // Function types: unify domain and range
        .s_func => |f1| switch (t2) {
            .s_func => |f2| blk: {
                if (f1.args.len != f2.args.len) break :blk null;
                for (f1.args, f2.args) |a1, a2| {
                    const sub = unifyTypes(a1, a2) orelse break :blk null;
                    subst = mergeSubst(subst, sub) orelse break :blk null;
                }
                const ret_sub = unifyTypes(f1.ret, f2.ret) orelse break :blk null;
                break :blk mergeSubst(subst, ret_sub);
            },
            else => null,
        },

        // Boolean can unify with SigmaProp (implicit conversion)
        .s_boolean => switch (t2) {
            .s_sigma_prop, .s_boolean => subst,
            else => null,
        },

        // SAny matches anything
        .s_any => subst,

        // Primitive types must match exactly
        else => if (typesEqual(t1, t2)) subst else null,
    };
}

fn applySubst(tpe: SType, subst: TypeSubst) SType {
    return switch (tpe) {
        .s_type_var => |name| subst.get(name) orelse tpe,
        .s_coll => |elem| .{ .s_coll = applySubst(elem, subst) },
        .s_option => |elem| .{ .s_option = applySubst(elem, subst) },
        .s_tuple => |items| .{
            .s_tuple = items.map(fn(t) applySubst(t, subst)),
        },
        .s_func => |f| .{
            .s_func = .{
                .args = f.args.map(fn(t) applySubst(t, subst)),
                .ret = applySubst(f.ret, subst),
            },
        },
        else => tpe,
    };
}

fn mergeSubst(s1: TypeSubst, s2: TypeSubst) ?TypeSubst {
    var result = s1.clone();
    var it = s2.iterator();
    while (it.next()) |entry| {
        if (result.get(entry.key_ptr.*)) |existing| {
            if (!typesEqual(existing, entry.value_ptr.*)) {
                return null; // Conflict
            }
        } else {
            result.put(entry.key_ptr.*, entry.value_ptr.*) catch return null;
        }
    }
    return result;
}

Unification Example

Generic Method Specialization
─────────────────────────────────────────────────────

coll.map(f) where:
  - coll: Coll[Byte]
  - map type: (Coll[T], T => R) => Coll[R]
  - f: Byte => Int

Step 1: Unify Coll[T] with Coll[Byte]
        Result: {T → Byte}

Step 2: Unify (T => R) with (Byte => Int)
        T already bound to Byte ✓
        Result: {T → Byte, R → Int}

Step 3: Apply substitution to result type
        Coll[R] → Coll[Int]

Final: map specialized to (Coll[Byte], Byte => Int) => Coll[Int]

Method Resolution

Methods are looked up in type's methods container⁸:

const MethodsContainer = struct {
    const methods_by_type = std.ComptimeStringMap([]const MethodInfo, .{
        .{ "SBox", &box_methods },
        .{ "SColl", &coll_methods },
        .{ "SContext", &context_methods },
        // ...
    });

    pub fn getMethod(tpe: SType, name: []const u8) ?MethodInfo {
        const type_name = tpe.typeName();
        if (methods_by_type.get(type_name)) |methods| {
            for (methods) |m| {
                if (std.mem.eql(u8, m.name, name)) {
                    return m;
                }
            }
        }
        return null;
    }
};

const MethodInfo = struct {
    name: []const u8,
    stype: SType,
    ir_builder: ?*const fn (Expr, []const Expr) Expr,
};

const box_methods = [_]MethodInfo{
    .{ .name = "value", .stype = .s_long, .ir_builder = null },
    .{ .name = "propositionBytes", .stype = .{ .s_coll = .s_byte }, .ir_builder = null },
    .{ .name = "id", .stype = .{ .s_coll = .s_byte }, .ir_builder = null },
    .{ .name = "tokens", .stype = .{ .s_coll = .{ .s_tuple = &[_]SType{
        .{ .s_coll = .s_byte }, .s_long,
    } } }, .ir_builder = null },
    // ...
};

const coll_methods = [_]MethodInfo{
    .{ .name = "size", .stype = .s_int, .ir_builder = &buildSizeOf },
    .{ .name = "map", .stype = .{ .s_func = .{
        .args = &[_]SType{ .{ .s_type_var = "T" }, .{ .s_func = .{
            .args = &[_]SType{.{ .s_type_var = "T" }},
            .ret = .{ .s_type_var = "R" },
        } } },
        .ret = .{ .s_coll = .{ .s_type_var = "R" } },
    } }, .ir_builder = &buildMapCollection },
    // ...
};

Method Lowering

When lower_method_calls = true, method calls become IR nodes⁹:

fn typeSelect(
    self: *Typer,
    env: *const TypeEnv,
    sel: SelectExpr,
    span: Range,
) TyperError!Expr {
    const receiver = try self.assignType(env, sel.obj.*);
    const receiver_type = receiver.tpe.?;

    const method = MethodsContainer.getMethod(receiver_type, sel.field) orelse {
        return TyperError{
            .msg = "Method not found",
            .span = span,
        };
    };

    // Specialize generic method type
    const specialized = specializeMethod(method.stype, receiver_type);

    // Lower to IR node if builder available
    if (method.ir_builder) |builder| {
        if (self.lower_method_calls) {
            return builder(receiver, &[_]Expr{});
        }
    }

    // Keep as method call
    return .{
        .kind = .{ .select = .{
            .obj = receiver,
            .field = sel.field,
        } },
        .span = span,
        .tpe = specialized,
    };
}

fn buildSizeOf(receiver: Expr, _: []const Expr) Expr {
    return .{
        .kind = .{ .size_of = receiver },
        .span = receiver.span,
        .tpe = .s_int,
    };
}

fn buildMapCollection(receiver: Expr, args: []const Expr) Expr {
    return .{
        .kind = .{ .map = .{
            .input = receiver,
            .mapper = args[0],
        } },
        .span = receiver.span,
        .tpe = args[0].tpe.?.s_func.ret,
    };
}

MIR Lowering

After typing, HIR lowers to MIR (typed IR)¹⁰:

const MirLoweringError = struct {
    msg: []const u8,
    span: Range,
};

pub fn lower(hir_expr: hir.Expr) MirLoweringError!mir.Expr {
    const mir_expr: mir.Expr = switch (hir_expr.kind) {
        .global_vars => |g| switch (g) {
            .height => mir.GlobalVars.height.toExpr(),
            .self_ => mir.GlobalVars.self_.toExpr(),
            // ...
        },

        .ident => return MirLoweringError{
            .msg = "Unresolved identifier",
            .span = hir_expr.span,
        },

        .binary => |bin| blk: {
            const left = try lower(bin.lhs.*);
            const right = try lower(bin.rhs.*);
            break :blk mir.BinOp{
                .kind = bin.op.toMirOp(),
                .left = left,
                .right = right,
            }.toExpr();
        },

        .literal => |lit| switch (lit) {
            .int => |v| mir.Constant{ .int = v }.toExpr(),
            .long => |v| mir.Constant{ .long = v }.toExpr(),
            .bool_ => |v| (if (v) mir.TrueLeaf else mir.FalseLeaf).toExpr(),
        },

        // ...
    };

    // Verify types match
    const hir_tpe = hir_expr.tpe orelse return MirLoweringError{
        .msg = "Missing type for HIR expression",
        .span = hir_expr.span,
    };

    if (!typesEqual(mir_expr.tpe(), hir_tpe)) {
        return MirLoweringError{
            .msg = "Type mismatch after lowering",
            .span = hir_expr.span,
        };
    }

    return mir_expr;
}

Complete Compilation Flow

pub fn compile(source: []const u8, env: ScriptEnv) CompileError!mir.Expr {
    // 1. Parse
    const tokens = Lexer.init(source).tokenize();
    const events = Parser.init(tokens).parse();
    const ast = buildTree(events, tokens);

    // 2. Lower to HIR
    const hir = try hir.lower(ast);

    // 3. Bind
    const binder = Binder.init(allocator, env);
    const bound = try binder.bind(hir);

    // 4. Type
    const typer = Typer.init(allocator, TypeEnv.init(allocator), true);
    const typed = try typer.typecheck(bound);

    // 5. Lower to MIR
    const mir_expr = try mir.lower(typed);

    return mir_expr;
}

Error Messages

Error Types
─────────────────────────────────────────────────────

BinderError:
  - "Variable x already defined"
  - "Cannot lift value to constant"

TyperError:
  - "Cannot assign type for variable 'foo'"
  - "Condition must be Boolean, got Int"
  - "Branches must have same type: Int vs Long"
  - "Method 'bar' not found in type Box"

MirLoweringError:
  - "Unresolved identifier"
  - "Type mismatch after lowering"

Summary

Semantic analysis consists of two phases:

Binding (Binder):

Resolves global names (HEIGHT, SELF, etc.)
Lifts environment values to constants
Uses bottom-up tree rewriting

Typing (Typer):

Assigns types to all expressions
Resolves method calls via MethodsContainer
Unifies generic types with concrete types
Optionally lowers method calls to IR nodes
Checks type consistency

Key algorithms:

Type unification: Find substitution making types equal
Substitution application: Specialize generic types
Method resolution: Look up methods in type's container

Next: Chapter 18: Intermediate Representation

Scala: SigmaBinder.scala

Rust: binder.rs

Scala: SigmaBinder.scala:30-100

⁴

Rust: binder.rs:26-61

⁵

Scala: SigmaTyper.scala

⁶

Rust: type_infer.rs

⁷

Scala: package.scala (unifyTypes)

⁸

Scala: SRMethod.scala

⁹

Scala: SigmaTyper.scala:200-280

¹⁰

Rust: lower.rs:29-76

Chapter 18: Intermediate Representation (IR)

Prerequisites

Chapter 17 for the typed AST that feeds into IR construction
Chapter 5 for operation codes that IR nodes map to
Understanding of compiler optimization concepts: CSE, dead code elimination

Learning Objectives

By the end of this chapter, you will be able to:

Explain the graph-based IR design using the Def/Ref pattern
Implement common subexpression elimination (CSE) via hash-consing
Apply graph rewriting for algebraic simplifications
Trace the AST → Graph IR → Optimized Tree transformations

IR Architecture Overview

The Scala compiler uses a sophisticated graph-based IR for optimization¹². The Rust compiler uses a simpler direct HIR→MIR pipeline³.

Compilation Pipelines
─────────────────────────────────────────────────────

Scala (Graph IR):
┌─────────┐   GraphBuilding   ┌──────────┐   TreeBuilding    ┌──────────┐
│ Typed   │ ─────────────────>│ Graph IR │ ─────────────────>│ Optimized│
│ AST     │   (+ CSE)         │ (Def/Ref)│   (ValDef min)    │ ErgoTree │
└─────────┘                   └──────────┘                   └──────────┘
                                   │
                                   │ DefRewriting
                                   │ (algebraic simplifications)
                                   ▼

Rust (Direct):
┌─────────┐   Lower    ┌──────────┐   Lower    ┌──────────┐   Check   ┌──────────┐
│ HIR     │ ─────────> │ Bound    │ ─────────> │ Typed    │ ─────────>│ MIR/     │
│ (parse) │            │ HIR      │            │ HIR      │           │ ErgoTree │
└─────────┘            └──────────┘            └──────────┘           └──────────┘

The Def/Ref Pattern

The core IR abstraction uses definitions (nodes) and references (edges)⁴⁵:

/// Reference to a definition (graph edge)
/// Like a pointer but with type information
const Sym = u32;  // Symbol ID

/// Type descriptor for IR values
const Elem = struct {
    stype: SType,
    source_type: ?*const std.meta.Type,
};

/// Base type for all graph nodes
const Node = struct {
    /// Unique ID assigned on creation
    node_id: u32,
    /// Cached dependencies (other nodes this one uses)
    deps: ?[]const Sym,
    /// Cached hash for structural equality
    hash_code: u32,

    pub fn getDeps(self: *const Node) []const Sym {
        if (self.deps) |d| return d;
        // Computed lazily from node contents
        return computeDeps(self);
    }
};

/// Definition of a computation (graph node)
const Def = struct {
    node: Node,
    /// Type of the result value
    result_type: Elem,
    /// Reference to this definition (created lazily)
    self_ref: ?Sym,

    pub fn self(d: *Def, ctx: *IRContext) Sym {
        if (d.self_ref) |s| return s;
        const sym = ctx.freshSym(d);
        d.self_ref = sym;
        return sym;
    }
};

IR Context

The IR context manages the graph and provides CSE⁶⁷:

const IRContext = struct {
    allocator: Allocator,
    /// Counter for unique node IDs
    id_counter: u32,
    /// Global definitions: Def hash → Sym
    /// This enables CSE through hash-consing
    global_defs: std.HashMap(*const Def, Sym, DefHashContext, 80),
    /// Sym → Def mapping
    sym_to_def: std.AutoHashMap(Sym, *const Def),

    pub fn init(allocator: Allocator) IRContext {
        return .{
            .allocator = allocator,
            .id_counter = 0,
            .global_defs = std.HashMap(*const Def, Sym, DefHashContext, 80).init(allocator),
            .sym_to_def = std.AutoHashMap(Sym, *const Def).init(allocator),
        };
    }

    /// Generate fresh symbol ID
    pub fn freshSym(self: *IRContext, def: *const Def) Sym {
        const id = self.id_counter;
        self.id_counter += 1;
        self.sym_to_def.put(id, def) catch unreachable;
        return id;
    }

    /// Create or reuse existing definition (CSE)
    pub fn reifyObject(self: *IRContext, d: *Def) Sym {
        return self.findOrCreateDefinition(d);
    }

    /// Hash-consing: lookup by structural equality
    fn findOrCreateDefinition(self: *IRContext, d: *Def) Sym {
        if (self.global_defs.get(d)) |existing_sym| {
            // Reuse existing definition
            return existing_sym;
        }
        // Register new definition
        const sym = d.self(self);
        self.global_defs.put(d, sym) catch unreachable;
        return sym;
    }
};

/// Hash context for structural equality of definitions
const DefHashContext = struct {
    pub fn hash(_: DefHashContext, def: *const Def) u64 {
        // Hash based on node type and contents (structural)
        return def.node.hash_code;
    }

    pub fn eql(_: DefHashContext, a: *const Def, b: *const Def) bool {
        // Structural equality of definitions
        return structuralEqual(a, b);
    }
};

Common Subexpression Elimination

CSE is achieved automatically through hash-consing⁸:

CSE Through Hash-Consing
─────────────────────────────────────────────────────

Source:
  val a = SELF.value
  val b = SELF.value  // Same computation!
  a + b

Step 1: Build graph for SELF.value
  s1 = Self
  s2 = MethodCall(s1, "value")  → stored in global_defs

Step 2: Build graph for second SELF.value
  s1 = Self                      → already exists, reuse
  s2 = MethodCall(s1, "value")   → lookup in global_defs
                                 → found! return existing s2

Step 3: Build addition
  s3 = Plus(s2, s2)              → both operands point to s2

Result: Single computation of SELF.value

/// Build graph from typed AST
const GraphBuilder = struct {
    ctx: *IRContext,
    env: std.StringHashMap(Sym),

    pub fn buildGraph(self: *GraphBuilder, expr: *const TypedExpr) !Sym {
        return switch (expr.kind) {
            .constant => |c| self.buildConstant(c),
            .val_use => |name| self.env.get(name) orelse error.UndefinedVariable,
            .block => |b| self.buildBlock(b),
            .bin_op => |op| self.buildBinOp(op),
            .method_call => |mc| self.buildMethodCall(mc),
            .if_expr => |i| self.buildIf(i),
            .func_value => |f| self.buildLambda(f),
            .apply => |a| self.buildApply(a),
        };
    }

    fn buildConstant(self: *GraphBuilder, c: Constant) Sym {
        const def = self.ctx.allocator.create(ConstDef) catch unreachable;
        def.* = .{ .value = c };
        // CSE: if same constant exists, reuse it
        return self.ctx.reifyObject(&def.base);
    }

    fn buildBinOp(self: *GraphBuilder, op: *const BinOp) !Sym {
        const left_sym = try self.buildGraph(op.left);
        const right_sym = try self.buildGraph(op.right);

        const def = self.ctx.allocator.create(BinOpDef) catch unreachable;
        def.* = .{
            .op = op.kind,
            .left = left_sym,
            .right = right_sym,
        };
        // CSE: reuse if same operation on same operands exists
        return self.ctx.reifyObject(&def.base);
    }

    fn buildMethodCall(self: *GraphBuilder, mc: *const MethodCall) !Sym {
        const receiver_sym = try self.buildGraph(mc.receiver);
        var arg_syms = try self.ctx.allocator.alloc(Sym, mc.args.len);
        for (mc.args, 0..) |arg, i| {
            arg_syms[i] = try self.buildGraph(arg);
        }

        const def = self.ctx.allocator.create(MethodCallDef) catch unreachable;
        def.* = .{
            .receiver = receiver_sym,
            .method = mc.method,
            .args = arg_syms,
        };
        // CSE: reuse if identical method call exists
        return self.ctx.reifyObject(&def.base);
    }
};

Graph Rewriting

Algebraic simplifications are applied as rewrite rules⁹¹⁰:

/// Rewriting rules for optimization
const DefRewriter = struct {
    ctx: *IRContext,

    /// Called on each new definition
    /// Returns replacement Sym or null for no rewrite
    pub fn rewriteDef(self: *DefRewriter, d: *const Def) ?Sym {
        return switch (d.kind()) {
            .coll_length => self.rewriteLength(d.as(CollLengthDef)),
            .coll_map => self.rewriteMap(d.as(CollMapDef)),
            .coll_zip => self.rewriteZip(d.as(CollZipDef)),
            .option_get_or_else => self.rewriteGetOrElse(d.as(OptionGetOrElseDef)),
            else => null,
        };
    }

    /// xs.map(f).length => xs.length
    fn rewriteLength(self: *DefRewriter, len_def: *const CollLengthDef) ?Sym {
        const input = self.ctx.getDef(len_def.input);
        return switch (input.kind()) {
            .coll_map => |map_def| {
                // Rule: xs.map(f).length => xs.length
                return self.makeLength(map_def.input);
            },
            .coll_replicate => |rep_def| {
                // Rule: replicate(len, v).length => len
                return rep_def.length;
            },
            .const_coll => |coll_def| {
                // Rule: Const(coll).length => coll.length
                return self.makeConstant(.{ .int = @intCast(coll_def.items.len) });
            },
            .coll_from_items => |items_def| {
                // Rule: Coll(items).length => items.length
                return self.makeConstant(.{ .int = @intCast(items_def.items.len) });
            },
            else => null,
        };
    }

    /// xs.map(identity) => xs
    /// xs.map(f).map(g) => xs.map(x => g(f(x)))
    fn rewriteMap(self: *DefRewriter, map_def: *const CollMapDef) ?Sym {
        const mapper = self.ctx.getDef(map_def.mapper);

        // Rule: xs.map(identity) => xs
        if (isIdentityLambda(mapper)) {
            return map_def.input;
        }

        const input = self.ctx.getDef(map_def.input);
        return switch (input.kind()) {
            .coll_replicate => |rep_def| {
                // Rule: replicate(l, v).map(f) => replicate(l, f(v))
                const applied = self.makeApply(map_def.mapper, rep_def.value);
                return self.makeReplicate(rep_def.length, applied);
            },
            .coll_map => |inner_map| {
                // Rule: xs.map(f).map(g) => xs.map(x => g(f(x)))
                const composed = self.composeLambdas(inner_map.mapper, map_def.mapper);
                return self.makeMap(inner_map.input, composed);
            },
            else => null,
        };
    }

    /// replicate(l, x).zip(replicate(l, y)) => replicate(l, (x, y))
    fn rewriteZip(self: *DefRewriter, zip_def: *const CollZipDef) ?Sym {
        const left = self.ctx.getDef(zip_def.left);
        const right = self.ctx.getDef(zip_def.right);

        if (left.kind() == .coll_replicate and right.kind() == .coll_replicate) {
            const rep_l = left.as(CollReplicateDef);
            const rep_r = right.as(CollReplicateDef);

            // Check same length and builder
            if (rep_l.length == rep_r.length and rep_l.builder == rep_r.builder) {
                const pair = self.makePair(rep_l.value, rep_r.value);
                return self.makeReplicate(rep_l.length, pair);
            }
        }
        return null;
    }

    /// Some(x).getOrElse(d) => x
    fn rewriteGetOrElse(self: *DefRewriter, def: *const OptionGetOrElseDef) ?Sym {
        const opt = self.ctx.getDef(def.option);
        if (opt.kind() == .option_const) {
            const opt_const = opt.as(OptionConstDef);
            if (opt_const.value) |v| {
                return self.liftValue(v);
            }
        }
        return null;
    }
};

Sigma-Specific Rewrites

Special optimizations for Sigma propositions¹¹:

/// Sigma-specific rewriting rules
const SigmaRewriter = struct {
    ctx: *IRContext,

    pub fn rewriteSigma(self: *SigmaRewriter, d: *const Def) ?Sym {
        return switch (d.kind()) {
            .sigma_prop_is_valid => self.rewriteIsValid(d),
            .sigma_prop_from_bool => self.rewriteSigmaProp(d),
            .all_of => self.rewriteAllOf(d),
            .any_of => self.rewriteAnyOf(d),
            else => null,
        };
    }

    /// sigmaProp(sp.isValid) => sp
    fn rewriteIsValid(self: *SigmaRewriter, d: *const Def) ?Sym {
        const is_valid = d.as(SigmaIsValidDef);
        const inner = self.ctx.getDef(is_valid.prop);

        if (inner.kind() == .sigma_prop_from_bool) {
            const from_bool = inner.as(SigmaPropFromBoolDef);
            // Check if the bool is another isValid
            const bool_def = self.ctx.getDef(from_bool.bool_expr);
            if (bool_def.kind() == .sigma_prop_is_valid) {
                return bool_def.as(SigmaIsValidDef).prop;
            }
        }
        return null;
    }

    /// sigmaProp(b).isValid => b
    fn rewriteSigmaProp(self: *SigmaRewriter, d: *const Def) ?Sym {
        _ = d;
        _ = self;
        // This rewrite is handled in rewriteIsValid
        return null;
    }

    /// allOf(Coll(b1, ..., sp1.isValid, ...)) =>
    ///   (allOf(Coll(b1, ...)) && allZK(sp1, ...)).isValid
    fn rewriteAllOf(self: *SigmaRewriter, d: *const Def) ?Sym {
        const all_of = d.as(AllOfDef);
        const items = self.extractItems(all_of.input) orelse return null;

        var bools = std.ArrayList(Sym).init(self.ctx.allocator);
        var sigmas = std.ArrayList(Sym).init(self.ctx.allocator);

        for (items) |item| {
            const item_def = self.ctx.getDef(item);
            if (item_def.kind() == .sigma_prop_is_valid) {
                const is_valid = item_def.as(SigmaIsValidDef);
                sigmas.append(is_valid.prop) catch unreachable;
            } else {
                bools.append(item) catch unreachable;
            }
        }

        if (sigmas.items.len == 0) return null;

        // Build: (allOf(bools) && allZK(sigmas)).isValid
        const zk_all = self.makeAllZK(sigmas.items);
        if (bools.items.len == 0) {
            return self.makeIsValid(zk_all);
        }
        const bool_all = self.makeSigmaProp(self.makeAllOf(bools.items));
        const combined = self.makeSigmaAnd(bool_all, zk_all);
        return self.makeIsValid(combined);
    }
};

Tree Building

Transform optimized graph back to ErgoTree¹²¹³:

/// Transform graph IR to ErgoTree
const TreeBuilder = struct {
    ctx: *IRContext,
    /// Maps symbols to ValDef IDs
    env: std.AutoHashMap(Sym, struct { id: u32, tpe: SType }),
    /// Current ValDef ID counter
    def_id: u32,
    allocator: Allocator,

    pub fn buildTree(self: *TreeBuilder, root: Sym) !*Expr {
        // Compute usage counts to minimize ValDefs
        const usage = self.computeUsageCounts(root);

        // Build topological schedule
        const schedule = self.buildSchedule(root);

        // Process nodes, introducing ValDefs only for multi-use
        var val_defs = std.ArrayList(ValDef).init(self.allocator);
        for (schedule) |sym| {
            if (usage.get(sym).? > 1) {
                // Multi-use node: create ValDef
                const rhs = try self.buildValue(sym);
                const tpe = self.ctx.getDef(sym).result_type.stype;
                try val_defs.append(.{
                    .id = self.def_id,
                    .tpe = tpe,
                    .rhs = rhs,
                });
                try self.env.put(sym, .{ .id = self.def_id, .tpe = tpe });
                self.def_id += 1;
            }
        }

        // Build result expression
        const result = try self.buildValue(root);

        // Wrap in block if we have ValDefs
        if (val_defs.items.len == 0) {
            return result;
        }
        return self.makeBlock(val_defs.items, result);
    }

    fn buildValue(self: *TreeBuilder, sym: Sym) !*Expr {
        // Check if already bound in environment
        if (self.env.get(sym)) |binding| {
            return self.makeValUse(binding.id, binding.tpe);
        }

        const def = self.ctx.getDef(sym);
        return switch (def.kind()) {
            .constant => |c| self.makeConstant(c),
            .context_prop => |prop| self.buildContextProp(prop),
            .method_call => |mc| self.buildMethodCall(mc),
            .bin_op => |op| self.buildBinOp(op),
            .lambda => |lam| self.buildLambda(lam),
            .apply => |app| self.buildApply(app),
            .if_then_else => |ite| self.buildIf(ite),
            else => error.UnhandledDefKind,
        };
    }

    fn computeUsageCounts(self: *TreeBuilder, root: Sym) std.AutoHashMap(Sym, u32) {
        var counts = std.AutoHashMap(Sym, u32).init(self.allocator);
        self.countUsagesRecursive(root, &counts);
        return counts;
    }

    fn countUsagesRecursive(self: *TreeBuilder, sym: Sym, counts: *std.AutoHashMap(Sym, u32)) void {
        const current = counts.get(sym) orelse 0;
        counts.put(sym, current + 1) catch unreachable;

        // Only traverse dependencies on first visit
        if (current == 0) {
            const def = self.ctx.getDef(sym);
            for (def.node.getDeps()) |dep| {
                self.countUsagesRecursive(dep, counts);
            }
        }
    }

    fn buildSchedule(self: *TreeBuilder, root: Sym) []const Sym {
        // Topological sort via DFS
        var visited = std.AutoHashMap(Sym, void).init(self.allocator);
        var schedule = std.ArrayList(Sym).init(self.allocator);
        self.dfs(root, &visited, &schedule);
        return schedule.items;
    }

    fn dfs(self: *TreeBuilder, sym: Sym, visited: *std.AutoHashMap(Sym, void), schedule: *std.ArrayList(Sym)) void {
        if (visited.contains(sym)) return;
        visited.put(sym, {}) catch unreachable;

        const def = self.ctx.getDef(sym);
        for (def.node.getDeps()) |dep| {
            self.dfs(dep, visited, schedule);
        }
        schedule.append(sym) catch unreachable;
    }
};

Operation Translation

Map IR operations to ErgoTree nodes¹⁴:

/// Recognize arithmetic operations
fn translateArithOp(op: BinOpKind) ?OpCode {
    return switch (op) {
        .plus => OpCode.Plus,
        .minus => OpCode.Minus,
        .multiply => OpCode.Multiply,
        .divide => OpCode.Division,
        .modulo => OpCode.Modulo,
        .min => OpCode.Min,
        .max => OpCode.Max,
        else => null,
    };
}

/// Recognize comparison operations
fn translateRelationOp(op: BinOpKind) ?fn (*Expr, *Expr) *Expr {
    return switch (op) {
        .eq => makeEQ,
        .neq => makeNEQ,
        .gt => makeGT,
        .lt => makeLT,
        .ge => makeGE,
        .le => makeLE,
        else => null,
    };
}

/// Recognize context properties
fn translateContextProp(prop: ContextProperty) *Expr {
    return switch (prop) {
        .height => &expr_height,
        .inputs => &expr_inputs,
        .outputs => &expr_outputs,
        .self => &expr_self,
    };
}

/// Internal definitions should not become ValDefs
fn isInternalDef(def: *const Def) bool {
    return switch (def.kind()) {
        .sigma_dsl_builder, .coll_builder => true,
        else => false,
    };
}

Rust HIR (Alternative Approach)

The Rust compiler uses a simpler tree-based HIR without graph IR¹⁵¹⁶:

/// Rust-style HIR expression
const HirExpr = struct {
    kind: ExprKind,
    span: TextRange,
    tpe: ?SType,

    const ExprKind = union(enum) {
        ident: []const u8,
        binary: Binary,
        global_vars: GlobalVars,
        literal: Literal,
    };

    const Binary = struct {
        op: Spanned(BinaryOp),
        lhs: *HirExpr,
        rhs: *HirExpr,
    };

    const GlobalVars = enum {
        height,
    };

    const Literal = union(enum) {
        int: i32,
        long: i64,
    };
};

/// Rewrite HIR expressions (simpler than graph rewriting)
fn rewrite(
    e: HirExpr,
    f: fn (*const HirExpr) ?HirExpr,
) HirExpr {
    // Apply rewrite function
    const rewritten = f(&e) orelse e;

    // Recursively rewrite children
    return switch (rewritten.kind) {
        .binary => |bin| blk: {
            const new_lhs = f(bin.lhs) orelse bin.lhs.*;
            const new_rhs = f(bin.rhs) orelse bin.rhs.*;
            break :blk HirExpr{
                .kind = .{ .binary = .{
                    .op = bin.op,
                    .lhs = &new_lhs,
                    .rhs = &new_rhs,
                }},
                .span = rewritten.span,
                .tpe = rewritten.tpe,
            };
        },
        else => rewritten,
    };
}

CSE Example Walkthrough

Source:
─────────────────────────────────────────────────────
{
  val x = SELF.value
  val y = SELF.value    // Duplicate!
  val z = OUTPUTS(0).value
  x + y > z
}

After GraphBuilding (with CSE):
─────────────────────────────────────────────────────
s1 = Context.SELF
s2 = s1.value           // Single node for both x and y
s3 = Context.OUTPUTS
s4 = s3.apply(0)
s5 = s4.value
s6 = Plus(s2, s2)       // x + y = s2 + s2
s7 = GT(s6, s5)

After TreeBuilding (ValDef minimization):
─────────────────────────────────────────────────────
{
  val v1 = SELF.value   // s2 used twice → ValDef
  GT(Plus(v1, v1), OUTPUTS(0).value)
}

Nodes s1, s3, s4, s5 have single use → inlined
Node s2 has multiple uses → ValDef introduced

Summary

Def/Ref pattern separates computation definitions from references
Hash-consing enables automatic CSE—structurally equal nodes share identity
Graph rewriting applies algebraic simplifications (map fusion, etc.)
TreeBuilding transforms graph back to ErgoTree with minimal ValDefs
Usage counting determines which nodes need ValDef bindings
Scala uses full graph IR; Rust uses simpler tree-based HIR
IR optimizations reduce serialized ErgoTree size
Not part of consensus—compiler-only optimization

Next: Chapter 19: Compiler Pipeline

Scala: IRContext.scala

Scala: Base.scala:17-200 (Node, Def, Ref)

Rust: compiler.rs:59-76 (compile pipeline)

⁴

Scala: Base.scala:100-160 (Def trait)

⁵

Rust: hir.rs:32-37 (Expr struct)

⁶

Scala: IRContext.scala:28-50 (cake pattern)

⁷

Rust: compiler.rs:78-87 (compile_hir)

⁸

Scala: GraphBuilding.scala:28-35 (CSE documentation)

⁹

Scala: IRContext.scala:105-150 (rewriteDef)

¹⁰

Rust: rewrite.rs:10-29 (rewrite function)

¹¹

Scala: GraphBuilding.scala:75-120 (HasSigmas, AllOf)

¹²

Scala: TreeBuilding.scala:21-50 (TreeBuilding trait)

¹³

Scala: TreeBuilding.scala:60-100 (IsArithOp, IsRelationOp)

¹⁴

Scala: TreeBuilding.scala:100-140 (IsContextProperty)

¹⁵

Rust: hir.rs:146-167 (ExprKind enum)

¹⁶

Rust: hir.rs:61-94 (Expr::lower)

Chapter 19: Compiler Pipeline

Prerequisites

Chapter 16 for parsing ErgoScript to AST
Chapter 17 for name binding and type inference
Chapter 18 for IR optimization passes

Learning Objectives

By the end of this chapter, you will be able to:

Trace the complete compilation pipeline from ErgoScript source to ErgoTree bytecode
Use the SigmaCompiler API to compile scripts programmatically
Explain method call lowering strategies and when direct operations are used
Configure compiler settings for different networks (mainnet vs testnet)

Pipeline Architecture

The ErgoScript compiler transforms source code through multiple phases¹²:

Compilation Pipeline
─────────────────────────────────────────────────────

Source: "sigmaProp(SELF.value > 1000L)"
                       │
                       │ (1) Parse
                       ▼
┌─────────────────────────────────────────────────────┐
│ Untyped AST                                         │
│ Apply(Ident("sigmaProp"), [GT(Select(...), ...)])   │
└─────────────────────────────────────────────────────┘
                       │
                       │ (2) Bind
                       ▼
┌─────────────────────────────────────────────────────┐
│ Bound AST (names resolved)                          │
│ Apply(SigmaPropFunc, [GT(Self.value, 1000L)])       │
└─────────────────────────────────────────────────────┘
                       │
                       │ (3) Typecheck
                       ▼
┌─────────────────────────────────────────────────────┐
│ Typed AST                                           │
│ BoolToSigmaProp(GT(ExtractAmount(Self), 1000L))     │
│ :: SSigmaProp                                       │
└─────────────────────────────────────────────────────┘
                       │
                       │ (4) BuildGraph (Scala only)
                       ▼
┌─────────────────────────────────────────────────────┐
│ Graph IR (CSE applied)                              │
│ s1=Self, s2=s1.value, s3=1000L, s4=GT(s2,s3)        │
│ s5=sigmaProp(s4)                                    │
└─────────────────────────────────────────────────────┘
                       │
                       │ (5) BuildTree / Lower to MIR
                       ▼
┌─────────────────────────────────────────────────────┐
│ ErgoTree                                            │
│ BoolToSigmaProp(GT(ExtractAmount(Self), 1000L))     │
└─────────────────────────────────────────────────────┘

Compiler Settings

Configuration controls optimization and network behavior³:

const CompilerSettings = struct {
    /// Network prefix for address decoding (mainnet=0, testnet=16)
    network_prefix: u8,
    /// Whether to lower MethodCall to direct nodes
    lower_method_calls: bool,
    /// Builder for creating ErgoTree nodes
    builder: *const SigmaBuilder,

    pub fn mainnet() CompilerSettings {
        return .{
            .network_prefix = 0x00,
            .lower_method_calls = true,
            .builder = &TransformingSigmaBuilder,
        };
    }

    pub fn testnet() CompilerSettings {
        return .{
            .network_prefix = 0x10,
            .lower_method_calls = true,
            .builder = &TransformingSigmaBuilder,
        };
    }
};

SigmaCompiler Implementation

The compiler orchestrates all phases⁴⁵:

const SigmaCompiler = struct {
    settings: CompilerSettings,
    allocator: Allocator,

    pub fn init(settings: CompilerSettings, allocator: Allocator) SigmaCompiler {
        return .{
            .settings = settings,
            .allocator = allocator,
        };
    }

    /// Phase 1: Parse source to AST
    pub fn parse(self: *const SigmaCompiler, source: []const u8) !*Expr {
        var parser = Parser.init(source, self.allocator);
        return parser.parseExpr() catch |err| {
            return error.ParserError;
        };
    }

    /// Phases 2-3: Bind and typecheck
    pub fn typecheck(
        self: *const SigmaCompiler,
        env: *const ScriptEnv,
        parsed: *const Expr,
    ) !*TypedExpr {
        // Phase 2: Bind names
        const predef_registry = PredefinedFuncRegistry.init(self.settings.builder);
        var binder = Binder.init(env, self.settings.builder, self.settings.network_prefix, &predef_registry);
        const bound = try binder.bind(parsed);

        // Phase 3: Type inference and checking
        const type_env = env.collectTypes();
        var typer = Typer.init(
            self.settings.builder,
            &predef_registry,
            type_env,
            self.settings.lower_method_calls,
        );
        return typer.typecheck(bound);
    }

    /// Full compilation: all phases
    pub fn compile(
        self: *const SigmaCompiler,
        env: *const ScriptEnv,
        source: []const u8,
        ir_ctx: *IRContext,
    ) !CompilerResult {
        const parsed = try self.parse(source);
        const typed = try self.typecheck(env, parsed);
        return self.compileTyped(env, typed, ir_ctx, source);
    }

    /// Phases 4-5: Graph building and tree building
    fn compileTyped(
        self: *const SigmaCompiler,
        env: *const ScriptEnv,
        typed: *const TypedExpr,
        ir_ctx: *IRContext,
        source: []const u8,
    ) !CompilerResult {
        // Create placeholder constants for type parameters
        var placeholders_env = env.clone();
        var idx: u32 = 0;
        var iter = env.typeParams();
        while (iter.next()) |entry| {
            const placeholder = ConstantPlaceholder{
                .index = idx,
                .tpe = entry.value,
            };
            try placeholders_env.put(entry.key, .{ .placeholder = placeholder });
            idx += 1;
        }

        // Phase 4: Build graph (CSE)
        var graph_builder = GraphBuilder.init(ir_ctx, &placeholders_env);
        const compiled_graph = try graph_builder.buildGraph(typed);

        // Phase 5: Build tree (ValDef minimization)
        var tree_builder = TreeBuilder.init(ir_ctx, self.allocator);
        const compiled_tree = try tree_builder.buildTree(compiled_graph);

        return CompilerResult{
            .env = env,
            .source = source,
            .compiled_graph = compiled_graph,
            .ergo_tree = compiled_tree,
        };
    }
};

/// Result of compilation
const CompilerResult = struct {
    env: *const ScriptEnv,
    source: []const u8,
    compiled_graph: Sym,
    ergo_tree: *Expr,
};

Rust Compiler Pipeline

The Rust implementation uses a direct pipeline without graph IR⁶⁷:

/// Rust-style direct compilation pipeline
const RustCompiler = struct {
    allocator: Allocator,

    /// Compile source to ErgoTree expression
    pub fn compileExpr(
        self: *const RustCompiler,
        source: []const u8,
        env: ScriptEnv,
    ) !*MirExpr {
        // Parse to CST, then lower to HIR
        const hir = try self.compileHir(source);

        // Bind names in HIR
        var binder = Binder.init(env);
        const bound = try binder.bind(hir);

        // Assign types
        const typed = try assignType(bound);

        // Lower to MIR (ErgoTree IR)
        const mir = try lowerToMir(typed);

        // Type check MIR
        return try typeCheck(mir);
    }

    /// Compile to full ErgoTree
    pub fn compile(
        self: *const RustCompiler,
        source: []const u8,
        env: ScriptEnv,
    ) !ErgoTree {
        const expr = try self.compileExpr(source, env);
        return ErgoTree.fromExpr(expr);
    }

    fn compileHir(self: *const RustCompiler, source: []const u8) !*HirExpr {
        var parser = Parser.init(source);
        const parse_result = parser.parse();

        if (parse_result.errors.len > 0) {
            return error.ParseError;
        }

        const syntax = parse_result.syntax();
        const root = AstRoot.cast(syntax) orelse return error.InvalidRoot;
        return hirLower(root);
    }
};

Method Call Lowering

Lowering transforms generic MethodCall to compact direct nodes⁸⁹:

Method Call Lowering
─────────────────────────────────────────────────────

Before lowering (MethodCall - 3+ bytes):
  MethodCall(xs, CollMethods.MapMethod, [f], {})

After lowering (MapCollection - 1 byte):
  MapCollection(xs, f)

Size savings: 2+ bytes per operation

/// Method call lowering during typing
const MethodCallLowerer = struct {
    builder: *const SigmaBuilder,
    lower_enabled: bool,

    /// Try to lower MethodCall to direct node
    pub fn tryLower(
        self: *const MethodCallLowerer,
        obj: *const Expr,
        method: *const SMethod,
        args: []const *const Expr,
        subst: TypeSubst,
    ) ?*Expr {
        if (!self.lower_enabled) return null;

        // Check if method has IR builder
        const ir_builder = method.ir_info.ir_builder orelse return null;

        // Try to apply the builder
        return ir_builder.build(self.builder, obj, method, args, subst);
    }

    /// Unlower: convert direct nodes back to MethodCall (for display)
    pub fn unlower(self: *const MethodCallLowerer, expr: *const Expr) *Expr {
        return switch (expr.kind) {
            .multiply_group => |mg| self.builder.makeMethodCall(
                mg.left,
                &SGroupElementMethods.multiply_method,
                &[_]*const Expr{mg.right},
            ),
            .exponentiate => |exp| self.builder.makeMethodCall(
                exp.base,
                &SGroupElementMethods.exponentiate_method,
                &[_]*const Expr{exp.exponent},
            ),
            .map_collection => |mc| self.builder.makeMethodCall(
                mc.input,
                &SCollectionMethods.map_method.withConcreteTypes(.{
                    .tIV = mc.input.tpe.elemType(),
                    .tOV = mc.mapper.tpe.resultType(),
                }),
                &[_]*const Expr{mc.mapper},
            ),
            .fold => |f| self.builder.makeMethodCall(
                f.input,
                &SCollectionMethods.fold_method.withConcreteTypes(.{
                    .tIV = f.input.tpe.elemType(),
                    .tOV = f.zero.tpe,
                }),
                &[_]*const Expr{ f.zero, f.folder },
            ),
            .for_all => |fa| self.builder.makeMethodCall(
                fa.input,
                &SCollectionMethods.forall_method.withConcreteTypes(.{
                    .tIV = fa.input.tpe.elemType(),
                }),
                &[_]*const Expr{fa.predicate},
            ),
            .exists => |ex| self.builder.makeMethodCall(
                ex.input,
                &SCollectionMethods.exists_method.withConcreteTypes(.{
                    .tIV = ex.input.tpe.elemType(),
                }),
                &[_]*const Expr{ex.predicate},
            ),
            else => expr,
        };
    }
};

Type Inference

Type assignment propagates and unifies types¹⁰¹¹:

const Typer = struct {
    builder: *const SigmaBuilder,
    predef_registry: *const PredefinedFuncRegistry,
    type_env: std.StringHashMap(SType),
    lower_method_calls: bool,

    /// Assign types to bound expression
    pub fn typecheck(self: *Typer, bound: *const Expr) !*TypedExpr {
        return self.assignType(self.type_env, bound);
    }

    fn assignType(self: *Typer, env: std.StringHashMap(SType), expr: *const Expr) !*TypedExpr {
        return switch (expr.kind) {
            .block => |b| self.typecheckBlock(env, b),
            .tuple => |t| self.typecheckTuple(env, t),
            .ident => |id| self.typecheckIdent(env, id),
            .select => |s| self.typecheckSelect(env, s),
            .apply => |a| self.typecheckApply(env, a),
            .lambda => |l| self.typecheckLambda(env, l),
            .if_expr => |i| self.typecheckIf(env, i),
            .constant => |c| self.makeTyped(c, c.tpe),
            else => error.UnsupportedExpr,
        };
    }

    fn typecheckBlock(self: *Typer, env: std.StringHashMap(SType), block: *const Block) !*TypedExpr {
        var cur_env = try env.clone();

        for (block.items) |val_def| {
            if (cur_env.contains(val_def.name)) {
                return error.DuplicateVariable;
            }
            const rhs_typed = try self.assignType(cur_env, val_def.rhs);
            try cur_env.put(val_def.name, rhs_typed.tpe);
        }

        const result_typed = try self.assignType(cur_env, block.result);
        return self.builder.makeBlock(block.items, result_typed);
    }

    fn typecheckSelect(self: *Typer, env: std.StringHashMap(SType), sel: *const Select) !*TypedExpr {
        const obj_typed = try self.assignType(env, sel.obj);

        const method = MethodsContainer.getMethod(obj_typed.tpe, sel.field) orelse
            return error.MethodNotFound;

        // Unify method receiver type with object type
        const subst = unifyTypes(method.stype.domain[0], obj_typed.tpe) orelse
            return error.TypeMismatch;

        const result_type = applySubst(method.stype.range, subst);

        // Try to lower if it's a property access (no args)
        if (self.lower_method_calls) {
            if (method.ir_info.ir_builder) |ir_builder| {
                if (ir_builder.buildProperty(self.builder, obj_typed, method)) |lowered| {
                    return lowered;
                }
            }
        }

        return self.builder.makeSelect(obj_typed, sel.field, result_type);
    }
};

Error Handling

Each phase produces specific errors¹²:

const CompileError = union(enum) {
    parse_error: ParseError,
    hir_lowering_error: HirLoweringError,
    binder_error: BinderError,
    type_error: TypeInferenceError,
    mir_lowering_error: MirLoweringError,
    type_check_error: TypeCheckError,
    ergo_tree_error: ErgoTreeError,

    pub fn prettyDesc(self: CompileError, source: []const u8) []const u8 {
        return switch (self) {
            .parse_error => |e| e.prettyDesc(source),
            .hir_lowering_error => |e| e.prettyDesc(source),
            .binder_error => |e| e.prettyDesc(source),
            .type_error => |e| e.prettyDesc(source),
            .mir_lowering_error => |e| e.prettyDesc(source),
            .type_check_error => |e| e.prettyDesc(),
            .ergo_tree_error => |e| std.fmt.allocPrint(
                allocator,
                "ErgoTree error: {any}",
                .{e},
            ) catch "format error",
        };
    }
};

/// Parse error with source location
const ParseError = struct {
    message: []const u8,
    span: TextRange,
    expected: []const TokenKind,
    found: ?TokenKind,

    pub fn prettyDesc(self: ParseError, source: []const u8) []const u8 {
        const line_info = getLineInfo(source, self.span.start);
        return std.fmt.allocPrint(allocator,
            "error: {s}\nline: {d}\n{s}\n{s}",
            .{
                self.message,
                line_info.line_num,
                line_info.line_text,
                makeUnderline(line_info, self.span),
            },
        ) catch "format error";
    }
};

Predefined Functions Registry

Built-in functions are registered for name resolution¹³:

const PredefinedFuncRegistry = struct {
    funcs: std.StringHashMap(PredefinedFunc),
    builder: *const SigmaBuilder,

    pub fn init(builder: *const SigmaBuilder) PredefinedFuncRegistry {
        var self = PredefinedFuncRegistry{
            .funcs = std.StringHashMap(PredefinedFunc).init(allocator),
            .builder = builder,
        };
        self.registerAll();
        return self;
    }

    fn registerAll(self: *PredefinedFuncRegistry) void {
        // Boolean operations
        self.register("allOf", .{
            .tpe = SFunc.init(&[_]SType{SType.collOf(.boolean)}, .boolean),
            .ir_builder = AllOfIrBuilder,
        });
        self.register("anyOf", .{
            .tpe = SFunc.init(&[_]SType{SType.collOf(.boolean)}, .boolean),
            .ir_builder = AnyOfIrBuilder,
        });

        // Sigma operations
        self.register("sigmaProp", .{
            .tpe = SFunc.init(&[_]SType{.boolean}, .sigma_prop),
            .ir_builder = SigmaPropIrBuilder,
        });
        self.register("atLeast", .{
            .tpe = SFunc.init(&[_]SType{ .int, SType.collOf(.sigma_prop) }, .sigma_prop),
            .ir_builder = AtLeastIrBuilder,
        });
        self.register("allZK", .{
            .tpe = SFunc.init(&[_]SType{SType.collOf(.sigma_prop)}, .sigma_prop),
            .ir_builder = AllZKIrBuilder,
        });
        self.register("anyZK", .{
            .tpe = SFunc.init(&[_]SType{SType.collOf(.sigma_prop)}, .sigma_prop),
            .ir_builder = AnyZKIrBuilder,
        });

        // Cryptographic
        self.register("proveDlog", .{
            .tpe = SFunc.init(&[_]SType{.group_element}, .sigma_prop),
            .ir_builder = ProveDlogIrBuilder,
        });
        self.register("proveDHTuple", .{
            .tpe = SFunc.init(&[_]SType{
                .group_element,
                .group_element,
                .group_element,
                .group_element,
            }, .sigma_prop),
            .ir_builder = ProveDHTupleIrBuilder,
        });

        // Hash functions
        self.register("blake2b256", .{
            .tpe = SFunc.init(&[_]SType{SType.collOf(.byte)}, SType.collOf(.byte)),
            .ir_builder = Blake2b256IrBuilder,
        });
        self.register("sha256", .{
            .tpe = SFunc.init(&[_]SType{SType.collOf(.byte)}, SType.collOf(.byte)),
            .ir_builder = Sha256IrBuilder,
        });

        // Global
        self.register("groupGenerator", .{
            .tpe = SFunc.init(&[_]SType{}, .group_element),
            .ir_builder = GroupGeneratorIrBuilder,
        });
    }

    fn register(self: *PredefinedFuncRegistry, name: []const u8, func: PredefinedFunc) void {
        self.funcs.put(name, func) catch unreachable;
    }
};

Compilation Example

pub fn main() !void {
    var gpa = std.heap.GeneralPurposeAllocator(.{}){};
    const allocator = gpa.allocator();

    // Setup compiler
    const settings = CompilerSettings.testnet();
    const compiler = SigmaCompiler.init(settings, allocator);
    var ir_ctx = IRContext.init(allocator);

    // Source code
    const source =
        \\{
        \\  val deadline = 100000
        \\  val pk = PK("9fRusAarL1KkrWQVsxSRVYnvWxaAT2A96cKtNn9tvPh5XUCTgGi")
        \\  sigmaProp(HEIGHT > deadline) && pk
        \\}
    ;

    // Compile
    const env = ScriptEnv.empty();
    const result = try compiler.compile(&env, source, &ir_ctx);

    // Access results
    std.debug.print("Source: {s}\n", .{result.source});
    std.debug.print("ErgoTree: {any}\n", .{result.ergo_tree});
    std.debug.print("Type: {any}\n", .{result.ergo_tree.tpe});

    // Serialize
    const ergo_tree = try ErgoTree.fromSigmaProp(result.ergo_tree);
    const bytes = try ergo_tree.toBytes(allocator);
    std.debug.print("Bytes: {x}\n", .{std.fmt.fmtSliceHexLower(bytes)});
}

Compilation Flow Detail

Detailed Phase Transitions
─────────────────────────────────────────────────────

Source: "OUTPUTS.exists({ (b: Box) => b.value > 100L })"

Phase 1 - Parse:
  Apply(
    Select(Ident("OUTPUTS"), "exists"),
    [Lambda(["b": Box], GT(Select(Ident("b"), "value"), 100L))]
  )

Phase 2 - Bind:
  Apply(
    Select(Context.OUTPUTS, ExistsMethod),
    [Lambda([b: SBox], GT(Select(ValUse(b), "value"), 100L))]
  )

Phase 3 - Typecheck:
  Exists(
    input: Outputs :: SColl[SBox],
    predicate: Lambda(
      args: [(0, SBox)],
      body: GT(
        ExtractAmount(ValUse(0, SBox)) :: SLong,
        LongConstant(100) :: SLong
      ) :: SBoolean
    ) :: SFunc[SBox, SBoolean]
  ) :: SBoolean

Phase 4 - BuildGraph (if using Scala IR):
  s1 = Context.OUTPUTS
  s2 = Lambda(args=[(0,SBox)], body=s3)
  s3 = GT(s4, s5)
  s4 = ValUse(0).value  // ExtractAmount
  s5 = 100L
  s6 = Exists(s1, s2)

Phase 5 - BuildTree:
  Exists(
    Outputs,
    FuncValue(
      [(1, SBox)],
      GT(ExtractAmount(ValUse(1, SBox)), LongConstant(100))
    )
  )

Summary

5-phase pipeline: Parse → Bind → Typecheck → BuildGraph → BuildTree
Method lowering transforms MethodCall (3+ bytes) to direct nodes (1 byte)
Scala uses graph IR for CSE optimization; Rust uses direct HIR→MIR
Type inference propagates and unifies types through the AST
Predefined registry resolves built-in function names
Error handling provides detailed source-location diagnostics
Compiler is development-time only—interpreter uses serialized ErgoTree

Next: Chapter 20: Collections

Scala: SigmaCompiler.scala:51-100 (SigmaCompiler class)

Rust: lib.rs:16-27 (module structure)

Scala: SigmaCompiler.scala:15-25 (CompilerSettings)

⁴

Scala: SigmaCompiler.scala:55-95 (compile methods)

⁵

Rust: compiler.rs:59-76 (compile_expr)

⁶

Rust: compiler.rs:73-76 (compile)

⁷

Rust: compiler.rs:78-87 (compile_hir)

⁸

Scala: SigmaCompiler.scala:105-150 (unlowerMethodCalls)

⁹

Scala: SigmaTyper.scala:30-45 (processGlobalMethod)

¹⁰

Scala: SigmaTyper.scala:50-100 (assignType)

¹¹

Rust: type_infer.rs:25-49 (assign_type)

¹²

Rust: compiler.rs:23-55 (CompileError)

¹³

Scala: SigmaPredef.scala (PredefinedFuncRegistry)

Chapter 20: Collections

Prerequisites

Chapter 2 for Coll[T] type and type parameters
Chapter 5 for collection operation opcodes
Chapter 12 for how collection operations are evaluated

Learning Objectives

By the end of this chapter, you will be able to:

Explain the Coll[T] interface and its core operations (map, filter, fold, etc.)
Implement array-backed collections with bounds checking
Describe the Structure-of-Arrays optimization for pair collections
Use CollBuilder for creating and manipulating collections
Understand cost implications of collection operations

Collection Architecture

Collections in ErgoScript are immutable, indexed sequences¹²:

Collection Architecture
─────────────────────────────────────────────────────

                    Coll[T]
                       │
         ┌─────────────┴─────────────┐
         │                           │
   CollOverArray[T]            PairColl[L,R]
   (standard array)         (structure-of-arrays)
         │                           │
         │                    ┌──────┴──────┐
    Array[T]                Coll[L]      Coll[R]
                           (left)       (right)

Coll[T] Interface

Core collection interface with specialized operations³⁴:

/// Immutable indexed collection
const Coll = struct {
    data: CollData,
    elem_type: SType,
    builder: *CollBuilder,

    const CollData = union(enum) {
        /// Standard array-backed collection
        array: ArrayColl,
        /// Optimized pair collection
        pair: PairCollData,
    };

    /// Number of elements
    pub fn length(self: *const Coll) usize {
        return switch (self.data) {
            .array => |a| a.items.len,
            .pair => |p| @min(p.ls.length(), p.rs.length()),
        };
    }

    pub fn size(self: *const Coll) usize {
        return self.length();
    }

    pub fn isEmpty(self: *const Coll) bool {
        return self.length() == 0;
    }

    /// Element at index (0-based)
    pub fn get(self: *const Coll, i: usize) ?Value {
        if (i >= self.length()) return null;
        return switch (self.data) {
            .array => |a| a.items[i],
            .pair => |p| Value.tuple(.{ p.ls.get(i).?, p.rs.get(i).? }),
        };
    }

    /// Element at index with default
    pub fn getOrElse(self: *const Coll, i: usize, default: Value) Value {
        return self.get(i) orelse default;
    }

    /// Element access (throws on out of bounds)
    pub fn apply(self: *const Coll, i: usize) !Value {
        return self.get(i) orelse error.IndexOutOfBounds;
    }
};

Transformation Operations

Map, filter, and fold with cost tracking⁵⁶:

/// Collection transformation operations
const CollTransforms = struct {

    /// Apply function to each element
    pub fn map(
        coll: *const Coll,
        mapper: *const Closure,
        E: *Evaluator,
    ) !*Coll {
        const n = coll.length();
        try E.addSeqCost(MapCost, n, OpCode.Map);

        var result = try E.allocator.alloc(Value, n);
        for (0..n) |i| {
            const elem = coll.get(i).?;
            try E.addCost(AddToEnvCost, OpCode.Map);
            var fn_env = try mapper.captured_env.extend(mapper.args[0].id, elem);
            result[i] = try mapper.body.eval(&fn_env, E);
        }

        return coll.builder.fromArray(result, mapper.result_type);
    }

    /// Select elements satisfying predicate
    pub fn filter(
        coll: *const Coll,
        predicate: *const Closure,
        E: *Evaluator,
    ) !*Coll {
        const n = coll.length();
        try E.addSeqCost(FilterCost, n, OpCode.Filter);

        var result = std.ArrayList(Value).init(E.allocator);
        for (0..n) |i| {
            const elem = coll.get(i).?;
            try E.addCost(AddToEnvCost, OpCode.Filter);
            var fn_env = try predicate.captured_env.extend(predicate.args[0].id, elem);
            const keep = try E.evalTo(bool, &fn_env, predicate.body);
            if (keep) {
                try result.append(elem);
            }
        }

        return coll.builder.fromArray(result.items, coll.elem_type);
    }

    /// Left-associative fold
    pub fn foldLeft(
        coll: *const Coll,
        zero: Value,
        folder: *const Closure,
        E: *Evaluator,
    ) !Value {
        const n = coll.length();
        try E.addSeqCost(FoldCost, n, OpCode.Fold);

        var accum = zero;
        for (0..n) |i| {
            const elem = coll.get(i).?;
            const tuple = Value.tuple(.{ accum, elem });
            try E.addCost(AddToEnvCost, OpCode.Fold);
            var fn_env = try folder.captured_env.extend(folder.args[0].id, tuple);
            accum = try folder.body.eval(&fn_env, E);
        }

        return accum;
    }

    /// Flatten nested collections
    pub fn flatMap(
        coll: *const Coll,
        mapper: *const Closure,
        E: *Evaluator,
    ) !*Coll {
        const n = coll.length();
        var result = std.ArrayList(Value).init(E.allocator);

        for (0..n) |i| {
            const elem = coll.get(i).?;
            try E.addCost(AddToEnvCost, OpCode.FlatMap);
            var fn_env = try mapper.captured_env.extend(mapper.args[0].id, elem);
            const inner_coll = try E.evalTo(*Coll, &fn_env, mapper.body);

            for (0..inner_coll.length()) |j| {
                try result.append(inner_coll.get(j).?);
            }
        }

        return coll.builder.fromArray(result.items, mapper.result_type);
    }
};

const MapCost = PerItemCost{
    .base = JitCost{ .value = 10 },
    .per_chunk = JitCost{ .value = 5 },
    .chunk_size = 10,
};

const FilterCost = PerItemCost{
    .base = JitCost{ .value = 20 },
    .per_chunk = JitCost{ .value = 5 },
    .chunk_size = 10,
};

const FoldCost = PerItemCost{
    .base = JitCost{ .value = 10 },
    .per_chunk = JitCost{ .value = 5 },
    .chunk_size = 10,
};

Predicate Operations

Exists, forall with short-circuit evaluation⁷. Note: Short-circuit behavior means execution time varies based on collection contents. This is acceptable in blockchain contexts where data is public, but would be a timing side-channel if collections contained secrets.

/// Predicate operations (short-circuit)
const CollPredicates = struct {

    /// True if any element satisfies predicate
    pub fn exists(
        coll: *const Coll,
        predicate: *const Closure,
        E: *Evaluator,
    ) !bool {
        const n = coll.length();

        for (0..n) |i| {
            const elem = coll.get(i).?;
            try E.addCost(AddToEnvCost, OpCode.Exists);
            var fn_env = try predicate.captured_env.extend(predicate.args[0].id, elem);
            const result = try E.evalTo(bool, &fn_env, predicate.body);

            if (result) {
                // Short-circuit: found matching element
                try E.addSeqCost(ExistsCost, i + 1, OpCode.Exists);
                return true;
            }
        }

        try E.addSeqCost(ExistsCost, n, OpCode.Exists);
        return false;
    }

    /// True if all elements satisfy predicate
    pub fn forall(
        coll: *const Coll,
        predicate: *const Closure,
        E: *Evaluator,
    ) !bool {
        const n = coll.length();

        for (0..n) |i| {
            const elem = coll.get(i).?;
            try E.addCost(AddToEnvCost, OpCode.ForAll);
            var fn_env = try predicate.captured_env.extend(predicate.args[0].id, elem);
            const result = try E.evalTo(bool, &fn_env, predicate.body);

            if (!result) {
                // Short-circuit: found non-matching element
                try E.addSeqCost(ForAllCost, i + 1, OpCode.ForAll);
                return false;
            }
        }

        try E.addSeqCost(ForAllCost, n, OpCode.ForAll);
        return true;
    }

    /// Find first element satisfying predicate
    pub fn find(
        coll: *const Coll,
        predicate: *const Closure,
        E: *Evaluator,
    ) !?Value {
        const n = coll.length();

        for (0..n) |i| {
            const elem = coll.get(i).?;
            var fn_env = try predicate.captured_env.extend(predicate.args[0].id, elem);
            const result = try E.evalTo(bool, &fn_env, predicate.body);

            if (result) {
                return elem;
            }
        }

        return null;
    }

    /// Index of first element satisfying predicate
    pub fn indexWhere(
        coll: *const Coll,
        predicate: *const Closure,
        from: usize,
        E: *Evaluator,
    ) !i32 {
        const n = coll.length();
        const start = @max(from, 0);

        for (start..n) |i| {
            const elem = coll.get(i).?;
            var fn_env = try predicate.captured_env.extend(predicate.args[0].id, elem);
            const result = try E.evalTo(bool, &fn_env, predicate.body);

            if (result) {
                return @intCast(i);
            }
        }

        return -1;  // Not found
    }
};

Slicing Operations

Slice, take, append⁸:

/// Slicing operations
const CollSlicing = struct {

    /// First n elements
    pub fn take(coll: *const Coll, n: usize, E: *Evaluator) !*Coll {
        if (n <= 0) return coll.builder.emptyColl(coll.elem_type);
        if (n >= coll.length()) return coll;

        try E.addSeqCost(SliceCost, n, OpCode.Slice);
        return coll.builder.fromSlice(coll, 0, n);
    }

    /// Elements from index `from` until `until`
    pub fn slice(
        coll: *const Coll,
        from: usize,
        until: usize,
        E: *Evaluator,
    ) !*Coll {
        const actual_from = @min(from, coll.length());
        const actual_until = @min(until, coll.length());
        const len = if (actual_until > actual_from) actual_until - actual_from else 0;

        try E.addSeqCost(SliceCost, len, OpCode.Slice);
        return coll.builder.fromSlice(coll, actual_from, actual_until);
    }

    /// Concatenate collections
    pub fn append(coll: *const Coll, other: *const Coll, E: *Evaluator) !*Coll {
        if (coll.length() == 0) return other;
        if (other.length() == 0) return coll;

        const total = coll.length() + other.length();
        try E.addSeqCost(AppendCost, total, OpCode.Append);

        var result = try E.allocator.alloc(Value, total);
        for (0..coll.length()) |i| {
            result[i] = coll.get(i).?;
        }
        for (0..other.length()) |i| {
            result[coll.length() + i] = other.get(i).?;
        }

        return coll.builder.fromArray(result, coll.elem_type);
    }

    /// Replace slice with patch
    pub fn patch(
        coll: *const Coll,
        from: usize,
        replacement: *const Coll,
        replaced: usize,
        E: *Evaluator,
    ) !*Coll {
        const before = coll.slice(0, from, E);
        const after = coll.slice(from + replaced, coll.length(), E);
        const temp = try before.append(replacement, E);
        return temp.append(after, E);
    }

    /// Replace single element
    pub fn updated(
        coll: *const Coll,
        index: usize,
        elem: Value,
        E: *Evaluator,
    ) !*Coll {
        if (index >= coll.length()) return error.IndexOutOfBounds;

        var result = try E.allocator.alloc(Value, coll.length());
        for (0..coll.length()) |i| {
            result[i] = if (i == index) elem else coll.get(i).?;
        }

        return coll.builder.fromArray(result, coll.elem_type);
    }
};

Structure-of-Arrays: PairColl

Optimized representation for collections of pairs⁹¹⁰:

Structure-of-Arrays vs Array-of-Structures
─────────────────────────────────────────────────────

Array-of-Structures (standard):
┌────────────────────────────────────────────────────┐
│ [(L0,R0), (L1,R1), (L2,R2), (L3,R3), (L4,R4)]      │
│                                                    │
│ Memory: L0 R0 L1 R1 L2 R2 L3 R3 L4 R4              │
│ Issue: Cache unfriendly when accessing only Ls     │
└────────────────────────────────────────────────────┘

Structure-of-Arrays (PairColl):
┌────────────────────────────────────────────────────┐
│ ls: [L0, L1, L2, L3, L4]                           │
│ rs: [R0, R1, R2, R3, R4]                           │
│                                                    │
│ Memory: L0 L1 L2 L3 L4 | R0 R1 R2 R3 R4            │
│ Benefit: Cache friendly, O(1) unzip                │
└────────────────────────────────────────────────────┘

/// Optimized pair collection (Structure-of-Arrays)
const PairColl = struct {
    ls: *Coll,  // Left components
    rs: *Coll,  // Right components
    builder: *CollBuilder,

    pub fn length(self: *const PairColl) usize {
        return @min(self.ls.length(), self.rs.length());
    }

    /// Element at index returns tuple
    pub fn get(self: *const PairColl, i: usize) ?Value {
        const l = self.ls.get(i) orelse return null;
        const r = self.rs.get(i) orelse return null;
        return Value.tuple(.{ l, r });
    }

    /// O(1) unzip - just return components
    pub fn unzip(self: *const PairColl) struct { *Coll, *Coll } {
        return .{ self.ls, self.rs };
    }

    /// Map only left components
    pub fn mapFirst(
        self: *const PairColl,
        mapper: *const Closure,
        E: *Evaluator,
    ) !*PairColl {
        const mapped_ls = try CollTransforms.map(self.ls, mapper, E);
        return self.builder.pairColl(mapped_ls, self.rs);
    }

    /// Map only right components
    pub fn mapSecond(
        self: *const PairColl,
        mapper: *const Closure,
        E: *Evaluator,
    ) !*PairColl {
        const mapped_rs = try CollTransforms.map(self.rs, mapper, E);
        return self.builder.pairColl(self.ls, mapped_rs);
    }

    /// Slice maintains structure-of-arrays
    pub fn slice(
        self: *const PairColl,
        from: usize,
        until: usize,
        E: *Evaluator,
    ) !*PairColl {
        const sliced_ls = try CollSlicing.slice(self.ls, from, until, E);
        const sliced_rs = try CollSlicing.slice(self.rs, from, until, E);
        return self.builder.pairColl(sliced_ls, sliced_rs);
    }

    /// Append maintains structure
    pub fn append(
        self: *const PairColl,
        other: *const PairColl,
        E: *Evaluator,
    ) !*PairColl {
        const combined_ls = try CollSlicing.append(self.ls, other.ls, E);
        const combined_rs = try CollSlicing.append(self.rs, other.rs, E);
        return self.builder.pairColl(combined_ls, combined_rs);
    }
};

CollBuilder

Factory for creating collections¹¹¹²:

/// Factory for creating collections
const CollBuilder = struct {
    allocator: Allocator,

    /// Create pair collection from two collections
    pub fn pairColl(
        self: *CollBuilder,
        ls: *Coll,
        rs: *Coll,
    ) *PairColl {
        // Handle length mismatch by using minimum
        const result = self.allocator.create(PairColl) catch unreachable;
        result.* = .{
            .ls = ls,
            .rs = rs,
            .builder = self,
        };
        return result;
    }

    /// Create collection from array
    pub fn fromArray(
        self: *CollBuilder,
        items: []const Value,
        elem_type: SType,
    ) *Coll {
        // Enforce size limit
        if (items.len > MAX_ARRAY_LENGTH) {
            @panic("Collection size exceeds maximum");
        }

        const result = self.allocator.create(Coll) catch unreachable;

        // Special handling for pairs → PairColl
        if (elem_type == .s_tuple and elem_type.s_tuple.items.len == 2) {
            const ls = self.allocator.alloc(Value, items.len) catch unreachable;
            const rs = self.allocator.alloc(Value, items.len) catch unreachable;
            for (items, 0..) |item, i| {
                ls[i] = item.tuple[0];
                rs[i] = item.tuple[1];
            }
            result.* = .{
                .data = .{ .pair = .{
                    .ls = self.fromArray(ls, elem_type.s_tuple.items[0]),
                    .rs = self.fromArray(rs, elem_type.s_tuple.items[1]),
                } },
                .elem_type = elem_type,
                .builder = self,
            };
        } else {
            result.* = .{
                .data = .{ .array = .{ .items = items } },
                .elem_type = elem_type,
                .builder = self,
            };
        }
        return result;
    }

    /// Create collection of n copies of value
    pub fn replicate(
        self: *CollBuilder,
        n: usize,
        value: Value,
        elem_type: SType,
    ) *Coll {
        var items = self.allocator.alloc(Value, n) catch unreachable;
        for (items) |*item| {
            item.* = value;
        }
        return self.fromArray(items, elem_type);
    }

    /// Create empty collection
    pub fn emptyColl(self: *CollBuilder, elem_type: SType) *Coll {
        return self.fromArray(&[_]Value{}, elem_type);
    }

    /// Split pair collection into two collections
    pub fn unzip(self: *CollBuilder, coll: *const Coll) struct { *Coll, *Coll } {
        switch (coll.data) {
            .pair => |p| {
                // O(1) for PairColl
                return .{ p.ls, p.rs };
            },
            .array => |a| {
                // O(n) for regular collection - must materialize
                const n = a.items.len;
                var ls = self.allocator.alloc(Value, n) catch unreachable;
                var rs = self.allocator.alloc(Value, n) catch unreachable;
                for (a.items, 0..) |item, i| {
                    ls[i] = item.tuple[0];
                    rs[i] = item.tuple[1];
                }
                const elem_type = coll.elem_type.s_tuple;
                return .{
                    self.fromArray(ls, elem_type.items[0]),
                    self.fromArray(rs, elem_type.items[1]),
                };
            },
        }
    }

    /// Element-wise XOR of byte arrays
    pub fn xor(self: *CollBuilder, left: *const Coll, right: *const Coll) *Coll {
        const n = @min(left.length(), right.length());
        var result = self.allocator.alloc(Value, n) catch unreachable;
        for (0..n) |i| {
            const l = left.get(i).?.byte;
            const r = right.get(i).?.byte;
            result[i] = Value{ .byte = l ^ r };
        }
        return self.fromArray(result, .byte);
    }
};

/// Maximum collection size (DoS protection)
const MAX_ARRAY_LENGTH: usize = 100_000;

Rust Collection Representation

The Rust implementation uses a different approach¹³¹⁴:

/// Rust-style Collection enum
const RustCollection = union(enum) {
    /// Special representation for boolean constants (bit-packed)
    bool_constants: []const bool,
    /// Collection of expressions
    exprs: struct {
        elem_type: SType,
        items: []const *Expr,
    },

    pub fn tpe(self: RustCollection) SType {
        return switch (self) {
            .bool_constants => SType.collOf(.boolean),
            .exprs => |e| SType.collOf(e.elem_type),
        };
    }

    pub fn opCode(self: RustCollection) OpCode {
        return switch (self) {
            .bool_constants => OpCode.CollOfBoolConst,
            .exprs => OpCode.Coll,
        };
    }
};

/// Rust collection serialization
fn serializeCollection(coll: RustCollection, writer: anytype) !void {
    switch (coll) {
        .bool_constants => |bools| {
            try writer.writeInt(u16, @intCast(bools.len), .big);
            try writeBits(writer, bools);  // Bit-packed
        },
        .exprs => |e| {
            try writer.writeInt(u16, @intCast(e.items.len), .big);
            try serializeSType(e.elem_type, writer);
            for (e.items) |item| {
                try serializeExpr(item, writer);
            }
        },
    }
}

Cost Model

Collection Operation Costs
─────────────────────────────────────────────────────

Operation       │ Cost Type    │ Formula
────────────────┼──────────────┼──────────────────────
length          │ Fixed        │ 10
apply(i)        │ Fixed        │ 10
get(i)          │ Fixed        │ 10
map(f)          │ PerItem      │ 10 + ⌈n/10⌉ × 5
filter(p)       │ PerItem      │ 20 + ⌈n/10⌉ × 5
fold(z, op)     │ PerItem      │ 10 + ⌈n/10⌉ × 5
exists(p)       │ PerItem      │ 10 + ⌈k/10⌉ × 5  (k=items checked)
forall(p)       │ PerItem      │ 10 + ⌈k/10⌉ × 5  (k=items checked)
slice(from,to)  │ PerItem      │ 10 + ⌈len/10⌉ × 2
append(other)   │ PerItem      │ 20 + ⌈(n+m)/10⌉ × 2
zip(other)      │ Fixed        │ 10 (structural)
unzip           │ Fixed        │ 10 (PairColl), PerItem (array)
flatMap(f)      │ Dynamic      │ depends on result sizes
─────────────────────────────────────────────────────

Where: n = collection size, k = items processed before short-circuit

Set Operations

Distinct, union, intersection¹⁵:

/// Set-like operations on collections
const CollSetOps = struct {

    /// Remove duplicates, preserving first occurrences
    pub fn distinct(coll: *const Coll, E: *Evaluator) !*Coll {
        var seen = std.AutoHashMap(Value, void).init(E.allocator);
        var result = std.ArrayList(Value).init(E.allocator);

        for (0..coll.length()) |i| {
            const elem = coll.get(i).?;
            if (!seen.contains(elem)) {
                try seen.put(elem, {});
                try result.append(elem);
            }
        }

        return coll.builder.fromArray(result.items, coll.elem_type);
    }

    /// Union preserving order (set semantics)
    pub fn unionSet(coll: *const Coll, other: *const Coll, E: *Evaluator) !*Coll {
        var seen = std.AutoHashMap(Value, void).init(E.allocator);
        var result = std.ArrayList(Value).init(E.allocator);

        // Add all from first collection
        for (0..coll.length()) |i| {
            const elem = coll.get(i).?;
            if (!seen.contains(elem)) {
                try seen.put(elem, {});
                try result.append(elem);
            }
        }

        // Add unseen from second collection
        for (0..other.length()) |i| {
            const elem = other.get(i).?;
            if (!seen.contains(elem)) {
                try seen.put(elem, {});
                try result.append(elem);
            }
        }

        return coll.builder.fromArray(result.items, coll.elem_type);
    }

    /// Multiset intersection
    pub fn intersect(coll: *const Coll, other: *const Coll, E: *Evaluator) !*Coll {
        // Count occurrences in other
        var counts = std.AutoHashMap(Value, usize).init(E.allocator);
        for (0..other.length()) |i| {
            const elem = other.get(i).?;
            const entry = try counts.getOrPut(elem);
            if (!entry.found_existing) {
                entry.value_ptr.* = 0;
            }
            entry.value_ptr.* += 1;
        }

        // Collect elements that exist in other
        var result = std.ArrayList(Value).init(E.allocator);
        for (0..coll.length()) |i| {
            const elem = coll.get(i).?;
            if (counts.get(elem)) |*count| {
                if (count.* > 0) {
                    try result.append(elem);
                    count.* -= 1;
                }
            }
        }

        return coll.builder.fromArray(result.items, coll.elem_type);
    }
};

Summary

Coll[T] is immutable, indexed, deterministic
CollOverArray wraps arrays with specialized primitive support
PairColl uses Structure-of-Arrays for O(1) unzip
CollBuilder creates collections with automatic pair optimization
Short-circuit evaluation for exists/forall reduces costs
Size limit (100K elements) prevents DoS attacks
All operations have defined costs for gas calculation

Next: Chapter 21: AVL Trees

Scala: Colls.scala:12-50 (Coll trait)

Rust: collection.rs:21-32 (Collection enum)

Scala: Colls.scala:50-100 (core operations)

⁴

Rust: coll_by_index.rs (ByIndex)

⁵

Scala: CollsOverArrays.scala:30-50 (map, filter)

⁶

Rust: coll_map.rs:17-62 (Map struct)

⁷

Rust: coll_exists.rs, coll_forall.rs

⁸

Scala: CollsOverArrays.scala:50-80 (slice, append)

⁹

Scala: Colls.scala:150-180 (PairColl trait)

¹⁰

Scala: CollsOverArrays.scala:200-280 (PairOfCols)

¹¹

Scala: Colls.scala:180-220 (CollBuilder trait)

¹²

Scala: CollsOverArrays.scala:300-400 (CollOverArrayBuilder)

¹³

Rust: collection.rs:34-56 (Collection::new)

¹⁴

Rust: collection.rs:100-136 (serialization)

¹⁵

Scala: CollsOverArrays.scala:100-150 (set operations)

Chapter 21: AVL+ Trees

Prerequisites

Chapter 10 for BLAKE2b256 hashing used in node digests
Chapter 20 for collection operations that AVL trees extend
Familiarity with binary search tree concepts and balancing

Learning Objectives

By the end of this chapter, you will be able to:

Explain the prover-verifier architecture for authenticated dictionaries
Implement the AvlTreeData and ADDigest structures storing 33-byte commitments
Use operation flags to control insert/update/remove permissions
Apply proof-based verification for tree operations (contains, get, insert, update, remove)
Calculate operation costs based on proof length and tree height

Authenticated Dictionary Model

AVL+ trees provide authenticated key-value storage¹²:

Prover-Verifier Architecture
─────────────────────────────────────────────────────

OFF-CHAIN (Prover - holds full tree):
┌─────────────────────────────────────────────────────┐
│                 BatchAVLProver                      │
│  ┌─────────────────────────────────────────────────┐│
│  │           Complete Tree Structure               ││
│  │                    [H]                          ││
│  │                   /   \                         ││
│  │                 [D]   [L]                       ││
│  │                /   \ /   \                      ││
│  │              [B] [F][J] [N]                     ││
│  └─────────────────────────────────────────────────┘│
│  • Performs operations                              │
│  • Generates proofs                                 │
│  • Maintains full state                             │
└─────────────────────────│───────────────────────────┘
                          │ proof bytes
                          ▼
ON-CHAIN (Verifier - holds only digest):
┌─────────────────────────────────────────────────────┐
│               CAvlTreeVerifier                      │
│  ┌─────────────────────────────────────────────────┐│
│  │  Digest: [32-byte root hash][height byte]       ││
│  │          ═══════════════════════════════        ││
│  │          (33 bytes total)                       ││
│  └─────────────────────────────────────────────────┘│
│  • Verifies proof bytes                             │
│  • Returns operation results                        │
│  • Rejects invalid proofs                           │
└─────────────────────────────────────────────────────┘

AvlTreeData Structure

Core data type for authenticated trees³⁴:

/// Authenticated tree data (stored on-chain)
const AvlTreeData = struct {
    /// Root hash + height (33 bytes total)
    digest: ADDigest,
    /// Permitted operations
    tree_flags: AvlTreeFlags,
    /// Fixed key length (all keys same size)
    /// Note: In Ergo, this is always 32 bytes (Blake2b256 hash)
    key_length: u32,
    /// Optional fixed value length
    value_length_opt: ?u32,

    pub const DIGEST_SIZE: usize = 33; // 32-byte hash + 1-byte height

    pub fn fromDigest(digest: []const u8) AvlTreeData {
        return .{
            .digest = ADDigest.fromSlice(digest),
            .tree_flags = AvlTreeFlags.allOperationsAllowed(),
            .key_length = 32, // Ergo: always 32 bytes (Blake2b256 hash)
            .value_length_opt = null,
        };
    }
};

/// 33-byte authenticated digest
const ADDigest = struct {
    /// 32-byte BLAKE2b256 root hash
    root_hash: [32]u8,
    /// Tree height (0-255)
    height: u8,

    pub fn fromSlice(bytes: []const u8) ADDigest {
        var result: ADDigest = undefined;
        @memcpy(&result.root_hash, bytes[0..32]);
        result.height = bytes[32];
        return result;
    }

    pub fn toBytes(self: ADDigest) [33]u8 {
        var result: [33]u8 = undefined;
        @memcpy(result[0..32], &self.root_hash);
        result[32] = self.height;
        return result;
    }
};

Operation Flags

Control which modifications are permitted⁵⁶:

/// Operation permission flags (bit-packed)
const AvlTreeFlags = struct {
    flags: u8,

    /// Bit positions
    const INSERT_BIT: u8 = 0x01;
    const UPDATE_BIT: u8 = 0x02;
    const REMOVE_BIT: u8 = 0x04;

    pub fn new(insert_allowed: bool, update_allowed: bool, remove_allowed: bool) AvlTreeFlags {
        var flags: u8 = 0;
        if (insert_allowed) flags |= INSERT_BIT;
        if (update_allowed) flags |= UPDATE_BIT;
        if (remove_allowed) flags |= REMOVE_BIT;
        return .{ .flags = flags };
    }

    pub fn parse(byte: u8) AvlTreeFlags {
        return .{ .flags = byte };
    }

    pub fn serialize(self: AvlTreeFlags) u8 {
        return self.flags;
    }

    // Predefined flag combinations
    pub fn readOnly() AvlTreeFlags {
        return .{ .flags = 0x00 };
    }

    pub fn allOperationsAllowed() AvlTreeFlags {
        return .{ .flags = 0x07 };
    }

    pub fn insertOnly() AvlTreeFlags {
        return .{ .flags = 0x01 };
    }

    pub fn removeOnly() AvlTreeFlags {
        return .{ .flags = 0x04 };
    }

    // Permission checks
    pub fn insertAllowed(self: AvlTreeFlags) bool {
        return (self.flags & INSERT_BIT) != 0;
    }

    pub fn updateAllowed(self: AvlTreeFlags) bool {
        return (self.flags & UPDATE_BIT) != 0;
    }

    pub fn removeAllowed(self: AvlTreeFlags) bool {
        return (self.flags & REMOVE_BIT) != 0;
    }
};

AvlTree Interface

ErgoScript interface for authenticated trees⁷:

/// AvlTree wrapper providing ErgoScript interface
const AvlTree = struct {
    data: AvlTreeData,

    /// Get 33-byte authenticated digest
    pub fn digest(self: *const AvlTree) []const u8 {
        return &self.data.digest.toBytes();
    }

    /// Get operation flags byte
    pub fn enabledOperations(self: *const AvlTree) u8 {
        return self.data.tree_flags.serialize();
    }

    /// Get fixed key length
    pub fn keyLength(self: *const AvlTree) i32 {
        return @intCast(self.data.key_length);
    }

    /// Get optional fixed value length
    pub fn valueLengthOpt(self: *const AvlTree) ?i32 {
        if (self.data.value_length_opt) |v| {
            return @intCast(v);
        }
        return null;
    }

    /// Permission checks
    pub fn isInsertAllowed(self: *const AvlTree) bool {
        return self.data.tree_flags.insertAllowed();
    }

    pub fn isUpdateAllowed(self: *const AvlTree) bool {
        return self.data.tree_flags.updateAllowed();
    }

    pub fn isRemoveAllowed(self: *const AvlTree) bool {
        return self.data.tree_flags.removeAllowed();
    }

    /// Create new tree with updated digest (immutable)
    pub fn updateDigest(self: *const AvlTree, new_digest: []const u8) AvlTree {
        var new_data = self.data;
        new_data.digest = ADDigest.fromSlice(new_digest);
        return .{ .data = new_data };
    }

    /// Create new tree with updated flags (immutable)
    pub fn updateOperations(self: *const AvlTree, new_flags: u8) AvlTree {
        var new_data = self.data;
        new_data.tree_flags = AvlTreeFlags.parse(new_flags);
        return .{ .data = new_data };
    }
};

Verifier Implementation

The verifier processes proofs to verify operations⁸⁹:

/// AVL tree proof verifier
const AvlTreeVerifier = struct {
    /// Current state digest (None if verification failed)
    current_digest: ?ADDigest,
    /// Proof bytes to process
    proof: []const u8,
    /// Current position in proof
    proof_pos: usize,
    /// Key length
    key_length: usize,
    /// Optional value length
    value_length_opt: ?usize,

    pub fn init(tree: *const AvlTree, proof: []const u8) AvlTreeVerifier {
        return .{
            .current_digest = tree.data.digest,
            .proof = proof,
            .proof_pos = 0,
            .key_length = tree.data.key_length,
            .value_length_opt = tree.data.value_length_opt,
        };
    }

    /// Get current tree height
    pub fn treeHeight(self: *const AvlTreeVerifier) usize {
        if (self.current_digest) |d| {
            return d.height;
        }
        return 0;
    }

    /// Get current digest (None if verification failed)
    pub fn digest(self: *const AvlTreeVerifier) ?[]const u8 {
        if (self.current_digest) |d| {
            return &d.toBytes();
        }
        return null;
    }

    /// Perform lookup operation
    pub fn performLookup(self: *AvlTreeVerifier, key: []const u8) !?[]const u8 {
        if (self.current_digest == null) return error.VerificationFailed;

        // Process proof to verify key existence
        const result = try self.verifyLookup(key);
        return result;
    }

    /// Perform insert operation
    pub fn performInsert(
        self: *AvlTreeVerifier,
        key: []const u8,
        value: []const u8,
    ) !?[]const u8 {
        if (self.current_digest == null) return error.VerificationFailed;

        // Process proof to verify insertion
        const old_value = try self.verifyInsert(key, value);

        // Update digest based on proof
        self.updateDigestFromProof();

        return old_value;
    }

    /// Perform update operation
    pub fn performUpdate(
        self: *AvlTreeVerifier,
        key: []const u8,
        value: []const u8,
    ) !?[]const u8 {
        if (self.current_digest == null) return error.VerificationFailed;

        const old_value = try self.verifyUpdate(key, value);
        self.updateDigestFromProof();
        return old_value;
    }

    /// Perform remove operation
    pub fn performRemove(self: *AvlTreeVerifier, key: []const u8) !?[]const u8 {
        if (self.current_digest == null) return error.VerificationFailed;

        const old_value = try self.verifyRemove(key);
        self.updateDigestFromProof();
        return old_value;
    }

    fn verifyLookup(self: *AvlTreeVerifier, key: []const u8) !?[]const u8 {
        // NOTE: Stub - full implementation requires:
        // 1. Read node type from proof (leaf vs internal)
        // 2. Compare key with node key
        // 3. Follow proof path based on comparison result
        // 4. Verify all hashes match computed values
        // See scorex-util: BatchAVLVerifier for reference.
        _ = self;
        _ = key;
        @compileError("verifyLookup not implemented - see reference implementations");
    }
    // SECURITY: Key comparisons in production must be constant-time to prevent
    // timing attacks that could leak key values. Use std.crypto.utils.timingSafeEql.

    fn updateDigestFromProof(self: *AvlTreeVerifier) void {
        // Extract new digest from proof processing
        _ = self;
    }
};

Proof-Based Operations

Operations use proofs for verification¹⁰¹¹:

/// Evaluate contains operation
fn containsEval(
    tree: *const AvlTree,
    key: []const u8,
    proof: []const u8,
    E: *Evaluator,
) !bool {
    // Cost: create verifier O(proof.length)
    try E.addSeqCost(CreateVerifierCost, proof.len, OpCode.AvlTreeContains);
    var verifier = AvlTreeVerifier.init(tree, proof);

    // Cost: lookup O(tree.height)
    const n_items = verifier.treeHeight();
    try E.addSeqCost(LookupCost, n_items, OpCode.AvlTreeContains);

    const result = verifier.performLookup(key) catch return false;
    return result != null;
}

/// Evaluate get operation
fn getEval(
    tree: *const AvlTree,
    key: []const u8,
    proof: []const u8,
    E: *Evaluator,
) !?[]const u8 {
    try E.addSeqCost(CreateVerifierCost, proof.len, OpCode.AvlTreeGet);
    var verifier = AvlTreeVerifier.init(tree, proof);

    const n_items = verifier.treeHeight();
    try E.addSeqCost(LookupCost, n_items, OpCode.AvlTreeGet);

    return verifier.performLookup(key) catch return error.InvalidProof;
}

/// Evaluate insert operation
fn insertEval(
    tree: *const AvlTree,
    entries: []const KeyValue,
    proof: []const u8,
    E: *Evaluator,
) !?*AvlTree {
    // Check permission
    try E.addCost(IsInsertAllowedCost, OpCode.AvlTreeInsert);
    if (!tree.isInsertAllowed()) {
        return null;
    }

    try E.addSeqCost(CreateVerifierCost, proof.len, OpCode.AvlTreeInsert);
    var verifier = AvlTreeVerifier.init(tree, proof);

    const n_items = @max(verifier.treeHeight(), 1);

    // Process each entry
    for (entries) |entry| {
        try E.addSeqCost(InsertCost, n_items, OpCode.AvlTreeInsert);
        _ = verifier.performInsert(entry.key, entry.value) catch return null;
    }

    // Return new tree with updated digest
    const new_digest = verifier.digest() orelse return null;
    try E.addCost(UpdateDigestCost, OpCode.AvlTreeInsert);
    const new_tree = tree.updateDigest(new_digest);
    return &new_tree;
}

/// Evaluate remove operation
fn removeEval(
    tree: *const AvlTree,
    keys: []const []const u8,
    proof: []const u8,
    E: *Evaluator,
) !?*AvlTree {
    try E.addCost(IsRemoveAllowedCost, OpCode.AvlTreeRemove);
    if (!tree.isRemoveAllowed()) {
        return null;
    }

    try E.addSeqCost(CreateVerifierCost, proof.len, OpCode.AvlTreeRemove);
    var verifier = AvlTreeVerifier.init(tree, proof);

    const n_items = @max(verifier.treeHeight(), 1);

    for (keys) |key| {
        try E.addSeqCost(RemoveCost, n_items, OpCode.AvlTreeRemove);
        _ = verifier.performRemove(key) catch return null;
    }

    const new_digest = verifier.digest() orelse return null;
    try E.addCost(UpdateDigestCost, OpCode.AvlTreeRemove);
    return &tree.updateDigest(new_digest);
}

const KeyValue = struct {
    key: []const u8,
    value: []const u8,
};

Cost Model

AVL tree operations have two-part costs¹². Since AVL+ trees are balanced, the tree height is O(log n) where n is the number of entries. Proof size is also O(log n) as proofs contain one sibling hash per tree level.

AVL Tree Operation Costs
─────────────────────────────────────────────────────

Phase 1 - Create Verifier (O(proof.length)):
  base = 110, per_chunk = 20, chunk_size = 64

Phase 2 - Per Operation (O(tree.height)):
Operation     │ Base │ Per Height │ Chunk
──────────────┼──────┼────────────┼───────
Lookup        │  40  │    10      │   1
Insert        │  40  │    10      │   1
Update        │ 120  │    20      │   1
Remove        │ 100  │    15      │   1
──────────────────────────────────────────────────────

Example: Get operation on tree with height 10, proof 128 bytes
  Verifier: 110 + ⌈128/64⌉ × 20 = 110 + 2 × 20 = 150
  Lookup:   40 + 10 × 10 = 140
  Total:    290 JitCost units

const CreateVerifierCost = PerItemCost{
    .base = JitCost{ .value = 110 },
    .per_chunk = JitCost{ .value = 20 },
    .chunk_size = 64,
};

const LookupCost = PerItemCost{
    .base = JitCost{ .value = 40 },
    .per_chunk = JitCost{ .value = 10 },
    .chunk_size = 1,
};

const InsertCost = PerItemCost{
    .base = JitCost{ .value = 40 },
    .per_chunk = JitCost{ .value = 10 },
    .chunk_size = 1,
};

const UpdateCost = PerItemCost{
    .base = JitCost{ .value = 120 },
    .per_chunk = JitCost{ .value = 20 },
    .chunk_size = 1,
};

const RemoveCost = PerItemCost{
    .base = JitCost{ .value = 100 },
    .per_chunk = JitCost{ .value = 15 },
    .chunk_size = 1,
};

Serialization

AvlTreeData serialization format¹³¹⁴:

/// Serialize AvlTreeData
fn serializeAvlTreeData(data: *const AvlTreeData, writer: anytype) !void {
    // Digest (33 bytes)
    try writer.writeAll(&data.digest.toBytes());

    // Flags (1 byte)
    try writer.writeByte(data.tree_flags.serialize());

    // Key length (VLQ)
    try writeUInt(writer, data.key_length);

    // Optional value length
    if (data.value_length_opt) |vlen| {
        try writer.writeByte(1); // Some
        try writeUInt(writer, vlen);
    } else {
        try writer.writeByte(0); // None
    }
}

/// Parse AvlTreeData
fn parseAvlTreeData(reader: anytype) !AvlTreeData {
    // Digest (33 bytes)
    var digest_bytes: [33]u8 = undefined;
    _ = try reader.readAll(&digest_bytes);
    const digest = ADDigest.fromSlice(&digest_bytes);

    // Flags (1 byte)
    const flags = AvlTreeFlags.parse(try reader.readByte());

    // Key length (VLQ)
    const key_length = try readUInt(reader);

    // Optional value length
    const has_value_length = (try reader.readByte()) != 0;
    const value_length_opt: ?u32 = if (has_value_length)
        try readUInt(reader)
    else
        null;

    return AvlTreeData{
        .digest = digest,
        .tree_flags = flags,
        .key_length = key_length,
        .value_length_opt = value_length_opt,
    };
}

Off-Chain Proof Generation

Provers generate proofs for operations:

/// Off-chain AVL tree prover (holds full tree)
const AvlProver = struct {
    /// Full tree structure
    root: ?*AvlNode,
    /// Key length
    key_length: usize,
    /// Value length (optional)
    value_length_opt: ?usize,
    /// Pending operations for batch proof
    pending_ops: std.ArrayList(Operation),
    allocator: Allocator,

    const Operation = union(enum) {
        lookup: []const u8,
        insert: struct { key: []const u8, value: []const u8 },
        update: struct { key: []const u8, value: []const u8 },
        remove: []const u8,
    };

    /// Perform insert and record for proof
    pub fn performInsert(self: *AvlProver, key: []const u8, value: []const u8) !void {
        // Actually insert into tree
        self.root = try self.insertNode(self.root, key, value);
        // Record for proof generation
        try self.pending_ops.append(.{ .insert = .{ .key = key, .value = value } });
    }

    /// Generate proof for all pending operations
    pub fn generateProof(self: *AvlProver) ![]const u8 {
        var proof_builder = ProofBuilder.init(self.allocator);

        for (self.pending_ops.items) |op| {
            switch (op) {
                .lookup => |key| try proof_builder.addLookupPath(self.root, key),
                .insert => |ins| try proof_builder.addInsertPath(self.root, ins.key),
                .update => |upd| try proof_builder.addUpdatePath(self.root, upd.key),
                .remove => |key| try proof_builder.addRemovePath(self.root, key),
            }
        }

        self.pending_ops.clearRetainingCapacity();
        return proof_builder.finish();
    }

    /// Get current tree digest
    pub fn digest(self: *const AvlProver) ADDigest {
        if (self.root) |r| {
            return computeNodeDigest(r);
        }
        return ADDigest{ .root_hash = [_]u8{0} ** 32, .height = 0 };
    }

    fn computeNodeDigest(node: *const AvlNode) ADDigest {
        _ = node;
        // Compute BLAKE2b256 hash of node contents
        // Include left and right child hashes
        return undefined;
    }
};

const AvlNode = struct {
    key: []const u8,
    value: []const u8,
    left: ?*AvlNode,
    right: ?*AvlNode,
    height: u8,
};

Key Ordering Requirement

Keys must be provided in the same order during proof generation and verification¹⁵:

CRITICAL: Key Ordering
─────────────────────────────────────────────────────

Proof Generation (off-chain):
  prover.performLookup(key_A)
  prover.performLookup(key_B)
  prover.performLookup(key_C)
  proof = prover.generateProof()

Verification (on-chain):
  tree.getMany([key_A, key_B, key_C], proof)  ✓ Works
  tree.getMany([key_B, key_A, key_C], proof)  ✗ Fails

The proof encodes a specific traversal path.
Different key order = different path = verification failure.

Summary

Authenticated dictionaries store only 33-byte digest on-chain
Ergo key size: Always 32 bytes (Blake2b256 hash); keyLength field exists for generality
Prover (off-chain) holds full tree, generates proofs
Verifier (on-chain) verifies proofs with only digest
Operation flags control insert/update/remove permissions
Key ordering must match between proof generation and verification
Cost scales with proof length (verifier creation) and tree height (operations)
All methods are immutable—return new tree instances

Next: Chapter 22: Box Model

Scala: AvlTreeData.scala:43-57 (AvlTreeData case class)

Rust: avl_tree_data.rs:56-69 (AvlTreeData struct)

Scala: AvlTreeData.scala:57 (DigestSize = 33)

⁴

Rust: avl_tree_data.rs:61-62 (digest field)

⁵

Scala: AvlTreeData.scala:7-36 (AvlTreeFlags)

⁶

Rust: avl_tree_data.rs:10-54 (AvlTreeFlags impl)

⁷

Scala: SigmaDsl.scala:547-589 (AvlTree trait)

⁸

Scala: AvlTreeVerifier.scala:8-88 (AvlTreeVerifier)

⁹

Scala: CAvlTreeVerifier.scala:17-45 (CAvlTreeVerifier)

¹⁰

Scala: CErgoTreeEvaluator.scala:78-93 (contains_eval)

¹¹

Scala: CErgoTreeEvaluator.scala:132-164 (insert_eval)

¹²

Scala: methods.scala:1498-1540 (cost info constants)

¹³

Scala: AvlTreeData.scala:71-90 (serializer)

¹⁴

Rust: avl_tree_data.rs:71-91 (SigmaSerializable impl)

¹⁵

Scala: methods.scala:1588 (getMany key ordering caution)

Chapter 22: Box Model

Prerequisites

Understanding of UTXO (Unspent Transaction Output) model basics
Chapter 3 for ErgoTree format stored in boxes
Chapter 20 for collection types used in registers

Learning Objectives

By the end of this chapter, you will be able to:

Explain the Ergo box as the fundamental UTXO structure with extended capabilities
Work with the register-based data model (R0-R3 mandatory, R4-R9 optional)
Manage tokens—the multi-asset feature of Ergo boxes
Compute box IDs using Blake2b256 hashing of serialized content
Implement box serialization and deserialization

Box Architecture

Boxes are Ergo's state containers—the extended UTXO model¹²:

Box Structure
─────────────────────────────────────────────────────

┌─────────────────────────────────────────────────────┐
│                     ErgoBox                         │
├─────────────────────────────────────────────────────┤
│  box_id: [32]u8          Blake2b256(serialize(box)) │
├─────────────────────────────────────────────────────┤
│                   Mandatory Registers               │
│  ┌───────────────────────────────────────────────┐  │
│  │ R0: Long           Value in nanoERG (10⁻⁹ ERG)│  │
│  │ R1: ErgoTree       Guarding script            │  │
│  │ R2: Coll[Token]    Secondary tokens           │  │
│  │ R3: (Int, Bytes)   Creation info              │  │
│  └───────────────────────────────────────────────┘  │
├─────────────────────────────────────────────────────┤
│                Non-Mandatory Registers              │
│  ┌───────────────────────────────────────────────┐  │
│  │ R4-R9: Any         Application-defined data   │  │
│  └───────────────────────────────────────────────┘  │
├─────────────────────────────────────────────────────┤
│               Transaction Reference                 │
│  ┌───────────────────────────────────────────────┐  │
│  │ transaction_id: [32]u8    Creating tx hash    │  │
│  │ index: u16                Output index in tx  │  │
│  └───────────────────────────────────────────────┘  │
└─────────────────────────────────────────────────────┘

Core Box Structure

const ErgoBox = struct {
    /// Blake2b256 hash of serialized box (computed)
    box_id: BoxId,
    /// Amount in NanoErgs (R0)
    value: BoxValue,
    /// Guarding script (R1)
    ergo_tree: ErgoTree,
    /// Secondary tokens (R2), up to MAX_TOKENS_COUNT
    tokens: ?BoundedVec(Token, 1, MAX_TOKENS_COUNT),
    /// Additional registers R4-R9
    additional_registers: NonMandatoryRegisters,
    /// Block height when transaction was created (part of R3)
    creation_height: u32,
    /// Transaction that created this box (part of R3)
    transaction_id: TxId,
    /// Output index in transaction (part of R3)
    index: u16,

    /// Protocol: 255 (u8), practical: ~122 due to box size limit
    pub const MAX_TOKENS_COUNT: usize = 255;
    pub const MAX_BOX_SIZE: usize = 4096;
    pub const MAX_SCRIPT_SIZE: usize = 4096;

    /// Create new box, computing box_id from content
    pub fn init(
        value: BoxValue,
        ergo_tree: ErgoTree,
        tokens: ?BoundedVec(Token, 1, MAX_TOKENS_COUNT),
        additional_registers: NonMandatoryRegisters,
        creation_height: u32,
        transaction_id: TxId,
        index: u16,
    ) !ErgoBox {
        var box_with_zero_id = ErgoBox{
            .box_id = BoxId.zero(),
            .value = value,
            .ergo_tree = ergo_tree,
            .tokens = tokens,
            .additional_registers = additional_registers,
            .creation_height = creation_height,
            .transaction_id = transaction_id,
            .index = index,
        };
        box_with_zero_id.box_id = try box_with_zero_id.calcBoxId();
        return box_with_zero_id;
    }

    /// Compute box ID as Blake2b256 hash of serialized bytes
    fn calcBoxId(self: *const ErgoBox) !BoxId {
        const bytes = try self.sigmaSerialize();
        const hash = blake2b256(bytes);
        return BoxId{ .digest = hash };
    }

    /// Create box from candidate by adding transaction reference
    pub fn fromBoxCandidate(
        candidate: *const ErgoBoxCandidate,
        transaction_id: TxId,
        index: u16,
    ) !ErgoBox {
        return init(
            candidate.value,
            candidate.ergo_tree,
            candidate.tokens,
            candidate.additional_registers,
            candidate.creation_height,
            transaction_id,
            index,
        );
    }
};

ErgoBoxCandidate

Before confirmation, boxes exist as candidates without transaction reference³⁴:

/// Box before transaction confirmation (no tx reference yet)
const ErgoBoxCandidate = struct {
    /// Amount in NanoErgs
    value: BoxValue,
    /// Guarding script
    ergo_tree: ErgoTree,
    /// Secondary tokens
    tokens: ?BoundedVec(Token, 1, ErgoBox.MAX_TOKENS_COUNT),
    /// Additional registers R4-R9
    additional_registers: NonMandatoryRegisters,
    /// Declared creation height
    creation_height: u32,

    pub fn toBox(self: *const ErgoBoxCandidate, tx_id: TxId, index: u16) !ErgoBox {
        return ErgoBox.fromBoxCandidate(self, tx_id, index);
    }
};

Register Model

Ten registers total—four mandatory, six application-defined⁵⁶:

Register Layout
─────────────────────────────────────────────────────
 ID    Type                   Purpose
─────────────────────────────────────────────────────
 R0    Long                   Value in nanoERG (10⁻⁹ ERG)
 R1    Coll[Byte]             Serialized ErgoTree
 R2    Coll[(Coll[Byte],Long)] Secondary tokens
 R3    (Int, Coll[Byte])      (height, txId ++ index)
─────────────────────────────────────────────────────
 R4    Any                    Application data
 R5    Any                    Application data
 R6    Any                    Application data
 R7    Any                    Application data
 R8    Any                    Application data
 R9    Any                    Application data
─────────────────────────────────────────────────────

Note: R4-R9 must be densely packed.
      If R6 is used, R4 and R5 must also be present.

Register ID Types

/// Register identifier (0-9)
const RegisterId = union(enum) {
    mandatory: MandatoryRegisterId,
    non_mandatory: NonMandatoryRegisterId,

    pub const R0 = RegisterId{ .mandatory = .r0 };
    pub const R1 = RegisterId{ .mandatory = .r1 };
    pub const R2 = RegisterId{ .mandatory = .r2 };
    pub const R3 = RegisterId{ .mandatory = .r3 };

    pub fn fromByte(value: u8) !RegisterId {
        if (value < 4) {
            return RegisterId{ .mandatory = @enumFromInt(value) };
        } else if (value <= 9) {
            return RegisterId{ .non_mandatory = @enumFromInt(value) };
        } else {
            return error.RegisterIdOutOfBounds;
        }
    }
};

/// Mandatory registers (R0-R3) - every box has these
const MandatoryRegisterId = enum(u8) {
    /// Monetary value in NanoErgs
    r0 = 0,
    /// Guarding script (serialized ErgoTree)
    r1 = 1,
    /// Secondary tokens
    r2 = 2,
    /// Transaction reference and creation height
    r3 = 3,
};

/// Non-mandatory registers (R4-R9) - application defined
const NonMandatoryRegisterId = enum(u8) {
    r4 = 4,
    r5 = 5,
    r6 = 6,
    r7 = 7,
    r8 = 8,
    r9 = 9,

    pub const START_INDEX: usize = 4;
    pub const END_INDEX: usize = 9;
    pub const NUM_REGS: usize = 6;
};

Non-Mandatory Registers

Densely-packed storage for R4-R9⁷⁸:

const NonMandatoryRegisters = struct {
    /// Registers stored as contiguous array (R4 at index 0)
    values: []RegisterValue,
    allocator: Allocator,

    pub const MAX_SIZE: usize = NonMandatoryRegisterId.NUM_REGS;

    pub fn empty() NonMandatoryRegisters {
        return .{ .values = &.{}, .allocator = undefined };
    }

    /// Create from map, ensuring dense packing
    pub fn fromMap(
        allocator: Allocator,
        map: std.AutoHashMap(NonMandatoryRegisterId, Constant),
    ) !NonMandatoryRegisters {
        const count = map.count();
        if (count > MAX_SIZE) return error.InvalidSize;

        // Verify dense packing: R4...R(4+count-1) must all be present
        var values = try allocator.alloc(RegisterValue, count);
        var i: usize = 0;
        while (i < count) : (i += 1) {
            const reg_id: NonMandatoryRegisterId = @enumFromInt(4 + i);
            const constant = map.get(reg_id) orelse
                return error.NonDenselyPacked;
            values[i] = RegisterValue{ .parsed = constant };
        }

        return .{ .values = values, .allocator = allocator };
    }

    /// Get register by ID, returns null if not present
    pub fn get(self: *const NonMandatoryRegisters, reg_id: NonMandatoryRegisterId) ?*const RegisterValue {
        const index = @intFromEnum(reg_id) - NonMandatoryRegisterId.START_INDEX;
        if (index >= self.values.len) return null;
        return &self.values[index];
    }

    /// Get as Constant, handling parse errors
    pub fn getConstant(self: *const NonMandatoryRegisters, reg_id: NonMandatoryRegisterId) !?Constant {
        const reg_val = self.get(reg_id) orelse return null;
        return try reg_val.asConstant();
    }
};

/// Register value—either parsed Constant or unparseable bytes
const RegisterValue = union(enum) {
    parsed: Constant,
    parsed_tuple: EvaluatedTuple,
    invalid: struct {
        bytes: []const u8,
        error_msg: []const u8,
    },

    pub fn asConstant(self: *const RegisterValue) !Constant {
        return switch (self.*) {
            .parsed => |c| c,
            .parsed_tuple => |t| t.toConstant(),
            .invalid => |inv| error.UnparseableRegister,
        };
    }
};

Box ID Computation

Box ID is Blake2b256 hash of serialized content⁹¹⁰:

const BoxId = struct {
    digest: [32]u8,

    pub const SIZE: usize = 32;

    pub fn zero() BoxId {
        return .{ .digest = [_]u8{0} ** 32 };
    }

    pub fn fromBytes(bytes: []const u8) !BoxId {
        if (bytes.len != SIZE) return error.InvalidLength;
        var result: BoxId = undefined;
        @memcpy(&result.digest, bytes);
        return result;
    }
};

/// Compute box ID from serialized box bytes
pub fn computeBoxId(box_bytes: []const u8) BoxId {
    return BoxId{ .digest = blake2b256(box_bytes) };
}

The ID includes transaction reference, making each box unique:

Box ID Computation
─────────────────────────────────────────────────────

┌──────────────────────────────────────────────────┐
│              Serialized Box Bytes                │
├──────────────────────────────────────────────────┤
│  value (VLQ)                                     │
│  ergo_tree (bytes)                               │
│  creation_height (VLQ)                           │
│  tokens_count (u8)                               │
│  tokens[] (token_id + amount)                    │
│  registers_count (u8)                            │
│  additional_registers[]                          │
│  transaction_id (32 bytes)                       │
│  index (2 bytes, big-endian)                     │
└──────────────────────────────────────────────────┘
                        │
                        ▼
              ┌─────────────────┐
              │   Blake2b256    │
              └────────┬────────┘
                       │
                       ▼
              ┌─────────────────┐
              │  BoxId (32 B)   │
              └─────────────────┘

Register Access

Get register value with type checking¹¹¹²:

/// Get any register value (R0-R9)
pub fn getRegister(box: *const ErgoBox, id: RegisterId) !?Constant {
    return switch (id) {
        .mandatory => |mid| switch (mid) {
            .r0 => Constant.fromLong(box.value.as_i64()),
            .r1 => Constant.fromBytes(try box.ergo_tree.serialize()),
            .r2 => Constant.fromTokens(box.tokensRaw()),
            .r3 => Constant.fromTuple(box.creationInfo()),
        },
        .non_mandatory => |nid| try box.additional_registers.getConstant(nid),
    };
}

/// Get tokens as raw (bytes, amount) pairs
pub fn tokensRaw(box: *const ErgoBox) []const struct { []const i8, i64 } {
    if (box.tokens) |tokens| {
        var result = allocator.alloc(@TypeOf(result[0]), tokens.len);
        for (tokens.items(), 0..) |token, i| {
            result[i] = .{ token.token_id.asVecI8(), token.amount.as_i64() };
        }
        return result;
    }
    return &.{};
}

/// Get creation info as (height, txId ++ index)
pub fn creationInfo(box: *const ErgoBox) struct { i32, []const i8 } {
    var bytes: [34]u8 = undefined; // 32-byte tx_id + 2-byte index
    @memcpy(bytes[0..32], &box.transaction_id.digest);
    std.mem.writeInt(u16, bytes[32..34], box.index, .big);
    return .{
        @intCast(box.creation_height),
        std.mem.bytesAsSlice(i8, &bytes),
    };
}

ExtractRegisterAs (AST Node)

/// Box.R0 - Box.R9 operations
const ExtractRegisterAs = struct {
    /// Input box expression
    input: *const Expr,
    /// Register index (0-9)
    register_id: i8,
    /// Expected element type (wrapped in Option)
    elem_tpe: SType,

    pub const OP_CODE = OpCode.new(0x6E); // EXTRACT_REGISTER_AS

    pub fn tpe(self: *const ExtractRegisterAs) SType {
        return SType.option(self.elem_tpe);
    }

    pub fn eval(self: *const ExtractRegisterAs, env: *Env, ctx: *Context) !Value {
        const ir_box = try self.input.eval(env, ctx);
        const box = ir_box.asBox() orelse return error.TypeMismatch;

        const id = RegisterId.fromByte(@intCast(self.register_id)) catch
            return error.RegisterIdOutOfBounds;

        const reg_val_opt = try box.getRegister(id);

        if (reg_val_opt) |constant| {
            // Type must match exactly
            if (!constant.tpe.equals(self.elem_tpe)) {
                return error.UnexpectedType;
            }
            return Value.some(constant.value);
        } else {
            return Value.none();
        }
    }
};

Token Representation

Tokens are (id, amount) pairs stored in R2¹⁵¹⁶:

const Token = struct {
    /// 32-byte token identifier
    token_id: TokenId,
    /// Token amount (positive i64)
    amount: TokenAmount,
};

const TokenId = struct {
    digest: [32]u8,

    pub const SIZE: usize = 32;
};

const TokenAmount = struct {
    value: u64,

    pub fn as_i64(self: TokenAmount) i64 {
        return @intCast(self.value);
    }
};

/// Bounded collection of tokens (1 to MAX_TOKENS)
const BoxTokens = BoundedVec(Token, 1, ErgoBox.MAX_TOKENS_COUNT);

Token minting rule:

Token Creation Rule
─────────────────────────────────────────────────────

A new token can ONLY be minted when:
  token_id == INPUTS(0).id   (MUST equal first input's box ID)

This is a consensus rule enforced by the protocol.
Only the first input's box ID can be used as a new token ID.
This ensures uniqueness: tokens are "born" from a specific box.

┌─────────────┐     Spend      ┌─────────────────┐
│  Input Box  │ ─────────────► │   Output Box    │
│  id: ABC123 │                │  token: ABC123  │
└─────────────┘                │  amount: 1000   │
                               └─────────────────┘

Box Serialization

/// Serialize box with optional token ID indexing
pub fn serializeBoxWithIndexedDigests(
    box_value: BoxValue,
    ergo_tree_bytes: []const u8,
    tokens: ?BoxTokens,
    additional_registers: *const NonMandatoryRegisters,
    creation_height: u32,
    token_ids_in_tx: ?*const IndexSet(TokenId),
    writer: anytype,
) !void {
    // Value (VLQ-encoded)
    try box_value.serialize(writer);

    // ErgoTree bytes
    try writer.writeAll(ergo_tree_bytes);

    // Creation height (VLQ-encoded)
    try writeVLQ(writer, creation_height);

    // Tokens
    const token_slice = if (tokens) |t| t.items() else &[_]Token{};
    try writer.writeByte(@intCast(token_slice.len));

    for (token_slice) |token| {
        if (token_ids_in_tx) |index_set| {
            // Write index into transaction's token list
            const idx = index_set.getIndex(token.token_id) orelse
                return error.TokenNotInIndex;
            try writeVLQ(writer, @intCast(idx));
        } else {
            // Write full 32-byte token ID
            try writer.writeAll(&token.token_id.digest);
        }
        try writeVLQ(writer, token.amount.value);
    }

    // Additional registers
    try additional_registers.serialize(writer);
}

/// Full ErgoBox serialization (adds tx reference)
pub fn serializeErgoBox(box: *const ErgoBox, writer: anytype) !void {
    const ergo_tree_bytes = try box.ergo_tree.serialize();

    try serializeBoxWithIndexedDigests(
        box.value,
        ergo_tree_bytes,
        box.tokens,
        &box.additional_registers,
        box.creation_height,
        null,
        writer,
    );

    // Transaction reference
    try writer.writeAll(&box.transaction_id.digest);
    try writer.writeInt(u16, box.index, .big);
}

Size Limits

Box Constraints
─────────────────────────────────────────────────────
Limit                    Value     Notes
─────────────────────────────────────────────────────
Max box size             4 KB      Total serialized bytes
Max tokens per box       255       Protocol limit (u8)
  (practical limit)      ~122      Due to 4KB size limit
Max registers            10        R0-R9
Max script size          4 KB      ErgoTree in R1 (part of box)
─────────────────────────────────────────────────────

const SigmaConstants = struct {
    pub const MAX_BOX_SIZE: usize = 4 * 1024;
    /// Protocol allows 255 (u8), but ~122 fit within MAX_BOX_SIZE
    pub const MAX_TOKENS_PROTOCOL: usize = 255;
    pub const MAX_TOKENS_PRACTICAL: usize = 122;
    pub const MAX_REGISTERS: usize = 10;
};

Box Interface Methods

Methods available on Box type¹⁷¹⁸:

const BoxMethods = struct {
    /// Box.value: Long - monetary value in NanoErgs
    pub fn value(box: *const ErgoBox) i64 {
        return box.value.as_i64();
    }

    /// Box.propositionBytes: Coll[Byte] - serialized script
    pub fn propositionBytes(box: *const ErgoBox) ![]const u8 {
        return try box.ergo_tree.serialize();
    }

    /// Box.bytes: Coll[Byte] - full serialized box
    pub fn bytes(box: *const ErgoBox) ![]const u8 {
        return try box.serialize();
    }

    /// Box.bytesWithoutRef: Coll[Byte] - without tx reference
    pub fn bytesWithoutRef(box: *const ErgoBox) ![]const u8 {
        const candidate = ErgoBoxCandidate{
            .value = box.value,
            .ergo_tree = box.ergo_tree,
            .tokens = box.tokens,
            .additional_registers = box.additional_registers,
            .creation_height = box.creation_height,
        };
        return try candidate.serialize();
    }

    /// Box.id: Coll[Byte] - 32-byte Blake2b256 hash
    pub fn id(box: *const ErgoBox) []const u8 {
        return &box.box_id.digest;
    }

    /// Box.creationInfo: (Int, Coll[Byte])
    pub fn creationInfo(box: *const ErgoBox) struct { i32, []const u8 } {
        return box.creationInfo();
    }

    /// Box.tokens: Coll[(Coll[Byte], Long)]
    pub fn tokens(box: *const ErgoBox) []const Token {
        return if (box.tokens) |t| t.items() else &.{};
    }

    /// Box.getReg[T](i: Int): Option[T]
    pub fn getReg(box: *const ErgoBox, comptime T: type, index: i32) !?T {
        const id = try RegisterId.fromByte(@intCast(index));
        const constant = try box.getRegister(id) orelse return null;
        return constant.extractAs(T);
    }
};

Type-Safe Register Access

Three outcomes when accessing registers¹⁹²⁰:

Register Access Outcomes
─────────────────────────────────────────────────────

┌─────────────────────────────────────────────────┐
│             box.R4[Int]                         │
└─────────────────────────────────────────────────┘
                      │
        ┌─────────────┼─────────────┐
        ▼             ▼             ▼
┌──────────────┐ ┌──────────┐ ┌────────────────┐
│ R4 not set   │ │ R4 = Int │ │ R4 = Long      │
│              │ │          │ │ (wrong type)   │
└──────┬───────┘ └────┬─────┘ └───────┬────────┘
       │              │               │
       ▼              ▼               ▼
   None           Some(value)     ERROR!
                                  InvalidType

/// Type-safe register access with explicit error handling
pub fn extractRegisterAs(
    box: *const ErgoBox,
    register_id: i8,
    expected_type: SType,
) !?Value {
    const id = try RegisterId.fromByte(@intCast(register_id));
    const constant_opt = try box.getRegister(id);

    if (constant_opt) |constant| {
        if (!constant.tpe.equals(expected_type)) {
            return error.InvalidType;
        }
        return constant.value;
    }
    return null;
}

Summary

Boxes are immutable UTXO state containers with 10 registers
R0-R3 are mandatory (value, script, tokens, creation info)
R4-R9 are application-defined, must be densely packed
Box ID is Blake2b256 hash of serialized content including tx reference
Tokens stored in R2, max 255 per box (protocol), ~122 practical; token ID MUST equal first input's box ID
Type-safe access with three outcomes: None, Some(value), or InvalidType
4KB limit on total box size

Next: Chapter 23: Interpreter Wrappers

Scala: ErgoBox.scala:50-59

Rust: ergo_box.rs:38-80

Scala: ErgoBoxCandidate.scala:36-41

⁴

Rust: ergo_box.rs:225-248

⁵

Scala: ErgoBox.scala:154-168

⁶

Rust: id.rs:78-90

⁷

Scala: ErgoBox.scala (additionalRegisters)

⁸

Rust: register.rs:27-91

⁹

Scala: ErgoBox.scala:72-73

¹⁰

Rust: ergo_box.rs:149-153

¹¹

Scala: CBox.scala:77-94

¹²

Rust: ergo_box.rs:156-168

¹³

Scala: methods.scala:1263 (SBoxMethods)

¹⁴

Rust: extract_reg_as.rs:18-57

¹⁵

Scala: ErgoBox.scala:119-130

¹⁶

Rust: ergo_box.rs:36-37 (BoxTokens)

¹⁷

Scala: SigmaDsl.scala:414-536

¹⁸

Rust: ergo_box.rs:120-198

¹⁹

Scala: CBox.scala:20-74

²⁰

Rust: extract_reg_as.rs:15-47

Chapter 23: Interpreter Wrappers

Prerequisites

Chapter 14 for verifier implementation
Chapter 15 for prover implementation
Chapter 22 for box structure and registers

Learning Objectives

By the end of this chapter, you will be able to:

Explain the interpreter hierarchy and how verifier/prover are combined
Describe storage rent rules for expired boxes
Use the Wallet API for transaction signing
Implement proof verification with context extensions

Interpreter Architecture

The interpreter provides a layered architecture for script evaluation and proving¹²:

Interpreter Hierarchy
─────────────────────────────────────────────────────

┌─────────────────────────────────────────────────────┐
│                    Verifier                         │
│  verify(tree, ctx, proof, message) -> bool          │
│  Evaluates tree, then verifies sigma protocol proof │
└────────────────────────┬────────────────────────────┘
                         │ uses
                         ▼
┌─────────────────────────────────────────────────────┐
│                    Prover                           │
│  prove(tree, ctx, message, hints) -> ProverResult   │
│  Reduces to SigmaBoolean, generates proof           │
├─────────────────────────────────────────────────────┤
│  secrets: []PrivateInput                            │
│  prove() generates commitment, response             │
└────────────────────────┬────────────────────────────┘
                         │ uses
                         ▼
┌─────────────────────────────────────────────────────┐
│               reduce_to_crypto                      │
│  Evaluates ErgoTree to SigmaBoolean                 │
│  Returns: { sigma_prop, cost, diag }                │
└─────────────────────────────────────────────────────┘

Reduction to Crypto

The core evaluation function reduces ErgoTree to a cryptographic proposition³⁴:

/// Result of expression reduction
const ReductionResult = struct {
    /// SigmaBoolean representing verifiable statement
    sigma_prop: SigmaBoolean,
    /// Estimated execution cost
    cost: u64,
    /// Diagnostic info (env state, pretty-printed expr)
    diag: ReductionDiagnosticInfo,
};

/// Evaluate ErgoTree to SigmaBoolean
pub fn reduceToCrypto(tree: *const ErgoTree, ctx: *const Context) !ReductionResult {
    const expr = try tree.root();
    var env = Env.empty();

    const value = try expr.eval(&env, ctx);

    const sigma_prop = switch (value) {
        .boolean => |b| SigmaBoolean.trivial(b),
        .sigma_prop => |sp| sp.value(),
        else => return error.NotSigmaProp,
    };

    return ReductionResult{
        .sigma_prop = sigma_prop,
        .cost = ctx.cost_accum.total(),
        .diag = .{
            .env = env.toStatic(),
            .pretty_printed_expr = null,
        },
    };
}

Verifier Trait

Verification executes script and validates proof⁵⁶:

const Verifier = struct {
    /// Verify proof against ErgoTree in context
    pub fn verify(
        self: *const Verifier,
        tree: *const ErgoTree,
        ctx: *const Context,
        proof: ProofBytes,
        message: []const u8,
    ) !VerificationResult {
        // Step 1-2: Reduce to SigmaBoolean
        const reduction = try reduceToCrypto(tree, ctx);

        // Step 3: Verify proof
        const result = switch (reduction.sigma_prop) {
            .trivial_prop => |b| b,
            else => |sb| blk: {
                if (proof.isEmpty()) break :blk false;

                // Parse signature and compute challenges
                const unchecked_tree = try parseSigComputeChallenges(
                    sb,
                    proof.bytes(),
                );

                // Verify commitments match
                break :blk try checkCommitments(unchecked_tree, message);
            },
        };

        return VerificationResult{
            .result = result,
            .cost = reduction.cost,
            .diag = reduction.diag,
        };
    }
};

const VerificationResult = struct {
    /// True if proof validates
    result: bool,
    /// Execution cost
    cost: u64,
    /// Diagnostic information
    diag: ReductionDiagnosticInfo,
};

Prover Trait

The prover generates proofs for sigma propositions⁷⁸:

const Prover = struct {
    /// Private inputs (secrets)
    secrets: []const PrivateInput,

    /// Generate proof for ErgoTree
    pub fn prove(
        self: *const Prover,
        tree: *const ErgoTree,
        ctx: *const Context,
        message: []const u8,
        hints: ?*const HintsBag,
    ) !ProverResult {
        // Reduce to crypto
        const reduction = try reduceToCrypto(tree, ctx);

        return switch (reduction.sigma_prop) {
            .trivial_prop => |b| if (b)
                ProverResult.empty()
            else
                error.ReducedToFalse,

            else => |sb| blk: {
                // Generate proof using sigma protocol
                const proof = try self.generateProof(sb, message, hints);
                break :blk proof;
            },
        };
    }

    /// Add secret to prover
    pub fn appendSecret(self: *Prover, secret: PrivateInput) void {
        self.secrets = append(self.secrets, secret);
    }

    /// Get public images of all secrets
    pub fn publicImages(self: *const Prover) []SigmaBoolean {
        var result: []SigmaBoolean = &.{};
        for (self.secrets) |secret| {
            result = append(result, secret.publicImage());
        }
        return result;
    }
};

ProverResult

Proof output with context extension⁹¹⁰:

const ProverResult = struct {
    /// Serialized proof bytes
    proof: ProofBytes,
    /// User-defined context variables
    extension: ContextExtension,

    pub fn empty() ProverResult {
        return .{
            .proof = ProofBytes.empty(),
            .extension = ContextExtension.empty(),
        };
    }
};

/// Proof bytes (empty for trivial proofs)
const ProofBytes = union(enum) {
    empty: void,
    some: []const u8,

    pub fn isEmpty(self: ProofBytes) bool {
        return self == .empty;
    }

    pub fn bytes(self: ProofBytes) []const u8 {
        return switch (self) {
            .empty => &.{},
            .some => |b| b,
        };
    }
};

Wallet

The Wallet wraps prover for transaction signing¹¹¹²:

const Wallet = struct {
    /// Underlying prover
    prover: *Prover,

    /// Create from mnemonic phrase
    pub fn fromMnemonic(
        phrase: []const u8,
        password: []const u8,
    ) !Wallet {
        const seed = Mnemonic.toSeed(phrase, password);
        const ext_sk = try ExtSecretKey.deriveMaster(seed);
        return Wallet.fromSecrets(&.{ext_sk.secretKey()});
    }

    /// Create from secret keys
    pub fn fromSecrets(secrets: []const SecretKey) Wallet {
        var private_inputs: []PrivateInput = &.{};
        for (secrets) |sk| {
            private_inputs = append(private_inputs, PrivateInput.from(sk));
        }
        return .{
            .prover = &Prover{ .secrets = private_inputs },
        };
    }

    /// Add secret to wallet
    pub fn addSecret(self: *Wallet, secret: SecretKey) void {
        self.prover.appendSecret(PrivateInput.from(secret));
    }

    /// Sign a transaction
    pub fn signTransaction(
        self: *const Wallet,
        tx_context: *const TransactionContext,
        state_context: *const ErgoStateContext,
        tx_hints: ?*const TransactionHintsBag,
    ) !Transaction {
        return signTransactionImpl(
            self.prover,
            tx_context,
            state_context,
            tx_hints,
        );
    }

    /// Sign a reduced transaction
    pub fn signReducedTransaction(
        self: *const Wallet,
        reduced_tx: *const ReducedTransaction,
        tx_hints: ?*const TransactionHintsBag,
    ) !Transaction {
        return signReducedTransactionImpl(
            self.prover,
            reduced_tx,
            tx_hints,
        );
    }
};

Transaction Signing

Sign all inputs, accumulating costs¹³¹⁴:

/// Sign transaction, generating proofs for all inputs
pub fn signTransaction(
    prover: *const Prover,
    tx_context: *const TransactionContext,
    state_context: *const ErgoStateContext,
    tx_hints: ?*const TransactionHintsBag,
) !Transaction {
    const tx = tx_context.spending_tx;
    const message = try tx.bytesToSign();

    // Build context for first input
    var ctx = try makeContext(state_context, tx_context, 0);

    // Sign each input
    var inputs: []Input = &.{};
    for (tx.inputs(), 0..) |unsigned_input, idx| {
        if (idx > 0) {
            try updateContext(&ctx, tx_context, idx);
        }

        // Get hints for this input
        const hints = if (tx_hints) |h| h.allHintsForInput(idx) else null;

        // Generate proof
        const input_box = tx_context.getInputBox(unsigned_input.box_id) orelse
            return error.InputBoxNotFound;

        const prover_result = try prover.prove(
            &input_box.ergo_tree,
            &ctx,
            message,
            hints,
        );

        inputs = append(inputs, Input{
            .box_id = unsigned_input.box_id,
            .spending_proof = prover_result,
        });
    }

    return Transaction{
        .inputs = inputs,
        .data_inputs = tx.data_inputs,
        .output_candidates = tx.output_candidates,
    };
}

/// Create evaluation context for input
pub fn makeContext(
    state_ctx: *const ErgoStateContext,
    tx_ctx: *const TransactionContext,
    self_index: usize,
) !Context {
    const self_box = tx_ctx.getInputBox(
        tx_ctx.spending_tx.inputs()[self_index].box_id,
    ) orelse return error.InputBoxNotFound;

    return Context{
        .height = state_ctx.pre_header.height,
        .self_box = self_box,
        .outputs = tx_ctx.spending_tx.outputs(),
        .inputs = tx_ctx.inputBoxes(),
        .data_inputs = tx_ctx.dataBoxes(),
        .pre_header = state_ctx.pre_header,
        .headers = state_ctx.headers,
        .extension = tx_ctx.spending_tx.contextExtension(self_index),
    };
}

Transaction Hints Bag

Hints for multi-party signing protocols¹⁵¹⁶:

const TransactionHintsBag = struct {
    /// Secret hints by input index
    secret_hints: std.AutoHashMap(usize, HintsBag),
    /// Public hints (commitments) by input index
    public_hints: std.AutoHashMap(usize, HintsBag),

    pub fn empty() TransactionHintsBag {
        return .{
            .secret_hints = std.AutoHashMap(usize, HintsBag).init(allocator),
            .public_hints = std.AutoHashMap(usize, HintsBag).init(allocator),
        };
    }

    /// Replace all hints for an input
    pub fn replaceHintsForInput(self: *TransactionHintsBag, index: usize, hints: HintsBag) void {
        var public_hints: []Hint = &.{};
        var secret_hints: []Hint = &.{};

        for (hints.hints) |hint| {
            switch (hint) {
                .commitment_hint => public_hints = append(public_hints, hint),
                .secret_proven => secret_hints = append(secret_hints, hint),
            }
        }

        self.secret_hints.put(index, HintsBag{ .hints = secret_hints }) catch {};
        self.public_hints.put(index, HintsBag{ .hints = public_hints }) catch {};
    }

    /// Add hints for an input (appending to existing)
    pub fn addHintsForInput(self: *TransactionHintsBag, index: usize, hints: HintsBag) void {
        // Get existing or empty
        var existing_secret = self.secret_hints.get(index) orelse HintsBag.empty();
        var existing_public = self.public_hints.get(index) orelse HintsBag.empty();

        for (hints.hints) |hint| {
            switch (hint) {
                .commitment_hint => existing_public.hints = append(existing_public.hints, hint),
                .secret_proven => existing_secret.hints = append(existing_secret.hints, hint),
            }
        }

        self.secret_hints.put(index, existing_secret) catch {};
        self.public_hints.put(index, existing_public) catch {};
    }

    /// Get all hints for input
    pub fn allHintsForInput(self: *const TransactionHintsBag, index: usize) HintsBag {
        var hints: []Hint = &.{};

        if (self.secret_hints.get(index)) |bag| {
            for (bag.hints) |h| hints = append(hints, h);
        }
        if (self.public_hints.get(index)) |bag| {
            for (bag.hints) |h| hints = append(hints, h);
        }

        return HintsBag{ .hints = hints };
    }
};

Commitment Generation

Generate first-round commitments for distributed signing¹⁷¹⁸:

/// Generate commitments for transaction inputs
pub fn generateCommitments(
    wallet: *const Wallet,
    tx_context: *const TransactionContext,
    state_context: *const ErgoStateContext,
) !TransactionHintsBag {
    // Get public keys from wallet secrets
    var public_keys: []SigmaBoolean = &.{};
    for (wallet.prover.secrets) |secret| {
        public_keys = append(public_keys, secret.publicImage());
    }

    var hints_bag = TransactionHintsBag.empty();

    for (tx_context.spending_tx.inputs(), 0..) |_, idx| {
        var ctx = try makeContext(state_context, tx_context, idx);

        const input_box = tx_context.inputBoxes()[idx];
        const reduction = try reduceToCrypto(&input_box.ergo_tree, &ctx);

        // Generate commitments for this sigma proposition
        const input_hints = generateCommitmentsFor(
            &reduction.sigma_prop,
            public_keys,
        );
        hints_bag.addHintsForInput(idx, input_hints);
    }

    return hints_bag;
}

Storage Rent (Ergo-Specific)

Boxes expire after ~4 years and can be spent by anyone¹⁹:

Storage Rent Rules
─────────────────────────────────────────────────────

Period: 1,051,200 blocks ≈ 4 years (at 2 min/block)

Expired Box Spending:
┌─────────────────────────────────────────────────────┐
│ IF:                                                 │
│   current_height - box.creation_height >= 1,051,200 │
│   AND proof.isEmpty()                               │
│   AND extension.contains(STORAGE_INDEX_VAR)         │
│ THEN:                                               │
│   Check recreation rules instead of script          │
└─────────────────────────────────────────────────────┘

Recreation Rules:
┌─────────────────────────────────────────────────────┐
│ output.creation_height == current_height            │
│ output.value >= box.value - storage_fee             │
│ output.R1 == box.R1  (script preserved)             │
│ output.R2 == box.R2  (tokens preserved)             │
│ output.R4-R9 == box.R4-R9  (registers preserved)    │
│                                                     │
│ storage_fee = storage_fee_factor * box.bytes.len    │
└─────────────────────────────────────────────────────┘

const StorageConstants = struct {
    /// Storage period in blocks (~4 years at 2 min/block)
    pub const STORAGE_PERIOD: u32 = 1_051_200;
    /// Context extension variable ID for storage index
    pub const STORAGE_INDEX_VAR_ID: u8 = 127;
    /// Fixed cost for storage contract evaluation
    pub const STORAGE_CONTRACT_COST: u64 = 50;
};

/// Check if expired box spending is valid
pub fn checkExpiredBox(
    box: *const ErgoBox,
    output: *const ErgoBoxCandidate,
    current_height: u32,
    storage_fee_factor: u64,
) bool {
    // Calculate storage fee
    const storage_fee = storage_fee_factor * box.serializedSize();

    // If box value <= fee, it's "dust" - always allowed
    if (box.value.as_i64() - @as(i64, @intCast(storage_fee)) <= 0) {
        return true;
    }

    // Check recreation rules
    const correct_height = output.creation_height == current_height;
    const correct_value = output.value.as_i64() >= box.value.as_i64() - @as(i64, @intCast(storage_fee));
    const correct_registers = checkRegistersPreserved(box, output);

    return correct_height and correct_value and correct_registers;
}

fn checkRegistersPreserved(box: *const ErgoBox, output: *const ErgoBoxCandidate) bool {
    // R0 (value) and R3 (reference) can change
    // R1 (script), R2 (tokens), R4-R9 must be preserved
    return eql(box.ergo_tree, output.ergo_tree) and
        eql(box.tokens, output.tokens) and
        eql(box.additional_registers, output.additional_registers);
}

Signing Errors

const TxSigningError = error{
    /// Transaction context invalid
    TransactionContextError,
    /// Prover failed on input
    ProverError,
    /// Serialization failed
    SerializationError,
    /// Signature parsing failed
    SigParsingError,
};

const ProverError = error{
    /// ErgoTree parsing failed
    ErgoTreeError,
    /// Evaluation failed
    EvalError,
    /// Script reduced to false
    ReducedToFalse,
    /// Missing witness for proof
    TreeRootIsNotReal,
    /// Secret not found for leaf
    SecretNotFound,
    /// Simulated leaf needs challenge
    SimulatedLeafWithoutChallenge,
};

Cost Tracking

Transaction costs are accumulated across inputs²⁰:

const TxCostComponents = struct {
    /// Interpreter initialization (once per tx)
    pub const INTERPRETER_INIT_COST: u64 = 10_000;

    /// Calculate total transaction cost
    pub fn calculateInitialCost(
        params: *const BlockchainParameters,
        inputs_count: usize,
        data_inputs_count: usize,
        outputs_count: usize,
        token_access_cost: u64,
    ) u64 {
        return INTERPRETER_INIT_COST +
            inputs_count * params.input_cost +
            data_inputs_count * params.data_input_cost +
            outputs_count * params.output_cost +
            token_access_cost;
    }
};

Deterministic Signing

For platforms without secure random²¹²²:

/// Generate deterministic nonce from secret and message
/// Used when secure random is unavailable
pub fn generateDeterministicCommitments(
    wallet: *const Wallet,
    reduced_tx: *const ReducedTransaction,
    aux_rand: []const u8,
) !TransactionHintsBag {
    var hints_bag = TransactionHintsBag.empty();
    const message = try reduced_tx.unsigned_tx.bytesToSign();

    for (reduced_tx.reduced_inputs(), 0..) |input, idx| {
        // Deterministic nonce: H(secret || message || aux_rand)
        if (generateDeterministicCommitmentsFor(
            wallet.prover,
            &input.sigma_prop,
            message,
            aux_rand,
        )) |bag| {
            hints_bag.addHintsForInput(idx, bag);
        }
    }

    return hints_bag;
}

Summary

Verifier evaluates script, verifies sigma protocol proof
Prover reduces to SigmaBoolean, generates proof using secrets
Wallet wraps prover with transaction-level signing API
TransactionHintsBag coordinates multi-party signing
Storage rent allows expired boxes (~4 years) to be spent by anyone
Deterministic signing available for platforms without secure random
Cost accumulates across inputs with initial overhead

Next: Chapter 24: Transaction Validation

Scala: ErgoLikeInterpreter.scala

Rust: eval.rs:1-50

Scala: Interpreter.scala (reduce)

⁴

Rust: eval.rs:129-160

⁵

Scala: Interpreter.scala (verify)

⁶

Rust: verifier.rs:55-88

⁷

Scala: ProverInterpreter.scala

⁸

Rust: prover.rs:57-96

⁹

Scala: ProverResult.scala

¹⁰

Rust: prover_result.rs:14-50

¹¹

Scala: ErgoProvingInterpreter.scala

¹²

Rust: wallet.rs:52-94

¹³

Scala: ErgoProvingInterpreter.scala:100-159

¹⁴

Rust: signing.rs:143-180

¹⁵

Scala: HintsBag.scala

¹⁶

Rust: wallet.rs:259-347

¹⁷

Scala: ErgoProvingInterpreter.scala:190-217

¹⁸

Rust: wallet.rs:124-158

¹⁹

Scala: ErgoInterpreter.scala:42-55

²⁰

Scala: ErgoInterpreter.scala:93-96

²¹

Rust: wallet.rs:182-209

²²

Rust: deterministic.rs

Chapter 24: Transaction Validation

Prerequisites

Chapter 14 for script verification
Chapter 22 for box structure and tokens
Chapter 23 for interpreter wrappers

Learning Objectives

By the end of this chapter, you will be able to:

Explain the two-phase validation pipeline (stateless then stateful)
Implement stateless validation rules (input/output counts, no duplicates)
Perform stateful validation with cost accumulation
Verify ERG and token preservation across transaction inputs and outputs

Validation Pipeline

Transaction validation occurs in two phases¹²:

Transaction Validation Pipeline
─────────────────────────────────────────────────────

┌─────────────────────────────────────────────────────┐
│           STATELESS VALIDATION                      │
│   (No blockchain state required)                    │
├─────────────────────────────────────────────────────┤
│  • Has inputs?         (at least 1)                 │
│  • Has outputs?        (at least 1)                 │
│  • Count limits        (≤ 32,767 each)              │
│  • No negative values  (outputs ≥ 0)                │
│  • Output sum valid    (no overflow)                │
│  • Unique inputs       (no double-spend)            │
└──────────────────────────┬──────────────────────────┘
                           │ Pass
                           ▼
┌─────────────────────────────────────────────────────┐
│           STATEFUL VALIDATION                       │
│   (Requires UTXO state and blockchain context)      │
├─────────────────────────────────────────────────────┤
│  1. Calculate initial cost                          │
│  2. Verify outputs (dust, height, size)             │
│  3. Check ERG preservation                          │
│  4. Verify asset preservation                       │
│  5. Verify input scripts (accumulate cost)          │
│  6. Check re-emission rules (EIP-27)                │
└─────────────────────────────────────────────────────┘

Transaction Structure

const Transaction = struct {
    /// Transaction ID (Blake2b256 of serialized tx without proofs)
    tx_id: TxId,
    /// Input boxes to spend (with proofs)
    inputs: TxIoVec(Input),
    /// Read-only input references (no proofs)
    data_inputs: ?TxIoVec(DataInput),
    /// Output box candidates
    output_candidates: TxIoVec(ErgoBoxCandidate),
    /// Materialized outputs (with tx_id and index)
    outputs: TxIoVec(ErgoBox),

    pub const MAX_OUTPUTS_COUNT: usize = std.math.maxInt(u16);

    pub fn init(
        inputs: TxIoVec(Input),
        data_inputs: ?TxIoVec(DataInput),
        output_candidates: TxIoVec(ErgoBoxCandidate),
    ) !Transaction {
        // First pass: compute outputs with zero tx_id
        const zero_outputs = try output_candidates.mapIndexed(
            struct {
                fn f(idx: usize, bc: *const ErgoBoxCandidate) !ErgoBox {
                    return ErgoBox.fromBoxCandidate(bc, TxId.zero(), @intCast(idx));
                }
            }.f,
        );

        var tx = Transaction{
            .tx_id = TxId.zero(),
            .inputs = inputs,
            .data_inputs = data_inputs,
            .output_candidates = output_candidates,
            .outputs = zero_outputs,
        };

        // Compute actual tx_id
        tx.tx_id = try tx.calcTxId();

        // Update outputs with correct tx_id
        tx.outputs = try output_candidates.mapIndexed(
            struct {
                fn f(idx: usize, bc: *const ErgoBoxCandidate) !ErgoBox {
                    return ErgoBox.fromBoxCandidate(bc, tx.tx_id, @intCast(idx));
                }
            }.f,
        );

        return tx;
    }
};

Validation Error Types

const TxValidationError = error{
    /// Output ERG sum overflow
    OutputSumOverflow,
    /// Input ERG sum overflow
    InputSumOverflow,
    /// Same box spent twice
    DoubleSpend,
    /// ERG not preserved (inputs != outputs)
    ErgPreservationError,
    /// Token amounts not preserved
    TokenPreservationError,
    /// Output below dust threshold
    DustOutput,
    /// Creation height > current height
    InvalidHeightError,
    /// Creation height < max input height (v3+)
    MonotonicHeightError,
    /// Negative creation height (v1+)
    NegativeHeight,
    /// Box exceeds 4KB limit
    BoxSizeExceeded,
    /// Script exceeds size limit
    ScriptSizeExceeded,
    /// Script verification failed
    ReducedToFalse,
    /// Verifier error
    VerifierError,
};

Stateless Validation

Checks that don't require blockchain state³⁴:

/// Validate transaction structure without blockchain state
pub fn validateStateless(tx: *const Transaction) TxValidationError!void {
    // BoundedVec ensures 1 ≤ count ≤ 32767, so no explicit checks needed

    // Check output sum doesn't overflow
    var output_sum: i64 = 0;
    for (tx.outputs.items()) |out| {
        output_sum = std.math.add(i64, output_sum, out.value.as_i64()) catch
            return error.OutputSumOverflow;
    }

    // Check no double-spend (unique inputs)
    var seen = std.AutoHashMap(BoxId, void).init(allocator);
    defer seen.deinit();

    for (tx.inputs.items()) |input| {
        const result = seen.getOrPut(input.box_id);
        if (result.found_existing) {
            return error.DoubleSpend;
        }
    }
}

Stateless Rules Table

Stateless Validation Rules
─────────────────────────────────────────────────────
Rule              Check                    Limit
─────────────────────────────────────────────────────
txNoInputs        inputs.len >= 1          min 1
txNoOutputs       outputs.len >= 1         min 1
txManyInputs      inputs.len <= MAX        32,767
txManyDataInputs  data_inputs.len <= MAX   32,767
txManyOutputs     outputs.len <= MAX       32,767
txNegativeOutput  all outputs >= 0         -
txOutputSum       sum(outputs) no overflow -
txInputsUnique    no duplicate box_ids     -
─────────────────────────────────────────────────────

Stateful Validation

Requires UTXO state and blockchain context⁵⁶:

/// Validate transaction against blockchain state
pub fn validateStateful(
    tx: *const Transaction,
    boxes_to_spend: []const ErgoBox,
    data_boxes: []const ErgoBox,
    state_context: *const ErgoStateContext,
    accumulated_cost: u64,
    verifier: *const Verifier,
) TxValidationError!u64 {
    const params = state_context.current_parameters;
    const max_cost = params.max_block_cost;

    // 1. Calculate initial cost
    const initial_cost = calculateInitialCost(
        tx,
        boxes_to_spend.len,
        data_boxes.len,
        params,
    );
    var current_cost = accumulated_cost + initial_cost;

    if (current_cost > max_cost) {
        return error.CostExceeded;
    }

    // 2. Verify outputs
    const max_input_height = maxCreationHeight(boxes_to_spend);
    for (tx.outputs.items()) |out| {
        try verifyOutput(out, state_context, max_input_height);
    }

    // 3. Check ERG preservation (inputs must equal outputs exactly)
    const input_sum = try sumValues(boxes_to_spend);
    const output_sum = try sumValues(tx.outputs.items());
    if (input_sum != output_sum) {
        return error.ErgPreservationError;
    }

    // 4. Verify asset preservation
    current_cost = try verifyAssets(
        tx,
        boxes_to_spend,
        state_context,
        current_cost,
    );

    // 5. Verify each input script
    for (boxes_to_spend, 0..) |box, idx| {
        current_cost = try verifyInput(
            tx,
            boxes_to_spend,
            data_boxes,
            box,
            @intCast(idx),
            state_context,
            current_cost,
            verifier,
        );
    }

    return current_cost;
}

Initial Cost Calculation

Transaction cost starts with fixed overhead⁷⁸:

const CostConstants = struct {
    pub const INTERPRETER_INIT_COST: u64 = 10_000;
};

pub fn calculateInitialCost(
    tx: *const Transaction,
    inputs_count: usize,
    data_inputs_count: usize,
    params: *const BlockchainParameters,
) u64 {
    return CostConstants.INTERPRETER_INIT_COST +
        inputs_count * params.input_cost +
        data_inputs_count * params.data_input_cost +
        tx.outputs.len() * params.output_cost;
}

Output Verification

Each output must pass structural checks⁹¹⁰:

pub fn verifyOutput(
    out: *const ErgoBox,
    state_context: *const ErgoStateContext,
    max_input_height: u32,
) TxValidationError!void {
    const params = state_context.current_parameters;
    const block_version = state_context.block_version;
    const current_height = state_context.current_height;

    // Dust check: value >= minimum for box size
    const min_value = BoxUtils.minimalErgoAmount(out, params);
    if (out.value.as_u64() < min_value) {
        return error.DustOutput;
    }

    // Future check: creation height <= current height
    if (out.creation_height > current_height) {
        return error.InvalidHeightError;
    }

    // Non-negative height (after v1)
    if (block_version > 1 and out.creation_height < 0) {
        return error.NegativeHeight;
    }

    // Monotonic height (after v3): output height >= max input height
    if (block_version >= 3 and out.creation_height < max_input_height) {
        return error.MonotonicHeightError;
    }

    // Size limits
    if (out.serializedSize() > ErgoBox.MAX_BOX_SIZE) {
        return error.BoxSizeExceeded;
    }
    if (out.propositionBytes().len > ErgoBox.MAX_SCRIPT_SIZE) {
        return error.ScriptSizeExceeded;
    }
}

Asset Verification

Token preservation rules¹¹¹²:

pub fn verifyAssets(
    tx: *const Transaction,
    boxes_to_spend: []const ErgoBox,
    state_context: *const ErgoStateContext,
    current_cost: u64,
) TxValidationError!u64 {
    // Extract input assets
    var in_assets = std.AutoHashMap(TokenId, u64).init(allocator);
    defer in_assets.deinit();

    for (boxes_to_spend) |box| {
        if (box.tokens) |tokens| {
            for (tokens.items()) |token| {
                const entry = in_assets.getOrPut(token.token_id);
                if (entry.found_existing) {
                    entry.value_ptr.* += token.amount.value;
                } else {
                    entry.value_ptr.* = token.amount.value;
                }
            }
        }
    }

    // Extract output assets
    var out_assets = std.AutoHashMap(TokenId, u64).init(allocator);
    defer out_assets.deinit();

    for (tx.outputs.items()) |out| {
        if (out.tokens) |tokens| {
            for (tokens.items()) |token| {
                const entry = out_assets.getOrPut(token.token_id);
                if (entry.found_existing) {
                    entry.value_ptr.* += token.amount.value;
                } else {
                    entry.value_ptr.* = token.amount.value;
                }
            }
        }
    }

    // First input box ID can mint new tokens
    const new_token_id = TokenId{ .digest = tx.inputs.items()[0].box_id.digest };

    // Verify each output token
    var iter = out_assets.iterator();
    while (iter.next()) |entry| {
        const out_id = entry.key_ptr.*;
        const out_amount = entry.value_ptr.*;

        const in_amount = in_assets.get(out_id) orelse 0;

        // Output amount <= input amount OR it's a new token
        if (out_amount > in_amount) {
            if (!std.mem.eql(u8, &out_id.digest, &new_token_id.digest) or out_amount == 0) {
                return error.TokenPreservationError;
            }
        }
    }

    // Add token access cost
    const token_access_cost = calculateTokenAccessCost(
        in_assets.count(),
        out_assets.count(),
        state_context.current_parameters.token_access_cost,
    );

    return current_cost + token_access_cost;
}

Input Script Verification

The most expensive step—verify each input's script¹³¹⁴:

pub fn verifyInput(
    tx: *const Transaction,
    boxes_to_spend: []const ErgoBox,
    data_boxes: []const ErgoBox,
    box: *const ErgoBox,
    input_index: u16,
    state_context: *const ErgoStateContext,
    current_cost: u64,
    verifier: *const Verifier,
) TxValidationError!u64 {
    const max_cost = state_context.current_parameters.max_block_cost;
    const input = tx.inputs.items()[input_index];
    const proof = input.spending_proof;

    // Check for storage rent spending first
    const ctx = try buildContext(
        tx,
        boxes_to_spend,
        data_boxes,
        input_index,
        state_context,
        max_cost - current_cost,
    );

    if (trySpendStorageRent(&input, box, state_context, &ctx)) |_| {
        // Storage rent conditions satisfied, skip script verification
        return current_cost + StorageConstants.STORAGE_CONTRACT_COST;
    }

    // Normal script verification
    const result = verifier.verify(
        &box.ergo_tree,
        &ctx,
        proof.proof,
        tx.messageToSign(),
    ) catch |err| {
        return error.VerifierError;
    };

    if (!result.result) {
        return error.ReducedToFalse;
    }

    const new_cost = current_cost + result.cost;
    if (new_cost > max_cost) {
        return error.CostExceeded;
    }

    return new_cost;
}

Context Construction

Build evaluation context for input verification¹⁵¹⁶:

pub fn buildContext(
    tx: *const Transaction,
    boxes_to_spend: []const ErgoBox,
    data_boxes: []const ErgoBox,
    input_index: u16,
    state_context: *const ErgoStateContext,
    cost_limit: u64,
) !Context {
    return Context{
        .height = state_context.pre_header.height,
        .self_box = &boxes_to_spend[input_index],
        .inputs = boxes_to_spend,
        .data_inputs = data_boxes,
        .outputs = tx.outputs.items(),
        .pre_header = &state_context.pre_header,
        .headers = state_context.headers,
        .extension = tx.contextExtension(input_index),
        .cost_limit = cost_limit,
        .tree_version = @intCast(state_context.block_version - 1),
    };
}

Storage Rent Spending

Expired boxes can be spent without script verification¹⁷¹⁸:

const StorageConstants = struct {
    /// Blocks before box is eligible (~4 years)
    pub const STORAGE_PERIOD: u32 = 1_051_200;
    /// Context extension key for output index
    pub const STORAGE_EXTENSION_INDEX: u8 = 127;
    /// Cost for storage rent verification
    pub const STORAGE_CONTRACT_COST: u64 = 50;
};

pub fn trySpendStorageRent(
    input: *const Input,
    input_box: *const ErgoBox,
    state_context: *const ErgoStateContext,
    ctx: *const Context,
) ?void {
    // Must have empty proof
    if (!input.spending_proof.proof.isEmpty()) return null;

    return checkStorageRentConditions(input_box, state_context, ctx);
}

pub fn checkStorageRentConditions(
    input_box: *const ErgoBox,
    state_context: *const ErgoStateContext,
    ctx: *const Context,
) ?void {
    // Check time elapsed
    const age = ctx.pre_header.height - ctx.self_box.creation_height;
    if (age < StorageConstants.STORAGE_PERIOD) return null;

    // Get output index from context extension
    const output_idx_value = ctx.extension.values.get(
        StorageConstants.STORAGE_EXTENSION_INDEX,
    ) orelse return null;
    const output_idx = output_idx_value.extractAs(i16) orelse return null;

    const output = ctx.outputs[@intCast(output_idx)];

    // Calculate storage fee
    const storage_fee = input_box.serializedSize() *
        state_context.parameters.storage_fee_factor;

    // Dust boxes can always be spent
    if (ctx.self_box.value.as_u64() <= storage_fee) return {};

    // Verify recreation rules
    if (output.creation_height != state_context.pre_header.height) return null;
    if (output.value.as_u64() < ctx.self_box.value.as_u64() - storage_fee) return null;

    // Registers must be preserved (except R0 value and R3 creation info)
    for (0..10) |i| {
        const reg_id = RegisterId.fromByte(@intCast(i));
        if (reg_id == .r0 or reg_id == .r3) continue;
        if (!std.meta.eql(
            ctx.self_box.getRegister(reg_id),
            output.getRegister(reg_id),
        )) return null;
    }

    return {};
}

Cost Accumulation Flow

Cost Accumulation
─────────────────────────────────────────────────────

Block accumulated cost (from previous txs)
    │
    ├── + INTERPRETER_INIT_COST  (10,000)
    ├── + inputs.len × inputCost
    ├── + data_inputs.len × dataInputCost
    ├── + outputs.len × outputCost
    │
    ▼
startCost
    │
    ├── Input[0] script → + scriptCost₀
    ├── Input[1] script → + scriptCost₁
    ├── ...
    ├── Input[n] script → + scriptCostₙ
    │
    ├── Token access → + tokenAccessCost
    │
    ▼
finalCost ≤ maxBlockCost

Each input verification receives remaining budget:
  ctx.cost_limit = maxBlockCost - current_cost

Validation Rules Summary

Validation Rules Reference
─────────────────────────────────────────────────────
ID    Name                    Phase       Description
─────────────────────────────────────────────────────
100   txNoInputs              Stateless   ≥1 input
101   txNoOutputs             Stateless   ≥1 output
102   txManyInputs            Stateless   ≤32,767
103   txManyDataInputs        Stateless   ≤32,767
104   txManyOutputs           Stateless   ≤32,767
105   txNegativeOutput        Stateless   values ≥ 0
106   txOutputSum             Stateless   no overflow
107   txInputsUnique          Stateless   no duplicates
─────────────────────────────────────────────────────
120   txScriptValidation      Stateful    scripts pass
121   bsBlockTransactionsCost Stateful    cost in limit
122   txDust                  Stateful    min value
123   txFuture                Stateful    valid height
124   txErgPreservation       Stateful    inputs == outputs
125   txAssetsPreservation    Stateful    tokens balanced
126   txBoxSize               Stateful    ≤4KB
127   txReemission            Stateful    EIP-27 rules
─────────────────────────────────────────────────────

Complete Validation Flow

/// Full transaction validation
pub fn validateTransaction(
    tx: *const Transaction,
    utxo_state: *const UtxoState,
    state_context: *const ErgoStateContext,
    verifier: *const Verifier,
    accumulated_cost: u64,
) !u64 {
    // Phase 1: Stateless validation
    try validateStateless(tx);

    // Phase 2: Resolve input boxes
    var boxes_to_spend: []ErgoBox = &.{};
    for (tx.inputs.items()) |input| {
        const box = utxo_state.boxById(input.box_id) orelse
            return error.InputBoxNotFound;
        boxes_to_spend = append(boxes_to_spend, box);
    }

    // Phase 3: Resolve data input boxes
    var data_boxes: []ErgoBox = &.{};
    if (tx.data_inputs) |data_inputs| {
        for (data_inputs.items()) |data_input| {
            const box = utxo_state.boxById(data_input.box_id) orelse
                return error.DataInputBoxNotFound;
            data_boxes = append(data_boxes, box);
        }
    }

    // Phase 4: Stateful validation
    return validateStateful(
        tx,
        boxes_to_spend,
        data_boxes,
        state_context,
        accumulated_cost,
        verifier,
    );
}

Summary

Two-phase validation: Stateless (structural) then stateful (UTXO-dependent)
Stateless: Count limits, no negatives, no overflow, unique inputs
Stateful: Cost tracking, output checks, preservation rules, script verification
Cost accumulation: Tracks across inputs, bounded by maxBlockCost
Storage rent: Expired boxes (~4 years) spendable by anyone with recreation
Asset preservation: ERG exactly preserved (inputs == outputs), tokens can only decrease (or mint new)

Next: Chapter 25: Cost Limits and Parameters

Scala: ErgoTransaction.scala:57-64

Rust: transaction.rs:60-96

Scala: ErgoTransaction.scala:91-115

⁴

Rust: ergo_transaction.rs:93-110

⁵

Scala: ErgoTransaction.scala:360-441

⁶

Rust: transaction.rs:200-300

⁷

Scala: ErgoTransaction.scala:370-374

⁸

Scala: ErgoInterpreter.scala:93-96

⁹

Scala: ErgoTransaction.scala:163-177

¹⁰

Rust: ergo_transaction.rs:49-67

¹¹

Scala: ErgoTransaction.scala:180-216

¹²

Rust: ergo_transaction.rs:37-48

¹³

Scala: ErgoTransaction.scala:110-161

¹⁴

Rust: signing.rs:143-180

¹⁵

Scala: ErgoContext.scala:12-29

¹⁶

Rust: signing.rs:46-116

¹⁷

Scala: ErgoInterpreter.scala:42-55

¹⁸

Rust: storage_rent.rs:12-78

Chapter 25: Cost Limits and Parameters

Prerequisites

Chapter 24 for how cost limits are enforced during validation
Chapter 13 for JitCost and operation costs

Learning Objectives

By the end of this chapter, you will be able to:

Explain Ergo's adjustable blockchain parameters and their governance
Describe the miner voting mechanism for parameter changes
Work with cost-related parameters and their default values
Configure validation rules and soft-fork settings

Parameter System

Ergo's blockchain parameters are adjustable through miner voting¹²:

Parameter Governance
─────────────────────────────────────────────────────

┌─────────────────────────────────────────────────────┐
│                  Parameters                         │
├─────────────────────────────────────────────────────┤
│  parameters_table: HashMap<Parameter, i32>          │
│  proposed_update: ValidationSettingsUpdate          │
│  height: u32                                        │
└─────────────────────────────────────────────────────┘
                        │
                        ▼
┌─────────────────────────────────────────────────────┐
│              Parameter Types                        │
├─────────────────────────────────────────────────────┤
│  Cost:     maxBlockCost, inputCost, outputCost...   │
│  Size:     maxBlockSize, minValuePerByte            │
│  Fee:      storageFeeFactor                         │
│  Version:  blockVersion                             │
└─────────────────────────────────────────────────────┘
                        │
                        ▼
┌─────────────────────────────────────────────────────┐
│              Voting Mechanism                       │
├─────────────────────────────────────────────────────┤
│  Miners include votes in block headers              │
│  Votes tallied over epochs (1024 blocks)            │
│  Majority (>= 90%) activates change                 │
│  Each param has min/max bounds and step size        │
└─────────────────────────────────────────────────────┘

Parameter Enum

const Parameter = enum(i8) {
    /// Storage fee factor (per byte per ~4 year storage period)
    storage_fee_factor = 1,
    /// Minimum monetary value per byte of box
    min_value_per_byte = 2,
    /// Maximum block size in bytes
    max_block_size = 3,
    /// Maximum computational cost per block
    max_block_cost = 4,
    /// Cost per token access
    token_access_cost = 5,
    /// Cost per transaction input
    input_cost = 6,
    /// Cost per data input
    data_input_cost = 7,
    /// Cost per transaction output
    output_cost = 8,
    /// Sub-blocks per block (v6+)
    subblocks_per_block = 9,
    /// Soft-fork vote
    soft_fork = 120,
    /// Soft-fork votes collected
    soft_fork_votes = 121,
    /// Soft-fork starting height
    soft_fork_start_height = 122,
    /// Current block version
    block_version = 123,

    /// Negative values indicate decrease vote
    pub fn decreaseVote(self: Parameter) i8 {
        return -@intFromEnum(self);
    }
};

Parameters Structure

const Parameters = struct {
    /// Current block height
    height: u32,
    /// Parameter ID -> value mapping
    parameters_table: std.AutoHashMap(Parameter, i32),
    /// Proposed validation settings update
    proposed_update: ValidationSettingsUpdate,

    /// Get block version
    pub fn blockVersion(self: *const Parameters) i32 {
        return self.parameters_table.get(.block_version) orelse 1;
    }

    /// Get max block cost
    pub fn maxBlockCost(self: *const Parameters) i32 {
        return self.parameters_table.get(.max_block_cost) orelse DefaultParams.MAX_BLOCK_COST;
    }

    /// Get input cost
    pub fn inputCost(self: *const Parameters) i32 {
        return self.parameters_table.get(.input_cost) orelse DefaultParams.INPUT_COST;
    }

    /// Get data input cost
    pub fn dataInputCost(self: *const Parameters) i32 {
        return self.parameters_table.get(.data_input_cost) orelse DefaultParams.DATA_INPUT_COST;
    }

    /// Get output cost
    pub fn outputCost(self: *const Parameters) i32 {
        return self.parameters_table.get(.output_cost) orelse DefaultParams.OUTPUT_COST;
    }

    /// Get token access cost
    pub fn tokenAccessCost(self: *const Parameters) i32 {
        return self.parameters_table.get(.token_access_cost) orelse DefaultParams.TOKEN_ACCESS_COST;
    }

    /// Get storage fee factor
    pub fn storageFeeFactor(self: *const Parameters) i32 {
        return self.parameters_table.get(.storage_fee_factor) orelse DefaultParams.STORAGE_FEE_FACTOR;
    }

    /// Get min value per byte
    pub fn minValuePerByte(self: *const Parameters) i32 {
        return self.parameters_table.get(.min_value_per_byte) orelse DefaultParams.MIN_VALUE_PER_BYTE;
    }

    /// Get max block size
    pub fn maxBlockSize(self: *const Parameters) i32 {
        return self.parameters_table.get(.max_block_size) orelse DefaultParams.MAX_BLOCK_SIZE;
    }
};

Default Values

const DefaultParams = struct {
    /// Cost parameters
    pub const MAX_BLOCK_COST: i32 = 1_000_000;
    pub const TOKEN_ACCESS_COST: i32 = 100;
    pub const INPUT_COST: i32 = 2_000;
    pub const DATA_INPUT_COST: i32 = 100;
    pub const OUTPUT_COST: i32 = 100;

    /// Size parameters
    pub const MAX_BLOCK_SIZE: i32 = 512 * 1024; // 512 KB
    pub const MAX_BLOCK_SIZE_MAX: i32 = 1024 * 1024; // 1 MB
    pub const MAX_BLOCK_SIZE_MIN: i32 = 16 * 1024; // 16 KB

    /// Fee parameters
    pub const STORAGE_FEE_FACTOR: i32 = 1_250_000; // 0.00125 ERG per byte per ~4 years
    pub const STORAGE_FEE_FACTOR_MAX: i32 = 2_500_000;
    pub const STORAGE_FEE_FACTOR_MIN: i32 = 0;
    pub const STORAGE_FEE_FACTOR_STEP: i32 = 25_000;

    /// Dust prevention
    pub const MIN_VALUE_PER_BYTE: i32 = 30 * 12; // 360 nanoErgs per byte
    pub const MIN_VALUE_PER_BYTE_MAX: i32 = 10_000;
    pub const MIN_VALUE_PER_BYTE_MIN: i32 = 0;
    pub const MIN_VALUE_PER_BYTE_STEP: i32 = 10;

    /// Sub-blocks (v6+)
    pub const SUBBLOCKS_PER_BLOCK: i32 = 30;
    pub const SUBBLOCKS_PER_BLOCK_MIN: i32 = 2;
    pub const SUBBLOCKS_PER_BLOCK_MAX: i32 = 2048;
    pub const SUBBLOCKS_PER_BLOCK_STEP: i32 = 1;

    /// Interpreter initialization cost
    pub const INTERPRETER_INIT_COST: i32 = 10_000;
};

/// Create default parameters
pub fn defaultParameters() Parameters {
    var table = std.AutoHashMap(Parameter, i32).init(allocator);
    table.put(.storage_fee_factor, DefaultParams.STORAGE_FEE_FACTOR) catch {};
    table.put(.min_value_per_byte, DefaultParams.MIN_VALUE_PER_BYTE) catch {};
    table.put(.token_access_cost, DefaultParams.TOKEN_ACCESS_COST) catch {};
    table.put(.input_cost, DefaultParams.INPUT_COST) catch {};
    table.put(.data_input_cost, DefaultParams.DATA_INPUT_COST) catch {};
    table.put(.output_cost, DefaultParams.OUTPUT_COST) catch {};
    table.put(.max_block_size, DefaultParams.MAX_BLOCK_SIZE) catch {};
    table.put(.max_block_cost, DefaultParams.MAX_BLOCK_COST) catch {};
    table.put(.block_version, 1) catch {};
    return Parameters{
        .height = 0,
        .parameters_table = table,
        .proposed_update = ValidationSettingsUpdate.empty(),
    };
}

Parameter Reference

Default Parameter Values
────────────────────────────────────────────────────────────────────
ID   Name                Default      Min        Max        Step
────────────────────────────────────────────────────────────────────
1    storageFeeFactor    1,250,000    0          2,500,000  25,000
2    minValuePerByte     360          0          10,000     10
3    maxBlockSize        524,288      16,384     1,048,576  1%
4    maxBlockCost        1,000,000    16,384     -          1%
5    tokenAccessCost     100          -          -          1%
6    inputCost           2,000        -          -          1%
7    dataInputCost       100          -          -          1%
8    outputCost          100          -          -          1%
9    subblocksPerBlock   30           2          2,048      1
123  blockVersion        1            1          -          -
────────────────────────────────────────────────────────────────────

Voting Mechanism

Miners vote for parameter changes in block headers³:

const VotingSettings = struct {
    /// Blocks per voting epoch
    pub const EPOCH_LENGTH: u32 = 1024;
    /// Required approval threshold (90%)
    pub const APPROVAL_THRESHOLD: f32 = 0.90;

    /// Check if vote count meets approval threshold
    pub fn changeApproved(self: *const VotingSettings, vote_count: u32) bool {
        const threshold = @as(u32, @intFromFloat(EPOCH_LENGTH * APPROVAL_THRESHOLD));
        return vote_count >= threshold;
    }
};

/// Generate votes based on targets
pub fn generateVotes(
    params: *const Parameters,
    own_targets: std.AutoHashMap(Parameter, i32),
    epoch_votes: []const struct { param: i8, count: u32 },
    vote_for_fork: bool,
) []i8 {
    var votes: []i8 = &.{};

    for (epoch_votes) |ev| {
        const param_id = ev.param;

        if (param_id == @intFromEnum(Parameter.soft_fork)) {
            if (vote_for_fork) {
                votes = append(votes, param_id);
            }
        } else if (param_id > 0) {
            // Vote for increase if current < target
            const param: Parameter = @enumFromInt(param_id);
            const current = params.parameters_table.get(param) orelse continue;
            const target = own_targets.get(param) orelse continue;
            if (target > current) {
                votes = append(votes, param_id);
            }
        } else if (param_id < 0) {
            // Vote for decrease if current > target
            const param: Parameter = @enumFromInt(-param_id);
            const current = params.parameters_table.get(param) orelse continue;
            const target = own_targets.get(param) orelse continue;
            if (target < current) {
                votes = append(votes, param_id);
            }
        }
    }

    return padVotes(votes);
}

Parameter Update Logic

Apply votes at epoch boundaries⁴:

/// Update parameters based on epoch votes
pub fn updateParams(
    params_table: std.AutoHashMap(Parameter, i32),
    epoch_votes: []const struct { param: i8, count: u32 },
    settings: *const VotingSettings,
) std.AutoHashMap(Parameter, i32) {
    var new_table = params_table.clone();

    for (epoch_votes) |ev| {
        const param_id = ev.param;
        if (param_id >= @intFromEnum(Parameter.soft_fork)) continue;

        const param_abs: Parameter = @enumFromInt(if (param_id < 0) -param_id else param_id);

        if (settings.changeApproved(ev.count)) {
            const current = new_table.get(param_abs) orelse continue;
            const max_val = getMaxValue(param_abs);
            const min_val = getMinValue(param_abs);
            const step = getStep(param_abs, current);

            const new_value = if (param_id > 0) blk: {
                // Increase: cap at max
                break :blk if (current < max_val) current + step else current;
            } else blk: {
                // Decrease: floor at min
                break :blk if (current > min_val) current - step else current;
            };

            new_table.put(param_abs, new_value) catch {};
        }
    }

    return new_table;
}

fn getMaxValue(param: Parameter) i32 {
    return switch (param) {
        .storage_fee_factor => DefaultParams.STORAGE_FEE_FACTOR_MAX,
        .min_value_per_byte => DefaultParams.MIN_VALUE_PER_BYTE_MAX,
        .max_block_size => DefaultParams.MAX_BLOCK_SIZE_MAX,
        .subblocks_per_block => DefaultParams.SUBBLOCKS_PER_BLOCK_MAX,
        else => std.math.maxInt(i32) / 2,
    };
}

fn getMinValue(param: Parameter) i32 {
    return switch (param) {
        .storage_fee_factor => DefaultParams.STORAGE_FEE_FACTOR_MIN,
        .min_value_per_byte => DefaultParams.MIN_VALUE_PER_BYTE_MIN,
        .max_block_size => DefaultParams.MAX_BLOCK_SIZE_MIN,
        .max_block_cost => 16 * 1024,
        .subblocks_per_block => DefaultParams.SUBBLOCKS_PER_BLOCK_MIN,
        else => 0,
    };
}

fn getStep(param: Parameter, current: i32) i32 {
    return switch (param) {
        .storage_fee_factor => DefaultParams.STORAGE_FEE_FACTOR_STEP,
        .min_value_per_byte => DefaultParams.MIN_VALUE_PER_BYTE_STEP,
        .subblocks_per_block => DefaultParams.SUBBLOCKS_PER_BLOCK_STEP,
        else => @max(1, @divTrunc(current, 100)), // Default 1% step
    };
}

Cost Calculation

Transaction cost formula⁵⁶:

Cost Formula
──────────────────────────────────────────────────────────────────

totalCost = interpreterInitCost        // 10,000
          + inputs × inputCost         // inputs × 2,000
          + dataInputs × dataInputCost // dataInputs × 100
          + outputs × outputCost       // outputs × 100
          + tokenAccessCost × tokens   // varies
          + scriptExecutionCost        // varies per script

Example (2 inputs, 1 data input, 3 outputs, 50K script):
──────────────────────────────────────────────────────────────────
  10,000  interpreter init
   4,000  2 × 2,000 inputs
     100  1 × 100 data inputs
     300  3 × 100 outputs
  50,000  script execution
──────────────────────────────────────────────────────────────────
  64,400  TOTAL

/// Calculate transaction cost
pub fn calculateTransactionCost(
    params: *const Parameters,
    num_inputs: usize,
    num_data_inputs: usize,
    num_outputs: usize,
    script_cost: u64,
    token_ops: usize,
) u64 {
    const init_cost = DefaultParams.INTERPRETER_INIT_COST;
    const input_cost = params.inputCost() * @as(i32, @intCast(num_inputs));
    const data_input_cost = params.dataInputCost() * @as(i32, @intCast(num_data_inputs));
    const output_cost = params.outputCost() * @as(i32, @intCast(num_outputs));
    const token_cost = params.tokenAccessCost() * @as(i32, @intCast(token_ops));

    return @intCast(init_cost + input_cost + data_input_cost + output_cost + token_cost) + script_cost;
}

/// Calculate block capacity in simple transactions
pub fn estimateBlockCapacity(params: *const Parameters) u32 {
    const max_cost = params.maxBlockCost();

    // Simple tx: 1 input (P2PK), 2 outputs, ~15K script cost
    const simple_tx_cost = DefaultParams.INTERPRETER_INIT_COST +
        params.inputCost() +
        params.outputCost() * 2 +
        15_000; // P2PK verification

    return @intCast(@divTrunc(max_cost, simple_tx_cost));
}

Block Version History

Protocol Versions
────────────────────────────────────────────────────────
Block Version   Protocol   Features
────────────────────────────────────────────────────────
1               v1         Initial mainnet
2               v5.0       Script improvements
3               v5.0.12    Height monotonicity (EIP-39)
4               v6.0       Sub-blocks, new operations
────────────────────────────────────────────────────────

Script version = block_version - 1

Validation Rules

Rules can be enabled/disabled via soft-fork⁷:

const RuleStatus = struct {
    /// Creates error from modifier details
    create_error: fn (InvalidModifier) Invalid,
    /// Which modifier types this rule applies to
    affected_classes: []const ModifierType,
    /// Can this rule be disabled via soft-fork?
    may_be_disabled: bool,
    /// Is this rule currently active?
    is_active: bool,
};

/// Validation rule IDs
const ValidationRules = struct {
    // Stateless (100-109)
    pub const TX_NO_INPUTS: u16 = 100;
    pub const TX_NO_OUTPUTS: u16 = 101;
    pub const TX_MANY_INPUTS: u16 = 102;
    pub const TX_MANY_DATA_INPUTS: u16 = 103;
    pub const TX_MANY_OUTPUTS: u16 = 104;
    pub const TX_NEGATIVE_OUTPUT: u16 = 105;
    pub const TX_OUTPUT_SUM: u16 = 106;
    pub const TX_INPUTS_UNIQUE: u16 = 107;
    pub const TX_POSITIVE_ASSETS: u16 = 108;
    pub const TX_ASSETS_IN_ONE_BOX: u16 = 109;

    // Stateful (111-127)
    pub const TX_DUST: u16 = 111;
    pub const TX_FUTURE: u16 = 112;
    pub const TX_BOXES_TO_SPEND: u16 = 113;
    pub const TX_DATA_BOXES: u16 = 114;
    pub const TX_INPUTS_SUM: u16 = 115;
    pub const TX_ERG_PRESERVATION: u16 = 116;
    pub const TX_ASSETS_PRESERVATION: u16 = 117;
    pub const TX_BOX_TO_SPEND: u16 = 118;
    pub const TX_SCRIPT_VALIDATION: u16 = 119;
    pub const TX_BOX_SIZE: u16 = 120;
    pub const TX_BOX_PROPOSITION_SIZE: u16 = 121;
    pub const TX_NEG_HEIGHT: u16 = 122; // v2+
    pub const TX_REEMISSION: u16 = 123; // EIP-27
    pub const TX_MONOTONIC_HEIGHT: u16 = 124; // v3+

    // Block rules (300+)
    pub const BS_BLOCK_TX_SIZE: u16 = 306;
    pub const BS_BLOCK_TX_COST: u16 = 307;
};

Rule Configurability

Rule Categories
───────────────────────────────────────────────────────────
Category              Can Disable?  Examples
───────────────────────────────────────────────────────────
Consensus Critical    No            txErgPreservation
                                    txScriptValidation
                                    txNoInputs

Soft-Forkable         Yes           txDust
                                    txBoxSize
                                    txReemission

Version-Gated         N/A           txNegHeight (v2+)
                                    txMonotonicHeight (v3+)
───────────────────────────────────────────────────────────

/// Check if rule can be disabled
pub fn mayBeDisabled(rule: u16) bool {
    return switch (rule) {
        ValidationRules.TX_DUST,
        ValidationRules.TX_BOX_SIZE,
        ValidationRules.TX_BOX_PROPOSITION_SIZE,
        ValidationRules.TX_REEMISSION,
        => true,

        // Consensus-critical rules cannot be disabled
        ValidationRules.TX_NO_INPUTS,
        ValidationRules.TX_ERG_PRESERVATION,
        ValidationRules.TX_SCRIPT_VALIDATION,
        ValidationRules.TX_ASSETS_PRESERVATION,
        => false,

        else => false,
    };
}

Parameter Serialization

Parameters stored in block extensions⁸:

const SYSTEM_PARAMETERS_PREFIX: u8 = 0x00;
const SOFT_FORK_DISABLING_RULES_KEY: [2]u8 = .{ 0x00, 0x01 };

/// Serialize parameters to extension candidate
pub fn toExtensionCandidate(params: *const Parameters) ExtensionCandidate {
    var fields: []ExtensionField = &.{};

    // Add parameter fields
    var iter = params.parameters_table.iterator();
    while (iter.next()) |entry| {
        const key = [2]u8{ SYSTEM_PARAMETERS_PREFIX, @intFromEnum(entry.key_ptr.*) };
        const value = std.mem.toBytes(@byteSwap(entry.value_ptr.*));
        fields = append(fields, ExtensionField{ .key = key, .value = &value });
    }

    // Add proposed update
    const update_bytes = params.proposed_update.serialize();
    fields = append(fields, ExtensionField{
        .key = SOFT_FORK_DISABLING_RULES_KEY,
        .value = update_bytes,
    });

    return ExtensionCandidate{ .fields = fields };
}

/// Parse parameters from extension
pub fn parseExtension(height: u32, extension: *const Extension) !Parameters {
    var params_table = std.AutoHashMap(Parameter, i32).init(allocator);

    for (extension.fields) |field| {
        if (field.key[0] == SYSTEM_PARAMETERS_PREFIX and
            field.key[1] != SOFT_FORK_DISABLING_RULES_KEY[1])
        {
            const param: Parameter = @enumFromInt(field.key[1]);
            const value = @byteSwap(std.mem.bytesToValue(i32, field.value[0..4]));
            try params_table.put(param, value);
        }
    }

    var proposed_update = ValidationSettingsUpdate.empty();
    for (extension.fields) |field| {
        if (std.mem.eql(u8, &field.key, &SOFT_FORK_DISABLING_RULES_KEY)) {
            proposed_update = try ValidationSettingsUpdate.parse(field.value);
            break;
        }
    }

    return Parameters{
        .height = height,
        .parameters_table = params_table,
        .proposed_update = proposed_update,
    };
}

Summary

Parameters adjustable via miner voting (1024-block epochs, 90% threshold)
Cost parameters: maxBlockCost (1M), inputCost (2K), outputCost (100)
Size parameters: maxBlockSize (512KB), minValuePerByte (360)
Fee parameters: storageFeeFactor (1.25M nanoErgs per byte per ~4 years)
Block version tracks protocol upgrades (script_version = block_version - 1)
Validation rules can be consensus-critical or soft-forkable
Parameters stored in block extensions, parsed at epoch boundaries

Next: Chapter 26: Wallet and Signing

Scala: Parameters.scala:23-26

Rust: parameters.rs:8-27

Scala: Parameters.scala:190-217

⁴

Scala: Parameters.scala:159-183

⁵

Scala: ErgoTransaction.scala:370-374

⁶

Rust: parameters.rs:62-77

⁷

Scala: ValidationRules.scala:234-327

⁸

Scala: Parameters.scala:220-228

Chapter 26: Wallet and Signing

Prerequisites

Chapter 15 for proof generation
Chapter 23 for interpreter integration
Chapter 11 for hint system and distributed signing

Learning Objectives

By the end of this chapter, you will be able to:

Explain the wallet service architecture and its role in transaction signing
Trace the complete transaction signing flow from unsigned to signed
Use TransactionHintsBag for distributed multi-party signing
Implement box selection strategies for building transactions

Wallet Architecture

The wallet bridges high-level operations with the interpreter layer¹²:

Wallet Service Architecture
─────────────────────────────────────────────────────────

┌────────────────────────────────────────────────────────┐
│                   Wallet                               │
├────────────────────────────────────────────────────────┤
│  prover: Box<dyn Prover>                               │
│                                                        │
│  ├── from_mnemonic(phrase, pass) -> Wallet             │
│  ├── from_secrets([]SecretKey) -> Wallet               │
│  ├── add_secret(SecretKey)                             │
│  │                                                     │
│  ├── sign_transaction(...) -> Transaction              │
│  ├── sign_reduced_transaction(...) -> Transaction      │
│  │                                                     │
│  └── generate_commitments(...) -> TransactionHintsBag  │
└────────────────────────────────────────────────────────┘
                        │
                        │ uses
                        ▼
┌────────────────────────────────────────────────────────┐
│                    Prover                              │
│  prove(tree, ctx, message, hints) -> ProverResult      │
└────────────────────────────────────────────────────────┘

Wallet Structure

const Wallet = struct {
    /// Underlying prover (boxed trait object)
    prover: *Prover,
    allocator: Allocator,

    /// Create wallet from mnemonic phrase
    pub fn fromMnemonic(
        mnemonic_phrase: []const u8,
        mnemonic_pass: []const u8,
        allocator: Allocator,
    ) !Wallet {
        const seed = Mnemonic.toSeed(mnemonic_phrase, mnemonic_pass);
        const ext_sk = try ExtSecretKey.deriveMaster(seed);
        return Wallet.fromSecrets(&.{ext_sk.secretKey()}, allocator);
    }

    /// Create wallet from secret keys
    pub fn fromSecrets(secrets: []const SecretKey, allocator: Allocator) Wallet {
        var private_inputs = allocator.alloc(PrivateInput, secrets.len) catch unreachable;
        for (secrets, 0..) |sk, i| {
            private_inputs[i] = PrivateInput.from(sk);
        }
        return .{
            .prover = TestProver.init(private_inputs, allocator),
            .allocator = allocator,
        };
    }

    /// Add secret to wallet
    pub fn addSecret(self: *Wallet, secret: SecretKey) void {
        self.prover.appendSecret(PrivateInput.from(secret));
    }

    /// Sign a transaction
    pub fn signTransaction(
        self: *const Wallet,
        tx_context: *const TransactionContext(UnsignedTransaction),
        state_context: *const ErgoStateContext,
        tx_hints: ?*const TransactionHintsBag,
    ) !Transaction {
        return signTransactionImpl(
            self.prover,
            tx_context,
            state_context,
            tx_hints,
        );
    }

    /// Sign a reduced transaction
    pub fn signReducedTransaction(
        self: *const Wallet,
        reduced_tx: *const ReducedTransaction,
        tx_hints: ?*const TransactionHintsBag,
    ) !Transaction {
        return signReducedTransactionImpl(
            self.prover,
            reduced_tx,
            tx_hints,
        );
    }

    /// Generate commitments for distributed signing
    pub fn generateCommitments(
        self: *const Wallet,
        tx_context: *const TransactionContext(UnsignedTransaction),
        state_context: *const ErgoStateContext,
    ) !TransactionHintsBag {
        var public_keys: []SigmaBoolean = &.{};
        for (self.prover.secrets()) |secret| {
            public_keys = append(public_keys, secret.publicImage());
        }
        return generateCommitmentsImpl(tx_context, state_context, public_keys);
    }
};

Mnemonic Seed Generation

BIP-39 mnemonic to seed conversion³⁴:

const Mnemonic = struct {
    /// PBKDF2 iterations per BIP-39
    pub const PBKDF2_ITERATIONS: u32 = 2048;
    /// Seed output length (SHA-512)
    pub const SEED_LENGTH: usize = 64;

    /// Convert mnemonic phrase to seed bytes
    pub fn toSeed(
        mnemonic_phrase: []const u8,
        mnemonic_pass: []const u8,
    ) [SEED_LENGTH]u8 {
        var seed: [SEED_LENGTH]u8 = undefined;

        // Normalize to NFKD form
        const normalized_phrase = normalizeNfkd(mnemonic_phrase);
        const normalized_pass = normalizeNfkd(mnemonic_pass);

        // Salt is "mnemonic" + password
        var salt_buf: [256]u8 = undefined;
        const salt_prefix = "mnemonic";
        @memcpy(salt_buf[0..salt_prefix.len], salt_prefix);
        @memcpy(salt_buf[salt_prefix.len..][0..normalized_pass.len], normalized_pass);
        const salt = salt_buf[0 .. salt_prefix.len + normalized_pass.len];

        // PBKDF2-HMAC-SHA512
        pbkdf2HmacSha512(
            normalized_phrase,
            salt,
            PBKDF2_ITERATIONS,
            &seed,
        );

        return seed;
    }
};

Transaction Hints Bag

Manages hints for distributed signing (EIP-11)⁵⁶:

const TransactionHintsBag = struct {
    /// Secret hints by input index (own commitments)
    secret_hints: std.AutoHashMap(usize, HintsBag),
    /// Public hints by input index (other signers' commitments)
    public_hints: std.AutoHashMap(usize, HintsBag),
    allocator: Allocator,

    pub fn empty(allocator: Allocator) TransactionHintsBag {
        return .{
            .secret_hints = std.AutoHashMap(usize, HintsBag).init(allocator),
            .public_hints = std.AutoHashMap(usize, HintsBag).init(allocator),
            .allocator = allocator,
        };
    }

    /// Replace all hints for an input index
    pub fn replaceHintsForInput(
        self: *TransactionHintsBag,
        index: usize,
        hints_bag: HintsBag,
    ) void {
        var secret_hints: []Hint = &.{};
        var public_hints: []Hint = &.{};

        for (hints_bag.hints) |hint| {
            switch (hint) {
                .own_commitment => secret_hints = append(secret_hints, hint),
                else => public_hints = append(public_hints, hint),
            }
        }

        self.secret_hints.put(index, HintsBag{ .hints = secret_hints }) catch {};
        self.public_hints.put(index, HintsBag{ .hints = public_hints }) catch {};
    }

    /// Add hints for an input (accumulate with existing)
    pub fn addHintsForInput(
        self: *TransactionHintsBag,
        index: usize,
        hints_bag: HintsBag,
    ) void {
        var existing_secret = self.secret_hints.get(index) orelse HintsBag.empty();
        var existing_public = self.public_hints.get(index) orelse HintsBag.empty();

        for (hints_bag.hints) |hint| {
            switch (hint) {
                .own_commitment => existing_secret.hints = append(existing_secret.hints, hint),
                else => existing_public.hints = append(existing_public.hints, hint),
            }
        }

        self.secret_hints.put(index, existing_secret) catch {};
        self.public_hints.put(index, existing_public) catch {};
    }

    /// Get all hints (secret + public) for an input
    pub fn allHintsForInput(self: *const TransactionHintsBag, index: usize) HintsBag {
        var hints: []Hint = &.{};

        if (self.secret_hints.get(index)) |bag| {
            for (bag.hints) |h| hints = append(hints, h);
        }
        if (self.public_hints.get(index)) |bag| {
            for (bag.hints) |h| hints = append(hints, h);
        }

        return HintsBag{ .hints = hints };
    }

    /// Get only public hints (safe to share)
    pub fn publicHintsForInput(self: *const TransactionHintsBag, index: usize) HintsBag {
        return self.public_hints.get(index) orelse HintsBag.empty();
    }
};

Distributed Signing Protocol (EIP-11)

Distributed Signing Flow
─────────────────────────────────────────────────────

Party A                              Party B
─────────                            ─────────

1. Generate Commitments
   commitmentsA = generateCommitments()
                                     commitmentsB = generateCommitments()

2. Exchange Public Hints
   publicA ──────────────────────────►
                    ◄────────────────── publicB

3. Sign with Combined Hints
   combinedA = commitmentsA + publicB
   partialSigA = sign(tx, combinedA)
                                     combinedB = commitmentsB + publicA
                                     partialSigB = sign(tx, combinedB)

4. Extract & Complete
   partialSigA ─────────────────────►
                                     extractedHints = extractHints(partialSigA)
                                     finalTx = sign(tx, commitmentsB + extracted)

Security: Secret hints (randomness r) NEVER leave their owner.
          Only public hints (commitments g^r) are exchanged.

/// Generate commitments for all transaction inputs
pub fn generateCommitments(
    wallet: *const Wallet,
    tx_context: *const TransactionContext(UnsignedTransaction),
    state_context: *const ErgoStateContext,
) !TransactionHintsBag {
    var public_keys: []SigmaBoolean = &.{};
    for (wallet.prover.secrets()) |secret| {
        public_keys = append(public_keys, secret.publicImage());
    }

    var hints_bag = TransactionHintsBag.empty(wallet.allocator);

    for (tx_context.spending_tx.inputs.items(), 0..) |_, idx| {
        const ctx = try makeContext(state_context, tx_context, idx);
        const input_box = tx_context.inputBoxes()[idx];

        // Reduce to SigmaBoolean
        const reduction = try reduceToCrypto(&input_box.ergo_tree, &ctx);

        // Generate commitments for propositions we can prove
        const input_hints = generateCommitmentsFor(
            &reduction.sigma_prop,
            public_keys,
        );
        hints_bag.addHintsForInput(idx, input_hints);
    }

    return hints_bag;
}

/// Extract hints from a partial signature
pub fn extractHints(
    tx: *const Transaction,
    real_propositions: []const SigmaBoolean,
    simulated_propositions: []const SigmaBoolean,
    boxes_to_spend: []const ErgoBox,
    data_boxes: []const ErgoBox,
) TransactionHintsBag {
    var hints_bag = TransactionHintsBag.empty(allocator);

    for (tx.inputs.items(), 0..) |input, idx| {
        const proof = input.spending_proof.proof;
        if (proof.isEmpty()) continue;

        const box = boxes_to_spend[idx];
        const extracted = extractHintsFromProof(
            &box.ergo_tree,
            proof.bytes(),
            real_propositions,
            simulated_propositions,
        );
        hints_bag.addHintsForInput(idx, extracted);
    }

    return hints_bag;
}

Box Selection

Select inputs to satisfy target balance and tokens⁷⁸:

const BoxSelector = struct {
    /// Selects boxes to satisfy target balance and tokens
    pub fn select(
        self: *const BoxSelector,
        inputs: []const ErgoBox,
        target_balance: BoxValue,
        target_tokens: []const Token,
    ) BoxSelectorError!BoxSelection {
        var selected: []ErgoBox = &.{};
        var total_value: u64 = 0;
        var total_tokens = std.AutoHashMap(TokenId, u64).init(allocator);
        defer total_tokens.deinit();

        // First pass: select boxes until targets met
        for (inputs) |box| {
            const needed = needsMoreBoxes(
                total_value,
                &total_tokens,
                target_balance.as_u64(),
                target_tokens,
            );
            if (!needed) break;

            selected = append(selected, box);
            total_value += box.value.as_u64();

            if (box.tokens) |tokens| {
                for (tokens.items()) |token| {
                    const entry = total_tokens.getOrPut(token.token_id);
                    if (entry.found_existing) {
                        entry.value_ptr.* += token.amount.value;
                    } else {
                        entry.value_ptr.* = token.amount.value;
                    }
                }
            }
        }

        // Check if targets met
        if (total_value < target_balance.as_u64()) {
            return error.NotEnoughCoins;
        }

        for (target_tokens) |target| {
            const have = total_tokens.get(target.token_id) orelse 0;
            if (have < target.amount.value) {
                return error.NotEnoughTokens;
            }
        }

        // Calculate change
        const change = calculateChange(
            total_value,
            &total_tokens,
            target_balance.as_u64(),
            target_tokens,
        );

        return BoxSelection{
            .boxes = try BoundedVec(ErgoBox, 1, MAX_INPUTS).fromSlice(selected),
            .change_boxes = change,
        };
    }
};

const BoxSelection = struct {
    /// Selected boxes to spend
    boxes: BoundedVec(ErgoBox, 1, MAX_INPUTS),
    /// Change boxes to create
    change_boxes: []ErgoBoxAssetsData,
};

const BoxSelectorError = error{
    NotEnoughCoins,
    NotEnoughTokens,
    TokenAmountError,
    NotEnoughCoinsForChangeBox,
    SelectedInputsOutOfBounds,
};

Transaction Signing Flow

Transaction Signing Flow
─────────────────────────────────────────────────────

┌──────────────────────────────────────────────────┐
│ 1. User Request                                  │
│    ├── Target balance                            │
│    └── Target tokens                             │
└──────────────────────────┬───────────────────────┘
                           ▼
┌──────────────────────────────────────────────────┐
│ 2. Box Selection                                 │
│    ├── BoxSelector.select(inputs, target)        │
│    └── Returns: boxes + change                   │
└──────────────────────────┬───────────────────────┘
                           ▼
┌──────────────────────────────────────────────────┐
│ 3. Build Unsigned Transaction                    │
│    ├── inputs: selected boxes                    │
│    ├── data_inputs: read-only references         │
│    └── output_candidates: targets + change       │
└──────────────────────────┬───────────────────────┘
                           ▼
┌──────────────────────────────────────────────────┐
│ 4. Sign Transaction                              │
│    For each input:                               │
│    ├── Create Context                            │
│    ├── Get hints for input                       │
│    ├── prover.prove(tree, ctx, message, hints)   │
│    └── Accumulate cost                           │
└──────────────────────────┬───────────────────────┘
                           ▼
┌──────────────────────────────────────────────────┐
│ 5. Signed Transaction                            │
│    └── Submit to mempool                         │
└──────────────────────────────────────────────────┘

/// Sign transaction with prover
pub fn signTransaction(
    prover: *const Prover,
    tx_context: *const TransactionContext(UnsignedTransaction),
    state_context: *const ErgoStateContext,
    tx_hints: ?*const TransactionHintsBag,
) !Transaction {
    const tx = tx_context.spending_tx;
    const message = try tx.bytesToSign();

    var signed_inputs: []Input = &.{};

    for (tx.inputs.items(), 0..) |unsigned_input, idx| {
        const ctx = try makeContext(state_context, tx_context, idx);

        // Get hints for this input
        const hints = if (tx_hints) |h| h.allHintsForInput(idx) else HintsBag.empty();

        const input_box = tx_context.getInputBox(unsigned_input.box_id) orelse
            return error.InputBoxNotFound;

        // Generate proof
        const prover_result = try prover.prove(
            &input_box.ergo_tree,
            &ctx,
            message,
            &hints,
        );

        signed_inputs = append(signed_inputs, Input{
            .box_id = unsigned_input.box_id,
            .spending_proof = prover_result,
        });
    }

    return Transaction.new(
        try TxIoVec(Input).fromSlice(signed_inputs),
        tx.data_inputs,
        tx.output_candidates,
    );
}

Asset Extraction

Calculate token access costs⁹:

const ErgoBoxAssetExtractor = struct {
    pub const MAX_ASSETS_PER_BOX: usize = 255;

    /// Extract total token amounts from boxes
    pub fn extractAssets(
        boxes: []const ErgoBoxCandidate,
    ) !struct { assets: std.AutoHashMap(TokenId, u64), count: usize } {
        var assets = std.AutoHashMap(TokenId, u64).init(allocator);
        var total_count: usize = 0;

        for (boxes) |box| {
            if (box.tokens) |tokens| {
                if (tokens.len() > MAX_ASSETS_PER_BOX) {
                    return error.TooManyAssetsInBox;
                }

                for (tokens.items()) |token| {
                    const entry = assets.getOrPut(token.token_id);
                    if (entry.found_existing) {
                        entry.value_ptr.* = std.math.add(
                            u64,
                            entry.value_ptr.*,
                            token.amount.value,
                        ) catch return error.Overflow;
                    } else {
                        entry.value_ptr.* = token.amount.value;
                    }
                }
                total_count += tokens.len();
            }
        }

        return .{ .assets = assets, .count = total_count };
    }

    /// Calculate total token access cost
    pub fn totalAssetsAccessCost(
        in_assets_num: usize,
        in_assets_size: usize,
        out_assets_num: usize,
        out_assets_size: usize,
        token_access_cost: u32,
    ) u64 {
        // Cost to iterate through all tokens
        const all_assets_cost = (out_assets_num + in_assets_num) * token_access_cost;
        // Cost to check preservation of unique tokens
        const unique_assets_cost = (in_assets_size + out_assets_size) * token_access_cost;
        return all_assets_cost + unique_assets_cost;
    }
};

Wallet Errors

const WalletError = error{
    /// Transaction signing failed
    TxSigningError,
    /// Prover failed to generate proof
    ProverError,
    /// Key derivation failed
    ExtSecretKeyError,
    /// Secret key parsing failed
    SecretKeyParsingError,
    /// Wallet not initialized
    WalletNotInitialized,
    /// Wallet locked
    WalletLocked,
    /// Wallet already unlocked
    WalletAlreadyUnlocked,
    /// Box selection failed
    BoxSelectionError,
};

Distributed Signing Example

// Party A: Generate commitments
const commitments_a = try wallet_a.generateCommitments(&tx_context, &state_context);

// Party B: Generate commitments
const commitments_b = try wallet_b.generateCommitments(&tx_context, &state_context);

// Exchange public hints (safe to share)
const public_a = commitments_a.publicHintsForInput(0);
const public_b = commitments_b.publicHintsForInput(0);

// Party A: Sign with combined hints
var combined_a = commitments_a;
combined_a.addHintsForInput(0, public_b);
const partial_sig_a = try wallet_a.signTransaction(&tx_context, &state_context, &combined_a);

// Party B: Extract hints from A's partial signature
const extracted = extractHints(
    &partial_sig_a,
    real_propositions,
    simulated_propositions,
    boxes_to_spend,
    data_boxes,
);

// Party B: Complete signing
var final_hints = commitments_b;
final_hints.addHintsForInput(0, extracted.allHintsForInput(0));
const final_tx = try wallet_b.signTransaction(&tx_context, &state_context, &final_hints);

Summary

Wallet wraps prover with high-level signing API
Mnemonic converts BIP-39 phrase to seed via PBKDF2
TransactionHintsBag separates secret/public hints for distributed signing
BoxSelector finds optimal input set for target balance/tokens
Distributed signing (EIP-11) exchanges commitments, never secrets
Asset extraction calculates token access costs

Next: Chapter 27: High-Level SDK

Scala: ErgoWalletService.scala:36-61

Rust: wallet.rs:52-93

Scala: Mnemonic.scala

⁴

Rust: mnemonic.rs:20-37

⁵

Scala: TransactionHintsBag.scala:5-56

⁶

Rust: wallet.rs:259-347

⁷

Scala: BoxSelector.scala

⁸

Rust: box_selector.rs:34-46

⁹

Scala: ErgoBoxAssetExtractor.scala:55-65

Chapter 27: High-Level SDK

Prerequisites

Chapter 24 for transaction structure and validation
Chapter 15 for proof generation
Chapter 26 for wallet integration

Learning Objectives

By the end of this chapter, you will be able to:

Explain the SDK architecture layers from cryptography to transaction building
Use TxBuilder with the builder pattern for ergonomic transaction construction
Trace the reduce-then-sign pipeline for transaction signing
Work with TransactionContext and BoxSelection for complex transaction scenarios

SDK Architecture

The SDK provides a layered abstraction from low-level cryptography to high-level transaction building¹²:

SDK Layer Architecture
══════════════════════════════════════════════════════════════════

┌────────────────────────────────────────────────────────────────┐
│                     Application Layer                          │
│   TxBuilder    BoxSelector    ErgoBoxCandidateBuilder          │
├────────────────────────────────────────────────────────────────┤
│                     Wallet Layer                               │
│   Wallet    TransactionContext    TransactionHintsBag          │
├────────────────────────────────────────────────────────────────┤
│                     Reduction Layer                            │
│   reduce_tx()    ReducedTransaction    ReducedInput            │
├────────────────────────────────────────────────────────────────┤
│                     Signing Layer                              │
│   sign_transaction()    sign_reduced_transaction()             │
├────────────────────────────────────────────────────────────────┤
│                     Interpreter Layer                          │
│   Prover    Verifier    reduce_to_crypto()                     │
└────────────────────────────────────────────────────────────────┘

Transaction Builder

The builder pattern constructs unsigned transactions with validation³⁴:

const TxBuilder = struct {
    box_selection: BoxSelection,
    data_inputs: std.ArrayList(DataInput),
    output_candidates: std.ArrayList(ErgoBoxCandidate),
    current_height: u32,
    fee_amount: BoxValue,
    change_address: Address,
    context_extensions: std.AutoHashMap(BoxId, ContextExtension),
    token_burn_permit: std.ArrayList(Token),
    allocator: Allocator,

    pub fn init(
        box_selection: BoxSelection,
        output_candidates: []const ErgoBoxCandidate,
        current_height: u32,
        fee_amount: BoxValue,
        change_address: Address,
        allocator: Allocator,
    ) !TxBuilder {
        var outputs = std.ArrayList(ErgoBoxCandidate).init(allocator);
        try outputs.appendSlice(output_candidates);

        return .{
            .box_selection = box_selection,
            .data_inputs = std.ArrayList(DataInput).init(allocator),
            .output_candidates = outputs,
            .current_height = current_height,
            .fee_amount = fee_amount,
            .change_address = change_address,
            .context_extensions = std.AutoHashMap(BoxId, ContextExtension).init(allocator),
            .token_burn_permit = std.ArrayList(Token).init(allocator),
            .allocator = allocator,
        };
    }

    pub fn deinit(self: *TxBuilder) void {
        self.data_inputs.deinit();
        self.output_candidates.deinit();
        self.context_extensions.deinit();
        self.token_burn_permit.deinit();
    }

    pub fn setDataInputs(self: *TxBuilder, data_inputs: []const DataInput) !void {
        self.data_inputs.clearRetainingCapacity();
        try self.data_inputs.appendSlice(data_inputs);
    }

    pub fn setContextExtension(self: *TxBuilder, box_id: BoxId, ext: ContextExtension) !void {
        try self.context_extensions.put(box_id, ext);
    }

    pub fn setTokenBurnPermit(self: *TxBuilder, tokens: []const Token) !void {
        self.token_burn_permit.clearRetainingCapacity();
        try self.token_burn_permit.appendSlice(tokens);
    }
};

Build Validation

Building performs comprehensive validation before creating the transaction⁵⁶:

pub fn build(self: *TxBuilder) !UnsignedTransaction {
    // Validate inputs
    if (self.box_selection.boxes.items.len == 0) {
        return error.EmptyInputs;
    }
    if (self.output_candidates.items.len == 0) {
        return error.EmptyOutputs;
    }
    if (self.box_selection.boxes.items.len > std.math.maxInt(u16)) {
        return error.TooManyInputs;
    }

    // Check for duplicate inputs
    var seen = std.AutoHashMap(BoxId, void).init(self.allocator);
    defer seen.deinit();
    for (self.box_selection.boxes.items) |box| {
        const result = try seen.getOrPut(box.box_id);
        if (result.found_existing) {
            return error.DuplicateInputs;
        }
    }

    // Build output candidates with change boxes
    var all_outputs = try self.buildOutputCandidates();
    defer all_outputs.deinit();

    // Validate coin preservation
    const total_in = sumValue(self.box_selection.boxes.items);
    const total_out = sumValue(all_outputs.items);

    if (total_out > total_in) {
        return error.NotEnoughCoinsInInputs;
    }
    if (total_out < total_in) {
        return error.NotEnoughCoinsInOutputs;
    }

    // Validate token balance
    try self.validateTokenBalance(all_outputs.items);

    // Create unsigned inputs with context extensions
    var unsigned_inputs = std.ArrayList(UnsignedInput).init(self.allocator);
    for (self.box_selection.boxes.items) |box| {
        const ext = self.context_extensions.get(box.box_id) orelse
            ContextExtension.empty();
        try unsigned_inputs.append(.{
            .box_id = box.box_id,
            .extension = ext,
        });
    }

    return UnsignedTransaction{
        .inputs = try unsigned_inputs.toOwnedSlice(),
        .data_inputs = try self.data_inputs.toOwnedSlice(),
        .output_candidates = try all_outputs.toOwnedSlice(),
    };
}

fn buildOutputCandidates(self: *TxBuilder) !std.ArrayList(ErgoBoxCandidate) {
    var outputs = std.ArrayList(ErgoBoxCandidate).init(self.allocator);

    // Add user-specified outputs
    try outputs.appendSlice(self.output_candidates.items);

    // Add change boxes from selection
    const change_tree = try Contract.payToAddress(self.change_address);
    for (self.box_selection.change_boxes.items) |change| {
        var candidate = try ErgoBoxCandidateBuilder.init(
            change.value,
            change_tree,
            self.current_height,
            self.allocator,
        );
        for (change.tokens) |token| {
            try candidate.addToken(token);
        }
        try outputs.append(try candidate.build());
    }

    // Add miner fee box
    const fee_box = try newMinerFeeBox(self.fee_amount, self.current_height);
    try outputs.append(fee_box);

    return outputs;
}

Token Balance Validation

Token flow must be explicitly validated⁷⁸:

fn validateTokenBalance(self: *TxBuilder, outputs: []const ErgoBoxCandidate) !void {
    const input_tokens = try sumTokens(self.box_selection.boxes.items, self.allocator);
    defer input_tokens.deinit();

    const output_tokens = try sumTokens(outputs, self.allocator);
    defer output_tokens.deinit();

    // Token minting rule: new tokens can ONLY have token_id == first_input.box_id
    // You can mint any AMOUNT of this token type, but only ONE token type per tx.
    const first_input_id = TokenId.fromBoxId(self.box_selection.boxes.items[0].box_id);

    // Separate minted tokens (first_input_id) from transferred tokens
    var has_minted_token = false;
    var output_without_minted = std.AutoHashMap(TokenId, TokenAmount).init(self.allocator);
    defer output_without_minted.deinit();

    var iter = output_tokens.iterator();
    while (iter.next()) |entry| {
        if (entry.key_ptr.*.eql(first_input_id)) {
            has_minted_token = true;
            // Note: any amount is allowed for the minted token
        } else {
            try output_without_minted.put(entry.key_ptr.*, entry.value_ptr.*);
        }
    }
    _ = has_minted_token; // Used for documentation; actual validation is below

    // Check all output tokens exist in inputs
    var out_iter = output_without_minted.iterator();
    while (out_iter.next()) |entry| {
        const input_amt = input_tokens.get(entry.key_ptr.*) orelse {
            return error.NotEnoughTokens;
        };
        if (input_amt < entry.value_ptr.*) {
            return error.NotEnoughTokens;
        }
    }

    // Check token burn permits
    const burned = try subtractTokens(input_tokens, output_without_minted, self.allocator);
    defer burned.deinit();

    try self.checkBurnPermit(burned);
}

fn checkBurnPermit(self: *TxBuilder, burned: std.AutoHashMap(TokenId, TokenAmount)) !void {
    // Build permit map
    var permits = std.AutoHashMap(TokenId, TokenAmount).init(self.allocator);
    defer permits.deinit();
    for (self.token_burn_permit.items) |token| {
        try permits.put(token.id, token.amount);
    }

    // Every burned token must have permit
    var iter = burned.iterator();
    while (iter.next()) |entry| {
        const permit_amt = permits.get(entry.key_ptr.*) orelse {
            return error.TokenBurnPermitMissing;
        };
        if (entry.value_ptr.* > permit_amt) {
            return error.TokenBurnPermitExceeded;
        }
    }

    // Every permit must be used exactly
    var permit_iter = permits.iterator();
    while (permit_iter.next()) |entry| {
        const burned_amt = burned.get(entry.key_ptr.*) orelse {
            return error.TokenBurnPermitUnused;
        };
        if (burned_amt < entry.value_ptr.*) {
            return error.TokenBurnPermitUnused;
        }
    }
}

Box Candidate Builder

Constructs output boxes with fluent API:

const ErgoBoxCandidateBuilder = struct {
    value: BoxValue,
    ergo_tree: ErgoTree,
    creation_height: u32,
    tokens: std.ArrayList(Token),
    registers: [6]?Constant, // R4-R9
    allocator: Allocator,

    pub fn init(
        value: BoxValue,
        ergo_tree: ErgoTree,
        creation_height: u32,
        allocator: Allocator,
    ) !ErgoBoxCandidateBuilder {
        return .{
            .value = value,
            .ergo_tree = ergo_tree,
            .creation_height = creation_height,
            .tokens = std.ArrayList(Token).init(allocator),
            .registers = [_]?Constant{null} ** 6,
            .allocator = allocator,
        };
    }

    pub fn addToken(self: *ErgoBoxCandidateBuilder, token: Token) !void {
        if (self.tokens.items.len >= MAX_TOKENS) {
            return error.TooManyTokens;
        }
        try self.tokens.append(token);
    }

    pub fn mintToken(
        self: *ErgoBoxCandidateBuilder,
        token: Token,
        name: []const u8,
        description: []const u8,
        decimals: u8,
    ) !void {
        try self.addToken(token);
        // Store metadata in R4-R6
        self.registers[0] = Constant.fromBytes(name);
        self.registers[1] = Constant.fromBytes(description);
        self.registers[2] = Constant.fromByte(decimals);
    }

    pub fn setRegister(self: *ErgoBoxCandidateBuilder, reg: RegisterId, value: Constant) void {
        const idx = @intFromEnum(reg) - 4; // R4 = 0, R5 = 1, etc.
        self.registers[idx] = value;
    }

    pub fn build(self: *ErgoBoxCandidateBuilder) !ErgoBoxCandidate {
        return ErgoBoxCandidate{
            .value = self.value,
            .ergo_tree = self.ergo_tree,
            .creation_height = self.creation_height,
            .tokens = try self.tokens.toOwnedSlice(),
            .additional_registers = self.registers,
        };
    }
};

Transaction Context

Bundles transaction with input boxes for signing⁹¹⁰:

const TransactionContext = struct {
    spending_tx: UnsignedTransaction,
    input_boxes: []const ErgoBox,
    data_boxes: ?[]const ErgoBox,

    pub fn init(
        spending_tx: UnsignedTransaction,
        input_boxes: []const ErgoBox,
        data_boxes: ?[]const ErgoBox,
    ) !TransactionContext {
        // Validate input boxes match transaction inputs
        if (input_boxes.len != spending_tx.inputs.len) {
            return error.InputBoxCountMismatch;
        }

        for (spending_tx.inputs, input_boxes) |input, box| {
            if (!input.box_id.eql(box.box_id())) {
                return error.InputBoxIdMismatch;
            }
        }

        // Validate data boxes if present
        if (spending_tx.data_inputs) |data_inputs| {
            const data = data_boxes orelse return error.DataInputBoxNotFound;
            if (data.len != data_inputs.len) {
                return error.DataInputBoxCountMismatch;
            }
        }

        return .{
            .spending_tx = spending_tx,
            .input_boxes = input_boxes,
            .data_boxes = data_boxes,
        };
    }

    pub fn getInputBox(self: *const TransactionContext, box_id: BoxId) ?*const ErgoBox {
        for (self.input_boxes) |*box| {
            if (box.box_id().eql(box_id)) {
                return box;
            }
        }
        return null;
    }
};

Box Selection

Selects input boxes to satisfy output requirements¹¹¹²:

const BoxSelection = struct {
    boxes: std.ArrayList(ErgoBox),
    change_boxes: std.ArrayList(ErgoBoxAssets),

    const ErgoBoxAssets = struct {
        value: BoxValue,
        tokens: []const Token,
    };
};

const SimpleBoxSelector = struct {
    pub fn select(
        available: []const ErgoBox,
        target_value: BoxValue,
        target_tokens: []const Token,
        allocator: Allocator,
    ) !BoxSelection {
        var selected = std.ArrayList(ErgoBox).init(allocator);
        var total_value: u64 = 0;
        var token_sums = std.AutoHashMap(TokenId, TokenAmount).init(allocator);
        defer token_sums.deinit();

        // Greedy selection
        for (available) |box| {
            const needed = checkNeed(total_value, target_value, token_sums, target_tokens);
            if (!needed) break;

            try selected.append(box);
            total_value += box.value.as_u64();

            for (box.tokens) |token| {
                const entry = try token_sums.getOrPut(token.id);
                if (entry.found_existing) {
                    entry.value_ptr.* = try entry.value_ptr.*.checkedAdd(token.amount);
                } else {
                    entry.value_ptr.* = token.amount;
                }
            }
        }

        // Calculate change
        var change_boxes = std.ArrayList(BoxSelection.ErgoBoxAssets).init(allocator);
        const change_value = total_value - target_value.as_u64();
        if (change_value > 0) {
            const change_tokens = try calculateChangeTokens(token_sums, target_tokens, allocator);
            try change_boxes.append(.{
                .value = BoxValue.init(change_value) catch return error.ChangeValueTooSmall,
                .tokens = change_tokens,
            });
        }

        return .{
            .boxes = selected,
            .change_boxes = change_boxes,
        };
    }
};

Reduced Transaction

Script reduction separates evaluation from signing¹³¹⁴:

const ReducedInput = struct {
    sigma_prop: SigmaBoolean,
    cost: u64,
    extension: ContextExtension,
};

const ReducedTransaction = struct {
    unsigned_tx: UnsignedTransaction,
    reduced_inputs: []const ReducedInput,
    tx_cost: u32,

    pub fn reducedInputs(self: *const ReducedTransaction) []const ReducedInput {
        return self.reduced_inputs;
    }
};

/// Reduce transaction inputs to sigma propositions
pub fn reduceTx(
    tx_context: TransactionContext,
    state_context: *const ErgoStateContext,
    allocator: Allocator,
) !ReducedTransaction {
    var reduced_inputs = std.ArrayList(ReducedInput).init(allocator);

    for (tx_context.spending_tx.inputs, 0..) |input, idx| {
        // Build evaluation context
        var ctx = try makeContext(state_context, &tx_context, idx);

        // Get input box
        const input_box = tx_context.getInputBox(input.box_id) orelse
            return error.InputBoxNotFound;

        // Reduce ErgoTree to SigmaBoolean
        const result = try reduceToCrypto(&input_box.ergo_tree, &ctx);

        try reduced_inputs.append(.{
            .sigma_prop = result.sigma_prop,
            .cost = result.cost,
            .extension = input.extension,
        });
    }

    return .{
        .unsigned_tx = tx_context.spending_tx,
        .reduced_inputs = try reduced_inputs.toOwnedSlice(),
        .tx_cost = 0,
    };
}

Signing Pipeline

Signing Flow
══════════════════════════════════════════════════════════════════

┌─────────────────┐     ┌──────────────────┐     ┌───────────────┐
│ UnsignedTx      │     │ ReducedTx        │     │ SignedTx      │
│ + InputBoxes    │────▶│ (SigmaProps)     │────▶│ (Proofs)      │
│ + StateContext  │     │                  │     │               │
└─────────────────┘     └──────────────────┘     └───────────────┘
        │                       │                       │
        │  reduce_tx()          │  sign_reduced_tx()    │
        │  (needs context)      │  (context-free)       │
        ▼                       ▼                       ▼
   ┌─────────┐            ┌─────────┐              ┌─────────┐
   │ Online  │            │ Offline │              │ Verify  │
   │ Wallet  │            │ Wallet  │              │ Node    │
   └─────────┘            └─────────┘              └─────────┘

Transaction signing with optional hints¹⁵¹⁶:

pub fn signTransaction(
    prover: *const Prover,
    tx_context: TransactionContext,
    state_context: *const ErgoStateContext,
    tx_hints: ?*const TransactionHintsBag,
) !Transaction {
    const message = try tx_context.spending_tx.bytesToSign();

    var signed_inputs = std.ArrayList(Input).init(prover.allocator);
    for (tx_context.spending_tx.inputs, 0..) |input, idx| {
        const signed = try signTxInput(
            prover,
            &tx_context,
            state_context,
            tx_hints,
            idx,
            message,
        );
        try signed_inputs.append(signed);
    }

    return Transaction{
        .inputs = try signed_inputs.toOwnedSlice(),
        .data_inputs = tx_context.spending_tx.data_inputs,
        .outputs = tx_context.spending_tx.output_candidates,
    };
}

pub fn signReducedTransaction(
    prover: *const Prover,
    reduced_tx: ReducedTransaction,
    tx_hints: ?*const TransactionHintsBag,
) !Transaction {
    const message = try reduced_tx.unsigned_tx.bytesToSign();

    var signed_inputs = std.ArrayList(Input).init(prover.allocator);
    for (reduced_tx.unsigned_tx.inputs, 0..) |input, idx| {
        const reduced_input = reduced_tx.reduced_inputs[idx];

        // Get hints for this input
        const hints = if (tx_hints) |bag|
            bag.allHintsForInput(idx)
        else
            HintsBag.empty();

        // Generate proof from sigma proposition
        const proof = try prover.generateProof(
            reduced_input.sigma_prop,
            message,
            &hints,
        );

        try signed_inputs.append(.{
            .box_id = input.box_id,
            .spending_proof = .{
                .proof = proof,
                .extension = reduced_input.extension,
            },
        });
    }

    return Transaction{
        .inputs = try signed_inputs.toOwnedSlice(),
        .data_inputs = reduced_tx.unsigned_tx.data_inputs,
        .outputs = reduced_tx.unsigned_tx.output_candidates,
    };
}

Miner Fee Box

Standard miner fee output:

/// Miner fee ErgoTree (false proposition with height constraint)
const MINERS_FEE_ERGO_TREE = [_]u8{
    0x10, 0x05, 0x04, 0x00, 0x04, 0x00, 0x0e, 0x36,
    0x10, 0x02, 0x04, 0xa0, 0x0b, 0x08, 0xcd, 0x02,
    // ... (standard miner fee script)
};

pub fn newMinerFeeBox(fee: BoxValue, creation_height: u32) !ErgoBoxCandidate {
    const tree = try ErgoTree.sigmaParse(&MINERS_FEE_ERGO_TREE);

    return ErgoBoxCandidate{
        .value = fee,
        .ergo_tree = tree,
        .creation_height = creation_height,
        .tokens = &[_]Token{},
        .additional_registers = [_]?Constant{null} ** 6,
    };
}

/// Suggested transaction fee (1.1 mERG)
pub const SUGGESTED_TX_FEE = BoxValue.init(1_100_000) catch unreachable;

Reduced Transaction Serialization

EIP-19 format for cold wallet transfer¹⁷¹⁸:

const ReducedTransactionSerializer = struct {
    pub fn serialize(tx: *const ReducedTransaction, writer: anytype) !void {
        // Write message to sign (includes all tx data)
        const msg = try tx.unsigned_tx.bytesToSign();
        try writer.writeInt(u32, @intCast(msg.len), .little);
        try writer.writeAll(msg);

        // Write reduced inputs
        for (tx.reduced_inputs) |red_in| {
            try SigmaBoolean.serialize(&red_in.sigma_prop, writer);
            try writer.writeInt(u64, red_in.cost, .little);
        }

        try writer.writeInt(u32, tx.tx_cost, .little);
    }

    pub fn parse(reader: anytype, allocator: Allocator) !ReducedTransaction {
        // Read and parse message
        const msg_len = try reader.readInt(u32, .little);
        const msg = try allocator.alloc(u8, msg_len);
        try reader.readNoEof(msg);

        const tx = try Transaction.sigmaParse(msg);

        // Read reduced inputs
        var reduced_inputs = std.ArrayList(ReducedInput).init(allocator);
        for (tx.inputs) |input| {
            const sigma_prop = try SigmaBoolean.parse(reader);
            const cost = try reader.readInt(u64, .little);

            try reduced_inputs.append(.{
                .sigma_prop = sigma_prop,
                .cost = cost,
                .extension = input.spending_proof.extension,
            });
        }

        const tx_cost = try reader.readInt(u32, .little);

        return .{
            .unsigned_tx = tx.toUnsigned(),
            .reduced_inputs = try reduced_inputs.toOwnedSlice(),
            .tx_cost = tx_cost,
        };
    }
};

Cold Wallet Flow

Cold Wallet Signing
══════════════════════════════════════════════════════════════════

Online Wallet (Hot)              Cold Wallet (Air-gapped)
──────────────────────           ────────────────────────
       │                                    │
  Build Unsigned Tx                         │
       │                                    │
  reduce_tx()                               │
       │                                    │
  Serialize ReducedTx ─────────────────────▶│
  (QR code / USB)                           │
       │                               Parse ReducedTx
       │                                    │
       │                               sign_reduced_tx()
       │                               (uses secrets)
       │                                    │
       │◀──────────────────────── Serialize SignedTx
       │                          (QR code / USB)
  Broadcast Tx                              │
       │                                    │
       ▼                                    ▼

Complete Usage Example

pub fn buildAndSignTransaction(
    wallet: *const Wallet,
    available_boxes: []const ErgoBox,
    recipient: Address,
    amount: u64,
    state_context: *const ErgoStateContext,
    allocator: Allocator,
) !Transaction {
    const current_height = state_context.pre_header.height;

    // 1. Build output
    const recipient_tree = try Contract.payToAddress(recipient);
    var out_builder = try ErgoBoxCandidateBuilder.init(
        try BoxValue.init(amount),
        recipient_tree,
        current_height,
        allocator,
    );
    const output = try out_builder.build();

    // 2. Select inputs
    const total_needed = try BoxValue.init(amount + SUGGESTED_TX_FEE.as_u64());
    const selection = try SimpleBoxSelector.select(
        available_boxes,
        total_needed,
        &[_]Token{},
        allocator,
    );

    // 3. Build transaction
    const change_address = wallet.getP2PKAddress();
    var builder = try TxBuilder.init(
        selection,
        &[_]ErgoBoxCandidate{output},
        current_height,
        SUGGESTED_TX_FEE,
        change_address,
        allocator,
    );
    defer builder.deinit();

    const unsigned_tx = try builder.build();

    // 4. Create transaction context
    const tx_context = try TransactionContext.init(
        unsigned_tx,
        selection.boxes.items,
        null,
    );

    // 5. Sign transaction
    return wallet.signTransaction(tx_context, state_context, null);
}

Summary

TxBuilder constructs unsigned transactions with validation
BoxSelection satisfies value and token requirements
ErgoBoxCandidateBuilder creates output boxes with fluent API
TransactionContext bundles transaction with input data
reduce_tx() separates script evaluation from signing
ReducedTransaction enables air-gapped cold wallet signing
Token burn requires explicit permits to prevent accidents

Next: Chapter 28: Key Derivation

Scala: sdk/

Rust: wallet.rs:52-244

Scala: UnsignedTransactionBuilder.scala

⁴

Rust: tx_builder.rs:41-78

⁵

Scala: UnsignedTransactionBuilder.scala:79-111

⁶

Rust: tx_builder.rs:144-258

⁷

Scala: AppkitProvingInterpreter.scala (token validation)

⁸

Rust: tx_builder.rs:214-243

⁹

Scala: Transactions.scala:17-46

¹⁰

Rust: tx_context.rs

¹¹

Scala: BoxSelectionResult.scala

¹²

Rust: box_selector.rs

¹³

Scala: AppkitProvingInterpreter.scala:274-289

¹⁴

Rust: reduced.rs:25-67

¹⁵

Scala: AppkitProvingInterpreter.scala:81-95

¹⁶

Rust: signing.rs:143-168

¹⁷

Scala: AppkitProvingInterpreter.scala:292-336

¹⁸

Rust: reduced.rs:108-154

Chapter 28: Key Derivation

Prerequisites

Elliptic curve cryptography (Chapter 9)
Hash functions (Chapter 10)
High-level SDK (Chapter 27)

Learning Objectives

Understand BIP-32 hierarchical deterministic key derivation
Implement derivation paths and index encoding
Distinguish hardened from non-hardened derivation
Master EIP-3 key derivation for Ergo

HD Wallet Architecture

Hierarchical Deterministic (HD) wallets derive unlimited keys from a single master seed¹²:

HD Key Derivation Tree
══════════════════════════════════════════════════════════════════

                     Master Seed (BIP-39)
                            │
                     HMAC-SHA512("Bitcoin seed", seed)
                            │
              ┌─────────────┴─────────────┐
              │                           │
         Master Key                  Chain Code
         (32 bytes)                  (32 bytes)
              │                           │
              └───────────┬───────────────┘
                          │
                   Extended Master Key
                          │
         ┌────────────────┼────────────────┐
         │                │                │
    m/44' (Purpose)  m/44'/429'      m/44'/429'/0'
         │           (Coin Type)      (Account)
         │                │                │
         ▼                ▼                ▼
    BIP-44 Keys      Ergo Keys       Account Keys

Index Types

Child indices distinguish hardened from normal derivation³⁴:

const ChildIndex = union(enum) {
    hardened: HardenedIndex,
    normal: NormalIndex,

    const HardenedIndex = struct {
        value: u31, // 0 to 2^31-1

        pub fn toBits(self: HardenedIndex) u32 {
            return @as(u32, self.value) | HARDENED_BIT;
        }
    };

    const NormalIndex = struct {
        value: u31, // 0 to 2^31-1

        pub fn toBits(self: NormalIndex) u32 {
            return @as(u32, self.value);
        }

        pub fn next(self: NormalIndex) NormalIndex {
            return .{ .value = self.value + 1 };
        }
    };

    const HARDENED_BIT: u32 = 0x80000000; // 2^31

    pub fn hardened(i: u31) ChildIndex {
        return .{ .hardened = .{ .value = i } };
    }

    pub fn normal(i: u31) ChildIndex {
        return .{ .normal = .{ .value = i } };
    }

    pub fn toBits(self: ChildIndex) u32 {
        return switch (self) {
            .hardened => |h| h.toBits(),
            .normal => |n| n.toBits(),
        };
    }

    pub fn isHardened(self: ChildIndex) bool {
        return self == .hardened;
    }
};

Hardened vs Normal Derivation

Derivation Security Properties
══════════════════════════════════════════════════════════════════

┌──────────────┬─────────────────┬─────────────────────────────────┐
│ Type         │ Index Range     │ Security Property               │
├──────────────┼─────────────────┼─────────────────────────────────┤
│ Normal       │ 0 to 2³¹-1      │ Public derivation possible      │
│              │ (0, 1, 2)       │ Child pubkey from parent pubkey │
├──────────────┼─────────────────┼─────────────────────────────────┤
│ Hardened     │ 2³¹ to 2³²-1    │ Requires private key            │
│              │ (0', 1', 2')    │ Prevents key leakage            │
└──────────────┴─────────────────┴─────────────────────────────────┘

Why Hardened Matters:
─────────────────────────────────────────────────────────────────
If attacker obtains:
  - Child private key (leaked)
  - Parent chain code (public in xpub)

With normal derivation: Attacker can compute parent private key!
With hardened derivation: Parent key remains secure

Derivation Path

Paths encode the key tree location⁵⁶:

const DerivationPath = struct {
    indices: []const ChildIndex,

    const PURPOSE: ChildIndex = ChildIndex.hardened(44);
    const ERG_COIN_TYPE: ChildIndex = ChildIndex.hardened(429);
    const CHANGE_EXTERNAL: ChildIndex = ChildIndex.normal(0);

    /// Create EIP-3 compliant path: m/44'/429'/account'/0/address
    pub fn eip3(account: u31, address: u31) DerivationPath {
        return .{
            .indices = &[_]ChildIndex{
                PURPOSE,
                ERG_COIN_TYPE,
                ChildIndex.hardened(account),
                CHANGE_EXTERNAL,
                ChildIndex.normal(address),
            },
        };
    }

    /// Master path (empty)
    pub fn master() DerivationPath {
        return .{ .indices = &[_]ChildIndex{} };
    }

    pub fn depth(self: *const DerivationPath) usize {
        return self.indices.len;
    }

    /// Extend path with new index
    pub fn extend(self: *const DerivationPath, index: ChildIndex, allocator: Allocator) !DerivationPath {
        var new_indices = try allocator.alloc(ChildIndex, self.indices.len + 1);
        @memcpy(new_indices[0..self.indices.len], self.indices);
        new_indices[self.indices.len] = index;
        return .{ .indices = new_indices };
    }

    /// Increment last index
    pub fn next(self: *const DerivationPath, allocator: Allocator) !DerivationPath {
        if (self.indices.len == 0) return error.EmptyPath;

        var new_indices = try allocator.dupe(ChildIndex, self.indices);
        const last = &new_indices[new_indices.len - 1];
        last.* = switch (last.*) {
            .hardened => |h| ChildIndex.hardened(h.value + 1),
            .normal => |n| ChildIndex.normal(n.value + 1),
        };
        return .{ .indices = new_indices };
    }
};

Path Parsing and Display

const PathParser = struct {
    pub fn parse(path_str: []const u8, allocator: Allocator) !DerivationPath {
        var indices = std.ArrayList(ChildIndex).init(allocator);

        var iter = std.mem.splitScalar(u8, path_str, '/');

        // First element must be 'm' or 'M'
        const master = iter.next() orelse return error.EmptyPath;
        if (!std.mem.eql(u8, master, "m") and !std.mem.eql(u8, master, "M")) {
            return error.InvalidMasterPrefix;
        }

        while (iter.next()) |segment| {
            const is_hardened = std.mem.endsWith(u8, segment, "'");
            const num_str = if (is_hardened)
                segment[0 .. segment.len - 1]
            else
                segment;

            const value = try std.fmt.parseInt(u31, num_str, 10);
            const index = if (is_hardened)
                ChildIndex.hardened(value)
            else
                ChildIndex.normal(value);

            try indices.append(index);
        }

        return .{ .indices = try indices.toOwnedSlice() };
    }

    pub fn format(path: *const DerivationPath, writer: anytype) !void {
        try writer.writeAll("m");
        for (path.indices) |index| {
            try writer.writeAll("/");
            switch (index) {
                .hardened => |h| try writer.print("{}'", .{h.value}),
                .normal => |n| try writer.print("{}", .{n.value}),
            }
        }
    }
};

EIP-3 Derivation Standard

Ergo's EIP-3 defines the derivation structure⁷⁸:

EIP-3 Path Structure
══════════════════════════════════════════════════════════════════

m / 44' / 429' / account' / change / address
│    │      │        │         │        │
│    │      │        │         │        └── Address Index (normal)
│    │      │        │         └─────────── Change: 0=external, 1=internal
│    │      │        └───────────────────── Account Index (hardened)
│    │      └────────────────────────────── Coin Type: 429 (Ergo)
│    └───────────────────────────────────── Purpose: BIP-44
└────────────────────────────────────────── Master private key

Examples:
  m/44'/429'/0'/0/0   First address, first account
  m/44'/429'/0'/0/1   Second address, first account
  m/44'/429'/1'/0/0   First address, second account

Extended Secret Key

Extended keys pair key material with chain code⁹¹⁰:

const ExtSecretKey = struct {
    key_bytes: [32]u8,      // Private key scalar
    chain_code: [32]u8,     // Chain code for derivation
    path: DerivationPath,

    const BITCOIN_SEED = "Bitcoin seed";

    /// Derive master key from seed
    pub fn deriveMaster(seed: []const u8) !ExtSecretKey {
        var hmac = HmacSha512.init(BITCOIN_SEED);
        hmac.update(seed);
        var output: [64]u8 = undefined;
        hmac.final(&output);

        return ExtSecretKey{
            .key_bytes = output[0..32].*,
            .chain_code = output[32..64].*,
            .path = DerivationPath.master(),
        };
    }

    /// Get public image (ProveDlog)
    pub fn publicImage(self: *const ExtSecretKey) ProveDlog {
        const scalar = Scalar.fromBytes(self.key_bytes);
        const point = CryptoConstants.generator.mul(scalar);
        return ProveDlog{ .h = point };
    }

    /// Get corresponding extended public key
    pub fn publicKey(self: *const ExtSecretKey) !ExtPubKey {
        return ExtPubKey{
            .key_bytes = self.publicImage().compress(),
            .chain_code = self.chain_code,
            .path = self.path,
        };
    }

    /// Zero out key material
    /// SECURITY: In production, use volatile write or std.crypto.utils.secureZero
    /// to prevent compiler optimization from eliding the zeroing.
    pub fn zeroSecret(self: *ExtSecretKey) void {
        std.crypto.utils.secureZero(u8, &self.key_bytes);
    }
};

Child Key Derivation

BIP-32 child derivation algorithm¹¹¹²:

pub fn deriveChild(parent: *const ExtSecretKey, index: ChildIndex, allocator: Allocator) !ExtSecretKey {
    var hmac = HmacSha512.init(&parent.chain_code);

    // HMAC input depends on derivation type
    switch (index) {
        .hardened => {
            // Hardened: 0x00 || parent_key (33 bytes)
            hmac.update(&[_]u8{0x00});
            hmac.update(&parent.key_bytes);
        },
        .normal => {
            // Normal: parent_public_key (33 bytes compressed)
            const pub_key = parent.publicImage().compress();
            hmac.update(&pub_key);
        },
    }

    // Append index as big-endian u32
    var index_bytes: [4]u8 = undefined;
    std.mem.writeInt(u32, &index_bytes, index.toBits(), .big);
    hmac.update(&index_bytes);

    var output: [64]u8 = undefined;
    hmac.final(&output);

    // Parse left 32 bytes as scalar
    const child_key_proto = Scalar.fromBytes(output[0..32].*);

    // Check validity (must be < group order)
    if (child_key_proto.isOverflow()) {
        return deriveChild(parent, index.next(), allocator);
    }

    // child_key = (child_key_proto + parent_key) mod n
    const parent_scalar = Scalar.fromBytes(parent.key_bytes);
    const child_scalar = child_key_proto.add(parent_scalar);

    // Check for zero (invalid)
    if (child_scalar.isZero()) {
        return deriveChild(parent, index.next(), allocator);
    }

    return ExtSecretKey{
        .key_bytes = child_scalar.toBytes(),
        .chain_code = output[32..64].*,
        .path = try parent.path.extend(index, allocator),
    };
}

/// Derive key at full path
pub fn derive(master: *const ExtSecretKey, path: DerivationPath, allocator: Allocator) !ExtSecretKey {
    var current = master.*;
    for (path.indices) |index| {
        current = try deriveChild(&current, index, allocator);
    }
    return current;
}

Extended Public Key

Public key derivation (non-hardened only)¹³¹⁴:

const ExtPubKey = struct {
    key_bytes: [33]u8,      // Compressed public key
    chain_code: [32]u8,
    path: DerivationPath,

    pub fn deriveChild(parent: *const ExtPubKey, index: ChildIndex, allocator: Allocator) !ExtPubKey {
        // Cannot derive hardened children from public key
        if (index.isHardened()) {
            return error.HardenedDerivationRequiresPrivateKey;
        }

        var hmac = HmacSha512.init(&parent.chain_code);
        hmac.update(&parent.key_bytes);

        var index_bytes: [4]u8 = undefined;
        std.mem.writeInt(u32, &index_bytes, index.toBits(), .big);
        hmac.update(&index_bytes);

        var output: [64]u8 = undefined;
        hmac.final(&output);

        const child_key_proto = Scalar.fromBytes(output[0..32].*);

        if (child_key_proto.isOverflow()) {
            return deriveChild(parent, index.next(), allocator);
        }

        // child_public = point(child_key_proto) + parent_public
        const proto_point = CryptoConstants.generator.mul(child_key_proto);
        const parent_point = Point.decompress(parent.key_bytes);
        const child_point = proto_point.add(parent_point);

        if (child_point.isInfinity()) {
            return deriveChild(parent, index.next(), allocator);
        }

        return ExtPubKey{
            .key_bytes = child_point.compress(),
            .chain_code = output[32..64].*,
            .path = try parent.path.extend(index, allocator),
        };
    }
};

Mnemonic to Seed

BIP-39 seed derivation¹⁵¹⁶:

const Mnemonic = struct {
    const PBKDF2_ITERATIONS: u32 = 2048;
    const SEED_LENGTH: usize = 64;

    /// Convert mnemonic phrase to seed using PBKDF2-HMAC-SHA512
    pub fn toSeed(phrase: []const u8, passphrase: []const u8) [SEED_LENGTH]u8 {
        var seed: [SEED_LENGTH]u8 = undefined;

        // Normalize using NFKD
        const normalized_phrase = normalizeNfkd(phrase);
        const normalized_pass = normalizeNfkd(passphrase);

        // Salt = "mnemonic" + passphrase
        var salt_buf: [256]u8 = undefined;
        const salt = std.fmt.bufPrint(&salt_buf, "mnemonic{s}", .{normalized_pass}) catch unreachable;

        // PBKDF2-HMAC-SHA512
        pbkdf2(
            HmacSha512,
            normalized_phrase,
            salt,
            PBKDF2_ITERATIONS,
            &seed,
        );

        return seed;
    }
};

/// Full derivation from mnemonic to key
pub fn mnemonicToKey(
    phrase: []const u8,
    passphrase: []const u8,
    path: DerivationPath,
    allocator: Allocator,
) !ExtSecretKey {
    const seed = Mnemonic.toSeed(phrase, passphrase);
    const master = try ExtSecretKey.deriveMaster(&seed);
    return derive(&master, path, allocator);
}

Path Serialization

Binary format for storage/transfer¹⁷¹⁸:

const DerivationPathSerializer = struct {
    pub fn serialize(path: *const DerivationPath, writer: anytype) !void {
        // Public branch flag (0x00 for private, 0x01 for public)
        try writer.writeByte(0x00);

        // Depth
        try writer.writeInt(u32, @intCast(path.indices.len), .little);

        // Each index as 4-byte big-endian
        for (path.indices) |index| {
            var bytes: [4]u8 = undefined;
            std.mem.writeInt(u32, &bytes, index.toBits(), .big);
            try writer.writeAll(&bytes);
        }
    }

    pub fn parse(reader: anytype, allocator: Allocator) !DerivationPath {
        const public_branch = try reader.readByte();
        _ = public_branch; // TODO: handle public branch

        const depth = try reader.readInt(u32, .little);

        var indices = try allocator.alloc(ChildIndex, depth);
        for (0..depth) |i| {
            var bytes: [4]u8 = undefined;
            try reader.readNoEof(&bytes);
            const bits = std.mem.readInt(u32, &bytes, .big);

            indices[i] = if (bits & 0x80000000 != 0)
                ChildIndex.hardened(@truncate(bits & 0x7FFFFFFF))
            else
                ChildIndex.normal(@truncate(bits));
        }

        return .{ .indices = indices };
    }
};

Watch-Only Wallet

Public key derivation enables watch-only wallets:

Watch-Only Wallet Setup
══════════════════════════════════════════════════════════════════

Full Wallet (has secrets)           Watch-Only Wallet (no secrets)
─────────────────────────           ──────────────────────────────

Master Secret Key
       │
       ├── m/44'/429'/0'            Extended Public Key
       │   (hardened account)  ───▶  at m/44'/429'/0'/0
       │          │                        │
       │          └── m/44'/429'/0'/0      ├── Address 0 public
       │              (change branch) ───▶ ├── Address 1 public
       │                   │               ├── Address 2 public
       │                   ├── 0           └── ... (can derive more)
       │                   ├── 1
       │                   └── 2           Cannot derive:
       │                                    × Account 1 keys
                                            × Hardened children
                                            × Private keys

Export at: m/44'/429'/0'/0 (parent of address keys)
Can derive: All non-hardened children (addresses)
Cannot derive: Hardened children, private keys

Usage Example

const allocator = std.heap.page_allocator;

// 1. From mnemonic to master key
const mnemonic = "abandon abandon abandon abandon abandon abandon " ++
                 "abandon abandon abandon abandon abandon about";
const seed = Mnemonic.toSeed(mnemonic, "");
var master = try ExtSecretKey.deriveMaster(&seed);
defer master.zeroSecret();

// 2. Derive first EIP-3 address key
const path = DerivationPath.eip3(0, 0); // m/44'/429'/0'/0/0
var first_key = try derive(&master, path, allocator);
defer first_key.zeroSecret();

// 3. Get public image for address
const pub_key = first_key.publicImage();

// 4. Derive next address
const next_path = try path.next(allocator);
var second_key = try derive(&master, next_path, allocator);
defer second_key.zeroSecret();

// 5. Create watch-only wallet
const watch_only_path = try PathParser.parse("m/44'/429'/0'/0", allocator);
var account_key = try derive(&master, watch_only_path, allocator);
const watch_only = try account_key.publicKey();

// 6. Derive address public keys without secrets
const addr0_pub = try watch_only.deriveChild(ChildIndex.normal(0), allocator);
const addr1_pub = try watch_only.deriveChild(ChildIndex.normal(1), allocator);

// 7. Cannot derive hardened from public key
_ = watch_only.deriveChild(ChildIndex.hardened(0), allocator) catch |err| {
    std.debug.assert(err == error.HardenedDerivationRequiresPrivateKey);
};

Security Considerations

Key Derivation Security
══════════════════════════════════════════════════════════════════

Attack: Child + Chain Code → Parent  ⚠️ PRACTICAL ATTACK
────────────────────────────────────────────────────────
This is NOT theoretical - a single compromised child key
(via malware, hardware fault, or insider threat) can
recover the entire account if normal derivation was used.

Given:
  - Child private key k_i
  - Parent chain code c

For NORMAL derivation:
  HMAC-SHA512(c, K_parent || i) = IL || IR
  k_i = IL + k_parent  mod n

  Attacker can compute:
  k_parent = k_i - IL  mod n  ← COMPROMISED!

For HARDENED derivation:
  HMAC-SHA512(c, 0x00 || k_parent || i) = IL || IR

  Cannot compute IL without knowing k_parent
  → Parent key remains SECURE

Recommendation:
  └── Always use hardened derivation for account/purpose levels
  └── Normal derivation only for address indices

Summary

BIP-32 defines hierarchical deterministic key derivation
Derivation paths use notation m/44'/429'/0'/0/0
Hardened derivation (') requires private key; prevents key leakage
Normal derivation allows public key derivation from parent public key
EIP-3 standardizes Ergo's path: m/44'/429'/account'/change/address
Extended keys = key material (32 bytes) + chain code (32 bytes)
Watch-only wallets use extended public keys for address generation

Next: Chapter 29: Soft Fork Mechanism

Scala: ExtendedSecretKey.scala

Rust: ext_secret_key.rs:29-37

Scala: Index.scala:5-16

⁴

Rust: derivation_path.rs:15-131

⁵

Scala: DerivationPath.scala:10-29

⁶

Rust: derivation_path.rs:133-204

⁷

Scala: Constants.scala:31-36

⁸

Rust: derivation_path.rs:88-91 (PURPOSE, ERG, CHANGE constants)

⁹

Scala: ExtendedSecretKey.scala:13-49

¹⁰

Rust: ext_secret_key.rs:60-112

¹¹

Scala: ExtendedSecretKey.scala:53-78

¹²

Rust: ext_secret_key.rs:114-163

¹³

Scala: ExtendedPublicKey.scala:46-59

¹⁴

Rust: ext_pub_key.rs

¹⁵

Scala: JavaHelpers.scala:282-301

¹⁶

Rust: mnemonic.rs:20-37

¹⁷

Scala: DerivationPath.scala:133-147

¹⁸

Rust: derivation_path.rs:235-241 (ledger_bytes)

Chapter 29: Soft-Fork Mechanism

Prerequisites

Chapter 3 for ErgoTree version field and header format
Chapter 7 for serialization framework
Chapter 24 for validation rules

Learning Objectives

By the end of this chapter, you will be able to:

Explain version context and how script versioning enables protocol upgrades
Implement validation rules with configurable status (enabled, disabled, soft-fork)
Handle unknown opcodes gracefully to support future soft-forks
Describe the transition from AOT (Ahead-of-Time) to JIT (Just-in-Time) costing

Version Context Architecture

The soft-fork mechanism enables protocol upgrades without breaking consensus¹²:

Soft-Fork Version Architecture
══════════════════════════════════════════════════════════════════

┌─────────────────────────────────────────────────────────────────┐
│                    Block Header                                 │
│                                                                 │
│   Block Version: 1, 2, 3, 4                                     │
│                                                                 │
│   Activated Script Version = Block Version - 1                  │
└────────────────────────┬────────────────────────────────────────┘
                         │
                         ▼
┌─────────────────────────────────────────────────────────────────┐
│                    ErgoTree Header                              │
│                                                                 │
│   7   6   5   4   3   2   1   0                                 │
│   ├───┼───┼───┼───┼───┼───┼───┤                                 │
│   │ M │ G │ C │ S │ Z │ V │ V │ V                               │
│   └───┴───┴───┴───┴───┴───┴───┘                                 │
│   M = More bytes follow                                         │
│   G = GZIP (reserved)                                           │
│   C = Context costing (reserved)                                │
│   S = Constant segregation                                      │
│   Z = Size included                                             │
│   V = Version (0-7)                                             │
└─────────────────────────────────────────────────────────────────┘

ErgoTree Version

Script version is encoded in header bits 0-2³⁴:

const ErgoTreeVersion = struct {
    value: u3, // 0-7

    const VERSION_MASK: u8 = 0x07;

    /// Version 0 - Initial mainnet (v3.x)
    pub const V0 = ErgoTreeVersion{ .value = 0 };
    /// Version 1 - Height monotonicity (v4.x)
    pub const V1 = ErgoTreeVersion{ .value = 1 };
    /// Version 2 - JIT interpreter (v5.x)
    pub const V2 = ErgoTreeVersion{ .value = 2 };
    /// Version 3 - Sub-blocks, new ops (v6.x)
    pub const V3 = ErgoTreeVersion{ .value = 3 };

    /// Maximum supported script version
    pub const MAX_SCRIPT_VERSION = V3;

    /// Parse version from header byte
    pub fn parseVersion(header_byte: u8) ErgoTreeVersion {
        return .{ .value = @truncate(header_byte & VERSION_MASK) };
    }

    pub fn toU8(self: ErgoTreeVersion) u8 {
        return @as(u8, self.value);
    }
};

ErgoTree Header

Header byte encoding with flags⁵⁶:

const ErgoTreeHeader = struct {
    version: ErgoTreeVersion,
    is_constant_segregation: bool,
    has_size: bool,

    const CONSTANT_SEGREGATION_FLAG: u8 = 0b0001_0000;
    const HAS_SIZE_FLAG: u8 = 0b0000_1000;

    /// Parse header from byte
    pub fn parse(header_byte: u8) !ErgoTreeHeader {
        return .{
            .version = ErgoTreeVersion.parseVersion(header_byte),
            .is_constant_segregation = (header_byte & CONSTANT_SEGREGATION_FLAG) != 0,
            .has_size = (header_byte & HAS_SIZE_FLAG) != 0,
        };
    }

    /// Serialize header to byte
    pub fn serialize(self: *const ErgoTreeHeader) u8 {
        var header_byte: u8 = self.version.toU8();
        if (self.is_constant_segregation) {
            header_byte |= CONSTANT_SEGREGATION_FLAG;
        }
        if (self.has_size) {
            header_byte |= HAS_SIZE_FLAG;
        }
        return header_byte;
    }

    /// Create v0 header
    pub fn v0(constant_segregation: bool) ErgoTreeHeader {
        return .{
            .version = ErgoTreeVersion.V0,
            .is_constant_segregation = constant_segregation,
            .has_size = false,
        };
    }

    /// Create v1 header (size is mandatory)
    pub fn v1(constant_segregation: bool) ErgoTreeHeader {
        return .{
            .version = ErgoTreeVersion.V1,
            .is_constant_segregation = constant_segregation,
            .has_size = true,
        };
    }
};

Version Context

Thread-local context tracks activated and tree versions⁷⁸:

const VersionContext = struct {
    activated_version: u8,
    ergo_tree_version: u8,

    /// JIT costing activation version (v5.0)
    const JIT_ACTIVATION_VERSION: u8 = 2;
    /// v6.0 soft-fork version
    const V6_SOFT_FORK_VERSION: u8 = 3;

    pub fn init(activated: u8, tree: u8) !VersionContext {
        // ergoTreeVersion must never exceed activatedVersion
        if (activated >= JIT_ACTIVATION_VERSION and tree > activated) {
            return error.InvalidVersionContext;
        }
        return .{
            .activated_version = activated,
            .ergo_tree_version = tree,
        };
    }

    /// True if JIT costing is activated (v5.0+)
    pub fn isJitActivated(self: *const VersionContext) bool {
        return self.activated_version >= JIT_ACTIVATION_VERSION;
    }

    /// True if v6.0 protocol is activated
    pub fn isV6Activated(self: *const VersionContext) bool {
        return self.activated_version >= V6_SOFT_FORK_VERSION;
    }

    /// True if v3+ ErgoTree version
    pub fn isV3OrLaterErgoTree(self: *const VersionContext) bool {
        return self.ergo_tree_version >= V6_SOFT_FORK_VERSION;
    }
};

/// Thread-local version context
threadlocal var current_context: ?VersionContext = null;

pub fn withVersions(
    activated: u8,
    tree: u8,
    comptime block: fn (*VersionContext) anyerror!void,
) !void {
    const ctx = try VersionContext.init(activated, tree);
    const prev = current_context;
    current_context = ctx;
    defer current_context = prev;
    try block(&ctx);
}

pub fn currentContext() !*const VersionContext {
    return &(current_context orelse return error.VersionContextNotSet);
}

Version History

Protocol Version History
══════════════════════════════════════════════════════════════════

┌─────────────┬────────────────┬──────────────┬────────────────────┐
│ Block Ver   │ Script Ver     │ Protocol     │ Features           │
├─────────────┼────────────────┼──────────────┼────────────────────┤
│ 1           │ 0              │ v3.x         │ Initial mainnet    │
│ 2           │ 1              │ v4.x         │ Height monotonicity│
│ 3           │ 2              │ v5.x         │ JIT interpreter    │
│ 4           │ 3              │ v6.x         │ Sub-blocks, new ops│
└─────────────┴────────────────┴──────────────┴────────────────────┘

Relation: activated_script_version = block_version - 1

Rule Status

Validation rules have configurable status⁹¹⁰:

const RuleStatus = union(enum) {
    /// Default: rule is active and enforced
    enabled,
    /// Rule is disabled (via voting)
    disabled,
    /// Rule replaced by new rule
    replaced: struct { new_rule_id: u16 },
    /// Rule parameters changed
    changed: struct { new_value: []const u8 },

    const StatusCode = enum(u8) {
        enabled = 1,
        disabled = 2,
        replaced = 3,
        changed = 4,
    };

    pub fn statusCode(self: RuleStatus) StatusCode {
        return switch (self) {
            .enabled => .enabled,
            .disabled => .disabled,
            .replaced => .replaced,
            .changed => .changed,
        };
    }
};

Validation Rules

Rules define validation behavior with soft-fork support¹¹¹²:

const ValidationRule = struct {
    id: u16,
    description: []const u8,
    soft_fork_checker: SoftForkChecker,
    checked: bool = false,

    pub fn checkRule(self: *ValidationRule, settings: *const ValidationSettings) !void {
        if (!self.checked) {
            if (settings.getStatus(self.id) == null) {
                return error.ValidationRuleNotFound;
            }
            self.checked = true;
        }
    }

    pub fn throwValidationException(
        self: *const ValidationRule,
        cause: anyerror,
        args: []const u8,
    ) ValidationError {
        return ValidationError{
            .rule = self,
            .args = args,
            .cause = cause,
        };
    }
};

const ValidationError = struct {
    rule: *const ValidationRule,
    args: []const u8,
    cause: anyerror,
};

Core Validation Rules

const ValidationRules = struct {
    const FIRST_RULE_ID: u16 = 1000;

    /// Check primitive type code is valid
    pub const CheckPrimitiveTypeCode = ValidationRule{
        .id = 1007,
        .description = "Check primitive type code is supported or added via soft-fork",
        .soft_fork_checker = .code_added,
    };

    /// Check non-primitive type code is valid
    pub const CheckTypeCode = ValidationRule{
        .id = 1008,
        .description = "Check non-primitive type code is supported or added via soft-fork",
        .soft_fork_checker = .code_added,
    };

    /// Check data can be serialized for type
    pub const CheckSerializableTypeCode = ValidationRule{
        .id = 1009,
        .description = "Check data values of type can be serialized",
        .soft_fork_checker = .when_replaced,
    };

    /// Check reader position limit
    pub const CheckPositionLimit = ValidationRule{
        .id = 1014,
        .description = "Check Reader position limit",
        .soft_fork_checker = .when_replaced,
    };
};

Soft-Fork Checkers

Detect soft-fork conditions from validation failures¹³¹⁴:

const SoftForkChecker = enum {
    none,
    when_replaced,
    code_added,

    pub fn isSoftFork(
        self: SoftForkChecker,
        settings: *const ValidationSettings,
        rule_id: u16,
        status: RuleStatus,
        args: []const u8,
    ) bool {
        return switch (self) {
            .none => false,
            .when_replaced => switch (status) {
                .replaced => true,
                else => false,
            },
            .code_added => switch (status) {
                .changed => |c| std.mem.indexOf(u8, c.new_value, args) != null,
                else => false,
            },
        };
    }
};

Validation Settings

Configurable settings from blockchain state¹⁵¹⁶:

const ValidationSettings = struct {
    rules: std.AutoHashMap(u16, struct { rule: *ValidationRule, status: RuleStatus }),

    pub fn getStatus(self: *const ValidationSettings, id: u16) ?RuleStatus {
        if (self.rules.get(id)) |entry| {
            return entry.status;
        }
        return null;
    }

    pub fn updated(self: *const ValidationSettings, id: u16, new_status: RuleStatus) !ValidationSettings {
        var new_rules = try self.rules.clone();
        if (new_rules.getPtr(id)) |entry| {
            entry.status = new_status;
        }
        return .{ .rules = new_rules };
    }

    /// Check if exception represents a soft-fork condition
    pub fn isSoftFork(self: *const ValidationSettings, ve: ValidationError) bool {
        const entry = self.rules.get(ve.rule.id) orelse return false;

        // Don't tolerate replaced v5.0 rules after v6.0 activation
        switch (entry.status) {
            .replaced => {
                const ctx = currentContext() catch return false;
                if (ctx.isV6Activated() and
                    (ve.rule.id == 1011 or ve.rule.id == 1007 or ve.rule.id == 1008))
                {
                    return false;
                }
                return true;
            },
            else => return entry.rule.soft_fork_checker.isSoftFork(
                self,
                ve.rule.id,
                entry.status,
                ve.args,
            ),
        }
    }
};

Soft-Fork Execution Wrapper

Execute code with soft-fork fallback:

pub fn trySoftForkable(
    comptime T: type,
    settings: *const ValidationSettings,
    when_soft_fork: T,
    block: fn () anyerror!T,
) T {
    return block() catch |err| {
        if (@errorCast(ValidationError, err)) |ve| {
            if (settings.isSoftFork(ve)) {
                return when_soft_fork;
            }
        }
        return err;
    };
}

// Usage: handling unknown opcodes
fn deserializeValue(
    reader: *Reader,
    settings: *const ValidationSettings,
) !Value {
    return trySoftForkable(
        Value,
        settings,
        // Soft-fork fallback: return unit placeholder
        Value.unit,
        // Normal deserialization
        struct {
            fn f() !Value {
                const op_code = try reader.readByte();
                const serializer = getSerializer(op_code) orelse
                    return error.UnknownOpCode;
                return serializer.parse(reader);
            }
        }.f,
    );
}

AOT to JIT Transition

Script Validation Rules Across Versions
══════════════════════════════════════════════════════════════════

Rule │ Block │ Block Type │ Script │ v4.0 Action     │ v5.0 Action
─────┼───────┼────────────┼────────┼─────────────────┼─────────────
  1  │ 1,2   │ candidate  │ v0/v1  │ AOT-cost,verify │ AOT-cost,verify
  2  │ 1,2   │ mined      │ v0/v1  │ AOT-cost,verify │ AOT-cost,verify
  3  │ 1,2   │ candidate  │ v2     │ skip-pool-tx    │ skip-pool-tx
  4  │ 1,2   │ mined      │ v2     │ skip-reject     │ skip-reject
─────┼───────┼────────────┼────────┼─────────────────┼─────────────
  5  │ 3     │ candidate  │ v0/v1  │ skip-pool-tx    │ JIT-verify
  6  │ 3     │ mined      │ v0/v1  │ skip-accept     │ JIT-verify
  7  │ 3     │ candidate  │ v2     │ skip-pool-tx    │ JIT-verify
  8  │ 3     │ mined      │ v2     │ skip-accept     │ JIT-verify

Actions:
  AOT-cost,verify  Cost estimation + verification using v4.0 AOT
  JIT-verify       Verification using v5.0 JIT interpreter
  skip-pool-tx     Skip mempool transaction (can't handle)
  skip-accept      Accept block without verification (trust majority)
  skip-reject      Reject transaction/block (invalid version)

Consensus Equivalence Properties

For safe transition between interpreter versions:

// Property 1: AOT-verify preserved between releases
// forall s:ScriptV0/V1, R4.0-AOT-verify(s) == R5.0-AOT-verify(s)

// Property 2: AOT-cost preserved
// forall s:ScriptV0/V1, R4.0-AOT-cost(s) == R5.0-AOT-cost(s)

// Property 3: JIT can replace AOT
// forall s:ScriptV0/V1, R5.0-JIT-verify(s) == R4.0-AOT-verify(s)

// Property 4: JIT cost bounded by AOT
// forall s:ScriptV0/V1, R5.0-JIT-cost(s) <= R4.0-AOT-cost(s)

// Property 5: ScriptV2 rejected before soft-fork
// forall s:ScriptV2, if not SF active => reject

Version-Aware Interpreter

pub fn verify(
    ergo_tree: *const ErgoTree,
    ctx: *const Context,
) !bool {
    const script_version = ergo_tree.header.version;
    const activated_version = ctx.activatedScriptVersion();

    // Execute with proper version context
    var version_ctx = try VersionContext.init(
        activated_version.toU8(),
        script_version.toU8(),
    );

    const prev = current_context;
    current_context = version_ctx;
    defer current_context = prev;

    // Version-specific execution
    if (version_ctx.isJitActivated()) {
        return verifyJit(ergo_tree, ctx);
    } else {
        return verifyAot(ergo_tree, ctx);
    }
}

fn verifyJit(tree: *const ErgoTree, ctx: *const Context) !bool {
    const reduced = try fullReduction(tree, ctx);
    return verifySignature(reduced, ctx.messageToSign());
}

fn verifyAot(tree: *const ErgoTree, ctx: *const Context) !bool {
    // Legacy AOT interpreter path
    const result = try aotEvaluate(tree, ctx);
    return verifySignature(result, ctx.messageToSign());
}

Block Extension Voting

Rule status changes via blockchain extension voting:

Extension Voting Flow
══════════════════════════════════════════════════════════════════

┌────────────────────────────────────────────────────────────────────┐
│                    Block Extension Section                         │
│                                                                    │
│  Key (2 bytes)    │    Value                                       │
│  ─────────────────┼─────────────────────────────────────────────── │
│  Rule ID          │    RuleStatus + data                           │
│  0x03EF (1007)    │    ChangedRule([0x5A, 0x5B])                   │
│                   │    (new opcodes 0x5A, 0x5B allowed)            │
└────────────────────────────────────────────────────────────────────┘
                              │
                              ▼
┌────────────────────────────────────────────────────────────────────┐
│                    Voting Epoch                                    │
│                                                                    │
│  Epoch 1:    □ □ □ ■ □ ■ ■ □ ■ ■   (5/10 = 50%)              │
│  Epoch 2:    ■ ■ □ ■ ■ ■ □ ■ ■ ■   (8/10 = 80%)              │
│  Epoch 3:    ■ ■ ■ ■ ■ □ ■ ■ ■ ■   (9/10 = 90%) → ACTIVATED  │
└────────────────────────────────────────────────────────────────────┘

New Opcode Addition:
  1. Before soft-fork: Unknown opcode → ValidationException
  2. Extension update: ChangedRule(Array(newOpcode)) for rule 1001
  3. After activation: Old nodes recognize soft-fork via SoftForkWhenCodeAdded
  4. Result: Old nodes skip verification; new nodes execute new opcode

Unknown Opcode Handling

fn handleUnknownOpcode(
    reader: *Reader,
    settings: *const ValidationSettings,
    op_code: u8,
) !Expr {
    // Check if this is a soft-fork condition
    const rule = &ValidationRules.CheckTypeCode;
    const status = settings.getStatus(rule.id) orelse return error.RuleNotFound;

    switch (status) {
        .changed => |c| {
            // Check if opcode was added via soft-fork
            if (std.mem.indexOfScalar(u8, c.new_value, op_code) != null) {
                // Soft-fork: skip remaining bytes, return placeholder
                reader.skipToEnd();
                return Expr{ .constant = Constant.unit };
            }
        },
        else => {},
    }

    // Not a soft-fork condition - fail hard
    return rule.throwValidationException(error.UnknownOpCode, &[_]u8{op_code});
}

Summary

ErgoTreeVersion encodes script version in 3-bit header field (0-7)
VersionContext tracks activated protocol and tree versions
RuleStatus can be Enabled, Disabled, Replaced, or Changed
SoftForkChecker detects soft-fork conditions from validation failures
trySoftForkable provides graceful fallback for unknown constructs
AOT→JIT transition demonstrated soft-fork for major interpreter change
Block extension voting enables parameter changes via miner consensus
Old nodes remain compatible by trusting majority on unverifiable blocks

Next: Chapter 30: Cross-Platform Support

Scala: VersionContext.scala:17-35

Rust: context.rs:46-53

Scala: ErgoTree.scala (header)

⁴

Rust: tree_header.rs:122-145

⁵

Scala: ErgoTree.scala:57-84

⁶

Rust: tree_header.rs:27-109

⁷

Scala: VersionContext.scala:47-56

⁸

Rust: context.rs:12-54

⁹

Scala: RuleStatus.scala:4-53

¹⁰

Rust: Not directly present in sigma-rust; validation handled at higher level

¹¹

Scala: ValidationRules.scala:13-51

¹²

Rust: Validation rules embedded in deserializer implementations

¹³

Scala: SoftForkChecker.scala:4-42

¹⁴

Rust: Soft-fork handling at application level (ergo-lib)

¹⁵

Scala: SigmaValidationSettings.scala:45-69

¹⁶

Rust: parameters.rs (blockchain parameters)

Chapter 30: Cross-Platform Support

Prerequisites

Chapter 9 for platform-specific cryptographic implementations
Chapter 27 for SDK APIs that must work across platforms
Familiarity with build systems and compilation toolchains

Learning Objectives

By the end of this chapter, you will be able to:

Explain Zig's cross-compilation architecture and target selection
Implement platform abstraction layers for OS-specific functionality
Use conditional compilation (comptime if) for target-specific code
Build for WASM, native, and embedded targets from a single codebase

Cross-Compilation Architecture

Zig provides native cross-compilation to any target from any host¹²:

Cross-Compilation Targets
══════════════════════════════════════════════════════════════════

┌─────────────────────────────────────────────────────────────────┐
│                    Host Build System                            │
│                                                                 │
│   zig build -Dtarget=<target>                                   │
└────────────────────────┬────────────────────────────────────────┘

Common Targets:
  x86_64-linux-gnu       Linux desktop/server
  aarch64-linux-gnu      ARM64 Linux (Raspberry Pi, etc.)
  x86_64-macos           macOS Intel
  aarch64-macos          macOS Apple Silicon
  x86_64-windows-gnu     Windows
  wasm32-wasi            WebAssembly with WASI
  wasm32-freestanding    WebAssembly browser

Platform Abstraction

Platform-specific code via conditional compilation:

const builtin = @import("builtin");
const std = @import("std");

const Platform = struct {
    pub const target = builtin.target;
    pub const os = target.os.tag;
    pub const arch = target.cpu.arch;

    pub const is_wasm = arch == .wasm32 or arch == .wasm64;
    pub const is_native = !is_wasm;
    pub const is_windows = os == .windows;
    pub const is_linux = os == .linux;
    pub const is_macos = os == .macos;

    /// Get platform-appropriate crypto implementation
    pub fn getCrypto() type {
        if (is_wasm) {
            return WasmCrypto;
        } else {
            return NativeCrypto;
        }
    }

    /// Get platform-appropriate allocator
    pub fn getDefaultAllocator() std.mem.Allocator {
        if (is_wasm) {
            return std.heap.wasm_allocator;
        } else {
            return std.heap.c_allocator;
        }
    }
};

Crypto Abstraction Layer

Platform-agnostic cryptography interface³⁴.

const CryptoFacade = struct {
    const Impl = Platform.getCrypto();

    pub const SECRET_KEY_LENGTH: usize = 32;
    pub const PUBLIC_KEY_LENGTH: usize = 33;
    pub const SIGNATURE_LENGTH: usize = 64;

    /// Create new crypto context
    pub fn createContext() CryptoContext {
        return Impl.createContext();
    }

    /// Normalize point to affine coordinates
    pub fn normalizePoint(p: Ecp) Ecp {
        return Impl.normalizePoint(p);
    }

    /// Negate point (y-coordinate)
    pub fn negatePoint(p: Ecp) Ecp {
        return Impl.negatePoint(p);
    }

    /// Check if point is infinity
    pub fn isInfinityPoint(p: Ecp) bool {
        return Impl.isInfinityPoint(p);
    }

    /// Point exponentiation: p^n
    pub fn exponentiatePoint(p: Ecp, n: *const Scalar) Ecp {
        return Impl.exponentiatePoint(p, n);
    }

    /// Point multiplication (addition in EC group): p1 + p2
    pub fn multiplyPoints(p1: Ecp, p2: Ecp) Ecp {
        return Impl.multiplyPoints(p1, p2);
    }

    /// Encode point (compressed or uncompressed)
    pub fn encodePoint(p: Ecp, compressed: bool) [PUBLIC_KEY_LENGTH]u8 {
        return Impl.encodePoint(p, compressed);
    }

    /// HMAC-SHA512
    pub fn hashHmacSha512(key: []const u8, data: []const u8) [64]u8 {
        return Impl.hashHmacSha512(key, data);
    }

    /// PBKDF2-HMAC-SHA512
    pub fn generatePbkdf2Key(
        password: []const u8,
        salt: []const u8,
        iterations: u32,
    ) [64]u8 {
        return Impl.generatePbkdf2Key(password, salt, iterations);
    }

    /// Secure random bytes
    pub fn randomBytes(dest: []u8) void {
        Impl.randomBytes(dest);
    }
};

Native Crypto Implementation

Using Zig's standard library and optional C bindings⁵⁶:

const NativeCrypto = struct {
    const std = @import("std");
    const crypto = std.crypto;

    pub fn createContext() CryptoContext {
        return CryptoContext.secp256k1();
    }

    pub fn normalizePoint(p: Ecp) Ecp {
        return p.normalize();
    }

    pub fn negatePoint(p: Ecp) Ecp {
        return p.negate();
    }

    pub fn isInfinityPoint(p: Ecp) bool {
        return p.isIdentity();
    }

    pub fn exponentiatePoint(p: Ecp, n: *const Scalar) Ecp {
        return p.mul(n.*);
    }

    pub fn multiplyPoints(p1: Ecp, p2: Ecp) Ecp {
        return p1.add(p2);
    }

    pub fn encodePoint(p: Ecp, compressed: bool) [33]u8 {
        if (compressed) {
            return p.toCompressedSec1();
        } else {
            var buf: [65]u8 = undefined;
            return p.toUncompressedSec1(&buf)[0..33].*;
        }
    }

    pub fn hashHmacSha512(key: []const u8, data: []const u8) [64]u8 {
        var hmac = crypto.auth.HmacSha512.init(key);
        hmac.update(data);
        return hmac.finalResult();
    }

    pub fn generatePbkdf2Key(
        password: []const u8,
        salt: []const u8,
        iterations: u32,
    ) [64]u8 {
        var result: [64]u8 = undefined;
        crypto.pwhash.pbkdf2(
            &result,
            password,
            salt,
            iterations,
            crypto.auth.HmacSha512,
        );
        return result;
    }

    pub fn randomBytes(dest: []u8) void {
        crypto.random.bytes(dest);
    }
};

WASM Crypto Implementation

WebAssembly-specific implementation using imports:

const WasmCrypto = struct {
    // External functions imported from JavaScript host
    extern "env" fn crypto_random_bytes(ptr: [*]u8, len: usize) void;
    extern "env" fn crypto_hmac_sha512(
        key_ptr: [*]const u8,
        key_len: usize,
        data_ptr: [*]const u8,
        data_len: usize,
        out_ptr: [*]u8,
    ) void;
    extern "env" fn crypto_secp256k1_mul(
        point_ptr: [*]const u8,
        scalar_ptr: [*]const u8,
        out_ptr: [*]u8,
    ) void;

    pub fn createContext() CryptoContext {
        return CryptoContext.secp256k1();
    }

    pub fn randomBytes(dest: []u8) void {
        crypto_random_bytes(dest.ptr, dest.len);
    }

    pub fn hashHmacSha512(key: []const u8, data: []const u8) [64]u8 {
        var result: [64]u8 = undefined;
        crypto_hmac_sha512(
            key.ptr,
            key.len,
            data.ptr,
            data.len,
            &result,
        );
        return result;
    }

    pub fn exponentiatePoint(p: Ecp, n: *const Scalar) Ecp {
        var result: [33]u8 = undefined;
        crypto_secp256k1_mul(
            &p.toCompressedSec1(),
            &n.toBytes(),
            &result,
        );
        return Ecp.fromCompressedSec1(result) catch unreachable;
    }

    // ... other operations using WASM imports
};

WASM JavaScript Host

JavaScript glue code for browser/Node.js:

// wasm_host.js - JavaScript host for WASM crypto
const crypto = require('crypto');
const secp256k1 = require('secp256k1');

const imports = {
    env: {
        crypto_random_bytes: (ptr, len) => {
            const bytes = crypto.randomBytes(len);
            const mem = new Uint8Array(wasmMemory.buffer, ptr, len);
            mem.set(bytes);
        },

        crypto_hmac_sha512: (keyPtr, keyLen, dataPtr, dataLen, outPtr) => {
            const key = new Uint8Array(wasmMemory.buffer, keyPtr, keyLen);
            const data = new Uint8Array(wasmMemory.buffer, dataPtr, dataLen);
            const hmac = crypto.createHmac('sha512', key);
            hmac.update(data);
            const result = hmac.digest();
            const out = new Uint8Array(wasmMemory.buffer, outPtr, 64);
            out.set(result);
        },

        crypto_secp256k1_mul: (pointPtr, scalarPtr, outPtr) => {
            const point = new Uint8Array(wasmMemory.buffer, pointPtr, 33);
            const scalar = new Uint8Array(wasmMemory.buffer, scalarPtr, 32);
            const result = secp256k1.publicKeyTweakMul(point, scalar, true);
            const out = new Uint8Array(wasmMemory.buffer, outPtr, 33);
            out.set(result);
        }
    }
};

Conditional Compilation

Target-specific code paths:

const builtin = @import("builtin");

pub fn getTimestamp() i64 {
    if (builtin.target.os.tag == .wasi) {
        // WASI clock_time_get
        var ts: std.os.wasi.timestamp_t = undefined;
        _ = std.os.wasi.clock_time_get(.REALTIME, 1, &ts);
        return @intCast(ts / 1_000_000_000);
    } else if (builtin.target.cpu.arch == .wasm32) {
        // Freestanding WASM - use imported function
        return wasmGetTimestamp();
    } else {
        // Native - use std
        return std.time.timestamp();
    }
}

pub fn allocate(comptime T: type, n: usize) ![]T {
    const allocator = if (Platform.is_wasm)
        std.heap.wasm_allocator
    else if (builtin.link_libc)
        std.heap.c_allocator
    else
        std.heap.page_allocator;

    return allocator.alloc(T, n);
}

Build Configuration

build.zig for multi-target builds:

const std = @import("std");

pub fn build(b: *std.Build) void {
    // Native target (default)
    const native_target = b.standardTargetOptions(.{});
    const native_optimize = b.standardOptimizeOption(.{});

    const lib = b.addStaticLibrary(.{
        .name = "ergotree",
        .root_source_file = .{ .path = "src/main.zig" },
        .target = native_target,
        .optimize = native_optimize,
    });

    // WASM target
    const wasm_target = b.resolveTargetQuery(.{
        .cpu_arch = .wasm32,
        .os_tag = .freestanding,
    });

    const wasm_lib = b.addStaticLibrary(.{
        .name = "ergotree",
        .root_source_file = .{ .path = "src/main.zig" },
        .target = wasm_target,
        .optimize = .ReleaseSmall,
    });

    // Export for JavaScript
    wasm_lib.rdynamic = true;
    wasm_lib.export_memory = true;

    // WASI target (for Node.js)
    const wasi_target = b.resolveTargetQuery(.{
        .cpu_arch = .wasm32,
        .os_tag = .wasi,
    });

    const wasi_lib = b.addExecutable(.{
        .name = "ergotree",
        .root_source_file = .{ .path = "src/main.zig" },
        .target = wasi_target,
        .optimize = .ReleaseSmall,
    });

    // Install all targets
    b.installArtifact(lib);
    b.installArtifact(wasm_lib);
    b.installArtifact(wasi_lib);
}

Memory Management

Platform-specific allocation strategies:

const Allocator = std.mem.Allocator;

const MemoryConfig = struct {
    /// Maximum memory for WASM (64KB pages)
    wasm_max_pages: u32 = 256, // 16MB

    /// Use arena for batch operations
    use_arena: bool = true,

    /// Pre-allocate constant pool
    constant_pool_size: usize = 4096,
};

pub fn createPlatformAllocator(config: MemoryConfig) Allocator {
    if (Platform.is_wasm) {
        // WASM uses linear memory with explicit growth
        return std.heap.WasmAllocator.init(.{
            .max_memory = config.wasm_max_pages * 65536,
        });
    } else {
        // Native uses page allocator with arena wrapper
        if (config.use_arena) {
            const backing = std.heap.page_allocator;
            var arena = std.heap.ArenaAllocator.init(backing);
            return arena.allocator();
        }
        return std.heap.page_allocator;
    }
}

Type Representation

Consistent types across platforms:

/// Platform-independent big integer
pub const BigInt = struct {
    limbs: []u64,
    positive: bool,

    pub fn fromBytes(bytes: []const u8) BigInt {
        // Works on all platforms
        var limbs = std.ArrayList(u64).init(allocator);
        // ... conversion logic
        return .{ .limbs = limbs.items, .positive = true };
    }

    pub fn toBytes(self: *const BigInt, buf: []u8) []u8 {
        // Consistent byte representation
        // ... conversion logic
        return buf[0..written];
    }
};

/// Platform-independent scalar (256-bit)
pub const Scalar = struct {
    bytes: [32]u8,

    pub fn fromBigInt(n: *const BigInt) Scalar {
        var result: Scalar = undefined;
        _ = n.toBytes(&result.bytes);
        return result;
    }
};

Endianness Handling

Consistent byte order across architectures:

pub fn readU32BE(bytes: []const u8) u32 {
    return std.mem.readInt(u32, bytes[0..4], .big);
}

pub fn writeU32BE(value: u32, buf: []u8) void {
    std.mem.writeInt(u32, buf[0..4], value, .big);
}

pub fn readU64LE(bytes: []const u8) u64 {
    return std.mem.readInt(u64, bytes[0..8], .little);
}

// Serialization always uses network byte order (big-endian)
pub fn serializeInt(value: anytype, writer: anytype) !void {
    const T = @TypeOf(value);
    var buf: [@sizeOf(T)]u8 = undefined;
    std.mem.writeInt(T, &buf, value, .big);
    try writer.writeAll(&buf);
}

Performance Considerations

Platform Performance Characteristics
══════════════════════════════════════════════════════════════════

┌─────────────────┬───────────────────────────────────────────────┐
│ Platform        │ Characteristics                               │
├─────────────────┼───────────────────────────────────────────────┤
│ Native (x86_64) │ ✓ SIMD acceleration (AVX2/AVX512)             │
│                 │ ✓ Hardware AES-NI                             │
│                 │ ✓ Large memory, fast allocation               │
│                 │ ✓ Multi-threaded execution                    │
├─────────────────┼───────────────────────────────────────────────┤
│ Native (ARM64)  │ ✓ NEON SIMD                                   │
│                 │ ✓ Hardware crypto extensions                  │
│                 │ ✓ Power-efficient                             │
├─────────────────┼───────────────────────────────────────────────┤
│ WASM (browser)  │ ○ Single-threaded (mostly)                    │
│                 │ ○ Linear memory model                         │
│                 │ ✓ JIT compilation by browser                  │
│                 │ ○ No direct filesystem/network                │
├─────────────────┼───────────────────────────────────────────────┤
│ WASI (Node.js)  │ ○ Single-threaded                             │
│                 │ ✓ WASI syscalls for I/O                       │
│                 │ ✓ Sandboxed execution                         │
└─────────────────┴───────────────────────────────────────────────┘

Optimization Strategies:
  Native:   Use comptime for specialization, SIMD intrinsics
  WASM:     Minimize memory allocations, batch operations
  Both:     Profile-guided optimization, cache-friendly layouts

Testing Cross-Platform

const testing = std.testing;

test "crypto operations consistent across platforms" {
    const key = "test_key";
    const data = "test_data";

    const result = CryptoFacade.hashHmacSha512(key, data);

    // Expected value computed externally
    const expected = [_]u8{
        0x8f, 0x9d, 0x1c, // ... full 64 bytes
    };

    try testing.expectEqualSlices(u8, &expected, &result);
}

test "point operations" {
    const ctx = CryptoFacade.createContext();
    const g = ctx.generator;

    // g + g = 2g
    const two_g_add = CryptoFacade.multiplyPoints(g, g);
    const scalar_2 = Scalar.fromInt(2);
    const two_g_mul = CryptoFacade.exponentiatePoint(g, &scalar_2);

    try testing.expect(two_g_add.eql(two_g_mul));
}

Usage Example

Cross-platform wallet library:

const Wallet = struct {
    prover: Prover,
    allocator: Allocator,

    pub fn init(seed: []const u8) !Wallet {
        const allocator = Platform.getDefaultAllocator();

        // Platform-independent key derivation
        const master_key = CryptoFacade.generatePbkdf2Key(
            seed,
            "mnemonic",
            2048,
        );

        return .{
            .prover = try Prover.fromSeed(master_key, allocator),
            .allocator = allocator,
        };
    }

    pub fn signTransaction(
        self: *const Wallet,
        tx: *const Transaction,
    ) !SignedTransaction {
        // Works identically on all platforms
        return self.prover.sign(tx);
    }
};

// Same code runs on all targets:
// - Desktop app (native)
// - Browser extension (WASM)
// - Mobile wallet (ARM native or WASM)
// - Server-side validation (native)

Summary

Zig cross-compiles to any target from any host without external tools
Platform abstraction uses builtin.target for conditional compilation
CryptoFacade provides consistent API across native and WASM
WASM targets use JavaScript imports for platform-specific crypto
Memory management adapts to platform constraints
Type representation ensures consistent behavior across architectures
Testing verifies identical results on all platforms

Next: Chapter 31: Performance Engineering

Scala: CryptoFacade.scala (abstraction)

Rust: Platform-independent design in sigma-rust crate structure

Scala: Platform.scala (JVM impl)

⁴

Rust: sigma_protocol/ (crypto operations)

⁵

Scala: Platform.scala (JS impl)

⁶

Rust: Feature flags in Cargo.toml for optional dependencies

Chapter 31: Performance Engineering

Prerequisites

Chapter 12 for evaluation model fundamentals that define hot paths
Chapter 7 for serialization patterns to optimize
Chapter 13 for understanding cost accounting overhead

Learning Objectives

By the end of this chapter, you will be able to:

Identify performance-critical paths in script interpretation
Apply Zig's comptime for zero-cost abstractions and type dispatch
Design data structures using Struct-of-Arrays (SoA) for cache efficiency
Use arena allocators to batch allocations and reduce overhead
Implement SIMD and vectorization for throughput-critical operations
Profile and benchmark interpreter components systematically

Performance Architecture

Script interpretation requires processing thousands of transactions per block¹²:

Performance Critical Paths
══════════════════════════════════════════════════════════════════

┌─────────────────────────────────────────────────────────────────┐
│                    Transaction Flow                             │
│                                                                 │
│   Block (1000+ txs)                                             │
│       │                                                         │
│       ├── Tx 1: 3 inputs × (deserialize + evaluate + verify)    │
│       ├── Tx 2: 1 input × (deserialize + evaluate + verify)     │
│       ├── Tx 3: 5 inputs × (deserialize + evaluate + verify)    │
│       └── ...                                                   │
│                                                                 │
│   Hot paths (per input):                                        │
│     • Deserialization: ~50-200 opcode parses                    │
│     • Evaluation: ~100-500 operations                           │
│     • Proof verification: 1-10 EC operations                    │
└─────────────────────────────────────────────────────────────────┘

Performance Targets:
  Deserialization:   < 100 µs per script
  Evaluation:        < 500 µs per script
  Verification:      < 2 ms per input
  Total per block:   < 5 seconds

Comptime Optimization

Zig's comptime enables zero-cost abstractions³⁴:

/// Compile-time type dispatch eliminates runtime branching
fn evalOperation(comptime op: OpCode, args: []const Value) !Value {
    return switch (op) {
        .plus => evalPlus(args),
        .minus => evalMinus(args),
        .multiply => evalMultiply(args),
        // All branches resolved at compile time
        inline else => |o| evalGeneric(o, args),
    };
}

/// Comptime-generated lookup tables
const op_costs = blk: {
    var costs: [256]u32 = undefined;
    for (0..256) |i| {
        costs[i] = computeCost(@enumFromInt(i));
    }
    break :blk costs;
};

/// Zero-cost field access via comptime offset calculation
fn getField(comptime T: type, comptime field: []const u8, ptr: *const T) *const @TypeOf(@field(T{}, field)) {
    const offset = @offsetOf(T, field);
    const byte_ptr: [*]const u8 = @ptrCast(ptr);
    return @ptrCast(@alignCast(byte_ptr + offset));
}

Data-Oriented Design

Structure data for cache efficiency. The Array-of-Structs to Struct-of-Arrays transformation is a semantics-preserving isomorphism: Array[n](A × B) ≅ Array[n](A) × Array[n](B). Both represent the same data with identical behavior, but different memory layouts yield dramatically different cache performance:

/// Bad: Array of Structs (AoS) - poor cache locality for iteration
const ValueAoS = struct {
    tag: ValueTag,      // 1 byte
    padding: [7]u8,     // 7 bytes padding
    data: [8]u8,        // 8 bytes payload
}; // 16 bytes per value, only 9 used

/// Good: Struct of Arrays (SoA) - excellent cache locality
const ValueStore = struct {
    tags: []ValueTag,           // Packed tags
    data: [][8]u8,              // Packed payloads
    len: usize,

    /// Iterate tags without touching payload
    pub fn countType(self: *const ValueStore, target: ValueTag) usize {
        var count: usize = 0;
        for (self.tags) |tag| {
            count += @intFromBool(tag == target);
        }
        return count;
    }

    /// Access specific value
    pub fn get(self: *const ValueStore, idx: usize) Value {
        return Value.decode(self.tags[idx], self.data[idx]);
    }
};

Memory Layout Analysis

Cache Line Utilization
══════════════════════════════════════════════════════════════════

Array of Structs (AoS):
┌──────────────────────────────────────────────────────────────────┐
│ Cache Line (64 bytes)                                            │
├──────────────────────────────────────────────────────────────────┤
│ Value[0] │ Value[1] │ Value[2] │ Value[3] │                      │
│ 16 bytes │ 16 bytes │ 16 bytes │ 16 bytes │                      │
│ T+P+D    │ T+P+D    │ T+P+D    │ T+P+D    │ Wasted               │
└──────────────────────────────────────────────────────────────────┘
Tag iteration: 25% cache utilization (touches only 1 byte per 16)

Struct of Arrays (SoA):
┌──────────────────────────────────────────────────────────────────┐
│ Tags Cache Line (64 bytes)                                       │
├──────────────────────────────────────────────────────────────────┤
│ T[0] T[1] T[2] ... T[63]                                         │
│ 64 tags in single cache line                                     │
└──────────────────────────────────────────────────────────────────┘
Tag iteration: 100% cache utilization (64 values per fetch)

Speedup: ~4x for tag-only operations

Arena Allocators

Batch allocations reduce overhead:

const ArenaAllocator = std.heap.ArenaAllocator;

/// Evaluation context with arena for temporary allocations
const EvalContext = struct {
    arena: ArenaAllocator,
    constants: []const Constant,
    env: Environment,

    pub fn init(backing: Allocator) EvalContext {
        return .{
            .arena = ArenaAllocator.init(backing),
            .constants = &[_]Constant{},
            .env = Environment.init(),
        };
    }

    /// All temporary allocations use arena
    pub fn allocTemp(self: *EvalContext, comptime T: type, n: usize) ![]T {
        return self.arena.allocator().alloc(T, n);
    }

    /// Single deallocation frees all temps
    pub fn reset(self: *EvalContext) void {
        _ = self.arena.reset(.retain_capacity);
    }

    pub fn deinit(self: *EvalContext) void {
        self.arena.deinit();
    }
};

/// Usage: batch evaluation without per-operation allocations
fn evaluateScript(tree: *const ErgoTree, allocator: Allocator) !Value {
    var ctx = EvalContext.init(allocator);
    defer ctx.deinit();

    for (tree.ops) |op| {
        try evalOp(op, &ctx);
    }

    ctx.reset(); // Free all temps at once
    return ctx.result;
}

Loop Optimization

Efficient iteration patterns:

/// Unrolled loop for fixed-size operations
fn hashBlock(data: *const [64]u8, state: *[8]u32) void {
    // Process 16 words per iteration, unrolled
    comptime var i: usize = 0;
    inline while (i < 64) : (i += 4) {
        const w0 = std.mem.readInt(u32, data[i..][0..4], .big);
        const w1 = std.mem.readInt(u32, data[i + 4..][0..4], .big);
        const w2 = std.mem.readInt(u32, data[i + 8..][0..4], .big);
        const w3 = std.mem.readInt(u32, data[i + 12..][0..4], .big);
        round(state, w0);
        round(state, w1);
        round(state, w2);
        round(state, w3);
    }
}

/// Vectorized collection operations
fn sumValues(values: []const i64) i64 {
    const Vec = @Vector(4, i64);
    var sum_vec: Vec = @splat(0);

    var i: usize = 0;
    while (i + 4 <= values.len) : (i += 4) {
        const chunk: Vec = values[i..][0..4].*;
        sum_vec += chunk;
    }

    // Reduce vector to scalar
    var sum = @reduce(.Add, sum_vec);

    // Handle remainder
    while (i < values.len) : (i += 1) {
        sum += values[i];
    }

    return sum;
}

Memoization

Cache expensive computations⁵⁶:

/// Generic memoization with comptime key type
fn Memoized(comptime K: type, comptime V: type) type {
    return struct {
        cache: std.AutoHashMap(K, V),

        const Self = @This();

        pub fn init(allocator: Allocator) Self {
            return .{ .cache = std.AutoHashMap(K, V).init(allocator) };
        }

        pub fn getOrCompute(
            self: *Self,
            key: K,
            compute: *const fn (K) V,
        ) V {
            const result = self.cache.getOrPut(key) catch unreachable;
            if (!result.found_existing) {
                result.value_ptr.* = compute(key);
            }
            return result.value_ptr.*;
        }

        pub fn reset(self: *Self) void {
            self.cache.clearRetainingCapacity();
        }
    };
}

/// Type method resolution memoization
const MethodCache = Memoized(struct { type_code: u8, method_id: u8 }, *const Method);

var method_cache: MethodCache = undefined;

fn resolveMethod(type_code: u8, method_id: u8) *const Method {
    return method_cache.getOrCompute(
        .{ .type_code = type_code, .method_id = method_id },
        computeMethod,
    );
}

String Interning

Avoid repeated string allocations:

const StringInterner = struct {
    table: std.StringHashMap([]const u8),
    arena: ArenaAllocator,

    pub fn init(allocator: Allocator) StringInterner {
        return .{
            .table = std.StringHashMap([]const u8).init(allocator),
            .arena = ArenaAllocator.init(allocator),
        };
    }

    /// Return interned string (pointer stable for lifetime)
    pub fn intern(self: *StringInterner, str: []const u8) []const u8 {
        if (self.table.get(str)) |existing| {
            return existing;
        }

        // Allocate permanent copy
        const copy = self.arena.allocator().dupe(u8, str) catch unreachable;
        self.table.put(copy, copy) catch unreachable;
        return copy;
    }
};

// Variable names are interned for fast comparison
fn lookupVar(env: *const Environment, name: []const u8) ?Value {
    const interned = global_interner.intern(name);
    return env.bindings.get(interned);
}

SIMD for Crypto

Vectorized elliptic curve operations:

/// SIMD-accelerated field multiplication (mod p)
fn mulModP(a: *const [4]u64, b: *const [4]u64) [4]u64 {
    // Use vector operations where available
    if (comptime std.Target.current.cpu.arch.isX86()) {
        return mulModP_avx2(a, b);
    } else if (comptime std.Target.current.cpu.arch.isAARCH64()) {
        return mulModP_neon(a, b);
    } else {
        return mulModP_scalar(a, b);
    }
}

fn mulModP_avx2(a: *const [4]u64, b: *const [4]u64) [4]u64 {
    // AVX2 implementation using 256-bit vectors
    const va: @Vector(4, u64) = a.*;
    const vb: @Vector(4, u64) = b.*;

    // Schoolbook multiplication with vector operations
    // ... (optimized implementation)

    return result;
}

Profiling and Benchmarking

Built-in profiling support:

const Timer = struct {
    start: i128,

    pub fn init() Timer {
        return .{ .start = std.time.nanoTimestamp() };
    }

    pub fn elapsed(self: *const Timer) u64 {
        const now = std.time.nanoTimestamp();
        return @intCast(now - self.start);
    }
};

/// Benchmark harness
fn benchmark(
    comptime name: []const u8,
    comptime iterations: usize,
    comptime warmup: usize,
    func: *const fn () void,
) void {
    // Warmup
    for (0..warmup) |_| {
        func();
    }

    // Measure
    const timer = Timer.init();
    for (0..iterations) |_| {
        func();
    }
    const total_ns = timer.elapsed();

    const ns_per_op = total_ns / iterations;
    const ops_per_sec = @as(f64, 1_000_000_000) / @as(f64, @floatFromInt(ns_per_op));

    std.debug.print("{s}: {} ns/op ({d:.0} ops/sec)\n", .{
        name,
        ns_per_op,
        ops_per_sec,
    });
}

// Usage
test "benchmark deserialization" {
    benchmark("deserialize_script", 10000, 1000, struct {
        fn run() void {
            _ = deserialize(test_script);
        }
    }.run);
}

Memory Profiling

Track allocations in debug builds:

const DebugAllocator = struct {
    backing: Allocator,
    total_allocated: usize = 0,
    total_freed: usize = 0,
    allocation_count: usize = 0,

    pub fn allocator(self: *DebugAllocator) Allocator {
        return .{
            .ptr = self,
            .vtable = &.{
                .alloc = alloc,
                .resize = resize,
                .free = free,
            },
        };
    }

    fn alloc(ctx: *anyopaque, len: usize, ptr_align: u8, ret_addr: usize) ?[*]u8 {
        const self: *DebugAllocator = @ptrCast(@alignCast(ctx));
        self.total_allocated += len;
        self.allocation_count += 1;
        return self.backing.rawAlloc(len, ptr_align, ret_addr);
    }

    // ... other methods

    pub fn report(self: *const DebugAllocator) void {
        std.debug.print("Allocations: {}\n", .{self.allocation_count});
        std.debug.print("Total allocated: {} bytes\n", .{self.total_allocated});
        std.debug.print("Total freed: {} bytes\n", .{self.total_freed});
        std.debug.print("Leaked: {} bytes\n", .{self.total_allocated - self.total_freed});
    }
};

Performance Patterns

Optimization Decision Tree
══════════════════════════════════════════════════════════════════

Is operation in hot path?
│
├── NO → Optimize for clarity, not speed
│
└── YES → Profile first, then:
    │
    ├── CPU-bound?
    │   ├── Use comptime for dispatch
    │   ├── Unroll small loops
    │   ├── Use SIMD where applicable
    │   └── Inline critical functions
    │
    ├── Memory-bound?
    │   ├── Use SoA layout
    │   ├── Pool/arena allocators
    │   ├── Reduce allocations
    │   └── Prefetch data
    │
    └── Allocation-bound?
        ├── Arena allocators
        ├── Object pools
        ├── String interning
        └── Stack allocation where safe

Performance Checklist

When writing performance-critical code:

// ✓ Use comptime for type-level decisions
const Handler = comptime getHandler(op);

// ✓ Pre-compute lookup tables
const costs = comptime computeCostTable();

// ✓ Use SoA for iterated data
const Store = struct { tags: []Tag, values: []Value };

// ✓ Arena allocators for batch operations
var arena = ArenaAllocator.init(allocator);
defer arena.deinit();

// ✓ Inline hot functions
pub inline fn addCost(self: *CostAccum, cost: u32) !void

// ✓ Avoid allocations in tight loops
for (items) |item| {
    // Process without allocation
}

// ✓ Use vectors for parallel data
const Vec4 = @Vector(4, u64);

// ✓ Profile before optimizing
// std.debug.print("elapsed: {} ns\n", .{timer.elapsed()});

Summary

Comptime enables zero-cost abstractions and compile-time dispatch
Data-oriented design (SoA) improves cache efficiency 4x+
Arena allocators batch deallocations for throughput
Loop unrolling and SIMD accelerate hot paths
Memoization caches expensive computations
String interning reduces allocation pressure
Profile first before optimizing—measure, don't guess

Next: Chapter 32: v6 Protocol Features

Scala: perf-style-guide.md (HOTSPOT patterns)

Rust: Performance-oriented design throughout sigma-rust crates

⁵

Scala: MemoizedFunc.scala

⁶

Rust: Memoization patterns in ergotree-interpreter

Scala: CErgoTreeEvaluator.scala (fixedCostOp)

⁴

Rust: eval.rs (cost tracking)

Chapter 32: v6 Protocol Features

Prerequisites

Chapter 2 for the ErgoTree type system and numeric types
Chapter 6 for method definitions on types
Chapter 29 for soft-fork versioning and activation

Learning Objectives

By the end of this chapter, you will be able to:

Implement SUnsignedBigInt (256-bit unsigned integers) with modular arithmetic operations
Apply bitwise operations (AND, OR, XOR, NOT, shifts) to all numeric types
Use new collection manipulation methods (patch, updated, updateMany, reverse, get)
Understand the Autolykos2 proof-of-work algorithm and Header.checkPow
Serialize values using Global.serialize and decode difficulty with NBits encoding
Write version-aware scripts that use v6 features safely

Version Activation

ErgoTree version 3 corresponds to protocol v6.0. Features in this chapter are only available when the v6 soft-fork is activated.

Version Mapping
═══════════════════════════════════════════════════════════════════

Block Version    ErgoTree Version    Protocol    Features
─────────────────────────────────────────────────────────────────────
1-2              0-1                 v3.x-v4.x   AOT costing
3                2                   v5.x        JIT costing
4                3                   v6.x        This chapter

Version Context

const VersionContext = struct {
    activated_version: u8,
    ergo_tree_version: u8,

    pub const JIT_ACTIVATION_VERSION: u8 = 2;   // v5.0
    pub const V6_SOFT_FORK_VERSION: u8 = 3;     // v6.0

    /// True if v6.0 protocol is activated
    pub fn isV6Activated(self: *const VersionContext) bool {
        return self.activated_version >= V6_SOFT_FORK_VERSION;
    }

    /// True if current ErgoTree is v3 or later
    pub fn isV3OrLaterErgoTree(self: *const VersionContext) bool {
        return self.ergo_tree_version >= V6_SOFT_FORK_VERSION;
    }

    /// Check if a v6 method can be used
    pub fn canUseV6Method(self: *const VersionContext) bool {
        return self.isV6Activated() and self.isV3OrLaterErgoTree();
    }
};

SUnsignedBigInt Type

The SUnsignedBigInt type (type code 9) is a 256-bit unsigned integer designed for cryptographic modular arithmetic¹². Unlike SBigInt which is signed, SUnsignedBigInt guarantees non-negative values—essential for operations like modular exponentiation where sign handling would introduce complexity and potential errors.

Type Definition

/// 256-bit unsigned integer for modular arithmetic
/// Type code: 0x09
const UnsignedBigInt256 = struct {
    /// Internal representation: 4 x 64-bit words (little-endian)
    words: [4]u64,

    pub const TYPE_CODE: u8 = 0x09;
    pub const BIT_WIDTH: usize = 256;
    pub const BYTE_WIDTH: usize = 32;

    /// Maximum value: 2^256 - 1
    pub const MAX = UnsignedBigInt256{ .words = .{
        0xFFFFFFFFFFFFFFFF, 0xFFFFFFFFFFFFFFFF,
        0xFFFFFFFFFFFFFFFF, 0xFFFFFFFFFFFFFFFF,
    }};

    /// Zero value
    pub const ZERO = UnsignedBigInt256{ .words = .{ 0, 0, 0, 0 }};

    /// One value
    pub const ONE = UnsignedBigInt256{ .words = .{ 1, 0, 0, 0 }};

    /// Create from bytes (big-endian)
    pub fn fromBytes(bytes: [32]u8) UnsignedBigInt256 {
        var result = UnsignedBigInt256{ .words = undefined };
        // Convert big-endian bytes to little-endian words
        inline for (0..4) |i| {
            const offset = (3 - i) * 8;
            result.words[i] = std.mem.readInt(u64, bytes[offset..][0..8], .big);
        }
        return result;
    }

    /// Convert to bytes (big-endian)
    pub fn toBytes(self: UnsignedBigInt256) [32]u8 {
        var result: [32]u8 = undefined;
        inline for (0..4) |i| {
            const offset = (3 - i) * 8;
            std.mem.writeInt(u64, result[offset..][0..8], self.words[i], .big);
        }
        return result;
    }

    /// Convert from signed BigInt (errors if negative)
    pub fn fromBigInt(bi: BigInt256) !UnsignedBigInt256 {
        if (bi.isNegative()) {
            return error.NegativeValue;
        }
        // Safe to reinterpret since non-negative
        return @bitCast(bi.abs());
    }

    /// Convert to signed BigInt (errors if > BigInt.MAX)
    pub fn toBigInt(self: UnsignedBigInt256) !BigInt256 {
        // Check if value exceeds signed max (2^255 - 1)
        if (self.words[3] & 0x8000000000000000 != 0) {
            return error.Overflow;
        }
        return @bitCast(self);
    }

    /// Comparison
    pub fn lessThan(self: UnsignedBigInt256, other: UnsignedBigInt256) bool {
        // Compare from most significant word
        var i: usize = 4;
        while (i > 0) {
            i -= 1;
            if (self.words[i] < other.words[i]) return true;
            if (self.words[i] > other.words[i]) return false;
        }
        return false; // Equal
    }

    pub fn eql(self: UnsignedBigInt256, other: UnsignedBigInt256) bool {
        return std.mem.eql(u64, &self.words, &other.words);
    }
};

Why Unsigned Matters for Cryptography

Signed integers introduce complexity in modular arithmetic:

Sign bit ambiguity: In two's complement, the high bit indicates sign. For cryptographic operations on field elements, all 256 bits should represent magnitude.
Modular reduction: Computing a mod m for negative a requires adjustment: (-5) mod 7 = 2, not -5. Unsigned values eliminate this edge case.
Constant-time operations: Sign handling can introduce timing variations. Unsigned operations are more naturally constant-time.
Field element representation: Finite field elements are inherently non-negative integers in [0, p-1].

Serialization

const UnsignedBigInt256Serializer = struct {
    /// Serialize to variable-length big-endian bytes
    pub fn serialize(value: UnsignedBigInt256, writer: anytype) !void {
        const bytes = value.toBytes();

        // Find first non-zero byte (skip leading zeros)
        var start: usize = 0;
        while (start < 32 and bytes[start] == 0) : (start += 1) {}

        // Write length + bytes
        const len = 32 - start;
        try writer.writeInt(u8, @intCast(len), .big);
        try writer.writeAll(bytes[start..]);
    }

    /// Deserialize from variable-length big-endian bytes
    pub fn deserialize(reader: anytype) !UnsignedBigInt256 {
        const len = try reader.readInt(u8, .big);
        if (len > 32) return error.InvalidLength;

        var bytes: [32]u8 = .{0} ** 32;
        const start = 32 - len;
        try reader.readNoEof(bytes[start..]);

        return UnsignedBigInt256.fromBytes(bytes);
    }
};

Modular Arithmetic Operations

v6 adds six modular arithmetic methods to SUnsignedBigInt³⁴:

Modular Arithmetic Methods
═══════════════════════════════════════════════════════════════════

Method          Signature                     Cost    Description
─────────────────────────────────────────────────────────────────────
mod             (UBI, UBI) → UBI              20      a mod m
modInverse      (UBI, UBI) → UBI              150     a⁻¹ mod m
plusMod         (UBI, UBI, UBI) → UBI         30      (a + b) mod m
subtractMod     (UBI, UBI, UBI) → UBI         30      (a - b) mod m
multiplyMod     (UBI, UBI, UBI) → UBI         40      (a × b) mod m
toSigned        UBI → BigInt                  10      Convert to signed

Basic Modulo Operation

/// a mod m - remainder after division
/// Cost: FixedCost(20)
pub fn mod(a: UnsignedBigInt256, m: UnsignedBigInt256) !UnsignedBigInt256 {
    if (m.eql(UnsignedBigInt256.ZERO)) {
        return error.DivisionByZero;
    }

    // Use schoolbook division for 256-bit values
    // Result is always < m
    return divmod(a, m).remainder;
}

Modular Inverse (Extended Euclidean Algorithm)

The modular inverse a⁻¹ mod m is the value x such that (a × x) mod m = 1. It exists only when gcd(a, m) = 1.

/// Extended Euclidean Algorithm
/// Returns x such that (a * x) ≡ 1 (mod m)
/// Cost: FixedCost(150) - most expensive modular operation
pub fn modInverse(a: UnsignedBigInt256, m: UnsignedBigInt256) !UnsignedBigInt256 {
    if (m.eql(UnsignedBigInt256.ZERO)) {
        return error.DivisionByZero;
    }
    if (a.eql(UnsignedBigInt256.ZERO)) {
        return error.NoInverse; // gcd(0, m) = m ≠ 1
    }

    // Extended Euclidean Algorithm
    // Maintains: old_s * a + old_t * m = old_r (Bézout's identity)
    var old_r = a;
    var r = m;
    var old_s = UnsignedBigInt256.ONE;
    var s = UnsignedBigInt256.ZERO;
    var old_s_negative = false;
    var s_negative = false;

    while (!r.eql(UnsignedBigInt256.ZERO)) {
        const quotient = divmod(old_r, r).quotient;

        // (old_r, r) = (r, old_r - quotient * r)
        const temp_r = r;
        const qr = multiply(quotient, r);
        if (old_r.lessThan(qr)) {
            r = subtract(qr, old_r);
        } else {
            r = subtract(old_r, qr);
        }
        old_r = temp_r;

        // (old_s, s) = (s, old_s - quotient * s)
        // Handle signed arithmetic carefully
        const temp_s = s;
        const temp_s_neg = s_negative;
        const qs = multiply(quotient, s);

        if (old_s_negative == s_negative) {
            // Same sign: subtraction
            if (old_s.lessThan(qs)) {
                s = subtract(qs, old_s);
                s_negative = !old_s_negative;
            } else {
                s = subtract(old_s, qs);
                s_negative = old_s_negative;
            }
        } else {
            // Different signs: addition
            s = add(old_s, qs);
            s_negative = old_s_negative;
        }

        old_s = temp_s;
        old_s_negative = temp_s_neg;
    }

    // Check that gcd(a, m) = 1
    if (!old_r.eql(UnsignedBigInt256.ONE)) {
        return error.NoInverse; // a and m are not coprime
    }

    // Adjust result to be positive
    if (old_s_negative) {
        return subtract(m, old_s);
    }
    return old_s;
}

Modular Addition

/// (a + b) mod m - modular addition
/// Handles overflow by using 320-bit intermediate
/// Cost: FixedCost(30)
pub fn plusMod(
    a: UnsignedBigInt256,
    b: UnsignedBigInt256,
    m: UnsignedBigInt256,
) !UnsignedBigInt256 {
    if (m.eql(UnsignedBigInt256.ZERO)) {
        return error.DivisionByZero;
    }

    // a + b can overflow 256 bits, so use 320-bit intermediate
    var sum: [5]u64 = .{ 0, 0, 0, 0, 0 };
    var carry: u64 = 0;

    for (0..4) |i| {
        const s = @as(u128, a.words[i]) + @as(u128, b.words[i]) + carry;
        sum[i] = @truncate(s);
        carry = @truncate(s >> 64);
    }
    sum[4] = carry;

    // Reduce mod m
    return reduce320(sum, m);
}

/// (a - b) mod m - modular subtraction
/// If a < b, result is m - (b - a)
/// Cost: FixedCost(30)
pub fn subtractMod(
    a: UnsignedBigInt256,
    b: UnsignedBigInt256,
    m: UnsignedBigInt256,
) !UnsignedBigInt256 {
    if (m.eql(UnsignedBigInt256.ZERO)) {
        return error.DivisionByZero;
    }

    if (a.lessThan(b)) {
        // a - b is negative: compute m - (b - a)
        const diff = subtract(b, a);
        const diff_mod = try mod(diff, m);
        if (diff_mod.eql(UnsignedBigInt256.ZERO)) {
            return UnsignedBigInt256.ZERO;
        }
        return subtract(m, diff_mod);
    } else {
        const diff = subtract(a, b);
        return mod(diff, m);
    }
}

Modular Multiplication

/// (a * b) mod m - modular multiplication
/// Uses 512-bit intermediate to handle overflow
/// Cost: FixedCost(40)
pub fn multiplyMod(
    a: UnsignedBigInt256,
    b: UnsignedBigInt256,
    m: UnsignedBigInt256,
) !UnsignedBigInt256 {
    if (m.eql(UnsignedBigInt256.ZERO)) {
        return error.DivisionByZero;
    }

    // Multiply to 512-bit result
    var product: [8]u64 = .{0} ** 8;

    for (0..4) |i| {
        var carry: u64 = 0;
        for (0..4) |j| {
            const p = @as(u128, a.words[i]) * @as(u128, b.words[j]) +
                      @as(u128, product[i + j]) + @as(u128, carry);
            product[i + j] = @truncate(p);
            carry = @truncate(p >> 64);
        }
        product[i + 4] = carry;
    }

    // Reduce 512-bit product mod m
    return reduce512(product, m);
}

Bitwise Operations

v6 adds eight bitwise methods to all numeric types (Byte, Short, Int, Long, BigInt, UnsignedBigInt)⁵⁶:

Bitwise Operations
═══════════════════════════════════════════════════════════════════

Method          Signature           Cost    Description
─────────────────────────────────────────────────────────────────────
bitwiseInverse  T → T               5       ~x (NOT)
bitwiseOr       (T, T) → T          5       x | y
bitwiseAnd      (T, T) → T          5       x & y
bitwiseXor      (T, T) → T          5       x ^ y
shiftLeft       (T, Int) → T        5       x << n
shiftRight      (T, Int) → T        5       x >> n
toBytes         T → Coll[Byte]      5       Byte representation
toBits          T → Coll[Boolean]   5       Bit representation

Implementation

/// Bitwise operations for all numeric types
pub fn BitwiseOps(comptime T: type) type {
    return struct {
        /// Bitwise NOT (~x)
        /// For signed types: ~x = -x - 1 (two's complement identity)
        /// For unsigned: ~x = MAX - x
        /// Cost: FixedCost(5)
        pub fn bitwiseInverse(x: T) T {
            return ~x;
        }

        /// Bitwise OR (x | y)
        /// Cost: FixedCost(5)
        pub fn bitwiseOr(x: T, y: T) T {
            return x | y;
        }

        /// Bitwise AND (x & y)
        /// Cost: FixedCost(5)
        pub fn bitwiseAnd(x: T, y: T) T {
            return x & y;
        }

        /// Bitwise XOR (x ^ y)
        /// Cost: FixedCost(5)
        pub fn bitwiseXor(x: T, y: T) T {
            return x ^ y;
        }

        /// Left shift (x << n)
        /// Returns 0 if n >= bitwidth
        /// Cost: FixedCost(5)
        pub fn shiftLeft(x: T, n: i32) !T {
            if (n < 0) return error.NegativeShift;
            const bits = @bitSizeOf(T);
            if (n >= bits) return 0;
            return x << @intCast(n);
        }

        /// Right shift (x >> n)
        /// Arithmetic shift for signed (preserves sign)
        /// Logical shift for unsigned (fills with 0)
        /// Cost: FixedCost(5)
        pub fn shiftRight(x: T, n: i32) !T {
            if (n < 0) return error.NegativeShift;
            const bits = @bitSizeOf(T);
            if (n >= bits) {
                // For signed: return -1 if negative, 0 otherwise
                // For unsigned: return 0
                if (@typeInfo(T).Int.signedness == .signed) {
                    return if (x < 0) -1 else 0;
                }
                return 0;
            }
            return x >> @intCast(n);
        }
    };
}

/// Byte conversion for numeric types
pub fn toBytes(comptime T: type, x: T) []const u8 {
    const size = @sizeOf(T);
    var result: [size]u8 = undefined;
    std.mem.writeInt(T, &result, x, .big);
    return &result;
}

/// Bit conversion for numeric types
pub fn toBits(comptime T: type, x: T) []const bool {
    const bits = @bitSizeOf(T);
    var result: [bits]bool = undefined;
    for (0..bits) |i| {
        result[bits - 1 - i] = ((x >> @intCast(i)) & 1) == 1;
    }
    return &result;
}

BigInt Bitwise Operations

For BigInt256 and UnsignedBigInt256, bitwise operations work on the full 256-bit representation:

/// 256-bit bitwise operations
const BigIntBitwise = struct {
    /// Bitwise NOT for 256-bit unsigned
    /// ~x = MAX - x for unsigned interpretation
    pub fn bitwiseInverse(x: UnsignedBigInt256) UnsignedBigInt256 {
        return .{ .words = .{
            ~x.words[0],
            ~x.words[1],
            ~x.words[2],
            ~x.words[3],
        }};
    }

    /// Bitwise OR for 256-bit
    pub fn bitwiseOr(x: UnsignedBigInt256, y: UnsignedBigInt256) UnsignedBigInt256 {
        return .{ .words = .{
            x.words[0] | y.words[0],
            x.words[1] | y.words[1],
            x.words[2] | y.words[2],
            x.words[3] | y.words[3],
        }};
    }

    /// Left shift for 256-bit (handles cross-word shifts)
    pub fn shiftLeft(x: UnsignedBigInt256, n: i32) !UnsignedBigInt256 {
        if (n < 0) return error.NegativeShift;
        if (n >= 256) return UnsignedBigInt256.ZERO;

        const shift: u8 = @intCast(n);
        const word_shift = shift / 64;
        const bit_shift: u6 = @intCast(shift % 64);

        var result = UnsignedBigInt256.ZERO;

        if (bit_shift == 0) {
            // Word-aligned shift
            for (word_shift..4) |i| {
                result.words[i] = x.words[i - word_shift];
            }
        } else {
            // Cross-word shift
            for (word_shift..4) |i| {
                result.words[i] = x.words[i - word_shift] << bit_shift;
                if (i > word_shift) {
                    result.words[i] |= x.words[i - word_shift - 1] >> (64 - bit_shift);
                }
            }
        }

        return result;
    }
};

Collection Methods (v6)

v6 adds seven new methods to Coll[T] for efficient collection manipulation⁷⁸:

Collection Methods (v6)
═══════════════════════════════════════════════════════════════════

Method          Signature                          Cost
─────────────────────────────────────────────────────────────────────
patch           (Coll[T], Int, Coll[T], Int) → Coll[T]  PerItem(30,2,10)
updated         (Coll[T], Int, T) → Coll[T]             PerItem(20,1,10)
updateMany      (Coll[T], Coll[Int], Coll[T]) → Coll[T] PerItem(20,2,10)
reverse         Coll[T] → Coll[T]                       PerItem (append)
startsWith      (Coll[T], Coll[T]) → Boolean            PerItem (zip)
endsWith        (Coll[T], Coll[T]) → Boolean            PerItem (zip)
get             (Coll[T], Int) → Option[T]              FixedCost(14)

patch - Replace Slice

/// Replace elements from index `from`, removing `replaced` elements,
/// inserting `patch` collection in their place.
///
/// xs.patch(from, patch, replaced):
///   result = xs[0..from] ++ patch ++ xs[from+replaced..]
///
/// Cost: PerItemCost(30, 2, 10) based on xs.len + patch.len
pub fn patch(
    comptime T: type,
    xs: []const T,
    from: i32,
    patchColl: []const T,
    replaced: i32,
    allocator: Allocator,
) ![]T {
    if (from < 0) return error.IndexOutOfBounds;
    const from_idx: usize = @intCast(from);
    if (from_idx > xs.len) return error.IndexOutOfBounds;

    const replaced_count: usize = if (replaced < 0)
        0
    else
        @min(@as(usize, @intCast(replaced)), xs.len - from_idx);

    // Result length: original - replaced + patch
    const result_len = xs.len - replaced_count + patchColl.len;
    var result = try allocator.alloc(T, result_len);

    // Copy prefix [0..from]
    @memcpy(result[0..from_idx], xs[0..from_idx]);

    // Copy patch
    @memcpy(result[from_idx..][0..patchColl.len], patchColl);

    // Copy suffix [from+replaced..]
    const suffix_start = from_idx + replaced_count;
    const suffix_dest = from_idx + patchColl.len;
    @memcpy(result[suffix_dest..], xs[suffix_start..]);

    return result;
}

updated - Single Element Update

/// Return a new collection with element at index replaced.
/// Immutable operation - original collection unchanged.
///
/// Cost: PerItemCost(20, 1, 10)
pub fn updated(
    comptime T: type,
    xs: []const T,
    idx: i32,
    value: T,
    allocator: Allocator,
) ![]T {
    if (idx < 0) return error.IndexOutOfBounds;
    const index: usize = @intCast(idx);
    if (index >= xs.len) return error.IndexOutOfBounds;

    var result = try allocator.dupe(T, xs);
    result[index] = value;
    return result;
}

updateMany - Batch Update

/// Update multiple elements at specified indices.
/// indexes and updates must have the same length.
///
/// Cost: PerItemCost(20, 2, 10)
pub fn updateMany(
    comptime T: type,
    xs: []const T,
    indexes: []const i32,
    updates: []const T,
    allocator: Allocator,
) ![]T {
    if (indexes.len != updates.len) {
        return error.LengthMismatch;
    }

    // Validate all indexes first
    for (indexes) |idx| {
        if (idx < 0) return error.IndexOutOfBounds;
        if (@as(usize, @intCast(idx)) >= xs.len) return error.IndexOutOfBounds;
    }

    var result = try allocator.dupe(T, xs);

    for (indexes, updates) |idx, val| {
        result[@intCast(idx)] = val;
    }

    return result;
}

reverse, startsWith, endsWith, get

/// Reverse collection order
/// Cost: Same as append (PerItem)
pub fn reverse(comptime T: type, xs: []const T, allocator: Allocator) ![]T {
    var result = try allocator.alloc(T, xs.len);
    for (xs, 0..) |x, i| {
        result[xs.len - 1 - i] = x;
    }
    return result;
}

/// Check if collection starts with prefix
/// Cost: Same as zip (PerItem based on prefix length)
pub fn startsWith(comptime T: type, xs: []const T, prefix: []const T) bool {
    if (prefix.len > xs.len) return false;
    return std.mem.eql(T, xs[0..prefix.len], prefix);
}

/// Check if collection ends with suffix
/// Cost: Same as zip (PerItem based on suffix length)
pub fn endsWith(comptime T: type, xs: []const T, suffix: []const T) bool {
    if (suffix.len > xs.len) return false;
    return std.mem.eql(T, xs[xs.len - suffix.len ..], suffix);
}

/// Safe element access returning Option
/// Returns null if index out of bounds (instead of error)
/// Cost: FixedCost(14)
pub fn get(comptime T: type, xs: []const T, idx: i32) ?T {
    if (idx < 0) return null;
    const index: usize = @intCast(idx);
    if (index >= xs.len) return null;
    return xs[index];
}

Autolykos2 Proof-of-Work

v6 exposes proof-of-work verification in ErgoScript through Header.checkPow() and Global.powHit()⁹¹⁰. Autolykos2 is Ergo's memory-hard, ASIC-resistant PoW algorithm designed for fair GPU mining.

Algorithm Overview

Autolykos2 Structure
═══════════════════════════════════════════════════════════════════

Parameters:
  N = 2²⁶ ≈ 67 million     Table size (memory requirement)
  k = 32                    Elements to sum per solution
  n = 26                    log₂(N)

Memory: N × 32 bytes ≈ 2 GB table

Algorithm:
  1. Seed table from height (changes every ~1024 blocks)
  2. For each nonce attempt:
     a. Compute 32 pseudo-random indices from (msg, nonce)
     b. Sum the 32 table elements at those indices
     c. Hash (msg || nonce || sum) to get PoW hit
     d. If hit < target, solution found

Implementation

/// Autolykos2 proof-of-work algorithm constants and functions
const Autolykos2 = struct {
    /// Table size: 2^26 elements
    pub const N: u32 = 1 << 26;

    /// Elements summed per solution attempt
    pub const K: u32 = 32;

    /// Bits in N (log2(N))
    pub const N_BITS: u5 = 26;

    /// Element size in bytes
    pub const ELEMENT_SIZE: usize = 32;

    /// Total table memory requirement
    pub const TABLE_SIZE: usize = N * ELEMENT_SIZE; // ~2 GB

    /// Epoch length for table seed rotation
    pub const EPOCH_LENGTH: u32 = 1024;

    /// Compute the PoW hit value for a header
    /// Returns BigInt256 that must be < target (from nBits)
    ///
    /// Cost: ~700 JitCost (multiple Blake2b256 computations)
    pub fn powHit(
        header_without_pow: []const u8,
        nonce: u64,
        height: u32,
    ) BigInt256 {
        // Step 1: Compute message hash
        const msg = Blake2b256.hash(header_without_pow);

        // Step 2: Generate table seed from height epoch
        const seed = computeTableSeed(height);

        // Step 3: Compute k-sum of table elements
        var sum = UnsignedBigInt256.ZERO;
        var nonce_bytes: [8]u8 = undefined;
        std.mem.writeInt(u64, &nonce_bytes, nonce, .big);

        for (0..K) |i| {
            // Derive index from hash(msg || nonce || i)
            var index_input: [32 + 8 + 4]u8 = undefined;
            @memcpy(index_input[0..32], &msg);
            @memcpy(index_input[32..40], &nonce_bytes);
            std.mem.writeInt(u32, index_input[40..44], @intCast(i), .big);

            const index_hash = Blake2b256.hash(&index_input);
            const idx = std.mem.readInt(u32, index_hash[0..4], .big) % N;

            // Look up table element
            const element = computeTableElement(seed, idx);
            sum = addUnchecked(sum, element);
        }

        // Step 4: Final hash to get PoW hit
        var final_input: [32 + 8 + 32]u8 = undefined;
        @memcpy(final_input[0..32], &msg);
        @memcpy(final_input[32..40], &nonce_bytes);
        @memcpy(final_input[40..72], &sum.toBytes());

        const hit_hash = Blake2b256.hash(&final_input);
        return BigInt256.fromBytes(hit_hash);
    }

    /// Compute table seed from block height
    /// Seed changes every EPOCH_LENGTH blocks to prevent precomputation
    fn computeTableSeed(height: u32) [32]u8 {
        const epoch = height / EPOCH_LENGTH;
        var epoch_bytes: [4]u8 = undefined;
        std.mem.writeInt(u32, &epoch_bytes, epoch, .big);
        return Blake2b256.hash(&epoch_bytes);
    }

    /// Compute table element at given index
    /// This is the memory-hard part - miners must store or recompute
    fn computeTableElement(seed: [32]u8, idx: u32) UnsignedBigInt256 {
        // Element = H(seed || idx || 0) || H(seed || idx || 1) || ...
        // Combined to form 256-bit value
        var result: [32]u8 = undefined;

        for (0..4) |chunk| {
            var input: [32 + 4 + 1]u8 = undefined;
            @memcpy(input[0..32], &seed);
            std.mem.writeInt(u32, input[32..36], idx, .big);
            input[36] = @intCast(chunk);

            const chunk_hash = Blake2b256.hash(&input);
            @memcpy(result[chunk * 8 ..][0..8], chunk_hash[0..8]);
        }

        return UnsignedBigInt256.fromBytes(result);
    }
};

Header.checkPow

/// Verify that a block header satisfies the PoW difficulty requirement
///
/// checkPow() returns true iff powHit(header) < decodeNBits(header.nBits)
///
/// Cost: FixedCost(700) - approximately 2×32 hash computations
pub fn checkPow(header: Header) bool {
    // Serialize header without PoW solution
    const header_bytes = header.serializeWithoutPow();

    // Compute PoW hit
    const hit = Autolykos2.powHit(
        header_bytes,
        header.powSolutions.n, // nonce
        header.height,
    );

    // Decode difficulty target from nBits
    const target = NBits.decode(header.nBits);

    // Valid if hit < target
    return hit.lessThan(target);
}

Why Memory-Hard?

Autolykos2's memory requirement (~2GB) provides ASIC resistance:

Table storage: Miners must maintain the full table in fast memory
Random access: k=32 random lookups per attempt prevents caching tricks
Epoch rotation: Table changes every ~1024 blocks, invalidating precomputation
GPU-friendly: Memory bandwidth is the bottleneck, favoring commodity GPUs

NBits Difficulty Encoding

The nBits field in block headers uses a compact encoding for the difficulty target¹¹:

NBits Format
═══════════════════════════════════════════════════════════════════

Format: 0xAABBBBBB (4 bytes)
  AA     = exponent (1 byte)
  BBBBBB = mantissa (3 bytes)

Value = mantissa × 256^(exponent - 3)

Example:
  nBits = 0x1d00ffff
  exponent = 0x1d = 29
  mantissa = 0x00ffff = 65535
  target = 65535 × 256^(29-3) = 65535 × 256^26

Implementation

const NBits = struct {
    /// Decode nBits to BigInt target
    /// Cost: FixedCost(10)
    pub fn decode(nBits: i64) BigInt256 {
        const n = @as(u32, @intCast(nBits & 0xFFFFFFFF));
        const exp: u8 = @intCast((n >> 24) & 0xFF);
        const mantissa: u32 = n & 0x00FFFFFF;

        if (exp <= 3) {
            // Small exponent: right shift mantissa
            const shift = (3 - exp) * 8;
            return BigInt256.fromInt(mantissa >> @intCast(shift));
        } else {
            // Normal case: left shift mantissa
            const shift = (exp - 3) * 8;
            return BigInt256.fromInt(mantissa).shiftLeft(shift);
        }
    }

    /// Encode BigInt to nBits
    /// Cost: FixedCost(10)
    pub fn encode(target: BigInt256) i64 {
        // Find the byte length of target
        const bytes = target.toBytes();
        var byte_len: u8 = 32;
        for (bytes) |b| {
            if (b != 0) break;
            byte_len -= 1;
        }

        if (byte_len == 0) return 0;

        // Extract top 3 bytes as mantissa
        const start = 32 - byte_len;
        var mantissa: u32 = 0;

        if (byte_len >= 3) {
            mantissa = (@as(u32, bytes[start]) << 16) |
                       (@as(u32, bytes[start + 1]) << 8) |
                       @as(u32, bytes[start + 2]);
        } else if (byte_len == 2) {
            mantissa = (@as(u32, bytes[start]) << 8) |
                       @as(u32, bytes[start + 1]);
        } else {
            mantissa = bytes[start];
        }

        // Handle sign bit in mantissa (MSB must be 0)
        if (mantissa & 0x00800000 != 0) {
            mantissa >>= 8;
            byte_len += 1;
        }

        return @as(i64, byte_len) << 24 | @as(i64, mantissa);
    }
};

Global Serialization Methods

v6 adds methods to Global for value serialization¹²:

serialize

/// Serialize any value to bytes using SigmaSerializer
/// Works for all serializable types
/// Cost: Varies by type complexity
pub fn serialize(comptime T: type, value: T) ![]const u8 {
    var buffer = std.ArrayList(u8).init(allocator);
    const writer = buffer.writer();

    try SigmaSerializer.serialize(T, value, writer);

    return buffer.toOwnedSlice();
}

fromBigEndianBytes

/// Deserialize numeric type from big-endian bytes
/// Generic over numeric types
/// Cost: FixedCost(5) for primitives
pub fn fromBigEndianBytes(comptime T: type, bytes: []const u8) !T {
    const size = @sizeOf(T);
    if (bytes.len != size) return error.InvalidLength;

    var arr: [size]u8 = undefined;
    @memcpy(&arr, bytes);

    return std.mem.readInt(T, &arr, .big);
}

Cost Model for v6 Operations

v6 Operation Costs
═══════════════════════════════════════════════════════════════════

Operation                    Cost Type       Value    Notes
─────────────────────────────────────────────────────────────────────
Modular Arithmetic:
  mod(a, m)                  Fixed           20       Division
  modInverse(a, m)           Fixed           150      Extended Euclid
  plusMod(a, b, m)           Fixed           30       Add + mod
  subtractMod(a, b, m)       Fixed           30       Sub + mod
  multiplyMod(a, b, m)       Fixed           40       Mul + mod

Bitwise (all types):
  bitwiseInverse(x)          Fixed           5        Single op
  bitwiseOr(x, y)            Fixed           5        Single op
  bitwiseAnd(x, y)           Fixed           5        Single op
  bitwiseXor(x, y)           Fixed           5        Single op
  shiftLeft(x, n)            Fixed           5        Single op
  shiftRight(x, n)           Fixed           5        Single op
  toBytes(x)                 Fixed           5        Conversion
  toBits(x)                  Fixed           5        Conversion

Collections:
  patch(xs, from, p, r)      PerItem(30,2,10)        O(n)
  updated(xs, idx, v)        PerItem(20,1,10)        O(n) copy
  updateMany(xs, is, vs)     PerItem(20,2,10)        O(n)
  reverse(xs)                PerItem (append)        O(n)
  startsWith(xs, p)          PerItem (zip)           O(|p|)
  endsWith(xs, s)            PerItem (zip)           O(|s|)
  get(xs, idx)               Fixed           14      O(1)

Cryptographic:
  expUnsigned(g, k)          Fixed           900     Scalar mult
  checkPow(header)           Fixed           700     ~32 hashes
  powHit(...)                Dynamic                 Autolykos2

Serialization:
  serialize(v)               Varies                  Type-dependent
  fromBigEndianBytes(b)      Fixed           5       Simple parse
  encodeNBits(n)             Fixed           10      Encoding
  decodeNBits(n)             Fixed           10      Decoding

Migration Guide

Version Checking in Scripts

// ErgoScript: Check v6 availability
val canUseV6 = getVar[Boolean](127).getOrElse(false)

// Conditional v6 feature usage
if (canUseV6) {
  // Use v6 features
  val x: UnsignedBigInt = ...
  val inv = x.modInverse(p)
} else {
  // Fallback for pre-v6
}

When to Use v6 Features

Feature	Use When
`SUnsignedBigInt`	Cryptographic protocols requiring modular arithmetic
`modInverse`	Computing multiplicative inverses in finite fields
Bitwise ops	Bit manipulation, flags, compact encodings
`patch/updated`	Immutable collection updates in contracts
`get`	Safe array access without exceptions
`checkPow`	On-chain PoW verification for sidechains/merged mining

Backward Compatibility

v6 features are only available when VersionContext.isV6Activated() returns true
Scripts using v6 features will fail validation on pre-v6 nodes
Design scripts with fallback paths for pre-v6 compatibility during transition

Summary

This chapter covered the v6 protocol features that expand ErgoTree's capabilities:

SUnsignedBigInt provides 256-bit unsigned integers for cryptographic modular arithmetic, with six new methods (mod, modInverse, plusMod, subtractMod, multiplyMod, toSigned)
Bitwise operations (AND, OR, XOR, NOT, shifts) are now available on all numeric types with consistent semantics and low cost (5 JitCost each)
Collection methods (patch, updated, updateMany, reverse, startsWith, endsWith, get) enable efficient immutable collection manipulation
Autolykos2 PoW verification is exposed through Header.checkPow() and Global.powHit(), enabling on-chain proof-of-work validation
NBits encoding provides compact difficulty target representation with encodeNBits/decodeNBits
Serialization methods (Global.serialize, fromBigEndianBytes) support arbitrary value serialization
Cost model assigns appropriate costs to each operation, with modInverse (150) and checkPow (700) being the most expensive due to their computational complexity

Previous: Chapter 31 | Next: Appendix A

Scala: CUnsignedBigInt.scala

Rust: unsignedbigint256.rs

Scala: methods.scala:570-625 (SUnsignedBigIntMethods)

⁴

Rust: snumeric.rs:381-491

⁵

Scala: methods.scala (Bitwise method definitions)

⁶

Rust: snumeric.rs:73-264

⁷

Scala: Colls.scala (Collection trait)

⁸

Rust: scoll.rs:140-266

⁹

Ergo: Autolykos PoW

¹⁰

Rust: Header type

¹¹

Bitcoin Wiki: Difficulty

¹²

Scala: sglobal.scala (SGlobalMethods)

Appendix A: Complete Type Code Table

Complete reference for all type codes used in ErgoTree serialization¹².

Type Code Ranges

Type Code Organization
══════════════════════════════════════════════════════════════════

Range          Usage
─────────────────────────────────────────────────────────────────
0x00           Reserved (invalid)
0x01-0x09      Primitive types (embeddable)
0x0A-0x0B      Reserved
0x0C           Collection type constructor
0x0D-0x17      Reserved
0x18           Nested collection (Coll[Coll[T]])
0x19-0x23      Reserved
0x24           Option type constructor
0x25-0x3B      Reserved
0x3C-0x5F      Pair type constructors
0x60           Tuple type constructor
0x61-0x6A      Object types (non-embeddable)
0x6B-0x6F      Reserved for future object types

Primitive Types (Embeddable)

Embeddable types can be used as element types in collections (Coll[T], Option[T]) and have compact type code encodings. They are "embedded" into composite type codes rather than being serialized separately. For example, Coll[Int] is encoded as 0x0C 0x04 where 0x04 (Int) is embedded directly after 0x0C (Coll).

Dec	Hex	Type	Size	Zig Type
1	0x01	SBoolean	1 bit	`bool`
2	0x02	SByte	8 bits	`i8`
3	0x03	SShort	16 bits	`i16`
4	0x04	SInt	32 bits	`i32`
5	0x05	SLong	64 bits	`i64`
6	0x06	SBigInt	256 bits	`BigInt256`
7	0x07	SGroupElement	33 bytes	`Ecp` (compressed)
8	0x08	SSigmaProp	variable	`SigmaBoolean`
9	0x09	SUnsignedBigInt	256 bits	`UnsignedBigInt256`

Object Types

Dec	Hex	Type	Description
97	0x61	SAny	Supertype of all types
98	0x62	SUnit	Unit type (singleton)
99	0x63	SBox	Transaction box
100	0x64	SAvlTree	Authenticated dictionary
101	0x65	SContext	Execution context
102	0x66	SString	String (ErgoScript only)
103	0x67	STypeVar	Type variable (internal)
104	0x68	SHeader	Block header
105	0x69	SPreHeader	Block pre-header
106	0x6A	SGlobal	Global object (SigmaDslBuilder)

Type Constructors

Dec	Hex	Constructor	Example	Serialized As
12	0x0C	SColl	`Coll[Byte]`	`0x0C 0x02`
24	0x18	Nested SColl	`Coll[Coll[Int]]`	`0x18 0x04`
36	0x24	SOption	`Option[Long]`	`0x24 0x05`
60	0x3C	Pair (first generic)	`(_, Byte)`	`0x3C 0x02`
72	0x48	Pair (second generic)	`(Int, _)`	`0x48 0x04`
84	0x54	Pair (symmetric)	`(Long, Long)`	`0x54 0x05`
96	0x60	STuple	`(Int, Boolean, ...)`	`0x60 len types...`

Zig Type Definition

const TypeCode = enum(u8) {
    // Primitive types
    boolean = 0x01,
    byte = 0x02,
    short = 0x03,
    int = 0x04,
    long = 0x05,
    big_int = 0x06,
    group_element = 0x07,
    sigma_prop = 0x08,
    unsigned_big_int = 0x09,

    // Type constructors
    coll = 0x0C,
    nested_coll = 0x18,
    option = 0x24,
    pair_first = 0x3C,
    pair_second = 0x48,
    pair_symmetric = 0x54,
    tuple = 0x60,

    // Object types
    any = 0x61,
    unit = 0x62,
    box = 0x63,
    avl_tree = 0x64,
    context = 0x65,
    string = 0x66,
    type_var = 0x67,
    header = 0x68,
    pre_header = 0x69,
    global = 0x6A,

    pub fn isPrimitive(self: TypeCode) bool {
        return @intFromEnum(self) >= 0x01 and @intFromEnum(self) <= 0x09;
    }

    pub fn isEmbeddable(self: TypeCode) bool {
        return self.isPrimitive();
    }

    pub fn isNumeric(self: TypeCode) bool {
        return switch (self) {
            .byte, .short, .int, .long, .big_int, .unsigned_big_int => true,
            else => false,
        };
    }
};

Type Serialization

const SType = union(enum) {
    boolean,
    byte,
    short,
    int,
    long,
    big_int,
    group_element,
    sigma_prop,
    unsigned_big_int,
    coll: *const SType,
    option: *const SType,
    tuple: []const SType,
    box,
    avl_tree,
    context,
    header,
    pre_header,
    global,
    unit,
    any,

    pub fn typeCode(self: *const SType) u8 {
        return switch (self.*) {
            .boolean => 0x01,
            .byte => 0x02,
            .short => 0x03,
            .int => 0x04,
            .long => 0x05,
            .big_int => 0x06,
            .group_element => 0x07,
            .sigma_prop => 0x08,
            .unsigned_big_int => 0x09,
            .coll => |elem| blk: {
                if (elem.* == .coll) break :blk 0x18;
                break :blk 0x0C;
            },
            .option => 0x24,
            .tuple => 0x60,
            .box => 0x63,
            .avl_tree => 0x64,
            .context => 0x65,
            .header => 0x68,
            .pre_header => 0x69,
            .global => 0x6A,
            .unit => 0x62,
            .any => 0x61,
        };
    }
};

Encoding Rules

Type Encoding Examples
══════════════════════════════════════════════════════════════════

Simple Types:
  SInt           → [0x04]
  SBoolean       → [0x01]
  SGroupElement  → [0x07]

Collections:
  Coll[Byte]         → [0x0C, 0x02]      (coll + byte)
  Coll[Int]          → [0x0C, 0x04]      (coll + int)
  Coll[Coll[Byte]]   → [0x18, 0x02]      (nested_coll + byte)

Options:
  Option[Int]        → [0x24, 0x04]      (option + int)
  Option[Box]        → [0x24, 0x63]      (option + box)

Tuples (2 elements):
  (Int, Int)         → [0x54, 0x04]      (symmetric + int)
  (Int, Long)        → [0x48, 0x04, 0x05] (pair2 + int + long)
  (Long, Int)        → [0x3C, 0x05, 0x04] (pair1 + long + int)

Tuples (3+ elements):
  (Int, Long, Byte)  → [0x60, 0x03, 0x04, 0x05, 0x02]
                       (tuple + len + int + long + byte)

Constants

const TypeConstants = struct {
    /// First type code for primitive types
    pub const FIRST_PRIMITIVE_TYPE: u8 = 0x01;
    /// Last type code for primitive types
    pub const LAST_PRIMITIVE_TYPE: u8 = 0x09;
    /// Maximum supported type code
    pub const MAX_TYPE_CODE: u8 = 0x6A;
    /// Last data type (can be serialized as data)
    pub const LAST_DATA_TYPE: u8 = 111;
};

Previous: Chapter 31 | Next: Appendix B

Scala: SType.scala

Rust: stype.rs:28-76

Appendix B: Complete Opcode Table

Complete reference for all operation codes used in ErgoTree serialization¹².

Opcode Ranges

Opcode Space Organization
══════════════════════════════════════════════════════════════════

Range           Usage                           Encoding
─────────────────────────────────────────────────────────────────
0x00            Reserved (invalid)              -
0x01-0x6F       Data types (constants)          Type code directly
0x70            LastConstantCode boundary       112
0x71-0xFF       Operations                      LastConstantCode + shift

Operation Categories:
─────────────────────────────────────────────────────────────────
0x71-0x79       Variables & references          ValUse, ConstPlaceholder
0x7A-0x7E       Type conversions               Upcast, Downcast
0x7F-0x8C       Constants & tuples             True, False, Tuple
0x8F-0x98       Relations & logic              Lt, Gt, Eq, And, Or
0x99-0xA2       Arithmetic                     Plus, Minus, Multiply
0xA3-0xAC       Context access                 HEIGHT, INPUTS, OUTPUTS
0xAD-0xB8       Collection operations          Map, Filter, Fold
0xC1-0xC7       Box extraction                 ExtractAmount, ExtractId
0xCB-0xD5       Crypto & serialization         Blake2b, ProveDlog
0xD6-0xE7       Blocks & functions             ValDef, FuncValue, Apply
0xEA-0xEB       Sigma operations               SigmaAnd, SigmaOr
0xEC-0xFF       Bitwise & misc                 BitOr, BitAnd, XorOf

Zig Opcode Definition

const OpCode = enum(u8) {
    // Constants region: 0x01-0x70 (type codes)
    // Operations start at LAST_CONSTANT_CODE + 1 = 113

    // Variable references
    tagged_variable = 0x71,      // Context variable by ID
    val_use = 0x72,              // Reference to ValDef binding
    constant_placeholder = 0x73, // Segregated constant reference
    subst_constants = 0x74,      // Substitute constants in tree

    // Type conversions
    long_to_byte_array = 0x7A,
    byte_array_to_bigint = 0x7B,
    byte_array_to_long = 0x7C,
    downcast = 0x7D,
    upcast = 0x7E,

    // Primitive constants
    true_const = 0x7F,
    false_const = 0x80,
    unit_constant = 0x81,
    group_generator = 0x82,

    // Collection & tuple construction
    concrete_collection = 0x83,
    concrete_collection_bool = 0x85,
    tuple = 0x86,
    select_1 = 0x87,
    select_2 = 0x88,
    select_3 = 0x89,
    select_4 = 0x8A,
    select_5 = 0x8B,
    select_field = 0x8C,

    // Relational operations
    lt = 0x8F,
    le = 0x90,
    gt = 0x91,
    ge = 0x92,
    eq = 0x93,
    neq = 0x94,

    // Control flow & logic
    if_op = 0x95,
    and_op = 0x96,
    or_op = 0x97,
    atleast = 0x98,

    // Arithmetic
    minus = 0x99,
    plus = 0x9A,
    xor = 0x9B,
    multiply = 0x9C,
    division = 0x9D,
    modulo = 0x9E,
    exponentiate = 0x9F,
    multiply_group = 0xA0,
    min = 0xA1,
    max = 0xA2,

    // Context access
    height = 0xA3,
    inputs = 0xA4,
    outputs = 0xA5,
    last_block_utxo_root_hash = 0xA6,
    self_box = 0xA7,
    miner_pubkey = 0xAC,

    // Collection operations
    map_collection = 0xAD,
    exists = 0xAE,
    forall = 0xAF,
    fold = 0xB0,
    size_of = 0xB1,
    by_index = 0xB2,
    append = 0xB3,
    slice = 0xB4,
    filter = 0xB5,
    avl_tree = 0xB6,
    avl_tree_get = 0xB7,
    flat_map = 0xB8,

    // Box extraction
    extract_amount = 0xC1,
    extract_script_bytes = 0xC2,
    extract_bytes = 0xC3,
    extract_bytes_with_no_ref = 0xC4,
    extract_id = 0xC5,
    extract_register_as = 0xC6,
    extract_creation_info = 0xC7,

    // Cryptographic operations
    calc_blake2b256 = 0xCB,
    calc_sha256 = 0xCC,
    prove_dlog = 0xCD,
    prove_diffie_hellman_tuple = 0xCE,
    sigma_prop_is_proven = 0xCF,
    sigma_prop_bytes = 0xD0,
    bool_to_sigma_prop = 0xD1,
    trivial_prop_false = 0xD2,
    trivial_prop_true = 0xD3,

    // Deserialization
    deserialize_context = 0xD4,
    deserialize_register = 0xD5,

    // Block & function definitions
    val_def = 0xD6,
    fun_def = 0xD7,
    block_value = 0xD8,
    func_value = 0xD9,
    func_apply = 0xDA,
    property_call = 0xDB,
    method_call = 0xDC,
    global = 0xDD,

    // Option operations
    some_value = 0xDE,
    none_value = 0xDF,
    get_var = 0xE3,
    option_get = 0xE4,
    option_get_or_else = 0xE5,
    option_is_defined = 0xE6,

    // Modular arithmetic (deprecated in v5+)
    mod_q = 0xE7,
    plus_mod_q = 0xE8,
    minus_mod_q = 0xE9,

    // Sigma operations
    sigma_and = 0xEA,
    sigma_or = 0xEB,

    // Binary operations
    bin_or = 0xEC,
    bin_and = 0xED,
    decode_point = 0xEE,
    logical_not = 0xEF,
    negation = 0xF0,

    // Bitwise operations
    bit_inversion = 0xF1,
    bit_or = 0xF2,
    bit_and = 0xF3,
    bin_xor = 0xF4,
    bit_xor = 0xF5,
    bit_shift_right = 0xF6,
    bit_shift_left = 0xF7,
    bit_shift_right_zeroed = 0xF8,

    // Collection bitwise operations
    coll_shift_right = 0xF9,
    coll_shift_left = 0xFA,
    coll_shift_right_zeroed = 0xFB,
    coll_rotate_left = 0xFC,
    coll_rotate_right = 0xFD,

    // Misc
    context = 0xFE,
    xor_of = 0xFF,

    pub fn isConstant(code: u8) bool {
        return code >= 0x01 and code <= 0x70;
    }

    pub fn isOperation(code: u8) bool {
        return code > 0x70;
    }

    pub fn fromShift(shift: u8) OpCode {
        return @enumFromInt(0x70 + shift);
    }
};

Variable & Reference Operations

Hex	Decimal	Operation	Description
0x71	113	TaggedVariable	Reference context variable by ID
0x72	114	ValUse	Use value defined by ValDef
0x73	115	ConstantPlaceholder	Reference segregated constant
0x74	116	SubstConstants	Substitute constants in tree

Type Conversion Operations

Hex	Decimal	Operation	Description
0x7A	122	LongToByteArray	Long → Coll[Byte] (big-endian)
0x7B	123	ByteArrayToBigInt	Coll[Byte] → BigInt
0x7C	124	ByteArrayToLong	Coll[Byte] → Long
0x7D	125	Downcast	Numeric downcast (may overflow)
0x7E	126	Upcast	Numeric upcast (always safe)

Constants & Tuples

Hex	Decimal	Operation	Description
0x7F	127	True	Boolean true constant
0x80	128	False	Boolean false constant
0x81	129	UnitConstant	Unit () value
0x82	130	GroupGenerator	EC generator point G
0x83	131	ConcreteCollection	Coll construction
0x85	133	ConcreteCollectionBool	Optimized Coll[Boolean]
0x86	134	Tuple	Tuple construction
0x87-0x8B	135-139	Select1-5	Tuple element access
0x8C	140	SelectField	Select by field index

Relational & Logic Operations

Hex	Decimal	Operation	Description
0x8F	143	Lt	Less than (<)
0x90	144	Le	Less or equal (≤)
0x91	145	Gt	Greater than (>)
0x92	146	Ge	Greater or equal (≥)
0x93	147	Eq	Equal (==)
0x94	148	Neq	Not equal (≠)
0x95	149	If	If-then-else
0x96	150	And	Logical AND (&&)
0x97	151	Or	Logical OR (\|\|)
0x98	152	AtLeast	k-of-n threshold

Arithmetic Operations

Hex	Decimal	Operation	Description
0x99	153	Minus	Subtraction
0x9A	154	Plus	Addition
0x9B	155	Xor	Byte-array XOR
0x9C	156	Multiply	Multiplication
0x9D	157	Division	Integer division
0x9E	158	Modulo	Remainder
0x9F	159	Exponentiate	BigInt exponentiation
0xA0	160	MultiplyGroup	EC point multiplication
0xA1	161	Min	Minimum
0xA2	162	Max	Maximum

Context Access Operations

Hex	Decimal	Operation	Description
0xA3	163	Height	Current block height
0xA4	164	Inputs	Transaction inputs (INPUTS)
0xA5	165	Outputs	Transaction outputs (OUTPUTS)
0xA6	166	LastBlockUtxoRootHash	UTXO tree root hash
0xA7	167	Self	Current box (SELF)
0xAC	172	MinerPubkey	Miner's public key
0xFE	254	Context	Context object

Collection Operations

Hex	Decimal	Operation	Description
0xAD	173	MapCollection	Transform elements
0xAE	174	Exists	Any element matches
0xAF	175	ForAll	All elements match
0xB0	176	Fold	Reduce to single value
0xB1	177	SizeOf	Collection length
0xB2	178	ByIndex	Element at index
0xB3	179	Append	Concatenate collections
0xB4	180	Slice	Extract sub-collection
0xB5	181	Filter	Keep matching elements
0xB6	182	AvlTree	AVL tree construction
0xB7	183	AvlTreeGet	AVL tree lookup
0xB8	184	FlatMap	Map and flatten

Box Extraction Operations

Hex	Decimal	Operation	Description
0xC1	193	ExtractAmount	Box.value (nanoErgs)
0xC2	194	ExtractScriptBytes	Box.propositionBytes
0xC3	195	ExtractBytes	Box.bytes (full)
0xC4	196	ExtractBytesWithNoRef	Box.bytesWithoutRef
0xC5	197	ExtractId	Box.id (32 bytes)
0xC6	198	ExtractRegisterAs	Box.Rx[T]
0xC7	199	ExtractCreationInfo	Box.creationInfo

Cryptographic Operations

Hex	Decimal	Operation	Description
0xCB	203	CalcBlake2b256	Blake2b256 hash
0xCC	204	CalcSha256	SHA-256 hash
0xCD	205	ProveDlog	DLog proposition
0xCE	206	ProveDHTuple	DHT proposition
0xCF	207	SigmaPropIsProven	Check proven
0xD0	208	SigmaPropBytes	Serialize SigmaProp
0xD1	209	BoolToSigmaProp	Bool → SigmaProp
0xD2	210	TrivialPropFalse	Always false
0xD3	211	TrivialPropTrue	Always true
0xEE	238	DecodePoint	Bytes → GroupElement

Block & Function Operations

Hex	Decimal	Operation	Description
0xD4	212	DeserializeContext	Deserialize from context
0xD5	213	DeserializeRegister	Deserialize from register
0xD6	214	ValDef	Define value binding
0xD7	215	FunDef	Define function
0xD8	216	BlockValue	Block expression { }
0xD9	217	FuncValue	Lambda expression
0xDA	218	FuncApply	Apply function
0xDB	219	PropertyCall	Property access
0xDC	220	MethodCall	Method invocation
0xDD	221	Global	Global object

Option Operations

Hex	Decimal	Operation	Description
0xDE	222	SomeValue	Some(x) construction
0xDF	223	NoneValue	None construction
0xE3	227	GetVar	Get context variable
0xE4	228	OptionGet	Option.get (may fail)
0xE5	229	OptionGetOrElse	Option.getOrElse
0xE6	230	OptionIsDefined	Option.isDefined

Sigma Operations

Hex	Decimal	Operation	Description
0xEA	234	SigmaAnd	Sigma AND (∧)
0xEB	235	SigmaOr	Sigma OR (∨)

Bitwise Operations (v6+)

Hex	Decimal	Operation	Description
0xEF	239	LogicalNot	Boolean NOT (!)
0xF0	240	Negation	Numeric negation (-x)
0xF1	241	BitInversion	Bitwise NOT (~)
0xF2	242	BitOr	Bitwise OR (\|)
0xF3	243	BitAnd	Bitwise AND (&)
0xF4	244	BinXor	Binary XOR
0xF5	245	BitXor	Bitwise XOR (^)
0xF6	246	BitShiftRight	Arithmetic right shift (>>)
0xF7	247	BitShiftLeft	Left shift (<<)
0xF8	248	BitShiftRightZeroed	Logical right shift (>>>)

Collection Bitwise Operations (v6+)

Hex	Decimal	Operation	Description
0xF9	249	CollShiftRight	Collection shift right
0xFA	250	CollShiftLeft	Collection shift left
0xFB	251	CollShiftRightZeroed	Collection logical shift right
0xFC	252	CollRotateLeft	Collection rotate left
0xFD	253	CollRotateRight	Collection rotate right
0xFF	255	XorOf	XOR of collection elements

Opcode Parsing

const OpCodeParser = struct {
    /// Parse opcode from byte, determining if constant or operation
    pub fn parse(byte: u8) ParseResult {
        if (byte == 0) return .invalid;
        if (byte <= 0x70) return .{ .constant = byte };
        return .{ .operation = @enumFromInt(byte) };
    }

    /// Check if opcode requires additional data
    pub fn hasPayload(op: OpCode) bool {
        return switch (op) {
            .val_use,
            .constant_placeholder,
            .tagged_variable,
            .extract_register_as,
            .by_index,
            .select_field,
            .method_call,
            .property_call,
            => true,
            else => false,
        };
    }

    const ParseResult = union(enum) {
        invalid,
        constant: u8,
        operation: OpCode,
    };
};

Constants

const OpCodeConstants = struct {
    /// First valid data type code
    pub const FIRST_DATA_TYPE: u8 = 0x01;
    /// Last data type code
    pub const LAST_DATA_TYPE: u8 = 111; // 0x6F
    /// Boundary between constants and operations
    pub const LAST_CONSTANT_CODE: u8 = 112; // 0x70
    /// First operation code
    pub const FIRST_OP_CODE: u8 = 113; // 0x71
    /// Maximum opcode value
    pub const MAX_OP_CODE: u8 = 255; // 0xFF
};

Previous: Appendix A | Next: Appendix C

Scala: OpCodes.scala

Rust: op_code.rs:14-203

Appendix C: Cost Table

Complete reference for operation costs in the JIT cost model¹².

Cost Model Architecture

Cost Model Structure
══════════════════════════════════════════════════════════════════

┌────────────────────────────────────────────────────────────────┐
│                       CostKind                                  │
├─────────────┬─────────────┬─────────────┬─────────────────────┤
│  FixedCost  │PerItemCost  │TypeBasedCost│   DynamicCost       │
│             │             │             │                      │
│  cost: u32  │ base: u32   │ costFunc()  │ sum of sub-costs    │
│             │ per_chunk   │ per type    │                      │
│             │ chunk_size  │             │                      │
└─────────────┴─────────────┴─────────────┴─────────────────────┘

Cost Calculation Flow:
─────────────────────────────────────────────────────────────────
                 ┌─────────────┐
                 │  Operation  │
                 └──────┬──────┘
                        │
         ┌──────────────┼──────────────┐
         ▼              ▼              ▼
    ┌─────────┐   ┌──────────┐   ┌─────────┐
    │ FixedOp │   │PerItemOp │   │TypedOp  │
    │ cost=26 │   │base=20   │   │depends  │
    └────┬────┘   │chunk=10  │   │on type  │
         │        └────┬─────┘   └────┬────┘
         │             │              │
         └─────────────┼──────────────┘
                       ▼
              ┌────────────────┐
              │CostAccumulator │
              │ accum += cost  │
              │ check < limit  │
              └────────────────┘

Zig Cost Types

const JitCost = struct {
    value: u32,

    pub fn add(self: JitCost, other: JitCost) !JitCost {
        return .{ .value = try std.math.add(u32, self.value, other.value) };
    }
};

const CostKind = union(enum) {
    fixed: FixedCost,
    per_item: PerItemCost,
    type_based: TypeBasedCost,
    dynamic,

    pub fn compute(self: CostKind, ctx: CostContext) JitCost {
        return switch (self) {
            .fixed => |f| f.cost,
            .per_item => |p| p.compute(ctx.n_items),
            .type_based => |t| t.costFunc(ctx.tpe),
            .dynamic => ctx.computed_cost,
        };
    }
};

/// Fixed cost regardless of input
const FixedCost = struct {
    cost: JitCost,
};

/// Cost proportional to collection size
const PerItemCost = struct {
    base: JitCost,
    per_chunk: JitCost,
    chunk_size: usize,

    /// totalCost = base + per_chunk * ceil(n_items / chunk_size)
    pub fn compute(self: PerItemCost, n_items: usize) JitCost {
        const chunks = (n_items + self.chunk_size - 1) / self.chunk_size;
        return .{
            .value = self.base.value + @as(u32, @intCast(chunks)) * self.per_chunk.value,
        };
    }
};

/// Cost depends on type
const TypeBasedCost = struct {
    primitive_cost: JitCost,
    bigint_cost: JitCost,
    collection_cost: ?PerItemCost,

    pub fn costFunc(self: TypeBasedCost, tpe: SType) JitCost {
        return switch (tpe) {
            .byte, .short, .int, .long => self.primitive_cost,
            .big_int, .unsigned_big_int => self.bigint_cost,
            .coll => |elem| if (self.collection_cost) |c|
                c.compute(elem.len)
            else
                self.primitive_cost,
            else => self.primitive_cost,
        };
    }
};

Cost Accumulator

const CostAccumulator = struct {
    accum: u64,
    limit: u64,

    pub fn init(limit: u64) CostAccumulator {
        return .{ .accum = 0, .limit = limit };
    }

    pub fn add(self: *CostAccumulator, cost: JitCost) !void {
        self.accum += cost.value;
        if (self.accum > self.limit) {
            return error.CostLimitExceeded;
        }
    }

    pub fn addSeq(
        self: *CostAccumulator,
        cost: PerItemCost,
        n_items: usize,
    ) !void {
        try self.add(cost.compute(n_items));
    }

    pub fn totalCost(self: *const CostAccumulator) JitCost {
        return .{ .value = @intCast(self.accum) };
    }
};

Fixed Cost Operations

Operation	Cost	Description
ConstantPlaceholder	1	Reference segregated constant
Height	1	Current block height
Inputs	1	Transaction inputs
Outputs	1	Transaction outputs
LastBlockUtxoRootHash	1	UTXO root hash
Self	1	Self box
MinerPubkey	1	Miner public key
ValUse	5	Use defined value
TaggedVariable	5	Context variable
SomeValue	5	Option Some
NoneValue	5	Option None
SelectField	8	Select tuple field
CreateProveDlog	10	Create DLog
OptionGetOrElse	10	Option.getOrElse
OptionIsDefined	10	Option.isDefined
OptionGet	10	Option.get
ExtractAmount	10	Box value
ExtractScriptBytes	10	Proposition bytes
ExtractId	10	Box ID
Tuple	10	Create tuple
Select1-5	12	Select tuple element
ByIndex	14	Collection access
BoolToSigmaProp	15	Bool → SigmaProp
DeserializeContext	15	Deserialize context
DeserializeRegister	15	Deserialize register
ByteArrayToLong	16	Bytes → Long
LongToByteArray	17	Long → bytes
CreateProveDHTuple	20	Create DHT
If	20	Conditional
LogicalNot	20	Boolean NOT
Negation	20	Numeric negation
ArithOp	26	Plus, Minus, etc.
ByteArrayToBigInt	30	Bytes → BigInt
SubstConstants	30	Substitute constants
SizeOf	30	Collection size
MultiplyGroup	40	EC point multiply
ExtractRegisterAs	50	Register access
Exponentiate	300	BigInt exponent
DecodePoint	900	Decode EC point

Per-Item Cost Operations

Operation	Base	Per Chunk	Chunk Size
CalcBlake2b256	20	7	128
CalcSha256	20	8	64
MapCollection	20	1	10
Exists	20	5	10
ForAll	20	5	10
Fold	20	1	10
Filter	20	5	10
FlatMap	20	5	10
Slice	10	2	100
Append	20	2	100
SigmaAnd	10	2	1
SigmaOr	10	2	1
AND (logical)	10	5	32
OR (logical)	10	5	32
XorOf	20	5	32
AtLeast	20	3	1
Xor (bytes)	10	2	128

Type-Based Costs

Numeric Casting

Target Type	Cost
Byte, Short, Int, Long	10
BigInt	30
UnsignedBigInt	30

Comparison Operations

Type	Cost
Primitives	10-20
BigInt	30
Collections	PerItemCost
Tuples	Sum of components

Interpreter Overhead

Cost Type	Value	Description
interpreterInitCost	10,000	Interpreter init
inputCost	2,000	Per input
dataInputCost	100	Per data input
outputCost	100	Per output
tokenAccessCost	100	Per token

Cost Limits

Parameter	Value	Description
maxBlockCost	1,000,000	Max per block
scriptCostLimit	~8,000,000	Single script

Zig Cost Constants

const OperationCosts = struct {
    // Context access (very cheap)
    pub const HEIGHT: FixedCost = .{ .cost = .{ .value = 1 } };
    pub const INPUTS: FixedCost = .{ .cost = .{ .value = 1 } };
    pub const OUTPUTS: FixedCost = .{ .cost = .{ .value = 1 } };
    pub const SELF: FixedCost = .{ .cost = .{ .value = 1 } };

    // Variable access
    pub const VAL_USE: FixedCost = .{ .cost = .{ .value = 5 } };
    pub const CONSTANT_PLACEHOLDER: FixedCost = .{ .cost = .{ .value = 1 } };

    // Arithmetic
    pub const ARITH_OP: FixedCost = .{ .cost = .{ .value = 26 } };
    pub const COMPARISON: FixedCost = .{ .cost = .{ .value = 20 } };

    // Box extraction
    pub const EXTRACT_AMOUNT: FixedCost = .{ .cost = .{ .value = 10 } };
    pub const EXTRACT_REGISTER: FixedCost = .{ .cost = .{ .value = 50 } };

    // Cryptographic
    pub const PROVE_DLOG: FixedCost = .{ .cost = .{ .value = 10 } };
    pub const PROVE_DHT: FixedCost = .{ .cost = .{ .value = 20 } };
    pub const DECODE_POINT: FixedCost = .{ .cost = .{ .value = 900 } };
    pub const MULTIPLY_GROUP: FixedCost = .{ .cost = .{ .value = 40 } };
    pub const EXPONENTIATE: FixedCost = .{ .cost = .{ .value = 300 } };

    // Hashing
    pub const BLAKE2B256: PerItemCost = .{
        .base = .{ .value = 20 },
        .per_chunk = .{ .value = 7 },
        .chunk_size = 128,
    };
    pub const SHA256: PerItemCost = .{
        .base = .{ .value = 20 },
        .per_chunk = .{ .value = 8 },
        .chunk_size = 64,
    };

    // Collection operations
    pub const MAP: PerItemCost = .{
        .base = .{ .value = 20 },
        .per_chunk = .{ .value = 1 },
        .chunk_size = 10,
    };
    pub const FILTER: PerItemCost = .{
        .base = .{ .value = 20 },
        .per_chunk = .{ .value = 5 },
        .chunk_size = 10,
    };
    pub const FOLD: PerItemCost = .{
        .base = .{ .value = 20 },
        .per_chunk = .{ .value = 1 },
        .chunk_size = 10,
    };

    // Sigma operations
    pub const SIGMA_AND: PerItemCost = .{
        .base = .{ .value = 10 },
        .per_chunk = .{ .value = 2 },
        .chunk_size = 1,
    };
    pub const SIGMA_OR: PerItemCost = .{
        .base = .{ .value = 10 },
        .per_chunk = .{ .value = 2 },
        .chunk_size = 1,
    };
};

const InterpreterCosts = struct {
    pub const INIT: u32 = 10_000;
    pub const PER_INPUT: u32 = 2_000;
    pub const PER_DATA_INPUT: u32 = 100;
    pub const PER_OUTPUT: u32 = 100;
    pub const PER_TOKEN: u32 = 100;
};

const CostLimits = struct {
    pub const MAX_BLOCK_COST: u64 = 1_000_000;
    pub const MAX_SCRIPT_COST: u64 = 8_000_000;
};

Cost Calculation Example

/// Calculate total cost for transaction verification
fn calculateTxCost(
    n_inputs: usize,
    n_data_inputs: usize,
    n_outputs: usize,
    script_costs: []const JitCost,
) u64 {
    var total: u64 = InterpreterCosts.INIT;

    total += @as(u64, n_inputs) * InterpreterCosts.PER_INPUT;
    total += @as(u64, n_data_inputs) * InterpreterCosts.PER_DATA_INPUT;
    total += @as(u64, n_outputs) * InterpreterCosts.PER_OUTPUT;

    for (script_costs) |cost| {
        total += cost.value;
    }

    return total;
}

// Example: 2 inputs, 1 data input, 3 outputs
// Base: 10,000 + 4,000 + 100 + 300 = 14,400
// Plus script costs per input

Previous: Appendix B | Next: Appendix D

Scala: CostKind.scala

Rust: cost_accum.rs:7-43

Appendix D: Method Reference

Complete reference for all methods available on each type¹².

Method Organization

Method System Architecture
══════════════════════════════════════════════════════════════════

                    ┌────────────────────────┐
                    │    STypeCompanion      │
                    │  type_code: TypeCode   │
                    │  methods: []SMethod    │
                    └──────────┬─────────────┘
                               │
       ┌───────────────────────┼───────────────────────┐
       ▼                       ▼                       ▼
┌──────────────┐       ┌──────────────┐       ┌──────────────┐
│  SNumeric    │       │    SBox      │       │   SColl      │
│  methods.len │       │  methods.len │       │ methods.len  │
│     = 13     │       │     = 10     │       │    = 20+     │
└──────────────┘       └──────────────┘       └──────────────┘

Method Lookup:
─────────────────────────────────────────────────────────────────
  receiver.methodCall(type_code=99, method_id=1)
       │
       ▼
  STypeCompanion::Box.method_by_id(1)
       │
       ▼
  SMethod { name: "value", tpe: Box => Long, cost: 10 }

Zig Method Descriptors

const SMethod = struct {
    name: []const u8,
    method_id: u8,
    tpe: SFunc,
    cost_kind: CostKind,
    min_version: ?ErgoTreeVersion = null,

    pub fn isV6Only(self: *const SMethod) bool {
        return self.min_version != null and
            @intFromEnum(self.min_version.?) >= 3;
    }
};

const SFunc = struct {
    t_dom: []const SType,  // Domain (receiver + args)
    t_range: SType,        // Return type

    pub fn unary(recv: SType, ret: SType) SFunc {
        return .{ .t_dom = &[_]SType{recv}, .t_range = ret };
    }

    pub fn binary(recv: SType, arg: SType, ret: SType) SFunc {
        return .{ .t_dom = &[_]SType{ recv, arg }, .t_range = ret };
    }
};

Numeric Types (SByte, SShort, SInt, SLong)3

ID	Method	Signature	v5	v6	Cost
1	toByte	T → Byte	✓	✓	10
2	toShort	T → Short	✓	✓	10
3	toInt	T → Int	✓	✓	10
4	toLong	T → Long	✓	✓	10
5	toBigInt	T → BigInt	✓	✓	30
6	toBytes	T → Coll[Byte]	-	✓	5
7	toBits	T → Coll[Boolean]	-	✓	5
8	bitwiseInverse	T → T	-	✓	5
9	bitwiseOr	(T, T) → T	-	✓	5
10	bitwiseAnd	(T, T) → T	-	✓	5
11	bitwiseXor	(T, T) → T	-	✓	5
12	shiftLeft	(T, Int) → T	-	✓	5
13	shiftRight	(T, Int) → T	-	✓	5

SBigInt 4

ID	Method	Signature	v5	v6	Cost
1-5	toXxx	Conversions	✓	✓	10-30
6-13	bitwise	Bitwise ops	-	✓	5-10
14	toUnsigned	BigInt → UnsignedBigInt	-	✓	5
15	toUnsignedMod	(BigInt, UBI) → UBI	-	✓	10

SUnsignedBigInt (v6+)5

ID	Method	Signature	Cost
14	modInverse	(UBI, UBI) → UBI	50
15	plusMod	(UBI, UBI, UBI) → UBI	10
16	subtractMod	(UBI, UBI, UBI) → UBI	10
17	multiplyMod	(UBI, UBI, UBI) → UBI	15
18	mod	(UBI, UBI) → UBI	10
19	toSigned	UBI → BigInt	5

SGroupElement 6

ID	Method	Signature	v5	v6	Cost
2	getEncoded	GE → Coll[Byte]	✓	✓	250
3	exp	(GE, BigInt) → GE	✓	✓	900
4	multiply	(GE, GE) → GE	✓	✓	40
5	negate	GE → GE	✓	✓	45
6	expUnsigned	(GE, UBI) → GE	-	✓	900

SSigmaProp 7

ID	Method	Signature	Cost
1	propBytes	SigmaProp → Coll[Byte]	35
2	isProven	SigmaProp → Boolean	10

SBox 8

ID	Method	Signature	Cost
1	value	Box → Long	1
2	propositionBytes	Box → Coll[Byte]	10
3	bytes	Box → Coll[Byte]	10
4	bytesWithoutRef	Box → Coll[Byte]	10
5	id	Box → Coll[Byte]	10
6	creationInfo	Box → (Int, Coll[Byte])	10
7	getReg[T]	(Box, Int) → Option[T]	50
8	tokens	Box → Coll[(Coll[Byte], Long)]	15

Register Access

const BoxMethods = struct {
    // R0-R3: mandatory registers
    pub const R0 = makeRegMethod(0);  // monetary value
    pub const R1 = makeRegMethod(1);  // guard script
    pub const R2 = makeRegMethod(2);  // tokens
    pub const R3 = makeRegMethod(3);  // creation info
    // R4-R9: optional registers
    pub const R4 = makeRegMethod(4);
    pub const R5 = makeRegMethod(5);
    pub const R6 = makeRegMethod(6);
    pub const R7 = makeRegMethod(7);
    pub const R8 = makeRegMethod(8);
    pub const R9 = makeRegMethod(9);

    fn makeRegMethod(comptime idx: u8) SMethod {
        return .{
            .method_id = 7,  // getReg opcode
            .name = "R" ++ &[_]u8{'0' + idx},
            .cost_kind = .{ .fixed = .{ .cost = .{ .value = 50 } } },
        };
    }
};

SAvlTree 9

ID	Method	Signature	Cost
1	digest	AvlTree → Coll[Byte]	15
2	enabledOperations	AvlTree → Byte	15
3	keyLength	AvlTree → Int	15
4	valueLengthOpt	AvlTree → Option[Int]	15
5	isInsertAllowed	AvlTree → Boolean	15
6	isUpdateAllowed	AvlTree → Boolean	15
7	isRemoveAllowed	AvlTree → Boolean	15
8	updateOperations	(AvlTree, Byte) → AvlTree	20
9	contains	(AvlTree, key, proof) → Boolean	dynamic
10	get	(AvlTree, key, proof) → Option[Coll[Byte]]	dynamic
11	getMany	(AvlTree, keys, proof) → Coll[Option[...]]	dynamic
12	insert	(AvlTree, entries, proof) → Option[AvlTree]	dynamic
13	update	(AvlTree, operations, proof) → Option[AvlTree]	dynamic
14	remove	(AvlTree, keys, proof) → Option[AvlTree]	dynamic
15	updateDigest	(AvlTree, Coll[Byte]) → AvlTree	20

SContext 10

ID	Method	Signature	Cost
1	dataInputs	Context → Coll[Box]	15
2	headers	Context → Coll[Header]	15
3	preHeader	Context → PreHeader	10
4	INPUTS	Context → Coll[Box]	10
5	OUTPUTS	Context → Coll[Box]	10
6	HEIGHT	Context → Int	26
7	SELF	Context → Box	10
8	selfBoxIndex	Context → Int	20
9	LastBlockUtxoRootHash	Context → AvlTree	15
10	minerPubKey	Context → Coll[Byte]	20
11	getVar[T]	(Context, Byte) → Option[T]	dynamic

SHeader 11

ID	Method	Signature	Cost
1	id	Header → Coll[Byte]	10
2	version	Header → Byte	10
3	parentId	Header → Coll[Byte]	10
4	ADProofsRoot	Header → Coll[Byte]	10
5	stateRoot	Header → AvlTree	10
6	transactionsRoot	Header → Coll[Byte]	10
7	timestamp	Header → Long	10
8	nBits	Header → Long	10
9	height	Header → Int	10
10	extensionRoot	Header → Coll[Byte]	10
11	minerPk	Header → GroupElement	10
12	powOnetimePk	Header → GroupElement	10
13	powNonce	Header → Coll[Byte]	10
14	powDistance	Header → BigInt	10
15	votes	Header → Coll[Byte]	10
16	checkPow	Header → Boolean (v6+)	500

SPreHeader 12

ID	Method	Signature	Cost
1	version	PreHeader → Byte	10
2	parentId	PreHeader → Coll[Byte]	10
3	timestamp	PreHeader → Long	10
4	nBits	PreHeader → Long	10
5	height	PreHeader → Int	10
6	minerPk	PreHeader → GroupElement	10
7	votes	PreHeader → Coll[Byte]	10

SGlobal 13

ID	Method	Signature	v5	v6	Cost
1	groupGenerator	Global → GroupElement	✓	✓	10
2	xor	(Coll[Byte], Coll[Byte]) → Coll[Byte]	✓	✓	PerItem
3	serialize[T]	T → Coll[Byte]	-	✓	dynamic
4	fromBigEndianBytes[T]	Coll[Byte] → T	-	✓	10
5	encodeNBits	BigInt → Long	-	✓	20
6	decodeNBits	Long → BigInt	-	✓	20
7	powHit	(Int, ...) → BigInt	-	✓	500

SCollection 14

ID	Method	Signature	Cost
1	size	Coll[T] → Int	14
2	apply	(Coll[T], Int) → T	14
3	getOrElse	(Coll[T], Int, T) → T	dynamic
4	map[R]	(Coll[T], T → R) → Coll[R]	PerItem(20,1,10)
5	exists	(Coll[T], T → Bool) → Bool	PerItem(20,5,10)
6	fold[R]	(Coll[T], R, (R,T) → R) → R	PerItem(20,1,10)
7	forall	(Coll[T], T → Bool) → Bool	PerItem(20,5,10)
8	slice	(Coll[T], Int, Int) → Coll[T]	PerItem(10,2,100)
9	filter	(Coll[T], T → Bool) → Coll[T]	PerItem(20,5,10)
10	append	(Coll[T], Coll[T]) → Coll[T]	PerItem(20,2,100)
14	indices	Coll[T] → Coll[Int]	PerItem(20,2,128)
15	flatMap[R]	(Coll[T], T → Coll[R]) → Coll[R]	PerItem(20,5,10)
19	patch (v6)	(Coll[T], Int, Coll[T], Int) → Coll[T]	dynamic
20	updated (v6)	(Coll[T], Int, T) → Coll[T]	20
21	updateMany (v6)	(Coll[T], Coll[Int], Coll[T]) → Coll[T]	PerItem
26	indexOf	(Coll[T], T, Int) → Int	PerItem(20,1,10)
29	zip[U]	(Coll[T], Coll[U]) → Coll[(T,U)]	PerItem(10,1,10)
30	reverse (v6)	Coll[T] → Coll[T]	PerItem
31	startsWith (v6)	(Coll[T], Coll[T]) → Boolean	PerItem
32	endsWith (v6)	(Coll[T], Coll[T]) → Boolean	PerItem
33	get (v6)	(Coll[T], Int) → Option[T]	14

SOption 15

ID	Method	Signature	Cost
2	isDefined	Option[T] → Boolean	10
3	get	Option[T] → T	10
4	getOrElse	(Option[T], T) → T	10
7	map[R]	(Option[T], T → R) → Option[R]	dynamic
8	filter	(Option[T], T → Bool) → Option[T]	dynamic

STuple

Tuples support component access by position:

const TupleMethods = struct {
    /// Access tuple component by index (1-based like Scala)
    pub fn component(comptime idx: usize) SMethod {
        return .{
            .name = "_" ++ std.fmt.comptimePrint("{}", .{idx}),
            .method_id = @intCast(idx),
            .cost_kind = .{ .fixed = .{ .cost = .{ .value = 12 } } },
        };
    }
};

// Usage: tuple._1, tuple._2, ... up to tuple._255

Previous: Appendix C | Next: Appendix E

Scala: methods.scala

Rust: smethod.rs:36-99

Scala: methods.scala (SNumericTypeMethods)

⁴

Scala: methods.scala (SBigIntMethods)

⁵

Scala: methods.scala (SUnsignedBigIntMethods)

⁶

Rust: sgroup_elem.rs

⁷

Scala: methods.scala (SSigmaPropMethods)

⁸

Rust: sbox.rs:29-92

⁹

Rust: savltree.rs

¹⁰

Rust: scontext.rs

¹¹

Rust: sheader.rs

¹²

Rust: spreheader.rs

¹³

Rust: sglobal.rs

¹⁴

Rust: scoll.rs:22-266

¹⁵

Rust: soption.rs

Appendix E: Serialization Format Reference

Complete reference for ErgoTree and value serialization formats¹².

Integer Encoding

VLQ (Variable-Length Quantity)

VLQ Encoding
══════════════════════════════════════════════════════════════════

Byte format: [C][D D D D D D D]
             |  |____________|
             |       |
             |       +-- 7 data bits
             +---------- Continuation bit (1 = more bytes follow)

Examples:
  0       → [0x00]                    (1 byte)
  127     → [0x7F]                    (1 byte)
  128     → [0x80, 0x01]              (2 bytes: 10000000 00000001)
  16383   → [0xFF, 0x7F]              (2 bytes)
  16384   → [0x80, 0x80, 0x01]        (3 bytes)

const VlqEncoder = struct {
    /// Encode unsigned integer as VLQ
    pub fn encodeU64(value: u64, writer: anytype) !void {
        var v = value;
        while (v >= 0x80) {
            try writer.writeByte(@as(u8, @truncate(v)) | 0x80);
            v >>= 7;
        }
        try writer.writeByte(@as(u8, @truncate(v)));
    }

    /// Decode VLQ to unsigned integer
    pub fn decodeU64(reader: anytype) !u64 {
        var result: u64 = 0;
        var shift: u6 = 0;
        while (true) {
            const byte = try reader.readByte();
            result |= @as(u64, byte & 0x7F) << shift;
            if (byte & 0x80 == 0) break;
            shift += 7;
            if (shift > 63) return error.VlqOverflow;
        }
        return result;
    }
};

ZigZag Encoding

const ZigZag = struct {
    /// Encode signed → unsigned (small negatives stay small)
    pub fn encode32(n: i32) u32 {
        return @bitCast((n << 1) ^ (n >> 31));
    }

    pub fn encode64(n: i64) u64 {
        return @bitCast((n << 1) ^ (n >> 63));
    }

    /// Decode unsigned → signed
    pub fn decode32(n: u32) i32 {
        return @bitCast((n >> 1) ^ (~(n & 1) +% 1));
    }

    pub fn decode64(n: u64) i64 {
        return @bitCast((n >> 1) ^ (~(n & 1) +% 1));
    }
};

// Mapping: 0 → 0, -1 → 1, 1 → 2, -2 → 3, 2 → 4, ...

Type Serialization 3

Primitive Type Codes

Type	Dec	Hex	Zig
SBoolean	1	0x01	`.boolean`
SByte	2	0x02	`.byte`
SShort	3	0x03	`.short`
SInt	4	0x04	`.int`
SLong	5	0x05	`.long`
SBigInt	6	0x06	`.big_int`
SGroupElement	7	0x07	`.group_element`
SSigmaProp	8	0x08	`.sigma_prop`
SUnsignedBigInt	9	0x09	`.unsigned_big_int`

Collection Types

const TypeEncoder = struct {
    const COLL_BASE: u8 = 12;      // 0x0C
    const NESTED_COLL: u8 = 24;    // 0x18
    const OPTION_BASE: u8 = 36;    // 0x24

    /// Encode collection type
    pub fn encodeColl(elem: SType) u8 {
        if (elem.isPrimitive()) {
            return COLL_BASE + elem.typeCode();
        }
        if (elem == .coll) {
            return NESTED_COLL + elem.inner().typeCode();
        }
        // Non-embeddable: write COLL_BASE then element type separately
        return COLL_BASE;
    }
};

Non-Embeddable Types

Type	Dec	Hex
SBox	99	0x63
SAvlTree	100	0x64
SContext	101	0x65
SHeader	104	0x68
SPreHeader	105	0x69
SGlobal	106	0x6A

ErgoTree Format 4

Header Byte

ErgoTree Header
══════════════════════════════════════════════════════════════════

Bits: [V V V V][S][C][R][R]
      |______|  |  |  |__|
         |      |  |    |
         |      |  |    +-- Reserved (2 bits)
         |      |  +------- Constant segregation (1 = segregated)
         |      +---------- Size flag (1 = size bytes present)
         +----------------- Version (4 bits, 0-15)

Version Mapping:
  0 → ErgoTree v0 (protocol v3.x)
  1 → ErgoTree v1 (protocol v4.x)
  2 → ErgoTree v2 (protocol v5.x, JIT costing)
  3 → ErgoTree v3 (protocol v6.x)

const ErgoTreeHeader = struct {
    version: u4,
    has_size: bool,
    constant_segregation: bool,

    pub fn parse(byte: u8) ErgoTreeHeader {
        return .{
            .version = @truncate(byte >> 4),
            .has_size = (byte & 0x08) != 0,
            .constant_segregation = (byte & 0x04) != 0,
        };
    }

    pub fn serialize(self: ErgoTreeHeader) u8 {
        var result: u8 = @as(u8, self.version) << 4;
        if (self.has_size) result |= 0x08;
        if (self.constant_segregation) result |= 0x04;
        return result;
    }
};

Complete Structure

ErgoTree Wire Format
══════════════════════════════════════════════════════════════════

┌─────────┬──────────┬──────────────┬─────────────┬──────────────┐
│ Header  │   Size   │  Constants   │ Complexity  │    Root      │
│ 1 byte  │   VLQ    │    Array     │    VLQ      │ Expression   │
│         │(optional)│  (if C=1)    │ (optional)  │              │
└─────────┴──────────┴──────────────┴─────────────┴──────────────┘

With constant segregation (C=1):
┌─────────┬──────────┬───────────┬──────────────────────────────┐
│ Header  │ # consts │ Constants │   Root (with placeholders)   │
│         │   VLQ    │  [type +  │                              │
│         │          │  value]*  │                              │
└─────────┴──────────┴───────────┴──────────────────────────────┘

Value Serialization 5

Primitive Values

const DataSerializer = struct {
    pub fn serialize(value: Value, writer: anytype) !void {
        switch (value) {
            .boolean => |b| try writer.writeByte(if (b) 0x01 else 0x00),
            .byte => |b| try writer.writeByte(@bitCast(b)),
            .short => |s| try VlqEncoder.encodeI16(s, writer),
            .int => |i| try VlqEncoder.encodeI32(i, writer),
            .long => |l| try VlqEncoder.encodeI64(l, writer),
            .big_int => |bi| try serializeBigInt(bi, writer),
            .group_element => |ge| try ge.serializeCompressed(writer),
            .sigma_prop => |sp| try serializeSigmaProp(sp, writer),
            .coll => |c| try serializeColl(c, writer),
            // ...
        }
    }

    fn serializeBigInt(bi: BigInt256, writer: anytype) !void {
        const bytes = bi.toBytesBigEndian();
        // Skip leading zeros for signed representation
        var start: usize = 0;
        while (start < bytes.len - 1 and bytes[start] == 0) : (start += 1) {}
        try writer.writeByte(@intCast(bytes.len - start));
        try writer.writeAll(bytes[start..]);
    }
};

GroupElement (SEC1 Compressed)

GroupElement Encoding (33 bytes)
══════════════════════════════════════════════════════════════════

┌────────────┬─────────────────────────────────────────────────────┐
│   Prefix   │                 X Coordinate                        │
│  (1 byte)  │                  (32 bytes)                         │
├────────────┼─────────────────────────────────────────────────────┤
│ 0x02 = Y   │                                                     │
│    even    │              Big-endian X value                     │
│ 0x03 = Y   │                                                     │
│    odd     │                                                     │
└────────────┴─────────────────────────────────────────────────────┘

SigmaProp

const SigmaPropSerializer = struct {
    const PROVE_DLOG: u8 = 0xCD;
    const PROVE_DHT: u8 = 0xCE;
    const THRESHOLD: u8 = 0x98;
    const AND: u8 = 0x96;
    const OR: u8 = 0x97;

    pub fn serialize(sp: SigmaBoolean, writer: anytype) !void {
        switch (sp) {
            .prove_dlog => |pk| {
                try writer.writeByte(PROVE_DLOG);
                try pk.serializeCompressed(writer);
            },
            .prove_dht => |dht| {
                try writer.writeByte(PROVE_DHT);
                try dht.g.serializeCompressed(writer);
                try dht.h.serializeCompressed(writer);
                try dht.u.serializeCompressed(writer);
                try dht.v.serializeCompressed(writer);
            },
            .and_conj => |children| {
                try writer.writeByte(AND);
                try VlqEncoder.encodeU64(children.len, writer);
                for (children) |child| try serialize(child, writer);
            },
            .or_conj => |children| {
                try writer.writeByte(OR);
                try VlqEncoder.encodeU64(children.len, writer);
                for (children) |child| try serialize(child, writer);
            },
            .threshold => |t| {
                try writer.writeByte(THRESHOLD);
                try VlqEncoder.encodeU64(t.k, writer);
                try VlqEncoder.encodeU64(t.children.len, writer);
                for (t.children) |child| try serialize(child, writer);
            },
        }
    }
};

Collections

const CollSerializer = struct {
    pub fn serialize(coll: Collection, writer: anytype) !void {
        try VlqEncoder.encodeU64(coll.len, writer);
        // Element type already encoded in type header
        for (coll.items) |item| {
            try DataSerializer.serialize(item, writer);
        }
    }

    /// Optimized boolean collection (bit-packed)
    pub fn serializeBoolColl(bools: []const bool, writer: anytype) !void {
        try VlqEncoder.encodeU64(bools.len, writer);
        var byte: u8 = 0;
        var bit: u3 = 0;
        for (bools) |b| {
            if (b) byte |= @as(u8, 1) << bit;
            bit +%= 1;
            if (bit == 0) {
                try writer.writeByte(byte);
                byte = 0;
            }
        }
        if (bools.len % 8 != 0) try writer.writeByte(byte);
    }
};

Expression Serialization 6

General Pattern

const ExprSerializer = struct {
    pub fn serialize(expr: Expr, writer: anytype) !void {
        // Write opcode
        try writer.writeByte(@intFromEnum(expr.opCode()));

        // Write opcode-specific data
        switch (expr) {
            .val_use => |vu| try VlqEncoder.encodeU32(vu.id, writer),
            .constant_placeholder => |cp| {
                try VlqEncoder.encodeU32(cp.index, writer);
                try TypeEncoder.serialize(cp.tpe, writer);
            },
            .bin_op => |bo| {
                try serialize(bo.left.*, writer);
                try serialize(bo.right.*, writer);
            },
            .method_call => |mc| {
                try writer.writeByte(mc.type_code);
                try writer.writeByte(mc.method_id);
                try serialize(mc.receiver.*, writer);
                try VlqEncoder.encodeU64(mc.args.len, writer);
                for (mc.args) |arg| try serialize(arg.*, writer);
            },
            // ...
        }
    }
};

Block Expressions

Block Value Structure
══════════════════════════════════════════════════════════════════

BlockValue:
┌────────┬──────────┬─────────────────────┬───────────────────────┐
│ 0xD8   │  count   │   ValDef items      │   Result expr         │
│        │   VLQ    │                     │                       │
└────────┴──────────┴─────────────────────┴───────────────────────┘

ValDef:
┌────────┬────────┬────────────┬───────────────────────────────────┐
│ 0xD6   │   ID   │   Type     │        RHS Expression             │
│        │  VLQ   │ (optional) │                                   │
└────────┴────────┴────────────┴───────────────────────────────────┘

FuncValue (Lambda):
┌────────┬──────────┬─────────────────────┬───────────────────────┐
│ 0xD9   │ arg cnt  │  Args (ID + type)   │   Body expr           │
│        │   VLQ    │                     │                       │
└────────┴──────────┴─────────────────────┴───────────────────────┘

Size Limits 7

Limit	Value	Description
Max ErgoTree size	4 KB	Serialized bytes
Max box size	4 KB	Total serialized
Max constants	255	Per ErgoTree
Max registers	10	R0-R9
Max tokens/box	255	Token types
Max BigInt bytes	32	256 bits

Deserialization

const SigmaByteReader = struct {
    reader: std.io.Reader,
    constant_store: []const Constant,
    version: ErgoTreeVersion,

    pub fn readVlqU64(self: *SigmaByteReader) !u64 {
        return VlqEncoder.decodeU64(self.reader);
    }

    pub fn readType(self: *SigmaByteReader) !SType {
        const code = try self.reader.readByte();
        return TypeEncoder.decode(code, self);
    }

    pub fn readExpr(self: *SigmaByteReader) !Expr {
        const opcode = try self.reader.readByte();
        if (opcode <= 0x70) {
            // Constant (type code in data region)
            return try self.readConstantWithType(opcode);
        }
        return try ExprSerializer.deserialize(@enumFromInt(opcode), self);
    }
};

Previous: Appendix D | Next: Appendix F

Scala: serialization/

Rust: sigma_byte_reader.rs:12-97

Rust: types.rs

⁴

Rust: ergo_tree.rs (header parsing)

⁵

Rust: data.rs

⁶

Rust: expr.rs

⁷

Scala: serialization.tex (size limits)

Appendix F: Version History

Version history of ErgoScript and the SigmaState interpreter¹².

Protocol Versions Overview

Block Version	Activated Version	ErgoTree Version	Name	Release
1	0	0	Initial	Mainnet launch
2	1	1	v4.0	2020
3	2	2	v5.0 (JIT)	2022
4	3	3	v6.0	2024/2025

Version Context

const VersionContext = struct {
    activated_version: u8,  // Protocol version on network
    ergo_tree_version: u8,  // Version of currently executing script

    pub const MAX_SUPPORTED_SCRIPT_VERSION: u8 = 3; // Supports 0, 1, 2, 3
    pub const JIT_ACTIVATION_VERSION: u8 = 2;       // v5.0 JIT activation
    pub const V6_SOFT_FORK_VERSION: u8 = 3;         // v6.0 soft-fork

    pub fn isJitActivated(self: VersionContext) bool {
        return self.activated_version >= JIT_ACTIVATION_VERSION;
    }

    pub fn isV6Activated(self: VersionContext) bool {
        return self.activated_version >= V6_SOFT_FORK_VERSION;
    }
};

Version 1 (Initial - v3.x)

ErgoTree Version: 0

Features:

Core ErgoScript language
Basic types: Boolean, Byte, Short, Int, Long, BigInt, GroupElement, SigmaProp
Collection operations: map, filter, fold, exists, forall
Sigma protocols: ProveDlog, ProveDHTuple, AND, OR, THRESHOLD
Box operations: value, propositionBytes, id, registers R0-R9
Context access: INPUTS, OUTPUTS, HEIGHT, SELF

Limitations:

AOT (Ahead-of-Time) interpreter only
Fixed cost model
No constant segregation required

Version 2 (v4.0)

ErgoTree Version: 1 Block Version: 2

New Features:

Mandatory constant segregation flag
Improved script validation
Enhanced soft-fork mechanism
Size flag in ErgoTree header

Changes:

ErgoTree header now requires size bytes when flag is set
Better error handling for malformed scripts

Version 3 (v5.0 - JIT)

ErgoTree Version: 2 Block Version: 3 Activated Version: 2

This was the major interpreter upgrade replacing AOT with JIT costing.

Major Changes

New Interpreter Architecture:

JIT (Just-In-Time) costing model
Data-driven evaluation via eval() methods
Precise cost tracking per operation
Profiler support for cost measurement

New Cost Model:

FixedCost for constant-time operations
PerItemCost for collection operations
TypeBasedCost for type-dependent costs
DynamicCost for complex operations

Costing Changes:

AOT: Fixed costs estimated at compile time
JIT: Actual costs computed during execution

New Operations:

Context.dataInputs - access data inputs
Context.headers - access last 10 block headers
Context.preHeader - access current block pre-header
Header type with full block header access
PreHeader type

Soft-Fork Infrastructure:

ValidationRules framework
Configurable rule status (enabled, disabled, replaced)
trySoftForkable pattern for graceful degradation

AOT to JIT Transition

The transition happened at a specific block height. Scripts created before JIT activation continue to work, but new scripts benefit from more accurate costing.

Version 4 (v6.0 - Evolution)

ErgoTree Version: 3 Block Version: 4 Activated Version: 3

This soft-fork adds significant new functionality.

New Types

SUnsignedBigInt (Type code 9):

256-bit unsigned integers
Modular arithmetic operations
Conversion between signed/unsigned

New Methods

Numeric Types (Byte, Short, Int, Long, BigInt):

toBytes: Convert to byte array
toBits: Convert to boolean array
bitwiseInverse: Bitwise NOT
bitwiseOr, bitwiseAnd, bitwiseXor: Bitwise operations
shiftLeft, shiftRight: Bit shifting

BigInt:

toUnsigned: Convert to UnsignedBigInt
toUnsignedMod: Modular conversion

UnsignedBigInt:

modInverse: Modular multiplicative inverse
plusMod, subtractMod, multiplyMod: Modular arithmetic
mod: Modulo operation
toSigned: Convert to signed BigInt

GroupElement:

expUnsigned: Scalar multiplication with unsigned exponent

Header:

checkPow: Verify Proof-of-Work solution

Collection:

patch: Replace range with another collection
updated: Update single element
updateMany: Batch update elements
indexOf: Find element index
zip: Pair with another collection
reverse: Reverse order
startsWith, endsWith: Prefix/suffix checks
get: Safe element access returning Option

Global:

serialize: Serialize any value to bytes
fromBigEndianBytes: Decode big-endian bytes
encodeNBits, decodeNBits: Difficulty encoding
powHit: Autolykos2 PoW verification

Version Checks

fn evaluateWithVersion(ctx: *VersionContext, expr: *const Expr) !Value {
    if (ctx.isV6Activated()) {
        // Use v6 methods and features
        return try evalV6(expr);
    } else if (ctx.isJitActivated()) {
        // Use JIT costing
        return try evalJit(expr);
    } else {
        // Legacy AOT path
        return try evalAot(expr);
    }
}

Backward Compatibility

Script Compatibility

All scripts created for earlier versions continue to work:

Version 0 scripts: Execute with v0 semantics
Version 1 scripts: Execute with v1 semantics
Version 2 scripts: Execute with JIT costing
Version 3 scripts: Full v6 features available

Method Resolution by Version

fn getMethods(ctx: *const VersionContext, type_code: u8) []const SMethod {
    const container = getTypeCompanion(type_code);
    if (ctx.isV6Activated()) {
        return container.all_methods;  // All methods including v6
    }
    return container.v5_methods;  // Pre-v6 methods only
}

Soft-Fork Safety

Unknown opcodes and methods in future versions are handled gracefully:

fn checkOpCode(opcode: u8, ctx: *const VersionContext) ValidationResult {
    if (isKnownOpcode(opcode)) return .validated;
    if (ctx.isSoftForkable(opcode)) return .soft_forkable;
    return .invalid;
}

Migration Guide

For Script Authors

v5 → v6:

Use UnsignedBigInt for modular arithmetic (more efficient)
Use new collection methods (reverse, zip, etc.)
Use Header.checkPow for PoW verification
Use Global.serialize for value encoding

For Node Operators

Upgrading to v6:

Update node software before activation height
No action needed for existing scripts
New features available after soft-fork activation

Feature Matrix

Feature	v3.x	v4.0	v5.0	v6.0
Basic types	✓	✓	✓	✓
Sigma protocols	✓	✓	✓	✓
JIT costing	-	-	✓	✓
Data inputs	-	-	✓	✓
Headers access	-	-	✓	✓
UnsignedBigInt	-	-	-	✓
Bitwise ops	-	-	-	✓
Collection updates	-	-	-	✓
PoW verification	-	-	-	✓
Serialization	-	-	-	✓

Test Coverage

Version-specific behavior is tested in:

LanguageSpecificationV5.scala (~9,690 lines)
LanguageSpecificationV6.scala (~3,081 lines)

These tests verify:

All operations produce expected results
Cost calculations are accurate
Version-gated features work correctly
Backward compatibility is maintained

Previous: Appendix E | Back to Book

Scala: VersionContext.scala

Rust: ergo_tree.rs (ErgoTreeVersion)

Glossary

A

AOT (Ahead-Of-Time): Costing model where script costs are calculated before execution. Used in ErgoTree versions 0-1.

AVL Tree: A self-balancing binary search tree used for authenticated dictionaries in Ergo.

B

BigInt: 256-bit signed integer type in ErgoTree.

Box: The fundamental UTXO unit in Ergo, containing value, ErgoTree script, tokens, and registers.

C

Constant Segregation: Optimization where constants are extracted from ErgoTree expressions and stored in a separate array. Enables efficient script substitution without re-serializing the expression tree.

Context: Execution environment containing blockchain state (HEIGHT, headers), transaction data (INPUTS, OUTPUTS, dataInputs), and current input information (SELF).

Cost Accumulator: Runtime tracker that sums operation costs and enforces the script cost limit.

D

Data Input: Read-only box reference in a transaction. Provides data without being spent.

DHT (Diffie-Hellman Tuple): Four-element sigma protocol proving knowledge of secret x where u = g^x and v = h^x.

DLog (Discrete Logarithm): Sigma protocol proving knowledge of discrete logarithm. Given generator g and public key h = g^x, proves knowledge of x.

E

ErgoScript: High-level smart contract language with Scala-like syntax.

ErgoTree: Serialized bytecode representation of smart contracts.

F

Fiat-Shamir Transformation: Technique to convert interactive proofs into non-interactive proofs.

G

GroupElement: An elliptic curve point on secp256k1.

H

Header: The first byte(s) of ErgoTree that specify version and format flags.

I

Interpreter: Component that evaluates ErgoTree expressions against a context to produce a SigmaBoolean result.

J

JIT (Just-In-Time): Costing model where costs are calculated during execution. Used in ErgoTree version 2+.

O

OpCode: Single-byte identifier for expression nodes in serialized ErgoTree. Values 0x01-0x70 encode constants; 0x71+ encode operations.

P

Prover: Component that generates cryptographic proofs for spending conditions.

Proposition: A statement that can be proven true or false.

S

Secp256k1: The elliptic curve used in Ergo (same as Bitcoin).

SigmaBoolean: A tree of cryptographic propositions (AND, OR, threshold, DLog, DHT).

SigmaProp: Type representing sigma-protocol propositions.

Sigma Protocol: Zero-knowledge proof system with three-move structure.

Bibliography

Primary Sources

sigmastate-interpreter Repository
- URL: https://github.com/ScorexFoundation/sigmastate-interpreter
- Reference Scala implementation of the SigmaState interpreter
- Key packages: sigma.ast, sigma.serialization, sigma.eval, sigma.crypto
sigma-rust Repository
- URL: https://github.com/ergoplatform/sigma-rust
- Rust implementation of ErgoTree IR and interpreter
- Key crates: ergotree-ir, ergotree-interpreter, ergo-lib
Ergo Node Repository
- URL: https://github.com/ergoplatform/ergo
- Full node implementation in Scala

Specifications

ErgoTree Specification (spec.pdf)
- Location: sigmastate-interpreter/docs/spec/spec.pdf
- Formal specification of ErgoTree format and semantics
ErgoScript Language Specification (LangSpec.md)
- Location: sigmastate-interpreter/docs/LangSpec.md
- Informal language specification
Sigma Protocols Paper (sigma.pdf)
- Location: sigmastate-interpreter/docs/wpaper/sigma.pdf
- Formal specification of Sigma protocols

Academic Papers

Sigmastate Protocols
- Location: sigmastate-interpreter/docs/sigmastate_protocols/sigmastate_protocols.pdf
- Detailed protocol descriptions
Ergo Whitepaper
- Platform overview and design rationale
Ergo Yellow Paper
- Technical specification

External References

Schnorr Identification Protocol
- Schnorr, C.P. (1991). Efficient signature generation by smart cards
Fiat-Shamir Heuristic
- Fiat, A., & Shamir, A. (1986). How to prove yourself
secp256k1 Curve
- Standards for Efficient Cryptography (SEC 2)
BLAKE2 Hash Function
- https://www.blake2.net/

Back to Contents