The Sigma Book
KYA: Know Your Assumptions
This is a PRE-ALPHA version of The Sigma Book.
Before using this material, understand these critical assumptions:
Not Authoritative: This book is NOT an official specification. It is a research and educational resource derived from studying the source code.
May Contain Errors: Content has not been formally verified. Implementations based solely on this book may be incorrect.
Subject to Change: As a pre-alpha work, chapters may be incomplete, reorganized, or substantially rewritten.
Source of Truth: For authoritative information, always consult:
- sigmastate-interpreter — Reference Scala implementation
- sigma-rust — Rust implementation
- ergo — Ergo node
- ErgoTree Specification (spec.pdf)
Verification Required: Cross-reference all claims against the actual source code before relying on them.
Use this book to learn and explore, but verify everything against the source.
Complete Technical Reference for SigmaState Interpreter
Welcome to The Sigma Book, a comprehensive technical reference covering the SigmaState interpreter, ErgoTrees, and the Sigma language. This book is written for engineers who need deep understanding of the implementation details, algorithms, and data structures behind the Ergo blockchain's smart contract system.
Code examples use idiomatic Zig 0.13+ with data-oriented design patterns, making algorithms explicit and accessible to implementers in any language.
What This Book Covers
This book provides complete documentation of:
- Specifications: Formal and informal specifications of the Sigma language, type system, and ErgoTree format
- Implementation Details: Internal algorithms and data structures from both the reference Scala implementation (sigmastate-interpreter) and Rust implementation (sigma-rust)
- Node Integration: How the Ergo node uses the interpreter for transaction validation
- Practical APIs: SDK and high-level interfaces for building applications
How to Read This Book
Prerequisites Approach
Every chapter includes an explicit Prerequisites section that lists:
- Required knowledge assumptions
- Related concepts you should understand
- Links to earlier chapters covering dependencies
This allows you to:
- Jump directly to topics of interest if you have the background
- Trace backward to fill gaps in your understanding
- Use the book as a reference rather than reading linearly
Code Examples
Code examples use Zig 0.13+ to illustrate algorithms with explicit memory management and data-oriented patterns. While not directly runnable against the Scala or Rust implementations, they demonstrate the core logic clearly.
Exercises
Each chapter concludes with exercises at three levels:
- Conceptual: Test your understanding of the material
- Implementation: Write code applying the concepts
- Analysis: Read and analyze real source code
Source Material
This book is derived from:
- sigmastate-interpreter: Reference Scala implementation (ScorexFoundation/sigmastate-interpreter)
- sigma-rust: Rust implementation (ergoplatform/sigma-rust)
- Ergo node: Full node implementation showing integration
- Formal specifications: LaTeX documents in
docs/spec/ - Test suites: Language specification tests defining expected behavior
Citations use footnotes referencing both Scala and Rust source locations.
Book Structure
| Part | Focus | Depth |
|---|---|---|
| I. Foundations | Core concepts and type system | Overview |
| II. AST | Expression node catalog | Reference |
| III. Serialization | Binary format | Detailed |
| IV. Cryptography | Zero-knowledge proofs | Deep |
| V. Interpreter | Evaluation engine | Deep |
| VI. Compiler | ErgoScript compilation | Deep |
| VII. Data Structures | Collections, AVL trees, boxes | Detailed |
| VIII. Node Integration | Transaction validation | Practical |
| IX. SDK | Developer APIs | Practical |
| X. Advanced | Soft-forks, cross-platform | Specialized |
Conventions Used
// Code blocks use Zig to illustrate algorithms
const ErgoTree = struct {
header: Header,
constants: []const Constant,
root: *const Expr,
};
Note: Highlighted notes provide important context or warnings.
Footnotes: [^1]: Scala: path/to/file.scala:123 and [^2]: Rust: path/to/file.rs:456 reference source locations in both implementations.
Version Information
This book documents:
- sigmastate-interpreter: Version 6.x (with notes on v5 differences)
- sigma-rust: ergotree-ir and ergotree-interpreter crates
- Protocol versions: v0 (initial), v1 (v4.0), v2 (v5.0 JIT), v3 (v6.0)
Contributing
This book is maintained as part of the ErgoTree research project. Corrections and improvements are welcome.
Let's begin with Chapter 1: Introduction to Sigma and ErgoTree.
Chapter 1: Introduction to Sigma and ErgoTree
PRE-ALPHA WARNING: This is a pre-alpha version of The Sigma Book. Content may be incomplete, inaccurate, or subject to change. Do not use as a source of truth. For authoritative information, consult the official repositories:
- sigmastate-interpreter — Reference Scala implementation
- sigma-rust — Rust implementation
- ergo — Ergo node
Prerequisites
- Basic blockchain concepts (transactions, blocks, consensus)
- Understanding of the UTXO model (unspent transaction outputs)
- Familiarity with any systems programming language (C, Rust, Go, or similar)
- Public key cryptography fundamentals (key pairs, digital signatures, hash functions)
Learning Objectives
By the end of this chapter, you will be able to:
- Explain why Sigma protocols offer advantages over traditional blockchain scripting
- Describe the relationship between ErgoScript, ErgoTree, and SigmaBoolean
- Understand the UTXO model and how scripts guard spending conditions
- Differentiate the roles of prover and verifier in transaction validation
- Identify the core components of the Sigma interpreter architecture
What is Sigma?
Traditional blockchain scripting languages like Bitcoin Script offer limited expressiveness: they support hash preimages, signature checks, and timelocks, but little else. Ethereum's EVM provides Turing completeness but at the cost of complexity, high gas fees, and limited privacy guarantees.
Sigma (Σ) protocols occupy a middle ground. They are cryptographic proof systems that enable zero-knowledge proofs of knowledge—proving you know a secret without revealing it1. The name comes from the Greek letter Σ and reflects their characteristic three-move structure:
- Commitment: The prover sends a randomized commitment value
- Challenge: The verifier sends a random challenge
- Response: The prover sends a response that proves knowledge without revealing secrets
What makes Sigma protocols powerful for blockchains is their composability: you can combine them with AND, OR, and threshold operations to build complex spending conditions while preserving zero-knowledge properties.
The Three Layers
┌─────────────────────────────────────┐
│ ErgoScript │ High-level language
│ (Human-readable source) │
└─────────────────┬───────────────────┘
│ Compilation
▼
┌─────────────────────────────────────┐
│ ErgoTree │ Intermediate representation
│ (Typed AST / Bytecode) │ (Serialized in UTXOs)
└─────────────────┬───────────────────┘
│ Evaluation
▼
┌─────────────────────────────────────┐
│ SigmaBoolean │ Cryptographic proposition
│ (Sigma protocol tree) │ (What needs to be proven)
└─────────────────────────────────────┘
ErgoScript
High-level, statically-typed language with Scala-like syntax2:
- First-class lambdas and higher-order functions
- Call-by-value evaluation
- Local type inference
- Blocks as expressions
// Zig representation of an ErgoScript contract
const Contract = struct {
freeze_deadline: i32,
pk_owner: SigmaProp,
pub fn evaluate(self: Contract, height: i32) SigmaProp {
const deadline_passed = height > self.freeze_deadline;
return SigmaProp.and(
SigmaProp.fromBool(deadline_passed),
self.pk_owner,
);
}
};
ErgoTree
Compiled bytecode representation stored on-chain34:
- Typed abstract syntax tree (AST)
- Serialized as bytes in UTXOs
- Deterministically interpretable
- Version-controlled for soft-fork upgrades
const ErgoTree = struct {
header: HeaderType,
constants: []const Constant,
root: union(enum) {
parsed: SigmaPropValue,
unparsed: UnparsedTree,
},
/// Header byte layout:
/// Bit 7: Multi-byte header flag
/// Bit 6: Reserved (GZIP)
/// Bit 5: Reserved (context-dependent costing)
/// Bit 4: Constant segregation flag
/// Bit 3: Size flag
/// Bits 2-0: Version (0-7)
pub const HeaderType = packed struct(u8) {
version: u3,
has_size: bool,
constant_segregation: bool,
reserved1: bool = false,
reserved_gzip: bool = false,
multi_byte: bool = false,
};
};
SigmaBoolean
After evaluation, ErgoTree reduces to a SigmaBoolean—a tree of cryptographic propositions56:
const SigmaBoolean = union(enum) {
prove_dlog: ProveDlog, // Knowledge of discrete log
prove_dh_tuple: ProveDhTuple, // Diffie-Hellman tuple
cand: Cand, // Logical AND
cor: Cor, // Logical OR
cthreshold: Cthreshold, // k-of-n threshold
trivial: TrivialProp, // True/False
/// Count nodes in proposition tree
pub fn size(self: SigmaBoolean) usize {
return switch (self) {
.prove_dlog, .prove_dh_tuple, .trivial => 1,
.cand => |c| 1 + totalChildrenSize(c.children),
.cor => |c| 1 + totalChildrenSize(c.children),
.cthreshold => |c| 1 + totalChildrenSize(c.children),
};
}
// NOTE: In production, use an explicit work stack instead of recursion
// to guarantee bounded stack depth. See ZIGMA_STYLE.md.
};
const ProveDlog = struct {
/// Public key (compressed EC point, 33 bytes)
h: EcPoint,
};
const ProveDhTuple = struct {
g: EcPoint, // Generator
h: EcPoint, // Point h
u: EcPoint, // g^w
v: EcPoint, // h^w
};
The UTXO Model
Ergo extends the UTXO (Unspent Transaction Output) model pioneered by Bitcoin. Instead of simple locking scripts, Ergo uses boxes—rich data structures that contain value, tokens, and arbitrary typed data:
┌─────────────────────────────────────────┐
│ Box │
├─────────────────────────────────────────┤
│ R0: value (i64 nanoERGs) │ ← Computed registers
│ R1: ergoTree (spending condition) │
│ R2: tokens (asset list) │
│ R3: creationInfo (height, txId, idx) │
├─────────────────────────────────────────┤
│ R4-R9: additional registers ───────────┼──► User-defined data
│ (optional, typed constants) │ (up to 6 registers)
└─────────────────────────────────────────┘
│
┌──────────────┴──────────────┐
▼ ▼
┌─────────────────────────┐ ┌─────────────────────────┐
│ Token │ │ Register (R4-R9) │
├─────────────────────────┤ ├─────────────────────────┤
│ id: [32]u8 (token ID) │ │ value: Constant │
│ amount: i64 │ │ (any SType value) │
└─────────────────────────┘ └─────────────────────────┘
Registers R0–R3 are computed from box fields and always present. Registers R4–R9 are optional and can store any typed value—integers, byte arrays, group elements, or even nested collections.
const Box = struct {
value: i64, // nanoERGs (R0)
ergo_tree: ErgoTree, // Spending condition (R1)
tokens: []const Token, // Additional assets (R2)
creation_height: u32, // Part of creation info (R3)
tx_id: [32]u8, // Part of creation info (R3)
output_index: u16, // Part of creation info (R3)
additional_registers: [6]?Constant, // R4-R9 (user-defined, optional)
pub fn id(self: *const Box) [32]u8 {
// Blake2b256(tx_id || output_index || serialized_content)
var hasher = std.crypto.hash.blake2.Blake2b256.init(.{});
hasher.update(&self.tx_id);
hasher.update(std.mem.asBytes(&self.output_index));
// ... serialize and hash content
return hasher.finalResult();
}
};
// NOTE: R0-R3 are computed from box fields; only R4-R9 are stored explicitly.
The Prover/Verifier Model
TODO: Add explainations.
PROVER VERIFIER
┌──────────────┐ ┌──────────────┐
│ Secrets │ │ │
│ (private │ │ Context │
│ keys) │ │ │
└──────┬───────┘ └──────┬───────┘
│ │
ErgoTree ─────►│ ErgoTree ─────►│
│ │
┌──────▼───────┐ ┌──────▼───────┐
│ Reduction │ │ Reduction │
│ (same as │ │ (same as │
│ verifier) │ │ prover) │
└──────┬───────┘ └──────┬───────┘
│ │
SigmaBoolean SigmaBoolean
│ │
┌──────▼───────┐ ┌──────▼───────┐
│ Signing │───────Proof───────►│ Verify │
│ (Fiat-Shamir)│ │ Signature │
└──────────────┘ └──────┬───────┘
│
true / false
Prover
TODO: Add explainations.
const Prover = struct {
secrets: []const SecretKey,
pub fn prove(
self: *const Prover,
ergo_tree: *const ErgoTree,
context: *const Context,
) !Proof {
// 1. Reduce to SigmaBoolean
const sigma_bool = try Evaluator.reduce(ergo_tree, context);
// 2. Generate proof using Fiat-Shamir
return try self.generateProof(sigma_bool, context.message);
}
};
Verifier
TODO: Add explainations.
const Verifier = struct {
cost_limit: u64,
pub fn verify(
self: *const Verifier,
ergo_tree: *const ErgoTree,
context: *const Context,
proof: *const Proof,
) !bool {
// 1. Reduce with cost tracking
var cost: u64 = 0;
const sigma_bool = try Evaluator.reduceWithCost(
ergo_tree, context, &cost, self.cost_limit,
);
// 2. Verify signature
return try verifySignature(sigma_bool, proof, context.message);
}
};
Why Sigma Protocols?
Consider what Bitcoin Script can express: "This output can be spent if you provide a valid signature for public key X." This covers most payment scenarios but falls short for more sophisticated applications.
Sigma protocols enable a fundamentally richer set of spending conditions:
| Feature | What It Enables | Example Use Case |
|---|---|---|
| Composable ZK Proofs | AND, OR, threshold combinations of conditions | Multi-party escrow with complex release logic |
| Ring Signatures | Prove you're one of N signers without revealing which | Anonymous voting, whistleblower systems |
| Threshold Signatures | Require k-of-n parties to sign | DAO governance, cold storage recovery |
| Zero-Knowledge Privacy | Prove statements without revealing underlying data | Private auctions, confidential identity verification |
The key insight is that Sigma protocols can be composed while preserving their zero-knowledge properties. An OR composition of two Sigma proofs reveals that the prover knows one of two secrets—but not which one.
// OR composition hides actual signer
const ring_signature = SigmaBoolean{
.cor = .{
.children = &[_]SigmaBoolean{
.{ .prove_dlog = pk_alice },
.{ .prove_dlog = pk_bob },
.{ .prove_dlog = pk_carol },
},
},
};
// Proof reveals ONE signed, but not which
Repository Structure
| Module | Purpose |
|---|---|
core | Cryptographic primitives, base types |
data | ErgoTree, AST nodes, serialization |
interpreter | Evaluation engine, Sigma protocols |
parsers | ErgoScript parser |
sc | Compiler with IR optimization |
sdk | High-level transaction APIs |
Key Design Principles
The Sigma interpreter is built around four core principles that make it suitable for blockchain consensus:
Determinism
Every operation must produce identical results for identical inputs, regardless of platform or implementation. This means no floating-point arithmetic, no uninitialized memory, and careful handling of hash map iteration order. Without determinism, nodes would disagree on transaction validity.
Bounded Execution
Every script must complete within a predictable cost limit. The interpreter tracks three resource categories:
- Computational operations: arithmetic, comparisons, function calls
- Memory allocations: collections, tuples, intermediate values
- Cryptographic operations: EC point multiplication, signature verification
Scripts exceeding the cost limit fail validation, preventing denial-of-service attacks.
Soft-Fork Compatibility
ErgoTree includes version information in its header. When nodes encounter unknown opcodes (from future protocol versions), they can handle them gracefully rather than rejecting the entire block. This enables protocol upgrades without hard forks.
Cross-Platform Consistency
The specification must be implementable identically across different platforms. Reference implementations exist for:
- JVM (Scala): The original sigmastate-interpreter
- JavaScript (Scala.js): Browser and Node.js environments
- Native (Rust): sigma-rust for performance-critical applications7
Summary
This chapter introduced the fundamental concepts of the Sigma protocol ecosystem:
- Sigma protocols are three-move cryptographic proofs that enable zero-knowledge proofs of knowledge, with the crucial property of composability
- ErgoScript is a high-level, statically-typed language that compiles to ErgoTree bytecode
- ErgoTree is a serialized AST stored in UTXO boxes that evaluates to SigmaBoolean propositions
- SigmaBoolean represents cryptographic conditions (discrete log proofs, Diffie-Hellman tuples) combined with AND, OR, and threshold logic
- The prover generates zero-knowledge proofs; the verifier checks them without learning secrets
- The system is designed for blockchain consensus: deterministic, bounded, soft-fork compatible, and cross-platform
In the following chapters, we'll dive deep into each layer—starting with the type system that makes ErgoTree's static guarantees possible.
Next: Chapter 2: Type System
Sigma protocols are interactive proof systems with the special "honest-verifier zero-knowledge" property.
Scala: LangSpec.md:57-80
Scala: ErgoTree.scala:24-88
Rust: tree_header.rs:10-32
Scala: SigmaBoolean.scala:12-21
Rust: sigma_boolean.rs:34-80
Rust implementation: sigma-rust crate at ergotree-ir/, ergotree-interpreter/
Chapter 2: Type System
PRE-ALPHA WARNING: This is a pre-alpha version of The Sigma Book. Content may be incomplete, inaccurate, or subject to change. Do not use as a source of truth. For authoritative information, consult the official repositories:
- sigmastate-interpreter — Reference Scala implementation
- sigma-rust — Rust implementation
- ergo — Ergo node
Prerequisites
- Basic type system concepts (static vs dynamic typing, generic types)
- Understanding of binary serialization concepts
- Prior chapters: Chapter 1
Learning Objectives
By the end of this chapter, you will be able to:
- Identify all ErgoTree primitive types and their numeric ranges
- Understand why type codes exist and how they enable compact serialization
- Explain the "embeddable" type concept and its efficiency benefits
- Construct collection, option, tuple, and function types
- Recognize version-specific type additions (v6 and beyond)
Type System Overview
Every value in ErgoTree has a statically-known type. Unlike dynamically-typed languages where types are checked at runtime, ErgoTree's type system catches errors at compile time—before the script ever reaches the blockchain.
- Static typing: All types known at compile time, enabling early error detection
- Type inference: The compiler automatically deduces types in most cases
- Generic types: Collections and options parameterized over element types
- Type codes: Each type has a unique numeric code enabling compact binary serialization
Understanding type codes is essential because they directly affect how data is serialized on-chain. The type system is carefully designed so that common types serialize to single bytes, minimizing transaction size.
/// Base type descriptor
const SType = union(enum) {
// Primitives (embeddable, codes 1-9)
boolean,
byte,
short,
int,
long,
big_int,
group_element,
sigma_prop,
unsigned_big_int, // v6+
// Compound types
coll: *const SType,
option: *const SType,
tuple: []const SType,
func: SFunc,
// Object types (codes 99-106)
box,
avl_tree,
context,
header,
pre_header,
global,
// Special
unit,
any,
type_var: []const u8,
pub fn typeCode(self: SType) u8 {
return switch (self) {
.boolean => 1,
.byte => 2,
.short => 3,
.int => 4,
.long => 5,
.big_int => 6,
.group_element => 7,
.sigma_prop => 8,
.unsigned_big_int => 9,
.coll => 12,
.option => 36,
.tuple => 96,
.box => 99,
.avl_tree => 100,
.context => 101,
.header => 104,
.pre_header => 105,
.global => 106,
else => 0,
};
}
pub fn isEmbeddable(self: SType) bool {
return self.typeCode() >= 1 and self.typeCode() <= 9;
}
pub fn isNumeric(self: SType) bool {
return switch (self) {
.byte, .short, .int, .long, .big_int, .unsigned_big_int => true,
else => false,
};
}
};
Type Hierarchy
SType
│
┌──────────────────────┼──────────────────────┐
│ │ │
SEmbeddable SCollection SOption
│ (elemType) (elemType)
┌────┴────┬─────────────────┐
│ │ │
SNumericType SBoolean SGroupElement
│ SSigmaProp
│
├── SByte (code 2)
├── SShort (code 3)
├── SInt (code 4)
├── SLong (code 5)
├── SBigInt (code 6)
└── SUnsignedBigInt (code 9, v6+)
Object Types (non-embeddable):
SBox(99), SAvlTree(100), SContext(101),
SHeader(104), SPreHeader(105), SGlobal(106)
Primitive Types
Numeric Types
All numeric types support conversion via upcast (widening) and downcast (narrowing, throws on overflow)34:
| Type | Code | Size | Range |
|---|---|---|---|
SByte | 2 | 8-bit | -128 to 127 |
SShort | 3 | 16-bit | -32,768 to 32,767 |
SInt | 4 | 32-bit | ±2.1 billion |
SLong | 5 | 64-bit | ±9.2 quintillion |
SBigInt | 6 | 256-bit | Signed arbitrary |
SUnsignedBigInt | 9 | 256-bit | Unsigned (v6+) |
const SNumericType = struct {
type_code: u8,
numeric_index: u8, // 0=Byte, 1=Short, 2=Int, 3=Long, 4=BigInt, 5=UBigInt
/// Ordering: Byte < Short < Int < Long < BigInt < UnsignedBigInt
pub fn canUpcastTo(self: SNumericType, target: SNumericType) bool {
return self.numeric_index <= target.numeric_index;
}
/// Downcast with overflow check
pub fn downcast(comptime T: type, value: anytype) !T {
const min = std.math.minInt(T);
const max = std.math.maxInt(T);
if (value < min or value > max) {
return error.ArithmeticOverflow;
}
return @intCast(value);
}
};
// Type instances
const SByte = SNumericType{ .type_code = 2, .numeric_index = 0 };
const SShort = SNumericType{ .type_code = 3, .numeric_index = 1 };
const SInt = SNumericType{ .type_code = 4, .numeric_index = 2 };
const SLong = SNumericType{ .type_code = 5, .numeric_index = 3 };
const SBigInt = SNumericType{ .type_code = 6, .numeric_index = 4 };
const SUnsignedBigInt = SNumericType{ .type_code = 9, .numeric_index = 5 };
Boolean Type
const SBoolean = struct {
pub const type_code: u8 = 1;
pub const is_embeddable = true;
};
Cryptographic Types
GroupElement — Point on secp256k1 curve (33 bytes compressed)5:
const SGroupElement = struct {
pub const type_code: u8 = 7;
/// 33 bytes: 1-byte prefix (0x02/0x03) + 32-byte X coordinate
pub const SERIALIZED_SIZE = 33;
};
SigmaProp — Cryptographic proposition (required return type)6:
const SSigmaProp = struct {
pub const type_code: u8 = 8;
/// Maximum serialized size
pub const MAX_SIZE_BYTES: usize = 1024;
};
Type Codes
Type code space partitioning7:
| Range | Description |
|---|---|
| 1-9 | Primitive embeddable types |
| 10-11 | Reserved |
| 12-23 | Coll[T] (T primitive) |
| 24-35 | Coll[Coll[T]] |
| 36-47 | Option[T] |
| 48-59 | Option[Coll[T]] |
| 60+ | Other types |
const TypeCodes = struct {
// Primitives
pub const BOOLEAN: u8 = 1;
pub const BYTE: u8 = 2;
pub const SHORT: u8 = 3;
pub const INT: u8 = 4;
pub const LONG: u8 = 5;
pub const BIGINT: u8 = 6;
pub const GROUP_ELEMENT: u8 = 7;
pub const SIGMA_PROP: u8 = 8;
pub const UNSIGNED_BIGINT: u8 = 9;
// Type constructor bases
pub const PRIM_RANGE: u8 = 12; // MaxPrimTypeCode + 1
pub const COLL_BASE: u8 = 12;
pub const NESTED_COLL_BASE: u8 = 24;
pub const OPTION_BASE: u8 = 36;
pub const OPTION_COLL_BASE: u8 = 48;
// Object types
pub const TUPLE: u8 = 96;
pub const ANY: u8 = 97;
pub const UNIT: u8 = 98;
pub const BOX: u8 = 99;
pub const AVL_TREE: u8 = 100;
pub const CONTEXT: u8 = 101;
pub const HEADER: u8 = 104;
pub const PREHEADER: u8 = 105;
pub const GLOBAL: u8 = 106;
};
Embeddable Types
The type system's most elegant optimization is the concept of embeddable types. These nine primitive types (codes 1–9) can be "embedded" directly into type constructor codes, allowing common composite types to serialize as a single byte.
Consider Coll[Int] (a collection of integers). Without embedding, this would require two bytes: one for "Collection" and one for "Int". With embedding, it serializes as a single byte: 12 + 4 = 16. This matters because type information appears frequently in serialized ErgoTrees—every constant, every expression result has a type.
The embedding formula is simple8:
/// Embed primitive type code into constructor
pub fn embedType(type_constr_base: u8, prim_type_code: u8) u8 {
return type_constr_base + prim_type_code;
}
// Examples:
// Coll[Byte] = 12 + 2 = 14
// Coll[Int] = 12 + 4 = 16
// Option[Long] = 36 + 5 = 41
// Option[Coll[Byte]] = 48 + 2 = 50
| Type | Code | Coll[T] | Option[T] |
|---|---|---|---|
| Boolean | 1 | 13 | 37 |
| Byte | 2 | 14 | 38 |
| Short | 3 | 15 | 39 |
| Int | 4 | 16 | 40 |
| Long | 5 | 17 | 41 |
| BigInt | 6 | 18 | 42 |
| GroupElement | 7 | 19 | 43 |
| SigmaProp | 8 | 20 | 44 |
| UnsignedBigInt | 9 | 21 | 45 |
Collection Types
Collections are homogeneous sequences910:
const SCollection = struct {
elem_type: *const SType,
pub fn typeCode(self: SCollection) u8 {
if (self.elem_type.isEmbeddable()) {
return TypeCodes.COLL_BASE + self.elem_type.typeCode();
}
return TypeCodes.COLL_BASE; // Followed by element type
}
};
// Pre-defined collection types (avoid allocation)
const SByteArray = SCollection{ .elem_type = &SType.byte };
const SIntArray = SCollection{ .elem_type = &SType.int };
const SBooleanArray = SCollection{ .elem_type = &SType.boolean };
const SBoxArray = SCollection{ .elem_type = &SType.box };
Option Types
Optional values11:
const SOption = struct {
elem_type: *const SType,
pub fn typeCode(self: SOption) u8 {
if (self.elem_type.isEmbeddable()) {
return TypeCodes.OPTION_BASE + self.elem_type.typeCode();
}
return TypeCodes.OPTION_BASE;
}
};
// Pre-defined option types
const SByteOption = SOption{ .elem_type = &SType.byte };
const SIntOption = SOption{ .elem_type = &SType.int };
const SLongOption = SOption{ .elem_type = &SType.long };
const SBoxOption = SOption{ .elem_type = &SType.box };
Tuple Types
Heterogeneous fixed-size sequences:
const STuple = struct {
items: []const SType,
pub const type_code: u8 = 96;
pub fn pair(left: SType, right: SType) STuple {
return STuple{ .items = &[_]SType{ left, right } };
}
};
Function Types
Function signatures for lambdas and methods:
const SFunc = struct {
t_dom: []const SType, // Domain (argument types)
t_range: *const SType, // Range (return type)
tpe_params: []const STypeVar, // Generic type parameters
pub const type_code: u8 = 246;
};
// Example: (Int) => Boolean
const intToBool = SFunc{
.t_dom = &[_]SType{SType.int},
.t_range = &SType.boolean,
.tpe_params = &[_]STypeVar{},
};
Object Types
| Type | Code | Description |
|---|---|---|
SBox | 99 | UTXO with value, script, tokens, registers |
SAvlTree | 100 | Authenticated dictionary (Merkle proofs) |
SContext | 101 | Transaction context |
SHeader | 104 | Block header |
SPreHeader | 105 | Pre-solved block header |
SGlobal | 106 | Global operations |
Type Variables
Used internally by compiler for generic methods (never serialized)12:
const STypeVar = struct {
name: []const u8,
// Standard type variables
pub const T = STypeVar{ .name = "T" };
pub const R = STypeVar{ .name = "R" };
pub const K = STypeVar{ .name = "K" };
pub const V = STypeVar{ .name = "V" };
pub const IV = STypeVar{ .name = "IV" }; // Input Value
pub const OV = STypeVar{ .name = "OV" }; // Output Value
};
Version Differences
v6 additions13:
SUnsignedBigInt(type code 9)- Bitwise operations on numeric types
- Additional numeric methods (
toBytes,toBits, shifts)
pub fn allPredefTypes(version: ErgoTreeVersion) []const SType {
const v5_types = &[_]SType{
.boolean, .byte, .short, .int, .long, .big_int,
.context, .global, .header, .pre_header, .avl_tree,
.group_element, .sigma_prop, .box, .unit, .any,
};
if (version.value >= 3) { // v6+
return v5_types ++ &[_]SType{.unsigned_big_int};
}
return v5_types;
}
Complete Type Code Reference
| Type | Code | Embeddable |
|---|---|---|
| Boolean | 1 | Yes |
| Byte | 2 | Yes |
| Short | 3 | Yes |
| Int | 4 | Yes |
| Long | 5 | Yes |
| BigInt | 6 | Yes |
| GroupElement | 7 | Yes |
| SigmaProp | 8 | Yes |
| UnsignedBigInt | 9 | Yes |
| Coll[T] | 12 | Constructor |
| Option[T] | 36 | Constructor |
| Tuple | 96 | No |
| Any | 97 | No |
| Unit | 98 | No |
| Box | 99 | No |
| AvlTree | 100 | No |
| Context | 101 | No |
| Header | 104 | No |
| PreHeader | 105 | No |
| Global | 106 | No |
Summary
This chapter covered ErgoTree's type system, which provides the foundation for type-safe script execution:
- Type codes (unique numeric identifiers) enable compact binary serialization—critical for on-chain storage efficiency
- Embeddable types (codes 1–9) combine with type constructors using a clever arithmetic encoding, reducing common types to single bytes
- Numeric types form an ordered hierarchy (Byte < Short < Int < Long < BigInt) with safe upcasting and checked downcasting
- SigmaProp is the required return type for all ErgoScript contracts—it represents the cryptographic proposition that must be proven
- Object types (Box, Context, Header) provide access to blockchain state during script execution
- Version 6 introduces
SUnsignedBigIntand additional numeric operations for greater expressiveness
The type system ensures that scripts are well-formed before execution, preventing runtime type errors that could cause consensus failures. In the next chapter, we'll see how these types are organized into the ErgoTree structure—the actual format stored on-chain.
Next: Chapter 3: ErgoTree Structure
Scala: SType.scala:17-61
Rust: stype.rs:27-76
Scala: SType.scala:395-575 (numeric type definitions)
Rust: snumeric.rs:12-37 (method IDs)
Scala: SType.scala (SGroupElement definition)
Scala: SType.scala (SSigmaProp definition)
Scala: SType.scala:320-332 (type code ranges)
Scala: SType.scala:305-313 (SEmbeddable trait)
Scala: SType.scala:743-799 (SCollection)
Rust: scoll.rs
Scala: SType.scala:691-741 (SOption)
Scala: SType.scala:67-95 (type variables)
Scala: SType.scala:105-128 (version differences)
Chapter 3: ErgoTree Structure
PRE-ALPHA WARNING: This is a pre-alpha version of The Sigma Book. Content may be incomplete, inaccurate, or subject to change. Do not use as a source of truth. For authoritative information, consult the official repositories:
- sigmastate-interpreter — Reference Scala implementation
- sigma-rust — Rust implementation
- ergo — Ergo node
Prerequisites
- Binary representation concepts (bits, bytes, bitwise operations)
- Variable-Length Quantity (VLQ) encoding—a method for encoding integers using a variable number of bytes
- Understanding of Abstract Syntax Trees (ASTs) as hierarchical representations of code structure
- Prior chapters: Chapter 1 for the three-layer architecture, Chapter 2 for type codes used in serialization
Learning Objectives
By the end of this chapter, you will be able to:
- Parse and interpret ErgoTree header bytes, extracting version and feature flags
- Explain how constant segregation enables template sharing and caching optimizations
- Describe the version mechanism and how it enables soft-fork protocol upgrades
- Read and write the complete ErgoTree binary format
ErgoTree Overview
When you write an ErgoScript contract, the compiler transforms it into ErgoTree—a compact binary format designed for blockchain storage and deterministic execution. Every UTXO box contains an ErgoTree that defines its spending conditions.
ErgoTree is specifically designed to be12:
- Self-sufficient: Contains everything needed for evaluation (no external dependencies)
- Compact: Optimized binary encoding minimizes on-chain storage
- Forward-compatible: Version mechanism enables protocol upgrades without hard forks
- Deterministic: Same bytes always produce the same evaluation result
The structure consists of:
- Header byte — Format version and feature flags
- Size field (optional) — Total size for fast skipping
- Constants array (optional) — Extracted constants for template sharing
- Root expression — The actual script logic, returning
SigmaProp
const ErgoTree = struct {
header: HeaderType,
constants: []const Constant,
root: union(enum) {
parsed: SigmaPropValue,
unparsed: UnparsedTree,
},
proposition_bytes: ?[]const u8,
pub fn bytes(self: *ErgoTree, allocator: Allocator) ![]u8 {
if (self.proposition_bytes) |b| return b;
return try serialize(self, allocator);
}
pub fn bytesHex(self: *ErgoTree, allocator: Allocator) ![]u8 {
const b = try self.bytes(allocator);
return std.fmt.allocPrint(allocator, "{x}", .{b});
}
};
Header Format
The first byte uses a bit-field format34:
7 6 5 4 3 2 1 0
┌──┬──┬──┬──┬──┬──┬──┬──┐
│ │ │ │ │ │ │ │ │
└──┴──┴──┴──┴──┴──┴──┴──┘
│ │ │ │ │ └──┴──┴── Version (bits 0-2)
│ │ │ │ └─────────── Size flag (bit 3)
│ │ │ └────────────── Constant segregation (bit 4)
│ │ └───────────────── Reserved (bit 5, must be 0)
│ └──────────────────── Reserved for GZIP (bit 6, must be 0)
└─────────────────────── Extended header (bit 7)
const HeaderType = packed struct(u8) {
version: u3, // bits 0-2
has_size: bool, // bit 3
constant_segregation: bool, // bit 4
reserved1: bool = false, // bit 5
reserved_gzip: bool = false, // bit 6
multi_byte: bool = false, // bit 7
pub const VERSION_MASK: u8 = 0x07;
pub const SIZE_FLAG: u8 = 0x08;
pub const CONST_SEG_FLAG: u8 = 0x10;
pub fn fromByte(byte: u8) HeaderType {
return @bitCast(byte);
}
pub fn toByte(self: HeaderType) u8 {
return @bitCast(self);
}
pub fn v0(constant_segregation: bool) HeaderType {
return .{
.version = 0,
.has_size = false,
.constant_segregation = constant_segregation,
};
}
pub fn v1(constant_segregation: bool) HeaderType {
return .{
.version = 1,
.has_size = true, // Required for v1+
.constant_segregation = constant_segregation,
};
}
};
Common Header Values
| Byte | Binary | Meaning |
|---|---|---|
0x00 | 00000000 | v0, no segregation, no size |
0x08 | 00001000 | v0, no segregation, with size |
0x10 | 00010000 | v0, constant segregation, no size |
0x18 | 00011000 | v0, constant segregation, with size |
0x09 | 00001001 | v1, with size (required) |
0x19 | 00011001 | v1, constant segregation, with size |
Binary Format
┌──────────────────────────────────────────────────────────────────┐
│ ErgoTree │
├─────────┬─────────────┬──────────────────┬───────────────────────┤
│ Header │ [Size] │ [Constants] │ Root Expression │
│ 1 byte │ VLQ (opt) │ Array (opt) │ Serialized tree │
└─────────┴─────────────┴──────────────────┴───────────────────────┘
If header bit 3 is set (hasSize):
Size = VLQ-encoded size of (Constants + Root Expression)
If header bit 4 is set (isConstantSegregation):
Constants = VLQ count + Array of serialized constants
Root Expression = Serialized expression tree (SigmaPropValue)
const ErgoTreeSerializer = struct {
pub fn deserialize(reader: anytype) !ErgoTree {
// 1. Read header byte
const header = HeaderType.fromByte(try reader.readByte());
// 2. Read extended header if bit 7 set
if (header.multi_byte) {
// VLQ continuation - read additional bytes
_ = try readVlqExtension(reader);
}
// 3. Read size if flag set
var tree_size: ?u32 = null;
if (header.has_size) {
tree_size = try readVlq(reader);
}
// 4. Read constants if segregation enabled
var constants: []Constant = &.{};
if (header.constant_segregation) {
const count = try readVlq(reader);
// Bounds check: prevent DoS via excessive allocation
const MAX_CONSTANTS: u32 = 4096;
if (count > MAX_CONSTANTS) {
return error.TooManyConstants;
}
constants = try allocator.alloc(Constant, count);
for (constants) |*c| {
c.* = try Constant.deserialize(reader);
}
}
// NOTE: In production, use a pre-allocated pool instead of dynamic
// allocation during deserialization. See ZIGMA_STYLE.md.
// 5. Read root expression
const root = try Expr.deserialize(reader);
return ErgoTree{
.header = header,
.constants = constants,
.root = .{ .parsed = root },
.proposition_bytes = null,
};
}
};
Constant Segregation
Constant segregation is an optimization technique that extracts literal values from the expression tree and stores them in a separate array5. The expression tree then references these constants via placeholder indices. This seemingly simple change enables several powerful optimizations:
Without segregation:
┌─────────────────────────────────────────────────┐
│ header: 0x00 │
│ root: AND(GT(HEIGHT, IntConstant(100)), pk) │
└─────────────────────────────────────────────────┘
With segregation:
┌─────────────────────────────────────────────────┐
│ header: 0x10 │
│ constants: [IntConstant(100)] │
│ root: AND(GT(HEIGHT, Placeholder(0)), pk) │
└─────────────────────────────────────────────────┘
Benefits:
- Template sharing: Same template, different constants
- Caching: Templates cached for repeated evaluation
- Substitution: Constants replaced without re-parsing
/// Substitute ConstantPlaceholder nodes with actual constants
pub fn substConstants(
root: *const Expr,
constants: []const Constant,
) Expr {
return switch (root.*) {
.constant_placeholder => |ph| .{
.constant = constants[ph.index],
},
.and => |a| .{
.and = .{
.left = substConstants(a.left, constants),
.right = substConstants(a.right, constants),
},
},
// ... other node types
else => root.*,
};
}
// NOTE: In production, use an iterative approach with an explicit work stack
// to guarantee bounded stack depth and prevent stack overflow on deep trees.
Version Mechanism
The ErgoTree version field (bits 0-2) enables soft-fork protocol upgrades without breaking consensus6. Each ErgoTree version corresponds to a minimum required block version in the Ergo protocol—nodes running older protocol versions will skip validation of scripts with newer ErgoTree versions rather than rejecting them as invalid.
| ErgoTree Version | Min Block Version | Key Features |
|---|---|---|
| v0 | 1 | Original format with Ahead-of-Time (AOT) costing calculated during compilation |
| v1 | 2 | Just-in-Time (JIT) costing calculated during execution; size field required |
| v2 | 3 | Extended operations and new opcodes |
| v3 | 4 | UnsignedBigInt type and enhanced collection methods |
The size field became mandatory in v1 to support forward compatibility—nodes can skip over scripts they cannot fully parse by reading the size and advancing past the unknown content.
pub fn setVersionBits(header: HeaderType, version: u3) HeaderType {
var h = header;
h.version = version;
// Size flag required for version > 0
if (version > 0) {
h.has_size = true;
}
return h;
}
Unparsed Trees
When a node encounters an ErgoTree with an unknown opcode—typically from a newer protocol version—deserialization fails. Rather than rejecting the transaction entirely, the raw bytes are preserved as an "unparsed tree"7. This design is critical for soft-fork compatibility: older nodes can process blocks containing newer script versions without understanding their contents.
const UnparsedTree = struct {
bytes: []const u8,
err: DeserializationError,
};
/// Convert to proposition, handling unparsed case
pub fn toProposition(self: *const ErgoTree, replace_constants: bool) !SigmaPropValue {
return switch (self.root) {
.parsed => |tree| blk: {
if (replace_constants and self.constants.len > 0) {
break :blk substConstants(tree, self.constants);
}
break :blk tree;
},
.unparsed => |u| return u.err,
};
}
Creating ErgoTrees
pub fn fromProposition(prop: SigmaPropValue) ErgoTree {
return fromPropositionWithHeader(HeaderType.v0(false), prop);
}
pub fn fromPropositionWithHeader(header: HeaderType, prop: SigmaPropValue) ErgoTree {
// Simple constants don't need segregation
if (prop == .sigma_prop_constant) {
return withoutSegregation(header, prop);
}
// Complex expressions benefit from segregation
return withSegregation(header, prop);
}
fn withSegregation(header: HeaderType, prop: SigmaPropValue) ErgoTree {
var constants = std.ArrayList(Constant).init(allocator);
const segregated = extractConstants(prop, &constants);
return ErgoTree{
.header = .{
.version = header.version,
.has_size = header.has_size,
.constant_segregation = true,
},
.constants = constants.toOwnedSlice(),
.root = .{ .parsed = segregated },
.proposition_bytes = null,
};
}
Template Extraction
The template is root expression bytes without constant values:
pub fn template(self: *const ErgoTree) ![]const u8 {
// Serialize root with placeholders (no constant substitution)
var buf = std.ArrayList(u8).init(allocator);
try self.root.parsed.serialize(buf.writer());
return buf.toOwnedSlice();
}
Templates are useful for:
- Identifying script patterns regardless of constants
- Contract template matching
- Caching deserialized templates
Properties
const ErgoTree = struct {
// ... fields ...
/// Returns true if tree contains deserialization operations
pub fn hasDeserialize(self: *const ErgoTree) bool {
return switch (self.root) {
.parsed => |p| containsDeserializeOp(p),
.unparsed => false,
};
}
/// Returns true if tree uses blockchain context
pub fn isUsingBlockchainContext(self: *const ErgoTree) bool {
return switch (self.root) {
.parsed => |p| containsContextOp(p),
.unparsed => false,
};
}
/// Convert to SigmaBoolean if simple proposition
pub fn toSigmaBooleanOpt(self: *const ErgoTree) ?SigmaBoolean {
const prop = self.toProposition(self.header.constant_segregation) catch return null;
return switch (prop) {
.sigma_prop_constant => |c| c.value,
else => null,
};
}
};
Summary
This chapter covered the complete ErgoTree binary format—the serialized representation of smart contracts stored in every UTXO box:
- ErgoTree is a self-sufficient serialized contract format containing everything needed for evaluation without external dependencies
- The header byte uses a bit-field layout: version (bits 0-2), size flag (bit 3), constant segregation flag (bit 4), with reserved bits for future extensions
- Constant segregation (bit 4) extracts literal values into a separate array, enabling template sharing, caching, and runtime substitution without re-parsing
- The version mechanism enables soft-fork protocol upgrades—newer ErgoTree versions are skipped by older nodes rather than causing consensus failures
- ErgoTree versions 1+ require the size flag, allowing nodes to skip past unknown content
- UnparsedTree preserves raw bytes when deserialization fails, maintaining block validity even with unknown opcodes
- Simple cryptographic propositions can be extracted as SigmaBoolean values for direct signature verification
Next: Chapter 4: Value Nodes
Scala: ErgoTree.scala:24-80
Rust: ergo_tree.rs:33-41
Scala: ErgoTree.scala:227-270
Rust: tree_header.rs:10-32
Scala: ErgoTree.scala:307-322
Scala: ErgoTree.scala:263-305
Scala: ErgoTree.scala:19-22
Chapter 4: Value Nodes
PRE-ALPHA WARNING: This is a pre-alpha version of The Sigma Book. Content may be incomplete, inaccurate, or subject to change. Do not use as a source of truth. For authoritative information, consult the official repositories:
- sigmastate-interpreter — Reference Scala implementation
- sigma-rust — Rust implementation
- ergo — Ergo node
Prerequisites
- Understanding of Abstract Syntax Trees (ASTs) as hierarchical representations where each node represents a language construct
- Tree traversal techniques (depth-first evaluation)
- Prior chapters: Chapter 2 for the type system that governs value types, Chapter 3 for how values are serialized in ErgoTree
Learning Objectives
By the end of this chapter, you will be able to:
- Explain the
Valuebase type and its role as the foundation for all ErgoTree expression nodes - Distinguish between different constant value types (primitives, cryptographic, collections)
- Describe how the
evalmethod implements the evaluation semantics for each node type - Work with compound values including collections and tuples
The Value Base Type
ErgoTree is fundamentally an expression tree where every node produces a typed value. The Value base type defines the common interface that all expression nodes share—a type annotation, an opcode for serialization, and an evaluation method that computes the result12.
/// Base type for all ErgoTree expression nodes
const Value = struct {
tpe: SType,
op_code: OpCode,
/// Evaluate this node in the given environment
pub fn eval(self: *const Value, env: *const DataEnv, evaluator: *Evaluator) !Any {
// Default: must be overridden
return error.NotImplemented;
}
/// Add fixed cost to accumulator
pub fn addCost(self: *const Value, evaluator: *Evaluator, cost: FixedCost) void {
evaluator.addCost(cost, self.op_code);
}
/// Add per-item cost for known iteration count
pub fn addSeqCost(
self: *const Value,
evaluator: *Evaluator,
cost: PerItemCost,
n_items: usize,
) void {
evaluator.addSeqCost(cost, n_items, self.op_code);
}
};
Value Hierarchy
Value
├── Constant
│ ├── BooleanConstant (TrueLeaf, FalseLeaf)
│ ├── ByteConstant, ShortConstant, IntConstant, LongConstant
│ ├── BigIntConstant, UnsignedBigIntConstant (v6+)
│ ├── GroupElementConstant
│ ├── SigmaPropConstant
│ ├── CollectionConstant
│ └── UnitConstant
├── ConstantPlaceholder
├── Tuple
├── ConcreteCollection
├── SigmaPropValue
│ ├── BoolToSigmaProp
│ ├── CreateProveDlog
│ ├── CreateProveDHTuple
│ ├── SigmaAnd
│ └── SigmaOr
└── Transformer (collection operations)
├── AND, OR, XorOf
├── Map, Filter, Fold
└── Exists, ForAll
The hierarchy divides into several major categories:
- Constants hold literal values known at compile time
- ConstantPlaceholder references segregated constants by index (see Chapter 3)
- Compound values (Tuple, ConcreteCollection) combine multiple values
- SigmaPropValue nodes produce cryptographic propositions for signing
- Transformers perform operations on collections
Constant Values
Constants are pre-evaluated values embedded in the tree34:
const Constant = struct {
tpe: SType,
value: Literal,
pub const COST = FixedCost{ .value = 5 }; // JitCost units
pub fn eval(self: *const Constant, env: *const DataEnv, E: *Evaluator) Any {
E.addCost(COST, OpCode.Constant);
return self.value.toAny();
}
};
/// Literal values for constants
const Literal = union(enum) {
boolean: bool,
byte: i8,
short: i16,
int: i32,
long: i64,
big_int: BigInt256,
unsigned_big_int: UnsignedBigInt256,
group_element: EcPoint,
sigma_prop: SigmaProp,
coll: Collection,
tuple: []const Literal,
unit: void,
pub fn toAny(self: Literal) Any {
return switch (self) {
.boolean => |b| .{ .boolean = b },
.int => |i| .{ .int = i },
// ... other cases
};
}
};
Primitive Constant Factories
pub fn intConstant(value: i32) Constant {
return .{
.tpe = SType.int,
.value = .{ .int = value },
};
}
pub fn longConstant(value: i64) Constant {
return .{
.tpe = SType.long,
.value = .{ .long = value },
};
}
pub fn byteArrayConstant(bytes: []const u8) Constant {
return .{
.tpe = .{ .coll = &SType.byte },
.value = .{ .coll = .{ .bytes = bytes } },
};
}
Boolean Singletons
Boolean has special singleton instances for efficiency5:
pub const TrueLeaf = Constant{
.tpe = SType.boolean,
.value = .{ .boolean = true },
};
pub const FalseLeaf = Constant{
.tpe = SType.boolean,
.value = .{ .boolean = false },
};
pub fn booleanConstant(v: bool) *const Constant {
return if (v) &TrueLeaf else &FalseLeaf;
}
Cryptographic Constants
pub fn groupElementConstant(point: EcPoint) Constant {
return .{
.tpe = SType.group_element,
.value = .{ .group_element = point },
};
}
pub fn sigmaPropConstant(prop: SigmaProp) Constant {
return .{
.tpe = SType.sigma_prop,
.value = .{ .sigma_prop = prop },
};
}
/// Group generator - base point G of secp256k1
pub const GroupGenerator = struct {
pub const COST = FixedCost{ .value = 10 };
pub fn eval(_: *const @This(), _: *const DataEnv, E: *Evaluator) GroupElement {
E.addCost(COST, OpCode.GroupGenerator);
return crypto.SECP256K1_GENERATOR;
}
};
Constant Placeholders
When constant segregation is enabled (Chapter 3), placeholders replace inline constants with index references into the constants array67. This separation enables template caching—the same expression tree structure can be reused with different constant values. Placeholder evaluation costs less than inline constants because the constant data has already been parsed and validated during ErgoTree deserialization.
const ConstantPlaceholder = struct {
index: u32,
tpe: SType,
pub const COST = FixedCost{ .value = 1 }; // Cheaper than Constant
pub fn eval(self: *const ConstantPlaceholder, _: *const DataEnv, E: *Evaluator) !Any {
// Bounds check first (prevents out-of-bounds access)
if (self.index >= E.constants.len) {
return error.ConstantIndexOutOfBounds;
}
const c = E.constants[self.index];
E.addCost(COST, OpCode.ConstantPlaceholder);
// Type check
if (c.tpe != self.tpe) {
return error.TypeMismatch;
}
return c.value.toAny();
}
};
Collection Values
ErgoTree supports two kinds of collection nodes, optimized for different use cases:
CollectionConstant
For collections where all elements are known at compile time, CollectionConstant stores the values directly. This enables efficient serialization and avoids evaluation overhead for static data like byte arrays and fixed integer sequences.
const CollectionConstant = struct {
elem_type: SType,
items: union(enum) {
bytes: []const u8,
ints: []const i32,
longs: []const i64,
bools: []const bool,
any: []const Literal,
},
pub fn tpe(self: *const CollectionConstant) SType {
return .{ .coll = &self.elem_type };
}
};
ConcreteCollection
When collection elements are computed expressions rather than literals, ConcreteCollection holds references to sub-expression nodes8. Each element is evaluated at runtime, making this suitable for dynamically constructed collections.
const ConcreteCollection = struct {
items: []const *Value,
elem_type: SType,
pub const COST = FixedCost{ .value = 20 };
pub fn eval(self: *const ConcreteCollection, env: *const DataEnv, E: *Evaluator) ![]Any {
E.addCost(COST, OpCode.ConcreteCollection);
var result = try E.allocator.alloc(Any, self.items.len);
for (self.items, 0..) |item, i| {
result[i] = try item.eval(env, E);
}
return result;
}
};
// NOTE: In production, use a pre-allocated value pool to avoid dynamic
// allocation during evaluation. See ZIGMA_STYLE.md memory management section.
Tuple Values
Heterogeneous fixed-size sequences9:
const Tuple = struct {
items: []const *Value,
pub const COST = FixedCost{ .value = 15 };
pub fn tpe(self: *const Tuple) STuple {
var types = try allocator.alloc(SType, self.items.len);
for (self.items, 0..) |item, i| {
types[i] = item.tpe;
}
return STuple{ .items = types };
}
pub fn eval(self: *const Tuple, env: *const DataEnv, E: *Evaluator) !TupleValue {
// Note: v5.0 only supports pairs (2 elements)
if (self.items.len != 2) {
return error.InvalidTupleSize;
}
const x = try self.items[0].eval(env, E);
const y = try self.items[1].eval(env, E);
E.addCost(COST, OpCode.Tuple);
return .{ x, y };
}
};
Sigma Proposition Values
BoolToSigmaProp
Converts boolean to cryptographic proposition10:
const BoolToSigmaProp = struct {
input: *Value, // Must be boolean
pub const COST = FixedCost{ .value = 15 };
pub fn eval(self: *const BoolToSigmaProp, env: *const DataEnv, E: *Evaluator) !SigmaProp {
const v = try self.input.eval(env, E);
E.addCost(COST, OpCode.BoolToSigmaProp);
return SigmaProp.fromBool(v.boolean);
}
};
CreateProveDlog
Creates discrete log proposition (standard public key)11:
const CreateProveDlog = struct {
input: *Value, // GroupElement
pub const COST = FixedCost{ .value = 10 };
pub fn eval(self: *const CreateProveDlog, env: *const DataEnv, E: *Evaluator) !SigmaProp {
const point = try self.input.eval(env, E);
E.addCost(COST, OpCode.ProveDlog);
return SigmaProp{
.prove_dlog = ProveDlog{ .h = point.group_element },
};
}
};
CreateProveDHTuple
Creates Diffie-Hellman tuple proposition:
const CreateProveDHTuple = struct {
g: *Value,
h: *Value,
u: *Value,
v: *Value,
pub const COST = FixedCost{ .value = 20 };
pub fn eval(self: *const CreateProveDHTuple, env: *const DataEnv, E: *Evaluator) !SigmaProp {
const g_val = try self.g.eval(env, E);
const h_val = try self.h.eval(env, E);
const u_val = try self.u.eval(env, E);
const v_val = try self.v.eval(env, E);
E.addCost(COST, OpCode.ProveDHTuple);
return SigmaProp{
.prove_dh_tuple = ProveDhTuple{
.g = g_val.group_element,
.h = h_val.group_element,
.u = u_val.group_element,
.v = v_val.group_element,
},
};
}
};
SigmaAnd / SigmaOr
Combine sigma propositions12:
const SigmaAnd = struct {
items: []const *Value, // SigmaPropValues
pub const COST = PerItemCost{
.base = 10,
.per_chunk = 2,
.chunk_size = 1,
};
pub fn eval(self: *const SigmaAnd, env: *const DataEnv, E: *Evaluator) !SigmaProp {
var props = try E.allocator.alloc(SigmaProp, self.items.len);
for (self.items, 0..) |item, i| {
props[i] = (try item.eval(env, E)).sigma_prop;
}
E.addSeqCost(COST, self.items.len, OpCode.SigmaAnd);
return SigmaProp{ .cand = Cand{ .children = props } };
}
};
const SigmaOr = struct {
items: []const *Value,
pub const COST = PerItemCost{
.base = 10,
.per_chunk = 2,
.chunk_size = 1,
};
pub fn eval(self: *const SigmaOr, env: *const DataEnv, E: *Evaluator) !SigmaProp {
var props = try E.allocator.alloc(SigmaProp, self.items.len);
for (self.items, 0..) |item, i| {
props[i] = (try item.eval(env, E)).sigma_prop;
}
E.addSeqCost(COST, self.items.len, OpCode.SigmaOr);
return SigmaProp{ .cor = Cor{ .children = props } };
}
};
Logical Operations
AND / OR with Short-Circuit
Boolean operations support short-circuit evaluation13:
const AND = struct {
input: *Value, // Collection[Boolean]
pub const COST = PerItemCost{
.base = 10,
.per_chunk = 5,
.chunk_size = 32,
};
pub fn eval(self: *const AND, env: *const DataEnv, E: *Evaluator) !bool {
const coll = try self.input.eval(env, E);
const items = coll.coll.bools;
var result = true;
var i: usize = 0;
// Short-circuit: stop on first false
while (i < items.len and result) {
result = result and items[i];
i += 1;
}
// Cost based on actual items processed
E.addSeqCost(COST, i, OpCode.And);
return result;
}
};
const OR = struct {
input: *Value, // Collection[Boolean]
pub const COST = PerItemCost{
.base = 10,
.per_chunk = 5,
.chunk_size = 32,
};
pub fn eval(self: *const OR, env: *const DataEnv, E: *Evaluator) !bool {
const coll = try self.input.eval(env, E);
const items = coll.coll.bools;
var result = false;
var i: usize = 0;
// Short-circuit: stop on first true
while (i < items.len and !result) {
result = result or items[i];
i += 1;
}
E.addSeqCost(COST, i, OpCode.Or);
return result;
}
};
XorOf
XOR over boolean collection:
const XorOf = struct {
input: *Value,
pub const COST = PerItemCost{
.base = 20,
.per_chunk = 5,
.chunk_size = 32,
};
pub fn eval(self: *const XorOf, env: *const DataEnv, E: *Evaluator) !bool {
const coll = try self.input.eval(env, E);
const items = coll.coll.bools;
var result = false;
for (items) |b| {
result = result != b; // XOR
}
E.addSeqCost(COST, items.len, OpCode.XorOf);
return result;
}
};
Cost Summary
| Operation | Cost Type | Value |
|---|---|---|
| Constant | Fixed | 5 |
| ConstantPlaceholder | Fixed | 1 |
| Tuple | Fixed | 15 |
| BoolToSigmaProp | Fixed | 15 |
| CreateProveDlog | Fixed | 10 |
| CreateProveDHTuple | Fixed | 20 |
| GroupGenerator | Fixed | 10 |
| AND/OR | PerItem | base=10, chunk=5/32 |
| SigmaAnd/SigmaOr | PerItem | base=10, chunk=2/1 |
| XorOf | PerItem | base=20, chunk=5/32 |
Summary
This chapter introduced the value node hierarchy that forms the foundation of ErgoTree's expression tree:
Valueis the base type for all ErgoTree expression nodes, defining the common interface of type, opcode, and evaluation method- Every value carries type information (
tpe) used for static type checking and cost information used for bounded execution - Constants are pre-evaluated literals embedded in the tree;
ConstantPlaceholderprovides indirection to segregated constants for template sharing - Collection values come in two forms:
CollectionConstantfor static data andConcreteCollectionfor computed elements - Sigma proposition values (
CreateProveDlog,CreateProveDHTuple,SigmaAnd,SigmaOr) produce cryptographic propositions that require zero-knowledge proofs - Boolean operations (
AND,OR) support short-circuit evaluation, charging costs only for elements actually processed - The
evalmethod on each value type implements its evaluation semantics, transforming the AST node into a runtime value
Next: Chapter 5: Operations and Opcodes
Scala: values.scala:30-165
Rust: expr.rs:1-80
Scala: values.scala:305-398
Rust: constant.rs:51-58
Scala: values.scala (TrueLeaf, FalseLeaf)
Scala: values.scala:400-422
Rust: constant_placeholder.rs
Scala: values.scala (ConcreteCollection)
Scala: values.scala:771-810
Scala: trees.scala:28-57
Scala: trees.scala (CreateProveDlog)
Scala: trees.scala (SigmaAnd, SigmaOr)
Scala: trees.scala:186-299
Chapter 5: Operations and Opcodes
PRE-ALPHA WARNING: This is a pre-alpha version of The Sigma Book. Content may be incomplete, inaccurate, or subject to change. Do not use as a source of truth. For authoritative information, consult the official repositories:
- sigmastate-interpreter — Reference Scala implementation
- sigma-rust — Rust implementation
- ergo — Ergo node
Prerequisites
- Understanding of bytecode as numeric instruction encodings
- Single-byte vs multi-byte encoding trade-offs
- Prior chapters: Chapter 4 for value node types, Chapter 2 for type codes that occupy the lower opcode range
Learning Objectives
By the end of this chapter, you will be able to:
- Explain the opcode encoding scheme and why constants share space with operations
- Navigate the complete opcode space (0x00-0xFF) and identify operation categories
- Describe the three cost descriptor types (
FixedCost,PerItemCost,TypeBasedCost) - Understand how short-circuit evaluation affects cost calculation
Opcode Encoding Scheme
Every ErgoTree operation is identified by a single-byte opcode12:
Opcode Space Layout:
┌────────────────────────────────────────────────────────┐
│ 0x00 │ Reserved (Undefined) │
├────────────┼───────────────────────────────────────────┤
│ 0x01-0x70 │ Constant type codes (optimized encoding) │
├────────────┼───────────────────────────────────────────┤
│ 0x71 │ Function type marker (LastConstantCode+1) │
├────────────┼───────────────────────────────────────────┤
│ 0x72-0xFF │ Operation codes (newOpCode 1-143) │
└────────────┴───────────────────────────────────────────┘
This layout is an optimization: constant values in the range 0x01-0x70 encode their type code directly as the opcode, saving one byte per constant in the serialized tree. The type code simultaneously identifies both what the value is and how to deserialize it. Operations occupy the upper range (0x72-0xFF), providing 143 distinct operation codes.
const OpCode = struct {
value: u8,
pub const FIRST_DATA_TYPE: u8 = 1;
pub const LAST_DATA_TYPE: u8 = 111;
pub const LAST_CONSTANT_CODE: u8 = 112; // LAST_DATA_TYPE + 1
pub fn new(shift: u8) OpCode {
return .{ .value = LAST_CONSTANT_CODE + shift };
}
pub fn isConstant(byte: u8) bool {
return byte >= FIRST_DATA_TYPE and byte <= LAST_CONSTANT_CODE;
}
};
Opcode Definitions
const OpCodes = struct {
// Variables (0x71-0x74)
pub const TaggedVariable = OpCode.new(1); // 113
pub const ValUse = OpCode.new(2); // 114
pub const ConstantPlaceholder = OpCode.new(3); // 115
pub const SubstConstants = OpCode.new(4); // 116
// Conversions (0x7A-0x7E)
pub const LongToByteArray = OpCode.new(10); // 122
pub const ByteArrayToBigInt = OpCode.new(11); // 123
pub const ByteArrayToLong = OpCode.new(12); // 124
pub const Downcast = OpCode.new(13); // 125
pub const Upcast = OpCode.new(14); // 126
// Literals (0x7F-0x86)
pub const True = OpCode.new(15); // 127
pub const False = OpCode.new(16); // 128
pub const UnitConstant = OpCode.new(17); // 129
pub const GroupGenerator = OpCode.new(18); // 130
pub const Coll = OpCode.new(19); // 131
pub const CollOfBoolConst = OpCode.new(21); // 133
pub const Tuple = OpCode.new(22); // 134
// Tuple access (0x87-0x8C)
pub const Select1 = OpCode.new(23); // 135
pub const Select2 = OpCode.new(24); // 136
pub const Select3 = OpCode.new(25); // 137
pub const Select4 = OpCode.new(26); // 138
pub const Select5 = OpCode.new(27); // 139
pub const SelectField = OpCode.new(28); // 140
// Relations (0x8F-0x98)
pub const Lt = OpCode.new(31); // 143
pub const Le = OpCode.new(32); // 144
pub const Gt = OpCode.new(33); // 145
pub const Ge = OpCode.new(34); // 146
pub const Eq = OpCode.new(35); // 147
pub const Neq = OpCode.new(36); // 148
pub const If = OpCode.new(37); // 149
pub const And = OpCode.new(38); // 150
pub const Or = OpCode.new(39); // 151
pub const AtLeast = OpCode.new(40); // 152
// Arithmetic (0x99-0xA2)
pub const Minus = OpCode.new(41); // 153
pub const Plus = OpCode.new(42); // 154
pub const Xor = OpCode.new(43); // 155
pub const Multiply = OpCode.new(44); // 156
pub const Division = OpCode.new(45); // 157
pub const Modulo = OpCode.new(46); // 158
pub const Exponentiate = OpCode.new(47); // 159
pub const MultiplyGroup = OpCode.new(48); // 160
pub const Min = OpCode.new(49); // 161
pub const Max = OpCode.new(50); // 162
// Context (0xA3-0xAC)
pub const Height = OpCode.new(51); // 163
pub const Inputs = OpCode.new(52); // 164
pub const Outputs = OpCode.new(53); // 165
pub const LastBlockUtxoRootHash = OpCode.new(54); // 166
pub const Self = OpCode.new(55); // 167
pub const MinerPubkey = OpCode.new(60); // 172
// Collections (0xAD-0xB8)
pub const Map = OpCode.new(61); // 173
pub const Exists = OpCode.new(62); // 174
pub const ForAll = OpCode.new(63); // 175
pub const Fold = OpCode.new(64); // 176
pub const SizeOf = OpCode.new(65); // 177
pub const ByIndex = OpCode.new(66); // 178
pub const Append = OpCode.new(67); // 179
pub const Slice = OpCode.new(68); // 180
pub const Filter = OpCode.new(69); // 181
pub const AvlTree = OpCode.new(70); // 182
pub const FlatMap = OpCode.new(72); // 184
// Box access (0xC1-0xC7)
pub const ExtractAmount = OpCode.new(81); // 193
pub const ExtractScriptBytes = OpCode.new(82); // 194
pub const ExtractBytes = OpCode.new(83); // 195
pub const ExtractBytesWithNoRef = OpCode.new(84); // 196
pub const ExtractId = OpCode.new(85); // 197
pub const ExtractRegisterAs = OpCode.new(86); // 198
pub const ExtractCreationInfo = OpCode.new(87); // 199
// Crypto (0xCB-0xD3)
pub const CalcBlake2b256 = OpCode.new(91); // 203
pub const CalcSha256 = OpCode.new(92); // 204
pub const ProveDlog = OpCode.new(93); // 205
pub const ProveDHTuple = OpCode.new(94); // 206
pub const SigmaPropBytes = OpCode.new(96); // 208
pub const BoolToSigmaProp = OpCode.new(97); // 209
pub const TrivialFalse = OpCode.new(98); // 210
pub const TrivialTrue = OpCode.new(99); // 211
// Blocks (0xD4-0xDD)
pub const DeserializeContext = OpCode.new(100); // 212
pub const DeserializeRegister = OpCode.new(101); // 213
pub const ValDef = OpCode.new(102); // 214
pub const FunDef = OpCode.new(103); // 215
pub const BlockValue = OpCode.new(104); // 216
pub const FuncValue = OpCode.new(105); // 217
pub const FuncApply = OpCode.new(106); // 218
pub const PropertyCall = OpCode.new(107); // 219
pub const MethodCall = OpCode.new(108); // 220
pub const Global = OpCode.new(109); // 221
// Options (0xDE-0xE6)
pub const SomeValue = OpCode.new(110); // 222
pub const NoneValue = OpCode.new(111); // 223
pub const GetVar = OpCode.new(115); // 227
pub const OptionGet = OpCode.new(116); // 228
pub const OptionGetOrElse = OpCode.new(117); // 229
pub const OptionIsDefined = OpCode.new(118); // 230
// Sigma props (0xEA-0xED)
pub const SigmaAnd = OpCode.new(122); // 234
pub const SigmaOr = OpCode.new(123); // 235
pub const BinOr = OpCode.new(124); // 236
pub const BinAnd = OpCode.new(125); // 237
// Bitwise (0xEE-0xFB)
pub const DecodePoint = OpCode.new(126); // 238
pub const LogicalNot = OpCode.new(127); // 239
pub const Negation = OpCode.new(128); // 240
pub const BitInversion = OpCode.new(129); // 241
pub const BitOr = OpCode.new(130); // 242
pub const BitAnd = OpCode.new(131); // 243
pub const BinXor = OpCode.new(132); // 244
pub const BitXor = OpCode.new(133); // 245
pub const BitShiftRight = OpCode.new(134); // 246
pub const BitShiftLeft = OpCode.new(135); // 247
pub const BitShiftRightZeroed = OpCode.new(136); // 248
// Special (0xFE-0xFF)
pub const Context = OpCode.new(142); // 254
pub const XorOf = OpCode.new(143); // 255
};
Opcode Categories Summary
| Category | Range | Count | Description |
|---|---|---|---|
| Variables | 113-116 | 4 | Variable references, placeholders |
| Conversions | 122-126 | 5 | Type conversions |
| Literals | 127-134 | 8 | Boolean, unit, collections |
| Tuple access | 135-140 | 6 | Field selection |
| Relations | 143-152 | 10 | Comparisons, conditionals |
| Arithmetic | 153-162 | 10 | Math operations |
| Context | 163-172 | 6 | Transaction context |
| Collections | 173-184 | 10 | Collection operations |
| Box access | 193-199 | 7 | Box property access |
| Crypto | 203-211 | 9 | Hashing, sigma props |
| Blocks | 212-221 | 10 | Definitions, lambdas |
| Options | 222-230 | 7 | Option operations |
| Sigma props | 234-237 | 4 | Sigma composition |
| Bitwise | 238-248 | 11 | Bit operations |
Arithmetic Operations
Arithmetic operations use type-based costing34:
const ArithOp = struct {
op_code: OpCode,
left: *const Value,
right: *const Value,
pub fn eval(self: *const ArithOp, env: *const DataEnv, E: *Evaluator) !Any {
const x = try self.left.eval(env, E);
const y = try self.right.eval(env, E);
const cost = switch (self.left.tpe) {
.big_int, .unsigned_big_int => 30,
else => 15,
};
E.addCost(FixedCost{ .value = cost }, self.op_code);
return switch (self.op_code.value) {
OpCodes.Plus.value => arithPlus(x, y, self.left.tpe),
OpCodes.Minus.value => arithMinus(x, y, self.left.tpe),
OpCodes.Multiply.value => arithMultiply(x, y, self.left.tpe),
OpCodes.Division.value => arithDivision(x, y, self.left.tpe),
OpCodes.Modulo.value => arithModulo(x, y, self.left.tpe),
OpCodes.Min.value => arithMin(x, y, self.left.tpe),
OpCodes.Max.value => arithMax(x, y, self.left.tpe),
else => error.UnknownOpcode,
};
}
};
fn arithPlus(x: Any, y: Any, tpe: SType) !Any {
// NOTE: ErgoTree arithmetic uses modular (wrapping) semantics for primitives.
// The +% operator in Zig performs wrapping addition, matching this behavior.
// In production, use @addWithOverflow for explicit overflow detection when
// the application requires overflow errors. See ZIGMA_STYLE.md.
return switch (tpe) {
.byte => .{ .byte = x.byte +% y.byte },
.short => .{ .short = x.short +% y.short },
.int => .{ .int = x.int +% y.int },
.long => .{ .long = x.long +% y.long },
.big_int => .{ .big_int = x.big_int.add(y.big_int) },
else => unreachable,
};
}
Arithmetic Cost Table
| Operation | Primitive Cost | BigInt Cost |
|---|---|---|
| Plus (+) | 15 | 20 |
| Minus (-) | 15 | 20 |
| Multiply (*) | 15 | 30 |
| Division (/) | 15 | 30 |
| Modulo (%) | 15 | 30 |
| Min/Max | 15 | 20 |
Relation Operations
Comparison operations5:
const Relation = struct {
op_code: OpCode,
left: *const Value,
right: *const Value,
pub fn eval(self: *const Relation, env: *const DataEnv, E: *Evaluator) !bool {
const lv = try self.left.eval(env, E);
const rv = try self.right.eval(env, E);
const cost: u32 = switch (self.op_code.value) {
OpCodes.Eq.value, OpCodes.Neq.value => 3, // Equality cheap
else => 15, // Ordering comparisons
};
E.addCost(FixedCost{ .value = cost }, self.op_code);
return switch (self.op_code.value) {
OpCodes.Lt.value => compare(lv, rv, self.left.tpe) < 0,
OpCodes.Le.value => compare(lv, rv, self.left.tpe) <= 0,
OpCodes.Gt.value => compare(lv, rv, self.left.tpe) > 0,
OpCodes.Ge.value => compare(lv, rv, self.left.tpe) >= 0,
OpCodes.Eq.value => equalValues(lv, rv),
OpCodes.Neq.value => !equalValues(lv, rv),
else => error.UnknownOpcode,
};
}
};
Logical Operations
Short-circuit evaluation with per-item cost6:
const LogicalAnd = struct {
input: *const Value, // Collection[Boolean]
pub const COST = PerItemCost{
.base = 10,
.per_chunk = 5,
.chunk_size = 32,
};
pub fn eval(self: *const LogicalAnd, env: *const DataEnv, E: *Evaluator) !bool {
const coll = try self.input.eval(env, E);
const items = coll.coll.bools;
var result = true;
var i: usize = 0;
// Short-circuit: stop on first false
while (i < items.len and result) : (i += 1) {
result = result and items[i];
}
// Cost based on actual items processed
E.addSeqCost(COST, i, OpCodes.And);
return result;
}
};
const BinaryAnd = struct {
left: *const Value,
right: *const Value,
pub const COST = FixedCost{ .value = 20 };
pub fn eval(self: *const BinaryAnd, env: *const DataEnv, E: *Evaluator) !bool {
const l = try self.left.eval(env, E);
E.addCost(COST, OpCodes.BinAnd);
// Short-circuit: don't evaluate right if left is false
if (!l.boolean) return false;
return (try self.right.eval(env, E)).boolean;
}
};
Cost Descriptors
Every operation has an associated cost that the interpreter accumulates during evaluation. If the total cost exceeds the block limit, execution fails—this prevents denial-of-service attacks via expensive computations. Three cost descriptor types model different operation characteristics7:
/// Fixed cost regardless of input
const FixedCost = struct {
value: u32, // JitCost units
};
/// Cost scales with input size
const PerItemCost = struct {
base: u32, // Fixed overhead
per_chunk: u32, // Cost per chunk
chunk_size: u32, // Items per chunk
pub fn calculate(self: PerItemCost, n_items: usize) u32 {
const chunks = (n_items + self.chunk_size - 1) / self.chunk_size;
return self.base + @intCast(chunks) * self.per_chunk;
}
};
/// Cost depends on operand type
const TypeBasedCost = struct {
primitive_cost: u32,
big_int_cost: u32,
pub fn forType(self: TypeBasedCost, tpe: SType) u32 {
return switch (tpe) {
.big_int, .unsigned_big_int => self.big_int_cost,
else => self.primitive_cost,
};
}
};
Context Operations
Access transaction context8:
const ContextOps = struct {
pub const Height = struct {
pub const COST = FixedCost{ .value = 26 };
pub fn eval(_: *const @This(), _: *const DataEnv, E: *Evaluator) i32 {
E.addCost(COST, OpCodes.Height);
return E.context.pre_header.height;
}
};
pub const Inputs = struct {
pub const COST = FixedCost{ .value = 10 };
pub fn eval(_: *const @This(), _: *const DataEnv, E: *Evaluator) []const Box {
E.addCost(COST, OpCodes.Inputs);
return E.context.inputs;
}
};
pub const Outputs = struct {
pub const COST = FixedCost{ .value = 10 };
pub fn eval(_: *const @This(), _: *const DataEnv, E: *Evaluator) []const Box {
E.addCost(COST, OpCodes.Outputs);
return E.context.outputs;
}
};
pub const SelfBox = struct {
pub const COST = FixedCost{ .value = 10 };
pub fn eval(_: *const @This(), _: *const DataEnv, E: *Evaluator) *const Box {
E.addCost(COST, OpCodes.Self);
return E.context.self_box;
}
};
};
Box Property Access
Extract box properties9:
const ExtractAmount = struct {
box: *const Value,
pub const COST = FixedCost{ .value = 12 };
pub fn eval(self: *const ExtractAmount, env: *const DataEnv, E: *Evaluator) !i64 {
const b = try self.box.eval(env, E);
E.addCost(COST, OpCodes.ExtractAmount);
return b.box.value;
}
};
const ExtractId = struct {
box: *const Value,
pub const COST = FixedCost{ .value = 12 };
pub fn eval(self: *const ExtractId, env: *const DataEnv, E: *Evaluator) ![32]u8 {
const b = try self.box.eval(env, E);
E.addCost(COST, OpCodes.ExtractId);
return b.box.id();
}
};
const ExtractRegisterAs = struct {
box: *const Value,
register_id: u4, // 0-9
pub const COST = FixedCost{ .value = 12 };
pub fn eval(self: *const ExtractRegisterAs, env: *const DataEnv, E: *Evaluator) !?Constant {
const b = try self.box.eval(env, E);
E.addCost(COST, OpCodes.ExtractRegisterAs);
return b.box.registers[self.register_id];
}
};
Cryptographic Operations
Hash and sigma prop operations10:
const CalcBlake2b256 = struct {
input: *const Value, // Coll[Byte]
pub const COST = PerItemCost{
.base = 117,
.per_chunk = 1,
.chunk_size = 128,
};
pub fn eval(self: *const CalcBlake2b256, env: *const DataEnv, E: *Evaluator) ![32]u8 {
const bytes = try self.input.eval(env, E);
E.addSeqCost(COST, bytes.coll.bytes.len, OpCodes.CalcBlake2b256);
var hasher = std.crypto.hash.blake2.Blake2b256.init(.{});
hasher.update(bytes.coll.bytes);
return hasher.finalResult();
}
};
const CalcSha256 = struct {
input: *const Value,
pub const COST = PerItemCost{
.base = 79,
.per_chunk = 1,
.chunk_size = 64,
};
pub fn eval(self: *const CalcSha256, env: *const DataEnv, E: *Evaluator) ![32]u8 {
const bytes = try self.input.eval(env, E);
E.addSeqCost(COST, bytes.coll.bytes.len, OpCodes.CalcSha256);
var hasher = std.crypto.hash.sha2.Sha256.init(.{});
hasher.update(bytes.coll.bytes);
return hasher.finalResult();
}
};
Summary
This chapter detailed the opcode encoding scheme that gives each ErgoTree operation a unique byte identifier:
- Opcode space is split between constant type codes (0x01-0x70) and operation codes (0x72-0xFF), with constants using their type code directly to save one byte per value
- Operation categories group related functionality: variables, conversions, relations, arithmetic, context access, collections, box properties, cryptography, blocks, options, sigma propositions, and bitwise operations
- Cost descriptors come in three types:
FixedCostfor constant-time operations,PerItemCostfor operations that scale with input size, andTypeBasedCostfor operations where BigInt is more expensive than primitive types - Short-circuit evaluation in logical operations (
AND,OR,BinaryAnd,BinaryOr) stops early when the result is determined, with costs calculated based on actual items processed - Context operations provide access to transaction data:
HEIGHT,INPUTS,OUTPUTS,SELFbox, and miner public key
Next: Chapter 6: Methods on Types
Scala: OpCodes.scala
Rust: op_code.rs:10-100
Scala: trees.scala:704-827
Rust: bin_op.rs
Scala: trees.scala:908-1100
Scala: trees.scala (AND, OR)
Scala: CostKind.scala
Scala: trees.scala (context operations)
Scala: trees.scala (box accessors)
Scala: trees.scala (crypto operations)
Chapter 6: Methods on Types
PRE-ALPHA WARNING: This is a pre-alpha version of The Sigma Book. Content may be incomplete, inaccurate, or subject to change. Do not use as a source of truth. For authoritative information, consult the official repositories:
- sigmastate-interpreter — Reference Scala implementation
- sigma-rust — Rust implementation
- ergo — Ergo node
Prerequisites
- Understanding of method dispatch—how method calls are resolved to specific implementations based on the receiver type
- Familiarity with type hierarchies and how types can share common method interfaces
- Prior chapters: Chapter 2 for type codes used in method resolution, Chapter 5 for operations vs methods distinction
Learning Objectives
By the end of this chapter, you will be able to:
- Explain how methods are organized via
MethodsContainerand resolved by type code and method ID - Use methods on numeric, collection, box, and cryptographic types
- Describe the method resolution process from
MethodCallto method implementation - Access transaction context and blockchain state through context methods
Method Architecture
While Chapter 5 covered standalone operations (arithmetic, comparisons, etc.), ErgoTree also supports methods—operations that belong to specific types. The distinction matters for serialization: operations use opcodes directly, while method calls serialize a type code, method ID, and arguments. This design allows types to have rich APIs without consuming the limited opcode space.
Methods are organized through a MethodsContainer system that groups related methods by their receiver type12:
Method Organization
══════════════════════════════════════════════════════════════════
┌────────────────────────────────────────────────────────────────┐
│ STypeCompanion │
│ (type_code: u8, methods: []const SMethodDesc) │
└───────────────────────┬────────────────────────────────────────┘
│
┌────────────────┼────────────────┬───────────────────┐
▼ ▼ ▼ ▼
┌──────────────┐ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐
│ SNumeric │ │ SBox │ │ SColl │ │ SContext │
│ TYPE_CODE=2-6│ │ TYPE_CODE=99 │ │ TYPE_CODE=12 │ │ TYPE_CODE=101│
├──────────────┤ ├──────────────┤ ├──────────────┤ ├──────────────┤
│ toByte (1) │ │ value (1) │ │ size (1) │ │ dataInputs(1)│
│ toShort (2) │ │ propBytes(2) │ │ getOrElse(2) │ │ headers (2) │
│ toInt (3) │ │ bytes (3) │ │ map (3) │ │ preHeader(3) │
│ toLong (4) │ │ id (5) │ │ exists (4) │ │ INPUTS (4) │
│ toBigInt (5) │ │ getReg (7) │ │ forall (5) │ │ OUTPUTS (5) │
│ toBytes (6) │ │ tokens (8) │ │ fold (6) │ │ HEIGHT (6) │
│ ... │ │ ... │ │ ... │ │ ... │
└──────────────┘ └──────────────┘ └──────────────┘ └──────────────┘
Method Resolution:
MethodCall(receiver, type_code=99, method_id=1)
│
▼
resolveMethod(99, 1) → SBoxMethods.VALUE
│
▼
method.eval(receiver, args, evaluator)
const MethodsContainer = struct {
type_code: u8,
methods: []const SMethod,
pub fn getMethodById(self: *const MethodsContainer, method_id: u8) ?*const SMethod {
for (self.methods) |*m| {
if (m.method_id == method_id) return m;
}
return null;
}
pub fn getMethodByName(self: *const MethodsContainer, name: []const u8) ?*const SMethod {
for (self.methods) |*m| {
if (std.mem.eql(u8, m.name, name)) return m;
}
return null;
}
};
const SMethod = struct {
obj_type: STypeCompanion,
name: []const u8,
method_id: u8,
tpe: SFunc,
cost_kind: CostKind,
min_version: ?ErgoTreeVersion = null, // v6+ methods
pub fn eval(
self: *const SMethod,
receiver: Any,
args: []const Any,
E: *Evaluator,
) !Any {
// Method dispatch by type_code and method_id
return try evalMethod(
self.obj_type.type_code,
self.method_id,
receiver,
args,
E,
);
}
};
Available Method Containers
| Type | Container | Method Count |
|---|---|---|
| Byte, Short, Int, Long | SNumericMethods | 13 |
| BigInt | SBigIntMethods | 13 |
| UnsignedBigInt (v6+) | SUnsignedBigIntMethods | 13 |
| Boolean | SBooleanMethods | 0 |
| GroupElement | SGroupElementMethods | 4 |
| SigmaProp | SSigmaPropMethods | 2 |
| Box | SBoxMethods | 10 |
| Coll[T] | SCollectionMethods | 20 |
| Option[T] | SOptionMethods | 4 |
| Context | SContextMethods | 12 |
| Header | SHeaderMethods | 16 |
| PreHeader | SPreHeaderMethods | 8 |
| AvlTree | SAvlTreeMethods | 9 |
| Global | SGlobalMethods | 4 |
Numeric Type Methods
All numeric types share common methods34:
const SNumericMethods = struct {
pub const TYPE_CODE = 0; // Varies by actual type
// Conversion methods (v5+)
pub const TO_BYTE = SMethod{
.method_id = 1,
.name = "toByte",
.tpe = SFunc.unary(.{ .type_var = "T" }, .byte),
.cost_kind = .{ .type_based = .{ .primitive = 5, .big_int = 10 } },
};
pub const TO_SHORT = SMethod{ .method_id = 2, .name = "toShort", ... };
pub const TO_INT = SMethod{ .method_id = 3, .name = "toInt", ... };
pub const TO_LONG = SMethod{ .method_id = 4, .name = "toLong", ... };
pub const TO_BIGINT = SMethod{ .method_id = 5, .name = "toBigInt", ... };
// Binary representation (v6+)
pub const TO_BYTES = SMethod{
.method_id = 6,
.name = "toBytes",
.tpe = SFunc.unary(.{ .type_var = "T" }, .{ .coll = &SType.byte }),
.cost_kind = .{ .fixed = 5 },
.min_version = .v3, // v6
};
pub const TO_BITS = SMethod{
.method_id = 7,
.name = "toBits",
.tpe = SFunc.unary(.{ .type_var = "T" }, .{ .coll = &SType.boolean }),
.cost_kind = .{ .fixed = 5 },
.min_version = .v3,
};
// Bitwise operations (v6+)
pub const BITWISE_INVERSE = SMethod{ .method_id = 8, .name = "bitwiseInverse", ... };
pub const BITWISE_OR = SMethod{ .method_id = 9, .name = "bitwiseOr", ... };
pub const BITWISE_AND = SMethod{ .method_id = 10, .name = "bitwiseAnd", ... };
pub const BITWISE_XOR = SMethod{ .method_id = 11, .name = "bitwiseXor", ... };
pub const SHIFT_LEFT = SMethod{ .method_id = 12, .name = "shiftLeft", ... };
pub const SHIFT_RIGHT = SMethod{ .method_id = 13, .name = "shiftRight", ... };
};
Numeric Method Summary
| ID | Method | Signature | v5 | v6 | Description |
|---|---|---|---|---|---|
| 1 | toByte | T => Byte | ✓ | ✓ | Convert (may overflow) |
| 2 | toShort | T => Short | ✓ | ✓ | Convert (may overflow) |
| 3 | toInt | T => Int | ✓ | ✓ | Convert (may overflow) |
| 4 | toLong | T => Long | ✓ | ✓ | Convert (may overflow) |
| 5 | toBigInt | T => BigInt | ✓ | ✓ | Convert (always safe) |
| 6 | toBytes | T => Coll[Byte] | - | ✓ | Big-endian bytes |
| 7 | toBits | T => Coll[Bool] | - | ✓ | Bit representation |
| 8 | bitwiseInverse | T => T | - | ✓ | Bitwise NOT |
| 9 | bitwiseOr | (T,T) => T | - | ✓ | Bitwise OR |
| 10 | bitwiseAnd | (T,T) => T | - | ✓ | Bitwise AND |
| 11 | bitwiseXor | (T,T) => T | - | ✓ | Bitwise XOR |
| 12 | shiftLeft | (T,Int) => T | - | ✓ | Left shift |
| 13 | shiftRight | (T,Int) => T | - | ✓ | Arithmetic right shift |
Collection Methods
Collections have the richest method set56:
const SCollectionMethods = struct {
pub const TYPE_CODE = 12;
// Basic access
pub const SIZE = SMethod{
.method_id = 1,
.name = "size",
.tpe = SFunc.unary(.{ .coll = .{ .type_var = "T" } }, .int),
.cost_kind = .{ .fixed = 14 },
};
pub const GET_OR_ELSE = SMethod{
.method_id = 2,
.name = "getOrElse",
.tpe = SFunc.new(&[_]SType{
.{ .coll = .{ .type_var = "T" } },
.int,
.{ .type_var = "T" },
}, .{ .type_var = "T" }),
.cost_kind = .dynamic,
};
// Transformation
pub const MAP = SMethod{
.method_id = 3,
.name = "map",
.tpe = SFunc.new(&[_]SType{
.{ .coll = .{ .type_var = "IV" } },
SFunc.unary(.{ .type_var = "IV" }, .{ .type_var = "OV" }),
}, .{ .coll = .{ .type_var = "OV" } }),
.cost_kind = .{ .per_item = .{ .base = 20, .per_chunk = 1, .chunk_size = 10 } },
};
pub const FILTER = SMethod{
.method_id = 8,
.name = "filter",
.tpe = SFunc.new(&[_]SType{
.{ .coll = .{ .type_var = "IV" } },
SFunc.unary(.{ .type_var = "IV" }, .boolean),
}, .{ .coll = .{ .type_var = "IV" } }),
.cost_kind = .{ .per_item = .{ .base = 20, .per_chunk = 1, .chunk_size = 10 } },
};
pub const FOLD = SMethod{
.method_id = 6,
.name = "fold",
.tpe = SFunc.new(&[_]SType{
.{ .coll = .{ .type_var = "IV" } },
.{ .type_var = "OV" },
SFunc.binary(.{ .type_var = "OV" }, .{ .type_var = "IV" }, .{ .type_var = "OV" }),
}, .{ .type_var = "OV" }),
.cost_kind = .{ .per_item = .{ .base = 20, .per_chunk = 1, .chunk_size = 10 } },
};
// Predicates
pub const EXISTS = SMethod{
.method_id = 4,
.name = "exists",
.tpe = SFunc.new(&[_]SType{
.{ .coll = .{ .type_var = "IV" } },
SFunc.unary(.{ .type_var = "IV" }, .boolean),
}, .boolean),
.cost_kind = .{ .per_item = .{ .base = 20, .per_chunk = 1, .chunk_size = 10 } },
};
pub const FORALL = SMethod{
.method_id = 5,
.name = "forall",
.tpe = SFunc.new(&[_]SType{
.{ .coll = .{ .type_var = "IV" } },
SFunc.unary(.{ .type_var = "IV" }, .boolean),
}, .boolean),
.cost_kind = .{ .per_item = .{ .base = 20, .per_chunk = 1, .chunk_size = 10 } },
};
// Combination
pub const APPEND = SMethod{
.method_id = 9,
.name = "append",
.tpe = SFunc.binary(
.{ .coll = .{ .type_var = "IV" } },
.{ .coll = .{ .type_var = "IV" } },
.{ .coll = .{ .type_var = "IV" } },
),
.cost_kind = .{ .per_item = .{ .base = 20, .per_chunk = 2, .chunk_size = 100 } },
};
pub const SLICE = SMethod{
.method_id = 10,
.name = "slice",
.tpe = SFunc.new(&[_]SType{
.{ .coll = .{ .type_var = "IV" } },
.int,
.int,
}, .{ .coll = .{ .type_var = "IV" } }),
.cost_kind = .{ .per_item = .{ .base = 10, .per_chunk = 2, .chunk_size = 100 } },
};
pub const ZIP = SMethod{
.method_id = 14,
.name = "zip",
.cost_kind = .{ .per_item = .{ .base = 10, .per_chunk = 1, .chunk_size = 10 } },
};
// Index operations
pub const INDICES = SMethod{
.method_id = 11,
.name = "indices",
.tpe = SFunc.unary(.{ .coll = .{ .type_var = "T" } }, .{ .coll = &SType.int }),
.cost_kind = .{ .per_item = .{ .base = 20, .per_chunk = 2, .chunk_size = 128 } },
};
pub const INDEX_OF = SMethod{
.method_id = 26,
.name = "indexOf",
.cost_kind = .{ .per_item = .{ .base = 20, .per_chunk = 1, .chunk_size = 10 } },
};
};
Collection Method Summary
| ID | Method | v5 | v6 | Description |
|---|---|---|---|---|
| 1 | size | ✓ | ✓ | Number of elements |
| 2 | getOrElse | ✓ | ✓ | Element with default |
| 3 | map | ✓ | ✓ | Transform elements |
| 4 | exists | ✓ | ✓ | Any match predicate |
| 5 | forall | ✓ | ✓ | All match predicate |
| 6 | fold | ✓ | ✓ | Reduce to single value |
| 7 | apply | ✓ | ✓ | Element at index (panics if OOB) |
| 8 | filter | ✓ | ✓ | Keep matching elements |
| 9 | append | ✓ | ✓ | Concatenate |
| 10 | slice | ✓ | ✓ | Extract range |
| 14 | indices | ✓ | ✓ | Range 0..size-1 |
| 15 | flatMap | ✓ | ✓ | Map and flatten |
| 19 | patch | ✓ | ✓ | Replace range |
| 20 | updated | ✓ | ✓ | Replace at index |
| 21 | updateMany | ✓ | ✓ | Batch update |
| 26 | indexOf | ✓ | ✓ | Find element index |
| 29 | zip | ✓ | ✓ | Pair with other collection |
| 30 | reverse | - | ✓ | Reverse order |
| 31 | startsWith | - | ✓ | Prefix match |
| 32 | endsWith | - | ✓ | Suffix match |
| 33 | get | - | ✓ | Safe element access (returns Option) |
Box Methods
const SBoxMethods = struct {
pub const TYPE_CODE = 99;
pub const VALUE = SMethod{
.method_id = 1,
.name = "value",
.tpe = SFunc.unary(.box, .long),
.cost_kind = .{ .fixed = 1 },
};
pub const PROPOSITION_BYTES = SMethod{
.method_id = 2,
.name = "propositionBytes",
.tpe = SFunc.unary(.box, .{ .coll = &SType.byte }),
.cost_kind = .{ .fixed = 10 },
};
pub const BYTES = SMethod{
.method_id = 3,
.name = "bytes",
.tpe = SFunc.unary(.box, .{ .coll = &SType.byte }),
.cost_kind = .{ .fixed = 10 },
};
pub const ID = SMethod{
.method_id = 5,
.name = "id",
.tpe = SFunc.unary(.box, .{ .coll = &SType.byte }),
.cost_kind = .{ .fixed = 10 },
};
pub const CREATION_INFO = SMethod{
.method_id = 6,
.name = "creationInfo",
.tpe = SFunc.unary(.box, .{ .tuple = &[_]SType{ .int, .{ .coll = &SType.byte } } }),
.cost_kind = .{ .fixed = 10 },
};
pub const TOKENS = SMethod{
.method_id = 8,
.name = "tokens",
.tpe = SFunc.unary(.box, .{
.coll = &.{ .tuple = &[_]SType{ .{ .coll = &SType.byte }, .long } },
}),
.cost_kind = .{ .fixed = 15 },
};
// Register access: R0-R9
pub const R0 = SMethod{ .method_id = 10, .name = "R0", ... };
pub const R1 = SMethod{ .method_id = 11, .name = "R1", ... };
pub const R2 = SMethod{ .method_id = 12, .name = "R2", ... };
pub const R3 = SMethod{ .method_id = 13, .name = "R3", ... };
pub const R4 = SMethod{ .method_id = 14, .name = "R4", ... };
pub const R5 = SMethod{ .method_id = 15, .name = "R5", ... };
pub const R6 = SMethod{ .method_id = 16, .name = "R6", ... };
pub const R7 = SMethod{ .method_id = 17, .name = "R7", ... };
pub const R8 = SMethod{ .method_id = 18, .name = "R8", ... };
pub const R9 = SMethod{ .method_id = 19, .name = "R9", ... };
};
Context Methods
Access transaction context910:
const SContextMethods = struct {
pub const TYPE_CODE = 101;
pub const DATA_INPUTS = SMethod{
.method_id = 1,
.name = "dataInputs",
.tpe = SFunc.unary(.context, .{ .coll = &SType.box }),
.cost_kind = .{ .fixed = 15 },
};
pub const HEADERS = SMethod{
.method_id = 2,
.name = "headers",
.tpe = SFunc.unary(.context, .{ .coll = &SType.header }),
.cost_kind = .{ .fixed = 15 },
};
pub const PRE_HEADER = SMethod{
.method_id = 3,
.name = "preHeader",
.tpe = SFunc.unary(.context, .pre_header),
.cost_kind = .{ .fixed = 10 },
};
pub const INPUTS = SMethod{
.method_id = 4,
.name = "INPUTS",
.tpe = SFunc.unary(.context, .{ .coll = &SType.box }),
.cost_kind = .{ .fixed = 10 },
};
pub const OUTPUTS = SMethod{
.method_id = 5,
.name = "OUTPUTS",
.tpe = SFunc.unary(.context, .{ .coll = &SType.box }),
.cost_kind = .{ .fixed = 10 },
};
pub const HEIGHT = SMethod{
.method_id = 6,
.name = "HEIGHT",
.tpe = SFunc.unary(.context, .int),
.cost_kind = .{ .fixed = 26 },
};
pub const SELF = SMethod{
.method_id = 7,
.name = "SELF",
.tpe = SFunc.unary(.context, .box),
.cost_kind = .{ .fixed = 10 },
};
pub const GET_VAR = SMethod{
.method_id = 8,
.name = "getVar",
.tpe = SFunc.new(&[_]SType{ .context, .byte }, .{ .option = .{ .type_var = "T" } }),
.cost_kind = .dynamic,
};
};
GroupElement Methods
Elliptic curve operations1112:
const SGroupElementMethods = struct {
pub const TYPE_CODE = 7;
pub const GET_ENCODED = SMethod{
.method_id = 2,
.name = "getEncoded",
.tpe = SFunc.unary(.group_element, .{ .coll = &SType.byte }),
.cost_kind = .{ .fixed = 250 },
};
pub const EXP = SMethod{
.method_id = 3,
.name = "exp",
.tpe = SFunc.binary(.group_element, .big_int, .group_element),
.cost_kind = .{ .fixed = 900 },
};
pub const MULTIPLY = SMethod{
.method_id = 4,
.name = "multiply",
.tpe = SFunc.binary(.group_element, .group_element, .group_element),
.cost_kind = .{ .fixed = 40 },
};
pub const NEGATE = SMethod{
.method_id = 5,
.name = "negate",
.tpe = SFunc.unary(.group_element, .group_element),
.cost_kind = .{ .fixed = 45 },
};
};
SigmaProp Methods
Cryptographic proposition operations13:
const SSigmaPropMethods = struct {
pub const TYPE_CODE = 8;
pub const PROP_BYTES = SMethod{
.method_id = 1,
.name = "propBytes",
.tpe = SFunc.unary(.sigma_prop, .{ .coll = &SType.byte }),
.cost_kind = .{ .fixed = 35 },
};
pub const IS_PROVEN = SMethod{
.method_id = 2,
.name = "isProven",
.tpe = SFunc.unary(.sigma_prop, .boolean),
.cost_kind = .{ .fixed = 10 },
.deprecated = true, // Use in scripts only
};
};
Method Resolution
Method lookup by type code and method ID:
pub fn resolveMethod(type_code: u8, method_id: u8) ?*const SMethod {
const container = switch (type_code) {
2, 3, 4, 5 => &SNumericMethods, // Byte, Short, Int, Long
6 => &SBigIntMethods,
7 => &SGroupElementMethods,
8 => &SSigmaPropMethods,
9 => &SUnsignedBigIntMethods, // v6+
12...23 => &SCollectionMethods, // Coll[T]
36...47 => &SOptionMethods, // Option[T]
99 => &SBoxMethods,
100 => &SAvlTreeMethods,
101 => &SContextMethods,
104 => &SHeaderMethods,
105 => &SPreHeaderMethods,
106 => &SGlobalMethods,
else => return null,
};
return container.getMethodById(method_id);
}
Method Call Evaluation
const MethodCall = struct {
receiver_type: SType,
method: *const SMethod,
receiver: *const Value,
args: []const *Value,
pub fn eval(self: *const MethodCall, env: *const DataEnv, E: *Evaluator) !Any {
// Evaluate receiver
const recv = try self.receiver.eval(env, E);
// Evaluate arguments
var arg_values = try E.allocator.alloc(Any, self.args.len);
for (self.args, 0..) |arg, i| {
arg_values[i] = try arg.eval(env, E);
}
// Add cost
E.addCost(self.method.cost_kind, self.method.op_code);
// Dispatch to method implementation
return try self.method.eval(recv, arg_values, E);
}
};
Summary
This chapter covered the method system that extends ErgoTree types with rich APIs:
MethodsContainerorganizes methods per type, with each method having a unique ID (1-255) within its container- Method resolution uses the receiver's type code and the method ID to locate the implementation, avoiding opcode space consumption
- Numeric methods provide type conversions (
toByte,toInt,toLong,toBigInt) shared across all numeric types, with v6 adding bitwise operations and byte representation - Collection methods form the richest API with transformation (
map,filter,fold), predicates (exists,forall), and combination operations (append,slice,zip) - Box methods access UTXO properties:
value(nanoERGs),tokens,propositionBytes, and registers R0-R9 - Context methods provide access to transaction data:
INPUTS,OUTPUTS,HEIGHT,SELF,dataInputs,headers, and context variables viagetVar - Cryptographic methods on
GroupElementsupport elliptic curve operations (exp,multiply,negate) andSigmaPropprovidespropBytesfor serialization
Next: Chapter 7: Serialization Framework
Scala: methods.scala
Rust: smethod.rs:36-99 (SMethod, SMethodDesc)
Scala: methods.scala:232-500
Rust: snumeric.rs
Scala: methods.scala:805-1260
Rust: scoll.rs:22-266 (METHOD_DESC, method IDs)
Scala: methods.scala (SBoxMethods)
Rust: sbox.rs:29-92 (VALUE_METHOD, GET_REG_METHOD, TOKENS_METHOD)
Scala: methods.scala (SContextMethods)
Rust: scontext.rs
Scala: methods.scala (SGroupElementMethods)
Rust: sgroup_elem.rs
Scala: methods.scala (SSigmaPropMethods)
Chapter 7: Serialization Framework
PRE-ALPHA WARNING: This is a pre-alpha version of The Sigma Book. Content may be incomplete, inaccurate, or subject to change. Do not use as a source of truth. For authoritative information, consult the official repositories:
- sigmastate-interpreter — Reference Scala implementation
- sigma-rust — Rust implementation
- ergo — Ergo node
Prerequisites
- Binary encoding concepts (bits, bytes, big-endian vs little-endian)
- Familiarity with variable-length encoding techniques and their space-efficiency trade-offs
- Prior chapters: Chapter 2 for type codes, Chapter 5 for opcodes
Learning Objectives
By the end of this chapter, you will be able to:
- Explain VLQ (Variable-Length Quantity) encoding and how it achieves compact integer representation
- Describe ZigZag encoding and why it improves VLQ efficiency for signed integers
- Implement type serialization using the type code embedding scheme
- Use
SigmaByteReaderandSigmaByteWriterfor type-aware serialization
Serialization Architecture
Blockchain storage is expensive—every byte of an ErgoTree increases transaction fees and network bandwidth. The serialization framework therefore prioritizes compactness while maintaining determinism (identical inputs must produce identical outputs across all implementations). The system uses a layered design where each layer handles a specific concern12:
┌─────────────────────────────────────────────────┐
│ Application Layer │
│ (ErgoTree, Box, Transaction) │
├─────────────────────────────────────────────────┤
│ Value Serializers │
│ (ConstantSerializer, MethodCall) │
├─────────────────────────────────────────────────┤
│ SigmaByteReader/Writer │
│ (Type-aware, constant store) │
├─────────────────────────────────────────────────┤
│ VLQ Encoding Layer │
│ (Variable-length integers) │
├─────────────────────────────────────────────────┤
│ Byte Buffer I/O │
│ (Raw read/write operations) │
└─────────────────────────────────────────────────┘
Base Serializer Interface
const SigmaSerializer = struct {
pub const MAX_PROPOSITION_SIZE: usize = 4096;
pub const MAX_TREE_DEPTH: u32 = 110;
pub fn toBytes(comptime T: type, obj: T, allocator: Allocator) ![]u8 {
var list = std.ArrayList(u8).init(allocator);
var writer = SigmaByteWriter.init(&list);
try T.serialize(obj, &writer);
return list.toOwnedSlice();
}
pub fn fromBytes(comptime T: type, bytes: []const u8) !T {
var reader = SigmaByteReader.init(bytes);
return try T.deserialize(&reader);
}
};
VLQ Encoding
Variable-Length Quantity (VLQ) represents integers compactly34:
Value Range Bytes Format
─────────────────────────────────────────────────
0 - 127 1 0xxxxxxx
128 - 16,383 2 1xxxxxxx 0xxxxxxx
16,384 - 2,097,151 3 1xxxxxxx 1xxxxxxx 0xxxxxxx
2,097,152 - 268,435,455 4 1xxxxxxx 1xxxxxxx 1xxxxxxx 0xxxxxxx
... ... ...
Each byte uses 7 bits for data; MSB is continuation flag:
0= final byte1= more bytes follow
VLQ Implementation
const VlqEncoder = struct {
/// Write unsigned integer using VLQ encoding
pub fn putUInt(writer: anytype, value: u64) !void {
var v = value;
while ((v & 0xFFFFFF80) != 0) {
try writer.writeByte(@intCast((v & 0x7F) | 0x80));
v >>= 7;
}
try writer.writeByte(@intCast(v & 0x7F));
}
/// Read unsigned integer using VLQ decoding
/// Maximum 10 bytes for u64 (ceil(64/7) = 10)
pub fn getUInt(reader: anytype) !u64 {
const MAX_VLQ_BYTES: u6 = 10; // ceil(64/7) = 10 bytes max
var result: u64 = 0;
var shift: u6 = 0;
var byte_count: u6 = 0;
while (shift < 64) {
const b = try reader.readByte();
byte_count += 1;
if (byte_count > MAX_VLQ_BYTES) return error.VlqTooLong;
result |= @as(u64, b & 0x7F) << shift;
if ((b & 0x80) == 0) return result;
shift += 7;
}
return error.VlqDecodingFailed;
}
};
// NOTE: In production, VLQ decoding should use compile-time assertions to
// verify max byte counts. See ZIGMA_STYLE.md for bounded iteration patterns.
VLQ Size by Value Range
Unsigned Value Bytes
─────────────────────────────────
0 - 127 1
128 - 16,383 2
16,384 - 2,097,151 3
2,097,152 - 268M 4
268M - 34B 5
34B - 4T 6
4T - 562T 7
562T - 72P 8
72P - 9E 9
> 9E 10
ZigZag Encoding
VLQ assumes non-negative values—it encodes the magnitude directly. For signed integers like -1, the two's complement representation has all high bits set, resulting in maximum VLQ length. ZigZag encoding solves this by mapping signed values to unsigned in a way that preserves magnitude: small positive and negative numbers both produce small unsigned values56:
Signed ZigZag Encoded
────────────────────────
0 0
-1 1
1 2
-2 3
2 4
-3 5
n 2n (n >= 0)
-n 2n-1 (n < 0)
ZigZag Implementation
const ZigZag = struct {
/// Encode signed 32-bit to unsigned
pub fn encode32(n: i32) u64 {
// Arithmetic right shift replicates sign bit
return @bitCast(@as(i64, (n << 1) ^ (n >> 31)));
}
/// Decode unsigned back to signed 32-bit
pub fn decode32(n: u64) i32 {
const v: u32 = @intCast(n);
return @as(i32, @intCast(v >> 1)) ^ -@as(i32, @intCast(v & 1));
}
/// Encode signed 64-bit to unsigned
pub fn encode64(n: i64) u64 {
return @bitCast((n << 1) ^ (n >> 63));
}
/// Decode unsigned back to signed 64-bit
pub fn decode64(n: u64) i64 {
return @as(i64, @intCast(n >> 1)) ^ -@as(i64, @intCast(n & 1));
}
};
ZigZag ensures small-magnitude signed values use few bytes:
Value ZigZag VLQ Bytes
─────────────────────────────
0 0 1
-1 1 1
1 2 1
-64 127 1
64 128 2
-65 129 2
SigmaByteWriter
The writer handles type-aware serialization with cost tracking78:
const SigmaByteWriter = struct {
buffer: *std.ArrayList(u8),
constant_store: ?*ConstantStore,
tree_version: ErgoTreeVersion,
pub fn init(buffer: *std.ArrayList(u8)) SigmaByteWriter {
return .{
.buffer = buffer,
.constant_store = null,
.tree_version = .v0,
};
}
/// Write single byte
pub fn putByte(self: *SigmaByteWriter, b: u8) !void {
try self.buffer.append(b);
}
/// Write byte slice
pub fn putBytes(self: *SigmaByteWriter, bytes: []const u8) !void {
try self.buffer.appendSlice(bytes);
}
/// Write unsigned integer (VLQ encoded)
pub fn putUInt(self: *SigmaByteWriter, value: u64) !void {
try VlqEncoder.putUInt(self.buffer.writer(), value);
}
/// Write signed short (ZigZag + VLQ)
pub fn putShort(self: *SigmaByteWriter, value: i16) !void {
try self.putUInt(ZigZag.encode32(value));
}
/// Write signed int (ZigZag + VLQ)
pub fn putInt(self: *SigmaByteWriter, value: i32) !void {
try self.putUInt(ZigZag.encode32(value));
}
/// Write signed long (ZigZag + VLQ)
pub fn putLong(self: *SigmaByteWriter, value: i64) !void {
try self.putUInt(ZigZag.encode64(value));
}
/// Write type descriptor
pub fn putType(self: *SigmaByteWriter, tpe: SType) !void {
try TypeSerializer.serialize(tpe, self);
}
/// Write value with optional constant extraction
pub fn putValue(self: *SigmaByteWriter, value: *const Value) !void {
if (self.constant_store) |store| {
if (value.isConstant()) {
const idx = store.put(value.asConstant());
try self.putByte(OpCode.ConstantPlaceholder.value);
try self.putUInt(idx);
return;
}
}
try ValueSerializer.serialize(value, self);
}
};
SigmaByteReader
The reader provides type-aware deserialization910:
const SigmaByteReader = struct {
data: []const u8,
pos: usize,
constant_store: ConstantStore,
substitute_placeholders: bool,
val_def_type_store: ValDefTypeStore,
tree_version: ErgoTreeVersion,
pub fn init(data: []const u8) SigmaByteReader {
return .{
.data = data,
.pos = 0,
.constant_store = ConstantStore.empty(),
.substitute_placeholders = false,
.val_def_type_store = ValDefTypeStore.init(),
.tree_version = .v0,
};
}
pub fn initWithStore(data: []const u8, store: ConstantStore) SigmaByteReader {
var reader = init(data);
reader.constant_store = store;
reader.substitute_placeholders = true;
return reader;
}
/// Read single byte
pub fn getByte(self: *SigmaByteReader) !u8 {
if (self.pos >= self.data.len) return error.EndOfStream;
const b = self.data[self.pos];
self.pos += 1;
return b;
}
/// Read byte slice
pub fn getBytes(self: *SigmaByteReader, n: usize) ![]const u8 {
if (self.pos + n > self.data.len) return error.EndOfStream;
const slice = self.data[self.pos..][0..n];
self.pos += n;
return slice;
}
/// Read unsigned integer (VLQ)
pub fn getUInt(self: *SigmaByteReader) !u64 {
return VlqEncoder.getUInt(self);
}
/// Read signed short (VLQ + ZigZag)
pub fn getShort(self: *SigmaByteReader) !i16 {
const v = try self.getUInt();
return @intCast(ZigZag.decode32(v));
}
/// Read signed int (VLQ + ZigZag)
pub fn getInt(self: *SigmaByteReader) !i32 {
return ZigZag.decode32(try self.getUInt());
}
/// Read signed long (VLQ + ZigZag)
pub fn getLong(self: *SigmaByteReader) !i64 {
return ZigZag.decode64(try self.getUInt());
}
/// Read type descriptor
pub fn getType(self: *SigmaByteReader) !SType {
return TypeSerializer.deserialize(self);
}
/// Read value expression
pub fn getValue(self: *SigmaByteReader) !*Value {
return ValueSerializer.deserialize(self);
}
/// Remaining bytes available
pub fn remaining(self: *const SigmaByteReader) usize {
return self.data.len - self.pos;
}
// Reader interface for VLQ
pub fn readByte(self: *SigmaByteReader) !u8 {
return self.getByte();
}
};
Constant Store
Manages constants during ErgoTree serialization11:
const ConstantStore = struct {
constants: []const Constant,
extracted: std.ArrayList(Constant),
pub fn empty() ConstantStore {
return .{
.constants = &.{},
.extracted = undefined,
};
}
pub fn init(constants: []const Constant, allocator: Allocator) ConstantStore {
return .{
.constants = constants,
.extracted = std.ArrayList(Constant).init(allocator),
};
}
/// Get constant by index
pub fn get(self: *const ConstantStore, index: usize) !Constant {
if (index >= self.constants.len) return error.IndexOutOfBounds;
return self.constants[index];
}
/// Store constant during extraction, return index
pub fn put(self: *ConstantStore, c: Constant) !u32 {
const idx = self.extracted.items.len;
try self.extracted.append(c);
return @intCast(idx);
}
};
Type Serialization
Types use a compact encoding scheme based on type codes1213:
Type Code Space
───────────────────────────────────────────────────────────
1-11 Primitive embeddable types
12-23 Coll[primitive] (12 + primCode)
24-35 Coll[Coll[primitive]] (24 + primCode)
36-47 Option[primitive] (36 + primCode)
48-59 Option[Coll[primitive]] (48 + primCode)
60-71 (primitive, T2) pairs (60 + primCode)
72-83 (T1, primitive) pairs (72 + primCode)
84-95 (primitive, primitive) (84 + primCode) symmetric
96 Tuple (generic)
97-106 Object types (Any, Unit, Box, ...)
112 SFunc (v6+)
Type Code Constants
const TypeCode = struct {
value: u8,
// Primitive types (embeddable)
pub const BOOLEAN: u8 = 1;
pub const BYTE: u8 = 2;
pub const SHORT: u8 = 3;
pub const INT: u8 = 4;
pub const LONG: u8 = 5;
pub const BIGINT: u8 = 6;
pub const GROUP_ELEMENT: u8 = 7;
pub const SIGMA_PROP: u8 = 8;
pub const UNSIGNED_BIGINT: u8 = 9;
// Type constructor bases
pub const MAX_PRIM: u8 = 11;
pub const PRIM_RANGE: u8 = 12; // MAX_PRIM + 1
pub const COLL: u8 = 12;
pub const NESTED_COLL: u8 = 24;
pub const OPTION: u8 = 36;
pub const OPTION_COLL: u8 = 48;
pub const TUPLE_PAIR1: u8 = 60;
pub const TUPLE_PAIR2: u8 = 72;
pub const TUPLE_SYMMETRIC: u8 = 84;
pub const TUPLE: u8 = 96;
// Object types
pub const ANY: u8 = 97;
pub const UNIT: u8 = 98;
pub const BOX: u8 = 99;
pub const AVL_TREE: u8 = 100;
pub const CONTEXT: u8 = 101;
pub const STRING: u8 = 102;
pub const TYPE_VAR: u8 = 103;
pub const HEADER: u8 = 104;
pub const PRE_HEADER: u8 = 105;
pub const GLOBAL: u8 = 106;
pub const FUNC: u8 = 112;
/// Embed primitive type into container code
pub fn embed(container_base: u8, prim_code: u8) u8 {
return container_base + prim_code;
}
/// Extract container and primitive from combined code
pub fn unpack(code: u8) struct { container: ?u8, primitive: ?u8 } {
if (code >= TUPLE) return .{ .container = null, .primitive = null };
const container_id = (code / PRIM_RANGE) * PRIM_RANGE;
const type_id = code % PRIM_RANGE;
return .{
.container = if (container_id == 0) null else container_id,
.primitive = if (type_id == 0) null else type_id,
};
}
};
Type Serializer
const TypeSerializer = struct {
pub fn serialize(tpe: SType, w: *SigmaByteWriter) !void {
switch (tpe) {
// Primitives - single byte
.boolean => try w.putByte(TypeCode.BOOLEAN),
.byte => try w.putByte(TypeCode.BYTE),
.short => try w.putByte(TypeCode.SHORT),
.int => try w.putByte(TypeCode.INT),
.long => try w.putByte(TypeCode.LONG),
.big_int => try w.putByte(TypeCode.BIGINT),
.group_element => try w.putByte(TypeCode.GROUP_ELEMENT),
.sigma_prop => try w.putByte(TypeCode.SIGMA_PROP),
.unsigned_big_int => try w.putByte(TypeCode.UNSIGNED_BIGINT),
// Object types
.box => try w.putByte(TypeCode.BOX),
.avl_tree => try w.putByte(TypeCode.AVL_TREE),
.context => try w.putByte(TypeCode.CONTEXT),
.header => try w.putByte(TypeCode.HEADER),
.pre_header => try w.putByte(TypeCode.PRE_HEADER),
.global => try w.putByte(TypeCode.GLOBAL),
.unit => try w.putByte(TypeCode.UNIT),
.any => try w.putByte(TypeCode.ANY),
// Collections
.coll => |elem| {
if (elem.isEmbeddable()) {
// Single byte: Coll[primitive]
try w.putByte(TypeCode.embed(TypeCode.COLL, elem.typeCode()));
} else if (elem.* == .coll) {
const inner = elem.coll;
if (inner.isEmbeddable()) {
// Single byte: Coll[Coll[primitive]]
try w.putByte(TypeCode.embed(TypeCode.NESTED_COLL, inner.typeCode()));
} else {
try w.putByte(TypeCode.COLL);
try serialize(elem.*, w);
}
} else {
try w.putByte(TypeCode.COLL);
try serialize(elem.*, w);
}
},
// Options
.option => |elem| {
if (elem.isEmbeddable()) {
try w.putByte(TypeCode.embed(TypeCode.OPTION, elem.typeCode()));
} else if (elem.* == .coll) {
const inner = elem.coll;
if (inner.isEmbeddable()) {
try w.putByte(TypeCode.embed(TypeCode.OPTION_COLL, inner.typeCode()));
} else {
try w.putByte(TypeCode.OPTION);
try serialize(elem.*, w);
}
} else {
try w.putByte(TypeCode.OPTION);
try serialize(elem.*, w);
}
},
// Tuples (pairs)
.tuple => |items| {
if (items.len == 2) {
try serializePair(items[0], items[1], w);
} else {
try w.putByte(TypeCode.TUPLE);
try w.putByte(@intCast(items.len));
for (items) |item| {
try serialize(item, w);
}
}
},
// Functions (v6+)
.func => |f| {
try w.putByte(TypeCode.FUNC);
try w.putByte(@intCast(f.t_dom.len));
for (f.t_dom) |arg| try serialize(arg, w);
try serialize(f.t_range.*, w);
try w.putByte(@intCast(f.tpe_params.len));
for (f.tpe_params) |p| {
try w.putByte(TypeCode.TYPE_VAR);
try w.putBytes(p.name);
}
},
else => return error.UnsupportedType,
}
}
fn serializePair(t1: SType, t2: SType, w: *SigmaByteWriter) !void {
const e1 = t1.isEmbeddable();
const e2 = t2.isEmbeddable();
if (e1 and e2 and std.meta.eql(t1, t2)) {
// Symmetric pair: (Int, Int)
try w.putByte(TypeCode.embed(TypeCode.TUPLE_SYMMETRIC, t1.typeCode()));
} else if (e1) {
// First is primitive: (Int, T)
try w.putByte(TypeCode.embed(TypeCode.TUPLE_PAIR1, t1.typeCode()));
try serialize(t2, w);
} else if (e2) {
// Second is primitive: (T, Int)
try w.putByte(TypeCode.embed(TypeCode.TUPLE_PAIR2, t2.typeCode()));
try serialize(t1, w);
} else {
// Both non-primitive
try w.putByte(TypeCode.TUPLE_PAIR1);
try serialize(t1, w);
try serialize(t2, w);
}
}
pub fn deserialize(r: *SigmaByteReader) !SType {
const c = try r.getByte();
return parseWithTag(r, c);
}
fn parseWithTag(r: *SigmaByteReader, c: u8) !SType {
if (c < TypeCode.TUPLE) {
const unpacked = TypeCode.unpack(c);
const elem_type = if (unpacked.primitive) |p|
try getEmbeddableType(p, r.tree_version)
else
try deserialize(r);
if (unpacked.container) |container| {
return switch (container) {
TypeCode.COLL => .{ .coll = &elem_type },
TypeCode.NESTED_COLL => .{ .coll = &SType{ .coll = &elem_type } },
TypeCode.OPTION => .{ .option = &elem_type },
TypeCode.OPTION_COLL => .{ .option = &SType{ .coll = &elem_type } },
TypeCode.TUPLE_PAIR1 => blk: {
const t2 = try deserialize(r);
break :blk .{ .tuple = &[_]SType{ elem_type, t2 } };
},
TypeCode.TUPLE_PAIR2 => blk: {
const t1 = try deserialize(r);
break :blk .{ .tuple = &[_]SType{ t1, elem_type } };
},
TypeCode.TUPLE_SYMMETRIC => .{ .tuple = &[_]SType{ elem_type, elem_type } },
else => return error.InvalidTypeCode,
};
}
return elem_type;
}
return switch (c) {
TypeCode.TUPLE => blk: {
const len = try r.getByte();
var items: [8]SType = undefined;
for (0..len) |i| items[i] = try deserialize(r);
break :blk .{ .tuple = items[0..len] };
},
TypeCode.ANY => .any,
TypeCode.UNIT => .unit,
TypeCode.BOX => .box,
TypeCode.AVL_TREE => .avl_tree,
TypeCode.CONTEXT => .context,
TypeCode.HEADER => .header,
TypeCode.PRE_HEADER => .pre_header,
TypeCode.GLOBAL => .global,
TypeCode.FUNC => blk: {
if (r.tree_version.value < 3) return error.UnsupportedVersion;
const dom_len = try r.getByte();
var t_dom: [255]SType = undefined;
for (0..dom_len) |i| t_dom[i] = try deserialize(r);
const t_range = try deserialize(r);
// ... parse tpe_params
break :blk .{ .func = undefined }; // Simplified
},
else => error.InvalidTypeCode,
};
}
fn getEmbeddableType(code: u8, version: ErgoTreeVersion) !SType {
return switch (code) {
TypeCode.BOOLEAN => .boolean,
TypeCode.BYTE => .byte,
TypeCode.SHORT => .short,
TypeCode.INT => .int,
TypeCode.LONG => .long,
TypeCode.BIGINT => .big_int,
TypeCode.GROUP_ELEMENT => .group_element,
TypeCode.SIGMA_PROP => .sigma_prop,
TypeCode.UNSIGNED_BIGINT => blk: {
if (version.value < 3) return error.UnsupportedVersion;
break :blk .unsigned_big_int;
},
else => error.InvalidTypeCode,
};
}
};
Encoding Examples
Example: Encode 300 as VLQ
300 = 0x12C = 0b100101100
Step 1: Take low 7 bits, set continuation: 0x2C | 0x80 = 0xAC
Step 2: Shift right 7: 300 >> 7 = 2
Step 3: Take low 7 bits, no continuation: 0x02
Result: [0xAC, 0x02]
Example: Encode -5 as ZigZag + VLQ
ZigZag(-5) = (-5 << 1) ^ (-5 >> 31)
= -10 ^ -1
= 9
VLQ(9) = [0x09] (fits in 7 bits)
Example: Serialize Coll[Int]
Coll[Int] → single byte
→ TypeCode.COLL + TypeCode.INT
→ 12 + 4 = 16 = 0x10
Example: Serialize (Int, Long)
(Int, Long) → TUPLE_PAIR1 + INT, then Long
→ 60 + 4 = 64, then 5
→ [0x40, 0x05]
Summary
This chapter covered the serialization framework that enables compact, deterministic encoding of ErgoTree structures:
- VLQ (Variable-Length Quantity) encoding represents integers using 7 data bits per byte with a continuation flag, achieving compact representation where small values use fewer bytes
- ZigZag encoding transforms signed integers to unsigned before VLQ encoding, ensuring small-magnitude values (positive or negative) remain compact
- Type code embedding packs common type patterns (like
Coll[Int]orOption[Long]) into single bytes by combining container and primitive codes SigmaByteWriterprovides type-aware serialization with optional constant extraction for segregated constant treesSigmaByteReadermanages deserialization state including constant stores for placeholder resolution and version tracking- The type code space (0-112) is partitioned to enable single-byte encoding for primitives, nested collections, options, and pairs
Next: Chapter 8: Value Serializers
Scala: SigmaSerializer.scala:24-60
Rust: serializable.rs
Scala: VLQByteBufferWriter.scala
Rust: vlq_encode.rs:94-112
Scala: (via scorex-util ZigZag implementation)
Rust: zig_zag_encode.rs:12-40
Scala: SigmaByteWriter.scala
Scala: SigmaByteReader.scala
Rust: constant_store.rs
Scala: TypeSerializer.scala
Rust: types.rs:18-160
Chapter 8: Value Serializers
PRE-ALPHA WARNING: This is a pre-alpha version of The Sigma Book. Content may be incomplete, inaccurate, or subject to change. Do not use as a source of truth. For authoritative information, consult the official repositories:
- sigmastate-interpreter — Reference Scala implementation
- sigma-rust — Rust implementation
- ergo — Ergo node
Prerequisites
- Chapter 7 for VLQ encoding, type serialization, and
SigmaByteReader/SigmaByteWriter - Chapter 4 for the
Valuehierarchy and expression node types - Chapter 5 for the opcode space and operation categories
Learning Objectives
By the end of this chapter, you will be able to:
- Explain opcode-based serialization dispatch and how it enables extensibility
- Implement value serializers following common patterns (binary, unary, nullary, collection)
- Describe constant extraction and placeholder substitution for segregated constant trees
- Handle type inference during deserialization using
ValDefTypeStore
Serialization Architecture
Chapter 7 covered the low-level encoding primitives (VLQ, ZigZag, type codes). This chapter builds on that foundation to show how entire expression trees are serialized. The key insight is that each expression's opcode determines its serialization format, enabling a registry-based dispatch pattern that scales to hundreds of operation types12.
Expression Serialization Flow
─────────────────────────────────────────────────────────
┌─────────────────┐
│ Expression │
└────────┬────────┘
│
┌────────────┴────────────┐
│ Is Constant? │
└────────────┬────────────┘
┌─────┴─────┐
│ Yes │ No
▼ ▼
┌───────────────┐ ┌───────────────┐
│ Extract to │ │ Get OpCode │
│ Store or │ │ Write OpCode │
│ Write Inline │ │ Serialize Body│
└───────────────┘ └───────────────┘
Serializer Registry
All serializers are registered in a sparse array indexed by opcode34:
const ValueSerializer = struct {
/// Sparse array of serializers indexed by opcode
serializers: [256]?*const Serializer,
pub fn init() ValueSerializer {
var self = ValueSerializer{ .serializers = [_]?*const Serializer{null} ** 256 };
// Constants
self.register(OpCode.Constant, &ConstantSerializer);
self.register(OpCode.ConstantPlaceholder, &ConstantPlaceholderSerializer);
// Tuples
self.register(OpCode.Tuple, &TupleSerializer);
self.register(OpCode.SelectField, &SelectFieldSerializer);
// Relations
self.register(OpCode.GT, &BinOpSerializer);
self.register(OpCode.GE, &BinOpSerializer);
self.register(OpCode.LT, &BinOpSerializer);
self.register(OpCode.LE, &BinOpSerializer);
self.register(OpCode.EQ, &BinOpSerializer);
self.register(OpCode.NEQ, &BinOpSerializer);
// Logical
self.register(OpCode.BinAnd, &BinOpSerializer);
self.register(OpCode.BinOr, &BinOpSerializer);
self.register(OpCode.BinXor, &BinOpSerializer);
// Arithmetic
self.register(OpCode.Plus, &BinOpSerializer);
self.register(OpCode.Minus, &BinOpSerializer);
self.register(OpCode.Multiply, &BinOpSerializer);
self.register(OpCode.Division, &BinOpSerializer);
self.register(OpCode.Modulo, &BinOpSerializer);
// Context
self.register(OpCode.Height, &NullarySerializer);
self.register(OpCode.Self, &NullarySerializer);
self.register(OpCode.Inputs, &NullarySerializer);
self.register(OpCode.Outputs, &NullarySerializer);
self.register(OpCode.Context, &NullarySerializer);
self.register(OpCode.Global, &NullarySerializer);
// Collections
self.register(OpCode.Coll, &CollectionSerializer);
self.register(OpCode.CollBoolConst, &BoolCollectionSerializer);
self.register(OpCode.Map, &MapSerializer);
self.register(OpCode.Filter, &FilterSerializer);
self.register(OpCode.Fold, &FoldSerializer);
// Method calls
self.register(OpCode.PropertyCall, &PropertyCallSerializer);
self.register(OpCode.MethodCall, &MethodCallSerializer);
return self;
}
fn register(self: *ValueSerializer, opcode: OpCode, serializer: *const Serializer) void {
self.serializers[opcode.value] = serializer;
}
pub fn getSerializer(self: *const ValueSerializer, opcode: OpCode) !*const Serializer {
return self.serializers[opcode.value] orelse error.UnknownOpCode;
}
};
Serialization Dispatch
Serialize Expression
pub fn serialize(expr: *const Expr, w: *SigmaByteWriter) !void {
switch (expr.*) {
.constant => |c| {
if (w.constant_store) |store| {
// Extract constant to store, write placeholder
const idx = try store.put(c);
try w.putByte(OpCode.ConstantPlaceholder.value);
try w.putUInt(idx);
} else {
// Write constant inline (type + value)
try ConstantSerializer.serialize(c, w);
}
},
else => {
const opcode = expr.opCode();
try w.putByte(opcode.value); // Write opcode first
const ser = registry.getSerializer(opcode) catch return error.UnknownOpCode;
try ser.serialize(expr, w); // Then serialize body
},
}
}
Deserialize Expression
pub fn deserialize(r: *SigmaByteReader) !Expr {
const tag = try r.getByte();
// Look-ahead: constants have type codes 1-112
if (tag <= OpCode.LAST_CONSTANT_CODE) {
return .{ .constant = try ConstantSerializer.deserializeWithTag(r, tag) };
}
const opcode = OpCode{ .value = tag };
const ser = registry.getSerializer(opcode) catch {
return error.UnknownOpCode;
};
return ser.deserialize(r);
}
Constant Serialization
Constants are serialized as type followed by value56:
const ConstantSerializer = struct {
pub fn serialize(c: Constant, w: *SigmaByteWriter) !void {
try TypeSerializer.serialize(c.tpe, w); // 1. Type
try DataSerializer.serialize(c.value, c.tpe, w); // 2. Value
}
pub fn deserialize(r: *SigmaByteReader) !Constant {
const tag = try r.getByte();
return deserializeWithTag(r, tag);
}
pub fn deserializeWithTag(r: *SigmaByteReader, tag: u8) !Constant {
const tpe = try TypeSerializer.parseWithTag(r, tag);
const value = try DataSerializer.deserialize(tpe, r);
return Constant{ .tpe = tpe, .value = value };
}
};
Constant Placeholder
When constant segregation is enabled, constants become placeholders7:
const ConstantPlaceholderSerializer = struct {
pub fn serialize(ph: ConstantPlaceholder, w: *SigmaByteWriter) !void {
try w.putUInt(ph.index); // Just the index
}
pub fn deserialize(r: *SigmaByteReader) !Expr {
const id = try r.getUInt();
if (r.substitute_placeholders) {
// Return actual constant from store
const c = try r.constant_store.get(@intCast(id));
return .{ .constant = c };
} else {
// Return placeholder (for template extraction)
const tpe = (try r.constant_store.get(@intCast(id))).tpe;
return .{ .constant_placeholder = .{ .index = @intCast(id), .tpe = tpe } };
}
}
};
Common Serializer Patterns
BinOp Serializer (Two Arguments)
For binary operations like arithmetic and comparisons8:
const BinOpSerializer = struct {
pub fn serialize(expr: *const Expr, w: *SigmaByteWriter) !void {
const binop = expr.asBinOp();
try ValueSerializer.serialize(binop.left, w); // Left operand
try ValueSerializer.serialize(binop.right, w); // Right operand
}
pub fn deserialize(r: *SigmaByteReader, kind: BinOp.Kind) !Expr {
const left = try ValueSerializer.deserialize(r);
const right = try ValueSerializer.deserialize(r);
return .{ .bin_op = .{
.kind = kind,
.left = &left,
.right = &right,
} };
}
};
Unary Serializer (One Argument)
For single-input transformations:
const UnarySerializer = struct {
pub fn serialize(input: *const Expr, w: *SigmaByteWriter) !void {
try ValueSerializer.serialize(input, w);
}
pub fn deserialize(r: *SigmaByteReader) !*const Expr {
return try ValueSerializer.deserialize(r);
}
};
Nullary Serializer (No Body)
For singletons where opcode is sufficient:
const NullarySerializer = struct {
pub fn serialize(_: *const Expr, _: *SigmaByteWriter) !void {
// Nothing to write - opcode is enough
}
pub fn deserialize(r: *SigmaByteReader, opcode: OpCode) !Expr {
_ = r;
return switch (opcode) {
.Height => .{ .global_var = .height },
.Self => .{ .global_var = .self_box },
.Inputs => .{ .global_var = .inputs },
.Outputs => .{ .global_var = .outputs },
.Context => .context,
.Global => .global,
else => error.InvalidOpCode,
};
}
};
Collection Serializers
ConcreteCollection
For collections of expressions9:
const CollectionSerializer = struct {
const MAX_COLLECTION_ITEMS: u16 = 4096; // DoS protection
pub fn serialize(coll: *const Collection, w: *SigmaByteWriter) !void {
try w.putUShort(@intCast(coll.items.len)); // Count
try TypeSerializer.serialize(coll.elem_type, w); // Element type
for (coll.items) |item| {
try ValueSerializer.serialize(item, w); // Each item
}
}
pub fn deserialize(r: *SigmaByteReader) !Expr {
const count = try r.getUShort();
if (count > MAX_COLLECTION_ITEMS) return error.CollectionTooLarge;
const elem_type = try TypeSerializer.deserialize(r);
var items = try r.allocator.alloc(*Expr, count);
for (0..count) |i| {
items[i] = try ValueSerializer.deserialize(r);
}
return .{ .collection = .{
.elem_type = elem_type,
.items = items,
} };
}
};
// NOTE: In production, use a pre-allocated expression pool instead of
// dynamic allocation during deserialization. See ZIGMA_STYLE.md.
Boolean Collection Constant
Compact serialization for Coll[Boolean] constants:
const BoolCollectionSerializer = struct {
pub fn serialize(bools: []const bool, w: *SigmaByteWriter) !void {
try w.putUShort(@intCast(bools.len));
// Pack into bits
const byte_count = (bools.len + 7) / 8;
var i: usize = 0;
for (0..byte_count) |_| {
var byte: u8 = 0;
for (0..8) |bit| {
if (i < bools.len and bools[i]) {
byte |= @as(u8, 1) << @intCast(bit);
}
i += 1;
}
try w.putByte(byte);
}
}
pub fn deserialize(r: *SigmaByteReader) !Expr {
const count = try r.getUShort();
const byte_count = (count + 7) / 8;
var bools = try r.allocator.alloc(bool, count);
var i: usize = 0;
for (0..byte_count) |_| {
const byte = try r.getByte();
for (0..8) |bit| {
if (i >= count) break;
bools[i] = (byte >> @intCast(bit)) & 1 == 1;
i += 1;
}
}
return .{ .coll_bool_const = bools };
}
};
Map/Filter/Fold
Higher-order collection operations:
const MapSerializer = struct {
pub fn serialize(m: *const Map, w: *SigmaByteWriter) !void {
try ValueSerializer.serialize(m.input, w); // Collection
try ValueSerializer.serialize(m.mapper, w); // Function
}
pub fn deserialize(r: *SigmaByteReader) !Expr {
const input = try ValueSerializer.deserialize(r);
const mapper = try ValueSerializer.deserialize(r);
return .{ .map = .{ .input = &input, .mapper = &mapper } };
}
};
const FoldSerializer = struct {
pub fn serialize(f: *const Fold, w: *SigmaByteWriter) !void {
try ValueSerializer.serialize(f.input, w); // Collection
try ValueSerializer.serialize(f.zero, w); // Initial value
try ValueSerializer.serialize(f.folder, w); // Fold function
}
pub fn deserialize(r: *SigmaByteReader) !Expr {
const input = try ValueSerializer.deserialize(r);
const zero = try ValueSerializer.deserialize(r);
const folder = try ValueSerializer.deserialize(r);
return .{ .fold = .{
.input = &input,
.zero = &zero,
.folder = &folder,
} };
}
};
Block and Function Serializers
BlockValue
For blocks with local definitions10:
const BlockValueSerializer = struct {
pub fn serialize(block: *const BlockValue, w: *SigmaByteWriter) !void {
try w.putUInt(block.items.len); // Definition count
for (block.items) |item| {
try ValueSerializer.serialize(item, w); // Each definition
}
try ValueSerializer.serialize(block.result, w); // Result expression
}
pub fn deserialize(r: *SigmaByteReader) !Expr {
const count = try r.getUInt();
var items = try r.allocator.alloc(*Expr, @intCast(count));
for (0..count) |i| {
items[i] = try ValueSerializer.deserialize(r);
}
const result = try ValueSerializer.deserialize(r);
return .{ .block_value = .{ .items = items, .result = &result } };
}
};
FuncValue
For lambda functions:
const FuncValueSerializer = struct {
pub fn serialize(func: *const FuncValue, w: *SigmaByteWriter) !void {
try w.putUInt(func.args.len); // Argument count
for (func.args) |arg| {
try w.putUInt(arg.id); // Argument id
try TypeSerializer.serialize(arg.tpe, w); // Argument type
}
try ValueSerializer.serialize(func.body, w); // Body
}
pub fn deserialize(r: *SigmaByteReader) !Expr {
const arg_count = try r.getUInt();
var args = try r.allocator.alloc(FuncArg, @intCast(arg_count));
for (0..arg_count) |i| {
const id = try r.getUInt();
const tpe = try TypeSerializer.deserialize(r);
// Store type for ValUse resolution
r.val_def_type_store.put(@intCast(id), tpe);
args[i] = .{ .id = @intCast(id), .tpe = tpe };
}
const body = try ValueSerializer.deserialize(r);
return .{ .func_value = .{ .args = args, .body = &body } };
}
};
ValDef / ValUse
Variable definitions and references:
const ValDefSerializer = struct {
pub fn serialize(vd: *const ValDef, w: *SigmaByteWriter) !void {
try w.putUInt(vd.id);
try TypeSerializer.serialize(vd.tpe, w);
try ValueSerializer.serialize(vd.rhs, w);
}
pub fn deserialize(r: *SigmaByteReader) !Expr {
const id = try r.getUInt();
const tpe = try TypeSerializer.deserialize(r);
// Store for ValUse resolution
r.val_def_type_store.put(@intCast(id), tpe);
const rhs = try ValueSerializer.deserialize(r);
return .{ .val_def = .{ .id = @intCast(id), .tpe = tpe, .rhs = &rhs } };
}
};
const ValUseSerializer = struct {
pub fn serialize(vu: *const ValUse, w: *SigmaByteWriter) !void {
try w.putUInt(vu.id);
}
pub fn deserialize(r: *SigmaByteReader) !Expr {
const id = try r.getUInt();
// Lookup type from earlier ValDef
const tpe = r.val_def_type_store.get(@intCast(id)) orelse
return error.UndefinedVariable;
return .{ .val_use = .{ .id = @intCast(id), .tpe = tpe } };
}
};
MethodCall Serializer
Method calls require type and method ID lookup1112:
const MethodCallSerializer = struct {
pub fn serialize(mc: *const MethodCall, w: *SigmaByteWriter) !void {
try w.putByte(mc.method.obj_type.typeId()); // Type ID
try w.putByte(mc.method.method_id); // Method ID
try ValueSerializer.serialize(mc.obj, w); // Receiver
try w.putUInt(mc.args.len); // Arg count
for (mc.args) |arg| {
try ValueSerializer.serialize(arg, w); // Each argument
}
// Explicit type arguments (for generic methods)
for (mc.method.explicit_type_args) |tvar| {
const tpe = mc.type_subst.get(tvar) orelse continue;
try TypeSerializer.serialize(tpe, w);
}
}
pub fn deserialize(r: *SigmaByteReader) !Expr {
const type_id = try r.getByte();
const method_id = try r.getByte();
const obj = try ValueSerializer.deserialize(r);
const arg_count = try r.getUInt();
var args = try r.allocator.alloc(*Expr, @intCast(arg_count));
for (0..arg_count) |i| {
args[i] = try ValueSerializer.deserialize(r);
}
// Lookup method by type and method ID
const method = try SMethod.fromIds(type_id, method_id);
// Check version compatibility
if (r.tree_version.value < method.min_version.value) {
return error.MethodNotAvailable;
}
// Read type arguments
var type_args = std.AutoHashMap(STypeVar, SType).init(r.allocator);
for (method.explicit_type_args) |tvar| {
const tpe = try TypeSerializer.deserialize(r);
try type_args.put(tvar, tpe);
}
return .{ .method_call = .{
.obj = &obj,
.method = method,
.args = args,
.type_subst = type_args,
} };
}
};
Serializer Summary Table
OpCode Range Category Serializer Pattern
────────────────────────────────────────────────────────────
1-112 Constants Type + Value inline
113 ConstPlaceholder Index only
114-120 Global vars Nullary (opcode only)
121-130 Unary ops Single child
131-150 Binary ops Left + Right
151-160 Collection ops Input + Function
161-170 Block/Func Items + Body
171-180 Method calls TypeId + MethodId + Args
Summary
This chapter covered the value serialization system that transforms ErgoTree expression trees to and from bytes:
- Opcode dispatch enables extensible serialization—the first byte of each expression determines which serializer handles the remaining bytes, allowing O(1) lookup via a sparse registry array
- Constant extraction supports two modes: inline serialization (type + value) when constant segregation is disabled, or placeholder indices when segregation is enabled for template sharing
- Common serializer patterns reduce code duplication:
BinOpSerializerhandles all two-argument operations,UnarySerializerhandles single-input transformations, andNullarySerializerhandles singletons where the opcode alone is sufficient - Collection serializers include bounds checking to prevent DoS attacks from maliciously large collections during deserialization
- Type inference via
ValDefTypeStoretracks variable types asValDefnodes are deserialized, allowingValUsenodes to recover their types without storing them redundantly - Method call serialization includes type ID, method ID, and version checking to ensure compatibility with the ErgoTree version being deserialized
Next: Chapter 9: Elliptic Curve Cryptography
Scala: ValueSerializer.scala:65-95
Rust: expr.rs:83-203
Scala: ValueSerializer.scala:50-182
Rust: expr.rs:215-298
Scala: ConstantSerializer.scala
Rust: constant.rs:9-29
Rust: constant_placeholder.rs
Rust: bin_op.rs
Scala: BlockValueSerializer.scala
Scala: MethodCallSerializer.scala
Rust: method_call.rs:19-60
Chapter 9: Elliptic Curve Cryptography
PRE-ALPHA WARNING: This is a pre-alpha version of The Sigma Book. Content may be incomplete, inaccurate, or subject to change. Do not use as a source of truth. For authoritative information, consult the official repositories:
- sigmastate-interpreter — Reference Scala implementation
- sigma-rust — Rust implementation
- ergo — Ergo node
Prerequisites
- Basic finite field arithmetic: operations modulo a prime p, multiplicative inverses
- Public key cryptography concepts: key pairs, discrete logarithm problem
- Understanding of elliptic curves as sets of points satisfying y² = x³ + ax + b over a finite field
- Prior chapters: Chapter 2 for the
GroupElementtype
Learning Objectives
By the end of this chapter, you will be able to:
- Explain why secp256k1 was chosen for Sigma protocols and describe its key parameters
- Implement the discrete logarithm group interface: exponentiate, multiply, inverse
- Encode and decode group elements using compressed SEC1 format (33 bytes)
- Translate between multiplicative group notation (used in Sigma protocols) and additive notation (used in libraries)
The Secp256k1 Curve
Sigma protocols use secp256k1—the same elliptic curve as Bitcoin and Ethereum12. This choice provides several benefits: widespread library support, extensive security analysis, and compatibility with existing blockchain infrastructure. The curve offers 128-bit security (meaning the best known attack requires approximately 2^128 operations) while using 256-bit keys.
Curve Definition
The curve is defined by:
y² = x³ + 7 (mod p)
where:
p = 2²⁵⁶ - 2³² - 977 (field characteristic)
n = group order (number of points)
G = generator point (base point)
Cryptographic Constants
const CryptoConstants = struct {
/// Encoded group element size in bytes (compressed)
pub const ENCODED_GROUP_ELEMENT_LENGTH: usize = 33;
/// Group size in bits
pub const GROUP_SIZE_BITS: u32 = 256;
/// Challenge size for Sigma protocols
/// Must be < GROUP_SIZE_BITS for security
pub const SOUNDNESS_BITS: u32 = 192;
/// Group order (number of curve points)
/// n = FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFEBAAEDCE6AF48A03BBFD25E8CD0364141
pub const GROUP_ORDER: [32]u8 = .{
0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF,
0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFE,
0xBA, 0xAE, 0xDC, 0xE6, 0xAF, 0x48, 0xA0, 0x3B,
0xBF, 0xD2, 0x5E, 0x8C, 0xD0, 0x36, 0x41, 0x41,
};
/// Field characteristic
/// p = FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFEFFFFFC2F
pub const FIELD_PRIME: [32]u8 = .{
0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF,
0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF,
0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF,
0xFF, 0xFF, 0xFF, 0xFE, 0xFF, 0xFF, 0xFC, 0x2F,
};
comptime {
// Security constraint: 2^soundnessBits < groupOrder
std.debug.assert(SOUNDNESS_BITS < GROUP_SIZE_BITS);
}
};
Group Element Representation
EcPoint Structure
const EcPoint = struct {
/// Compressed encoding size
pub const GROUP_SIZE: usize = 33;
/// Internal representation (projective coordinates)
x: FieldElement,
y: FieldElement,
z: FieldElement,
/// Identity element (point at infinity)
pub const IDENTITY = EcPoint{
.x = FieldElement.zero(),
.y = FieldElement.one(),
.z = FieldElement.zero(),
};
/// Generator point G
pub const GENERATOR = init: {
// secp256k1 generator coordinates
const gx = FieldElement.fromHex(
"79BE667EF9DCBBAC55A06295CE870B07029BFCDB2DCE28D959F2815B16F81798"
);
const gy = FieldElement.fromHex(
"483ADA7726A3C4655DA4FBFC0E1108A8FD17B448A68554199C47D08FFB10D4B8"
);
break :init EcPoint{ .x = gx, .y = gy, .z = FieldElement.one() };
};
/// Check if this is the identity (infinity) point
pub fn isIdentity(self: *const EcPoint) bool {
return self.z.isZero();
}
/// Convert to affine coordinates
pub fn toAffine(self: *const EcPoint) struct { x: FieldElement, y: FieldElement } {
if (self.isIdentity()) return .{ .x = .zero(), .y = .zero() };
const z_inv = self.z.inverse();
return .{
.x = self.x.mul(z_inv),
.y = self.y.mul(z_inv),
};
}
};
Group Operations
The discrete logarithm group interface provides standard operations34:
Operation Notation Description
─────────────────────────────────────────────────────
Exponentiate g^x Scalar multiplication
Multiply g * h Point addition
Inverse g^(-1) Point negation
Identity 1 Point at infinity
Generator g Base point G
Group Interface
SECURITY: The
exponentiate(scalar multiplication) operation must be implemented in constant-time when the scalar is secret (e.g., private keys, nonces). Variable-time implementations leak secret bits through timing side-channels. Use audited libraries like libsecp256k1 or Zig's std.crypto.ecc.
const DlogGroup = struct {
/// The generator point
pub fn generator() EcPoint {
return EcPoint.GENERATOR;
}
/// The identity element (point at infinity)
pub fn identity() EcPoint {
return EcPoint.IDENTITY;
}
/// Check if point is identity
pub fn isIdentity(point: *const EcPoint) bool {
return point.isIdentity();
}
/// Exponentiate: base^exponent (scalar multiplication)
pub fn exponentiate(base: *const EcPoint, exponent: *const Scalar) EcPoint {
if (base.isIdentity()) return base.*;
// Handle negative exponents
var exp = exponent.*;
if (exp.isNegative()) {
exp = exp.mod(CryptoConstants.GROUP_ORDER);
}
return scalarMul(base, &exp);
}
/// Multiply two group elements: g1 * g2 (point addition)
pub fn multiply(g1: *const EcPoint, g2: *const EcPoint) EcPoint {
return pointAdd(g1, g2);
}
/// Compute inverse: g^(-1) (point negation)
pub fn inverse(point: *const EcPoint) EcPoint {
return EcPoint{
.x = point.x,
.y = point.y.negate(),
.z = point.z,
};
}
/// Create random group element
pub fn randomElement(rng: std.rand.Random) EcPoint {
const scalar = Scalar.random(rng);
return exponentiate(&EcPoint.GENERATOR, &scalar);
}
};
Notation Translation
Sigma protocols use multiplicative notation while underlying libraries often use additive5:
Sigma (multiplicative) Library (additive) Operation
──────────────────────────────────────────────────────────
g * h g + h Point addition
g^n n * g Scalar multiplication
g^(-1) -g Point negation
1 (identity) O (origin) Point at infinity
/// Wrapper translating multiplicative to additive notation
const MultiplicativeGroup = struct {
/// Multiply in multiplicative notation = Add in additive
pub fn mul(a: *const EcPoint, b: *const EcPoint) EcPoint {
return pointAdd(a, b);
}
/// Exponentiate in multiplicative = Scalar multiply in additive
pub fn exp(base: *const EcPoint, scalar: *const Scalar) EcPoint {
return scalarMul(base, scalar);
}
/// Inverse in multiplicative = Negate in additive
pub fn inv(p: *const EcPoint) EcPoint {
return pointNegate(p);
}
};
Point Encoding
Group elements use compressed SEC1 encoding (33 bytes)67:
Compressed Point Format (33 bytes)
────────────────────────────────────────────────────
┌──────────┬────────────────────────────────────────┐
│ Byte 0 │ Bytes 1-32 │
├──────────┼────────────────────────────────────────┤
│ 0x02 │ X coordinate (32 bytes, big-end) │ Y is even
│ 0x03 │ X coordinate (32 bytes, big-end) │ Y is odd
│ 0x00 │ 31 zero bytes │ Identity
└──────────┴────────────────────────────────────────┘
Serialization Implementation
const GroupElementSerializer = struct {
const ENCODING_SIZE: usize = 33;
/// Identity encoding (33 zero bytes)
const IDENTITY_ENCODING = [_]u8{0} ** ENCODING_SIZE;
pub fn serialize(point: *const EcPoint, writer: anytype) !void {
if (point.isIdentity()) {
try writer.writeAll(&IDENTITY_ENCODING);
return;
}
const affine = point.toAffine();
// Determine sign byte from Y coordinate parity
const y_bytes = affine.y.toBytes();
const sign_byte: u8 = if (y_bytes[31] & 1 == 0) 0x02 else 0x03;
// Write sign byte + X coordinate
try writer.writeByte(sign_byte);
try writer.writeAll(&affine.x.toBytes());
}
pub fn deserialize(reader: anytype) !EcPoint {
var buf: [ENCODING_SIZE]u8 = undefined;
try reader.readNoEof(&buf);
if (buf[0] == 0) {
// Check all zeros for identity
for (buf[1..]) |b| {
if (b != 0) return error.InvalidEncoding;
}
return EcPoint.IDENTITY;
}
if (buf[0] != 0x02 and buf[0] != 0x03) {
return error.InvalidPrefix;
}
// Recover Y from X using curve equation: y² = x³ + 7
const x = FieldElement.fromBytes(buf[1..33]);
const y_squared = x.cube().add(FieldElement.fromInt(7));
var y = y_squared.sqrt() orelse return error.NotOnCurve;
// Choose correct Y based on sign byte
const y_is_odd = y.toBytes()[31] & 1 == 1;
if ((buf[0] == 0x02) == y_is_odd) {
y = y.negate();
}
const point = EcPoint{ .x = x, .y = y, .z = FieldElement.one() };
// CRITICAL: Validate point is on curve and in correct subgroup
// This prevents invalid curve attacks. See ZIGMA_STYLE.md.
// if (!point.isOnCurve()) return error.NotOnCurve;
// if (!point.isInSubgroup()) return error.InvalidSubgroup;
return point;
}
};
Why Compressed Encoding?
Format Size Content
────────────────────────────────────────────────────
Compressed 33 B Sign (1) + X (32)
Uncompressed 65 B 0x04 (1) + X (32) + Y (32)
Savings 49% Y recovered from curve equation
Coordinate Systems
Affine vs Projective
Libraries use projective coordinates internally for efficiency:
Coordinate System Representation Division Required
──────────────────────────────────────────────────────────
Affine (x, y) Per operation
Projective (X, Y, Z) Only at end
x = X/Z
y = Y/Z
Normalization
/// Normalize point to affine coordinates
/// Required before: encoding, comparison, coordinate access
pub fn normalize(point: *const EcPoint) EcPoint {
if (point.isIdentity()) return point.*;
const z_inv = point.z.inverse();
const z_inv_sq = z_inv.square();
const z_inv_cu = z_inv_sq.mul(z_inv);
return EcPoint{
.x = point.x.mul(z_inv_sq),
.y = point.y.mul(z_inv_cu),
.z = FieldElement.one(),
};
}
Random Scalar Generation
Secure random scalars for key generation8:
const Scalar = struct {
bytes: [32]u8,
/// Generate random scalar in [1, n-1] where n is group order
pub fn random(rng: std.rand.Random) Scalar {
while (true) {
var bytes: [32]u8 = undefined;
rng.bytes(&bytes);
// Ensure scalar < group order
if (lessThan(&bytes, &CryptoConstants.GROUP_ORDER)) {
// Ensure non-zero
var is_zero = true;
for (bytes) |b| {
if (b != 0) { is_zero = false; break; }
}
if (!is_zero) {
return Scalar{ .bytes = bytes };
}
}
}
}
/// Constant-time comparison to prevent timing attacks
fn lessThan(a: *const [32]u8, b: *const [32]u8) bool {
// NOTE: This simplified version is NOT constant-time.
// In production, use constant-time comparison like:
// var borrow: u1 = 0;
// for (a.*, b.*) |ai, bi| {
// borrow = @intFromBool(ai < bi) | (borrow & @intFromBool(ai == bi));
// }
// return borrow == 1;
// See ZIGMA_STYLE.md for constant-time crypto requirements.
for (a.*, b.*) |ai, bi| {
if (ai < bi) return true;
if (ai > bi) return false;
}
return false;
}
};
Security Properties
Discrete Logarithm Assumption
The security relies on the hardness of the DLP9:
Given: g (generator), h = g^x (public key)
Find: x (secret key)
Best known attack: ~2^128 operations for secp256k1
Soundness Parameter
The SOUNDNESS_BITS = 192 determines:
- Challenge size in Sigma protocols
- Security level against malicious provers
- Constraint:
2^192 < n(group order)
comptime {
// Verify soundness constraint
// 2^soundnessBits must be less than group order
// Group order ≈ 2^256, so 192 < 256 satisfies this
std.debug.assert(CryptoConstants.SOUNDNESS_BITS < 256);
}
Summary
This chapter covered the elliptic curve cryptography foundation that underlies all Sigma protocol operations:
- secp256k1 (y² = x³ + 7) provides the mathematical foundation for Sigma protocols, chosen for its security properties and widespread support in Bitcoin and Ethereum tooling
- Group elements are encoded as 33 bytes using compressed SEC1 format—a sign byte (0x02 or 0x03 based on Y coordinate parity) followed by the 32-byte X coordinate
- Multiplicative notation used in Sigma protocol literature (g^x, g·h) maps to additive operations in typical EC libraries (scalar multiplication, point addition)
- SOUNDNESS_BITS = 192 determines the challenge size in Sigma protocols and must be less than the group order's bit length for security
- The DlogGroup interface provides exponentiate (scalar multiplication), multiply (point addition), inverse (point negation), and identity (point at infinity)
- Projective coordinates (X, Y, Z) avoid expensive field inversions during computation; conversion to affine coordinates is required only for encoding and comparison
Next: Chapter 10: Hash Functions
Scala: CryptoConstants.scala
Rust: ec_point.rs:41-51
Scala: DlogGroup.scala
Rust: dlog_group.rs:39-84
Scala: Platform.scala:217-225
Scala: GroupElementSerializer.scala
Rust: ec_point.rs:120-146
Rust: dlog_group.rs:40-43
Scala: CryptoConstants.scala:70-75
Chapter 10: Hash Functions
PRE-ALPHA WARNING: This is a pre-alpha version of The Sigma Book. Content may be incomplete, inaccurate, or subject to change. Do not use as a source of truth. For authoritative information, consult the official repositories:
- sigmastate-interpreter — Reference Scala implementation
- sigma-rust — Rust implementation
- ergo — Ergo node
Prerequisites
- Cryptographic hash function properties: collision resistance, preimage resistance, deterministic output
- Understanding of message authentication codes (MACs) and their role in key derivation
- Prior chapters: Chapter 9 for the cryptographic context, Chapter 5 for opcode-based operations
Learning Objectives
By the end of this chapter, you will be able to:
- Explain why BLAKE2b256 is the primary hash function in Ergo and when SHA-256 is used
- Implement hash operations with per-item costing based on block size
- Describe Fiat-Shamir challenge generation and why challenges are truncated to 192 bits
- Use HMAC-SHA512 for BIP32/BIP39 key derivation
Hash Functions in Sigma
Hash functions are fundamental to blockchain security—they provide integrity guarantees, enable content addressing, and transform interactive proofs into non-interactive ones via the Fiat-Shamir heuristic. The Sigma protocol uses two primary hash functions, each optimized for different use cases12:
Hash Function Uses
─────────────────────────────────────────────────────
Purpose Function Output
─────────────────────────────────────────────────────
Script hashing blake2b256() 32 bytes
External compat sha256() 32 bytes
Challenge gen Fiat-Shamir 24 bytes (truncated)
Box identification blake2b256() 32 bytes
Key derivation HMAC-SHA512 64 bytes
BLAKE2b256
The primary hash function for Ergo—faster and more secure than SHA-25634.
Implementation
const Blake2b256 = struct {
/// Output size in bytes
pub const DIGEST_SIZE: usize = 32;
/// Block size for cost calculation
pub const BLOCK_SIZE: usize = 128;
state: [8]u64,
buf: [BLOCK_SIZE]u8,
buf_len: usize,
total_len: u128,
const IV: [8]u64 = .{
0x6a09e667f3bcc908, 0xbb67ae8584caa73b,
0x3c6ef372fe94f82b, 0xa54ff53a5f1d36f1,
0x510e527fade682d1, 0x9b05688c2b3e6c1f,
0x1f83d9abfb41bd6b, 0x5be0cd19137e2179,
};
pub fn init() Blake2b256 {
var self = Blake2b256{
.state = IV,
.buf = undefined,
.buf_len = 0,
.total_len = 0,
};
// Parameter block XOR (digest length, fanout, depth)
self.state[0] ^= 0x01010000 ^ DIGEST_SIZE;
return self;
}
pub fn update(self: *Blake2b256, data: []const u8) void {
var offset: usize = 0;
// Fill buffer if partially full
if (self.buf_len > 0 and self.buf_len + data.len > BLOCK_SIZE) {
const fill = BLOCK_SIZE - self.buf_len;
@memcpy(self.buf[self.buf_len..][0..fill], data[0..fill]);
self.compress(false);
self.buf_len = 0;
offset = fill;
}
// Process full blocks
while (offset + BLOCK_SIZE <= data.len) {
@memcpy(&self.buf, data[offset..][0..BLOCK_SIZE]);
self.compress(false);
offset += BLOCK_SIZE;
}
// Buffer remaining
const remaining = data.len - offset;
if (remaining > 0) {
@memcpy(self.buf[self.buf_len..][0..remaining], data[offset..][0..remaining]);
self.buf_len += remaining;
}
self.total_len += data.len;
}
pub fn final(self: *Blake2b256) [DIGEST_SIZE]u8 {
// Pad with zeros
@memset(self.buf[self.buf_len..], 0);
self.compress(true); // Final block
var result: [DIGEST_SIZE]u8 = undefined;
for (self.state[0..4], 0..) |s, i| {
@memcpy(result[i * 8 ..][0..8], &std.mem.toBytes(std.mem.nativeToLittle(u64, s)));
}
return result;
}
fn compress(self: *Blake2b256, is_final: bool) void {
// BLAKE2b compression function
// ... (standard BLAKE2b round function)
_ = is_final;
}
/// One-shot hash
pub fn hash(data: []const u8) [DIGEST_SIZE]u8 {
var hasher = init();
hasher.update(data);
return hasher.final();
}
};
AST Node
const CalcBlake2b256 = struct {
input: *const Expr, // Coll[Byte]
pub const OP_CODE = OpCode.new(87);
pub const COST = PerItemCost{
.base = JitCost{ .value = 20 },
.per_chunk = JitCost{ .value = 7 },
.chunk_size = 128,
};
pub fn tpe(_: *const CalcBlake2b256) SType {
return .{ .coll = &SType.byte };
}
pub fn eval(self: *const CalcBlake2b256, env: *const DataEnv, E: *Evaluator) ![]const u8 {
const input_bytes = try self.input.eval(env, E);
const coll = input_bytes.coll.bytes;
// Add cost based on input length
E.addSeqCost(COST, coll.len, OP_CODE);
const result = Blake2b256.hash(coll);
return try E.allocator.dupe(u8, &result);
}
};
SHA-256
Available for external system compatibility (Bitcoin, etc.)56.
Implementation
const Sha256 = struct {
pub const DIGEST_SIZE: usize = 32;
pub const BLOCK_SIZE: usize = 64;
state: [8]u32,
buf: [BLOCK_SIZE]u8,
buf_len: usize,
total_len: u64,
const K: [64]u32 = .{
0x428a2f98, 0x71374491, 0xb5c0fbcf, 0xe9b5dba5,
0x3956c25b, 0x59f111f1, 0x923f82a4, 0xab1c5ed5,
// ... remaining round constants
};
const H0: [8]u32 = .{
0x6a09e667, 0xbb67ae85, 0x3c6ef372, 0xa54ff53a,
0x510e527f, 0x9b05688c, 0x1f83d9ab, 0x5be0cd19,
};
pub fn init() Sha256 {
return .{
.state = H0,
.buf = undefined,
.buf_len = 0,
.total_len = 0,
};
}
pub fn hash(data: []const u8) [DIGEST_SIZE]u8 {
var hasher = init();
hasher.update(data);
return hasher.final();
}
// ... update, final, compress methods
};
AST Node
const CalcSha256 = struct {
input: *const Expr,
pub const OP_CODE = OpCode.new(88);
/// SHA-256 is more expensive than BLAKE2b
pub const COST = PerItemCost{
.base = JitCost{ .value = 80 },
.per_chunk = JitCost{ .value = 8 },
.chunk_size = 64,
};
pub fn eval(self: *const CalcSha256, env: *const DataEnv, E: *Evaluator) ![]const u8 {
const input_bytes = try self.input.eval(env, E);
const coll = input_bytes.coll.bytes;
E.addSeqCost(COST, coll.len, OP_CODE);
const result = Sha256.hash(coll);
return try E.allocator.dupe(u8, &result);
}
};
Cost Comparison
Hash Function Costs
─────────────────────────────────────────────────────
Base Per Chunk Chunk Size
─────────────────────────────────────────────────────
BLAKE2b256 20 7 128 bytes
SHA-256 80 8 64 bytes
─────────────────────────────────────────────────────
Cost Formula: total = base + ceil(len / chunk_size) * per_chunk
Example: 200-byte Input
BLAKE2b256:
chunks = ceil(200 / 128) = 2
cost = 20 + 2 * 7 = 34
SHA-256:
chunks = ceil(200 / 64) = 4
cost = 80 + 4 * 8 = 112
Ratio: SHA-256 is ~3.3x more expensive
Fiat-Shamir Hash
Internal hash for Sigma protocol challenge generation78:
const FiatShamir = struct {
/// Soundness bits (192 = 24 bytes)
pub const SOUNDNESS_BITS: u32 = 192;
pub const SOUNDNESS_BYTES: usize = SOUNDNESS_BITS / 8; // 24
/// Fiat-Shamir hash function
/// Returns first 24 bytes of BLAKE2b256 hash
pub fn hashFn(input: []const u8) [SOUNDNESS_BYTES]u8 {
const full_hash = Blake2b256.hash(input);
var result: [SOUNDNESS_BYTES]u8 = undefined;
@memcpy(&result, full_hash[0..SOUNDNESS_BYTES]);
return result;
}
};
Why 192 Bits?
The truncation to 192 bits is not arbitrary9:
Security Constraints
─────────────────────────────────────────────────────
1. Challenge must be unpredictable to cheating prover
2. Threshold signatures use GF(2^192) polynomials
3. Must satisfy: 2^soundnessBits < group_order
4. Group order ≈ 2^256, so 192 < 256 works
comptime {
// This constraint is critical for security
std.debug.assert(FiatShamir.SOUNDNESS_BITS < CryptoConstants.GROUP_SIZE_BITS);
}
Fiat-Shamir Tree Serialization
The challenge is computed from a serialized proof tree10:
const FiatShamirTreeSerializer = struct {
const INTERNAL_NODE_PREFIX: u8 = 0;
const LEAF_PREFIX: u8 = 1;
pub fn serialize(tree: *const ProofTree, writer: anytype) !void {
switch (tree.*) {
.leaf => |leaf| {
try writer.writeByte(LEAF_PREFIX);
// Serialize proposition as ErgoTree
const prop_bytes = try leaf.proposition.toErgoTreeBytes();
try writer.writeInt(i16, @intCast(prop_bytes.len), .big);
try writer.writeAll(prop_bytes);
// Serialize commitment
const commitment = leaf.commitment orelse
return error.EmptyCommitment;
try writer.writeInt(i16, @intCast(commitment.len), .big);
try writer.writeAll(commitment);
},
.conjecture => |conj| {
try writer.writeByte(INTERNAL_NODE_PREFIX);
try writer.writeByte(@intFromEnum(conj.conj_type));
// Threshold k for CTHRESHOLD
if (conj.conj_type == .cthreshold) {
try writer.writeByte(conj.k);
}
try writer.writeInt(i16, @intCast(conj.children.len), .big);
for (conj.children) |child| {
try serialize(child, writer);
}
},
}
}
};
HMAC-SHA512
For BIP32/BIP39 key derivation11:
const HmacSha512 = struct {
pub const DIGEST_SIZE: usize = 64;
pub const BLOCK_SIZE: usize = 128;
inner: Sha512,
outer: Sha512,
pub fn init(key: []const u8) HmacSha512 {
var padded_key: [BLOCK_SIZE]u8 = [_]u8{0} ** BLOCK_SIZE;
if (key.len > BLOCK_SIZE) {
const hashed = Sha512.hash(key);
@memcpy(padded_key[0..64], &hashed);
} else {
@memcpy(padded_key[0..key.len], key);
}
// Inner padding (0x36)
var inner_pad: [BLOCK_SIZE]u8 = undefined;
for (padded_key, 0..) |b, i| {
inner_pad[i] = b ^ 0x36;
}
// Outer padding (0x5c)
var outer_pad: [BLOCK_SIZE]u8 = undefined;
for (padded_key, 0..) |b, i| {
outer_pad[i] = b ^ 0x5c;
}
var self = HmacSha512{
.inner = Sha512.init(),
.outer = Sha512.init(),
};
self.inner.update(&inner_pad);
self.outer.update(&outer_pad);
return self;
}
pub fn update(self: *HmacSha512, data: []const u8) void {
self.inner.update(data);
}
pub fn final(self: *HmacSha512) [DIGEST_SIZE]u8 {
const inner_hash = self.inner.final();
self.outer.update(&inner_hash);
return self.outer.final();
}
pub fn hash(key: []const u8, data: []const u8) [DIGEST_SIZE]u8 {
var hmac = init(key);
hmac.update(data);
return hmac.final();
}
};
Key Derivation Constants
const KeyDerivation = struct {
/// BIP39 HMAC key
pub const BITCOIN_SEED = "Bitcoin seed";
/// PBKDF2 iterations for BIP39
pub const PBKDF2_ITERATIONS: u32 = 2048;
/// Derived key length
pub const PBKDF2_KEY_LENGTH: u32 = 512;
};
Box ID Computation
Box IDs are BLAKE2b256 hashes of box content:
pub fn computeBoxId(box_bytes: []const u8) [32]u8 {
return Blake2b256.hash(box_bytes);
}
Summary
This chapter covered the hash functions that provide cryptographic integrity throughout the Sigma protocol:
- BLAKE2b256 is the primary hash function—approximately 3x cheaper than SHA-256 due to its larger block size (128 bytes vs 64 bytes) and optimized design
- SHA-256 is available for external system compatibility (Bitcoin scripts, cross-chain verification)
- Fiat-Shamir challenge generation uses BLAKE2b256 truncated to 192 bits, matching the threshold signature polynomial field size while satisfying the constraint 2^192 < group_order
- Per-item costing calculates hash cost as
base + ceil(input_length / block_size) * per_chunk, accurately reflecting the computational work - HMAC-SHA512 provides key derivation for BIP32/BIP39 wallet compatibility, using the standard "Bitcoin seed" key
- Box IDs are computed as BLAKE2b256 hashes of serialized box content, providing content-addressable identification
Next: Chapter 11: Sigma Protocols
Scala: CryptoFunctions.scala
Rust: hash.rs:5-26
Scala: trees.scala (CalcBlake2b256)
Rust: calc_blake2b256.rs:14-47
Scala: trees.scala (CalcSha256)
Rust: calc_sha256.rs
Scala: CryptoFunctions.scala:hashFn
Rust: fiat_shamir.rs:70-76
Scala: CryptoConstants.scala:70-75
Rust: fiat_shamir.rs:116-203
Scala: HmacSHA512.scala
Chapter 11: Sigma Protocols
PRE-ALPHA WARNING: This is a pre-alpha version of The Sigma Book. Content may be incomplete, inaccurate, or subject to change. Do not use as a source of truth. For authoritative information, consult the official repositories:
- sigmastate-interpreter — Reference Scala implementation
- sigma-rust — Rust implementation
- ergo — Ergo node
Prerequisites
- Chapter 9 for elliptic curve operations and the discrete logarithm problem
- Chapter 10 for Fiat-Shamir hash generation
- Understanding of zero-knowledge proofs: proving knowledge without revealing secrets
Learning Objectives
By the end of this chapter, you will be able to:
- Explain the three-move Sigma protocol structure (commitment, challenge, response)
- Implement the Schnorr (DLog) protocol for proving knowledge of a discrete logarithm
- Describe the Diffie-Hellman Tuple protocol for proving equality of discrete logs
- Compose protocols using AND, OR, and THRESHOLD operations
- Apply the Fiat-Shamir transformation to convert interactive proofs to non-interactive
Sigma Protocol Structure
Sigma (Σ) protocols are the cryptographic foundation that makes Ergo's smart contracts possible. Named for their characteristic three-move "sigma-shaped" structure, they enable a prover to convince a verifier that they know a secret without revealing anything about that secret—the defining property of zero-knowledge proofs.
A Sigma protocol is a three-move interactive proof12:
Sigma Protocol Flow
─────────────────────────────────────────────────────
Prover (P) Verifier (V)
│ │
│ ──────── a (commitment) ───────> │
│ │
│ <─────── e (challenge) ───────── │
│ │
│ ──────── z (response) ─────────> │
│ │
│ Verify(a, e, z)?
Message Types
/// First message: prover's commitment
const FirstProverMessage = union(enum) {
dlog: FirstDlogProverMessage,
dht: FirstDhtProverMessage,
pub fn bytes(self: FirstProverMessage) []const u8 {
return switch (self) {
.dlog => |m| m.a.serialize(),
.dht => |m| m.a.serialize() ++ m.b.serialize(),
};
}
};
/// Second message: prover's response
const SecondProverMessage = union(enum) {
dlog: SecondDlogProverMessage,
dht: SecondDhtProverMessage,
};
/// Challenge from verifier (192 bits = 24 bytes)
const Challenge = [FiatShamir.SOUNDNESS_BYTES]u8;
Schnorr Protocol (Discrete Log)
Proves knowledge of secret w such that h = g^w34:
Schnorr Protocol Steps
─────────────────────────────────────────────────────
Given: g (generator), h = g^w (public key), w (secret)
Step Message Computation
─────────────────────────────────────────────────────
1. Commit a r ← random, a = g^r
2. Challenge e Verifier sends random e
3. Response z z = r + e·w (mod q)
4. Verify ✓ g^z = a · h^e
Implementation
const DlogProverInput = struct {
/// Secret scalar w in [0, q-1]
w: Scalar,
/// Compute public image h = g^w
pub fn publicImage(self: *const DlogProverInput) ProveDlog {
const g = DlogGroup.generator();
const h = DlogGroup.exponentiate(&g, &self.w);
return ProveDlog{ .h = h };
}
/// Generate random secret
pub fn random(rng: std.rand.Random) DlogProverInput {
return .{ .w = Scalar.random(rng) };
}
};
/// First message: commitment a = g^r
const FirstDlogProverMessage = struct {
a: EcPoint,
pub fn bytes(self: *const FirstDlogProverMessage) [33]u8 {
return GroupElementSerializer.serialize(&self.a);
}
};
/// Second message: response z
const SecondDlogProverMessage = struct {
z: Scalar,
};
Prover Steps
const DlogProver = struct {
/// Step 1: Generate commitment (real proof)
pub fn firstMessage(rng: std.rand.Random) struct { r: Scalar, msg: FirstDlogProverMessage } {
const r = Scalar.random(rng);
const g = DlogGroup.generator();
const a = DlogGroup.exponentiate(&g, &r);
return .{ .r = r, .msg = .{ .a = a } };
}
/// Step 3: Compute response z = r + e·w (mod q)
pub fn secondMessage(
private_input: *const DlogProverInput,
r: Scalar,
challenge: *const Challenge,
) SecondDlogProverMessage {
const e = Scalar.fromBytes(challenge);
const ew = e.mul(&private_input.w); // e * w mod q
const z = r.add(&ew); // r + ew mod q
return .{ .z = z };
}
};
Simulation
For OR composition, we need to simulate proofs without knowing the secret5:
/// Simulate transcript without knowing secret
/// Given challenge e, produce valid-looking (a, z)
pub fn simulate(
public_input: *const ProveDlog,
challenge: *const Challenge,
rng: std.rand.Random,
) struct { first: FirstDlogProverMessage, second: SecondDlogProverMessage } {
// SAMPLE random z
const z = Scalar.random(rng);
// COMPUTE a = g^z · h^(-e)
// This satisfies verification equation: g^z = a · h^e
const e = Scalar.fromBytes(challenge);
const minus_e = e.negate();
const g = DlogGroup.generator();
const h = public_input.h;
const g_to_z = DlogGroup.exponentiate(&g, &z);
const h_to_minus_e = DlogGroup.exponentiate(&h, &minus_e);
const a = DlogGroup.multiply(&g_to_z, &h_to_minus_e);
return .{
.first = .{ .a = a },
.second = .{ .z = z },
};
}
Verification (Commitment Reconstruction)
/// Verify: reconstruct a from z and e, check equality
/// g^z = a · h^e => a = g^z / h^e
pub fn computeCommitment(
proposition: *const ProveDlog,
challenge: *const Challenge,
second_msg: *const SecondDlogProverMessage,
) EcPoint {
const g = DlogGroup.generator();
const h = proposition.h;
const e = Scalar.fromBytes(challenge);
const g_to_z = DlogGroup.exponentiate(&g, &second_msg.z);
const h_to_e = DlogGroup.exponentiate(&h, &e);
const h_to_e_inv = DlogGroup.inverse(&h_to_e);
return DlogGroup.multiply(&g_to_z, &h_to_e_inv);
}
Diffie-Hellman Tuple Protocol
Proves knowledge of w such that u = g^w AND v = h^w67:
DHT Protocol: Prove (u, v) share the same discrete log
Given: g, h (generators), u = g^w, v = h^w (public tuple)
Step Message Computation
─────────────────────────────────────────────────────
1. Commit (a, b) r ← random, a = g^r, b = h^r
2. Challenge e Verifier sends random e
3. Response z z = r + e·w (mod q)
4. Verify ✓ g^z = a·u^e AND h^z = b·v^e
Implementation
const ProveDhTuple = struct {
g: EcPoint,
h: EcPoint,
u: EcPoint, // u = g^w
v: EcPoint, // v = h^w
pub const OP_CODE = OpCode.ProveDiffieHellmanTuple;
};
const FirstDhtProverMessage = struct {
a: EcPoint, // a = g^r
b: EcPoint, // b = h^r
pub fn bytes(self: *const FirstDhtProverMessage) [66]u8 {
var result: [66]u8 = undefined;
@memcpy(result[0..33], &GroupElementSerializer.serialize(&self.a));
@memcpy(result[33..66], &GroupElementSerializer.serialize(&self.b));
return result;
}
};
const DhtProver = struct {
pub fn firstMessage(
public_input: *const ProveDhTuple,
rng: std.rand.Random,
) struct { r: Scalar, msg: FirstDhtProverMessage } {
const r = Scalar.random(rng);
const a = DlogGroup.exponentiate(&public_input.g, &r);
const b = DlogGroup.exponentiate(&public_input.h, &r);
return .{ .r = r, .msg = .{ .a = a, .b = b } };
}
};
SigmaBoolean Proposition Types
Propositions form a tree structure89:
const SigmaBoolean = union(enum) {
/// Leaf: prove knowledge of discrete log
prove_dlog: ProveDlog,
/// Leaf: prove DHT equality
prove_dh_tuple: ProveDhTuple,
/// Conjunction: all children must be proven
cand: Cand,
/// Disjunction: at least one child proven
cor: Cor,
/// Threshold: k-of-n children proven
cthreshold: Cthreshold,
/// Trivially true
trivial_true: void,
/// Trivially false
trivial_false: void,
pub fn opCode(self: SigmaBoolean) OpCode {
return switch (self) {
.prove_dlog => OpCode.ProveDlog,
.prove_dh_tuple => OpCode.ProveDiffieHellmanTuple,
.cand => OpCode.SigmaAnd,
.cor => OpCode.SigmaOr,
.cthreshold => OpCode.Atleast,
else => OpCode.Constant,
};
}
/// Count nodes in proposition tree
pub fn size(self: SigmaBoolean) usize {
return switch (self) {
.cand => |c| 1 + sumChildSizes(c.children),
.cor => |c| 1 + sumChildSizes(c.children),
.cthreshold => |c| 1 + sumChildSizes(c.children),
else => 1,
};
}
};
const ProveDlog = struct {
h: EcPoint, // Public key h = g^w
pub const OP_CODE = OpCode.ProveDlog;
};
const Cand = struct {
children: []const SigmaBoolean,
};
const Cor = struct {
children: []const SigmaBoolean,
};
const Cthreshold = struct {
k: u8, // Threshold
children: []const SigmaBoolean,
};
Protocol Composition
AND Composition
All children share the same challenge10:
Challenge e
│
┌─────┴─────┐
│ │
σ₁(e) σ₂(e)
real real
/// AND: prove all children with same challenge
fn proveAnd(
children: []const *SigmaBoolean,
secrets: []const PrivateInput,
challenge: *const Challenge,
) []const ProofNode {
var proofs = allocator.alloc(ProofNode, children.len);
for (children, secrets, 0..) |child, secret, i| {
proofs[i] = proveReal(child, secret, challenge);
}
return proofs;
}
OR Composition
At least one child is real; others are simulated11:
Challenge e
│
┌─────┴─────┐
│ │
σ₁(e₁) σ₂(e₂)
REAL SIMULATED
Constraint: e₁ ⊕ e₂ = e (XOR)
/// OR: one real proof, rest simulated
/// Challenges must XOR to root challenge
fn proveOr(
children: []const *SigmaBoolean,
real_index: usize,
secret: PrivateInput,
challenge: *const Challenge,
rng: std.rand.Random,
) []const ProofNode {
var proofs = allocator.alloc(ProofNode, children.len);
var challenge_sum: Challenge = [_]u8{0} ** FiatShamir.SOUNDNESS_BYTES;
// First: generate simulated proofs with random challenges
for (children, 0..) |child, i| {
if (i != real_index) {
var sim_challenge: Challenge = undefined;
rng.bytes(&sim_challenge);
proofs[i] = simulate(child, &sim_challenge, rng);
xorChallenge(&challenge_sum, &sim_challenge);
}
}
// Derive real challenge: e_real = e ⊕ (sum of simulated challenges)
var real_challenge: Challenge = undefined;
for (0..FiatShamir.SOUNDNESS_BYTES) |i| {
real_challenge[i] = challenge[i] ^ challenge_sum[i];
}
proofs[real_index] = proveReal(children[real_index], secret, &real_challenge);
return proofs;
}
THRESHOLD (k-of-n)
Uses polynomial interpolation over GF(2^192)12:
Threshold k-of-n Challenge Distribution
─────────────────────────────────────────────────────
- Construct polynomial p(x) of degree k-1
- p(0) = e (root challenge)
- Each child i gets challenge p(i)
- k real children, (n-k) simulated
const GF2_192 = struct {
/// 192 bits = 3 × 64-bit words
words: [3]u64,
pub fn fromChallenge(c: *const Challenge) GF2_192 {
var self = GF2_192{ .words = .{ 0, 0, 0 } };
// Load 24 bytes into 3 words (only 192 bits used)
@memcpy(std.mem.asBytes(&self.words[0])[0..8], c[0..8]);
@memcpy(std.mem.asBytes(&self.words[1])[0..8], c[8..16]);
@memcpy(std.mem.asBytes(&self.words[2])[0..8], c[16..24]);
return self;
}
pub fn add(a: *const GF2_192, b: *const GF2_192) GF2_192 {
// Addition in GF(2^192) is XOR
return .{ .words = .{
a.words[0] ^ b.words[0],
a.words[1] ^ b.words[1],
a.words[2] ^ b.words[2],
} };
}
// Multiplication uses polynomial representation with reduction
// NOTE: This is a stub. Full implementation requires:
// 1. Carry-less multiplication of 192-bit polynomials
// 2. Reduction modulo irreducible polynomial x^192 + x^7 + x^2 + x + 1
// See sigma-rust: ergotree-interpreter/src/sigma_protocol/gf2_192.rs
pub fn mul(a: *const GF2_192, b: *const GF2_192) GF2_192 {
_ = a;
_ = b;
@compileError("GF2_192.mul() not implemented - see reference implementations");
}
};
const GF2_192_Poly = struct {
coefficients: []GF2_192,
degree: usize,
/// Evaluate polynomial at point x using Horner's method
pub fn evaluate(self: *const GF2_192_Poly, x: u8) GF2_192 {
var result = self.coefficients[self.degree];
var i = self.degree;
while (i > 0) {
i -= 1;
result = GF2_192.add(&GF2_192.mulByByte(&result, x), &self.coefficients[i]);
}
return result;
}
/// Lagrange interpolation through given points
/// NOTE: Stub - full implementation requires GF2_192 arithmetic
/// See sigma-rust: ergotree-interpreter/src/sigma_protocol/gf2_192_poly.rs
pub fn interpolate(
points: []const u8,
values: []const GF2_192,
value_at_0: GF2_192,
) GF2_192_Poly {
// Construct unique polynomial of degree (n-1)
// passing through n points with p(0) = value_at_0
_ = points;
_ = values;
_ = value_at_0;
@compileError("GF2_192_Poly.interpolate() not implemented");
}
};
// NOTE: In production, all scalar operations (add, mul, negate) must be
// constant-time to prevent timing side-channel attacks. See ZIGMA_STYLE.md.
Mathematical correctness of GF(2^192) polynomial interpolation:
The threshold k-of-n scheme uses polynomial interpolation over the finite field GF(2^192):
Field properties: GF(2^192) is a finite field where addition is XOR and multiplication is polynomial multiplication modulo an irreducible polynomial (x^192 + x^7 + x^2 + x + 1). All arithmetic is well-defined and invertible.
Lagrange interpolation: Given any k distinct points (x₁, y₁), ..., (xₖ, yₖ), there exists a unique polynomial p(x) of degree at most k-1 passing through all points. This is constructed using Lagrange basis polynomials.
Challenge distribution: The prover constructs a polynomial of degree n-k with p(0) = root_challenge. Simulated children's challenges are assigned randomly, and the polynomial is interpolated through these points. Real children receive challenges p(i) computed from this polynomial.
Security: The 192-bit field size matches SOUNDNESS_BITS, ensuring that a cheating prover (who knows fewer than k secrets) succeeds with probability at most 2^-192—cryptographically negligible.
Proof Trees
Track proof state during proving13:
const UnprovenTree = union(enum) {
leaf: UnprovenLeaf,
conjecture: UnprovenConjecture,
};
const UnprovenLeaf = struct {
proposition: SigmaBoolean,
position: NodePosition,
simulated: bool,
commitment_opt: ?FirstProverMessage,
randomness_opt: ?Scalar,
challenge_opt: ?Challenge,
};
const UnprovenConjecture = struct {
conj_type: enum { and, or_, threshold },
children: []UnprovenTree,
position: NodePosition,
simulated: bool,
challenge_opt: ?Challenge,
k: ?u8, // For threshold
polynomial_opt: ?GF2_192_Poly,
};
/// Position in tree: "0-2-1" means root → child 2 → child 1
const NodePosition = struct {
positions: []const usize,
pub fn child(self: NodePosition, idx: usize) NodePosition {
return .{ .positions = self.positions ++ &[_]usize{idx} };
}
pub const CRYPTO_PREFIX = NodePosition{ .positions = &.{0} };
};
Fiat-Shamir Transformation
Convert interactive to non-interactive by deriving challenge from hash14:
/// Derive challenge from tree serialization
pub fn fiatShamirChallenge(tree: *const ProofTree) Challenge {
var buf = std.ArrayList(u8).init(allocator);
fiatShamirSerialize(tree, &buf);
return FiatShamir.hashFn(buf.items);
}
fn fiatShamirSerialize(tree: *const ProofTree, writer: anytype) !void {
const INTERNAL_PREFIX: u8 = 0;
const LEAF_PREFIX: u8 = 1;
switch (tree.*) {
.leaf => |leaf| {
try writer.writeByte(LEAF_PREFIX);
// Proposition bytes
const prop_bytes = try leaf.proposition.toErgoTreeBytes();
try writer.writeIntBig(i16, @intCast(prop_bytes.len));
try writer.writeAll(prop_bytes);
// Commitment bytes
const commitment = leaf.commitment_opt orelse return error.NoCommitment;
const comm_bytes = commitment.bytes();
try writer.writeIntBig(i16, @intCast(comm_bytes.len));
try writer.writeAll(comm_bytes);
},
.conjecture => |conj| {
try writer.writeByte(INTERNAL_PREFIX);
try writer.writeByte(@intFromEnum(conj.conj_type));
if (conj.k) |k| try writer.writeByte(k);
try writer.writeIntBig(i16, @intCast(conj.children.len));
for (conj.children) |child| {
try fiatShamirSerialize(child, writer);
}
},
}
}
Security Properties
Security Properties
─────────────────────────────────────────────────────
Property Meaning
─────────────────────────────────────────────────────
Completeness Honest prover always convinces
Soundness Cheater succeeds with prob ≤ 2^-192
Zero-Knowledge Proof reveals nothing about secret
Special Sound. Two transcripts extract secret
Summary
This chapter covered Sigma protocols—the zero-knowledge proof system that forms the cryptographic core of Ergo's smart contracts:
- Sigma protocols use a three-move structure: the prover sends a commitment, receives a challenge, and responds with a value that proves knowledge without revealing the secret
- Schnorr (DLog) protocol proves knowledge of a discrete logarithm: given h = g^w, prove knowledge of w without revealing it
- Diffie-Hellman Tuple protocol proves equality of discrete logs across different bases: given u = g^w and v = h^w, prove that u and v share the same discrete log
- AND composition applies the same challenge to all children—all must be proven
- OR composition distributes challenges via XOR constraint—only one child needs a real proof, others are simulated
- THRESHOLD (k-of-n) uses GF(2^192) polynomial interpolation to distribute challenges, requiring k real proofs
- Simulation generates valid-looking transcripts without knowing secrets, enabling OR and threshold compositions
- Fiat-Shamir transformation makes interactive protocols non-interactive by deriving the challenge from a hash of the commitments
Next: Chapter 12: Evaluation Model
Scala: SigmaProtocolFunctions.scala
Rust: sigma_protocol.rs
Scala: DLogProtocol.scala
Rust: dlog_protocol.rs:10-47
Rust: dlog_protocol.rs:73-93
Rust: dht_protocol.rs
Scala: SigmaBoolean.scala
Rust: sigma_boolean.rs:31-96
Rust: cand.rs
Rust: cor.rs
Scala: GF2_192_Poly.scala
Rust: unproven_tree.rs
Rust: fiat_shamir.rs
Chapter 12: Evaluation Model
PRE-ALPHA WARNING: This is a pre-alpha version of The Sigma Book. Content may be incomplete, inaccurate, or subject to change. Do not use as a source of truth. For authoritative information, consult the official repositories:
- sigmastate-interpreter — Reference Scala implementation
- sigma-rust — Rust implementation
- ergo — Ergo node
Prerequisites
- Chapter 4 for the AST structure and Value hierarchy
- Chapter 5 for opcodes and cost descriptors
- Chapter 3 for ErgoTree format and constant segregation
Learning Objectives
By the end of this chapter, you will be able to:
- Explain direct-style big-step interpretation and why it suits ErgoTree evaluation
- Implement
evaldispatch for AST node types (constants, variables, functions, operations) - Work with the
Envenvironment structure for variable binding and closure capture - Track accumulated costs during evaluation to enforce resource limits
Evaluation Architecture
The Sigma interpreter transforms an ErgoTree expression into a SigmaBoolean proposition that can be proven or verified. This "reduction" process uses direct-style big-step evaluation—each expression immediately returns its result value rather than producing intermediate steps. This approach is simpler than continuation-passing style while still supporting the necessary features: lexical closures, short-circuit evaluation, and cost tracking12.
Evaluation Flow
─────────────────────────────────────────────────────
┌──────────────────────────────────────────────────┐
│ ErgoTreeEvaluator │
├──────────────────────────────────────────────────┤
│ context: Context (SELF, INPUTS, OUTPUTS) │
│ constants: []Const (segregated constants) │
│ cost_accum: CostAcc (tracks execution cost) │
│ env: Env (variable bindings) │
└───────────────────────┬──────────────────────────┘
│
│ eval(expr)
▼
┌──────────────────────────────────────────────────┐
│ AST Traversal │
│ │
│ Expr.eval(env, ctx) │
│ │ │
│ ├── Evaluate children │
│ ├── Add operation cost │
│ ├── Perform operation │
│ └── Return result Value │
└──────────────────────────────────────────────────┘
Evaluator Structure
const Evaluator = struct {
context: *const Context,
constants: []const Constant,
cost_accum: CostAccumulator,
allocator: Allocator,
pub fn init(
context: *const Context,
constants: []const Constant,
cost_limit: JitCost,
allocator: Allocator,
) Evaluator {
return .{
.context = context,
.constants = constants,
.cost_accum = CostAccumulator.init(cost_limit),
.allocator = allocator,
};
}
/// Evaluate expression in given environment
pub fn eval(self: *Evaluator, env: *const Env, expr: *const Expr) !Value {
return expr.eval(env, self);
}
/// Evaluate to specific type
pub fn evalTo(
self: *Evaluator,
comptime T: type,
env: *const Env,
expr: *const Expr,
) !T {
const result = try self.eval(env, expr);
return result.as(T) orelse error.TypeMismatch;
}
/// Add fixed cost
pub fn addCost(self: *Evaluator, cost: FixedCost, op: OpCode) !void {
try self.cost_accum.add(cost.value, op);
}
/// Add per-item cost
pub fn addSeqCost(self: *Evaluator, cost: PerItemCost, n_items: usize, op: OpCode) !void {
const total = cost.base.value + (n_items / cost.chunk_size + 1) * cost.per_chunk.value;
try self.cost_accum.add(total, op);
}
};
Environment (Variable Binding)
The Env maps variable IDs to computed values34:
const Env = struct {
/// HashMap from variable ID to value
bindings: std.AutoHashMap(u32, Value),
allocator: Allocator,
pub fn init(allocator: Allocator) Env {
return .{
.bindings = std.AutoHashMap(u32, Value).init(allocator),
.allocator = allocator,
};
}
/// Look up variable by ID
pub fn get(self: *const Env, val_id: u32) ?Value {
return self.bindings.get(val_id);
}
/// Create new environment with additional binding
/// NOTE: This implementation clones the HashMap on every extend() call.
/// In production, use a pre-allocated binding stack with O(1) extend/pop:
/// bindings: [MAX_BINDINGS]Binding (pre-allocated)
/// stack_ptr: usize (grows/shrinks without allocation)
/// See ZIGMA_STYLE.md for zero-allocation evaluation patterns.
pub fn extend(self: *const Env, val_id: u32, value: Value) !Env {
var new_env = Env{
.bindings = try self.bindings.clone(),
.allocator = self.allocator,
};
try new_env.bindings.put(val_id, value);
return new_env;
}
/// Create new environment with multiple bindings
pub fn extendMany(self: *const Env, bindings: []const struct { id: u32, val: Value }) !Env {
var new_env = Env{
.bindings = try self.bindings.clone(),
.allocator = self.allocator,
};
for (bindings) |b| {
try new_env.bindings.put(b.id, b.val);
}
return new_env;
}
};
Expression Dispatch
Each expression type implements eval56:
const Expr = union(enum) {
constant: Constant,
const_placeholder: ConstantPlaceholder,
val_use: ValUse,
block_value: BlockValue,
func_value: FuncValue,
apply: Apply,
if_op: If,
bin_op: BinOp,
// ... other expression types
/// Evaluate expression recursively
/// NOTE: This recursive approach is clear for learning but uses the call
/// stack. In production, use an explicit work stack to:
/// 1. Guarantee bounded stack depth (no stack overflow)
/// 2. Enable O(1) reset between transactions
/// See ZIGMA_STYLE.md for iterative evaluation patterns.
pub fn eval(self: *const Expr, env: *const Env, E: *Evaluator) !Value {
return switch (self.*) {
.constant => |c| c.eval(env, E),
.const_placeholder => |cp| cp.eval(env, E),
.val_use => |vu| vu.eval(env, E),
.block_value => |bv| bv.eval(env, E),
.func_value => |fv| fv.eval(env, E),
.apply => |a| a.eval(env, E),
.if_op => |i| i.eval(env, E),
.bin_op => |b| b.eval(env, E),
// ... dispatch to other eval implementations
};
}
};
Constant Evaluation
Constants return their value with fixed cost7:
const Constant = struct {
tpe: SType,
value: Literal,
pub const COST = FixedCost{ .value = 5 };
pub fn eval(self: *const Constant, _: *const Env, E: *Evaluator) !Value {
try E.addCost(COST, OpCode.Constant);
return Value.fromLiteral(self.value);
}
};
const ConstantPlaceholder = struct {
index: u32,
tpe: SType,
pub const COST = FixedCost{ .value = 1 };
pub fn eval(self: *const ConstantPlaceholder, _: *const Env, E: *Evaluator) !Value {
try E.addCost(COST, OpCode.ConstantPlaceholder);
if (self.index >= E.constants.len) {
return error.IndexOutOfBounds;
}
const c = E.constants[self.index];
return Value.fromLiteral(c.value);
}
};
Variable Access
ValUse looks up variables in environment8:
const ValUse = struct {
val_id: u32,
tpe: SType,
pub const COST = FixedCost{ .value = 5 };
pub fn eval(self: *const ValUse, env: *const Env, E: *Evaluator) !Value {
try E.addCost(COST, OpCode.ValUse);
return env.get(self.val_id) orelse error.UndefinedVariable;
}
};
Block Evaluation
Blocks introduce variable bindings9:
const BlockValue = struct {
items: []const ValDef,
result: *const Expr,
pub const COST = PerItemCost{
.base = JitCost{ .value = 1 },
.per_chunk = JitCost{ .value = 1 },
.chunk_size = 1,
};
pub fn eval(self: *const BlockValue, env: *const Env, E: *Evaluator) !Value {
try E.addSeqCost(COST, self.items.len, OpCode.BlockValue);
var cur_env = env.*;
for (self.items) |item| {
// Evaluate right-hand side
const rhs_val = try item.rhs.eval(&cur_env, E);
// Extend environment with new binding
try E.addCost(FuncValue.ADD_TO_ENV_COST, OpCode.FuncValue);
cur_env = try cur_env.extend(item.id, rhs_val);
}
// Evaluate result in extended environment
return self.result.eval(&cur_env, E);
}
};
const ValDef = struct {
id: u32,
tpe: SType,
rhs: *const Expr,
};
Lambda Functions
FuncValue creates closures10:
const FuncValue = struct {
args: []const FuncArg,
body: *const Expr,
pub const COST = FixedCost{ .value = 10 };
pub const ADD_TO_ENV_COST = FixedCost{ .value = 5 };
pub fn eval(self: *const FuncValue, env: *const Env, E: *Evaluator) !Value {
try E.addCost(COST, OpCode.FuncValue);
// Create closure capturing current environment
return Value{
.closure = .{
.captured_env = env.*,
.args = self.args,
.body = self.body,
},
};
}
};
const FuncArg = struct {
id: u32,
tpe: SType,
};
const Apply = struct {
func: *const Expr,
args: *const Expr,
pub fn eval(self: *const Apply, env: *const Env, E: *Evaluator) !Value {
// Evaluate function
const func_val = try self.func.eval(env, E);
const closure = func_val.closure;
// Evaluate argument
const arg_val = try self.args.eval(env, E);
// Extend closure's captured env with argument binding
try E.addCost(FuncValue.ADD_TO_ENV_COST, OpCode.Apply);
var new_env = try closure.captured_env.extend(closure.args[0].id, arg_val);
// Evaluate body in new environment
return closure.body.eval(&new_env, E);
}
};
Conditional Evaluation
If uses short-circuit semantics11:
const If = struct {
condition: *const Expr,
true_branch: *const Expr,
false_branch: *const Expr,
pub const COST = FixedCost{ .value = 10 };
pub fn eval(self: *const If, env: *const Env, E: *Evaluator) !Value {
// Evaluate condition
const cond = try E.evalTo(bool, env, self.condition);
try E.addCost(COST, OpCode.If);
// Only evaluate taken branch (short-circuit)
if (cond) {
return self.true_branch.eval(env, E);
} else {
return self.false_branch.eval(env, E);
}
}
};
Collection Operations
Map, filter, fold evaluate with per-item costs12:
const Map = struct {
input: *const Expr,
mapper: *const Expr,
pub const COST = PerItemCost{
.base = JitCost{ .value = 10 },
.per_chunk = JitCost{ .value = 5 },
.chunk_size = 10,
};
pub fn eval(self: *const Map, env: *const Env, E: *Evaluator) !Value {
const input_coll = try E.evalTo(Collection, env, self.input);
const mapper_fn = try E.evalTo(Closure, env, self.mapper);
try E.addSeqCost(COST, input_coll.len, OpCode.Map);
var result = try E.allocator.alloc(Value, input_coll.len);
for (input_coll.items, 0..) |item, i| {
// Apply mapper to each element
try E.addCost(FuncValue.ADD_TO_ENV_COST, OpCode.Map);
var fn_env = try mapper_fn.captured_env.extend(mapper_fn.args[0].id, item);
result[i] = try mapper_fn.body.eval(&fn_env, E);
}
return Value{ .coll = .{ .items = result } };
}
};
const Fold = struct {
input: *const Expr,
zero: *const Expr,
folder: *const Expr,
pub const COST = PerItemCost{
.base = JitCost{ .value = 10 },
.per_chunk = JitCost{ .value = 5 },
.chunk_size = 10,
};
pub fn eval(self: *const Fold, env: *const Env, E: *Evaluator) !Value {
const input_coll = try E.evalTo(Collection, env, self.input);
const zero_val = try self.zero.eval(env, E);
const folder_fn = try E.evalTo(Closure, env, self.folder);
try E.addSeqCost(COST, input_coll.len, OpCode.Fold);
var accum = zero_val;
for (input_coll.items) |item| {
// folder takes (accum, item)
const tuple = Value{ .tuple = .{ accum, item } };
try E.addCost(FuncValue.ADD_TO_ENV_COST, OpCode.Fold);
var fn_env = try folder_fn.captured_env.extend(folder_fn.args[0].id, tuple);
accum = try folder_fn.body.eval(&fn_env, E);
}
return accum;
}
};
Binary Operations
const BinOp = struct {
kind: Kind,
left: *const Expr,
right: *const Expr,
const Kind = enum {
plus, minus, multiply, divide, modulo,
gt, ge, lt, le, eq, neq,
bin_and, bin_or, bin_xor,
};
pub fn eval(self: *const BinOp, env: *const Env, E: *Evaluator) !Value {
const left_val = try self.left.eval(env, E);
const right_val = try self.right.eval(env, E);
return switch (self.kind) {
.plus => try evalPlus(left_val, right_val, E),
.minus => try evalMinus(left_val, right_val, E),
.gt => try evalGt(left_val, right_val, E),
// ... other operations
};
}
fn evalPlus(left: Value, right: Value, E: *Evaluator) !Value {
try E.addCost(ArithOp.PLUS_COST, OpCode.Plus);
return switch (left) {
.int => |l| Value{ .int = try std.math.add(i32, l, right.int) },
.long => |l| Value{ .long = try std.math.add(i64, l, right.long) },
else => error.TypeMismatch,
};
}
};
Top-Level Evaluation
Reduce ErgoTree to SigmaBoolean:
pub fn reduceToSigmaBoolean(
ergo_tree: *const ErgoTree,
context: *const Context,
cost_limit: JitCost,
allocator: Allocator,
) !struct { prop: SigmaBoolean, cost: JitCost } {
var evaluator = Evaluator.init(
context,
ergo_tree.constants,
cost_limit,
allocator,
);
const empty_env = Env.init(allocator);
const result = try evaluator.eval(&empty_env, ergo_tree.root);
const sigma_prop = result.asSigmaProp() orelse
return error.NotSigmaProp;
return .{
.prop = sigma_prop.sigma_boolean,
.cost = evaluator.cost_accum.totalCost(),
};
}
Summary
This chapter covered the evaluation model that transforms ErgoTree expressions into SigmaBoolean propositions:
- Direct-style big-step interpretation evaluates expressions recursively, with each node immediately returning its result value
Envmaps variable IDs to values using immutable functional updates—eachextend()creates a new environment with additional bindings- Each AST node implements an
eval()method that returns aValueand accumulates execution cost BlockValueextends the environment withValDefbindings, enabling local variable definitionsFuncValuecreates closures that capture the current environment, enabling lexical scopingIfimplements short-circuit evaluation—only the taken branch is evaluated, reducing unnecessary computation and cost- Collection operations (
Map,Filter,Fold) have per-item costs reflecting their iteration over elements - Top-level reduction produces a
SigmaBooleanproposition that the prover/verifier can then handle cryptographically
Next: Chapter 13: Cost Model
Scala: CErgoTreeEvaluator.scala
Rust: eval.rs:1-100
Scala: ErgoTreeEvaluator.scala (DataEnv)
Rust: env.rs
Scala: values.scala (eval methods)
Rust: expr.rs:14-100
Scala: values.scala (ConstantNode.eval)
Rust: val_use.rs
Rust: block.rs
Rust: func_value.rs
Rust: if_op.rs
Rust: coll_map.rs, coll_fold.rs
Chapter 13: Cost Model
PRE-ALPHA WARNING: This is a pre-alpha version of The Sigma Book. Content may be incomplete, inaccurate, or subject to change. Do not use as a source of truth. For authoritative information, consult the official repositories:
- sigmastate-interpreter — Reference Scala implementation
- sigma-rust — Rust implementation
- ergo — Ergo node
Prerequisites
- Chapter 12 for the evaluation architecture and how costs are accumulated during eval
- Chapter 5 for operation categories and cost descriptor types
- Basic computational complexity: understanding of constant-time vs linear-time operations
Learning Objectives
By the end of this chapter, you will be able to:
- Explain JitCost scaling (10x) and conversion to/from block costs
- Apply the three cost descriptor types:
FixedCost,PerItemCost, andTypeBasedCost - Implement cost accumulation with limit enforcement to prevent denial-of-service attacks
- Use cost tracing to analyze script execution costs
Cost Model Purpose
Unlike Turing-complete smart contract platforms that can enter infinite loops, ErgoTree scripts must terminate within bounded resources. The cost model assigns a computational cost to every operation, accumulating these costs during evaluation. If the accumulated cost exceeds the block limit, execution fails—this guarantees that all scripts terminate and prevents attackers from crafting expensive scripts that slow down block validation.
ErgoTree scripts execute in a resource-constrained environment12:
Cost Model Guarantees
─────────────────────────────────────────────────────
1. DoS Protection Expensive scripts blocked
2. Predictable Time Miners estimate validation
3. Fair Pricing Users pay for resources
4. Bounded Verify All scripts terminate
JitCost: The Cost Unit
JitCost provides 10x finer granularity than block costs3:
const JitCost = struct {
value: i32,
pub const SCALE_FACTOR: i32 = 10;
/// Add with overflow protection
pub fn add(self: JitCost, other: JitCost) !JitCost {
const result = @addWithOverflow(self.value, other.value);
if (result[1] != 0) return error.CostOverflow;
return .{ .value = result[0] };
}
/// Multiply with overflow protection
pub fn mul(self: JitCost, n: i32) !JitCost {
const result = @mulWithOverflow(self.value, n);
if (result[1] != 0) return error.CostOverflow;
return .{ .value = result[0] };
}
/// Divide by integer
pub fn div(self: JitCost, n: i32) JitCost {
return .{ .value = @divTrunc(self.value, n) };
}
/// Convert to block cost (inverse of fromBlockCost)
pub fn toBlockCost(self: JitCost) i32 {
return @divTrunc(self.value, SCALE_FACTOR);
}
/// Create from block cost
pub fn fromBlockCost(block_cost: i32) !JitCost {
const result = @mulWithOverflow(block_cost, SCALE_FACTOR);
if (result[1] != 0) return error.CostOverflow;
return .{ .value = result[0] };
}
/// Comparison
pub fn gt(self: JitCost, other: JitCost) bool {
return self.value > other.value;
}
};
Cost Scaling
Cost Scales
─────────────────────────────────────────────────────
JitCost (internal) ─────────────────> Block Cost
÷ 10
Example:
JitCost(50) ──────────────────────> 5 block units
JitCost(123) ──────────────────────> 12 block units
Block Cost (external) <───────────────── JitCost
× 10
The 10x scaling provides:
- Finer granularity for internal calculations
- Integer arithmetic (no floating point)
- Overflow protection via checked operations
Cost Kind Descriptors
Cost descriptors define how operations are costed45:
const CostKind = union(enum) {
fixed: FixedCost,
per_item: PerItemCost,
type_based: TypeBasedCost,
dynamic: void,
};
/// Constant time operations
const FixedCost = struct {
cost: JitCost,
};
/// Linear operations with chunking
const PerItemCost = struct {
base_cost: JitCost,
per_chunk_cost: JitCost,
chunk_size: u32,
/// Compute number of chunks for n items
pub fn chunks(self: PerItemCost, n_items: usize) usize {
if (n_items == 0) return 1;
return (n_items - 1) / self.chunk_size + 1;
}
/// Compute total cost for n items
pub fn cost(self: PerItemCost, n_items: usize) !JitCost {
const n_chunks = self.chunks(n_items);
const chunk_cost = try self.per_chunk_cost.mul(@intCast(n_chunks));
return self.base_cost.add(chunk_cost);
}
};
/// Type-dependent operations
const TypeBasedCost = struct {
cost_fn: *const fn (SType) JitCost,
};
FixedCost Operations
| Operation | Cost | Description |
|---|---|---|
| Constant | 5 | Return constant value |
| ConstantPlaceholder | 1 | Lookup segregated constant |
| ValUse | 5 | Variable lookup |
| If | 10 | Conditional branch |
| SelectField | 10 | Tuple field access |
| SizeOf | 14 | Get collection size |
PerItemCost Operations
| Operation | Base | Per Chunk | Chunk Size |
|---|---|---|---|
| blake2b256 | 20 | 7 | 128 bytes |
| sha256 | 80 | 8 | 64 bytes |
| Append | 20 | 2 | 10 items |
| Filter | 20 | 2 | 10 items |
| Map | 20 | 2 | 10 items |
| Fold | 20 | 2 | 10 items |
Cost Formula
PerItemCost Formula
─────────────────────────────────────────────────────
total = baseCost + ceil(nItems / chunkSize) × perChunkCost
Example: Map over 50 elements
chunks = ceil(50 / 10) = 5
cost = 20 + 5 × 2 = 30 JitCost units
Type-Based Costs
Operations with type-dependent complexity6:
/// Numeric cast cost depends on target type
const NumericCastCost = struct {
pub fn costFunc(target_type: SType) JitCost {
return switch (target_type) {
.s_big_int, .s_unsigned_big_int => .{ .value = 30 },
else => .{ .value = 10 }, // Byte, Short, Int, Long
};
}
};
/// Equality cost depends on operand types
const EqualityCost = struct {
pub fn costFunc(tpe: SType) JitCost {
return switch (tpe) {
.s_byte, .s_short, .s_int, .s_long => .{ .value = 3 },
.s_big_int => .{ .value = 6 },
.s_group_element => .{ .value = 172 },
.s_coll => |elem| blk: {
// Recursive: base + per-element
const elem_cost = costFunc(elem.*);
break :blk .{ .value = 10 + elem_cost.value };
},
else => .{ .value = 10 },
};
}
};
Cost Items: Tracing
Cost items record individual contributions for debugging78:
const CostItem = union(enum) {
fixed: FixedCostItem,
seq: SeqCostItem,
type_based: TypeBasedCostItem,
pub fn opName(self: CostItem) []const u8 {
return switch (self) {
.fixed => |f| f.op_desc.name,
.seq => |s| s.op_desc.name,
.type_based => |t| t.op_desc.name,
};
}
pub fn cost(self: CostItem) JitCost {
return switch (self) {
.fixed => |f| f.cost_kind.cost,
.seq => |s| s.cost_kind.cost(s.n_items) catch .{ .value = 0 },
.type_based => |t| t.cost_kind.cost_fn(t.tpe),
};
}
};
const FixedCostItem = struct {
op_desc: OperationDesc,
cost_kind: FixedCost,
};
const SeqCostItem = struct {
op_desc: OperationDesc,
cost_kind: PerItemCost,
n_items: usize,
pub fn chunks(self: SeqCostItem) usize {
return self.cost_kind.chunks(self.n_items);
}
};
const TypeBasedCostItem = struct {
op_desc: OperationDesc,
cost_kind: TypeBasedCost,
tpe: SType,
};
Cost Accumulator
Tracks costs during evaluation with limit enforcement910:
const CostCounter = struct {
initial_cost: JitCost,
current_cost: JitCost,
pub fn init(initial: JitCost) CostCounter {
return .{
.initial_cost = initial,
.current_cost = initial,
};
}
pub fn add(self: *CostCounter, cost: JitCost) !void {
self.current_cost = try self.current_cost.add(cost);
}
pub fn reset(self: *CostCounter) void {
self.current_cost = self.initial_cost;
}
};
const CostAccumulator = struct {
scope_stack: std.ArrayList(Scope),
cost_limit: ?JitCost,
allocator: Allocator,
const Scope = struct {
counter: CostCounter,
child_result: i32 = 0,
pub fn add(self: *Scope, cost: JitCost) !void {
try self.counter.add(cost);
}
pub fn currentCost(self: *const Scope) JitCost {
return self.counter.current_cost;
}
};
pub fn init(
allocator: Allocator,
initial_cost: JitCost,
cost_limit: ?JitCost,
) CostAccumulator {
var stack = std.ArrayList(Scope).init(allocator);
stack.append(.{ .counter = CostCounter.init(initial_cost) }) catch unreachable;
return .{
.scope_stack = stack,
.cost_limit = cost_limit,
.allocator = allocator,
};
}
pub fn currentScope(self: *CostAccumulator) *Scope {
return &self.scope_stack.items[self.scope_stack.items.len - 1];
}
/// Add cost, checking limit
pub fn add(self: *CostAccumulator, cost: JitCost) !void {
try self.currentScope().add(cost);
if (self.cost_limit) |limit| {
const accumulated = self.currentScope().currentCost();
if (accumulated.gt(limit)) {
return error.CostLimitExceeded;
}
}
}
/// Total accumulated cost
pub fn totalCost(self: *const CostAccumulator) JitCost {
return self.scope_stack.items[self.scope_stack.items.len - 1].counter.current_cost;
}
pub fn reset(self: *CostAccumulator) void {
self.scope_stack.clearRetainingCapacity();
self.scope_stack.append(.{
.counter = CostCounter.init(.{ .value = 0 }),
}) catch unreachable;
}
};
Cost Limit Enforcement
Cost Accumulation Flow
─────────────────────────────────────────────────────
Each operation:
1. Compute operation cost
2. Call accumulator.add(opCost)
3. Check: accumulatedCost > limit?
Yes → return CostLimitExceeded
No → continue execution
At the end:
totalCost = accumulator.totalCost()
blockCost = totalCost.toBlockCost()
Evaluator Cost Methods
The evaluator provides methods to add costs1112:
const Evaluator = struct {
cost_accum: CostAccumulator,
cost_trace: ?std.ArrayList(CostItem),
profiler: ?*Profiler,
// ... other fields
/// Add fixed cost
pub fn addCost(self: *Evaluator, cost_kind: FixedCost, op_desc: OperationDesc) !void {
try self.cost_accum.add(cost_kind.cost);
if (self.cost_trace) |*trace| {
try trace.append(.{
.fixed = .{ .op_desc = op_desc, .cost_kind = cost_kind },
});
}
}
/// Add fixed cost and execute block
pub fn addFixedCost(
self: *Evaluator,
cost_kind: FixedCost,
op_desc: OperationDesc,
comptime block: fn (*Evaluator) anyerror!anytype,
) !@TypeOf(block(self)) {
if (self.profiler) |prof| {
const start = std.time.nanoTimestamp();
try self.cost_accum.add(cost_kind.cost);
const result = try block(self);
const end = std.time.nanoTimestamp();
prof.addTiming(op_desc, end - start);
return result;
} else {
try self.cost_accum.add(cost_kind.cost);
return block(self);
}
}
/// Add per-item cost for known count
pub fn addSeqCost(
self: *Evaluator,
cost_kind: PerItemCost,
n_items: usize,
op_desc: OperationDesc,
) !void {
const cost = try cost_kind.cost(n_items);
try self.cost_accum.add(cost);
if (self.cost_trace) |*trace| {
try trace.append(.{
.seq = .{
.op_desc = op_desc,
.cost_kind = cost_kind,
.n_items = n_items,
},
});
}
}
/// Add type-based cost
pub fn addTypeBasedCost(
self: *Evaluator,
cost_kind: TypeBasedCost,
tpe: SType,
op_desc: OperationDesc,
) !void {
const cost = cost_kind.cost_fn(tpe);
try self.cost_accum.add(cost);
if (self.cost_trace) |*trace| {
try trace.append(.{
.type_based = .{
.op_desc = op_desc,
.cost_kind = cost_kind,
.tpe = tpe,
},
});
}
}
};
PowHit (Autolykos2) Cost
Special cost computation for Autolykos2 mining13:
const PowHitCost = struct {
/// Cost of custom Autolykos2 hash function
pub fn cost(
k: u32, // k-sum problem inputs
msg: []const u8, // message to hash
nonce: []const u8, // padding for PoW output
h: []const u8, // block height padding
) JitCost {
const chunk_size = CalcBlake2b256.COST.chunk_size;
const per_chunk = CalcBlake2b256.COST.per_chunk_cost.value;
const base_cost: i32 = 500;
// The heaviest part: k + 1 Blake2b256 invocations
const input_len = msg.len + nonce.len + h.len;
const chunks_per_hash = input_len / chunk_size + 1;
const total_cost = base_cost + @as(i32, @intCast(k + 1)) *
@as(i32, @intCast(chunks_per_hash)) * per_chunk;
return .{ .value = total_cost };
}
};
Operation Cost Constants
Defined in operation companion structs:
const Constant = struct {
pub const COST = FixedCost{ .cost = .{ .value = 5 } };
// ...
};
const ValUse = struct {
pub const COST = FixedCost{ .cost = .{ .value = 5 } };
// ...
};
const If = struct {
pub const COST = FixedCost{ .cost = .{ .value = 10 } };
// ...
};
const MapCollection = struct {
pub const COST = PerItemCost{
.base_cost = .{ .value = 20 },
.per_chunk_cost = .{ .value = 2 },
.chunk_size = 10,
};
// ...
};
const CalcBlake2b256 = struct {
pub const COST = PerItemCost{
.base_cost = .{ .value = 20 },
.per_chunk_cost = .{ .value = 7 },
.chunk_size = 128,
};
// ...
};
const CalcSha256 = struct {
pub const COST = PerItemCost{
.base_cost = .{ .value = 80 },
.per_chunk_cost = .{ .value = 8 },
.chunk_size = 64,
};
// ...
};
Cost Tracing Output
Example trace from evaluating a script:
Cost Trace
─────────────────────────────────────────────────────
Constant : 5
ValUse : 5
ByIndex : 30
Constant : 5
MapCollection[10] : 22 (base=20, chunks=1)
Filter[5] : 22 (base=20, chunks=1)
blake2b256[256] : 34 (base=20, chunks=2)
─────────────────────────────────────────────────────
Total JitCost : 123
Block Cost : 12
Complete Evaluation with Costing
pub fn evaluateWithCost(
ergo_tree: *const ErgoTree,
context: *const Context,
cost_limit: JitCost,
allocator: Allocator,
) !struct { result: SigmaBoolean, cost: JitCost } {
var cost_accum = CostAccumulator.init(
allocator,
.{ .value = 0 },
cost_limit,
);
var evaluator = Evaluator{
.context = context,
.constants = ergo_tree.constants,
.cost_accum = cost_accum,
.cost_trace = null,
.profiler = null,
.allocator = allocator,
};
const empty_env = Env.init(allocator);
const result = try evaluator.eval(&empty_env, ergo_tree.root);
const sigma_prop = result.asSigmaProp() orelse
return error.NotSigmaProp;
return .{
.result = sigma_prop.sigma_boolean,
.cost = evaluator.cost_accum.totalCost(),
};
}
Summary
This chapter covered the cost model that ensures all ErgoTree scripts terminate within bounded resources:
- JitCost uses 10x scaling from block costs, providing finer granularity for internal calculations while maintaining integer arithmetic without floating point
- FixedCost applies to constant-time operations like variable access (cost = 5) and conditionals (cost = 10)
- PerItemCost models operations that scale with input size using the formula:
baseCost + ceil(n/chunkSize) × perChunkCost—this applies to collection operations and hash functions - TypeBasedCost handles operations whose cost depends on operand type—BigInt operations are more expensive than primitive integer operations
- CostAccumulator tracks accumulated costs during evaluation and checks against the limit after each operation; exceeding the limit immediately fails evaluation
- CostItem types (
FixedCostItem,SeqCostItem,TypeBasedCostItem) enable detailed cost tracing for debugging and optimization - The PowHit cost function handles the special case of Autolykos2 mining operations
Next: Chapter 14: Verifier Implementation
Scala: JitCost.scala:3-7
Rust: cost_accum.rs:1-12
Scala: JitCost.scala:9-36
Scala: CostKind.scala:10-55
Rust: costs.rs:1-24
Scala: CostKind.scala:60-66
Scala: CostItem.scala:3-78
Rust: cost_accum.rs:13-17
Scala: CostAccumulator.scala:7-79
Rust: cost_accum.rs:19-43
Rust: eval.rs:130-160
Scala: CostKind.scala:71-88
Chapter 14: Verifier Implementation
PRE-ALPHA WARNING: This is a pre-alpha version of The Sigma Book. Content may be incomplete, inaccurate, or subject to change. Do not use as a source of truth. For authoritative information, consult the official repositories:
- sigmastate-interpreter — Reference Scala implementation
- sigma-rust — Rust implementation
- ergo — Ergo node
Prerequisites
- Chapter 11 for Sigma protocol verification and Fiat-Shamir transformation
- Chapter 12 for ErgoTree reduction to SigmaBoolean
- Chapter 13 for cost accumulation during verification
Learning Objectives
By the end of this chapter, you will be able to:
- Trace the complete verification flow from ErgoTree to boolean result
- Implement
verify()andfullReduction()methods - Handle soft-fork conditions gracefully to maintain network compatibility
- Verify cryptographic signatures using Fiat-Shamir commitment reconstruction
- Estimate verification cost before performing expensive cryptographic operations
Verification Overview
Verification is the counterpart to proving: given an ErgoTree, a transaction context, and a cryptographic proof, the verifier determines whether the proof is valid. This process happens for every input box in every transaction—efficient verification is critical for blockchain throughput.
The verification proceeds in two phases: first reduce the ErgoTree to a SigmaBoolean proposition (using the evaluator from Chapter 12), then verify the cryptographic proof satisfies that proposition12.
Verification Pipeline
─────────────────────────────────────────────────────
Input: ErgoTree + Context + Proof + Message
┌──────────────────────────────────────────────────┐
│ 1. REDUCTION PHASE │
│ │
│ ErgoTree ────> propositionFromErgoTree() │
│ │ │
│ ▼ │
│ SigmaPropValue │
│ │ │
│ ▼ │
│ fullReduction() │
│ │ │
│ ▼ │
│ SigmaBoolean + Cost │
└──────────────────────────────────────────────────┘
│
▼
┌──────────────────────────────────────────────────┐
│ 2. VERIFICATION PHASE │
│ │
│ TrueProp ────> return (true, cost) │
│ FalseProp ────> return (false, cost) │
│ │
│ Otherwise: │
│ estimateCryptoVerifyCost() │
│ │ │
│ ▼ │
│ verifySignature() ────> boolean result │
└──────────────────────────────────────────────────┘
Output: (verified: bool, total_cost: u64)
Verification Result
const VerificationResult = struct {
/// Result of SigmaProp verification
result: bool,
/// Estimated cost of contract execution
cost: u64,
/// Diagnostic information
diag: ReductionDiagnosticInfo,
};
const ReductionResult = struct {
/// SigmaBoolean proposition
sigma_prop: SigmaBoolean,
/// Accumulated cost (block scale)
cost: u64,
/// Diagnostic info
diag: ReductionDiagnosticInfo,
};
const ReductionDiagnosticInfo = struct {
/// Environment after evaluation
env: Env,
/// Pretty-printed expression
pretty_printed_expr: ?[]const u8,
};
Verifier Trait
The base verifier interface34:
const Verifier = struct {
const Self = @This();
/// Cost per byte for deserialization
pub const COST_PER_BYTE_DESERIALIZED: i32 = 2;
/// Cost per tree byte for substitution
pub const COST_PER_TREE_BYTE: i32 = 2;
/// Verify an ErgoTree in context with proof
pub fn verify(
self: *const Self,
ergo_tree: *const ErgoTree,
context: *const Context,
proof: ProofBytes,
message: []const u8,
) VerifierError!VerificationResult {
// Reduce to SigmaBoolean
const reduction = try reduceToCrypto(ergo_tree, context);
const result: bool = switch (reduction.sigma_prop) {
.trivial_prop => |b| b,
else => |sb| blk: {
if (proof.isEmpty()) {
break :blk false;
}
// Verifier Steps 1-3: Parse proof
const unchecked = try parseAndComputeChallenges(&sb, proof.bytes());
// Verifier Steps 4-6: Check commitments
break :blk try checkCommitments(unchecked, message);
},
};
return .{
.result = result,
.cost = reduction.cost,
.diag = reduction.diag,
};
}
};
The verify() Method
Complete verification entry point5:
pub fn verify(
env: ScriptEnv,
ergo_tree: *const ErgoTree,
context: *const Context,
proof: []const u8,
message: []const u8,
) VerifierError!VerificationResult {
// Check soft-fork condition first
if (checkSoftForkCondition(ergo_tree, context)) |soft_fork_result| {
return soft_fork_result;
}
// REDUCTION PHASE
const reduced = try fullReduction(ergo_tree, context, env);
// VERIFICATION PHASE
return switch (reduced.sigma_prop) {
.true_prop => .{ .result = true, .cost = reduced.cost, .diag = reduced.diag },
.false_prop => .{ .result = false, .cost = reduced.cost, .diag = reduced.diag },
else => |sb| blk: {
// Non-trivial proposition: verify cryptographic proof
const full_cost = try addCryptoCost(sb, reduced.cost, context.cost_limit);
const ok = verifySignature(sb, message, proof) catch false;
break :blk .{
.result = ok,
.cost = full_cost,
.diag = reduced.diag,
};
},
};
}
Full Reduction
Reduces ErgoTree to SigmaBoolean with cost tracking67:
pub fn fullReduction(
ergo_tree: *const ErgoTree,
context: *const Context,
env: ScriptEnv,
) ReducerError!ReductionResult {
// Extract proposition from ErgoTree
const prop = try propositionFromErgoTree(ergo_tree, context);
// Fast path: SigmaProp constant
if (prop == .sigma_prop_constant) {
const sb = prop.sigma_prop_constant.toSigmaBoolean();
const eval_cost = SigmaPropConstant.COST.cost.toBlockCost();
const res_cost = try addCostChecked(context.init_cost, eval_cost, context.cost_limit);
return .{
.sigma_prop = sb,
.cost = res_cost,
.diag = .{ .env = context.env, .pretty_printed_expr = null },
};
}
// No DeserializeContext: direct evaluation
if (!ergo_tree.hasDeserialize()) {
return evalToCrypto(context, ergo_tree);
}
// Has DeserializeContext: special handling
return reductionWithDeserialize(ergo_tree, prop, context, env);
}
fn propositionFromErgoTree(
ergo_tree: *const ErgoTree,
context: *const Context,
) PropositionError!SigmaPropValue {
return switch (ergo_tree.root) {
.parsed => |tree| ergo_tree.toProposition(ergo_tree.header.constant_segregation),
.unparsed => |u| blk: {
if (context.validation_settings.isSoftFork(u.err)) {
// Soft-fork: return true (accept)
break :blk SigmaPropValue.true_sigma_prop;
}
// Hard error
return error.UnparsedErgoTree;
},
};
}
Signature Verification
Implements Verifier Steps 4-6 of the Sigma protocol89:
/// Verify a signature on message for given proposition
pub fn verifySignature(
sigma_tree: SigmaBoolean,
message: []const u8,
signature: []const u8,
) VerifierError!bool {
return switch (sigma_tree) {
.trivial_prop => |b| b,
else => |sb| blk: {
if (signature.len == 0) {
break :blk false;
}
// Verifier Steps 1-3: Parse proof
const unchecked = try parseAndComputeChallenges(&sb, signature);
// Verifier Steps 4-6: Check commitments
break :blk try checkCommitments(unchecked, message);
},
};
}
/// Verifier Steps 4-6: Check commitments match Fiat-Shamir challenge
fn checkCommitments(
sp: UncheckedTree,
message: []const u8,
) VerifierError!bool {
// Verifier Step 4: Compute commitments from challenges and responses
const new_root = computeCommitments(sp);
// Steps 5-6: Serialize tree for Fiat-Shamir
var buf = std.ArrayList(u8).init(allocator);
try fiatShamirTreeToBytes(&new_root, buf.writer());
try buf.appendSlice(message);
// Compute expected challenge
const expected_challenge = fiatShamirHashFn(buf.items);
// Compare with actual challenge
// NOTE: In production, use constant-time comparison for challenge bytes
// to prevent timing side-channels: std.crypto.utils.timingSafeEql
return std.mem.eql(u8, &new_root.challenge(), &expected_challenge);
}
Computing Commitments
Verifier Step 4: Reconstruct commitments from challenges and responses1011:
/// For every leaf, compute commitment from challenge and response
pub fn computeCommitments(sp: UncheckedTree) UncheckedTree {
return switch (sp) {
.unchecked_leaf => |leaf| switch (leaf) {
.unchecked_schnorr => |sn| blk: {
// Reconstruct: a = g^z / h^e
const a = DlogProver.computeCommitment(
&sn.proposition,
&sn.challenge,
&sn.second_message,
);
break :blk UncheckedTree{
.unchecked_leaf = .{
.unchecked_schnorr = .{
.proposition = sn.proposition,
.challenge = sn.challenge,
.second_message = sn.second_message,
.commitment_opt = FirstDlogProverMessage{ .a = a },
},
},
};
},
.unchecked_dh_tuple => |dh| blk: {
// Reconstruct both commitments
const commitment = DhTupleProver.computeCommitment(
&dh.proposition,
&dh.challenge,
&dh.second_message,
);
break :blk UncheckedTree{
.unchecked_leaf = .{
.unchecked_dh_tuple = .{
.proposition = dh.proposition,
.challenge = dh.challenge,
.second_message = dh.second_message,
.commitment_opt = commitment,
},
},
};
},
},
.unchecked_conjecture => |conj| blk: {
// Recursively process children
var new_children = allocator.alloc(UncheckedTree, conj.children.len);
for (conj.children, 0..) |child, i| {
new_children[i] = computeCommitments(child);
}
break :blk conj.withChildren(new_children);
},
};
}
Crypto Verification Cost
Estimate cost before performing expensive operations12:
const VerificationCosts = struct {
/// Cost for Schnorr commitment computation
pub const COMPUTE_COMMITMENTS_SCHNORR = FixedCost{ .cost = .{ .value = 3400 } };
/// Cost for DHT commitment computation
pub const COMPUTE_COMMITMENTS_DHT = FixedCost{ .cost = .{ .value = 6450 } };
/// Total Schnorr verification cost
pub const PROVE_DLOG_VERIFICATION: JitCost = blk: {
const parse = ParseChallenge_ProveDlog.COST.cost;
const compute = COMPUTE_COMMITMENTS_SCHNORR.cost;
const serialize = ToBytes_Schnorr.COST.cost;
break :blk parse.add(compute).add(serialize);
};
/// Total DHT verification cost
pub const PROVE_DHT_VERIFICATION: JitCost = blk: {
const parse = ParseChallenge_ProveDHT.COST.cost;
const compute = COMPUTE_COMMITMENTS_DHT.cost;
const serialize = ToBytes_DHT.COST.cost;
break :blk parse.add(compute).add(serialize);
};
};
/// Estimate verification cost without performing crypto
pub fn estimateCryptoVerifyCost(sb: SigmaBoolean) JitCost {
return switch (sb) {
.prove_dlog => VerificationCosts.PROVE_DLOG_VERIFICATION,
.prove_dh_tuple => VerificationCosts.PROVE_DHT_VERIFICATION,
.c_and => |and_node| blk: {
const node_cost = ToBytes_ProofTreeConjecture.COST.cost;
var children_cost = JitCost{ .value = 0 };
for (and_node.children) |child| {
children_cost = children_cost.add(estimateCryptoVerifyCost(child)) catch unreachable;
}
break :blk node_cost.add(children_cost) catch unreachable;
},
.c_or => |or_node| blk: {
const node_cost = ToBytes_ProofTreeConjecture.COST.cost;
var children_cost = JitCost{ .value = 0 };
for (or_node.children) |child| {
children_cost = children_cost.add(estimateCryptoVerifyCost(child)) catch unreachable;
}
break :blk node_cost.add(children_cost) catch unreachable;
},
.c_threshold => |th| blk: {
const n_children = th.children.len;
const n_coefs = n_children - th.k;
const parse_cost = ParsePolynomial.COST.cost(@intCast(n_coefs));
const eval_cost = EvaluatePolynomial.COST.cost(@intCast(n_coefs)).mul(@intCast(n_children)) catch unreachable;
const node_cost = ToBytes_ProofTreeConjecture.COST.cost;
var children_cost = JitCost{ .value = 0 };
for (th.children) |child| {
children_cost = children_cost.add(estimateCryptoVerifyCost(child)) catch unreachable;
}
break :blk parse_cost.add(eval_cost).add(node_cost).add(children_cost) catch unreachable;
},
else => JitCost{ .value = 0 }, // Trivial proposition
};
}
/// Add crypto cost to accumulated cost
fn addCryptoCost(
sigma_prop: SigmaBoolean,
base_cost: u64,
cost_limit: u64,
) CostError!u64 {
const crypto_cost = estimateCryptoVerifyCost(sigma_prop).toBlockCost();
return addCostChecked(base_cost, crypto_cost, cost_limit);
}
Soft-Fork Handling
Handle unrecognized script versions gracefully13:
/// Check for soft-fork condition
fn checkSoftForkCondition(
ergo_tree: *const ErgoTree,
context: *const Context,
) ?VerificationResult {
if (context.activated_script_version > MAX_SUPPORTED_SCRIPT_VERSION) {
// Protocol version exceeds interpreter capabilities
if (ergo_tree.header.version > MAX_SUPPORTED_SCRIPT_VERSION) {
// Cannot verify: accept and rely on 90% upgraded nodes
return .{
.result = true,
.cost = context.init_cost,
.diag = .{ .env = Env.empty(), .pretty_printed_expr = null },
};
}
// Can verify despite protocol upgrade
} else {
// Activated version within supported range
if (ergo_tree.header.version > context.activated_script_version) {
// ErgoTree version too high
return error.ErgoTreeVersionTooHigh;
}
}
return null; // Proceed normally
}
/// Soft-fork reduction result: accept as true
fn whenSoftForkReductionResult(cost: u64) ReductionResult {
return .{
.sigma_prop = .{ .trivial_prop = true },
.cost = cost,
.diag = .{ .env = Env.empty(), .pretty_printed_expr = null },
};
}
DeserializeContext Handling
Scripts may contain deserialization operations14:
fn reductionWithDeserialize(
ergo_tree: *const ErgoTree,
prop: SigmaPropValue,
context: *const Context,
env: ScriptEnv,
) ReducerError!ReductionResult {
// Add cost for deserialization substitution
const tree_bytes = ergo_tree.bytes();
const deserialize_cost = @as(i64, @intCast(tree_bytes.len)) * COST_PER_TREE_BYTE;
const curr_cost = try addCostChecked(context.init_cost, deserialize_cost, context.cost_limit);
var context1 = context.*;
context1.init_cost = curr_cost;
// Substitute DeserializeContext nodes
const prop_tree = try applyDeserializeContext(&context1, prop);
// Reduce the substituted tree
return reduceToCrypto(&context1, prop_tree);
}
Complete Verification Flow
verify(ergoTree, context, proof, message)
─────────────────────────────────────────────────────
Step 1: checkSoftForkCondition()
│
├─ activated > MaxSupported AND script > MaxSupported
│ └─> return (true, initCost) [soft-fork accept]
│
├─ script.version > activated
│ └─> throw ErgoTreeVersionTooHigh
│
└─ Otherwise: proceed
│
▼
Step 2: fullReduction()
│
├─ propositionFromErgoTree()
│ └─ Handle unparsed trees
│
├─ SigmaPropConstant
│ └─> Extract directly
│
├─ No DeserializeContext
│ └─> evalToCrypto()
│
└─ Has DeserializeContext
└─> reductionWithDeserialize()
│
▼
ReductionResult(sigmaBoolean, cost)
│
▼
Step 3: Check result
│
├─ TrueProp ────> return (true, cost)
├─ FalseProp ────> return (false, cost)
└─ Non-trivial ────> continue
│
▼
Step 4: addCryptoCost()
│
└─ Estimate without crypto ops
│
▼
Step 5: verifySignature()
│
├─ parseAndComputeChallenges()
│ └─ Parse proof bytes
│
├─ computeCommitments()
│ └─ Reconstruct commitments
│
├─ fiatShamirTreeToBytes()
│ └─ Serialize tree
│
└─ fiatShamirHashFn()
└─ Compute expected challenge
│
▼
Step 6: Return (result, totalCost)
Verifier Errors
const VerifierError = error{
/// Failed to parse ErgoTree
ErgoTreeError,
/// Failed to evaluate ErgoTree
EvalError,
/// Signature parsing error
SigParsingError,
/// Fiat-Shamir serialization error
FiatShamirTreeSerializationError,
/// Cost limit exceeded
CostLimitExceeded,
/// ErgoTree version too high
ErgoTreeVersionTooHigh,
/// Cannot parse unparsed tree
UnparsedErgoTree,
};
Test Verifier
Simple verifier implementation for testing15:
const TestVerifier = struct {
const Self = @This();
pub fn verify(
self: *const Self,
tree: *const ErgoTree,
ctx: *const Context,
proof: ProofBytes,
message: []const u8,
) VerifierError!VerificationResult {
_ = self;
const reduction = try reduceToCrypto(tree, ctx);
const result: bool = switch (reduction.sigma_prop) {
.trivial_prop => |b| b,
else => |sb| blk: {
if (proof.isEmpty()) {
break :blk false;
}
const unchecked = try parseAndComputeChallenges(&sb, proof.bytes());
break :blk try checkCommitments(unchecked, message);
},
};
return .{
.result = result,
.cost = 0, // Test verifier doesn't track cost
.diag = reduction.diag,
};
}
};
Summary
This chapter covered the verifier implementation that validates Sigma proofs:
- Verification proceeds in two phases: reduction (ErgoTree → SigmaBoolean) and cryptographic verification (proof checking)
fullReduction()evaluates the ErgoTree to a SigmaBoolean proposition while tracking costsverifySignature()implements Verifier Steps 4-6: parse proof bytes, compute expected commitments from challenges and responses, then verify via Fiat-Shamir hash- Soft-fork handling accepts scripts with unrecognized versions or opcodes, enabling protocol upgrades without network splits
- Cost estimation predicts cryptographic verification cost before performing expensive EC operations, failing early if the limit would be exceeded
- Commitment reconstruction (
computeCommitments) derives the prover's commitments from the challenges and responses, which must match the Fiat-Shamir challenge DeserializeContextnodes are substituted with their deserialized values before reduction begins
Next: Chapter 15: Prover Implementation
Scala: Interpreter.scala:30-100
Rust: verifier.rs:27-52
Scala: Interpreter.scala:78-92
Rust: verifier.rs:55-88
Scala: Interpreter.scala:132-167
Scala: Interpreter.scala:196-239
Rust: eval.rs:130-160
Scala: Interpreter.scala:282-298
Rust: verifier.rs:91-125
Scala: Interpreter.scala:324-347
Rust: verifier.rs:127-163
Scala: Interpreter.scala:362-408
Scala: Interpreter.scala:450-472
Scala: Interpreter.scala:492-517
Rust: verifier.rs:166-168
Chapter 15: Prover Implementation
PRE-ALPHA WARNING: This is a pre-alpha version of The Sigma Book. Content may be incomplete, inaccurate, or subject to change. Do not use as a source of truth. For authoritative information, consult the official repositories:
- sigmastate-interpreter — Reference Scala implementation
- sigma-rust — Rust implementation
- ergo — Ergo node
Prerequisites
- Chapter 11 for Sigma protocol structure, simulation, and Fiat-Shamir
- Chapter 12 for ErgoTree reduction to SigmaBoolean
- Chapter 14 for understanding what the verifier expects
Learning Objectives
By the end of this chapter, you will be able to:
- Trace the 10-step proving algorithm from SigmaBoolean to serialized proof
- Work with the
UnprovenTreedata structure and its transformations - Explain challenge flow through AND, OR, and THRESHOLD compositions
- Use the hint system for distributed multi-party signing
- Serialize proofs in the compact format expected by verifiers
Prover Overview
The prover is the counterpart to the verifier: given an ErgoTree, a transaction context, and the necessary secret keys, it generates a cryptographic proof that the verifier will accept. The proving algorithm is significantly more complex than verification because it must handle composite propositions (AND/OR/THRESHOLD) by generating simulated transcripts for children the prover cannot prove, while maintaining the zero-knowledge property that simulated and real transcripts are indistinguishable.
The prover generates cryptographic proofs for sigma propositions through a multi-phase algorithm12:
Proving Pipeline
─────────────────────────────────────────────────────
Step 0: SigmaBoolean ─────> convertToUnproven()
│
▼
Step 1: Mark real nodes (bottom-up)
│
▼
Step 2: Check root is real (abort if simulated)
│
▼
Step 3: Polish simulated (top-down)
│
▼
Steps 4-6: Simulate/Commit
- Assign challenges to simulated children
- Simulate simulated leaves
- Compute commitments for real leaves
│
▼
Step 7: Serialize for Fiat-Shamir
│
▼
Step 8: Compute root challenge = H(tree || message)
│
▼
Step 9: Compute real challenges and responses
│
▼
Step 10: Serialize proof bytes
Tree Data Structures
Node Position
Position encodes path from root3:
const NodePosition = struct {
/// Position bytes (e.g., [0, 2, 1] for "0-2-1")
positions: []const u8,
pub const CRYPTO_TREE_PREFIX: NodePosition = .{ .positions = &[_]u8{0} };
pub fn child(self: NodePosition, idx: usize, allocator: Allocator) !NodePosition {
var new_pos = try allocator.alloc(u8, self.positions.len + 1);
@memcpy(new_pos[0..self.positions.len], self.positions);
new_pos[self.positions.len] = @intCast(idx);
return .{ .positions = new_pos };
}
};
Position Encoding
─────────────────────────────────────────────────────
0 (root)
/ | \
/ | \
0-0 0-1 0-2 (children)
/|
/ |
0-2-0 0-2-1 (grandchildren)
Prefix "0" = crypto-tree (vs "1" = ErgoTree)
Unproven Tree
During proving, the tree undergoes transformations45:
const UnprovenTree = union(enum) {
unproven_leaf: UnprovenLeaf,
unproven_conjecture: UnprovenConjecture,
pub fn isReal(self: UnprovenTree) bool {
return !self.simulated();
}
pub fn simulated(self: UnprovenTree) bool {
return switch (self) {
.unproven_leaf => |l| l.simulated,
.unproven_conjecture => |c| c.simulated(),
};
}
pub fn withChallenge(self: UnprovenTree, challenge: Challenge) UnprovenTree {
return switch (self) {
.unproven_leaf => |l| .{ .unproven_leaf = l.withChallenge(challenge) },
.unproven_conjecture => |c| .{ .unproven_conjecture = c.withChallenge(challenge) },
};
}
pub fn withSimulated(self: UnprovenTree, sim: bool) UnprovenTree {
return switch (self) {
.unproven_leaf => |l| .{ .unproven_leaf = l.withSimulated(sim) },
.unproven_conjecture => |c| .{ .unproven_conjecture = c.withSimulated(sim) },
};
}
};
Unproven Leaf Nodes
const UnprovenLeaf = union(enum) {
unproven_schnorr: UnprovenSchnorr,
unproven_dh_tuple: UnprovenDhTuple,
// ... accessor methods
};
const UnprovenSchnorr = struct {
proposition: ProveDlog,
commitment_opt: ?FirstDlogProverMessage,
randomness_opt: ?Scalar, // Secret r for commitment
challenge_opt: ?Challenge,
simulated: bool,
position: NodePosition,
pub fn withChallenge(self: UnprovenSchnorr, c: Challenge) UnprovenSchnorr {
return .{
.proposition = self.proposition,
.commitment_opt = self.commitment_opt,
.randomness_opt = self.randomness_opt,
.challenge_opt = c,
.simulated = self.simulated,
.position = self.position,
};
}
pub fn withSimulated(self: UnprovenSchnorr, sim: bool) UnprovenSchnorr {
return .{
.proposition = self.proposition,
.commitment_opt = self.commitment_opt,
.randomness_opt = self.randomness_opt,
.challenge_opt = self.challenge_opt,
.simulated = sim,
.position = self.position,
};
}
};
const UnprovenDhTuple = struct {
proposition: ProveDhTuple,
commitment_opt: ?FirstDhTupleProverMessage,
randomness_opt: ?Scalar,
challenge_opt: ?Challenge,
simulated: bool,
position: NodePosition,
};
Unproven Conjecture Nodes
const UnprovenConjecture = union(enum) {
cand_unproven: CandUnproven,
cor_unproven: CorUnproven,
cthreshold_unproven: CthresholdUnproven,
pub fn simulated(self: UnprovenConjecture) bool {
return switch (self) {
.cand_unproven => |c| c.simulated,
.cor_unproven => |c| c.simulated,
.cthreshold_unproven => |c| c.simulated,
};
}
pub fn children(self: UnprovenConjecture) []ProofTree {
return switch (self) {
.cand_unproven => |c| c.children,
.cor_unproven => |c| c.children,
.cthreshold_unproven => |c| c.children,
};
}
};
const CandUnproven = struct {
proposition: Cand,
challenge_opt: ?Challenge,
simulated: bool,
children: []ProofTree,
position: NodePosition,
};
const CorUnproven = struct {
proposition: Cor,
challenge_opt: ?Challenge,
simulated: bool,
children: []ProofTree,
position: NodePosition,
};
const CthresholdUnproven = struct {
proposition: Cthreshold,
challenge_opt: ?Challenge,
simulated: bool,
k: u8, // Threshold
children: []ProofTree,
polynomial_opt: ?Gf2_192Poly, // For challenge distribution
position: NodePosition,
};
The Proving Algorithm
Prover Trait
const Prover = struct {
secrets: []const PrivateInput,
pub fn prove(
self: *const Prover,
tree: *const ErgoTree,
ctx: *const Context,
message: []const u8,
hints_bag: *const HintsBag,
) ProverError!ProverResult {
const reduction = try reduceToCrypto(tree, ctx);
const proof = try self.generateProof(
reduction.sigma_prop,
message,
hints_bag,
);
return .{
.proof = proof,
.extension = ctx.extension,
};
}
pub fn generateProof(
self: *const Prover,
sigma_bool: SigmaBoolean,
message: []const u8,
hints_bag: *const HintsBag,
) ProverError!ProofBytes {
return switch (sigma_bool) {
.trivial_prop => |b| blk: {
if (b) break :blk ProofBytes.empty();
return error.ReducedToFalse;
},
else => |sb| blk: {
const unproven = try convertToUnproven(sb);
const unchecked = try proveToUnchecked(self, unproven, message, hints_bag);
break :blk serializeSig(unchecked);
},
};
}
};
Step 0: Convert to Unproven
Transform SigmaBoolean to UnprovenTree6:
fn convertToUnproven(sigma_tree: SigmaBoolean) ProverError!UnprovenTree {
return switch (sigma_tree) {
.c_and => |and_node| blk: {
var children = try allocator.alloc(ProofTree, and_node.children.len);
for (and_node.children, 0..) |child, i| {
children[i] = .{ .unproven_tree = try convertToUnproven(child) };
}
break :blk .{
.unproven_conjecture = .{
.cand_unproven = .{
.proposition = and_node,
.challenge_opt = null,
.simulated = false,
.children = children,
.position = NodePosition.CRYPTO_TREE_PREFIX,
},
},
};
},
.c_or => |or_node| blk: {
// Similar conversion for OR
// ...
},
.c_threshold => |th| blk: {
// Similar conversion for THRESHOLD
// ...
},
.prove_dlog => |pk| .{
.unproven_leaf = .{
.unproven_schnorr = .{
.proposition = pk,
.commitment_opt = null,
.randomness_opt = null,
.challenge_opt = null,
.simulated = false,
.position = NodePosition.CRYPTO_TREE_PREFIX,
},
},
},
.prove_dh_tuple => |dht| .{
.unproven_leaf = .{
.unproven_dh_tuple = .{
.proposition = dht,
.commitment_opt = null,
.randomness_opt = null,
.challenge_opt = null,
.simulated = false,
.position = NodePosition.CRYPTO_TREE_PREFIX,
},
},
},
else => error.Unexpected,
};
}
Step 1: Mark Real Nodes
Bottom-up traversal to mark what prover can prove78:
fn markReal(
prover: *const Prover,
tree: UnprovenTree,
hints_bag: *const HintsBag,
) ProverError!UnprovenTree {
return rewriteBottomUp(tree, struct {
fn transform(node: ProofTree, p: *const Prover, hints: *const HintsBag) ?ProofTree {
return switch (node) {
.unproven_tree => |ut| switch (ut) {
.unproven_leaf => |leaf| blk: {
// Leaf is real if prover has secret OR hint shows knowledge
const secret_known = hints.realImages().contains(leaf.proposition()) or
p.hasSecretFor(leaf.proposition());
break :blk leaf.withSimulated(!secret_known);
},
.unproven_conjecture => |conj| switch (conj) {
.cand_unproven => |cand| blk: {
// AND is real only if ALL children are real
const simulated = anyChildSimulated(cand.children);
break :blk cand.withSimulated(simulated);
},
.cor_unproven => |cor| blk: {
// OR is real if AT LEAST ONE child is real
const simulated = allChildrenSimulated(cor.children);
break :blk cor.withSimulated(simulated);
},
.cthreshold_unproven => |ct| blk: {
// THRESHOLD(k) is real if AT LEAST k children are real
const real_count = countRealChildren(ct.children);
break :blk ct.withSimulated(real_count < ct.k);
},
},
},
else => null,
};
}
}.transform, prover, hints_bag);
}
Step 2: Check Root
fn proveToUnchecked(
prover: *const Prover,
unproven: UnprovenTree,
message: []const u8,
hints_bag: *const HintsBag,
) ProverError!UncheckedTree {
// Step 1
const step1 = try markReal(prover, unproven, hints_bag);
// Step 2: If root is simulated, prover cannot prove
if (!step1.isReal()) {
return error.TreeRootIsNotReal;
}
// Steps 3-9...
}
Step 3: Polish Simulated
Top-down traversal to ensure correct structure9:
fn polishSimulated(tree: UnprovenTree) ProverError!UnprovenTree {
return rewriteTopDown(tree, struct {
fn transform(node: ProofTree) ?ProofTree {
return switch (node) {
.unproven_tree => |ut| switch (ut) {
.unproven_conjecture => |conj| switch (conj) {
.cand_unproven => |cand| blk: {
// Simulated AND: all children simulated
if (cand.simulated) {
break :blk cand.withChildren(
markAllChildrenSimulated(cand.children),
);
}
break :blk cand;
},
.cor_unproven => |cor| blk: {
if (cor.simulated) {
// Simulated OR: all children simulated
break :blk cor.withChildren(
markAllChildrenSimulated(cor.children),
);
} else {
// Real OR: keep ONE child real, mark rest simulated
break :blk makeCorChildrenSimulated(cor);
}
},
.cthreshold_unproven => |ct| blk: {
if (ct.simulated) {
break :blk ct.withChildren(
markAllChildrenSimulated(ct.children),
);
} else {
// Real THRESHOLD(k): keep only k children real
break :blk makeThresholdChildrenSimulated(ct);
}
},
},
else => null,
},
else => null,
};
}
}.transform);
}
fn makeCorChildrenSimulated(cor: CorUnproven) CorUnproven {
// Find first real child, mark all others simulated
var found_real = false;
var new_children = allocator.alloc(ProofTree, cor.children.len);
for (cor.children, 0..) |child, i| {
const ut = child.unproven_tree;
if (ut.isReal() and !found_real) {
new_children[i] = child;
found_real = true;
} else if (ut.isReal()) {
new_children[i] = ut.withSimulated(true);
} else {
new_children[i] = child;
}
}
return cor.withChildren(new_children);
}
Steps 4-6: Simulate and Commit
Combined traversal for challenges, simulation, and commitments1011:
fn simulateAndCommit(
tree: UnprovenTree,
hints_bag: *const HintsBag,
rng: std.rand.Random,
) ProverError!ProofTree {
return rewriteTopDown(tree, struct {
fn transform(node: ProofTree, hints: *const HintsBag, random: std.rand.Random) ?ProofTree {
return switch (node) {
.unproven_tree => |ut| switch (ut) {
// Step 4: Real conjecture assigns random challenges to simulated children
.unproven_conjecture => |conj| blk: {
if (conj.isReal()) {
break :blk assignChallengesFromRealParent(conj, random);
} else {
break :blk propagateChallengeToSimulatedChildren(conj, random);
}
},
// Steps 5-6: Simulate or commit at leaves
.unproven_leaf => |leaf| blk: {
if (leaf.simulated()) {
// Step 5: Simulate
break :blk simulateLeaf(leaf);
} else {
// Step 6: Compute commitment
break :blk commitLeaf(leaf, hints, random);
}
},
},
else => null,
};
}
}.transform, hints_bag, rng);
}
/// Simulate a leaf: pick random z, compute commitment backwards
fn simulateLeaf(leaf: UnprovenLeaf) UncheckedTree {
return switch (leaf) {
.unproven_schnorr => |us| blk: {
const challenge = us.challenge_opt orelse return error.SimulatedLeafWithoutChallenge;
const sim = DlogProver.simulate(us.proposition, challenge);
break :blk .{
.unchecked_leaf = .{
.unchecked_schnorr = .{
.proposition = us.proposition,
.commitment_opt = sim.first_message,
.challenge = challenge,
.second_message = sim.second_message,
},
},
};
},
.unproven_dh_tuple => |ud| blk: {
// Similar for DHT
},
};
}
/// Commit at a real leaf: pick random r, compute a = g^r
///
/// SECURITY: The randomness `r` MUST come from a cryptographically secure source:
/// - Use a CSPRNG (e.g., OS-provided /dev/urandom, std.crypto.random)
/// - For platforms without secure random, use deterministic nonce generation
/// (RFC 6979 style: r = HMAC(secret_key, message))
/// - NEVER reuse nonces: reusing r with different messages reveals the secret key
fn commitLeaf(
leaf: UnprovenLeaf,
hints: *const HintsBag,
rng: std.rand.Random,
) UnprovenTree {
return switch (leaf) {
.unproven_schnorr => |us| blk: {
// Check hints first
if (hints.findCommitment(us.position)) |hint| {
break :blk us.withCommitment(hint.commitment);
}
// Generate fresh commitment
const first = DlogProver.firstMessage(rng);
break :blk .{
.unproven_leaf = .{
.unproven_schnorr = .{
.proposition = us.proposition,
.commitment_opt = first.message,
.randomness_opt = first.r,
.challenge_opt = null,
.simulated = false,
.position = us.position,
},
},
};
},
// Similar for DHT
};
}
Steps 7-8: Fiat-Shamir
Serialize tree and compute root challenge12:
fn computeRootChallenge(tree: ProofTree, message: []const u8) Challenge {
// Step 7: Serialize tree structure + propositions + commitments
var buf = std.ArrayList(u8).init(allocator);
fiatShamirTreeToBytes(&tree, buf.writer());
// Step 8: Append message and hash
buf.appendSlice(message);
return fiatShamirHashFn(buf.items);
}
Step 9: Compute Real Challenges and Responses
Top-down traversal for real nodes1314:
fn proving(
prover: *const Prover,
tree: ProofTree,
hints_bag: *const HintsBag,
) ProverError!ProofTree {
return rewriteTopDown(tree, struct {
fn transform(node: ProofTree, p: *const Prover, hints: *const HintsBag) ?ProofTree {
return switch (node) {
.unproven_tree => |ut| switch (ut) {
.unproven_conjecture => |conj| blk: {
if (!conj.isReal()) break :blk null;
switch (conj) {
.cand_unproven => |cand| blk: {
// Real AND: all children get same challenge
const challenge = cand.challenge_opt.?;
break :blk cand.withChildren(
propagateChallenge(cand.children, challenge),
);
},
.cor_unproven => |cor| blk: {
// Real OR: real child gets XOR of root and simulated
const root_challenge = cor.challenge_opt.?;
const xored = xorChallenges(root_challenge, cor.children);
break :blk cor.withRealChildChallenge(xored);
},
.cthreshold_unproven => |ct| blk: {
// Real THRESHOLD: polynomial interpolation
break :blk computeThresholdChallenges(ct);
},
}
},
.unproven_leaf => |leaf| blk: {
if (!leaf.isReal()) break :blk null;
// Compute response z = r + e*w mod q
const challenge = leaf.challenge_opt orelse
return error.RealUnprovenTreeWithoutChallenge;
switch (leaf) {
.unproven_schnorr => |us| blk: {
const secret = p.findSecret(us.proposition) orelse
hints.findRealProof(us.position)?.unchecked.second_message orelse
return error.SecretNotFound;
const z = DlogProver.secondMessage(
secret,
us.randomness_opt.?,
challenge,
);
break :blk .{
.unchecked_leaf = .{
.unchecked_schnorr = .{
.proposition = us.proposition,
.commitment_opt = null,
.challenge = challenge,
.second_message = z,
},
},
};
},
// Similar for DHT
}
},
},
else => null,
};
}
}.transform, prover, hints_bag);
}
Step 10: Serialize Proof
fn serializeSig(tree: UncheckedTree) ProofBytes {
var buf = std.ArrayList(u8).init(allocator);
var w = SigmaByteWriter.init(buf.writer());
sigWriteBytes(&tree, &w, true);
return .{ .bytes = buf.items };
}
fn sigWriteBytes(node: *const UncheckedTree, w: *SigmaByteWriter, write_challenge: bool) void {
if (write_challenge) {
w.writeBytes(&node.challenge());
}
switch (node.*) {
.unchecked_leaf => |leaf| switch (leaf) {
.unchecked_schnorr => |us| {
w.writeBytes(&us.second_message.z.toBytes());
},
.unchecked_dh_tuple => |dh| {
w.writeBytes(&dh.second_message.z.toBytes());
},
},
.unchecked_conjecture => |conj| switch (conj) {
.cand_unchecked => |cand| {
// Children's challenges equal parent's - don't write
for (cand.children) |child| {
sigWriteBytes(&child, w, false);
}
},
.cor_unchecked => |cor| {
// Write all except last (computed via XOR)
for (cor.children[0 .. cor.children.len - 1]) |child| {
sigWriteBytes(&child, w, true);
}
sigWriteBytes(&cor.children[cor.children.len - 1], w, false);
},
.cthreshold_unchecked => |ct| {
// Write polynomial coefficients
w.writeBytes(ct.polynomial.toBytes(false));
for (ct.children) |child| {
sigWriteBytes(&child, w, false);
}
},
},
};
}
Response Computation
Schnorr Response
const DlogProver = struct {
/// First message: a = g^r
pub fn firstMessage(rng: std.rand.Random) struct { r: Scalar, message: FirstDlogProverMessage } {
const r = Scalar.random(rng);
const a = DlogGroup.exponentiate(&DlogGroup.generator(), &r);
return .{ .r = r, .message = .{ .a = a } };
}
/// Second message: z = r + e*w mod q
pub fn secondMessage(
private_key: DlogProverInput,
r: Scalar,
challenge: Challenge,
) SecondDlogProverMessage {
const e = Scalar.fromBytes(&challenge.bytes);
const z = r.add(e.mul(private_key.w));
return .{ .z = z };
}
/// Simulation: pick random z, compute a = g^z * h^(-e)
pub fn simulate(
proposition: ProveDlog,
challenge: Challenge,
) struct { first_message: FirstDlogProverMessage, second_message: SecondDlogProverMessage } {
const z = Scalar.random(rng);
const e = Scalar.fromBytes(&challenge.bytes);
const minus_e = e.negate();
const gz = DlogGroup.exponentiate(&DlogGroup.generator(), &z);
const h_neg_e = DlogGroup.exponentiate(&proposition.h, &minus_e);
const a = gz.multiply(&h_neg_e);
return .{
.first_message = .{ .a = a },
.second_message = .{ .z = z },
};
}
};
Hint System
Hint Types
For distributed signing15:
const Hint = union(enum) {
real_secret_proof: RealSecretProof,
simulated_secret_proof: SimulatedSecretProof,
own_commitment: OwnCommitment,
real_commitment: RealCommitment,
simulated_commitment: SimulatedCommitment,
};
const RealSecretProof = struct {
image: SigmaBoolean,
challenge: Challenge,
unchecked_tree: UncheckedTree,
position: NodePosition,
};
const OwnCommitment = struct {
image: SigmaBoolean,
secret_randomness: Scalar, // PRIVATE - NEVER share!
commitment: FirstProverMessage,
position: NodePosition,
};
// SECURITY: OwnCommitment contains secret randomness (r). NEVER send
// OwnCommitment to other parties - only send RealCommitment (public part).
// Leaking r allows computing secret key w = (z - r) / e.
const RealCommitment = struct {
image: SigmaBoolean,
commitment: FirstProverMessage,
position: NodePosition,
};
const HintsBag = struct {
hints: []const Hint,
pub fn realImages(self: *const HintsBag) []const SigmaBoolean {
// Collect public images from real proofs and commitments
}
pub fn findCommitment(self: *const HintsBag, pos: NodePosition) ?CommitmentHint {
for (self.hints) |hint| {
switch (hint) {
.own_commitment, .real_commitment => |c| {
if (c.position.eql(pos)) return c;
},
else => {},
}
}
return null;
}
pub fn findRealProof(self: *const HintsBag, pos: NodePosition) ?RealSecretProof {
for (self.hints) |hint| {
if (hint == .real_secret_proof and hint.real_secret_proof.position.eql(pos)) {
return hint.real_secret_proof;
}
}
return null;
}
};
Distributed Signing Protocol
Distributed Signing (2-of-2 AND)
─────────────────────────────────────────────────────
Round 1: Generate commitments
Party 1 (sk1) ─────> OwnCommitment(pk1, r1, g^r1)
Party 2 (sk2) ─────> OwnCommitment(pk2, r2, g^r2)
Exchange: Share RealCommitment (NOT OwnCommitment!)
Party 1 ─────> RealCommitment(pk1, g^r1) ─────> Party 2
Party 2 ─────> RealCommitment(pk2, g^r2) ─────> Party 1
Round 2: Sign sequentially
Party 1:
combined = hints1 ++ RealCommitment(pk2)
partialProof = prove(tree, msg, combined)
Extract hints from partial:
hintsFromProof = bagForMultisig(partialProof, ...)
Party 2:
combined = hints2 ++ hintsFromProof
finalProof = prove(tree, msg, combined)
Prover Errors
const ProverError = error{
ErgoTreeError,
EvalError,
Gf2_192Error,
ReducedToFalse,
TreeRootIsNotReal,
SimulatedLeafWithoutChallenge,
RealUnprovenTreeWithoutChallenge,
SecretNotFound,
Unexpected,
FiatShamirTreeSerializationError,
};
Summary
This chapter covered the prover implementation that generates Sigma proofs:
The prover transforms a sigma-tree through a 10-step algorithm:
- Convert to unproven: Transform SigmaBoolean to UnprovenTree data structure
- Mark real (bottom-up): Identify which nodes the prover has secrets for
- Check root: Fail if the root is simulated (prover cannot prove)
- Polish simulated (top-down): Ensure OR keeps only one real child, THRESHOLD keeps exactly k
- Simulate and commit: Assign challenges to simulated children, generate commitments for real leaves
- Fiat-Shamir serialization: Serialize tree structure and commitments
- Compute root challenge: Hash serialized tree with message
- Prove (top-down): Distribute challenges and compute responses for real nodes
- Serialize proof: Output compact format
Key design principles:
- Zero-knowledge: Simulated transcripts are computationally indistinguishable from real ones
- Challenge flow depends on composition: AND propagates same challenge to all; OR uses XOR constraint; THRESHOLD uses polynomial interpolation over GF(2^192)
- Hint system enables distributed signing: parties exchange commitments (never secret randomness), then sign sequentially
Next: Chapter 16: ErgoScript Parser
Rust: prover.rs:1-100
Rust: unproven_tree.rs (NodePosition)
Scala: UnprovenTree.scala
Rust: unproven_tree.rs:27-88
Rust: prover.rs (convert_to_unproven)
Scala: ProverInterpreter.scala (markReal)
Rust: prover.rs:243-305
Rust: prover.rs:367-400
Scala: ProverInterpreter.scala (simulateAndCommit)
Rust: prover.rs (simulate_and_commit)
Rust: fiat_shamir.rs
Scala: ProverInterpreter.scala (proving)
Rust: prover.rs (proving)
Rust: hint.rs
Chapter 16: ErgoScript Parser
PRE-ALPHA WARNING: This is a pre-alpha version of The Sigma Book. Content may be incomplete, inaccurate, or subject to change. Do not use as a source of truth. For authoritative information, consult the official repositories:
- sigmastate-interpreter — Reference Scala implementation
- sigma-rust — Rust implementation
- ergo — Ergo node
Prerequisites
- Chapter 4 for AST node types that the parser produces
- Chapter 2 for type syntax parsing
- Familiarity with parsing concepts: tokenization, recursive descent, operator precedence
Learning Objectives
By the end of this chapter, you will be able to:
- Explain parser combinator and Pratt parsing techniques used in ErgoScript
- Navigate the parser module structure (lexer, grammar, expressions, types)
- Implement operator precedence using binding power
- Trace expression parsing from ErgoScript source to untyped AST
- Handle source position tracking for meaningful error messages
Parser Architecture
ErgoScript source code transforms to AST through lexing and parsing12:
Parsing Pipeline
─────────────────────────────────────────────────────
Source Code
│
▼
┌──────────────────────────────────────────────────┐
│ LEXER │
│ │
│ Characters ─────> Tokens │
│ "val x = 1 + 2" │
│ ─────> [ValKw, Ident("x"), Eq, Int(1), │
│ Plus, Int(2)] │
└──────────────────────────────────────────────────┘
│
▼
┌──────────────────────────────────────────────────┐
│ PARSER │
│ │
│ Tokens ─────> AST │
│ Grammar rules, precedence, associativity │
│ ─────> ValDef("x", BinOp(Int(1), +, Int(2))) │
└──────────────────────────────────────────────────┘
│
▼
Untyped AST (SValue)
Lexer (Tokenizer)
Converts character stream to tokens3:
const TokenKind = enum {
// Literals
int_number,
long_number,
string_literal,
// Keywords
val_kw,
def_kw,
if_kw,
else_kw,
true_kw,
false_kw,
// Operators
plus,
minus,
star,
slash,
percent,
eq,
neq,
lt,
gt,
le,
ge,
and_and,
or_or,
bang,
// Punctuation
l_paren,
r_paren,
l_brace,
r_brace,
l_bracket,
r_bracket,
dot,
comma,
colon,
semicolon,
arrow,
// Identifiers
ident,
// Special
whitespace,
comment,
eof,
err,
};
const Token = struct {
kind: TokenKind,
text: []const u8,
range: Range,
};
const Range = struct {
start: usize,
end: usize,
};
Lexer Implementation
const Lexer = struct {
source: []const u8,
pos: usize,
pub fn init(source: []const u8) Lexer {
return .{ .source = source, .pos = 0 };
}
pub fn nextToken(self: *Lexer) Token {
self.skipWhitespaceAndComments();
if (self.pos >= self.source.len) {
return .{ .kind = .eof, .text = "", .range = .{ .start = self.pos, .end = self.pos } };
}
const start = self.pos;
const c = self.source[self.pos];
// Single-character tokens
const single_char_token: ?TokenKind = switch (c) {
'(' => .l_paren,
')' => .r_paren,
'{' => .l_brace,
'}' => .r_brace,
'[' => .l_bracket,
']' => .r_bracket,
'.' => .dot,
',' => .comma,
':' => .colon,
';' => .semicolon,
'+' => .plus,
'-' => .minus,
'*' => .star,
'/' => .slash,
'%' => .percent,
else => null,
};
if (single_char_token) |kind| {
self.pos += 1;
return .{ .kind = kind, .text = self.source[start..self.pos], .range = .{ .start = start, .end = self.pos } };
}
// Multi-character tokens
if (c == '=' and self.peek(1) == '=') {
self.pos += 2;
return .{ .kind = .eq, .text = "==", .range = .{ .start = start, .end = self.pos } };
}
if (c == '=' and self.peek(1) == '>') {
self.pos += 2;
return .{ .kind = .arrow, .text = "=>", .range = .{ .start = start, .end = self.pos } };
}
if (c == '&' and self.peek(1) == '&') {
self.pos += 2;
return .{ .kind = .and_and, .text = "&&", .range = .{ .start = start, .end = self.pos } };
}
// Numbers
if (std.ascii.isDigit(c)) {
return self.scanNumber(start);
}
// Identifiers and keywords
if (std.ascii.isAlphabetic(c) or c == '_') {
return self.scanIdentifier(start);
}
// Unknown character
self.pos += 1;
return .{ .kind = .err, .text = self.source[start..self.pos], .range = .{ .start = start, .end = self.pos } };
}
fn scanIdentifier(self: *Lexer, start: usize) Token {
while (self.pos < self.source.len) {
const c = self.source[self.pos];
if (std.ascii.isAlphanumeric(c) or c == '_') {
self.pos += 1;
} else {
break;
}
}
const text = self.source[start..self.pos];
const kind: TokenKind = if (keywords.get(text)) |kw| kw else .ident;
return .{ .kind = kind, .text = text, .range = .{ .start = start, .end = self.pos } };
}
fn scanNumber(self: *Lexer, start: usize) Token {
// Check for hex
if (self.source[self.pos] == '0' and self.pos + 1 < self.source.len and
(self.source[self.pos + 1] == 'x' or self.source[self.pos + 1] == 'X'))
{
self.pos += 2;
while (self.pos < self.source.len and std.ascii.isHex(self.source[self.pos])) {
self.pos += 1;
}
} else {
while (self.pos < self.source.len and std.ascii.isDigit(self.source[self.pos])) {
self.pos += 1;
}
}
// Check for L suffix (long)
var kind: TokenKind = .int_number;
if (self.pos < self.source.len and (self.source[self.pos] == 'L' or self.source[self.pos] == 'l')) {
kind = .long_number;
self.pos += 1;
}
return .{ .kind = kind, .text = self.source[start..self.pos], .range = .{ .start = start, .end = self.pos } };
}
const keywords = std.ComptimeStringMap(TokenKind, .{
.{ "val", .val_kw },
.{ "def", .def_kw },
.{ "if", .if_kw },
.{ "else", .else_kw },
.{ "true", .true_kw },
.{ "false", .false_kw },
});
};
Parser Structure
Event-based parser using markers45:
const Event = union(enum) {
start_node: SyntaxKind,
add_token,
finish_node,
err: ParseError,
placeholder,
};
const Parser = struct {
source: Source,
events: std.ArrayList(Event),
expected_kinds: std.ArrayList(TokenKind),
allocator: Allocator,
pub fn init(allocator: Allocator, tokens: []const Token) Parser {
return .{
.source = Source.init(tokens),
.events = std.ArrayList(Event).init(allocator),
.expected_kinds = std.ArrayList(TokenKind).init(allocator),
.allocator = allocator,
};
}
pub fn parse(self: *Parser) []Event {
grammar.root(self);
return self.events.toOwnedSlice();
}
fn start(self: *Parser) Marker {
const pos = self.events.items.len;
try self.events.append(.placeholder);
return Marker.init(pos);
}
fn at(self: *Parser, kind: TokenKind) bool {
try self.expected_kinds.append(kind);
return self.peek() == kind;
}
fn bump(self: *Parser) void {
self.expected_kinds.clearRetainingCapacity();
_ = self.source.nextToken();
try self.events.append(.add_token);
}
fn expect(self: *Parser, kind: TokenKind) void {
if (self.at(kind)) {
self.bump();
} else {
self.err();
}
}
};
const Marker = struct {
pos: usize,
pub fn init(pos: usize) Marker {
return .{ .pos = pos };
}
pub fn complete(self: Marker, p: *Parser, kind: SyntaxKind) CompletedMarker {
p.events.items[self.pos] = .{ .start_node = kind };
try p.events.append(.finish_node);
return .{ .pos = self.pos };
}
pub fn precede(self: Marker, p: *Parser) Marker {
const new_marker = p.start();
p.events.items[self.pos] = .{ .start_node_at = new_marker.pos };
return new_marker;
}
};
Pratt Parsing (Binding Power)
Expression parsing uses Pratt parsing for operator precedence67. This technique, introduced by Vaughan Pratt in 1973 ("Top Down Operator Precedence"), elegantly handles operator precedence and associativity through numeric "binding power" values:
Binding Power Concept
─────────────────────────────────────────────────────
Expression: A + B * C
Power: 3 3 5 5
The * has higher binding power, holds B and C tighter.
Result: A + (B * C)
Associativity via asymmetric power:
Expression: A + B + C
Power: 0 3 3.1 3 3.1 0
Right power slightly higher → left associativity
Result: (A + B) + C
Expression Grammar
const grammar = struct {
pub fn root(p: *Parser) CompletedMarker {
const m = p.start();
while (!p.atEnd()) {
stmt(p);
}
return m.complete(p, .root);
}
pub fn expr(p: *Parser) ?CompletedMarker {
return exprBindingPower(p, 0);
}
/// Pratt parser core
fn exprBindingPower(p: *Parser, min_bp: u8) ?CompletedMarker {
var lhs = lhs(p) orelse return null;
while (true) {
const op: ?BinaryOp = blk: {
if (p.at(.plus)) break :blk .add;
if (p.at(.minus)) break :blk .sub;
if (p.at(.star)) break :blk .mul;
if (p.at(.slash)) break :blk .div;
if (p.at(.percent)) break :blk .mod;
if (p.at(.lt)) break :blk .lt;
if (p.at(.gt)) break :blk .gt;
if (p.at(.le)) break :blk .le;
if (p.at(.ge)) break :blk .ge;
if (p.at(.eq)) break :blk .eq;
if (p.at(.neq)) break :blk .neq;
if (p.at(.and_and)) break :blk .and_;
if (p.at(.or_or)) break :blk .or_;
break :blk null;
};
if (op == null) break;
const bp = op.?.bindingPower();
if (bp.left < min_bp) break;
// Consume operator
p.bump();
// Parse right operand with right binding power
const m = lhs.precede(p);
const parsed_rhs = exprBindingPower(p, bp.right) != null;
lhs = m.complete(p, .infix_expr);
if (!parsed_rhs) break;
}
return lhs;
}
/// Left-hand side (atoms and prefix expressions)
fn lhs(p: *Parser) ?CompletedMarker {
if (p.at(.int_number)) return intNumber(p);
if (p.at(.long_number)) return longNumber(p);
if (p.at(.ident)) return ident(p);
if (p.at(.true_kw) or p.at(.false_kw)) return boolLiteral(p);
if (p.at(.minus) or p.at(.bang)) return prefixExpr(p);
if (p.at(.l_paren)) return parenExpr(p);
if (p.at(.l_brace)) return blockExpr(p);
if (p.at(.if_kw)) return ifExpr(p);
p.err();
return null;
}
fn intNumber(p: *Parser) CompletedMarker {
const m = p.start();
p.bump();
return m.complete(p, .int_literal);
}
fn ident(p: *Parser) CompletedMarker {
const m = p.start();
p.bump();
return m.complete(p, .ident);
}
fn prefixExpr(p: *Parser) ?CompletedMarker {
const m = p.start();
const op_bp = UnaryOp.fromToken(p.peek()).?.bindingPower();
p.bump(); // operator
_ = exprBindingPower(p, op_bp.right);
return m.complete(p, .prefix_expr);
}
fn parenExpr(p: *Parser) CompletedMarker {
const m = p.start();
p.expect(.l_paren);
_ = expr(p);
p.expect(.r_paren);
return m.complete(p, .paren_expr);
}
fn ifExpr(p: *Parser) CompletedMarker {
const m = p.start();
p.expect(.if_kw);
p.expect(.l_paren);
_ = expr(p);
p.expect(.r_paren);
_ = expr(p);
if (p.at(.else_kw)) {
p.bump();
_ = expr(p);
}
return m.complete(p, .if_expr);
}
};
Binary Operators
const BinaryOp = enum {
add,
sub,
mul,
div,
mod,
lt,
gt,
le,
ge,
eq,
neq,
and_,
or_,
const BindingPower = struct { left: u8, right: u8 };
pub fn bindingPower(self: BinaryOp) BindingPower {
return switch (self) {
.or_ => .{ .left = 1, .right = 2 }, // ||
.and_ => .{ .left = 3, .right = 4 }, // &&
.eq, .neq => .{ .left = 5, .right = 6 }, // ==, !=
.lt, .gt, .le, .ge => .{ .left = 7, .right = 8 },
.add, .sub => .{ .left = 9, .right = 10 },
.mul, .div, .mod => .{ .left = 11, .right = 12 },
};
}
};
const UnaryOp = enum {
neg,
not,
pub fn bindingPower(self: UnaryOp) struct { right: u8 } {
return switch (self) {
.neg, .not => .{ .right = 13 }, // Higher than all binary
};
}
};
Operator Precedence Table
Operator Precedence (lowest to highest)
─────────────────────────────────────────────────────
1-2 || Logical OR
3-4 && Logical AND
5-6 == != Equality
7-8 < > <= >= Comparison
9-10 + - Addition, Subtraction
11-12 * / % Multiplication, Division
13 - ! ~ Prefix (unary)
14 . () Postfix (method call, index)
Type Parsing
const TypeParser = struct {
const predef_types = std.ComptimeStringMap(SType, .{
.{ "Boolean", .s_boolean },
.{ "Byte", .s_byte },
.{ "Short", .s_short },
.{ "Int", .s_int },
.{ "Long", .s_long },
.{ "BigInt", .s_big_int },
.{ "GroupElement", .s_group_element },
.{ "SigmaProp", .s_sigma_prop },
.{ "Box", .s_box },
.{ "AvlTree", .s_avl_tree },
.{ "Context", .s_context },
.{ "Header", .s_header },
.{ "PreHeader", .s_pre_header },
.{ "Unit", .s_unit },
});
pub fn parseType(p: *Parser) ?SType {
if (p.at(.ident)) {
const name = p.currentText();
// Check predefined types
if (predef_types.get(name)) |t| {
p.bump();
return t;
}
// Generic types: Coll[T], Option[T]
p.bump();
if (p.at(.l_bracket)) {
p.bump();
const inner = parseType(p) orelse return null;
p.expect(.r_bracket);
if (std.mem.eql(u8, name, "Coll")) {
return .{ .s_coll = inner };
} else if (std.mem.eql(u8, name, "Option")) {
return .{ .s_option = inner };
}
}
// Type variable
return .{ .s_type_var = name };
}
// Tuple type: (T1, T2, ...)
if (p.at(.l_paren)) {
p.bump();
var items = std.ArrayList(SType).init(p.allocator);
while (!p.at(.r_paren)) {
const t = parseType(p) orelse return null;
try items.append(t);
if (!p.at(.r_paren)) p.expect(.comma);
}
p.expect(.r_paren);
return .{ .s_tuple = items.toOwnedSlice() };
}
// Function type: T1 => T2
const domain = parseType(p) orelse return null;
if (p.at(.arrow)) {
p.bump();
const range = parseType(p) orelse return null;
return .{ .s_func = .{ .args = &[_]SType{domain}, .ret = range } };
}
return domain;
}
};
Statement Parsing
fn stmt(p: *Parser) ?CompletedMarker {
if (p.at(.val_kw)) {
return valDef(p);
}
if (p.at(.def_kw)) {
return defDef(p);
}
return expr(p);
}
fn valDef(p: *Parser) CompletedMarker {
const m = p.start();
p.expect(.val_kw);
p.expect(.ident);
// Optional type annotation
if (p.at(.colon)) {
p.bump();
_ = TypeParser.parseType(p);
}
p.expect(.eq);
_ = expr(p);
return m.complete(p, .val_def);
}
fn defDef(p: *Parser) CompletedMarker {
const m = p.start();
p.expect(.def_kw);
p.expect(.ident);
// Parameters
if (p.at(.l_paren)) {
p.bump();
while (!p.at(.r_paren)) {
p.expect(.ident);
p.expect(.colon);
_ = TypeParser.parseType(p);
if (!p.at(.r_paren)) p.expect(.comma);
}
p.expect(.r_paren);
}
// Return type
if (p.at(.colon)) {
p.bump();
_ = TypeParser.parseType(p);
}
p.expect(.eq);
_ = expr(p);
return m.complete(p, .def_def);
}
Source Position Tracking
Every AST node carries source position for error messages8:
const SourceContext = struct {
index: usize,
line: u32,
column: u32,
source_line: []const u8,
pub fn fromIndex(index: usize, source: []const u8) SourceContext {
var line: u32 = 1;
var col: u32 = 1;
var line_start: usize = 0;
for (source[0..index], 0..) |c, i| {
if (c == '\n') {
line += 1;
col = 1;
line_start = i + 1;
} else {
col += 1;
}
}
// Find end of current line
var line_end = index;
while (line_end < source.len and source[line_end] != '\n') {
line_end += 1;
}
return .{
.index = index,
.line = line,
.column = col,
.source_line = source[line_start..line_end],
};
}
};
const ParseError = struct {
expected: []const TokenKind,
found: ?TokenKind,
span: Range,
pub fn format(self: ParseError, ctx: SourceContext) []const u8 {
// Format error message with source context
}
};
Syntax Tree Construction
Events convert to concrete syntax tree9:
const SyntaxKind = enum {
// Nodes
root,
val_def,
def_def,
if_expr,
block_expr,
infix_expr,
prefix_expr,
paren_expr,
lambda_expr,
apply_expr,
select_expr,
// Literals
int_literal,
long_literal,
bool_literal,
string_literal,
ident,
// Error
err,
};
const SyntaxNode = struct {
kind: SyntaxKind,
range: Range,
children: []SyntaxNode,
text: ?[]const u8,
};
fn buildTree(events: []const Event, tokens: []const Token) SyntaxNode {
var builder = TreeBuilder.init();
for (events) |event| {
switch (event) {
.start_node => |kind| builder.startNode(kind),
.add_token => builder.addToken(tokens[builder.token_idx]),
.finish_node => builder.finishNode(),
.err => |e| builder.addError(e),
.placeholder => {},
}
}
return builder.finish();
}
Parsing Example
Input: "val x = 1 + 2 * 3"
Tokens:
[val_kw, ident("x"), eq, int(1), plus, int(2), star, int(3)]
Events:
start_node(val_def)
add_token(val_kw)
add_token(ident)
add_token(eq)
start_node(infix_expr) // 1 + (2 * 3)
add_token(int) // 1
add_token(plus)
start_node(infix_expr) // 2 * 3
add_token(int) // 2
add_token(star)
add_token(int) // 3
finish_node
finish_node
finish_node
AST:
ValDef
name: "x"
rhs: InfixExpr(+)
lhs: IntLiteral(1)
rhs: InfixExpr(*)
lhs: IntLiteral(2)
rhs: IntLiteral(3)
Error Recovery
const RECOVERY_SET = [_]TokenKind{ .val_kw, .def_kw, .r_brace };
fn err(p: *Parser) void {
const current = p.source.peekToken();
const range = if (current) |t| t.range else p.source.lastTokenRange();
try p.events.append(.{
.err = .{
.expected = p.expected_kinds.toOwnedSlice(),
.found = if (current) |t| t.kind else null,
.span = range,
},
});
// Skip tokens until recovery point
if (!p.atSet(&RECOVERY_SET) and !p.atEnd()) {
const m = p.start();
p.bump();
_ = m.complete(p, .err);
}
}
Summary
- Lexer converts characters to tokens with position tracking
- Parser uses event-based architecture with markers
- Pratt parsing handles operator precedence via binding power
- Left associativity: right power slightly higher than left
- Source positions enable accurate error messages
- Error recovery skips to synchronization points
- Output is untyped AST; semantic analysis comes next
Next: Chapter 17: Semantic Analysis
Scala: SigmaParser.scala
Rust: parser.rs
Rust: lexer.rs
Scala: Basic.scala
Rust: marker.rs
Scala: Exprs.scala
Rust: expr.rs:1-60
Scala: SourceContext.scala
Rust: sink.rs
Chapter 17: Semantic Analysis
PRE-ALPHA WARNING: This is a pre-alpha version of The Sigma Book. Content may be incomplete, inaccurate, or subject to change. Do not use as a source of truth. For authoritative information, consult the official repositories:
- sigmastate-interpreter — Reference Scala implementation
- sigma-rust — Rust implementation
- ergo — Ergo node
Prerequisites
- Chapter 16 for the untyped AST structure
- Chapter 2 for type codes and type compatibility rules
- Familiarity with type inference concepts: type variables, unification, constraint solving
Learning Objectives
By the end of this chapter, you will be able to:
- Explain the two-phase semantic analysis: name binding followed by type inference
- Implement name resolution for globals, environment variables, and local definitions
- Apply the type unification algorithm to infer types and detect mismatches
- Describe method resolution and how method calls are lowered to direct operations
- Trace type inference for complex expressions involving generics and collections
Semantic Analysis Overview
After parsing, ErgoScript transforms through two phases12:
Semantic Analysis Pipeline
─────────────────────────────────────────────────────
Source Code
│
▼
┌──────────────────────────────────────────────────┐
│ PARSE │
│ │
│ Untyped AST │
│ - Identifiers have NoType │
│ - References are unresolved strings │
│ - Operators are symbolic │
└──────────────────────────────────────────────────┘
│
▼
┌──────────────────────────────────────────────────┐
│ BIND │
│ │
│ Resolve names: │
│ - Global constants (HEIGHT, SELF, INPUTS) │
│ - Environment variables │
│ - Predefined functions │
└──────────────────────────────────────────────────┘
│
▼
┌──────────────────────────────────────────────────┐
│ TYPE │
│ │
│ Assign types: │
│ - Infer expression types │
│ - Resolve method calls │
│ - Unify generic types │
│ - Check type consistency │
└──────────────────────────────────────────────────┘
│
▼
Typed AST (ready for IR)
Phase 1: Name Binding
The binder resolves identifiers to their definitions34:
const BinderError = struct {
msg: []const u8,
span: Range,
pub fn prettyDesc(self: BinderError, source: []const u8) []const u8 {
// Format error with source context
}
};
const GlobalVars = enum {
height,
self_,
inputs,
outputs,
context,
global,
miner_pubkey,
last_block_utxo_root_hash,
pub fn tpe(self: GlobalVars) SType {
return switch (self) {
.height => .s_int,
.self_ => .s_box,
.inputs => .{ .s_coll = .s_box },
.outputs => .{ .s_coll = .s_box },
.context => .s_context,
.global => .s_global,
.miner_pubkey => .{ .s_coll = .s_byte },
.last_block_utxo_root_hash => .s_avl_tree,
};
}
};
const Binder = struct {
env: ScriptEnv,
allocator: Allocator,
pub fn init(allocator: Allocator, env: ScriptEnv) Binder {
return .{ .env = env, .allocator = allocator };
}
pub fn bind(self: *const Binder, expr: Expr) BinderError!Expr {
return self.rewrite(expr);
}
fn rewrite(self: *const Binder, expr: Expr) BinderError!Expr {
return switch (expr.kind) {
.ident => |name| blk: {
// Check environment first
if (self.env.get(name)) |value| {
break :blk liftToConstant(value, expr.span);
}
// Check global variables
if (resolveGlobal(name)) |global| {
break :blk .{
.kind = .{ .global_vars = global },
.span = expr.span,
.tpe = global.tpe(),
};
}
// Leave unresolved for typer
break :blk expr;
},
.binary => |bin| blk: {
const left = try self.rewrite(bin.lhs.*);
const right = try self.rewrite(bin.rhs.*);
break :blk .{
.kind = .{ .binary = .{
.op = bin.op,
.lhs = try self.allocator.create(Expr),
.rhs = try self.allocator.create(Expr),
} },
.span = expr.span,
.tpe = expr.tpe,
};
},
.block => |block| blk: {
var new_bindings = try self.allocator.alloc(ValDef, block.bindings.len);
for (block.bindings, 0..) |binding, i| {
const rhs = try self.rewrite(binding.rhs.*);
new_bindings[i] = .{
.name = binding.name,
.tpe = rhs.tpe orelse binding.tpe,
.rhs = rhs,
};
}
const body = try self.rewrite(block.body.*);
break :blk .{
.kind = .{ .block = .{
.bindings = new_bindings,
.body = body,
} },
.span = expr.span,
.tpe = body.tpe,
};
},
.lambda => |lam| blk: {
const body = try self.rewrite(lam.body.*);
break :blk .{
.kind = .{ .lambda = .{
.args = lam.args,
.body = body,
} },
.span = expr.span,
.tpe = expr.tpe,
};
},
else => expr,
};
}
fn resolveGlobal(name: []const u8) ?GlobalVars {
const globals = std.ComptimeStringMap(GlobalVars, .{
.{ "HEIGHT", .height },
.{ "SELF", .self_ },
.{ "INPUTS", .inputs },
.{ "OUTPUTS", .outputs },
.{ "CONTEXT", .context },
.{ "Global", .global },
.{ "MinerPubkey", .miner_pubkey },
.{ "LastBlockUtxoRootHash", .last_block_utxo_root_hash },
});
return globals.get(name);
}
fn liftToConstant(value: anytype, span: Range) Expr {
const T = @TypeOf(value);
return .{
.kind = .{ .literal = switch (T) {
i32 => .{ .int = value },
i64 => .{ .long = value },
bool => .{ .bool_ = value },
else => @compileError("unsupported type"),
} },
.span = span,
.tpe = SType.fromNative(T),
};
}
};
Global Constants
Built-in Global Constants
─────────────────────────────────────────────────────
Name Type Description
─────────────────────────────────────────────────────
HEIGHT Int Current block height
SELF Box Current box being spent
INPUTS Coll[Box] Transaction inputs
OUTPUTS Coll[Box] Transaction outputs
CONTEXT Context Execution context
MinerPubkey Coll[Byte] Miner's public key
LastBlockUtxoRootHash AvlTree UTXO digest
Phase 2: Type Inference
The typer assigns types to all expressions56:
const TyperError = struct {
msg: []const u8,
span: Range,
};
const TypeEnv = std.StringHashMap(SType);
const Typer = struct {
predef_env: TypeEnv,
lower_method_calls: bool,
allocator: Allocator,
pub fn init(allocator: Allocator, type_env: TypeEnv, lower: bool) Typer {
var env = TypeEnv.init(allocator);
// Add predefined function types
env.put("min", .{ .s_func = .{ .args = &[_]SType{ .s_int, .s_int }, .ret = .s_int } }) catch {};
env.put("max", .{ .s_func = .{ .args = &[_]SType{ .s_int, .s_int }, .ret = .s_int } }) catch {};
// Merge with provided env
var it = type_env.iterator();
while (it.next()) |entry| {
env.put(entry.key_ptr.*, entry.value_ptr.*) catch {};
}
return .{
.predef_env = env,
.lower_method_calls = lower,
.allocator = allocator,
};
}
pub fn typecheck(self: *Typer, bound: Expr) TyperError!Expr {
const typed = try self.assignType(&self.predef_env, bound);
if (typed.tpe == null) {
return error.NoTypeAssigned;
}
return typed;
}
fn assignType(self: *Typer, env: *const TypeEnv, expr: Expr) TyperError!Expr {
return switch (expr.kind) {
// Identifier: lookup in environment
.ident => |name| blk: {
if (env.get(name)) |t| {
break :blk .{
.kind = expr.kind,
.span = expr.span,
.tpe = t,
};
}
return TyperError{
.msg = "Cannot assign type for variable",
.span = expr.span,
};
},
// Global variables already typed
.global_vars => |g| .{
.kind = expr.kind,
.span = expr.span,
.tpe = g.tpe(),
},
// Block: extend environment with each binding
// NOTE: In production, use a binding stack instead of cloning HashMap
// for each scope. See ZIGMA_STYLE.md for zero-allocation patterns.
.block => |block| blk: {
var cur_env = env.clone();
var new_bindings = try self.allocator.alloc(ValDef, block.bindings.len);
for (block.bindings, 0..) |binding, i| {
const rhs = try self.assignType(&cur_env, binding.rhs.*);
try cur_env.put(binding.name, rhs.tpe.?);
new_bindings[i] = .{
.name = binding.name,
.tpe = rhs.tpe.?,
.rhs = rhs,
};
}
const body = try self.assignType(&cur_env, block.body.*);
break :blk .{
.kind = .{ .block = .{
.bindings = new_bindings,
.body = body,
} },
.span = expr.span,
.tpe = body.tpe,
};
},
// Binary: type operands, check compatibility
.binary => |bin| blk: {
const left = try self.assignType(env, bin.lhs.*);
const right = try self.assignType(env, bin.rhs.*);
const result_type = try inferBinaryType(
bin.op,
left.tpe.?,
right.tpe.?,
);
break :blk .{
.kind = .{ .binary = .{
.op = bin.op,
.lhs = left,
.rhs = right,
} },
.span = expr.span,
.tpe = result_type,
};
},
// If: check condition is Boolean, branches have same type
.if_ => |if_expr| blk: {
const cond = try self.assignType(env, if_expr.cond.*);
const then_ = try self.assignType(env, if_expr.then_.*);
const else_ = try self.assignType(env, if_expr.else_.*);
if (cond.tpe.? != .s_boolean) {
return TyperError{
.msg = "Condition must be Boolean",
.span = cond.span,
};
}
if (!typesEqual(then_.tpe.?, else_.tpe.?)) {
return TyperError{
.msg = "Branches must have same type",
.span = expr.span,
};
}
break :blk .{
.kind = .{ .if_ = .{
.cond = cond,
.then_ = then_,
.else_ = else_,
} },
.span = expr.span,
.tpe = then_.tpe,
};
},
// Lambda: check argument types, type body
.lambda => |lam| blk: {
var lambda_env = env.clone();
for (lam.args) |arg| {
if (arg.tpe == .no_type) {
return TyperError{
.msg = "Lambda argument must have explicit type",
.span = expr.span,
};
}
try lambda_env.put(arg.name, arg.tpe);
}
const body = try self.assignType(&lambda_env, lam.body.*);
const func_type = SType{
.s_func = .{
.args = lam.args.map(fn(a) a.tpe),
.ret = body.tpe.?,
},
};
break :blk .{
.kind = .{ .lambda = .{
.args = lam.args,
.body = body,
} },
.span = expr.span,
.tpe = func_type,
};
},
// Method call: type receiver, resolve method, unify types
.select => |sel| try self.typeSelect(env, sel, expr.span),
.apply => |app| try self.typeApply(env, app, expr.span),
// Literals already typed
.literal => |lit| .{
.kind = expr.kind,
.span = expr.span,
.tpe = switch (lit) {
.int => .s_int,
.long => .s_long,
.bool_ => .s_boolean,
.string => .{ .s_coll = .s_byte },
},
},
else => expr,
};
}
};
Binary Operation Type Inference
fn inferBinaryType(op: BinaryOp, left: SType, right: SType) TyperError!SType {
return switch (op) {
// Arithmetic: operands must be same numeric type
.plus, .minus, .multiply, .divide, .modulo => blk: {
if (!left.isNumeric() or !right.isNumeric()) {
return error.TypeMismatch;
}
if (!typesEqual(left, right)) {
return error.TypeMismatch;
}
break :blk left;
},
// Comparison: operands must be same type, result is Boolean
.lt, .gt, .le, .ge => blk: {
if (!typesEqual(left, right)) {
return error.TypeMismatch;
}
break :blk .s_boolean;
},
// Equality: operands must be same type
.eq, .neq => blk: {
if (!typesEqual(left, right)) {
return error.TypeMismatch;
}
break :blk .s_boolean;
},
// Logical: Boolean operands
.and_, .or_ => blk: {
if (left == .s_boolean and right == .s_boolean) {
break :blk .s_boolean;
}
// SigmaProp operations
if (left == .s_sigma_prop and right == .s_sigma_prop) {
break :blk .s_sigma_prop;
}
// Mixed: SigmaProp with Boolean
if ((left == .s_sigma_prop and right == .s_boolean) or
(left == .s_boolean and right == .s_sigma_prop))
{
break :blk .s_boolean;
}
return error.TypeMismatch;
},
// Bitwise: numeric operands
.bit_and, .bit_or, .bit_xor => blk: {
if (!left.isNumeric() or !right.isNumeric()) {
return error.TypeMismatch;
}
if (!typesEqual(left, right)) {
return error.TypeMismatch;
}
break :blk left;
},
};
}
Type Unification
Finds a substitution making two types equal7:
const TypeSubst = std.StringHashMap(SType);
fn unifyTypes(t1: SType, t2: SType) ?TypeSubst {
var subst = TypeSubst.init(allocator);
return switch (t1) {
// Type variable matches anything
.s_type_var => |name| blk: {
subst.put(name, t2) catch return null;
break :blk subst;
},
// Collection types: unify element types
.s_coll => |elem1| switch (t2) {
.s_coll => |elem2| unifyTypes(elem1, elem2),
else => null,
},
// Option types: unify element types
.s_option => |elem1| switch (t2) {
.s_option => |elem2| unifyTypes(elem1, elem2),
else => null,
},
// Tuple types: unify element-wise
.s_tuple => |items1| switch (t2) {
.s_tuple => |items2| blk: {
if (items1.len != items2.len) break :blk null;
for (items1, items2) |i1, i2| {
const sub = unifyTypes(i1, i2) orelse break :blk null;
subst = mergeSubst(subst, sub) orelse break :blk null;
}
break :blk subst;
},
else => null,
},
// Function types: unify domain and range
.s_func => |f1| switch (t2) {
.s_func => |f2| blk: {
if (f1.args.len != f2.args.len) break :blk null;
for (f1.args, f2.args) |a1, a2| {
const sub = unifyTypes(a1, a2) orelse break :blk null;
subst = mergeSubst(subst, sub) orelse break :blk null;
}
const ret_sub = unifyTypes(f1.ret, f2.ret) orelse break :blk null;
break :blk mergeSubst(subst, ret_sub);
},
else => null,
},
// Boolean can unify with SigmaProp (implicit conversion)
.s_boolean => switch (t2) {
.s_sigma_prop, .s_boolean => subst,
else => null,
},
// SAny matches anything
.s_any => subst,
// Primitive types must match exactly
else => if (typesEqual(t1, t2)) subst else null,
};
}
fn applySubst(tpe: SType, subst: TypeSubst) SType {
return switch (tpe) {
.s_type_var => |name| subst.get(name) orelse tpe,
.s_coll => |elem| .{ .s_coll = applySubst(elem, subst) },
.s_option => |elem| .{ .s_option = applySubst(elem, subst) },
.s_tuple => |items| .{
.s_tuple = items.map(fn(t) applySubst(t, subst)),
},
.s_func => |f| .{
.s_func = .{
.args = f.args.map(fn(t) applySubst(t, subst)),
.ret = applySubst(f.ret, subst),
},
},
else => tpe,
};
}
fn mergeSubst(s1: TypeSubst, s2: TypeSubst) ?TypeSubst {
var result = s1.clone();
var it = s2.iterator();
while (it.next()) |entry| {
if (result.get(entry.key_ptr.*)) |existing| {
if (!typesEqual(existing, entry.value_ptr.*)) {
return null; // Conflict
}
} else {
result.put(entry.key_ptr.*, entry.value_ptr.*) catch return null;
}
}
return result;
}
Unification Example
Generic Method Specialization
─────────────────────────────────────────────────────
coll.map(f) where:
- coll: Coll[Byte]
- map type: (Coll[T], T => R) => Coll[R]
- f: Byte => Int
Step 1: Unify Coll[T] with Coll[Byte]
Result: {T → Byte}
Step 2: Unify (T => R) with (Byte => Int)
T already bound to Byte ✓
Result: {T → Byte, R → Int}
Step 3: Apply substitution to result type
Coll[R] → Coll[Int]
Final: map specialized to (Coll[Byte], Byte => Int) => Coll[Int]
Method Resolution
Methods are looked up in type's methods container8:
const MethodsContainer = struct {
const methods_by_type = std.ComptimeStringMap([]const MethodInfo, .{
.{ "SBox", &box_methods },
.{ "SColl", &coll_methods },
.{ "SContext", &context_methods },
// ...
});
pub fn getMethod(tpe: SType, name: []const u8) ?MethodInfo {
const type_name = tpe.typeName();
if (methods_by_type.get(type_name)) |methods| {
for (methods) |m| {
if (std.mem.eql(u8, m.name, name)) {
return m;
}
}
}
return null;
}
};
const MethodInfo = struct {
name: []const u8,
stype: SType,
ir_builder: ?*const fn (Expr, []const Expr) Expr,
};
const box_methods = [_]MethodInfo{
.{ .name = "value", .stype = .s_long, .ir_builder = null },
.{ .name = "propositionBytes", .stype = .{ .s_coll = .s_byte }, .ir_builder = null },
.{ .name = "id", .stype = .{ .s_coll = .s_byte }, .ir_builder = null },
.{ .name = "tokens", .stype = .{ .s_coll = .{ .s_tuple = &[_]SType{
.{ .s_coll = .s_byte }, .s_long,
} } }, .ir_builder = null },
// ...
};
const coll_methods = [_]MethodInfo{
.{ .name = "size", .stype = .s_int, .ir_builder = &buildSizeOf },
.{ .name = "map", .stype = .{ .s_func = .{
.args = &[_]SType{ .{ .s_type_var = "T" }, .{ .s_func = .{
.args = &[_]SType{.{ .s_type_var = "T" }},
.ret = .{ .s_type_var = "R" },
} } },
.ret = .{ .s_coll = .{ .s_type_var = "R" } },
} }, .ir_builder = &buildMapCollection },
// ...
};
Method Lowering
When lower_method_calls = true, method calls become IR nodes9:
fn typeSelect(
self: *Typer,
env: *const TypeEnv,
sel: SelectExpr,
span: Range,
) TyperError!Expr {
const receiver = try self.assignType(env, sel.obj.*);
const receiver_type = receiver.tpe.?;
const method = MethodsContainer.getMethod(receiver_type, sel.field) orelse {
return TyperError{
.msg = "Method not found",
.span = span,
};
};
// Specialize generic method type
const specialized = specializeMethod(method.stype, receiver_type);
// Lower to IR node if builder available
if (method.ir_builder) |builder| {
if (self.lower_method_calls) {
return builder(receiver, &[_]Expr{});
}
}
// Keep as method call
return .{
.kind = .{ .select = .{
.obj = receiver,
.field = sel.field,
} },
.span = span,
.tpe = specialized,
};
}
fn buildSizeOf(receiver: Expr, _: []const Expr) Expr {
return .{
.kind = .{ .size_of = receiver },
.span = receiver.span,
.tpe = .s_int,
};
}
fn buildMapCollection(receiver: Expr, args: []const Expr) Expr {
return .{
.kind = .{ .map = .{
.input = receiver,
.mapper = args[0],
} },
.span = receiver.span,
.tpe = args[0].tpe.?.s_func.ret,
};
}
MIR Lowering
After typing, HIR lowers to MIR (typed IR)10:
const MirLoweringError = struct {
msg: []const u8,
span: Range,
};
pub fn lower(hir_expr: hir.Expr) MirLoweringError!mir.Expr {
const mir_expr: mir.Expr = switch (hir_expr.kind) {
.global_vars => |g| switch (g) {
.height => mir.GlobalVars.height.toExpr(),
.self_ => mir.GlobalVars.self_.toExpr(),
// ...
},
.ident => return MirLoweringError{
.msg = "Unresolved identifier",
.span = hir_expr.span,
},
.binary => |bin| blk: {
const left = try lower(bin.lhs.*);
const right = try lower(bin.rhs.*);
break :blk mir.BinOp{
.kind = bin.op.toMirOp(),
.left = left,
.right = right,
}.toExpr();
},
.literal => |lit| switch (lit) {
.int => |v| mir.Constant{ .int = v }.toExpr(),
.long => |v| mir.Constant{ .long = v }.toExpr(),
.bool_ => |v| (if (v) mir.TrueLeaf else mir.FalseLeaf).toExpr(),
},
// ...
};
// Verify types match
const hir_tpe = hir_expr.tpe orelse return MirLoweringError{
.msg = "Missing type for HIR expression",
.span = hir_expr.span,
};
if (!typesEqual(mir_expr.tpe(), hir_tpe)) {
return MirLoweringError{
.msg = "Type mismatch after lowering",
.span = hir_expr.span,
};
}
return mir_expr;
}
Complete Compilation Flow
pub fn compile(source: []const u8, env: ScriptEnv) CompileError!mir.Expr {
// 1. Parse
const tokens = Lexer.init(source).tokenize();
const events = Parser.init(tokens).parse();
const ast = buildTree(events, tokens);
// 2. Lower to HIR
const hir = try hir.lower(ast);
// 3. Bind
const binder = Binder.init(allocator, env);
const bound = try binder.bind(hir);
// 4. Type
const typer = Typer.init(allocator, TypeEnv.init(allocator), true);
const typed = try typer.typecheck(bound);
// 5. Lower to MIR
const mir_expr = try mir.lower(typed);
return mir_expr;
}
Error Messages
Error Types
─────────────────────────────────────────────────────
BinderError:
- "Variable x already defined"
- "Cannot lift value to constant"
TyperError:
- "Cannot assign type for variable 'foo'"
- "Condition must be Boolean, got Int"
- "Branches must have same type: Int vs Long"
- "Method 'bar' not found in type Box"
MirLoweringError:
- "Unresolved identifier"
- "Type mismatch after lowering"
Summary
Semantic analysis consists of two phases:
Binding (Binder):
- Resolves global names (HEIGHT, SELF, etc.)
- Lifts environment values to constants
- Uses bottom-up tree rewriting
Typing (Typer):
- Assigns types to all expressions
- Resolves method calls via MethodsContainer
- Unifies generic types with concrete types
- Optionally lowers method calls to IR nodes
- Checks type consistency
Key algorithms:
- Type unification: Find substitution making types equal
- Substitution application: Specialize generic types
- Method resolution: Look up methods in type's container
Next: Chapter 18: Intermediate Representation
Scala: SigmaBinder.scala
Rust: binder.rs
Scala: SigmaBinder.scala:30-100
Rust: binder.rs:26-61
Scala: SigmaTyper.scala
Rust: type_infer.rs
Scala: package.scala (unifyTypes)
Scala: SRMethod.scala
Scala: SigmaTyper.scala:200-280
Rust: lower.rs:29-76
Chapter 18: Intermediate Representation (IR)
PRE-ALPHA WARNING: This is a pre-alpha version of The Sigma Book. Content may be incomplete, inaccurate, or subject to change. Do not use as a source of truth. For authoritative information, consult the official repositories:
- sigmastate-interpreter — Reference Scala implementation
- sigma-rust — Rust implementation
- ergo — Ergo node
Prerequisites
- Chapter 17 for the typed AST that feeds into IR construction
- Chapter 5 for operation codes that IR nodes map to
- Understanding of compiler optimization concepts: CSE, dead code elimination
Learning Objectives
By the end of this chapter, you will be able to:
- Explain the graph-based IR design using the Def/Ref pattern
- Implement common subexpression elimination (CSE) via hash-consing
- Apply graph rewriting for algebraic simplifications
- Trace the AST → Graph IR → Optimized Tree transformations
IR Architecture Overview
The Scala compiler uses a sophisticated graph-based IR for optimization12. The Rust compiler uses a simpler direct HIR→MIR pipeline3.
Compilation Pipelines
─────────────────────────────────────────────────────
Scala (Graph IR):
┌─────────┐ GraphBuilding ┌──────────┐ TreeBuilding ┌──────────┐
│ Typed │ ─────────────────>│ Graph IR │ ─────────────────>│ Optimized│
│ AST │ (+ CSE) │ (Def/Ref)│ (ValDef min) │ ErgoTree │
└─────────┘ └──────────┘ └──────────┘
│
│ DefRewriting
│ (algebraic simplifications)
▼
Rust (Direct):
┌─────────┐ Lower ┌──────────┐ Lower ┌──────────┐ Check ┌──────────┐
│ HIR │ ─────────> │ Bound │ ─────────> │ Typed │ ─────────>│ MIR/ │
│ (parse) │ │ HIR │ │ HIR │ │ ErgoTree │
└─────────┘ └──────────┘ └──────────┘ └──────────┘
The Def/Ref Pattern
The core IR abstraction uses definitions (nodes) and references (edges)45:
/// Reference to a definition (graph edge)
/// Like a pointer but with type information
const Sym = u32; // Symbol ID
/// Type descriptor for IR values
const Elem = struct {
stype: SType,
source_type: ?*const std.meta.Type,
};
/// Base type for all graph nodes
const Node = struct {
/// Unique ID assigned on creation
node_id: u32,
/// Cached dependencies (other nodes this one uses)
deps: ?[]const Sym,
/// Cached hash for structural equality
hash_code: u32,
pub fn getDeps(self: *const Node) []const Sym {
if (self.deps) |d| return d;
// Computed lazily from node contents
return computeDeps(self);
}
};
/// Definition of a computation (graph node)
const Def = struct {
node: Node,
/// Type of the result value
result_type: Elem,
/// Reference to this definition (created lazily)
self_ref: ?Sym,
pub fn self(d: *Def, ctx: *IRContext) Sym {
if (d.self_ref) |s| return s;
const sym = ctx.freshSym(d);
d.self_ref = sym;
return sym;
}
};
IR Context
The IR context manages the graph and provides CSE67:
const IRContext = struct {
allocator: Allocator,
/// Counter for unique node IDs
id_counter: u32,
/// Global definitions: Def hash → Sym
/// This enables CSE through hash-consing
global_defs: std.HashMap(*const Def, Sym, DefHashContext, 80),
/// Sym → Def mapping
sym_to_def: std.AutoHashMap(Sym, *const Def),
pub fn init(allocator: Allocator) IRContext {
return .{
.allocator = allocator,
.id_counter = 0,
.global_defs = std.HashMap(*const Def, Sym, DefHashContext, 80).init(allocator),
.sym_to_def = std.AutoHashMap(Sym, *const Def).init(allocator),
};
}
/// Generate fresh symbol ID
pub fn freshSym(self: *IRContext, def: *const Def) Sym {
const id = self.id_counter;
self.id_counter += 1;
self.sym_to_def.put(id, def) catch unreachable;
return id;
}
/// Create or reuse existing definition (CSE)
pub fn reifyObject(self: *IRContext, d: *Def) Sym {
return self.findOrCreateDefinition(d);
}
/// Hash-consing: lookup by structural equality
fn findOrCreateDefinition(self: *IRContext, d: *Def) Sym {
if (self.global_defs.get(d)) |existing_sym| {
// Reuse existing definition
return existing_sym;
}
// Register new definition
const sym = d.self(self);
self.global_defs.put(d, sym) catch unreachable;
return sym;
}
};
/// Hash context for structural equality of definitions
const DefHashContext = struct {
pub fn hash(_: DefHashContext, def: *const Def) u64 {
// Hash based on node type and contents (structural)
return def.node.hash_code;
}
pub fn eql(_: DefHashContext, a: *const Def, b: *const Def) bool {
// Structural equality of definitions
return structuralEqual(a, b);
}
};
Common Subexpression Elimination
CSE is achieved automatically through hash-consing8:
CSE Through Hash-Consing
─────────────────────────────────────────────────────
Source:
val a = SELF.value
val b = SELF.value // Same computation!
a + b
Step 1: Build graph for SELF.value
s1 = Self
s2 = MethodCall(s1, "value") → stored in global_defs
Step 2: Build graph for second SELF.value
s1 = Self → already exists, reuse
s2 = MethodCall(s1, "value") → lookup in global_defs
→ found! return existing s2
Step 3: Build addition
s3 = Plus(s2, s2) → both operands point to s2
Result: Single computation of SELF.value
/// Build graph from typed AST
const GraphBuilder = struct {
ctx: *IRContext,
env: std.StringHashMap(Sym),
pub fn buildGraph(self: *GraphBuilder, expr: *const TypedExpr) !Sym {
return switch (expr.kind) {
.constant => |c| self.buildConstant(c),
.val_use => |name| self.env.get(name) orelse error.UndefinedVariable,
.block => |b| self.buildBlock(b),
.bin_op => |op| self.buildBinOp(op),
.method_call => |mc| self.buildMethodCall(mc),
.if_expr => |i| self.buildIf(i),
.func_value => |f| self.buildLambda(f),
.apply => |a| self.buildApply(a),
};
}
fn buildConstant(self: *GraphBuilder, c: Constant) Sym {
const def = self.ctx.allocator.create(ConstDef) catch unreachable;
def.* = .{ .value = c };
// CSE: if same constant exists, reuse it
return self.ctx.reifyObject(&def.base);
}
fn buildBinOp(self: *GraphBuilder, op: *const BinOp) !Sym {
const left_sym = try self.buildGraph(op.left);
const right_sym = try self.buildGraph(op.right);
const def = self.ctx.allocator.create(BinOpDef) catch unreachable;
def.* = .{
.op = op.kind,
.left = left_sym,
.right = right_sym,
};
// CSE: reuse if same operation on same operands exists
return self.ctx.reifyObject(&def.base);
}
fn buildMethodCall(self: *GraphBuilder, mc: *const MethodCall) !Sym {
const receiver_sym = try self.buildGraph(mc.receiver);
var arg_syms = try self.ctx.allocator.alloc(Sym, mc.args.len);
for (mc.args, 0..) |arg, i| {
arg_syms[i] = try self.buildGraph(arg);
}
const def = self.ctx.allocator.create(MethodCallDef) catch unreachable;
def.* = .{
.receiver = receiver_sym,
.method = mc.method,
.args = arg_syms,
};
// CSE: reuse if identical method call exists
return self.ctx.reifyObject(&def.base);
}
};
Graph Rewriting
Algebraic simplifications are applied as rewrite rules910:
/// Rewriting rules for optimization
const DefRewriter = struct {
ctx: *IRContext,
/// Called on each new definition
/// Returns replacement Sym or null for no rewrite
pub fn rewriteDef(self: *DefRewriter, d: *const Def) ?Sym {
return switch (d.kind()) {
.coll_length => self.rewriteLength(d.as(CollLengthDef)),
.coll_map => self.rewriteMap(d.as(CollMapDef)),
.coll_zip => self.rewriteZip(d.as(CollZipDef)),
.option_get_or_else => self.rewriteGetOrElse(d.as(OptionGetOrElseDef)),
else => null,
};
}
/// xs.map(f).length => xs.length
fn rewriteLength(self: *DefRewriter, len_def: *const CollLengthDef) ?Sym {
const input = self.ctx.getDef(len_def.input);
return switch (input.kind()) {
.coll_map => |map_def| {
// Rule: xs.map(f).length => xs.length
return self.makeLength(map_def.input);
},
.coll_replicate => |rep_def| {
// Rule: replicate(len, v).length => len
return rep_def.length;
},
.const_coll => |coll_def| {
// Rule: Const(coll).length => coll.length
return self.makeConstant(.{ .int = @intCast(coll_def.items.len) });
},
.coll_from_items => |items_def| {
// Rule: Coll(items).length => items.length
return self.makeConstant(.{ .int = @intCast(items_def.items.len) });
},
else => null,
};
}
/// xs.map(identity) => xs
/// xs.map(f).map(g) => xs.map(x => g(f(x)))
fn rewriteMap(self: *DefRewriter, map_def: *const CollMapDef) ?Sym {
const mapper = self.ctx.getDef(map_def.mapper);
// Rule: xs.map(identity) => xs
if (isIdentityLambda(mapper)) {
return map_def.input;
}
const input = self.ctx.getDef(map_def.input);
return switch (input.kind()) {
.coll_replicate => |rep_def| {
// Rule: replicate(l, v).map(f) => replicate(l, f(v))
const applied = self.makeApply(map_def.mapper, rep_def.value);
return self.makeReplicate(rep_def.length, applied);
},
.coll_map => |inner_map| {
// Rule: xs.map(f).map(g) => xs.map(x => g(f(x)))
const composed = self.composeLambdas(inner_map.mapper, map_def.mapper);
return self.makeMap(inner_map.input, composed);
},
else => null,
};
}
/// replicate(l, x).zip(replicate(l, y)) => replicate(l, (x, y))
fn rewriteZip(self: *DefRewriter, zip_def: *const CollZipDef) ?Sym {
const left = self.ctx.getDef(zip_def.left);
const right = self.ctx.getDef(zip_def.right);
if (left.kind() == .coll_replicate and right.kind() == .coll_replicate) {
const rep_l = left.as(CollReplicateDef);
const rep_r = right.as(CollReplicateDef);
// Check same length and builder
if (rep_l.length == rep_r.length and rep_l.builder == rep_r.builder) {
const pair = self.makePair(rep_l.value, rep_r.value);
return self.makeReplicate(rep_l.length, pair);
}
}
return null;
}
/// Some(x).getOrElse(d) => x
fn rewriteGetOrElse(self: *DefRewriter, def: *const OptionGetOrElseDef) ?Sym {
const opt = self.ctx.getDef(def.option);
if (opt.kind() == .option_const) {
const opt_const = opt.as(OptionConstDef);
if (opt_const.value) |v| {
return self.liftValue(v);
}
}
return null;
}
};
Sigma-Specific Rewrites
Special optimizations for Sigma propositions11:
/// Sigma-specific rewriting rules
const SigmaRewriter = struct {
ctx: *IRContext,
pub fn rewriteSigma(self: *SigmaRewriter, d: *const Def) ?Sym {
return switch (d.kind()) {
.sigma_prop_is_valid => self.rewriteIsValid(d),
.sigma_prop_from_bool => self.rewriteSigmaProp(d),
.all_of => self.rewriteAllOf(d),
.any_of => self.rewriteAnyOf(d),
else => null,
};
}
/// sigmaProp(sp.isValid) => sp
fn rewriteIsValid(self: *SigmaRewriter, d: *const Def) ?Sym {
const is_valid = d.as(SigmaIsValidDef);
const inner = self.ctx.getDef(is_valid.prop);
if (inner.kind() == .sigma_prop_from_bool) {
const from_bool = inner.as(SigmaPropFromBoolDef);
// Check if the bool is another isValid
const bool_def = self.ctx.getDef(from_bool.bool_expr);
if (bool_def.kind() == .sigma_prop_is_valid) {
return bool_def.as(SigmaIsValidDef).prop;
}
}
return null;
}
/// sigmaProp(b).isValid => b
fn rewriteSigmaProp(self: *SigmaRewriter, d: *const Def) ?Sym {
_ = d;
_ = self;
// This rewrite is handled in rewriteIsValid
return null;
}
/// allOf(Coll(b1, ..., sp1.isValid, ...)) =>
/// (allOf(Coll(b1, ...)) && allZK(sp1, ...)).isValid
fn rewriteAllOf(self: *SigmaRewriter, d: *const Def) ?Sym {
const all_of = d.as(AllOfDef);
const items = self.extractItems(all_of.input) orelse return null;
var bools = std.ArrayList(Sym).init(self.ctx.allocator);
var sigmas = std.ArrayList(Sym).init(self.ctx.allocator);
for (items) |item| {
const item_def = self.ctx.getDef(item);
if (item_def.kind() == .sigma_prop_is_valid) {
const is_valid = item_def.as(SigmaIsValidDef);
sigmas.append(is_valid.prop) catch unreachable;
} else {
bools.append(item) catch unreachable;
}
}
if (sigmas.items.len == 0) return null;
// Build: (allOf(bools) && allZK(sigmas)).isValid
const zk_all = self.makeAllZK(sigmas.items);
if (bools.items.len == 0) {
return self.makeIsValid(zk_all);
}
const bool_all = self.makeSigmaProp(self.makeAllOf(bools.items));
const combined = self.makeSigmaAnd(bool_all, zk_all);
return self.makeIsValid(combined);
}
};
Tree Building
Transform optimized graph back to ErgoTree1213:
/// Transform graph IR to ErgoTree
const TreeBuilder = struct {
ctx: *IRContext,
/// Maps symbols to ValDef IDs
env: std.AutoHashMap(Sym, struct { id: u32, tpe: SType }),
/// Current ValDef ID counter
def_id: u32,
allocator: Allocator,
pub fn buildTree(self: *TreeBuilder, root: Sym) !*Expr {
// Compute usage counts to minimize ValDefs
const usage = self.computeUsageCounts(root);
// Build topological schedule
const schedule = self.buildSchedule(root);
// Process nodes, introducing ValDefs only for multi-use
var val_defs = std.ArrayList(ValDef).init(self.allocator);
for (schedule) |sym| {
if (usage.get(sym).? > 1) {
// Multi-use node: create ValDef
const rhs = try self.buildValue(sym);
const tpe = self.ctx.getDef(sym).result_type.stype;
try val_defs.append(.{
.id = self.def_id,
.tpe = tpe,
.rhs = rhs,
});
try self.env.put(sym, .{ .id = self.def_id, .tpe = tpe });
self.def_id += 1;
}
}
// Build result expression
const result = try self.buildValue(root);
// Wrap in block if we have ValDefs
if (val_defs.items.len == 0) {
return result;
}
return self.makeBlock(val_defs.items, result);
}
fn buildValue(self: *TreeBuilder, sym: Sym) !*Expr {
// Check if already bound in environment
if (self.env.get(sym)) |binding| {
return self.makeValUse(binding.id, binding.tpe);
}
const def = self.ctx.getDef(sym);
return switch (def.kind()) {
.constant => |c| self.makeConstant(c),
.context_prop => |prop| self.buildContextProp(prop),
.method_call => |mc| self.buildMethodCall(mc),
.bin_op => |op| self.buildBinOp(op),
.lambda => |lam| self.buildLambda(lam),
.apply => |app| self.buildApply(app),
.if_then_else => |ite| self.buildIf(ite),
else => error.UnhandledDefKind,
};
}
fn computeUsageCounts(self: *TreeBuilder, root: Sym) std.AutoHashMap(Sym, u32) {
var counts = std.AutoHashMap(Sym, u32).init(self.allocator);
self.countUsagesRecursive(root, &counts);
return counts;
}
fn countUsagesRecursive(self: *TreeBuilder, sym: Sym, counts: *std.AutoHashMap(Sym, u32)) void {
const current = counts.get(sym) orelse 0;
counts.put(sym, current + 1) catch unreachable;
// Only traverse dependencies on first visit
if (current == 0) {
const def = self.ctx.getDef(sym);
for (def.node.getDeps()) |dep| {
self.countUsagesRecursive(dep, counts);
}
}
}
fn buildSchedule(self: *TreeBuilder, root: Sym) []const Sym {
// Topological sort via DFS
var visited = std.AutoHashMap(Sym, void).init(self.allocator);
var schedule = std.ArrayList(Sym).init(self.allocator);
self.dfs(root, &visited, &schedule);
return schedule.items;
}
fn dfs(self: *TreeBuilder, sym: Sym, visited: *std.AutoHashMap(Sym, void), schedule: *std.ArrayList(Sym)) void {
if (visited.contains(sym)) return;
visited.put(sym, {}) catch unreachable;
const def = self.ctx.getDef(sym);
for (def.node.getDeps()) |dep| {
self.dfs(dep, visited, schedule);
}
schedule.append(sym) catch unreachable;
}
};
Operation Translation
Map IR operations to ErgoTree nodes14:
/// Recognize arithmetic operations
fn translateArithOp(op: BinOpKind) ?OpCode {
return switch (op) {
.plus => OpCode.Plus,
.minus => OpCode.Minus,
.multiply => OpCode.Multiply,
.divide => OpCode.Division,
.modulo => OpCode.Modulo,
.min => OpCode.Min,
.max => OpCode.Max,
else => null,
};
}
/// Recognize comparison operations
fn translateRelationOp(op: BinOpKind) ?fn (*Expr, *Expr) *Expr {
return switch (op) {
.eq => makeEQ,
.neq => makeNEQ,
.gt => makeGT,
.lt => makeLT,
.ge => makeGE,
.le => makeLE,
else => null,
};
}
/// Recognize context properties
fn translateContextProp(prop: ContextProperty) *Expr {
return switch (prop) {
.height => &expr_height,
.inputs => &expr_inputs,
.outputs => &expr_outputs,
.self => &expr_self,
};
}
/// Internal definitions should not become ValDefs
fn isInternalDef(def: *const Def) bool {
return switch (def.kind()) {
.sigma_dsl_builder, .coll_builder => true,
else => false,
};
}
Rust HIR (Alternative Approach)
The Rust compiler uses a simpler tree-based HIR without graph IR1516:
/// Rust-style HIR expression
const HirExpr = struct {
kind: ExprKind,
span: TextRange,
tpe: ?SType,
const ExprKind = union(enum) {
ident: []const u8,
binary: Binary,
global_vars: GlobalVars,
literal: Literal,
};
const Binary = struct {
op: Spanned(BinaryOp),
lhs: *HirExpr,
rhs: *HirExpr,
};
const GlobalVars = enum {
height,
};
const Literal = union(enum) {
int: i32,
long: i64,
};
};
/// Rewrite HIR expressions (simpler than graph rewriting)
fn rewrite(
e: HirExpr,
f: fn (*const HirExpr) ?HirExpr,
) HirExpr {
// Apply rewrite function
const rewritten = f(&e) orelse e;
// Recursively rewrite children
return switch (rewritten.kind) {
.binary => |bin| blk: {
const new_lhs = f(bin.lhs) orelse bin.lhs.*;
const new_rhs = f(bin.rhs) orelse bin.rhs.*;
break :blk HirExpr{
.kind = .{ .binary = .{
.op = bin.op,
.lhs = &new_lhs,
.rhs = &new_rhs,
}},
.span = rewritten.span,
.tpe = rewritten.tpe,
};
},
else => rewritten,
};
}
CSE Example Walkthrough
Source:
─────────────────────────────────────────────────────
{
val x = SELF.value
val y = SELF.value // Duplicate!
val z = OUTPUTS(0).value
x + y > z
}
After GraphBuilding (with CSE):
─────────────────────────────────────────────────────
s1 = Context.SELF
s2 = s1.value // Single node for both x and y
s3 = Context.OUTPUTS
s4 = s3.apply(0)
s5 = s4.value
s6 = Plus(s2, s2) // x + y = s2 + s2
s7 = GT(s6, s5)
After TreeBuilding (ValDef minimization):
─────────────────────────────────────────────────────
{
val v1 = SELF.value // s2 used twice → ValDef
GT(Plus(v1, v1), OUTPUTS(0).value)
}
Nodes s1, s3, s4, s5 have single use → inlined
Node s2 has multiple uses → ValDef introduced
Summary
- Def/Ref pattern separates computation definitions from references
- Hash-consing enables automatic CSE—structurally equal nodes share identity
- Graph rewriting applies algebraic simplifications (map fusion, etc.)
- TreeBuilding transforms graph back to ErgoTree with minimal ValDefs
- Usage counting determines which nodes need ValDef bindings
- Scala uses full graph IR; Rust uses simpler tree-based HIR
- IR optimizations reduce serialized ErgoTree size
- Not part of consensus—compiler-only optimization
Next: Chapter 19: Compiler Pipeline
Scala: IRContext.scala
Scala: Base.scala:17-200 (Node, Def, Ref)
Rust: compiler.rs:59-76 (compile pipeline)
Scala: Base.scala:100-160 (Def trait)
Rust: hir.rs:32-37 (Expr struct)
Scala: IRContext.scala:28-50 (cake pattern)
Rust: compiler.rs:78-87 (compile_hir)
Scala: GraphBuilding.scala:28-35 (CSE documentation)
Scala: IRContext.scala:105-150 (rewriteDef)
Rust: rewrite.rs:10-29 (rewrite function)
Scala: GraphBuilding.scala:75-120 (HasSigmas, AllOf)
Scala: TreeBuilding.scala:21-50 (TreeBuilding trait)
Scala: TreeBuilding.scala:60-100 (IsArithOp, IsRelationOp)
Scala: TreeBuilding.scala:100-140 (IsContextProperty)
Rust: hir.rs:146-167 (ExprKind enum)
Rust: hir.rs:61-94 (Expr::lower)
Chapter 19: Compiler Pipeline
PRE-ALPHA WARNING: This is a pre-alpha version of The Sigma Book. Content may be incomplete, inaccurate, or subject to change. Do not use as a source of truth. For authoritative information, consult the official repositories:
- sigmastate-interpreter — Reference Scala implementation
- sigma-rust — Rust implementation
- ergo — Ergo node
Prerequisites
- Chapter 16 for parsing ErgoScript to AST
- Chapter 17 for name binding and type inference
- Chapter 18 for IR optimization passes
Learning Objectives
By the end of this chapter, you will be able to:
- Trace the complete compilation pipeline from ErgoScript source to ErgoTree bytecode
- Use the
SigmaCompilerAPI to compile scripts programmatically - Explain method call lowering strategies and when direct operations are used
- Configure compiler settings for different networks (mainnet vs testnet)
Pipeline Architecture
The ErgoScript compiler transforms source code through multiple phases12:
Compilation Pipeline
─────────────────────────────────────────────────────
Source: "sigmaProp(SELF.value > 1000L)"
│
│ (1) Parse
▼
┌─────────────────────────────────────────────────────┐
│ Untyped AST │
│ Apply(Ident("sigmaProp"), [GT(Select(...), ...)]) │
└─────────────────────────────────────────────────────┘
│
│ (2) Bind
▼
┌─────────────────────────────────────────────────────┐
│ Bound AST (names resolved) │
│ Apply(SigmaPropFunc, [GT(Self.value, 1000L)]) │
└─────────────────────────────────────────────────────┘
│
│ (3) Typecheck
▼
┌─────────────────────────────────────────────────────┐
│ Typed AST │
│ BoolToSigmaProp(GT(ExtractAmount(Self), 1000L)) │
│ :: SSigmaProp │
└─────────────────────────────────────────────────────┘
│
│ (4) BuildGraph (Scala only)
▼
┌─────────────────────────────────────────────────────┐
│ Graph IR (CSE applied) │
│ s1=Self, s2=s1.value, s3=1000L, s4=GT(s2,s3) │
│ s5=sigmaProp(s4) │
└─────────────────────────────────────────────────────┘
│
│ (5) BuildTree / Lower to MIR
▼
┌─────────────────────────────────────────────────────┐
│ ErgoTree │
│ BoolToSigmaProp(GT(ExtractAmount(Self), 1000L)) │
└─────────────────────────────────────────────────────┘
Compiler Settings
Configuration controls optimization and network behavior3:
const CompilerSettings = struct {
/// Network prefix for address decoding (mainnet=0, testnet=16)
network_prefix: u8,
/// Whether to lower MethodCall to direct nodes
lower_method_calls: bool,
/// Builder for creating ErgoTree nodes
builder: *const SigmaBuilder,
pub fn mainnet() CompilerSettings {
return .{
.network_prefix = 0x00,
.lower_method_calls = true,
.builder = &TransformingSigmaBuilder,
};
}
pub fn testnet() CompilerSettings {
return .{
.network_prefix = 0x10,
.lower_method_calls = true,
.builder = &TransformingSigmaBuilder,
};
}
};
SigmaCompiler Implementation
The compiler orchestrates all phases45:
const SigmaCompiler = struct {
settings: CompilerSettings,
allocator: Allocator,
pub fn init(settings: CompilerSettings, allocator: Allocator) SigmaCompiler {
return .{
.settings = settings,
.allocator = allocator,
};
}
/// Phase 1: Parse source to AST
pub fn parse(self: *const SigmaCompiler, source: []const u8) !*Expr {
var parser = Parser.init(source, self.allocator);
return parser.parseExpr() catch |err| {
return error.ParserError;
};
}
/// Phases 2-3: Bind and typecheck
pub fn typecheck(
self: *const SigmaCompiler,
env: *const ScriptEnv,
parsed: *const Expr,
) !*TypedExpr {
// Phase 2: Bind names
const predef_registry = PredefinedFuncRegistry.init(self.settings.builder);
var binder = Binder.init(env, self.settings.builder, self.settings.network_prefix, &predef_registry);
const bound = try binder.bind(parsed);
// Phase 3: Type inference and checking
const type_env = env.collectTypes();
var typer = Typer.init(
self.settings.builder,
&predef_registry,
type_env,
self.settings.lower_method_calls,
);
return typer.typecheck(bound);
}
/// Full compilation: all phases
pub fn compile(
self: *const SigmaCompiler,
env: *const ScriptEnv,
source: []const u8,
ir_ctx: *IRContext,
) !CompilerResult {
const parsed = try self.parse(source);
const typed = try self.typecheck(env, parsed);
return self.compileTyped(env, typed, ir_ctx, source);
}
/// Phases 4-5: Graph building and tree building
fn compileTyped(
self: *const SigmaCompiler,
env: *const ScriptEnv,
typed: *const TypedExpr,
ir_ctx: *IRContext,
source: []const u8,
) !CompilerResult {
// Create placeholder constants for type parameters
var placeholders_env = env.clone();
var idx: u32 = 0;
var iter = env.typeParams();
while (iter.next()) |entry| {
const placeholder = ConstantPlaceholder{
.index = idx,
.tpe = entry.value,
};
try placeholders_env.put(entry.key, .{ .placeholder = placeholder });
idx += 1;
}
// Phase 4: Build graph (CSE)
var graph_builder = GraphBuilder.init(ir_ctx, &placeholders_env);
const compiled_graph = try graph_builder.buildGraph(typed);
// Phase 5: Build tree (ValDef minimization)
var tree_builder = TreeBuilder.init(ir_ctx, self.allocator);
const compiled_tree = try tree_builder.buildTree(compiled_graph);
return CompilerResult{
.env = env,
.source = source,
.compiled_graph = compiled_graph,
.ergo_tree = compiled_tree,
};
}
};
/// Result of compilation
const CompilerResult = struct {
env: *const ScriptEnv,
source: []const u8,
compiled_graph: Sym,
ergo_tree: *Expr,
};
Rust Compiler Pipeline
The Rust implementation uses a direct pipeline without graph IR67:
/// Rust-style direct compilation pipeline
const RustCompiler = struct {
allocator: Allocator,
/// Compile source to ErgoTree expression
pub fn compileExpr(
self: *const RustCompiler,
source: []const u8,
env: ScriptEnv,
) !*MirExpr {
// Parse to CST, then lower to HIR
const hir = try self.compileHir(source);
// Bind names in HIR
var binder = Binder.init(env);
const bound = try binder.bind(hir);
// Assign types
const typed = try assignType(bound);
// Lower to MIR (ErgoTree IR)
const mir = try lowerToMir(typed);
// Type check MIR
return try typeCheck(mir);
}
/// Compile to full ErgoTree
pub fn compile(
self: *const RustCompiler,
source: []const u8,
env: ScriptEnv,
) !ErgoTree {
const expr = try self.compileExpr(source, env);
return ErgoTree.fromExpr(expr);
}
fn compileHir(self: *const RustCompiler, source: []const u8) !*HirExpr {
var parser = Parser.init(source);
const parse_result = parser.parse();
if (parse_result.errors.len > 0) {
return error.ParseError;
}
const syntax = parse_result.syntax();
const root = AstRoot.cast(syntax) orelse return error.InvalidRoot;
return hirLower(root);
}
};
Method Call Lowering
Lowering transforms generic MethodCall to compact direct nodes89:
Method Call Lowering
─────────────────────────────────────────────────────
Before lowering (MethodCall - 3+ bytes):
MethodCall(xs, CollMethods.MapMethod, [f], {})
After lowering (MapCollection - 1 byte):
MapCollection(xs, f)
Size savings: 2+ bytes per operation
/// Method call lowering during typing
const MethodCallLowerer = struct {
builder: *const SigmaBuilder,
lower_enabled: bool,
/// Try to lower MethodCall to direct node
pub fn tryLower(
self: *const MethodCallLowerer,
obj: *const Expr,
method: *const SMethod,
args: []const *const Expr,
subst: TypeSubst,
) ?*Expr {
if (!self.lower_enabled) return null;
// Check if method has IR builder
const ir_builder = method.ir_info.ir_builder orelse return null;
// Try to apply the builder
return ir_builder.build(self.builder, obj, method, args, subst);
}
/// Unlower: convert direct nodes back to MethodCall (for display)
pub fn unlower(self: *const MethodCallLowerer, expr: *const Expr) *Expr {
return switch (expr.kind) {
.multiply_group => |mg| self.builder.makeMethodCall(
mg.left,
&SGroupElementMethods.multiply_method,
&[_]*const Expr{mg.right},
),
.exponentiate => |exp| self.builder.makeMethodCall(
exp.base,
&SGroupElementMethods.exponentiate_method,
&[_]*const Expr{exp.exponent},
),
.map_collection => |mc| self.builder.makeMethodCall(
mc.input,
&SCollectionMethods.map_method.withConcreteTypes(.{
.tIV = mc.input.tpe.elemType(),
.tOV = mc.mapper.tpe.resultType(),
}),
&[_]*const Expr{mc.mapper},
),
.fold => |f| self.builder.makeMethodCall(
f.input,
&SCollectionMethods.fold_method.withConcreteTypes(.{
.tIV = f.input.tpe.elemType(),
.tOV = f.zero.tpe,
}),
&[_]*const Expr{ f.zero, f.folder },
),
.for_all => |fa| self.builder.makeMethodCall(
fa.input,
&SCollectionMethods.forall_method.withConcreteTypes(.{
.tIV = fa.input.tpe.elemType(),
}),
&[_]*const Expr{fa.predicate},
),
.exists => |ex| self.builder.makeMethodCall(
ex.input,
&SCollectionMethods.exists_method.withConcreteTypes(.{
.tIV = ex.input.tpe.elemType(),
}),
&[_]*const Expr{ex.predicate},
),
else => expr,
};
}
};
Type Inference
Type assignment propagates and unifies types1011:
const Typer = struct {
builder: *const SigmaBuilder,
predef_registry: *const PredefinedFuncRegistry,
type_env: std.StringHashMap(SType),
lower_method_calls: bool,
/// Assign types to bound expression
pub fn typecheck(self: *Typer, bound: *const Expr) !*TypedExpr {
return self.assignType(self.type_env, bound);
}
fn assignType(self: *Typer, env: std.StringHashMap(SType), expr: *const Expr) !*TypedExpr {
return switch (expr.kind) {
.block => |b| self.typecheckBlock(env, b),
.tuple => |t| self.typecheckTuple(env, t),
.ident => |id| self.typecheckIdent(env, id),
.select => |s| self.typecheckSelect(env, s),
.apply => |a| self.typecheckApply(env, a),
.lambda => |l| self.typecheckLambda(env, l),
.if_expr => |i| self.typecheckIf(env, i),
.constant => |c| self.makeTyped(c, c.tpe),
else => error.UnsupportedExpr,
};
}
fn typecheckBlock(self: *Typer, env: std.StringHashMap(SType), block: *const Block) !*TypedExpr {
var cur_env = try env.clone();
for (block.items) |val_def| {
if (cur_env.contains(val_def.name)) {
return error.DuplicateVariable;
}
const rhs_typed = try self.assignType(cur_env, val_def.rhs);
try cur_env.put(val_def.name, rhs_typed.tpe);
}
const result_typed = try self.assignType(cur_env, block.result);
return self.builder.makeBlock(block.items, result_typed);
}
fn typecheckSelect(self: *Typer, env: std.StringHashMap(SType), sel: *const Select) !*TypedExpr {
const obj_typed = try self.assignType(env, sel.obj);
const method = MethodsContainer.getMethod(obj_typed.tpe, sel.field) orelse
return error.MethodNotFound;
// Unify method receiver type with object type
const subst = unifyTypes(method.stype.domain[0], obj_typed.tpe) orelse
return error.TypeMismatch;
const result_type = applySubst(method.stype.range, subst);
// Try to lower if it's a property access (no args)
if (self.lower_method_calls) {
if (method.ir_info.ir_builder) |ir_builder| {
if (ir_builder.buildProperty(self.builder, obj_typed, method)) |lowered| {
return lowered;
}
}
}
return self.builder.makeSelect(obj_typed, sel.field, result_type);
}
};
Error Handling
Each phase produces specific errors12:
const CompileError = union(enum) {
parse_error: ParseError,
hir_lowering_error: HirLoweringError,
binder_error: BinderError,
type_error: TypeInferenceError,
mir_lowering_error: MirLoweringError,
type_check_error: TypeCheckError,
ergo_tree_error: ErgoTreeError,
pub fn prettyDesc(self: CompileError, source: []const u8) []const u8 {
return switch (self) {
.parse_error => |e| e.prettyDesc(source),
.hir_lowering_error => |e| e.prettyDesc(source),
.binder_error => |e| e.prettyDesc(source),
.type_error => |e| e.prettyDesc(source),
.mir_lowering_error => |e| e.prettyDesc(source),
.type_check_error => |e| e.prettyDesc(),
.ergo_tree_error => |e| std.fmt.allocPrint(
allocator,
"ErgoTree error: {any}",
.{e},
) catch "format error",
};
}
};
/// Parse error with source location
const ParseError = struct {
message: []const u8,
span: TextRange,
expected: []const TokenKind,
found: ?TokenKind,
pub fn prettyDesc(self: ParseError, source: []const u8) []const u8 {
const line_info = getLineInfo(source, self.span.start);
return std.fmt.allocPrint(allocator,
"error: {s}\nline: {d}\n{s}\n{s}",
.{
self.message,
line_info.line_num,
line_info.line_text,
makeUnderline(line_info, self.span),
},
) catch "format error";
}
};
Predefined Functions Registry
Built-in functions are registered for name resolution13:
const PredefinedFuncRegistry = struct {
funcs: std.StringHashMap(PredefinedFunc),
builder: *const SigmaBuilder,
pub fn init(builder: *const SigmaBuilder) PredefinedFuncRegistry {
var self = PredefinedFuncRegistry{
.funcs = std.StringHashMap(PredefinedFunc).init(allocator),
.builder = builder,
};
self.registerAll();
return self;
}
fn registerAll(self: *PredefinedFuncRegistry) void {
// Boolean operations
self.register("allOf", .{
.tpe = SFunc.init(&[_]SType{SType.collOf(.boolean)}, .boolean),
.ir_builder = AllOfIrBuilder,
});
self.register("anyOf", .{
.tpe = SFunc.init(&[_]SType{SType.collOf(.boolean)}, .boolean),
.ir_builder = AnyOfIrBuilder,
});
// Sigma operations
self.register("sigmaProp", .{
.tpe = SFunc.init(&[_]SType{.boolean}, .sigma_prop),
.ir_builder = SigmaPropIrBuilder,
});
self.register("atLeast", .{
.tpe = SFunc.init(&[_]SType{ .int, SType.collOf(.sigma_prop) }, .sigma_prop),
.ir_builder = AtLeastIrBuilder,
});
self.register("allZK", .{
.tpe = SFunc.init(&[_]SType{SType.collOf(.sigma_prop)}, .sigma_prop),
.ir_builder = AllZKIrBuilder,
});
self.register("anyZK", .{
.tpe = SFunc.init(&[_]SType{SType.collOf(.sigma_prop)}, .sigma_prop),
.ir_builder = AnyZKIrBuilder,
});
// Cryptographic
self.register("proveDlog", .{
.tpe = SFunc.init(&[_]SType{.group_element}, .sigma_prop),
.ir_builder = ProveDlogIrBuilder,
});
self.register("proveDHTuple", .{
.tpe = SFunc.init(&[_]SType{
.group_element,
.group_element,
.group_element,
.group_element,
}, .sigma_prop),
.ir_builder = ProveDHTupleIrBuilder,
});
// Hash functions
self.register("blake2b256", .{
.tpe = SFunc.init(&[_]SType{SType.collOf(.byte)}, SType.collOf(.byte)),
.ir_builder = Blake2b256IrBuilder,
});
self.register("sha256", .{
.tpe = SFunc.init(&[_]SType{SType.collOf(.byte)}, SType.collOf(.byte)),
.ir_builder = Sha256IrBuilder,
});
// Global
self.register("groupGenerator", .{
.tpe = SFunc.init(&[_]SType{}, .group_element),
.ir_builder = GroupGeneratorIrBuilder,
});
}
fn register(self: *PredefinedFuncRegistry, name: []const u8, func: PredefinedFunc) void {
self.funcs.put(name, func) catch unreachable;
}
};
Compilation Example
pub fn main() !void {
var gpa = std.heap.GeneralPurposeAllocator(.{}){};
const allocator = gpa.allocator();
// Setup compiler
const settings = CompilerSettings.testnet();
const compiler = SigmaCompiler.init(settings, allocator);
var ir_ctx = IRContext.init(allocator);
// Source code
const source =
\\{
\\ val deadline = 100000
\\ val pk = PK("9fRusAarL1KkrWQVsxSRVYnvWxaAT2A96cKtNn9tvPh5XUCTgGi")
\\ sigmaProp(HEIGHT > deadline) && pk
\\}
;
// Compile
const env = ScriptEnv.empty();
const result = try compiler.compile(&env, source, &ir_ctx);
// Access results
std.debug.print("Source: {s}\n", .{result.source});
std.debug.print("ErgoTree: {any}\n", .{result.ergo_tree});
std.debug.print("Type: {any}\n", .{result.ergo_tree.tpe});
// Serialize
const ergo_tree = try ErgoTree.fromSigmaProp(result.ergo_tree);
const bytes = try ergo_tree.toBytes(allocator);
std.debug.print("Bytes: {x}\n", .{std.fmt.fmtSliceHexLower(bytes)});
}
Compilation Flow Detail
Detailed Phase Transitions
─────────────────────────────────────────────────────
Source: "OUTPUTS.exists({ (b: Box) => b.value > 100L })"
Phase 1 - Parse:
Apply(
Select(Ident("OUTPUTS"), "exists"),
[Lambda(["b": Box], GT(Select(Ident("b"), "value"), 100L))]
)
Phase 2 - Bind:
Apply(
Select(Context.OUTPUTS, ExistsMethod),
[Lambda([b: SBox], GT(Select(ValUse(b), "value"), 100L))]
)
Phase 3 - Typecheck:
Exists(
input: Outputs :: SColl[SBox],
predicate: Lambda(
args: [(0, SBox)],
body: GT(
ExtractAmount(ValUse(0, SBox)) :: SLong,
LongConstant(100) :: SLong
) :: SBoolean
) :: SFunc[SBox, SBoolean]
) :: SBoolean
Phase 4 - BuildGraph (if using Scala IR):
s1 = Context.OUTPUTS
s2 = Lambda(args=[(0,SBox)], body=s3)
s3 = GT(s4, s5)
s4 = ValUse(0).value // ExtractAmount
s5 = 100L
s6 = Exists(s1, s2)
Phase 5 - BuildTree:
Exists(
Outputs,
FuncValue(
[(1, SBox)],
GT(ExtractAmount(ValUse(1, SBox)), LongConstant(100))
)
)
Summary
- 5-phase pipeline: Parse → Bind → Typecheck → BuildGraph → BuildTree
- Method lowering transforms MethodCall (3+ bytes) to direct nodes (1 byte)
- Scala uses graph IR for CSE optimization; Rust uses direct HIR→MIR
- Type inference propagates and unifies types through the AST
- Predefined registry resolves built-in function names
- Error handling provides detailed source-location diagnostics
- Compiler is development-time only—interpreter uses serialized ErgoTree
Next: Chapter 20: Collections
Scala: SigmaCompiler.scala:51-100 (SigmaCompiler class)
Rust: lib.rs:16-27 (module structure)
Scala: SigmaCompiler.scala:15-25 (CompilerSettings)
Scala: SigmaCompiler.scala:55-95 (compile methods)
Rust: compiler.rs:59-76 (compile_expr)
Rust: compiler.rs:73-76 (compile)
Rust: compiler.rs:78-87 (compile_hir)
Scala: SigmaCompiler.scala:105-150 (unlowerMethodCalls)
Scala: SigmaTyper.scala:30-45 (processGlobalMethod)
Scala: SigmaTyper.scala:50-100 (assignType)
Rust: type_infer.rs:25-49 (assign_type)
Rust: compiler.rs:23-55 (CompileError)
Scala: SigmaPredef.scala (PredefinedFuncRegistry)
Chapter 20: Collections
PRE-ALPHA WARNING: This is a pre-alpha version of The Sigma Book. Content may be incomplete, inaccurate, or subject to change. Do not use as a source of truth. For authoritative information, consult the official repositories:
- sigmastate-interpreter — Reference Scala implementation
- sigma-rust — Rust implementation
- ergo — Ergo node
Prerequisites
- Chapter 2 for
Coll[T]type and type parameters - Chapter 5 for collection operation opcodes
- Chapter 12 for how collection operations are evaluated
Learning Objectives
By the end of this chapter, you will be able to:
- Explain the
Coll[T]interface and its core operations (map, filter, fold, etc.) - Implement array-backed collections with bounds checking
- Describe the Structure-of-Arrays optimization for pair collections
- Use
CollBuilderfor creating and manipulating collections - Understand cost implications of collection operations
Collection Architecture
Collections in ErgoScript are immutable, indexed sequences12:
Collection Architecture
─────────────────────────────────────────────────────
Coll[T]
│
┌─────────────┴─────────────┐
│ │
CollOverArray[T] PairColl[L,R]
(standard array) (structure-of-arrays)
│ │
│ ┌──────┴──────┐
Array[T] Coll[L] Coll[R]
(left) (right)
Coll[T] Interface
Core collection interface with specialized operations34:
/// Immutable indexed collection
const Coll = struct {
data: CollData,
elem_type: SType,
builder: *CollBuilder,
const CollData = union(enum) {
/// Standard array-backed collection
array: ArrayColl,
/// Optimized pair collection
pair: PairCollData,
};
/// Number of elements
pub fn length(self: *const Coll) usize {
return switch (self.data) {
.array => |a| a.items.len,
.pair => |p| @min(p.ls.length(), p.rs.length()),
};
}
pub fn size(self: *const Coll) usize {
return self.length();
}
pub fn isEmpty(self: *const Coll) bool {
return self.length() == 0;
}
/// Element at index (0-based)
pub fn get(self: *const Coll, i: usize) ?Value {
if (i >= self.length()) return null;
return switch (self.data) {
.array => |a| a.items[i],
.pair => |p| Value.tuple(.{ p.ls.get(i).?, p.rs.get(i).? }),
};
}
/// Element at index with default
pub fn getOrElse(self: *const Coll, i: usize, default: Value) Value {
return self.get(i) orelse default;
}
/// Element access (throws on out of bounds)
pub fn apply(self: *const Coll, i: usize) !Value {
return self.get(i) orelse error.IndexOutOfBounds;
}
};
Transformation Operations
Map, filter, and fold with cost tracking56:
/// Collection transformation operations
const CollTransforms = struct {
/// Apply function to each element
pub fn map(
coll: *const Coll,
mapper: *const Closure,
E: *Evaluator,
) !*Coll {
const n = coll.length();
try E.addSeqCost(MapCost, n, OpCode.Map);
var result = try E.allocator.alloc(Value, n);
for (0..n) |i| {
const elem = coll.get(i).?;
try E.addCost(AddToEnvCost, OpCode.Map);
var fn_env = try mapper.captured_env.extend(mapper.args[0].id, elem);
result[i] = try mapper.body.eval(&fn_env, E);
}
return coll.builder.fromArray(result, mapper.result_type);
}
/// Select elements satisfying predicate
pub fn filter(
coll: *const Coll,
predicate: *const Closure,
E: *Evaluator,
) !*Coll {
const n = coll.length();
try E.addSeqCost(FilterCost, n, OpCode.Filter);
var result = std.ArrayList(Value).init(E.allocator);
for (0..n) |i| {
const elem = coll.get(i).?;
try E.addCost(AddToEnvCost, OpCode.Filter);
var fn_env = try predicate.captured_env.extend(predicate.args[0].id, elem);
const keep = try E.evalTo(bool, &fn_env, predicate.body);
if (keep) {
try result.append(elem);
}
}
return coll.builder.fromArray(result.items, coll.elem_type);
}
/// Left-associative fold
pub fn foldLeft(
coll: *const Coll,
zero: Value,
folder: *const Closure,
E: *Evaluator,
) !Value {
const n = coll.length();
try E.addSeqCost(FoldCost, n, OpCode.Fold);
var accum = zero;
for (0..n) |i| {
const elem = coll.get(i).?;
const tuple = Value.tuple(.{ accum, elem });
try E.addCost(AddToEnvCost, OpCode.Fold);
var fn_env = try folder.captured_env.extend(folder.args[0].id, tuple);
accum = try folder.body.eval(&fn_env, E);
}
return accum;
}
/// Flatten nested collections
pub fn flatMap(
coll: *const Coll,
mapper: *const Closure,
E: *Evaluator,
) !*Coll {
const n = coll.length();
var result = std.ArrayList(Value).init(E.allocator);
for (0..n) |i| {
const elem = coll.get(i).?;
try E.addCost(AddToEnvCost, OpCode.FlatMap);
var fn_env = try mapper.captured_env.extend(mapper.args[0].id, elem);
const inner_coll = try E.evalTo(*Coll, &fn_env, mapper.body);
for (0..inner_coll.length()) |j| {
try result.append(inner_coll.get(j).?);
}
}
return coll.builder.fromArray(result.items, mapper.result_type);
}
};
const MapCost = PerItemCost{
.base = JitCost{ .value = 10 },
.per_chunk = JitCost{ .value = 5 },
.chunk_size = 10,
};
const FilterCost = PerItemCost{
.base = JitCost{ .value = 20 },
.per_chunk = JitCost{ .value = 5 },
.chunk_size = 10,
};
const FoldCost = PerItemCost{
.base = JitCost{ .value = 10 },
.per_chunk = JitCost{ .value = 5 },
.chunk_size = 10,
};
Predicate Operations
Exists, forall with short-circuit evaluation7. Note: Short-circuit behavior means execution time varies based on collection contents. This is acceptable in blockchain contexts where data is public, but would be a timing side-channel if collections contained secrets.
/// Predicate operations (short-circuit)
const CollPredicates = struct {
/// True if any element satisfies predicate
pub fn exists(
coll: *const Coll,
predicate: *const Closure,
E: *Evaluator,
) !bool {
const n = coll.length();
for (0..n) |i| {
const elem = coll.get(i).?;
try E.addCost(AddToEnvCost, OpCode.Exists);
var fn_env = try predicate.captured_env.extend(predicate.args[0].id, elem);
const result = try E.evalTo(bool, &fn_env, predicate.body);
if (result) {
// Short-circuit: found matching element
try E.addSeqCost(ExistsCost, i + 1, OpCode.Exists);
return true;
}
}
try E.addSeqCost(ExistsCost, n, OpCode.Exists);
return false;
}
/// True if all elements satisfy predicate
pub fn forall(
coll: *const Coll,
predicate: *const Closure,
E: *Evaluator,
) !bool {
const n = coll.length();
for (0..n) |i| {
const elem = coll.get(i).?;
try E.addCost(AddToEnvCost, OpCode.ForAll);
var fn_env = try predicate.captured_env.extend(predicate.args[0].id, elem);
const result = try E.evalTo(bool, &fn_env, predicate.body);
if (!result) {
// Short-circuit: found non-matching element
try E.addSeqCost(ForAllCost, i + 1, OpCode.ForAll);
return false;
}
}
try E.addSeqCost(ForAllCost, n, OpCode.ForAll);
return true;
}
/// Find first element satisfying predicate
pub fn find(
coll: *const Coll,
predicate: *const Closure,
E: *Evaluator,
) !?Value {
const n = coll.length();
for (0..n) |i| {
const elem = coll.get(i).?;
var fn_env = try predicate.captured_env.extend(predicate.args[0].id, elem);
const result = try E.evalTo(bool, &fn_env, predicate.body);
if (result) {
return elem;
}
}
return null;
}
/// Index of first element satisfying predicate
pub fn indexWhere(
coll: *const Coll,
predicate: *const Closure,
from: usize,
E: *Evaluator,
) !i32 {
const n = coll.length();
const start = @max(from, 0);
for (start..n) |i| {
const elem = coll.get(i).?;
var fn_env = try predicate.captured_env.extend(predicate.args[0].id, elem);
const result = try E.evalTo(bool, &fn_env, predicate.body);
if (result) {
return @intCast(i);
}
}
return -1; // Not found
}
};
Slicing Operations
Slice, take, append8:
/// Slicing operations
const CollSlicing = struct {
/// First n elements
pub fn take(coll: *const Coll, n: usize, E: *Evaluator) !*Coll {
if (n <= 0) return coll.builder.emptyColl(coll.elem_type);
if (n >= coll.length()) return coll;
try E.addSeqCost(SliceCost, n, OpCode.Slice);
return coll.builder.fromSlice(coll, 0, n);
}
/// Elements from index `from` until `until`
pub fn slice(
coll: *const Coll,
from: usize,
until: usize,
E: *Evaluator,
) !*Coll {
const actual_from = @min(from, coll.length());
const actual_until = @min(until, coll.length());
const len = if (actual_until > actual_from) actual_until - actual_from else 0;
try E.addSeqCost(SliceCost, len, OpCode.Slice);
return coll.builder.fromSlice(coll, actual_from, actual_until);
}
/// Concatenate collections
pub fn append(coll: *const Coll, other: *const Coll, E: *Evaluator) !*Coll {
if (coll.length() == 0) return other;
if (other.length() == 0) return coll;
const total = coll.length() + other.length();
try E.addSeqCost(AppendCost, total, OpCode.Append);
var result = try E.allocator.alloc(Value, total);
for (0..coll.length()) |i| {
result[i] = coll.get(i).?;
}
for (0..other.length()) |i| {
result[coll.length() + i] = other.get(i).?;
}
return coll.builder.fromArray(result, coll.elem_type);
}
/// Replace slice with patch
pub fn patch(
coll: *const Coll,
from: usize,
replacement: *const Coll,
replaced: usize,
E: *Evaluator,
) !*Coll {
const before = coll.slice(0, from, E);
const after = coll.slice(from + replaced, coll.length(), E);
const temp = try before.append(replacement, E);
return temp.append(after, E);
}
/// Replace single element
pub fn updated(
coll: *const Coll,
index: usize,
elem: Value,
E: *Evaluator,
) !*Coll {
if (index >= coll.length()) return error.IndexOutOfBounds;
var result = try E.allocator.alloc(Value, coll.length());
for (0..coll.length()) |i| {
result[i] = if (i == index) elem else coll.get(i).?;
}
return coll.builder.fromArray(result, coll.elem_type);
}
};
Structure-of-Arrays: PairColl
Optimized representation for collections of pairs910:
Structure-of-Arrays vs Array-of-Structures
─────────────────────────────────────────────────────
Array-of-Structures (standard):
┌────────────────────────────────────────────────────┐
│ [(L0,R0), (L1,R1), (L2,R2), (L3,R3), (L4,R4)] │
│ │
│ Memory: L0 R0 L1 R1 L2 R2 L3 R3 L4 R4 │
│ Issue: Cache unfriendly when accessing only Ls │
└────────────────────────────────────────────────────┘
Structure-of-Arrays (PairColl):
┌────────────────────────────────────────────────────┐
│ ls: [L0, L1, L2, L3, L4] │
│ rs: [R0, R1, R2, R3, R4] │
│ │
│ Memory: L0 L1 L2 L3 L4 | R0 R1 R2 R3 R4 │
│ Benefit: Cache friendly, O(1) unzip │
└────────────────────────────────────────────────────┘
/// Optimized pair collection (Structure-of-Arrays)
const PairColl = struct {
ls: *Coll, // Left components
rs: *Coll, // Right components
builder: *CollBuilder,
pub fn length(self: *const PairColl) usize {
return @min(self.ls.length(), self.rs.length());
}
/// Element at index returns tuple
pub fn get(self: *const PairColl, i: usize) ?Value {
const l = self.ls.get(i) orelse return null;
const r = self.rs.get(i) orelse return null;
return Value.tuple(.{ l, r });
}
/// O(1) unzip - just return components
pub fn unzip(self: *const PairColl) struct { *Coll, *Coll } {
return .{ self.ls, self.rs };
}
/// Map only left components
pub fn mapFirst(
self: *const PairColl,
mapper: *const Closure,
E: *Evaluator,
) !*PairColl {
const mapped_ls = try CollTransforms.map(self.ls, mapper, E);
return self.builder.pairColl(mapped_ls, self.rs);
}
/// Map only right components
pub fn mapSecond(
self: *const PairColl,
mapper: *const Closure,
E: *Evaluator,
) !*PairColl {
const mapped_rs = try CollTransforms.map(self.rs, mapper, E);
return self.builder.pairColl(self.ls, mapped_rs);
}
/// Slice maintains structure-of-arrays
pub fn slice(
self: *const PairColl,
from: usize,
until: usize,
E: *Evaluator,
) !*PairColl {
const sliced_ls = try CollSlicing.slice(self.ls, from, until, E);
const sliced_rs = try CollSlicing.slice(self.rs, from, until, E);
return self.builder.pairColl(sliced_ls, sliced_rs);
}
/// Append maintains structure
pub fn append(
self: *const PairColl,
other: *const PairColl,
E: *Evaluator,
) !*PairColl {
const combined_ls = try CollSlicing.append(self.ls, other.ls, E);
const combined_rs = try CollSlicing.append(self.rs, other.rs, E);
return self.builder.pairColl(combined_ls, combined_rs);
}
};
CollBuilder
Factory for creating collections1112:
/// Factory for creating collections
const CollBuilder = struct {
allocator: Allocator,
/// Create pair collection from two collections
pub fn pairColl(
self: *CollBuilder,
ls: *Coll,
rs: *Coll,
) *PairColl {
// Handle length mismatch by using minimum
const result = self.allocator.create(PairColl) catch unreachable;
result.* = .{
.ls = ls,
.rs = rs,
.builder = self,
};
return result;
}
/// Create collection from array
pub fn fromArray(
self: *CollBuilder,
items: []const Value,
elem_type: SType,
) *Coll {
// Enforce size limit
if (items.len > MAX_ARRAY_LENGTH) {
@panic("Collection size exceeds maximum");
}
const result = self.allocator.create(Coll) catch unreachable;
// Special handling for pairs → PairColl
if (elem_type == .s_tuple and elem_type.s_tuple.items.len == 2) {
const ls = self.allocator.alloc(Value, items.len) catch unreachable;
const rs = self.allocator.alloc(Value, items.len) catch unreachable;
for (items, 0..) |item, i| {
ls[i] = item.tuple[0];
rs[i] = item.tuple[1];
}
result.* = .{
.data = .{ .pair = .{
.ls = self.fromArray(ls, elem_type.s_tuple.items[0]),
.rs = self.fromArray(rs, elem_type.s_tuple.items[1]),
} },
.elem_type = elem_type,
.builder = self,
};
} else {
result.* = .{
.data = .{ .array = .{ .items = items } },
.elem_type = elem_type,
.builder = self,
};
}
return result;
}
/// Create collection of n copies of value
pub fn replicate(
self: *CollBuilder,
n: usize,
value: Value,
elem_type: SType,
) *Coll {
var items = self.allocator.alloc(Value, n) catch unreachable;
for (items) |*item| {
item.* = value;
}
return self.fromArray(items, elem_type);
}
/// Create empty collection
pub fn emptyColl(self: *CollBuilder, elem_type: SType) *Coll {
return self.fromArray(&[_]Value{}, elem_type);
}
/// Split pair collection into two collections
pub fn unzip(self: *CollBuilder, coll: *const Coll) struct { *Coll, *Coll } {
switch (coll.data) {
.pair => |p| {
// O(1) for PairColl
return .{ p.ls, p.rs };
},
.array => |a| {
// O(n) for regular collection - must materialize
const n = a.items.len;
var ls = self.allocator.alloc(Value, n) catch unreachable;
var rs = self.allocator.alloc(Value, n) catch unreachable;
for (a.items, 0..) |item, i| {
ls[i] = item.tuple[0];
rs[i] = item.tuple[1];
}
const elem_type = coll.elem_type.s_tuple;
return .{
self.fromArray(ls, elem_type.items[0]),
self.fromArray(rs, elem_type.items[1]),
};
},
}
}
/// Element-wise XOR of byte arrays
pub fn xor(self: *CollBuilder, left: *const Coll, right: *const Coll) *Coll {
const n = @min(left.length(), right.length());
var result = self.allocator.alloc(Value, n) catch unreachable;
for (0..n) |i| {
const l = left.get(i).?.byte;
const r = right.get(i).?.byte;
result[i] = Value{ .byte = l ^ r };
}
return self.fromArray(result, .byte);
}
};
/// Maximum collection size (DoS protection)
const MAX_ARRAY_LENGTH: usize = 100_000;
Rust Collection Representation
The Rust implementation uses a different approach1314:
/// Rust-style Collection enum
const RustCollection = union(enum) {
/// Special representation for boolean constants (bit-packed)
bool_constants: []const bool,
/// Collection of expressions
exprs: struct {
elem_type: SType,
items: []const *Expr,
},
pub fn tpe(self: RustCollection) SType {
return switch (self) {
.bool_constants => SType.collOf(.boolean),
.exprs => |e| SType.collOf(e.elem_type),
};
}
pub fn opCode(self: RustCollection) OpCode {
return switch (self) {
.bool_constants => OpCode.CollOfBoolConst,
.exprs => OpCode.Coll,
};
}
};
/// Rust collection serialization
fn serializeCollection(coll: RustCollection, writer: anytype) !void {
switch (coll) {
.bool_constants => |bools| {
try writer.writeInt(u16, @intCast(bools.len), .big);
try writeBits(writer, bools); // Bit-packed
},
.exprs => |e| {
try writer.writeInt(u16, @intCast(e.items.len), .big);
try serializeSType(e.elem_type, writer);
for (e.items) |item| {
try serializeExpr(item, writer);
}
},
}
}
Cost Model
Collection Operation Costs
─────────────────────────────────────────────────────
Operation │ Cost Type │ Formula
────────────────┼──────────────┼──────────────────────
length │ Fixed │ 10
apply(i) │ Fixed │ 10
get(i) │ Fixed │ 10
map(f) │ PerItem │ 10 + ⌈n/10⌉ × 5
filter(p) │ PerItem │ 20 + ⌈n/10⌉ × 5
fold(z, op) │ PerItem │ 10 + ⌈n/10⌉ × 5
exists(p) │ PerItem │ 10 + ⌈k/10⌉ × 5 (k=items checked)
forall(p) │ PerItem │ 10 + ⌈k/10⌉ × 5 (k=items checked)
slice(from,to) │ PerItem │ 10 + ⌈len/10⌉ × 2
append(other) │ PerItem │ 20 + ⌈(n+m)/10⌉ × 2
zip(other) │ Fixed │ 10 (structural)
unzip │ Fixed │ 10 (PairColl), PerItem (array)
flatMap(f) │ Dynamic │ depends on result sizes
─────────────────────────────────────────────────────
Where: n = collection size, k = items processed before short-circuit
Set Operations
Distinct, union, intersection15:
/// Set-like operations on collections
const CollSetOps = struct {
/// Remove duplicates, preserving first occurrences
pub fn distinct(coll: *const Coll, E: *Evaluator) !*Coll {
var seen = std.AutoHashMap(Value, void).init(E.allocator);
var result = std.ArrayList(Value).init(E.allocator);
for (0..coll.length()) |i| {
const elem = coll.get(i).?;
if (!seen.contains(elem)) {
try seen.put(elem, {});
try result.append(elem);
}
}
return coll.builder.fromArray(result.items, coll.elem_type);
}
/// Union preserving order (set semantics)
pub fn unionSet(coll: *const Coll, other: *const Coll, E: *Evaluator) !*Coll {
var seen = std.AutoHashMap(Value, void).init(E.allocator);
var result = std.ArrayList(Value).init(E.allocator);
// Add all from first collection
for (0..coll.length()) |i| {
const elem = coll.get(i).?;
if (!seen.contains(elem)) {
try seen.put(elem, {});
try result.append(elem);
}
}
// Add unseen from second collection
for (0..other.length()) |i| {
const elem = other.get(i).?;
if (!seen.contains(elem)) {
try seen.put(elem, {});
try result.append(elem);
}
}
return coll.builder.fromArray(result.items, coll.elem_type);
}
/// Multiset intersection
pub fn intersect(coll: *const Coll, other: *const Coll, E: *Evaluator) !*Coll {
// Count occurrences in other
var counts = std.AutoHashMap(Value, usize).init(E.allocator);
for (0..other.length()) |i| {
const elem = other.get(i).?;
const entry = try counts.getOrPut(elem);
if (!entry.found_existing) {
entry.value_ptr.* = 0;
}
entry.value_ptr.* += 1;
}
// Collect elements that exist in other
var result = std.ArrayList(Value).init(E.allocator);
for (0..coll.length()) |i| {
const elem = coll.get(i).?;
if (counts.get(elem)) |*count| {
if (count.* > 0) {
try result.append(elem);
count.* -= 1;
}
}
}
return coll.builder.fromArray(result.items, coll.elem_type);
}
};
Summary
- Coll[T] is immutable, indexed, deterministic
- CollOverArray wraps arrays with specialized primitive support
- PairColl uses Structure-of-Arrays for O(1) unzip
- CollBuilder creates collections with automatic pair optimization
- Short-circuit evaluation for exists/forall reduces costs
- Size limit (100K elements) prevents DoS attacks
- All operations have defined costs for gas calculation
Next: Chapter 21: AVL Trees
Scala: Colls.scala:12-50 (Coll trait)
Rust: collection.rs:21-32 (Collection enum)
Scala: Colls.scala:50-100 (core operations)
Rust: coll_by_index.rs (ByIndex)
Scala: CollsOverArrays.scala:30-50 (map, filter)
Rust: coll_map.rs:17-62 (Map struct)
Rust: coll_exists.rs, coll_forall.rs
Scala: CollsOverArrays.scala:50-80 (slice, append)
Scala: Colls.scala:150-180 (PairColl trait)
Scala: CollsOverArrays.scala:200-280 (PairOfCols)
Scala: Colls.scala:180-220 (CollBuilder trait)
Scala: CollsOverArrays.scala:300-400 (CollOverArrayBuilder)
Rust: collection.rs:34-56 (Collection::new)
Rust: collection.rs:100-136 (serialization)
Scala: CollsOverArrays.scala:100-150 (set operations)
Chapter 21: AVL+ Trees
PRE-ALPHA WARNING: This is a pre-alpha version of The Sigma Book. Content may be incomplete, inaccurate, or subject to change. Do not use as a source of truth. For authoritative information, consult the official repositories:
- sigmastate-interpreter — Reference Scala implementation
- sigma-rust — Rust implementation
- ergo — Ergo node
Prerequisites
- Chapter 10 for BLAKE2b256 hashing used in node digests
- Chapter 20 for collection operations that AVL trees extend
- Familiarity with binary search tree concepts and balancing
Learning Objectives
By the end of this chapter, you will be able to:
- Explain the prover-verifier architecture for authenticated dictionaries
- Implement the
AvlTreeDataandADDigeststructures storing 33-byte commitments - Use operation flags to control insert/update/remove permissions
- Apply proof-based verification for tree operations (contains, get, insert, update, remove)
- Calculate operation costs based on proof length and tree height
Authenticated Dictionary Model
AVL+ trees provide authenticated key-value storage12:
Prover-Verifier Architecture
─────────────────────────────────────────────────────
OFF-CHAIN (Prover - holds full tree):
┌─────────────────────────────────────────────────────┐
│ BatchAVLProver │
│ ┌─────────────────────────────────────────────────┐│
│ │ Complete Tree Structure ││
│ │ [H] ││
│ │ / \ ││
│ │ [D] [L] ││
│ │ / \ / \ ││
│ │ [B] [F][J] [N] ││
│ └─────────────────────────────────────────────────┘│
│ • Performs operations │
│ • Generates proofs │
│ • Maintains full state │
└─────────────────────────│───────────────────────────┘
│ proof bytes
▼
ON-CHAIN (Verifier - holds only digest):
┌─────────────────────────────────────────────────────┐
│ CAvlTreeVerifier │
│ ┌─────────────────────────────────────────────────┐│
│ │ Digest: [32-byte root hash][height byte] ││
│ │ ═══════════════════════════════ ││
│ │ (33 bytes total) ││
│ └─────────────────────────────────────────────────┘│
│ • Verifies proof bytes │
│ • Returns operation results │
│ • Rejects invalid proofs │
└─────────────────────────────────────────────────────┘
AvlTreeData Structure
Core data type for authenticated trees34:
/// Authenticated tree data (stored on-chain)
const AvlTreeData = struct {
/// Root hash + height (33 bytes total)
digest: ADDigest,
/// Permitted operations
tree_flags: AvlTreeFlags,
/// Fixed key length (all keys same size)
/// Note: In Ergo, this is always 32 bytes (Blake2b256 hash)
key_length: u32,
/// Optional fixed value length
value_length_opt: ?u32,
pub const DIGEST_SIZE: usize = 33; // 32-byte hash + 1-byte height
pub fn fromDigest(digest: []const u8) AvlTreeData {
return .{
.digest = ADDigest.fromSlice(digest),
.tree_flags = AvlTreeFlags.allOperationsAllowed(),
.key_length = 32, // Ergo: always 32 bytes (Blake2b256 hash)
.value_length_opt = null,
};
}
};
/// 33-byte authenticated digest
const ADDigest = struct {
/// 32-byte BLAKE2b256 root hash
root_hash: [32]u8,
/// Tree height (0-255)
height: u8,
pub fn fromSlice(bytes: []const u8) ADDigest {
var result: ADDigest = undefined;
@memcpy(&result.root_hash, bytes[0..32]);
result.height = bytes[32];
return result;
}
pub fn toBytes(self: ADDigest) [33]u8 {
var result: [33]u8 = undefined;
@memcpy(result[0..32], &self.root_hash);
result[32] = self.height;
return result;
}
};
Operation Flags
Control which modifications are permitted56:
/// Operation permission flags (bit-packed)
const AvlTreeFlags = struct {
flags: u8,
/// Bit positions
const INSERT_BIT: u8 = 0x01;
const UPDATE_BIT: u8 = 0x02;
const REMOVE_BIT: u8 = 0x04;
pub fn new(insert_allowed: bool, update_allowed: bool, remove_allowed: bool) AvlTreeFlags {
var flags: u8 = 0;
if (insert_allowed) flags |= INSERT_BIT;
if (update_allowed) flags |= UPDATE_BIT;
if (remove_allowed) flags |= REMOVE_BIT;
return .{ .flags = flags };
}
pub fn parse(byte: u8) AvlTreeFlags {
return .{ .flags = byte };
}
pub fn serialize(self: AvlTreeFlags) u8 {
return self.flags;
}
// Predefined flag combinations
pub fn readOnly() AvlTreeFlags {
return .{ .flags = 0x00 };
}
pub fn allOperationsAllowed() AvlTreeFlags {
return .{ .flags = 0x07 };
}
pub fn insertOnly() AvlTreeFlags {
return .{ .flags = 0x01 };
}
pub fn removeOnly() AvlTreeFlags {
return .{ .flags = 0x04 };
}
// Permission checks
pub fn insertAllowed(self: AvlTreeFlags) bool {
return (self.flags & INSERT_BIT) != 0;
}
pub fn updateAllowed(self: AvlTreeFlags) bool {
return (self.flags & UPDATE_BIT) != 0;
}
pub fn removeAllowed(self: AvlTreeFlags) bool {
return (self.flags & REMOVE_BIT) != 0;
}
};
AvlTree Interface
ErgoScript interface for authenticated trees7:
/// AvlTree wrapper providing ErgoScript interface
const AvlTree = struct {
data: AvlTreeData,
/// Get 33-byte authenticated digest
pub fn digest(self: *const AvlTree) []const u8 {
return &self.data.digest.toBytes();
}
/// Get operation flags byte
pub fn enabledOperations(self: *const AvlTree) u8 {
return self.data.tree_flags.serialize();
}
/// Get fixed key length
pub fn keyLength(self: *const AvlTree) i32 {
return @intCast(self.data.key_length);
}
/// Get optional fixed value length
pub fn valueLengthOpt(self: *const AvlTree) ?i32 {
if (self.data.value_length_opt) |v| {
return @intCast(v);
}
return null;
}
/// Permission checks
pub fn isInsertAllowed(self: *const AvlTree) bool {
return self.data.tree_flags.insertAllowed();
}
pub fn isUpdateAllowed(self: *const AvlTree) bool {
return self.data.tree_flags.updateAllowed();
}
pub fn isRemoveAllowed(self: *const AvlTree) bool {
return self.data.tree_flags.removeAllowed();
}
/// Create new tree with updated digest (immutable)
pub fn updateDigest(self: *const AvlTree, new_digest: []const u8) AvlTree {
var new_data = self.data;
new_data.digest = ADDigest.fromSlice(new_digest);
return .{ .data = new_data };
}
/// Create new tree with updated flags (immutable)
pub fn updateOperations(self: *const AvlTree, new_flags: u8) AvlTree {
var new_data = self.data;
new_data.tree_flags = AvlTreeFlags.parse(new_flags);
return .{ .data = new_data };
}
};
Verifier Implementation
The verifier processes proofs to verify operations89:
/// AVL tree proof verifier
const AvlTreeVerifier = struct {
/// Current state digest (None if verification failed)
current_digest: ?ADDigest,
/// Proof bytes to process
proof: []const u8,
/// Current position in proof
proof_pos: usize,
/// Key length
key_length: usize,
/// Optional value length
value_length_opt: ?usize,
pub fn init(tree: *const AvlTree, proof: []const u8) AvlTreeVerifier {
return .{
.current_digest = tree.data.digest,
.proof = proof,
.proof_pos = 0,
.key_length = tree.data.key_length,
.value_length_opt = tree.data.value_length_opt,
};
}
/// Get current tree height
pub fn treeHeight(self: *const AvlTreeVerifier) usize {
if (self.current_digest) |d| {
return d.height;
}
return 0;
}
/// Get current digest (None if verification failed)
pub fn digest(self: *const AvlTreeVerifier) ?[]const u8 {
if (self.current_digest) |d| {
return &d.toBytes();
}
return null;
}
/// Perform lookup operation
pub fn performLookup(self: *AvlTreeVerifier, key: []const u8) !?[]const u8 {
if (self.current_digest == null) return error.VerificationFailed;
// Process proof to verify key existence
const result = try self.verifyLookup(key);
return result;
}
/// Perform insert operation
pub fn performInsert(
self: *AvlTreeVerifier,
key: []const u8,
value: []const u8,
) !?[]const u8 {
if (self.current_digest == null) return error.VerificationFailed;
// Process proof to verify insertion
const old_value = try self.verifyInsert(key, value);
// Update digest based on proof
self.updateDigestFromProof();
return old_value;
}
/// Perform update operation
pub fn performUpdate(
self: *AvlTreeVerifier,
key: []const u8,
value: []const u8,
) !?[]const u8 {
if (self.current_digest == null) return error.VerificationFailed;
const old_value = try self.verifyUpdate(key, value);
self.updateDigestFromProof();
return old_value;
}
/// Perform remove operation
pub fn performRemove(self: *AvlTreeVerifier, key: []const u8) !?[]const u8 {
if (self.current_digest == null) return error.VerificationFailed;
const old_value = try self.verifyRemove(key);
self.updateDigestFromProof();
return old_value;
}
fn verifyLookup(self: *AvlTreeVerifier, key: []const u8) !?[]const u8 {
// NOTE: Stub - full implementation requires:
// 1. Read node type from proof (leaf vs internal)
// 2. Compare key with node key
// 3. Follow proof path based on comparison result
// 4. Verify all hashes match computed values
// See scorex-util: BatchAVLVerifier for reference.
_ = self;
_ = key;
@compileError("verifyLookup not implemented - see reference implementations");
}
// SECURITY: Key comparisons in production must be constant-time to prevent
// timing attacks that could leak key values. Use std.crypto.utils.timingSafeEql.
fn updateDigestFromProof(self: *AvlTreeVerifier) void {
// Extract new digest from proof processing
_ = self;
}
};
Proof-Based Operations
Operations use proofs for verification1011:
/// Evaluate contains operation
fn containsEval(
tree: *const AvlTree,
key: []const u8,
proof: []const u8,
E: *Evaluator,
) !bool {
// Cost: create verifier O(proof.length)
try E.addSeqCost(CreateVerifierCost, proof.len, OpCode.AvlTreeContains);
var verifier = AvlTreeVerifier.init(tree, proof);
// Cost: lookup O(tree.height)
const n_items = verifier.treeHeight();
try E.addSeqCost(LookupCost, n_items, OpCode.AvlTreeContains);
const result = verifier.performLookup(key) catch return false;
return result != null;
}
/// Evaluate get operation
fn getEval(
tree: *const AvlTree,
key: []const u8,
proof: []const u8,
E: *Evaluator,
) !?[]const u8 {
try E.addSeqCost(CreateVerifierCost, proof.len, OpCode.AvlTreeGet);
var verifier = AvlTreeVerifier.init(tree, proof);
const n_items = verifier.treeHeight();
try E.addSeqCost(LookupCost, n_items, OpCode.AvlTreeGet);
return verifier.performLookup(key) catch return error.InvalidProof;
}
/// Evaluate insert operation
fn insertEval(
tree: *const AvlTree,
entries: []const KeyValue,
proof: []const u8,
E: *Evaluator,
) !?*AvlTree {
// Check permission
try E.addCost(IsInsertAllowedCost, OpCode.AvlTreeInsert);
if (!tree.isInsertAllowed()) {
return null;
}
try E.addSeqCost(CreateVerifierCost, proof.len, OpCode.AvlTreeInsert);
var verifier = AvlTreeVerifier.init(tree, proof);
const n_items = @max(verifier.treeHeight(), 1);
// Process each entry
for (entries) |entry| {
try E.addSeqCost(InsertCost, n_items, OpCode.AvlTreeInsert);
_ = verifier.performInsert(entry.key, entry.value) catch return null;
}
// Return new tree with updated digest
const new_digest = verifier.digest() orelse return null;
try E.addCost(UpdateDigestCost, OpCode.AvlTreeInsert);
const new_tree = tree.updateDigest(new_digest);
return &new_tree;
}
/// Evaluate remove operation
fn removeEval(
tree: *const AvlTree,
keys: []const []const u8,
proof: []const u8,
E: *Evaluator,
) !?*AvlTree {
try E.addCost(IsRemoveAllowedCost, OpCode.AvlTreeRemove);
if (!tree.isRemoveAllowed()) {
return null;
}
try E.addSeqCost(CreateVerifierCost, proof.len, OpCode.AvlTreeRemove);
var verifier = AvlTreeVerifier.init(tree, proof);
const n_items = @max(verifier.treeHeight(), 1);
for (keys) |key| {
try E.addSeqCost(RemoveCost, n_items, OpCode.AvlTreeRemove);
_ = verifier.performRemove(key) catch return null;
}
const new_digest = verifier.digest() orelse return null;
try E.addCost(UpdateDigestCost, OpCode.AvlTreeRemove);
return &tree.updateDigest(new_digest);
}
const KeyValue = struct {
key: []const u8,
value: []const u8,
};
Cost Model
AVL tree operations have two-part costs12. Since AVL+ trees are balanced, the tree height is O(log n) where n is the number of entries. Proof size is also O(log n) as proofs contain one sibling hash per tree level.
AVL Tree Operation Costs
─────────────────────────────────────────────────────
Phase 1 - Create Verifier (O(proof.length)):
base = 110, per_chunk = 20, chunk_size = 64
Phase 2 - Per Operation (O(tree.height)):
Operation │ Base │ Per Height │ Chunk
──────────────┼──────┼────────────┼───────
Lookup │ 40 │ 10 │ 1
Insert │ 40 │ 10 │ 1
Update │ 120 │ 20 │ 1
Remove │ 100 │ 15 │ 1
──────────────────────────────────────────────────────
Example: Get operation on tree with height 10, proof 128 bytes
Verifier: 110 + ⌈128/64⌉ × 20 = 110 + 2 × 20 = 150
Lookup: 40 + 10 × 10 = 140
Total: 290 JitCost units
const CreateVerifierCost = PerItemCost{
.base = JitCost{ .value = 110 },
.per_chunk = JitCost{ .value = 20 },
.chunk_size = 64,
};
const LookupCost = PerItemCost{
.base = JitCost{ .value = 40 },
.per_chunk = JitCost{ .value = 10 },
.chunk_size = 1,
};
const InsertCost = PerItemCost{
.base = JitCost{ .value = 40 },
.per_chunk = JitCost{ .value = 10 },
.chunk_size = 1,
};
const UpdateCost = PerItemCost{
.base = JitCost{ .value = 120 },
.per_chunk = JitCost{ .value = 20 },
.chunk_size = 1,
};
const RemoveCost = PerItemCost{
.base = JitCost{ .value = 100 },
.per_chunk = JitCost{ .value = 15 },
.chunk_size = 1,
};
Serialization
AvlTreeData serialization format1314:
/// Serialize AvlTreeData
fn serializeAvlTreeData(data: *const AvlTreeData, writer: anytype) !void {
// Digest (33 bytes)
try writer.writeAll(&data.digest.toBytes());
// Flags (1 byte)
try writer.writeByte(data.tree_flags.serialize());
// Key length (VLQ)
try writeUInt(writer, data.key_length);
// Optional value length
if (data.value_length_opt) |vlen| {
try writer.writeByte(1); // Some
try writeUInt(writer, vlen);
} else {
try writer.writeByte(0); // None
}
}
/// Parse AvlTreeData
fn parseAvlTreeData(reader: anytype) !AvlTreeData {
// Digest (33 bytes)
var digest_bytes: [33]u8 = undefined;
_ = try reader.readAll(&digest_bytes);
const digest = ADDigest.fromSlice(&digest_bytes);
// Flags (1 byte)
const flags = AvlTreeFlags.parse(try reader.readByte());
// Key length (VLQ)
const key_length = try readUInt(reader);
// Optional value length
const has_value_length = (try reader.readByte()) != 0;
const value_length_opt: ?u32 = if (has_value_length)
try readUInt(reader)
else
null;
return AvlTreeData{
.digest = digest,
.tree_flags = flags,
.key_length = key_length,
.value_length_opt = value_length_opt,
};
}
Off-Chain Proof Generation
Provers generate proofs for operations:
/// Off-chain AVL tree prover (holds full tree)
const AvlProver = struct {
/// Full tree structure
root: ?*AvlNode,
/// Key length
key_length: usize,
/// Value length (optional)
value_length_opt: ?usize,
/// Pending operations for batch proof
pending_ops: std.ArrayList(Operation),
allocator: Allocator,
const Operation = union(enum) {
lookup: []const u8,
insert: struct { key: []const u8, value: []const u8 },
update: struct { key: []const u8, value: []const u8 },
remove: []const u8,
};
/// Perform insert and record for proof
pub fn performInsert(self: *AvlProver, key: []const u8, value: []const u8) !void {
// Actually insert into tree
self.root = try self.insertNode(self.root, key, value);
// Record for proof generation
try self.pending_ops.append(.{ .insert = .{ .key = key, .value = value } });
}
/// Generate proof for all pending operations
pub fn generateProof(self: *AvlProver) ![]const u8 {
var proof_builder = ProofBuilder.init(self.allocator);
for (self.pending_ops.items) |op| {
switch (op) {
.lookup => |key| try proof_builder.addLookupPath(self.root, key),
.insert => |ins| try proof_builder.addInsertPath(self.root, ins.key),
.update => |upd| try proof_builder.addUpdatePath(self.root, upd.key),
.remove => |key| try proof_builder.addRemovePath(self.root, key),
}
}
self.pending_ops.clearRetainingCapacity();
return proof_builder.finish();
}
/// Get current tree digest
pub fn digest(self: *const AvlProver) ADDigest {
if (self.root) |r| {
return computeNodeDigest(r);
}
return ADDigest{ .root_hash = [_]u8{0} ** 32, .height = 0 };
}
fn computeNodeDigest(node: *const AvlNode) ADDigest {
_ = node;
// Compute BLAKE2b256 hash of node contents
// Include left and right child hashes
return undefined;
}
};
const AvlNode = struct {
key: []const u8,
value: []const u8,
left: ?*AvlNode,
right: ?*AvlNode,
height: u8,
};
Key Ordering Requirement
Keys must be provided in the same order during proof generation and verification15:
CRITICAL: Key Ordering
─────────────────────────────────────────────────────
Proof Generation (off-chain):
prover.performLookup(key_A)
prover.performLookup(key_B)
prover.performLookup(key_C)
proof = prover.generateProof()
Verification (on-chain):
tree.getMany([key_A, key_B, key_C], proof) ✓ Works
tree.getMany([key_B, key_A, key_C], proof) ✗ Fails
The proof encodes a specific traversal path.
Different key order = different path = verification failure.
Summary
- Authenticated dictionaries store only 33-byte digest on-chain
- Ergo key size: Always 32 bytes (Blake2b256 hash);
keyLengthfield exists for generality - Prover (off-chain) holds full tree, generates proofs
- Verifier (on-chain) verifies proofs with only digest
- Operation flags control insert/update/remove permissions
- Key ordering must match between proof generation and verification
- Cost scales with proof length (verifier creation) and tree height (operations)
- All methods are immutable—return new tree instances
Next: Chapter 22: Box Model
Scala: AvlTreeData.scala:43-57 (AvlTreeData case class)
Rust: avl_tree_data.rs:56-69 (AvlTreeData struct)
Scala: AvlTreeData.scala:57 (DigestSize = 33)
Rust: avl_tree_data.rs:61-62 (digest field)
Scala: AvlTreeData.scala:7-36 (AvlTreeFlags)
Rust: avl_tree_data.rs:10-54 (AvlTreeFlags impl)
Scala: SigmaDsl.scala:547-589 (AvlTree trait)
Scala: AvlTreeVerifier.scala:8-88 (AvlTreeVerifier)
Scala: CAvlTreeVerifier.scala:17-45 (CAvlTreeVerifier)
Scala: CErgoTreeEvaluator.scala:78-93 (contains_eval)
Scala: CErgoTreeEvaluator.scala:132-164 (insert_eval)
Scala: methods.scala:1498-1540 (cost info constants)
Scala: AvlTreeData.scala:71-90 (serializer)
Rust: avl_tree_data.rs:71-91 (SigmaSerializable impl)
Scala: methods.scala:1588 (getMany key ordering caution)
Chapter 22: Box Model
PRE-ALPHA WARNING: This is a pre-alpha version of The Sigma Book. Content may be incomplete, inaccurate, or subject to change. Do not use as a source of truth. For authoritative information, consult the official repositories:
- sigmastate-interpreter — Reference Scala implementation
- sigma-rust — Rust implementation
- ergo — Ergo node
Prerequisites
- Understanding of UTXO (Unspent Transaction Output) model basics
- Chapter 3 for ErgoTree format stored in boxes
- Chapter 20 for collection types used in registers
Learning Objectives
By the end of this chapter, you will be able to:
- Explain the Ergo box as the fundamental UTXO structure with extended capabilities
- Work with the register-based data model (R0-R3 mandatory, R4-R9 optional)
- Manage tokens—the multi-asset feature of Ergo boxes
- Compute box IDs using Blake2b256 hashing of serialized content
- Implement box serialization and deserialization
Box Architecture
Boxes are Ergo's state containers—the extended UTXO model12:
Box Structure
─────────────────────────────────────────────────────
┌─────────────────────────────────────────────────────┐
│ ErgoBox │
├─────────────────────────────────────────────────────┤
│ box_id: [32]u8 Blake2b256(serialize(box)) │
├─────────────────────────────────────────────────────┤
│ Mandatory Registers │
│ ┌───────────────────────────────────────────────┐ │
│ │ R0: Long Value in nanoERG (10⁻⁹ ERG)│ │
│ │ R1: ErgoTree Guarding script │ │
│ │ R2: Coll[Token] Secondary tokens │ │
│ │ R3: (Int, Bytes) Creation info │ │
│ └───────────────────────────────────────────────┘ │
├─────────────────────────────────────────────────────┤
│ Non-Mandatory Registers │
│ ┌───────────────────────────────────────────────┐ │
│ │ R4-R9: Any Application-defined data │ │
│ └───────────────────────────────────────────────┘ │
├─────────────────────────────────────────────────────┤
│ Transaction Reference │
│ ┌───────────────────────────────────────────────┐ │
│ │ transaction_id: [32]u8 Creating tx hash │ │
│ │ index: u16 Output index in tx │ │
│ └───────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────┘
Core Box Structure
const ErgoBox = struct {
/// Blake2b256 hash of serialized box (computed)
box_id: BoxId,
/// Amount in NanoErgs (R0)
value: BoxValue,
/// Guarding script (R1)
ergo_tree: ErgoTree,
/// Secondary tokens (R2), up to MAX_TOKENS_COUNT
tokens: ?BoundedVec(Token, 1, MAX_TOKENS_COUNT),
/// Additional registers R4-R9
additional_registers: NonMandatoryRegisters,
/// Block height when transaction was created (part of R3)
creation_height: u32,
/// Transaction that created this box (part of R3)
transaction_id: TxId,
/// Output index in transaction (part of R3)
index: u16,
/// Protocol: 255 (u8), practical: ~122 due to box size limit
pub const MAX_TOKENS_COUNT: usize = 255;
pub const MAX_BOX_SIZE: usize = 4096;
pub const MAX_SCRIPT_SIZE: usize = 4096;
/// Create new box, computing box_id from content
pub fn init(
value: BoxValue,
ergo_tree: ErgoTree,
tokens: ?BoundedVec(Token, 1, MAX_TOKENS_COUNT),
additional_registers: NonMandatoryRegisters,
creation_height: u32,
transaction_id: TxId,
index: u16,
) !ErgoBox {
var box_with_zero_id = ErgoBox{
.box_id = BoxId.zero(),
.value = value,
.ergo_tree = ergo_tree,
.tokens = tokens,
.additional_registers = additional_registers,
.creation_height = creation_height,
.transaction_id = transaction_id,
.index = index,
};
box_with_zero_id.box_id = try box_with_zero_id.calcBoxId();
return box_with_zero_id;
}
/// Compute box ID as Blake2b256 hash of serialized bytes
fn calcBoxId(self: *const ErgoBox) !BoxId {
const bytes = try self.sigmaSerialize();
const hash = blake2b256(bytes);
return BoxId{ .digest = hash };
}
/// Create box from candidate by adding transaction reference
pub fn fromBoxCandidate(
candidate: *const ErgoBoxCandidate,
transaction_id: TxId,
index: u16,
) !ErgoBox {
return init(
candidate.value,
candidate.ergo_tree,
candidate.tokens,
candidate.additional_registers,
candidate.creation_height,
transaction_id,
index,
);
}
};
ErgoBoxCandidate
Before confirmation, boxes exist as candidates without transaction reference34:
/// Box before transaction confirmation (no tx reference yet)
const ErgoBoxCandidate = struct {
/// Amount in NanoErgs
value: BoxValue,
/// Guarding script
ergo_tree: ErgoTree,
/// Secondary tokens
tokens: ?BoundedVec(Token, 1, ErgoBox.MAX_TOKENS_COUNT),
/// Additional registers R4-R9
additional_registers: NonMandatoryRegisters,
/// Declared creation height
creation_height: u32,
pub fn toBox(self: *const ErgoBoxCandidate, tx_id: TxId, index: u16) !ErgoBox {
return ErgoBox.fromBoxCandidate(self, tx_id, index);
}
};
Register Model
Ten registers total—four mandatory, six application-defined56:
Register Layout
─────────────────────────────────────────────────────
ID Type Purpose
─────────────────────────────────────────────────────
R0 Long Value in nanoERG (10⁻⁹ ERG)
R1 Coll[Byte] Serialized ErgoTree
R2 Coll[(Coll[Byte],Long)] Secondary tokens
R3 (Int, Coll[Byte]) (height, txId ++ index)
─────────────────────────────────────────────────────
R4 Any Application data
R5 Any Application data
R6 Any Application data
R7 Any Application data
R8 Any Application data
R9 Any Application data
─────────────────────────────────────────────────────
Note: R4-R9 must be densely packed.
If R6 is used, R4 and R5 must also be present.
Register ID Types
/// Register identifier (0-9)
const RegisterId = union(enum) {
mandatory: MandatoryRegisterId,
non_mandatory: NonMandatoryRegisterId,
pub const R0 = RegisterId{ .mandatory = .r0 };
pub const R1 = RegisterId{ .mandatory = .r1 };
pub const R2 = RegisterId{ .mandatory = .r2 };
pub const R3 = RegisterId{ .mandatory = .r3 };
pub fn fromByte(value: u8) !RegisterId {
if (value < 4) {
return RegisterId{ .mandatory = @enumFromInt(value) };
} else if (value <= 9) {
return RegisterId{ .non_mandatory = @enumFromInt(value) };
} else {
return error.RegisterIdOutOfBounds;
}
}
};
/// Mandatory registers (R0-R3) - every box has these
const MandatoryRegisterId = enum(u8) {
/// Monetary value in NanoErgs
r0 = 0,
/// Guarding script (serialized ErgoTree)
r1 = 1,
/// Secondary tokens
r2 = 2,
/// Transaction reference and creation height
r3 = 3,
};
/// Non-mandatory registers (R4-R9) - application defined
const NonMandatoryRegisterId = enum(u8) {
r4 = 4,
r5 = 5,
r6 = 6,
r7 = 7,
r8 = 8,
r9 = 9,
pub const START_INDEX: usize = 4;
pub const END_INDEX: usize = 9;
pub const NUM_REGS: usize = 6;
};
Non-Mandatory Registers
Densely-packed storage for R4-R978:
const NonMandatoryRegisters = struct {
/// Registers stored as contiguous array (R4 at index 0)
values: []RegisterValue,
allocator: Allocator,
pub const MAX_SIZE: usize = NonMandatoryRegisterId.NUM_REGS;
pub fn empty() NonMandatoryRegisters {
return .{ .values = &.{}, .allocator = undefined };
}
/// Create from map, ensuring dense packing
pub fn fromMap(
allocator: Allocator,
map: std.AutoHashMap(NonMandatoryRegisterId, Constant),
) !NonMandatoryRegisters {
const count = map.count();
if (count > MAX_SIZE) return error.InvalidSize;
// Verify dense packing: R4...R(4+count-1) must all be present
var values = try allocator.alloc(RegisterValue, count);
var i: usize = 0;
while (i < count) : (i += 1) {
const reg_id: NonMandatoryRegisterId = @enumFromInt(4 + i);
const constant = map.get(reg_id) orelse
return error.NonDenselyPacked;
values[i] = RegisterValue{ .parsed = constant };
}
return .{ .values = values, .allocator = allocator };
}
/// Get register by ID, returns null if not present
pub fn get(self: *const NonMandatoryRegisters, reg_id: NonMandatoryRegisterId) ?*const RegisterValue {
const index = @intFromEnum(reg_id) - NonMandatoryRegisterId.START_INDEX;
if (index >= self.values.len) return null;
return &self.values[index];
}
/// Get as Constant, handling parse errors
pub fn getConstant(self: *const NonMandatoryRegisters, reg_id: NonMandatoryRegisterId) !?Constant {
const reg_val = self.get(reg_id) orelse return null;
return try reg_val.asConstant();
}
};
/// Register value—either parsed Constant or unparseable bytes
const RegisterValue = union(enum) {
parsed: Constant,
parsed_tuple: EvaluatedTuple,
invalid: struct {
bytes: []const u8,
error_msg: []const u8,
},
pub fn asConstant(self: *const RegisterValue) !Constant {
return switch (self.*) {
.parsed => |c| c,
.parsed_tuple => |t| t.toConstant(),
.invalid => |inv| error.UnparseableRegister,
};
}
};
Box ID Computation
Box ID is Blake2b256 hash of serialized content910:
const BoxId = struct {
digest: [32]u8,
pub const SIZE: usize = 32;
pub fn zero() BoxId {
return .{ .digest = [_]u8{0} ** 32 };
}
pub fn fromBytes(bytes: []const u8) !BoxId {
if (bytes.len != SIZE) return error.InvalidLength;
var result: BoxId = undefined;
@memcpy(&result.digest, bytes);
return result;
}
};
/// Compute box ID from serialized box bytes
pub fn computeBoxId(box_bytes: []const u8) BoxId {
return BoxId{ .digest = blake2b256(box_bytes) };
}
The ID includes transaction reference, making each box unique:
Box ID Computation
─────────────────────────────────────────────────────
┌──────────────────────────────────────────────────┐
│ Serialized Box Bytes │
├──────────────────────────────────────────────────┤
│ value (VLQ) │
│ ergo_tree (bytes) │
│ creation_height (VLQ) │
│ tokens_count (u8) │
│ tokens[] (token_id + amount) │
│ registers_count (u8) │
│ additional_registers[] │
│ transaction_id (32 bytes) │
│ index (2 bytes, big-endian) │
└──────────────────────────────────────────────────┘
│
▼
┌─────────────────┐
│ Blake2b256 │
└────────┬────────┘
│
▼
┌─────────────────┐
│ BoxId (32 B) │
└─────────────────┘
Register Access
Get register value with type checking1112:
/// Get any register value (R0-R9)
pub fn getRegister(box: *const ErgoBox, id: RegisterId) !?Constant {
return switch (id) {
.mandatory => |mid| switch (mid) {
.r0 => Constant.fromLong(box.value.as_i64()),
.r1 => Constant.fromBytes(try box.ergo_tree.serialize()),
.r2 => Constant.fromTokens(box.tokensRaw()),
.r3 => Constant.fromTuple(box.creationInfo()),
},
.non_mandatory => |nid| try box.additional_registers.getConstant(nid),
};
}
/// Get tokens as raw (bytes, amount) pairs
pub fn tokensRaw(box: *const ErgoBox) []const struct { []const i8, i64 } {
if (box.tokens) |tokens| {
var result = allocator.alloc(@TypeOf(result[0]), tokens.len);
for (tokens.items(), 0..) |token, i| {
result[i] = .{ token.token_id.asVecI8(), token.amount.as_i64() };
}
return result;
}
return &.{};
}
/// Get creation info as (height, txId ++ index)
pub fn creationInfo(box: *const ErgoBox) struct { i32, []const i8 } {
var bytes: [34]u8 = undefined; // 32-byte tx_id + 2-byte index
@memcpy(bytes[0..32], &box.transaction_id.digest);
std.mem.writeInt(u16, bytes[32..34], box.index, .big);
return .{
@intCast(box.creation_height),
std.mem.bytesAsSlice(i8, &bytes),
};
}
ExtractRegisterAs (AST Node)
Register access in ErgoScript compiles to ExtractRegisterAs1314:
/// Box.R0 - Box.R9 operations
const ExtractRegisterAs = struct {
/// Input box expression
input: *const Expr,
/// Register index (0-9)
register_id: i8,
/// Expected element type (wrapped in Option)
elem_tpe: SType,
pub const OP_CODE = OpCode.new(0x6E); // EXTRACT_REGISTER_AS
pub fn tpe(self: *const ExtractRegisterAs) SType {
return SType.option(self.elem_tpe);
}
pub fn eval(self: *const ExtractRegisterAs, env: *Env, ctx: *Context) !Value {
const ir_box = try self.input.eval(env, ctx);
const box = ir_box.asBox() orelse return error.TypeMismatch;
const id = RegisterId.fromByte(@intCast(self.register_id)) catch
return error.RegisterIdOutOfBounds;
const reg_val_opt = try box.getRegister(id);
if (reg_val_opt) |constant| {
// Type must match exactly
if (!constant.tpe.equals(self.elem_tpe)) {
return error.UnexpectedType;
}
return Value.some(constant.value);
} else {
return Value.none();
}
}
};
Token Representation
Tokens are (id, amount) pairs stored in R21516:
const Token = struct {
/// 32-byte token identifier
token_id: TokenId,
/// Token amount (positive i64)
amount: TokenAmount,
};
const TokenId = struct {
digest: [32]u8,
pub const SIZE: usize = 32;
};
const TokenAmount = struct {
value: u64,
pub fn as_i64(self: TokenAmount) i64 {
return @intCast(self.value);
}
};
/// Bounded collection of tokens (1 to MAX_TOKENS)
const BoxTokens = BoundedVec(Token, 1, ErgoBox.MAX_TOKENS_COUNT);
Token minting rule:
Token Creation Rule
─────────────────────────────────────────────────────
A new token can ONLY be minted when:
token_id == INPUTS(0).id (MUST equal first input's box ID)
This is a consensus rule enforced by the protocol.
Only the first input's box ID can be used as a new token ID.
This ensures uniqueness: tokens are "born" from a specific box.
┌─────────────┐ Spend ┌─────────────────┐
│ Input Box │ ─────────────► │ Output Box │
│ id: ABC123 │ │ token: ABC123 │
└─────────────┘ │ amount: 1000 │
└─────────────────┘
Box Serialization
/// Serialize box with optional token ID indexing
pub fn serializeBoxWithIndexedDigests(
box_value: BoxValue,
ergo_tree_bytes: []const u8,
tokens: ?BoxTokens,
additional_registers: *const NonMandatoryRegisters,
creation_height: u32,
token_ids_in_tx: ?*const IndexSet(TokenId),
writer: anytype,
) !void {
// Value (VLQ-encoded)
try box_value.serialize(writer);
// ErgoTree bytes
try writer.writeAll(ergo_tree_bytes);
// Creation height (VLQ-encoded)
try writeVLQ(writer, creation_height);
// Tokens
const token_slice = if (tokens) |t| t.items() else &[_]Token{};
try writer.writeByte(@intCast(token_slice.len));
for (token_slice) |token| {
if (token_ids_in_tx) |index_set| {
// Write index into transaction's token list
const idx = index_set.getIndex(token.token_id) orelse
return error.TokenNotInIndex;
try writeVLQ(writer, @intCast(idx));
} else {
// Write full 32-byte token ID
try writer.writeAll(&token.token_id.digest);
}
try writeVLQ(writer, token.amount.value);
}
// Additional registers
try additional_registers.serialize(writer);
}
/// Full ErgoBox serialization (adds tx reference)
pub fn serializeErgoBox(box: *const ErgoBox, writer: anytype) !void {
const ergo_tree_bytes = try box.ergo_tree.serialize();
try serializeBoxWithIndexedDigests(
box.value,
ergo_tree_bytes,
box.tokens,
&box.additional_registers,
box.creation_height,
null,
writer,
);
// Transaction reference
try writer.writeAll(&box.transaction_id.digest);
try writer.writeInt(u16, box.index, .big);
}
Size Limits
Box Constraints
─────────────────────────────────────────────────────
Limit Value Notes
─────────────────────────────────────────────────────
Max box size 4 KB Total serialized bytes
Max tokens per box 255 Protocol limit (u8)
(practical limit) ~122 Due to 4KB size limit
Max registers 10 R0-R9
Max script size 4 KB ErgoTree in R1 (part of box)
─────────────────────────────────────────────────────
const SigmaConstants = struct {
pub const MAX_BOX_SIZE: usize = 4 * 1024;
/// Protocol allows 255 (u8), but ~122 fit within MAX_BOX_SIZE
pub const MAX_TOKENS_PROTOCOL: usize = 255;
pub const MAX_TOKENS_PRACTICAL: usize = 122;
pub const MAX_REGISTERS: usize = 10;
};
Box Interface Methods
Methods available on Box type1718:
const BoxMethods = struct {
/// Box.value: Long - monetary value in NanoErgs
pub fn value(box: *const ErgoBox) i64 {
return box.value.as_i64();
}
/// Box.propositionBytes: Coll[Byte] - serialized script
pub fn propositionBytes(box: *const ErgoBox) ![]const u8 {
return try box.ergo_tree.serialize();
}
/// Box.bytes: Coll[Byte] - full serialized box
pub fn bytes(box: *const ErgoBox) ![]const u8 {
return try box.serialize();
}
/// Box.bytesWithoutRef: Coll[Byte] - without tx reference
pub fn bytesWithoutRef(box: *const ErgoBox) ![]const u8 {
const candidate = ErgoBoxCandidate{
.value = box.value,
.ergo_tree = box.ergo_tree,
.tokens = box.tokens,
.additional_registers = box.additional_registers,
.creation_height = box.creation_height,
};
return try candidate.serialize();
}
/// Box.id: Coll[Byte] - 32-byte Blake2b256 hash
pub fn id(box: *const ErgoBox) []const u8 {
return &box.box_id.digest;
}
/// Box.creationInfo: (Int, Coll[Byte])
pub fn creationInfo(box: *const ErgoBox) struct { i32, []const u8 } {
return box.creationInfo();
}
/// Box.tokens: Coll[(Coll[Byte], Long)]
pub fn tokens(box: *const ErgoBox) []const Token {
return if (box.tokens) |t| t.items() else &.{};
}
/// Box.getReg[T](i: Int): Option[T]
pub fn getReg(box: *const ErgoBox, comptime T: type, index: i32) !?T {
const id = try RegisterId.fromByte(@intCast(index));
const constant = try box.getRegister(id) orelse return null;
return constant.extractAs(T);
}
};
Type-Safe Register Access
Three outcomes when accessing registers1920:
Register Access Outcomes
─────────────────────────────────────────────────────
┌─────────────────────────────────────────────────┐
│ box.R4[Int] │
└─────────────────────────────────────────────────┘
│
┌─────────────┼─────────────┐
▼ ▼ ▼
┌──────────────┐ ┌──────────┐ ┌────────────────┐
│ R4 not set │ │ R4 = Int │ │ R4 = Long │
│ │ │ │ │ (wrong type) │
└──────┬───────┘ └────┬─────┘ └───────┬────────┘
│ │ │
▼ ▼ ▼
None Some(value) ERROR!
InvalidType
/// Type-safe register access with explicit error handling
pub fn extractRegisterAs(
box: *const ErgoBox,
register_id: i8,
expected_type: SType,
) !?Value {
const id = try RegisterId.fromByte(@intCast(register_id));
const constant_opt = try box.getRegister(id);
if (constant_opt) |constant| {
if (!constant.tpe.equals(expected_type)) {
return error.InvalidType;
}
return constant.value;
}
return null;
}
Summary
- Boxes are immutable UTXO state containers with 10 registers
- R0-R3 are mandatory (value, script, tokens, creation info)
- R4-R9 are application-defined, must be densely packed
- Box ID is Blake2b256 hash of serialized content including tx reference
- Tokens stored in R2, max 255 per box (protocol), ~122 practical; token ID MUST equal first input's box ID
- Type-safe access with three outcomes: None, Some(value), or InvalidType
- 4KB limit on total box size
Next: Chapter 23: Interpreter Wrappers
Scala: ErgoBox.scala:50-59
Rust: ergo_box.rs:38-80
Scala: ErgoBoxCandidate.scala:36-41
Rust: ergo_box.rs:225-248
Scala: ErgoBox.scala:154-168
Rust: id.rs:78-90
Scala: ErgoBox.scala (additionalRegisters)
Rust: register.rs:27-91
Scala: ErgoBox.scala:72-73
Rust: ergo_box.rs:149-153
Scala: CBox.scala:77-94
Rust: ergo_box.rs:156-168
Scala: methods.scala:1263 (SBoxMethods)
Rust: extract_reg_as.rs:18-57
Scala: ErgoBox.scala:119-130
Rust: ergo_box.rs:36-37 (BoxTokens)
Scala: SigmaDsl.scala:414-536
Rust: ergo_box.rs:120-198
Scala: CBox.scala:20-74
Rust: extract_reg_as.rs:15-47
Chapter 23: Interpreter Wrappers
PRE-ALPHA WARNING: This is a pre-alpha version of The Sigma Book. Content may be incomplete, inaccurate, or subject to change. Do not use as a source of truth. For authoritative information, consult the official repositories:
- sigmastate-interpreter — Reference Scala implementation
- sigma-rust — Rust implementation
- ergo — Ergo node
Prerequisites
- Chapter 14 for verifier implementation
- Chapter 15 for prover implementation
- Chapter 22 for box structure and registers
Learning Objectives
By the end of this chapter, you will be able to:
- Explain the interpreter hierarchy and how verifier/prover are combined
- Describe storage rent rules for expired boxes
- Use the Wallet API for transaction signing
- Implement proof verification with context extensions
Interpreter Architecture
The interpreter provides a layered architecture for script evaluation and proving12:
Interpreter Hierarchy
─────────────────────────────────────────────────────
┌─────────────────────────────────────────────────────┐
│ Verifier │
│ verify(tree, ctx, proof, message) -> bool │
│ Evaluates tree, then verifies sigma protocol proof │
└────────────────────────┬────────────────────────────┘
│ uses
▼
┌─────────────────────────────────────────────────────┐
│ Prover │
│ prove(tree, ctx, message, hints) -> ProverResult │
│ Reduces to SigmaBoolean, generates proof │
├─────────────────────────────────────────────────────┤
│ secrets: []PrivateInput │
│ prove() generates commitment, response │
└────────────────────────┬────────────────────────────┘
│ uses
▼
┌─────────────────────────────────────────────────────┐
│ reduce_to_crypto │
│ Evaluates ErgoTree to SigmaBoolean │
│ Returns: { sigma_prop, cost, diag } │
└─────────────────────────────────────────────────────┘
Reduction to Crypto
The core evaluation function reduces ErgoTree to a cryptographic proposition34:
/// Result of expression reduction
const ReductionResult = struct {
/// SigmaBoolean representing verifiable statement
sigma_prop: SigmaBoolean,
/// Estimated execution cost
cost: u64,
/// Diagnostic info (env state, pretty-printed expr)
diag: ReductionDiagnosticInfo,
};
/// Evaluate ErgoTree to SigmaBoolean
pub fn reduceToCrypto(tree: *const ErgoTree, ctx: *const Context) !ReductionResult {
const expr = try tree.root();
var env = Env.empty();
const value = try expr.eval(&env, ctx);
const sigma_prop = switch (value) {
.boolean => |b| SigmaBoolean.trivial(b),
.sigma_prop => |sp| sp.value(),
else => return error.NotSigmaProp,
};
return ReductionResult{
.sigma_prop = sigma_prop,
.cost = ctx.cost_accum.total(),
.diag = .{
.env = env.toStatic(),
.pretty_printed_expr = null,
},
};
}
Verifier Trait
Verification executes script and validates proof56:
const Verifier = struct {
/// Verify proof against ErgoTree in context
pub fn verify(
self: *const Verifier,
tree: *const ErgoTree,
ctx: *const Context,
proof: ProofBytes,
message: []const u8,
) !VerificationResult {
// Step 1-2: Reduce to SigmaBoolean
const reduction = try reduceToCrypto(tree, ctx);
// Step 3: Verify proof
const result = switch (reduction.sigma_prop) {
.trivial_prop => |b| b,
else => |sb| blk: {
if (proof.isEmpty()) break :blk false;
// Parse signature and compute challenges
const unchecked_tree = try parseSigComputeChallenges(
sb,
proof.bytes(),
);
// Verify commitments match
break :blk try checkCommitments(unchecked_tree, message);
},
};
return VerificationResult{
.result = result,
.cost = reduction.cost,
.diag = reduction.diag,
};
}
};
const VerificationResult = struct {
/// True if proof validates
result: bool,
/// Execution cost
cost: u64,
/// Diagnostic information
diag: ReductionDiagnosticInfo,
};
Prover Trait
The prover generates proofs for sigma propositions78:
const Prover = struct {
/// Private inputs (secrets)
secrets: []const PrivateInput,
/// Generate proof for ErgoTree
pub fn prove(
self: *const Prover,
tree: *const ErgoTree,
ctx: *const Context,
message: []const u8,
hints: ?*const HintsBag,
) !ProverResult {
// Reduce to crypto
const reduction = try reduceToCrypto(tree, ctx);
return switch (reduction.sigma_prop) {
.trivial_prop => |b| if (b)
ProverResult.empty()
else
error.ReducedToFalse,
else => |sb| blk: {
// Generate proof using sigma protocol
const proof = try self.generateProof(sb, message, hints);
break :blk proof;
},
};
}
/// Add secret to prover
pub fn appendSecret(self: *Prover, secret: PrivateInput) void {
self.secrets = append(self.secrets, secret);
}
/// Get public images of all secrets
pub fn publicImages(self: *const Prover) []SigmaBoolean {
var result: []SigmaBoolean = &.{};
for (self.secrets) |secret| {
result = append(result, secret.publicImage());
}
return result;
}
};
ProverResult
Proof output with context extension910:
const ProverResult = struct {
/// Serialized proof bytes
proof: ProofBytes,
/// User-defined context variables
extension: ContextExtension,
pub fn empty() ProverResult {
return .{
.proof = ProofBytes.empty(),
.extension = ContextExtension.empty(),
};
}
};
/// Proof bytes (empty for trivial proofs)
const ProofBytes = union(enum) {
empty: void,
some: []const u8,
pub fn isEmpty(self: ProofBytes) bool {
return self == .empty;
}
pub fn bytes(self: ProofBytes) []const u8 {
return switch (self) {
.empty => &.{},
.some => |b| b,
};
}
};
Wallet
The Wallet wraps prover for transaction signing1112:
const Wallet = struct {
/// Underlying prover
prover: *Prover,
/// Create from mnemonic phrase
pub fn fromMnemonic(
phrase: []const u8,
password: []const u8,
) !Wallet {
const seed = Mnemonic.toSeed(phrase, password);
const ext_sk = try ExtSecretKey.deriveMaster(seed);
return Wallet.fromSecrets(&.{ext_sk.secretKey()});
}
/// Create from secret keys
pub fn fromSecrets(secrets: []const SecretKey) Wallet {
var private_inputs: []PrivateInput = &.{};
for (secrets) |sk| {
private_inputs = append(private_inputs, PrivateInput.from(sk));
}
return .{
.prover = &Prover{ .secrets = private_inputs },
};
}
/// Add secret to wallet
pub fn addSecret(self: *Wallet, secret: SecretKey) void {
self.prover.appendSecret(PrivateInput.from(secret));
}
/// Sign a transaction
pub fn signTransaction(
self: *const Wallet,
tx_context: *const TransactionContext,
state_context: *const ErgoStateContext,
tx_hints: ?*const TransactionHintsBag,
) !Transaction {
return signTransactionImpl(
self.prover,
tx_context,
state_context,
tx_hints,
);
}
/// Sign a reduced transaction
pub fn signReducedTransaction(
self: *const Wallet,
reduced_tx: *const ReducedTransaction,
tx_hints: ?*const TransactionHintsBag,
) !Transaction {
return signReducedTransactionImpl(
self.prover,
reduced_tx,
tx_hints,
);
}
};
Transaction Signing
Sign all inputs, accumulating costs1314:
/// Sign transaction, generating proofs for all inputs
pub fn signTransaction(
prover: *const Prover,
tx_context: *const TransactionContext,
state_context: *const ErgoStateContext,
tx_hints: ?*const TransactionHintsBag,
) !Transaction {
const tx = tx_context.spending_tx;
const message = try tx.bytesToSign();
// Build context for first input
var ctx = try makeContext(state_context, tx_context, 0);
// Sign each input
var inputs: []Input = &.{};
for (tx.inputs(), 0..) |unsigned_input, idx| {
if (idx > 0) {
try updateContext(&ctx, tx_context, idx);
}
// Get hints for this input
const hints = if (tx_hints) |h| h.allHintsForInput(idx) else null;
// Generate proof
const input_box = tx_context.getInputBox(unsigned_input.box_id) orelse
return error.InputBoxNotFound;
const prover_result = try prover.prove(
&input_box.ergo_tree,
&ctx,
message,
hints,
);
inputs = append(inputs, Input{
.box_id = unsigned_input.box_id,
.spending_proof = prover_result,
});
}
return Transaction{
.inputs = inputs,
.data_inputs = tx.data_inputs,
.output_candidates = tx.output_candidates,
};
}
/// Create evaluation context for input
pub fn makeContext(
state_ctx: *const ErgoStateContext,
tx_ctx: *const TransactionContext,
self_index: usize,
) !Context {
const self_box = tx_ctx.getInputBox(
tx_ctx.spending_tx.inputs()[self_index].box_id,
) orelse return error.InputBoxNotFound;
return Context{
.height = state_ctx.pre_header.height,
.self_box = self_box,
.outputs = tx_ctx.spending_tx.outputs(),
.inputs = tx_ctx.inputBoxes(),
.data_inputs = tx_ctx.dataBoxes(),
.pre_header = state_ctx.pre_header,
.headers = state_ctx.headers,
.extension = tx_ctx.spending_tx.contextExtension(self_index),
};
}
Transaction Hints Bag
Hints for multi-party signing protocols1516:
const TransactionHintsBag = struct {
/// Secret hints by input index
secret_hints: std.AutoHashMap(usize, HintsBag),
/// Public hints (commitments) by input index
public_hints: std.AutoHashMap(usize, HintsBag),
pub fn empty() TransactionHintsBag {
return .{
.secret_hints = std.AutoHashMap(usize, HintsBag).init(allocator),
.public_hints = std.AutoHashMap(usize, HintsBag).init(allocator),
};
}
/// Replace all hints for an input
pub fn replaceHintsForInput(self: *TransactionHintsBag, index: usize, hints: HintsBag) void {
var public_hints: []Hint = &.{};
var secret_hints: []Hint = &.{};
for (hints.hints) |hint| {
switch (hint) {
.commitment_hint => public_hints = append(public_hints, hint),
.secret_proven => secret_hints = append(secret_hints, hint),
}
}
self.secret_hints.put(index, HintsBag{ .hints = secret_hints }) catch {};
self.public_hints.put(index, HintsBag{ .hints = public_hints }) catch {};
}
/// Add hints for an input (appending to existing)
pub fn addHintsForInput(self: *TransactionHintsBag, index: usize, hints: HintsBag) void {
// Get existing or empty
var existing_secret = self.secret_hints.get(index) orelse HintsBag.empty();
var existing_public = self.public_hints.get(index) orelse HintsBag.empty();
for (hints.hints) |hint| {
switch (hint) {
.commitment_hint => existing_public.hints = append(existing_public.hints, hint),
.secret_proven => existing_secret.hints = append(existing_secret.hints, hint),
}
}
self.secret_hints.put(index, existing_secret) catch {};
self.public_hints.put(index, existing_public) catch {};
}
/// Get all hints for input
pub fn allHintsForInput(self: *const TransactionHintsBag, index: usize) HintsBag {
var hints: []Hint = &.{};
if (self.secret_hints.get(index)) |bag| {
for (bag.hints) |h| hints = append(hints, h);
}
if (self.public_hints.get(index)) |bag| {
for (bag.hints) |h| hints = append(hints, h);
}
return HintsBag{ .hints = hints };
}
};
Commitment Generation
Generate first-round commitments for distributed signing1718:
/// Generate commitments for transaction inputs
pub fn generateCommitments(
wallet: *const Wallet,
tx_context: *const TransactionContext,
state_context: *const ErgoStateContext,
) !TransactionHintsBag {
// Get public keys from wallet secrets
var public_keys: []SigmaBoolean = &.{};
for (wallet.prover.secrets) |secret| {
public_keys = append(public_keys, secret.publicImage());
}
var hints_bag = TransactionHintsBag.empty();
for (tx_context.spending_tx.inputs(), 0..) |_, idx| {
var ctx = try makeContext(state_context, tx_context, idx);
const input_box = tx_context.inputBoxes()[idx];
const reduction = try reduceToCrypto(&input_box.ergo_tree, &ctx);
// Generate commitments for this sigma proposition
const input_hints = generateCommitmentsFor(
&reduction.sigma_prop,
public_keys,
);
hints_bag.addHintsForInput(idx, input_hints);
}
return hints_bag;
}
Storage Rent (Ergo-Specific)
Boxes expire after ~4 years and can be spent by anyone19:
Storage Rent Rules
─────────────────────────────────────────────────────
Period: 1,051,200 blocks ≈ 4 years (at 2 min/block)
Expired Box Spending:
┌─────────────────────────────────────────────────────┐
│ IF: │
│ current_height - box.creation_height >= 1,051,200 │
│ AND proof.isEmpty() │
│ AND extension.contains(STORAGE_INDEX_VAR) │
│ THEN: │
│ Check recreation rules instead of script │
└─────────────────────────────────────────────────────┘
Recreation Rules:
┌─────────────────────────────────────────────────────┐
│ output.creation_height == current_height │
│ output.value >= box.value - storage_fee │
│ output.R1 == box.R1 (script preserved) │
│ output.R2 == box.R2 (tokens preserved) │
│ output.R4-R9 == box.R4-R9 (registers preserved) │
│ │
│ storage_fee = storage_fee_factor * box.bytes.len │
└─────────────────────────────────────────────────────┘
const StorageConstants = struct {
/// Storage period in blocks (~4 years at 2 min/block)
pub const STORAGE_PERIOD: u32 = 1_051_200;
/// Context extension variable ID for storage index
pub const STORAGE_INDEX_VAR_ID: u8 = 127;
/// Fixed cost for storage contract evaluation
pub const STORAGE_CONTRACT_COST: u64 = 50;
};
/// Check if expired box spending is valid
pub fn checkExpiredBox(
box: *const ErgoBox,
output: *const ErgoBoxCandidate,
current_height: u32,
storage_fee_factor: u64,
) bool {
// Calculate storage fee
const storage_fee = storage_fee_factor * box.serializedSize();
// If box value <= fee, it's "dust" - always allowed
if (box.value.as_i64() - @as(i64, @intCast(storage_fee)) <= 0) {
return true;
}
// Check recreation rules
const correct_height = output.creation_height == current_height;
const correct_value = output.value.as_i64() >= box.value.as_i64() - @as(i64, @intCast(storage_fee));
const correct_registers = checkRegistersPreserved(box, output);
return correct_height and correct_value and correct_registers;
}
fn checkRegistersPreserved(box: *const ErgoBox, output: *const ErgoBoxCandidate) bool {
// R0 (value) and R3 (reference) can change
// R1 (script), R2 (tokens), R4-R9 must be preserved
return eql(box.ergo_tree, output.ergo_tree) and
eql(box.tokens, output.tokens) and
eql(box.additional_registers, output.additional_registers);
}
Signing Errors
const TxSigningError = error{
/// Transaction context invalid
TransactionContextError,
/// Prover failed on input
ProverError,
/// Serialization failed
SerializationError,
/// Signature parsing failed
SigParsingError,
};
const ProverError = error{
/// ErgoTree parsing failed
ErgoTreeError,
/// Evaluation failed
EvalError,
/// Script reduced to false
ReducedToFalse,
/// Missing witness for proof
TreeRootIsNotReal,
/// Secret not found for leaf
SecretNotFound,
/// Simulated leaf needs challenge
SimulatedLeafWithoutChallenge,
};
Cost Tracking
Transaction costs are accumulated across inputs20:
const TxCostComponents = struct {
/// Interpreter initialization (once per tx)
pub const INTERPRETER_INIT_COST: u64 = 10_000;
/// Calculate total transaction cost
pub fn calculateInitialCost(
params: *const BlockchainParameters,
inputs_count: usize,
data_inputs_count: usize,
outputs_count: usize,
token_access_cost: u64,
) u64 {
return INTERPRETER_INIT_COST +
inputs_count * params.input_cost +
data_inputs_count * params.data_input_cost +
outputs_count * params.output_cost +
token_access_cost;
}
};
Deterministic Signing
For platforms without secure random2122:
/// Generate deterministic nonce from secret and message
/// Used when secure random is unavailable
pub fn generateDeterministicCommitments(
wallet: *const Wallet,
reduced_tx: *const ReducedTransaction,
aux_rand: []const u8,
) !TransactionHintsBag {
var hints_bag = TransactionHintsBag.empty();
const message = try reduced_tx.unsigned_tx.bytesToSign();
for (reduced_tx.reduced_inputs(), 0..) |input, idx| {
// Deterministic nonce: H(secret || message || aux_rand)
if (generateDeterministicCommitmentsFor(
wallet.prover,
&input.sigma_prop,
message,
aux_rand,
)) |bag| {
hints_bag.addHintsForInput(idx, bag);
}
}
return hints_bag;
}
Summary
- Verifier evaluates script, verifies sigma protocol proof
- Prover reduces to SigmaBoolean, generates proof using secrets
- Wallet wraps prover with transaction-level signing API
- TransactionHintsBag coordinates multi-party signing
- Storage rent allows expired boxes (~4 years) to be spent by anyone
- Deterministic signing available for platforms without secure random
- Cost accumulates across inputs with initial overhead
Next: Chapter 24: Transaction Validation
Scala: ErgoLikeInterpreter.scala
Rust: eval.rs:1-50
Scala: Interpreter.scala (reduce)
Rust: eval.rs:129-160
Scala: Interpreter.scala (verify)
Rust: verifier.rs:55-88
Scala: ProverInterpreter.scala
Rust: prover.rs:57-96
Scala: ProverResult.scala
Rust: prover_result.rs:14-50
Scala: ErgoProvingInterpreter.scala
Rust: wallet.rs:52-94
Rust: signing.rs:143-180
Scala: HintsBag.scala
Rust: wallet.rs:259-347
Rust: wallet.rs:124-158
Scala: ErgoInterpreter.scala:42-55
Scala: ErgoInterpreter.scala:93-96
Rust: wallet.rs:182-209
Rust: deterministic.rs
Chapter 24: Transaction Validation
PRE-ALPHA WARNING: This is a pre-alpha version of The Sigma Book. Content may be incomplete, inaccurate, or subject to change. Do not use as a source of truth. For authoritative information, consult the official repositories:
- sigmastate-interpreter — Reference Scala implementation
- sigma-rust — Rust implementation
- ergo — Ergo node
Prerequisites
- Chapter 14 for script verification
- Chapter 22 for box structure and tokens
- Chapter 23 for interpreter wrappers
Learning Objectives
By the end of this chapter, you will be able to:
- Explain the two-phase validation pipeline (stateless then stateful)
- Implement stateless validation rules (input/output counts, no duplicates)
- Perform stateful validation with cost accumulation
- Verify ERG and token preservation across transaction inputs and outputs
Validation Pipeline
Transaction validation occurs in two phases12:
Transaction Validation Pipeline
─────────────────────────────────────────────────────
┌─────────────────────────────────────────────────────┐
│ STATELESS VALIDATION │
│ (No blockchain state required) │
├─────────────────────────────────────────────────────┤
│ • Has inputs? (at least 1) │
│ • Has outputs? (at least 1) │
│ • Count limits (≤ 32,767 each) │
│ • No negative values (outputs ≥ 0) │
│ • Output sum valid (no overflow) │
│ • Unique inputs (no double-spend) │
└──────────────────────────┬──────────────────────────┘
│ Pass
▼
┌─────────────────────────────────────────────────────┐
│ STATEFUL VALIDATION │
│ (Requires UTXO state and blockchain context) │
├─────────────────────────────────────────────────────┤
│ 1. Calculate initial cost │
│ 2. Verify outputs (dust, height, size) │
│ 3. Check ERG preservation │
│ 4. Verify asset preservation │
│ 5. Verify input scripts (accumulate cost) │
│ 6. Check re-emission rules (EIP-27) │
└─────────────────────────────────────────────────────┘
Transaction Structure
const Transaction = struct {
/// Transaction ID (Blake2b256 of serialized tx without proofs)
tx_id: TxId,
/// Input boxes to spend (with proofs)
inputs: TxIoVec(Input),
/// Read-only input references (no proofs)
data_inputs: ?TxIoVec(DataInput),
/// Output box candidates
output_candidates: TxIoVec(ErgoBoxCandidate),
/// Materialized outputs (with tx_id and index)
outputs: TxIoVec(ErgoBox),
pub const MAX_OUTPUTS_COUNT: usize = std.math.maxInt(u16);
pub fn init(
inputs: TxIoVec(Input),
data_inputs: ?TxIoVec(DataInput),
output_candidates: TxIoVec(ErgoBoxCandidate),
) !Transaction {
// First pass: compute outputs with zero tx_id
const zero_outputs = try output_candidates.mapIndexed(
struct {
fn f(idx: usize, bc: *const ErgoBoxCandidate) !ErgoBox {
return ErgoBox.fromBoxCandidate(bc, TxId.zero(), @intCast(idx));
}
}.f,
);
var tx = Transaction{
.tx_id = TxId.zero(),
.inputs = inputs,
.data_inputs = data_inputs,
.output_candidates = output_candidates,
.outputs = zero_outputs,
};
// Compute actual tx_id
tx.tx_id = try tx.calcTxId();
// Update outputs with correct tx_id
tx.outputs = try output_candidates.mapIndexed(
struct {
fn f(idx: usize, bc: *const ErgoBoxCandidate) !ErgoBox {
return ErgoBox.fromBoxCandidate(bc, tx.tx_id, @intCast(idx));
}
}.f,
);
return tx;
}
};
Validation Error Types
const TxValidationError = error{
/// Output ERG sum overflow
OutputSumOverflow,
/// Input ERG sum overflow
InputSumOverflow,
/// Same box spent twice
DoubleSpend,
/// ERG not preserved (inputs != outputs)
ErgPreservationError,
/// Token amounts not preserved
TokenPreservationError,
/// Output below dust threshold
DustOutput,
/// Creation height > current height
InvalidHeightError,
/// Creation height < max input height (v3+)
MonotonicHeightError,
/// Negative creation height (v1+)
NegativeHeight,
/// Box exceeds 4KB limit
BoxSizeExceeded,
/// Script exceeds size limit
ScriptSizeExceeded,
/// Script verification failed
ReducedToFalse,
/// Verifier error
VerifierError,
};
Stateless Validation
Checks that don't require blockchain state34:
/// Validate transaction structure without blockchain state
pub fn validateStateless(tx: *const Transaction) TxValidationError!void {
// BoundedVec ensures 1 ≤ count ≤ 32767, so no explicit checks needed
// Check output sum doesn't overflow
var output_sum: i64 = 0;
for (tx.outputs.items()) |out| {
output_sum = std.math.add(i64, output_sum, out.value.as_i64()) catch
return error.OutputSumOverflow;
}
// Check no double-spend (unique inputs)
var seen = std.AutoHashMap(BoxId, void).init(allocator);
defer seen.deinit();
for (tx.inputs.items()) |input| {
const result = seen.getOrPut(input.box_id);
if (result.found_existing) {
return error.DoubleSpend;
}
}
}
Stateless Rules Table
Stateless Validation Rules
─────────────────────────────────────────────────────
Rule Check Limit
─────────────────────────────────────────────────────
txNoInputs inputs.len >= 1 min 1
txNoOutputs outputs.len >= 1 min 1
txManyInputs inputs.len <= MAX 32,767
txManyDataInputs data_inputs.len <= MAX 32,767
txManyOutputs outputs.len <= MAX 32,767
txNegativeOutput all outputs >= 0 -
txOutputSum sum(outputs) no overflow -
txInputsUnique no duplicate box_ids -
─────────────────────────────────────────────────────
Stateful Validation
Requires UTXO state and blockchain context56:
/// Validate transaction against blockchain state
pub fn validateStateful(
tx: *const Transaction,
boxes_to_spend: []const ErgoBox,
data_boxes: []const ErgoBox,
state_context: *const ErgoStateContext,
accumulated_cost: u64,
verifier: *const Verifier,
) TxValidationError!u64 {
const params = state_context.current_parameters;
const max_cost = params.max_block_cost;
// 1. Calculate initial cost
const initial_cost = calculateInitialCost(
tx,
boxes_to_spend.len,
data_boxes.len,
params,
);
var current_cost = accumulated_cost + initial_cost;
if (current_cost > max_cost) {
return error.CostExceeded;
}
// 2. Verify outputs
const max_input_height = maxCreationHeight(boxes_to_spend);
for (tx.outputs.items()) |out| {
try verifyOutput(out, state_context, max_input_height);
}
// 3. Check ERG preservation (inputs must equal outputs exactly)
const input_sum = try sumValues(boxes_to_spend);
const output_sum = try sumValues(tx.outputs.items());
if (input_sum != output_sum) {
return error.ErgPreservationError;
}
// 4. Verify asset preservation
current_cost = try verifyAssets(
tx,
boxes_to_spend,
state_context,
current_cost,
);
// 5. Verify each input script
for (boxes_to_spend, 0..) |box, idx| {
current_cost = try verifyInput(
tx,
boxes_to_spend,
data_boxes,
box,
@intCast(idx),
state_context,
current_cost,
verifier,
);
}
return current_cost;
}
Initial Cost Calculation
Transaction cost starts with fixed overhead78:
const CostConstants = struct {
pub const INTERPRETER_INIT_COST: u64 = 10_000;
};
pub fn calculateInitialCost(
tx: *const Transaction,
inputs_count: usize,
data_inputs_count: usize,
params: *const BlockchainParameters,
) u64 {
return CostConstants.INTERPRETER_INIT_COST +
inputs_count * params.input_cost +
data_inputs_count * params.data_input_cost +
tx.outputs.len() * params.output_cost;
}
Output Verification
Each output must pass structural checks910:
pub fn verifyOutput(
out: *const ErgoBox,
state_context: *const ErgoStateContext,
max_input_height: u32,
) TxValidationError!void {
const params = state_context.current_parameters;
const block_version = state_context.block_version;
const current_height = state_context.current_height;
// Dust check: value >= minimum for box size
const min_value = BoxUtils.minimalErgoAmount(out, params);
if (out.value.as_u64() < min_value) {
return error.DustOutput;
}
// Future check: creation height <= current height
if (out.creation_height > current_height) {
return error.InvalidHeightError;
}
// Non-negative height (after v1)
if (block_version > 1 and out.creation_height < 0) {
return error.NegativeHeight;
}
// Monotonic height (after v3): output height >= max input height
if (block_version >= 3 and out.creation_height < max_input_height) {
return error.MonotonicHeightError;
}
// Size limits
if (out.serializedSize() > ErgoBox.MAX_BOX_SIZE) {
return error.BoxSizeExceeded;
}
if (out.propositionBytes().len > ErgoBox.MAX_SCRIPT_SIZE) {
return error.ScriptSizeExceeded;
}
}
Asset Verification
pub fn verifyAssets(
tx: *const Transaction,
boxes_to_spend: []const ErgoBox,
state_context: *const ErgoStateContext,
current_cost: u64,
) TxValidationError!u64 {
// Extract input assets
var in_assets = std.AutoHashMap(TokenId, u64).init(allocator);
defer in_assets.deinit();
for (boxes_to_spend) |box| {
if (box.tokens) |tokens| {
for (tokens.items()) |token| {
const entry = in_assets.getOrPut(token.token_id);
if (entry.found_existing) {
entry.value_ptr.* += token.amount.value;
} else {
entry.value_ptr.* = token.amount.value;
}
}
}
}
// Extract output assets
var out_assets = std.AutoHashMap(TokenId, u64).init(allocator);
defer out_assets.deinit();
for (tx.outputs.items()) |out| {
if (out.tokens) |tokens| {
for (tokens.items()) |token| {
const entry = out_assets.getOrPut(token.token_id);
if (entry.found_existing) {
entry.value_ptr.* += token.amount.value;
} else {
entry.value_ptr.* = token.amount.value;
}
}
}
}
// First input box ID can mint new tokens
const new_token_id = TokenId{ .digest = tx.inputs.items()[0].box_id.digest };
// Verify each output token
var iter = out_assets.iterator();
while (iter.next()) |entry| {
const out_id = entry.key_ptr.*;
const out_amount = entry.value_ptr.*;
const in_amount = in_assets.get(out_id) orelse 0;
// Output amount <= input amount OR it's a new token
if (out_amount > in_amount) {
if (!std.mem.eql(u8, &out_id.digest, &new_token_id.digest) or out_amount == 0) {
return error.TokenPreservationError;
}
}
}
// Add token access cost
const token_access_cost = calculateTokenAccessCost(
in_assets.count(),
out_assets.count(),
state_context.current_parameters.token_access_cost,
);
return current_cost + token_access_cost;
}
Input Script Verification
The most expensive step—verify each input's script1314:
pub fn verifyInput(
tx: *const Transaction,
boxes_to_spend: []const ErgoBox,
data_boxes: []const ErgoBox,
box: *const ErgoBox,
input_index: u16,
state_context: *const ErgoStateContext,
current_cost: u64,
verifier: *const Verifier,
) TxValidationError!u64 {
const max_cost = state_context.current_parameters.max_block_cost;
const input = tx.inputs.items()[input_index];
const proof = input.spending_proof;
// Check for storage rent spending first
const ctx = try buildContext(
tx,
boxes_to_spend,
data_boxes,
input_index,
state_context,
max_cost - current_cost,
);
if (trySpendStorageRent(&input, box, state_context, &ctx)) |_| {
// Storage rent conditions satisfied, skip script verification
return current_cost + StorageConstants.STORAGE_CONTRACT_COST;
}
// Normal script verification
const result = verifier.verify(
&box.ergo_tree,
&ctx,
proof.proof,
tx.messageToSign(),
) catch |err| {
return error.VerifierError;
};
if (!result.result) {
return error.ReducedToFalse;
}
const new_cost = current_cost + result.cost;
if (new_cost > max_cost) {
return error.CostExceeded;
}
return new_cost;
}
Context Construction
Build evaluation context for input verification1516:
pub fn buildContext(
tx: *const Transaction,
boxes_to_spend: []const ErgoBox,
data_boxes: []const ErgoBox,
input_index: u16,
state_context: *const ErgoStateContext,
cost_limit: u64,
) !Context {
return Context{
.height = state_context.pre_header.height,
.self_box = &boxes_to_spend[input_index],
.inputs = boxes_to_spend,
.data_inputs = data_boxes,
.outputs = tx.outputs.items(),
.pre_header = &state_context.pre_header,
.headers = state_context.headers,
.extension = tx.contextExtension(input_index),
.cost_limit = cost_limit,
.tree_version = @intCast(state_context.block_version - 1),
};
}
Storage Rent Spending
Expired boxes can be spent without script verification1718:
const StorageConstants = struct {
/// Blocks before box is eligible (~4 years)
pub const STORAGE_PERIOD: u32 = 1_051_200;
/// Context extension key for output index
pub const STORAGE_EXTENSION_INDEX: u8 = 127;
/// Cost for storage rent verification
pub const STORAGE_CONTRACT_COST: u64 = 50;
};
pub fn trySpendStorageRent(
input: *const Input,
input_box: *const ErgoBox,
state_context: *const ErgoStateContext,
ctx: *const Context,
) ?void {
// Must have empty proof
if (!input.spending_proof.proof.isEmpty()) return null;
return checkStorageRentConditions(input_box, state_context, ctx);
}
pub fn checkStorageRentConditions(
input_box: *const ErgoBox,
state_context: *const ErgoStateContext,
ctx: *const Context,
) ?void {
// Check time elapsed
const age = ctx.pre_header.height - ctx.self_box.creation_height;
if (age < StorageConstants.STORAGE_PERIOD) return null;
// Get output index from context extension
const output_idx_value = ctx.extension.values.get(
StorageConstants.STORAGE_EXTENSION_INDEX,
) orelse return null;
const output_idx = output_idx_value.extractAs(i16) orelse return null;
const output = ctx.outputs[@intCast(output_idx)];
// Calculate storage fee
const storage_fee = input_box.serializedSize() *
state_context.parameters.storage_fee_factor;
// Dust boxes can always be spent
if (ctx.self_box.value.as_u64() <= storage_fee) return {};
// Verify recreation rules
if (output.creation_height != state_context.pre_header.height) return null;
if (output.value.as_u64() < ctx.self_box.value.as_u64() - storage_fee) return null;
// Registers must be preserved (except R0 value and R3 creation info)
for (0..10) |i| {
const reg_id = RegisterId.fromByte(@intCast(i));
if (reg_id == .r0 or reg_id == .r3) continue;
if (!std.meta.eql(
ctx.self_box.getRegister(reg_id),
output.getRegister(reg_id),
)) return null;
}
return {};
}
Cost Accumulation Flow
Cost Accumulation
─────────────────────────────────────────────────────
Block accumulated cost (from previous txs)
│
├── + INTERPRETER_INIT_COST (10,000)
├── + inputs.len × inputCost
├── + data_inputs.len × dataInputCost
├── + outputs.len × outputCost
│
▼
startCost
│
├── Input[0] script → + scriptCost₀
├── Input[1] script → + scriptCost₁
├── ...
├── Input[n] script → + scriptCostₙ
│
├── Token access → + tokenAccessCost
│
▼
finalCost ≤ maxBlockCost
Each input verification receives remaining budget:
ctx.cost_limit = maxBlockCost - current_cost
Validation Rules Summary
Validation Rules Reference
─────────────────────────────────────────────────────
ID Name Phase Description
─────────────────────────────────────────────────────
100 txNoInputs Stateless ≥1 input
101 txNoOutputs Stateless ≥1 output
102 txManyInputs Stateless ≤32,767
103 txManyDataInputs Stateless ≤32,767
104 txManyOutputs Stateless ≤32,767
105 txNegativeOutput Stateless values ≥ 0
106 txOutputSum Stateless no overflow
107 txInputsUnique Stateless no duplicates
─────────────────────────────────────────────────────
120 txScriptValidation Stateful scripts pass
121 bsBlockTransactionsCost Stateful cost in limit
122 txDust Stateful min value
123 txFuture Stateful valid height
124 txErgPreservation Stateful inputs == outputs
125 txAssetsPreservation Stateful tokens balanced
126 txBoxSize Stateful ≤4KB
127 txReemission Stateful EIP-27 rules
─────────────────────────────────────────────────────
Complete Validation Flow
/// Full transaction validation
pub fn validateTransaction(
tx: *const Transaction,
utxo_state: *const UtxoState,
state_context: *const ErgoStateContext,
verifier: *const Verifier,
accumulated_cost: u64,
) !u64 {
// Phase 1: Stateless validation
try validateStateless(tx);
// Phase 2: Resolve input boxes
var boxes_to_spend: []ErgoBox = &.{};
for (tx.inputs.items()) |input| {
const box = utxo_state.boxById(input.box_id) orelse
return error.InputBoxNotFound;
boxes_to_spend = append(boxes_to_spend, box);
}
// Phase 3: Resolve data input boxes
var data_boxes: []ErgoBox = &.{};
if (tx.data_inputs) |data_inputs| {
for (data_inputs.items()) |data_input| {
const box = utxo_state.boxById(data_input.box_id) orelse
return error.DataInputBoxNotFound;
data_boxes = append(data_boxes, box);
}
}
// Phase 4: Stateful validation
return validateStateful(
tx,
boxes_to_spend,
data_boxes,
state_context,
accumulated_cost,
verifier,
);
}
Summary
- Two-phase validation: Stateless (structural) then stateful (UTXO-dependent)
- Stateless: Count limits, no negatives, no overflow, unique inputs
- Stateful: Cost tracking, output checks, preservation rules, script verification
- Cost accumulation: Tracks across inputs, bounded by maxBlockCost
- Storage rent: Expired boxes (~4 years) spendable by anyone with recreation
- Asset preservation: ERG exactly preserved (inputs == outputs), tokens can only decrease (or mint new)
Next: Chapter 25: Cost Limits and Parameters
Scala: ErgoTransaction.scala:57-64
Rust: transaction.rs:60-96
Scala: ErgoTransaction.scala:91-115
Rust: transaction.rs:200-300
Scala: ErgoInterpreter.scala:93-96
Rust: signing.rs:143-180
Scala: ErgoContext.scala:12-29
Rust: signing.rs:46-116
Scala: ErgoInterpreter.scala:42-55
Rust: storage_rent.rs:12-78
Chapter 25: Cost Limits and Parameters
PRE-ALPHA WARNING: This is a pre-alpha version of The Sigma Book. Content may be incomplete, inaccurate, or subject to change. Do not use as a source of truth. For authoritative information, consult the official repositories:
- sigmastate-interpreter — Reference Scala implementation
- sigma-rust — Rust implementation
- ergo — Ergo node
Prerequisites
- Chapter 24 for how cost limits are enforced during validation
- Chapter 13 for JitCost and operation costs
Learning Objectives
By the end of this chapter, you will be able to:
- Explain Ergo's adjustable blockchain parameters and their governance
- Describe the miner voting mechanism for parameter changes
- Work with cost-related parameters and their default values
- Configure validation rules and soft-fork settings
Parameter System
Ergo's blockchain parameters are adjustable through miner voting12:
Parameter Governance
─────────────────────────────────────────────────────
┌─────────────────────────────────────────────────────┐
│ Parameters │
├─────────────────────────────────────────────────────┤
│ parameters_table: HashMap<Parameter, i32> │
│ proposed_update: ValidationSettingsUpdate │
│ height: u32 │
└─────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────┐
│ Parameter Types │
├─────────────────────────────────────────────────────┤
│ Cost: maxBlockCost, inputCost, outputCost... │
│ Size: maxBlockSize, minValuePerByte │
│ Fee: storageFeeFactor │
│ Version: blockVersion │
└─────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────┐
│ Voting Mechanism │
├─────────────────────────────────────────────────────┤
│ Miners include votes in block headers │
│ Votes tallied over epochs (1024 blocks) │
│ Majority (>= 90%) activates change │
│ Each param has min/max bounds and step size │
└─────────────────────────────────────────────────────┘
Parameter Enum
const Parameter = enum(i8) {
/// Storage fee factor (per byte per ~4 year storage period)
storage_fee_factor = 1,
/// Minimum monetary value per byte of box
min_value_per_byte = 2,
/// Maximum block size in bytes
max_block_size = 3,
/// Maximum computational cost per block
max_block_cost = 4,
/// Cost per token access
token_access_cost = 5,
/// Cost per transaction input
input_cost = 6,
/// Cost per data input
data_input_cost = 7,
/// Cost per transaction output
output_cost = 8,
/// Sub-blocks per block (v6+)
subblocks_per_block = 9,
/// Soft-fork vote
soft_fork = 120,
/// Soft-fork votes collected
soft_fork_votes = 121,
/// Soft-fork starting height
soft_fork_start_height = 122,
/// Current block version
block_version = 123,
/// Negative values indicate decrease vote
pub fn decreaseVote(self: Parameter) i8 {
return -@intFromEnum(self);
}
};
Parameters Structure
const Parameters = struct {
/// Current block height
height: u32,
/// Parameter ID -> value mapping
parameters_table: std.AutoHashMap(Parameter, i32),
/// Proposed validation settings update
proposed_update: ValidationSettingsUpdate,
/// Get block version
pub fn blockVersion(self: *const Parameters) i32 {
return self.parameters_table.get(.block_version) orelse 1;
}
/// Get max block cost
pub fn maxBlockCost(self: *const Parameters) i32 {
return self.parameters_table.get(.max_block_cost) orelse DefaultParams.MAX_BLOCK_COST;
}
/// Get input cost
pub fn inputCost(self: *const Parameters) i32 {
return self.parameters_table.get(.input_cost) orelse DefaultParams.INPUT_COST;
}
/// Get data input cost
pub fn dataInputCost(self: *const Parameters) i32 {
return self.parameters_table.get(.data_input_cost) orelse DefaultParams.DATA_INPUT_COST;
}
/// Get output cost
pub fn outputCost(self: *const Parameters) i32 {
return self.parameters_table.get(.output_cost) orelse DefaultParams.OUTPUT_COST;
}
/// Get token access cost
pub fn tokenAccessCost(self: *const Parameters) i32 {
return self.parameters_table.get(.token_access_cost) orelse DefaultParams.TOKEN_ACCESS_COST;
}
/// Get storage fee factor
pub fn storageFeeFactor(self: *const Parameters) i32 {
return self.parameters_table.get(.storage_fee_factor) orelse DefaultParams.STORAGE_FEE_FACTOR;
}
/// Get min value per byte
pub fn minValuePerByte(self: *const Parameters) i32 {
return self.parameters_table.get(.min_value_per_byte) orelse DefaultParams.MIN_VALUE_PER_BYTE;
}
/// Get max block size
pub fn maxBlockSize(self: *const Parameters) i32 {
return self.parameters_table.get(.max_block_size) orelse DefaultParams.MAX_BLOCK_SIZE;
}
};
Default Values
const DefaultParams = struct {
/// Cost parameters
pub const MAX_BLOCK_COST: i32 = 1_000_000;
pub const TOKEN_ACCESS_COST: i32 = 100;
pub const INPUT_COST: i32 = 2_000;
pub const DATA_INPUT_COST: i32 = 100;
pub const OUTPUT_COST: i32 = 100;
/// Size parameters
pub const MAX_BLOCK_SIZE: i32 = 512 * 1024; // 512 KB
pub const MAX_BLOCK_SIZE_MAX: i32 = 1024 * 1024; // 1 MB
pub const MAX_BLOCK_SIZE_MIN: i32 = 16 * 1024; // 16 KB
/// Fee parameters
pub const STORAGE_FEE_FACTOR: i32 = 1_250_000; // 0.00125 ERG per byte per ~4 years
pub const STORAGE_FEE_FACTOR_MAX: i32 = 2_500_000;
pub const STORAGE_FEE_FACTOR_MIN: i32 = 0;
pub const STORAGE_FEE_FACTOR_STEP: i32 = 25_000;
/// Dust prevention
pub const MIN_VALUE_PER_BYTE: i32 = 30 * 12; // 360 nanoErgs per byte
pub const MIN_VALUE_PER_BYTE_MAX: i32 = 10_000;
pub const MIN_VALUE_PER_BYTE_MIN: i32 = 0;
pub const MIN_VALUE_PER_BYTE_STEP: i32 = 10;
/// Sub-blocks (v6+)
pub const SUBBLOCKS_PER_BLOCK: i32 = 30;
pub const SUBBLOCKS_PER_BLOCK_MIN: i32 = 2;
pub const SUBBLOCKS_PER_BLOCK_MAX: i32 = 2048;
pub const SUBBLOCKS_PER_BLOCK_STEP: i32 = 1;
/// Interpreter initialization cost
pub const INTERPRETER_INIT_COST: i32 = 10_000;
};
/// Create default parameters
pub fn defaultParameters() Parameters {
var table = std.AutoHashMap(Parameter, i32).init(allocator);
table.put(.storage_fee_factor, DefaultParams.STORAGE_FEE_FACTOR) catch {};
table.put(.min_value_per_byte, DefaultParams.MIN_VALUE_PER_BYTE) catch {};
table.put(.token_access_cost, DefaultParams.TOKEN_ACCESS_COST) catch {};
table.put(.input_cost, DefaultParams.INPUT_COST) catch {};
table.put(.data_input_cost, DefaultParams.DATA_INPUT_COST) catch {};
table.put(.output_cost, DefaultParams.OUTPUT_COST) catch {};
table.put(.max_block_size, DefaultParams.MAX_BLOCK_SIZE) catch {};
table.put(.max_block_cost, DefaultParams.MAX_BLOCK_COST) catch {};
table.put(.block_version, 1) catch {};
return Parameters{
.height = 0,
.parameters_table = table,
.proposed_update = ValidationSettingsUpdate.empty(),
};
}
Parameter Reference
Default Parameter Values
────────────────────────────────────────────────────────────────────
ID Name Default Min Max Step
────────────────────────────────────────────────────────────────────
1 storageFeeFactor 1,250,000 0 2,500,000 25,000
2 minValuePerByte 360 0 10,000 10
3 maxBlockSize 524,288 16,384 1,048,576 1%
4 maxBlockCost 1,000,000 16,384 - 1%
5 tokenAccessCost 100 - - 1%
6 inputCost 2,000 - - 1%
7 dataInputCost 100 - - 1%
8 outputCost 100 - - 1%
9 subblocksPerBlock 30 2 2,048 1
123 blockVersion 1 1 - -
────────────────────────────────────────────────────────────────────
Voting Mechanism
Miners vote for parameter changes in block headers3:
const VotingSettings = struct {
/// Blocks per voting epoch
pub const EPOCH_LENGTH: u32 = 1024;
/// Required approval threshold (90%)
pub const APPROVAL_THRESHOLD: f32 = 0.90;
/// Check if vote count meets approval threshold
pub fn changeApproved(self: *const VotingSettings, vote_count: u32) bool {
const threshold = @as(u32, @intFromFloat(EPOCH_LENGTH * APPROVAL_THRESHOLD));
return vote_count >= threshold;
}
};
/// Generate votes based on targets
pub fn generateVotes(
params: *const Parameters,
own_targets: std.AutoHashMap(Parameter, i32),
epoch_votes: []const struct { param: i8, count: u32 },
vote_for_fork: bool,
) []i8 {
var votes: []i8 = &.{};
for (epoch_votes) |ev| {
const param_id = ev.param;
if (param_id == @intFromEnum(Parameter.soft_fork)) {
if (vote_for_fork) {
votes = append(votes, param_id);
}
} else if (param_id > 0) {
// Vote for increase if current < target
const param: Parameter = @enumFromInt(param_id);
const current = params.parameters_table.get(param) orelse continue;
const target = own_targets.get(param) orelse continue;
if (target > current) {
votes = append(votes, param_id);
}
} else if (param_id < 0) {
// Vote for decrease if current > target
const param: Parameter = @enumFromInt(-param_id);
const current = params.parameters_table.get(param) orelse continue;
const target = own_targets.get(param) orelse continue;
if (target < current) {
votes = append(votes, param_id);
}
}
}
return padVotes(votes);
}
Parameter Update Logic
Apply votes at epoch boundaries4:
/// Update parameters based on epoch votes
pub fn updateParams(
params_table: std.AutoHashMap(Parameter, i32),
epoch_votes: []const struct { param: i8, count: u32 },
settings: *const VotingSettings,
) std.AutoHashMap(Parameter, i32) {
var new_table = params_table.clone();
for (epoch_votes) |ev| {
const param_id = ev.param;
if (param_id >= @intFromEnum(Parameter.soft_fork)) continue;
const param_abs: Parameter = @enumFromInt(if (param_id < 0) -param_id else param_id);
if (settings.changeApproved(ev.count)) {
const current = new_table.get(param_abs) orelse continue;
const max_val = getMaxValue(param_abs);
const min_val = getMinValue(param_abs);
const step = getStep(param_abs, current);
const new_value = if (param_id > 0) blk: {
// Increase: cap at max
break :blk if (current < max_val) current + step else current;
} else blk: {
// Decrease: floor at min
break :blk if (current > min_val) current - step else current;
};
new_table.put(param_abs, new_value) catch {};
}
}
return new_table;
}
fn getMaxValue(param: Parameter) i32 {
return switch (param) {
.storage_fee_factor => DefaultParams.STORAGE_FEE_FACTOR_MAX,
.min_value_per_byte => DefaultParams.MIN_VALUE_PER_BYTE_MAX,
.max_block_size => DefaultParams.MAX_BLOCK_SIZE_MAX,
.subblocks_per_block => DefaultParams.SUBBLOCKS_PER_BLOCK_MAX,
else => std.math.maxInt(i32) / 2,
};
}
fn getMinValue(param: Parameter) i32 {
return switch (param) {
.storage_fee_factor => DefaultParams.STORAGE_FEE_FACTOR_MIN,
.min_value_per_byte => DefaultParams.MIN_VALUE_PER_BYTE_MIN,
.max_block_size => DefaultParams.MAX_BLOCK_SIZE_MIN,
.max_block_cost => 16 * 1024,
.subblocks_per_block => DefaultParams.SUBBLOCKS_PER_BLOCK_MIN,
else => 0,
};
}
fn getStep(param: Parameter, current: i32) i32 {
return switch (param) {
.storage_fee_factor => DefaultParams.STORAGE_FEE_FACTOR_STEP,
.min_value_per_byte => DefaultParams.MIN_VALUE_PER_BYTE_STEP,
.subblocks_per_block => DefaultParams.SUBBLOCKS_PER_BLOCK_STEP,
else => @max(1, @divTrunc(current, 100)), // Default 1% step
};
}
Cost Calculation
Cost Formula
──────────────────────────────────────────────────────────────────
totalCost = interpreterInitCost // 10,000
+ inputs × inputCost // inputs × 2,000
+ dataInputs × dataInputCost // dataInputs × 100
+ outputs × outputCost // outputs × 100
+ tokenAccessCost × tokens // varies
+ scriptExecutionCost // varies per script
Example (2 inputs, 1 data input, 3 outputs, 50K script):
──────────────────────────────────────────────────────────────────
10,000 interpreter init
4,000 2 × 2,000 inputs
100 1 × 100 data inputs
300 3 × 100 outputs
50,000 script execution
──────────────────────────────────────────────────────────────────
64,400 TOTAL
/// Calculate transaction cost
pub fn calculateTransactionCost(
params: *const Parameters,
num_inputs: usize,
num_data_inputs: usize,
num_outputs: usize,
script_cost: u64,
token_ops: usize,
) u64 {
const init_cost = DefaultParams.INTERPRETER_INIT_COST;
const input_cost = params.inputCost() * @as(i32, @intCast(num_inputs));
const data_input_cost = params.dataInputCost() * @as(i32, @intCast(num_data_inputs));
const output_cost = params.outputCost() * @as(i32, @intCast(num_outputs));
const token_cost = params.tokenAccessCost() * @as(i32, @intCast(token_ops));
return @intCast(init_cost + input_cost + data_input_cost + output_cost + token_cost) + script_cost;
}
/// Calculate block capacity in simple transactions
pub fn estimateBlockCapacity(params: *const Parameters) u32 {
const max_cost = params.maxBlockCost();
// Simple tx: 1 input (P2PK), 2 outputs, ~15K script cost
const simple_tx_cost = DefaultParams.INTERPRETER_INIT_COST +
params.inputCost() +
params.outputCost() * 2 +
15_000; // P2PK verification
return @intCast(@divTrunc(max_cost, simple_tx_cost));
}
Block Version History
Protocol Versions
────────────────────────────────────────────────────────
Block Version Protocol Features
────────────────────────────────────────────────────────
1 v1 Initial mainnet
2 v5.0 Script improvements
3 v5.0.12 Height monotonicity (EIP-39)
4 v6.0 Sub-blocks, new operations
────────────────────────────────────────────────────────
Script version = block_version - 1
Validation Rules
Rules can be enabled/disabled via soft-fork7:
const RuleStatus = struct {
/// Creates error from modifier details
create_error: fn (InvalidModifier) Invalid,
/// Which modifier types this rule applies to
affected_classes: []const ModifierType,
/// Can this rule be disabled via soft-fork?
may_be_disabled: bool,
/// Is this rule currently active?
is_active: bool,
};
/// Validation rule IDs
const ValidationRules = struct {
// Stateless (100-109)
pub const TX_NO_INPUTS: u16 = 100;
pub const TX_NO_OUTPUTS: u16 = 101;
pub const TX_MANY_INPUTS: u16 = 102;
pub const TX_MANY_DATA_INPUTS: u16 = 103;
pub const TX_MANY_OUTPUTS: u16 = 104;
pub const TX_NEGATIVE_OUTPUT: u16 = 105;
pub const TX_OUTPUT_SUM: u16 = 106;
pub const TX_INPUTS_UNIQUE: u16 = 107;
pub const TX_POSITIVE_ASSETS: u16 = 108;
pub const TX_ASSETS_IN_ONE_BOX: u16 = 109;
// Stateful (111-127)
pub const TX_DUST: u16 = 111;
pub const TX_FUTURE: u16 = 112;
pub const TX_BOXES_TO_SPEND: u16 = 113;
pub const TX_DATA_BOXES: u16 = 114;
pub const TX_INPUTS_SUM: u16 = 115;
pub const TX_ERG_PRESERVATION: u16 = 116;
pub const TX_ASSETS_PRESERVATION: u16 = 117;
pub const TX_BOX_TO_SPEND: u16 = 118;
pub const TX_SCRIPT_VALIDATION: u16 = 119;
pub const TX_BOX_SIZE: u16 = 120;
pub const TX_BOX_PROPOSITION_SIZE: u16 = 121;
pub const TX_NEG_HEIGHT: u16 = 122; // v2+
pub const TX_REEMISSION: u16 = 123; // EIP-27
pub const TX_MONOTONIC_HEIGHT: u16 = 124; // v3+
// Block rules (300+)
pub const BS_BLOCK_TX_SIZE: u16 = 306;
pub const BS_BLOCK_TX_COST: u16 = 307;
};
Rule Configurability
Rule Categories
───────────────────────────────────────────────────────────
Category Can Disable? Examples
───────────────────────────────────────────────────────────
Consensus Critical No txErgPreservation
txScriptValidation
txNoInputs
Soft-Forkable Yes txDust
txBoxSize
txReemission
Version-Gated N/A txNegHeight (v2+)
txMonotonicHeight (v3+)
───────────────────────────────────────────────────────────
/// Check if rule can be disabled
pub fn mayBeDisabled(rule: u16) bool {
return switch (rule) {
ValidationRules.TX_DUST,
ValidationRules.TX_BOX_SIZE,
ValidationRules.TX_BOX_PROPOSITION_SIZE,
ValidationRules.TX_REEMISSION,
=> true,
// Consensus-critical rules cannot be disabled
ValidationRules.TX_NO_INPUTS,
ValidationRules.TX_ERG_PRESERVATION,
ValidationRules.TX_SCRIPT_VALIDATION,
ValidationRules.TX_ASSETS_PRESERVATION,
=> false,
else => false,
};
}
Parameter Serialization
Parameters stored in block extensions8:
const SYSTEM_PARAMETERS_PREFIX: u8 = 0x00;
const SOFT_FORK_DISABLING_RULES_KEY: [2]u8 = .{ 0x00, 0x01 };
/// Serialize parameters to extension candidate
pub fn toExtensionCandidate(params: *const Parameters) ExtensionCandidate {
var fields: []ExtensionField = &.{};
// Add parameter fields
var iter = params.parameters_table.iterator();
while (iter.next()) |entry| {
const key = [2]u8{ SYSTEM_PARAMETERS_PREFIX, @intFromEnum(entry.key_ptr.*) };
const value = std.mem.toBytes(@byteSwap(entry.value_ptr.*));
fields = append(fields, ExtensionField{ .key = key, .value = &value });
}
// Add proposed update
const update_bytes = params.proposed_update.serialize();
fields = append(fields, ExtensionField{
.key = SOFT_FORK_DISABLING_RULES_KEY,
.value = update_bytes,
});
return ExtensionCandidate{ .fields = fields };
}
/// Parse parameters from extension
pub fn parseExtension(height: u32, extension: *const Extension) !Parameters {
var params_table = std.AutoHashMap(Parameter, i32).init(allocator);
for (extension.fields) |field| {
if (field.key[0] == SYSTEM_PARAMETERS_PREFIX and
field.key[1] != SOFT_FORK_DISABLING_RULES_KEY[1])
{
const param: Parameter = @enumFromInt(field.key[1]);
const value = @byteSwap(std.mem.bytesToValue(i32, field.value[0..4]));
try params_table.put(param, value);
}
}
var proposed_update = ValidationSettingsUpdate.empty();
for (extension.fields) |field| {
if (std.mem.eql(u8, &field.key, &SOFT_FORK_DISABLING_RULES_KEY)) {
proposed_update = try ValidationSettingsUpdate.parse(field.value);
break;
}
}
return Parameters{
.height = height,
.parameters_table = params_table,
.proposed_update = proposed_update,
};
}
Summary
- Parameters adjustable via miner voting (1024-block epochs, 90% threshold)
- Cost parameters: maxBlockCost (1M), inputCost (2K), outputCost (100)
- Size parameters: maxBlockSize (512KB), minValuePerByte (360)
- Fee parameters: storageFeeFactor (1.25M nanoErgs per byte per ~4 years)
- Block version tracks protocol upgrades (script_version = block_version - 1)
- Validation rules can be consensus-critical or soft-forkable
- Parameters stored in block extensions, parsed at epoch boundaries
Next: Chapter 26: Wallet and Signing
Scala: Parameters.scala:23-26
Rust: parameters.rs:8-27
Scala: Parameters.scala:190-217
Scala: Parameters.scala:159-183
Rust: parameters.rs:62-77
Scala: Parameters.scala:220-228
Chapter 26: Wallet and Signing
PRE-ALPHA WARNING: This is a pre-alpha version of The Sigma Book. Content may be incomplete, inaccurate, or subject to change. Do not use as a source of truth. For authoritative information, consult the official repositories:
- sigmastate-interpreter — Reference Scala implementation
- sigma-rust — Rust implementation
- ergo — Ergo node
Prerequisites
- Chapter 15 for proof generation
- Chapter 23 for interpreter integration
- Chapter 11 for hint system and distributed signing
Learning Objectives
By the end of this chapter, you will be able to:
- Explain the wallet service architecture and its role in transaction signing
- Trace the complete transaction signing flow from unsigned to signed
- Use
TransactionHintsBagfor distributed multi-party signing - Implement box selection strategies for building transactions
Wallet Architecture
The wallet bridges high-level operations with the interpreter layer12:
Wallet Service Architecture
─────────────────────────────────────────────────────────
┌────────────────────────────────────────────────────────┐
│ Wallet │
├────────────────────────────────────────────────────────┤
│ prover: Box<dyn Prover> │
│ │
│ ├── from_mnemonic(phrase, pass) -> Wallet │
│ ├── from_secrets([]SecretKey) -> Wallet │
│ ├── add_secret(SecretKey) │
│ │ │
│ ├── sign_transaction(...) -> Transaction │
│ ├── sign_reduced_transaction(...) -> Transaction │
│ │ │
│ └── generate_commitments(...) -> TransactionHintsBag │
└────────────────────────────────────────────────────────┘
│
│ uses
▼
┌────────────────────────────────────────────────────────┐
│ Prover │
│ prove(tree, ctx, message, hints) -> ProverResult │
└────────────────────────────────────────────────────────┘
Wallet Structure
const Wallet = struct {
/// Underlying prover (boxed trait object)
prover: *Prover,
allocator: Allocator,
/// Create wallet from mnemonic phrase
pub fn fromMnemonic(
mnemonic_phrase: []const u8,
mnemonic_pass: []const u8,
allocator: Allocator,
) !Wallet {
const seed = Mnemonic.toSeed(mnemonic_phrase, mnemonic_pass);
const ext_sk = try ExtSecretKey.deriveMaster(seed);
return Wallet.fromSecrets(&.{ext_sk.secretKey()}, allocator);
}
/// Create wallet from secret keys
pub fn fromSecrets(secrets: []const SecretKey, allocator: Allocator) Wallet {
var private_inputs = allocator.alloc(PrivateInput, secrets.len) catch unreachable;
for (secrets, 0..) |sk, i| {
private_inputs[i] = PrivateInput.from(sk);
}
return .{
.prover = TestProver.init(private_inputs, allocator),
.allocator = allocator,
};
}
/// Add secret to wallet
pub fn addSecret(self: *Wallet, secret: SecretKey) void {
self.prover.appendSecret(PrivateInput.from(secret));
}
/// Sign a transaction
pub fn signTransaction(
self: *const Wallet,
tx_context: *const TransactionContext(UnsignedTransaction),
state_context: *const ErgoStateContext,
tx_hints: ?*const TransactionHintsBag,
) !Transaction {
return signTransactionImpl(
self.prover,
tx_context,
state_context,
tx_hints,
);
}
/// Sign a reduced transaction
pub fn signReducedTransaction(
self: *const Wallet,
reduced_tx: *const ReducedTransaction,
tx_hints: ?*const TransactionHintsBag,
) !Transaction {
return signReducedTransactionImpl(
self.prover,
reduced_tx,
tx_hints,
);
}
/// Generate commitments for distributed signing
pub fn generateCommitments(
self: *const Wallet,
tx_context: *const TransactionContext(UnsignedTransaction),
state_context: *const ErgoStateContext,
) !TransactionHintsBag {
var public_keys: []SigmaBoolean = &.{};
for (self.prover.secrets()) |secret| {
public_keys = append(public_keys, secret.publicImage());
}
return generateCommitmentsImpl(tx_context, state_context, public_keys);
}
};
Mnemonic Seed Generation
BIP-39 mnemonic to seed conversion34:
const Mnemonic = struct {
/// PBKDF2 iterations per BIP-39
pub const PBKDF2_ITERATIONS: u32 = 2048;
/// Seed output length (SHA-512)
pub const SEED_LENGTH: usize = 64;
/// Convert mnemonic phrase to seed bytes
pub fn toSeed(
mnemonic_phrase: []const u8,
mnemonic_pass: []const u8,
) [SEED_LENGTH]u8 {
var seed: [SEED_LENGTH]u8 = undefined;
// Normalize to NFKD form
const normalized_phrase = normalizeNfkd(mnemonic_phrase);
const normalized_pass = normalizeNfkd(mnemonic_pass);
// Salt is "mnemonic" + password
var salt_buf: [256]u8 = undefined;
const salt_prefix = "mnemonic";
@memcpy(salt_buf[0..salt_prefix.len], salt_prefix);
@memcpy(salt_buf[salt_prefix.len..][0..normalized_pass.len], normalized_pass);
const salt = salt_buf[0 .. salt_prefix.len + normalized_pass.len];
// PBKDF2-HMAC-SHA512
pbkdf2HmacSha512(
normalized_phrase,
salt,
PBKDF2_ITERATIONS,
&seed,
);
return seed;
}
};
Transaction Hints Bag
Manages hints for distributed signing (EIP-11)56:
const TransactionHintsBag = struct {
/// Secret hints by input index (own commitments)
secret_hints: std.AutoHashMap(usize, HintsBag),
/// Public hints by input index (other signers' commitments)
public_hints: std.AutoHashMap(usize, HintsBag),
allocator: Allocator,
pub fn empty(allocator: Allocator) TransactionHintsBag {
return .{
.secret_hints = std.AutoHashMap(usize, HintsBag).init(allocator),
.public_hints = std.AutoHashMap(usize, HintsBag).init(allocator),
.allocator = allocator,
};
}
/// Replace all hints for an input index
pub fn replaceHintsForInput(
self: *TransactionHintsBag,
index: usize,
hints_bag: HintsBag,
) void {
var secret_hints: []Hint = &.{};
var public_hints: []Hint = &.{};
for (hints_bag.hints) |hint| {
switch (hint) {
.own_commitment => secret_hints = append(secret_hints, hint),
else => public_hints = append(public_hints, hint),
}
}
self.secret_hints.put(index, HintsBag{ .hints = secret_hints }) catch {};
self.public_hints.put(index, HintsBag{ .hints = public_hints }) catch {};
}
/// Add hints for an input (accumulate with existing)
pub fn addHintsForInput(
self: *TransactionHintsBag,
index: usize,
hints_bag: HintsBag,
) void {
var existing_secret = self.secret_hints.get(index) orelse HintsBag.empty();
var existing_public = self.public_hints.get(index) orelse HintsBag.empty();
for (hints_bag.hints) |hint| {
switch (hint) {
.own_commitment => existing_secret.hints = append(existing_secret.hints, hint),
else => existing_public.hints = append(existing_public.hints, hint),
}
}
self.secret_hints.put(index, existing_secret) catch {};
self.public_hints.put(index, existing_public) catch {};
}
/// Get all hints (secret + public) for an input
pub fn allHintsForInput(self: *const TransactionHintsBag, index: usize) HintsBag {
var hints: []Hint = &.{};
if (self.secret_hints.get(index)) |bag| {
for (bag.hints) |h| hints = append(hints, h);
}
if (self.public_hints.get(index)) |bag| {
for (bag.hints) |h| hints = append(hints, h);
}
return HintsBag{ .hints = hints };
}
/// Get only public hints (safe to share)
pub fn publicHintsForInput(self: *const TransactionHintsBag, index: usize) HintsBag {
return self.public_hints.get(index) orelse HintsBag.empty();
}
};
Distributed Signing Protocol (EIP-11)
Distributed Signing Flow
─────────────────────────────────────────────────────
Party A Party B
───────── ─────────
1. Generate Commitments
commitmentsA = generateCommitments()
commitmentsB = generateCommitments()
2. Exchange Public Hints
publicA ──────────────────────────►
◄────────────────── publicB
3. Sign with Combined Hints
combinedA = commitmentsA + publicB
partialSigA = sign(tx, combinedA)
combinedB = commitmentsB + publicA
partialSigB = sign(tx, combinedB)
4. Extract & Complete
partialSigA ─────────────────────►
extractedHints = extractHints(partialSigA)
finalTx = sign(tx, commitmentsB + extracted)
Security: Secret hints (randomness r) NEVER leave their owner.
Only public hints (commitments g^r) are exchanged.
/// Generate commitments for all transaction inputs
pub fn generateCommitments(
wallet: *const Wallet,
tx_context: *const TransactionContext(UnsignedTransaction),
state_context: *const ErgoStateContext,
) !TransactionHintsBag {
var public_keys: []SigmaBoolean = &.{};
for (wallet.prover.secrets()) |secret| {
public_keys = append(public_keys, secret.publicImage());
}
var hints_bag = TransactionHintsBag.empty(wallet.allocator);
for (tx_context.spending_tx.inputs.items(), 0..) |_, idx| {
const ctx = try makeContext(state_context, tx_context, idx);
const input_box = tx_context.inputBoxes()[idx];
// Reduce to SigmaBoolean
const reduction = try reduceToCrypto(&input_box.ergo_tree, &ctx);
// Generate commitments for propositions we can prove
const input_hints = generateCommitmentsFor(
&reduction.sigma_prop,
public_keys,
);
hints_bag.addHintsForInput(idx, input_hints);
}
return hints_bag;
}
/// Extract hints from a partial signature
pub fn extractHints(
tx: *const Transaction,
real_propositions: []const SigmaBoolean,
simulated_propositions: []const SigmaBoolean,
boxes_to_spend: []const ErgoBox,
data_boxes: []const ErgoBox,
) TransactionHintsBag {
var hints_bag = TransactionHintsBag.empty(allocator);
for (tx.inputs.items(), 0..) |input, idx| {
const proof = input.spending_proof.proof;
if (proof.isEmpty()) continue;
const box = boxes_to_spend[idx];
const extracted = extractHintsFromProof(
&box.ergo_tree,
proof.bytes(),
real_propositions,
simulated_propositions,
);
hints_bag.addHintsForInput(idx, extracted);
}
return hints_bag;
}
Box Selection
Select inputs to satisfy target balance and tokens78:
const BoxSelector = struct {
/// Selects boxes to satisfy target balance and tokens
pub fn select(
self: *const BoxSelector,
inputs: []const ErgoBox,
target_balance: BoxValue,
target_tokens: []const Token,
) BoxSelectorError!BoxSelection {
var selected: []ErgoBox = &.{};
var total_value: u64 = 0;
var total_tokens = std.AutoHashMap(TokenId, u64).init(allocator);
defer total_tokens.deinit();
// First pass: select boxes until targets met
for (inputs) |box| {
const needed = needsMoreBoxes(
total_value,
&total_tokens,
target_balance.as_u64(),
target_tokens,
);
if (!needed) break;
selected = append(selected, box);
total_value += box.value.as_u64();
if (box.tokens) |tokens| {
for (tokens.items()) |token| {
const entry = total_tokens.getOrPut(token.token_id);
if (entry.found_existing) {
entry.value_ptr.* += token.amount.value;
} else {
entry.value_ptr.* = token.amount.value;
}
}
}
}
// Check if targets met
if (total_value < target_balance.as_u64()) {
return error.NotEnoughCoins;
}
for (target_tokens) |target| {
const have = total_tokens.get(target.token_id) orelse 0;
if (have < target.amount.value) {
return error.NotEnoughTokens;
}
}
// Calculate change
const change = calculateChange(
total_value,
&total_tokens,
target_balance.as_u64(),
target_tokens,
);
return BoxSelection{
.boxes = try BoundedVec(ErgoBox, 1, MAX_INPUTS).fromSlice(selected),
.change_boxes = change,
};
}
};
const BoxSelection = struct {
/// Selected boxes to spend
boxes: BoundedVec(ErgoBox, 1, MAX_INPUTS),
/// Change boxes to create
change_boxes: []ErgoBoxAssetsData,
};
const BoxSelectorError = error{
NotEnoughCoins,
NotEnoughTokens,
TokenAmountError,
NotEnoughCoinsForChangeBox,
SelectedInputsOutOfBounds,
};
Transaction Signing Flow
Transaction Signing Flow
─────────────────────────────────────────────────────
┌──────────────────────────────────────────────────┐
│ 1. User Request │
│ ├── Target balance │
│ └── Target tokens │
└──────────────────────────┬───────────────────────┘
▼
┌──────────────────────────────────────────────────┐
│ 2. Box Selection │
│ ├── BoxSelector.select(inputs, target) │
│ └── Returns: boxes + change │
└──────────────────────────┬───────────────────────┘
▼
┌──────────────────────────────────────────────────┐
│ 3. Build Unsigned Transaction │
│ ├── inputs: selected boxes │
│ ├── data_inputs: read-only references │
│ └── output_candidates: targets + change │
└──────────────────────────┬───────────────────────┘
▼
┌──────────────────────────────────────────────────┐
│ 4. Sign Transaction │
│ For each input: │
│ ├── Create Context │
│ ├── Get hints for input │
│ ├── prover.prove(tree, ctx, message, hints) │
│ └── Accumulate cost │
└──────────────────────────┬───────────────────────┘
▼
┌──────────────────────────────────────────────────┐
│ 5. Signed Transaction │
│ └── Submit to mempool │
└──────────────────────────────────────────────────┘
/// Sign transaction with prover
pub fn signTransaction(
prover: *const Prover,
tx_context: *const TransactionContext(UnsignedTransaction),
state_context: *const ErgoStateContext,
tx_hints: ?*const TransactionHintsBag,
) !Transaction {
const tx = tx_context.spending_tx;
const message = try tx.bytesToSign();
var signed_inputs: []Input = &.{};
for (tx.inputs.items(), 0..) |unsigned_input, idx| {
const ctx = try makeContext(state_context, tx_context, idx);
// Get hints for this input
const hints = if (tx_hints) |h| h.allHintsForInput(idx) else HintsBag.empty();
const input_box = tx_context.getInputBox(unsigned_input.box_id) orelse
return error.InputBoxNotFound;
// Generate proof
const prover_result = try prover.prove(
&input_box.ergo_tree,
&ctx,
message,
&hints,
);
signed_inputs = append(signed_inputs, Input{
.box_id = unsigned_input.box_id,
.spending_proof = prover_result,
});
}
return Transaction.new(
try TxIoVec(Input).fromSlice(signed_inputs),
tx.data_inputs,
tx.output_candidates,
);
}
Asset Extraction
Calculate token access costs9:
const ErgoBoxAssetExtractor = struct {
pub const MAX_ASSETS_PER_BOX: usize = 255;
/// Extract total token amounts from boxes
pub fn extractAssets(
boxes: []const ErgoBoxCandidate,
) !struct { assets: std.AutoHashMap(TokenId, u64), count: usize } {
var assets = std.AutoHashMap(TokenId, u64).init(allocator);
var total_count: usize = 0;
for (boxes) |box| {
if (box.tokens) |tokens| {
if (tokens.len() > MAX_ASSETS_PER_BOX) {
return error.TooManyAssetsInBox;
}
for (tokens.items()) |token| {
const entry = assets.getOrPut(token.token_id);
if (entry.found_existing) {
entry.value_ptr.* = std.math.add(
u64,
entry.value_ptr.*,
token.amount.value,
) catch return error.Overflow;
} else {
entry.value_ptr.* = token.amount.value;
}
}
total_count += tokens.len();
}
}
return .{ .assets = assets, .count = total_count };
}
/// Calculate total token access cost
pub fn totalAssetsAccessCost(
in_assets_num: usize,
in_assets_size: usize,
out_assets_num: usize,
out_assets_size: usize,
token_access_cost: u32,
) u64 {
// Cost to iterate through all tokens
const all_assets_cost = (out_assets_num + in_assets_num) * token_access_cost;
// Cost to check preservation of unique tokens
const unique_assets_cost = (in_assets_size + out_assets_size) * token_access_cost;
return all_assets_cost + unique_assets_cost;
}
};
Wallet Errors
const WalletError = error{
/// Transaction signing failed
TxSigningError,
/// Prover failed to generate proof
ProverError,
/// Key derivation failed
ExtSecretKeyError,
/// Secret key parsing failed
SecretKeyParsingError,
/// Wallet not initialized
WalletNotInitialized,
/// Wallet locked
WalletLocked,
/// Wallet already unlocked
WalletAlreadyUnlocked,
/// Box selection failed
BoxSelectionError,
};
Distributed Signing Example
// Party A: Generate commitments
const commitments_a = try wallet_a.generateCommitments(&tx_context, &state_context);
// Party B: Generate commitments
const commitments_b = try wallet_b.generateCommitments(&tx_context, &state_context);
// Exchange public hints (safe to share)
const public_a = commitments_a.publicHintsForInput(0);
const public_b = commitments_b.publicHintsForInput(0);
// Party A: Sign with combined hints
var combined_a = commitments_a;
combined_a.addHintsForInput(0, public_b);
const partial_sig_a = try wallet_a.signTransaction(&tx_context, &state_context, &combined_a);
// Party B: Extract hints from A's partial signature
const extracted = extractHints(
&partial_sig_a,
real_propositions,
simulated_propositions,
boxes_to_spend,
data_boxes,
);
// Party B: Complete signing
var final_hints = commitments_b;
final_hints.addHintsForInput(0, extracted.allHintsForInput(0));
const final_tx = try wallet_b.signTransaction(&tx_context, &state_context, &final_hints);
Summary
- Wallet wraps prover with high-level signing API
- Mnemonic converts BIP-39 phrase to seed via PBKDF2
- TransactionHintsBag separates secret/public hints for distributed signing
- BoxSelector finds optimal input set for target balance/tokens
- Distributed signing (EIP-11) exchanges commitments, never secrets
- Asset extraction calculates token access costs
Next: Chapter 27: High-Level SDK
Rust: wallet.rs:52-93
Scala: Mnemonic.scala
Rust: mnemonic.rs:20-37
Rust: wallet.rs:259-347
Scala: BoxSelector.scala
Rust: box_selector.rs:34-46
Chapter 27: High-Level SDK
PRE-ALPHA WARNING: This is a pre-alpha version of The Sigma Book. Content may be incomplete, inaccurate, or subject to change. Do not use as a source of truth. For authoritative information, consult the official repositories:
- sigmastate-interpreter — Reference Scala implementation
- sigma-rust — Rust implementation
- ergo — Ergo node
Prerequisites
- Chapter 24 for transaction structure and validation
- Chapter 15 for proof generation
- Chapter 26 for wallet integration
Learning Objectives
By the end of this chapter, you will be able to:
- Explain the SDK architecture layers from cryptography to transaction building
- Use
TxBuilderwith the builder pattern for ergonomic transaction construction - Trace the reduce-then-sign pipeline for transaction signing
- Work with
TransactionContextandBoxSelectionfor complex transaction scenarios
SDK Architecture
The SDK provides a layered abstraction from low-level cryptography to high-level transaction building12:
SDK Layer Architecture
══════════════════════════════════════════════════════════════════
┌────────────────────────────────────────────────────────────────┐
│ Application Layer │
│ TxBuilder BoxSelector ErgoBoxCandidateBuilder │
├────────────────────────────────────────────────────────────────┤
│ Wallet Layer │
│ Wallet TransactionContext TransactionHintsBag │
├────────────────────────────────────────────────────────────────┤
│ Reduction Layer │
│ reduce_tx() ReducedTransaction ReducedInput │
├────────────────────────────────────────────────────────────────┤
│ Signing Layer │
│ sign_transaction() sign_reduced_transaction() │
├────────────────────────────────────────────────────────────────┤
│ Interpreter Layer │
│ Prover Verifier reduce_to_crypto() │
└────────────────────────────────────────────────────────────────┘
Transaction Builder
The builder pattern constructs unsigned transactions with validation34:
const TxBuilder = struct {
box_selection: BoxSelection,
data_inputs: std.ArrayList(DataInput),
output_candidates: std.ArrayList(ErgoBoxCandidate),
current_height: u32,
fee_amount: BoxValue,
change_address: Address,
context_extensions: std.AutoHashMap(BoxId, ContextExtension),
token_burn_permit: std.ArrayList(Token),
allocator: Allocator,
pub fn init(
box_selection: BoxSelection,
output_candidates: []const ErgoBoxCandidate,
current_height: u32,
fee_amount: BoxValue,
change_address: Address,
allocator: Allocator,
) !TxBuilder {
var outputs = std.ArrayList(ErgoBoxCandidate).init(allocator);
try outputs.appendSlice(output_candidates);
return .{
.box_selection = box_selection,
.data_inputs = std.ArrayList(DataInput).init(allocator),
.output_candidates = outputs,
.current_height = current_height,
.fee_amount = fee_amount,
.change_address = change_address,
.context_extensions = std.AutoHashMap(BoxId, ContextExtension).init(allocator),
.token_burn_permit = std.ArrayList(Token).init(allocator),
.allocator = allocator,
};
}
pub fn deinit(self: *TxBuilder) void {
self.data_inputs.deinit();
self.output_candidates.deinit();
self.context_extensions.deinit();
self.token_burn_permit.deinit();
}
pub fn setDataInputs(self: *TxBuilder, data_inputs: []const DataInput) !void {
self.data_inputs.clearRetainingCapacity();
try self.data_inputs.appendSlice(data_inputs);
}
pub fn setContextExtension(self: *TxBuilder, box_id: BoxId, ext: ContextExtension) !void {
try self.context_extensions.put(box_id, ext);
}
pub fn setTokenBurnPermit(self: *TxBuilder, tokens: []const Token) !void {
self.token_burn_permit.clearRetainingCapacity();
try self.token_burn_permit.appendSlice(tokens);
}
};
Build Validation
Building performs comprehensive validation before creating the transaction56:
pub fn build(self: *TxBuilder) !UnsignedTransaction {
// Validate inputs
if (self.box_selection.boxes.items.len == 0) {
return error.EmptyInputs;
}
if (self.output_candidates.items.len == 0) {
return error.EmptyOutputs;
}
if (self.box_selection.boxes.items.len > std.math.maxInt(u16)) {
return error.TooManyInputs;
}
// Check for duplicate inputs
var seen = std.AutoHashMap(BoxId, void).init(self.allocator);
defer seen.deinit();
for (self.box_selection.boxes.items) |box| {
const result = try seen.getOrPut(box.box_id);
if (result.found_existing) {
return error.DuplicateInputs;
}
}
// Build output candidates with change boxes
var all_outputs = try self.buildOutputCandidates();
defer all_outputs.deinit();
// Validate coin preservation
const total_in = sumValue(self.box_selection.boxes.items);
const total_out = sumValue(all_outputs.items);
if (total_out > total_in) {
return error.NotEnoughCoinsInInputs;
}
if (total_out < total_in) {
return error.NotEnoughCoinsInOutputs;
}
// Validate token balance
try self.validateTokenBalance(all_outputs.items);
// Create unsigned inputs with context extensions
var unsigned_inputs = std.ArrayList(UnsignedInput).init(self.allocator);
for (self.box_selection.boxes.items) |box| {
const ext = self.context_extensions.get(box.box_id) orelse
ContextExtension.empty();
try unsigned_inputs.append(.{
.box_id = box.box_id,
.extension = ext,
});
}
return UnsignedTransaction{
.inputs = try unsigned_inputs.toOwnedSlice(),
.data_inputs = try self.data_inputs.toOwnedSlice(),
.output_candidates = try all_outputs.toOwnedSlice(),
};
}
fn buildOutputCandidates(self: *TxBuilder) !std.ArrayList(ErgoBoxCandidate) {
var outputs = std.ArrayList(ErgoBoxCandidate).init(self.allocator);
// Add user-specified outputs
try outputs.appendSlice(self.output_candidates.items);
// Add change boxes from selection
const change_tree = try Contract.payToAddress(self.change_address);
for (self.box_selection.change_boxes.items) |change| {
var candidate = try ErgoBoxCandidateBuilder.init(
change.value,
change_tree,
self.current_height,
self.allocator,
);
for (change.tokens) |token| {
try candidate.addToken(token);
}
try outputs.append(try candidate.build());
}
// Add miner fee box
const fee_box = try newMinerFeeBox(self.fee_amount, self.current_height);
try outputs.append(fee_box);
return outputs;
}
Token Balance Validation
Token flow must be explicitly validated78:
fn validateTokenBalance(self: *TxBuilder, outputs: []const ErgoBoxCandidate) !void {
const input_tokens = try sumTokens(self.box_selection.boxes.items, self.allocator);
defer input_tokens.deinit();
const output_tokens = try sumTokens(outputs, self.allocator);
defer output_tokens.deinit();
// Token minting rule: new tokens can ONLY have token_id == first_input.box_id
// You can mint any AMOUNT of this token type, but only ONE token type per tx.
const first_input_id = TokenId.fromBoxId(self.box_selection.boxes.items[0].box_id);
// Separate minted tokens (first_input_id) from transferred tokens
var has_minted_token = false;
var output_without_minted = std.AutoHashMap(TokenId, TokenAmount).init(self.allocator);
defer output_without_minted.deinit();
var iter = output_tokens.iterator();
while (iter.next()) |entry| {
if (entry.key_ptr.*.eql(first_input_id)) {
has_minted_token = true;
// Note: any amount is allowed for the minted token
} else {
try output_without_minted.put(entry.key_ptr.*, entry.value_ptr.*);
}
}
_ = has_minted_token; // Used for documentation; actual validation is below
// Check all output tokens exist in inputs
var out_iter = output_without_minted.iterator();
while (out_iter.next()) |entry| {
const input_amt = input_tokens.get(entry.key_ptr.*) orelse {
return error.NotEnoughTokens;
};
if (input_amt < entry.value_ptr.*) {
return error.NotEnoughTokens;
}
}
// Check token burn permits
const burned = try subtractTokens(input_tokens, output_without_minted, self.allocator);
defer burned.deinit();
try self.checkBurnPermit(burned);
}
fn checkBurnPermit(self: *TxBuilder, burned: std.AutoHashMap(TokenId, TokenAmount)) !void {
// Build permit map
var permits = std.AutoHashMap(TokenId, TokenAmount).init(self.allocator);
defer permits.deinit();
for (self.token_burn_permit.items) |token| {
try permits.put(token.id, token.amount);
}
// Every burned token must have permit
var iter = burned.iterator();
while (iter.next()) |entry| {
const permit_amt = permits.get(entry.key_ptr.*) orelse {
return error.TokenBurnPermitMissing;
};
if (entry.value_ptr.* > permit_amt) {
return error.TokenBurnPermitExceeded;
}
}
// Every permit must be used exactly
var permit_iter = permits.iterator();
while (permit_iter.next()) |entry| {
const burned_amt = burned.get(entry.key_ptr.*) orelse {
return error.TokenBurnPermitUnused;
};
if (burned_amt < entry.value_ptr.*) {
return error.TokenBurnPermitUnused;
}
}
}
Box Candidate Builder
Constructs output boxes with fluent API:
const ErgoBoxCandidateBuilder = struct {
value: BoxValue,
ergo_tree: ErgoTree,
creation_height: u32,
tokens: std.ArrayList(Token),
registers: [6]?Constant, // R4-R9
allocator: Allocator,
pub fn init(
value: BoxValue,
ergo_tree: ErgoTree,
creation_height: u32,
allocator: Allocator,
) !ErgoBoxCandidateBuilder {
return .{
.value = value,
.ergo_tree = ergo_tree,
.creation_height = creation_height,
.tokens = std.ArrayList(Token).init(allocator),
.registers = [_]?Constant{null} ** 6,
.allocator = allocator,
};
}
pub fn addToken(self: *ErgoBoxCandidateBuilder, token: Token) !void {
if (self.tokens.items.len >= MAX_TOKENS) {
return error.TooManyTokens;
}
try self.tokens.append(token);
}
pub fn mintToken(
self: *ErgoBoxCandidateBuilder,
token: Token,
name: []const u8,
description: []const u8,
decimals: u8,
) !void {
try self.addToken(token);
// Store metadata in R4-R6
self.registers[0] = Constant.fromBytes(name);
self.registers[1] = Constant.fromBytes(description);
self.registers[2] = Constant.fromByte(decimals);
}
pub fn setRegister(self: *ErgoBoxCandidateBuilder, reg: RegisterId, value: Constant) void {
const idx = @intFromEnum(reg) - 4; // R4 = 0, R5 = 1, etc.
self.registers[idx] = value;
}
pub fn build(self: *ErgoBoxCandidateBuilder) !ErgoBoxCandidate {
return ErgoBoxCandidate{
.value = self.value,
.ergo_tree = self.ergo_tree,
.creation_height = self.creation_height,
.tokens = try self.tokens.toOwnedSlice(),
.additional_registers = self.registers,
};
}
};
Transaction Context
Bundles transaction with input boxes for signing910:
const TransactionContext = struct {
spending_tx: UnsignedTransaction,
input_boxes: []const ErgoBox,
data_boxes: ?[]const ErgoBox,
pub fn init(
spending_tx: UnsignedTransaction,
input_boxes: []const ErgoBox,
data_boxes: ?[]const ErgoBox,
) !TransactionContext {
// Validate input boxes match transaction inputs
if (input_boxes.len != spending_tx.inputs.len) {
return error.InputBoxCountMismatch;
}
for (spending_tx.inputs, input_boxes) |input, box| {
if (!input.box_id.eql(box.box_id())) {
return error.InputBoxIdMismatch;
}
}
// Validate data boxes if present
if (spending_tx.data_inputs) |data_inputs| {
const data = data_boxes orelse return error.DataInputBoxNotFound;
if (data.len != data_inputs.len) {
return error.DataInputBoxCountMismatch;
}
}
return .{
.spending_tx = spending_tx,
.input_boxes = input_boxes,
.data_boxes = data_boxes,
};
}
pub fn getInputBox(self: *const TransactionContext, box_id: BoxId) ?*const ErgoBox {
for (self.input_boxes) |*box| {
if (box.box_id().eql(box_id)) {
return box;
}
}
return null;
}
};
Box Selection
Selects input boxes to satisfy output requirements1112:
const BoxSelection = struct {
boxes: std.ArrayList(ErgoBox),
change_boxes: std.ArrayList(ErgoBoxAssets),
const ErgoBoxAssets = struct {
value: BoxValue,
tokens: []const Token,
};
};
const SimpleBoxSelector = struct {
pub fn select(
available: []const ErgoBox,
target_value: BoxValue,
target_tokens: []const Token,
allocator: Allocator,
) !BoxSelection {
var selected = std.ArrayList(ErgoBox).init(allocator);
var total_value: u64 = 0;
var token_sums = std.AutoHashMap(TokenId, TokenAmount).init(allocator);
defer token_sums.deinit();
// Greedy selection
for (available) |box| {
const needed = checkNeed(total_value, target_value, token_sums, target_tokens);
if (!needed) break;
try selected.append(box);
total_value += box.value.as_u64();
for (box.tokens) |token| {
const entry = try token_sums.getOrPut(token.id);
if (entry.found_existing) {
entry.value_ptr.* = try entry.value_ptr.*.checkedAdd(token.amount);
} else {
entry.value_ptr.* = token.amount;
}
}
}
// Calculate change
var change_boxes = std.ArrayList(BoxSelection.ErgoBoxAssets).init(allocator);
const change_value = total_value - target_value.as_u64();
if (change_value > 0) {
const change_tokens = try calculateChangeTokens(token_sums, target_tokens, allocator);
try change_boxes.append(.{
.value = BoxValue.init(change_value) catch return error.ChangeValueTooSmall,
.tokens = change_tokens,
});
}
return .{
.boxes = selected,
.change_boxes = change_boxes,
};
}
};
Reduced Transaction
Script reduction separates evaluation from signing1314:
const ReducedInput = struct {
sigma_prop: SigmaBoolean,
cost: u64,
extension: ContextExtension,
};
const ReducedTransaction = struct {
unsigned_tx: UnsignedTransaction,
reduced_inputs: []const ReducedInput,
tx_cost: u32,
pub fn reducedInputs(self: *const ReducedTransaction) []const ReducedInput {
return self.reduced_inputs;
}
};
/// Reduce transaction inputs to sigma propositions
pub fn reduceTx(
tx_context: TransactionContext,
state_context: *const ErgoStateContext,
allocator: Allocator,
) !ReducedTransaction {
var reduced_inputs = std.ArrayList(ReducedInput).init(allocator);
for (tx_context.spending_tx.inputs, 0..) |input, idx| {
// Build evaluation context
var ctx = try makeContext(state_context, &tx_context, idx);
// Get input box
const input_box = tx_context.getInputBox(input.box_id) orelse
return error.InputBoxNotFound;
// Reduce ErgoTree to SigmaBoolean
const result = try reduceToCrypto(&input_box.ergo_tree, &ctx);
try reduced_inputs.append(.{
.sigma_prop = result.sigma_prop,
.cost = result.cost,
.extension = input.extension,
});
}
return .{
.unsigned_tx = tx_context.spending_tx,
.reduced_inputs = try reduced_inputs.toOwnedSlice(),
.tx_cost = 0,
};
}
Signing Pipeline
Signing Flow
══════════════════════════════════════════════════════════════════
┌─────────────────┐ ┌──────────────────┐ ┌───────────────┐
│ UnsignedTx │ │ ReducedTx │ │ SignedTx │
│ + InputBoxes │────▶│ (SigmaProps) │────▶│ (Proofs) │
│ + StateContext │ │ │ │ │
└─────────────────┘ └──────────────────┘ └───────────────┘
│ │ │
│ reduce_tx() │ sign_reduced_tx() │
│ (needs context) │ (context-free) │
▼ ▼ ▼
┌─────────┐ ┌─────────┐ ┌─────────┐
│ Online │ │ Offline │ │ Verify │
│ Wallet │ │ Wallet │ │ Node │
└─────────┘ └─────────┘ └─────────┘
Transaction signing with optional hints1516:
pub fn signTransaction(
prover: *const Prover,
tx_context: TransactionContext,
state_context: *const ErgoStateContext,
tx_hints: ?*const TransactionHintsBag,
) !Transaction {
const message = try tx_context.spending_tx.bytesToSign();
var signed_inputs = std.ArrayList(Input).init(prover.allocator);
for (tx_context.spending_tx.inputs, 0..) |input, idx| {
const signed = try signTxInput(
prover,
&tx_context,
state_context,
tx_hints,
idx,
message,
);
try signed_inputs.append(signed);
}
return Transaction{
.inputs = try signed_inputs.toOwnedSlice(),
.data_inputs = tx_context.spending_tx.data_inputs,
.outputs = tx_context.spending_tx.output_candidates,
};
}
pub fn signReducedTransaction(
prover: *const Prover,
reduced_tx: ReducedTransaction,
tx_hints: ?*const TransactionHintsBag,
) !Transaction {
const message = try reduced_tx.unsigned_tx.bytesToSign();
var signed_inputs = std.ArrayList(Input).init(prover.allocator);
for (reduced_tx.unsigned_tx.inputs, 0..) |input, idx| {
const reduced_input = reduced_tx.reduced_inputs[idx];
// Get hints for this input
const hints = if (tx_hints) |bag|
bag.allHintsForInput(idx)
else
HintsBag.empty();
// Generate proof from sigma proposition
const proof = try prover.generateProof(
reduced_input.sigma_prop,
message,
&hints,
);
try signed_inputs.append(.{
.box_id = input.box_id,
.spending_proof = .{
.proof = proof,
.extension = reduced_input.extension,
},
});
}
return Transaction{
.inputs = try signed_inputs.toOwnedSlice(),
.data_inputs = reduced_tx.unsigned_tx.data_inputs,
.outputs = reduced_tx.unsigned_tx.output_candidates,
};
}
Miner Fee Box
Standard miner fee output:
/// Miner fee ErgoTree (false proposition with height constraint)
const MINERS_FEE_ERGO_TREE = [_]u8{
0x10, 0x05, 0x04, 0x00, 0x04, 0x00, 0x0e, 0x36,
0x10, 0x02, 0x04, 0xa0, 0x0b, 0x08, 0xcd, 0x02,
// ... (standard miner fee script)
};
pub fn newMinerFeeBox(fee: BoxValue, creation_height: u32) !ErgoBoxCandidate {
const tree = try ErgoTree.sigmaParse(&MINERS_FEE_ERGO_TREE);
return ErgoBoxCandidate{
.value = fee,
.ergo_tree = tree,
.creation_height = creation_height,
.tokens = &[_]Token{},
.additional_registers = [_]?Constant{null} ** 6,
};
}
/// Suggested transaction fee (1.1 mERG)
pub const SUGGESTED_TX_FEE = BoxValue.init(1_100_000) catch unreachable;
Reduced Transaction Serialization
EIP-19 format for cold wallet transfer1718:
const ReducedTransactionSerializer = struct {
pub fn serialize(tx: *const ReducedTransaction, writer: anytype) !void {
// Write message to sign (includes all tx data)
const msg = try tx.unsigned_tx.bytesToSign();
try writer.writeInt(u32, @intCast(msg.len), .little);
try writer.writeAll(msg);
// Write reduced inputs
for (tx.reduced_inputs) |red_in| {
try SigmaBoolean.serialize(&red_in.sigma_prop, writer);
try writer.writeInt(u64, red_in.cost, .little);
}
try writer.writeInt(u32, tx.tx_cost, .little);
}
pub fn parse(reader: anytype, allocator: Allocator) !ReducedTransaction {
// Read and parse message
const msg_len = try reader.readInt(u32, .little);
const msg = try allocator.alloc(u8, msg_len);
try reader.readNoEof(msg);
const tx = try Transaction.sigmaParse(msg);
// Read reduced inputs
var reduced_inputs = std.ArrayList(ReducedInput).init(allocator);
for (tx.inputs) |input| {
const sigma_prop = try SigmaBoolean.parse(reader);
const cost = try reader.readInt(u64, .little);
try reduced_inputs.append(.{
.sigma_prop = sigma_prop,
.cost = cost,
.extension = input.spending_proof.extension,
});
}
const tx_cost = try reader.readInt(u32, .little);
return .{
.unsigned_tx = tx.toUnsigned(),
.reduced_inputs = try reduced_inputs.toOwnedSlice(),
.tx_cost = tx_cost,
};
}
};
Cold Wallet Flow
Cold Wallet Signing
══════════════════════════════════════════════════════════════════
Online Wallet (Hot) Cold Wallet (Air-gapped)
────────────────────── ────────────────────────
│ │
Build Unsigned Tx │
│ │
reduce_tx() │
│ │
Serialize ReducedTx ─────────────────────▶│
(QR code / USB) │
│ Parse ReducedTx
│ │
│ sign_reduced_tx()
│ (uses secrets)
│ │
│◀──────────────────────── Serialize SignedTx
│ (QR code / USB)
Broadcast Tx │
│ │
▼ ▼
Complete Usage Example
pub fn buildAndSignTransaction(
wallet: *const Wallet,
available_boxes: []const ErgoBox,
recipient: Address,
amount: u64,
state_context: *const ErgoStateContext,
allocator: Allocator,
) !Transaction {
const current_height = state_context.pre_header.height;
// 1. Build output
const recipient_tree = try Contract.payToAddress(recipient);
var out_builder = try ErgoBoxCandidateBuilder.init(
try BoxValue.init(amount),
recipient_tree,
current_height,
allocator,
);
const output = try out_builder.build();
// 2. Select inputs
const total_needed = try BoxValue.init(amount + SUGGESTED_TX_FEE.as_u64());
const selection = try SimpleBoxSelector.select(
available_boxes,
total_needed,
&[_]Token{},
allocator,
);
// 3. Build transaction
const change_address = wallet.getP2PKAddress();
var builder = try TxBuilder.init(
selection,
&[_]ErgoBoxCandidate{output},
current_height,
SUGGESTED_TX_FEE,
change_address,
allocator,
);
defer builder.deinit();
const unsigned_tx = try builder.build();
// 4. Create transaction context
const tx_context = try TransactionContext.init(
unsigned_tx,
selection.boxes.items,
null,
);
// 5. Sign transaction
return wallet.signTransaction(tx_context, state_context, null);
}
Summary
- TxBuilder constructs unsigned transactions with validation
- BoxSelection satisfies value and token requirements
- ErgoBoxCandidateBuilder creates output boxes with fluent API
- TransactionContext bundles transaction with input data
- reduce_tx() separates script evaluation from signing
- ReducedTransaction enables air-gapped cold wallet signing
- Token burn requires explicit permits to prevent accidents
Next: Chapter 28: Key Derivation
Scala: sdk/
Rust: wallet.rs:52-244
Rust: tx_builder.rs:41-78
Rust: tx_builder.rs:144-258
Scala: AppkitProvingInterpreter.scala (token validation)
Rust: tx_builder.rs:214-243
Scala: Transactions.scala:17-46
Rust: tx_context.rs
Scala: BoxSelectionResult.scala
Rust: box_selector.rs
Rust: reduced.rs:25-67
Rust: signing.rs:143-168
Rust: reduced.rs:108-154
Chapter 28: Key Derivation
PRE-ALPHA WARNING: This is a pre-alpha version of The Sigma Book. Content may be incomplete, inaccurate, or subject to change. Do not use as a source of truth. For authoritative information, consult the official repositories:
- sigmastate-interpreter — Reference Scala implementation
- sigma-rust — Rust implementation
- ergo — Ergo node
Prerequisites
- Elliptic curve cryptography (Chapter 9)
- Hash functions (Chapter 10)
- High-level SDK (Chapter 27)
Learning Objectives
- Understand BIP-32 hierarchical deterministic key derivation
- Implement derivation paths and index encoding
- Distinguish hardened from non-hardened derivation
- Master EIP-3 key derivation for Ergo
HD Wallet Architecture
Hierarchical Deterministic (HD) wallets derive unlimited keys from a single master seed12:
HD Key Derivation Tree
══════════════════════════════════════════════════════════════════
Master Seed (BIP-39)
│
HMAC-SHA512("Bitcoin seed", seed)
│
┌─────────────┴─────────────┐
│ │
Master Key Chain Code
(32 bytes) (32 bytes)
│ │
└───────────┬───────────────┘
│
Extended Master Key
│
┌────────────────┼────────────────┐
│ │ │
m/44' (Purpose) m/44'/429' m/44'/429'/0'
│ (Coin Type) (Account)
│ │ │
▼ ▼ ▼
BIP-44 Keys Ergo Keys Account Keys
Index Types
Child indices distinguish hardened from normal derivation34:
const ChildIndex = union(enum) {
hardened: HardenedIndex,
normal: NormalIndex,
const HardenedIndex = struct {
value: u31, // 0 to 2^31-1
pub fn toBits(self: HardenedIndex) u32 {
return @as(u32, self.value) | HARDENED_BIT;
}
};
const NormalIndex = struct {
value: u31, // 0 to 2^31-1
pub fn toBits(self: NormalIndex) u32 {
return @as(u32, self.value);
}
pub fn next(self: NormalIndex) NormalIndex {
return .{ .value = self.value + 1 };
}
};
const HARDENED_BIT: u32 = 0x80000000; // 2^31
pub fn hardened(i: u31) ChildIndex {
return .{ .hardened = .{ .value = i } };
}
pub fn normal(i: u31) ChildIndex {
return .{ .normal = .{ .value = i } };
}
pub fn toBits(self: ChildIndex) u32 {
return switch (self) {
.hardened => |h| h.toBits(),
.normal => |n| n.toBits(),
};
}
pub fn isHardened(self: ChildIndex) bool {
return self == .hardened;
}
};
Hardened vs Normal Derivation
Derivation Security Properties
══════════════════════════════════════════════════════════════════
┌──────────────┬─────────────────┬─────────────────────────────────┐
│ Type │ Index Range │ Security Property │
├──────────────┼─────────────────┼─────────────────────────────────┤
│ Normal │ 0 to 2³¹-1 │ Public derivation possible │
│ │ (0, 1, 2) │ Child pubkey from parent pubkey │
├──────────────┼─────────────────┼─────────────────────────────────┤
│ Hardened │ 2³¹ to 2³²-1 │ Requires private key │
│ │ (0', 1', 2') │ Prevents key leakage │
└──────────────┴─────────────────┴─────────────────────────────────┘
Why Hardened Matters:
─────────────────────────────────────────────────────────────────
If attacker obtains:
- Child private key (leaked)
- Parent chain code (public in xpub)
With normal derivation: Attacker can compute parent private key!
With hardened derivation: Parent key remains secure
Derivation Path
Paths encode the key tree location56:
const DerivationPath = struct {
indices: []const ChildIndex,
const PURPOSE: ChildIndex = ChildIndex.hardened(44);
const ERG_COIN_TYPE: ChildIndex = ChildIndex.hardened(429);
const CHANGE_EXTERNAL: ChildIndex = ChildIndex.normal(0);
/// Create EIP-3 compliant path: m/44'/429'/account'/0/address
pub fn eip3(account: u31, address: u31) DerivationPath {
return .{
.indices = &[_]ChildIndex{
PURPOSE,
ERG_COIN_TYPE,
ChildIndex.hardened(account),
CHANGE_EXTERNAL,
ChildIndex.normal(address),
},
};
}
/// Master path (empty)
pub fn master() DerivationPath {
return .{ .indices = &[_]ChildIndex{} };
}
pub fn depth(self: *const DerivationPath) usize {
return self.indices.len;
}
/// Extend path with new index
pub fn extend(self: *const DerivationPath, index: ChildIndex, allocator: Allocator) !DerivationPath {
var new_indices = try allocator.alloc(ChildIndex, self.indices.len + 1);
@memcpy(new_indices[0..self.indices.len], self.indices);
new_indices[self.indices.len] = index;
return .{ .indices = new_indices };
}
/// Increment last index
pub fn next(self: *const DerivationPath, allocator: Allocator) !DerivationPath {
if (self.indices.len == 0) return error.EmptyPath;
var new_indices = try allocator.dupe(ChildIndex, self.indices);
const last = &new_indices[new_indices.len - 1];
last.* = switch (last.*) {
.hardened => |h| ChildIndex.hardened(h.value + 1),
.normal => |n| ChildIndex.normal(n.value + 1),
};
return .{ .indices = new_indices };
}
};
Path Parsing and Display
const PathParser = struct {
pub fn parse(path_str: []const u8, allocator: Allocator) !DerivationPath {
var indices = std.ArrayList(ChildIndex).init(allocator);
var iter = std.mem.splitScalar(u8, path_str, '/');
// First element must be 'm' or 'M'
const master = iter.next() orelse return error.EmptyPath;
if (!std.mem.eql(u8, master, "m") and !std.mem.eql(u8, master, "M")) {
return error.InvalidMasterPrefix;
}
while (iter.next()) |segment| {
const is_hardened = std.mem.endsWith(u8, segment, "'");
const num_str = if (is_hardened)
segment[0 .. segment.len - 1]
else
segment;
const value = try std.fmt.parseInt(u31, num_str, 10);
const index = if (is_hardened)
ChildIndex.hardened(value)
else
ChildIndex.normal(value);
try indices.append(index);
}
return .{ .indices = try indices.toOwnedSlice() };
}
pub fn format(path: *const DerivationPath, writer: anytype) !void {
try writer.writeAll("m");
for (path.indices) |index| {
try writer.writeAll("/");
switch (index) {
.hardened => |h| try writer.print("{}'", .{h.value}),
.normal => |n| try writer.print("{}", .{n.value}),
}
}
}
};
EIP-3 Derivation Standard
Ergo's EIP-3 defines the derivation structure78:
EIP-3 Path Structure
══════════════════════════════════════════════════════════════════
m / 44' / 429' / account' / change / address
│ │ │ │ │ │
│ │ │ │ │ └── Address Index (normal)
│ │ │ │ └─────────── Change: 0=external, 1=internal
│ │ │ └───────────────────── Account Index (hardened)
│ │ └────────────────────────────── Coin Type: 429 (Ergo)
│ └───────────────────────────────────── Purpose: BIP-44
└────────────────────────────────────────── Master private key
Examples:
m/44'/429'/0'/0/0 First address, first account
m/44'/429'/0'/0/1 Second address, first account
m/44'/429'/1'/0/0 First address, second account
Extended Secret Key
Extended keys pair key material with chain code910:
const ExtSecretKey = struct {
key_bytes: [32]u8, // Private key scalar
chain_code: [32]u8, // Chain code for derivation
path: DerivationPath,
const BITCOIN_SEED = "Bitcoin seed";
/// Derive master key from seed
pub fn deriveMaster(seed: []const u8) !ExtSecretKey {
var hmac = HmacSha512.init(BITCOIN_SEED);
hmac.update(seed);
var output: [64]u8 = undefined;
hmac.final(&output);
return ExtSecretKey{
.key_bytes = output[0..32].*,
.chain_code = output[32..64].*,
.path = DerivationPath.master(),
};
}
/// Get public image (ProveDlog)
pub fn publicImage(self: *const ExtSecretKey) ProveDlog {
const scalar = Scalar.fromBytes(self.key_bytes);
const point = CryptoConstants.generator.mul(scalar);
return ProveDlog{ .h = point };
}
/// Get corresponding extended public key
pub fn publicKey(self: *const ExtSecretKey) !ExtPubKey {
return ExtPubKey{
.key_bytes = self.publicImage().compress(),
.chain_code = self.chain_code,
.path = self.path,
};
}
/// Zero out key material
/// SECURITY: In production, use volatile write or std.crypto.utils.secureZero
/// to prevent compiler optimization from eliding the zeroing.
pub fn zeroSecret(self: *ExtSecretKey) void {
std.crypto.utils.secureZero(u8, &self.key_bytes);
}
};
Child Key Derivation
BIP-32 child derivation algorithm1112:
pub fn deriveChild(parent: *const ExtSecretKey, index: ChildIndex, allocator: Allocator) !ExtSecretKey {
var hmac = HmacSha512.init(&parent.chain_code);
// HMAC input depends on derivation type
switch (index) {
.hardened => {
// Hardened: 0x00 || parent_key (33 bytes)
hmac.update(&[_]u8{0x00});
hmac.update(&parent.key_bytes);
},
.normal => {
// Normal: parent_public_key (33 bytes compressed)
const pub_key = parent.publicImage().compress();
hmac.update(&pub_key);
},
}
// Append index as big-endian u32
var index_bytes: [4]u8 = undefined;
std.mem.writeInt(u32, &index_bytes, index.toBits(), .big);
hmac.update(&index_bytes);
var output: [64]u8 = undefined;
hmac.final(&output);
// Parse left 32 bytes as scalar
const child_key_proto = Scalar.fromBytes(output[0..32].*);
// Check validity (must be < group order)
if (child_key_proto.isOverflow()) {
return deriveChild(parent, index.next(), allocator);
}
// child_key = (child_key_proto + parent_key) mod n
const parent_scalar = Scalar.fromBytes(parent.key_bytes);
const child_scalar = child_key_proto.add(parent_scalar);
// Check for zero (invalid)
if (child_scalar.isZero()) {
return deriveChild(parent, index.next(), allocator);
}
return ExtSecretKey{
.key_bytes = child_scalar.toBytes(),
.chain_code = output[32..64].*,
.path = try parent.path.extend(index, allocator),
};
}
/// Derive key at full path
pub fn derive(master: *const ExtSecretKey, path: DerivationPath, allocator: Allocator) !ExtSecretKey {
var current = master.*;
for (path.indices) |index| {
current = try deriveChild(¤t, index, allocator);
}
return current;
}
Extended Public Key
Public key derivation (non-hardened only)1314:
const ExtPubKey = struct {
key_bytes: [33]u8, // Compressed public key
chain_code: [32]u8,
path: DerivationPath,
pub fn deriveChild(parent: *const ExtPubKey, index: ChildIndex, allocator: Allocator) !ExtPubKey {
// Cannot derive hardened children from public key
if (index.isHardened()) {
return error.HardenedDerivationRequiresPrivateKey;
}
var hmac = HmacSha512.init(&parent.chain_code);
hmac.update(&parent.key_bytes);
var index_bytes: [4]u8 = undefined;
std.mem.writeInt(u32, &index_bytes, index.toBits(), .big);
hmac.update(&index_bytes);
var output: [64]u8 = undefined;
hmac.final(&output);
const child_key_proto = Scalar.fromBytes(output[0..32].*);
if (child_key_proto.isOverflow()) {
return deriveChild(parent, index.next(), allocator);
}
// child_public = point(child_key_proto) + parent_public
const proto_point = CryptoConstants.generator.mul(child_key_proto);
const parent_point = Point.decompress(parent.key_bytes);
const child_point = proto_point.add(parent_point);
if (child_point.isInfinity()) {
return deriveChild(parent, index.next(), allocator);
}
return ExtPubKey{
.key_bytes = child_point.compress(),
.chain_code = output[32..64].*,
.path = try parent.path.extend(index, allocator),
};
}
};
Mnemonic to Seed
const Mnemonic = struct {
const PBKDF2_ITERATIONS: u32 = 2048;
const SEED_LENGTH: usize = 64;
/// Convert mnemonic phrase to seed using PBKDF2-HMAC-SHA512
pub fn toSeed(phrase: []const u8, passphrase: []const u8) [SEED_LENGTH]u8 {
var seed: [SEED_LENGTH]u8 = undefined;
// Normalize using NFKD
const normalized_phrase = normalizeNfkd(phrase);
const normalized_pass = normalizeNfkd(passphrase);
// Salt = "mnemonic" + passphrase
var salt_buf: [256]u8 = undefined;
const salt = std.fmt.bufPrint(&salt_buf, "mnemonic{s}", .{normalized_pass}) catch unreachable;
// PBKDF2-HMAC-SHA512
pbkdf2(
HmacSha512,
normalized_phrase,
salt,
PBKDF2_ITERATIONS,
&seed,
);
return seed;
}
};
/// Full derivation from mnemonic to key
pub fn mnemonicToKey(
phrase: []const u8,
passphrase: []const u8,
path: DerivationPath,
allocator: Allocator,
) !ExtSecretKey {
const seed = Mnemonic.toSeed(phrase, passphrase);
const master = try ExtSecretKey.deriveMaster(&seed);
return derive(&master, path, allocator);
}
Path Serialization
Binary format for storage/transfer1718:
const DerivationPathSerializer = struct {
pub fn serialize(path: *const DerivationPath, writer: anytype) !void {
// Public branch flag (0x00 for private, 0x01 for public)
try writer.writeByte(0x00);
// Depth
try writer.writeInt(u32, @intCast(path.indices.len), .little);
// Each index as 4-byte big-endian
for (path.indices) |index| {
var bytes: [4]u8 = undefined;
std.mem.writeInt(u32, &bytes, index.toBits(), .big);
try writer.writeAll(&bytes);
}
}
pub fn parse(reader: anytype, allocator: Allocator) !DerivationPath {
const public_branch = try reader.readByte();
_ = public_branch; // TODO: handle public branch
const depth = try reader.readInt(u32, .little);
var indices = try allocator.alloc(ChildIndex, depth);
for (0..depth) |i| {
var bytes: [4]u8 = undefined;
try reader.readNoEof(&bytes);
const bits = std.mem.readInt(u32, &bytes, .big);
indices[i] = if (bits & 0x80000000 != 0)
ChildIndex.hardened(@truncate(bits & 0x7FFFFFFF))
else
ChildIndex.normal(@truncate(bits));
}
return .{ .indices = indices };
}
};
Watch-Only Wallet
Public key derivation enables watch-only wallets:
Watch-Only Wallet Setup
══════════════════════════════════════════════════════════════════
Full Wallet (has secrets) Watch-Only Wallet (no secrets)
───────────────────────── ──────────────────────────────
Master Secret Key
│
├── m/44'/429'/0' Extended Public Key
│ (hardened account) ───▶ at m/44'/429'/0'/0
│ │ │
│ └── m/44'/429'/0'/0 ├── Address 0 public
│ (change branch) ───▶ ├── Address 1 public
│ │ ├── Address 2 public
│ ├── 0 └── ... (can derive more)
│ ├── 1
│ └── 2 Cannot derive:
│ × Account 1 keys
× Hardened children
× Private keys
Export at: m/44'/429'/0'/0 (parent of address keys)
Can derive: All non-hardened children (addresses)
Cannot derive: Hardened children, private keys
Usage Example
const allocator = std.heap.page_allocator;
// 1. From mnemonic to master key
const mnemonic = "abandon abandon abandon abandon abandon abandon " ++
"abandon abandon abandon abandon abandon about";
const seed = Mnemonic.toSeed(mnemonic, "");
var master = try ExtSecretKey.deriveMaster(&seed);
defer master.zeroSecret();
// 2. Derive first EIP-3 address key
const path = DerivationPath.eip3(0, 0); // m/44'/429'/0'/0/0
var first_key = try derive(&master, path, allocator);
defer first_key.zeroSecret();
// 3. Get public image for address
const pub_key = first_key.publicImage();
// 4. Derive next address
const next_path = try path.next(allocator);
var second_key = try derive(&master, next_path, allocator);
defer second_key.zeroSecret();
// 5. Create watch-only wallet
const watch_only_path = try PathParser.parse("m/44'/429'/0'/0", allocator);
var account_key = try derive(&master, watch_only_path, allocator);
const watch_only = try account_key.publicKey();
// 6. Derive address public keys without secrets
const addr0_pub = try watch_only.deriveChild(ChildIndex.normal(0), allocator);
const addr1_pub = try watch_only.deriveChild(ChildIndex.normal(1), allocator);
// 7. Cannot derive hardened from public key
_ = watch_only.deriveChild(ChildIndex.hardened(0), allocator) catch |err| {
std.debug.assert(err == error.HardenedDerivationRequiresPrivateKey);
};
Security Considerations
Key Derivation Security
══════════════════════════════════════════════════════════════════
Attack: Child + Chain Code → Parent ⚠️ PRACTICAL ATTACK
────────────────────────────────────────────────────────
This is NOT theoretical - a single compromised child key
(via malware, hardware fault, or insider threat) can
recover the entire account if normal derivation was used.
Given:
- Child private key k_i
- Parent chain code c
For NORMAL derivation:
HMAC-SHA512(c, K_parent || i) = IL || IR
k_i = IL + k_parent mod n
Attacker can compute:
k_parent = k_i - IL mod n ← COMPROMISED!
For HARDENED derivation:
HMAC-SHA512(c, 0x00 || k_parent || i) = IL || IR
Cannot compute IL without knowing k_parent
→ Parent key remains SECURE
Recommendation:
└── Always use hardened derivation for account/purpose levels
└── Normal derivation only for address indices
Summary
- BIP-32 defines hierarchical deterministic key derivation
- Derivation paths use notation
m/44'/429'/0'/0/0 - Hardened derivation (
') requires private key; prevents key leakage - Normal derivation allows public key derivation from parent public key
- EIP-3 standardizes Ergo's path:
m/44'/429'/account'/change/address - Extended keys = key material (32 bytes) + chain code (32 bytes)
- Watch-only wallets use extended public keys for address generation
Next: Chapter 29: Soft Fork Mechanism
Scala: ExtendedSecretKey.scala
Rust: ext_secret_key.rs:29-37
Scala: Index.scala:5-16
Scala: DerivationPath.scala:10-29
Scala: Constants.scala:31-36
Rust: derivation_path.rs:88-91 (PURPOSE, ERG, CHANGE constants)
Rust: ext_secret_key.rs:60-112
Rust: ext_pub_key.rs
Scala: JavaHelpers.scala:282-301
Rust: mnemonic.rs:20-37
Scala: DerivationPath.scala:133-147
Rust: derivation_path.rs:235-241 (ledger_bytes)
Chapter 29: Soft-Fork Mechanism
PRE-ALPHA WARNING: This is a pre-alpha version of The Sigma Book. Content may be incomplete, inaccurate, or subject to change. Do not use as a source of truth. For authoritative information, consult the official repositories:
- sigmastate-interpreter — Reference Scala implementation
- sigma-rust — Rust implementation
- ergo — Ergo node
Prerequisites
- Chapter 3 for ErgoTree version field and header format
- Chapter 7 for serialization framework
- Chapter 24 for validation rules
Learning Objectives
By the end of this chapter, you will be able to:
- Explain version context and how script versioning enables protocol upgrades
- Implement validation rules with configurable status (enabled, disabled, soft-fork)
- Handle unknown opcodes gracefully to support future soft-forks
- Describe the transition from AOT (Ahead-of-Time) to JIT (Just-in-Time) costing
Version Context Architecture
The soft-fork mechanism enables protocol upgrades without breaking consensus12:
Soft-Fork Version Architecture
══════════════════════════════════════════════════════════════════
┌─────────────────────────────────────────────────────────────────┐
│ Block Header │
│ │
│ Block Version: 1, 2, 3, 4 │
│ │
│ Activated Script Version = Block Version - 1 │
└────────────────────────┬────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────┐
│ ErgoTree Header │
│ │
│ 7 6 5 4 3 2 1 0 │
│ ├───┼───┼───┼───┼───┼───┼───┤ │
│ │ M │ G │ C │ S │ Z │ V │ V │ V │
│ └───┴───┴───┴───┴───┴───┴───┘ │
│ M = More bytes follow │
│ G = GZIP (reserved) │
│ C = Context costing (reserved) │
│ S = Constant segregation │
│ Z = Size included │
│ V = Version (0-7) │
└─────────────────────────────────────────────────────────────────┘
ErgoTree Version
Script version is encoded in header bits 0-234:
const ErgoTreeVersion = struct {
value: u3, // 0-7
const VERSION_MASK: u8 = 0x07;
/// Version 0 - Initial mainnet (v3.x)
pub const V0 = ErgoTreeVersion{ .value = 0 };
/// Version 1 - Height monotonicity (v4.x)
pub const V1 = ErgoTreeVersion{ .value = 1 };
/// Version 2 - JIT interpreter (v5.x)
pub const V2 = ErgoTreeVersion{ .value = 2 };
/// Version 3 - Sub-blocks, new ops (v6.x)
pub const V3 = ErgoTreeVersion{ .value = 3 };
/// Maximum supported script version
pub const MAX_SCRIPT_VERSION = V3;
/// Parse version from header byte
pub fn parseVersion(header_byte: u8) ErgoTreeVersion {
return .{ .value = @truncate(header_byte & VERSION_MASK) };
}
pub fn toU8(self: ErgoTreeVersion) u8 {
return @as(u8, self.value);
}
};
ErgoTree Header
Header byte encoding with flags56:
const ErgoTreeHeader = struct {
version: ErgoTreeVersion,
is_constant_segregation: bool,
has_size: bool,
const CONSTANT_SEGREGATION_FLAG: u8 = 0b0001_0000;
const HAS_SIZE_FLAG: u8 = 0b0000_1000;
/// Parse header from byte
pub fn parse(header_byte: u8) !ErgoTreeHeader {
return .{
.version = ErgoTreeVersion.parseVersion(header_byte),
.is_constant_segregation = (header_byte & CONSTANT_SEGREGATION_FLAG) != 0,
.has_size = (header_byte & HAS_SIZE_FLAG) != 0,
};
}
/// Serialize header to byte
pub fn serialize(self: *const ErgoTreeHeader) u8 {
var header_byte: u8 = self.version.toU8();
if (self.is_constant_segregation) {
header_byte |= CONSTANT_SEGREGATION_FLAG;
}
if (self.has_size) {
header_byte |= HAS_SIZE_FLAG;
}
return header_byte;
}
/// Create v0 header
pub fn v0(constant_segregation: bool) ErgoTreeHeader {
return .{
.version = ErgoTreeVersion.V0,
.is_constant_segregation = constant_segregation,
.has_size = false,
};
}
/// Create v1 header (size is mandatory)
pub fn v1(constant_segregation: bool) ErgoTreeHeader {
return .{
.version = ErgoTreeVersion.V1,
.is_constant_segregation = constant_segregation,
.has_size = true,
};
}
};
Version Context
Thread-local context tracks activated and tree versions78:
const VersionContext = struct {
activated_version: u8,
ergo_tree_version: u8,
/// JIT costing activation version (v5.0)
const JIT_ACTIVATION_VERSION: u8 = 2;
/// v6.0 soft-fork version
const V6_SOFT_FORK_VERSION: u8 = 3;
pub fn init(activated: u8, tree: u8) !VersionContext {
// ergoTreeVersion must never exceed activatedVersion
if (activated >= JIT_ACTIVATION_VERSION and tree > activated) {
return error.InvalidVersionContext;
}
return .{
.activated_version = activated,
.ergo_tree_version = tree,
};
}
/// True if JIT costing is activated (v5.0+)
pub fn isJitActivated(self: *const VersionContext) bool {
return self.activated_version >= JIT_ACTIVATION_VERSION;
}
/// True if v6.0 protocol is activated
pub fn isV6Activated(self: *const VersionContext) bool {
return self.activated_version >= V6_SOFT_FORK_VERSION;
}
/// True if v3+ ErgoTree version
pub fn isV3OrLaterErgoTree(self: *const VersionContext) bool {
return self.ergo_tree_version >= V6_SOFT_FORK_VERSION;
}
};
/// Thread-local version context
threadlocal var current_context: ?VersionContext = null;
pub fn withVersions(
activated: u8,
tree: u8,
comptime block: fn (*VersionContext) anyerror!void,
) !void {
const ctx = try VersionContext.init(activated, tree);
const prev = current_context;
current_context = ctx;
defer current_context = prev;
try block(&ctx);
}
pub fn currentContext() !*const VersionContext {
return &(current_context orelse return error.VersionContextNotSet);
}
Version History
Protocol Version History
══════════════════════════════════════════════════════════════════
┌─────────────┬────────────────┬──────────────┬────────────────────┐
│ Block Ver │ Script Ver │ Protocol │ Features │
├─────────────┼────────────────┼──────────────┼────────────────────┤
│ 1 │ 0 │ v3.x │ Initial mainnet │
│ 2 │ 1 │ v4.x │ Height monotonicity│
│ 3 │ 2 │ v5.x │ JIT interpreter │
│ 4 │ 3 │ v6.x │ Sub-blocks, new ops│
└─────────────┴────────────────┴──────────────┴────────────────────┘
Relation: activated_script_version = block_version - 1
Rule Status
Validation rules have configurable status910:
const RuleStatus = union(enum) {
/// Default: rule is active and enforced
enabled,
/// Rule is disabled (via voting)
disabled,
/// Rule replaced by new rule
replaced: struct { new_rule_id: u16 },
/// Rule parameters changed
changed: struct { new_value: []const u8 },
const StatusCode = enum(u8) {
enabled = 1,
disabled = 2,
replaced = 3,
changed = 4,
};
pub fn statusCode(self: RuleStatus) StatusCode {
return switch (self) {
.enabled => .enabled,
.disabled => .disabled,
.replaced => .replaced,
.changed => .changed,
};
}
};
Validation Rules
Rules define validation behavior with soft-fork support1112:
const ValidationRule = struct {
id: u16,
description: []const u8,
soft_fork_checker: SoftForkChecker,
checked: bool = false,
pub fn checkRule(self: *ValidationRule, settings: *const ValidationSettings) !void {
if (!self.checked) {
if (settings.getStatus(self.id) == null) {
return error.ValidationRuleNotFound;
}
self.checked = true;
}
}
pub fn throwValidationException(
self: *const ValidationRule,
cause: anyerror,
args: []const u8,
) ValidationError {
return ValidationError{
.rule = self,
.args = args,
.cause = cause,
};
}
};
const ValidationError = struct {
rule: *const ValidationRule,
args: []const u8,
cause: anyerror,
};
Core Validation Rules
const ValidationRules = struct {
const FIRST_RULE_ID: u16 = 1000;
/// Check primitive type code is valid
pub const CheckPrimitiveTypeCode = ValidationRule{
.id = 1007,
.description = "Check primitive type code is supported or added via soft-fork",
.soft_fork_checker = .code_added,
};
/// Check non-primitive type code is valid
pub const CheckTypeCode = ValidationRule{
.id = 1008,
.description = "Check non-primitive type code is supported or added via soft-fork",
.soft_fork_checker = .code_added,
};
/// Check data can be serialized for type
pub const CheckSerializableTypeCode = ValidationRule{
.id = 1009,
.description = "Check data values of type can be serialized",
.soft_fork_checker = .when_replaced,
};
/// Check reader position limit
pub const CheckPositionLimit = ValidationRule{
.id = 1014,
.description = "Check Reader position limit",
.soft_fork_checker = .when_replaced,
};
};
Soft-Fork Checkers
Detect soft-fork conditions from validation failures1314:
const SoftForkChecker = enum {
none,
when_replaced,
code_added,
pub fn isSoftFork(
self: SoftForkChecker,
settings: *const ValidationSettings,
rule_id: u16,
status: RuleStatus,
args: []const u8,
) bool {
return switch (self) {
.none => false,
.when_replaced => switch (status) {
.replaced => true,
else => false,
},
.code_added => switch (status) {
.changed => |c| std.mem.indexOf(u8, c.new_value, args) != null,
else => false,
},
};
}
};
Validation Settings
Configurable settings from blockchain state1516:
const ValidationSettings = struct {
rules: std.AutoHashMap(u16, struct { rule: *ValidationRule, status: RuleStatus }),
pub fn getStatus(self: *const ValidationSettings, id: u16) ?RuleStatus {
if (self.rules.get(id)) |entry| {
return entry.status;
}
return null;
}
pub fn updated(self: *const ValidationSettings, id: u16, new_status: RuleStatus) !ValidationSettings {
var new_rules = try self.rules.clone();
if (new_rules.getPtr(id)) |entry| {
entry.status = new_status;
}
return .{ .rules = new_rules };
}
/// Check if exception represents a soft-fork condition
pub fn isSoftFork(self: *const ValidationSettings, ve: ValidationError) bool {
const entry = self.rules.get(ve.rule.id) orelse return false;
// Don't tolerate replaced v5.0 rules after v6.0 activation
switch (entry.status) {
.replaced => {
const ctx = currentContext() catch return false;
if (ctx.isV6Activated() and
(ve.rule.id == 1011 or ve.rule.id == 1007 or ve.rule.id == 1008))
{
return false;
}
return true;
},
else => return entry.rule.soft_fork_checker.isSoftFork(
self,
ve.rule.id,
entry.status,
ve.args,
),
}
}
};
Soft-Fork Execution Wrapper
Execute code with soft-fork fallback:
pub fn trySoftForkable(
comptime T: type,
settings: *const ValidationSettings,
when_soft_fork: T,
block: fn () anyerror!T,
) T {
return block() catch |err| {
if (@errorCast(ValidationError, err)) |ve| {
if (settings.isSoftFork(ve)) {
return when_soft_fork;
}
}
return err;
};
}
// Usage: handling unknown opcodes
fn deserializeValue(
reader: *Reader,
settings: *const ValidationSettings,
) !Value {
return trySoftForkable(
Value,
settings,
// Soft-fork fallback: return unit placeholder
Value.unit,
// Normal deserialization
struct {
fn f() !Value {
const op_code = try reader.readByte();
const serializer = getSerializer(op_code) orelse
return error.UnknownOpCode;
return serializer.parse(reader);
}
}.f,
);
}
AOT to JIT Transition
Script Validation Rules Across Versions
══════════════════════════════════════════════════════════════════
Rule │ Block │ Block Type │ Script │ v4.0 Action │ v5.0 Action
─────┼───────┼────────────┼────────┼─────────────────┼─────────────
1 │ 1,2 │ candidate │ v0/v1 │ AOT-cost,verify │ AOT-cost,verify
2 │ 1,2 │ mined │ v0/v1 │ AOT-cost,verify │ AOT-cost,verify
3 │ 1,2 │ candidate │ v2 │ skip-pool-tx │ skip-pool-tx
4 │ 1,2 │ mined │ v2 │ skip-reject │ skip-reject
─────┼───────┼────────────┼────────┼─────────────────┼─────────────
5 │ 3 │ candidate │ v0/v1 │ skip-pool-tx │ JIT-verify
6 │ 3 │ mined │ v0/v1 │ skip-accept │ JIT-verify
7 │ 3 │ candidate │ v2 │ skip-pool-tx │ JIT-verify
8 │ 3 │ mined │ v2 │ skip-accept │ JIT-verify
Actions:
AOT-cost,verify Cost estimation + verification using v4.0 AOT
JIT-verify Verification using v5.0 JIT interpreter
skip-pool-tx Skip mempool transaction (can't handle)
skip-accept Accept block without verification (trust majority)
skip-reject Reject transaction/block (invalid version)
Consensus Equivalence Properties
For safe transition between interpreter versions:
// Property 1: AOT-verify preserved between releases
// forall s:ScriptV0/V1, R4.0-AOT-verify(s) == R5.0-AOT-verify(s)
// Property 2: AOT-cost preserved
// forall s:ScriptV0/V1, R4.0-AOT-cost(s) == R5.0-AOT-cost(s)
// Property 3: JIT can replace AOT
// forall s:ScriptV0/V1, R5.0-JIT-verify(s) == R4.0-AOT-verify(s)
// Property 4: JIT cost bounded by AOT
// forall s:ScriptV0/V1, R5.0-JIT-cost(s) <= R4.0-AOT-cost(s)
// Property 5: ScriptV2 rejected before soft-fork
// forall s:ScriptV2, if not SF active => reject
Version-Aware Interpreter
pub fn verify(
ergo_tree: *const ErgoTree,
ctx: *const Context,
) !bool {
const script_version = ergo_tree.header.version;
const activated_version = ctx.activatedScriptVersion();
// Execute with proper version context
var version_ctx = try VersionContext.init(
activated_version.toU8(),
script_version.toU8(),
);
const prev = current_context;
current_context = version_ctx;
defer current_context = prev;
// Version-specific execution
if (version_ctx.isJitActivated()) {
return verifyJit(ergo_tree, ctx);
} else {
return verifyAot(ergo_tree, ctx);
}
}
fn verifyJit(tree: *const ErgoTree, ctx: *const Context) !bool {
const reduced = try fullReduction(tree, ctx);
return verifySignature(reduced, ctx.messageToSign());
}
fn verifyAot(tree: *const ErgoTree, ctx: *const Context) !bool {
// Legacy AOT interpreter path
const result = try aotEvaluate(tree, ctx);
return verifySignature(result, ctx.messageToSign());
}
Block Extension Voting
Rule status changes via blockchain extension voting:
Extension Voting Flow
══════════════════════════════════════════════════════════════════
┌────────────────────────────────────────────────────────────────────┐
│ Block Extension Section │
│ │
│ Key (2 bytes) │ Value │
│ ─────────────────┼─────────────────────────────────────────────── │
│ Rule ID │ RuleStatus + data │
│ 0x03EF (1007) │ ChangedRule([0x5A, 0x5B]) │
│ │ (new opcodes 0x5A, 0x5B allowed) │
└────────────────────────────────────────────────────────────────────┘
│
▼
┌────────────────────────────────────────────────────────────────────┐
│ Voting Epoch │
│ │
│ Epoch 1: □ □ □ ■ □ ■ ■ □ ■ ■ (5/10 = 50%) │
│ Epoch 2: ■ ■ □ ■ ■ ■ □ ■ ■ ■ (8/10 = 80%) │
│ Epoch 3: ■ ■ ■ ■ ■ □ ■ ■ ■ ■ (9/10 = 90%) → ACTIVATED │
└────────────────────────────────────────────────────────────────────┘
New Opcode Addition:
1. Before soft-fork: Unknown opcode → ValidationException
2. Extension update: ChangedRule(Array(newOpcode)) for rule 1001
3. After activation: Old nodes recognize soft-fork via SoftForkWhenCodeAdded
4. Result: Old nodes skip verification; new nodes execute new opcode
Unknown Opcode Handling
fn handleUnknownOpcode(
reader: *Reader,
settings: *const ValidationSettings,
op_code: u8,
) !Expr {
// Check if this is a soft-fork condition
const rule = &ValidationRules.CheckTypeCode;
const status = settings.getStatus(rule.id) orelse return error.RuleNotFound;
switch (status) {
.changed => |c| {
// Check if opcode was added via soft-fork
if (std.mem.indexOfScalar(u8, c.new_value, op_code) != null) {
// Soft-fork: skip remaining bytes, return placeholder
reader.skipToEnd();
return Expr{ .constant = Constant.unit };
}
},
else => {},
}
// Not a soft-fork condition - fail hard
return rule.throwValidationException(error.UnknownOpCode, &[_]u8{op_code});
}
Summary
- ErgoTreeVersion encodes script version in 3-bit header field (0-7)
- VersionContext tracks activated protocol and tree versions
- RuleStatus can be Enabled, Disabled, Replaced, or Changed
- SoftForkChecker detects soft-fork conditions from validation failures
- trySoftForkable provides graceful fallback for unknown constructs
- AOT→JIT transition demonstrated soft-fork for major interpreter change
- Block extension voting enables parameter changes via miner consensus
- Old nodes remain compatible by trusting majority on unverifiable blocks
Next: Chapter 30: Cross-Platform Support
Scala: VersionContext.scala:17-35
Rust: context.rs:46-53
Scala: ErgoTree.scala (header)
Rust: tree_header.rs:122-145
Scala: ErgoTree.scala:57-84
Rust: tree_header.rs:27-109
Scala: VersionContext.scala:47-56
Rust: context.rs:12-54
Scala: RuleStatus.scala:4-53
Rust: Not directly present in sigma-rust; validation handled at higher level
Scala: ValidationRules.scala:13-51
Rust: Validation rules embedded in deserializer implementations
Scala: SoftForkChecker.scala:4-42
Rust: Soft-fork handling at application level (ergo-lib)
Rust: parameters.rs (blockchain parameters)
Chapter 30: Cross-Platform Support
PRE-ALPHA WARNING: This is a pre-alpha version of The Sigma Book. Content may be incomplete, inaccurate, or subject to change. Do not use as a source of truth. For authoritative information, consult the official repositories:
- sigmastate-interpreter — Reference Scala implementation
- sigma-rust — Rust implementation
- ergo — Ergo node
Prerequisites
- Chapter 9 for platform-specific cryptographic implementations
- Chapter 27 for SDK APIs that must work across platforms
- Familiarity with build systems and compilation toolchains
Learning Objectives
By the end of this chapter, you will be able to:
- Explain Zig's cross-compilation architecture and target selection
- Implement platform abstraction layers for OS-specific functionality
- Use conditional compilation (
comptime if) for target-specific code - Build for WASM, native, and embedded targets from a single codebase
Cross-Compilation Architecture
Zig provides native cross-compilation to any target from any host12:
Cross-Compilation Targets
══════════════════════════════════════════════════════════════════
┌─────────────────────────────────────────────────────────────────┐
│ Host Build System │
│ │
│ zig build -Dtarget=<target> │
└────────────────────────┬────────────────────────────────────────┘
Common Targets:
x86_64-linux-gnu Linux desktop/server
aarch64-linux-gnu ARM64 Linux (Raspberry Pi, etc.)
x86_64-macos macOS Intel
aarch64-macos macOS Apple Silicon
x86_64-windows-gnu Windows
wasm32-wasi WebAssembly with WASI
wasm32-freestanding WebAssembly browser
Platform Abstraction
Platform-specific code via conditional compilation:
const builtin = @import("builtin");
const std = @import("std");
const Platform = struct {
pub const target = builtin.target;
pub const os = target.os.tag;
pub const arch = target.cpu.arch;
pub const is_wasm = arch == .wasm32 or arch == .wasm64;
pub const is_native = !is_wasm;
pub const is_windows = os == .windows;
pub const is_linux = os == .linux;
pub const is_macos = os == .macos;
/// Get platform-appropriate crypto implementation
pub fn getCrypto() type {
if (is_wasm) {
return WasmCrypto;
} else {
return NativeCrypto;
}
}
/// Get platform-appropriate allocator
pub fn getDefaultAllocator() std.mem.Allocator {
if (is_wasm) {
return std.heap.wasm_allocator;
} else {
return std.heap.c_allocator;
}
}
};
Crypto Abstraction Layer
Platform-agnostic cryptography interface34.
SECURITY: All implementations of cryptographic operations involving secret data (scalar multiplication, HMAC with secret keys, etc.) must be constant-time to prevent timing side-channel attacks. Use libraries that guarantee constant-time behavior (e.g., libsecp256k1, Zig's std.crypto).
const CryptoFacade = struct {
const Impl = Platform.getCrypto();
pub const SECRET_KEY_LENGTH: usize = 32;
pub const PUBLIC_KEY_LENGTH: usize = 33;
pub const SIGNATURE_LENGTH: usize = 64;
/// Create new crypto context
pub fn createContext() CryptoContext {
return Impl.createContext();
}
/// Normalize point to affine coordinates
pub fn normalizePoint(p: Ecp) Ecp {
return Impl.normalizePoint(p);
}
/// Negate point (y-coordinate)
pub fn negatePoint(p: Ecp) Ecp {
return Impl.negatePoint(p);
}
/// Check if point is infinity
pub fn isInfinityPoint(p: Ecp) bool {
return Impl.isInfinityPoint(p);
}
/// Point exponentiation: p^n
pub fn exponentiatePoint(p: Ecp, n: *const Scalar) Ecp {
return Impl.exponentiatePoint(p, n);
}
/// Point multiplication (addition in EC group): p1 + p2
pub fn multiplyPoints(p1: Ecp, p2: Ecp) Ecp {
return Impl.multiplyPoints(p1, p2);
}
/// Encode point (compressed or uncompressed)
pub fn encodePoint(p: Ecp, compressed: bool) [PUBLIC_KEY_LENGTH]u8 {
return Impl.encodePoint(p, compressed);
}
/// HMAC-SHA512
pub fn hashHmacSha512(key: []const u8, data: []const u8) [64]u8 {
return Impl.hashHmacSha512(key, data);
}
/// PBKDF2-HMAC-SHA512
pub fn generatePbkdf2Key(
password: []const u8,
salt: []const u8,
iterations: u32,
) [64]u8 {
return Impl.generatePbkdf2Key(password, salt, iterations);
}
/// Secure random bytes
pub fn randomBytes(dest: []u8) void {
Impl.randomBytes(dest);
}
};
Native Crypto Implementation
Using Zig's standard library and optional C bindings56:
const NativeCrypto = struct {
const std = @import("std");
const crypto = std.crypto;
pub fn createContext() CryptoContext {
return CryptoContext.secp256k1();
}
pub fn normalizePoint(p: Ecp) Ecp {
return p.normalize();
}
pub fn negatePoint(p: Ecp) Ecp {
return p.negate();
}
pub fn isInfinityPoint(p: Ecp) bool {
return p.isIdentity();
}
pub fn exponentiatePoint(p: Ecp, n: *const Scalar) Ecp {
return p.mul(n.*);
}
pub fn multiplyPoints(p1: Ecp, p2: Ecp) Ecp {
return p1.add(p2);
}
pub fn encodePoint(p: Ecp, compressed: bool) [33]u8 {
if (compressed) {
return p.toCompressedSec1();
} else {
var buf: [65]u8 = undefined;
return p.toUncompressedSec1(&buf)[0..33].*;
}
}
pub fn hashHmacSha512(key: []const u8, data: []const u8) [64]u8 {
var hmac = crypto.auth.HmacSha512.init(key);
hmac.update(data);
return hmac.finalResult();
}
pub fn generatePbkdf2Key(
password: []const u8,
salt: []const u8,
iterations: u32,
) [64]u8 {
var result: [64]u8 = undefined;
crypto.pwhash.pbkdf2(
&result,
password,
salt,
iterations,
crypto.auth.HmacSha512,
);
return result;
}
pub fn randomBytes(dest: []u8) void {
crypto.random.bytes(dest);
}
};
WASM Crypto Implementation
WebAssembly-specific implementation using imports:
const WasmCrypto = struct {
// External functions imported from JavaScript host
extern "env" fn crypto_random_bytes(ptr: [*]u8, len: usize) void;
extern "env" fn crypto_hmac_sha512(
key_ptr: [*]const u8,
key_len: usize,
data_ptr: [*]const u8,
data_len: usize,
out_ptr: [*]u8,
) void;
extern "env" fn crypto_secp256k1_mul(
point_ptr: [*]const u8,
scalar_ptr: [*]const u8,
out_ptr: [*]u8,
) void;
pub fn createContext() CryptoContext {
return CryptoContext.secp256k1();
}
pub fn randomBytes(dest: []u8) void {
crypto_random_bytes(dest.ptr, dest.len);
}
pub fn hashHmacSha512(key: []const u8, data: []const u8) [64]u8 {
var result: [64]u8 = undefined;
crypto_hmac_sha512(
key.ptr,
key.len,
data.ptr,
data.len,
&result,
);
return result;
}
pub fn exponentiatePoint(p: Ecp, n: *const Scalar) Ecp {
var result: [33]u8 = undefined;
crypto_secp256k1_mul(
&p.toCompressedSec1(),
&n.toBytes(),
&result,
);
return Ecp.fromCompressedSec1(result) catch unreachable;
}
// ... other operations using WASM imports
};
WASM JavaScript Host
JavaScript glue code for browser/Node.js:
// wasm_host.js - JavaScript host for WASM crypto
const crypto = require('crypto');
const secp256k1 = require('secp256k1');
const imports = {
env: {
crypto_random_bytes: (ptr, len) => {
const bytes = crypto.randomBytes(len);
const mem = new Uint8Array(wasmMemory.buffer, ptr, len);
mem.set(bytes);
},
crypto_hmac_sha512: (keyPtr, keyLen, dataPtr, dataLen, outPtr) => {
const key = new Uint8Array(wasmMemory.buffer, keyPtr, keyLen);
const data = new Uint8Array(wasmMemory.buffer, dataPtr, dataLen);
const hmac = crypto.createHmac('sha512', key);
hmac.update(data);
const result = hmac.digest();
const out = new Uint8Array(wasmMemory.buffer, outPtr, 64);
out.set(result);
},
crypto_secp256k1_mul: (pointPtr, scalarPtr, outPtr) => {
const point = new Uint8Array(wasmMemory.buffer, pointPtr, 33);
const scalar = new Uint8Array(wasmMemory.buffer, scalarPtr, 32);
const result = secp256k1.publicKeyTweakMul(point, scalar, true);
const out = new Uint8Array(wasmMemory.buffer, outPtr, 33);
out.set(result);
}
}
};
Conditional Compilation
Target-specific code paths:
const builtin = @import("builtin");
pub fn getTimestamp() i64 {
if (builtin.target.os.tag == .wasi) {
// WASI clock_time_get
var ts: std.os.wasi.timestamp_t = undefined;
_ = std.os.wasi.clock_time_get(.REALTIME, 1, &ts);
return @intCast(ts / 1_000_000_000);
} else if (builtin.target.cpu.arch == .wasm32) {
// Freestanding WASM - use imported function
return wasmGetTimestamp();
} else {
// Native - use std
return std.time.timestamp();
}
}
pub fn allocate(comptime T: type, n: usize) ![]T {
const allocator = if (Platform.is_wasm)
std.heap.wasm_allocator
else if (builtin.link_libc)
std.heap.c_allocator
else
std.heap.page_allocator;
return allocator.alloc(T, n);
}
Build Configuration
build.zig for multi-target builds:
const std = @import("std");
pub fn build(b: *std.Build) void {
// Native target (default)
const native_target = b.standardTargetOptions(.{});
const native_optimize = b.standardOptimizeOption(.{});
const lib = b.addStaticLibrary(.{
.name = "ergotree",
.root_source_file = .{ .path = "src/main.zig" },
.target = native_target,
.optimize = native_optimize,
});
// WASM target
const wasm_target = b.resolveTargetQuery(.{
.cpu_arch = .wasm32,
.os_tag = .freestanding,
});
const wasm_lib = b.addStaticLibrary(.{
.name = "ergotree",
.root_source_file = .{ .path = "src/main.zig" },
.target = wasm_target,
.optimize = .ReleaseSmall,
});
// Export for JavaScript
wasm_lib.rdynamic = true;
wasm_lib.export_memory = true;
// WASI target (for Node.js)
const wasi_target = b.resolveTargetQuery(.{
.cpu_arch = .wasm32,
.os_tag = .wasi,
});
const wasi_lib = b.addExecutable(.{
.name = "ergotree",
.root_source_file = .{ .path = "src/main.zig" },
.target = wasi_target,
.optimize = .ReleaseSmall,
});
// Install all targets
b.installArtifact(lib);
b.installArtifact(wasm_lib);
b.installArtifact(wasi_lib);
}
Memory Management
Platform-specific allocation strategies:
const Allocator = std.mem.Allocator;
const MemoryConfig = struct {
/// Maximum memory for WASM (64KB pages)
wasm_max_pages: u32 = 256, // 16MB
/// Use arena for batch operations
use_arena: bool = true,
/// Pre-allocate constant pool
constant_pool_size: usize = 4096,
};
pub fn createPlatformAllocator(config: MemoryConfig) Allocator {
if (Platform.is_wasm) {
// WASM uses linear memory with explicit growth
return std.heap.WasmAllocator.init(.{
.max_memory = config.wasm_max_pages * 65536,
});
} else {
// Native uses page allocator with arena wrapper
if (config.use_arena) {
const backing = std.heap.page_allocator;
var arena = std.heap.ArenaAllocator.init(backing);
return arena.allocator();
}
return std.heap.page_allocator;
}
}
Type Representation
Consistent types across platforms:
/// Platform-independent big integer
pub const BigInt = struct {
limbs: []u64,
positive: bool,
pub fn fromBytes(bytes: []const u8) BigInt {
// Works on all platforms
var limbs = std.ArrayList(u64).init(allocator);
// ... conversion logic
return .{ .limbs = limbs.items, .positive = true };
}
pub fn toBytes(self: *const BigInt, buf: []u8) []u8 {
// Consistent byte representation
// ... conversion logic
return buf[0..written];
}
};
/// Platform-independent scalar (256-bit)
pub const Scalar = struct {
bytes: [32]u8,
pub fn fromBigInt(n: *const BigInt) Scalar {
var result: Scalar = undefined;
_ = n.toBytes(&result.bytes);
return result;
}
};
Endianness Handling
Consistent byte order across architectures:
pub fn readU32BE(bytes: []const u8) u32 {
return std.mem.readInt(u32, bytes[0..4], .big);
}
pub fn writeU32BE(value: u32, buf: []u8) void {
std.mem.writeInt(u32, buf[0..4], value, .big);
}
pub fn readU64LE(bytes: []const u8) u64 {
return std.mem.readInt(u64, bytes[0..8], .little);
}
// Serialization always uses network byte order (big-endian)
pub fn serializeInt(value: anytype, writer: anytype) !void {
const T = @TypeOf(value);
var buf: [@sizeOf(T)]u8 = undefined;
std.mem.writeInt(T, &buf, value, .big);
try writer.writeAll(&buf);
}
Performance Considerations
Platform Performance Characteristics
══════════════════════════════════════════════════════════════════
┌─────────────────┬───────────────────────────────────────────────┐
│ Platform │ Characteristics │
├─────────────────┼───────────────────────────────────────────────┤
│ Native (x86_64) │ ✓ SIMD acceleration (AVX2/AVX512) │
│ │ ✓ Hardware AES-NI │
│ │ ✓ Large memory, fast allocation │
│ │ ✓ Multi-threaded execution │
├─────────────────┼───────────────────────────────────────────────┤
│ Native (ARM64) │ ✓ NEON SIMD │
│ │ ✓ Hardware crypto extensions │
│ │ ✓ Power-efficient │
├─────────────────┼───────────────────────────────────────────────┤
│ WASM (browser) │ ○ Single-threaded (mostly) │
│ │ ○ Linear memory model │
│ │ ✓ JIT compilation by browser │
│ │ ○ No direct filesystem/network │
├─────────────────┼───────────────────────────────────────────────┤
│ WASI (Node.js) │ ○ Single-threaded │
│ │ ✓ WASI syscalls for I/O │
│ │ ✓ Sandboxed execution │
└─────────────────┴───────────────────────────────────────────────┘
Optimization Strategies:
Native: Use comptime for specialization, SIMD intrinsics
WASM: Minimize memory allocations, batch operations
Both: Profile-guided optimization, cache-friendly layouts
Testing Cross-Platform
const testing = std.testing;
test "crypto operations consistent across platforms" {
const key = "test_key";
const data = "test_data";
const result = CryptoFacade.hashHmacSha512(key, data);
// Expected value computed externally
const expected = [_]u8{
0x8f, 0x9d, 0x1c, // ... full 64 bytes
};
try testing.expectEqualSlices(u8, &expected, &result);
}
test "point operations" {
const ctx = CryptoFacade.createContext();
const g = ctx.generator;
// g + g = 2g
const two_g_add = CryptoFacade.multiplyPoints(g, g);
const scalar_2 = Scalar.fromInt(2);
const two_g_mul = CryptoFacade.exponentiatePoint(g, &scalar_2);
try testing.expect(two_g_add.eql(two_g_mul));
}
Usage Example
Cross-platform wallet library:
const Wallet = struct {
prover: Prover,
allocator: Allocator,
pub fn init(seed: []const u8) !Wallet {
const allocator = Platform.getDefaultAllocator();
// Platform-independent key derivation
const master_key = CryptoFacade.generatePbkdf2Key(
seed,
"mnemonic",
2048,
);
return .{
.prover = try Prover.fromSeed(master_key, allocator),
.allocator = allocator,
};
}
pub fn signTransaction(
self: *const Wallet,
tx: *const Transaction,
) !SignedTransaction {
// Works identically on all platforms
return self.prover.sign(tx);
}
};
// Same code runs on all targets:
// - Desktop app (native)
// - Browser extension (WASM)
// - Mobile wallet (ARM native or WASM)
// - Server-side validation (native)
Summary
- Zig cross-compiles to any target from any host without external tools
- Platform abstraction uses
builtin.targetfor conditional compilation - CryptoFacade provides consistent API across native and WASM
- WASM targets use JavaScript imports for platform-specific crypto
- Memory management adapts to platform constraints
- Type representation ensures consistent behavior across architectures
- Testing verifies identical results on all platforms
Next: Chapter 31: Performance Engineering
Scala: CryptoFacade.scala (abstraction)
Rust: Platform-independent design in sigma-rust crate structure
Scala: Platform.scala (JVM impl)
Rust: sigma_protocol/ (crypto operations)
Scala: Platform.scala (JS impl)
Rust: Feature flags in Cargo.toml for optional dependencies
Chapter 31: Performance Engineering
PRE-ALPHA WARNING: This is a pre-alpha version of The Sigma Book. Content may be incomplete, inaccurate, or subject to change. Do not use as a source of truth. For authoritative information, consult the official repositories:
- sigmastate-interpreter — Reference Scala implementation
- sigma-rust — Rust implementation
- ergo — Ergo node
Prerequisites
- Chapter 12 for evaluation model fundamentals that define hot paths
- Chapter 7 for serialization patterns to optimize
- Chapter 13 for understanding cost accounting overhead
Learning Objectives
By the end of this chapter, you will be able to:
- Identify performance-critical paths in script interpretation
- Apply Zig's comptime for zero-cost abstractions and type dispatch
- Design data structures using Struct-of-Arrays (SoA) for cache efficiency
- Use arena allocators to batch allocations and reduce overhead
- Implement SIMD and vectorization for throughput-critical operations
- Profile and benchmark interpreter components systematically
Performance Architecture
Script interpretation requires processing thousands of transactions per block12:
Performance Critical Paths
══════════════════════════════════════════════════════════════════
┌─────────────────────────────────────────────────────────────────┐
│ Transaction Flow │
│ │
│ Block (1000+ txs) │
│ │ │
│ ├── Tx 1: 3 inputs × (deserialize + evaluate + verify) │
│ ├── Tx 2: 1 input × (deserialize + evaluate + verify) │
│ ├── Tx 3: 5 inputs × (deserialize + evaluate + verify) │
│ └── ... │
│ │
│ Hot paths (per input): │
│ • Deserialization: ~50-200 opcode parses │
│ • Evaluation: ~100-500 operations │
│ • Proof verification: 1-10 EC operations │
└─────────────────────────────────────────────────────────────────┘
Performance Targets:
Deserialization: < 100 µs per script
Evaluation: < 500 µs per script
Verification: < 2 ms per input
Total per block: < 5 seconds
Comptime Optimization
Zig's comptime enables zero-cost abstractions34:
/// Compile-time type dispatch eliminates runtime branching
fn evalOperation(comptime op: OpCode, args: []const Value) !Value {
return switch (op) {
.plus => evalPlus(args),
.minus => evalMinus(args),
.multiply => evalMultiply(args),
// All branches resolved at compile time
inline else => |o| evalGeneric(o, args),
};
}
/// Comptime-generated lookup tables
const op_costs = blk: {
var costs: [256]u32 = undefined;
for (0..256) |i| {
costs[i] = computeCost(@enumFromInt(i));
}
break :blk costs;
};
/// Zero-cost field access via comptime offset calculation
fn getField(comptime T: type, comptime field: []const u8, ptr: *const T) *const @TypeOf(@field(T{}, field)) {
const offset = @offsetOf(T, field);
const byte_ptr: [*]const u8 = @ptrCast(ptr);
return @ptrCast(@alignCast(byte_ptr + offset));
}
Data-Oriented Design
Structure data for cache efficiency. The Array-of-Structs to Struct-of-Arrays transformation is a semantics-preserving isomorphism: Array[n](A × B) ≅ Array[n](A) × Array[n](B). Both represent the same data with identical behavior, but different memory layouts yield dramatically different cache performance:
/// Bad: Array of Structs (AoS) - poor cache locality for iteration
const ValueAoS = struct {
tag: ValueTag, // 1 byte
padding: [7]u8, // 7 bytes padding
data: [8]u8, // 8 bytes payload
}; // 16 bytes per value, only 9 used
/// Good: Struct of Arrays (SoA) - excellent cache locality
const ValueStore = struct {
tags: []ValueTag, // Packed tags
data: [][8]u8, // Packed payloads
len: usize,
/// Iterate tags without touching payload
pub fn countType(self: *const ValueStore, target: ValueTag) usize {
var count: usize = 0;
for (self.tags) |tag| {
count += @intFromBool(tag == target);
}
return count;
}
/// Access specific value
pub fn get(self: *const ValueStore, idx: usize) Value {
return Value.decode(self.tags[idx], self.data[idx]);
}
};
Memory Layout Analysis
Cache Line Utilization
══════════════════════════════════════════════════════════════════
Array of Structs (AoS):
┌──────────────────────────────────────────────────────────────────┐
│ Cache Line (64 bytes) │
├──────────────────────────────────────────────────────────────────┤
│ Value[0] │ Value[1] │ Value[2] │ Value[3] │ │
│ 16 bytes │ 16 bytes │ 16 bytes │ 16 bytes │ │
│ T+P+D │ T+P+D │ T+P+D │ T+P+D │ Wasted │
└──────────────────────────────────────────────────────────────────┘
Tag iteration: 25% cache utilization (touches only 1 byte per 16)
Struct of Arrays (SoA):
┌──────────────────────────────────────────────────────────────────┐
│ Tags Cache Line (64 bytes) │
├──────────────────────────────────────────────────────────────────┤
│ T[0] T[1] T[2] ... T[63] │
│ 64 tags in single cache line │
└──────────────────────────────────────────────────────────────────┘
Tag iteration: 100% cache utilization (64 values per fetch)
Speedup: ~4x for tag-only operations
Arena Allocators
Batch allocations reduce overhead:
const ArenaAllocator = std.heap.ArenaAllocator;
/// Evaluation context with arena for temporary allocations
const EvalContext = struct {
arena: ArenaAllocator,
constants: []const Constant,
env: Environment,
pub fn init(backing: Allocator) EvalContext {
return .{
.arena = ArenaAllocator.init(backing),
.constants = &[_]Constant{},
.env = Environment.init(),
};
}
/// All temporary allocations use arena
pub fn allocTemp(self: *EvalContext, comptime T: type, n: usize) ![]T {
return self.arena.allocator().alloc(T, n);
}
/// Single deallocation frees all temps
pub fn reset(self: *EvalContext) void {
_ = self.arena.reset(.retain_capacity);
}
pub fn deinit(self: *EvalContext) void {
self.arena.deinit();
}
};
/// Usage: batch evaluation without per-operation allocations
fn evaluateScript(tree: *const ErgoTree, allocator: Allocator) !Value {
var ctx = EvalContext.init(allocator);
defer ctx.deinit();
for (tree.ops) |op| {
try evalOp(op, &ctx);
}
ctx.reset(); // Free all temps at once
return ctx.result;
}
Loop Optimization
Efficient iteration patterns:
/// Unrolled loop for fixed-size operations
fn hashBlock(data: *const [64]u8, state: *[8]u32) void {
// Process 16 words per iteration, unrolled
comptime var i: usize = 0;
inline while (i < 64) : (i += 4) {
const w0 = std.mem.readInt(u32, data[i..][0..4], .big);
const w1 = std.mem.readInt(u32, data[i + 4..][0..4], .big);
const w2 = std.mem.readInt(u32, data[i + 8..][0..4], .big);
const w3 = std.mem.readInt(u32, data[i + 12..][0..4], .big);
round(state, w0);
round(state, w1);
round(state, w2);
round(state, w3);
}
}
/// Vectorized collection operations
fn sumValues(values: []const i64) i64 {
const Vec = @Vector(4, i64);
var sum_vec: Vec = @splat(0);
var i: usize = 0;
while (i + 4 <= values.len) : (i += 4) {
const chunk: Vec = values[i..][0..4].*;
sum_vec += chunk;
}
// Reduce vector to scalar
var sum = @reduce(.Add, sum_vec);
// Handle remainder
while (i < values.len) : (i += 1) {
sum += values[i];
}
return sum;
}
Memoization
Cache expensive computations56:
/// Generic memoization with comptime key type
fn Memoized(comptime K: type, comptime V: type) type {
return struct {
cache: std.AutoHashMap(K, V),
const Self = @This();
pub fn init(allocator: Allocator) Self {
return .{ .cache = std.AutoHashMap(K, V).init(allocator) };
}
pub fn getOrCompute(
self: *Self,
key: K,
compute: *const fn (K) V,
) V {
const result = self.cache.getOrPut(key) catch unreachable;
if (!result.found_existing) {
result.value_ptr.* = compute(key);
}
return result.value_ptr.*;
}
pub fn reset(self: *Self) void {
self.cache.clearRetainingCapacity();
}
};
}
/// Type method resolution memoization
const MethodCache = Memoized(struct { type_code: u8, method_id: u8 }, *const Method);
var method_cache: MethodCache = undefined;
fn resolveMethod(type_code: u8, method_id: u8) *const Method {
return method_cache.getOrCompute(
.{ .type_code = type_code, .method_id = method_id },
computeMethod,
);
}
String Interning
Avoid repeated string allocations:
const StringInterner = struct {
table: std.StringHashMap([]const u8),
arena: ArenaAllocator,
pub fn init(allocator: Allocator) StringInterner {
return .{
.table = std.StringHashMap([]const u8).init(allocator),
.arena = ArenaAllocator.init(allocator),
};
}
/// Return interned string (pointer stable for lifetime)
pub fn intern(self: *StringInterner, str: []const u8) []const u8 {
if (self.table.get(str)) |existing| {
return existing;
}
// Allocate permanent copy
const copy = self.arena.allocator().dupe(u8, str) catch unreachable;
self.table.put(copy, copy) catch unreachable;
return copy;
}
};
// Variable names are interned for fast comparison
fn lookupVar(env: *const Environment, name: []const u8) ?Value {
const interned = global_interner.intern(name);
return env.bindings.get(interned);
}
SIMD for Crypto
Vectorized elliptic curve operations:
/// SIMD-accelerated field multiplication (mod p)
fn mulModP(a: *const [4]u64, b: *const [4]u64) [4]u64 {
// Use vector operations where available
if (comptime std.Target.current.cpu.arch.isX86()) {
return mulModP_avx2(a, b);
} else if (comptime std.Target.current.cpu.arch.isAARCH64()) {
return mulModP_neon(a, b);
} else {
return mulModP_scalar(a, b);
}
}
fn mulModP_avx2(a: *const [4]u64, b: *const [4]u64) [4]u64 {
// AVX2 implementation using 256-bit vectors
const va: @Vector(4, u64) = a.*;
const vb: @Vector(4, u64) = b.*;
// Schoolbook multiplication with vector operations
// ... (optimized implementation)
return result;
}
Profiling and Benchmarking
Built-in profiling support:
const Timer = struct {
start: i128,
pub fn init() Timer {
return .{ .start = std.time.nanoTimestamp() };
}
pub fn elapsed(self: *const Timer) u64 {
const now = std.time.nanoTimestamp();
return @intCast(now - self.start);
}
};
/// Benchmark harness
fn benchmark(
comptime name: []const u8,
comptime iterations: usize,
comptime warmup: usize,
func: *const fn () void,
) void {
// Warmup
for (0..warmup) |_| {
func();
}
// Measure
const timer = Timer.init();
for (0..iterations) |_| {
func();
}
const total_ns = timer.elapsed();
const ns_per_op = total_ns / iterations;
const ops_per_sec = @as(f64, 1_000_000_000) / @as(f64, @floatFromInt(ns_per_op));
std.debug.print("{s}: {} ns/op ({d:.0} ops/sec)\n", .{
name,
ns_per_op,
ops_per_sec,
});
}
// Usage
test "benchmark deserialization" {
benchmark("deserialize_script", 10000, 1000, struct {
fn run() void {
_ = deserialize(test_script);
}
}.run);
}
Memory Profiling
Track allocations in debug builds:
const DebugAllocator = struct {
backing: Allocator,
total_allocated: usize = 0,
total_freed: usize = 0,
allocation_count: usize = 0,
pub fn allocator(self: *DebugAllocator) Allocator {
return .{
.ptr = self,
.vtable = &.{
.alloc = alloc,
.resize = resize,
.free = free,
},
};
}
fn alloc(ctx: *anyopaque, len: usize, ptr_align: u8, ret_addr: usize) ?[*]u8 {
const self: *DebugAllocator = @ptrCast(@alignCast(ctx));
self.total_allocated += len;
self.allocation_count += 1;
return self.backing.rawAlloc(len, ptr_align, ret_addr);
}
// ... other methods
pub fn report(self: *const DebugAllocator) void {
std.debug.print("Allocations: {}\n", .{self.allocation_count});
std.debug.print("Total allocated: {} bytes\n", .{self.total_allocated});
std.debug.print("Total freed: {} bytes\n", .{self.total_freed});
std.debug.print("Leaked: {} bytes\n", .{self.total_allocated - self.total_freed});
}
};
Performance Patterns
Optimization Decision Tree
══════════════════════════════════════════════════════════════════
Is operation in hot path?
│
├── NO → Optimize for clarity, not speed
│
└── YES → Profile first, then:
│
├── CPU-bound?
│ ├── Use comptime for dispatch
│ ├── Unroll small loops
│ ├── Use SIMD where applicable
│ └── Inline critical functions
│
├── Memory-bound?
│ ├── Use SoA layout
│ ├── Pool/arena allocators
│ ├── Reduce allocations
│ └── Prefetch data
│
└── Allocation-bound?
├── Arena allocators
├── Object pools
├── String interning
└── Stack allocation where safe
Performance Checklist
When writing performance-critical code:
// ✓ Use comptime for type-level decisions
const Handler = comptime getHandler(op);
// ✓ Pre-compute lookup tables
const costs = comptime computeCostTable();
// ✓ Use SoA for iterated data
const Store = struct { tags: []Tag, values: []Value };
// ✓ Arena allocators for batch operations
var arena = ArenaAllocator.init(allocator);
defer arena.deinit();
// ✓ Inline hot functions
pub inline fn addCost(self: *CostAccum, cost: u32) !void
// ✓ Avoid allocations in tight loops
for (items) |item| {
// Process without allocation
}
// ✓ Use vectors for parallel data
const Vec4 = @Vector(4, u64);
// ✓ Profile before optimizing
// std.debug.print("elapsed: {} ns\n", .{timer.elapsed()});
Summary
- Comptime enables zero-cost abstractions and compile-time dispatch
- Data-oriented design (SoA) improves cache efficiency 4x+
- Arena allocators batch deallocations for throughput
- Loop unrolling and SIMD accelerate hot paths
- Memoization caches expensive computations
- String interning reduces allocation pressure
- Profile first before optimizing—measure, don't guess
Next: Chapter 32: v6 Protocol Features
Scala: perf-style-guide.md (HOTSPOT patterns)
Rust: Performance-oriented design throughout sigma-rust crates
Scala: MemoizedFunc.scala
Rust: Memoization patterns in ergotree-interpreter
Scala: CErgoTreeEvaluator.scala (fixedCostOp)
Rust: eval.rs (cost tracking)
Chapter 32: v6 Protocol Features
PRE-ALPHA WARNING: This is a pre-alpha version of The Sigma Book. Content may be incomplete, inaccurate, or subject to change. Do not use as a source of truth. For authoritative information, consult the official repositories:
- sigmastate-interpreter — Reference Scala implementation
- sigma-rust — Rust implementation
- ergo — Ergo node
Prerequisites
- Chapter 2 for the ErgoTree type system and numeric types
- Chapter 6 for method definitions on types
- Chapter 29 for soft-fork versioning and activation
Learning Objectives
By the end of this chapter, you will be able to:
- Implement SUnsignedBigInt (256-bit unsigned integers) with modular arithmetic operations
- Apply bitwise operations (AND, OR, XOR, NOT, shifts) to all numeric types
- Use new collection manipulation methods (patch, updated, updateMany, reverse, get)
- Understand the Autolykos2 proof-of-work algorithm and Header.checkPow
- Serialize values using Global.serialize and decode difficulty with NBits encoding
- Write version-aware scripts that use v6 features safely
Version Activation
ErgoTree version 3 corresponds to protocol v6.0. Features in this chapter are only available when the v6 soft-fork is activated.
Version Mapping
═══════════════════════════════════════════════════════════════════
Block Version ErgoTree Version Protocol Features
─────────────────────────────────────────────────────────────────────
1-2 0-1 v3.x-v4.x AOT costing
3 2 v5.x JIT costing
4 3 v6.x This chapter
Version Context
const VersionContext = struct {
activated_version: u8,
ergo_tree_version: u8,
pub const JIT_ACTIVATION_VERSION: u8 = 2; // v5.0
pub const V6_SOFT_FORK_VERSION: u8 = 3; // v6.0
/// True if v6.0 protocol is activated
pub fn isV6Activated(self: *const VersionContext) bool {
return self.activated_version >= V6_SOFT_FORK_VERSION;
}
/// True if current ErgoTree is v3 or later
pub fn isV3OrLaterErgoTree(self: *const VersionContext) bool {
return self.ergo_tree_version >= V6_SOFT_FORK_VERSION;
}
/// Check if a v6 method can be used
pub fn canUseV6Method(self: *const VersionContext) bool {
return self.isV6Activated() and self.isV3OrLaterErgoTree();
}
};
SUnsignedBigInt Type
The SUnsignedBigInt type (type code 9) is a 256-bit unsigned integer designed for cryptographic modular arithmetic12. Unlike SBigInt which is signed, SUnsignedBigInt guarantees non-negative values—essential for operations like modular exponentiation where sign handling would introduce complexity and potential errors.
Type Definition
/// 256-bit unsigned integer for modular arithmetic
/// Type code: 0x09
const UnsignedBigInt256 = struct {
/// Internal representation: 4 x 64-bit words (little-endian)
words: [4]u64,
pub const TYPE_CODE: u8 = 0x09;
pub const BIT_WIDTH: usize = 256;
pub const BYTE_WIDTH: usize = 32;
/// Maximum value: 2^256 - 1
pub const MAX = UnsignedBigInt256{ .words = .{
0xFFFFFFFFFFFFFFFF, 0xFFFFFFFFFFFFFFFF,
0xFFFFFFFFFFFFFFFF, 0xFFFFFFFFFFFFFFFF,
}};
/// Zero value
pub const ZERO = UnsignedBigInt256{ .words = .{ 0, 0, 0, 0 }};
/// One value
pub const ONE = UnsignedBigInt256{ .words = .{ 1, 0, 0, 0 }};
/// Create from bytes (big-endian)
pub fn fromBytes(bytes: [32]u8) UnsignedBigInt256 {
var result = UnsignedBigInt256{ .words = undefined };
// Convert big-endian bytes to little-endian words
inline for (0..4) |i| {
const offset = (3 - i) * 8;
result.words[i] = std.mem.readInt(u64, bytes[offset..][0..8], .big);
}
return result;
}
/// Convert to bytes (big-endian)
pub fn toBytes(self: UnsignedBigInt256) [32]u8 {
var result: [32]u8 = undefined;
inline for (0..4) |i| {
const offset = (3 - i) * 8;
std.mem.writeInt(u64, result[offset..][0..8], self.words[i], .big);
}
return result;
}
/// Convert from signed BigInt (errors if negative)
pub fn fromBigInt(bi: BigInt256) !UnsignedBigInt256 {
if (bi.isNegative()) {
return error.NegativeValue;
}
// Safe to reinterpret since non-negative
return @bitCast(bi.abs());
}
/// Convert to signed BigInt (errors if > BigInt.MAX)
pub fn toBigInt(self: UnsignedBigInt256) !BigInt256 {
// Check if value exceeds signed max (2^255 - 1)
if (self.words[3] & 0x8000000000000000 != 0) {
return error.Overflow;
}
return @bitCast(self);
}
/// Comparison
pub fn lessThan(self: UnsignedBigInt256, other: UnsignedBigInt256) bool {
// Compare from most significant word
var i: usize = 4;
while (i > 0) {
i -= 1;
if (self.words[i] < other.words[i]) return true;
if (self.words[i] > other.words[i]) return false;
}
return false; // Equal
}
pub fn eql(self: UnsignedBigInt256, other: UnsignedBigInt256) bool {
return std.mem.eql(u64, &self.words, &other.words);
}
};
Why Unsigned Matters for Cryptography
Signed integers introduce complexity in modular arithmetic:
-
Sign bit ambiguity: In two's complement, the high bit indicates sign. For cryptographic operations on field elements, all 256 bits should represent magnitude.
-
Modular reduction: Computing
a mod mfor negativearequires adjustment:(-5) mod 7 = 2, not-5. Unsigned values eliminate this edge case. -
Constant-time operations: Sign handling can introduce timing variations. Unsigned operations are more naturally constant-time.
-
Field element representation: Finite field elements are inherently non-negative integers in
[0, p-1].
Serialization
const UnsignedBigInt256Serializer = struct {
/// Serialize to variable-length big-endian bytes
pub fn serialize(value: UnsignedBigInt256, writer: anytype) !void {
const bytes = value.toBytes();
// Find first non-zero byte (skip leading zeros)
var start: usize = 0;
while (start < 32 and bytes[start] == 0) : (start += 1) {}
// Write length + bytes
const len = 32 - start;
try writer.writeInt(u8, @intCast(len), .big);
try writer.writeAll(bytes[start..]);
}
/// Deserialize from variable-length big-endian bytes
pub fn deserialize(reader: anytype) !UnsignedBigInt256 {
const len = try reader.readInt(u8, .big);
if (len > 32) return error.InvalidLength;
var bytes: [32]u8 = .{0} ** 32;
const start = 32 - len;
try reader.readNoEof(bytes[start..]);
return UnsignedBigInt256.fromBytes(bytes);
}
};
Modular Arithmetic Operations
v6 adds six modular arithmetic methods to SUnsignedBigInt34:
Modular Arithmetic Methods
═══════════════════════════════════════════════════════════════════
Method Signature Cost Description
─────────────────────────────────────────────────────────────────────
mod (UBI, UBI) → UBI 20 a mod m
modInverse (UBI, UBI) → UBI 150 a⁻¹ mod m
plusMod (UBI, UBI, UBI) → UBI 30 (a + b) mod m
subtractMod (UBI, UBI, UBI) → UBI 30 (a - b) mod m
multiplyMod (UBI, UBI, UBI) → UBI 40 (a × b) mod m
toSigned UBI → BigInt 10 Convert to signed
Basic Modulo Operation
/// a mod m - remainder after division
/// Cost: FixedCost(20)
pub fn mod(a: UnsignedBigInt256, m: UnsignedBigInt256) !UnsignedBigInt256 {
if (m.eql(UnsignedBigInt256.ZERO)) {
return error.DivisionByZero;
}
// Use schoolbook division for 256-bit values
// Result is always < m
return divmod(a, m).remainder;
}
Modular Inverse (Extended Euclidean Algorithm)
The modular inverse a⁻¹ mod m is the value x such that (a × x) mod m = 1. It exists only when gcd(a, m) = 1.
/// Extended Euclidean Algorithm
/// Returns x such that (a * x) ≡ 1 (mod m)
/// Cost: FixedCost(150) - most expensive modular operation
pub fn modInverse(a: UnsignedBigInt256, m: UnsignedBigInt256) !UnsignedBigInt256 {
if (m.eql(UnsignedBigInt256.ZERO)) {
return error.DivisionByZero;
}
if (a.eql(UnsignedBigInt256.ZERO)) {
return error.NoInverse; // gcd(0, m) = m ≠ 1
}
// Extended Euclidean Algorithm
// Maintains: old_s * a + old_t * m = old_r (Bézout's identity)
var old_r = a;
var r = m;
var old_s = UnsignedBigInt256.ONE;
var s = UnsignedBigInt256.ZERO;
var old_s_negative = false;
var s_negative = false;
while (!r.eql(UnsignedBigInt256.ZERO)) {
const quotient = divmod(old_r, r).quotient;
// (old_r, r) = (r, old_r - quotient * r)
const temp_r = r;
const qr = multiply(quotient, r);
if (old_r.lessThan(qr)) {
r = subtract(qr, old_r);
} else {
r = subtract(old_r, qr);
}
old_r = temp_r;
// (old_s, s) = (s, old_s - quotient * s)
// Handle signed arithmetic carefully
const temp_s = s;
const temp_s_neg = s_negative;
const qs = multiply(quotient, s);
if (old_s_negative == s_negative) {
// Same sign: subtraction
if (old_s.lessThan(qs)) {
s = subtract(qs, old_s);
s_negative = !old_s_negative;
} else {
s = subtract(old_s, qs);
s_negative = old_s_negative;
}
} else {
// Different signs: addition
s = add(old_s, qs);
s_negative = old_s_negative;
}
old_s = temp_s;
old_s_negative = temp_s_neg;
}
// Check that gcd(a, m) = 1
if (!old_r.eql(UnsignedBigInt256.ONE)) {
return error.NoInverse; // a and m are not coprime
}
// Adjust result to be positive
if (old_s_negative) {
return subtract(m, old_s);
}
return old_s;
}
Modular Addition
/// (a + b) mod m - modular addition
/// Handles overflow by using 320-bit intermediate
/// Cost: FixedCost(30)
pub fn plusMod(
a: UnsignedBigInt256,
b: UnsignedBigInt256,
m: UnsignedBigInt256,
) !UnsignedBigInt256 {
if (m.eql(UnsignedBigInt256.ZERO)) {
return error.DivisionByZero;
}
// a + b can overflow 256 bits, so use 320-bit intermediate
var sum: [5]u64 = .{ 0, 0, 0, 0, 0 };
var carry: u64 = 0;
for (0..4) |i| {
const s = @as(u128, a.words[i]) + @as(u128, b.words[i]) + carry;
sum[i] = @truncate(s);
carry = @truncate(s >> 64);
}
sum[4] = carry;
// Reduce mod m
return reduce320(sum, m);
}
/// (a - b) mod m - modular subtraction
/// If a < b, result is m - (b - a)
/// Cost: FixedCost(30)
pub fn subtractMod(
a: UnsignedBigInt256,
b: UnsignedBigInt256,
m: UnsignedBigInt256,
) !UnsignedBigInt256 {
if (m.eql(UnsignedBigInt256.ZERO)) {
return error.DivisionByZero;
}
if (a.lessThan(b)) {
// a - b is negative: compute m - (b - a)
const diff = subtract(b, a);
const diff_mod = try mod(diff, m);
if (diff_mod.eql(UnsignedBigInt256.ZERO)) {
return UnsignedBigInt256.ZERO;
}
return subtract(m, diff_mod);
} else {
const diff = subtract(a, b);
return mod(diff, m);
}
}
Modular Multiplication
/// (a * b) mod m - modular multiplication
/// Uses 512-bit intermediate to handle overflow
/// Cost: FixedCost(40)
pub fn multiplyMod(
a: UnsignedBigInt256,
b: UnsignedBigInt256,
m: UnsignedBigInt256,
) !UnsignedBigInt256 {
if (m.eql(UnsignedBigInt256.ZERO)) {
return error.DivisionByZero;
}
// Multiply to 512-bit result
var product: [8]u64 = .{0} ** 8;
for (0..4) |i| {
var carry: u64 = 0;
for (0..4) |j| {
const p = @as(u128, a.words[i]) * @as(u128, b.words[j]) +
@as(u128, product[i + j]) + @as(u128, carry);
product[i + j] = @truncate(p);
carry = @truncate(p >> 64);
}
product[i + 4] = carry;
}
// Reduce 512-bit product mod m
return reduce512(product, m);
}
Bitwise Operations
v6 adds eight bitwise methods to all numeric types (Byte, Short, Int, Long, BigInt, UnsignedBigInt)56:
Bitwise Operations
═══════════════════════════════════════════════════════════════════
Method Signature Cost Description
─────────────────────────────────────────────────────────────────────
bitwiseInverse T → T 5 ~x (NOT)
bitwiseOr (T, T) → T 5 x | y
bitwiseAnd (T, T) → T 5 x & y
bitwiseXor (T, T) → T 5 x ^ y
shiftLeft (T, Int) → T 5 x << n
shiftRight (T, Int) → T 5 x >> n
toBytes T → Coll[Byte] 5 Byte representation
toBits T → Coll[Boolean] 5 Bit representation
Implementation
/// Bitwise operations for all numeric types
pub fn BitwiseOps(comptime T: type) type {
return struct {
/// Bitwise NOT (~x)
/// For signed types: ~x = -x - 1 (two's complement identity)
/// For unsigned: ~x = MAX - x
/// Cost: FixedCost(5)
pub fn bitwiseInverse(x: T) T {
return ~x;
}
/// Bitwise OR (x | y)
/// Cost: FixedCost(5)
pub fn bitwiseOr(x: T, y: T) T {
return x | y;
}
/// Bitwise AND (x & y)
/// Cost: FixedCost(5)
pub fn bitwiseAnd(x: T, y: T) T {
return x & y;
}
/// Bitwise XOR (x ^ y)
/// Cost: FixedCost(5)
pub fn bitwiseXor(x: T, y: T) T {
return x ^ y;
}
/// Left shift (x << n)
/// Returns 0 if n >= bitwidth
/// Cost: FixedCost(5)
pub fn shiftLeft(x: T, n: i32) !T {
if (n < 0) return error.NegativeShift;
const bits = @bitSizeOf(T);
if (n >= bits) return 0;
return x << @intCast(n);
}
/// Right shift (x >> n)
/// Arithmetic shift for signed (preserves sign)
/// Logical shift for unsigned (fills with 0)
/// Cost: FixedCost(5)
pub fn shiftRight(x: T, n: i32) !T {
if (n < 0) return error.NegativeShift;
const bits = @bitSizeOf(T);
if (n >= bits) {
// For signed: return -1 if negative, 0 otherwise
// For unsigned: return 0
if (@typeInfo(T).Int.signedness == .signed) {
return if (x < 0) -1 else 0;
}
return 0;
}
return x >> @intCast(n);
}
};
}
/// Byte conversion for numeric types
pub fn toBytes(comptime T: type, x: T) []const u8 {
const size = @sizeOf(T);
var result: [size]u8 = undefined;
std.mem.writeInt(T, &result, x, .big);
return &result;
}
/// Bit conversion for numeric types
pub fn toBits(comptime T: type, x: T) []const bool {
const bits = @bitSizeOf(T);
var result: [bits]bool = undefined;
for (0..bits) |i| {
result[bits - 1 - i] = ((x >> @intCast(i)) & 1) == 1;
}
return &result;
}
BigInt Bitwise Operations
For BigInt256 and UnsignedBigInt256, bitwise operations work on the full 256-bit representation:
/// 256-bit bitwise operations
const BigIntBitwise = struct {
/// Bitwise NOT for 256-bit unsigned
/// ~x = MAX - x for unsigned interpretation
pub fn bitwiseInverse(x: UnsignedBigInt256) UnsignedBigInt256 {
return .{ .words = .{
~x.words[0],
~x.words[1],
~x.words[2],
~x.words[3],
}};
}
/// Bitwise OR for 256-bit
pub fn bitwiseOr(x: UnsignedBigInt256, y: UnsignedBigInt256) UnsignedBigInt256 {
return .{ .words = .{
x.words[0] | y.words[0],
x.words[1] | y.words[1],
x.words[2] | y.words[2],
x.words[3] | y.words[3],
}};
}
/// Left shift for 256-bit (handles cross-word shifts)
pub fn shiftLeft(x: UnsignedBigInt256, n: i32) !UnsignedBigInt256 {
if (n < 0) return error.NegativeShift;
if (n >= 256) return UnsignedBigInt256.ZERO;
const shift: u8 = @intCast(n);
const word_shift = shift / 64;
const bit_shift: u6 = @intCast(shift % 64);
var result = UnsignedBigInt256.ZERO;
if (bit_shift == 0) {
// Word-aligned shift
for (word_shift..4) |i| {
result.words[i] = x.words[i - word_shift];
}
} else {
// Cross-word shift
for (word_shift..4) |i| {
result.words[i] = x.words[i - word_shift] << bit_shift;
if (i > word_shift) {
result.words[i] |= x.words[i - word_shift - 1] >> (64 - bit_shift);
}
}
}
return result;
}
};
Collection Methods (v6)
v6 adds seven new methods to Coll[T] for efficient collection manipulation78:
Collection Methods (v6)
═══════════════════════════════════════════════════════════════════
Method Signature Cost
─────────────────────────────────────────────────────────────────────
patch (Coll[T], Int, Coll[T], Int) → Coll[T] PerItem(30,2,10)
updated (Coll[T], Int, T) → Coll[T] PerItem(20,1,10)
updateMany (Coll[T], Coll[Int], Coll[T]) → Coll[T] PerItem(20,2,10)
reverse Coll[T] → Coll[T] PerItem (append)
startsWith (Coll[T], Coll[T]) → Boolean PerItem (zip)
endsWith (Coll[T], Coll[T]) → Boolean PerItem (zip)
get (Coll[T], Int) → Option[T] FixedCost(14)
patch - Replace Slice
/// Replace elements from index `from`, removing `replaced` elements,
/// inserting `patch` collection in their place.
///
/// xs.patch(from, patch, replaced):
/// result = xs[0..from] ++ patch ++ xs[from+replaced..]
///
/// Cost: PerItemCost(30, 2, 10) based on xs.len + patch.len
pub fn patch(
comptime T: type,
xs: []const T,
from: i32,
patchColl: []const T,
replaced: i32,
allocator: Allocator,
) ![]T {
if (from < 0) return error.IndexOutOfBounds;
const from_idx: usize = @intCast(from);
if (from_idx > xs.len) return error.IndexOutOfBounds;
const replaced_count: usize = if (replaced < 0)
0
else
@min(@as(usize, @intCast(replaced)), xs.len - from_idx);
// Result length: original - replaced + patch
const result_len = xs.len - replaced_count + patchColl.len;
var result = try allocator.alloc(T, result_len);
// Copy prefix [0..from]
@memcpy(result[0..from_idx], xs[0..from_idx]);
// Copy patch
@memcpy(result[from_idx..][0..patchColl.len], patchColl);
// Copy suffix [from+replaced..]
const suffix_start = from_idx + replaced_count;
const suffix_dest = from_idx + patchColl.len;
@memcpy(result[suffix_dest..], xs[suffix_start..]);
return result;
}
updated - Single Element Update
/// Return a new collection with element at index replaced.
/// Immutable operation - original collection unchanged.
///
/// Cost: PerItemCost(20, 1, 10)
pub fn updated(
comptime T: type,
xs: []const T,
idx: i32,
value: T,
allocator: Allocator,
) ![]T {
if (idx < 0) return error.IndexOutOfBounds;
const index: usize = @intCast(idx);
if (index >= xs.len) return error.IndexOutOfBounds;
var result = try allocator.dupe(T, xs);
result[index] = value;
return result;
}
updateMany - Batch Update
/// Update multiple elements at specified indices.
/// indexes and updates must have the same length.
///
/// Cost: PerItemCost(20, 2, 10)
pub fn updateMany(
comptime T: type,
xs: []const T,
indexes: []const i32,
updates: []const T,
allocator: Allocator,
) ![]T {
if (indexes.len != updates.len) {
return error.LengthMismatch;
}
// Validate all indexes first
for (indexes) |idx| {
if (idx < 0) return error.IndexOutOfBounds;
if (@as(usize, @intCast(idx)) >= xs.len) return error.IndexOutOfBounds;
}
var result = try allocator.dupe(T, xs);
for (indexes, updates) |idx, val| {
result[@intCast(idx)] = val;
}
return result;
}
reverse, startsWith, endsWith, get
/// Reverse collection order
/// Cost: Same as append (PerItem)
pub fn reverse(comptime T: type, xs: []const T, allocator: Allocator) ![]T {
var result = try allocator.alloc(T, xs.len);
for (xs, 0..) |x, i| {
result[xs.len - 1 - i] = x;
}
return result;
}
/// Check if collection starts with prefix
/// Cost: Same as zip (PerItem based on prefix length)
pub fn startsWith(comptime T: type, xs: []const T, prefix: []const T) bool {
if (prefix.len > xs.len) return false;
return std.mem.eql(T, xs[0..prefix.len], prefix);
}
/// Check if collection ends with suffix
/// Cost: Same as zip (PerItem based on suffix length)
pub fn endsWith(comptime T: type, xs: []const T, suffix: []const T) bool {
if (suffix.len > xs.len) return false;
return std.mem.eql(T, xs[xs.len - suffix.len ..], suffix);
}
/// Safe element access returning Option
/// Returns null if index out of bounds (instead of error)
/// Cost: FixedCost(14)
pub fn get(comptime T: type, xs: []const T, idx: i32) ?T {
if (idx < 0) return null;
const index: usize = @intCast(idx);
if (index >= xs.len) return null;
return xs[index];
}
Autolykos2 Proof-of-Work
v6 exposes proof-of-work verification in ErgoScript through Header.checkPow() and Global.powHit()910. Autolykos2 is Ergo's memory-hard, ASIC-resistant PoW algorithm designed for fair GPU mining.
Algorithm Overview
Autolykos2 Structure
═══════════════════════════════════════════════════════════════════
Parameters:
N = 2²⁶ ≈ 67 million Table size (memory requirement)
k = 32 Elements to sum per solution
n = 26 log₂(N)
Memory: N × 32 bytes ≈ 2 GB table
Algorithm:
1. Seed table from height (changes every ~1024 blocks)
2. For each nonce attempt:
a. Compute 32 pseudo-random indices from (msg, nonce)
b. Sum the 32 table elements at those indices
c. Hash (msg || nonce || sum) to get PoW hit
d. If hit < target, solution found
Implementation
/// Autolykos2 proof-of-work algorithm constants and functions
const Autolykos2 = struct {
/// Table size: 2^26 elements
pub const N: u32 = 1 << 26;
/// Elements summed per solution attempt
pub const K: u32 = 32;
/// Bits in N (log2(N))
pub const N_BITS: u5 = 26;
/// Element size in bytes
pub const ELEMENT_SIZE: usize = 32;
/// Total table memory requirement
pub const TABLE_SIZE: usize = N * ELEMENT_SIZE; // ~2 GB
/// Epoch length for table seed rotation
pub const EPOCH_LENGTH: u32 = 1024;
/// Compute the PoW hit value for a header
/// Returns BigInt256 that must be < target (from nBits)
///
/// Cost: ~700 JitCost (multiple Blake2b256 computations)
pub fn powHit(
header_without_pow: []const u8,
nonce: u64,
height: u32,
) BigInt256 {
// Step 1: Compute message hash
const msg = Blake2b256.hash(header_without_pow);
// Step 2: Generate table seed from height epoch
const seed = computeTableSeed(height);
// Step 3: Compute k-sum of table elements
var sum = UnsignedBigInt256.ZERO;
var nonce_bytes: [8]u8 = undefined;
std.mem.writeInt(u64, &nonce_bytes, nonce, .big);
for (0..K) |i| {
// Derive index from hash(msg || nonce || i)
var index_input: [32 + 8 + 4]u8 = undefined;
@memcpy(index_input[0..32], &msg);
@memcpy(index_input[32..40], &nonce_bytes);
std.mem.writeInt(u32, index_input[40..44], @intCast(i), .big);
const index_hash = Blake2b256.hash(&index_input);
const idx = std.mem.readInt(u32, index_hash[0..4], .big) % N;
// Look up table element
const element = computeTableElement(seed, idx);
sum = addUnchecked(sum, element);
}
// Step 4: Final hash to get PoW hit
var final_input: [32 + 8 + 32]u8 = undefined;
@memcpy(final_input[0..32], &msg);
@memcpy(final_input[32..40], &nonce_bytes);
@memcpy(final_input[40..72], &sum.toBytes());
const hit_hash = Blake2b256.hash(&final_input);
return BigInt256.fromBytes(hit_hash);
}
/// Compute table seed from block height
/// Seed changes every EPOCH_LENGTH blocks to prevent precomputation
fn computeTableSeed(height: u32) [32]u8 {
const epoch = height / EPOCH_LENGTH;
var epoch_bytes: [4]u8 = undefined;
std.mem.writeInt(u32, &epoch_bytes, epoch, .big);
return Blake2b256.hash(&epoch_bytes);
}
/// Compute table element at given index
/// This is the memory-hard part - miners must store or recompute
fn computeTableElement(seed: [32]u8, idx: u32) UnsignedBigInt256 {
// Element = H(seed || idx || 0) || H(seed || idx || 1) || ...
// Combined to form 256-bit value
var result: [32]u8 = undefined;
for (0..4) |chunk| {
var input: [32 + 4 + 1]u8 = undefined;
@memcpy(input[0..32], &seed);
std.mem.writeInt(u32, input[32..36], idx, .big);
input[36] = @intCast(chunk);
const chunk_hash = Blake2b256.hash(&input);
@memcpy(result[chunk * 8 ..][0..8], chunk_hash[0..8]);
}
return UnsignedBigInt256.fromBytes(result);
}
};
Header.checkPow
/// Verify that a block header satisfies the PoW difficulty requirement
///
/// checkPow() returns true iff powHit(header) < decodeNBits(header.nBits)
///
/// Cost: FixedCost(700) - approximately 2×32 hash computations
pub fn checkPow(header: Header) bool {
// Serialize header without PoW solution
const header_bytes = header.serializeWithoutPow();
// Compute PoW hit
const hit = Autolykos2.powHit(
header_bytes,
header.powSolutions.n, // nonce
header.height,
);
// Decode difficulty target from nBits
const target = NBits.decode(header.nBits);
// Valid if hit < target
return hit.lessThan(target);
}
Why Memory-Hard?
Autolykos2's memory requirement (~2GB) provides ASIC resistance:
- Table storage: Miners must maintain the full table in fast memory
- Random access: k=32 random lookups per attempt prevents caching tricks
- Epoch rotation: Table changes every ~1024 blocks, invalidating precomputation
- GPU-friendly: Memory bandwidth is the bottleneck, favoring commodity GPUs
NBits Difficulty Encoding
The nBits field in block headers uses a compact encoding for the difficulty target11:
NBits Format
═══════════════════════════════════════════════════════════════════
Format: 0xAABBBBBB (4 bytes)
AA = exponent (1 byte)
BBBBBB = mantissa (3 bytes)
Value = mantissa × 256^(exponent - 3)
Example:
nBits = 0x1d00ffff
exponent = 0x1d = 29
mantissa = 0x00ffff = 65535
target = 65535 × 256^(29-3) = 65535 × 256^26
Implementation
const NBits = struct {
/// Decode nBits to BigInt target
/// Cost: FixedCost(10)
pub fn decode(nBits: i64) BigInt256 {
const n = @as(u32, @intCast(nBits & 0xFFFFFFFF));
const exp: u8 = @intCast((n >> 24) & 0xFF);
const mantissa: u32 = n & 0x00FFFFFF;
if (exp <= 3) {
// Small exponent: right shift mantissa
const shift = (3 - exp) * 8;
return BigInt256.fromInt(mantissa >> @intCast(shift));
} else {
// Normal case: left shift mantissa
const shift = (exp - 3) * 8;
return BigInt256.fromInt(mantissa).shiftLeft(shift);
}
}
/// Encode BigInt to nBits
/// Cost: FixedCost(10)
pub fn encode(target: BigInt256) i64 {
// Find the byte length of target
const bytes = target.toBytes();
var byte_len: u8 = 32;
for (bytes) |b| {
if (b != 0) break;
byte_len -= 1;
}
if (byte_len == 0) return 0;
// Extract top 3 bytes as mantissa
const start = 32 - byte_len;
var mantissa: u32 = 0;
if (byte_len >= 3) {
mantissa = (@as(u32, bytes[start]) << 16) |
(@as(u32, bytes[start + 1]) << 8) |
@as(u32, bytes[start + 2]);
} else if (byte_len == 2) {
mantissa = (@as(u32, bytes[start]) << 8) |
@as(u32, bytes[start + 1]);
} else {
mantissa = bytes[start];
}
// Handle sign bit in mantissa (MSB must be 0)
if (mantissa & 0x00800000 != 0) {
mantissa >>= 8;
byte_len += 1;
}
return @as(i64, byte_len) << 24 | @as(i64, mantissa);
}
};
Global Serialization Methods
v6 adds methods to Global for value serialization12:
serialize
/// Serialize any value to bytes using SigmaSerializer
/// Works for all serializable types
/// Cost: Varies by type complexity
pub fn serialize(comptime T: type, value: T) ![]const u8 {
var buffer = std.ArrayList(u8).init(allocator);
const writer = buffer.writer();
try SigmaSerializer.serialize(T, value, writer);
return buffer.toOwnedSlice();
}
fromBigEndianBytes
/// Deserialize numeric type from big-endian bytes
/// Generic over numeric types
/// Cost: FixedCost(5) for primitives
pub fn fromBigEndianBytes(comptime T: type, bytes: []const u8) !T {
const size = @sizeOf(T);
if (bytes.len != size) return error.InvalidLength;
var arr: [size]u8 = undefined;
@memcpy(&arr, bytes);
return std.mem.readInt(T, &arr, .big);
}
Cost Model for v6 Operations
v6 Operation Costs
═══════════════════════════════════════════════════════════════════
Operation Cost Type Value Notes
─────────────────────────────────────────────────────────────────────
Modular Arithmetic:
mod(a, m) Fixed 20 Division
modInverse(a, m) Fixed 150 Extended Euclid
plusMod(a, b, m) Fixed 30 Add + mod
subtractMod(a, b, m) Fixed 30 Sub + mod
multiplyMod(a, b, m) Fixed 40 Mul + mod
Bitwise (all types):
bitwiseInverse(x) Fixed 5 Single op
bitwiseOr(x, y) Fixed 5 Single op
bitwiseAnd(x, y) Fixed 5 Single op
bitwiseXor(x, y) Fixed 5 Single op
shiftLeft(x, n) Fixed 5 Single op
shiftRight(x, n) Fixed 5 Single op
toBytes(x) Fixed 5 Conversion
toBits(x) Fixed 5 Conversion
Collections:
patch(xs, from, p, r) PerItem(30,2,10) O(n)
updated(xs, idx, v) PerItem(20,1,10) O(n) copy
updateMany(xs, is, vs) PerItem(20,2,10) O(n)
reverse(xs) PerItem (append) O(n)
startsWith(xs, p) PerItem (zip) O(|p|)
endsWith(xs, s) PerItem (zip) O(|s|)
get(xs, idx) Fixed 14 O(1)
Cryptographic:
expUnsigned(g, k) Fixed 900 Scalar mult
checkPow(header) Fixed 700 ~32 hashes
powHit(...) Dynamic Autolykos2
Serialization:
serialize(v) Varies Type-dependent
fromBigEndianBytes(b) Fixed 5 Simple parse
encodeNBits(n) Fixed 10 Encoding
decodeNBits(n) Fixed 10 Decoding
Migration Guide
Version Checking in Scripts
// ErgoScript: Check v6 availability
val canUseV6 = getVar[Boolean](127).getOrElse(false)
// Conditional v6 feature usage
if (canUseV6) {
// Use v6 features
val x: UnsignedBigInt = ...
val inv = x.modInverse(p)
} else {
// Fallback for pre-v6
}
When to Use v6 Features
| Feature | Use When |
|---|---|
SUnsignedBigInt | Cryptographic protocols requiring modular arithmetic |
modInverse | Computing multiplicative inverses in finite fields |
| Bitwise ops | Bit manipulation, flags, compact encodings |
patch/updated | Immutable collection updates in contracts |
get | Safe array access without exceptions |
checkPow | On-chain PoW verification for sidechains/merged mining |
Backward Compatibility
- v6 features are only available when
VersionContext.isV6Activated()returns true - Scripts using v6 features will fail validation on pre-v6 nodes
- Design scripts with fallback paths for pre-v6 compatibility during transition
Summary
This chapter covered the v6 protocol features that expand ErgoTree's capabilities:
-
SUnsignedBigInt provides 256-bit unsigned integers for cryptographic modular arithmetic, with six new methods (mod, modInverse, plusMod, subtractMod, multiplyMod, toSigned)
-
Bitwise operations (AND, OR, XOR, NOT, shifts) are now available on all numeric types with consistent semantics and low cost (5 JitCost each)
-
Collection methods (patch, updated, updateMany, reverse, startsWith, endsWith, get) enable efficient immutable collection manipulation
-
Autolykos2 PoW verification is exposed through
Header.checkPow()andGlobal.powHit(), enabling on-chain proof-of-work validation -
NBits encoding provides compact difficulty target representation with
encodeNBits/decodeNBits -
Serialization methods (
Global.serialize,fromBigEndianBytes) support arbitrary value serialization -
Cost model assigns appropriate costs to each operation, with modInverse (150) and checkPow (700) being the most expensive due to their computational complexity
Previous: Chapter 31 | Next: Appendix A
Scala: CUnsignedBigInt.scala
Rust: unsignedbigint256.rs
Scala: methods.scala:570-625 (SUnsignedBigIntMethods)
Rust: snumeric.rs:381-491
Scala: methods.scala (Bitwise method definitions)
Rust: snumeric.rs:73-264
Scala: Colls.scala (Collection trait)
Rust: scoll.rs:140-266
Ergo: Autolykos PoW
Rust: Header type
Bitcoin Wiki: Difficulty
Scala: sglobal.scala (SGlobalMethods)
Appendix A: Complete Type Code Table
PRE-ALPHA WARNING: This is a pre-alpha version of The Sigma Book. Content may be incomplete, inaccurate, or subject to change. Do not use as a source of truth. For authoritative information, consult the official repositories:
- sigmastate-interpreter — Reference Scala implementation
- sigma-rust — Rust implementation
- ergo — Ergo node
Complete reference for all type codes used in ErgoTree serialization12.
Type Code Ranges
Type Code Organization
══════════════════════════════════════════════════════════════════
Range Usage
─────────────────────────────────────────────────────────────────
0x00 Reserved (invalid)
0x01-0x09 Primitive types (embeddable)
0x0A-0x0B Reserved
0x0C Collection type constructor
0x0D-0x17 Reserved
0x18 Nested collection (Coll[Coll[T]])
0x19-0x23 Reserved
0x24 Option type constructor
0x25-0x3B Reserved
0x3C-0x5F Pair type constructors
0x60 Tuple type constructor
0x61-0x6A Object types (non-embeddable)
0x6B-0x6F Reserved for future object types
Primitive Types (Embeddable)
Embeddable types can be used as element types in collections (Coll[T], Option[T]) and have compact type code encodings. They are "embedded" into composite type codes rather than being serialized separately. For example, Coll[Int] is encoded as 0x0C 0x04 where 0x04 (Int) is embedded directly after 0x0C (Coll).
| Dec | Hex | Type | Size | Zig Type |
|---|---|---|---|---|
| 1 | 0x01 | SBoolean | 1 bit | bool |
| 2 | 0x02 | SByte | 8 bits | i8 |
| 3 | 0x03 | SShort | 16 bits | i16 |
| 4 | 0x04 | SInt | 32 bits | i32 |
| 5 | 0x05 | SLong | 64 bits | i64 |
| 6 | 0x06 | SBigInt | 256 bits | BigInt256 |
| 7 | 0x07 | SGroupElement | 33 bytes | Ecp (compressed) |
| 8 | 0x08 | SSigmaProp | variable | SigmaBoolean |
| 9 | 0x09 | SUnsignedBigInt | 256 bits | UnsignedBigInt256 |
Object Types
| Dec | Hex | Type | Description |
|---|---|---|---|
| 97 | 0x61 | SAny | Supertype of all types |
| 98 | 0x62 | SUnit | Unit type (singleton) |
| 99 | 0x63 | SBox | Transaction box |
| 100 | 0x64 | SAvlTree | Authenticated dictionary |
| 101 | 0x65 | SContext | Execution context |
| 102 | 0x66 | SString | String (ErgoScript only) |
| 103 | 0x67 | STypeVar | Type variable (internal) |
| 104 | 0x68 | SHeader | Block header |
| 105 | 0x69 | SPreHeader | Block pre-header |
| 106 | 0x6A | SGlobal | Global object (SigmaDslBuilder) |
Type Constructors
| Dec | Hex | Constructor | Example | Serialized As |
|---|---|---|---|---|
| 12 | 0x0C | SColl | Coll[Byte] | 0x0C 0x02 |
| 24 | 0x18 | Nested SColl | Coll[Coll[Int]] | 0x18 0x04 |
| 36 | 0x24 | SOption | Option[Long] | 0x24 0x05 |
| 60 | 0x3C | Pair (first generic) | (_, Byte) | 0x3C 0x02 |
| 72 | 0x48 | Pair (second generic) | (Int, _) | 0x48 0x04 |
| 84 | 0x54 | Pair (symmetric) | (Long, Long) | 0x54 0x05 |
| 96 | 0x60 | STuple | (Int, Boolean, ...) | 0x60 len types... |
Zig Type Definition
const TypeCode = enum(u8) {
// Primitive types
boolean = 0x01,
byte = 0x02,
short = 0x03,
int = 0x04,
long = 0x05,
big_int = 0x06,
group_element = 0x07,
sigma_prop = 0x08,
unsigned_big_int = 0x09,
// Type constructors
coll = 0x0C,
nested_coll = 0x18,
option = 0x24,
pair_first = 0x3C,
pair_second = 0x48,
pair_symmetric = 0x54,
tuple = 0x60,
// Object types
any = 0x61,
unit = 0x62,
box = 0x63,
avl_tree = 0x64,
context = 0x65,
string = 0x66,
type_var = 0x67,
header = 0x68,
pre_header = 0x69,
global = 0x6A,
pub fn isPrimitive(self: TypeCode) bool {
return @intFromEnum(self) >= 0x01 and @intFromEnum(self) <= 0x09;
}
pub fn isEmbeddable(self: TypeCode) bool {
return self.isPrimitive();
}
pub fn isNumeric(self: TypeCode) bool {
return switch (self) {
.byte, .short, .int, .long, .big_int, .unsigned_big_int => true,
else => false,
};
}
};
Type Serialization
const SType = union(enum) {
boolean,
byte,
short,
int,
long,
big_int,
group_element,
sigma_prop,
unsigned_big_int,
coll: *const SType,
option: *const SType,
tuple: []const SType,
box,
avl_tree,
context,
header,
pre_header,
global,
unit,
any,
pub fn typeCode(self: *const SType) u8 {
return switch (self.*) {
.boolean => 0x01,
.byte => 0x02,
.short => 0x03,
.int => 0x04,
.long => 0x05,
.big_int => 0x06,
.group_element => 0x07,
.sigma_prop => 0x08,
.unsigned_big_int => 0x09,
.coll => |elem| blk: {
if (elem.* == .coll) break :blk 0x18;
break :blk 0x0C;
},
.option => 0x24,
.tuple => 0x60,
.box => 0x63,
.avl_tree => 0x64,
.context => 0x65,
.header => 0x68,
.pre_header => 0x69,
.global => 0x6A,
.unit => 0x62,
.any => 0x61,
};
}
};
Encoding Rules
Type Encoding Examples
══════════════════════════════════════════════════════════════════
Simple Types:
SInt → [0x04]
SBoolean → [0x01]
SGroupElement → [0x07]
Collections:
Coll[Byte] → [0x0C, 0x02] (coll + byte)
Coll[Int] → [0x0C, 0x04] (coll + int)
Coll[Coll[Byte]] → [0x18, 0x02] (nested_coll + byte)
Options:
Option[Int] → [0x24, 0x04] (option + int)
Option[Box] → [0x24, 0x63] (option + box)
Tuples (2 elements):
(Int, Int) → [0x54, 0x04] (symmetric + int)
(Int, Long) → [0x48, 0x04, 0x05] (pair2 + int + long)
(Long, Int) → [0x3C, 0x05, 0x04] (pair1 + long + int)
Tuples (3+ elements):
(Int, Long, Byte) → [0x60, 0x03, 0x04, 0x05, 0x02]
(tuple + len + int + long + byte)
Constants
const TypeConstants = struct {
/// First type code for primitive types
pub const FIRST_PRIMITIVE_TYPE: u8 = 0x01;
/// Last type code for primitive types
pub const LAST_PRIMITIVE_TYPE: u8 = 0x09;
/// Maximum supported type code
pub const MAX_TYPE_CODE: u8 = 0x6A;
/// Last data type (can be serialized as data)
pub const LAST_DATA_TYPE: u8 = 111;
};
Previous: Chapter 31 | Next: Appendix B
Scala: SType.scala
Rust: stype.rs:28-76
Appendix B: Complete Opcode Table
PRE-ALPHA WARNING: This is a pre-alpha version of The Sigma Book. Content may be incomplete, inaccurate, or subject to change. Do not use as a source of truth. For authoritative information, consult the official repositories:
- sigmastate-interpreter — Reference Scala implementation
- sigma-rust — Rust implementation
- ergo — Ergo node
Complete reference for all operation codes used in ErgoTree serialization12.
Opcode Ranges
Opcode Space Organization
══════════════════════════════════════════════════════════════════
Range Usage Encoding
─────────────────────────────────────────────────────────────────
0x00 Reserved (invalid) -
0x01-0x6F Data types (constants) Type code directly
0x70 LastConstantCode boundary 112
0x71-0xFF Operations LastConstantCode + shift
Operation Categories:
─────────────────────────────────────────────────────────────────
0x71-0x79 Variables & references ValUse, ConstPlaceholder
0x7A-0x7E Type conversions Upcast, Downcast
0x7F-0x8C Constants & tuples True, False, Tuple
0x8F-0x98 Relations & logic Lt, Gt, Eq, And, Or
0x99-0xA2 Arithmetic Plus, Minus, Multiply
0xA3-0xAC Context access HEIGHT, INPUTS, OUTPUTS
0xAD-0xB8 Collection operations Map, Filter, Fold
0xC1-0xC7 Box extraction ExtractAmount, ExtractId
0xCB-0xD5 Crypto & serialization Blake2b, ProveDlog
0xD6-0xE7 Blocks & functions ValDef, FuncValue, Apply
0xEA-0xEB Sigma operations SigmaAnd, SigmaOr
0xEC-0xFF Bitwise & misc BitOr, BitAnd, XorOf
Zig Opcode Definition
const OpCode = enum(u8) {
// Constants region: 0x01-0x70 (type codes)
// Operations start at LAST_CONSTANT_CODE + 1 = 113
// Variable references
tagged_variable = 0x71, // Context variable by ID
val_use = 0x72, // Reference to ValDef binding
constant_placeholder = 0x73, // Segregated constant reference
subst_constants = 0x74, // Substitute constants in tree
// Type conversions
long_to_byte_array = 0x7A,
byte_array_to_bigint = 0x7B,
byte_array_to_long = 0x7C,
downcast = 0x7D,
upcast = 0x7E,
// Primitive constants
true_const = 0x7F,
false_const = 0x80,
unit_constant = 0x81,
group_generator = 0x82,
// Collection & tuple construction
concrete_collection = 0x83,
concrete_collection_bool = 0x85,
tuple = 0x86,
select_1 = 0x87,
select_2 = 0x88,
select_3 = 0x89,
select_4 = 0x8A,
select_5 = 0x8B,
select_field = 0x8C,
// Relational operations
lt = 0x8F,
le = 0x90,
gt = 0x91,
ge = 0x92,
eq = 0x93,
neq = 0x94,
// Control flow & logic
if_op = 0x95,
and_op = 0x96,
or_op = 0x97,
atleast = 0x98,
// Arithmetic
minus = 0x99,
plus = 0x9A,
xor = 0x9B,
multiply = 0x9C,
division = 0x9D,
modulo = 0x9E,
exponentiate = 0x9F,
multiply_group = 0xA0,
min = 0xA1,
max = 0xA2,
// Context access
height = 0xA3,
inputs = 0xA4,
outputs = 0xA5,
last_block_utxo_root_hash = 0xA6,
self_box = 0xA7,
miner_pubkey = 0xAC,
// Collection operations
map_collection = 0xAD,
exists = 0xAE,
forall = 0xAF,
fold = 0xB0,
size_of = 0xB1,
by_index = 0xB2,
append = 0xB3,
slice = 0xB4,
filter = 0xB5,
avl_tree = 0xB6,
avl_tree_get = 0xB7,
flat_map = 0xB8,
// Box extraction
extract_amount = 0xC1,
extract_script_bytes = 0xC2,
extract_bytes = 0xC3,
extract_bytes_with_no_ref = 0xC4,
extract_id = 0xC5,
extract_register_as = 0xC6,
extract_creation_info = 0xC7,
// Cryptographic operations
calc_blake2b256 = 0xCB,
calc_sha256 = 0xCC,
prove_dlog = 0xCD,
prove_diffie_hellman_tuple = 0xCE,
sigma_prop_is_proven = 0xCF,
sigma_prop_bytes = 0xD0,
bool_to_sigma_prop = 0xD1,
trivial_prop_false = 0xD2,
trivial_prop_true = 0xD3,
// Deserialization
deserialize_context = 0xD4,
deserialize_register = 0xD5,
// Block & function definitions
val_def = 0xD6,
fun_def = 0xD7,
block_value = 0xD8,
func_value = 0xD9,
func_apply = 0xDA,
property_call = 0xDB,
method_call = 0xDC,
global = 0xDD,
// Option operations
some_value = 0xDE,
none_value = 0xDF,
get_var = 0xE3,
option_get = 0xE4,
option_get_or_else = 0xE5,
option_is_defined = 0xE6,
// Modular arithmetic (deprecated in v5+)
mod_q = 0xE7,
plus_mod_q = 0xE8,
minus_mod_q = 0xE9,
// Sigma operations
sigma_and = 0xEA,
sigma_or = 0xEB,
// Binary operations
bin_or = 0xEC,
bin_and = 0xED,
decode_point = 0xEE,
logical_not = 0xEF,
negation = 0xF0,
// Bitwise operations
bit_inversion = 0xF1,
bit_or = 0xF2,
bit_and = 0xF3,
bin_xor = 0xF4,
bit_xor = 0xF5,
bit_shift_right = 0xF6,
bit_shift_left = 0xF7,
bit_shift_right_zeroed = 0xF8,
// Collection bitwise operations
coll_shift_right = 0xF9,
coll_shift_left = 0xFA,
coll_shift_right_zeroed = 0xFB,
coll_rotate_left = 0xFC,
coll_rotate_right = 0xFD,
// Misc
context = 0xFE,
xor_of = 0xFF,
pub fn isConstant(code: u8) bool {
return code >= 0x01 and code <= 0x70;
}
pub fn isOperation(code: u8) bool {
return code > 0x70;
}
pub fn fromShift(shift: u8) OpCode {
return @enumFromInt(0x70 + shift);
}
};
Variable & Reference Operations
| Hex | Decimal | Operation | Description |
|---|---|---|---|
| 0x71 | 113 | TaggedVariable | Reference context variable by ID |
| 0x72 | 114 | ValUse | Use value defined by ValDef |
| 0x73 | 115 | ConstantPlaceholder | Reference segregated constant |
| 0x74 | 116 | SubstConstants | Substitute constants in tree |
Type Conversion Operations
| Hex | Decimal | Operation | Description |
|---|---|---|---|
| 0x7A | 122 | LongToByteArray | Long → Coll[Byte] (big-endian) |
| 0x7B | 123 | ByteArrayToBigInt | Coll[Byte] → BigInt |
| 0x7C | 124 | ByteArrayToLong | Coll[Byte] → Long |
| 0x7D | 125 | Downcast | Numeric downcast (may overflow) |
| 0x7E | 126 | Upcast | Numeric upcast (always safe) |
Constants & Tuples
| Hex | Decimal | Operation | Description |
|---|---|---|---|
| 0x7F | 127 | True | Boolean true constant |
| 0x80 | 128 | False | Boolean false constant |
| 0x81 | 129 | UnitConstant | Unit () value |
| 0x82 | 130 | GroupGenerator | EC generator point G |
| 0x83 | 131 | ConcreteCollection | Coll construction |
| 0x85 | 133 | ConcreteCollectionBool | Optimized Coll[Boolean] |
| 0x86 | 134 | Tuple | Tuple construction |
| 0x87-0x8B | 135-139 | Select1-5 | Tuple element access |
| 0x8C | 140 | SelectField | Select by field index |
Relational & Logic Operations
| Hex | Decimal | Operation | Description |
|---|---|---|---|
| 0x8F | 143 | Lt | Less than (<) |
| 0x90 | 144 | Le | Less or equal (≤) |
| 0x91 | 145 | Gt | Greater than (>) |
| 0x92 | 146 | Ge | Greater or equal (≥) |
| 0x93 | 147 | Eq | Equal (==) |
| 0x94 | 148 | Neq | Not equal (≠) |
| 0x95 | 149 | If | If-then-else |
| 0x96 | 150 | And | Logical AND (&&) |
| 0x97 | 151 | Or | Logical OR (||) |
| 0x98 | 152 | AtLeast | k-of-n threshold |
Arithmetic Operations
| Hex | Decimal | Operation | Description |
|---|---|---|---|
| 0x99 | 153 | Minus | Subtraction |
| 0x9A | 154 | Plus | Addition |
| 0x9B | 155 | Xor | Byte-array XOR |
| 0x9C | 156 | Multiply | Multiplication |
| 0x9D | 157 | Division | Integer division |
| 0x9E | 158 | Modulo | Remainder |
| 0x9F | 159 | Exponentiate | BigInt exponentiation |
| 0xA0 | 160 | MultiplyGroup | EC point multiplication |
| 0xA1 | 161 | Min | Minimum |
| 0xA2 | 162 | Max | Maximum |
Context Access Operations
| Hex | Decimal | Operation | Description |
|---|---|---|---|
| 0xA3 | 163 | Height | Current block height |
| 0xA4 | 164 | Inputs | Transaction inputs (INPUTS) |
| 0xA5 | 165 | Outputs | Transaction outputs (OUTPUTS) |
| 0xA6 | 166 | LastBlockUtxoRootHash | UTXO tree root hash |
| 0xA7 | 167 | Self | Current box (SELF) |
| 0xAC | 172 | MinerPubkey | Miner's public key |
| 0xFE | 254 | Context | Context object |
Collection Operations
| Hex | Decimal | Operation | Description |
|---|---|---|---|
| 0xAD | 173 | MapCollection | Transform elements |
| 0xAE | 174 | Exists | Any element matches |
| 0xAF | 175 | ForAll | All elements match |
| 0xB0 | 176 | Fold | Reduce to single value |
| 0xB1 | 177 | SizeOf | Collection length |
| 0xB2 | 178 | ByIndex | Element at index |
| 0xB3 | 179 | Append | Concatenate collections |
| 0xB4 | 180 | Slice | Extract sub-collection |
| 0xB5 | 181 | Filter | Keep matching elements |
| 0xB6 | 182 | AvlTree | AVL tree construction |
| 0xB7 | 183 | AvlTreeGet | AVL tree lookup |
| 0xB8 | 184 | FlatMap | Map and flatten |
Box Extraction Operations
| Hex | Decimal | Operation | Description |
|---|---|---|---|
| 0xC1 | 193 | ExtractAmount | Box.value (nanoErgs) |
| 0xC2 | 194 | ExtractScriptBytes | Box.propositionBytes |
| 0xC3 | 195 | ExtractBytes | Box.bytes (full) |
| 0xC4 | 196 | ExtractBytesWithNoRef | Box.bytesWithoutRef |
| 0xC5 | 197 | ExtractId | Box.id (32 bytes) |
| 0xC6 | 198 | ExtractRegisterAs | Box.Rx[T] |
| 0xC7 | 199 | ExtractCreationInfo | Box.creationInfo |
Cryptographic Operations
| Hex | Decimal | Operation | Description |
|---|---|---|---|
| 0xCB | 203 | CalcBlake2b256 | Blake2b256 hash |
| 0xCC | 204 | CalcSha256 | SHA-256 hash |
| 0xCD | 205 | ProveDlog | DLog proposition |
| 0xCE | 206 | ProveDHTuple | DHT proposition |
| 0xCF | 207 | SigmaPropIsProven | Check proven |
| 0xD0 | 208 | SigmaPropBytes | Serialize SigmaProp |
| 0xD1 | 209 | BoolToSigmaProp | Bool → SigmaProp |
| 0xD2 | 210 | TrivialPropFalse | Always false |
| 0xD3 | 211 | TrivialPropTrue | Always true |
| 0xEE | 238 | DecodePoint | Bytes → GroupElement |
Block & Function Operations
| Hex | Decimal | Operation | Description |
|---|---|---|---|
| 0xD4 | 212 | DeserializeContext | Deserialize from context |
| 0xD5 | 213 | DeserializeRegister | Deserialize from register |
| 0xD6 | 214 | ValDef | Define value binding |
| 0xD7 | 215 | FunDef | Define function |
| 0xD8 | 216 | BlockValue | Block expression { } |
| 0xD9 | 217 | FuncValue | Lambda expression |
| 0xDA | 218 | FuncApply | Apply function |
| 0xDB | 219 | PropertyCall | Property access |
| 0xDC | 220 | MethodCall | Method invocation |
| 0xDD | 221 | Global | Global object |
Option Operations
| Hex | Decimal | Operation | Description |
|---|---|---|---|
| 0xDE | 222 | SomeValue | Some(x) construction |
| 0xDF | 223 | NoneValue | None construction |
| 0xE3 | 227 | GetVar | Get context variable |
| 0xE4 | 228 | OptionGet | Option.get (may fail) |
| 0xE5 | 229 | OptionGetOrElse | Option.getOrElse |
| 0xE6 | 230 | OptionIsDefined | Option.isDefined |
Sigma Operations
| Hex | Decimal | Operation | Description |
|---|---|---|---|
| 0xEA | 234 | SigmaAnd | Sigma AND (∧) |
| 0xEB | 235 | SigmaOr | Sigma OR (∨) |
Bitwise Operations (v6+)
| Hex | Decimal | Operation | Description |
|---|---|---|---|
| 0xEF | 239 | LogicalNot | Boolean NOT (!) |
| 0xF0 | 240 | Negation | Numeric negation (-x) |
| 0xF1 | 241 | BitInversion | Bitwise NOT (~) |
| 0xF2 | 242 | BitOr | Bitwise OR (|) |
| 0xF3 | 243 | BitAnd | Bitwise AND (&) |
| 0xF4 | 244 | BinXor | Binary XOR |
| 0xF5 | 245 | BitXor | Bitwise XOR (^) |
| 0xF6 | 246 | BitShiftRight | Arithmetic right shift (>>) |
| 0xF7 | 247 | BitShiftLeft | Left shift (<<) |
| 0xF8 | 248 | BitShiftRightZeroed | Logical right shift (>>>) |
Collection Bitwise Operations (v6+)
| Hex | Decimal | Operation | Description |
|---|---|---|---|
| 0xF9 | 249 | CollShiftRight | Collection shift right |
| 0xFA | 250 | CollShiftLeft | Collection shift left |
| 0xFB | 251 | CollShiftRightZeroed | Collection logical shift right |
| 0xFC | 252 | CollRotateLeft | Collection rotate left |
| 0xFD | 253 | CollRotateRight | Collection rotate right |
| 0xFF | 255 | XorOf | XOR of collection elements |
Opcode Parsing
const OpCodeParser = struct {
/// Parse opcode from byte, determining if constant or operation
pub fn parse(byte: u8) ParseResult {
if (byte == 0) return .invalid;
if (byte <= 0x70) return .{ .constant = byte };
return .{ .operation = @enumFromInt(byte) };
}
/// Check if opcode requires additional data
pub fn hasPayload(op: OpCode) bool {
return switch (op) {
.val_use,
.constant_placeholder,
.tagged_variable,
.extract_register_as,
.by_index,
.select_field,
.method_call,
.property_call,
=> true,
else => false,
};
}
const ParseResult = union(enum) {
invalid,
constant: u8,
operation: OpCode,
};
};
Constants
const OpCodeConstants = struct {
/// First valid data type code
pub const FIRST_DATA_TYPE: u8 = 0x01;
/// Last data type code
pub const LAST_DATA_TYPE: u8 = 111; // 0x6F
/// Boundary between constants and operations
pub const LAST_CONSTANT_CODE: u8 = 112; // 0x70
/// First operation code
pub const FIRST_OP_CODE: u8 = 113; // 0x71
/// Maximum opcode value
pub const MAX_OP_CODE: u8 = 255; // 0xFF
};
Previous: Appendix A | Next: Appendix C
Scala: OpCodes.scala
Rust: op_code.rs:14-203
Appendix C: Cost Table
PRE-ALPHA WARNING: This is a pre-alpha version of The Sigma Book. Content may be incomplete, inaccurate, or subject to change. Do not use as a source of truth. For authoritative information, consult the official repositories:
- sigmastate-interpreter — Reference Scala implementation
- sigma-rust — Rust implementation
- ergo — Ergo node
Complete reference for operation costs in the JIT cost model12.
Cost Model Architecture
Cost Model Structure
══════════════════════════════════════════════════════════════════
┌────────────────────────────────────────────────────────────────┐
│ CostKind │
├─────────────┬─────────────┬─────────────┬─────────────────────┤
│ FixedCost │PerItemCost │TypeBasedCost│ DynamicCost │
│ │ │ │ │
│ cost: u32 │ base: u32 │ costFunc() │ sum of sub-costs │
│ │ per_chunk │ per type │ │
│ │ chunk_size │ │ │
└─────────────┴─────────────┴─────────────┴─────────────────────┘
Cost Calculation Flow:
─────────────────────────────────────────────────────────────────
┌─────────────┐
│ Operation │
└──────┬──────┘
│
┌──────────────┼──────────────┐
▼ ▼ ▼
┌─────────┐ ┌──────────┐ ┌─────────┐
│ FixedOp │ │PerItemOp │ │TypedOp │
│ cost=26 │ │base=20 │ │depends │
└────┬────┘ │chunk=10 │ │on type │
│ └────┬─────┘ └────┬────┘
│ │ │
└─────────────┼──────────────┘
▼
┌────────────────┐
│CostAccumulator │
│ accum += cost │
│ check < limit │
└────────────────┘
Zig Cost Types
const JitCost = struct {
value: u32,
pub fn add(self: JitCost, other: JitCost) !JitCost {
return .{ .value = try std.math.add(u32, self.value, other.value) };
}
};
const CostKind = union(enum) {
fixed: FixedCost,
per_item: PerItemCost,
type_based: TypeBasedCost,
dynamic,
pub fn compute(self: CostKind, ctx: CostContext) JitCost {
return switch (self) {
.fixed => |f| f.cost,
.per_item => |p| p.compute(ctx.n_items),
.type_based => |t| t.costFunc(ctx.tpe),
.dynamic => ctx.computed_cost,
};
}
};
/// Fixed cost regardless of input
const FixedCost = struct {
cost: JitCost,
};
/// Cost proportional to collection size
const PerItemCost = struct {
base: JitCost,
per_chunk: JitCost,
chunk_size: usize,
/// totalCost = base + per_chunk * ceil(n_items / chunk_size)
pub fn compute(self: PerItemCost, n_items: usize) JitCost {
const chunks = (n_items + self.chunk_size - 1) / self.chunk_size;
return .{
.value = self.base.value + @as(u32, @intCast(chunks)) * self.per_chunk.value,
};
}
};
/// Cost depends on type
const TypeBasedCost = struct {
primitive_cost: JitCost,
bigint_cost: JitCost,
collection_cost: ?PerItemCost,
pub fn costFunc(self: TypeBasedCost, tpe: SType) JitCost {
return switch (tpe) {
.byte, .short, .int, .long => self.primitive_cost,
.big_int, .unsigned_big_int => self.bigint_cost,
.coll => |elem| if (self.collection_cost) |c|
c.compute(elem.len)
else
self.primitive_cost,
else => self.primitive_cost,
};
}
};
Cost Accumulator
const CostAccumulator = struct {
accum: u64,
limit: u64,
pub fn init(limit: u64) CostAccumulator {
return .{ .accum = 0, .limit = limit };
}
pub fn add(self: *CostAccumulator, cost: JitCost) !void {
self.accum += cost.value;
if (self.accum > self.limit) {
return error.CostLimitExceeded;
}
}
pub fn addSeq(
self: *CostAccumulator,
cost: PerItemCost,
n_items: usize,
) !void {
try self.add(cost.compute(n_items));
}
pub fn totalCost(self: *const CostAccumulator) JitCost {
return .{ .value = @intCast(self.accum) };
}
};
Fixed Cost Operations
| Operation | Cost | Description |
|---|---|---|
| ConstantPlaceholder | 1 | Reference segregated constant |
| Height | 1 | Current block height |
| Inputs | 1 | Transaction inputs |
| Outputs | 1 | Transaction outputs |
| LastBlockUtxoRootHash | 1 | UTXO root hash |
| Self | 1 | Self box |
| MinerPubkey | 1 | Miner public key |
| ValUse | 5 | Use defined value |
| TaggedVariable | 5 | Context variable |
| SomeValue | 5 | Option Some |
| NoneValue | 5 | Option None |
| SelectField | 8 | Select tuple field |
| CreateProveDlog | 10 | Create DLog |
| OptionGetOrElse | 10 | Option.getOrElse |
| OptionIsDefined | 10 | Option.isDefined |
| OptionGet | 10 | Option.get |
| ExtractAmount | 10 | Box value |
| ExtractScriptBytes | 10 | Proposition bytes |
| ExtractId | 10 | Box ID |
| Tuple | 10 | Create tuple |
| Select1-5 | 12 | Select tuple element |
| ByIndex | 14 | Collection access |
| BoolToSigmaProp | 15 | Bool → SigmaProp |
| DeserializeContext | 15 | Deserialize context |
| DeserializeRegister | 15 | Deserialize register |
| ByteArrayToLong | 16 | Bytes → Long |
| LongToByteArray | 17 | Long → bytes |
| CreateProveDHTuple | 20 | Create DHT |
| If | 20 | Conditional |
| LogicalNot | 20 | Boolean NOT |
| Negation | 20 | Numeric negation |
| ArithOp | 26 | Plus, Minus, etc. |
| ByteArrayToBigInt | 30 | Bytes → BigInt |
| SubstConstants | 30 | Substitute constants |
| SizeOf | 30 | Collection size |
| MultiplyGroup | 40 | EC point multiply |
| ExtractRegisterAs | 50 | Register access |
| Exponentiate | 300 | BigInt exponent |
| DecodePoint | 900 | Decode EC point |
Per-Item Cost Operations
| Operation | Base | Per Chunk | Chunk Size |
|---|---|---|---|
| CalcBlake2b256 | 20 | 7 | 128 |
| CalcSha256 | 20 | 8 | 64 |
| MapCollection | 20 | 1 | 10 |
| Exists | 20 | 5 | 10 |
| ForAll | 20 | 5 | 10 |
| Fold | 20 | 1 | 10 |
| Filter | 20 | 5 | 10 |
| FlatMap | 20 | 5 | 10 |
| Slice | 10 | 2 | 100 |
| Append | 20 | 2 | 100 |
| SigmaAnd | 10 | 2 | 1 |
| SigmaOr | 10 | 2 | 1 |
| AND (logical) | 10 | 5 | 32 |
| OR (logical) | 10 | 5 | 32 |
| XorOf | 20 | 5 | 32 |
| AtLeast | 20 | 3 | 1 |
| Xor (bytes) | 10 | 2 | 128 |
Type-Based Costs
Numeric Casting
| Target Type | Cost |
|---|---|
| Byte, Short, Int, Long | 10 |
| BigInt | 30 |
| UnsignedBigInt | 30 |
Comparison Operations
| Type | Cost |
|---|---|
| Primitives | 10-20 |
| BigInt | 30 |
| Collections | PerItemCost |
| Tuples | Sum of components |
Interpreter Overhead
| Cost Type | Value | Description |
|---|---|---|
| interpreterInitCost | 10,000 | Interpreter init |
| inputCost | 2,000 | Per input |
| dataInputCost | 100 | Per data input |
| outputCost | 100 | Per output |
| tokenAccessCost | 100 | Per token |
Cost Limits
| Parameter | Value | Description |
|---|---|---|
| maxBlockCost | 1,000,000 | Max per block |
| scriptCostLimit | ~8,000,000 | Single script |
Zig Cost Constants
const OperationCosts = struct {
// Context access (very cheap)
pub const HEIGHT: FixedCost = .{ .cost = .{ .value = 1 } };
pub const INPUTS: FixedCost = .{ .cost = .{ .value = 1 } };
pub const OUTPUTS: FixedCost = .{ .cost = .{ .value = 1 } };
pub const SELF: FixedCost = .{ .cost = .{ .value = 1 } };
// Variable access
pub const VAL_USE: FixedCost = .{ .cost = .{ .value = 5 } };
pub const CONSTANT_PLACEHOLDER: FixedCost = .{ .cost = .{ .value = 1 } };
// Arithmetic
pub const ARITH_OP: FixedCost = .{ .cost = .{ .value = 26 } };
pub const COMPARISON: FixedCost = .{ .cost = .{ .value = 20 } };
// Box extraction
pub const EXTRACT_AMOUNT: FixedCost = .{ .cost = .{ .value = 10 } };
pub const EXTRACT_REGISTER: FixedCost = .{ .cost = .{ .value = 50 } };
// Cryptographic
pub const PROVE_DLOG: FixedCost = .{ .cost = .{ .value = 10 } };
pub const PROVE_DHT: FixedCost = .{ .cost = .{ .value = 20 } };
pub const DECODE_POINT: FixedCost = .{ .cost = .{ .value = 900 } };
pub const MULTIPLY_GROUP: FixedCost = .{ .cost = .{ .value = 40 } };
pub const EXPONENTIATE: FixedCost = .{ .cost = .{ .value = 300 } };
// Hashing
pub const BLAKE2B256: PerItemCost = .{
.base = .{ .value = 20 },
.per_chunk = .{ .value = 7 },
.chunk_size = 128,
};
pub const SHA256: PerItemCost = .{
.base = .{ .value = 20 },
.per_chunk = .{ .value = 8 },
.chunk_size = 64,
};
// Collection operations
pub const MAP: PerItemCost = .{
.base = .{ .value = 20 },
.per_chunk = .{ .value = 1 },
.chunk_size = 10,
};
pub const FILTER: PerItemCost = .{
.base = .{ .value = 20 },
.per_chunk = .{ .value = 5 },
.chunk_size = 10,
};
pub const FOLD: PerItemCost = .{
.base = .{ .value = 20 },
.per_chunk = .{ .value = 1 },
.chunk_size = 10,
};
// Sigma operations
pub const SIGMA_AND: PerItemCost = .{
.base = .{ .value = 10 },
.per_chunk = .{ .value = 2 },
.chunk_size = 1,
};
pub const SIGMA_OR: PerItemCost = .{
.base = .{ .value = 10 },
.per_chunk = .{ .value = 2 },
.chunk_size = 1,
};
};
const InterpreterCosts = struct {
pub const INIT: u32 = 10_000;
pub const PER_INPUT: u32 = 2_000;
pub const PER_DATA_INPUT: u32 = 100;
pub const PER_OUTPUT: u32 = 100;
pub const PER_TOKEN: u32 = 100;
};
const CostLimits = struct {
pub const MAX_BLOCK_COST: u64 = 1_000_000;
pub const MAX_SCRIPT_COST: u64 = 8_000_000;
};
Cost Calculation Example
/// Calculate total cost for transaction verification
fn calculateTxCost(
n_inputs: usize,
n_data_inputs: usize,
n_outputs: usize,
script_costs: []const JitCost,
) u64 {
var total: u64 = InterpreterCosts.INIT;
total += @as(u64, n_inputs) * InterpreterCosts.PER_INPUT;
total += @as(u64, n_data_inputs) * InterpreterCosts.PER_DATA_INPUT;
total += @as(u64, n_outputs) * InterpreterCosts.PER_OUTPUT;
for (script_costs) |cost| {
total += cost.value;
}
return total;
}
// Example: 2 inputs, 1 data input, 3 outputs
// Base: 10,000 + 4,000 + 100 + 300 = 14,400
// Plus script costs per input
Previous: Appendix B | Next: Appendix D
Scala: CostKind.scala
Rust: cost_accum.rs:7-43
Appendix D: Method Reference
PRE-ALPHA WARNING: This is a pre-alpha version of The Sigma Book. Content may be incomplete, inaccurate, or subject to change. Do not use as a source of truth. For authoritative information, consult the official repositories:
- sigmastate-interpreter — Reference Scala implementation
- sigma-rust — Rust implementation
- ergo — Ergo node
Complete reference for all methods available on each type12.
Method Organization
Method System Architecture
══════════════════════════════════════════════════════════════════
┌────────────────────────┐
│ STypeCompanion │
│ type_code: TypeCode │
│ methods: []SMethod │
└──────────┬─────────────┘
│
┌───────────────────────┼───────────────────────┐
▼ ▼ ▼
┌──────────────┐ ┌──────────────┐ ┌──────────────┐
│ SNumeric │ │ SBox │ │ SColl │
│ methods.len │ │ methods.len │ │ methods.len │
│ = 13 │ │ = 10 │ │ = 20+ │
└──────────────┘ └──────────────┘ └──────────────┘
Method Lookup:
─────────────────────────────────────────────────────────────────
receiver.methodCall(type_code=99, method_id=1)
│
▼
STypeCompanion::Box.method_by_id(1)
│
▼
SMethod { name: "value", tpe: Box => Long, cost: 10 }
Zig Method Descriptors
const SMethod = struct {
name: []const u8,
method_id: u8,
tpe: SFunc,
cost_kind: CostKind,
min_version: ?ErgoTreeVersion = null,
pub fn isV6Only(self: *const SMethod) bool {
return self.min_version != null and
@intFromEnum(self.min_version.?) >= 3;
}
};
const SFunc = struct {
t_dom: []const SType, // Domain (receiver + args)
t_range: SType, // Return type
pub fn unary(recv: SType, ret: SType) SFunc {
return .{ .t_dom = &[_]SType{recv}, .t_range = ret };
}
pub fn binary(recv: SType, arg: SType, ret: SType) SFunc {
return .{ .t_dom = &[_]SType{ recv, arg }, .t_range = ret };
}
};
Numeric Types (SByte, SShort, SInt, SLong)3
| ID | Method | Signature | v5 | v6 | Cost |
|---|---|---|---|---|---|
| 1 | toByte | T → Byte | ✓ | ✓ | 10 |
| 2 | toShort | T → Short | ✓ | ✓ | 10 |
| 3 | toInt | T → Int | ✓ | ✓ | 10 |
| 4 | toLong | T → Long | ✓ | ✓ | 10 |
| 5 | toBigInt | T → BigInt | ✓ | ✓ | 30 |
| 6 | toBytes | T → Coll[Byte] | - | ✓ | 5 |
| 7 | toBits | T → Coll[Boolean] | - | ✓ | 5 |
| 8 | bitwiseInverse | T → T | - | ✓ | 5 |
| 9 | bitwiseOr | (T, T) → T | - | ✓ | 5 |
| 10 | bitwiseAnd | (T, T) → T | - | ✓ | 5 |
| 11 | bitwiseXor | (T, T) → T | - | ✓ | 5 |
| 12 | shiftLeft | (T, Int) → T | - | ✓ | 5 |
| 13 | shiftRight | (T, Int) → T | - | ✓ | 5 |
SBigInt4
| ID | Method | Signature | v5 | v6 | Cost |
|---|---|---|---|---|---|
| 1-5 | toXxx | Conversions | ✓ | ✓ | 10-30 |
| 6-13 | bitwise | Bitwise ops | - | ✓ | 5-10 |
| 14 | toUnsigned | BigInt → UnsignedBigInt | - | ✓ | 5 |
| 15 | toUnsignedMod | (BigInt, UBI) → UBI | - | ✓ | 10 |
SUnsignedBigInt (v6+)5
| ID | Method | Signature | Cost |
|---|---|---|---|
| 14 | modInverse | (UBI, UBI) → UBI | 50 |
| 15 | plusMod | (UBI, UBI, UBI) → UBI | 10 |
| 16 | subtractMod | (UBI, UBI, UBI) → UBI | 10 |
| 17 | multiplyMod | (UBI, UBI, UBI) → UBI | 15 |
| 18 | mod | (UBI, UBI) → UBI | 10 |
| 19 | toSigned | UBI → BigInt | 5 |
SGroupElement6
| ID | Method | Signature | v5 | v6 | Cost |
|---|---|---|---|---|---|
| 2 | getEncoded | GE → Coll[Byte] | ✓ | ✓ | 250 |
| 3 | exp | (GE, BigInt) → GE | ✓ | ✓ | 900 |
| 4 | multiply | (GE, GE) → GE | ✓ | ✓ | 40 |
| 5 | negate | GE → GE | ✓ | ✓ | 45 |
| 6 | expUnsigned | (GE, UBI) → GE | - | ✓ | 900 |
SSigmaProp7
| ID | Method | Signature | Cost |
|---|---|---|---|
| 1 | propBytes | SigmaProp → Coll[Byte] | 35 |
| 2 | isProven | SigmaProp → Boolean | 10 |
SBox8
| ID | Method | Signature | Cost |
|---|---|---|---|
| 1 | value | Box → Long | 1 |
| 2 | propositionBytes | Box → Coll[Byte] | 10 |
| 3 | bytes | Box → Coll[Byte] | 10 |
| 4 | bytesWithoutRef | Box → Coll[Byte] | 10 |
| 5 | id | Box → Coll[Byte] | 10 |
| 6 | creationInfo | Box → (Int, Coll[Byte]) | 10 |
| 7 | getReg[T] | (Box, Int) → Option[T] | 50 |
| 8 | tokens | Box → Coll[(Coll[Byte], Long)] | 15 |
Register Access
const BoxMethods = struct {
// R0-R3: mandatory registers
pub const R0 = makeRegMethod(0); // monetary value
pub const R1 = makeRegMethod(1); // guard script
pub const R2 = makeRegMethod(2); // tokens
pub const R3 = makeRegMethod(3); // creation info
// R4-R9: optional registers
pub const R4 = makeRegMethod(4);
pub const R5 = makeRegMethod(5);
pub const R6 = makeRegMethod(6);
pub const R7 = makeRegMethod(7);
pub const R8 = makeRegMethod(8);
pub const R9 = makeRegMethod(9);
fn makeRegMethod(comptime idx: u8) SMethod {
return .{
.method_id = 7, // getReg opcode
.name = "R" ++ &[_]u8{'0' + idx},
.cost_kind = .{ .fixed = .{ .cost = .{ .value = 50 } } },
};
}
};
SAvlTree9
| ID | Method | Signature | Cost |
|---|---|---|---|
| 1 | digest | AvlTree → Coll[Byte] | 15 |
| 2 | enabledOperations | AvlTree → Byte | 15 |
| 3 | keyLength | AvlTree → Int | 15 |
| 4 | valueLengthOpt | AvlTree → Option[Int] | 15 |
| 5 | isInsertAllowed | AvlTree → Boolean | 15 |
| 6 | isUpdateAllowed | AvlTree → Boolean | 15 |
| 7 | isRemoveAllowed | AvlTree → Boolean | 15 |
| 8 | updateOperations | (AvlTree, Byte) → AvlTree | 20 |
| 9 | contains | (AvlTree, key, proof) → Boolean | dynamic |
| 10 | get | (AvlTree, key, proof) → Option[Coll[Byte]] | dynamic |
| 11 | getMany | (AvlTree, keys, proof) → Coll[Option[...]] | dynamic |
| 12 | insert | (AvlTree, entries, proof) → Option[AvlTree] | dynamic |
| 13 | update | (AvlTree, operations, proof) → Option[AvlTree] | dynamic |
| 14 | remove | (AvlTree, keys, proof) → Option[AvlTree] | dynamic |
| 15 | updateDigest | (AvlTree, Coll[Byte]) → AvlTree | 20 |
SContext10
| ID | Method | Signature | Cost |
|---|---|---|---|
| 1 | dataInputs | Context → Coll[Box] | 15 |
| 2 | headers | Context → Coll[Header] | 15 |
| 3 | preHeader | Context → PreHeader | 10 |
| 4 | INPUTS | Context → Coll[Box] | 10 |
| 5 | OUTPUTS | Context → Coll[Box] | 10 |
| 6 | HEIGHT | Context → Int | 26 |
| 7 | SELF | Context → Box | 10 |
| 8 | selfBoxIndex | Context → Int | 20 |
| 9 | LastBlockUtxoRootHash | Context → AvlTree | 15 |
| 10 | minerPubKey | Context → Coll[Byte] | 20 |
| 11 | getVar[T] | (Context, Byte) → Option[T] | dynamic |
SHeader11
| ID | Method | Signature | Cost |
|---|---|---|---|
| 1 | id | Header → Coll[Byte] | 10 |
| 2 | version | Header → Byte | 10 |
| 3 | parentId | Header → Coll[Byte] | 10 |
| 4 | ADProofsRoot | Header → Coll[Byte] | 10 |
| 5 | stateRoot | Header → AvlTree | 10 |
| 6 | transactionsRoot | Header → Coll[Byte] | 10 |
| 7 | timestamp | Header → Long | 10 |
| 8 | nBits | Header → Long | 10 |
| 9 | height | Header → Int | 10 |
| 10 | extensionRoot | Header → Coll[Byte] | 10 |
| 11 | minerPk | Header → GroupElement | 10 |
| 12 | powOnetimePk | Header → GroupElement | 10 |
| 13 | powNonce | Header → Coll[Byte] | 10 |
| 14 | powDistance | Header → BigInt | 10 |
| 15 | votes | Header → Coll[Byte] | 10 |
| 16 | checkPow | Header → Boolean (v6+) | 500 |
SPreHeader12
| ID | Method | Signature | Cost |
|---|---|---|---|
| 1 | version | PreHeader → Byte | 10 |
| 2 | parentId | PreHeader → Coll[Byte] | 10 |
| 3 | timestamp | PreHeader → Long | 10 |
| 4 | nBits | PreHeader → Long | 10 |
| 5 | height | PreHeader → Int | 10 |
| 6 | minerPk | PreHeader → GroupElement | 10 |
| 7 | votes | PreHeader → Coll[Byte] | 10 |
SGlobal13
| ID | Method | Signature | v5 | v6 | Cost |
|---|---|---|---|---|---|
| 1 | groupGenerator | Global → GroupElement | ✓ | ✓ | 10 |
| 2 | xor | (Coll[Byte], Coll[Byte]) → Coll[Byte] | ✓ | ✓ | PerItem |
| 3 | serialize[T] | T → Coll[Byte] | - | ✓ | dynamic |
| 4 | fromBigEndianBytes[T] | Coll[Byte] → T | - | ✓ | 10 |
| 5 | encodeNBits | BigInt → Long | - | ✓ | 20 |
| 6 | decodeNBits | Long → BigInt | - | ✓ | 20 |
| 7 | powHit | (Int, ...) → BigInt | - | ✓ | 500 |
SCollection14
| ID | Method | Signature | Cost |
|---|---|---|---|
| 1 | size | Coll[T] → Int | 14 |
| 2 | apply | (Coll[T], Int) → T | 14 |
| 3 | getOrElse | (Coll[T], Int, T) → T | dynamic |
| 4 | map[R] | (Coll[T], T → R) → Coll[R] | PerItem(20,1,10) |
| 5 | exists | (Coll[T], T → Bool) → Bool | PerItem(20,5,10) |
| 6 | fold[R] | (Coll[T], R, (R,T) → R) → R | PerItem(20,1,10) |
| 7 | forall | (Coll[T], T → Bool) → Bool | PerItem(20,5,10) |
| 8 | slice | (Coll[T], Int, Int) → Coll[T] | PerItem(10,2,100) |
| 9 | filter | (Coll[T], T → Bool) → Coll[T] | PerItem(20,5,10) |
| 10 | append | (Coll[T], Coll[T]) → Coll[T] | PerItem(20,2,100) |
| 14 | indices | Coll[T] → Coll[Int] | PerItem(20,2,128) |
| 15 | flatMap[R] | (Coll[T], T → Coll[R]) → Coll[R] | PerItem(20,5,10) |
| 19 | patch (v6) | (Coll[T], Int, Coll[T], Int) → Coll[T] | dynamic |
| 20 | updated (v6) | (Coll[T], Int, T) → Coll[T] | 20 |
| 21 | updateMany (v6) | (Coll[T], Coll[Int], Coll[T]) → Coll[T] | PerItem |
| 26 | indexOf | (Coll[T], T, Int) → Int | PerItem(20,1,10) |
| 29 | zip[U] | (Coll[T], Coll[U]) → Coll[(T,U)] | PerItem(10,1,10) |
| 30 | reverse (v6) | Coll[T] → Coll[T] | PerItem |
| 31 | startsWith (v6) | (Coll[T], Coll[T]) → Boolean | PerItem |
| 32 | endsWith (v6) | (Coll[T], Coll[T]) → Boolean | PerItem |
| 33 | get (v6) | (Coll[T], Int) → Option[T] | 14 |
SOption15
| ID | Method | Signature | Cost |
|---|---|---|---|
| 2 | isDefined | Option[T] → Boolean | 10 |
| 3 | get | Option[T] → T | 10 |
| 4 | getOrElse | (Option[T], T) → T | 10 |
| 7 | map[R] | (Option[T], T → R) → Option[R] | dynamic |
| 8 | filter | (Option[T], T → Bool) → Option[T] | dynamic |
STuple
Tuples support component access by position:
const TupleMethods = struct {
/// Access tuple component by index (1-based like Scala)
pub fn component(comptime idx: usize) SMethod {
return .{
.name = "_" ++ std.fmt.comptimePrint("{}", .{idx}),
.method_id = @intCast(idx),
.cost_kind = .{ .fixed = .{ .cost = .{ .value = 12 } } },
};
}
};
// Usage: tuple._1, tuple._2, ... up to tuple._255
Previous: Appendix C | Next: Appendix E
Scala: methods.scala
Rust: smethod.rs:36-99
Scala: methods.scala (SNumericTypeMethods)
Scala: methods.scala (SBigIntMethods)
Scala: methods.scala (SUnsignedBigIntMethods)
Rust: sgroup_elem.rs
Scala: methods.scala (SSigmaPropMethods)
Rust: sbox.rs:29-92
Rust: savltree.rs
Rust: scontext.rs
Rust: sheader.rs
Rust: spreheader.rs
Rust: sglobal.rs
Rust: scoll.rs:22-266
Rust: soption.rs
Appendix E: Serialization Format Reference
PRE-ALPHA WARNING: This is a pre-alpha version of The Sigma Book. Content may be incomplete, inaccurate, or subject to change. Do not use as a source of truth. For authoritative information, consult the official repositories:
- sigmastate-interpreter — Reference Scala implementation
- sigma-rust — Rust implementation
- ergo — Ergo node
Complete reference for ErgoTree and value serialization formats12.
Integer Encoding
VLQ (Variable-Length Quantity)
VLQ Encoding
══════════════════════════════════════════════════════════════════
Byte format: [C][D D D D D D D]
| |____________|
| |
| +-- 7 data bits
+---------- Continuation bit (1 = more bytes follow)
Examples:
0 → [0x00] (1 byte)
127 → [0x7F] (1 byte)
128 → [0x80, 0x01] (2 bytes: 10000000 00000001)
16383 → [0xFF, 0x7F] (2 bytes)
16384 → [0x80, 0x80, 0x01] (3 bytes)
const VlqEncoder = struct {
/// Encode unsigned integer as VLQ
pub fn encodeU64(value: u64, writer: anytype) !void {
var v = value;
while (v >= 0x80) {
try writer.writeByte(@as(u8, @truncate(v)) | 0x80);
v >>= 7;
}
try writer.writeByte(@as(u8, @truncate(v)));
}
/// Decode VLQ to unsigned integer
pub fn decodeU64(reader: anytype) !u64 {
var result: u64 = 0;
var shift: u6 = 0;
while (true) {
const byte = try reader.readByte();
result |= @as(u64, byte & 0x7F) << shift;
if (byte & 0x80 == 0) break;
shift += 7;
if (shift > 63) return error.VlqOverflow;
}
return result;
}
};
ZigZag Encoding
const ZigZag = struct {
/// Encode signed → unsigned (small negatives stay small)
pub fn encode32(n: i32) u32 {
return @bitCast((n << 1) ^ (n >> 31));
}
pub fn encode64(n: i64) u64 {
return @bitCast((n << 1) ^ (n >> 63));
}
/// Decode unsigned → signed
pub fn decode32(n: u32) i32 {
return @bitCast((n >> 1) ^ (~(n & 1) +% 1));
}
pub fn decode64(n: u64) i64 {
return @bitCast((n >> 1) ^ (~(n & 1) +% 1));
}
};
// Mapping: 0 → 0, -1 → 1, 1 → 2, -2 → 3, 2 → 4, ...
Type Serialization3
Primitive Type Codes
| Type | Dec | Hex | Zig |
|---|---|---|---|
| SBoolean | 1 | 0x01 | .boolean |
| SByte | 2 | 0x02 | .byte |
| SShort | 3 | 0x03 | .short |
| SInt | 4 | 0x04 | .int |
| SLong | 5 | 0x05 | .long |
| SBigInt | 6 | 0x06 | .big_int |
| SGroupElement | 7 | 0x07 | .group_element |
| SSigmaProp | 8 | 0x08 | .sigma_prop |
| SUnsignedBigInt | 9 | 0x09 | .unsigned_big_int |
Collection Types
const TypeEncoder = struct {
const COLL_BASE: u8 = 12; // 0x0C
const NESTED_COLL: u8 = 24; // 0x18
const OPTION_BASE: u8 = 36; // 0x24
/// Encode collection type
pub fn encodeColl(elem: SType) u8 {
if (elem.isPrimitive()) {
return COLL_BASE + elem.typeCode();
}
if (elem == .coll) {
return NESTED_COLL + elem.inner().typeCode();
}
// Non-embeddable: write COLL_BASE then element type separately
return COLL_BASE;
}
};
Non-Embeddable Types
| Type | Dec | Hex |
|---|---|---|
| SBox | 99 | 0x63 |
| SAvlTree | 100 | 0x64 |
| SContext | 101 | 0x65 |
| SHeader | 104 | 0x68 |
| SPreHeader | 105 | 0x69 |
| SGlobal | 106 | 0x6A |
ErgoTree Format4
Header Byte
ErgoTree Header
══════════════════════════════════════════════════════════════════
Bits: [V V V V][S][C][R][R]
|______| | | |__|
| | | |
| | | +-- Reserved (2 bits)
| | +------- Constant segregation (1 = segregated)
| +---------- Size flag (1 = size bytes present)
+----------------- Version (4 bits, 0-15)
Version Mapping:
0 → ErgoTree v0 (protocol v3.x)
1 → ErgoTree v1 (protocol v4.x)
2 → ErgoTree v2 (protocol v5.x, JIT costing)
3 → ErgoTree v3 (protocol v6.x)
const ErgoTreeHeader = struct {
version: u4,
has_size: bool,
constant_segregation: bool,
pub fn parse(byte: u8) ErgoTreeHeader {
return .{
.version = @truncate(byte >> 4),
.has_size = (byte & 0x08) != 0,
.constant_segregation = (byte & 0x04) != 0,
};
}
pub fn serialize(self: ErgoTreeHeader) u8 {
var result: u8 = @as(u8, self.version) << 4;
if (self.has_size) result |= 0x08;
if (self.constant_segregation) result |= 0x04;
return result;
}
};
Complete Structure
ErgoTree Wire Format
══════════════════════════════════════════════════════════════════
┌─────────┬──────────┬──────────────┬─────────────┬──────────────┐
│ Header │ Size │ Constants │ Complexity │ Root │
│ 1 byte │ VLQ │ Array │ VLQ │ Expression │
│ │(optional)│ (if C=1) │ (optional) │ │
└─────────┴──────────┴──────────────┴─────────────┴──────────────┘
With constant segregation (C=1):
┌─────────┬──────────┬───────────┬──────────────────────────────┐
│ Header │ # consts │ Constants │ Root (with placeholders) │
│ │ VLQ │ [type + │ │
│ │ │ value]* │ │
└─────────┴──────────┴───────────┴──────────────────────────────┘
Value Serialization5
Primitive Values
const DataSerializer = struct {
pub fn serialize(value: Value, writer: anytype) !void {
switch (value) {
.boolean => |b| try writer.writeByte(if (b) 0x01 else 0x00),
.byte => |b| try writer.writeByte(@bitCast(b)),
.short => |s| try VlqEncoder.encodeI16(s, writer),
.int => |i| try VlqEncoder.encodeI32(i, writer),
.long => |l| try VlqEncoder.encodeI64(l, writer),
.big_int => |bi| try serializeBigInt(bi, writer),
.group_element => |ge| try ge.serializeCompressed(writer),
.sigma_prop => |sp| try serializeSigmaProp(sp, writer),
.coll => |c| try serializeColl(c, writer),
// ...
}
}
fn serializeBigInt(bi: BigInt256, writer: anytype) !void {
const bytes = bi.toBytesBigEndian();
// Skip leading zeros for signed representation
var start: usize = 0;
while (start < bytes.len - 1 and bytes[start] == 0) : (start += 1) {}
try writer.writeByte(@intCast(bytes.len - start));
try writer.writeAll(bytes[start..]);
}
};
GroupElement (SEC1 Compressed)
GroupElement Encoding (33 bytes)
══════════════════════════════════════════════════════════════════
┌────────────┬─────────────────────────────────────────────────────┐
│ Prefix │ X Coordinate │
│ (1 byte) │ (32 bytes) │
├────────────┼─────────────────────────────────────────────────────┤
│ 0x02 = Y │ │
│ even │ Big-endian X value │
│ 0x03 = Y │ │
│ odd │ │
└────────────┴─────────────────────────────────────────────────────┘
SigmaProp
const SigmaPropSerializer = struct {
const PROVE_DLOG: u8 = 0xCD;
const PROVE_DHT: u8 = 0xCE;
const THRESHOLD: u8 = 0x98;
const AND: u8 = 0x96;
const OR: u8 = 0x97;
pub fn serialize(sp: SigmaBoolean, writer: anytype) !void {
switch (sp) {
.prove_dlog => |pk| {
try writer.writeByte(PROVE_DLOG);
try pk.serializeCompressed(writer);
},
.prove_dht => |dht| {
try writer.writeByte(PROVE_DHT);
try dht.g.serializeCompressed(writer);
try dht.h.serializeCompressed(writer);
try dht.u.serializeCompressed(writer);
try dht.v.serializeCompressed(writer);
},
.and_conj => |children| {
try writer.writeByte(AND);
try VlqEncoder.encodeU64(children.len, writer);
for (children) |child| try serialize(child, writer);
},
.or_conj => |children| {
try writer.writeByte(OR);
try VlqEncoder.encodeU64(children.len, writer);
for (children) |child| try serialize(child, writer);
},
.threshold => |t| {
try writer.writeByte(THRESHOLD);
try VlqEncoder.encodeU64(t.k, writer);
try VlqEncoder.encodeU64(t.children.len, writer);
for (t.children) |child| try serialize(child, writer);
},
}
}
};
Collections
const CollSerializer = struct {
pub fn serialize(coll: Collection, writer: anytype) !void {
try VlqEncoder.encodeU64(coll.len, writer);
// Element type already encoded in type header
for (coll.items) |item| {
try DataSerializer.serialize(item, writer);
}
}
/// Optimized boolean collection (bit-packed)
pub fn serializeBoolColl(bools: []const bool, writer: anytype) !void {
try VlqEncoder.encodeU64(bools.len, writer);
var byte: u8 = 0;
var bit: u3 = 0;
for (bools) |b| {
if (b) byte |= @as(u8, 1) << bit;
bit +%= 1;
if (bit == 0) {
try writer.writeByte(byte);
byte = 0;
}
}
if (bools.len % 8 != 0) try writer.writeByte(byte);
}
};
Expression Serialization6
General Pattern
const ExprSerializer = struct {
pub fn serialize(expr: Expr, writer: anytype) !void {
// Write opcode
try writer.writeByte(@intFromEnum(expr.opCode()));
// Write opcode-specific data
switch (expr) {
.val_use => |vu| try VlqEncoder.encodeU32(vu.id, writer),
.constant_placeholder => |cp| {
try VlqEncoder.encodeU32(cp.index, writer);
try TypeEncoder.serialize(cp.tpe, writer);
},
.bin_op => |bo| {
try serialize(bo.left.*, writer);
try serialize(bo.right.*, writer);
},
.method_call => |mc| {
try writer.writeByte(mc.type_code);
try writer.writeByte(mc.method_id);
try serialize(mc.receiver.*, writer);
try VlqEncoder.encodeU64(mc.args.len, writer);
for (mc.args) |arg| try serialize(arg.*, writer);
},
// ...
}
}
};
Block Expressions
Block Value Structure
══════════════════════════════════════════════════════════════════
BlockValue:
┌────────┬──────────┬─────────────────────┬───────────────────────┐
│ 0xD8 │ count │ ValDef items │ Result expr │
│ │ VLQ │ │ │
└────────┴──────────┴─────────────────────┴───────────────────────┘
ValDef:
┌────────┬────────┬────────────┬───────────────────────────────────┐
│ 0xD6 │ ID │ Type │ RHS Expression │
│ │ VLQ │ (optional) │ │
└────────┴────────┴────────────┴───────────────────────────────────┘
FuncValue (Lambda):
┌────────┬──────────┬─────────────────────┬───────────────────────┐
│ 0xD9 │ arg cnt │ Args (ID + type) │ Body expr │
│ │ VLQ │ │ │
└────────┴──────────┴─────────────────────┴───────────────────────┘
Size Limits7
| Limit | Value | Description |
|---|---|---|
| Max ErgoTree size | 4 KB | Serialized bytes |
| Max box size | 4 KB | Total serialized |
| Max constants | 255 | Per ErgoTree |
| Max registers | 10 | R0-R9 |
| Max tokens/box | 255 | Token types |
| Max BigInt bytes | 32 | 256 bits |
Deserialization
const SigmaByteReader = struct {
reader: std.io.Reader,
constant_store: []const Constant,
version: ErgoTreeVersion,
pub fn readVlqU64(self: *SigmaByteReader) !u64 {
return VlqEncoder.decodeU64(self.reader);
}
pub fn readType(self: *SigmaByteReader) !SType {
const code = try self.reader.readByte();
return TypeEncoder.decode(code, self);
}
pub fn readExpr(self: *SigmaByteReader) !Expr {
const opcode = try self.reader.readByte();
if (opcode <= 0x70) {
// Constant (type code in data region)
return try self.readConstantWithType(opcode);
}
return try ExprSerializer.deserialize(@enumFromInt(opcode), self);
}
};
Previous: Appendix D | Next: Appendix F
Scala: serialization/
Rust: types.rs
Rust: ergo_tree.rs (header parsing)
Rust: data.rs
Rust: expr.rs
Scala: serialization.tex (size limits)
Appendix F: Version History
PRE-ALPHA WARNING: This is a pre-alpha version of The Sigma Book. Content may be incomplete, inaccurate, or subject to change. Do not use as a source of truth. For authoritative information, consult the official repositories:
- sigmastate-interpreter — Reference Scala implementation
- sigma-rust — Rust implementation
- ergo — Ergo node
Version history of ErgoScript and the SigmaState interpreter12.
Protocol Versions Overview
| Block Version | Activated Version | ErgoTree Version | Name | Release |
|---|---|---|---|---|
| 1 | 0 | 0 | Initial | Mainnet launch |
| 2 | 1 | 1 | v4.0 | 2020 |
| 3 | 2 | 2 | v5.0 (JIT) | 2022 |
| 4 | 3 | 3 | v6.0 | 2024/2025 |
Version Context
const VersionContext = struct {
activated_version: u8, // Protocol version on network
ergo_tree_version: u8, // Version of currently executing script
pub const MAX_SUPPORTED_SCRIPT_VERSION: u8 = 3; // Supports 0, 1, 2, 3
pub const JIT_ACTIVATION_VERSION: u8 = 2; // v5.0 JIT activation
pub const V6_SOFT_FORK_VERSION: u8 = 3; // v6.0 soft-fork
pub fn isJitActivated(self: VersionContext) bool {
return self.activated_version >= JIT_ACTIVATION_VERSION;
}
pub fn isV6Activated(self: VersionContext) bool {
return self.activated_version >= V6_SOFT_FORK_VERSION;
}
};
Version 1 (Initial - v3.x)
ErgoTree Version: 0
Features:
- Core ErgoScript language
- Basic types: Boolean, Byte, Short, Int, Long, BigInt, GroupElement, SigmaProp
- Collection operations: map, filter, fold, exists, forall
- Sigma protocols: ProveDlog, ProveDHTuple, AND, OR, THRESHOLD
- Box operations: value, propositionBytes, id, registers R0-R9
- Context access: INPUTS, OUTPUTS, HEIGHT, SELF
Limitations:
- AOT (Ahead-of-Time) interpreter only
- Fixed cost model
- No constant segregation required
Version 2 (v4.0)
ErgoTree Version: 1 Block Version: 2
New Features:
- Mandatory constant segregation flag
- Improved script validation
- Enhanced soft-fork mechanism
- Size flag in ErgoTree header
Changes:
- ErgoTree header now requires size bytes when flag is set
- Better error handling for malformed scripts
Version 3 (v5.0 - JIT)
ErgoTree Version: 2 Block Version: 3 Activated Version: 2
This was the major interpreter upgrade replacing AOT with JIT costing.
Major Changes
New Interpreter Architecture:
- JIT (Just-In-Time) costing model
- Data-driven evaluation via
eval()methods - Precise cost tracking per operation
- Profiler support for cost measurement
New Cost Model:
FixedCostfor constant-time operationsPerItemCostfor collection operationsTypeBasedCostfor type-dependent costsDynamicCostfor complex operations
Costing Changes:
AOT: Fixed costs estimated at compile time
JIT: Actual costs computed during execution
New Operations:
Context.dataInputs- access data inputsContext.headers- access last 10 block headersContext.preHeader- access current block pre-headerHeadertype with full block header accessPreHeadertype
Soft-Fork Infrastructure:
ValidationRulesframework- Configurable rule status (enabled, disabled, replaced)
trySoftForkablepattern for graceful degradation
AOT to JIT Transition
The transition happened at a specific block height. Scripts created before JIT activation continue to work, but new scripts benefit from more accurate costing.
Version 4 (v6.0 - Evolution)
ErgoTree Version: 3 Block Version: 4 Activated Version: 3
This soft-fork adds significant new functionality.
New Types
SUnsignedBigInt (Type code 9):
- 256-bit unsigned integers
- Modular arithmetic operations
- Conversion between signed/unsigned
New Methods
Numeric Types (Byte, Short, Int, Long, BigInt):
toBytes: Convert to byte arraytoBits: Convert to boolean arraybitwiseInverse: Bitwise NOTbitwiseOr,bitwiseAnd,bitwiseXor: Bitwise operationsshiftLeft,shiftRight: Bit shifting
BigInt:
toUnsigned: Convert to UnsignedBigInttoUnsignedMod: Modular conversion
UnsignedBigInt:
modInverse: Modular multiplicative inverseplusMod,subtractMod,multiplyMod: Modular arithmeticmod: Modulo operationtoSigned: Convert to signed BigInt
GroupElement:
expUnsigned: Scalar multiplication with unsigned exponent
Header:
checkPow: Verify Proof-of-Work solution
Collection:
patch: Replace range with another collectionupdated: Update single elementupdateMany: Batch update elementsindexOf: Find element indexzip: Pair with another collectionreverse: Reverse orderstartsWith,endsWith: Prefix/suffix checksget: Safe element access returning Option
Global:
serialize: Serialize any value to bytesfromBigEndianBytes: Decode big-endian bytesencodeNBits,decodeNBits: Difficulty encodingpowHit: Autolykos2 PoW verification
Version Checks
fn evaluateWithVersion(ctx: *VersionContext, expr: *const Expr) !Value {
if (ctx.isV6Activated()) {
// Use v6 methods and features
return try evalV6(expr);
} else if (ctx.isJitActivated()) {
// Use JIT costing
return try evalJit(expr);
} else {
// Legacy AOT path
return try evalAot(expr);
}
}
Backward Compatibility
Script Compatibility
All scripts created for earlier versions continue to work:
- Version 0 scripts: Execute with v0 semantics
- Version 1 scripts: Execute with v1 semantics
- Version 2 scripts: Execute with JIT costing
- Version 3 scripts: Full v6 features available
Method Resolution by Version
fn getMethods(ctx: *const VersionContext, type_code: u8) []const SMethod {
const container = getTypeCompanion(type_code);
if (ctx.isV6Activated()) {
return container.all_methods; // All methods including v6
}
return container.v5_methods; // Pre-v6 methods only
}
Soft-Fork Safety
Unknown opcodes and methods in future versions are handled gracefully:
fn checkOpCode(opcode: u8, ctx: *const VersionContext) ValidationResult {
if (isKnownOpcode(opcode)) return .validated;
if (ctx.isSoftForkable(opcode)) return .soft_forkable;
return .invalid;
}
Migration Guide
For Script Authors
v5 → v6:
- Use
UnsignedBigIntfor modular arithmetic (more efficient) - Use new collection methods (
reverse,zip, etc.) - Use
Header.checkPowfor PoW verification - Use
Global.serializefor value encoding
For Node Operators
Upgrading to v6:
- Update node software before activation height
- No action needed for existing scripts
- New features available after soft-fork activation
Feature Matrix
| Feature | v3.x | v4.0 | v5.0 | v6.0 |
|---|---|---|---|---|
| Basic types | ✓ | ✓ | ✓ | ✓ |
| Sigma protocols | ✓ | ✓ | ✓ | ✓ |
| JIT costing | - | - | ✓ | ✓ |
| Data inputs | - | - | ✓ | ✓ |
| Headers access | - | - | ✓ | ✓ |
| UnsignedBigInt | - | - | - | ✓ |
| Bitwise ops | - | - | - | ✓ |
| Collection updates | - | - | - | ✓ |
| PoW verification | - | - | - | ✓ |
| Serialization | - | - | - | ✓ |
Test Coverage
Version-specific behavior is tested in:
LanguageSpecificationV5.scala(~9,690 lines)LanguageSpecificationV6.scala(~3,081 lines)
These tests verify:
- All operations produce expected results
- Cost calculations are accurate
- Version-gated features work correctly
- Backward compatibility is maintained
Previous: Appendix E | Back to Book
Scala: VersionContext.scala
Rust: ergo_tree.rs (ErgoTreeVersion)
Glossary
PRE-ALPHA WARNING: This is a pre-alpha version of The Sigma Book. Content may be incomplete, inaccurate, or subject to change. Do not use as a source of truth. For authoritative information, consult the official repositories:
- sigmastate-interpreter — Reference Scala implementation
- sigma-rust — Rust implementation
- ergo — Ergo node
A
AOT (Ahead-Of-Time): Costing model where script costs are calculated before execution. Used in ErgoTree versions 0-1.
AVL Tree: A self-balancing binary search tree used for authenticated dictionaries in Ergo.
B
BigInt: 256-bit signed integer type in ErgoTree.
Box: The fundamental UTXO unit in Ergo, containing value, ErgoTree script, tokens, and registers.
C
Constant Segregation: Optimization where constants are extracted from ErgoTree expressions and stored in a separate array. Enables efficient script substitution without re-serializing the expression tree.
Context: Execution environment containing blockchain state (HEIGHT, headers), transaction data (INPUTS, OUTPUTS, dataInputs), and current input information (SELF).
Cost Accumulator: Runtime tracker that sums operation costs and enforces the script cost limit.
D
Data Input: Read-only box reference in a transaction. Provides data without being spent.
DHT (Diffie-Hellman Tuple): Four-element sigma protocol proving knowledge of secret x where u = g^x and v = h^x.
DLog (Discrete Logarithm): Sigma protocol proving knowledge of discrete logarithm. Given generator g and public key h = g^x, proves knowledge of x.
E
ErgoScript: High-level smart contract language with Scala-like syntax.
ErgoTree: Serialized bytecode representation of smart contracts.
F
Fiat-Shamir Transformation: Technique to convert interactive proofs into non-interactive proofs.
G
GroupElement: An elliptic curve point on secp256k1.
H
Header: The first byte(s) of ErgoTree that specify version and format flags.
I
Interpreter: Component that evaluates ErgoTree expressions against a context to produce a SigmaBoolean result.
J
JIT (Just-In-Time): Costing model where costs are calculated during execution. Used in ErgoTree version 2+.
O
OpCode: Single-byte identifier for expression nodes in serialized ErgoTree. Values 0x01-0x70 encode constants; 0x71+ encode operations.
P
Prover: Component that generates cryptographic proofs for spending conditions.
Proposition: A statement that can be proven true or false.
S
Secp256k1: The elliptic curve used in Ergo (same as Bitcoin).
SigmaBoolean: A tree of cryptographic propositions (AND, OR, threshold, DLog, DHT).
SigmaProp: Type representing sigma-protocol propositions.
Sigma Protocol: Zero-knowledge proof system with three-move structure.
T
Type Code: Unique byte identifier for each type in ErgoTree serialization.
U
UTXO: Unspent Transaction Output model used by Ergo.
UnsignedBigInt: 256-bit unsigned integer type (added in v6).
V
Verifier: Component that verifies cryptographic proofs.
VLQ: Variable-Length Quantity encoding for unsigned integers. Uses 7 data bits per byte with continuation bit.
Z
ZigZag Encoding: Maps signed integers to unsigned: 0→0, -1→1, 1→2, -2→3, etc. Keeps small negatives small for efficient VLQ encoding.
Bibliography
PRE-ALPHA WARNING: This is a pre-alpha version of The Sigma Book. Content may be incomplete, inaccurate, or subject to change. Do not use as a source of truth. For authoritative information, consult the official repositories:
- sigmastate-interpreter — Reference Scala implementation
- sigma-rust — Rust implementation
- ergo — Ergo node
Primary Sources
-
sigmastate-interpreter Repository
- URL: https://github.com/ScorexFoundation/sigmastate-interpreter
- Reference Scala implementation of the SigmaState interpreter
- Key packages:
sigma.ast,sigma.serialization,sigma.eval,sigma.crypto
-
sigma-rust Repository
- URL: https://github.com/ergoplatform/sigma-rust
- Rust implementation of ErgoTree IR and interpreter
- Key crates:
ergotree-ir,ergotree-interpreter,ergo-lib
-
Ergo Node Repository
- URL: https://github.com/ergoplatform/ergo
- Full node implementation in Scala
Specifications
-
ErgoTree Specification (spec.pdf)
- Location:
sigmastate-interpreter/docs/spec/spec.pdf - Formal specification of ErgoTree format and semantics
- Location:
-
ErgoScript Language Specification (LangSpec.md)
- Location:
sigmastate-interpreter/docs/LangSpec.md - Informal language specification
- Location:
-
Sigma Protocols Paper (sigma.pdf)
- Location:
sigmastate-interpreter/docs/wpaper/sigma.pdf - Formal specification of Sigma protocols
- Location:
Academic Papers
-
Sigmastate Protocols
- Location:
sigmastate-interpreter/docs/sigmastate_protocols/sigmastate_protocols.pdf - Detailed protocol descriptions
- Location:
-
Ergo Whitepaper
- Platform overview and design rationale
-
Ergo Yellow Paper
- Technical specification
External References
-
Schnorr Identification Protocol
- Schnorr, C.P. (1991). Efficient signature generation by smart cards
-
Fiat-Shamir Heuristic
- Fiat, A., & Shamir, A. (1986). How to prove yourself
-
secp256k1 Curve
- Standards for Efficient Cryptography (SEC 2)
-
BLAKE2 Hash Function
- https://www.blake2.net/