Introduction to the Cairo Virtual Machine
Welcome to the documentation for the Cairo VM.
First, let us clarify what is a Virtual Machine (VM)? A VM is a digital version of a physical computer, that can run programs and operating systems, store data, connect to networks, and do other computing functions.
This documentation is designed to provide developers with a fundamental understanding of the Cairo VM's architecture and capabilities, empowering them to build and optimize applications on Starknet effectively. This documentation will be divided into four main parts, each of which we will briefly describe in this introduction.
The Cairo Virtual Machine (CVM), at the heart of the Starknet network, is an advanced computational framework specifically designed to execute Casm bytecode generated by the Cairo compiler. Also, the Cairo VM is able to generate cryptographic proofs, these proofs will be posted on a settlement layer such as Ethereum, offering Starknet the same level of security as the underlying layer.
VM Layout
The Cairo Virtual Machine optimizes its operations around a unique structure of registers and memory to facilitate the efficient execution of Casm bytecode.
Registers
Unlike traditional architectures where registers are separate storage units, in Cairo, these registers are pointers that facilitate access and management of memory during program execution.
General-Purpose Registers: These are hypothetical in Cairo, where instead of traditional CPU-bound registers all operations are directly performed on memory cells. This approach aligns with Cairo’s optimization strategy which aims to minimize the number of trace cells in the Algebraic Intermediate Representation (AIR). In Cairo, registers such as the Program Counter (pc), Allocation Pointer (ap), and Frame Pointer (fp) do not store data themselves.
Pointers
Program Counter (pc) points to the current instruction in memory that the VM is executing. Allocation Pointer (ap) identifies the first free memory cell not yet used by the program, crucial for dynamic memory management. Frame Pointer (fp) points to the beginning of the stack frame of the current function, managing the execution context of function calls.
Memory
Cairo employs a nondeterministic read-only memory model. Unlike typical random-access memory (RAM) used in most computing systems where any memory cell can be read or written at any time, Cairo's memory model only allows memory to be written once and restricts subsequent modifications. In addition, this memory model is particularly efficient for proof generation, as it requires fewer trace cells per memory access.
Cairo's memory model also introduces the idea of "public memory," a crucial feature for validating data integrity between the prover and verifier. This feature enables the prover to prove that specific memory cells contain certain values without exposing the entire memory's content.
This feature is crucial for operations like loading bytecode, passing arguments, and returning values from programs because it ensures data integrity and security. By proving that specific memory cells contain certain values without exposing the entire memory's content, it prevents unauthorized access to data and protects against data manipulation. Additionally, it increases efficiency as it allows the Cairo VM to directly access required data instead of scanning through the entire memory.
Basic Instructions
Cairo's instruction set is designed to facilitate the verification of computations within its unique programming environment. The foundational aspect of Cairo instructions revolves around assertions of equality, which form the basis of most operations within the virtual machine.
Assert Statements
A primary form of instruction in Cairo is the assert statement, which verifies that two values or computations are equal. For example, the instruction [ap] = [ap - 1] * [fp], ap++
; not only performs multiplication but also asserts that the result is equal to the value at the current free memory cell [ap]
, before incrementing [ap]
with ap++
.
Pointer Management
The increment of the allocation pointer (ap++) is an important standalone operation in many Cairo instructions. It prepares the virtual machine to utilize the next cell in memory for future operations, thereby ensuring a seamless flow of computation.
Control Flow Instructions
Conditional and Unconditional Jumps: These instructions manage the flow of execution, allowing the program to branch or loop based on conditions or unconditionally, respectively.
Cal
and Ret
for Call and Return
Similar to functions in traditional programming, these instructions manage the execution of subroutines within Cairo code. The call instruction directs the machine to execute a block of code at another location, and return concludes that execution, resuming control at the point following the original call. These control structures are pivotal for creating complex, logical structures within Cairo programs, enabling more dynamic and versatile applications than would be possible with linear execution paths alone.
Advanced Syntax and Operations
In addition to fundamental arithmetic and control flow operations, Cairo provides advanced features for greater code complexity and efficiency. It supports complex operations such as double dereferencing, which allows values to be used as memory addresses, and the direct modification of values using immediate values and offsets, which provides flexibility in memory operations.
Builtins
Builtins in Cairo are predefined and optimized low-level execution units that are added to the Cairo CPU board to perform predefined calculations that are expensive to perform in Cairo. Communication between the CPU and embedded functionalities occurs through memory-mapped I/O. Each builtin is assigned a contiguous region of memory and applies specific restrictions on the data found in that area.
Hints
Cairo language introduces "prover hints" or simply "hints" which are snippets of code strategically inserted between Cairo instructions to facilitate additional computations necessary for the prover's operations. Unique to the prover's process, these hints can be authored in any programming language, as their execution is not required to be verified, not needed to be included in the proof submitted to the verifier. As a result, when the Cairo Runner encounters a hint before a Cairo instruction, it executes the hint first, potentially initializing memory cells before proceeding with the Cairo instruction.
Hints are written in Python in the Cairo VM currently in production, providing a powerful tool for the prover to execute operations invisible to the verifier, hence, this invisibility is crucial as it allows the hint to perform complex tasks without making them part of the program's formal execution trace, thereby not counting towards the total number of execution steps in the Cairo VM. Moreover, while hints offer enhanced capabilities in writing Cairo code, their use within Starknet contracts should be avoided unless the hints are specifically approved or whitelisted by Starknet.
A "hint" in Cairo is a block of any language executed just before the subsequent Cairo instruction, interacting directly with the program’s variables or memory. To incorporate a hint, simply write Python code inside the %{ ... %}
markers, this feature allows programmers to leverage Python's extensive functionalities within Cairo programs, enhancing their capability to manage complex computations efficiently and securely.
Hints in Cairo are useful when the cost of verifying a computed value is cheaper than finding the value itself, they are particularly valuable for solving equations or optimizing computational tasks.
Cairo Runner
The Cairo Runner, a specialized computer program, executes compiled Cairo programs. Unlike conventional programs, these include nondeterministic code due to Cairo's acceptance of nondeterminism. For instance, when computing the square root of a number, the Cairo Runner considers both possible outcomes (+Y and -Y). It then selects the solution that aligns with the following program instructions.
The output of the Cairo Runner comprises two main parts: an accepting input for the Cairo nondeterministic machine and a corresponding input for the Cairo deterministic machine, which serves as the witness for the nondeterministic machine. Additionally, the Runner may return a failure if the execution results in a contradiction or if it cannot compute the value of a memory cell due to insufficient hints.
This process is supported by "hints," which are special instructions that the Cairo Runner utilizes to resolve nondeterministic outcomes that cannot be directly inferred. These hints, typically written in Python, are crucial for the execution of certain Cairo programs where additional information is necessary to determine the course of execution.