On the design of based ZK rollups over Kaspa's UTXO-based DAG consensus

michaelsutton · December 9, 2024, 1:09pm

This post aims to provide a possible picture of how based zk (zero-knowledge) rollups can be designed to operate over Kaspa’s UTXO-based L1 (see @hashdag’s post for broader context). This is by no means a final design, but rather the accumulation of several months of discussions with @hashdag, @reshmem and Ilia@starkware, which is presented here for aligning the core R&D discussion in the Kaspa community around common ground before finalizing (the L1 part of) the design.

I’m making a deliberate effort in this post to use the already established jargon and language of the broader SC community. To this end I suggest that unfamiliar readers review documents such as eth-research-based, eth-docs-zk and nested links therein before proceeding with this post.

Brief description of based zk rollups

A zk rollup is an L2 scaling solution where L2 operators use succinct zk validity proofs to prove correct smart contract execution without requiring L1 validators to run the full computations themselves. A based zk rollup is a type of rollup design where the L2 is committed to operations submitted to L1 as data blobs (payloads in Kaspa’s jargon), and cannot censor or manipulate the order dictated by L1.

Based zk rollups as an L1 ↔ L2 protocol

Since based zk rollups represent an interaction between the base layer (L1) and the rollup layer (L2), the L1 must expose certain functionalities to support rollup operations. Conceptually, this defines a protocol between L1 validators and L2 provers.

Logical functionalities expected from L1:

Aggregate rollup transactions: L1 aggregates user-submitted data blobs in the order received, providing a reliable anchoring for L2 validity proofs.
Verify proof submissions: L2 operators submit zk proofs that confirm correct processing of these transactions, ensuring L2 execution follows the agreed protocol and updating the L2 state commitments stored on L1.
Entry/Exit of L1 funds: The protocol must enable deposits and withdrawals of native L1 funds to and from L2, ensuring consistent state and authorized spending.

Key point: L1 acting as the zk proof verifier (point 2) is crucial for enabling the native currency (KAS) to serve as collateral for L2 financial activity, underscoring the importance of point 3. Additional benefits of having L1 verify the proofs include establishing a clear, single point of verification (enforced by L1 consensus rather than requiring each interested party to perform it individually) and providing proof of state commitment for new L2 validators.

VM-based vs. UTXO-based L1

When the L1 is a fully-fledged smart contract VM, the establishment of this protocol is straightforward: the rollup designer (i) publishes a core contract on L1 with a set of rules for L1 validators to follow when interacting with the rollup; and (ii) specifies a program hash (PROG) that the L2 provers are required to prove the execution of (via a zk proof, ZKP). This two-sided interplay establishes a well-defined commitment of the rollup to its users.

For UTXO/scripting-based L1s like Kaspa, a more embedded approach is required in order to support various rollup designs in the most generic and flexible way.

Technical design of the L1 side of the protocol in UTXO systems

To begin with, I present a simplified design which assumes a single rollup managed by a single state commitment on L1. Following this minimum-viable design I discuss the implications of alleviating these assumptions.

Kaspa DAG preliminaries

A block B \in G has a selected parent \in parents(B) as chosen by GHOSTDAG. The mergeset of B is defined as past(B) \setminus past(\text{selected parent of } B). The block inherits the ordering from its selected parent and appends its mergeset in some consensus-agreed topological order. Block C is considered a chain block from B 's pov if there’s a path of selected parent links from B to C (which means the DAG ordering of B is an extension of C ’s DAG ordering).

Detailed protocol design

Recursive DAG ordering commitments:

A new header field, called ordered_history_merkle_root, will be introduced to commit to the full, ordered transaction history. The purpose of this field is to maintain a recursive commitment to the complete sequence of (rollup) transactions. The leftmost leaf of the underlying Merkle tree will contain the selected parent’s ordered_history_merkle_root, thereby recursively referencing the entire ordered history. The remaining tree leaves correspond to the ordered sequence of mergeset transactions, as induced by the mergeset block order.

ZK-related opcodes:

OpZkVerify: An opcode that accepts public proof arguments and verifies the correctness of the zk proof (our final design might decompose this opcode into more basic cryptographic primitives; however, this is out of scope for this post and will be continued by @reshmem).
OpChainBlockHistoryRoot: An opcode that provides access to the ordered_history_merkle_root field of a previous chain block. This opcode expects a block hash as an argument and fails if the block has been pruned (i.e., its depth has passed a threshold) or is not a chain-block from the perspective of the merging block executing the script. It will be used to supply valid anchor points to which the ZKP can prove execution.

State-commitment UTXO:

The UTXO responsible for managing the rollup state will appear as an ordinary UTXO from L1’s perspective, with no special handling or differentiation at the base layer. The UTXO spk (script_public_key) will be of type p2sh (pay to script hash), which will represent the hash of a more complex structure. Specifically, it will be the hash of the following pre-image:

PROG (the hash of the permanent program L2 is obligated to execute)
state_commitment (the L2 state commitment)
history_merkle_root (the ordered_history_merkle_root from L1’s header, representing the point in the DAG ordering up to which L1 transactions have been processed to produce the corresponding L2 state commitment)
Additional auxiliary data required to verify a ZKP (e.g., a well-known verification key)
The remaining execution script (will be specified below)

Proof transaction:

In its minimal form, a proof transaction consists of an incoming state-commitment UTXO, an outgoing updated state-commitment UTXO and a signature revealing the pre-images and the ZKP.

Assuming such in, out UTXOs and a signature script sig, the following pseudo code outlines the script execution required to verify the validity of this signature (and logically verify the L2 state transition):

 //
 // All data is extracted from sig
 // Some operations below might use new Kip10 introspection opcodes
 //
 // Skipping the part where the script itself is proven to be in the preimage 
 // (which is standard p2sh processing) 
 //

 // Prove preimages
 show (prog_in , commit_in , hr_in , ...) is the preimage of in.spk
 show (prog_out, commit_out, hr_out, ...) is the preimage of out.spk
 verify prog_in == prog_out // verify prog is preserved
 
 // Verify L1 history-root anchoring
 extract block_hash from sig // the chain block we are claiming execution to
 hr_ref <- OpChainBlockHistoryRoot( block_hash ) // fails if anchoring is invalid
 verify hr_ref == hr_out

 // Verify the proof
 extract zkp from sig
 OpZkVerify( proof: zkp, proof_pub_inputs: [commit_in, commit_out, hr_in, hr_out]) 
 // ^ omitting prog and other auxiliary verification data

L2 program semantics:

In order to provide the desired based rollup guarantees, the execution specified by the (publicly known) L2 PROG must strictly adhere to the following rules:

Reveal (through private program inputs) the full tree diff claimed to be processed: T(hr_out) \setminus T(hr_in)
Execute the identified transactions in order, without any additions or removals

Observe that the first rule, combined with the OpChainBlockHistoryRoot call within the script, ensures that the state commitment always advances to a valid state:

The script verifies that hr_out is a valid chain block history commitment.
The PROG verifies that hr_out recursively references hr_in and, as a result, must be an extension of it.
- This verification by PROG is enforced by L1 through OpZkVerify.

Operational flow:

Tx_1, Tx_2, ..., Tx_n with data payloads are submitted to the DAG, included in blocks, and accepted by chain blocks C_1, C_2, ..., C_n, respectively.
The transaction hashes are embedded into the ordered_history_merkle_root fields of the corresponding headers, enforced as part of L1 consensus validation.
An L2 prover chooses to prove execution up to block C_i
A proof transaction referencing the initial state-commitment UTXO and producing a new state-commitment UTXO (encoding the new history root ordered_history_merkle_root(C_i)) is created.
The proof is validated by L1, and the new state-commitment UTXO replaces the previous one in the UTXO set, recording the L2 state transition on L1.

Soundness and flexibility of the proposed design

The design presented supports fully based zk rollups by embedding ordered history Merkle roots into block headers and introducing new zk-related opcodes into Kaspa’s script engine. The history roots provide modular and complete evidence of DAG ordering, enabling provers to select granularity over any period of consecutive chain blocks (while still requiring processing in mergeset bulks). The combination of L1 script opcodes and L2 PROG programmability provides substantial flexibility for rollup design.

Points for subsequent discussion

Despite its focus on the most baseline design, this post is already becoming rather long, so I’ll wrap up by briefly outlining some key “zoom-in” points for further discussion and refinement.

Uniqueness of the state-commitment UTXO
1.1 Challenge: Proving that a specific UTXO represents “the” well-known authorized state commitment for a given rollup. Although state transitions do not require such proof, it is important, for instance, for proving L2 state to a newly syncing L2 node.
1.2 Solution Direction: L2 source code can define a “genesis” state (encoded in the initial UTXO), and zk validity proofs can recursively attest that the current state originated from that genesis.
Entry/Exit of L1 funds
2.1 Allow deposits to static addresses representing the L2.
2.2 Allow withdrawals back to L1 (as additional proof transaction outcomes).
2.3 The N-to-const problem: Address the challenges arising from local limits on transaction size and the potentially many outcomes resulting from a batched proof operation.
Extension to many rollups ¹
3.1 Requirement/Desire: Each L2 rollup prover should only need to execute O(\text{rollup activity}) within their PROG proof execution.
3.2 Solution Direction: Manage the L1 history Merkle tree by grouping by rollup and further dividing into activity/inactivity branches. Note: Applying the grouping recursively might result in long-term storage requirements per rollup, which has broader implications.
Multiple state commitments per rollup
4.1 Challenge: Allowing L1 to manage multiple state commitments for a single rollup in order to balance scalability and validity (allowing different provers to partially advance independent segments/logic zones of L2 state).
4.2 Solution Direction: Implement partitioned state commitments on L1, representing dynamic cuts of the L2 state tree. If taken to the extreme, this solution could allow a user to solely control their own L2 account via a dedicated state commitment on L1.

Another major aspect not discussed in this post is the zero-knowledge technology stack to be supported and its implications for L1 components (e.g., the hash function used to construct the Merkle history trees). Laving this part to @reshmem, @aspect and others for full follow-up dives.

[¹] The introduction of many rollups (or subnets in Kaspa’s jargon) touches on conceptual topics such as L2 state fragmentation and atomic composability, which are beyond the scope of this post and were preliminarily discussed in @hashdag’s post. Here, I’m referring merely to the technical definitions and consequences.

YesComrade · December 10, 2024, 9:26pm

I know that for zkEVM, a single account holds the balances of all users in the rollup. It looks like this design makes that account a UTXO which contains a commitment in the form of a merkle root of a merkle tree that commits to all the current balances of existing accounts in the rollup.

Re 1: Highly cogent and reasonbale.

Re 2: Will the entire tree be posted on-chain or just the root of the tree? If it is just the root, how do users get their branch in the tree in order to be capable of exiting without permission when they want to?

Re 3: Not sure how you gonna do that based on your description. It sounds like some clustering. Personal thought is not sure if this is a high priority issue (do you need that many rollups on Kaspa or having one that works is more important?)

Re 4: Not sure if understood correctly, but if the concern is about scalability and validity, I think fully distribututed ZKPs may be something interesting to think of. The scheme distributes proof generation across multiple machines and require minimal communication among them; it allows us to distribute ZKP generation in zkRollups and zkEVM among multiple participants as mining pools. Participants may share the reward for proof generation, akin to miners in PoW chains like Kaspa. This also is related to 3 since then groupin may be unecessary.

One more thing: do we have a role like an aggregator (such as one in CKB)?

michaelsutton · December 11, 2024, 11:01am

In the basic design - yes, the UTXO will contain a single state root representing all existing rollup accounts.

Re 2. Only the root. Any user operation on his L2 account will mutate the root, and the prover reporting this root mutation (via a proof tx) will be obligated to apply the outcome in L1 in the form of an additional tx output going to the user’s L1 address. This will be enforced as part of the PROG.

Re 3. As mentioned, the above description applies to the basic design. In a more advanced/dynamic design, we can have the L2 state committed to L1 in multiple, fragmented state commitments—where at the extreme, a single account can have its own L1 state-commitment UTXO.
My mental picture of this is keeping the image of L2 state as a single tree, but drawing a tree cut representing the subtree roots which are reported to L1 (see image)

Notes:

Such L2 split strategies can be static (as in scenario 2), or dynamic (scenario 3) in which case they will be supported via an L2 “detach” command which locally breaks a commitment to its tree-child commitments (on L1 this will look like a proof tx with a single incoming UTXO representing the previous subtree root and multiple output UTXO entries representing its children)
In scenario 3, accounts 13, 14 can be solely controlled by their sole owner, perhaps even requiring that each operation must be provided with an immediate inline ZKP. Think krc20 token holders performing a send—the transaction can consume the state-commitments UTXOs of both the sender and recipient and can output the updated UTXOs with the updated state commitments, where the signature is an inline ZKP (credit: @reshmem, @aspect).
Of course there are numerous subtleties and complexities to this dynamic design which I’m neglecting here, one major one being “how to prove a subtree state commitment can be advanced w/o needing to execute unrelated rollup transactions (for showing non of them touched this part of the state)?”. This requires some form of “exclusion proof” or an explicit way to state read/write dependencies (cc: @aspect).

Re 4. I totally agree, I think proof generation should be a distributed effort. Imho it can be a coordinated collective effort, and even centralized to some degree, since decentralization and censorship-resistance are enforced by L1 due to the based design guarantees.

Re the final remark. Afaiu the aggregator role in CKB is exactly the non-based part where an L2 operator is required to collect off-chain batches, hence it’s irrelevant to this based design. That being said, there’s no way to enforce such a thing from L1 (unless you define a single METAPROG which all rollups must extend)—the point is to allow based rollups and to make them the default way to go, not to forbid non-based approaches.

coderofstuff · December 12, 2024, 6:56am

This has been an interesting read so far. I’m still trying to catch up to be able to properly parse and understand the content fully. For now, I do have some preliminary questions based on my first reading here:

Entry/exit funds - will these use p2sh exclusively? Or p2pk for entry and exit but p2sh for state transitions?
What does a state_commitment look like?
Regarding the use of p2sh here referencing a single state commitment - does this mean that each state transition will necessarily use a different p2sh address each time? I’m wondering about the impact of this on services like the explorer who will maintain several address entries that will keep getting used just the one time.
With the statement: “The proof is validated by L1, and the new state-commitment UTXO replaces the previous one in the UTXO set, recording the L2 state transition on L1.” - what happens during a re-org in L1? This will regularly happen around the tips. How is the L2 tolerant to such re-orgs?
The post focuses on state transitions, but I’m curious about how much funds would actually move between each transaction from p2sh to new p2sh? Each tx will incur a fee also. What are your thoughts on these amounts moved/paid on the base layer as state transitions in the L2?

michaelsutton · December 12, 2024, 11:10am

I’ll try to address the questions to some degree, though I think some of the answers will be fully clarified only after in-depth follow-up posts.

Note: for simplicity, my points below still assume the baseline design with a single state-commitment UTXO per rollup instance, but can easily be extended to the multiple-commitment scheme described in my previous comment.

Re 1. First - this area is definitely still not finalized and you (or any other reader) should join the brainstorming. My current thinking is as follows: Think of L2 (a rollup instance) as a “virtual” wallet which might own many UTXOs on L1. All spks (addresses) for this wallet will be p2sh; however, some will be dynamic and some static. Like you mention in point 3, the state-commitment UTXO is an everchanging dynamic p2sh address. In addition to that there will be a set of static p2sh addresses which can be driven from PROG ¹. The static addresses will be used by users for deposit/entry. Spending these UTXOs must be done through a proof transaction and the spending sig must reveal the preimage script which will delegate verification to the primary input sig of the proof tx (the one containing the ZKP) ².

Re 2. The L2 state_commitment would usually be a Merkle root of its state (e.g., the root of a Patricia tree). But our L1 design should not rely on specific assumptions about it.

Re 3: As mentioned, yes, it means a different p2sh address each time. Explorers will learn how to track these addresses and treat them as a continuous entity by following the specific L2 encoding.

Re 4. Great question. This is precisely why OpChainBlockHistoryRoot ³ is set to fail if the block hash isn’t a chain block from pov of the executing merging block. If a reorg invalidates a previously used anchoring chain block, then the proof tx using that anchor will be invalidated as well (when executed through the new chain), thus effectively “unspending” the spent state-commitment UTXO. This means that following a reorg, L2 provers will need to resubmit proofs proving execution according to the order dictated by the new chain segment. Your mental picture here should be a proof chain following the DAG selected chain. Note that if done correctly, L2 provers can reuse hierarchic zk proofs used to compose the previous proof-chain. I.e., a reorg does not necessarily mean full re-computation of all reversed proofs.

Re 5. It can be the full amount deposited to L2 all concentrated in the “dynamic” state-commitment UTXO. I don’s see this as an issue.

[¹] These addresses can incorporate KIP10-style additive schemes for improved management of the L2 UTXO subset and for compliance with KIP9
[²] The advanced reader might notice that this scheme requires the “Uniqueness of the state-commitment UTXO” property I mentioned at the end of the post.
[³] Unrelated note on OpChainBlockHistoryRoot. The chain depth we allow access to here will affect syncing of new L1 nodes. We will need to sync a chain segment of that length below the pruning point in order for the syncee to be able to process all transactions above the pruning point deterministically.

LostandFound · December 18, 2024, 1:52pm

It took me a while to find this ancient post in Bitcoin Forum: Storing UTXOs in a Balanced Merkle Tree (zero-trust nodes with O(1)-storage).

I think it answers the basic questions of storing UTXOs plus that its #7 reply (starting with “Node-deletion is not the inverse of node-insertion”) is related to your Re 1. (addressing the entry/exit question) above.

YesComrade · December 21, 2024, 8:50pm

@michaelsutton then I think the biggest thing what this post wasn’t addressing is the “how to be based” question? I can think of execution of a zkSNARK verification program pre-configured with a verification key (like a SNARK VM) . It can be run and implemented in a really simple way with only two challenge scenarios and just 3 instructions ADD, SUB and MULKAS, running any zk-snark generator lib targeting groth16, for example.

Certainly bridge that allows users to transfer assets between L1 and L2 would be needed. That should also easily solve the problem of deposit and withdrawal. There may be two types of proofs that will be needed: one for transaction inclusion and another one for state transition of bridge sc on the L2 side. For both a zkSNARK groth16 proof can be a good fit and proof recursion should be supported to allow Plony2 and STARKs proofs if needed.

But if a role of operator is introduced then I guess this is agian non-based, right? As based-approach may be challenging and taking some time and saying “not to forbid non-based approaches”, will Kaspa welcome third party teams to make a non-based rollup plan on Kaspa? [Question 1]

Also what happens if the target block with the target TX is invalidated by reorg? The OpChainBlockHistoryRoot looks like a revert. Practically, for an unconfirmed transaction does the transaction remain in the mempool and is reprocessed? Would this end up with a new TX hash? How might we be notified of this new TX? For a confirmed (multiple confirmations) transaction (though may not be likely) will it be marked as reverted? I guess the answer is Yes as you mentioned about to resubmit. Could this potentially lead to a situation in which a TX1 is confirmed on L1 while another TX2 is later confirmed on L1 on the canonical, which means TX2 is final but TX1 is reverted after TX2 is submitted? [Question 2]

An additional thought on reorg: is it possible to figure out a way maybe just do experiements? I have seen people do it for Opstack by Setting up an L2 replica – Bridge OAS from L1 to L2 using three different accounts (A, B, C) – Fork L1, and then roll up L2 to the miner’s side – Transfer – Merge L1’s minor chain with the majority – Set up another replica. Then just see what happens to the accounts. Would it be possible to figure out ansuwers to such questions just empirically maybe? So different L2s will at least know what will happen at least. [Question 3]

hashdag · December 22, 2024, 9:31am

AFAIK Andrew’s post deals with UTXO commitments (more accurately: accumulator), not sure what connection to this post you are suggesting here (?)

someone235 · February 23, 2025, 3:17pm

I want to emphasize the fact that this opcode is stateful, which breaks the assumption that a script validity is dependent only on the transaction itself and its previous UTXOs. This means that a transaction that spends a UTXO with such opcode might be invalidated without an intentional double spend, which might harm the UX of the recipient.

This can be dealt by introducing a concept similar to coinbase maturity for such transactions, or developing an off-chain mechanism to help users identify such transactions (and dependent transactions) so they can set a higher confirmation time.