Skip to content

Constructing simple SMTs

Understanding the finer details of how the Storage SM operates requires a good grasp of the way the zkProver’s Sparse Merkle Trees (SMTs) are constructed. This document explains how these SMTs are built.

Consider key-value pair based binary SMTs. The focus here is on explaining how to construct an SMT that represents a given set of key-value pairs. And, for the sake of simplicity, we assume 8-bit key-lengths.

A NULL or empty SMT has a zero root. That is, it has no key and no value recorded in it. Similarly, a zero node or NULL node refers to a node that carries no value.

A binary SMT with one key-value pair

A binary SMT with a single key-value pair (Ka,Va), is built as follows.

Suppose that Ka=11010110. In order to build a binary SMT with this single key-value (Ka,Va),

  1. One computes the hash H(Va) of the value Va,

  2. Sets the leaf La:=H(Va),

  3. Sets the sibling leaf as a NULL leaf, simply represented as “0”,

  4. Computes the root as roota0=H(La0), with the leaf La on the left because the lsb(Ka)=0. That is, between the two edges leading up to the root, the leaf La is on the left edge, while the NULL leaf “0” is on the right.

See the below figure for the SMT representing the single key-value pair (Ka,Va), where Ka=11010110.

A Single key-value pair SMT

Note that the last nodes in binary SMT branches are generally either leaves or zero-nodes.

In the case where the least-significant bit, lsb of Ka is 1, the SMT with a single key-value pair (Ka,Va) would be a mirror image of what is seen in Figure 3. And its root, root0a=H(0La)roota0 because H is a collision-resistant hash function.

This example also explains why we need a zero node. Since all trees used in our design are binary SMTs, a zero node is used as a default sibling for computing the parent node. This helps to differentiate between roots (also between parent nodes) because a root node actually identifies an SMT. Therefore, it is crucial to distinguish between roota0=H(La0) and root0a=H(0La) because they represent two distinct trees.

Binary SMTs with two key-value pairs

Consider now SMTs with two key-value pairs, (Ka,Va) and (Kb,Vb).

There are three distinct cases of how corresponding SMTs can be built, each determined by the keys, Ka and Kb.

Case 1

The keys are such that the lsb(Ka)=0 and the lsb(Kb)=1.

Suppose that the keys are given as Ka=11010110 and Kb=11010101.

To build a binary SMT with this two key-values, (Ka,Va) and (Kb,Vb),

  1. One computes the hashes, H(Va) and H(Vb) of the values, Va and Vb , respectively,

  2. Sets the leaves, La:=H(Va) and Lb:=H(Vb),

  3. Checks if the lsb(Ka) differs from the lsb(Kb),

  4. Since the lsb(Ka)=0 and the lsb(Kb)=1, it means the two leaves can be siblings,

  5. One can then compute the root as, rootab=H(LaLb).

Note that the leaf La is on the left because the lsb(Ka)=0, but Lb is on the right because the lsb(Kb)=1. That is, between the two edges leading up to the rootab, the leaf La must be on the edge from the left, while Lb is on the edge from the right.

See the below figure for the SMT representing the two key-value pairs (Ka,Va) and (Kb,Vb), where Ka=11010110 and Kb=11010101.

Two key-value pairs SMT - Case 1

Case 2

Both keys end with the same key-bit. That is, the lsb(Ka)=lsb(Kb), but their second least-significant bits differ.

Suppose that the two keys are given as Ka=11010100 and Kb=11010110.

To build a binary SMT with this two key-values, (Ka,Va) and (Kb,Vb);

  1. One computes the hashes, H(Va) and H(Vb) of the values, Va and Vb , respectively.

  2. Sets the leaves, La:=H(Va) and Lb:=H(Vb).

  3. Checks if the lsb(Ka) differs from the lsb(Kb). Since the lsb(Ka)=0 and the lsb(Kb)=0, it means the two leaves cannot be siblings at this position because it would otherwise mean they share the same tree-address 0, which is not allowed.

  4. One continues to check if the second least-significant bits of Ka and Kb differ. Since the second lsb(Ka)=0 and the second lsb(Kb)=1, it means the two leaves La and Lb can be siblings at their respective tree-addresses, 00 and 10.

  5. Next is to compute the hash H(LaLb) and set it as the branch Bab:=H(LaLb) at the tree-address 0. Note that the leaf La is on the left because the second lsb(Ka)=0, while Lb is on the right because the second lsb(Kb)=1.

  6. The branch Bab:=H(LaLb) needs a sibling. Since all the values, Va and Vb, are already represented in the tree at La and Lb, respectively, one therefore sets a NULL leaf “0” as the sibling leaf to Bab.

  7. As a result, it is possible to compute the root as, rootab0=H(Bab0). Note that, the branch Bab is on the left because the lsb(Ka)=0, and 0 must therefore be on the right. That is, between the two edges leading up to the rootab0, the branch Bab must be on the edge from the left, while 0 is on the edge from the right.

See the below figure depicting the SMT representing the two key-value pairs (Ka,Va) and (Kb,Vb), where Ka=11010100 and Kb=11010110.

Two key-value pairs SMT - Case 2

Case 3

The first two least-significant bits of both keys are the same, but their third least-significant bits differ.

Suppose that the two keys are given as Ka=11011000 and Kb=10010100. The process for building a binary SMT with these two key-values, (Ka,Va) and (Kb,Vb) is the same as in Case 2;

  1. One computes the hashes, H(Va) and H(Vb) of the values, Va and Vb , respectively.

  2. Sets the leaves, La:=H(Va) and Lb:=H(Vb).

  3. Checks if the lsb(Ka) differs from the lsb(Kb). Since the lsb(Ka)=0 and the lsb(Kb)=0, it means the two leaves cannot be siblings at this position as it would otherwise mean they share the same tree-address 0, which is not allowed.

  4. Next verifier continues to check if the second least-significant bits of Ka and Kb differ. Since the second lsb(Ka)=0 and the second lsb(Kb)=0, it means the two leaves cannot be siblings at this position, because it would otherwise mean they share the same tree-address 00, which is not allowed.

  5. Once again he checks if the third least-significant bits of Ka and Kb differ. Since the third lsb(Ka)=0 and the third lsb(Kb)=1, it means the two leaves La and Lb can be siblings at their respective tree-addresses, 000 and 100.

  6. One then computes the hash H(LaLb), and sets it as the branch Bab:=H(LaLb) at the tree-address 00. The leaf La is on the left because the third lsb(Ka)=0, while Lb is on the right because the third lsb(Kb)=1.

  7. The branch Bab:=H(LaLb) needs a sibling. Since all the values, Va and Vb , are already represented in the tree at La and Lb, respectively, one therefore sets a NULL leaf “0” as the sibling leaf to Bab.

  8. One can now compute the hash H(Bab0), and set it as the branch Bab0:=H(Bab0) at the tree-address 0. The hash is computed with the branch Bab on the left because the second lsb of both keys, Ka and Kb, equals 0. Therefore the NULL leaf “0” must be on the right as an argument to the hash.

  9. The branch Bab0:=H(Bab0) also needs a sibling. For the same reason given above, one sets a NULL leaf “0” as the sibling leaf to Bab0.

  10. Now, one is able to compute the root as rootab00=H(Bab00). Note that the hash is computed with the branch Bab0 on the left because the lsb of both keys, Ka and Kb, equals 0. That is, between the two edges leading up to the rootab00, the branch Bab0 must be on the edge from the left, while “0” is on the edge from the right.

See the below figure depicting the SMT representing the two key-value pairs (Ka,Va) and (Kb,Vb), where Ka=11011000 and Kb=10010100.

Two key-value pairs SMT - Case 3

There are several other SMTs of two key-value pairs (Kx,Vx) and (Kz,Vz) that can be constructed depending on how long the strings of the common least-significant bits between Kx and Kz are.

In general, when building an SMT, leaves of key-value pairs with the same least-significant key-bits share the same navigational path only until any of the corresponding key-bits differ. These common strings of key-bits dictate where the leaf storing the corresponding value is located in the tree.


Last update: January 17, 2024
Authors: avenbreaks