Constructing simple SMTs
Understanding the finer details of how the Storage SM operates requires a good grasp of the way the zkProver’s Sparse Merkle Trees (SMTs) are constructed. This document explains how these SMTs are built.
Consider key-value pair based binary SMTs. The focus here is on explaining how to construct an SMT that represents a given set of key-value pairs. And, for the sake of simplicity, we assume 8-bit key-lengths.
A NULL or empty SMT has a zero root. That is, it has no key and no value recorded in it. Similarly, a zero node or NULL node refers to a node that carries no value.
A binary SMT with one key-value pair¶
A binary SMT with a single key-value pair
Suppose that
-
One computes the hash
of the value , -
Sets the leaf
, -
Sets the sibling leaf as a NULL leaf, simply represented as “
”, -
Computes the root as
, with the leaf on the left because the . That is, between the two edges leading up to the root, the leaf is on the left edge, while the NULL leaf “ ” is on the right.
See the below figure for the SMT representing the single key-value pair
Note that the last nodes in binary SMT branches are generally either leaves or zero-nodes.
In the case where the least-significant bit, lsb of
This example also explains why we need a zero node. Since all trees used in our design are binary SMTs, a zero node is used as a default sibling for computing the parent node. This helps to differentiate between roots (also between parent nodes) because a root node actually identifies an SMT. Therefore, it is crucial to distinguish between
Binary SMTs with two key-value pairs¶
Consider now SMTs with two key-value pairs,
There are three distinct cases of how corresponding SMTs can be built, each determined by the keys,
Case 1¶
The keys are such that the
Suppose that the keys are given as
To build a binary SMT with this two key-values,
-
One computes the hashes,
and of the values, and , respectively, -
Sets the leaves,
and , -
Checks if the
differs from the , -
Since the
and the , it means the two leaves can be siblings, -
One can then compute the root as,
.
Note that the leaf
See the below figure for the SMT representing the two key-value pairs
Case 2¶
Both keys end with the same key-bit. That is, the
Suppose that the two keys are given as
To build a binary SMT with this two key-values,
-
One computes the hashes,
and of the values, and , respectively. -
Sets the leaves,
and . -
Checks if the
differs from the . Since the and the , it means the two leaves cannot be siblings at this position because it would otherwise mean they share the same tree-address0
, which is not allowed. -
One continues to check if the second least-significant bits of
and differ. Since the and the , it means the two leaves and can be siblings at their respective tree-addresses,00
and10
. -
Next is to compute the hash
and set it as the branch at the tree-address0
. Note that the leaf is on the left because the , while is on the right because the . -
The branch
needs a sibling. Since all the values, and , are already represented in the tree at and , respectively, one therefore sets a NULL leaf “ ” as the sibling leaf to . -
As a result, it is possible to compute the root as,
. Note that, the branch is on the left because the , and must therefore be on the right. That is, between the two edges leading up to the , the branch must be on the edge from the left, while is on the edge from the right.
See the below figure depicting the SMT representing the two key-value pairs
Case 3¶
The first two least-significant bits of both keys are the same, but their third least-significant bits differ.
Suppose that the two keys are given as
-
One computes the hashes,
and of the values, and , respectively. -
Sets the leaves,
and . -
Checks if the
differs from the . Since the and the , it means the two leaves cannot be siblings at this position as it would otherwise mean they share the same tree-address0
, which is not allowed. -
Next verifier continues to check if the second least-significant bits of
and differ. Since the and the , it means the two leaves cannot be siblings at this position, because it would otherwise mean they share the same tree-address00
, which is not allowed. -
Once again he checks if the third least-significant bits of
and differ. Since the and the , it means the two leaves and can be siblings at their respective tree-addresses,000
and100
. -
One then computes the hash
, and sets it as the branch at the tree-address00
. The leaf is on the left because the third , while is on the right because the third . -
The branch
needs a sibling. Since all the values, and , are already represented in the tree at and , respectively, one therefore sets a NULL leaf “ ” as the sibling leaf to . -
One can now compute the hash
, and set it as the branch at the tree-address0
. The hash is computed with the branch on the left because the second lsb of both keys, and , equals . Therefore the NULL leaf “ ” must be on the right as an argument to the hash. -
The branch
also needs a sibling. For the same reason given above, one sets a NULL leaf “ ” as the sibling leaf to . -
Now, one is able to compute the root as
. Note that the hash is computed with the branch on the left because the lsb of both keys, and , equals . That is, between the two edges leading up to the , the branch must be on the edge from the left, while “ ” is on the edge from the right.
See the below figure depicting the SMT representing the two key-value pairs
There are several other SMTs of two key-value pairs
In general, when building an SMT, leaves of key-value pairs with the same least-significant key-bits share the same navigational path only until any of the corresponding key-bits differ. These common strings of key-bits dictate where the leaf storing the corresponding value is located in the tree.