Skip to content

Merkle trees

zkProver’s data is stored in the form of a special Sparse Merkle Tree (SMT), which is a tree that combines the concept of a Merkle Tree and that of a Patricia tree. The design is based on how the Sparse Merkle Trees are constructed and how they store keys and values.

Tip

The content of this document is rather elementary. Experienced developers can fast-forward to later sections and only refer back as the need arises.

A typical Merkle tree has leaves, branches and a root. A leaf is a node with no child-nodes, while a branch is a node with child-nodes. The root is therefore the node with no parent-node.

See the below figure for an example of how a hash function H is used to create a Merkle tree recording eight values;

Va,Vb,Vc,Vd,Ve,Vf,Vg,Vh

A Merkle Tree Example

Firstly, each leaf is nothing but the hash value H(Vi) of a particular value Vi, where i is an element of the index-set {a,b,c,d,e,f,g,h}.

Secondly, the branch nodes are computed as follows;

Bab=H(H(Va)H(Vb)),Bcd=H(H(Vc)H(Vd)),Bef=H(H(Ve)H(Vf)), Bgh=H(H(Vg)H(Vh)),Babcd=H(BabBcd),andBefgh=H(BefBgh).   

Thirdly, the root is computed as roota..h=H(BabcdBefgh).

Leaves that share a parent-node are called siblings. The same terminology applies to branches. For example, Bab and Bcd are sibling branches because they are branches of the same parent, Babcd. Similarly, Befgh and Babcd are sibling branches.

Keys and navigating a Merkle tree

Note the bits used to label edges in figure above. The strings of these bits are called Keys, and they indicated where a node is located in a Merkle Tree. Since these keys uniquely locate nodes, they are used to navigate from the root to the leaves (and backwards).

Suppose one is given the key-value pair (Kd,Vd), where the key is Kd=10010110.

In order to locate the key-value pair (Kd,Vd) in the Merkle depicted in the figure above, the key Kd is read bit-by-bit from the right-most bit to the left-most bit. While traversing the tree from the root downwards,

  • a zero-key-bit “0” means “follow the edge going to the left”,

  • a key-bit “1” means “follow the edge going to the right”.

Since Kd=10010110, as follows:

  1. Read the least-significant bit of Kd, which is 0, hence traverse the tree to the left, and reach Babcd.
  2. Then read the second significant key-bit, which is “1” in this case. So take the edge going to the right, reaching Bcd.
  3. Again, read the next key-bit, which is “1”, hence follow the edge going to the right, reaching the leaf H(Vd).

Since H(Vd) is a leaf and not a branch, and the navigation was correctly done with respect to the given key Kd, the H(Vd) must be the leaf storing the value Vd.

One can similarly climb the tree, going in the reverse direction, by using the key-bits of the given key in the reverse order. That is, starting with the last key-bit used to reach the leaf and ending with the least-significant bit of the key.

The tree-address of the value Vx, herein refers to the position of the leaf Lx:=H(Vx), denoted by the key-bits used to reach Ld but in the reverse order.

In the above example, the tree-address of Vd is 011.

A Merkle proof example

Merkle Trees can be used as commitment schemes. And here is an example that follows the (key,value)-pair approach used in the zkProver. Consider the Merkle Tree shown in the figure above.

If the prover has committed to a value Vf by appending a new leaf H(Vf) to the Merkle Tree as depicted in the above figure, he must then avail the following information, to enable verification of his claim;

  1. The Merkle root, denoted by roota..h,
  2. The value Vf, and
  3. The siblings; H(Ve), Bgh and Babcd.

Instead of searching through all hash values stored in the tree, the verifier uses only a few hash values of relevant siblings. That is, three siblings in this case.

The verifier then checks the prover’s claim by computing the Merkle root as follows;

  1. He computes H(Vf), which is the hash of the value Vf.

  2. Then uses the sibling H(Ve) to compute H(H(Ve)H(Vf))=:B~ef, which should be the same as the branch node Bef.

  3. Next, he computes H(B~efBgh)=:B~efgh, corresponding to the branch node Befgh.

  4. Now, uses H(BabcdB~efgh)=:root~a..h.

The Merkle proof is concluded by checking whether root~ah equals to the publicly known root roota..h.

Note

The symbol tilde denoted ~, is used throughout the document to indicate that the computed value, ~, still needs to be checked, or tested to be true.


Last update: January 17, 2024
Authors: avenbreaks