 Merkle Tree
		Merkle Tree
The concept of a Merkle Tree as a binary authentication tree was introduced in [MER79].
- A Merkle Tree is a binary authentication tree built over a set of $n$ values, $Y_1 \ldots Y_n$.
- These $Y$ may be chunks of data we need to authenticate (perhaps in a Bitcoin chain), or OTS public keys we may want to publish as part of a larger scheme.
- The parent of each leaf $Y_i$ is the hash of that value, $H(Y_i)$.
- Each node in the binary tree is the hash of the values of both its children. 
	- node = H(leftchild || rightchild)
 
- The value of the root node is published as a public authentication value.
Merkle Tree with $n=8$

Authenticating the leaf values
- If we know all the values, $Y_1 \ldots Y_n$, we can reconstruct the tree and compare the resulting root value with the published value.
- But this means an end user needs to store all the values, $Y_1 \ldots Y_n$
- What if we just need to verify one value, say $Y_5$?
- To do this we actually only need the values of the nodes at $C$, $6$ and $1$ (shown shaded in the diagram above).
- Along with the value of $Y_5$ we can use these 3 values to compute the root value and compare to the published value.
- We call this an authentication path.
- Note we require the value of exactly one node at each level to do this.
- That means if we have $n$ leaves, the authentication path will consist of $\log_2(n)$ values, e.g. $\log_2(8) = 3$
Example Merkle Tree
- 
Here is a simple example of a Merkle tree. To make the representation of the tree manageable, we cheat and simplify the hash function so it only outputs a digest value of 4 bytes. This is obviously not secure, but it makes displaying the results less cumbersome, and still shows the underlying concept. 
- 
In this example, we have eight key values, $Y_0,\ldots,Y_7$, each 4 bytes long represented in hexadecimal. (Note index-0 this time) 
- 
The leaf nodes (7-E) contain the 4-byte hash of the keys. 
- 
Each internal node contains the hash of its left and right child node value. 

- 
The authentication path for $Y_4$ is C $\rightarrow$ 6 $\rightarrow$ 1. 
- 
The path from the root to the leaf with index 4 (binary 100) is shown in blue. The index value in binary is interpreted bit-by-bit, going to the left child if the bit is zero and to the right child if the bit is one. 
- 
The authentication path consists of the siblings of the nodes on the path to the leaf. 
Calculations for example Merkle tree
Our special "weak" hash function $H$ is defined in Python as follows. (Note that the hash digests are computed over the decoded binary values of the hex strings.)
import hashlib
def H(val: str) -> str:
    """Weak hash function returning hex-encoded 4-byte hash."""
    return hashlib.sha256(bytes.fromhex(val)).hexdigest()[:8]
Some sample calculations.
B = H(Y_4) = H(f7d5e02e) = 59bb626f 5 = H(B||C) = H(59bb626f || 804c9bdb) = 7890c8b6 2 = H(5||6) = H(7890c8b6 || e090b2ce) = 009a4350 root, 0 = H(1||2) = H(076f83f6 || 009a4350) = 03583268
- The authentication path for leaf $Y_4=\texttt{f7d5e02e}$ is
- $C = \texttt{804c9bdb}$
- $6 = \texttt{e090b2ce}$
- $1 = \texttt{076f83f6}$
 
- 
The path can be verified by showing that
	- $H(1 \parallel H(H(H(Y_4)\parallel C) \| 6)) = \text{root}$
- There is a neat way to do this using binary numbers shown in the Python code below.
 
Python code
# Required root value in Merkle tree
root = "03583268"
# Authentication path
authpath = [
"804c9bdb",
"e090b2ce",
"076f83f6",
]
print("authpath=", [x for x in authpath])
# Known key value
sk = "f7d5e02e" # Y_4
print(f"key={sk}")
# Compute leaf value above sk
idx = 4  # 0-based index of Y_4
node = H(sk)  # leaf node
for ap in authpath:
    print(f"idx={idx}")
    # Get last bit of child idx
    lastbit = idx & 1
    # Move up to next level
    idx >>= 1
    print(f"child node={node}")
    print(f"authpath={ap}")
    # Last bit of child determines left/right order of authpath
    if (lastbit):
        print(f"m1,m2={ap} {node}") 
        node = H(ap + node)
    else:
        print(f"m1,m2={node} {ap}") 
        node = H(node + ap)
    print(f"node={node}")
print(f"Required root={root}")
assert(root == node)
Output:
authpath= ['804c9bdb', 'e090b2ce', '076f83f6'] key=f7d5e02e idx=4 child node=59bb626f authpath=804c9bdb m1,m2=59bb626f 804c9bdb node=7890c8b6 idx=2 child node=7890c8b6 authpath=e090b2ce m1,m2=7890c8b6 e090b2ce node=009a4350 idx=1 child node=009a4350 authpath=076f83f6 m1,m2=076f83f6 009a4350 node=03583268 Required root=03583268
Benefits of using a Merkle Tree
- 
The issuer can publish a single root value as a "trusted" authentication value for the $n$ leaves. 
- 
The issuer can publish a single leaf value along with its authentication path. 
- 
A verifier can recompute and check that the authentication path results in the "trusted" root value. 
A warning about Merkle trees
- 
See Bitcoin vulnerability (CVE-2012-2459) a vulnerability if the number of hashes in the list at a given level is odd. 
- 
Hint: Merkle trees work most efficiently (and safely) if they are perfect.
In a perfect tree, all internal nodes have two children and all leaves are at the same level. 
A perfect Merkle tree of height $h$ has exactly $2^h$ leaf nodes.
This avoids the Bitcoin vulnerability above.
- Some authors use the word complete, but not all definitions of "complete" mean "perfect" as described above.
 
| << previous: Winternitz One-Time Signature (WOTS) | Contents | next: Basic Merkle Signature Scheme >> | 
Rate this page
Contact us
To comment on this page or to contact us, please send us a message.
This page first published 17 March 2023. Last updated 9 September 2025.


