Merkle Mountain Range Multi Proofs

Do not index

Merkle mountain ranges are an improvement over conventional merkle trees for growing, potentially unbounded lists. Where conventional merkle tree constructions over growing lists prove very inefficient to compute, as all nodes in the tree must be recomputed. Merkle mountain ranges amortise this cost by growing subtrees incrementally and merging subtrees at the same height, rather than growing the full tree.

You can observe how the tree grows by looking at the order of the nodes.

Merkle mountain ranges, true to it’s name, are a type of merkle tree that is composed of perfectly balanced merkle trees (ie each sub tree has leaves = a power of 2) such that it does indeed look like a mountain range. This construction provides a few benefits:

Smaller proof sizes when compared to conventional merkle trees for very large sets of data (especially for more recent leaves in the tree).

Better insertion complexity, where the insertion complexity of conventional merkle trees is .

First we define the termial function as the sum:

Where can be evaluated through the function:

Let’s recall the properties of merkle trees:

Where:

height of the tree

number of leaves in the tree.

For merkle trees, the number of proof nodes for a single item proof is defined as . In order to understand the improvements that mmr’s bring to the table, lets consider the number of leaves present in both trees to get a maximum proof size of nodes. Note that the number of proof items needed for an item in a merkle tree = height of the merkle tree.

Since an mmr is itself composed of perfectly balanced binary trees with decreasing heights. The total number of leaves in an mmr can be expressed as , where is the total number of leaves in the first subtree. We can therefore derive the maximum number of leaves in an mmr where the height of the first subtree is using the function:

Whereas for a conventional merkle tree, whose height is the total number of leaves is:

So now, we see how mmrs enable more efficient for merkle proof sizes on much larger data sets than conventional merkle trees.

MMR Multi Proofs

Our approach to verifying mmr multi proofs will be to regard each sub tree as an isolated merkle tree, using the -index model defined in my previous article. The -index of each node in an mmr, will be the distance from the left-most node in each subtree.

We can describe each sub tree as a standalone merkle tree.

Given this model, we can re-use the calculate_merkle_multi_root function defined in my previous article to verify mmr multi proofs using the algorithm:

pub fn calculate_mmr_root(
    mut leaves: Vec<(u64, usize, Hash)>, // mmr_index, k_index, node_hash
    mmr_size: u64,
    mut proof_iter: Vec<Hash>,
) -> Hash {
    let peaks = get_peaks(mmr_size);
    let mut peak_roots = vec![];

    for peak in peaks {
        let mut leaves: Vec<_> = take_while(&mut leaves, |(pos, _, _)| *pos <= peak);

        match leaves.len() {
            1 if leaves[0].0 == peak => {
                // this is a peak root.
                peak_roots.push(leaves.pop().unwrap().2);
            }
            0 => {
                // the next proof item is a peak
                if let Some(peak) = proof_iter.pop() {
                    peak_roots.push(peak)
                } else {
                    break;
                }
            }
            _ => {
                let leaves = leaves
                    .into_iter()
                    .map(|(_, index, leaf)| {
                        (index, leaf)
                    })
                    .collect::<Vec<_>>();

                let height = pos_to_height(peak);
                let mut current_layer: Vec<_> = leaves.iter().map(|(i, _)| *i).collect();
                let mut sub_tree: Vec<Vec<_>> = vec![];

                for i in 0..height {
                    let siblings = sibling_indices(current_layer.clone());
                    let diff = difference(&siblings, &current_layer);
                    if diff.len() == 0 {
                        // fill the remaining layers
                        sub_tree.extend((i..height).map(|_| vec![]));
                        break;
                    }

                    let len = diff.len();
                    let layer = diff.into_iter().zip(proof_iter.drain(..len)).collect();
                    sub_tree.push(layer);
                    current_layer = parent_indices(siblings);
                }

                // insert the leaves at the base layer.
                sub_tree[0].extend(&leaves);
                sub_tree[0].sort_by(|a, b| a.0.cmp(&b.0));

                let peak_root = calculate_merkle_multi_root(sub_tree);
                peak_roots.push(peak_root);
            }
        };
    }

    // bagging peaks from right to left via hash(right, left).
    while peak_roots.len() > 1 {
        let right_peak = peak_roots.pop().unwrap();
        let left_peak = peak_roots.pop().unwrap();
        let mut buf = vec![];
        buf.extend(&right_peak);
        buf.extend(&left_peak);

        peak_roots.push(keccak256(&buf));
    }
    peak_roots.pop().unwrap()
}

References

P. Todd. Merkle Mountain Ranges. GitHub, 2012. https://opentimestamps.org.

Donald Knuth in The Art of Computer Programming.

(Third edition, Volume 1, page 48).