Understanding Van Emde Boas Trees (vEB Trees)
Van Emde Boas tree provides O(log log U) time complexity for insert, delete, search, successor, and predecessor operations on integer sets.
Understanding Van Emde Boas Trees (vEB Trees)
A van Emde Boas (vEB) tree is a specialized tree data structure for storing a set of integers from a bounded universe.
It supports operations like insert, delete, find (membership), successor, and predecessor extremely efficiently – in O(log log U) time (where U is the universe size), and min/max queries in O(1) time.
In contrast, a balanced binary search tree or a binary heap needs O(log n) time for these operations.
The vEB tree achieves this speed by recursively clustering the universe, at the cost of O(U) space in the naive form.
Core Structure
The key idea of a vEB tree is to recursively divide the universe of keys into √U clusters of size √U each (for U = 2^m).
Each vEB node (representing a range [0..U-1]) stores:
minandmax: the minimum and maximum keys in the subtree.- Clusters: an array of √U child pointers, each covering a subrange.
- Summary: a vEB structure of size √U that keeps track of which clusters are non-empty.
ASCII Representation
vEB Node (Universe U)
┌───────────────────────────┐
│ min, max │
│ summary (vEB of size √U) │
│ cluster[0 ... √U-1] │
└───────────────────────────┘
A number x is split into:
high(x) = floor(x / √U) --> Cluster index
low(x) = x mod √U --> Offset inside the cluster
Example for U=16 (√U=4):
x = 14
high(14) = 3 (cluster index, covers [12..15])
low(14) = 2 (offset inside cluster)
We maintain a dynamic set of integers from a fixed universe of size U (say integers from 0 … U-1), and it supports operations like:
- insert(x)
- delete(x)
- member(x) / contains(x)
- successor(x) → next larger element
- predecessor(x) → next smaller element
- find-min, find-max
The remarkable property is:
- All these operations run in O(log log U) time (much faster than balanced BSTs like red-black trees, which are O(log n)).
The tradeoff is:
- It only works when the universe size is bounded and a power of 2 (your code enforces this).
- It can take O(U) space in the naive form (though improved versions reduce this).
Operations
The magic behind vEB is recursive decomposition of the universe.
Suppose your universe is size U = 16. Numbers are from 0 … 15.
- Split the numbers into a high part and a low part:
- Think of each number as a 2-digit number in base √U.
- Example: with U=16, √U = 4. So each number in binary (4 bits) is split into:
- high(x) = top 2 bits → which "cluster" the number goes to
- low(x) = bottom 2 bits → the position inside that cluster
x = 13 = (1101)b
high(x) = 11 = 3
low(x) = 01 = 1
- A vEB(U) is composed of:
- A summary structure (a vEB(√U)) that keeps track of which clusters are non-empty.
- An array of clusters, each a vEB(√U), holding numbers whose high part is the same.
vEB(16)
├── summary: vEB(4) // which clusters are active
├── clusters[0] : vEB(4)
├── clusters[1] : vEB(4)
├── clusters[2] : vEB(4)
└── clusters[3] : vEB(4)
- Min and Max are cached at the root for O(1) access.
Insertion
Steps to insert x into a vEB tree T:
- If tree empty → set
min = max = x. - If x < min → swap(x, min).
- If x > max → set max = x.
- Recurse into cluster[high(x)] with low(x).
- If the cluster was empty, insert high(x) into summary.
ASCII Example (Insert 7 into U=16 tree)
Cluster index = high(7) = 1
Offset = low(7) = 3
Before Insert:
summary: { }
cluster[1]: empty
After Insert:
summary: {1}
cluster[1]: contains {3} (represents 7)
Successor
- If x < min → successor = min.
- Else, check cluster[high(x)]:
- If local max > low(x), recurse in same cluster.
- Else, find next cluster index in summary and return its min.
ASCII intuition:
summary: {1, 3}
cluster[1] → has 5,7
cluster[3] → has 14
successor(7):
→ cluster[1] max = 7 (not > 7)
→ go to summary, next cluster = 3
→ answer = 12 + cluster[3].min = 14
Deletion
Steps:
- If only one element → empty the tree.
- If deleting min → replace it with next element from summary+cluster.
- Delete recursively from cluster.
- If a cluster empties → remove its index from summary.
Why O(log log U)?
- Each operation descends recursively through clusters.
- At each step, the universe shrinks from U to √U.
- Recursion depth = log log U.
- Each step does O(1) work (constant time splits, min/max updates).
- Total = O(log log U).
Worked Example
Insert sequence into U=16: [2, 5, 7, 9].
Insert 2 → min=2, max=2
Insert 5 → cluster[1]={1}, summary={1}, min=2, max=5
Insert 7 → cluster[1]={1,3}, summary={1}, min=2, max=7
Insert 9 → cluster[2]={1}, summary={1,2}, min=2, max=9
ASCII View:
summary: {1,2}
cluster[1]: {5,7}
cluster[2]: {9}
cluster[0], cluster[3]: empty
Performance Comparison
| Data Structure | Insert/Search/Delete | Min/Max | Space |
|----------------|----------------------|---------|----------|
| BST (balanced) | O(log n) | O(log n)| O(n) |
| Binary Heap | O(log n) | O(1) | O(n) |
| vEB Tree | O(log log U) | O(1) | O(U) |
Code Sketch
class VEBTree:
def __init__(self, u):
self.u = u
self.min = None
self.max = None
if u > 2:
self.summary = VEBTree(int(u**0.5))
self.cluster = [VEBTree(int(u**0.5)) for _ in range(int(u**0.5))]
else:
self.summary = None
self.cluster = []
def insert(T, x):
if T.min is None:
T.min = T.max = x
return
if x < T.min:
x, T.min = T.min, x
if x > T.max:
T.max = x
if T.u > 2:
i = x // int(T.u**0.5)
j = x % int(T.u**0.5)
insert(T.cluster[i], j)
if T.cluster[i].min == T.cluster[i].max == j:
insert(T.summary, i)
Conclusion
The vEB tree achieves O(log log U) operations by recursively structuring the universe.
It is faster than BSTs and heaps in theory, but consumes O(U) space.
It’s ideal when the universe size is not excessively large compared to the number of elements.
Rubygems for vEB Tree: https://rubygems.org/gems/veb_tree
Source Code: Github
Share this post
Comments (0)
Loading comments...
Leave a Comment