0% complete
6 min left
Keep reading...
Go back

Understanding Van Emde Boas Trees (vEB Trees)

October 03, 2025 | 12:00 AM
6 min read
by Abhinav PandeyFeatured

Van Emde Boas tree provides O(log log U) time complexity for insert, delete, search, successor, and predecessor operations on integer sets.

Understanding Van Emde Boas Trees (vEB Trees)

A van Emde Boas (vEB) tree is a specialized tree data structure for storing a set of integers from a bounded universe.
It supports operations like insert, delete, find (membership), successor, and predecessor extremely efficiently – in O(log log U) time (where U is the universe size), and min/max queries in O(1) time.
In contrast, a balanced binary search tree or a binary heap needs O(log n) time for these operations.
The vEB tree achieves this speed by recursively clustering the universe, at the cost of O(U) space in the naive form.


Core Structure

The key idea of a vEB tree is to recursively divide the universe of keys into √U clusters of size √U each (for U = 2^m).

Each vEB node (representing a range [0..U-1]) stores:

  • min and max: the minimum and maximum keys in the subtree.
  • Clusters: an array of √U child pointers, each covering a subrange.
  • Summary: a vEB structure of size √U that keeps track of which clusters are non-empty.

ASCII Representation

 vEB Node (Universe U)
 ┌───────────────────────────┐
 │ min, max                  │
 │ summary (vEB of size √U)  │
 │ cluster[0 ... √U-1]       │
 └───────────────────────────┘

A number x is split into:

 high(x) = floor(x / √U)   --> Cluster index
 low(x)  = x mod √U        --> Offset inside the cluster

Example for U=16 (√U=4):

 x = 14
 high(14) = 3  (cluster index, covers [12..15])
 low(14)  = 2  (offset inside cluster)

We maintain a dynamic set of integers from a fixed universe of size U (say integers from 0 … U-1), and it supports operations like:

  • insert(x)
  • delete(x)
  • member(x) / contains(x)
  • successor(x) → next larger element
  • predecessor(x) → next smaller element
  • find-min, find-max

The remarkable property is:

  • All these operations run in O(log log U) time (much faster than balanced BSTs like red-black trees, which are O(log n)).

The tradeoff is:

  • It only works when the universe size is bounded and a power of 2 (your code enforces this).
  • It can take O(U) space in the naive form (though improved versions reduce this).

Operations

The magic behind vEB is recursive decomposition of the universe.

Suppose your universe is size U = 16. Numbers are from 0 … 15.

  1. Split the numbers into a high part and a low part:
    • Think of each number as a 2-digit number in base √U.
    • Example: with U=16, √U = 4. So each number in binary (4 bits) is split into:
    • high(x) = top 2 bits → which "cluster" the number goes to
    • low(x) = bottom 2 bits → the position inside that cluster
    x = 13 = (1101)b
    high(x) = 11 = 3
    low(x)  = 01 = 1
  1. A vEB(U) is composed of:
    • A summary structure (a vEB(√U)) that keeps track of which clusters are non-empty.
    • An array of clusters, each a vEB(√U), holding numbers whose high part is the same.
    vEB(16)
    ├── summary: vEB(4)   // which clusters are active
    ├── clusters[0] : vEB(4)
    ├── clusters[1] : vEB(4)
    ├── clusters[2] : vEB(4)
    └── clusters[3] : vEB(4)
  1. Min and Max are cached at the root for O(1) access.

Insertion

Steps to insert x into a vEB tree T:

  1. If tree empty → set min = max = x.
  2. If x < min → swap(x, min).
  3. If x > max → set max = x.
  4. Recurse into cluster[high(x)] with low(x).
  5. If the cluster was empty, insert high(x) into summary.

ASCII Example (Insert 7 into U=16 tree)

 Cluster index = high(7) = 1
 Offset        = low(7)  = 3

 Before Insert:
 summary: { }
 cluster[1]: empty

 After Insert:
 summary: {1}
 cluster[1]: contains {3} (represents 7)

Successor

  1. If x < min → successor = min.
  2. Else, check cluster[high(x)]:
    • If local max > low(x), recurse in same cluster.
    • Else, find next cluster index in summary and return its min.

ASCII intuition:

 summary: {1, 3}
 cluster[1] → has 5,7
 cluster[3] → has 14

 successor(7):
   → cluster[1] max = 7 (not > 7)
   → go to summary, next cluster = 3
   → answer = 12 + cluster[3].min = 14

Deletion

Steps:

  1. If only one element → empty the tree.
  2. If deleting min → replace it with next element from summary+cluster.
  3. Delete recursively from cluster.
  4. If a cluster empties → remove its index from summary.

Why O(log log U)?

  • Each operation descends recursively through clusters.
  • At each step, the universe shrinks from U to √U.
  • Recursion depth = log log U.
  • Each step does O(1) work (constant time splits, min/max updates).
  • Total = O(log log U).

Worked Example

Insert sequence into U=16: [2, 5, 7, 9].

 Insert 2 → min=2, max=2
 Insert 5 → cluster[1]={1}, summary={1}, min=2, max=5
 Insert 7 → cluster[1]={1,3}, summary={1}, min=2, max=7
 Insert 9 → cluster[2]={1}, summary={1,2}, min=2, max=9

ASCII View:

 summary: {1,2}
 cluster[1]: {5,7}
 cluster[2]: {9}
 cluster[0], cluster[3]: empty

Performance Comparison

| Data Structure | Insert/Search/Delete | Min/Max | Space    |
|----------------|----------------------|---------|----------|
| BST (balanced) | O(log n)             | O(log n)| O(n)     |
| Binary Heap    | O(log n)             | O(1)    | O(n)     |
| vEB Tree       | O(log log U)         | O(1)    | O(U)     |

Code Sketch

class VEBTree:
    def __init__(self, u):
        self.u = u
        self.min = None
        self.max = None
        if u > 2:
            self.summary = VEBTree(int(u**0.5))
            self.cluster = [VEBTree(int(u**0.5)) for _ in range(int(u**0.5))]
        else:
            self.summary = None
            self.cluster = []

def insert(T, x):
    if T.min is None:
        T.min = T.max = x
        return
    if x < T.min:
        x, T.min = T.min, x
    if x > T.max:
        T.max = x
    if T.u > 2:
        i = x // int(T.u**0.5)
        j = x % int(T.u**0.5)
        insert(T.cluster[i], j)
        if T.cluster[i].min == T.cluster[i].max == j:
            insert(T.summary, i)

Conclusion

The vEB tree achieves O(log log U) operations by recursively structuring the universe.
It is faster than BSTs and heaps in theory, but consumes O(U) space.
It’s ideal when the universe size is not excessively large compared to the number of elements.

Rubygems for vEB Tree: https://rubygems.org/gems/veb_tree

Source Code: Github

Share this post

Comments (0)

Leave a Comment

Loading comments...