Disjoint Set Union (DSU) Data Structure with Practical Examples

Introduction
Operations in DSU
DSU Optimizations
- Path Compression
- Union by Size / Rank
Implementation of DSU in C
Time Complexity of DSU
Finding Connected Components in a Graph
Applications of DSU
Run C Programming Online Compiler

Introduction

The Disjoint Set Union (DSU), also known as Union-Find, is a data structure that keeps track of a set of elements partitioned into a number of disjoint (non-overlapping) subsets. It supports two main operations efficiently: finding the representative of a set and merging two sets. This structure is widely used in various computer science applications such as network connectivity, image processing, and Kruskal’s algorithm for finding Minimum Spanning Trees (MST).

Operations in `DSU`

We represent DSU as a forest of trees, where each tree corresponds to one set, and the root of the tree acts as the representative or leader of that set. In this structure, every element points to its parent, and the root is the ultimate parent for all elements in the set. This hierarchical organization allows efficient management of the sets and enables quick determination of the representative element through traversal to the root. It supports three primary operations: Initializing a set, finding representative of a set and union of two sets. These operations are fundamental in managing the disjoint sets.

Initializing a set

This operation initializes a new set containing the single element v. It sets the parent of v to v itself, indicating that v is the representative (or leader) of its own set. The parent array is used to keep track of the representative of each set.

Implementation in C:


void make_set(int v) {
    parent[v] = v; // Each element is its own parent
}

Finding representative

This operation returns the representative (or leader) of the set containing element v. It can be used to check if two elements belong to the same set.

Implementation in C:


int find_set(int v) {
    if (parent[v] != v) {
        return find_set(parent[v]);
    }
    return v;
}

Union of two sets

This operation merges the sets containing elements a and b. It finds the representatives of both sets, if the roots are different connects one set to the other, effectively merging them into a single set.

Implementation in C:


void union_sets(int a, int b) {
    int rootA = find_set(a);
    int rootB = find_set(b);

    if (rootA != rootB) {
        parent[rootB] = rootA;
    }
}

Time Complexity

make_set(v): O(1)
find_set(v): In the worst case, O(n) if the tree is completely unbalanced.
union_sets(a, b): In the worst case, O(n) if the tree is completely unbalanced.

`DSU` Optimizations

Path Compression

This optimization technique flattens the structure of the tree whenever Find is called by making every node on the path point directly to the root. This helps in speeding up future Find operations, making them nearly constant time on average. Without path compression, the time complexity of Find can be as bad as O(n), but with path compression, it becomes O(log n).

Implementation in C:


int find(int x) {
    if (parent[x] != x) {
        parent[x] = find(parent[x]); // Path compression
    }
    return parent[x];
}

Union by Size / Rank

In the previous union implementation, the second tree always got attached to the first one. It can lead to trees containing chains of length O(n) . This optimization ensures that the smaller tree is always attached under the root of the larger tree. The rank array keeps track of the depth of the trees to decide which tree to merge under which. This prevents the trees from becoming too tall, ensuring that the Find operation remains efficient. To use this optimization we need to introduce new array size which keeps the size of the components.

Implementation in C:


void union_sets(int a, int b) {
    a = find(a);
    b = find(b);
    if (a != b) {
        if (size[a] < size[b]) {
            // Swap to ensure that we attach the smaller tree to the larger one
            int temp = a;
            a = b;
            b = temp;
        }
        parent[b] = a;
        size[a] += size[b];
    }
}

Implementation of `DSU` in C


#include <stdio.h>
#include <stdlib.h>

#define MAXN 1000  // Adjust MAXN based on your needs

// Global arrays for parent and size
int parent[MAXN];
int size[MAXN];

// Function to initialize the DSU structure
void init_dsu(int n) {
    for (int i = 0; i < n; ++i) {
        parent[i] = i;
        size[i] = 1;
    }
}

// Find with path compression
int find(int x) {
    if (parent[x] != x) {
        parent[x] = find(parent[x]); // Path compression
    }
    return parent[x];
}

// Union by size
void union_sets(int a, int b) {
    a = find(a);
    b = find(b);
    if (a != b) {
        if (size[a] < size[b]) {
            // Swap to ensure that we attach the smaller tree to the larger one
            int temp = a;
            a = b;
            b = temp;
        }
        parent[b] = a;
        size[a] += size[b];
    }
}

// Example usage
int main() {
    int n = 10; // Number of elements, adjust as needed
    init_dsu(n);

    // Example union operations
    union_sets(1, 2);
    union_sets(2, 3);
    union_sets(4, 5);

    // Example find operations
    printf("Find(1): %d\n", find(1));
    printf("Find(2): %d\n", find(2));
    printf("Find(4): %d\n", find(4));

    return 0;
}

Output:
Find(1): 1
Find(2): 1
Find(4): 4

Time Complexity of `DSU`

When combining the optimizations of path compression with union by size or rank in a Disjoint Set Union (DSU) data structure, we achieve nearly constant time complexity for queries. Specifically, the amortized time complexity is O(α(n)), where α(n) is the inverse Ackermann function. This function grows extremely slowly, so it does not exceed 4 for all practical values of n.
Amortized complexity refers to the average time per operation when considering a sequence of multiple operations. This means that while a single operation may take longer in some cases, the average time across many operations is very small. Without path compression, only using union by size or rank, the time complexity per query is O(log n).

Finding Connected Components in a Graph

One of the primary applications of the Disjoint Set Union (DSU) or Union-Find data structure is to manage and query connected components in a graph. Let’s consider the following problem,

we are given edges of a graph. We need to answer whether vertices a and b are in the same connected component.

With DSU, we efficiently answer this in almost linear time by:

Adding vertices and edges to the graph.
Using union operations to merge connected components.
Using find operations to check if two vertices are in the same component.


#include <stdio.h>
#include <stdlib.h>

// Define a maximum number of vertices
#define MAXN 1000  

// Global arrays for parent and size
int parent[MAXN];
int size[MAXN];

// Function to initialize the DSU structure
void init_dsu(int n) {
    for (int i = 0; i < n; ++i) {
        parent[i] = i;
        size[i] = 1;
    }
}

// Function to swap two integers
void swap(int *a, int *b) {
    int temp = *a;
    *a = *b;
    *b = temp;
}

// Find with path compression
int find(int x) {
    if (parent[x] != x) {
        parent[x] = find(parent[x]); // Path compression
    }
    return parent[x];
}

// Union by size
void union_sets(int a, int b) {
    a = find(a);
    b = find(b);
    if (a != b) {
        if (size[a] < size[b]) {
            swap(&a, &b); // Swap to ensure we attach the smaller tree to the larger one
        }
        parent[b] = a;
        size[a] += size[b];
    }
}

// Function to check if two vertices are in the same connected component
int are_connected(int a, int b) {
    return find(a) == find(b);
}

// Example usage
int main() {
    int n = 10; // Number of vertices, adjust as needed
    init_dsu(n);

    // Adding edges
    union_sets(1, 2);
    union_sets(2, 3);
    union_sets(4, 5);
    union_sets(6, 7);
    union_sets(5, 6);

    // Querying if vertices are in the same connected component
    printf("Are 1 and 3 connected? %s\n", are_connected(1, 3) ? "Yes" : "No");
    printf("Are 1 and 4 connected? %s\n", are_connected(1, 4) ? "Yes" : "No");
    printf("Are 4 and 7 connected? %s\n", are_connected(4, 7) ? "Yes" : "No");

    return 0;
}

Output:
Are 1 and 3 connected? Yes
Are 1 and 4 connected? No
Are 4 and 7 connected? Yes

Applications of `DSU`

Finding Connected Components in a Graph : DSU helps efficiently track and merge connected components as edges are added. It can answer queries about whether two vertices belong to the same connected component.
Kruskal’s Minimum Spanning Tree Algorithm : DSU is used to detect cycles while adding edges in Kruskal’s algorithm. It ensures that adding an edge does not create a cycle, thus maintaining a forest that eventually becomes the MST.
Dynamic Connectivity in Networks : In network management, DSU can handle dynamic connectivity queries, allowing real-time updates and checks on the connectivity status of nodes.
Image Processing : DSU can be used to label connected components in a binary image, identifying and grouping pixels that are part of the same connected region.
Network Design and Optimization : DSU is used in network design algorithms to manage and optimize the connectivity of various components, ensuring efficient design and minimal cost while keeping components connected.
Tarjan’s LCA Algorithm : Tarjan’s algorithm uses DSU to efficiently answer multiple LCA queries after preprocessing the tree.

Run C Programming Online Compiler

To make your learning more effective, exercise the coding examples in the text editor below.

Run C programming online

Previous
Single-Source Shortest Paths Algorithm - Dijkstra's Algorithm

Next
Introduction to Binary Search with Practical Examples

Data Structure and Algorithm

Disjoint Set Union (DSU) Data Structure with Practical Examples

¶Introduction

¶Operations in DSU

¶Initializing a set

¶Finding representative

¶Union of two sets

¶DSU Optimizations

¶Path Compression

¶Union by Size / Rank

¶Implementation of DSU in C

¶Time Complexity of DSU

¶Finding Connected Components in a Graph

¶Applications of DSU

¶Run C Programming Online Compiler

All Tutorials in this playlist

Introduction to Linked List Data Structure with Practical Examples

Introduction to Queue Data Structure with Practical Examples

Introduction to Stack Data Structure with Practical Examples

Introduction to Set Data Structure with Practical Examples

Introduction to Map Data Structure with Practical Examples

Priority Queue Data Structure with Practical Examples

Introduction to Trie Data Structure with Practical Examples

Segment Tree Data Structure with Practical Examples Part 01

Introduction to Graph Data Structure with Practical Examples

Depth First Search - DFS Algorithm with Practical Examples

Breadth First Search - BFS Algorithm with Practical Examples

Single-Source Shortest Paths Algorithm - Dijkstra's Algorithm

Introduction to Tree Data Structure with Practical Examples

Disjoint Set Union (DSU) Data Structure with Practical Examples

Kruskal's Algorithm for Finding the Minimum Spanning Tree in a Graph

Introduction to Binary Search with Practical Examples

Sliding Window Technique with Two Practical Examples

The Two Pointers Method with Three Practical Examples

Popular Tutorials

Vue JS 3 - Complete Advance CRUD Example with Vuex, Vue-Validate, Pagination, Searching and Everything

Laravel Eloquent Vs DB Query Builder [Performance and other statistics]

Laravel Role Permission Management System Full Example with Source Code

Start WordPress Plugin Development with React JS easily in just few steps

Laravel Membership Management Full Project Demo with Source Codes

Numerical Methods for Engineers EBook & Solution Download - Download PDF

Laravel Ecommerce Complete Web Application with Admin Panel

Connect your React app with WebSocket to get real time Data

#AJAX With Laravel API and More - Learn Laravel Beyond the Limit

How to redirect back in Laravel with Form Input and many possible ways

Categories

Tags

Introduction

Operations in `DSU`

Initializing a set

Finding representative

Union of two sets

`DSU` Optimizations

Path Compression

Union by Size / Rank

Implementation of `DSU` in C

Time Complexity of `DSU`

Finding Connected Components in a Graph

Applications of `DSU`

Run C Programming Online Compiler