Introduction to Tree Data Structure with Practical Examples

Have you ever wondered how your computer neatly organizes all your files and folders? It’s not just random magic; there’s a clever system behind it called a tree data structure. Each file has its own unique path connecting it to others. This structure makes it super easy to find and manage your files, whether you’re looking for a document or a photo album. But the usefulness of tree data structures doesn’t stop there; they’re also essential in organizing things like database indexing, data compression and even in everyday tasks like navigating through menus on your phone.

Tree Data Structure

A tree data structure is a non-linear, acyclic, hierarchical organization of data, consisting of nodes connected by edges. Unlike linear data structures, trees arrange data in a manner that mirrors natural hierarchies, facilitating rapid data retrieval through various tree traversal algorithms such as inorder, preorder, and postorder traversal.

Serving as the backbone of numerous sophisticated algorithms, the tree data structure and its variants, including the binary search tree, binary tree, and complete binary tree, play pivotal roles in a wide array of real-world applications. These applications span from database management to network algorithms like the minimum spanning tree (MST) and are indispensable in fields requiring hierarchical data representation.

Binary Tree

Fundamental Tree Terminologies

  1. Node: A fundamental unit of a tree data structure that contains data.
  2. Edge: The connection between two nodes in a tree.
  3. Level: The level of a node represents the distance of the node from the root node, with the root node being at level 0.
  4. Subtree: A subtree is a smaller tree structure that is formed by any node in the tree along with all of its descendants.
  5. Parent: A parent node is a node that comes before another node in the tree structure and serves as the immediate predecessor of the node in question.
  6. Child: A child node is a node that directly follows another node in the tree. These are the immediate successors of their respective parent nodes.
  7. Root Node: At the very top of the tree hierarchy, there exists a special node called the root node. This node does not have any parent nodes and serves as the starting point for traversing the entire tree.
  8. Leaf Node: Leaf nodes are nodes that do not have any child nodes. These nodes mark the end points of branches in the tree structure.
  9. Ancestor: An ancestor of a node refers to any node that precedes it along the path from the root node to that specific node.
  10. Descendant: A descendant is any node that can be reached by following a path from a particular node.

Types of Tree Data Structure

There are several types of tree data structures, each with its own characteristics and applications. Some common types include:

  1. Binary Tree: A binary tree is a tree data structure in which each node has at most two children, referred to as the left child and the right child. This structure is widely used in computer science for various applications such as binary search trees and expression trees.

  2. Binary Search Tree (BST): A binary search tree is a type of binary tree in which the nodes are arranged in a specific order. For any node in the tree, all nodes in its left subtree have values less than the node’s value, and all nodes in its right subtree have values greater than the node’s value. BSTs are efficient for searching, inserting, and deleting elements.

  3. AVL Tree: An AVL tree is a self-balancing binary search tree in which the heights of the two child subtrees of any node differ by at most one. This balancing ensures that the tree remains relatively balanced, leading to improved performance for search, insertion, and deletion operations.

  4. B-Tree: A B-tree is a balanced tree data structure that maintains sorted data and allows for efficient insertion, deletion, and search operations. B-trees are used in databases and file systems where large amounts of data need to be stored and accessed efficiently.

  5. Red-Black Tree: A red-black tree is a type of binary search tree with additional properties that ensure it remains balanced. These properties include coloring nodes either red or black and enforcing rules that maintain the balance of the tree during insertion and deletion operations. Red-black trees are used in various applications, including implementing sets and maps in programming languages and databases.

  6. Trie: A trie, also known as a prefix tree, is a tree data structure used for storing a dynamic set of keys where each node represents a common prefix of its descendants’ keys. Tries are commonly used in applications involving string manipulation and information retrieval, such as autocomplete features in text editors and search engines.

Implementation of a Tree in C++

#include <iostream>
#include <vector>

using namespace std;

// Function to add an edge between two nodes in the tree
void addEdge(vector<vector<int>>& tree, int parent, int child) {
    tree[parent].push_back(child);
}

// Function to perform a depth-first traversal of the tree
void dfs(const vector<vector<int>>& tree, int node) {
    cout << node << " ";

    for (int child : tree[node]) {
        dfs(tree, child);
    }
}

int main() {
    // Define the tree using adjacency lists
    vector<vector<int>> tree(6); // Assuming 6 nodes in the tree
    // Add edges to connect the nodes
    addEdge(tree, 0, 1);
    addEdge(tree, 0, 2);
    addEdge(tree, 1, 3);
    addEdge(tree, 1, 4);
    addEdge(tree, 2, 5);
    // Perform depth-first traversal starting from the root (node 0)
    cout << "Depth-First Traversal:" << endl;
    dfs(tree, 0);
    cout << endl;
    return 0;
}

Output:
Depth-First Traversal:
0 1 3 4 2 5

In the adjacency list representation:

  1. Node Representation: Each node in the tree is represented by an index in the vector. This index corresponds to the node’s position in the vector. For example, if a node has an index i, its adjacency list is stored at tree[i].
  2. Adjacency Lists: Each element of the vector (tree[i]) is itself a list or vector. This list contains the indices of nodes that are adjacent to the node represented by index i. In a tree, these adjacent nodes are typically the children of the node represented by index i.
  3. Adding Edges: To add an edge between two nodes in the tree, you simply append the index of the child node to the adjacency list of the parent node.
  4. Traversal: Traversing the tree using adjacency lists is typically done recursively. You start at the root node and explore its adjacency list, visiting each adjacent node (child), and recursively exploring their adjacency lists. This representation is memory-efficient, especially for sparse trees where nodes have relatively few children. It also allows for efficient traversal and exploration of the tree structure.

Tree Traversal Methods:

  1. Preorder Traversal: This method explores the current node before its child nodes. It’s commonly employed for tasks like creating a copy of the tree or generating prefix expressions from an expression tree.
#include <iostream>

using namespace std;

// Definition of a binary tree node
struct TreeNode {
    int data;
    TreeNode* left;
    TreeNode* right;

    TreeNode(int val) : data(val), left(nullptr), right(nullptr) {}
};

// Function to perform preorder traversal
void preOrderTraversal(TreeNode* root) {
    if (root == nullptr) return;

    // Visit the current node
    cout << root->data << " ";

    // Traverse the left subtree
    preOrderTraversal(root->left);

    // Traverse the right subtree
    preOrderTraversal(root->right);
}

int main() {
    // Construct the tree
    TreeNode* root = new TreeNode(1);
    root->left = new TreeNode(2);
    root->right = new TreeNode(3);
    root->left->left = new TreeNode(4);
    root->left->right = new TreeNode(5);
    root->right->left = new TreeNode(6);
    root->right->right = new TreeNode(7);

    // Perform preorder traversal
    cout << "Preorder traversal: ";
    preOrderTraversal(root);
    cout << endl;

    // Clean up memory
    delete root->left->left;
    delete root->left->right;
    delete root->left;
    delete root->right->left;
    delete root->right->right;
    delete root->right;
    delete root;

    return 0;
}

Output:
Preorder traversal: 1 2 4 5 3 6 7

  1. Inorder Traversal: In this approach, all nodes in the left subtree are visited first, followed by the current node, and then all nodes in the right subtree. In binary search trees, this results in nodes being visited in ascending order, making it ideal for tasks such as sorting.
#include <iostream>
#include <vector>

using namespace std;

// Definition of a binary tree node
struct TreeNode {
    int data;
    TreeNode* left;
    TreeNode* right;

    TreeNode(int val) : data(val), left(nullptr), right(nullptr) {}
};

// Function to perform in-order traversal
void inOrderTraversal(TreeNode* root) {
    if (root == nullptr) return;

    // Traverse the left subtree
    inOrderTraversal(root->left);

    // Visit the current node
    cout << root->data << " ";

    // Traverse the right subtree
    inOrderTraversal(root->right);
}

int main() {

    // Construct the tree from the adjacency list
    TreeNode* root = new TreeNode(1);
    root->left = new TreeNode(2);
    root->right = new TreeNode(3);
    root->left->left = new TreeNode(4);
    root->left->right = new TreeNode(5);
    root->right->left = new TreeNode(6);
    root->right->right = new TreeNode(7);

    // Perform in-order traversal
    cout << "In-order traversal: ";
    inOrderTraversal(root);
    cout << endl;
	
	    // Clean up memory
    delete root->left->left;
    delete root->left->right;
    delete root->left;
    delete root->right->left;
    delete root->right->right;
    delete root->right;
    delete root;

    return 0;
}

Output:
In-order traversal: 4 2 5 1 6 3 7

  1. Postorder Traversal: Nodes are visited only after all their descendants have been visited. This method is useful for operations like deleting the tree or performing garbage collection.
#include <iostream>

using namespace std;

// Definition of a binary tree node
struct TreeNode {
    int data;
    TreeNode* left;
    TreeNode* right;

    TreeNode(int val) : data(val), left(nullptr), right(nullptr) {}
};

// Function to perform postorder traversal
void postOrderTraversal(TreeNode* root) {
    if (root == nullptr) return;

    // Traverse the left subtree
    postOrderTraversal(root->left);

    // Traverse the right subtree
    postOrderTraversal(root->right);

    // Visit the current node
    cout << root->data << " ";
}

int main() {
    // Construct the tree
    TreeNode* root = new TreeNode(1);
    root->left = new TreeNode(2);
    root->right = new TreeNode(3);
    root->left->left = new TreeNode(4);
    root->left->right = new TreeNode(5);
    root->right->left = new TreeNode(6);
    root->right->right = new TreeNode(7);

    // Perform postorder traversal
    cout << "Postorder traversal: ";
    postOrderTraversal(root);
    cout << endl;

    // Clean up memory
    delete root->left->left;
    delete root->left->right;
    delete root->left;
    delete root->right->left;
    delete root->right->right;
    delete root->right;
    delete root;

    return 0;
}

Output:
Postorder traversal: 4 5 2 6 7 3 1

  1. Level Order Traversal: This BFS variant entails visiting all nodes at one depth level before moving to the next level. It employs a FIFO (First-In-First-Out) queue to ensure that nodes are processed in the order they are encountered.
#include <iostream>
#include <queue>

using namespace std;

// Definition of a binary tree node
struct TreeNode {
    int data;
    TreeNode* left;
    TreeNode* right;

    TreeNode(int val) : data(val), left(nullptr), right(nullptr) {}
};

// Function to perform level order traversal
void levelOrderTraversal(TreeNode* root) {
    if (root == nullptr) return;

    // Create a queue for level order traversal
    queue<TreeNode*> q;
    q.push(root);

    while (!q.empty()) {
        TreeNode* current = q.front();
        q.pop();

        // Visit the current node
        cout << current->data << " ";

        // Enqueue the left child if exists
        if (current->left != nullptr) {
            q.push(current->left);
        }

        // Enqueue the right child if exists
        if (current->right != nullptr) {
            q.push(current->right);
        }
    }
}

int main() {
    // Construct the tree
    TreeNode* root = new TreeNode(1);
    root->left = new TreeNode(2);
    root->right = new TreeNode(3);
    root->left->left = new TreeNode(4);
    root->left->right = new TreeNode(5);
    root->right->left = new TreeNode(6);
    root->right->right = new TreeNode(7);

    // Perform level order traversal
    cout << "Level order traversal: ";
    levelOrderTraversal(root);
    cout << endl;

    // Clean up memory
    delete root->left->left;
    delete root->left->right;
    delete root->left;
    delete root->right->left;
    delete root->right->right;
    delete root->right;
    delete root;

    return 0;
}

Output:
Level order traversal: 1 2 3 4 5 6 7

Applications of Tree Data Structures

  1. File System Organization: Trees structure the directory hierarchy in computer file systems, allowing for organized and efficient file storage and retrieval
  2. Data Compression: Huffman coding uses tree structures to encode data, reducing the size of data files by assigning shorter codes to more frequently used characters
  3. Abstract Syntax Trees (AST): Compilers use ASTs to parse and convert source code into an intermediate, tree-based representation, facilitating the subsequent stages of code generation and optimization.
  4. Parsing Mechanisms: In compilers and interpreters, parsing trees analyze grammatical structure, significantly aiding in the programming language translation process.
  5. Game Playing Algorithms: In AI, decision trees predict possible moves in games like chess, optimizing decision-making processes by evaluating potential future states.
  6. Networking Algorithms: Trees underpin algorithms for network routing and topology discovery, crucial for data packet management and pathfinding in network infrastructures.

Run C Programming Online Compiler

To make your learning more effective, exercise the coding examples in the text editor below.

Run C programming online

Previous
Single-Source Shortest Paths Algorithm - Dijkstra's Algorithm
Next
Introduction to Binary Search with Practical Examples