Scapegoat Trees In C++ - C++ Programming Tutorial
C++ Course / Data Structures / Scapegoat Trees In C++

Scapegoat Trees In C++

BLUF: Mastering Scapegoat Trees In C++ is a critical step in becoming a proficient C++ developer. This lesson provides a deep dive into the syntax, performance considerations, and real-world applications of this concept.
Key Performance Insight: Scapegoat Trees In C++

C++ is renowned for its efficiency. Learn how Scapegoat Trees In C++ enables low-level control and high-performance computing in the tutorial below.

Scapegoat trees are self-adjusting binary search trees designed to uphold effectiveness in their functions like adding, removing, and finding elements. This is achieved by reconstructing subtrees whenever they lose balance. In contrast to AVL or Red-Black trees that perform rotations immediately after each operation to keep the tree balanced, Scapegoat Trees opt for a more lenient balancing approach, allowing the tree to remain unbalanced for extended periods.

This indicates that instead of instantly readjusting an excessively unbalanced tree, the tree permits the imbalance to escalate. When the imbalance surpasses a specified threshold deep within the tree, a subtree is reconstructed. By implementing the α-balance standard (a criteria for identifying imbalance), the tree guarantees that its height stays logarithmic. This feature aids in sustaining effective tasks such as searching, inserting, and removing elements, particularly in scenarios involving fluctuating and uncertain datasets.

Key Characteristics:

Several key characteristics of Scapegoat Trees in C++ are as follows:

  • Binary Search Tree (BST): A Scapegoat Tree follows standard BST properties. For any node, the left subtree contains values smaller than the node's value, and the right subtree contains larger values. This structure allows efficient searching, whose time complexity is proportional to the height of the tree.
  • Balance Criterion: Scapegoat trees maintain balance according to a criterion controlled by a parameter α. The criterion ensures that in each subtree, neither side becomes significantly larger than the other. If at any time this condition is ever violated, such as, there is an imbalance in the subtree, corrective action is done on the subtree concerned. It does not rely on rotations as in AVL or Red-Black trees ; instead, the tree builds entire subtrees afresh whenever there is an imbalance problem. It keeps the height of the tree logarithmic in the number of nodes.
  • α-Balance: A node is said to be unbalanced in a scapegoat tree if one of its subtrees has its size more than a fraction of the total size of the subtree rooted at that node. If the imbalance is detected, the tree finds a "scapegoat" node (the deepest node that is unbalanced) and rebuilds the subtree rooted at that node.
  • Key Operations:

Key functionalities of Scapegoat Trees in C++ include:

1. Insertion:

A new node can be added to a Scapegoat Tree initially, similar to how it's done in a standard Binary Search Tree (BST). The node is placed correctly by comparing it with other nodes in the tree. Following insertion, the tree evaluates if any nodes have become unbalanced by checking the depth of the newly added node against a depth proportional to log(n), where (n) represents the total number of nodes. If the actual depth exceeds this threshold, the tree identifies the deepest unbalanced node (known as the scapegoat) and proceeds to reconstruct the subtree rooted at that node to regain equilibrium.

2. Deletion:

Deletion in a Scapegoat Tree follows a process akin to deleting a node in a standard Binary Search Tree (BST). The target node is eliminated, and adjustments are made to retain the properties of a BST. Unlike insertions, deletions do not prompt an instant subtree reconstruction. Instead, the tree is rebuilt only when the count of nodes remaining in the tree post-deletions drops below a specified threshold fraction of the maximum tree size. This approach effectively controls the height of the tree within logarithmic limits, even with numerous deletions.

3. Find Scapegoat:

A scapegoat is the specific node identified as having an imbalance. Upon insertion, if the depth of this node surpasses the logarithmic limit, the tree retraces its steps from the newly added node back to the root. During this process, the Size of the subtrees at every node is evaluated. The initial node that fails to meet the α-balance criteria is referred to as the scapegoat. This scapegoat node is then designated as the root of the subtree that requires reconstruction.

4. Rebuild:

Once a sacrificial node is identified, the subtree rooted at that node is reconstructed into an optimally balanced subtree. Typically, this reconstruction involves gathering all nodes within the subtree, arranging them in order, and then constructing a new balanced tree. By performing this reconstruction, it guarantees that the overall height of the tree will not exceed a logarithmic proportion to n. Therefore, any potential performance issues caused by imbalanced nodes are effectively mitigated.

Example:

Let's consider a scenario to demonstrate the Scapegoat Trees implementation in C++.

Example

#include <iostream>
#include <vector>
#include <cmath>

template <typename T>
class ScapegoatTree {
private:
    struct Node {
        T key;
        Node* left;
        Node* right;
        int size;
        Node(T k) : key(k), left(nullptr), right(nullptr), size(1) {}
    };
    Node* root;
    int maxSize;
    const double alpha; 
    // Calculate the Size of a given node's subtree
    int size(Node* node) {
        return node ? node->size : 0;
    }
    // Update the Size of a node based on its children
    void updateSize(Node* node) {
        if (node) {
            node->size = 1 + size(node->left) + size(node->right);
        }
    }
    // Check if a node is alpha-balanced
    bool isBalanced(Node* node) {
        return size(node->left) <= alpha * size(node) && size(node->right) <= alpha * size(node);
    }
    // Perform an in-order traversal to collect nodes in a subtree
    void flatten(Node* node, std::vector<Node*>& nodeList) {
        if (!node) return;
        flatten(node->left, nodeList);
        nodeList.push_back(node);
        flatten(node->right, nodeList);
    }
    // Rebuild a subtree into a balanced BST from a sorted list of nodes
    Node* buildBalancedTree(std::vector<Node*>& nodeList, int start, int end) {
        if (start > end) return nullptr;
        int mid = (start + end) / 2;
        Node* node = nodeList[mid];
        node->left = buildBalancedTree(nodeList, start, mid - 1);
        node->right = buildBalancedTree(nodeList, mid + 1, end);
        updateSize(node);
        return node;
    }
    // Rebuild the subtree rooted at the given node
    Node* rebuild(Node* node) {
        std::vector<Node*> nodeList;
        flatten(node, nodeList);
        return buildBalancedTree(nodeList, 0, nodeList.size() - 1);
    }
    // Recursively insert a node and return the scapegoat node if unbalanced
    Node* insert(Node* node, T key, Node*& scapegoat) {
        if (!node) return new Node(key);
        if (key < node->key) {
            node->left = insert(node->left, key, scapegoat);
        } else {
            node->right = insert(node->right, key, scapegoat);
        }
        updateSize(node);
        if (!isBalanced(node)) {
            scapegoat = node;
        }
        return node;
    }
    // Recursively search for a node with a given key
    Node* search(Node* node, T key) {
        if (!node || node->key == key) return node;
        if (key < node->key) return search(node->left, key);
        return search(node->right, key);
    }
    // Find the minimum node in a subtree
    Node* findMin(Node* node) {
        while (node->left) node = node->left;
        return node;
    }
    // Recursively delete a node and return the updated root
    Node* deleteNode(Node* node, T key) {
        if (!node) return nullptr;
        if (key < node->key) {
            node->left = deleteNode(node->left, key);
        } else if (key > node->key) {
            node->right = deleteNode(node->right, key);
        } else {
            // Node found
            if (!node->left) {
                Node* rightChild = node->right;
                delete node;
                return rightChild;
            } else if (!node->right) {
                Node* leftChild = node->left;
                delete node;
                return leftChild;
            } else {
                Node* minRight = findMin(node->right);
                node->key = minRight->key;
                node->right = deleteNode(node->right, minRight->key);
            }
        }
        updateSize(node);
        return node;
    }
    // Recursively find the depth of the inserted node
    int depth(Node* node, T key, int currentDepth) {
        if (!node) return -1;
        if (node->key == key) return currentDepth;
        if (key < node->key) return depth(node->left, key, currentDepth + 1);
        return depth(node->right, key, currentDepth + 1);
    }
    // Calculate the logarithmic bound for the height
    int logBound(int n) {
        return std::floor(log(n) / log(1.0 / alpha));
    }
    // Private in-order traversal method for printing
    void inOrder(Node* node) {
        if (!node) return;
        inOrder(node->left);
        std::cout << node->key << " ";
        inOrder(node->right);
    }

public:
    ScapegoatTree(double alpha = 2.0 / 3.0) : root(nullptr), maxSize(0), alpha(alpha) {}
    // Insert a key into the scapegoat tree
    void insert(T key) {
        Node* scapegoat = nullptr;
        root = insert(root, key, scapegoat);
        if (!scapegoat) {
            int currentSize = size(root);
            if (currentSize > maxSize) {
                maxSize = currentSize;
            }
        } else {
            if (depth(root, key, 0) > logBound(maxSize)) {
                root = rebuild(scapegoat);
            }
        }
    }
    // Search for a key in the tree
    bool search(T key) {
        return search(root, key) != nullptr;
    }
    // Delete a key from the tree
    void remove(T key) {
        root = deleteNode(root, key);
        if (size(root) < alpha * maxSize) {
            root = rebuild(root);
            maxSize = size(root);
        }
    }
    // Public method to print the tree (in-order traversal)
    void printTree() {
        inOrder(root);
        std::cout << std::endl;
    }
};
int main() {
    ScapegoatTree<int> tree;
    tree.insert(10);
    tree.insert(20);
    tree.insert(30);
    tree.insert(40);
    tree.insert(50);
    tree.insert(60);
    tree.insert(70);
    std::cout << "Tree after insertions: ";
    tree.printTree();
    std::cout << "Searching for 30: " << (tree.search(30) ? "Found" : "Not Found") << std::endl;
    std::cout << "Searching for 100: " << (tree.search(100) ? "Found" : "Not Found") << std::endl;
    tree.remove(30);
    std::cout << "Tree after deleting 30: ";
    tree.printTree();
    return 0;
}

Output:

Output

Tree after insertions: 30 40 50 60 70 
Searching for 30: Found
Searching for 100: Not Found
Tree after deleting 30: 40 50 60 70

Explanation:

This Scapegoat Tree implementation in C++ serves as a self-adjusting Binary Search Tree (BST) that ensures equilibrium by periodically reconstructing imbalanced subtrees. The implementation includes essential elements such as the node structure, adding new nodes, searching for specific nodes, removing nodes, and reconstructing subtrees as needed.

Node Structure:

The structure Node defines every individual node in the tree. It contains a key value, references to its left and right child nodes, and an integer size indicating the total number of nodes in the subtree that has this node as its root. The size attribute is essential for evaluating the subtree's balance. By applying the α-balance rule for assessing balance and reconstructing any imbalanced subtrees, it guarantees the tree's height stays logarithmic. Consequently, operations like searching, inserting, and removing elements are typically effective, especially when dealing with changing and unpredictable datasets.

Tree Structure:

The primary class ScapegoatTree holds essential attributes such as the topmost node (root), the maximum capacity (maxSize) that the tree has attained, and the equilibrium parameter (alpha), commonly configured to 2/3. This parameter determines the point at which a sub-tree is deemed unbalanced. Initially, the root is set as nullptr in an empty tree, and the maxSize is set to zero upon instantiation.

Insertion Operation:

The insert method initially executes a standard BST insertion process. Once the new node is added, the algorithm evaluates if any node along the path from the recently inserted node to the root has become unbalanced by utilizing the isBalanced function. An unbalanced node is identified when either its left or right subtree exceeds a specific fraction (determined by alpha) of the overall size of the subtree.

If there is a disparity, the system identifies the most deeply unbalanced node, also known as the scapegoat. The addition process ensures an average time complexity of O(logn) as reconstruction is a rare event triggered only by imbalances.

Search Operation:

The search method is simple to understand. It navigates through the tree in a manner similar to a standard BST in order to locate the node that holds the specified key. If the key is located, the method returns the corresponding node; if not, it returns nullptr.

Deletion Operation:

The remove method adheres to the conventional BST deletion guidelines, which involve replacing a node with its in-order predecessor or successor (in cases with two children) or deleting the node and reattaching its child. Unlike the insertion operation, deleting a node does not automatically initiate a restructuring process. Instead, the algorithm verifies the tree's size, and if it falls below a certain threshold relative to the maximum size (specifically, less than alpha * maxSize), the entire tree undergoes a rebuilding process to maintain balance.

Subtree Rebuilding:

The process of reconstructing the subtree is essential for preserving the equilibrium of the Scapegoat Tree. This process entails gathering all the nodes within the subtree by employing the flatten method and then rearranging them to form a tree that is perfectly balanced. By doing so, it guarantees that any imbalanced subtrees are rectified with minimal impact on the overall structure of the tree.

Complexity Analysis:

Time Complexity:

Insertion:

  • Insertion in a Scapegoat Tree begins similarly to insertion in a Binary Search Tree (BST), which takes O(logn) in the average case if the tree is balanced. However, after insertion, the tree may become unbalanced, triggering a scapegoat detection and a subtree rebuild.
  • Finding the scapegoat involves traversing the path from the inserted node to the root, which takes O(logn).
  • Amortized complexity: The tree rebuilds are infrequent. Although rebuilding a subtree takes O(k), it happens only after several insertions, and the amortized cost of insertion remains O(logn). This is because the cost of the occasional rebuild is spread across many operations.

Deletion:

  • Deletion follows the standard BST deletion procedure, which is O(logn) if the tree is balanced.
  • Unlike insertion, deletion doesn't trigger an immediate rebuild. Instead, a rebuild is triggered only when the tree's size falls below a threshold relative to the maximum size reached. If triggered, rebuilding the entire tree takes O(n), but again, this is infrequent.
  • Amortized complexity: Like insertion, deletion has an amortized time complexity of O(logn).
  • Querying within a Scapegoat Tree mirrors the process of a typical BST search, with an efficiency of O(logn) when the tree is in equilibrium.
  • Space Complexity:

  • The space complexity is O(n) because each node in the tree takes constant space, and there are n nodes in the tree.
  • Additionally, during rebuilding, a temporary array to hold the nodes of a subtree is used, which may take O(k) space, but k≤n. Thus, the overall space complexity is O(n).

Input Required

This code uses input(). Please provide values below:

Logic Practice
Install Logic Practice
Add to home screen for a faster app-like experience