H Index Ii In C++ - C++ Programming Tutorial
C++ Course / Miscellaneous / H Index Ii In C++

H Index Ii In C++

BLUF: Mastering H Index Ii In C++ is a critical step in becoming a proficient C++ developer. This lesson provides a deep dive into the syntax, performance considerations, and real-world applications of this concept.
Key Performance Insight: H Index Ii In C++

C++ is renowned for its efficiency. Learn how H Index Ii In C++ enables low-level control and high-performance computing in the tutorial below.

The H-Index II conundrum in C++ presents a twist on the traditional H-Index challenge, tailored for application with an arranged array. The H-Index serves as a quantitative measure for evaluating a researcher's output and influence through citations. The objective remains to determine the maximum value, h, representing the threshold where the researcher possesses at minimum h publications each garnering h citations.

Problem Overview:

In a sorted array of citations, containing non-negative integers representing the number of citations a researcher's paper has received, the goal is to calculate the researcher's H-Index. This index is the maximum value, denoted as h, where there are at least h papers with h citations or more. The array is arranged in ascending order.

Approach 1: Simple Approach

Implementation:

Example

#include <vector>
#include <iostream>
int hIndex(const std::vector<int>& citations) {
    int n = citations.size();
    int low = 0, high = n - 1;
    while (low <= high) {
        int mid = low + (high - low) / 2;
        int h = n - mid; // Number of papers with at least 'citations[mid]' citations
        if (citations[mid] == h) {
            return h; // Found the exact H-Index
        } else if (citations[mid] < h) {
            low = mid + 1; // Move to the right to find a larger H-Index
        } else {
            high = mid - 1; // Move to the left to find a smaller H-Index
        }
    }
    // If no exact match is found, the H-Index is n - low
    return n - low;
}
int main() {
    std::vector<int> citations = {0, 1, 3, 5, 6};
    int result = hIndex(citations);
    std::cout << "The H-Index is: " << result << std::endl;
    return 0;
}

Output:

Output

The H-Index is: 3

Explanation:

Step 1: Entering Data and Initialization

  • The issue provides us with an ordered array of integers, where each integer indicates the number of citations a research paper has garnered. The size of the array, denoted as n, informs us about the total number of research papers. Our objective is to determine the maximum value of h, ensuring that at least h papers have acquired h citations or more.
  • Commencing with the establishment of two pointers: low and high. The low pointer is positioned at the start of the array (index 0), while the high pointer is positioned at the conclusion of the array (index n-1). These pointers delineate the scope within which we will seek the correct h value.

Now, in this step, we implement a binary search method to effectively reduce the search range for the potential H-Index value. The strategy involves determining the midpoint within the list, denoted as 'mid', by averaging the values of low and high. This operation indicates a specific paper located at the mid position along with its corresponding citation count.

In addition to identifying the paper at the mid position, we also assess the number of papers that possess citations equal to or exceeding the citations of this mid paper. This evaluation is carried out by calculating the difference between the total number of papers 'n' and the mid position. This calculation reveals the count of papers that come after the mid position, including the mid paper itself.

Step 3: Condition Checking

  • After calculating the number of papers that meet the citation threshold, we check whether it matches the number of citations of the paper at mid. This forms the crux of the solution:
  • If the number of documents n - mid equals the citation count of the paper at mid, we have found the H-Index. This is because the number of documents with at least h citations is precisely equal to the citation count, satisfying the definition of H-Index.
  • If the number of papers with sufficient citations is less than the citation count, it means we need to search the left side of the list. In this case, we move the high pointer to the left to narrow the search.
  • If the number of papers with sufficient citations is greater than the citation count, it means we need to look at the right side of the list. Therefore, we move the low pointer to the right, expanding our search in that direction.

The process of binary search progresses by iteratively refining the search boundaries through adjustments to the low and high indices according to the comparison outcomes. This refinement continues until the pointers converge, signaling the end of the search. In cases where an exact match is not located, the final outcome is based on the count of papers that satisfy the h-index criteria at the position indicated by the low pointer.

At the conclusion of the binary search process, the H-Index is determined by the point of convergence in the search. In particular, the difference between the total number of papers (n) and the lower bound (low) indicates the highest count of papers having citations equal to or exceeding this value, thereby defining the H-Index.

Time Complexity:

  • Binary Search: At each step, the search space is halved, meaning that instead of scanning all n elements in the list, the algorithm performs approximately log n steps (where n is the number of papers).
  • Since in each step, we only compare the mid-point element with its corresponding citation value, the total time complexity is O(log n).

Space Analysis:

  • This algorithm operates with a fixed amount of additional space, exclusively for storing variables such as low, high, mid, and a handful of other ephemeral values. It does not employ any supplementary data structures such as arrays or lists.
  • Consequently, the space complexity remains O(1), indicating constant space utilization that remains unaffected by input size variations.
  • Approach 2: Using Linear Scan Approach

Given that the array is sorted, a straightforward method involves traversing the array once to compute the H-Index directly. While this technique prioritizes simplicity, it trades off efficiency by sidestepping the intricacies of binary search.

Implementation:

Example

#include <vector>
#include <iostream>
int hIndex(const std::vector<int>& citations) {
    int n = citations.size();
    // Traverse the array from left to right
    for (int i = 0; i < n; ++i) {
        // Calculate the number of papers with at least citations[i] citations
        int h = n - i;
        // If the current citation count is greater than or equal to h
        if (citations[i] >= h) {
            return h; // This is the H-Index
        }
    }
    // If no valid H-Index was found, return 0
    return 0;
}
int main() {
    std::vector<int> citations = {0, 1, 3, 5, 6};
    int result = hIndex(citations);
    std::cout << "The H-Index is: " << result << std::endl;
    return 0;
}

Output:

Output

The H-Index is: 3

Explanation:

Understanding the Input:

  • You have been provided with a collection of references for each article, arranged in increasing order. This indicates that the papers with lower citation counts come first before those with higher citation counts.

Traversal Technique:

  • This method requires iterating through the list starting from the initial point and moving towards the final element. During this process, you will calculate the count of papers with a minimum specified number of citations. The objective is to identify the maximum value, denoted as h, that fulfills the criteria of having no less than h papers with a minimum of h citations.

Determine the Count of Papers with Adequate Citations:

  • At every index within the list, analyze the citation value specified. This value denotes the minimum citations needed for papers to meet the criteria.
  • Derive the total papers that possess citations equal to or surpassing this value. Given the sorted nature of the list, this can be calculated as the total document count minus the current index. To illustrate, when positioned at index i, the number of papers from index i to the list's conclusion (inclusive of the paper at index i) is n - i.

Check the Status:

  • Analyze the existing citation value against the total count of documents with a minimum number of citations.
  • In cases where the citation count at the present position is equal to or exceeds the required number of papers with at least that many citations, it may indicate a potential H-Index. Provide this value as the output.

Continue Iterating:

  • In case the condition is not satisfied, proceed to the next index and iterate the procedure. Keep doing this until you encounter an acceptable H-Index or complete scanning the list.

In scenarios where the entire list has been traversed without discovering a valid H-Index, it indicates the absence of a sufficient number of papers with h citations. Therefore, in such instances, the function should return 0 as the calculated H-Index value.

Complexity analysis:

Time Complexity:

In the Linear Scan Method, you iterate through the complete set of citations just once. Each iteration consists of basic mathematical calculations and comparisons, resulting in a time complexity that scales linearly with the quantity of citations.

Space Complexity:

O(1): This method involves utilizing a fixed amount of additional space. It necessitates only a small number of variables to monitor the present index and the outcome. No further data structures or memory allocations are necessary beyond this.

Approach 3: Using Counting Sort-Based Approach

This method bears resemblance to a counting sort, in which you tally the number of papers with a particular count of citations and then aggregate these tallies to determine the H-Index. This technique is effective when citation values are relatively modest in size.

Implementation:

Example

#include <vector>
#include <iostream>
using namespace std;
int hIndex(const vector<int>& citations) {
    int n = citations.size();
    // Step 1: Create a count array of size n+1
    vector<int> count(n + 1, 0);
    // Step 2: Populate the count array
    for (int citation : citations) {
        if (citation >= n) {
            count[n]++; // Cap citations larger than n to n
        } else {
            count[citation]++;
        }
    }
    // Step 3: Accumulate counts from high to low
    int total = 0;
    for (int i = n; i >= 0; --i) {
        total += count[i]; // Accumulate the number of papers with citations >= i
        if (total >= i) {
            return i; // The first time total >= i is the H-Index
        }
    }
    return 0; // If no valid H-Index is found
}
int main() {
    vector<int> citations = {0, 1, 3, 5, 6};
    int result = hIndex(citations);
    cout << "The H-Index is: " << result << endl;
    return 0;
}

Output:

Output

The H-Index is: 3

Explanation:

Given a list of citation counts for multiple papers, whether sorted or unsorted, with a total of n documents, the initial step involves setting up a counting array. This array, known as the count array, assigns each index to signify the frequency of papers with a particular citation count. By doing so, you establish a mechanism to monitor the occurrence of each citation value effectively.

Counting Citation Occurrences:

  • In the list of citations, for each citation, increase the value in the count array at the position that matches the number of citations.
  • If a paper has more citations than n (which means more citations than the total number of papers), you cap its count at n because having more citations than papers doesn't affect the H-Index.
  • For example, if you have n papers, create a count array of size n + 1. If a paper has five citations, increment the value at index 5. If a paper has more than n citations, increment the last index (n).

Accumulating Counts from High to Low:

  • Once the count array is populated, we need to figure out how many papers have at least i citations for each possible i.
  • To do this, start at the largest index (i.e., the maximum possible citation count, which is n) and move downwards.
  • For each index, accumulate the number of papers with that many or more citations. This helps track how many papers have at least a certain number of citations.
  • As you move from high citation values to low, you're essentially adding up the number of papers that meet or exceed that citation count.

H-Index Requirement:

The objective is to determine the maximum value of i where there are at least i papers with i citations or more. Therefore, while incrementing the counts, verify if the total count (i.e., the quantity of papers with citations equal to or exceeding i) is greater than or equal to i.

Once this criterion is met, i represents your H-Index.

If there is no index where the count of papers with at least i citations matches or exceeds i, then the H-Index is considered as 0. This scenario occurs when none of the papers has the required number of citations to fulfill the criteria for an H-Index.

Complexity analysis:

Time Complexity:

O(n): The Counting Sort-Based Method iterates through the list of citations to fill up the count array initially, followed by another iteration through the count array to calculate the H-Index.

Filling up the count array requires a time complexity of O(n), with n representing the quantity of papers.

Calculating the totals to establish the H-Index also requires O(n) time as the count array's size is n + 1.

Thus, the total time complexity is O(n).

Space Complexity:

The space complexity for O(n) is primarily influenced by the count array, sized at n + 1, representing the quantity of papers. As there are no extra data structures employed, the space complexity persists as linear concerning the quantity of records.

Input Required

This code uses input(). Please provide values below:

Logic Practice
Install Logic Practice
Add to home screen for a faster app-like experience