In this guide, we will explore various methods in C++ to calculate the number of substrings where the occurrence of any single character is odd.
Character subsets or sequences that are consecutive within a string are known as substrings.
It is now essential to determine the quantity of substrings in this scenario that contain no more than a single odd character in their frequency. Let's explore the most efficient approach to address this scenario.
Let's try to comprehend this problem by examining a couple of examples.
Input:
s = "ksjssjkk"
Output:
Explanation: As the frequency of the characters in the given string is given below-
- k → 3
- s → 3
- j → 2
Substrings that include a maximum of one character and appear at odd intervals can now be
- Take each character: 'k', 's', 'j', 's', 's', 'j', 'k', 'k' = 8
- Take 2 letters at a time: 'ss', 'kk' = 2
- Take 3 letters at a time: 'sjs', 'jss', 'ssj', 'jkk' = 4
- Take 4 letters at a time: 'jssj' = 1
- Take 5 letters at a time: 'sjssj', 'jssjk', 'ssjkk' = 3
- Take 6 letters at a time: 'jssjkk' = 1
- Take 7 letters at a time: 'ksjssjk', 'sjssjkk' = 2
- Take 8 letters at a time: No string
Now, by summing the quantity of substrings, we obtain (8 + 2 + 4 + 1 + 3 + 1 + 2) = 21.
Input:
s = "ab"
Output:
In this scenario, you will extract two substrings: 'a' and 'b'.
Problem Statement:
Let's analyze the problem and pinpoint a viable solution. It's essential to locate substrings within the string where, on average, a single letter occurs an odd number of times, ensuring that there is only one letter with an odd frequency.
Solution-1: Brute Force Solution
This technique is straightforward to understand. In this method, we solely run loops and consistently check for any individual letter appearing an odd number of times. If found, the specific substring will be added to the final result.
Example:
#include <bits/stdc++.h>
using namespace std;
bool checkValid(string s){
int n = s.size();
int Freq[26] = {0};
int oddFreq = 0;
for (int i = 0; i < n; i++) {
Freq[s[i] - 'a']++;
}
for (int i = 0; i < 26; i++) {
if (Freq[i] % 2 != 0)
oddFreq++;
}
if (oddFreq <= 1) return true;
else return false;
}
int Helper(string s){
int n = s.size();
int output = 0;
for(int i = 0; i < n; i++) {
for (int j = i; j < n; j++) {
string S = s.substr(i, j - i + 1);
if (checkValid(S)) output++;
}
}
return output;
}
int main(){
string s = "ksjssjkk" ;
int output = Helper(s);
cout << "The number of substrings with the frequency of at most one character as Odd is: " << output;
return 0;
}
Output:
Complexity Analysis:
Time complexity:
The time complexity of the helper function is (n^3), where 'n' represents the length of the string. On the other hand, the checkValid function's execution time is denoted by O(n).
Space complexity:
O(1); The provided code does not involve any variables stored within a data structure.
Solution 2: Optimized Solution using Bit-masking
Bit-masking
Bitmasking is a technique employed to impose a mask on a value in order to retain, change, or adjust a specific part of the provided data. This mask serves to determine the bits that should be retained or discarded from a binary number. Through various bitwise operations, bitmasking can be utilized to filter a value and indicate the subsets of a set.
Approach:
We utilize a bitmask to determine the characters that have an odd frequency of appearance. A hashmap is employed to retain previously encountered bitmasks. Post each iteration, we increment hashmap[bitmask] to indicate our recognition of this specific bitmask. By adding m[mask] to the output, we sum up the substrings with an even count of characters used. For instances where a single character appears an odd number of times, the tally is incremented by adding m[mask^ (1\<j)] to the output.
Example:
#include <bits/stdc++.h>
using namespace std;
int Helper(string s){
unordered_map<int, int> m;
m[0] = 1;
int mask = 0;
int output = 0;
for (int i = 0; i < s.size(); ++i) {
mask ^= (1 << (s[i] - 'a'));
output += m[mask];
for (int j = 0; j < 26; ++j) {
output += m[mask ^ (1 << j)];
}
m[mask]++;
}
return output;
}
int main(){
string s = "ksjssjkk" ;
int output = Helper(s);
cout << "The number of substrings with the frequency of at most one character as Odd is: " << output;
return 0;
}
Output:
Complexity Analysis:
Time complexity:
The time complexity of this algorithm is O(n*26), where 'n' represents the length of the string. Each character in the string is compared against 26 characters for a check.
Space complexity:
O(1): The sole data structure employed was a map, requiring O(26) space, approximately equal to O(1) space complexity.
Conclusion:
Initially, we will utilize a basic technique involving loops to achieve the intended outcome. While this approach is straightforward to understand, it does come with the drawback of demanding a significant amount of processing time for execution. Alternatively, we can assess the algorithm's time complexity more efficiently by utilizing a distinct strategy referred to as bitmasking in conjunction with hashmaps. The utilization of bitmasking in this context was rather unconventional, resulting in a notable reduction of the time complexity from O(n^3) to O(n). This write-up delves into the fundamentals and practical implementation of bitmasking.