The Bitap Algorithm , known as the Shift-Or Algorithm , is a string-searching algorithm that efficiently performs approximate string matching. It is particularly useful for finding a pattern within a text when there might be errors or variations in the pattern. The bitmap algorithm was introduced by John Colquhoun and Gaston Gonnet in 1980 .
Example:
Let us take a C++ Program to Implement the Bitap Algorithm for String Matching.
#include <string>
#include <map>
#include <iostream>
using namespace std;
int bitmap_search(string t, string p) {
int m = p.length();
long p_mask[300];
long A = ~1;
if (m == 0)
return -1;
if (m >63) {
cout<<"Pattern is too long!";//if pattern is too long
return -1;
}
for (int i = 0; i <= 299; ++i)
p_mask[i] = ~0;
for (int i = 0; i < m; ++i)
p_mask[p[i]] &= ~(1L << i);
for (int i = 0; i < t.length(); ++i) {
A |= p_mask[t[i]];
A <<= 1;
if ((A & (1L << m)) == 0)
return i - m + 1;
}
return -1;
}
void findPattern(string t, string p) {
int position = bitmap_search(t, p);//initialize the position with the function bitmap_search
if (position == -1)
cout << "\nNo Match\n";
else
cout << "\nPattern found at position : " << position;
}
int main(int argc, char **argv) {
cout << "Enter Text:\n";
string t;
cin >>t;
cout << "Enter Pattern:\n";
string p;
cin >>p;
findPattern(t, p);
}
Output:
Explanation:
This program implements the Bitap Algorithm for String Matching. The variables present in the program are 't', which represents the input text, 'p' is the input pattern, 'm' represents the length of the pattern, 'p_mask' represents an array of bit masks for each character in the pattern, 'A' is a variable used for bitwise operations, 'position' represents the position where the pattern is found in the text.
The function 'bitmapsearch' takes the string 't' and pattern 'p' and returns the position of the pattern in the text or -1 if not found. This function uses bit manipulation to search for the pattern in the text. It creates bit masks for the character in the pattern and then performs bitwise operations to identify the position of the pattern in the text. Another function named the find pattern function takes the string 't' as input text and string 'p' as the pattern and displays whether the pattern is found and its position. This function calls the bitmapsearch function and prints the result. If the pattern is found, it displays the position otherwise and prints 'No Match' .
The main function takes two strings, representing the user's input string and pattern. After that, calls findpattern function to search for the pattern and display the result.
Example 2:
Let us take another way to implement Bitap Algorithm for String Matching in C++.
#include <iostream>
#include <bitset>
#include <vector>
using namespace std;
const int ALPHABET_SIZE = 256;
class Bitap {
public:
Bitap(const string& pattern) : pattern_(pattern), patternLength_(pattern.length()) {
preprocess();
}
void search(const string& text) {
int textLength = text.length();
bitset<ALPHABET_SIZE> R(1);
for (int i = 0; i < textLength; ++i) {
R = ((R << 1) & textMasks_[text[i]]) | bitset<ALPHABET_SIZE>(1);
if (R.test(patternLength_ - 1)) {
cout << "Pattern found at index " << i - patternLength_ + 1 << endl;
}
}
}
private:
void preprocess() {
textMasks_.resize(ALPHABET_SIZE, 0);
for (int i = 0; i < patternLength_; ++i) {
textMasks_[pattern_[i]] |= bitset<ALPHABET_SIZE>(1) << i;
}
}
string pattern_;
int patternLength_;
vector<bitset<ALPHABET_SIZE>> textMasks_;
};
int main() {
string text, pattern;
cout << "Enter the text: ";
getline(cin, text);
cout << "Enter the pattern: ";
getline(cin, pattern);
Bitap bitap(pattern);
bitap.search(text);
return 0;
}
Output:
Explanation:
This program implements the Bitap Algorithm for string matching. It efficiently searches for a pattern within a text, providing approximate string-matching capabilities. This program has a Bitap class containing a constructor that accepts the patterns and calculates their length. The functions present in the class are the Preprocess function and Search functions. Preprocess function creates a bit masks for each character in the pattern, stored in the textMasks_vector . The Search function uses bitwise operations to find the pattern in the text and prints the positions where the pattern is found. The main function takes the input for the text and pattern from the user and displays the positions where the pattern is found in the text. This main function will create an instance or object of the Bitmap class with the provided pattern, ask the user for input, and call the Search function of the Bitap class to find and print the positions where the pattern is found in the text.