Find Pattern In Infinite Binary Stream I In C++ - C++ Programming Tutorial
C++ Course / File Handling / Find Pattern In Infinite Binary Stream I In C++

Find Pattern In Infinite Binary Stream I In C++

BLUF: Mastering Find Pattern In Infinite Binary Stream I In C++ is a critical step in becoming a proficient C++ developer. This lesson provides a deep dive into the syntax, performance considerations, and real-world applications of this concept.
Key Performance Insight: Find Pattern In Infinite Binary Stream I In C++

C++ is renowned for its efficiency. Learn how Find Pattern In Infinite Binary Stream I In C++ enables low-level control and high-performance computing in the tutorial below.

Identifying a pattern within an endless binary stream is a key principle in computer science and data handling. This process requires locating a particular series of binary bits within an unlimited flow of binary information that could persist without an end.

In numerous practical scenarios, data streams in continuously, for instance, monitoring network traffic, analyzing sensor data, or processing financial market information. Detecting patterns or anomalies within the incoming data flow efficiently is essential in such cases.

The difficulty in this scenario is managing the endless flow of data while also detecting specific patterns within it. Due to the impracticality of storing or analyzing the entire infinite stream simultaneously, algorithms need to be created to operate gradually, handling data as it comes or in smaller portions.

Different methods can be utilized to tackle this issue. One popular strategy involves the application of sliding window algorithms. These algorithms analyze a set window size of the data stream sequentially, shifting the window across the stream to identify matches with the specified pattern. Additional techniques include the utilization of probabilistic data structures, hashing functions, or finite automata.

Effective algorithms for identifying patterns in endless data streams are crucial for instant data analysis, monitoring, and decision-making across various fields. They necessitate thorough evaluation of computational efficiency, memory utilization, and the capability to process ongoing data input seamlessly.

Approach-1: Sliding window

The Sliding Window Algorithm is a method used to effectively search for a particular pattern in a binary sequence. ```

include <iostream>

include <string>

include <vector>

using namespace std;

// Function to find occurrences of a pattern in a binary stream

vector<int> findPatternInBinaryStream(const string& binaryStream, const string& pattern) {

vector<int> occurrences; // Store indices of pattern occurrences

int streamSize = binaryStream.size; // Size of the binary stream

int patternSize = pattern.size; // Size of the pattern

// Loop through the stream

for (int i = 0; i <= streamSize - patternSize; ++i) {

bool found = true; // Flag to indicate if the pattern is found at the current index

// Check if the pattern matches from the current index

for (int j = 0; j < patternSize; ++j) {

if (binaryStream[i + j] != pattern[j]) {

found = false; // Pattern not found at current index

break;

}

}

// If the pattern is found, record the occurrence

if (found) {

occurrences.push_back(i);

}

}

return occurrences;

}

int main {

string binaryStream = "1011101110101010101110111010101010111010101010101"; // Binary stream

string pattern = "101"; // Pattern to search for

// Find occurrences of the pattern in the binary stream

vector<int> result = findPatternInBinaryStream(binaryStream, pattern);

// Output results

if (!result.empty) {

cout << "Pattern found at indices: ";

for (int index : result) {

cout << index << " ";

}

cout << endl;

} else {

cout << "Pattern not found in the binary stream." << endl;

}

return 0;

}

Example

include <iostream>

include <string>

include <vector>

using namespace std;

// Function to find occurrences of a pattern in a binary stream

vector<int> findPatternInBinaryStream(const string& binaryStream, const string& pattern) {

vector<int> occurrences; // Store indices of pattern occurrences

int streamSize = binaryStream.size; // Size of the binary stream

int patternSize = pattern.size; // Size of the pattern

// Loop through the stream

for (int i = 0; i <= streamSize - patternSize; ++i) {

bool found = true; // Flag to indicate if the pattern is found at the current index

// Check if the pattern matches from the current index

for (int j = 0; j < patternSize; ++j) {

if (binaryStream[i + j] != pattern[j]) {

found = false; // Pattern not found at current index

break;

}

}

// If the pattern is found, record the occurrence

if (found) {

occurrences.push_back(i);

}

}

return occurrences;

}

int main {

string binaryStream = "1011101110101010101110111010101010111010101010101"; // Binary stream

string pattern = "101"; // Pattern to search for

// Find occurrences of the pattern in the binary stream

vector<int> result = findPatternInBinaryStream(binaryStream, pattern);

// Output results

if (!result.empty) {

cout << "Pattern found at indices: ";

for (int index : result) {

cout << index << " ";

}

cout << endl;

} else {

cout << "Pattern not found in the binary stream." << endl;

}

return 0;

}

Example


Output:

Pattern found at indices: 0 4 8 10 12 14 16 20 24 26 28 30 32 36 38 40 42 44 46

Example


### Explanation:

- Header Inclusions:

The code includes necessary header files such as <iostream>, <string>, and <vector>. These headers are required for input/output operations , string manipulation, and vector container usage, respectively.

- Namespace Declaration:

The declaration "using namespace std;" grants access to elements within the standard namespace, enhancing code clarity by eliminating the requirement for explicit namespace specification.

- Definition of Function: findPatternInBinaryStream:

This function accepts two arguments: binaryStream, which denotes the binary stream to be searched, and pattern, which indicates the specific sequence to locate within the stream. It outputs a vector of integers (vector<int>) that holds the positions of each instance where the pattern is found in the stream.

The function loops through the binary stream, beginning at index 0 and continuing until streamSize - patternSize to thoroughly analyze the entire stream. During each iteration, it validates if there is a possible match with the pattern starting from the current position.

If a match is identified, the function appends the index of that occurrence to the occurrences vector. Ultimately, the function outputs the occurrences vector that holds the indices of all instances of the pattern within the binary stream.

- Primary Function:

The primary function acts as the starting point of the program. It sets up the binary stream and the target pattern for searching. The function then invokes the findPatternInBinaryStream function, passing the binary stream and pattern as inputs, and saves the outcome in the result vector. After analyzing the result, it displays either the positions of where the pattern appears or a notification stating that the pattern was not located within the stream.

- Process:

When the program is run, it scans the binary stream for instances of the specified pattern. If it locates one or multiple instances, it displays the indices where the pattern is present. In case the pattern is not detected within the stream, a notification is printed to indicate that the pattern was not located.

This script offers a simple implementation of the sliding window technique for pattern matching in binary streams. It effectively handles the stream data and detects instances of the designated pattern, showcasing a key method commonly employed in string matching and data analysis tasks.

### Complexity Analysis:

Time Complexity:

The time complexity of the sliding window technique is established by the quantity of comparisons required to verify the presence of the pattern within the binary stream.

Outer Loop:

The external iteration covers the full binary sequence, starting from index 0 up to streamSize - patternSize. This particular iteration occurs n-m+1 times, with n representing the binary stream's size and m representing the pattern's size. Consequently, the time complexity of this outer iteration is O(n-m+1), which simplifies to O(n) when m is significantly less than n.

Inner Loop:

The internal loop evaluates every character of the pattern against the corresponding characters within the current binary stream window. This loop cycles through the pattern, which is of a length m.

Therefore, the time complexity of the internal loop is O(m). By aggregating the time complexities of both the outer and inner loops, the total time complexity of the sliding window technique amounts to O(n⋅m).

This is due to the fact that, in the most unfavorable situation, every character of the pattern must be matched against every character within every window of the binary stream.

Space Complexity:

The space efficiency of the sliding window method is mainly influenced by the extra memory needed to record the occurrences of the pattern in the binary stream.

Occurrences Vector:

The vector for occurrences keeps track of where the pattern appears in the binary stream. If every index in the stream matches the pattern, then the vector's space requirement can be as large as O(n), with n representing the stream's size.

Additional Space:

Besides the frequency vector, the algorithm requires a fixed quantity of extra space for variables and temporary storage. Therefore, the additional spatial complexity remains at O(1).

## Approach-2: Finite Automaton Approach

The method of utilizing a Finite Automaton to detect patterns within an endless binary stream consists of creating a finite automaton that embodies the specific pattern intended for detection. During the process, the binary stream is handled in a step-by-step manner, with the automaton transitioning according to the binary digits being received. Upon the automaton reaching a state of acceptance, it indicates successful matching of the complete pattern, and the detection of the pattern is then noted.

Finite automata serve as computational constructs comprising states and transitions triggered by input symbols. When a finite automaton is built to portray a specific pattern, the pattern matching process gains predictability and effectiveness. This technique is especially apt for scenarios where the pattern remains constant and is pre-established.

Mapping the pattern to a collection of states and transitions is a fundamental aspect of building a finite automaton. Every state signifies a partial match of the pattern achieved at that point, while transitions between states symbolize potential expansions of the pattern depending on the subsequent binary digit in the sequence. Certain states are identified as accepting states, denoting successful matching of the complete pattern.

### Program:

include <iostream>

include <string>

include <unordered_map>

include <vector>

using namespace std;

class FiniteAutomaton {

private:

vector<vector<int>> transitions; // Transition table

vector<bool> isAccepting; // Indicates accepting states

int currentState; // Current state of the automaton

public:

FiniteAutomaton(const string& pattern) {

int m = pattern.size; // Size of the pattern

// Initialize transition table with -1 (no transition)

transitions.assign(m + 1, vector<int>(2, -1));

isAccepting.assign(m + 1, false);

// Construct the finite automaton

for (int i = 0; i < m; ++i) {

transitionsi - '0'] = i + 1; // Transition to next state on pattern digit

}

isAccepting[m] = true; // Last state is accepting

currentState = 0; // Start at initial state

}

// Process the next binary digit in the stream

void processNextDigit(char digit) {

int next = digit - '0';

if (transitionscurrentState != -1) {

currentState = transitionscurrentState; // Transition to next stage

if (isAccepting[currentState]) {

cout << "Pattern found at position " << currentState << endl;

}

} else {

currentState = 0; // Reset to initial state

}

}

};

void findPatternInBinaryStream(const string& binaryStream, const string& pattern) {

FiniteAutomaton automaton(pattern);

int n = binaryStream.size; // Size of the binary stream

for (int i = 0; i < n; ++i) {

automaton.processNextDigit(binaryStream[i]);

}

}

int main {

string binaryStream = "1001010110101"; // Example binary stream

string pattern = "101"; // Pattern to search for

findPatternInBinaryStream(binaryStream, pattern);

return 0;

}

Example


Output:

Pattern found at position 3

Pattern found at position 3

Example


### Explanation:

Finite Automaton Class:

- Creation:

The FiniteAutomaton class is responsible for modeling the finite automaton utilized in pattern matching tasks. It accepts the pattern as an input and generates the automaton structure according to the provided pattern.

- Transition Table:

The transitions of the automaton are recorded in a transition matrix, structured as a two-dimensional vector. Every row within the matrix corresponds to a specific state, while each column signifies a potential binary digit (0 or 1). The values within the transitions specify the subsequent state to move to, determined by the present state and input digit.

- States That Accept:

The automaton designates specific states as accepting states, signaling the completion of matching the pattern. The isAccepting array maintains a record of the accepting states.

- Proceed to the Next Digit:

The processNextDigit function retrieves the subsequent binary digit from the stream and modifies the state of the automaton accordingly. It advances to the subsequent state by considering the current state and the input digit, adhering to the transitions specified in the transition table.

If the automaton reaches a state of acceptance, it signifies that the complete pattern has been successfully matched, and the occurrence is duly noted.

Main Function:

- Pattern Matching:

Within the primary function, the binary data stream and the specific pattern to be located are defined. The findPatternInBinaryStream method is invoked to scan for instances of the pattern within the stream by employing the Finite Automaton technique.

Execution Specifics:

- Optimization:

The Finite Automaton technique offers effective pattern matching with a time complexity that scales linearly with the size of both the binary stream and the pattern. Each binary digit in the stream undergoes quick, constant-time operations, rendering this method ideal for handling extensive streams.

- Deterministic Execution:

The machine functions in a deterministic manner, ensuring that the switch from one state to another is solely dictated by the current state and the input digit. This predictable behavior guarantees efficient pattern matching without requiring any backtracking.

While processing the binary stream, the automaton changes states based on the binary digits it receives. Upon identifying a valid transition, the automaton progresses to the next state in sequence. Once it reaches an accepting state, the current position in the stream captures the pattern.

### Complexity Analysis:

Time Complexity:

Construction of Finite Automaton:

Building the finite automaton includes setting up the transition table and identifying the accepting states according to the given pattern. This procedure involves traversing through the pattern, leading to a time complexity of O(m), where m represents the size of the pattern.

Processing Binary Stream:

Analyzing every individual binary digit within the sequence requires a consistent amount of computational steps, mainly focused on referencing the transition table and modifying the current state. With the stream containing n digits, correlating to the stream's size, the computational complexity of handling the complete stream is O(n). By merging the creation of the finite automaton and the handling of the binary stream, the comprehensive computational complexity of the Finite Automaton strategy amounts to O(m+n).

This linear computational complexity ensures that the method remains effective when dealing with extensive binary streams and patterns.

Space Complexity:

Transition Table:

The transition table stores state transitions, with space needed based on the number of states and alphabet size (binary digits). In the most unfavorable scenario, there are m+1 states (where m represents pattern size), and each state includes transitions for both binary digits. Consequently, the space complexity for the transition table is O(2m) or O(m).

Accepting States:

The vector named isAccepting holds data regarding the states that are considered accepting states. With a total of m+1 states, the space complexity linked to this vector amounts to O(m).

Additional Space:

Alternative variables, like the present condition of the automaton, demand consistent space, resulting in an increased space complexity of O(1). By considering the space needed for the transition table, accepting states, and extra variables, the total space complexity of the Finite Automaton method amounts to O(m).

## Approach-3: Using Z Algorithm

The Z Algorithm is an efficient pattern matching algorithm that operates in linear time. It preprocesses the pattern to generate a "Z array" and subsequently performs fast searches for matches within the binary stream. This algorithm is well-suited for handling larger patterns and provides effective pattern matching capabilities with a time complexity of O(n+m), where n represents the stream's size and m indicates the pattern's size.

### Program:

include <iostream>

include <string>

include <vector>

using namespace std;

// Function to construct the Z array for a given pattern

vector<int> constructZArray(const string& pattern) {

int m = pattern.size;

vector<int> Z(m, 0); // Initialize Z array with zeros

int L = 0, R = 0; // Left and righcpp tutorialers for Z box

for (int i = 1; i < m; ++i) {

// Case 1: i is outside the current Z box

if (i > R) {

L = R = i;

while (R < m && pattern[R - L] == pattern[R])

R++;

Z[i] = R - L;

R--;

}

// Case 2: i is inside the current Z box

else {

int k = i - L;

// If the value does not stretch outside the Z box, use the previous value

if (Z[k] < R - i + 1)

Z[i] = Z[k];

// Otherwise, update Z array from R onwards

else {

L = i;

while (R < m && pattern[R - L] == pattern[R])

R++;

Z[i] = R - L;

R--;

}

}

}

return Z;

}

// Function to search for pattern matches in the binary stream using Z Algorithm

void searchPatternInBinaryStream(const string& binaryStream, const string& pattern) {

int n = binaryStream.size;

int m = pattern.size;

// Construct a Z array for the pattern

vector<int> Z = constructZArray(pattern);

for (int i = 0; i <= n - m; ++i) {

// Check if the substring matches the pattern

int j;

for (j = 0; j < m; ++j) {

if (binaryStream[i + j] != pattern[j])

break;

}

// If all characters match, print the index of the occurrence

if (j == m)

cout << "Pattern found at index " << i << endl;

}

}

int main {

string binaryStream = "1011001101001110110"; // Example binary stream

string pattern = "10110"; // Pattern to search for

searchPatternInBinaryStream(binaryStream, pattern);

return 0;

}

Example


Output:

Pattern found at index 0

Pattern found at index 14

Example


### Explanation:

- Z Array Construction:

The function constructZArray handles the task of creating the Z array associated with a specific pattern. The Z array, also known as the Z-function, holds information about the longest matching substrings that start at each position within the pattern, matching the prefix of the pattern. This process involves iterating through each position i within the pattern and calculating the length of the matching substring at that particular position.

If the index i exceeds the current limit of the Z box (i.e., i>R), it compares characters starting from i until a difference is found or the pattern's end is reached. The number of matching characters is saved in Z[i], and the boundaries of the Z box, denoted as L and R, are adjusted accordingly.

When i is situated within the existing Z container, it computes the appropriate value from the prior Z container. If the computed value does not surpass the right limit of the current Z box, it simply assigns that value to Z[i]. However, if it does exceed the boundary, it initiates character evaluations beginning from R until a disparity arises or the conclusion of the pattern is achieved, adjusting Z[i] as needed.

- Exploration for Correspondences in Binary Stream:

The searchPatternInBinaryStream function processes the Z array that has been built and searches through the binary stream to locate instances of the specified pattern. It loops through each position i within the stream, verifying if the substring beginning at i corresponds to the pattern through character comparisons. Upon discovering a match, it displays the index where the occurrence was found.

### Complexity Analysis:

Time Complexity:

Z Array Construction:

Creating the Z array requires looping through each position within the pattern and examining individual characters. This procedure has a time complexity of O(m), with 'm' representing the pattern's length.

Searching for Matches in Binary Stream:

Iterating through the binary stream to identify matches requires checking each position against the specified pattern. This operation is time-consuming, with a complexity of O(n⋅m), where n represents the stream's size and m denotes the pattern's size.

The eventual time complexity of the Z Algorithm amounts to O(m+n⋅m). However, due to the usual scenario where m is smaller than n for longer patterns, it is often streamlined to O(n+m).

Space Complexity:

Z Array:

The Z array necessitates memory that scales proportionally with the length of the pattern m. Hence, its spatial complexity is O(m).

Additional Space:

The extra memory utilized in the algorithm is insignificant in comparison to the dimensions of the pattern and the data stream. It primarily comprises of variables for looping and checking, necessitating a fixed amount of memory. Therefore, the spatial complexity is O(1).

Combining the storage needed for the Z array and extra variables, the total space complexity of the Z Algorithm amounts to O(m+1), which can be simplified to O(m).

Input Required

This code uses input(). Please provide values below:

Logic Practice
Install Logic Practice
Add to home screen for a faster app-like experience