Introduction:
The Cooley-Tukey Fast Fourier Transform (FFT) technique is a commonly employed and effective approach for calculating the discrete Fourier transform (DFT) of a series or array of complex numbers. Developed by J.W. Cooley and John Tukey in 1965, this algorithm has established itself as a key method in various signal processing and numerical analysis scenarios.
The fundamental concept of the Cooley-Tukey FFT algorithm involves breaking down a DFT of a composite size N = N1 * N2 into smaller DFTs of sizes N1 and N2 recursively, incorporating periodic twiddle factors. This iterative breakdown process persists until reaching the fundamental scenario of a DFT of size 2. The crucial realization is that by leveraging the periodic nature of twiddle factors and reusing calculations through recursion, the computational workload can be notably diminished.
Approach 1: Using Recursive Implementation:
Implement the Cooley-Tukey Fast Fourier Transform (FFT) method through a recursive approach, dividing the calculations into smaller tasks until reaching the base scenario. Utilizing complex numbers enables the representation of input and interim results, facilitating the management of both real and imaginary parts throughout the iterative procedure.
Program:
#include <iostream>
#include <complex>
#include <cmath>
#include <vector>
typedef std::complex<double> Complex;
void fft(std::vector<Complex>& a, bool invert) {
int n = a.size();
if (n <= 1) return;
// Divide the array into even and odd parts
std::vector<Complex> a0(n / 2), a1(n / 2);
for (int i = 0, j = 0; i < n; i += 2, ++j) {
a0[j] = a[i];
a1[j] = a[i + 1];
}
// Recursively compute FFT on even and odd parts
fft(a0, invert);
fft(a1, invert);
// Combine the results
double angle = (invert ? 2 : -2) * M_PI / n;
Complex w(1), wn(cos(angle), sin(angle));
for (int i = 0; i < n / 2; ++i) {
Complex t = w * a1[i];
a[i] = a0[i] + t;
a[i + n / 2] = a0[i] - t;
if (invert) {
a[i] /= 2;
a[i + n / 2] /= 2;
}
w *= wn;
}
}
// Wrapper function for convenience
void cooleyTukeyFFT(std::vector<Complex>& a) {
fft(a, false);
}
// Wrapper function for convenience
void inverseCooleyTukeyFFT(std::vector<Complex>& a) {
fft(a, true);
}
int main() {
// Example usage:
std::vector<Complex> input = {1, 2, 3, 4};
// Compute FFT
cooleyTukeyFFT(input);
// Output FFT result
for (const auto& val : input) {
std::cout << val << " ";
}
std::cout << std::endl;
// Compute inverse FFT
inverseCooleyTukeyFFT(input);
// Output inverse FFT result
for (const auto& val : input) {
std::cout << val << " ";
}
std::cout << std::endl;
return 0;
}
Output:
(10,0) (-2,2) (-2,0) (-2,-2)
(1,0) (2,5.72119e-18) (3,0) (4,-5.72119e-18)
Explanation:
Ensure to Add Required Headers:
The code begins by including necessary C++ standard library headers. These headers provide functionality for working with complex numbers (<complex>), mathematical operations (<cmath>), and input/output (<iostream>).
- Define Complex Number Type:
A typedef statement is used to define a complex number type (Complex) based on the std::complex<double> type from the <complex> header. This type represents numbers with both real and imaginary parts.
- FFT Function:
The fft function is created to execute the Cooley-Tukey Fast Fourier Transform in a recursive manner.
It requires a complex number vector (a) as input and a boolean variable (invert) to specify whether to calculate the forward FFT or its reverse counterpart.
The function partitions the input data into segments containing even and odd elements, performs Fast Fourier Transform (FFT) recursively on these segments, and merges the outcomes using periodic twiddle factors.
- Wrapper Functions:
Two enveloping functions, cooleyTukeyFFT and inverseCooleyTukeyFFT, offer a user-friendly way to calculate the Fast Fourier Transform (FFT) and its reverse counterpart. These functions invoke the fft function with the necessary arguments.
- Primary Function:
The primary function illustrates how to use the FFT functions on a sample vector (input).
It calculates the Fast Fourier Transform, displays the outcome, calculates the inverse Fast Fourier Transform, and shows the result of the inverse FFT.
- Displaying on the Console:
The program's output is presented through std::cout statements. The revised code incorporates the essential #include <iostream> directive to address compilation issues linked to std::cout usage.
- Sequence of Execution:
The process begins with importing essential libraries, specifying the data type for complex numbers, and defining various functions.
Within the primary function, an example vector is employed to showcase the computation of the Fast Fourier Transform (FFT). Both the FFT calculation and its corresponding inverse operation are executed, with the outcomes being displayed on the console.
Complexity analysis:
Time Complexity:
The Cooley-Tukey Fast Fourier Transform (FFT) algorithm achieves a time complexity of O(n log n) as a result of its recursive divide-and-conquer strategy.
At every recursion level, there are O(n) computations, with the recursion depth being logarithmic(n) due to the input being halved iteratively.
Space Complexity:
The algorithm's space complexity amounts to O(n log n) as a result of the recursion stack.
During every iteration, the algorithm needs extra memory for temporary arrays like a0 and a1, with the highest recursion level being log(n).
The overall space complexity is O(n log n).
Approach 2: Using Iterative Implementation
Implement the Cooley-Tukey Fast Fourier Transform (FFT) algorithm iteratively using a loop-based strategy. Break down the calculation into distinct stages, where each stage handles clusters of butterfly operations via pairwise calculations. Employ a loop to cycle through these stages, enhancing the algorithm for streamlined computation. This iterative approach offers a step-by-step advancement through FFT stages, enhancing cache performance and minimizing memory access complexities.
Program:
#include <iostream>
#include <complex>
#include <cmath>
#include <vector>
typedef std::complex<double> Complex;
void iterativeCooleyTukeyFFT(std::vector<Complex>& a) {
int n = a.size();
int levels = log2(n);
for (int s = 1; s <= levels; ++s) {
int m = 1 << s; // 2^s
int halfM = m / 2;
Complex wm = exp(Complex(0, -2.0 * M_PI / m));
for (int k = 0; k < n; k += m) {
Complex w = 1.0;
for (int j = 0; j < halfM; ++j) {
Complex t = w * a[k + j + halfM];
Complex u = a[k + j];
a[k + j] = u + t;
a[k + j + halfM] = u - t;
w *= wm;
}
}
}
}
int main() {
// Example usage:
std::vector<Complex> input = {1, 2, 3, 4};
// Compute FFT iteratively
iterativeCooleyTukeyFFT(input);
// Output FFT result
for (const auto& val : input) {
std::cout << val << " ";
}
std::cout << std::endl;
return 0;
}
Output:
(10,0) (-1,1) (-4,0) (-1,-1)
Explanation:
Make sure to incorporate essential headers:
The code begins by including necessary C++ standard library headers, such as <iostream> for input/output and <cmath> for mathematical functions.
- Define Complex Number Type:
A typedef statement is used to define a complex number type (Complex) based on the std::complex<double> type from the <complex> header. This type represents numbers with both real and imaginary parts.
- Iterative Cooley-Tukey FFT Function:
The iterativeCooleyTukeyFFT function executes the Cooley-Tukey Fast Fourier Transform in an iterative manner.
Iterating through various phases of the FFT algorithm is achieved by employing a loop. Each phase is responsible for managing clusters of butterflies, with the loop parameters dictating the dimensions and arrangement of these clusters.
- Loop Over Stages:
The outer iteration, governed by the variable s, loops through the various phases of the FFT procedure. The count of phases is established by calculating the base 2 logarithm of the input size.
- Iterate Through Butterfly Groups:
Within every phase, the internal loop, managed by the variable k, cycles through the clusters of butterflies. Every cluster signifies a collection of pairs of complex numbers that are subjected to calculations.
- Calculation of Twiddle Factors:
Within the nested loop, the procedure computes the twiddle factor (wm) and applies it to execute butterfly computations on pairs of complex numbers in each cluster.
- Displaying the Result on the Console:
The primary function showcases how the iterative Fast Fourier Transform (FFT) function is applied to a sample input vector called "input".
The Fast Fourier Transform outcome is subsequently displayed on the console using std::cout.
- Sequence of Operations:
The process begins with importing essential libraries, specifying the data type for complex numbers, and defining various functions.
Within the main function, an example vector is employed to showcase the iterative calculation of the Fast Fourier Transform (FFT). The resultant FFT output is then calculated and displayed on the console.
Complexity analysis:
Time Complexity:
The time efficiency of the Cooley-Tukey FFT algorithm implemented iteratively is on the order of O(n log n).
The procedure goes through log(n) phases, and during each phase, it executes O(n) tasks (butterfly computations).
Space Complexity:
The space complexity is O(1) auxiliary space.
The iterative method requires a fixed amount of extra space, independent of the input size. The main data structure that undergoes changes is the input array.
Approach 3: Using Bit-Reversal Permutation
Incorporating a bit-reversal permutation within the Cooley-Tukey FFT algorithm serves to reorganize the input data, thereby boosting cache locality and reducing memory access patterns. By rearranging elements to align with the memory structure, this method enhances data retrieval efficiency during FFT calculations. The utilization of bit-reversal permutation offers a practical approach to enhance memory access patterns, thus increasing the algorithm's compatibility with contemporary computer architectures.
Program:
#include <iostream>
#include <complex>
#include <cmath>
#include <vector>
typedef std::complex<double> Complex;
// Function to perform bit-reversal permutation
void bitReversePermutation(std::vector<Complex>& a) {
int n = a.size();
int bits = log2(n);
for (int i = 0; i < n; ++i) {
int reversed = 0;
for (int j = 0; j < bits; ++j) {
reversed |= ((i >> j) & 1) << (bits - 1 - j);
}
if (reversed > i) {
std::swap(a[i], a[reversed]);
}
}
}
// Iterative Cooley-Tukey FFT with bit-reversal permutation
void iterativeCooleyTukeyFFT(std::vector<Complex>& a) {
bitReversePermutation(a);
int n = a.size();
int levels = log2(n);
for (int s = 1; s <= levels; ++s) {
int m = 1 << s; // 2^s
int halfM = m / 2;
Complex wm = exp(Complex(0, -2.0 * M_PI / m));
for (int k = 0; k < n; k += m) {
Complex w = 1.0;
for (int j = 0; j < halfM; ++j) {
Complex t = w * a[k + j + halfM];
Complex u = a[k + j];
a[k + j] = u + t;
a[k + j + halfM] = u - t;
w *= wm;
}
}
}
}
int main() {
// Example usage:
std::vector<Complex> input = {1, 2, 3, 4, 5, 6, 7, 8};
// Perform bit-reversal permutation
bitReversePermutation(input);
// Compute FFT iteratively with bit-reversed input
iterativeCooleyTukeyFFT(input);
// Output FFT result
for (const auto& val : input) {
std::cout << val << " ";
}
std::cout << std::endl;
return 0;
}
Output:
(36,0) (-1,2.41421) (-4,4) (-1,0.414214) (-16,0) (-1,-0.414214) (-4,-4) (-1,-2.41421)
Explanation:
- In Bit-Reversal Permutation:
The function bitReversePermutation reorganizes the provided data by employing a bit-reversal permutation technique.
It loops over the input array, exchanging elements according to their bit-reversed positions, and guarantees that exchanges occur just once to prevent duplication.
- Iterative Cooley-Tukey Fast Fourier Transform with Bit-Reversal:
The iterativeCooleyTukeyFFT function merges the bit-reversal reordering and the Cooley-Tukey Fast Fourier Transform (FFT) technique.
It initially employs the bit-reversal permutation on the input array utilizing the bitReversePermutation function.
Next, it sequentially applies the Cooley-Tukey Fast Fourier Transform (FFT) method to the input data that has been reversed in bits.
- Main Task:
In the primary function, a demonstration vector (input) is given as an illustrative example.
The function bitReversePermutation is invoked to reorganize the input information in a reversed order based on bits.
The iterativeCooleyTukeyFFT function is employed to calculate the Fast Fourier Transform (FFT) on the data that has been bit-reversed.
Finally, the outcome is displayed on the console.
- Sequence of Execution:
The process begins with executing a bit-reversal permutation on the provided input data.
Following this step, the iterative Cooley-Tukey Fast Fourier Transform (FFT) technique is then implemented on the input that has been reversed in terms of its bits.
The outcome is displayed on the console.
- Significance of Bit-Reversal:
The bit-reversal permutation is utilized to optimize cache performance and minimize memory access patterns while executing the FFT algorithm.
It reorganizes the input information to match the memory layout, which can lead to enhanced performance.
- Improving Cache Locality:
Integrating bit-reversal functionality with Fast Fourier Transform (FFT) calculations is designed to enhance the efficiency of cache usage by improving memory access patterns, thereby leading to more effective processing.
Complexity analysis:
Time Complexity:
Bit-Reversal Permutation (bitReversePermutation):
The computational efficiency of the bit-reversal permutation is O(n log n), with n representing the dimension of the input array.
The function loops over every item in the array, executing a bitwise reversal operation on each item, with a time complexity of O(log n).
Iterative Cooley-Tukey Fast Fourier Transform (iterativeCooleyTukeyFFT):
The computational efficiency of the Cooley-Tukey Fast Fourier Transform (FFT) algorithm is classified as O(n log n).
It consists of log(n) phases, with each one performing n/2 tasks (butterfly calculations).
Main Function:
Executing the primary function requires invoking both the permutation for bit-reversal and the iterative Cooley-Tukey Fast Fourier Transform (FFT).
The primary function's time complexity is mainly influenced by the Cooley-Tukey Fast Fourier Transform, resulting in a time complexity of O(n log n).
Space Complexity:
Bit-Reversal Permutation (bitReversePermutation):
The bit-reversal permutation operates within the same memory space, altering the original input vector directly.
The space complexity remains O(1) as it only requires a fixed amount of extra space.
Iterative Cooley-Tukey Fast Fourier Transform (iterativeCooleyTukeyFFT):
The auxiliary space complexity of the iterative Cooley-Tukey Fast Fourier Transform is O(1).
It alters the input array directly without the need for extra space that grows proportionally with the input size.
Main Function:
The primary function has a space complexity of O(1) in total, since it does not require substantial extra space beyond the input vector.
Approach 4: Parallelization using Multi-threading
Implementing parallelism in the Cooley-Tukey FFT algorithm involves distributing the computation across multiple threads through a multi-threading strategy. This technique divides the workload into concurrent tasks, enabling simultaneous execution on distinct processing units. Widely used libraries such as OpenMP or std::thread can be utilized to create parallel algorithms, improving the overall efficiency of the process.
Program:
#include <iostream>
#include <complex>
#include <cmath>
#include <vector>
#include <omp.h>
typedef std::complex<double> Complex;
// Recursive FFT function (same as before)
void fft(std::vector<Complex>& a, bool invert) {
int n = a.size();
if (n <= 1) return;
std::vector<Complex> a0(n / 2), a1(n / 2);
for (int i = 0, j = 0; i < n; i += 2, ++j) {
a0[j] = a[i];
a1[j] = a[i + 1];
}
fft(a0, invert);
fft(a1, invert);
double angle = (invert ? 2 : -2) * M_PI / n;
Complex w(1), wn(cos(angle), sin(angle));
for (int i = 0; i < n / 2; ++i) {
Complex t = w * a1[i];
a[i] = a0[i] + t;
a[i + n / 2] = a0[i] - t;
if (invert) {
a[i] /= 2;
a[i + n / 2] /= 2;
}
w *= wn;
}
}
// Parallel FFT function using OpenMP
void parallelCooleyTukeyFFT(std::vector<Complex>& a) {
#pragma omp parallel
{
#pragma omp single nowait
fft(a, false);
}
}
int main() {
// Example usage:
std::vector<Complex> input = {1, 2, 3, 4};
// Compute FFT in parallel
parallelCooleyTukeyFFT(input);
// Output FFT result
for (const auto& val : input) {
std::cout << val << " ";
}
std::cout << std::endl;
return 0;
}
Output:
(10,0) (-2,2) (-2,0) (-2,-2)
Explanation:
The recursive FFT (Fast Fourier Transform) function (fft):
The fft function represents a recursive version of the Cooley-Tukey Fast Fourier Transform (FFT) algorithm.
It requires an array of imaginary numbers (a) and a boolean variable (invert) to determine the FFT direction (forward or reverse).
The function divides the FFT calculation recursively into smaller tasks until it reaches the fundamental case.
- Concurrent Fast Fourier Transform Function (parallelCooleyTukeyFFT):
The parallelCooleyTukeyFFT function is an implementation of the Cooley-Tukey FFT algorithm that has been parallelized with OpenMP directives.
It employs #pragma omp parallel to form a group of threads for concurrent processing.
The #pragma omp single nowait directive guarantees that the recursive Fast Fourier Transform (FFT) is carried out by a solitary thread without delay, allowing the remaining threads to proceed with their tasks.
The nowait clause enables threads to progress autonomously without being dependent on the completion of a single thread.
- Primary Function:
In the primary function, a demonstration vector (input) is given as an instance input.
The parallelCooleyTukeyFFT function is invoked to calculate the Fast Fourier Transform (FFT) concurrently with OpenMP.
- Sequence of Operations:
The process begins with importing essential libraries and declaring the data type for complex numbers.
Within the primary function, a demonstration vector is employed, triggering the execution of the parallel Fast Fourier Transform (FFT) function.
The parallel FFT function leverages multiple threads to execute the FFT calculation simultaneously.
Complexity analysis:
Time Complexity:
Recursive FFT Function (fft):
The time complexity of the recursive Fast Fourier Transform (FFT) algorithm is O(n log n), where n represents the dimension of the input array. This complexity remains consistent with the unparallelized implementation.
The Fast Fourier Transform (FFT) algorithm consists of logarithmic(n) phases, with each phase requiring n/2 computations (known as butterfly computations).
Parallel FFT Function (parallelCooleyTukeyFFT):
The time complexity of the parallel Fast Fourier Transform (FFT) function remains O(n log n).
While parallelization can allocate tasks among various threads, the core time complexity of the FFT algorithm stays consistent.
Main Function:
The program's time complexity is O(n log n) primarily due to the concurrent Fast Fourier Transform (FFT) computation.
Space Complexity:
Recursive FFT Function (fft):
The space efficiency of the recursive Fast Fourier Transform (FFT) function amounts to O(log n) as a result of the recursion stack.
At every recursive level, new arrays (a0 and a1) are generated, which are subsequently recycled, leading to the dominance of the recursion stack in terms of space complexity.
Parallel FFT Function (parallelCooleyTukeyFFT):
The parallel Fast Fourier Transform function maintains an equivalent space complexity to the non-parallelized iteration.
Each thread functions on an individual segment of the input array, while they all access the common memory area, maintaining an O(log n) space complexity as a result of recursive operations.
Main Function:
The program's space complexity is O(log n) primarily because of the recursion stack utilized in the FFT function.