This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| #include <benchmark/benchmark.h> | |
| #include <cstdlib> | |
| #include <new> | |
| #include <cstddef> | |
| #include <vector> | |
| #include <stdexcept> | |
| #include <cassert> | |
| #include <algorithm> |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| /* | |
| How Parallel Computing Works to Compute Prefix Sum (Scan) Using OpenMP | |
| Prefix Sum (Scan) Definition: | |
| Given an array: | |
| A = [a0, a1, a2, a3, ...] | |
| Prefix Sum array P is: | |
| P[i] = a0 + a1 + a2 + ... + ai | |
| Unlike simple sum/count, prefix sum has DATA DEPENDENCIES: |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| /* | |
| How Parallel Computing Works for Merge Sort Using OpenMP | |
| Merge Sort follows the DIVIDE-AND-CONQUER paradigm: | |
| 1. Divide the array into two halves | |
| 2. Recursively sort the left half | |
| 3. Recursively sort the right half | |
| 4. Merge the two sorted halves |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| /* | |
| How Parallel Computing Works for Matrix Multiplication Using OpenMP | |
| Problem: | |
| Given two matrices: | |
| A (N × N) and B (N × N), | |
| compute matrix C = A × B. | |
| Each element: | |
| C[i][j] = Σ (A[i][k] × B[k][j]) for k = 0 to N-1 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| /* | |
| How Parallel Computing Works to Count Occurrences Using OpenMP (Linear Search) | |
| OpenMP internally performs the following steps to parallelize the counting process: | |
| 1. Creates a team of threads (fork) | |
| Suppose the system has 8 cores → OpenMP creates threads: | |
| T0, T1, T2, T3, T4, T5, T6, T7 | |
| 2. Divides loop iterations among threads (work-sharing) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| /* | |
| How Parallel Computing Works for Breadth-First Search (BFS) | |
| (Level-by-Level Parallel BFS Using OpenMP) | |
| Problem: | |
| Given an unweighted graph and a source vertex, | |
| perform Breadth-First Search (BFS) and compute the | |
| distance (level) of each vertex from the source. | |
| --------------------------------------------------- |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| /* | |
| How Parallel Computing Works for All-Pairs Distance Computation | |
| (using Floyd–Warshall Algorithm with OpenMP) | |
| Problem: | |
| Given a weighted graph with V vertices, compute the shortest distance | |
| between every pair of vertices. | |
| We use the Floyd–Warshall algorithm: | |
| for k = 0 to V-1 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| /* | |
| How Parallel Computing Works to Calculate the Sum of an Array Using OpenMP | |
| OpenMP internally does 5 things to achieve parallelism in the sum calculation: | |
| 1. Creates a team of threads (fork) | |
| If the machine has 8 cores, OpenMP might create 8 threads: | |
| T0, T1, T2, T3, T4, T5, T6, T7 | |
| This is the "fork" part of the fork-join model. |