Skip to content

Instantly share code, notes, and snippets.

View iamtgiri's full-sized avatar

Tanmoy Giri iamtgiri

View GitHub Profile
@iamtgiri
iamtgiri / cpp-custom-allocators.cpp
Created February 21, 2026 20:10
A small C++ experiment comparing malloc/free with pool, arena, and hybrid custom allocators, focusing on performance when allocating and deallocating large numbers of small objects.
#include <benchmark/benchmark.h>
#include <cstdlib>
#include <new>
#include <cstddef>
#include <vector>
#include <stdexcept>
#include <cassert>
#include <algorithm>
@iamtgiri
iamtgiri / parallel_prefix_sum.cpp
Created January 4, 2026 06:38
Sequential vs Parallel Computing to Compute Prefix Sum (Scan) Using OpenMP
/*
How Parallel Computing Works to Compute Prefix Sum (Scan) Using OpenMP
Prefix Sum (Scan) Definition:
Given an array:
A = [a0, a1, a2, a3, ...]
Prefix Sum array P is:
P[i] = a0 + a1 + a2 + ... + ai
Unlike simple sum/count, prefix sum has DATA DEPENDENCIES:
@iamtgiri
iamtgiri / parallel_merge_sort.cpp
Created January 4, 2026 06:37
Sequential vs Parallel Computing for Merge Sort Using OpenMP
/*
How Parallel Computing Works for Merge Sort Using OpenMP
Merge Sort follows the DIVIDE-AND-CONQUER paradigm:
1. Divide the array into two halves
2. Recursively sort the left half
3. Recursively sort the right half
4. Merge the two sorted halves
@iamtgiri
iamtgiri / parallel_matmul.cpp
Created January 4, 2026 06:36
Sequential vs Parallel Computing for Matrix Multiplication Using OpenMP
/*
How Parallel Computing Works for Matrix Multiplication Using OpenMP
Problem:
Given two matrices:
A (N × N) and B (N × N),
compute matrix C = A × B.
Each element:
C[i][j] = Σ (A[i][k] × B[k][j]) for k = 0 to N-1
@iamtgiri
iamtgiri / parallel_count.cpp
Created January 4, 2026 06:35
Sequential vs Parallel Computing to Count Occurrences Using OpenMP (Linear Search)
/*
How Parallel Computing Works to Count Occurrences Using OpenMP (Linear Search)
OpenMP internally performs the following steps to parallelize the counting process:
1. Creates a team of threads (fork)
Suppose the system has 8 cores → OpenMP creates threads:
T0, T1, T2, T3, T4, T5, T6, T7
2. Divides loop iterations among threads (work-sharing)
@iamtgiri
iamtgiri / parallel_bfs.cpp
Created January 4, 2026 06:34
Sequnetial vs Parallel Computing for Breadth-First Search (BFS)
/*
How Parallel Computing Works for Breadth-First Search (BFS)
(Level-by-Level Parallel BFS Using OpenMP)
Problem:
Given an unweighted graph and a source vertex,
perform Breadth-First Search (BFS) and compute the
distance (level) of each vertex from the source.
---------------------------------------------------
@iamtgiri
iamtgiri / parallel_apsp.cpp
Created January 4, 2026 06:31
Sequential and Parallel Computing for All-Pairs Distance Computation
/*
How Parallel Computing Works for All-Pairs Distance Computation
(using Floyd–Warshall Algorithm with OpenMP)
Problem:
Given a weighted graph with V vertices, compute the shortest distance
between every pair of vertices.
We use the Floyd–Warshall algorithm:
for k = 0 to V-1
@iamtgiri
iamtgiri / parallel_sum.cpp
Created December 11, 2025 09:14
Parallel vs Sequential Array Sum Using OpenMP in C++
/*
How Parallel Computing Works to Calculate the Sum of an Array Using OpenMP
OpenMP internally does 5 things to achieve parallelism in the sum calculation:
1. Creates a team of threads (fork)
If the machine has 8 cores, OpenMP might create 8 threads:
T0, T1, T2, T3, T4, T5, T6, T7
This is the "fork" part of the fork-join model.