Skip to content

Instantly share code, notes, and snippets.

@sebbbi
sebbbi / FastUniformLoadWithWaveOps.txt
Last active January 11, 2026 06:16
Fast uniform load with wave ops (up to 64x speedup)
In shader programming, you often run into a problem where you want to iterate an array in memory over all pixels in a compute shader
group (tile). Tiled deferred lighting is the most common case. 8x8 tile loops over a light list culled for that tile.
Simplified HLSL code looks like this:
Buffer<float4> lightDatas;
Texture2D<uint2> lightStartCounts;
RWTexture2D<float4> output;
[numthreads(8, 8, 1)]
@nadavrot
nadavrot / Matrix.md
Last active December 12, 2025 16:19
Efficient matrix multiplication

High-Performance Matrix Multiplication

This is a short post that explains how to write a high-performance matrix multiplication program on modern processors. In this tutorial I will use a single core of the Skylake-client CPU with AVX2, but the principles in this post also apply to other processors with different instruction sets (such as AVX512).

Intro

Matrix multiplication is a mathematical operation that defines the product of

@sebbbi
sebbbi / fast_spheres.txt
Created February 18, 2018 19:31
Fast way to render lots of spheres
Setup:
1. Index buffer containing N quads (each 2 triangles), where N is the max amount of spheres. Repeating pattern of {0,1,2,1,3,2} + K*4.
2. No vertex buffer.
Render N*2 triangles, where N is the number of spheres you have.
Vertex shader:
1. Sphere index = N/4 (N = SV_VertexId)
2. Quad coord: Q = float2(N%2, (N%4)/2) * 2.0 - 1.0
3. Transform sphere center -> pos
@mbinna
mbinna / effective_modern_cmake.md
Last active January 9, 2026 18:10
Effective Modern CMake

Effective Modern CMake

Getting Started

For a brief user-level introduction to CMake, watch C++ Weekly, Episode 78, Intro to CMake by Jason Turner. LLVM’s CMake Primer provides a good high-level introduction to the CMake syntax. Go read it now.

After that, watch Mathieu Ropert’s CppCon 2017 talk Using Modern CMake Patterns to Enforce a Good Modular Design (slides). It provides a thorough explanation of what modern CMake is and why it is so much better than “old school” CMake. The modular design ideas in this talk are based on the book [Large-Scale C++ Software Design](https://www.amazon.de/Large-Scale-Soft

@bkaradzic
bkaradzic / orthodoxc++.md
Last active December 25, 2025 23:58
Orthodox C++

Orthodox C++

This article has been updated and is available here.

@dwilliamson
dwilliamson / Doc.md
Last active April 23, 2023 14:17
Minimal Code Generation for STL-like Containers

This is my little Christmas-break experiment trying to (among other things) reduce the amount of generated code for containers.

THIS CODE WILL CONTAIN BUGS AND IS ONLY PRESENTED AS AN EXAMPLE.

The C++ STL is still an undesirable library for many reasons I have extolled in the past. But it's also a good library. Demons lie in this here debate and I have no interest in revisiting it right now.

The goals that I have achieved with this approach are:

@patriciogonzalezvivo
patriciogonzalezvivo / GLSL-Noise.md
Last active January 9, 2026 17:15
GLSL Noise Algorithms

Please consider using http://lygia.xyz instead of copy/pasting this functions. It expand suport for voronoi, voronoise, fbm, noise, worley, noise, derivatives and much more, through simple file dependencies. Take a look to https://github.com/patriciogonzalezvivo/lygia/tree/main/generative

Generic 1,2,3 Noise

float rand(float n){return fract(sin(n) * 43758.5453123);}

float noise(float p){
	float fl = floor(p);
  float fc = fract(p);