Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Select an option

  • Save suntong/5d144a185d35f122f3141a9fdab70cc2 to your computer and use it in GitHub Desktop.

Select an option

Save suntong/5d144a185d35f122f3141a9fdab70cc2 to your computer and use it in GitHub Desktop.

Go ↔ C++ Interop with cgo: A Progressive Course

Audience: Go engineers who want to call C++ safely and efficiently.

Outcome: You’ll be able to wrap real C++ libraries behind C ABIs, build and link them with Go, ship cross-platform binaries, and avoid common interop footguns.

Time: 20–30 hours with labs.

Note on code blocks: use these as-is. They are wrapped with instead of to satisfy export constraints.

Prerequisites

  • Solid Go proficiency (goroutines, memory model, error handling)
  • Basic C knowledge (pointers, structs, headers)
  • No C++ expertise required (introduced as needed)

Toolchain Checklist

  • Go 1.21+ (for runtime.Pinner)
  • A C++17-capable compiler
    • Linux: gcc/g++ or clang/clang++
    • macOS: Xcode Command Line Tools or Homebrew llvm
    • Windows: MinGW-w64 (simplest for cgo) or MSVC (advanced)
  • gdb/lldb, Valgrind (Linux/macOS), perf (Linux), Instruments (macOS)

Course Layout

  • Lesson 1: Why Interop? The Big Picture
  • Lesson 2: cgo Fundamentals & C Interop Primer
  • Lesson 3: C++ from Go: The C Wrapper Pattern
  • Lesson 4: Building & Linking C++ Code
  • Lesson 5: Memory & Safety Deep Dive
  • Lesson 6: Advanced Patterns & Optimization
  • Lesson 7: Debugging & Profiling
  • Final Project: Build and ship a production-grade Go package over a C++ library (with full code)

Lesson 1: Why Interop? The Big Picture

Goal: Understand when to use Go/C++ interop and what you’re trading.

Key concepts:

  • Performance-critical subsystems (DSP, image/video, physics, crypto)
  • Tapping mature/legacy C++ libraries (OpenCV, Eigen, Box2D)
  • Tradeoffs
    • Pros: performance, ecosystem leverage, reuse domain code
    • Cons: build complexity, safety pitfalls, harder debugging, portability friction
  • Alternatives
    • Rewrite in Go (safer, simpler; often slower)
    • Expose C++ via a local service (gRPC/REST) to decouple
    • Bind via a C shim (recommended) vs. direct C++ ABI (not supported)

Lab (discussion + design):

  • Pick a Go project idea that needs fast math (e.g., vector math, matrix ops).
  • Identify 2–3 candidate C++ libs (e.g., Eigen for linear algebra, a small physics lib).
  • For each: assess
    • License compatibility
    • Build complexity (dependencies, toolchains)
    • Thread-safety claims
    • Minimal surface area to expose

Safety checklist:

  • Keep the boundary small. Wrap only what you must.
  • Prefer simple POD types (ints, floats, plain structs) across the boundary.
  • Assume C++ isn’t thread-safe unless docs scream otherwise.

Lesson 2: cgo Fundamentals & C Interop Primer

Goal: Bridge Go and C (prerequisite for C++).

Key concepts:

  • import "C" and inline preamble
  • Data mapping
    • Go int, float64 ↔ C.int, C.double (be explicit)
    • Strings: C.CString/C.free, C.GoString
    • Buffers: C.CBytes, C.GoBytes
  • Ownership
    • Go memory is GC-managed; C/C++ memory is not—free your stuff!
  • Rule of pointers
    • Don’t pass pointers to Go memory into C if C will store them or use them after the call returns.
    • Passing Go memory temporarily for in-call use is OK if it doesn’t contain Go pointers and C won’t retain it.

Lab 2A: Call C sqrt from Go

Files:

  • math_c.h, math_c.c
  • main.go

Starter code:

// math_c.h
#ifndef MATH_C_H
#define MATH_C_H
double c_sqrt(double x);
#endif
// math_c.c
#include <math.h>
#include "math_c.h"
double c_sqrt(double x) { return sqrt(x); }
// main.go
package main

/*
#cgo CFLAGS: -O2
#include "math_c.h"
*/
import "C"
import "fmt"

func main() {
    x := 9.0
    y := C.c_sqrt(C.double(x))
    fmt.Println("sqrt(9) =", float64(y))
}

Lab 2B: Pass Go slices to C without retaining

  • Create a C function that scales a float array in place.
  • In Go, pass a []float32 to C only for the duration of the call.
// scale.h
#ifndef SCALE_H
#define SCALE_H
#include <stddef.h>
void scale_in_place(float *buf, size_t n, float s);
#endif
// scale.c
#include "scale.h"
void scale_in_place(float *buf, size_t n, float s) {
    for (size_t i = 0; i < n; i++) buf[i] *= s;
}
// scale.go
package main

/*
#cgo CFLAGS: -O2
#include "scale.h"
*/
import "C"
import (
    "fmt"
    "unsafe"
)

func main() {
    data := []float32{1, 2, 3, 4}
    // Safe: C uses the pointer only during the call and the element type has no Go pointers.
    C.scale_in_place((*C.float)(unsafe.Pointer(&data[0])), C.size_t(len(data)), C.float(2))
    fmt.Println(data) // [2 4 6 8]
}

Safety notes:

  • Never let C store a Go pointer for later use.
  • Strings: free C strings created with C.CString using C.free.

Lesson 3: C++ from Go: The C Wrapper Pattern

Goal: Safely expose C++ to Go via a stable C ABI.

Why: cgo only knows the C ABI. C++ name mangling, exceptions, templates → not ABI-stable.

Core pattern:

  • C++ class encapsulated behind opaque handle (void*)
  • extern "C" wrapper functions present a C-compatible API
  • Lifecycle is explicit in the wrapper (New/Delete)
  • Exceptions are caught in C++ and converted to status codes

Example: Wrap a simple C++ Vector3D

// vector3d.hpp (C++)
#pragma once
#include <cmath>
struct Vector3D {
    double x, y, z;
    Vector3D(double x, double y, double z): x(x), y(y), z(z) {}
    double length() const { return std::sqrt(x*x + y*y + z*z); }
    void add(const Vector3D& o) { x += o.x; y += o.y; z += o.z; }
};
// vec_wrapper.h (C ABI)
#pragma once
#ifdef __cplusplus
extern "C" {
#endif

typedef void* VecHandle;

VecHandle Vec_New(double x, double y, double z);
void      Vec_Delete(VecHandle v);
double    Vec_Length(VecHandle v);
void      Vec_Add(VecHandle v, VecHandle other);

#ifdef __cplusplus
}
#endif
// vec_wrapper.cpp
#include "vec_wrapper.h"
#include "vector3d.hpp"

extern "C" {
    VecHandle Vec_New(double x, double y, double z) {
        return new Vector3D(x,y,z);
    }
    void Vec_Delete(VecHandle v) {
        delete static_cast<Vector3D*>(v);
    }
    double Vec_Length(VecHandle v) {
        return static_cast<Vector3D*>(v)->length();
    }
    void Vec_Add(VecHandle v, VecHandle other) {
        static_cast<Vector3D*>(v)->add(*static_cast<Vector3D*>(other));
    }
}

Go wrapper:

// vec.go
package vec

/*
#cgo CXXFLAGS: -std=c++17 -O2 -fPIC
#cgo linux LDFLAGS: -lstdc++
#cgo darwin LDFLAGS: -lc++
#cgo windows LDFLAGS: -lstdc++
#include "vec_wrapper.h"
*/
import "C"
import (
    "runtime"
)

type Vec struct{ h C.VecHandle }

func New(x, y, z float64) *Vec {
    v := &Vec{h: C.Vec_New(C.double(x), C.double(y), C.double(z))}
    runtime.SetFinalizer(v, func(vv *Vec) { vv.Close() })
    return v
}

func (v *Vec) Close() {
    if v.h != nil {
        C.Vec_Delete(v.h)
        v.h = nil
    }
}

func (v *Vec) Length() float64 { return float64(C.Vec_Length(v.h)) }
func (v *Vec) Add(o *Vec)      { C.Vec_Add(v.h, o.h) }

Lab:

  • Extend Vector3D with dot and cross functions; expose via wrapper; call from Go; add tests.

Safety tips:

  • Always delete heap-allocated C++ objects; use Go finalizers as a backstop, but offer Close for explicit control.
  • Never let C++ exceptions cross the C boundary.

Lesson 4: Building & Linking C++ Code

Goal: Compile and link C++ with Go on all platforms.

Key concepts:

  • Build modes
    • Direct compile: Place .cc/.cpp/.c files in the Go package; go build compiles and links them automatically.
    • Prebuilt static (.a) or shared (.so/.dylib/.dll) libraries; link with #cgo LDFLAGS.
  • #cgo directives
    • CXXFLAGS/CFLAGS: language standard, include paths, optimization
    • LDFLAGS: library search paths (-L...), libraries (-lmycpp), C++ stdlib (-lstdc++ or -lc++)
  • Platform differences
    • Linux: -lstdc++
    • macOS: -lc++
    • Windows (MinGW): -lstdc++

Example (direct compile):

/*
#cgo CXXFLAGS: -std=c++17 -O2 -fPIC -I./cpp/include
#cgo LDFLAGS: -L./cpp/lib -lmycpp
#include "math_wrapper.h"
*/
import "C"

Common link errors:

  • undefined reference: wrong lib order or missing -lstdc++/-lc++
  • duplicate symbols: mixing static and direct .cpp compile
  • ABI mismatch: using a different C++ standard library or compiler version; pin your toolchain.

Lab:

  • Build a small C++ static lib (libmylib.a), link it into Go via #cgo LDFLAGS, call one function.
  • Switch between static and shared builds; verify ldd/otool -L.

Lesson 5: Memory & Safety Deep Dive

Goal: Avoid crashes, UBs, leaks; handle errors properly.

Key concepts:

  • Ownership:
    • Anything new/malloc in C/C++ must have a corresponding delete/free exposed in the wrapper.
    • Prefer not to return ownership to Go unless necessary; keep lifecycle well-defined.
  • Go GC vs. C++ destructors:
    • Finalizers are non-deterministic; expose Close() and document it.
  • Pointer passing:
    • Go -> C temporary is OK when not retained and element type has no Go pointers.
    • If C must hold a pointer, allocate the buffer on the C/C++ heap.
  • Pinning (Go 1.21+):
    • runtime.Pinner prevents stack-moving and small object relocation concerns for the duration of a call.
  • Exceptions:
    • Never throw across C; catch in C++ and convert to error codes and last-error strings.
  • Thread safety:
    • CGO calls run on system threads; do not assume re-entrancy in C++ code.
    • If the lib is thread-affine, use runtime.LockOSThread.

Pattern: Exception-safe C wrapper

// err_wrap.hpp
#pragma once
#include <string>
#include <exception>

inline thread_local std::string g_last_error;

inline const char* set_last_error(const char* msg) {
    g_last_error = msg ? msg : "unknown error";
    return g_last_error.c_str();
}
// err_wrap_c.h
#pragma once
#ifdef __cplusplus
extern "C" {
#endif

const char* last_error_message(void);

#ifdef __cplusplus
}
#endif
// err_wrap.cpp
#include "err_wrap.hpp"
extern "C" {
const char* last_error_message(void) { return g_last_error.c_str(); }
}
// safe_wrapper.cpp
#include "err_wrap.hpp"
#include <stdexcept>

extern "C" {
int might_fail(int x) {
    try {
        if (x < 0) throw std::runtime_error("x must be non-negative");
        return x * 2;
    } catch (const std::exception& e) {
        set_last_error(e.what());
        return -1;
    } catch (...) {
        set_last_error("unknown exception");
        return -1;
    }
}
}

Go side:

/*
int might_fail(int x);
const char* last_error_message(void);
*/
import "C"
import "errors"

func MightFail(x int) (int, error) {
    r := int(C.might_fail(C.int(x)))
    if r < 0 {
        return 0, errors.New(C.GoString(C.last_error_message()))
    }
    return r, nil
}

Pinning example:

// Fill a Go-allocated []float32 in C without copies, with pinning.
package main

/*
#include <stddef.h>
void fill_f32(float *dst, size_t n) {
    for (size_t i=0; i<n; i++) dst[i] = (float)i;
}
*/
import "C"
import (
    "fmt"
    "runtime"
    "unsafe"
)

func main() {
    n := 8
    buf := make([]float32, n)
    var p runtime.Pinner
    p.Pin(&buf[0])      // ensure address remains stable
    C.fill_f32((*C.float)(unsafe.Pointer(&buf[0])), C.size_t(n))
    p.Unpin()
    runtime.KeepAlive(buf) // ensure buf lives through the cgo call
    fmt.Println(buf)
}

Lab:

  • Write a C++ function that throws on invalid input; wrap it with status/last_error; call from Go; assert error.
  • Use Pinner to pass a []byte buffer to C++; fill with a pattern; verify contents.

Safety checklist:

  • Validate handles in C wrappers; return errors on null.
  • Return sizes with size_t; check overflows at the boundary.
  • Always document who frees what.

Lesson 6: Advanced Patterns & Optimization

Goal: Production-grade interop patterns.

Key concepts:

  • Zero-copy data
    • Share buffers by letting C++ write into a Go-provided buffer (pinned during call).
    • Or have C++ allocate and Go create a slice view via unsafe.Slice, but then Go must not resize, and C must manage lifetime.
  • Minimize cgo overhead
    • Batch work: process arrays/chunks, not per-element calls.
    • Avoid chatty APIs.
  • Async interop
    • It’s okay to call into C++ from goroutines if the C++ code is thread-safe.
    • For thread-affine libs, funnel calls via a single goroutine + LockOSThread.
  • Build automation
    • go:generate, Makefiles, or CMake invoked from go generate
  • Sanitizers (dev only)
    • -fsanitize=address,undefined to catch memory bugs

Example: Batch image blur (toy)

// blur.h
#pragma once
#include <stddef.h>
#ifdef __cplusplus
extern "C" {
#endif
// In-place 1D box blur over bytes; channel-agnostic; stride is bytes.
int box_blur_u8(unsigned char* data, size_t n, int radius);
#ifdef __cplusplus
}
#endif
// blur.cpp
#include "blur.h"
#include <vector>
int box_blur_u8(unsigned char* data, size_t n, int radius) {
    if (!data || n == 0 || radius < 0) return -1;
    if (radius == 0) return 0;
    std::vector<unsigned char> tmp(n);
    for (size_t i = 0; i < n; i++) {
        size_t s = (i > (size_t)radius) ? i - (size_t)radius : 0;
        size_t e = (i + (size_t)radius + 1 < n) ? i + (size_t)radius + 1 : n;
        unsigned int sum = 0;
        for (size_t k = s; k < e; k++) sum += data[k];
        tmp[i] = (unsigned char)(sum / (unsigned int)(e - s));
    }
    for (size_t i = 0; i < n; i++) data[i] = tmp[i];
    return 0;
}
// blur.go
package blur

/*
#cgo CFLAGS: -O3
int box_blur_u8(unsigned char* data, size_t n, int radius);
*/
import "C"
import (
    "errors"
    "runtime"
    "unsafe"
)

func BoxBlurInPlace(buf []byte, radius int) error {
    if len(buf) == 0 { return nil }
    var p runtime.Pinner
    p.Pin(&buf[0])
    rc := C.box_blur_u8((*C.uchar)(unsafe.Pointer(&buf[0])), C.size_t(len(buf)), C.int(radius))
    p.Unpin()
    runtime.KeepAlive(buf)
    if rc != 0 {
        return errors.New("box_blur: invalid args")
    }
    return nil
}

Lab:

  • Benchmark: (1) naive loop in Go, (2) cgo blur in one call, (3) cgo blur in many tiny calls. Compare.
  • Implement a worker pool that dispatches large chunks to a thread-safe C++ function from goroutines.

Lesson 7: Debugging & Profiling

Goal: Diagnose interop issues confidently.

Techniques:

  • Crash/segfaults
    • Rebuild with symbols: go build -gcflags="all=-N -l"
    • Use lldb/gdb; bt to inspect C++ stack frames
  • Valgrind (Linux/macOS)
    • Detect leaks, invalid reads/writes on the C++ side
  • Sanitizers (dev only)
    • #cgo CXXFLAGS: -fsanitize=address -fno-omit-frame-pointer
    • #cgo LDFLAGS: -fsanitize=address
  • Logging
    • Add verbose logging in C++ wrappers; return error codes not crashes.
  • Profiling
    • perf or Instruments to see where native time is spent
    • pprof on Go side to quantify cgo call overhead

Lab:

  • Introduce a dangling pointer bug in C++; replicate crash from Go; fix it by proper ownership.
  • Use Valgrind to detect a memory leak; add missing delete/free.

Final Project: TinyPhysics — a Go package wrapping a small C++ physics engine

Overview:

  • A tiny 2D particle world with gravity, damping, and boundary bounces.
  • Safe C ABI with opaque handles.
  • Idiomatic Go API with explicit Close, finalizer as backup.
  • Cross-platform build via direct .cpp compilation.
  • Tests and a benchmark.

Directory layout:

tinyphysics/
  go.mod
  physics/
    world.hpp
    world.cpp
    world_wrapper.h
    world_wrapper.cpp
    physics.go
    physics_test.go
    bench_test.go

go.mod:

module example.com/tinyphysics

go 1.21

C++ engine (world.hpp):

// physics/world.hpp
#pragma once
#include <vector>
#include <stdexcept>
#include <algorithm>

struct Particle {
    float x, y;
    float vx, vy;
    float r;
};

class World {
public:
    World(size_t capacity, float width, float height)
      : width_(width), height_(height), gravity_x_(0), gravity_y_(100.0f), damping_(0.99f) {
        particles_.reserve(capacity);
        if (width <= 0 || height <= 0) throw std::invalid_argument("invalid dimensions");
    }

    void set_gravity(float gx, float gy) { gravity_x_ = gx; gravity_y_ = gy; }
    void set_damping(float d) {
        if (d <= 0 || d > 1.0f) throw std::invalid_argument("invalid damping");
        damping_ = d;
    }

    int add(float x, float y, float vx, float vy, float r) {
        if (r <= 0) throw std::invalid_argument("radius must be positive");
        Particle p{x,y,vx,vy,r};
        particles_.push_back(p);
        return (int)particles_.size();
    }

    void step(float dt, int substeps) {
        if (dt <= 0 || substeps <= 0) throw std::invalid_argument("dt/substeps must be > 0");
        float sdt = dt / (float)substeps;
        for (int s = 0; s < substeps; s++) {
            for (auto& p : particles_) {
                p.vx += gravity_x_ * sdt;
                p.vy += gravity_y_ * sdt;
                p.x += p.vx * sdt;
                p.y += p.vy * sdt;

                // collide with bounds
                if (p.x - p.r < 0)  { p.x = p.r;     p.vx = -p.vx * damping_; }
                if (p.x + p.r > width_)  { p.x = width_ - p.r;  p.vx = -p.vx * damping_; }
                if (p.y - p.r < 0)  { p.y = p.r;     p.vy = -p.vy * damping_; }
                if (p.y + p.r > height_) { p.y = height_ - p.r; p.vy = -p.vy * damping_; }
            }
        }
    }

    size_t count() const { return particles_.size(); }

    // Writes [x0,y0,x1,y1,...] into out; returns number of floats written (2*N).
    size_t positions(float* out, size_t out_len) const {
        size_t need = particles_.size() * 2;
        if (!out || out_len < need) throw std::invalid_argument("buffer too small");
        size_t j = 0;
        for (const auto& p : particles_) {
            out[j++] = p.x; out[j++] = p.y;
        }
        return need;
    }

private:
    float width_, height_;
    float gravity_x_, gravity_y_;
    float damping_;
    std::vector<Particle> particles_;
};

C ABI (world_wrapper.h):

// physics/world_wrapper.h
#pragma once
#include <stddef.h>

#ifdef __cplusplus
extern "C" {
#endif

typedef void* PhysWorld;

const char* phys_last_error(void);

PhysWorld phys_new_world(size_t capacity, float width, float height);
void      phys_delete_world(PhysWorld w);

int       phys_world_set_gravity(PhysWorld w, float gx, float gy);  // 0 ok, -1 err
int       phys_world_set_damping(PhysWorld w, float d);             // 0 ok, -1 err

int       phys_world_add(PhysWorld w, float x, float y, float vx, float vy, float r); // returns count or -1
int       phys_world_step(PhysWorld w, float dt, int substeps); // 0 ok, -1 err

int       phys_world_count(PhysWorld w); // >=0 or -1 on error

// Writes positions into out[0:out_len], returns number of floats written or -1 on error
int       phys_world_positions(PhysWorld w, float* out, size_t out_len);

#ifdef __cplusplus
}
#endif

Wrapper implementation (world_wrapper.cpp):

// physics/world_wrapper.cpp
#include "world_wrapper.h"
#include "world.hpp"
#include <string>
#include <exception>

static thread_local std::string g_last_err;

static int set_err(const char* msg) {
    g_last_err = msg ? msg : "unknown error";
    return -1;
}

extern "C" {

const char* phys_last_error(void) { return g_last_err.c_str(); }

PhysWorld phys_new_world(size_t capacity, float width, float height) {
    try {
        return new World(capacity, width, height);
    } catch (const std::exception& e) {
        set_err(e.what());
        return nullptr;
    } catch (...) {
        set_err("unknown exception");
        return nullptr;
    }
}

void phys_delete_world(PhysWorld w) {
    if (w) delete static_cast<World*>(w);
}

int phys_world_set_gravity(PhysWorld w, float gx, float gy) {
    if (!w) return set_err("null world");
    try {
        static_cast<World*>(w)->set_gravity(gx, gy);
        return 0;
    } catch (const std::exception& e) {
        return set_err(e.what());
    } catch (...) {
        return set_err("unknown exception");
    }
}

int phys_world_set_damping(PhysWorld w, float d) {
    if (!w) return set_err("null world");
    try {
        static_cast<World*>(w)->set_damping(d);
        return 0;
    } catch (const std::exception& e) {
        return set_err(e.what());
    } catch (...) {
        return set_err("unknown exception");
    }
}

int phys_world_add(PhysWorld w, float x, float y, float vx, float vy, float r) {
    if (!w) return set_err("null world");
    try {
        return static_cast<int>(static_cast<World*>(w)->add(x,y,vx,vy,r));
    } catch (const std::exception& e) {
        return set_err(e.what());
    } catch (...) {
        return set_err("unknown exception");
    }
}

int phys_world_step(PhysWorld w, float dt, int substeps) {
    if (!w) return set_err("null world");
    try {
        static_cast<World*>(w)->step(dt, substeps);
        return 0;
    } catch (const std::exception& e) {
        return set_err(e.what());
    } catch (...) {
        return set_err("unknown exception");
    }
}

int phys_world_count(PhysWorld w) {
    if (!w) return set_err("null world");
    try {
        return static_cast<int>(static_cast<World*>(w)->count());
    } catch (const std::exception& e) {
        return set_err(e.what());
    } catch (...) {
        return set_err("unknown exception");
    }
}

int phys_world_positions(PhysWorld w, float* out, size_t out_len) {
    if (!w) return set_err("null world");
    try {
        size_t wrote = static_cast<World*>(w)->positions(out, out_len);
        return static_cast<int>(wrote);
    } catch (const std::exception& e) {
        return set_err(e.what());
    } catch (...) {
        return set_err("unknown exception");
    }
}

} // extern "C"

Go wrapper (physics.go):

// physics/physics.go
package physics

/*
#cgo CXXFLAGS: -std=c++17 -O2 -fPIC
#cgo linux LDFLAGS: -lstdc++
#cgo darwin LDFLAGS: -lc++
#cgo windows LDFLAGS: -lstdc++
#include "world_wrapper.h"
*/
import "C"
import (
	"errors"
	"runtime"
	"unsafe"
)

type World struct {
	h C.PhysWorld
}

// New creates a world with given capacity and bounds (width x height).
func New(capacity int, width, height float32) (*World, error) {
	w := &World{h: C.phys_new_world(C.size_t(capacity), C.float(width), C.float(height))}
	if w.h == nil {
		return nil, errors.New(C.GoString(C.phys_last_error()))
	}
	runtime.SetFinalizer(w, func(w *World) { _ = w.Close() })
	return w, nil
}

func (w *World) Close() error {
	if w.h != nil {
		C.phys_delete_world(w.h)
		w.h = nil
	}
	return nil
}

func (w *World) SetGravity(gx, gy float32) error {
	if rc := C.phys_world_set_gravity(w.h, C.float(gx), C.float(gy)); rc != 0 {
		return errors.New(C.GoString(C.phys_last_error()))
	}
	return nil
}

func (w *World) SetDamping(d float32) error {
	if rc := C.phys_world_set_damping(w.h, C.float(d)); rc != 0 {
		return errors.New(C.GoString(C.phys_last_error()))
	}
	return nil
}

func (w *World) Add(x, y, vx, vy, r float32) (int, error) {
	rc := int(C.phys_world_add(w.h, C.float(x), C.float(y), C.float(vx), C.float(vy), C.float(r)))
	if rc < 0 {
		return 0, errors.New(C.GoString(C.phys_last_error()))
	}
	return rc, nil
}

func (w *World) Step(dt float32, substeps int) error {
	if rc := C.phys_world_step(w.h, C.float(dt), C.int(substeps)); rc != 0 {
		return errors.New(C.GoString(C.phys_last_error()))
	}
	return nil
}

func (w *World) Count() (int, error) {
	rc := int(C.phys_world_count(w.h))
	if rc < 0 {
		return 0, errors.New(C.GoString(C.phys_last_error()))
	}
	return rc, nil
}

// Positions returns an x/y interleaved slice of length 2*Count.
// The data is copied into a Go-allocated slice in one cgo call.
func (w *World) Positions() ([]float32, error) {
	n, err := w.Count()
	if err != nil {
		return nil, err
	}
	out := make([]float32, n*2)
	if len(out) == 0 {
		return out, nil
	}
	// Pass a pointer to Go memory for C to fill. Safe: used only during call, no Go pointers inside.
	// Demonstrate pinning (extra safety for stack-allocated small slices):
	var p runtime.Pinner
	p.Pin(&out[0])
	wrote := int(C.phys_world_positions(w.h, (*C.float)(unsafe.Pointer(&out[0])), C.size_t(len(out))))
	p.Unpin()
	runtime.KeepAlive(out)

	if wrote < 0 {
		return nil, errors.New(C.GoString(C.phys_last_error()))
	}
	return out[:wrote], nil
}

Tests (physics_test.go):

package physics

import "testing"

func TestWorldBasic(t *testing.T) {
	w, err := New(128, 800, 600)
	if err != nil {
		t.Fatal(err)
	}
	defer w.Close()

	if err := w.SetGravity(0, 200); err != nil {
		t.Fatal(err)
	}
	if err := w.SetDamping(0.98); err != nil {
		t.Fatal(err)
	}
	for i := 0; i < 5; i++ {
		if _, err := w.Add(float32(100+20*i), 100, 30, 0, 5); err != nil {
			t.Fatal(err)
		}
	}
	if err := w.Step(0.016, 4); err != nil {
		t.Fatal(err)
	}
	pos, err := w.Positions()
	if err != nil {
		t.Fatal(err)
	}
	if len(pos) != 10 {
		t.Fatalf("expected 10 floats (5 particles), got %d", len(pos))
	}
}

Benchmark (bench_test.go):

package physics

import "testing"

func BenchmarkStep(b *testing.B) {
	w, _ := New(10_000, 1920, 1080)
	defer w.Close()
	for i := 0; i < 10_000; i++ {
		w.Add(float32(i%1920), float32(i%1080), 0, 0, 2)
	}
	b.ResetTimer()
	for i := 0; i < b.N; i++ {
		_ = w.Step(0.016, 2)
	}
}

Build and run:

  • go test ./...

Cross-platform notes:

  • Linux: requires libstdc++ (installed with gcc/g++)
  • macOS: links against libc++
  • Windows:
    • MinGW-w64 recommended: set CC and CXX to mingw compilers
    • Example: set CC=x86_64-w64-mingw32-gcc, CXX=x86_64-w64-mingw32-g++

Safety highlights:

  • Opaque handle; no C++ types cross the boundary.
  • All exceptions caught and converted to status/last-error.
  • No Go pointers are retained by C++; positions buffer is used only during call.
  • Explicit Close with finalizer as a safety net.

Critical Best Practices

  1. Never call C++ directly from Go; always use a C-compatible wrapper (extern "C").
  2. Assume C++ is not thread-safe unless docs guarantee it.
  3. Minimize cgo calls; batch your work.
  4. Validate inputs and handles in C wrappers; return error codes and set last error.
  5. Prefer direct compile of .cpp files or static linking for simpler deployment.
  6. Document ownership. Every new/malloc must have a delete/free path.
  7. Don’t pass Go pointers into C if C will store them. Use C-allocated memory for long-lived buffers.
  8. For short-lived writes into Go memory, pin and KeepAlive.
  9. Never let C++ exceptions cross the C boundary.
  10. Automate builds (Makefile/go:generate/CMake) and test on all target platforms.

Resources

Appendix: Handy Snippets

Thread-affine calls from Go:

runtime.LockOSThread()
defer runtime.UnlockOSThread()
// call into a thread-affine C++ API here

Enable ASan (development):

/*
#cgo CXXFLAGS: -fsanitize=address -fno-omit-frame-pointer
#cgo LDFLAGS: -fsanitize=address
*/
import "C"

Create Go slice view over C memory (dangerous; lifetime must outlive slice use):

// Given ptr from C.malloc and length n:
ptr := C.malloc(C.size_t(n))
defer C.free(ptr)
// Convert to []byte without copying:
buf := unsafe.Slice((*byte)(ptr), n)
// Use buf; do not retain after free; do not let C reallocate while Go holds it.

You made it! You’ve got the patterns, the pitfalls, and a complete working project. From here, you can wrap larger libraries (e.g., a subset of OpenCV or Box2D) by applying the same C ABI + error-handling patterns and batch-oriented APIs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment