Skip to content

Instantly share code, notes, and snippets.

@genbtc
Created December 22, 2025 15:19
Show Gist options
  • Select an option

  • Save genbtc/8d4b61d0780c2cd2fb4a775c55a61425 to your computer and use it in GitHub Desktop.

Select an option

Save genbtc/8d4b61d0780c2cd2fb4a775c55a61425 to your computer and use it in GitHub Desktop.
chatGPT-packed_simd-portable-simd.md

How is rust packed_simd killed when portable-simd cant replace it in context of mozilla firefox 115?:

Good question — and this is where the real technical wall is. Short answer first, then the concrete blockers.

portable_simd isn’t “killed”; it’s not a drop-in replacement for what Firefox 115 is doing with packed_simd. The problem is not ideology or conservatism — it’s API shape, guarantees, and codegen assumptions.

I’ll be precise and stay high-signal.


The core mismatch (one paragraph summary)

packed_simd exposed a value-level, fixed-width SIMD API with predictable lowering that Firefox-era code relies on. portable_simd (std::simd) is a type-driven, target-feature-mediated API whose semantics and optimizer behavior do not match what encoding_rs and rand are written around — especially on stable Rust at the time ESR 115 targets.

You don’t lose SIMD, but you lose the invariants the code assumes.


Why packed_simd “just worked” here

packed_simd gave Firefox three things std::simd does not (yet) guarantee:

1️⃣ Fixed lane width with zero feature negotiation

Example used by encoding_rs:

use packed_simd::u8x16;

This means:

  • Exactly 16 lanes
  • Always vectorized
  • No runtime dispatch
  • No feature probing
  • No monomorphization explosion

The code is written assuming one concrete type.


2️⃣ Implicit target-feature override via RUSTC_BOOTSTRAP

Firefox does this intentionally:

RUSTC_BOOTSTRAP=encoding_rs,packed_simd

That allows:

  • use of unstable SIMD intrinsics
  • bypassing feature gating
  • consistent codegen across platforms

std::simd explicitly does not allow this.


3️⃣ Known codegen behavior

This matters more than people admit.

Firefox relies on:

  • vector loads not scalarized
  • predictable mask lowering
  • stable alignment behavior

packed_simd was essentially a compiler co-design experiment. std::simd is not.


Why portable_simd is not a drop-in replacement

Let’s go through the real blockers.


Blocker 1: Lane-count types are const-generic, not concrete

In std::simd:

use core::simd::Simd;

type U8x16 = Simd<u8, 16>;

Sounds fine — but:

  • Requires const generics support maturity
  • Triggers heavy monomorphization
  • Breaks code that assumes concrete layout
  • Changes trait resolution and inlining

encoding_rs is not written generically. It assumes monomorphic SIMD.


Blocker 2: Mask semantics differ

packed_simd:

let mask: m8x16 = a.eq(b);
  • mask is a vector
  • can be bitcast
  • can be reduced cheaply

std::simd:

let mask = a.simd_eq(b); // SimdMask
  • opaque mask type
  • different reduction APIs
  • backend-dependent lowering
  • harder to force optimal codegen

Encoding hot loops depend on this.


Blocker 3: Target-feature dispatch is explicit (and viral)

std::simd requires either:

  • #[target_feature(enable = "...")]
  • or runtime feature detection

Both are poison for Firefox 115 because:

  • functions get duplicated
  • inlining behavior changes
  • build times explode
  • perf cliffs appear on some CPUs

Firefox intentionally avoids per-function dispatch here.


Blocker 4: Stable Rust SIMD was incomplete at ESR 115’s MSRV

At the Rust version Firefox 115 is pinned to:

  • missing reductions
  • missing casts
  • missing shuffle ops
  • partial operator coverage

You end up:

  • reimplementing ops manually
  • or falling back to scalar paths

Either way, performance parity is gone.


Blocker 5: rand uses SIMD differently than encoding_rs

rand’s SIMD paths:

  • rely on bulk sampling
  • assume cheap lane extraction
  • use packed arithmetic

std::simd makes these patterns:

  • more verbose
  • harder to optimize
  • more fragile to compiler changes

This isn’t theoretical — multiple upstream rand PRs stalled on this.


Why Mozilla hasn’t done it (this matters)

Mozilla could afford to do it if it were easy.

They haven’t because:

  • perf regressions were measurable
  • cross-arch behavior diverged
  • CI signal-to-noise was bad
  • ESR stability matters more than API purity

This isn’t inertia — it’s engineering cost.


The only way portable_simd becomes viable without perf loss

All of these must be true:

  1. Higher MSRV (newer Rust)
  2. Explicitly monomorphic SIMD types
  3. No runtime dispatch
  4. Equivalent mask lowering
  5. Benchmark-validated parity

Firefox 115 satisfies none of these simultaneously.


If you still want to try (the honest answer)

The only credible approach is:

  • Write a compat shim:

    pub struct u8x16(Simd<u8, 16>);
  • Re-expose packed_simd-like API

  • Force #[inline(always)]

  • Avoid masks where possible

  • Pin target features globally

That is weeks of work, not days — and it still may regress.


Bottom line (no hedging)

  • portable_simd is not immature
  • It is architecturally different
  • Firefox 115’s SIMD code assumes properties it does not guarantee
  • Using it without regressions requires an upstream-scale rewrite

You’re not wrong to want to move forward — but the wall you’re hitting is real, not superstition.


If you want, next I can:

  • Sketch a minimal compatibility layer design
  • Identify exact functions where parity breaks
  • Estimate realistic engineering time
  • Or help you decide where to accept controlled regressions vs keep SIMD
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment