Skip to content

Instantly share code, notes, and snippets.

@blinkinglight
Created February 22, 2026 18:40
Show Gist options
  • Select an option

  • Save blinkinglight/4328fb378e387317bd9fdb084a91a6f5 to your computer and use it in GitHub Desktop.

Select an option

Save blinkinglight/4328fb378e387317bd9fdb084a91a6f5 to your computer and use it in GitHub Desktop.
Error in user YAML: (<unknown>): mapping values are not allowed in this context at line 2 column 58
---
name: nats
description: Practical, production-minded NATS skill pack: Core NATS + JetStream + Go (nats.go) + Micro (service framework). Includes mental models, subject design, reliability patterns, security, ops, and copy‑paste Go snippets.
---

NATS SKILL

This is a high-signal field guide for building real systems with NATS:

  • Core NATS for low-latency messaging (pub/sub, request-reply, queue groups).
  • JetStream for persistence, replay, work-queues, KV, Object Store.
  • nats.go idioms for Go clients.
  • micro (service framework) for discoverable, observable services.

Sources referenced while building this: NATS docs (docs.nats.io), nats.go repo, nats.go/jetstream, nats.go/micro, and example assets.


0) The 30-second mental model

NATS is subject-based messaging: you publish to a subject (a dot-separated string), and subscribers who expressed interest in that subject get the message. No brokers of “topics with partitions” here—just very fast routing by subject.

JetStream adds a persistence layer: streams store messages for subjects; consumers track delivery/acks and provide at-least-once semantics.


1) Core NATS essentials

1.1 Subjects (design it like an API)

Subjects are your public contract. Good subject design will save your future self from archaeology.

Naming style (common and scalable):

  • domain.entity.action (commands)
  • domain.entity.event (events)
  • domain.entity.query or domain.entity.get (queries)

Examples:

  • Commands: billing.invoice.create
  • Events: billing.invoice.created
  • Queries: billing.invoice.get

Wildcards:

  • * matches one token: billing.*.created
  • > matches the rest: billing.>

Rule of thumb: subjects are cheap; changing them is expensive.

1.2 Pub/Sub

  • Use core pub/sub for signals, cache invalidation, fanout notifications, low-latency eventing where losing a message is acceptable or handled elsewhere.

Go:

nc, _ := nats.Connect(nats.DefaultURL)
sub, _ := nc.Subscribe("billing.invoice.created", func(m *nats.Msg) {
    // handle
})
_ = sub
nc.Publish("billing.invoice.created", []byte("hello"))

1.3 Request-Reply (RPC-ish, but don’t cosplay HTTP)

  • Request-reply is the “distributed function call” pattern.
  • In NATS, it’s typically implemented by the client publishing a request with a reply inbox.

Go:

msg, err := nc.Request("billing.invoice.get", []byte("id=123"), 2*time.Second)
if err != nil { /* timeout, no responders, etc */ }
_ = msg

Rules that keep you sane:

  • Always set a timeout.
  • Keep payloads small; send IDs not blobs.
  • Make handlers idempotent.

1.4 Queue groups (load balancing)

Multiple subscribers can share a queue group name to load-balance work.

nc.QueueSubscribe("work.resize_image", "workers", func(m *nats.Msg) {
    // only one worker in the group gets each msg
})

1.5 Inbox subjects (private replies)

NATS clients create unique inbox subjects for replies. Don’t invent your own unless you have a reason.

1.6 Message size, latency, and backpressure

Core NATS is designed for speed, not infinite buffering.

  • If you need durability or replay or work queues with retries → JetStream.
  • If you need slow consumers, consider pull-based consumption in JetStream.

2) JetStream fundamentals

JetStream = persistence + replay + delivery tracking.

2.1 Streams

A Stream is a message store for one or more subjects.

  • Messages published to matching subjects are captured.
  • Retention policy and limits define when data is evicted.

Key ideas:

  • Choose stream boundaries intentionally (by domain, by retention needs).
  • Prefer JetStream publish APIs when you care that a message was stored (they return a storage ack).

2.2 Consumers

A Consumer is a “view” of a stream + delivery state.

  • Pull or push.
  • Durable (named, stateful) or ephemeral (short-lived).
  • Ack policy + max deliveries decide redelivery behavior.

Practical default:

  • For work queues / slow processing: Pull consumer + explicit acks.
  • For fast fanout: push can be fine.

2.3 Acks, NAKs, and retries

At-least-once means duplicates can happen.

  • Ack when done.
  • Nak if you want redelivery sooner.
  • Term / “do not redeliver” patterns exist depending on API.

Design your processing to be idempotent.

2.4 Work queues vs event streams

Two very different beasts:

Event stream (many consumers):

  • You want multiple independent consumers to see the same events.
  • Use retention and delivery policies to support replay.

Work queue (one consumer group shares work):

  • You want each job processed once (at-least-once, with retries).
  • Use JetStream with a consumer strategy that ensures single delivery per job.

2.5 KV store (JetStream abstraction)

KV is a stream-backed key/value store with revision history and watch. Use it for:

  • Feature flags, leader hints, small shared state, coordination. Not for:
  • Large blobs or high-write hot keys.

Key patterns:

  • Watchers for reactive systems.
  • CAS (compare-and-set) for optimistic locking.
  • Create as a mutex primitive (pessimistic lock with TTL/discipline).

2.6 Object Store (JetStream abstraction)

Object Store stores large blobs by chunking. Use it for:

  • Artifacts, files, bundles, configs, model snapshots.

3) nats.go (Go client) patterns you’ll actually use

3.1 Connect safely

opts := []nats.Option{
    nats.Name("svc.billing"),
    nats.Timeout(5 * time.Second),
    nats.ReconnectWait(500 * time.Millisecond),
    nats.MaxReconnects(-1),
    nats.DisconnectErrHandler(func(_ *nats.Conn, err error) { /* log */ }),
    nats.ReconnectHandler(func(nc *nats.Conn) { /* log */ }),
    nats.ClosedHandler(func(nc *nats.Conn) { /* log */ }),
}

nc, err := nats.Connect("nats://127.0.0.1:4222", opts...)
if err != nil { panic(err) }

3.2 Core subscription lifecycle

Be explicit about draining on shutdown:

// on shutdown:
nc.Drain() // lets pending messages flush
nc.Close()

3.3 Prefer structured payloads

Pick an envelope:

  • JSON for human-debuggable.
  • Protobuf for strict schemas.

Always include:

  • id (unique message id)
  • correlation_id
  • created_at
  • type (event/command name)

This is not “enterprisey”; it’s “debuggable”.


4) JetStream in Go: use the modern jetstream package

nats.go has:

  • Older JetStream API via nc.JetStream().
  • Newer, cleaner API in github.com/nats-io/nats.go/jetstream.

4.1 Create a JetStream context

import (
  "github.com/nats-io/nats.go"
  "github.com/nats-io/nats.go/jetstream"
)

nc, _ := nats.Connect(nats.DefaultURL)
js, _ := jetstream.New(nc)

4.2 Create a stream

_, err := js.CreateStream(context.Background(), jetstream.StreamConfig{
    Name:     "ORDERS",
    Subjects: []string{"orders.>"},
    Storage:  jetstream.FileStorage,
    Retention: jetstream.LimitsPolicy,
    MaxAge:   24 * time.Hour,
})
if err != nil { /* handle */ }

4.3 Publish with ack (durable ingest)

ack, err := js.Publish(context.Background(), "orders.created", []byte("..."))
if err != nil { /* not stored */ }
_ = ack

4.4 Pull consume (worker pattern)

c, _ := js.CreateOrUpdateConsumer(context.Background(), "ORDERS", jetstream.ConsumerConfig{
    Durable:       "worker",
    AckPolicy:     jetstream.AckExplicitPolicy,
    MaxDeliver:    10,
    FilterSubject: "orders.created",
})

for {
    msgs, err := c.Fetch(10, jetstream.FetchMaxWait(2*time.Second))
    if err != nil { continue }

    for msg := range msgs.Messages() {
        // process
        if err := process(msg.Data()); err != nil {
            _ = msg.Nak() // retry
            continue
        }
        _ = msg.Ack()
    }
}

Why pull is loved: you control flow and avoid being drowned by push.

4.5 KV in Go

kv, _ := js.CreateOrUpdateKeyValue(context.Background(), jetstream.KeyValueConfig{
    Bucket:  "CONFIG",
    History: 5,
    TTL:     0,
})

_, _ = kv.Put(context.Background(), "flags/new_ui", []byte("true"))
entry, _ := kv.Get(context.Background(), "flags/new_ui")
_ = entry

w, _ := kv.Watch(context.Background(), "flags.*")
for update := range w.Updates() {
    if update == nil { continue }
    // react
}

4.6 Object Store in Go

obj, _ := js.CreateOrUpdateObjectStore(context.Background(), jetstream.ObjectStoreConfig{
    Bucket: "ARTIFACTS",
})

// Put bytes
_, _ = obj.PutBytes(context.Background(), "builds/app.tar.gz", data)

// Get
r, _ := obj.Get(context.Background(), "builds/app.tar.gz")
defer r.Close()

5) Microservices with nats.go/micro

The micro package helps you build services that are:

  • queue-grouped by default (load management)
  • discoverable
  • monitorable

5.1 Minimal service

import "github.com/nats-io/nats.go/micro"

svc, err := micro.AddService(nc, micro.Config{
    Name:    "billing",
    Version: "1.0.0",
})
if err != nil { panic(err) }

err = svc.AddEndpoint("get_invoice", micro.HandlerFunc(func(req micro.Request) {
    // req.Data()
    req.Respond([]byte("ok"))
}), micro.WithEndpointSubject("billing.invoice.get"))
if err != nil { panic(err) }

// block / run
select {}

5.2 Groups, subjects, and sane hierarchies

Group endpoints under consistent subject prefixes:

  • billing.invoice.*
  • billing.payment.*

5.3 Discovery and monitoring mindset

Micro isn’t magic; it standardizes patterns.

  • Make one service = one bounded context.
  • Publish events separately (often via JetStream) rather than trying to turn every endpoint into an RPC.

6) Tooling: nats and nsc

6.1 nats CLI (daily driver)

Use it for:

  • Pub/sub, request/reply debugging
  • JetStream stream/consumer/KV/ObjectStore management
  • Server checks and cheat sheets

Examples:

  • nats pub foo "hi"
  • nats sub foo
  • nats request svc.ping ""
  • nats stream ls
  • nats consumer ls ORDERS
  • nats kv ls
  • nats kv watch CONFIG

Also: export/import stream configs via JSON to treat them like IaC.

6.2 nsc for JWT/NKey security

Use nsc to manage:

  • Operators, Accounts, Users
  • NKey/JWT credentials

If you’re building a multi-tenant platform: spend time here. This is where “secure by default” happens.


7) Security skill: from dev box to zero-trust

7.1 The ladder

  1. Local dev: no auth (fine for localhost)
  2. Basic auth: users in server config
  3. TLS everywhere
  4. NKeys + JWT (operator/account/user) with nsc

7.2 JWT/NKey structure

  • Operator controls the universe.
  • Accounts isolate tenants.
  • Users belong to accounts.

This model enables one NATS infrastructure serving many isolated entities with optional exports/imports.

7.3 Practical advice

  • Rotate creds like you rotate keys, because they are keys.
  • Keep operator/account keys offline where possible.
  • Use short-lived creds when you can (and automate renewal).

8) Ops skill: running NATS like an adult

8.1 Monitoring endpoints

NATS servers expose a lightweight HTTP monitoring port with endpoints like:

  • /varz, /connz, /routez, /gatewayz, /leafz, /subsz

Use them to:

  • understand load
  • find slow consumers
  • validate cluster topology

8.2 Topology: cluster, leafnodes, gateways

  • Cluster: scale + HA within a logical group.
  • Leafnodes: connect edge clusters to a core (great for IoT/branch setups).
  • Gateways: connect independent clusters across regions/accounts.

8.3 JetStream sizing basics

  • Memory vs file storage: file for durable, memory for speed.
  • Know your retention: time-based, size-based, interest-based.
  • Plan for consumer backlogs and replay storms.

9) Reliability patterns (the part that makes systems not explode)

9.1 Idempotency (mandatory)

At-least-once delivery means duplicates happen.

Common strategies:

  • store last processed message ID per consumer
  • use natural idempotency keys (order ID)
  • transactional outbox → publish → consumer de-dup

9.2 The “command + event” split

  • Commands: request to change state (often request-reply).
  • Events: facts about what happened (pub/sub; often persisted).

This fits CQRS/event-sourcing designs nicely:

  • Commands go to a service (micro endpoint or request handler)
  • Events are published to JetStream and projected by other services

9.3 Backpressure

  • Core NATS: keep handlers fast; use worker pools.
  • JetStream: prefer pull consumers for slow/variable workloads.

9.4 Poison messages

After max deliveries, emit an advisory / alert and park the message:

  • send to a DLQ stream (dead letter)
  • store failed payload + error + stack + correlation id

9.5 Ordering realities

  • Don’t assume global ordering across subjects.
  • If you need ordering, design for it (single subject partitioning, per-aggregate subjects, or sequence numbers).

10) Subject taxonomy that scales (copy/paste idea)

Per bounded context:

  • Commands (to one handler): ctx.cmd.<aggregate>.<action>
  • Events (fanout): ctx.evt.<aggregate>.<event>
  • Queries (RPC-ish): ctx.qry.<readmodel>.<name>

Example:

  • billing.cmd.invoice.create
  • billing.evt.invoice.created
  • billing.qry.invoice.by_id

If you later bridge systems (Kafka, HTTP, MQTT), you’ll thank yourself.


11) “Docs map” (curated, not exhaustive)

NATS docs are huge; here’s the map you actually traverse:

Core NATS

  • Subjects + wildcards
  • Pub/Sub
  • Request-Reply
  • Queue groups

JetStream

  • Streams (storage + retention)
  • Consumers (delivery, ack, pull/push)
  • KV store
  • Object store

Security

  • Auth overview
  • JWT/NKeys model (operator/account/user)
  • NSC workflow

Ops/Deployment

  • Running nats-server
  • Monitoring endpoints
  • Leafnodes/gateways

Go client

  • nats.go README
  • jetstream package docs
  • micro package docs
  • nats.go examples / package example tests

12) Common mistakes (and how not to do them)

  1. Using Core NATS for durable work queues

    • Fix: use JetStream pull consumers.
  2. No idempotency

    • Fix: treat duplicates as normal; add de-dup.
  3. Subject chaos

    • Fix: design subjects like an API; version if needed.
  4. Push consumer overload

    • Fix: pull-based consumption; tune batch + max wait.
  5. Security bolted on at the end

    • Fix: adopt creds/TLS early; automate nsc.
  6. Monitoring added after the incident

    • Fix: enable monitoring port, keep dashboards/alerts.

13) Reference snippets: quick recipes

13.1 Fanout events (Core → JetStream)

Pattern: publish event to a subject that is captured by a stream.

  • Producers use JetStream publish ack.
  • Consumers each have their own durable consumer to replay.

13.2 Work queue (JetStream)

  • Stream: jobs.>
  • Consumer: durable pull, explicit ack, max deliver, backoff.

13.3 Cache invalidation (Core)

  • Publish cache.invalidate.user.123 events; subscribers purge local caches.

13.4 “Signal down / events up” UI

  • UI sends commands via request-reply.
  • Backend publishes progress/events on subjects including correlation id.

14) Where to look for examples

  • nats.go repo: API usage and idioms
  • nats.go jetstream README: CRUD streams/consumers + consume patterns
  • nats.go micro README + package examples
  • NATS By Example: broad, practical patterns across languages

15) Skill checklist

You’re “NATS-competent” when you can:

  • Design a subject taxonomy without painting yourself into a corner.
  • Pick Core vs JetStream intentionally.
  • Implement pull consumers with explicit acks + retries + poison handling.
  • Use KV watch + CAS safely.
  • Build a micro service with discovery + monitoring.
  • Secure with creds/TLS and understand the JWT/NKey hierarchy.
  • Operate a cluster with monitoring, leafnodes/gateways (as needed).

End of SKILL.md.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment