---
name: nats
description: Practical, production-minded NATS skill pack: Core NATS + JetStream + Go (nats.go) + Micro (service framework). Includes mental models, subject design, reliability patterns, security, ops, and copy‑paste Go snippets.
---
This is a high-signal field guide for building real systems with NATS:
- Core NATS for low-latency messaging (pub/sub, request-reply, queue groups).
- JetStream for persistence, replay, work-queues, KV, Object Store.
- nats.go idioms for Go clients.
- micro (service framework) for discoverable, observable services.
Sources referenced while building this: NATS docs (docs.nats.io), nats.go repo, nats.go/jetstream, nats.go/micro, and example assets.
NATS is subject-based messaging: you publish to a subject (a dot-separated string), and subscribers who expressed interest in that subject get the message. No brokers of “topics with partitions” here—just very fast routing by subject.
JetStream adds a persistence layer: streams store messages for subjects; consumers track delivery/acks and provide at-least-once semantics.
Subjects are your public contract. Good subject design will save your future self from archaeology.
Naming style (common and scalable):
domain.entity.action(commands)domain.entity.event(events)domain.entity.queryordomain.entity.get(queries)
Examples:
- Commands:
billing.invoice.create - Events:
billing.invoice.created - Queries:
billing.invoice.get
Wildcards:
*matches one token:billing.*.created>matches the rest:billing.>
Rule of thumb: subjects are cheap; changing them is expensive.
- Use core pub/sub for signals, cache invalidation, fanout notifications, low-latency eventing where losing a message is acceptable or handled elsewhere.
Go:
nc, _ := nats.Connect(nats.DefaultURL)
sub, _ := nc.Subscribe("billing.invoice.created", func(m *nats.Msg) {
// handle
})
_ = sub
nc.Publish("billing.invoice.created", []byte("hello"))- Request-reply is the “distributed function call” pattern.
- In NATS, it’s typically implemented by the client publishing a request with a reply inbox.
Go:
msg, err := nc.Request("billing.invoice.get", []byte("id=123"), 2*time.Second)
if err != nil { /* timeout, no responders, etc */ }
_ = msgRules that keep you sane:
- Always set a timeout.
- Keep payloads small; send IDs not blobs.
- Make handlers idempotent.
Multiple subscribers can share a queue group name to load-balance work.
nc.QueueSubscribe("work.resize_image", "workers", func(m *nats.Msg) {
// only one worker in the group gets each msg
})NATS clients create unique inbox subjects for replies. Don’t invent your own unless you have a reason.
Core NATS is designed for speed, not infinite buffering.
- If you need durability or replay or work queues with retries → JetStream.
- If you need slow consumers, consider pull-based consumption in JetStream.
JetStream = persistence + replay + delivery tracking.
A Stream is a message store for one or more subjects.
- Messages published to matching subjects are captured.
- Retention policy and limits define when data is evicted.
Key ideas:
- Choose stream boundaries intentionally (by domain, by retention needs).
- Prefer JetStream publish APIs when you care that a message was stored (they return a storage ack).
A Consumer is a “view” of a stream + delivery state.
- Pull or push.
- Durable (named, stateful) or ephemeral (short-lived).
- Ack policy + max deliveries decide redelivery behavior.
Practical default:
- For work queues / slow processing: Pull consumer + explicit acks.
- For fast fanout: push can be fine.
At-least-once means duplicates can happen.
- Ack when done.
- Nak if you want redelivery sooner.
- Term / “do not redeliver” patterns exist depending on API.
Design your processing to be idempotent.
Two very different beasts:
Event stream (many consumers):
- You want multiple independent consumers to see the same events.
- Use retention and delivery policies to support replay.
Work queue (one consumer group shares work):
- You want each job processed once (at-least-once, with retries).
- Use JetStream with a consumer strategy that ensures single delivery per job.
KV is a stream-backed key/value store with revision history and watch. Use it for:
- Feature flags, leader hints, small shared state, coordination. Not for:
- Large blobs or high-write hot keys.
Key patterns:
- Watchers for reactive systems.
- CAS (compare-and-set) for optimistic locking.
- Create as a mutex primitive (pessimistic lock with TTL/discipline).
Object Store stores large blobs by chunking. Use it for:
- Artifacts, files, bundles, configs, model snapshots.
opts := []nats.Option{
nats.Name("svc.billing"),
nats.Timeout(5 * time.Second),
nats.ReconnectWait(500 * time.Millisecond),
nats.MaxReconnects(-1),
nats.DisconnectErrHandler(func(_ *nats.Conn, err error) { /* log */ }),
nats.ReconnectHandler(func(nc *nats.Conn) { /* log */ }),
nats.ClosedHandler(func(nc *nats.Conn) { /* log */ }),
}
nc, err := nats.Connect("nats://127.0.0.1:4222", opts...)
if err != nil { panic(err) }Be explicit about draining on shutdown:
// on shutdown:
nc.Drain() // lets pending messages flush
nc.Close()Pick an envelope:
- JSON for human-debuggable.
- Protobuf for strict schemas.
Always include:
id(unique message id)correlation_idcreated_attype(event/command name)
This is not “enterprisey”; it’s “debuggable”.
nats.go has:
- Older JetStream API via
nc.JetStream(). - Newer, cleaner API in
github.com/nats-io/nats.go/jetstream.
import (
"github.com/nats-io/nats.go"
"github.com/nats-io/nats.go/jetstream"
)
nc, _ := nats.Connect(nats.DefaultURL)
js, _ := jetstream.New(nc)_, err := js.CreateStream(context.Background(), jetstream.StreamConfig{
Name: "ORDERS",
Subjects: []string{"orders.>"},
Storage: jetstream.FileStorage,
Retention: jetstream.LimitsPolicy,
MaxAge: 24 * time.Hour,
})
if err != nil { /* handle */ }ack, err := js.Publish(context.Background(), "orders.created", []byte("..."))
if err != nil { /* not stored */ }
_ = ackc, _ := js.CreateOrUpdateConsumer(context.Background(), "ORDERS", jetstream.ConsumerConfig{
Durable: "worker",
AckPolicy: jetstream.AckExplicitPolicy,
MaxDeliver: 10,
FilterSubject: "orders.created",
})
for {
msgs, err := c.Fetch(10, jetstream.FetchMaxWait(2*time.Second))
if err != nil { continue }
for msg := range msgs.Messages() {
// process
if err := process(msg.Data()); err != nil {
_ = msg.Nak() // retry
continue
}
_ = msg.Ack()
}
}Why pull is loved: you control flow and avoid being drowned by push.
kv, _ := js.CreateOrUpdateKeyValue(context.Background(), jetstream.KeyValueConfig{
Bucket: "CONFIG",
History: 5,
TTL: 0,
})
_, _ = kv.Put(context.Background(), "flags/new_ui", []byte("true"))
entry, _ := kv.Get(context.Background(), "flags/new_ui")
_ = entry
w, _ := kv.Watch(context.Background(), "flags.*")
for update := range w.Updates() {
if update == nil { continue }
// react
}obj, _ := js.CreateOrUpdateObjectStore(context.Background(), jetstream.ObjectStoreConfig{
Bucket: "ARTIFACTS",
})
// Put bytes
_, _ = obj.PutBytes(context.Background(), "builds/app.tar.gz", data)
// Get
r, _ := obj.Get(context.Background(), "builds/app.tar.gz")
defer r.Close()The micro package helps you build services that are:
- queue-grouped by default (load management)
- discoverable
- monitorable
import "github.com/nats-io/nats.go/micro"
svc, err := micro.AddService(nc, micro.Config{
Name: "billing",
Version: "1.0.0",
})
if err != nil { panic(err) }
err = svc.AddEndpoint("get_invoice", micro.HandlerFunc(func(req micro.Request) {
// req.Data()
req.Respond([]byte("ok"))
}), micro.WithEndpointSubject("billing.invoice.get"))
if err != nil { panic(err) }
// block / run
select {}Group endpoints under consistent subject prefixes:
billing.invoice.*billing.payment.*
Micro isn’t magic; it standardizes patterns.
- Make one service = one bounded context.
- Publish events separately (often via JetStream) rather than trying to turn every endpoint into an RPC.
Use it for:
- Pub/sub, request/reply debugging
- JetStream stream/consumer/KV/ObjectStore management
- Server checks and cheat sheets
Examples:
nats pub foo "hi"nats sub foonats request svc.ping ""nats stream lsnats consumer ls ORDERSnats kv lsnats kv watch CONFIG
Also: export/import stream configs via JSON to treat them like IaC.
Use nsc to manage:
- Operators, Accounts, Users
- NKey/JWT credentials
If you’re building a multi-tenant platform: spend time here. This is where “secure by default” happens.
- Local dev: no auth (fine for localhost)
- Basic auth: users in server config
- TLS everywhere
- NKeys + JWT (operator/account/user) with
nsc
- Operator controls the universe.
- Accounts isolate tenants.
- Users belong to accounts.
This model enables one NATS infrastructure serving many isolated entities with optional exports/imports.
- Rotate creds like you rotate keys, because they are keys.
- Keep operator/account keys offline where possible.
- Use short-lived creds when you can (and automate renewal).
NATS servers expose a lightweight HTTP monitoring port with endpoints like:
/varz,/connz,/routez,/gatewayz,/leafz,/subsz
Use them to:
- understand load
- find slow consumers
- validate cluster topology
- Cluster: scale + HA within a logical group.
- Leafnodes: connect edge clusters to a core (great for IoT/branch setups).
- Gateways: connect independent clusters across regions/accounts.
- Memory vs file storage: file for durable, memory for speed.
- Know your retention: time-based, size-based, interest-based.
- Plan for consumer backlogs and replay storms.
At-least-once delivery means duplicates happen.
Common strategies:
- store last processed message ID per consumer
- use natural idempotency keys (order ID)
- transactional outbox → publish → consumer de-dup
- Commands: request to change state (often request-reply).
- Events: facts about what happened (pub/sub; often persisted).
This fits CQRS/event-sourcing designs nicely:
- Commands go to a service (micro endpoint or request handler)
- Events are published to JetStream and projected by other services
- Core NATS: keep handlers fast; use worker pools.
- JetStream: prefer pull consumers for slow/variable workloads.
After max deliveries, emit an advisory / alert and park the message:
- send to a DLQ stream (dead letter)
- store failed payload + error + stack + correlation id
- Don’t assume global ordering across subjects.
- If you need ordering, design for it (single subject partitioning, per-aggregate subjects, or sequence numbers).
Per bounded context:
- Commands (to one handler):
ctx.cmd.<aggregate>.<action> - Events (fanout):
ctx.evt.<aggregate>.<event> - Queries (RPC-ish):
ctx.qry.<readmodel>.<name>
Example:
billing.cmd.invoice.createbilling.evt.invoice.createdbilling.qry.invoice.by_id
If you later bridge systems (Kafka, HTTP, MQTT), you’ll thank yourself.
NATS docs are huge; here’s the map you actually traverse:
Core NATS
- Subjects + wildcards
- Pub/Sub
- Request-Reply
- Queue groups
JetStream
- Streams (storage + retention)
- Consumers (delivery, ack, pull/push)
- KV store
- Object store
Security
- Auth overview
- JWT/NKeys model (operator/account/user)
- NSC workflow
Ops/Deployment
- Running nats-server
- Monitoring endpoints
- Leafnodes/gateways
Go client
- nats.go README
jetstreampackage docsmicropackage docs- nats.go examples / package example tests
-
Using Core NATS for durable work queues
- Fix: use JetStream pull consumers.
-
No idempotency
- Fix: treat duplicates as normal; add de-dup.
-
Subject chaos
- Fix: design subjects like an API; version if needed.
-
Push consumer overload
- Fix: pull-based consumption; tune batch + max wait.
-
Security bolted on at the end
- Fix: adopt creds/TLS early; automate
nsc.
- Fix: adopt creds/TLS early; automate
-
Monitoring added after the incident
- Fix: enable monitoring port, keep dashboards/alerts.
Pattern: publish event to a subject that is captured by a stream.
- Producers use JetStream publish ack.
- Consumers each have their own durable consumer to replay.
- Stream:
jobs.> - Consumer: durable pull, explicit ack, max deliver, backoff.
- Publish
cache.invalidate.user.123events; subscribers purge local caches.
- UI sends commands via request-reply.
- Backend publishes progress/events on subjects including correlation id.
- nats.go repo: API usage and idioms
- nats.go
jetstreamREADME: CRUD streams/consumers + consume patterns - nats.go
microREADME + package examples - NATS By Example: broad, practical patterns across languages
You’re “NATS-competent” when you can:
- Design a subject taxonomy without painting yourself into a corner.
- Pick Core vs JetStream intentionally.
- Implement pull consumers with explicit acks + retries + poison handling.
- Use KV watch + CAS safely.
- Build a micro service with discovery + monitoring.
- Secure with creds/TLS and understand the JWT/NKey hierarchy.
- Operate a cluster with monitoring, leafnodes/gateways (as needed).
End of SKILL.md.