Goal: standardize how you think, communicate, and de-risk architecture so you can drive bigger outcomes with less friction.
If you can’t communicate the design, you don’t have a design.
- C4 model
- System Context → Container → Component (Code level usually optional)
- Deployment topology
- Where things run: VNets/subnets, private endpoints, gateways, AKS/App Service, etc.
- Sequence diagrams
- 1–2 critical flows: auth, create order, payment, async processing
- Data-flow diagrams
- PII, secrets, trust boundaries (doubles as security input)
- Runtime view
- Sync calls, async events, retries, circuit breakers, timeouts
- Diagrams as code: Mermaid / PlantUML in repo, PR-reviewed
- If you want C4-first tooling: Structurizr
- 1-page architecture memo
- Problem, constraints, NFRs, options, decision, risks, rollout plan
- ADR (Architecture Decision Records)
- One ADR per meaningful choice (storage, messaging, tenancy, authN/authZ, etc.)
- Reliability, latency, throughput, scalability, security, cost, operability
- Ability to state: “We prioritize X over Y” + why
- Timeouts, retries, idempotency, backpressure, rate limiting
- Consistency models, eventual consistency, ordering guarantees
- Caching (what/where + invalidation)
- Multi-region tradeoffs (and when not worth it)
You should be able to:
- Choose modular monolith vs microservices using explicit tradeoffs
- Define service boundaries (domain boundaries + team boundaries)
- Handle hard parts: data ownership, reporting, distributed transactions, versioning, testing
You should be strong at:
- Event storming / domain discovery
- Bounded contexts + context mapping
- Ubiquitous language (model matches business, not the DB)
You should be able to design:
- When to use events vs commands
- Delivery semantics: at-least-once, dedupe, idempotency
- Outbox pattern, schema evolution, consumer versioning
- Observability for async flows (correlation IDs, tracing, DLQ strategy)
Anchor: Azure Well-Architected Framework.
Minimum Azure SA competency:
- Identity: Entra ID, Managed Identity, RBAC, conditional access assumptions
- Network: hub-spoke, private endpoints, egress control, DNS, WAF
- Compute: App Service vs AKS vs Functions vs Container Apps (decisioning)
- Data: Azure SQL vs Cosmos vs Storage; backups, RPO/RTO
- Messaging: Service Bus vs Event Grid (pick based on guarantees needed)
- IaC: Bicep or Terraform (pick one and go deep)
- Observability: App Insights + Logs + distributed tracing standards
- Governance: Azure Policy, naming/tagging, cost controls
You should be able to answer:
- “How will we know it’s failing?”
- “How do we recover?”
- “What happens at 10x traffic?”
- “How do we deploy without fear?”
Key concepts:
- SLO/SLI thinking
- Runbooks, incident response, game days
- Load testing strategy + capacity model
- Resilience patterns: bulkheads, circuit breakers, queue-based load leveling
Minimum:
- Lightweight threat modeling
- OWASP Top 10 awareness (API, auth, logging, secrets)
- Data classification: PII handling, retention, encryption, auditing
You should be strong at:
- Cross-team alignment (decisions multiple teams live with)
- Stakeholder management (product, security, platform, finance)
- Driving clarity under ambiguity (tradeoffs + decision log)
- Mentoring + raising engineering standards
- Saying “no” with a better alternative
Assume 4–6 hrs/week. Deliverables are what make it stick.
Outputs
- C4 diagrams for one real system: Context + Container + one Component view
- 2 sequence diagrams (one sync, one async)
- 1-page architecture memo template + first memo written
Outputs
- Bounded context map + integration contracts (events/APIs)
- Written decision: modular monolith vs microservices (memo + ADR)
- Service ownership doc: data ownership + APIs/events + versioning policy
Outputs
- Event catalog: names, schemas, producers/consumers, ordering needs
- Reliability design: retries, idempotency, DLQ policy, reprocessing plan
- Observability standard: correlation IDs + tracing across async flows
Outputs
- Well-Architected review: top 10 risks + remediation plan
- Cost model: biggest cost drivers + scaling assumptions
- Runbook + incident checklist for top 3 failure modes
- Fundamentals of Software Architecture (Richards/Ford)
- Quality attributes, tradeoffs, evaluating architectures
- Software Architecture: The Hard Parts
- Decision frameworks for distributed systems + non-obvious tradeoffs
- Designing Data-Intensive Applications (Kleppmann)
- Storage, replication, partitioning, streams, consistency (how systems behave)
- Building Microservices (Sam Newman)
- When microservices help, and what they break
- Microservices Patterns (Chris Richardson)
- Patterns: decomposition, saga/outbox, communication, testing
- Designing Event-Driven Systems (Ben Stopford)
- Event streaming + EDA patterns and why they matter
- Domain-Driven Design (Eric Evans)
- Strategic DDD and domain modeling mindset
- Domain-Driven Design Distilled (Vaughn Vernon)
- Short, actionable DDD primer (high ROI/time)
- Release It! (Michael Nygard)
- Production failure patterns + how to design for survival
- Threat Modeling (Adam Shostack)
- Repeatable way to reason about security during design
- The Staff Engineer’s Path (Tanya Reilly)
- How senior ICs lead, set standards, and operate without authority
- Staff Engineer (Will Larson)
- Staff/Principal archetypes and how to be effective in each
- If you meant The Pragmatic Programmer (Hunt/Thomas): timeless craft advice
- If you meant The Pragmatic Engineer (Gergely Orosz): newsletter/brand; also The Software Engineer’s Guidebook
For any serious design, you should be able to produce:
- C4 diagrams + 1–2 sequence diagrams
- NFR list with priorities + measurable targets
- Key decisions (ADRs) + explicit tradeoffs
- Failure modes + resilience plan
- Security posture (light threat model)
- Cost posture (main drivers + scaling assumptions)
- Delivery plan (migration/rollout, testing, observability)