Lubdhak · August 13, 2025 18:03
diff --git a/gistfile1.txt b/gistfile1.txt
 Business Features
 	-	Admin Portal:
 		Developed an Admin Portal for the support team to streamline high-frequency operations.
 		- Implemented role-based access control (RBAC) with a dedicated “Support Read-Only” role.
 		- Integrated a Metabase analytics dashboard for storing and executing read-only SQL queries.
 		- Built an AWS S3–backed ingestion pipeline for processing Excel (XLSX/CSV) datasets.
 		- Enabled non–user-facing features using Feature Flags
 		- Used AWS EventBridge to orchestrate on-demand and scheduled task execution through separate code execution pipelines.
 		- Reduced JIRA ticket volume by 70% and improved SLA compliance time by 60%.

 	- App Catalog:
 		Designed and developed an App Catalog ticketing platform for managing application access and support requests across the organization.
 		- Integrated workflow automation to intelligently route requests through designated approvers, reducing manual coordination and ensuring compliance.
 		- Implemented configurable Single & Multi-Step Sequential Approval workflows with custom approve & reject rules.
 		- Integrated Webhook notifications to external systems with robust error handling for Webhook Delivery Failure scenarios.
 		- Designed escalation management, including Escalation Path routing for overdue approvals.
 		- Leveraged PostgreSQL, Sidekiq for background job processing, and AWS S3/EventBridge for asset storage and asynchronous workflow triggers.
 		- Seen an higher adoption by keeping application-related requests within the App Catalog instead of diverting them to the organization’s JIRA.

 	- Smart Contracts:
 		Built GPT-3.5-powered Smart Contracts system with automated clause extraction, compliance alerts, spend benchmarking, and savings insights
 		- Used Pretrained Large Language models to classify clauses (SLAs, termination terms) and extract critical dates, obligations, and pricing tables with >90% precision.
 		- Event-driven alerts via Sidekiq CRON jobs and AWS EventBridge, triggering notifications for renewals (30/60/90-day windows), spend anomalies, or non-compliant terms.
 		- Benchmarking engine comparing rates against our historical spend data in Snowflake and third-party APIs (e.g., Spend Intelligence).
 		- Generated savings recommendations via aggregated time-series analysis (Python Pandas) and outlier detection (DBSCAN clustering).
 		- UI dashboards visualizing contract health (burn rate, utilization) and benchmark gaps.
 		- LLM was GPT-3.5 base (trained on ~500B tokens) with context window of 4K tokens with per request cost < $0.002

 	- Product Sentiments:
 		An automated survey tool that analyzes user-app sentiment via targeted feedback for identifying unused/inefficient tools.
 		- Enables admins to launch targeted email campaigns to assess user sentiment about specific apps.
 		- Workspace owners select apps based on spend data (overlapping/expensive) or SSO logs (unused).
 		- Users receive personalized survey links via email to provide feedback.
 		- Aggregates responses into an interactive dashboard showing trends and suggestions.
 		- Plots app sentiment on a 4-quadrant grid (e.g., "High Cost vs. Low Satisfaction") for prioritization.
 		- Helped organizations cut costs (unused apps) and improve ROI (high-value tools).

 Technical Features
 	- SAML:
 		Implemented SAML 2.0 (Security Assertion Markup Language) for enterprise-grade SSO.
 		- Launched with Okta (v1) – Enabled enterprise SSO via SAML 2.0, later expanded to Azure AD, OneLogin, and custom IdPs.
 		- Challenges being Varied XML formats, certificate rotations, and strict NameID requirements caused integration hurdles.
 		- Testing & Debugging caused a lot of pain due the obvious mismatch of the ACS URLs, relied exclusively on network tunneling.
 		- in v2 we rolled out automated user provisioning using SCIM authorization flows
 		- Post-Acquisition: Consolidated IdPs under Auth0, migrated SAML customers, established cross-domain federated identity and managed session across the products
 		- Scaled to 150+ Customers – Cut support tickets by 90% and sped up integrations from 1 hour to 10 mins.

 	- Multi tenant & Microservice:
 		- Architected multi-tenant microservices using Django, PostgreSQL schema isolation, and Node API Gateway, implementing tenant-aware routing via subdomains.
 		- Designed schema-per-tenant architecture leveraging PostgreSQL's CREATE SCHEMA and Django-Tenants, developing middleware for automatic search_path switching.
 		- Implemented event-driven communication using Apache Kafka with tenant ID headers, enabling asynchronous processing while maintaining schema isolation
 		- Optimized database performance with PgBouncer connection pooling & schema-aware Django ORM extensions, achieving 2ms schema switches and 30% faster tenant-specific queries
 		- The node API Gateway was responsible for Service Discovery, Tenant Routing, Schema Injection, Authentication, Upload capabilities, Elastic Search Chores.
 	
 	- Table Saw:
  	Tools that dumps the referentially intact minimal subset of a postgres database with custom query selection & PII masking.
 		- Long-running production queries (debugging, reports, load-testing) required full DB restores or read-only replicas, slowing workflows.
 		- Built an open-source tool to extract minimal, referentially intact subsets of Postgres data instead of full dumps.
 		- Used topological sorting to auto-include all parent/child records via FK relationships from any seed row.
 		- This Enabled accelerated targeted data related debugging (e.g., single customer workspace) without multi-TB restores.
 		- Handledumps up to ~50GB before hitting VM memory limits.

 	- Email Server:
 		Linux-based email server using Postfix (MTA) and Dovecot (IMAP/POP3) with TLS encryption for secure SMTP relay.
 		- Configured NodeMailer for programmatic email sending, integrating OAuth2 and SMTP authentication.
 		- Implemented SPF, DKIM, DMARC, and Reverse DNS (PTR) to ensure inbox placement (reduced spam rate from ~50% to <5%).
 		- Monitored sender reputation using Google Postmaster Tools and MXToolbox to maintain high deliverability.
 		- Developed a queue-based scheduling system using Redis/BullMQ to delay emails and send them at predefined times.
 		- Engineered an email retraction feature (for unread emails) via IMAP IDLE tracking and custom API hooks.
 		- Optimized Postfix with rate limiting, connection pooling, and failover SMTP relays (AWS SES backup).

 	- Resume Parser & Ranking:
 		- Built parallel parsers using Apache Tika OCR microservice for scanned PDFs (92% text recovery)
 		- Extracted key fields (skills, experience, education) via rule-based matching and NER (spaCy/Stanford NLP).
 		- Created TF-IDF & Word2Vec embeddings for semantic similarity between resumes and job descriptions.
 		- Added handcrafted features (years of experience, skill overlap, education tier) for ML modeling.
 		- Experimented with Logistic Regression, Random Forests, and XGBoost (Bayesian hyperparameter tuning) to rank resumes by JD fit.
 		- Fine-tuned weights for tech vs. non-tech roles (e.g., heavier skill weighting for engineering jobs).
 		- Incorporated hard filters (e.g., "Must have: Python") to auto-reject mismatches.
 		- Achieved ~84% precision in top-5 shortlisting via cross-validation (human-annotated dataset).
 		- Addressed sparse data challege via synthetic oversampling of niche roles.
 		- Reduced bias by anonymizing resumes (removing names/gender cues) during ranking.

 		- Served predictions via a Flask API with caching (Redis) for batch processing.
 		- Designed Kafka topics with 3-partition architecture for load balancing
 		- Implemented idempotent consumers for resume/JD processing (exactly-once semantics)
 		- Scaled to 50 docs/minute using Kafka Connect S3 sink
 		- Integrated with AWS S3 for resume storage and Airflow for scheduled JD updates.

 	-	Performance Tuning:

 		- Stack Upgrade:
 			- Migrated from an 8-year-old Ruby/Rails monolith using iterative, zero-downtime strategies
 			- Replaced Unicorn with Puma (thread-safe scaling) & Transitioned from Asset Pipeline to Webpacker
 			- Rolling tweaks for compatibilty of native extensions, deprecated methods & args, minimal code changes

 		- Hardware / VM
 			- Upgraded to NVMe SSDs for disk I/O bound workloads -> Reduced CPU wait states -> CPU utilization dropped from 85% to <70% sustained
 			- Enabled huge pages (2MB) for memory-intensive apps -> Improved TLB hit rate -> Disk IOPS increased from 15k to 22k (random read/write)
 			- Switched to ARM-based Graviton3 instances -> Better price-performance -> Cost per 1000 requests reduced by 35%

 		- Infrastructure (AWS)
 			- Implemented spot instances for batch processing -> Cost savings -> EC2 costs decreased by 60% for non-critical workloads
 			- Right-sized RDS to r6gd.2xlarge -> Balanced memory/CPU -> Query throughput increased from 1.2k to 2.1k QPS
 			- Configured VPC flow logs -> Identified network bottlenecks -> Cross-AZ traffic reduced by 40%

 		- Docker
 			- Multi-stage builds -> Smaller images -> Image size reduced from 1.8GB to 450MB
 			- Set CPU limits (4 cores) -> Prevented noisy neighbors -> Container throttling events dropped from 12/hr to 0
 			- Switched to distroless base images -> Reduced attack surface -> CVE vulnerabilities decreased by 90%

 		- Language (Ruby)
 			- Enabled YJIT -> Faster execution -> Median request latency improved from 48ms to 29ms
 			- Tuned GC (RUBY_GC_HEAP_GROWTH_MAX_SLOTS=300k) -> Fewer GC pauses -> GC time per request reduced from 8ms to 3ms
 			- Adopted jemalloc -> Less fragmentation -> RSS memory stabilized at 1.2GB (was fluctuating 1-2GB)

 		- Database (PostgreSQL)
 			- Added partial indexes (WHERE status='active') -> Faster queries -> SELECT latency (p95) dropped from 120ms to 45ms
 			- Tuned autovacuum (autovacuum_vacuum_scale_factor=0.1) -> Fewer dead tuples -> Vacuum runs decreased from 20/day to 5/day
 			- Enabled parallel queries (max_parallel_workers=8) -> Improved analytics -> COUNT(*) runtime reduced from 12s to 3.2s

 		- Server (Puma)
 			- Adjusted workers:threads (4:8 -> 2:16) -> Better throughput -> Requests/sec increased from 850 to 1,100
 			- Enabled socket activation -> Zero-downtime restarts -> Deployment downtime reduced from 8s to 0s
 			- Set worker timeout (worker_timeout=30) -> Killed hung workers -> 5xx errors decreased by 75%

 		- Framework (Rails)
 			- Russian doll caching -> Fewer DB hits -> Cache hit rate improved from 65% to 92%
 			- Optimized ActiveRecord (pluck vs. select) -> Less memory -> Allocations/request dropped from 45k to 12k objects
 			- Enabled bootsnap -> Faster boots -> Application startup reduced from 12s to 4s

 		- Background Jobs (Sidekiq)
 			- Weighted queues (critical=5, default=1) -> Priority handling -> Critical job latency (p95) improved from 8s to 1.2s
 			- Set job expiration (30m) -> Redis memory control -> Redis memory usage stabilized at 800MB (was spiking to 2GB)
 			- Added idempotency keys -> Fewer duplicates -> Duplicate jobs dropped from 5% to 0.1%

 		- GraphQL
 			- Persisted queries -> Smaller payloads -> Network throughput reduced by 40%
 			- Dataloader batching -> Eliminated N+1 -> Resolver calls/query decreased from 32 to 5
 			- Query complexity limits (max_depth=10) -> Blocked expensive queries -> Timeout errors dropped by 90%

 		- Frontend (Angular)
 			- AOT compilation -> Faster rendering -> First Contentful Paint improved from 2.1s to 1.3s
 			- Lazy-loaded modules -> Smaller bundles -> Main.js size reduced from 1.4MB to 580KB
 			- OnPush change detection -> Less CPU usage -> Animation jank decreased from 12% to 2% frames dropped

 		- Build & Deployment
 			- Parallelized RSpec (--jobs 8) -> Faster CI -> Test suite runtime reduced from 18m to 6m
 			- Cached node_modules -> Less rebuilds -> Docker build time decreased from 5m to 90s
 			- Canary deployments (5% traffic) -> Safer releases -> Rollback rate dropped from 8% to 1%

 		- Scalability
 			- Read replicas -> Offloaded primary DB -> Primary DB CPU reduced from 80% to 45%
 			- HPA (CPU=70%) -> Auto-scaling pods -> Peak traffic capacity increased from 1k to 5k RPS
 			- Database sharding (by region) -> Reduced contention -> Write latency (p99) improved from 250ms to 90ms

 		- Observability
 			- Distributed tracing -> Faster debugging -> MTTR (Mean Time to Repair) reduced from 47m to 12m
 			- SLO-based alerts -> Fewer false positives -> Alert volume decreased by 70%
 			- Log sampling (10%) -> Cost control -> CloudWatch costs reduced by $1,200/month


 		

 NonTech Features:
 	- PR Review
 	- Hackathon
 	- BiWeekly Tech Talks
 	-	Conferences



 Drives:
 	- Better Service Design
 	- Sidekiq Pro Adoption
 	- NewRelic Adoption
 	- EC2 to ECS
 	- DB to S3 using AWS Glue
	Business Features
	- Admin Portal:
	Developed an Admin Portal for the support team to streamline high-frequency operations.
	- Implemented role-based access control (RBAC) with a dedicated “Support Read-Only” role.
	- Integrated a Metabase analytics dashboard for storing and executing read-only SQL queries.
	- Built an AWS S3–backed ingestion pipeline for processing Excel (XLSX/CSV) datasets.
	- Enabled non–user-facing features using Feature Flags
	- Used AWS EventBridge to orchestrate on-demand and scheduled task execution through separate code execution pipelines.
	- Reduced JIRA ticket volume by 70% and improved SLA compliance time by 60%.

	- App Catalog:
	Designed and developed an App Catalog ticketing platform for managing application access and support requests across the organization.
	- Integrated workflow automation to intelligently route requests through designated approvers, reducing manual coordination and ensuring compliance.
	- Implemented configurable Single & Multi-Step Sequential Approval workflows with custom approve & reject rules.
	- Integrated Webhook notifications to external systems with robust error handling for Webhook Delivery Failure scenarios.
	- Designed escalation management, including Escalation Path routing for overdue approvals.
	- Leveraged PostgreSQL, Sidekiq for background job processing, and AWS S3/EventBridge for asset storage and asynchronous workflow triggers.
	- Seen an higher adoption by keeping application-related requests within the App Catalog instead of diverting them to the organization’s JIRA.

	- Smart Contracts:
	Built GPT-3.5-powered Smart Contracts system with automated clause extraction, compliance alerts, spend benchmarking, and savings insights
	- Used Pretrained Large Language models to classify clauses (SLAs, termination terms) and extract critical dates, obligations, and pricing tables with >90% precision.
	- Event-driven alerts via Sidekiq CRON jobs and AWS EventBridge, triggering notifications for renewals (30/60/90-day windows), spend anomalies, or non-compliant terms.
	- Benchmarking engine comparing rates against our historical spend data in Snowflake and third-party APIs (e.g., Spend Intelligence).
	- Generated savings recommendations via aggregated time-series analysis (Python Pandas) and outlier detection (DBSCAN clustering).
	- UI dashboards visualizing contract health (burn rate, utilization) and benchmark gaps.
	- LLM was GPT-3.5 base (trained on ~500B tokens) with context window of 4K tokens with per request cost < $0.002

	- Product Sentiments:
	An automated survey tool that analyzes user-app sentiment via targeted feedback for identifying unused/inefficient tools.
	- Enables admins to launch targeted email campaigns to assess user sentiment about specific apps.
	- Workspace owners select apps based on spend data (overlapping/expensive) or SSO logs (unused).
	- Users receive personalized survey links via email to provide feedback.
	- Aggregates responses into an interactive dashboard showing trends and suggestions.
	- Plots app sentiment on a 4-quadrant grid (e.g., "High Cost vs. Low Satisfaction") for prioritization.
	- Helped organizations cut costs (unused apps) and improve ROI (high-value tools).

	Technical Features
	- SAML:
	Implemented SAML 2.0 (Security Assertion Markup Language) for enterprise-grade SSO.
	- Launched with Okta (v1) – Enabled enterprise SSO via SAML 2.0, later expanded to Azure AD, OneLogin, and custom IdPs.
	- Challenges being Varied XML formats, certificate rotations, and strict NameID requirements caused integration hurdles.
	- Testing & Debugging caused a lot of pain due the obvious mismatch of the ACS URLs, relied exclusively on network tunneling.
	- in v2 we rolled out automated user provisioning using SCIM authorization flows
	- Post-Acquisition: Consolidated IdPs under Auth0, migrated SAML customers, established cross-domain federated identity and managed session across the products
	- Scaled to 150+ Customers – Cut support tickets by 90% and sped up integrations from 1 hour to 10 mins.

	- Multi tenant & Microservice:
	- Architected multi-tenant microservices using Django, PostgreSQL schema isolation, and Node API Gateway, implementing tenant-aware routing via subdomains.
	- Designed schema-per-tenant architecture leveraging PostgreSQL's CREATE SCHEMA and Django-Tenants, developing middleware for automatic search_path switching.
	- Implemented event-driven communication using Apache Kafka with tenant ID headers, enabling asynchronous processing while maintaining schema isolation
	- Optimized database performance with PgBouncer connection pooling & schema-aware Django ORM extensions, achieving 2ms schema switches and 30% faster tenant-specific queries
	- The node API Gateway was responsible for Service Discovery, Tenant Routing, Schema Injection, Authentication, Upload capabilities, Elastic Search Chores.

	- Table Saw:
	Tools that dumps the referentially intact minimal subset of a postgres database with custom query selection & PII masking.
	- Long-running production queries (debugging, reports, load-testing) required full DB restores or read-only replicas, slowing workflows.
	- Built an open-source tool to extract minimal, referentially intact subsets of Postgres data instead of full dumps.
	- Used topological sorting to auto-include all parent/child records via FK relationships from any seed row.
	- This Enabled accelerated targeted data related debugging (e.g., single customer workspace) without multi-TB restores.
	- Handledumps up to ~50GB before hitting VM memory limits.

	- Email Server:
	Linux-based email server using Postfix (MTA) and Dovecot (IMAP/POP3) with TLS encryption for secure SMTP relay.
	- Configured NodeMailer for programmatic email sending, integrating OAuth2 and SMTP authentication.
	- Implemented SPF, DKIM, DMARC, and Reverse DNS (PTR) to ensure inbox placement (reduced spam rate from ~50% to <5%).
	- Monitored sender reputation using Google Postmaster Tools and MXToolbox to maintain high deliverability.
	- Developed a queue-based scheduling system using Redis/BullMQ to delay emails and send them at predefined times.
	- Engineered an email retraction feature (for unread emails) via IMAP IDLE tracking and custom API hooks.
	- Optimized Postfix with rate limiting, connection pooling, and failover SMTP relays (AWS SES backup).

	- Resume Parser & Ranking:
	- Built parallel parsers using Apache Tika OCR microservice for scanned PDFs (92% text recovery)
	- Extracted key fields (skills, experience, education) via rule-based matching and NER (spaCy/Stanford NLP).
	- Created TF-IDF & Word2Vec embeddings for semantic similarity between resumes and job descriptions.
	- Added handcrafted features (years of experience, skill overlap, education tier) for ML modeling.
	- Experimented with Logistic Regression, Random Forests, and XGBoost (Bayesian hyperparameter tuning) to rank resumes by JD fit.
	- Fine-tuned weights for tech vs. non-tech roles (e.g., heavier skill weighting for engineering jobs).
	- Incorporated hard filters (e.g., "Must have: Python") to auto-reject mismatches.
	- Achieved ~84% precision in top-5 shortlisting via cross-validation (human-annotated dataset).
	- Addressed sparse data challege via synthetic oversampling of niche roles.
	- Reduced bias by anonymizing resumes (removing names/gender cues) during ranking.

	- Served predictions via a Flask API with caching (Redis) for batch processing.
	- Designed Kafka topics with 3-partition architecture for load balancing
	- Implemented idempotent consumers for resume/JD processing (exactly-once semantics)
	- Scaled to 50 docs/minute using Kafka Connect S3 sink
	- Integrated with AWS S3 for resume storage and Airflow for scheduled JD updates.

	- Performance Tuning:

	- Stack Upgrade:
	- Migrated from an 8-year-old Ruby/Rails monolith using iterative, zero-downtime strategies
	- Replaced Unicorn with Puma (thread-safe scaling) & Transitioned from Asset Pipeline to Webpacker
	- Rolling tweaks for compatibilty of native extensions, deprecated methods & args, minimal code changes

	- Hardware / VM
	- Upgraded to NVMe SSDs for disk I/O bound workloads -> Reduced CPU wait states -> CPU utilization dropped from 85% to <70% sustained
	- Enabled huge pages (2MB) for memory-intensive apps -> Improved TLB hit rate -> Disk IOPS increased from 15k to 22k (random read/write)
	- Switched to ARM-based Graviton3 instances -> Better price-performance -> Cost per 1000 requests reduced by 35%

	- Infrastructure (AWS)
	- Implemented spot instances for batch processing -> Cost savings -> EC2 costs decreased by 60% for non-critical workloads
	- Right-sized RDS to r6gd.2xlarge -> Balanced memory/CPU -> Query throughput increased from 1.2k to 2.1k QPS
	- Configured VPC flow logs -> Identified network bottlenecks -> Cross-AZ traffic reduced by 40%

	- Docker
	- Multi-stage builds -> Smaller images -> Image size reduced from 1.8GB to 450MB
	- Set CPU limits (4 cores) -> Prevented noisy neighbors -> Container throttling events dropped from 12/hr to 0
	- Switched to distroless base images -> Reduced attack surface -> CVE vulnerabilities decreased by 90%

	- Language (Ruby)
	- Enabled YJIT -> Faster execution -> Median request latency improved from 48ms to 29ms
	- Tuned GC (RUBY_GC_HEAP_GROWTH_MAX_SLOTS=300k) -> Fewer GC pauses -> GC time per request reduced from 8ms to 3ms
	- Adopted jemalloc -> Less fragmentation -> RSS memory stabilized at 1.2GB (was fluctuating 1-2GB)

	- Database (PostgreSQL)
	- Added partial indexes (WHERE status='active') -> Faster queries -> SELECT latency (p95) dropped from 120ms to 45ms
	- Tuned autovacuum (autovacuum_vacuum_scale_factor=0.1) -> Fewer dead tuples -> Vacuum runs decreased from 20/day to 5/day
	- Enabled parallel queries (max_parallel_workers=8) -> Improved analytics -> COUNT(*) runtime reduced from 12s to 3.2s

	- Server (Puma)
	- Adjusted workers:threads (4:8 -> 2:16) -> Better throughput -> Requests/sec increased from 850 to 1,100
	- Enabled socket activation -> Zero-downtime restarts -> Deployment downtime reduced from 8s to 0s
	- Set worker timeout (worker_timeout=30) -> Killed hung workers -> 5xx errors decreased by 75%

	- Framework (Rails)
	- Russian doll caching -> Fewer DB hits -> Cache hit rate improved from 65% to 92%
	- Optimized ActiveRecord (pluck vs. select) -> Less memory -> Allocations/request dropped from 45k to 12k objects
	- Enabled bootsnap -> Faster boots -> Application startup reduced from 12s to 4s

	- Background Jobs (Sidekiq)
	- Weighted queues (critical=5, default=1) -> Priority handling -> Critical job latency (p95) improved from 8s to 1.2s
	- Set job expiration (30m) -> Redis memory control -> Redis memory usage stabilized at 800MB (was spiking to 2GB)
	- Added idempotency keys -> Fewer duplicates -> Duplicate jobs dropped from 5% to 0.1%

	- GraphQL
	- Persisted queries -> Smaller payloads -> Network throughput reduced by 40%
	- Dataloader batching -> Eliminated N+1 -> Resolver calls/query decreased from 32 to 5
	- Query complexity limits (max_depth=10) -> Blocked expensive queries -> Timeout errors dropped by 90%

	- Frontend (Angular)
	- AOT compilation -> Faster rendering -> First Contentful Paint improved from 2.1s to 1.3s
	- Lazy-loaded modules -> Smaller bundles -> Main.js size reduced from 1.4MB to 580KB
	- OnPush change detection -> Less CPU usage -> Animation jank decreased from 12% to 2% frames dropped

	- Build & Deployment
	- Parallelized RSpec (--jobs 8) -> Faster CI -> Test suite runtime reduced from 18m to 6m
	- Cached node_modules -> Less rebuilds -> Docker build time decreased from 5m to 90s
	- Canary deployments (5% traffic) -> Safer releases -> Rollback rate dropped from 8% to 1%

	- Scalability
	- Read replicas -> Offloaded primary DB -> Primary DB CPU reduced from 80% to 45%
	- HPA (CPU=70%) -> Auto-scaling pods -> Peak traffic capacity increased from 1k to 5k RPS
	- Database sharding (by region) -> Reduced contention -> Write latency (p99) improved from 250ms to 90ms

	- Observability
	- Distributed tracing -> Faster debugging -> MTTR (Mean Time to Repair) reduced from 47m to 12m
	- SLO-based alerts -> Fewer false positives -> Alert volume decreased by 70%
	- Log sampling (10%) -> Cost control -> CloudWatch costs reduced by $1,200/month




	NonTech Features:
	- PR Review
	- Hackathon
	- BiWeekly Tech Talks
	- Conferences



	Drives:
	- Better Service Design
	- Sidekiq Pro Adoption
	- NewRelic Adoption
	- EC2 to ECS
	- DB to S3 using AWS Glue
No results found