Skip to content

Instantly share code, notes, and snippets.

@tonylampada
Created January 26, 2026 21:52
Show Gist options
  • Select an option

  • Save tonylampada/311bf849013c20f831b41ff3cd2fb599 to your computer and use it in GitHub Desktop.

Select an option

Save tonylampada/311bf849013c20f831b41ff3cd2fb599 to your computer and use it in GitHub Desktop.
Production Readiness Report - Support Debug Website (Roboflow)

Production Readiness Report - Support Debug Website

Author: Bar Shimshon (Support Team)
Project: roboflow-support
Review Date: 2026-01-26
Reviewer: Tony França / Jarbas (AI Assistant)


Executive Summary

The Support Debug Website is an internal Next.js application created by a support team member to help with debugging and customer investigations. While the initiative is commendable, the application has critical security vulnerabilities and architectural issues that must be resolved before it can be considered production-ready.

Verdict: ❌ NOT PRODUCTION READY


Requirements Analysis

1. Stateless Backend ⚠️ NEEDS IMPROVEMENT

Requirement: Backend must not persist data to local filesystem. All state should be in external storage.

Findings:

  • ✅ Production uses GCS via src/lib/storage.ts abstraction
  • ⚠️ Terminal sessions stored in-memory (src/lib/terminal-session.ts) - lost on restart
  • ⚠️ Audit logs written to local filesystem in dev mode

Files affected:

  • src/lib/storage.ts - Good abstraction, switches between GCS (prod) and filesystem (dev)
  • src/lib/terminal-session.ts - In-memory Map for session storage

Recommendation: Migrate terminal sessions to Redis or Firestore for persistence across restarts.


2. Secrets Management ⚠️ NEEDS IMPROVEMENT

Requirement: Secrets must come from environment variables. Sensitive secrets should use Secret Manager.

Findings:

  • ✅ API keys loaded from environment variables
  • ⚠️ Not using Google Secret Manager
  • ❌ Hardcoded fallback secret in terminal-session.ts:
    const JWT_SECRET = process.env.TERMINAL_JWT_SECRET || 'your-secret-key-change-in-production';

Files affected:

  • src/lib/terminal-session.ts - Hardcoded fallback secret
  • src/lib/auth.ts - Uses env vars correctly
  • src/lib/intercom.ts, src/lib/linear.ts, src/lib/claude.ts - Use env vars correctly

Recommendation:

  1. Remove all hardcoded fallback secrets
  2. Migrate sensitive secrets to Google Secret Manager
  3. Use @google-cloud/secret-manager client library

3. Production Data Access ✅ OK

Requirement: All access to production GCS/Firestore must go through the backend, not directly from frontend.

Findings:

  • ✅ All GCS operations go through API routes
  • ✅ All Firestore operations go through API routes
  • ✅ Frontend makes fetch calls to /api/* endpoints
  • ✅ No direct client-side GCP SDK usage

Files verified:

  • src/lib/storage.ts - Server-side only
  • src/lib/firestore.ts - Server-side only
  • src/app/api/* - All data access is server-side

4. No gcloud CLI Usage ❌ CRITICAL

Requirement: Backend must NOT use gcloud CLI commands. Must use client libraries.

Findings:

  • VIOLATION: Multiple files use child_process.exec() to run gcloud commands

Affected files:

  1. src/lib/gcp-auth.ts:
const { stdout } = await execAsync('gcloud auth print-access-token');
await execAsync(`gcloud config set project ${projectId}`);
  1. src/app/api/gcs-scan/route.ts:
const { stdout } = await execAsync(`gcloud storage ls "gs://${bucket}/${prefix}**"`);
  1. src/app/api/dataset-recovery/route.ts:
const { stdout } = await execAsync(`gcloud storage ls ...`);

Recommendation:

  1. Replace gcloud auth print-access-token with Application Default Credentials or Workload Identity
  2. Replace gcloud storage ls with @google-cloud/storage client library
  3. Remove all child_process.exec() calls for GCP operations

5. Logging ⚠️ NEEDS IMPROVEMENT

Requirement: Logs should go to Google Cloud Logging with structured format, not console.log.

Findings:

  • ⚠️ 112 occurrences of console.log across the codebase
  • ⚠️ 67 occurrences of console.error
  • ❌ No structured logging
  • ❌ No Google Cloud Logging integration

Recommendation:

  1. Integrate @google-cloud/logging or use structured stdout logging
  2. Use log levels (info, warn, error) consistently
  3. Add request correlation IDs for tracing

6. Authentication ⚠️ NEEDS IMPROVEMENT

Requirement: Must use Google SSO. Must have allowlist for who can access (not just @roboflow.com domain).

Findings:

  • ✅ Google OAuth SSO implemented via NextAuth.js
  • ✅ Domain restriction to @roboflow.com in src/lib/auth.ts:
if (!email || !email.endsWith('@roboflow.com')) {
  return false; // Deny sign-in
}
  • NO USER ALLOWLIST - Any @roboflow.com email can access
  • ⚠️ Audit logging exists but basic

Recommendation:

  1. Implement user allowlist (config file or database)
  2. Add role-based access control if needed
  3. Consider using Google Groups for access management

7. Automated Deployment ✅ OK

Requirement: CI/CD must be configured for automated deployments.

Findings:

  • cloudbuild.yaml present and configured
  • ✅ Builds Docker image and deploys to Cloud Run
  • ✅ Service account configured

File: cloudbuild.yaml

steps:
  - name: 'gcr.io/cloud-builders/docker'
    args: ['build', '-t', 'gcr.io/$PROJECT_ID/support-dashboard', '.']
  - name: 'gcr.io/cloud-builders/docker'
    args: ['push', 'gcr.io/$PROJECT_ID/support-dashboard']
  - name: 'gcr.io/google.com/cloudsdktool/cloud-sdk'
    args: ['gcloud', 'run', 'deploy', ...]

8. Security ❌ CRITICAL - COMMAND INJECTION VULNERABILITIES

Requirement: No security vulnerabilities, especially in user input handling.

Findings:

❌ CRITICAL: Command Injection in gcs-scan/route.ts

// Line 67 - User input directly in shell command
const { stdout } = await execAsync(`gcloud storage ls "gs://${bucket}/${prefix}**"`);

Attack vector:

  • prefix comes from request body without sanitization
  • Attacker can inject: "; rm -rf / # or $(malicious_command)

❌ CRITICAL: Command Injection in dataset-recovery/route.ts

const { stdout } = await execAsync(`gcloud storage ls "gs://${bucket}/${prefix}"`);

Same vulnerability pattern.

❌ CRITICAL: Command Injection in jetson-diag/route.ts

await execAsync(`tar -xzf ${tempFile} -C ${tempDir}`);

If tempFile path is influenced by user input, this is exploitable.

Recommendation:

  1. IMMEDIATELY remove all exec() calls with user input
  2. Use client libraries instead of CLI commands
  3. If shell commands are absolutely necessary, use proper escaping libraries
  4. Implement input validation and sanitization
  5. Consider using a security linter (e.g., eslint-plugin-security)

Summary Table

Requirement Status Priority
Stateless Backend ⚠️ Medium
Secrets Management ⚠️ Medium
Production Data Access -
No gcloud CLI Critical
Logging ⚠️ Low
Authentication Allowlist ⚠️ High
Automated Deployment -
Security (Command Injection) Critical

Action Items (Priority Order)

Critical (Must fix before any production use)

  1. Remove command injection vulnerabilities

    • Replace all exec() calls with client libraries
    • Files: gcs-scan/route.ts, dataset-recovery/route.ts, jetson-diag/route.ts, gcp-auth.ts
  2. Remove gcloud CLI usage

    • Use @google-cloud/storage for GCS operations
    • Use Application Default Credentials for auth

High Priority

  1. Implement user allowlist
    • Add config for allowed users/groups
    • Don't rely solely on domain restriction

Medium Priority

  1. Remove hardcoded secrets

    • Remove fallback in terminal-session.ts
    • Consider migrating to Secret Manager
  2. Migrate terminal sessions to persistent storage

    • Use Redis or Firestore

Low Priority

  1. Implement structured logging
    • Integrate Cloud Logging
    • Add correlation IDs

Conclusion

The Support Debug Website shows good initiative and has some solid architectural decisions (storage abstraction, NextAuth.js for auth, CI/CD setup). However, the critical security vulnerabilities (command injection) and architectural violations (gcloud CLI in backend) make it unsuitable for production in its current state.

The most urgent fixes are:

  1. Remove all command injection vectors
  2. Replace gcloud CLI with client libraries
  3. Implement user allowlist

Once these are addressed, a follow-up review should be conducted before production deployment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment