Skip to content

Instantly share code, notes, and snippets.

@g-leech
Created December 15, 2025 19:05
Show Gist options
  • Select an option

  • Save g-leech/30f2484d0318b5d9d489e5748fe46131 to your computer and use it in GitHub Desktop.

Select an option

Save g-leech/30f2484d0318b5d9d489e5748fe46131 to your computer and use it in GitHub Desktop.

ASAT

  • "AGI Safety & Alignment"
    • amplified oversight
    • interpretability
    • ASAT eng (automated alignment research)
    • Causal Incentives Working Group,

Frontier Safety

  • Risk Assessment (evals, threat models, the framework),
  • Mitigations (e.g. banning accounts, refusal training, jailbreak robustness)
  • Loss of Control (control, alignment evals)

Gemini Safety

Voices of All in Alignment

AGI Safety Council

Responsibility and Safety Council

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment