Skip to content

Instantly share code, notes, and snippets.

View abatilo's full-sized avatar

Aaron Batilo abatilo

View GitHub Profile
@abatilo
abatilo / deepspeed_train.py
Last active December 29, 2025 01:28
DeepSpeed + Hugging Face Transformers Training Script for Spin Tutorial
#!/usr/bin/env python3
"""
Hugging Face Transformers Distributed Training Example for Spin
Trains a GPT-2 model on WikiText-2 using distributed data parallel.
Designed to run on multiple nodes via Spin's SyncSet orchestration.
"""
import os
@abatilo
abatilo / train.py
Created December 28, 2025 23:25
Tiny Shakespeare DDP Training Script for Spin Tutorial
#!/usr/bin/env python3
"""
Tiny Shakespeare DDP Training Example for Spin
Minimal causal-masked Transformer trained on Tiny Shakespeare using PyTorch DDP.
Uses character-level tokenization (~65 unique characters) for simplicity.
Designed to run on multiple nodes via Spin's SyncSet orchestration.
Usage with spinctl:
# Push this script to your SyncSet pods
@abatilo
abatilo / interprocess.md
Created October 6, 2024 05:01 — forked from jphsd/interprocess.md
Interprocess channels in Go by using named pipes

How to use channels across different processes in Go

A couple of code samples to show how a named pipe can be used to extend Go's channel paradigm for use between different processes running on a system.

  • interprocess1.go details a single byte channel.
  • interprocess2.go details a channel that passes slices of bytes.

Note that opening a write channel will return two channels -

@abatilo
abatilo / Dockerfile
Last active June 1, 2024 18:23
Start a training process with sshd - https://sliceofexperiments.com
FROM ubuntu:22.04
# Install sshd and xz to unzip the s6-overlay tarball
RUN <<EOF
apt-get update;
apt-get install -yq openssh-server xz-utils;
EOF
# Install s6-overlay
ADD https://github.com/just-containers/s6-overlay/releases/download/v3.1.6.2/s6-overlay-noarch.tar.xz /tmp
export ZSH="$HOME/.oh-my-zsh"
ZSH_THEME="pygmalion"
plugins=(
mise
fzf
git
kubectl
)
@abatilo
abatilo / cmd.go
Created April 2, 2023 21:42
Build index and search index with OpenAI
package gogptindex
import (
"context"
_ "embed"
"encoding/json"
"fmt"
"net/http"
"os"
"os/signal"
apiVersion: apps.kruise.io/v1alpha1
kind: ImagePullJob
metadata:
name: sudokurace
namespace: kruise-system
spec:
image: '911907402684.dkr.ecr.us-west-2.amazonaws.com/sudokurace:' # Tag set by kustomization.yaml
pullSecrets:
# Must match https://gist.github.com/abatilo/6b287265d541d06da567893c1522999f#file-imagepulljob-tf-L3
- kruise-ecr-token
locals {
kruise_ecr_token_updater_service_account = "kruise-ecr-token-updater"
kruise_ecr_token_secret_name = "kruise-ecr-token"
kruise_ecr_token_updater_script = <<EOF
ECR_TOKEN=`aws ecr get-login-password --region $${AWS_REGION}`
NAMESPACE_NAME=${kubernetes_namespace.kruise_system.metadata[0].name}
kubectl delete secret --ignore-not-found $DOCKER_SECRET_NAME -n $NAMESPACE_NAME
kubectl create secret docker-registry $DOCKER_SECRET_NAME \
@abatilo
abatilo / openkruise.tf
Created March 7, 2023 00:00
Terraform for installing OpenKruise with AWS ECR credentials
locals {
kruise_ecr_token_updater_service_account = "kruise-ecr-token-updater"
kruise_ecr_token_secret_name = "kruise-ecr-token"
kruise_ecr_token_updater_script = <<EOF
ECR_TOKEN=`aws ecr get-login-password --region $${AWS_REGION}`
NAMESPACE_NAME=${kubernetes_namespace.kruise_system.metadata[0].name}
kubectl delete secret --ignore-not-found $DOCKER_SECRET_NAME -n $NAMESPACE_NAME
kubectl create secret docker-registry $DOCKER_SECRET_NAME \
@abatilo
abatilo / ratelimit.yml
Created October 9, 2022 20:58
Example of getting app rate limits
name: Print rate limits
on:
push:
jobs:
rate-limit:
runs-on: ubuntu-latest
steps:
- name: Generate token
id: generate_token