Skip to content

Instantly share code, notes, and snippets.

@citizenrich
Last active December 30, 2025 17:59
Show Gist options
  • Select an option

  • Save citizenrich/03f6311cf76f85cc13c3639aa2e1f70a to your computer and use it in GitHub Desktop.

Select an option

Save citizenrich/03f6311cf76f85cc13c3639aa2e1f70a to your computer and use it in GitHub Desktop.
Python prompt.

python coding tips for claude code etc

  • Keep explanations brief and to the point.
  • I am mostly familiar with Bash, Stata, Python, and Go.
  • Do not provide code snippets and examples of code unless asked.

Looks

  • Do not use emojis.

Frontend

  • Prefer Cloudflare Pages and HTMX + Tailwind CSS.
  • Avoid npm/webpack/JS builds.

Infrastructure

  • Cloudflare (domains, R2, D1, Workers, Containers, Pages, AI Gateway, Queues, KV, Calls/RealtimeKit).
  • Use D1 unless there are limitations or need Postgres-specific features (JSONB, FTS, PostGIS).
  • While Cloudflare Workers support Python also consider whether Typescript workers are more reliable.
  • When creating R2 buckets, enable Data Catalog explicitly (not default).
  • Multiple cloud accounts/domains may exist across providers.
  • For low latency, prefer WebRTC (for audio/video), WebSockets, SSE, UDP, over REST.

Python coding

  • Use Python 3.12
  • uv, uv workspaces, ty (LSP), pytest, ruff, type hints on all functions.
  • FastAPI + SQLModel (Pydantic + SQLAlchemy combined) for APIs and databases.
  • Prefer serverless approaches that work in Containers or Workers over virtual machines.
  • Always try to use SQLModel and fallback to python dataclasses if necessary.
  • Use exchange_calendars for trading holidays by exchange.
  • Concise comments explaining intent. Tests with mocks/fixtures. 4-space indent.
  • Be mindful of timezones; Modal.com uses UTC/GMT.

Python analytics

  • Use gymnasium, not OpenAI gym (deprecated).
  • Prefer polars over pandas; use numpy for numerical compute.
  • Store tabular data as parquet (zstd compression if available).
  • DuckDB for analytical SQL queries on parquet/polars. dbt for SQL.
  • Prefer hardware-accelerated libraries: polars (SIMD), numpy (BLAS/LAPACK), cuDF/RAPIDS (GPU).
  • Prefer Prefect for orchestration and workflows over Airflow.

Python performance

  • Prefer async/await for I/O-bound work (HTTP, DB, file).
  • Use multiprocessing or ProcessPoolExecutor for CPU-bound (GIL bypass).
  • Profile first (cProfile, py-spy, scalene) — don't guess bottlenecks.
  • Batch external calls (DB queries, API requests) — N+1 is the common killer.
  • Connection pooling for databases and HTTP (httpx, asyncpg, sqlalchemy pool).
  • Use generators/iterators for large datasets — avoid materializing full lists.
  • Prefer list/dict comprehensions over loops — faster and clearer.
  • NumPy/polars vectorization over Python loops for numerical work.
  • functools.lru_cache or @cache for expensive pure functions.
  • slots on data classes with many instances (reduces memory ~40%).
  • Avoid repeated attribute lookups in tight loops — assign to local variable.
  • f-strings are fastest for string formatting.
  • orjson for JSON (10-20× faster, handles serialize + parse).
  • pysimdjson only if parsing large (>1MB) JSON is the bottleneck.
  • uvloop as event loop for async workloads (2-4× faster than default).
  • Struct packing for binary protocols — struct module or msgspec.
  • Memory views for zero-copy buffer manipulation.

LLMs

  • Use massive.com (formerly Polygon.io). There is already a running R2 bucket ('warrenbucket') with historical data.
  • GPU Service: Modal.com (primary) + RunPod.io (backup)
  • Use huggingface.co
  • Vector DB: Locally use Qdrant. Keep in mind potential sync in later iterations to Cloudflare Vectorize.
  • Use a versioned, compatible embedding library that can be used in both local development and Cloudflare Vectorize.
  • Use wandb for logging and debugging language model inputs, outputs, and traces.

Panel data and regression approaches

  • Prefer linear probability model over logit; always use robust (HC) standard errors.
  • Panel fixed effects: use linearmodels (PanelOLS) -- better than statsmodels for clustered SEs and IV.
  • Mixed effects (statsmodels.MixedLM): hierarchical data with random intercepts/slopes (repeated measures, nested structures like students within schools). Use when group-level variance matters or predicting for new groups.
  • Use fixed effects over mixed effects when within-group variation is what identifies causal effects.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment