Skip to content

Instantly share code, notes, and snippets.

View 0-vortex's full-sized avatar
:shipit:
looking out for #0

TED Vortex (Teodor-Eugen Duțulescu) 0-vortex

:shipit:
looking out for #0
View GitHub Profile
@disler
disler / README_MINIMAL_PROMPT_CHAINABLE.md
Last active July 27, 2025 06:29
Minimal Prompt Chainables - Zero LLM Library Sequential Prompt Chaining & Prompt Fusion

Minimal Prompt Chainables

Sequential prompt chaining in one method with context and output back-referencing.

Files

  • main.py - start here - full example using MinimalChainable from chain.py to build a sequential prompt chain
  • chain.py - contains zero library minimal prompt chain class
  • chain_test.py - tests for chain.py, you can ignore this
  • requirements.py - python requirements

Setup

@thesamesam
thesamesam / xz-backdoor.md
Last active December 25, 2025 23:58
xz-utils backdoor situation (CVE-2024-3094)

FAQ on the xz-utils backdoor (CVE-2024-3094)

This is a living document. Everything in this document is made in good faith of being accurate, but like I just said; we don't yet know everything about what's going on.

Update: I've disabled comments as of 2025-01-26 to avoid everyone having notifications for something a year on if someone wants to suggest a correction. Folks are free to email to suggest corrections still, of course.

Background

@Artefact2
Artefact2 / README.md
Last active December 31, 2025 05:44
GGUF quantizations overview

Which GGUF is right for me? (Opinionated)

Good question! I am collecting human data on how quantization affects outputs. See here for more information: ggml-org/llama.cpp#5962

In the meantime, use the largest that fully fits in your GPU. If you can comfortably fit Q4_K_S, try using a model with more parameters.

llama.cpp feature matrix

See the wiki upstream: https://github.com/ggerganov/llama.cpp/wiki/Feature-matrix

@0-vortex
0-vortex / llama2-mac-gpu.sh
Created July 21, 2023 19:40 — forked from adrienbrault/llama2-mac-gpu.sh
Run Llama-2-13B-chat locally on your M1/M2 Mac with GPU inference. Uses 10GB RAM
# Clone llama.cpp
git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
# Build it
LLAMA_METAL=1 make
# Download model
export MODEL=llama-2-13b-chat.ggmlv3.q4_0.bin
wget "https://huggingface.co/TheBloke/Llama-2-13B-chat-GGML/resolve/main/${MODEL}"
@adrienbrault
adrienbrault / llama2-mac-gpu.sh
Last active April 8, 2025 13:49
Run Llama-2-13B-chat locally on your M1/M2 Mac with GPU inference. Uses 10GB RAM. UPDATE: see https://twitter.com/simonw/status/1691495807319674880?s=20
# Clone llama.cpp
git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
# Build it
make clean
LLAMA_METAL=1 make
# Download model
export MODEL=llama-2-13b-chat.ggmlv3.q4_0.bin
@altryne
altryne / requirements.txt
Created July 9, 2023 06:45
GPT-4 code interpreter requirements.txt
['absl-py==1.4.0',
'affine==2.4.0',
'aiohttp==3.8.1',
'aiosignal==1.3.1',
'analytics-python==1.4.post1',
'anyio==3.7.1',
'anytree==2.8.0',
'argcomplete==1.10.3',
'argon2-cffi-bindings==21.2.0',
'argon2-cffi==21.3.0',
# Source: https://gist.github.com/vfarcic/78c1d2a87baf31512b87a2254194b11c
###############################################################
# How To Create A Complete Internal Developer Platform (IDP)? #
# https://youtu.be/Rg98GoEHBd4 #
###############################################################
# Additional Info:
# - DevOps MUST Build Internal Developer Platform (IDP): https://youtu.be/j5i00z3QXyU
# - How To Create A "Proper" CLI With Shell And Charm Gum: https://youtu.be/U8zCHA-9VLA
@Hellisotherpeople
Hellisotherpeople / blog.md
Last active December 27, 2025 05:31
You probably don't know how to do Prompt Engineering, let me educate you.

You probably don't know how to do Prompt Engineering

(This post could also be titled "Features missing from most LLM front-ends that should exist")

Apologies for the snarky title, but there has been a huge amount of discussion around so called "Prompt Engineering" these past few months on all kinds of platforms. Much of it is coming from individuals who are peddling around an awful lot of "Prompting" and very little "Engineering".

Most of these discussions are little more than users finding that writing more creative and complicated prompts can help them solve a task that a more simple prompt was unable to help with. I claim this is not Prompt Engineering. This is not to say that crafting good prompts is not a difficult task, but it does not involve doing any kind of sophisticated modifications to general "template" of a prompt.

Others, who I think do deserve to call themselves "Prompt Engineers" (and an awful lot more than that), have been writing about and utilizing the rich new eco-system

@sapphi-red
sapphi-red / vite-4.3-perf.md
Last active February 12, 2024 13:33
Vite 4.3 performance (2)

Terms

  • times
    • start up time: the time from "the command is executed" to the time "load event is triggered in browser".
    • root HMR time: the time from "the root file is changed" to the time "that file is executed in browser".
    • leaf HMR time: the time from "the leaf file is changed" to the time "that file is executed in browser".
  • cold/hot start
    • cold start: the dependency optimization cache is deleted before each run
    • hot start: the dependency optimization cache exists by each run

Summary

@levihuayuzhang
levihuayuzhang / alarm-m1-vm-build.md
Created January 2, 2023 07:20
Arch Linux ARM build for M1 (Apple Silicon) VMs

Arch Linux ARM build for M1 (Apple Silicon) VMs

This guide is for building your own Arch Linux ARM VM image and runnig in QEMU, UTM, Parallels...

Preparations in Linux

What you need in Linux phase:

1. qemu-img
2. fdisk
3. kpartx
4. bsdtar