Skip to content

Instantly share code, notes, and snippets.

@Basten7
Basten7 / ggml-metal-optimized-4.m
Created August 11, 2025 08:24
Evol for A new Metal3 Backend for llama.cpp
#import "ggml-metal.h"
#import "ggml-impl.h"
#import "ggml-backend-impl.h"
#import "ggml-metal-impl.h"
#import <Foundation/Foundation.h>
#import <Metal/Metal.h>
import argparse
import math
import mlx.core as mx
import mlx.nn as nn
from tqdm import tqdm
from mlx_lm.utils import load
from pathlib import Path
def eval_ppl(model, data, batch_size=32):
@awni
awni / mlx_distributed_deepseek.md
Last active January 9, 2026 05:35
Run DeepSeek R1 or V3 with MLX Distributed

Setup

On every machine in the cluster install openmpi and mlx-lm:

conda install conda-forge::openmpi
pip install -U mlx-lm

Next download the pipeline parallel run script. Download it to the same path on every machine: