Leonard lhl

This is a comparison between https://github.com/shisa-ai/ja-mt-bench-harness which aims to be faithful to the original JA MT-Bench and the version that is used in Swallow Evalulation Instruct v202510 https://github.com/swallow-llm/swallow-evaluation-instruct/releases/tag/v202510

JA-MT-Harness vs Swallow-Evaluation (JA MT-Bench)

High-level summary

The two frameworks both use an OpenAI-compatible API, but they run and score JA MT‑Bench in materially different ways. The FastChat-based harness is closer to the original MT‑Bench pipeline (question file layout, judge prompts, and single-sample judging), while Swallow’s lighteval task intentionally modifies the evaluation: Japanese-enforced judge prompts, a Japanese system prompt for model generation, multi-sample averaging (N=5), output truncation by character length, different judge model, and additional metrics. These differences alone can easily move scores by multiple points.

Key takeaways:

Prompting and judging are different (language constraint

GGML CUDA/HIP Inference Paths and Precision by Architecture

This document summarizes how ggml’s CUDA/HIP backend executes inference on different GPU families, which code paths are used, and at what numeric precision the major compute happens. It also provides rough workload composition percentages to relate paths to each architecture’s FLOPS/TOPs.

References are to files under ggml/src/ggml-cuda unless noted.

Matmul (quantized): mmq.cu, mmq.cuh, vecdotq.cuh, quantize.cu/.cuh
Matmul (float): mmf.cu, mmvf.cu, cuBLAS/hipBLAS calls in ggml-cuda.cu
FlashAttention: fattn*.cu/.cuh
Softmax: softmax.cu

How to configure NGINX with LetsEncrypt using the simp_le client.

this includes the nginx configs, as well as the auto renewal steps. I took a bunch of these steps from this blog, and adapted it to how I like.

simp_le issues three return codes depending on the status of the request.

0 if certificate data was created or updated;
1 if renewal not necessary;
2 in case of errors.

Keybase proof

I hereby claim:

I am lhl on github.
I am lhl (https://keybase.io/lhl) on keybase.
I have a public key whose fingerprint is 4DAB 5922 AD2C B6F2 780C CC2A CE9A 69D9 663F C373

To claim this, I am signing this object:

	# Power Usage Calculator for AI Workloads

	'''
	# Serving
	$ vllm serve meta-llama/Llama-3.3-70B-Instruct --tensor-parallel-size 4 --num-scheduler-steps 20 --quantization=fp8 --gpu-memory-utilization=0.97
	INFO 01-13 04:59:05 api_server.py:712] vLLM API server version 0.6.6.post2.dev5+g5ce4627a

	# Benchmark - we do bs=64 to emulate https://arxiv.org/pdf/2310.03003
	cmd = [
	"python", os.path.expanduser("~/vllm/benchmarks/benchmark_serving.py"),

	! http://crunchbang.org/forums/viewtopic.php?id=5618
	! Xft.dpi: 110
	Xft.dpi: 96
	Xft.autohint: 0
	Xft.lcdfilter: lcddefault
	Xft.hintstyle: hintfull
	Xft.hinting: 1
	Xft.antialias: 1
	Xft.rgba: rgb

	/*

	Dragon Age Inquisition Multiplayer Key Bindings
	---
	You should map WASD (from WQSE to movement).
	These tweaks should make DAI easier to control.

	What the script does:
	* MB4 toggles RMB down/up (freelook)
	* Caps lock toggles sprint

	init:
	define f = Character('Felicia', color="#c8ffc8", show_two_window=False, image="felicia")
	$ narrator = Character(None, color="#c8ffc8")
	init:
	image felicia happy = Image("art/f_happy.png")
	image felicia sad = Image("art/f_sad.png")
	image felicia angry = Image("art/f_angry.png")
	image felicia pensive = Image("art/f_pensive.png")
	image felicia surprised = Image("art/f_surprised.png")
	image felicia suspicious = Image("art/f_suspicious.png")

	tell application "Google Chrome"
	set myWindow to make new window
	set myTab to active tab of myWindow
	set URL of myTab to "http://randomfoo.net/"
	activate end tell

	function get_tweets($num=3) {
	// Cached
	if($tweets = get_transient('tweets')) {
	return $tweets;
	}

	$url = 'http://api.twitter.com/1/statuses/user_timeline.json?include_entities=true&include_rts=true&screen_name=lhl&count=20&exclude_replies=true';
	$ch = curl_init();
	curl_setopt($ch, CURLOPT_URL, $url);
	curl_setopt($ch, CURLOPT_HEADER, 0);