Skip to content

Instantly share code, notes, and snippets.

View dhruvilp's full-sized avatar
πŸ’­
πŸ‘¨β€πŸ’» working on something really cool

Dhruvil Patel dhruvilp

πŸ’­
πŸ‘¨β€πŸ’» working on something really cool
View GitHub Profile
@dhruvilp
dhruvilp / microgpt.py
Created February 12, 2026 01:36 — forked from karpathy/microgpt.py
microgpt
"""
The most atomic way to train and inference a GPT in pure, dependency-free Python.
This file is the complete algorithm.
Everything else is just efficiency.
@karpathy
"""
import os # os.path.exists
import math # math.log, math.exp
@dhruvilp
dhruvilp / Dockerfile
Last active November 10, 2025 05:17
vllm docling granite model
# Use an AWS Deep Learning Container (DLC) as a base or a vLLM specific image
# Ensure the base image has the necessary CUDA drivers and PyTorch
FROM vllm/vllm-openai:latest # Or a specific version that matches your CUDA
# Copy the pre-downloaded model weights into the container image
COPY /mnt/models/granite-docling-258M /app/local_model
WORKDIR /app
# The entrypoint command will use the local directory path for the --model argument
@dhruvilp
dhruvilp / notes-2.md
Created November 5, 2025 03:24
gpt-oss-20b-fine-tuning-q3-max-part-1

Here's a complete, battle-tested end-to-end script specifically designed for fine-tuning the MXFP4-quantized MoE GPT-oss-20B model on your 4Γ—A10G (96GB) setup. This leverages QLoRA for memory efficiency while handling MXFP4 quantization properly.

#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
Fine-tune MXFP4-quantized MoE GPT-oss-20B with QLoRA
Hardware: 4Γ— NVIDIA A10G (24GB VRAM each)
Key Tech: bitsandbytes (MXFP4), PEFT (QLoRA), FlashAttention-2, DeepSpeed ZeRO-3
"""
@dhruvilp
dhruvilp / quant-gpt-oss.py
Last active October 15, 2025 20:44
quant gpt oss local
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
import time
model_path = './gpt-oss-model-local'
quantization_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_use_double_quant=True,
@dhruvilp
dhruvilp / webpageloader.py
Created October 6, 2025 21:06
crawl4ai web page loader
import asyncio
import json
import os
from base64 import b64decode
from typing import List, Dict, Optional, Any
from pydantic import BaseModel, Field
from crawl4ai import (
AsyncWebCrawler, BrowserConfig, CrawlerRunConfig, CacheMode,
JsonCssExtractionStrategy, LLMExtractionStrategy, LLMConfig,
@dhruvilp
dhruvilp / t_to_sb.txt
Created March 25, 2025 03:49
Tomcat to Spring Boot
nference Providers
NEW
Fireworks
Text Generation
Reset
Examples
Input a message to start chatting with deepseek-ai/DeepSeek-V3-0324.
How can I convert an app running on tomcat Catalina 8 server to spring boot app with jdk 17
@dhruvilp
dhruvilp / thinking_tokens.py
Created February 18, 2025 16:01 — forked from zainhas/thinking_tokens.py
Extract ONLY thinking tokens from DeepSeek-R1
from together import Together
client = Together(api_key = TOGETHER_API_KEY)
question = "Which is larger 9.9 or 9.11?"
thought = client.chat.completions.create(
model="deepseek-ai/DeepSeek-R1",
messages=[{"role": "user", "content": question}],
stop = ['</think>']
)
@dhruvilp
dhruvilp / aimlapi-starter.py
Created November 18, 2024 15:14
AIML API Code Snippet
# pip install openai
import os
from openai import OpenAI
aiml_api_key ='<YOUR_AIML_API_KEY>'
client = OpenAI(
api_key=aiml_api_key,
base_url="https://api.aimlapi.com",
)
We can't make this file beautiful and searchable because it's too large.
"age";"job";"marital";"education";"default";"housing";"loan";"contact";"month";"day_of_week";"duration";"campaign";"pdays";"previous";"poutcome";"emp.var.rate";"cons.price.idx";"cons.conf.idx";"euribor3m";"nr.employed";"y"
56;"housemaid";"married";"basic.4y";"no";"no";"no";"telephone";"may";"mon";261;1;999;0;"nonexistent";1.1;93.994;-36.4;4.857;5191;"no"
57;"services";"married";"high.school";"unknown";"no";"no";"telephone";"may";"mon";149;1;999;0;"nonexistent";1.1;93.994;-36.4;4.857;5191;"no"
37;"services";"married";"high.school";"no";"yes";"no";"telephone";"may";"mon";226;1;999;0;"nonexistent";1.1;93.994;-36.4;4.857;5191;"no"
40;"admin.";"married";"basic.6y";"no";"no";"no";"telephone";"may";"mon";151;1;999;0;"nonexistent";1.1;93.994;-36.4;4.857;5191;"no"
56;"services";"married";"high.school";"no";"no";"yes";"telephone";"may";"mon";307;1;999;0;"nonexistent";1.1;93.994;-36.4;4.857;5191;"no"
45;"services";"married";"basic.9y";"unknown";"no";"no";"telephone";"may";"mon";198;1;999;0;"nonexistent";1.1;93.994;-36.4
@dhruvilp
dhruvilp / info.txt
Created August 15, 2023 20:45
SaaS Stack
If you're building a SaaS in 2023:
β—† framework: Next.js
β—† ui: @shadcn/ui + TailwindCSS
β—† redis/queues: Upstash
β—† time-series data & charts: Tinybird + Tremor
β—† ORM: Prisma
β—† auth: NextAuth.js
β—† database: PlanetScale
β—† emails: Resend