Skip to content

Instantly share code, notes, and snippets.

@criminact
Created April 9, 2024 20:44
Show Gist options
  • Select an option

  • Save criminact/db4434b4da7c7d6099b10514fdf19863 to your computer and use it in GitHub Desktop.

Select an option

Save criminact/db4434b4da7c7d6099b10514fdf19863 to your computer and use it in GitHub Desktop.
ConvertingToGGUF.ipynb
Display the source blob
Display the rendered blob
Raw
{
"nbformat": 4,
"nbformat_minor": 0,
"metadata": {
"colab": {
"provenance": [],
"gpuType": "T4",
"authorship_tag": "ABX9TyN6YmLxl7AaDpfIrSBX2iBk",
"include_colab_link": true
},
"kernelspec": {
"name": "python3",
"display_name": "Python 3"
},
"language_info": {
"name": "python"
},
"accelerator": "GPU"
},
"cells": [
{
"cell_type": "markdown",
"metadata": {
"id": "view-in-github",
"colab_type": "text"
},
"source": [
"<a href=\"https://colab.research.google.com/gist/criminact/db4434b4da7c7d6099b10514fdf19863/convertingtogguf.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
]
},
{
"cell_type": "markdown",
"source": [
"1. Clone the Llama.cpp"
],
"metadata": {
"id": "RA3Nsy7uMP8V"
}
},
{
"cell_type": "code",
"source": [
"!git clone https://github.com/ggerganov/llama.cpp.git"
],
"metadata": {
"id": "EfZsBnXFxnOr"
},
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"source": [
"2. Install required packages"
],
"metadata": {
"id": "wynqpTvzMkl0"
}
},
{
"cell_type": "code",
"source": [
"!pip install -r llama.cpp/requirements.txt"
],
"metadata": {
"id": "F8_BIUqiMVzN"
},
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"source": [
"!pip install huggingface_hub"
],
"metadata": {
"id": "nwiUDHe1SGqK"
},
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"source": [
"3. Download the model weights to a local directory"
],
"metadata": {
"id": "i6W8FfDWOOW3"
}
},
{
"cell_type": "code",
"source": [
"from huggingface_hub import snapshot_download\n",
"\n",
"model_id = \"mistralai/Mistral-7B-Instruct-v0.1\"\n",
"snapshot_download(repo_id=model_id, local_dir=\"mistral-hf\", local_dir_use_symlinks=False, revision=\"main\")"
],
"metadata": {
"id": "hfyKhHyFOG-I"
},
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"source": [
"4. This command will generate a ggml-model-f16.gguf file inside the local model directory"
],
"metadata": {
"id": "it0JyJB2Vl6q"
}
},
{
"cell_type": "code",
"source": [
"!python3 llama.cpp/convert-hf-to-gguf.py /content/mistral-hf --outfile /content/mistral-hf/mistral-v0.1-fp16.gguf --outtype f16"
],
"metadata": {
"id": "JjsJiwnpNREp"
},
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"source": [
"5. Build Llama.cpp for further Quantization (Q2, Q3, Q4, Q5)"
],
"metadata": {
"id": "X-JMfeudV92-"
}
},
{
"cell_type": "code",
"source": [
"!cd llama.cpp && make"
],
"metadata": {
"id": "DRk1zhRvV17h"
},
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"source": [
"!llama.cpp/quantize /content/mistral-hf/ggml-model-f16.gguf /content/mistral-hf/ggml-model-q4_km.gguf Q4_K_M"
],
"metadata": {
"id": "56rdiW8sYu1M"
},
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"source": [
"!llama.cpp/quantize /content/mistral-hf/ggml-model-f16.gguf /content/mistral-hf/ggml-model-q5_km.gguf Q5_K_M"
],
"metadata": {
"id": "2d-Hwa-pkB9L"
},
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"source": [
"!llama.cpp/main -m /content/mistral-7b-instruct-v0.1.Q3_K_S.gguf -p \"Building a website can be done in 10 simple steps:\\nStep 1:\" -n 400 -e"
],
"metadata": {
"id": "GnUksYwslIlG"
},
"execution_count": null,
"outputs": []
}
]
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment