Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Select an option

  • Save milad-rasouli/e453e404a2e6bfc7589011155bf38423 to your computer and use it in GitHub Desktop.

Select an option

Save milad-rasouli/e453e404a2e6bfc7589011155bf38423 to your computer and use it in GitHub Desktop.
Install and Run DeepSeek-R1 Locally with Ollama on Arch Linux

Installing DeepSeek Model with Ollama on Arch Linux

This guide provides a step-by-step process to install and run the DeepSeek model using Ollama on Arch Linux.

Prerequisites

Ensure your system meets the following requirements for running the DeepSeek model:

Hardware Requirements

Before running the DeepSeek-R1 model, make sure your hardware meets the specifications for the model you wish to run. Here’s an overview of the different versions and their hardware requirements:

Full-Scale Models

These models require substantial computational resources, particularly in terms of VRAM and a multi-GPU setup.

  • DeepSeek-R1-Zero: 671B parameters, ~1,342 GB VRAM, Multi-GPU setup (e.g., NVIDIA A100 80GB x16)
  • DeepSeek-R1: 671B parameters, ~1,342 GB VRAM, Multi-GPU setup (e.g., NVIDIA A100 80GB x16)

Distilled Models

These models are optimized for lower resource usage while maintaining strong performance.

  • DeepSeek-R1-Distill-Qwen-1.5B: 1.5B parameters, ~3.5 GB VRAM, NVIDIA RTX 3060 12GB or higher
  • DeepSeek-R1-Distill-Qwen-7B: 7B parameters, ~16 GB VRAM, NVIDIA RTX 4080 16GB or higher
  • DeepSeek-R1-Distill-Llama-8B: 8B parameters, ~18 GB VRAM, NVIDIA RTX 4080 16GB or higher

Quantized Models (4-bit)

Quantized models reduce memory usage and computational load, making them suitable for less powerful hardware.

  • DeepSeek-R1-Distill-Qwen-1.5B: 1.5B parameters, ~1 GB VRAM, NVIDIA RTX 3050 8GB or higher
  • DeepSeek-R1-Distill-Qwen-7B: 7B parameters, ~4 GB VRAM, NVIDIA RTX 3060 12GB or higher
  • DeepSeek-R1-Distill-Qwen-14B: 14B parameters, ~8 GB VRAM, NVIDIA RTX 4080 16GB or higher

For a detailed list of models and their specifications, refer to the DeepSeek Model Library.

Step 1: Install NVIDIA Graphics Drivers

If you don't have the NVIDIA drivers installed, use the following commands to install the necessary drivers:

  1. Install the nvidia-dkms package:

    sudo pacman -Sy nvidia-dkms
  2. Regenerate the initramfs:

    sudo mkinitcpio -P
  3. After installation, verify that the system recognizes your NVIDIA GPU:

    nvidia-smi

If successful, the nvidia-smi command will display your GPU's information.

Step 2: Install Ollama

To install Ollama, run the following command:

curl -fsSL https://ollama.com/install.sh | sh

Once the installation is complete, verify it by checking the installed version:

ollama --version

This command should return the version of Ollama installed on your system.

Step 3: Run the DeepSeek Model

Ollama supports various versions of the DeepSeek model. Choose a version that aligns with your hardware capabilities.

To run the DeepSeek model, execute the following command (adjust the model version as needed):

ollama run deepseek-r1:1.5b

This command will initiate the model run. Ensure your system meets the hardware requirements for the selected model to avoid performance issues.

Once the model is running, type any query or request you wish to ask DeepSeek, then hit Enter to see the magic.

Activate Ollama API

Ensure Ollama is running:

sudo systemctl status ollama

Next, create or edit the Ollama systemd service:

sudo vi /etc/systemd/system/ollama.service

Under the [Service] section, add the following environment variable to enable external API access:

[Service]
Environment="OLLAMA_ORIGINS=*"

Restart Ollama and reload the systemd services:

sudo systemctl restart ollama
sudo systemctl daemon-reload

Verify the API is activated by checking the response:

curl localhost:11434

You should see:

Ollama is running⏎ 

Test the API with a query:

Test the API by sending a POST request to ask a question:

curl -X POST http://localhost:11434/v1/chat/completions \
                   -H "Content-Type: application/json" \
                   -d '{
                   "model": "deepseek-r1:1.5b",
                   "messages": [
                       {
                           "role": "user",
                           "content": "What is the capital of Iran?"
                       }
                   ]
               }'

The response will be:

{"id":"chatcmpl-460","object":"chat.completion","created":1740145272,"model":"deepseek-r1:1.5b","system_fingerprint":"fp_ollama","choices":[{"index":0,"message":{"role":"assistant","content":"\u003cthink\u003e\n\n\u003c/think\u003e\n\nThe capital of Iran is Tehran."},"finish_reason":"stop"}],"usage":{"prompt_tokens":10,"completion_tokens":12,"total_tokens":22}}

This confirms that the API is active and responding to queries. You can now interact with the DeepSeek model using any programming language that supports HTTP requests.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment