This guide provides a step-by-step process to install and run the DeepSeek model using Ollama on Arch Linux.
Ensure your system meets the following requirements for running the DeepSeek model:
- Install NVIDIA Graphics Drivers (for hardware acceleration)
- Install Ollama
- Run the DeepSeek Model
- Activate Ollama API
Before running the DeepSeek-R1 model, make sure your hardware meets the specifications for the model you wish to run. Here’s an overview of the different versions and their hardware requirements:
These models require substantial computational resources, particularly in terms of VRAM and a multi-GPU setup.
- DeepSeek-R1-Zero: 671B parameters, ~1,342 GB VRAM, Multi-GPU setup (e.g., NVIDIA A100 80GB x16)
- DeepSeek-R1: 671B parameters, ~1,342 GB VRAM, Multi-GPU setup (e.g., NVIDIA A100 80GB x16)
These models are optimized for lower resource usage while maintaining strong performance.
- DeepSeek-R1-Distill-Qwen-1.5B: 1.5B parameters, ~3.5 GB VRAM, NVIDIA RTX 3060 12GB or higher
- DeepSeek-R1-Distill-Qwen-7B: 7B parameters, ~16 GB VRAM, NVIDIA RTX 4080 16GB or higher
- DeepSeek-R1-Distill-Llama-8B: 8B parameters, ~18 GB VRAM, NVIDIA RTX 4080 16GB or higher
Quantized models reduce memory usage and computational load, making them suitable for less powerful hardware.
- DeepSeek-R1-Distill-Qwen-1.5B: 1.5B parameters, ~1 GB VRAM, NVIDIA RTX 3050 8GB or higher
- DeepSeek-R1-Distill-Qwen-7B: 7B parameters, ~4 GB VRAM, NVIDIA RTX 3060 12GB or higher
- DeepSeek-R1-Distill-Qwen-14B: 14B parameters, ~8 GB VRAM, NVIDIA RTX 4080 16GB or higher
For a detailed list of models and their specifications, refer to the DeepSeek Model Library.
If you don't have the NVIDIA drivers installed, use the following commands to install the necessary drivers:
-
Install the
nvidia-dkmspackage:sudo pacman -Sy nvidia-dkms
-
Regenerate the initramfs:
sudo mkinitcpio -P
-
After installation, verify that the system recognizes your NVIDIA GPU:
nvidia-smi
If successful, the nvidia-smi command will display your GPU's information.
To install Ollama, run the following command:
curl -fsSL https://ollama.com/install.sh | shOnce the installation is complete, verify it by checking the installed version:
ollama --versionThis command should return the version of Ollama installed on your system.
Ollama supports various versions of the DeepSeek model. Choose a version that aligns with your hardware capabilities.
To run the DeepSeek model, execute the following command (adjust the model version as needed):
ollama run deepseek-r1:1.5bThis command will initiate the model run. Ensure your system meets the hardware requirements for the selected model to avoid performance issues.
Once the model is running, type any query or request you wish to ask DeepSeek, then hit Enter to see the magic.
Ensure Ollama is running:
sudo systemctl status ollamaNext, create or edit the Ollama systemd service:
sudo vi /etc/systemd/system/ollama.serviceUnder the [Service] section, add the following environment variable to enable external API access:
[Service]
Environment="OLLAMA_ORIGINS=*"Restart Ollama and reload the systemd services:
sudo systemctl restart ollama
sudo systemctl daemon-reloadVerify the API is activated by checking the response:
curl localhost:11434You should see:
Ollama is running⏎ Test the API by sending a POST request to ask a question:
curl -X POST http://localhost:11434/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "deepseek-r1:1.5b",
"messages": [
{
"role": "user",
"content": "What is the capital of Iran?"
}
]
}'The response will be:
{"id":"chatcmpl-460","object":"chat.completion","created":1740145272,"model":"deepseek-r1:1.5b","system_fingerprint":"fp_ollama","choices":[{"index":0,"message":{"role":"assistant","content":"\u003cthink\u003e\n\n\u003c/think\u003e\n\nThe capital of Iran is Tehran."},"finish_reason":"stop"}],"usage":{"prompt_tokens":10,"completion_tokens":12,"total_tokens":22}}This confirms that the API is active and responding to queries. You can now interact with the DeepSeek model using any programming language that supports HTTP requests.