Skip to content

Instantly share code, notes, and snippets.

@jxsl13
Created February 11, 2026 20:52
Show Gist options
  • Select an option

  • Save jxsl13/e6f77a1122a9037a04711dd5b2d74632 to your computer and use it in GitHub Desktop.

Select an option

Save jxsl13/e6f77a1122a9037a04711dd5b2d74632 to your computer and use it in GitHub Desktop.

Fix: Intel Arc A310 /dev/dri/renderD128 Randomly Disappearing on Proxmox

GPU: Intel Corporation DG2 [Arc A310] [8086:56a6] (rev 05) Platform: Proxmox VE 8.x+

The Problem

The /dev/dri/renderD128 device randomly vanishes, breaking hardware transcoding (Jellyfin, Plex, etc.) and any other GPU-dependent workloads.

Root Causes

  1. GuC/HuC Firmware Crash – Arc GPUs heavily rely on GuC/HuC firmware. If it crashes, the device disappears.
  2. PCIe Power Management – ASPM puts the GPU into a D3 power state it can't recover from.
  3. GPU Display Power States – The i915 driver's power saving modes cause the GPU to enter deep sleep.

The Fix (Step by Step)

1. Check Kernel Version (6.2+ required, 6.5+ recommended)

uname -r

If too old, update:

apt update && apt install pve-kernel-6.8

2. Force-load the i915 Module on Boot

echo "i915" > /etc/modules-load.d/i915.conf

3. Set i915 Module Options

cat > /etc/modprobe.d/i915.conf << 'EOF'
options i915 enable_guc=3 enable_dc=0 force_probe=56a6
EOF

Note

Replace 56a6 with your GPU's PCI ID. Find it with: lspci -nn | grep VGA

Option Purpose
enable_guc=3 Enables GuC & HuC submission (required for Arc GPUs to function properly)
enable_dc=0 Disables all display power saving states (prevents GPU deep sleep)
force_probe=56a6 Forces the kernel to recognize the specific DG2 chip

4. Disable PCIe ASPM

Edit /etc/default/grub:

GRUB_CMDLINE_LINUX_DEFAULT="quiet pcie_aspm=off"

Note

If you use GPU passthrough to VMs/containers, use: GRUB_CMDLINE_LINUX_DEFAULT="quiet intel_iommu=on iommu=pt pcie_aspm=off"

Then update GRUB:

update-grub

5. Add udev Rules for Stable Permissions

cat > /etc/udev/rules.d/99-intel-arc.rules << 'EOF'
KERNEL=="renderD128", GROUP="render", MODE="0666"
KERNEL=="card*", GROUP="video", MODE="0666"
EOF

6. Verify Firmware is Present

ls /lib/firmware/i915/dg2_*

Expected output:

/lib/firmware/i915/dg2_dmc_ver2_08.bin
/lib/firmware/i915/dg2_guc_70.bin
/lib/firmware/i915/dg2_huc_gsc.bin

These are shipped with Proxmox's pve-firmware package. Do NOT install firmware-misc-nonfree — it will try to remove critical Proxmox packages (proxmox-ve, pve-firmware, kernel packages).

7. Rebuild initramfs & Reboot

update-initramfs -u -k all
reboot

Post-Reboot Verification

# Check device exists
ls -la /dev/dri/

# Verify driver is loaded
lsmod | grep i915

# Check GuC/HuC firmware loaded successfully
dmesg | grep -i "guc\|huc"
# Expected: "GuC firmware ... loaded" and "HuC firmware ... authenticated"

# Inspect GPU details
lspci -v -s $(lspci | grep -i arc | awk '{print $1}')

# Test hardware video acceleration
vainfo --display drm --device /dev/dri/renderD128

How It Works: Three Layers of Protection

Layer 1 — PCIe Bus:       pcie_aspm=off      → Bus stays awake
Layer 2 — GPU Hardware:    enable_dc=0        → GPU stays awake
Layer 3 — GPU Driver:      enable_guc=3 +     → Driver works correctly
                           force_probe=56a6     and always detects GPU
Boot sequence with fixes applied:

BIOS → GRUB → Kernel → initramfs → i915 driver → /dev/dri/renderD128 ✅
        │        │         │            │
        │        │         │            ├─ enable_guc=3     → GPU command submission active
        │        │         │            ├─ enable_dc=0      → No deep sleep
        │        │         │            └─ force_probe=56a6 → GPU forcefully recognized
        │        │         │
        │        │         └─ Loads i915 + firmware + options
        │        │
        │        └─ pcie_aspm=off → PCIe bus stays active
        │
        └─ Kernel parameters passed

Trade-off: The GPU will consume slightly more power at idle (~5-10W), but it won't disappear anymore.

Optional: Watchdog Script (Last Resort)

If the issue persists despite all the above fixes, add a watchdog that automatically reloads the driver:

cat > /usr/local/bin/gpu-watchdog.sh << 'EOF'
#!/bin/bash
if [ ! -e /dev/dri/renderD128 ]; then
    logger "GPU renderD128 missing! Reloading i915..."
    modprobe -r i915 && sleep 2 && modprobe i915
fi
EOF
chmod +x /usr/local/bin/gpu-watchdog.sh

# Run every 5 minutes via cron
echo "*/5 * * * * root /usr/local/bin/gpu-watchdog.sh" > /etc/cron.d/gpu-watchdog

Files Modified (Summary)

File Purpose
/etc/modules-load.d/i915.conf Force-load i915 module at boot
/etc/modprobe.d/i915.conf i915 driver options (GuC, no sleep, force probe)
/etc/default/grub Disable PCIe ASPM via kernel parameter
/etc/udev/rules.d/99-intel-arc.rules Stable device permissions
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment