Skip to content

Instantly share code, notes, and snippets.

@daryltucker
Created December 17, 2025 22:57
Show Gist options
  • Select an option

  • Save daryltucker/1359d7f21f5a24c42bfdee18683868e4 to your computer and use it in GitHub Desktop.

Select an option

Save daryltucker/1359d7f21f5a24c42bfdee18683868e4 to your computer and use it in GitHub Desktop.
NVIDIA Kernel 470.256.02 Linux Kernel 6.12.43+deb13-amd64 DKMS Fixes

Patching NVIDIA 470.256.02 for Linux Kernel 6.12 -> 6.13 Transition (Debian 13)

Assuming you have the appropriate patches and had DKMS working...

0001-Fix-conftest-to-ignore-implicit-function-declaration.patch
0002-Fix-conftest-to-use-a-short-wchar_t.patch
0003-Fix-conftest-to-use-nv_drm_gem_vmap-which-has-the-se.patch
nvidia-470xx-fix-gcc-15.patch
kernel-6.10.patch
kernel-6.12.patch

Target Environment

  • OS: Linux (Debian 13 / Trixie / Sid)
  • Kernel: 6.12.x / 6.13.x
  • Compiler: GCC 14.x
  • Driver: NVIDIA 470.256.02 (Legacy)

Identifiable Errors

Users likely encountered one or more of the following blockers:

  1. Missing Symbol Table for RDMA: make: *** [/usr/src/ofa_kernel/Module.symvers] Error 1
  2. Symbol Pollution / Duplicate Exports: ERROR: modpost: net/vmw_vsock/vsock: 'vsock_addr_validate' exported twice.
  3. Compiler Mismatch: The compiler used to compile the kernel (gcc 14) does not match the current compiler.

The "In-House" Success Workflow

1. Align the Toolchain

Ensure the kernel and the driver are using the same GCC version.

  • Edit /etc/dkms/framework.conf or the driver's dkms.conf:
    export CC="gcc-14"

2. Prune Unused Enterprise Modules

The nvidia-peermem module (RDMA/InfiniBand) is the primary source of ofa_kernel errors.

  • Remove the directory: rm -rf /usr/src/nvidia-470.256.02/nvidia-peermem
  • Edit dkms.conf: Remove or comment out all entries for BUILT_MODULE_NAME[4] and DEST_MODULE_LOCATION[4] (or those which mention peermem).

3. Sanitize the Build Logic

Prevent the NVIDIA conftest.sh from searching for external RDMA headers.

  • Edit /usr/src/nvidia-470.256.02/conftest.sh: Locate and comment out lines setting MLNX_OFED_KERNEL_DIR or searching for /usr/src/ofa_kernel.

4. Fix Symbol Pollution (The modpost Hammer)

To prevent modpost from scanning the entire kernel tree and reporting "exported twice" for networking/virt symbols:

  • Update the MAKE command in dkms.conf:
    MAKE[0]="'make' -j`nproc` NV_EXCLUDE_BUILD_MODULES='' KERNEL_UNAME=${kernelver} IGNORE_CC_MISMATCH='1' modules"

5. Execute Clean Build

Flush the DKMS tree to remove stale artifacts and trigger the build:

sudo dkms remove nvidia/470.256.02 --all
sudo CC=gcc-14 dkms install nvidia/470.256.02

Technical Insight: Why This Works

The 470 series is legacy. By removing the nvidia-peermem directory and its associated conftest logic, you remove the dependency on ofa_kernel. Using KBUILD_MODPOST_WARN=1 allows the build to complete even if the Debian header metadata contains duplicate symbol definitions, which is common in the transition to Kernel 6.13.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment