Skip to content

Instantly share code, notes, and snippets.

@cpburnz
Last active April 17, 2026 17:05
Show Gist options
  • Select an option

  • Save cpburnz/8e87aec6509e3ec1016e6d56d1f80322 to your computer and use it in GitHub Desktop.

Select an option

Save cpburnz/8e87aec6509e3ec1016e6d56d1f80322 to your computer and use it in GitHub Desktop.
Strix Halo (AMD Ryzen AI Max+ 395) mixed eGPUs Guide

Here's a guide for setting up mixed eGPUs via USB4/Thunderbolt on a Strix Halo (AMD Ryzen AI Max+ 395) machine. This guide assumes CachyOS is used (an Arch Linux derivative) with the Limine bootloader. I did not test this with a desktop manager installed, nor did I attempt to use the video ports on the eGPUs. My use case was to use this machine as an LLM server.

Hardware:

  • Computer:
    • Minisforum MS-S1 MAX AI
  • eGPUs docks:
    • Minisforum DEG2
    • Razer Core X V2
  • GPUs:
    • AMD Radeon RX 7900 XTX
    • NVIDIA GeForce RTX 5090

MS-S1

I updated the BIOS from 1.03 to 1.06 because the release notes indicated 1.05 fixed some Thunderbolt issues. I followed the official guide "SHWSA_1.06_260104B/Update BIOS guide.docx" (downloaded from the Product Help page) and Minisforum MS-S1 Max BIOS Update from Linux (No Windows Required).

I had to disable ReBAR in the BIOS for the NVIDIA card to work.

There seems to be a quirk between the two USB4 v2 ports on the back of the machine. One port only supports a prefetchable bridge window which the AMD card is fine with, but the NVIDIA card is not. The other port supports a non-prefetchable bridge window which the NVIDIA card requires. If you see dmesg errors like these, that might be the cause, and the solution on the MS-S1 was to swap the USB4 ports:

[    6.744373] pci 0000:6a:00.0: BAR 1 [mem 0x9000000000-0x900fffffff 64bit pref]: assigned
[    6.744393] pci 0000:6a:00.0: VF BAR 2 [mem size 0x10000000 64bit pref]: can't assign; no space
[    6.744394] pci 0000:6a:00.0: VF BAR 2 [mem size 0x10000000 64bit pref]: failed to assign
[    6.744394] pci 0000:6a:00.0: BAR 0 [mem size 0x04000000]: can't assign; no space
[    6.744395] pci 0000:6a:00.0: BAR 0 [mem size 0x04000000]: failed to assign
[    6.744395] pci 0000:6a:00.0: BAR 3 [mem 0x9010000000-0x9011ffffff 64bit pref]: assigned
[    6.744410] pci 0000:6a:00.0: VF BAR 4 [mem size 0x02000000 64bit pref]: can't assign; no space
[    6.744411] pci 0000:6a:00.0: VF BAR 4 [mem size 0x02000000 64bit pref]: failed to assign
[    6.744412] pci 0000:6a:00.0: ROM [mem size 0x00080000 pref]: can't assign; no space
[    6.744412] pci 0000:6a:00.0: ROM [mem size 0x00080000 pref]: failed to assign
[    6.744413] pci 0000:6a:00.0: VF BAR 0 [mem size 0x00040000 64bit pref]: can't assign; no space
[    6.744413] pci 0000:6a:00.0: VF BAR 0 [mem size 0x00040000 64bit pref]: failed to assign
[    6.744415] pci 0000:6a:00.0: BAR 5 [io  size 0x0080]: can't assign; no space
[    6.744416] pci 0000:6a:00.0: BAR 5 [io  size 0x0080]: failed to assign
[    6.744416] pci 0000:6a:00.0: BAR 1 [mem 0x9000000000-0x900fffffff 64bit pref]: releasing
[    6.744417] pci 0000:6a:00.0: BAR 3 [mem 0x9010000000-0x9011ffffff 64bit pref]: releasing
...
[    6.834725] nvidia 0000:6a:00.0: enabling device (0000 -> 0002)
[    6.834750] NVRM: This PCI I/O region assigned to your NVIDIA device is invalid:
               NVRM: BAR0 is 0M @ 0x0 (PCI:0000:6a:00.0)
[    6.834764] nvidia 0000:6a:00.0: probe with driver nvidia failed with error -1

RX 7900 XTX

The NITRO+ AMD Radeon RX 7900 XTX Vapor-X 24GB has a little BIOS switch that needs to be set to the middle position to allow software control of power settings.

Kernel Boot Parameters

Add the "iommu=pt", "pci=realloc" and "pcie_ports=native" kernel boot parameter. For Limine, add the following to /etc/default/limine:

KERNEL_CMDLINE[default]+="iommu=pt pci=realloc pcie_ports=native"

Regenerate /boot/limine.conf:

sudo limine-mkinitcpio

AMD eGPU via Thunderbolt

Make sure the thunderbolt module is loaded before the AMD driver. Create /etc/mkinitcpio.conf.d/99-amd-egpu.conf with the contents:

MODULES+=(thunderbolt amdgpu)

Rebuild initramfs:

sudo limine-mkinitcpio

The Thunderbolt device has to be authorized before it can be used. Find it using boltctl:

$ boltctl
 ● Micro Computer (HK) Tech. Ltd. TBGAA
   ├─ type:          peripheral
   ├─ name:          TBGAA
   ├─ vendor:        Micro Computer (HK) Tech. Ltd.
   ├─ uuid:          34158780-0082-ee01-ffff-ffffffffffff
   ├─ generation:    USB4
   ├─ status:        connected
   │  ├─ domain:     a7a30000-0003-ea39-ffff-ffffffffffff
   │  ├─ rx speed:   80 Gb/s = 2 lanes * 40 Gb/s
   │  ├─ tx speed:   80 Gb/s = 2 lanes * 40 Gb/s
   │  └─ authflags:  none
   ├─ connected:     Thu 09 Apr 2026 02:45:08 PM UTC
   └─ stored:        no

And enroll the device using its uuid (I want to enroll 34158780-0082-ee01-ffff-ffffffffffff):

sudo boltctl enroll --policy=auto 34158780-0082-ee01-ffff-ffffffffffff

Disable MES because it fails on USB4. Create /etc/modprobe.d/amd-egpu.conf with the contents:

options amdgpu mes=0
softdep amdgpu pre: thunderbolt

Disable runtime power management because resuming the GPU frequently (always?) fails. Find the eGPU PCI vendor and device codes (format is [VENDOR:DEVICE]):

$ lspci -nn | grep -i vga | grep -i radeon
97:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Navi 31 [Radeon RX 7900 XT/7900 XTX/7900 GRE/7900M] [1002:744c] (rev c8)

The [1002:744c] is the vendor and device for the AMD card. Create /etc/udev/rules.d/99-amd-egpu-pm.rules and substitute the correct vendor and device codes (keep the leading 0x):

ACTION=="bind", SUBSYSTEM=="pci", ATTR{vendor}=="0x1002", ATTR{device}=="0x744c", ATTR{power/control}="on"

NVIDIA eGPU via Thunderbolt

Load the nvidia module after thunderbolt. A modprobe.d rule is required for nvidia-dkms. Create /etc/modprobe.d/nvidia-egpu.conf with the contents:

softdep nvidia pre: thunderbolt

The Thunderbolt device has to be authorized before it can be used. Find it using boltctl:

$ boltctl
 ● Razer Core X V2
   ├─ type:          peripheral
   ├─ name:          Core X V2
   ├─ vendor:        Razer
   ├─ uuid:          8ab48780-00a3-29aa-ffff-ffffffffffff
   ├─ generation:    USB4
   ├─ status:        connected
   │  ├─ domain:     a7a30000-0003-ea39-ffff-ffffffffffff
   │  ├─ rx speed:   80 Gb/s = 2 lanes * 40 Gb/s
   │  ├─ tx speed:   80 Gb/s = 2 lanes * 40 Gb/s
   │  └─ authflags:  none
   ├─ connected:     Thu 09 Apr 2026 03:00:58 PM UTC
   └─ stored:        no

 ● Micro Computer (HK) Tech. Ltd. TBGAA
   ├─ type:          peripheral
   ├─ name:          TBGAA
   ├─ vendor:        Micro Computer (HK) Tech. Ltd.
   ├─ uuid:          34158780-0082-ee01-ffff-ffffffffffff
   ├─ generation:    USB4
   ├─ status:        authorized
   │  ├─ domain:     a7a30000-0003-ea39-ffff-ffffffffffff
   │  ├─ rx speed:   80 Gb/s = 2 lanes * 40 Gb/s
   │  ├─ tx speed:   80 Gb/s = 2 lanes * 40 Gb/s
   │  └─ authflags:  none
   ├─ authorized:    Thu 09 Apr 2026 02:58:42 PM UTC
   ├─ connected:     Thu 09 Apr 2026 02:58:37 PM UTC
   └─ stored:        Thu 09 Apr 2026 02:51:32 PM UTC
      ├─ policy:     auto
      └─ key:        no

And enroll the device using its uuid (I want to enroll 8ab48780-00a3-29aa-ffff-ffffffffffff):

sudo boltctl enroll --policy=auto 8ab48780-00a3-29aa-ffff-ffffffffffff

Disable runtime power management because resuming the GPU frequently (always?) fails. Find the eGPU PCI vendor and device codes (format is [VENDOR:DEVICE]):

$ lspci -nn | grep -i vga | grep -i nvidia
6a:00.0 VGA compatible controller [0300]: NVIDIA Corporation GB202 [GeForce RTX 5090] [10de:2b85] (rev a1)

The [10de:2b85] is the vendor and device for the NVIDIA card. Create /etc/udev/rules.d/99-nvidia-egpu-pm.rules and substitute the correct vendor and device codes (keep the leading 0x):

ACTION=="bind", SUBSYSTEM=="pci", DRIVERS=="nvidia", ATTR{vendor}=="0x10de", ATTR{device}=="0x2b85", ATTR{power/control}="on"

If you encounter kernel panics or auto reboots when trying to use the NVIDIA card, and errors such as these appear in dmesg/journalctl -k -b -1:

NVRM: Xid (PCI:0000:95:00): 79, pid=25835, name=nvidia-smi, GPU has fallen off the bus.
NVRM: GPU 0000:95:00.0: GPU has fallen off the bus.
NVRM: GPU0 _kgspRpcRecvPoll: GSP RM heartbeat timed out
NVRM: GPU0 _kgspRpcRecvPoll: LibOS heartbeat timed out
NVRM: GPU0 tmrGetTimeEx_GH100: NVRM-RC: Consistently Bad TimeLo value ffffffff
NVRM: Xid (PCI:0000:95:00): 154, GPU recovery action changed from 0x0 (None) to 0x2 (Node Reboot Required)

Try adding this line to /etc/modprobe.d/nvidia-egpu.conf:

options nvidia NVreg_EnableGpuFirmware=0
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment