Here's a guide for setting up mixed eGPUs via USB4/Thunderbolt on a Strix Halo (AMD Ryzen AI Max+ 395) machine. This guide assumes CachyOS is used (an Arch Linux derivative) with the Limine bootloader. I did not test this with a desktop manager installed, nor did I attempt to use the video ports on the eGPUs. My use case was to use this machine as an LLM server.
Hardware:
- Computer:
- Minisforum MS-S1 MAX AI
- eGPUs docks:
- Minisforum DEG2
- Razer Core X V2
- GPUs:
- AMD Radeon RX 7900 XTX
- NVIDIA GeForce RTX 5090
I updated the BIOS from 1.03 to 1.06 because the release notes indicated 1.05 fixed some Thunderbolt issues. I followed the official guide "SHWSA_1.06_260104B/Update BIOS guide.docx" (downloaded from the Product Help page) and Minisforum MS-S1 Max BIOS Update from Linux (No Windows Required).
I had to disable ReBAR in the BIOS for the NVIDIA card to work.
There seems to be a quirk between the two USB4 v2 ports on the back of the machine. One port only supports a prefetchable bridge window which the AMD card is fine with, but the NVIDIA card is not. The other port supports a non-prefetchable bridge window which the NVIDIA card requires. If you see dmesg errors like these, that might be the cause, and the solution on the MS-S1 was to swap the USB4 ports:
[ 6.744373] pci 0000:6a:00.0: BAR 1 [mem 0x9000000000-0x900fffffff 64bit pref]: assigned
[ 6.744393] pci 0000:6a:00.0: VF BAR 2 [mem size 0x10000000 64bit pref]: can't assign; no space
[ 6.744394] pci 0000:6a:00.0: VF BAR 2 [mem size 0x10000000 64bit pref]: failed to assign
[ 6.744394] pci 0000:6a:00.0: BAR 0 [mem size 0x04000000]: can't assign; no space
[ 6.744395] pci 0000:6a:00.0: BAR 0 [mem size 0x04000000]: failed to assign
[ 6.744395] pci 0000:6a:00.0: BAR 3 [mem 0x9010000000-0x9011ffffff 64bit pref]: assigned
[ 6.744410] pci 0000:6a:00.0: VF BAR 4 [mem size 0x02000000 64bit pref]: can't assign; no space
[ 6.744411] pci 0000:6a:00.0: VF BAR 4 [mem size 0x02000000 64bit pref]: failed to assign
[ 6.744412] pci 0000:6a:00.0: ROM [mem size 0x00080000 pref]: can't assign; no space
[ 6.744412] pci 0000:6a:00.0: ROM [mem size 0x00080000 pref]: failed to assign
[ 6.744413] pci 0000:6a:00.0: VF BAR 0 [mem size 0x00040000 64bit pref]: can't assign; no space
[ 6.744413] pci 0000:6a:00.0: VF BAR 0 [mem size 0x00040000 64bit pref]: failed to assign
[ 6.744415] pci 0000:6a:00.0: BAR 5 [io size 0x0080]: can't assign; no space
[ 6.744416] pci 0000:6a:00.0: BAR 5 [io size 0x0080]: failed to assign
[ 6.744416] pci 0000:6a:00.0: BAR 1 [mem 0x9000000000-0x900fffffff 64bit pref]: releasing
[ 6.744417] pci 0000:6a:00.0: BAR 3 [mem 0x9010000000-0x9011ffffff 64bit pref]: releasing
...
[ 6.834725] nvidia 0000:6a:00.0: enabling device (0000 -> 0002)
[ 6.834750] NVRM: This PCI I/O region assigned to your NVIDIA device is invalid:
NVRM: BAR0 is 0M @ 0x0 (PCI:0000:6a:00.0)
[ 6.834764] nvidia 0000:6a:00.0: probe with driver nvidia failed with error -1The NITRO+ AMD Radeon RX 7900 XTX Vapor-X 24GB has a little BIOS switch that needs to be set to the middle position to allow software control of power settings.
Add the "iommu=pt", "pci=realloc" and "pcie_ports=native" kernel boot parameter. For Limine, add the following to /etc/default/limine:
KERNEL_CMDLINE[default]+="iommu=pt pci=realloc pcie_ports=native"
Regenerate /boot/limine.conf:
sudo limine-mkinitcpioMake sure the thunderbolt module is loaded before the AMD driver. Create /etc/mkinitcpio.conf.d/99-amd-egpu.conf with the contents:
MODULES+=(thunderbolt amdgpu)
Rebuild initramfs:
sudo limine-mkinitcpioThe Thunderbolt device has to be authorized before it can be used. Find it using boltctl:
$ boltctl
● Micro Computer (HK) Tech. Ltd. TBGAA
├─ type: peripheral
├─ name: TBGAA
├─ vendor: Micro Computer (HK) Tech. Ltd.
├─ uuid: 34158780-0082-ee01-ffff-ffffffffffff
├─ generation: USB4
├─ status: connected
│ ├─ domain: a7a30000-0003-ea39-ffff-ffffffffffff
│ ├─ rx speed: 80 Gb/s = 2 lanes * 40 Gb/s
│ ├─ tx speed: 80 Gb/s = 2 lanes * 40 Gb/s
│ └─ authflags: none
├─ connected: Thu 09 Apr 2026 02:45:08 PM UTC
└─ stored: noAnd enroll the device using its uuid (I want to enroll 34158780-0082-ee01-ffff-ffffffffffff):
sudo boltctl enroll --policy=auto 34158780-0082-ee01-ffff-ffffffffffffDisable MES because it fails on USB4. Create /etc/modprobe.d/amd-egpu.conf with the contents:
options amdgpu mes=0
softdep amdgpu pre: thunderbolt
Disable runtime power management because resuming the GPU frequently (always?) fails. Find the eGPU PCI vendor and device codes (format is [VENDOR:DEVICE]):
$ lspci -nn | grep -i vga | grep -i radeon
97:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Navi 31 [Radeon RX 7900 XT/7900 XTX/7900 GRE/7900M] [1002:744c] (rev c8)The [1002:744c] is the vendor and device for the AMD card. Create /etc/udev/rules.d/99-amd-egpu-pm.rules and substitute the correct vendor and device codes (keep the leading 0x):
ACTION=="bind", SUBSYSTEM=="pci", ATTR{vendor}=="0x1002", ATTR{device}=="0x744c", ATTR{power/control}="on"
Load the nvidia module after thunderbolt. A modprobe.d rule is required for nvidia-dkms. Create /etc/modprobe.d/nvidia-egpu.conf with the contents:
softdep nvidia pre: thunderbolt
The Thunderbolt device has to be authorized before it can be used. Find it using boltctl:
$ boltctl
● Razer Core X V2
├─ type: peripheral
├─ name: Core X V2
├─ vendor: Razer
├─ uuid: 8ab48780-00a3-29aa-ffff-ffffffffffff
├─ generation: USB4
├─ status: connected
│ ├─ domain: a7a30000-0003-ea39-ffff-ffffffffffff
│ ├─ rx speed: 80 Gb/s = 2 lanes * 40 Gb/s
│ ├─ tx speed: 80 Gb/s = 2 lanes * 40 Gb/s
│ └─ authflags: none
├─ connected: Thu 09 Apr 2026 03:00:58 PM UTC
└─ stored: no
● Micro Computer (HK) Tech. Ltd. TBGAA
├─ type: peripheral
├─ name: TBGAA
├─ vendor: Micro Computer (HK) Tech. Ltd.
├─ uuid: 34158780-0082-ee01-ffff-ffffffffffff
├─ generation: USB4
├─ status: authorized
│ ├─ domain: a7a30000-0003-ea39-ffff-ffffffffffff
│ ├─ rx speed: 80 Gb/s = 2 lanes * 40 Gb/s
│ ├─ tx speed: 80 Gb/s = 2 lanes * 40 Gb/s
│ └─ authflags: none
├─ authorized: Thu 09 Apr 2026 02:58:42 PM UTC
├─ connected: Thu 09 Apr 2026 02:58:37 PM UTC
└─ stored: Thu 09 Apr 2026 02:51:32 PM UTC
├─ policy: auto
└─ key: noAnd enroll the device using its uuid (I want to enroll 8ab48780-00a3-29aa-ffff-ffffffffffff):
sudo boltctl enroll --policy=auto 8ab48780-00a3-29aa-ffff-ffffffffffffDisable runtime power management because resuming the GPU frequently (always?) fails. Find the eGPU PCI vendor and device codes (format is [VENDOR:DEVICE]):
$ lspci -nn | grep -i vga | grep -i nvidia
6a:00.0 VGA compatible controller [0300]: NVIDIA Corporation GB202 [GeForce RTX 5090] [10de:2b85] (rev a1)The [10de:2b85] is the vendor and device for the NVIDIA card. Create /etc/udev/rules.d/99-nvidia-egpu-pm.rules and substitute the correct vendor and device codes (keep the leading 0x):
ACTION=="bind", SUBSYSTEM=="pci", DRIVERS=="nvidia", ATTR{vendor}=="0x10de", ATTR{device}=="0x2b85", ATTR{power/control}="on"
If you encounter kernel panics or auto reboots when trying to use the NVIDIA card, and errors such as these appear in dmesg/journalctl -k -b -1:
NVRM: Xid (PCI:0000:95:00): 79, pid=25835, name=nvidia-smi, GPU has fallen off the bus.
NVRM: GPU 0000:95:00.0: GPU has fallen off the bus.
NVRM: GPU0 _kgspRpcRecvPoll: GSP RM heartbeat timed out
NVRM: GPU0 _kgspRpcRecvPoll: LibOS heartbeat timed out
NVRM: GPU0 tmrGetTimeEx_GH100: NVRM-RC: Consistently Bad TimeLo value ffffffff
NVRM: Xid (PCI:0000:95:00): 154, GPU recovery action changed from 0x0 (None) to 0x2 (Node Reboot Required)
Try adding this line to /etc/modprobe.d/nvidia-egpu.conf:
options nvidia NVreg_EnableGpuFirmware=0