External GPU
On computers equipped with Thunderbolt 3+ or USB4, it is possible to attach a desktop-grade external graphics card (eGPU) using a GPU enclosure. eGPU.io is a good resource with buyer's guide and a community forum. While some manual configuration (shown below) is needed for most modes of operation, Linux support for eGPUs is generally good.
Installation
Thunderbolt
The eGPU enclosure Thunderbolt device may need to be authorized first after plugging in (based on your BIOS/UEFI Firmware configuration). Follow Thunderbolt#User device authorization. If successful, the external graphics card should show up in lspci
:
$ lspci | grep -E 'VGA|3D'
00:02.0 VGA compatible controller: Intel Corporation UHD Graphics 620 (rev 07) # internal GPU 1a:10.3 VGA compatible controller: NVIDIA Corporation GP107 [GeForce GTX 1050] (rev a1) # external GPU
Depending on your computer, its firmware and enclosure firmware, Thunderbolt will limit host <-> eGPU bandwidth to some extent due to the number of PCIe lanes and OPI Mode:
# dmesg | grep PCIe
[19888.928225] pci 0000:1a:10.3: 8.000 Gb/s available PCIe bandwidth, limited by 2.5 GT/s PCIe x4 link at 0000:05:01.0 (capable of 126.016 Gb/s with 8.0 GT/s PCIe x16 link)
Drivers
A driver compatible with your GPU model should be installed:
If installed successfully, lspci -k
should show that a driver has been associated with your card:
$ lspci -k
1a:10.3 VGA compatible controller: NVIDIA Corporation GP107 [GeForce GTX 1050] (rev a1) Subsystem: NVIDIA Corporation GP107 [GeForce GTX 1050] Kernel driver in use: nvidia Kernel modules: nouveau, nvidia_drm, nvidia
AMDGPU
Note that the AMDGPU driver (with either Thunderbolt or USB4) might in some cases set the wrong pcie_gen_cap option and fall back to PCIe gen 1.1 speed, with possibly serious performance issues. In this case the proper value can be set as module option (see Kernel module#Using files in /etc/modprobe.d/) or even passed as Kernel parameters:
/etc/modprobe.d/amd-egpu-pcie-speed.conf
options amdgpu pcie_gen_cap=0x40000
This will set PCIe gen 3 speed. A full list of options can be found in amd_pcie.h.
NVIDIA
For NVIDIA eGPUs on some systems you may need to early load the thunderbolt kernel module to ensure it is loaded before nvidia_drm
.
- If you use mkinitcpio initramfs, follow mkinitcpio#MODULES to add modules.
- If you use Booster, follow Booster#Early module loading.
- If you use dracut, follow dracut#Early kernel module loading.
Compute-only workloads
Right after completing installation steps, compute-only workloads like GPGPU#CUDA that do not need to display anything should work without any extra configuration. The nvidia-smi utility (provided by the nvidia-utils package) should work with the proprietary NVIDIA driver. Proprietary Nvidia NVENC/NVDEC should work (without OpenGL interop).
This use-case should also support full hotplug. Hot-unplug should be also possible (probably depending on drivers used). On Nvidia, active nvidia-persistenced
is expected to prevent clean hot-unplug.
Xorg
Multiple setups combining internal (iGPU) and external (eGPU) cards are possible, each with own advantages and disadvantages.
Xorg rendered on eGPU, PRIME display offload to iGPU
- Most programs that make use of GPU run out-of-the-box on eGPU:
glxinfo
/glxgears
,eglinfo
/eglgears_x11
,NVENC
/NVDEC
(including OpenGL interop). - Xorg only starts with the eGPU plugged in.
- Monitors attached to eGPU work out-of-the-box, PRIME display offload can be used for monitors attached to iGPU (i.e. internal laptop screen).
Main articles are PRIME#Discrete card as primary GPU and PRIME#Reverse PRIME. Also documented in NVIDIA driver docs Chapter 33. Offloading Graphics Display with RandR 1.4.
Use Xorg configuration snippet like this one:
/etc/X11/xorg.conf.d/80-egpu-primary-igpu-offload.conf
Section "Device" Identifier "Device0" Driver "nvidia" BusID "PCI:26:16:3" # Edit according to lspci, translate from hex to decimal. Option "AllowExternalGpus" "True" # Required for proprietary NVIDIA driver. EndSection Section "Module" # Load modesetting module for the iGPU, which should show up in XrandR 1.4 as a provider. Load "modesetting" EndSection
1a:10.3
to 26:16:3
for xorg.conf
snippet.ServerLayout
and Screen
sections, as these are inferred automatically. First Device
defined will be considered primary.To validate this setup, use xrandr --listproviders
, which should display
Providers: number : 2 Provider 0: id: 0x1b8 cap: 0x1, Source Output crtcs: 4 outputs: 4 associated providers: 0 name:NVIDIA-0 Provider 1: id: 0x1f3 cap: 0xf, Source Output, Sink Output, Source Offload, Sink Offload crtcs: 3 outputs: 5 associated providers: 0 name:modesetting
To output to internal laptop screen and/or other monitors attached to iGPU, RandR 1.4 PRIME display offload can be used, using names from above xrandr --listproviders
output:
xrandr --setprovideroutputsource modesetting NVIDIA-0 && xrandr --auto
xrandr --auto
is optional and may be substituted by any RandR-based display configuration tool. Its presence prevents all-screens-black situation.You may want to run this command before a display manager shows login propmt or before desktop environment starts, see Xrandr#Configuration and Xinit.
Vulkan may enumerate GPUs independently of Xorg, so in order to run for example vkcube
in this setup, one may need to pass --gpu_number 1
option. Alternatively, a layer to reorder GPUs during enumeration can be activated with the same effect: __NV_PRIME_RENDER_OFFLOAD=1 vkcube
or equivalently prime-run vkcube
.
BusId
of the eGPU in the appropriate file for your mode and graphics card in /etc/optimus-manager/xorg/
.Xorg rendered on iGPU, PRIME render offload to eGPU
- Programs are rendered on iGPU by default, but PRIME render offload can be used to render them on eGPU.
- Xorg starts even with eGPU disconnected, but render/display offload will not work until it is restarted.
- Monitors attached to iGPU (i.e. internal laptop screen) work out-of-the-box, PRIME display offload can be used for monitors attached to eGPU.
Main article is PRIME#PRIME GPU offloading. Also documented in NVIDIA driver docs Chapter 34. PRIME Render Offload.
With many discrete GPU drivers, this mode should be the default without any manual Xorg configuration. If that does not work, or if you use proprietary NVIDIA drivers, use the following:
/etc/X11/xorg.conf.d/80-igpu-primary-egpu-offload.conf
Section "Device" Identifier "Device0" Driver "modesetting" EndSection Section "Device" Identifier "Device1" Driver "nvidia" BusID "PCI:26:16:3" # Edit according to lspci, translate from hex to decimal. Option "AllowExternalGpus" "True" # Required for proprietary NVIDIA driver. EndSection
To validate this setup, use xrandr --listproviders
, which should display
$ xrandr --listproviders
Providers: number : 2 Provider 0: id: 0x47 cap: 0xf, Source Output, Sink Output, Source Offload, Sink Offload crtcs: 3 outputs: 5 associated providers: 0 name:modesetting Provider 1: id: 0x24a cap: 0x2, Sink Output crtcs: 4 outputs: 4 associated providers: 0 name:NVIDIA-G0
To render some_program
on the eGPU, PRIME render offload can be used:
- for proprietary NVIDIA drivers:
$ __NV_PRIME_RENDER_OFFLOAD=1 __VK_LAYER_NV_optimus=NVIDIA_only __GLX_VENDOR_LIBRARY_NAME=nvidia some_program
- for proprietary NVIDIA drivers (convenience wrapper):
$ prime-run some_program
- for open-source drivers:
$ DRI_PRIME=1 some_program
To output to monitors connected to eGPU, RandR 1.4 PRIME display offload can be again used:
$ xrandr --setprovideroutputsource NVIDIA-G0 modesetting && xrandr --auto
NVIDIA drivers 460.27.04+ implement an optimization for a special case of combined render and display offloads:
- Added support for “Reverse PRIME Bypass”, an optimization that bypasses the bandwidth overhead of PRIME Render Offload and PRIME Display Offload in conditions where a render offload application is fullscreen, unredirected, and visible only on a given NVIDIA-driven PRIME Display Offload output. Use of the optimization is reported in the X log when verbose logging is enabled in the X server.
Separate Xorg instance for eGPU
Main article is Nvidia-xrun#External GPU setup.
Known issues with eGPUs on Xorg
- hotplug is not supported with most discrete GPU Xorg drivers: the eGPU needs to be plugged in when Xorg starts. Logging out and in again should suffice to restart Xorg.
- hot-unplug is not supported at all: doing so leads to system instability or outright freezes (as acknowledged in the Nvidia docs).
Wayland
Wayland support for eGPUs (or multiple GPUs in general) is much less tested, but should work with even less manual configuration.
Note that there need to be explicit GPU hotplug support by the Wayland compositor, but most already have some level of support:
- KDE's kwin: https://invent.kde.org/plasma/kwin/-/merge_requests/811
- GNOME's Mutter: https://gitlab.gnome.org/GNOME/mutter/-/issues/17, https://gitlab.gnome.org/GNOME/mutter/-/merge_requests/1562
- wl-roots: https://gitlab.freedesktop.org/wlroots/wlroots/-/issues/1278
For open-source drivers, DRI offloading works fine:
$ DRI_PRIME=1 some_program
Some projects, such as all-ways-egpu, are trying to provide more efficient methods for GPU selection under Wayland.
Hotpluging Nvidia eGPU
It is possible to hotplug eGPU when using Wayland, and use PRIME feature. Nvidia already has great implementation of PRIME for dGPUs, and it is working same way for eGPU.
First you need to make sure that no program uses Nvidia modules. EGL programs tend to use 1MB dGPU memory per program, even if they run on iGPU, and it can be seen in nvidia-smi
. To avoid this, add __EGL_VENDOR_LIBRARY_FILENAMES=/usr/share/glvnd/egl_vendor.d/50_mesa.json
as environment variable. Best place for that is /etc/environment.d/50_mesa.conf
.
Then you unload Nvidia modules:
# rmmod nvidia_uvm # rmmod nvidia_drm # rmmod nvidia_modeset # rmmod nvidia
When Nvidia modules is no longer loaded, you can connect external GPU. When GPU is initialized, load Nvidia modules again with modprobe nvidia-drm
command or modprobe nvidia-current-drm
. Naming depends on source of modules, either it is drivers from Nvidia website or from package manager. In some cases (for example, for GIGABYTE AORUS GAMING BOX) eGPU does not work with proprietary modules, so you might need to load open-source ones: modprobe nvidia-current-open-drm
.
When modules successfully loaded, prime feature will work, but since we set __EGL_VENDOR_LIBRARY_FILENAMES
variable to use MESA, we need to add __EGL_VENDOR_LIBRARY_FILENAMES=/usr/share/glvnd/egl_vendor.d/10_nvidia.json
before starting program. Full string will look like:
__GLX_VENDOR_LIBRARY_NAME=nvidia __NV_PRIME_RENDER_OFFLOAD=1 __VK_LAYER_NV_optimus=NVIDIA_only __EGL_VENDOR_LIBRARY_FILENAMES=/usr/share/glvnd/egl_vendor.d/10_nvidia.json %command%
For Gnome users, you might need to patch switcheroo-control to include __EGL_VENDOR_LIBRARY_FILENAMES
into list of environment variables. This will allow programs to run on eGPU naturally with right click and "Launch using Dedicated Graphics Card". But this is beyond scope of this article.
Hotpluging Nvidia eGPU and temporarily disabling Nvidia dGPU
In case you have iGPU, Nvidia dGPU and want to connect Nvidia eGPU, you will encounter a conflict, where graphics renders only on dGPU, no matter what you do. To solve this, you need to temporarily disable dGPU, so Nvidia driver will not notice it. Best way to do that is to override its driver.
First, you need to unload Nvidia driver:
# rmmod nvidia_uvm # rmmod nvidia_drm # rmmod nvidia_modeset # rmmod nvidia
Then, you need to override dGPU driver with driverctlAUR utility. In this example, 0000:01:00.0 is address of dGPU. It can be found with lspci
utility.
# driverctl --nosave set-override 0000:01:00.0 vfio-pci
It is important to use --nosave
parameter, to prevent driverctl to override driver on boot. It is useful in case something goes wrong, simple reboot cleans everything.
When dGPU is disabled, you can load kernel modules with modprobe nvidia-drm
and then check if nvidia-smi
shows 1 or 2 GPUs.
Bringing dGPU back is tricky, because it is unintuitive. First, unload Nvidia modules, unplug eGPU and then run this series of commands:
# modprobe nvidia-current # driverctl --nosave unset-override 0000:01:00.0 # modprobe nvidia-current # driverctl --nosave unset-override 0000:01:00.0 # modprobe nvidia-current-modeset # modprobe nvidia-current-drm
It is strange that we need to run first 2 commands twice, but otherwise it will not bring back dGPU. Command will error once, but it is not critical.