Install Custom CUDA Version¶
This how-to will explain how to install a custom CUDA version inside a UbiOps deployment with a ubiops.yaml file, that can be part of your environment package. This can be useful if you require a different CUDA version than the ones readily available on UbiOps.
Ubiops.yaml¶
You can start with a plain UbiOps base environment of any Ubuntu or Python combination. To avoid version conflicts, do not perform these steps on a base environment with CUDA pre-installed. All steps that are necessary to install CUDA on top of this base environment, will be specified in a ubiops.yaml file. This file is interpreted by UbiOps, and can be used to install additional (OS level) software packages and to specify system environment variables. More information about the ubiops.yaml file can be found in the documentation.
An ubiops.yaml follows the template below:
environment_variables:
apt:
keys:
urls:
sources:
items:
packages:
First, we need to specify where the different CUDA version can be downloaded from. Afterward, we need to add the repository .pub keys to the apt keys. This key is necessary to install the CUDA packages from the source.
The UbiOps infrastructure uses an x86_64 architecture. If we take a base environment with Ubuntu 22.04 (a Debian kernel), we can use the following links (checkout the nvidia cuda repository links when you want to build on top of a different base environment.):
environment_variables:
apt:
keys:
urls:
- https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/3bf863cc.pub
sources:
items:
- deb https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64 /
Trailing slash
Note the space + trailing slash in the sources-items. Both of these are necessary.
With the locations of the packages specified, we can add the packages that we want to actually install. You can add individual packages, or simply install the full toolkit. Installing a complete CUDA toolkit will be the easiest method, as it will automatically install all the necessary packages in one go. Cherry-picking relevant packages can decrease the size of your resulting custom environments, though.
For this tutorial, we will install the CUDA 12.0 toolkit.
The following code will install the CUDA toolkit and add the correct paths to the corresponding environment variables:
environment_variables:
- PATH=/usr/local/nvidia/bin:/usr/local/cuda/bin:${PATH}
- LD_LIBRARY_PATH=/usr/local/nvidia/lib:/usr/local/nvidia/lib64:/usr/local/cuda/lib64:${LD_LIBRARY_PATH}
apt:
keys:
urls:
- https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/3bf863cc.pub
sources:
items:
- deb https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64 /
packages:
- cuda-toolkit-12-0
With this ubiops.yaml CUDA 12.0 will be installed, and can be used in our deployment. As a quick check, the nvidia-smi command can be run to check if the CUDA version is correct.
The following output or similar can be observed:
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 450.203.03 Driver Version: 450.203.03 CUDA Version: 12.0 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 Tesla T4 Off | 00000000:00:04.0 Off | 0 |
| N/A 46C P8 9W / 70W | 0MiB / 15109MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
Running Commands
See how-to run terminal commands for more information on how to run commands in a deployment.