Install Custom CUDA Version¶
This how-to will explain how to install a custom CUDA version inside a UbiOps deployment with the ubiops.yaml file. This can be useful if you require a different CUDA version than the ones readily available on UbiOps.
Ubiops.yaml¶
Everything will be done in the ubiops.yaml file. The ubiops.yaml file is used to install additional (OS level) software packages and to specify system environment variables. More information about the ubiops.yaml
file can be found in the documentation.
Initially, the following blank structure will be used:
environment_variables:
apt:
keys:
urls:
sources:
items:
packages:
First, we need to specify where the different CUDA version can be downloaded from. Afterwards, we need to add the repository keys to the apt keys. This key is necessary to install the CUDA packages from the source.
Since the Ubiops infrastructure uses a x86_64 architecture with Ubuntu 20.04 (a Debian kernel), the following links can be used:
environment_variables:
apt:
keys:
urls:
- https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/7fa2af80.pub
- https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/3bf863cc.pub
sources:
items:
- deb https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64 /
Trailing slash
Note the space + trailing slash in the sources-items. Both of these are necessary.
With the package locations specified, we can actually start installing the CUDA package that we want to use. Individual packages could be added, or the complete toolkit could be installed. Installing a complete CUDA toolkit will be the easiest method, as it will automatically install all the necessary packages in one go. For this tutorial the CUDA 11.0 toolkit will be installed, to align with the CUDA versions that are available on UbiOps. In this way, the pros and cons between selecting a readily available CUDA version and installing a custom CUDA version can be compared.
The following code will install the CUDA toolkit and add the correct paths to the corresponding environment variables:
environment_variables:
- PATH=/usr/local/nvidia/bin:/usr/local/cuda/bin:${PATH}
- LD_LIBRARY_PATH=/usr/local/nvidia/lib:/usr/local/nvidia/lib64:/usr/local/cuda/lib64:${LD_LIBRARY_PATH}
apt:
keys:
urls:
- https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/7fa2af80.pub
- https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/3bf863cc.pub
sources:
items:
- deb https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64 /
packages:
- cuda-toolkit-11-0
With this ubiops.yaml CUDA 11.0 will be installed properly and it can be used in our deployment. As a quick check, the nvidia-smi
command could be run to check if the CUDA version is correct.
The following output could be observed:
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 450.203.03 Driver Version: 450.203.03 CUDA Version: 11.0 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 Tesla T4 Off | 00000000:00:04.0 Off | 0 |
| N/A 46C P8 9W / 70W | 0MiB / 15109MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
Running Commands
See how-to run terminal commands for more information on how to run commands in a deployment.