Skip to content

GPU acceleration

UbiOps has support for GPU deployments, but this feature is not enabled for customers by default. Please contact sales for more information.

In order to use GPUs the following is needed:

You can also find some examples of deployments that use GPUs on UbiOps:

Cold start time

GPU deployments are usually quite large. Therefore, it takes more time than usual (like 30 seconds) to scale from zero to one instance. After the first request, the requests are much faster until the instance shuts down again when the maximum idle time of the deployment version is reached. You may want to set the minimum number of instances to a higher value than zero to prevent scaling down to zero instances. Of course this has an impact on the costs as instances will always be running.

CUDA version

UbiOps provides base environments with CUDA already installed. They can be selected when creating a deployment version. For example, it is possible to select the base environment ubuntu22-04-python3-10-cuda11-7-1 which contains Python 3.10 and CUDA 11.7.1 and is based on an Ubuntu 22.04 base image. It is encouraged to use the available environments with CUDA drivers, but not required. Alternatively, a ubiops.yaml file can be used to install a custom version of CUDA on an environments without CUDA installed, such as python3-10. Using a base environment with CUDA pre-installed, , typically reduces build time and improves cold start time compared to using ubiops.yaml.

GPU utilization

You can check if your GPU is being utilized by inspecting the logs. The peak and average GPU utilization of a run are after each run. In case of a multi-GPU set-up, the scores are printed for all GPUs.

Here we will highlight how to run two of the most popular machine learning libraries on GPUs.

TensorFlow and CUDA

TensorFlow has been compiled against CUDA by default (from tensorflow==2.0.0 onwards). Therefore, choosing an environment with CUDA installed and adding tensorflow to your requirements.txt is sufficient to allow your objects to load on GPUs. However, not all TensorFlow versions are compatible with all CUDA versions. See the TensorFlow CUDA compatibility matrix to see which TensorFlow version is compatible with which CUDA versions.

For example, Tensorflow version 2.11.0 requires CUDA 11.2 and Python 3.7-3.11. To get this to work on UbiOps, you can select the environment 'Python 3.9 + CUDA 11.2.2', which has the tag python3-9-cuda11-2-2.

PyTorch and CUDA

PyTorch has not been compiled against CUDA by default. You are required to explicitely install a PyTorch version that has been compiled against CUDA. The list of available CUDA-compiled PyTorch versions can be found here. Following example code of the PyTorch documentation, we can install a CUDA-compiled version of PyTorch by updating your deployment package in two steps:

Add the PIP Reposotory with CUDA-compiled versions of PyTorch to your PIP index, by adding the following line to your ubiops.yaml:

environment_variables:
- PIP_EXTRA_INDEX_URL=https://download.pytorch.org/whl/cu117

Then, instruct your deployment to install a package from this PIP repository by adding a specific version of PyTorch from this PIP repository to your requirements.txt. For example:

torch==1.13.0+cu117
This torch version is compatible with the environment 'Ubuntu 22.04 + Python 3.10 + CUDA 11.7.1', which has the tag ubuntu22-04-python3-10-cuda11-7-1.

Tips and Tricks

Most frameworks offer possibilties to check if your framework has access to your GPU. For PyTorch, you can run this command, for TensorFlow you can run this command to see if the framework detects a GPU.

We have a variety of versions of combinations of Python and CUDA available. In case that you require a different CUDA version than we offer, you can use the ubiops.yaml to instruct the installation of the CUDA version of your choice. See this how-to on how to install your custom CUDA version.