Skip to content

Blob (file) handling

A blob is any type of file that can be used in deployments and pipelines, either provided as input or produced as output. It can be used to handle images, data files or other file objects in requests.

Managing blobs

Blobs are managed on a project basis and identified by a unique blob ID. This means that a single blob file can be re-used in multiple requests in the same project, or used as intermediate storage for passing files between deployments in a pipeline.

Blobs are uploaded, listed and downloaded in UbiOps using an API request, either directly or using the CLI or Python client library. When making a request using the WebApp, the blob can be uploaded while preparing the request.

Using blobs in deployments

Blobs are available in UbiOps as a structured data type. Therefore, if you want to use blobs as input to your deployment, it needs to be defined with structured input. Similarly if you want your deployment to output blobs, it needs to be defined with structured output.

You can select the blob data type for in- or output fields when defining the deployment. For each blob field, one blob can be used per request.

When making a request, blobs are passed to the blob fields of a deployment by passing the unique ID of an existing blob. This means that prior to making the request either a new blob needs to be uploaded, or an already existing blob should be selected.

When using the WebApp, the uploading of the blob and passing its ID in the request is handled automatically in the the background. Using the API, the uploading is done with a separate API request before making the deployment request, for example, using the UbiOps Python client library:

response = core_api.blobs_create(project_name='your-project', file='test.txt')
blob_id = response.id

Retrieving blobs is done using blob IDs as well. In the WebApp a download button is available after completing a request. In code a blob is downloaded using the API, for example, using the UbiOps Python client library:

blob_id = 'c3af3e5d-ae06-4b00-8cf9-c96439a4f16d'
with core_api.blobs_get(project_name='your-project', blob_id=blob_id) as response:
    filename = response.getfilename()
    with open(filename, 'wb') as f:
        f.write(response.read())

See platform limits for limitations on the usage of blobs.

Handling blob input in a deployment

UbiOps makes sure that input blobs given in the request are automatically made available on the file system of the deployment during the request.

The blob ID initially passed while creating the request is replaced by the absolute path where the blob can be accessed.

For example, for a request made with the following request data containing two blobs identified by their blob IDs:

{
    "blob_1": "ce59da5a-a667-4961-a4ee-59a2914d4741",
    "blob_2": "1b3bd7a3-f3d8-4d38-b330-de497dd5edb8"
}

The data object available inside the request method of the deployment class would look as follows:

{
    "blob_1": "/home/deployment/blobs/ce59da5a-a667-4961-a4ee-59a2914d4741/original-filename-1.png",
    "blob_2": "/home/deployment/blobs/1b3bd7a3-f3d8-4d38-b330-de497dd5edb8/original-filename-2.png"
}

The file can be used as a normal file in the request method of the deployment, for example, in Python:

class Deployment:
    def request(self, data):
        with open(data['blob_1'], "rb") as f:
            # Process the file

Blob example

For another example of using a blob inside a deployment, see the MNIST quickstart example.

Handling blob output in a deployment

For blob outputs in a deployment, UbiOps handles the uploading of the file to the project blob storage and returning a blob ID in the response of the request.

For fields of type blob, the absolute path to a file should be returned in the response of the request method. For example in Python, for a deployment with a blob output field output_blob:

class Deployment:
    def request(self, data):
        with open("/tmp/file.txt", "w") as f:
            f.write("test")

        return {
            "output_blob": "/tmp/file.txt"
        }

The processed response of the request will then look as follows:

{
    "output_blob": "c05edd1f-c03e-4ae2-89b0-4fc851c92e95"
}

The original file.txt file can then be retrieved by downloading the blob with ID c05edd1f-c03e-4ae2-89b0-4fc851c92e95.

Blob time-to-live (TTL)

By default blobs are available for 72 hours after uploading before they are automatically deleted. In that time the blob can be re-used or downloaded an unlimited number of times.

You can customize the time a blob remains available by specifying the time-to-live (TTL) for the blob. This is done by defining the blob-ttl header in the API request when uploading the blob. This header indicates the time in seconds for the blob to remain available after uploading.

The minimum time-to-live value is 900 seconds (15 minutes), while the maximum is 259200 seconds (3 days).

If desired, a blob can be deleted at any moment before the time-to-live expires using the API.

Managing access to blobs

Users and service accounts can be given permissions to blobs on a project level. The following default roles are available:

  • blob-viewer - can list and download blobs.
  • blob-editor - can list, download, and upload blobs.
  • blob-admin - can list, download, upload, and delete blobs.

These permissions can only be granted to a user for the entire project, not per individual blob.

Blob access privileges

Because blob permissions can only be granted at the project level, users with blob permissions are able to use those for all blobs in a project. These can include blobs used in deployments or pipelines to which the user does not have direct permission.

Learn more about permissions in UbiOps on Permissions and roles.