Using files in deployments¶

UbiOps supports the use of files as input or output of both deployments and pipelines. You can select the file or array of files data type for in- or output fields when defining the deployment. Both the file and array of files datatypes can be used to handle images, data files or other file objects in requests.

Passing files in a request¶

When making a request, files are passed to the file fields of a deployment by passing the file URI of an existing file. This means that prior to making the request either a new file needs to be uploaded, or an already existing file should be selected. The file URI should be formatted as ubiops-file://{bucket-name}/{file_name}.

upload file

When using the API, CLI or client library, you need to upload the file with a separate API request(s) before making the deployment request. Here's an example with the Python client library:

Define request data with file¶

If you want to use a file as request data, you need to define it as follows:

data = {'file_input': file_uri}

api_client.deployment_version_requests_create(
    project_name=project_name,
    deployment_name='file-example-deployment',
    version='v1',
    data=data
)

Handling file input in the deployment code¶

UbiOps makes sure that input files given in the request are automatically made available on the file system of the deployment during the request. The file URI initially passed while creating the request is replaced by the absolute path where the file can be accessed.

For example, for a request made with the following request data containing a file identified by a file URI:

{
    "file": "ubiops-file://default/file_example.png"
}

When a file is passed as input to a deployment, UbiOps will add the file to the local storage of the deployment. The file is then accessible via the file path passed in the data dictionary and can be used as a normal file in the deployment code. For example:

class Deployment:
    def request(self, data):
        with open(data['file'], "rb") as f:
            # Process the file

Full file example

For a full example of using a file inside a deployment, see the MNIST quickstart example.

Handling file output in the deployment code¶

For file outputs in a deployment, UbiOps handles the uploading of the file to the project file storage and returning a file URI in the response of the request.

For fields of type file, the path to the file should be returned in the response of the request method. For example in Python, for a deployment with a file output field output_file:

class Deployment:
    def request(self, data):
        with open("/tmp/file.txt", "w") as f:
            f.write("test")

        return {
            "output_file": "/tmp/file.txt"
        }

The above snippet will store the outputted file in the default bucket. In case you want to store it in a different bucket, you can pass the bucket name in the return statement as well. You can also pass the optional bucket_file parameter, which specifies exactly where to store the file in your bucket. This can be useful when you want to store the output in a specific folder.

class Deployment:
    def request(self, data):
        with open("/tmp/file.txt", "w") as f:
            f.write("test")

        return {
            "output_file": {
                "file": "/tmp/file.txt",
                "bucket": "example-bucket",
                "bucket_file": "request-output/file.txt"
            }
        }

The processed response of the request will then look as follows:

{
    "output_file": "ubiops-file://example-bucket/request-output/file.txt"
}

Using a different bucket as default

It is possible to set a different bucket as the default for a deployment (version). You can do so by setting the SYS_DEFAULT_BUCKET environment variable to the name of the desired bucket. The deployment will then use this bucket instead of default when no bucket name is specified in the code.

You can also have an array of files as output, like in the code snippet below:

class Deployment:
    def request(self, data):

        data = data["input"]

        output_1 =  {
            "file": data,
            "bucket_file": "file/path/1.txt"
        }
        output_2 =  {
            "file": data,
            "bucket_file": "file/path/2.txt"
        }
        return {"output" : [output_1, output_2]}

Environment variablesAuthorization parameters

import ubiops

core_api = ubiops.CoreApi()

project_name = 'project_name_example' # str
bucket_name = 'bucket_name_example' # str
prefix = 'prefix_example' # str (optional, prefix to filter files)
delimiter = 'delimiter_example' # str (optional, delimiter used with prefix to emulate hierarchy to filter files)
continuation_token = 'continuation_token_example' # str (optional, a token that indicates the start point of the returned the files)
limit = 56 # int (optional, the maximum number of files returned, default is 100)

# List files
api_response = core_api.files_list(project_name, bucket_name, prefix=prefix, delimiter=delimiter, continuation_token=continuation_token, limit=limit)
print(api_response)

# Close the connection
core_api.api_client.close()

import ubiops

configuration = ubiops.Configuration()
# Configure API token authorization
configuration.api_key['Authorization'] = "Token <YOUR_API_TOKEN>"
# Defining host is optional and default to "https://api.ubiops.com/v2.1"
configuration.host = "https://api.ubiops.com/v2.1"

api_client = ubiops.ApiClient(configuration)
core_api = ubiops.CoreApi(api_client)

project_name = 'project_name_example' # str
bucket_name = 'bucket_name_example' # str
prefix = 'prefix_example' # str (optional, prefix to filter files)
delimiter = 'delimiter_example' # str (optional, delimiter used with prefix to emulate hierarchy to filter files)
continuation_token = 'continuation_token_example' # str (optional, a token that indicates the start point of the returned the files)
limit = 56 # int (optional, the maximum number of files returned, default is 100)

# List files
api_response = core_api.files_list(project_name, bucket_name, prefix=prefix, delimiter=delimiter, continuation_token=continuation_token, limit=limit)
print(api_response)

# Close the connection
api_client.close()

Ensuring stateless request handling¶

Files that are generated during request or training runruntime, are stored on the disk of the relevant instance that handles the request. One instance can handle multiple requests consecutively. It is possible to ensure that files that are generated during request runtime do not interfere with other requests by storing new files in the /home/deployment/files folder. This is the directory where input files are added, and files inside this folder are removed after every request. This also ensures that the disk usage does not build up when a single instance handles multiple requests in a row. This operation, where files in the /home/deployment/files directory are removed after each request, also applies to training runs.