Skip to content

Manage files within your deployment

This how-to will show you how you can setup a deployment in such a way that it checks if a file exists in a storage bucket, and downloads it if necessary (read more below). This is useful because the local storage of your deployment is not persistent when it shuts down. If you want to download a model from an online repository, you will need to import it every time you boot up an instance.The download speed of files in your bucket can be faster than downloading it from an online repository.

UbiOps supports the use of buckets to organize your files. You can connect UbiOps to your own storage bucket or create a storage bucket directly on UbiOps. UbiOps provides a default bucket, which is accessible to every project member, deployment or pipeline in the project with read and write access (keep in mind that service users need to be granted permissions to use this bucket). Buckets have an automatic deletion policy, which sets the time after which files are deleted.

The code snippet below shows the initialization method of a deployment that downloads and runs a model from the Huggingface library, along with a tokenizer. The code does two things:

  1. Checks whether or not a file is in your storage bucket, which is useful if you have automatic deletion enabled.
  2. Downloads the file if it is not in your bucket.

The initialization will run only when the deployment instance starts up and can be used for (down)loading a model for example. This way the model does not have to be downloaded everytime you want to run the model.

The initialization will run only when the deployment instance starts up and can be used for (down)loading a model for example. This way the model does not have to be downloaded everytime you want to run the model.

The first thing to do is initialize the UbiOps client library:

 # Initialize UbiOps client library

    Initialisation method for the deployment. It can for example be used for loading modules that have to be kept in
    memory or setting up connections. Load your external deployment files (such as pickles or .h5 files) here.
    :param str base_directory: absolute path to the directory where the deployment.py file is located
    :param dict context: a dictionary containing details of the deployment that might be useful in your code.


        configuration = ubiops.Configuration(host="https://api.ubiops.com/v2.1")
        configuration.api_key['Authorization'] = os.environ['UBIOPS_API_TOKEN']
        client = ubiops.ApiClient(configuration)
        api_client = ubiops.CoreApi(client)
        project_name = context["project"]

        tok_fn = "bert-base-uncased-tok"
        model_fn = "bert-base-uncased-model"
After that we can check if the file that we need, in this case a tokinizer, already exists in the bucket:
        try:
            ubiops.utils.download_file(
                client,
                project_name,
                bucket_name="default", 
                file_name=f"{tok_fn}.zip",
                output_path=".",
                stream=True,
                chunk_size=8192
            )

            shutil.unpack_archive(f"{tok_fn}.zip",f"./{tok_fn}", 'zip')
            print("Token file loaded from object storage")
            self.tokenizer = AutoTokenizer.from_pretrained(f"./{tok_fn}")

        except Exception as e:
            print(e)
            print("Tokenizer does not exist. Downloading from Hugging Face")
Now we need tell UbiOps what to do when the file does not exist. In this case we download it and store it into our default bucket:

            self.tokenizer = AutoTokenizer.from_pretrained("bert-base-uncased")

            self.tokenizer.save_pretrained(f"./{tok_fn}")
            tok_dir = shutil.make_archive(tok_fn, 'zip', tok_fn)
            ubiops.utils.upload_file(client, project_name, f"{tok_fn}.zip", 'default')
Below you can find the same process but for a model instead of a tokenizer:
        try:
            ubiops.utils.download_file(
                client,
                project_name,
                bucket_name='default', 
                file_name=f"{model_fn}.zip",
                output_path='.',
                stream=True,
                chunk_size=8192
            )

            shutil.unpack_archive(f"{model_fn}.zip",f"./{model_fn}", 'zip')
            print("Model file loaded from object storage")
            self.model = BertForMaskedLM.from_pretrained(f"./{model_fn}")

        except Exception as e:
            print(e)
            print("Model does not exist. Downloading from Hugging Face")

            self.model = BertForMaskedLM.from_pretrained("bert-base-uncased")
            self.model.save_pretrained(f"./{model_fn}")

            print("Storing model on UbiOps")
            model_dir = shutil.make_archive(model_fn, 'zip', model_fn)
            ubiops.utils.upload_file(client, project_name, f"{model_fn}.zip", "default")
It is also possible to store files permanently in UbiOps, and to connect UbiOps to an existing storage bucket that you are already using.