Download model files from your bucket into your deployment¶

In this howto we explained how you can download (model) files from Huggingface and upload them into your UbiOps bucket. This howto will explain how you can download files uploaded as explained in that howto, into your deployment from your storage bucket.

Downloading files from your UbiOps storage bucket has some advantages, as opposed to downloading the model's files directly from Huggingface. The first one is that you aren't dependent on Huggingface being up, and their rate limits. Furthermore, you're able to run your model in an air gapped environment, by allowing you to keep the entire solution contained within UbiOps.

To run this workflow on UbiOps, you need to perform the following steps:

Create environment variables to establish a connection with your UbiOps environment and your deployment
Download the model files into your deployment from your UbiOps Storage bucket, using the utils module.

The code snippets below are all part of the __init__ method of a deployment.py. If you scroll further down you can find an example of how an entire deployment.py may look like after using this method.

1. Create the environment variables¶

To download files from our UbiOps Storage bucket, we need to create environment variables from our UbiOps API token, and the name of the bucket where the model's files are stored. We'll also create an environment variable for the directory to the model's files, the value of which is equal to the directory of your bucket in which your model files are located. If you followed along with the uploading howto , the directory you'll need to pass is the one you used for the dir parameter in your training run:

create-env-var

Click here to see how you can create environment variables using the UbiOps Client Library

First, we define some parameters:

API_TOKEN = "<ENTER YOUR UBIOPS API TOKEN HERE>"
PROJECT_NAME = "<ENTER YOUR PROJECT NAME HERE>"

DIR = "<ENTER THE DIRECTORY TO YOUR MODEL FILeS HERE>"
BUCKET_NAME = "<ENTER THE BUCKET NAME HERE>"

DEPLOYMENT_NAME = "<ENTER THE DEPLOYMENT YOU WANT TO USE THE MODEL IN HERE>"
DEPLOYMENT_VERSION = "v1"

Then we establish a connection with UbiOps:

import ubiops
import os

# Initialize client library
configuration = ubiops.Configuration(host="https://api.ubiops.com/v2.1")
configuration.api_key["Authorization"] = API_TOKEN

# Establish a connection
client = ubiops.ApiClient(configuration)
api = ubiops.CoreApi(client)
print(api.projects_get(PROJECT_NAME))

Now we can create the environment variables:

# Environment variable for the API token:

api.deployment_version_environment_variables_create(
    project_name=PROJECT_NAME,
    deployment_name=DEPLOYMENT_NAME,
    version=DEPLOYMENT_VERSION,
    data=ubiops.EnvironmentVariableCreate(
        name=UBIOPS_API_TOKEN, value=API_TOKEN, secret=True
    ),
)

# Environment variable for the directory:

api.deployment_version_environment_variables_create(
    project_name=PROJECT_NAME,
    deployment_name=DEPLOYMENT_NAME,
    version=DEPLOYMENT_VERSION,
    data=ubiops.EnvironmentVariableCreate(
        name="DIR", value=DIR, secret=False
    ),
)

# Environment variable for the bucket where the model files are stored:

api.deployment_version_environment_variables_create(
    project_name=PROJECT_NAME,
    deployment_name=DEPLOYMENT_NAME,
    version=DEPLOYMENT_VERSION,
    data=ubiops.EnvironmentVariableCreate(
        name="BUCKET_NAME", value=BUCKET_NAME, secret=False
    ),
)

2. Download the model files into your deployment¶

Now we can start writing the code for downloading the files into your deployment itself. First, we need to read out the model-related environment variables and tokens:

    UBIOPS_API_TOKEN=os.environ["UBIOPS_API_TOKEN"]
    DIR = os.environ["DIR"]
    BUCKET_NAME = os.environ["BUCKET_NAME"]
    PROJECT_NAME = context["project"]

Then we make a connection between our deployment and the UbiOps API:

    configuration = ubiops.Configuration()
    configuration.api_key['Authorization'] = UBIOPS_API_TOKEN
    configuration.host = "https://api.ubiops.com/v2.1"
    api_client = ubiops.ApiClient(configuration)
    api = ubiops.CoreApi(api_client)

Now we can start with downloading the model into the deployment:

    model_files = api.files_list(PROJECT_NAME, BUCKET_NAME, prefix = DIR)
    print(f"Found the following model files for model {DIR}: {model_files} ")
    model_dir = os.path.join(base_directory, DIR)
    os.makedirs(model_dir)
    print(f"Downloading model files to: {model_dir}")
    for model_file in model_files.files:
        ubiops.utils.download_file(api_client, PROJECT_NAME, BUCKET_NAME, file_name = model_file.file, output_path=model_dir, stream = True)
    model_files = os.listdir(model_dir)
    print(f"Files in model_dir:")
    for file_name in model_files:
        print(file_name)
    print("Loading model & tokenizer from {model_dir}")

After that, you can use the model by using the AutoModelForCausalLM.from_pretrained and AutoTokenizer.from_pretrained functions from the transformers library:

    self.tokenizer = AutoTokenizer.from_pretrained(model_dir, 
                                                   device_map='auto'
    )
    self.model = AutoModelForCausalLM.from_pretrained(model_dir, 
                                                  torch_dtype=torch.float16, 
                                                  device_map='auto', 
                                                  use_safetensors=True
    )

Below you can see an example of all the code snippets above integrated into the __init__ statement of a deployment.

Click here to see an example of a deployment.py

import ubiops
import os
import torch
import shutil
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline,


class Deployment:
    def __init__(self, base_directory, context):
        """
        Initialisation method for the deployment. Any code inside this method will execute when the deployment starts up.
        It can for example be used for loading modules that have to be stored in memory or setting up connections.
        """

        print("Initialising deployment")

        # Read out model-related environment variables and tokens

        UBIOPS_API_TOKEN=os.environ["UBIOPS_API_TOKEN"]
        DIR = os.environ["DIR"]
        BUCKET_NAME = os.environ["BUCKET_NAME"]
        PROJECT_NAME = context["project"]

        self.REPETITION_PENALTY = float(os.environ.get('REPETITION_PENALTY', 1.15))
        self.MAX_RESPONSE_LENGTH  = float(os.environ.get('MAX_RESPONSE_LENGTH', 256))

        # Connect to the UbiOps API
        configuration = ubiops.Configuration()
        configuration.api_key['Authorization'] = UBIOPS_API_TOKEN
        configuration.host = "https://api.ubiops.com/v2.1"
        api_client = ubiops.ApiClient(configuration)
        api = ubiops.CoreApi(api_client)

        # Download model files from bucket
        model_files = api.files_list(PROJECT_NAME, BUCKET_NAME, prefix = DIR)
        print(f"Found the following model files for model {DIR}: {model_files} ")

        model_dir = os.path.join(base_directory, DIR)
        os.makedirs(model_dir)
        print(f"Downloading model files to: {model_dir}")
        for model_file in model_files.files:
            ubiops.utils.download_file(api_client, PROJECT_NAME, BUCKET_NAME, file_name = model_file.file, output_path=model_dir, stream = True)


        model_files = os.listdir(model_dir)
        print(f"Files in model_dir:")
        for file_name in model_files:
            print(file_name)
        print("Loading model & tokenizer from {model_dir}")
        self.tokenizer = AutoTokenizer.from_pretrained(model_dir, 
                                                       device_map='auto'
        )
        self.model = AutoModelForCausalLM.from_pretrained(model_dir, 
                                                      torch_dtype=torch.float16, 
                                                      device_map='auto', 
                                                      use_safetensors=True
        )

        self.pipe = pipeline(
            os.environ.get("PIPELINE_TASK", "text-generation"),
            model=self.model,
            tokenizer=self.tokenizer,
            return_full_text=False,
        )

        print("Model loaded")

        # Set default prompt generation variables   
        self.messages = [
            {"role": "system", "content": "{system_prompt}"},
            {"role": "user", "content": "{user_prompt}"},
        ]


        self.terminators = [
            self.pipe.tokenizer.eos_token_id,
            self.pipe.tokenizer.convert_tokens_to_ids("<|eot_id|>")
            ]


        self.default_config = {
            'do_sample': True,
            'max_new_tokens': self.MAX_RESPONSE_LENGTH,
            'temperature': 0.6,
        }


    def request(self, data):
        """
        Method for deployment requests, called separately for each individual request.
        """
        print("Processing request")


        # Here we set our output parameters in the form of a JSON
        return {}