Manage files within your deployment¶
This how-to will show you how you can setup a deployment in such a way that it checks if a file exists in a storage bucket, and downloads it if necessary (read more below). This is useful because the local storage of your deployment is not persistent when it shuts down. If you want to download a model from an online repository, you will need to import it every time you boot up an instance.The download speed of files in your bucket can be faster than downloading it from an online repository.
UbiOps supports the use of buckets to organize your files. You can connect UbiOps to your own storage bucket or create a storage bucket directly on UbiOps. UbiOps provides a default
bucket, which is accessible to every project member, deployment or pipeline in the project with read and write access (keep in mind that service users need to be granted permissions to use this bucket). Buckets have an automatic deletion policy, which sets the time after which files are deleted.
The code snippet below shows the initialization
method of a deployment that downloads and runs a model from the Huggingface library, along with a tokenizer. The code does two things:
- Checks whether or not a file is in your storage bucket, which is useful if you have automatic deletion enabled.
- Downloads the file if it is not in your bucket.
The initialization
will run only when the deployment instance starts up and can be used for (down)loading a model for example. This way the model does not have to be downloaded everytime you want to run the model.
The initialization
will run only when the deployment instance starts up and can be used for (down)loading a model for example. This way the model does not have to be downloaded everytime you want to run the model.
The first thing to do is initialize the UbiOps client library:
# Initialize UbiOps client library
Initialisation method for the deployment. It can for example be used for loading modules that have to be kept in
memory or setting up connections. Load your external deployment files (such as pickles or .h5 files) here.
:param str base_directory: absolute path to the directory where the deployment.py file is located
:param dict context: a dictionary containing details of the deployment that might be useful in your code.
configuration = ubiops.Configuration(host="https://api.ubiops.com/v2.1")
configuration.api_key['Authorization'] = os.environ['UBIOPS_API_TOKEN']
client = ubiops.ApiClient(configuration)
api_client = ubiops.CoreApi(client)
project_name = context["project"]
tok_fn = "bert-base-uncased-tok"
model_fn = "bert-base-uncased-model"
try:
ubiops.utils.download_file(
client,
project_name,
bucket_name="default",
file_name=f"{tok_fn}.zip",
output_path=".",
stream=True,
chunk_size=8192
)
shutil.unpack_archive(f"{tok_fn}.zip",f"./{tok_fn}", 'zip')
print("Token file loaded from object storage")
self.tokenizer = AutoTokenizer.from_pretrained(f"./{tok_fn}")
except Exception as e:
print(e)
print("Tokenizer does not exist. Downloading from Hugging Face")
default
bucket: self.tokenizer = AutoTokenizer.from_pretrained("bert-base-uncased")
self.tokenizer.save_pretrained(f"./{tok_fn}")
tok_dir = shutil.make_archive(tok_fn, 'zip', tok_fn)
ubiops.utils.upload_file(client, project_name, f"{tok_fn}.zip", 'default')
try:
ubiops.utils.download_file(
client,
project_name,
bucket_name='default',
file_name=f"{model_fn}.zip",
output_path='.',
stream=True,
chunk_size=8192
)
shutil.unpack_archive(f"{model_fn}.zip",f"./{model_fn}", 'zip')
print("Model file loaded from object storage")
self.model = BertForMaskedLM.from_pretrained(f"./{model_fn}")
except Exception as e:
print(e)
print("Model does not exist. Downloading from Hugging Face")
self.model = BertForMaskedLM.from_pretrained("bert-base-uncased")
self.model.save_pretrained(f"./{model_fn}")
print("Storing model on UbiOps")
model_dir = shutil.make_archive(model_fn, 'zip', model_fn)
ubiops.utils.upload_file(client, project_name, f"{model_fn}.zip", "default")