Skip to content

MLFlow example

Download notebook View source code

On this page we will show you the following:

How to perform hyperparameter tuning and experiment tracking using MLFlow, and how to deploy the resulting best performing model into a deployment. The model used in this tutorial looks at features of wine and tries to predict the quality based on that. This is based on the Example from the MLFlow documentation

If you download and run this entire notebook after filling in your access token, the mlflow deployment will be deployed to your UbiOps environment. You can thus check your environment after running to explore. You can also check the individual steps in this notebook to see what we did exactly and how you can adapt it to your own use case.

We recommend to run the cells step by step, as some cells can take a few minutes to finish. You can run everything in one go as well and it will work, just allow a few minutes for building the individual deployments.

Installing the required packages

We will use several packages to create our model and deploy it.

!pip install pandas
!pip install numpy
!pip install sklearn
!pip install mlflow
First download the training script and some training data and the MlFlow project file which is used to automate the execution of the trainin script

import os 
import requests 

os.mkdir('wine-model')

r = requests.get('https://storage.googleapis.com/ubiops/data/Integration%20with%20other%20tools/mlflow-example/wine-model/MLproject')  
with open('wine-model/MLproject', 'wb') as f:
    f.write(r.content)

r = requests.get('https://storage.googleapis.com/ubiops/data/Integration%20with%20other%20tools/mlflow-example/wine-model/train.py')  
with open('wine-model/train.py', 'wb') as f:
    f.write(r.content)

r = requests.get('https://storage.googleapis.com/ubiops/data/Integration%20with%20other%20tools/mlflow-example/wine-model/wine-quality.csv')  
with open('wine-model/wine-quality.csv', 'wb') as f:
    f.write(r.content)

Testing for the most optimal parameters

We can do this in one of two ways:

  • Manually via the command line
  • Programmatically using python

We will use the latter in this example because it can be automated and would take less time.

Testing best parameters

We can also use the mlflow package to test a list of possible settings to see which performs the best.

parameters = [
    {'alpha': 0.3, 'l1_ratio': 0.1},
    {'alpha': 0.2, 'l1_ratio': 0.7},
    {'alpha': 0.4, 'l1_ratio': 0.2},
    {'alpha': 0.5, 'l1_ratio': 0.7},
    {'alpha': 0.1, 'l1_ratio': 0.9},
    {'alpha': 0.2, 'l1_ratio': 0.2},
    {'alpha': 0.7},
]

model_location = 'wine-model'
import mlflow

for param in parameters:
    print(f'Running with param = {param}')
    res = mlflow.run(model_location, parameters=param, use_conda=False)
    print(f'status={res.get_status()}')

Comparing the results

Start a terminal session and run this (in the mlflow-example folder). Then head over to the MLFlow UI

mlflow ui

Comparing runs

Selecting the optimal run

After running you can view the runs of your model with the metrics of each time and compare to find the best configuration for use case.

For my example I would like to use the model with the lowest root mean square error (RMSE). Running the code in the cell below will find that run id and copy the built model into our deployment folder.

from shutil import copyfile
import pandas as pd
import os

# Reading Pandas Dataframe from mlflow
df=mlflow.search_runs(filter_string="metrics.rmse < 1")

# Fetching Run ID for
run = df.loc[df['metrics.rmse'].idxmin()]
run_id = run['run_id']

print(f'The optimal run id is {run_id}')
print(f'It had the parameters: alpha={run["params.alpha"]}, l1_ratio={run["params.l1_ratio"]}')
print(f'And RMSE: {run["metrics.rmse"]}')


copyStatus = copyfile(f'mlruns/0/{run_id}/artifacts/model/model.pkl', 'mlflow_deployment_package/model.pkl')
print('Model copied to the deployment!')

Deployment steps

Now that we have the optimal model copied into our deployment folder will deploy it to our UbiOps environment.

API_TOKEN = "<INSERT API_TOKEN WITH PROJECT EDITOR RIGHTS>" # Make sure this is in the format "Token token-code"
PROJECT_NAME = "<INSERT PROJECT NAME IN YOUR ACCOUNT>"
DEPLOYMENT_NAME = 'mlflow-deployment'
DEPLOYMENT_VERSION = 'v1'

# Import all necessary libraries
import shutil
import ubiops

client = ubiops.ApiClient(ubiops.Configuration(api_key={'Authorization': API_TOKEN}, 
                                               host='https://api.ubiops.com/v2.1'))
api = ubiops.CoreApi(client)
os.mkdir("mlflow_deployment_package")
%%writefile mlflow_deployment_package/deployment.py
"""
The file containing the deployment code is required to be called 'deployment.py' and should contain the 'Deployment'
class and 'request' method.
"""

import os
import pickle
import pandas as pd




class Deployment:

    def __init__(self, base_directory, context):
        """
        Initialisation method for the deployment. It can for example be used for loading modules that have to be kept in
        memory or setting up connections. Load your external model files (such as pickles or .h5 files) here.
        """

        print("Initialising the model")

        model_file = os.path.join(base_directory, "model.pkl")
        with open('model.pkl', 'rb') as f:
            self.model = pickle.load(f)


    def request(self, data):
        """
        Method for deployment requests, called separately for each individual request.
        """
        print('Loading data')
        input_data = pd.read_csv(data['data'])

        print("Prediction being made")
        prediction = self.model.predict(input_data)

        # Writing the prediction to a csv for further use
        print('Writing prediction to csv')
        pd.DataFrame(prediction).to_csv('prediction.csv', header = ['MPG'], index_label= 'index')

        return {
            "prediction": 'prediction.csv',
        }
%% writefile mlflow_deployment_package/requirements.txt

pandas==1.4.2
scikit-learn==1.0.2

Create Deployment

# Create the deployment
deployment_template = ubiops.DeploymentCreate(
    name=DEPLOYMENT_NAME,
    description='MLFlow deployment',
    input_type='structured',
    output_type='structured',
    input_fields=[
        {'name':'data', 'data_type':'file'},
    ],
    output_fields=[
        {'name':'prediction', 'data_type':'file'},
    ],
    labels={'demo': 'mlflow-tutorial'}
)

api.deployments_create(
    project_name=PROJECT_NAME,
    data=deployment_template
)

# Create the version
version_template = ubiops.DeploymentVersionCreate(
    version=DEPLOYMENT_VERSION,
    environment='python3.9',
    instance_type='512mb',
    minimum_instances=0,
    maximum_instances=1,
    maximum_idle_time=1800, # = 30 minutes
    request_retention_mode='none' # We don't need to store the requests for this deployment    
)

api.deployment_versions_create(
    project_name=PROJECT_NAME,
    deployment_name=DEPLOYMENT_NAME,
    data=version_template
)

# Zip the deployment package
shutil.make_archive('mlflow_deployment_package', 'zip', '.', 'mlflow_deployment_package')

# Upload the zipped deployment package
file_upload_result =api.revisions_file_upload(
    project_name=PROJECT_NAME,
    deployment_name=DEPLOYMENT_NAME,
    version=DEPLOYMENT_VERSION,
    file='mlflow_deployment_package.zip'
)
Note: This notebook runs on Python 3.9 and uses UbiOps CLient Library 3.15.0.

Making a request and exploring further

You can go ahead to the Web App and take a look in the user interface at what you have just built. If you want you can create a request to the mlflow deployment using the "dummy_data_to_predict.csv". The dummy data is just the horsepower data.

So there we have it! We have used MLFlow to try train a machine learning model on a large set of hyperparameters. Then we selected the best model, and deployed it to UbiOps. You can use this notebook to base your own deployments on. Just adapt the code in the deployment packages and alter the input and output fields as you wish and you should be good to go.

For any questions, feel free to reach out to us via the customer service portal: https://ubiops.atlassian.net/servicedesk/customer/portals