Python Scikit learn and UbiOps¶
Download link for necessary files: Scikit example files.
In this example we will show you the following:
-
How to make a training pipeline in UbiOps which preprocesses the data and trains and tests a KNN model using scikit
-
How to make a production pipeline in UbiOps which takes in new data, processes it and feeds it to a trained model for prediction/classification
For this example we will use a diabetes dataset from Kaggle to create a KNN classifier. This classifier predicts if someone will have diabetes or not. Link to original dataset: https://www.kaggle.com/uciml/pima-indians-diabetes-database The dataset can also be found in this folder under the name 'diabetes.csv'.
The training pipeline¶
The training pipeline takes the diabetes data set as input and trains a KNN classifier. It is made up of the following deployments:
Model | Function |
---|---|
data-preprocessor | Cleans the input data |
model-training | Trains a KNN classifier to predict diabetes |
The pipeline looks like this:
With the following inputs and outputs:
Model | Input | Output |
---|---|---|
data-preprocessor | data: file , training: file | cleaned_data: file , target_data: file |
model-training | cleaned_data: file, target_data: file | trained_model: file, model_score: double |
The production pipeline¶
The production pipeline takes in new data, cleans it and feeds it to a trained KNN classifier. It is made up of the following deployments:
Model | Function |
---|---|
data-preprocessor | Cleans the input data |
knn-model | Predicts diabetes instances |
The pipeline looks like this:
With the following inputs and outputs:
Model | Input | Output |
---|---|---|
data-preprocessor | data: file, training: file | cleaned_data: file, target_data: file |
knn-model | data: file, data_cleaning_artifact: file | prediction: file, prediction_diabetes_instances : integer |
How does it work?¶
Step 1: Login to your UbiOps account at https://app.ubiops.com/ and create an API token with project editor rights. To do so, click on Permissions in the navigation panel, and then click on API tokens. Click on [+]Add token to create a new token.
Give your new token a name, save the token in safe place and assign the 'project-editor' role to the token. The role can be assigned on project level.
Step 2: Download the scikit-deployment folder and open scikit_deployment.ipynb
. In the notebook you will find a space to enter your API token and the name of your project in UbiOps. Paste the saved API token in the notebook in the indicated spot and enter the name of the project in your UbiOps environment. This project name can be found in the top of your screen in the WebApp. In the image in step 1 the project name is example.
Step 3: Run the Jupyter notebook scikit_deployment.ipynb
and everything will be automatically deployed to your UbiOps environment! Afterwards you can explore the code in the notebook or explore the application in the WebApp.