Skip to content

Metrics

UbiOps generates metrics based on activity in the platform and it is also possible to track your own custom metrics in UbiOps. This allows you to monitor the performance of deployments in production, or to investigate performance metrics during training runs.

The recommended way to access the metrics is through the WebApp (available at https://app.ubiops.com for our SaaS solution). However, it is also possible to access the metrics via the Python Client Library, the CLI and our API if you want to port them to your own dashboard.

Have a look at our Swagger page to see all available metrics endpoints.

In the WebApp metrics can be found under Monitoring in the sidebar. By default this will show an overview of project level metrics. It is possible to filter metrics on date, or to view metrics for specific objects.

Metrics

Many objects in UbiOps (like deployment versions, pipeline versions and training runs) have a dedicated Metrics page as well. On these pages you can see all the metrics for that specific object.

Deployment version metrics

The following metrics are available by default.

Default deployment metrics

Metric1 Unit Description
deployments.credits credits (float) Usage of Credits 2
deployments.instances instances (float) Average number of active deployment instances
deployments.input_volume bytes (int) Volume of incoming data in bytes
deployments.output_volume bytes (int) Volume of outgoing data in bytes
deployments.request_duration seconds (float) Average time in seconds for a request to complete
deployments.memory_utilization bytes (int) Peak memory used during a request
deployments.requests requests (int) Number of requests made to the object
deployments.failed_requests requests (int) Number of failed requests made to the object
deployments.express_queue_size items (int) Average number of queued express requests
deployments.batch_queue_size items (int) Average number of queued batch requests
deployments.express_queue_time3 items (int) Average time in seconds for an express request to start processing
deployments.batch_queue_time3 items (int) Average time in seconds for a batch request to start processing
deployments.network_in bytes (int) Inbound network traffic for a deployment version
deployments.network_out bytes (int) Outbound network traffic for a deployment version
deployments.instance_start_time seconds (float) Average duration from instance creation to start time

Default pipeline metrics

Metric1 Unit Description
pipelines.requests requests (int) Number of requests made to the object
pipelines.failed_requests requests (int) Number of failed requests made to the object
pipelines.request_duration seconds (float) Average time in seconds for a pipeline request to complete
pipelines.input_volume bytes (int) Volume of incoming data in bytes
pipelines.output_volume bytes (int) Volume of outgoing data in bytes
pipelines.object_requests requests (int) Number of requests made to deployments in a pipeline
pipelines.object_failed_requests requests (int) Number of failed requests made to deployments in a pipeline

Token and user specific metrics

It is possible to see metrics which are generated by a specific user or a service user (API token). To view this data, go to the Monitoring page, click Add object and select for what user or service user you want to view the data.

Custom metrics

It is possible to create and track custom metrics in UbiOps as well. This is especially helpful for tracking model specific metrics like accuracy or precision.

Defining a new custom metric

To start tracking a new custom metric, it first needs to be defined in UbiOps. You can do so by navigating to Monitoring > Metrics > Create custom metric in the WebApp. To create a new custom metric you can configure the following fields:

  • Name: the name of your custom metric. It always has to begin with custom.. This is the name that will be used for the titles of associated graphs later
  • Description (optional): the description of your custom metric.
  • Unit (optional): the unit of measurement for your custom metric (e.g. seconds)
  • Metric type: how this metric should be processed. We support Gauge and Delta metrics. For a gauge metric, the value measures a specific instant in time. For example, metrics measuring CPU utilization are gauge metrics; each point records the CPU utilization at the time of measurement. For a delta metric, the value measures the change in a time interval. For example, metrics measuring request counts are delta metrics; each value records how many requests were received after the start time, up to and including the end time.
  • Metric level (referred to as labels in the API): on what level do you plan to store the metric? In the WebApp you can select either deployment level, pipeline level, or training run.

Creating a custom metric

Custom metric labels

If you do not intend to tie your custom metric to deployments, pipelines or training runs, it is also possible to pass different labels. This can only be done when working with the client library or API directly.

Logging data to your custom metric

You can easily log data to your custom metric with the UbiOps Python CLient Library. You just need to initialize and start a MetricCLient instance, and log values with the log_metric function.

from ubiops.utils.metrics import MetricClient

metric_client = MetricClient(project_name = "your_project_name")
metric_client.start()
metric_client.log_metric(
  metric_name = "custom.your_metric_name",
  labels = {"deployment_version_id": "your_deployment_version_id"}, # Can also be a pipeline_version_id, request_id
  value = your_metric_value
)

Typically you'll want to log metrics from within a training run, a deployment, or pipeline. Below you can find example code snippets for each case. Each code snippet assumes that you have already defined the custom metric beforehand, as described in the previous section.

from ubiops.utils.metrics import MetricClient

def train(training_data, parameters, context):
    training_run_id = context["id"]
    project_name = context["project"]
    metric_client = MetricClient(project_name=project_name)
    metric_client.start()
    # <YOUR TRAINING CODE>

    example_value = 0
    metric_client.log_metric(
        metric_name = "custom.example", # Make sure you created a metric with this name beforehand
        labels = {"request_id": training_run_id},
        value = example_value
    )        
from ubiops.utils.metrics import MetricClient

class Deployment:
    def __init__(self, base_directory, context):
        self.metric_client = MetricClient(project_name=context["project"])
        self.metric_client.start()
        self.context = context

    def request(self, data, context):
      example_value = 0
      metric_client.log_metric(
          metric_name = "custom.example", # Make sure you created a metric with this name beforehand
          labels = {"deployment_version_id": self.context["version_id"]},
          value = example_value
      )
from ubiops.utils.metrics import MetricClient

class Deployment:
    def __init__(self, base_directory, context):
        self.metric_client = MetricClient(project_name=context["project"])
        self.metric_client.start()
        self.context = context

    def request(self, data, context):
      example_value = 0
      if "pipeline_version_id" in context.keys():
          print("Deployment was called as part of a pipeline, logging metric on pipeline level.")
          metric_client.log_metric(
              metric_name = "custom.example", # Make sure you created a metric with this name beforehand
              labels = {"pipeline_version_id": context["pipeline_version_id"]},
              value = example_value
          )

A maximum of one value will be registered per minute

Metrics are aggregated per minute by the UbiOps API. If you log multiple values within a minute with the client library, the value logged for that minute will either be an average (if the metric is a Gauge metric), or a sum (if the metric is a Delta metric), of those values.

For step by step examples of working with training metrics, have a look at the following how-to's:

Viewing data of your custom metrics

You can view the data of your custom metrics in the WebApp. If you want to inspect your custom metrics for a specific object, you can navigate to the metrics tab of that object. In the case of deployments and pipelines you can find the tab on the version details page, for training runs you can find it on the training run details page. On these metrics tabs you will see a toggle with which you can toggle your view between default metrics and custom metrics for that object.

Custom metrics for a deployment

In case you want to view the data of multiple objects at once, you can also navigate to the general monitoring page (Monitoring > General). Here you can add objects to the view via the Add object button. You can customize this page to only show the graphs you are interested in. To add a graph, click the Add graph button and select the metrics you want to see (default or custom). To remove one, click on the three little dots in the top right of the graph and select Remove graph.

Getting data from the API

If you want to get the data for your metrics directly from the API, that's also possible. Below you can find an example code snippet showing you how. You can adjust the metric to the name of the metric you want to get data from. Metric names are visible in the WebApp when navigating to Monitoring > Metrics. It's also possible to add labels (for example deployment_version_id:id) to get more granular data.

    import ubiops
    api_client = ubiops.ApiClient(configuration)
    api = ubiops.CoreApi(api_client)

    api.time_series_data_list(
        project_name="your_project_name",
        metric="desired_metric_name",
        start_date='2023-7-19 12:00:00',
        end_date='2023-7-19 12:03:00'
        )

  1. Deployment and pipeline metrics are sampled every 60s 

  2. credits is the amount of credits used by the active instance. The ratio will be different based on the instance type. 

  3. If your deployment version is not showing the express_queue_time and batch_queue_time metrics it may be needed to re-upload your deployment package to start making use of this feature.