Metrics¶
UbiOps generates metrics based on activity in the platform and it is also possible to track your own custom metrics in UbiOps. This allows you to monitor the performance of deployments in production, or to investigate performance metrics during training runs.
The recommended way to access the metrics is through the WebApp (available at https://app.ubiops.com for our SaaS solution). However, it is also possible to access the metrics via the Python Client Library, the CLI and our API if you want to port them to your own dashboard.
Have a look at our Swagger page to see all available metrics endpoints.
In the WebApp metrics can be found under Monitoring in the sidebar. By default this will show an overview of project level metrics. It is possible to filter metrics on date, or to view metrics for specific objects.
Many objects in UbiOps (like deployment versions, pipeline versions and training runs) have a dedicated Metrics page as well. On these pages you can see all the metrics for that specific object.
The following metrics are available by default.
Default deployment metrics¶
Metric1 | Unit | Description |
---|---|---|
deployments.credits | credits (float) | Usage of Credits 2 |
deployments.instances | instances (float) | Average number of active deployment instances |
deployments.input_volume | bytes (int) | Volume of incoming data in bytes |
deployments.output_volume | bytes (int) | Volume of outgoing data in bytes |
deployments.request_duration | seconds (float) | Average time in seconds for a request to complete |
deployments.memory_utilization | bytes (int) | Peak memory used during a request |
deployments.requests | requests (int) | Number of requests made to the object |
deployments.failed_requests | requests (int) | Number of failed requests made to the object |
deployments.express_queue_size | items (int) | Average number of queued express requests |
deployments.batch_queue_size | items (int) | Average number of queued batch requests |
deployments.express_queue_time3 | items (int) | Average time in seconds for an express request to start processing |
deployments.batch_queue_time3 | items (int) | Average time in seconds for a batch request to start processing |
deployments.network_in | bytes (int) | Inbound network traffic for a deployment version |
deployments.network_out | bytes (int) | Outbound network traffic for a deployment version |
deployments.instance_start_time | seconds (float) | Average duration from instance creation to start time |
Default pipeline metrics¶
Metric1 | Unit | Description |
---|---|---|
pipelines.requests | requests (int) | Number of requests made to the object |
pipelines.failed_requests | requests (int) | Number of failed requests made to the object |
pipelines.request_duration | seconds (float) | Average time in seconds for a pipeline request to complete |
pipelines.input_volume | bytes (int) | Volume of incoming data in bytes |
pipelines.output_volume | bytes (int) | Volume of outgoing data in bytes |
pipelines.object_requests | requests (int) | Number of requests made to deployments in a pipeline |
pipelines.object_failed_requests | requests (int) | Number of failed requests made to deployments in a pipeline |
Token and user specific metrics¶
It is possible to see metrics which are generated by a specific user or a service user (API token). To view this data, go to the Monitoring page, click Add object and select for what user or service user you want to view the data.
Custom metrics¶
It is possible to create and track custom metrics in UbiOps as well. This is especially helpful for tracking model specific metrics like accuracy or precision.
Defining a new custom metric¶
To start tracking a new custom metric, it first needs to be defined in UbiOps. You can do so by navigating to Monitoring > Metrics > Create custom metric in the WebApp. To create a new custom metric you can configure the following fields:
- Name: the name of your custom metric. It always has to begin with
custom.
. This is the name that will be used for the titles of associated graphs later - Description (optional): the description of your custom metric.
- Unit (optional): the unit of measurement for your custom metric (e.g. seconds)
- Metric type: how this metric should be processed. We support Gauge and Delta metrics. For a gauge metric, the value measures a specific instant in time. For example, metrics measuring CPU utilization are gauge metrics; each point records the CPU utilization at the time of measurement. For a delta metric, the value measures the change in a time interval. For example, metrics measuring request counts are delta metrics; each value records how many requests were received after the start time, up to and including the end time.
- Metric level (referred to as
labels
in the API): on what level do you plan to store the metric? In the WebApp you can select either deployment level, pipeline level, or training run.
Custom metric labels
If you do not intend to tie your custom metric to deployments, pipelines or training runs, it is also possible to pass different labels. This can only be done when working with the client library or API directly.
Logging data to your custom metric¶
You can easily log data to your custom metric with the UbiOps Python CLient Library. You just need to initialize and start a MetricCLient
instance, and log values with the log_metric
function.
from ubiops.utils.metrics import MetricClient
metric_client = MetricClient(project_name = "your_project_name")
metric_client.start()
metric_client.log_metric(
metric_name = "custom.your_metric_name",
labels = {"deployment_version_id": "your_deployment_version_id"}, # Can also be a pipeline_version_id, request_id
value = your_metric_value
)
Typically you'll want to log metrics from within a training run, a deployment, or pipeline. Below you can find example code snippets for each case. Each code snippet assumes that you have already defined the custom metric beforehand, as described in the previous section.
from ubiops.utils.metrics import MetricClient
def train(training_data, parameters, context):
training_run_id = context["id"]
project_name = context["project"]
metric_client = MetricClient(project_name=project_name)
metric_client.start()
# <YOUR TRAINING CODE>
example_value = 0
metric_client.log_metric(
metric_name = "custom.example", # Make sure you created a metric with this name beforehand
labels = {"request_id": training_run_id},
value = example_value
)
from ubiops.utils.metrics import MetricClient
class Deployment:
def __init__(self, base_directory, context):
self.metric_client = MetricClient(project_name=context["project"])
self.metric_client.start()
self.context = context
def request(self, data, context):
example_value = 0
metric_client.log_metric(
metric_name = "custom.example", # Make sure you created a metric with this name beforehand
labels = {"deployment_version_id": self.context["version_id"]},
value = example_value
)
from ubiops.utils.metrics import MetricClient
class Deployment:
def __init__(self, base_directory, context):
self.metric_client = MetricClient(project_name=context["project"])
self.metric_client.start()
self.context = context
def request(self, data, context):
example_value = 0
if "pipeline_version_id" in context.keys():
print("Deployment was called as part of a pipeline, logging metric on pipeline level.")
metric_client.log_metric(
metric_name = "custom.example", # Make sure you created a metric with this name beforehand
labels = {"pipeline_version_id": context["pipeline_version_id"]},
value = example_value
)
A maximum of one value will be registered per minute
Metrics are aggregated per minute by the UbiOps API. If you log multiple values within a minute with the client library, the value logged for that minute will either be an average (if the metric is a Gauge metric), or a sum (if the metric is a Delta metric), of those values.
For step by step examples of working with training metrics, have a look at the following how-to's:
- Tracking custom metrics for your training run
- Tracking custom metrics for your deployment
- Tracking custom metrics for your pipeline
Viewing data of your custom metrics¶
You can view the data of your custom metrics in the WebApp. If you want to inspect your custom metrics for a specific object, you can navigate to the metrics tab of that object. In the case of deployments and pipelines you can find the tab on the version details page, for training runs you can find it on the training run details page. On these metrics tabs you will see a toggle with which you can toggle your view between default metrics and custom metrics for that object.
In case you want to view the data of multiple objects at once, you can also navigate to the general monitoring page (Monitoring > General). Here you can add objects to the view via the Add object button. You can customize this page to only show the graphs you are interested in. To add a graph, click the Add graph button and select the metrics you want to see (default or custom). To remove one, click on the three little dots in the top right of the graph and select Remove graph.
Getting data from the API¶
If you want to get the data for your metrics directly from the API, that's also possible. Below you can find an example code snippet showing you how. You can adjust the metric to the name of the metric you want to get data from. Metric names are visible in the WebApp when navigating to Monitoring > Metrics. It's also possible to add labels (for example deployment_version_id:id
) to get more granular data.
import ubiops
api_client = ubiops.ApiClient(configuration)
api = ubiops.CoreApi(api_client)
api.time_series_data_list(
project_name="your_project_name",
metric="desired_metric_name",
start_date='2023-7-19 12:00:00',
end_date='2023-7-19 12:03:00'
)
-
credits
is the amount of credits used by the active instance. The ratio will be different based on the instance type. ↩ -
If your deployment version is not showing the
express_queue_time
andbatch_queue_time
metrics it may be needed to re-upload your deployment package to start making use of this feature. ↩↩