Implement custom input gaurdrails on UbiOps¶
Download notebook View source code
This notebook shows an example on how to implement input custom guardrails into a UbiOps pipeline. Input guardrails are mechanisms designed to filter and validate user input before it reaches the LLM. The input request can then either be blocked, or adjusted, to steer the LLM response into a certain direction.
Different input guardrail mechanisms exist. You can use a high-throughput LLm to classify a response, or simply use regex. As an example, we will guide you through a simple regex guardrail example.
We will implement a pipeline that connects two deployments. One deployment applies the regex filter. The second deployment proxies a request to an LLM. The return of the LLM will be streamed.
The pipeline can be called with OpenAI-compatible input bodies. In the next release of UbiOps, the pipeline will be exposed via an OpenAI compatible chat/completions endpoint.
This solution can be set up in your UbiOps environment in four steps: 1) Establish a connection with your UbiOps environment 2) Create the deployment for the input guardrailing 3) Create the deployment for the proxy LLM, using connection strings 4) Create a pipeline that combines the two deployments created in step 2 and 3
1) Connecting with the UbiOps API client¶
To use the UbiOps API from our notebook, we need to install the UbiOps Python client library.
%pip install --upgrade ubiops
To set up a connection with the UbiOps platform API we need the name of your UbiOps project and an API token with project-editor permissions.
Once you have your project name and API token, paste them below in the following cell before running.
import json
import ubiops
API_TOKEN = "<UBIOPS_API_TOKEN>" # Used to create the deployments and pipeline, make sure this is in the format "Token token-code"
API_HOST_URL = "<API_HOST_URL>" # Standard UbiOps API URL is 'https://api.ubiops.com/v2.1', your URL may differ depending on your environment
PROJECT_NAME = "<PROJECT_NAME>" # Fill in your project name here
configuration = ubiops.Configuration()
configuration.api_key["Authorization"] = API_TOKEN
configuration.host = API_HOST_URL
api_client = ubiops.ApiClient(configuration)
api = ubiops.api.CoreApi(api_client)
print(api.service_status())
You will also need to be able to send requests to an LLM that accepts OpenAI compatible chat completion requests. You can use an external supplier, such as OpenAI or Mistral, or configure such an LLM on UbiOps yourself.
BASE_URL = "<BASE_URL>" # The base URL of your LLM. If hosted on UbiOps, use f"{API_HOST_URL}/projects/{PROJECT_NAME}/openai-compatible/v1/"
MODEL_NAME = "<MODEL_NAME>" # The name of your LLM model. If hosted on UbiOps, use f"ubiops-deployment/{DEPLOYMENT_NAME}//<the name of your model>"
API_KEY = "<API_KEY>" # Used to create requests within the proxy deployment. If hosted on UbiOps, use a valid API Token with atleast `deployment-request-user` permissions to request the deployment, but without the 'Token ' prefix
Create the deployments for the pipeline¶
Now that we have established a connection with our UbiOps environment, we can start creating our deployment packages. Each package will consist of two files: - The deployment.py, which is where we will define the actual code to run the embedding model and LLM - The requirements.txt, which will contain additional dependencies that our codes needs to run properly
These deployment packages will be zipped, and uploaded to UbiOps, after which we will add them to a pipeline. The pipeline will consist out of two deployments: - One deployment will host the embedding model - One will host the LLM
GUARDRAIL_DEPLOYMENT_NAME = "filter-apple-guardrail"
GUARDRAIL_DEPLOYMENT_PACKAGE_DIR = "input_guardrail_deployment_package"
PROXY_LLM_DEPLOYMENT_NAME = "proxy-llm"
PROXY_LLM_DEPLOYMENT_PACKAGE_DIR = "proxy_llm_deployment_package"
2) Create the Input guardrail deployment¶
This deployment adds a simple input guardrail before messages reach the main LLM. It checks if the user mentions the word "apple" and, if so, inserts a system message instructing them to talk about other fruits instead. It also validates that the input is properly formatted JSON with a "messages" list. If not, a public error is returned to the end-user.
%mkdir {GUARDRAIL_DEPLOYMENT_PACKAGE_DIR}
First we create the deployment.py:
%%writefile {GUARDRAIL_DEPLOYMENT_PACKAGE_DIR}/deployment.py
import re
import json
class Deployment:
def __init__(self):
self.guard = SimpleWordChecker()
def request(self, data: str) -> str:
# Load the OpenAI-Compatible input body
try:
parsed = json.loads(data)
except json.JSONDecodeError:
raise PublicError("Invalid JSON: Could not parse request body.")
# Validate required structure
if "messages" not in parsed or not isinstance(parsed["messages"], list):
raise PublicError("Invalid input: 'messages' key must be present and must be a list.")
updated = self.guard.check_for_apple(parsed)
# Dump the response back as a string
return json.dumps(updated)
class SimpleWordChecker:
def check_for_apple(self, body):
'''
Checks if the last user message contains the word "apple",
if so, adds a new system message for the LLM.
Returns the body of messages
'''
messages = body["messages"]
for i in reversed(range(len(messages))):
if messages[i].get("role") == "user":
# Find the last user message and check for the forbidden word
if re.search(r"\bapple\b", messages[i].get("content", ""), re.IGNORECASE):
messages.insert(i + 1, {
"role": "system",
"content": "The user used the word apple. Using the word apple is forbidden. Instruct the user to talk about other fruits instead.."
})
# We only need to check the last user's message, so stop after this
break
return body
class PublicError(Exception):
'''
Raise a public error message to the user
which is visible from the request overview page in UbiOps.
'''
def __init__(self, public_error_message):
super().__init__()
self.public_error_message = public_error_message
Now we create the deployment¶
For the deployment we will specify the in- and output for the model as type plain, to support OpenAI-compatible input:
deployment_template = ubiops.DeploymentCreate(
name=GUARDRAIL_DEPLOYMENT_NAME,
description="An example deployment that checks if a user used the world apple and instructs the LLM to make the end-user" \
"aware of this.",
input_type="plain",
output_type="plain",
labels={"type": "input-guardrail"},
)
guardrail_deployment = api.deployments_create(
project_name=PROJECT_NAME, data=deployment_template
)
print(guardrail_deployment)
And finally we create the version¶
Each deployment can have multiple versions. The version of a deployment defines the coding environment, instance type (CPU or GPU) & size, and other settings. We will set minimum_instances to warrant fast response time . The code is simple and will not consume a lot of resources. Therefore we select the smallest instance type group available.
⚠️ Warning: toggle minimum_instances to 0 after this tutorial to save up on resources
version_template = ubiops.DeploymentVersionCreate(
version="v1",
environment="python3-13",
instance_type_group_name="256 MB + 0.0625 vCPU",
minimum_instances=1,
maximum_instances=1,
maximum_idle_time=10,
instance_processes = 1,
request_retention_mode="full", # Input/output of requests will be stored.
)
version = api.deployment_versions_create(
project_name=PROJECT_NAME, deployment_name=GUARDRAIL_DEPLOYMENT_NAME, data=version_template
)
print(version)
Then we zip the deployment package and upload it to UbiOps (this process can take between 5-10 minutes).
import shutil
shutil.make_archive(GUARDRAIL_DEPLOYMENT_PACKAGE_DIR, "zip", ".", GUARDRAIL_DEPLOYMENT_PACKAGE_DIR)
file_upload_result = api.revisions_file_upload(
project_name=PROJECT_NAME,
deployment_name=GUARDRAIL_DEPLOYMENT_NAME,
version="v1",
file=f"{GUARDRAIL_DEPLOYMENT_PACKAGE_DIR}.zip",
)
ubiops.utils.wait_for_deployment_version(
client=api.api_client,
project_name=PROJECT_NAME,
deployment_name=GUARDRAIL_DEPLOYMENT_NAME,
version="v1",
revision_id=file_upload_result.revision,
)
3) Create the proxy LLM deployment¶
Next we will create the deployment that will proxy a request to an external LLM by hitting its v1/chat/completions endpoint. The deployment functions simply as a passthrough, although you're free to add custom logic. The workflow for creating this deployment is similar to the workflow for creating the previous deployment, except we now add the openai python package to the environment, and use environment variables to specify the LLM to which we will apply requests.
%mkdir {PROXY_LLM_DEPLOYMENT_PACKAGE_DIR}
Create the deployment.py:
%%writefile {PROXY_LLM_DEPLOYMENT_PACKAGE_DIR}/deployment.py
import os
import json
from openai import OpenAI
class Deployment:
def __init__(self, base_directory, context):
print("Initializing OpenAI-compatible Deployment")
try:
self.base_url = os.environ["BASE_URL"]
self.model_name = os.environ["MODEL_NAME"]
self.api_key = os.environ["API_KEY"] # You might want to add this or similar
except KeyError as e:
raise Exception(f"Missing required environment variable: {e}")
# Initialize OpenAI client
self.client = OpenAI(
api_key=self.api_key,
base_url=self.base_url
)
self.context = context
def request(self, data, context):
print("Processing request for OpenAI-compatible Deployment")
try:
input_data = json.loads(data)
except (TypeError, ValueError):
raise PublicError("Invalid JSON: Could not parse request body.")
input_data["model"] = self.model_name
# Optional: Include usage info for tracking tokens
is_streaming = input_data.get("stream", False)
if is_streaming:
input_data["stream_options"] = {"include_usage": True}
try:
print(f"Sending request to model {input_data['model']}")
response = self.client.chat.completions.create(**input_data)
except Exception as e:
raise RuntimeError("Failed to call model") from e
if is_streaming:
streaming_callback = context["streaming_update"]
full_response = []
for partial_response in response:
chunk_dump = partial_response.model_dump()
streaming_callback(json.dumps(chunk_dump))
full_response.append(chunk_dump)
return json.dumps(full_response)
else:
full_response = response.model_dump()
return json.dumps(full_response)
class PublicError(Exception):
def __init__(self, public_error_message):
super().__init__()
self.public_error_message = public_error_message
Then the requirements.txt:
%%writefile {PROXY_LLM_DEPLOYMENT_PACKAGE_DIR}/requirements.txt
openai
Create a deployment¶
Again, we will use input and output types plain
llm_template = ubiops.DeploymentCreate(
name=PROXY_LLM_DEPLOYMENT_NAME,
description="A deployment that proxies requests to an OpenAI-compatible server",
input_type="plain",
output_type="plain",
labels={"type": "llm-proxy"},
)
llm_deployment = api.deployments_create(project_name=PROJECT_NAME, data=llm_template)
print(llm_deployment)
And create a version for the deployment. We will use a slightly larger instance type to ensure that the llm proxy can handle multiple requests concurrently:
version_template = ubiops.DeploymentVersionCreate(
version="v1",
environment="python3-13",
instance_type_group_name="512 MB + 0.125 vCPU",
maximum_instances=1,
minimum_instances=1,
maximum_idle_time=10,
instance_processes=5,
request_retention_mode="full", # input/output of requests will be stored)
)
version = api.deployment_versions_create(
project_name=PROJECT_NAME, deployment_name=PROXY_LLM_DEPLOYMENT_NAME, data=version_template
)
print(version)
Now we need to create environment variables that allow the proxy deployment to request an LLM.
api_response = api.deployment_version_environment_variables_create(
project_name=PROJECT_NAME,
deployment_name=PROXY_LLM_DEPLOYMENT_NAME,
version="v1",
data=ubiops.EnvironmentVariableCreate(name="BASE_URL", value=BASE_URL, secret=False),
)
api_response = api.deployment_version_environment_variables_create(
project_name=PROJECT_NAME,
deployment_name=PROXY_LLM_DEPLOYMENT_NAME,
version="v1",
data=ubiops.EnvironmentVariableCreate(name="MODEL_NAME", value=MODEL_NAME, secret=False),
)
api_response = api.deployment_version_environment_variables_create(
project_name=PROJECT_NAME,
deployment_name=PROXY_LLM_DEPLOYMENT_NAME,
version="v1",
data=ubiops.EnvironmentVariableCreate(name="API_KEY", value=API_KEY, secret=True),
)
Zip & upload the files to UbiOps (this process can take between 5-10 minutes).
import shutil
shutil.make_archive(PROXY_LLM_DEPLOYMENT_PACKAGE_DIR, "zip", ".", PROXY_LLM_DEPLOYMENT_PACKAGE_DIR)
file_upload_result = api.revisions_file_upload(
project_name=PROJECT_NAME,
deployment_name=PROXY_LLM_DEPLOYMENT_NAME,
version="v1",
file=f"{PROXY_LLM_DEPLOYMENT_PACKAGE_DIR}.zip",
)
ubiops.utils.wait_for_deployment_version(
client=api.api_client,
project_name=PROJECT_NAME,
deployment_name=PROXY_LLM_DEPLOYMENT_NAME,
version="v1",
revision_id=file_upload_result.revision,
)
4) Create a pipeline and pipeline version¶
Now we create a pipeline that orchestrates the workflow between the deployments above. When a request will be made to this pipeline the first deployment will check the last user's prompt for forbidden words. Then it passes the messages through to the LLM to generate an answer.
For a pipeline you will have to define an input & output and create a version, as with a deployment. In addition to this we will also need to define the objects (i.e, deployments) and how to orchestrate the workflow (i.e., how to attach each object to each other).
First we create the pipeline:
PIPELINE_NAME = "guardrail-pipeline-demo"
PIPELINE_VERSION = "v1"
pipeline_template = ubiops.PipelineCreate(
name=PIPELINE_NAME,
description="A pipeline that applies an input guardrail",
input_type="plain",
output_type="plain"
)
api.pipelines_create(project_name=PROJECT_NAME, data=pipeline_template)
Then we define the objects, and how to attach the objects together:
# Define the two objects to be used in the pipeline
objects = [
# input-guardrail
{
"name": GUARDRAIL_DEPLOYMENT_NAME,
"reference_name": GUARDRAIL_DEPLOYMENT_NAME,
"version": "v1",
},
# LLM-model
{
"name": PROXY_LLM_DEPLOYMENT_NAME,
"reference_name": PROXY_LLM_DEPLOYMENT_NAME,
"version": "v1"
},
]
attachments = [
# start --> input-guardrail
{
"destination_name": GUARDRAIL_DEPLOYMENT_NAME,
"sources": [
{
"source_name": "pipeline_start",
"mapping": [],
}
],
},
# input-guardrail --> LLM
{
"destination_name": PROXY_LLM_DEPLOYMENT_NAME,
"sources": [
{
"source_name": GUARDRAIL_DEPLOYMENT_NAME,
"mapping": []
}
],
},
# LLM --> pipeline end
{
"destination_name": "pipeline_end",
"sources": [
{
"source_name": PROXY_LLM_DEPLOYMENT_NAME,
"mapping": [],
}
],
},
]
And finally we create a version for this pipeline. Note that we are adding labels, so that the solutions will be returned when using the /models endpoint:
pipeline_template = ubiops.PipelineVersionCreate(
version=PIPELINE_VERSION,
request_retention_mode="full",
objects=objects,
attachments=attachments,
labels = {"openai-model-names":"llm-apple-input-guardrail", "openai-compatible":True}
)
api.pipeline_versions_create(
project_name=PROJECT_NAME, pipeline_name=PIPELINE_NAME, data=pipeline_template
)
And there you have it!¶
We have now set up input guardrails on UbiOps. If you want, you can use the code block below to create a request to your newly created pipeline.
data ="""
{
"messages": [
{
"content": "You are a helpful assistant.",
"role": "system"
},
{
"content": "I ate an apple!",
"role": "user"
}
],
"stream": false
}
"""
response = api.pipeline_requests_create(
project_name=PROJECT_NAME,
pipeline_name=PIPELINE_NAME,
data=data
)
print(response)
The textual response of the LLM reads as
import json
json.loads(response.result)["choices"][0]["message"]["content"]
An example response would be:
"""I see you mentioned a certain fruit that starts with "A". Let's try something different. How about we talk about bananas, oranges, or grapes instead? Which one of those fruits do you like?"""
You can also initiate requests via the openai library:
from openai import OpenAI
client = OpenAI(
api_key= API_TOKEN[6:] if API_TOKEN.startswith("Token ") else API_TOKEN,
base_url = f"{API_HOST_URL}/projects/{PROJECT_NAME}/openai-compatible/v1"
)
Then fetch all models available within your project:
models = client.models.list()
print(models)
response = client.chat.completions.create(
model=MODEL_NAME,
**json.loads(data)
)
print(response)
6. Cleanup¶
At last, let's close our connection to UbiOps
api_client.close()
This tutorial just serves as an example. Feel free to reach out to our support portal if you want to discuss your set-up in more detail.