OpenAI-compatibility¶

UbiOps deployments can be used to run LLM servers. These deployments can be linked together in UbiOps pipelines, for example to build RAG applications or complex agentic workflows, combining different models.

UbiOps provides functionality to interface with deployments and pipelines using the OpenAI format, allowing for easier integrations with frameworks that use this protocol.

OpenAI-compatible clients make use of the following fields:

base_url - Specifies the project or server where the deployment is hosted.
api_key - Your authentication token for accessing the deployment.
model - Specifies which deployment or pipeline (and optionally which version) to send requests to. This identifies the specific model hosted inside the deployment.

On this page, we will show how you can construct these fields using UbiOps concepts, and provide examples on how to set up the OpenAI-compatible client, and send requests to it.

OpenAI API Reference

For a complete overview of the OpenAI format, see the OpenAI documentation.

Supported OpenAI endpoints

The following endpoints are supported for OpenAI requests:

/v1/chat/completions
/v1/embeddings
/v1/models

Format¶

For models hosted by OpenAI, the base_url is https://api.openai.com/v1 and an example model would be gpt-4.5-preview-2025-02-27. For models hosted on UbiOps, these values can be constructed as:

base_url: https://{UBIOPS_API_HOST}/projects/{PROJECT_NAME}/openai-compatible/v1/
- UBIOPS_API_HOST: The URL of the UbiOps API, e.g., api.ubiops.com/v2.1
- PROJECT_NAME: The name of the project in which the deployment is created.
- Example: https://api.ubiops.com/v2.1/projects/my-project/openai-compatible/v1/
model: ubiops-deployment/{DEPLOYMENT_NAME}/{VERSION_NAME}/{MODEL_NAME}
- DEPLOYMENT_NAME: The name of the deployment.
- VERSION_NAME: The name of the deployment version. Leave blank when requesting the default version.
- MODEL_NAME: The name of your model as it is named within your deployment code. For vLLM, the default value is the name of the directory or Hugging Face repository referenced when loading the model.
- A full model value can then e.g. be ubiops-deployment/my-deployment/my-version/my-model or ubiops-deployment/my-deployment//google/gemma-3-27b-it for using the default version of a deployment. Note that the double // ensures that the default version of the deployment is requested.

OpenAI Deployment Format

Note that you are responsible for ensuring that the deployment can accept the OpenAI format.
For an example on setting up such a deployment that can be requested with an openai client, see the vLLM Deployment tutorial.

Authentication¶

For authentication to the UbiOps API via the OpenAI client, you can use a standard UbiOps API token (e.g. "Token a1b2c3d4e5f6"). However, the OpenAI client doesn't accept token prefixes. The prefix Token should therefore be removed. This means that the provided api_key should look like e.g. "a1b2c3d4e5f6". See the Python example for more information.

Send requests¶

We will show two examples to send requests to a UbiOps deployment version via the v1/chat/completions endpoint using the OpenAI format. The first example sends a request using the OpenAI Python client and the other uses plain curl. The v1/chat/completions endpoint accepts JSON payloads, so the input/output configurations of your deployment or pipeline should be set to plain.

Parsing of the model value

Note that the model value contains both deployment name information, as well as the actual name of a model that you serve in your deployment code. UbiOps parses the the model value, uses the first parts to determine which deployment or pipeline to route the request to, and then adds the last part model_name as a key to your input payload. For example, for a model value of ubiops-deployment/my-deployment//smollm, the key value pair {"model":"smollm"} is added to your input payload. This information can be processed within your deployment.

Python¶

from openai import OpenAI

ubiops_token = "Token ..."
project_name = "openai-project"
deployment_name = "openai-deployment"
version_name = "" # Send request to the default version
model_name = "smollm"

# Generate the OpenAI client
client = OpenAI(
    api_key=ubiops_token[6:] if ubiops_token.startswith("Token ") else ubiops_token,
    base_url=f"https://api.ubiops.com/v2.1/projects/{project_name}/openai-compatible/v1"
)

# Send a (streaming) request to the deployment
response = client.chat.completions.create(
    model=f"ubiops-deployment/{deployment_name}/{version_name}/{model_name}",
    messages=[{"role": "user", "content": "Can you tell me more about UbiOps in exactly two lines"}],
    stream=True
)

Curl¶

curl -X POST "https://api.ubiops.com/v2.1/projects/<project-name>/openai-compatible/v1/chat/completions" \
     -H "Authorization: Token ..." \
     -H "Content-Type: application/json" \
     -d '{
           "model": "ubiops-deployment/<deployment-name>/<version-name>/<model-name>",
           "messages": [{"role": "user", "content": "Can you tell me more about UbiOps in exactly two lines"}],
           "stream": false,
         }'

Listing available models¶

To list all models that are available to your API Token, you can use the models endpoint of the OpenAI API. The complete OpenAI URL for this endpoint is: {base_url}/models.
This will return a list of all models which are conforming to the following requirements:

A deployment version must have the following labels:
- openai-compatible: true
- openai-model-names: model-1;model-2;...;model-n;
The UbiOps API token used must have sufficient permissions to list the deployment version.

In case you host multiple models within the same deployment, then openai-model-names label should be a semicolon-separated list of model names.

As explained before, UbiOps parses the model value - it takes the last part, and adds a key-value pair of for example {"model":"smollm"} to your request payload. Ensure to match the names specified in your labels with the names of the models that you host in your deployment code.

Sending a request to the /models endpoint, returns the following format:

{
  "object": "list",
  "data": [
    {
      "id": "{UBIOPS_MODEL_ID}",
      "object": "model",
      "created": "{UNIX TIMESTAMP}",
      "owned_by": "{UBIOPS_ORGANIZATION_NAME}"
    },
    {
      "...": "..."
    }
  ]
}

The id field contain the model ID in the format as specified in the Format section.
The object field will always be model.
The created field contains the UNIX timestamp of the creation time of the deployment version.
The owned_by field contains the name of the UbiOps organization in which the deployment (version) is present.

Return example¶

{
  "object": "list",
  "data": [
    {
      "id": "ubiops-deployment/deployment-1/v1/llama-3-3",
      "object": "model",
      "created": 1686935002,
      "owned_by": "organization"
    },
    {
      "id": "ubiops-deployment/deployment-1/v2/model-id-1",
      "object": "model",
      "created": 1686935002,
      "owned_by": "organization"
    },
    {
      "id": "ubiops-deployment/deployment-2/v1/model-id-2",
      "object": "model",
      "created": 1686935002,
      "owned_by": "organization"
    }
  ]
}

Request examples¶

To list the models that are available for your API token using the /models endpoint, we will show 2 examples. One using the OpenAI Python client and the other using curl.

Python¶

from openai import OpenAI

ubiops_token = "Token ..."
project_name = "openai-project"
deployment_name = "openai-deployment"
version_name = "" # Send request to the default version
model_name = "test-model"

# Generate the OpenAI client
client = OpenAI(
    api_key=ubiops_token[6:] if ubiops_token.startswith("Token ") else ubiops_token,
    base_url=f"https://api.ubiops.com/v2.1/projects/{project_name}/openai-compatible/v1"
)

result = client.models.list()
print(result)

Curl¶

curl -X GET "https://api.ubiops.com/v2.1/projects/<project-name>/openai-compatible/v1/models" \
     -H "Authorization: Token ..." \
     -H "Content-Type: application/json"

Getting started¶

We provide several tutorials with examples on how to use the OpenAI format: