Migration guide
This guide helps you update H2O MLOps when moving between versions. It outlines migration steps for each version upgrade and focuses on changes that require updates to your code, configuration, or workflows.
For detailed changes to the H2O MLOps Python client, see the Python client migration guide
HT scorer runtime 1.7.x to 2.0.x
This section applies to you if you run HydrogenTorch (HT) models on the HT scorer runtime and your client code parses scoring responses. Specifically, you are affected if your code does any of the following:
- Expects a single row in the response that contains the entire batch as one JSON blob
- Indexes into nested prediction arrays (for example,
blob["predictions"][row_index]) - Unwraps an extra list layer for single-row responses (for example,
blob["predictions"][0])
What changed
Runtime 2.0.x adds output parsing that splits the legacy single-blob HT response into per-row JSON strings. This change is required for the batching feature introduced in 2.0.x and for row count validation in the scoring API. For more information, see Scoring runtimes.
All currently released HT models are affected because they produce a single JSON blob for the entire batch. Future HT model versions will produce per-row JSON strings natively, at which point the runtime parsing becomes a no-op passthrough.
Multi-row requests
Runtime 1.7.x returns 1 row containing a single JSON blob for the entire batch:
{
"fields": ["output"],
"rows": [
["{\"predictions\": [[0.3, 0.3, 0.4], [0.5, 0.3, 0.2], [0.1, 0.8, 0.1]], \"labels\": [\"neg\", \"neu\", \"pos\"]}"]
]
}
Runtime 2.0.x returns N rows, one per input row, with flat per-row predictions:
{
"fields": ["output"],
"rows": [
["{\"predictions\": [0.3, 0.3, 0.4], \"labels\": [\"neg\", \"neu\", \"pos\"]}"],
["{\"predictions\": [0.5, 0.3, 0.2], \"labels\": [\"neg\", \"neu\", \"pos\"]}"],
["{\"predictions\": [0.1, 0.8, 0.1], \"labels\": [\"neg\", \"neu\", \"pos\"]}"]
]
}
Single-row requests
Runtime 1.7.x returns 1 row with nested predictions (outer list wrapping):
{
"fields": ["output"],
"rows": [
["{\"predictions\": [[0.9, 0.1]], \"labels\": [\"pos\", \"neg\"]}"]
]
}
Runtime 2.0.x returns 1 row with flat predictions, consistent with the multi-row format:
{
"fields": ["output"],
"rows": [
["{\"predictions\": [0.9, 0.1], \"labels\": [\"pos\", \"neg\"]}"]
]
}
Migration steps
Multi-row requests
Update your client code to read one row per input instead of indexing into a single blob:
Before (runtime 1.7.x):
# One blob returned — manually index predictions by row
response = score(rows)
blob = json.loads(response["rows"][0][0]) # single row always
row_0_predictions = blob["predictions"][0]
row_1_predictions = blob["predictions"][1]
After (runtime 2.0.x):
# One row returned per input — each row is already split
response = score(rows)
row_0 = json.loads(response["rows"][0][0])
row_0_predictions = row_0["predictions"]
row_1 = json.loads(response["rows"][1][0])
row_1_predictions = row_1["predictions"]
Single-row requests
Before (runtime 1.7.x):
# Single row returned — predictions are nested in an extra list layer
response = score(rows)
blob = json.loads(response["rows"][0][0])
predictions = blob["predictions"][0] # unwrap the extra list layer
After (runtime 2.0.x):
# Single row returned — predictions are flat
response = score(rows)
row_0 = json.loads(response["rows"][0][0])
predictions = row_0["predictions"] # no extra list layer to unwrap
Why this change was made
- Batching support: Runtime 2.0.x introduces request batching where multiple requests are merged into a single model call and results are split back by row offset. The old single-blob format made this incompatible — the runtime couldn't determine which predictions belonged to which request without parsing model-specific JSON.
- Row count validation: The scoring API validates that the number of output rows matches the number of input rows. The old single-blob format (1 output row for N input rows) would fail this validation.
From 1.0 to 1.1
Runtimes
Removal of MLflow Runtime for Python 3.9
The MLflow Runtime for Python 3.9 is no longer available. Python 3.9 has reached end of life. Update your deployments to use an MLflow Runtime based on Python 3.10 or later.
Removal of Java MOJO Runtime
H2O MLOps version 1.1 removes the Java MOJO Runtime, as previously announced in version 0.67.0.
The following Java-based runtimes are no longer available:
- Driverless AI MOJO Scorer (
dai_mojo_runtime) - Driverless AI MOJO Scorer - Shapley original only (
mojo_runtime_shapley_original) - Driverless AI MOJO Scorer - Shapley transformed only (
mojo_runtime_shapley_transformed) - Driverless AI MOJO Scorer - Shapley all (
mojo_runtime_shapley_all)
Migrate to the MOJO Scorer (dai-mojo-scorer), which replaces both the Java and C++ MOJO runtimes. The MOJO Scorer supports all Shapley contribution types and accepts a wider range of algorithms, including BERT, GrowNet, and TensorFlow models.
- To deploy BERT, GrowNet, or TensorFlow models, link the experiment from Driverless AI. Manually uploaded artifacts for these model types are not supported.
- The
H2O_SCORER_WORKERSenvironment variable is no longer used. To tune scoring performance fordai-mojo-scorer, useSCORING_CONCURRENCYandBATCH_WORKERSinstead. For details, see Concurrency and performance tuning.
H2O-3 MOJO models are now scored using the H2O-3 MOJO Scorer (h2o3-mojo-scorer). For full runtime details, configuration, and migration steps, see Scoring runtimes.
Migrating Driverless AI MOJO deployments
- Switch the runtime to MOJO Scorer (
dai-mojo-scorer). - Set the
MODEL_PATHenvironment variable to your MOJO pipeline model directory and ensure theDRIVERLESS_AI_LICENSE_KEYenvironment variable is configured. - Shapley contributions and prediction intervals are auto-detected per model — no additional configuration is needed.
- Replace
H2O_SCORER_WORKERSwithSCORING_CONCURRENCYandBATCH_WORKERS. For details, see Concurrency and performance tuning. - The REST API endpoints (
/model/score,/model/contribution, etc.) are unchanged.
Migrating H2O-3 MOJO deployments
- Switch the runtime to H2O-3 MOJO Scorer (
h2o3-mojo-scorer). - Set the
SCORER_MOJO_PATHenvironment variable (or-Dmojo.pathsystem property) to point to your.mojofile. - If you need Shapley contributions, set
SHAPLEY_ENABLE=trueor configureSHAPLEY_TYPES_ENABLED. For details, see H2O-3 MOJO Scorer configuration. - The REST API endpoints are unchanged.
API compatibility
Both dai-mojo-scorer and h2o3-mojo-scorer implement the same REST API as the legacy Java MOJO runtime. The endpoints, request/response formats, and OpenAPI specification are unchanged. Your existing client integrations work without modification.
Hydrogen Torch scorer runtime response format changes
Upgrading the HydrogenTorch scorer runtime from 1.7.x to 2.0.x introduces breaking changes to the scoring API response format. All currently released HT models are affected.
Multi-row requests (N > 1 input rows)
Runtime 1.7.x returns one row containing a single JSON blob with the entire batch:
{
"fields": ["output"],
"rows": [
["{\"predictions\": [[0.3, 0.3, 0.4], [0.5, 0.3, 0.2]], \"labels\": [\"neg\", \"neu\", \"pos\"]}"]
]
}
Runtime 2.0.x returns N rows, one per input row, with flat per-row predictions:
{
"fields": ["output"],
"rows": [
["{\"predictions\": [0.3, 0.3, 0.4], \"labels\": [\"neg\", \"neu\", \"pos\"]}"],
["{\"predictions\": [0.5, 0.3, 0.2], \"labels\": [\"neg\", \"neu\", \"pos\"]}"]
]
}
Single-row requests (1 input row)
Runtime 1.7.x returns nested predictions with outer list wrapping:
{
"fields": ["output"],
"rows": [
["{\"predictions\": [[0.9, 0.1]], \"labels\": [\"pos\", \"neg\"]}"]
]
}
Runtime 2.0.x returns flat predictions:
{
"fields": ["output"],
"rows": [
["{\"predictions\": [0.9, 0.1], \"labels\": [\"pos\", \"neg\"]}"]
]
}
Migrating client code
Update response parsing from:
# Old (runtime 1.7.x): one blob, manually index by row
response = score(rows)
blob = json.loads(response["rows"][0][0]) # single row always
row_0_predictions = blob["predictions"][0]
row_1_predictions = blob["predictions"][1]
To:
# New (runtime 2.0.x): one row per input, already split
response = score(rows)
row_0 = json.loads(response["rows"][0][0])
row_0_predictions = row_0["predictions"]
row_1 = json.loads(response["rows"][1][0])
row_1_predictions = row_1["predictions"]
Header forwarding changes in runtimes 2.0.x
Runtimes 2.0.x introduces a new Go+Worker architecture that changes how HTTP request headers are accessed in model code. The Python HTTP server (FastAPI/Gunicorn) no longer handles requests directly. Models that access HTTP headers from the request context must be updated.
Before (runtimes 1.x — no longer works in 2.0.x):
# In MLflow pyfunc model predict()
token = request.headers.get("Authorization", "")
Deployment configuration:
H2O_SCORER_AUTHORIZATION_HEADER="Authorization"
After (runtimes 2.0.x):
from h2o_scorer_core.context import request_headers
# In MLflow pyfunc model predict()
headers = request_headers.get({})
token = headers.get("Authorization", "")
Deployment configuration:
H2O_SCORER_FORWARD_HEADERS="Authorization"
Key differences
| Runtimes 1.x | Runtimes 2.0.x | |
|---|---|---|
| HTTP server | FastAPI / Gunicorn (Python) | Go HTTP server + Python worker (Unix socket IPC) |
| Environment variable | H2O_SCORER_AUTHORIZATION_HEADER | H2O_SCORER_FORWARD_HEADERS |
| Multiple headers | No (single header) | Yes (comma-separated) |
| Access in model code | FastAPI/Starlette request context | h2o_scorer_core.context.request_headers ContextVar |
Only explicitly listed headers are forwarded. Per-request tracing headers (x-request-id, traceparent, etc.) are automatically excluded to avoid degrading batching performance.
HTTP request header forwarding
Starting with H2O MLOps version 1.1 (which ships with runtimes 2.0.x), the scoring runtime HTTP server has changed from a Python-based stack (FastAPI/Gunicorn) to a Go HTTP server.
Scope
This change affects you if you:
- Use the
H2O_SCORER_AUTHORIZATION_HEADERenvironment variable - Have MLflow pyfunc models that read HTTP headers during scoring (via FastAPI/Starlette request context)
- Rely on forwarded auth tokens to call external APIs from within model code
Pyfunc models that read HTTP headers through the FastAPI Request object or rely on the H2O_SCORER_AUTHORIZATION_HEADER environment variable no longer receive headers in runtimes 2.0.x. You must update your model code and environment configuration to use the new header forwarding mechanism described below.
Before (runtimes 1.x)
# Accessing headers via FastAPI request context
from starlette.requests import Request
def predict(self, context, model_input):
request: Request = context.request
token = request.headers.get("Authorization", "")
...
# Configuring a single header to forward
H2O_SCORER_AUTHORIZATION_HEADER="Authorization"
After (runtimes 2.0.x)
from h2o_scorer_core.context import request_headers
def predict(self, context, model_input):
headers = request_headers.get({}) # dict[str, str]
token = headers.get("Authorization", "")
...
Deployment configuration
If you're configuring the runtime container, replace H2O_SCORER_AUTHORIZATION_HEADER with H2O_SCORER_FORWARD_HEADERS:
# Configuring headers to forward (comma-separated)
H2O_SCORER_FORWARD_HEADERS="Authorization,X-Custom-Token"
H2O_SCORER_FORWARD_HEADERS accepts a comma-separated list of header names. The runtime automatically blocks internal tracing headers (for example, x-request-id, traceparent) from being forwarded to the model code, regardless of what you configure in this variable.
The following table summarizes the key differences between runtimes 1.x and 2.0.x:
| Aspect | Runtimes 1.x | Runtimes 2.0.x |
|---|---|---|
| HTTP server | FastAPI/Gunicorn (Python) | Go HTTP server + Python worker (Unix socket) |
| Header access mechanism | FastAPI/Starlette Request object | h2o_scorer_core.context.request_headers ContextVar |
| Configuration env var | H2O_SCORER_AUTHORIZATION_HEADER (single header) | H2O_SCORER_FORWARD_HEADERS (comma-separated list) |
| Supports multiple headers | No (single header) | Yes (comma-separated) |
| Batching safety | N/A | Automatic batch-key partitioning by header values |
Batch scoring Secure Store integration
Starting with H2O MLOps version 1.1, secret fields in batch scoring source and sink configurations must reference Secure Store Secret IDs rather than raw sensitive values.
Using secrets in batch scoring
- Create secrets in the Secure Store.
- Reference Secret IDs in your batch scoring source or sink configuration.
This requirement applies to both the UI and the Python client.
Removal of pre-AuthZ server AuthZ capability from the gRPC gateway
The pre-AuthZ server authorization capability has been removed from the gRPC gateway. The configuration options apiGateway.config.userJwtClaim and apiGateway.config.allowedUserRoles, which previously allowed you to configure role and group values for accessing the MLOps API Gateway, have been removed.
To achieve the same functionality, define an AuthZ policy that denies all MLOps actions across all workspaces.
From 0.70.0 to 1.0.0
Workspace integration
MLOps 1.0.0 is integrated with the Workspace service. All projects have been migrated to Workspaces, and both the user interface and Python client have been updated accordingly.
Python client
The legacy, automatically generated Python Client is no longer compatible with MLOPS 1.0.0. Only the h2o-mlops 1.4.0 and higher is supported. Please migrate your workflows to the new Python Client.
To migrate from version 1.3.x to 1.4.x, see the Python client migration guide from v1.3.x to v1.4.x.
Removal of Wave UI
Starting from H2O MLOps version 1.0.0, the legacy Wave-based user interface is no longer available. The official and supported MLOPs user interface is now part of the H2O AI Cloud user interface. The Admin Analytics Wave app has also been removed, and its capabilities have been migrated to the new interface.
Helm chart changes
In Affinity and Tolerations configuration, the field matchExpression has been replaced by the matchExpressions.
The option apiGateway.authorization.enabled has been removed as authorization is now automatically used on API Gateway. In case this option was set to false in previous MLOPs versions, please make sure to keep the apiGateway.authorization.allowedUserRoles set to []. [] is default value of this option.
Monitoring setup changes
Starting from H2O MLOps version 1.0.0, the legacy monitoring setup has been removed and replaced with a new configuration method using the MonitoringOptions class in the Python client. Monitoring is disabled by default and must be explicitly enabled during or after deployment.
Kafka sink configuration is now more granular and can be set on a per-model basis.
For more information, see Monitoring setup.
Migrate from old monitoring to new monitoring
The new monitoring system differs from the legacy setup. To migrate legacy monitoring data and existing deployments, MLOps 1.0.0 includes a migration job. You must explicitly enable this job during or after installation using Helm chart parameters.
Migration is optional. New monitoring works for newly created deployments without migration. However, deployments created before MLOps 1.0.0 do not have monitoring enabled, and historical scoring metrics are unavailable unless you run the migration. To enable the migration, set the following Helm parameters:
Before you run the migration, scale all deployments with monitoring enabled to at least 1 replica. The migration requires active pods to transfer monitoring data and configurations. Deployments scaled to 0 will be not migrated.
global:
components:
influxdb:
enabled: true
superset:
enabled: true
mlops:
config:
models:
monitoringEnabled: true
monitoringMigrationEnabled: true
To enable the migration, you must also enable the new monitoring. You cannot enable only the migration.
Hash security option changes
Starting from H2O MLOps version 1.0.0, hash-based security options require you to provide the passphrase directly. The hashing is now handled automatically in the backend.
Make sure to store the passphrase in a secure location, as you won't be able to retrieve it after it's submitted.
From 0.69.x to 0.70.0
Transition from Scoring Client to native batch scoring
Starting from H2O MLOps version 0.70, batch scoring functionality has been natively integrated into H2O MLOps, replacing the H2O MLOps Scoring Client. The native batch scoring implementation is available through the official H2O MLOps Python client.
For added convenience, batch scoring can also be performed through the new H2O MLOps UI. Please contact H2O.ai support in case you need any guidance on this migration.
Workload identity and IAM authentication
Starting from H2O MLOps version 0.70, workload identity and IAM authentication will be managed using the github.com/h2oai/go-pkg/database/postgres/v2 library for the mlops-storage, mlops-telemetry, and mlops-deployer components.
Update the connection strings for these components to match the formats shown in the examples below:
Example of the mlops-storage and mlops-telemetry database connection string:
storage_db_connection_string = "postgres://${var.mlops_db_username}@${var.mlops_db_address}:5432/${var.mlops_storage_db}?aws_iam_auth_enabled=true&aws_iam_auth_region=${var.aws_region}&aws_iam_auth_user=${var.mlops_db_username}&aws_iam_auth_endpoint=${var.mlops_db_address}:5432"
Example of the mlops-deployer database connection string:
deployment_db_connection_string = "postgres://${var.mlops_deployment_db_address}/${var.mlops_deployment_db_name}?sslmode=${var.db_connection_ssl_mode}&user=${urlencode(var.mlops_deployment_db_username)}&password=${urlencode(var.mlops_deployment_db_password)}"
Removal of mTLS
mTLS is no longer managed by Kubernetes jobs. For environments requiring mTLS communication between MLOps services, this should now be handled by a service mesh solution (such as Istio).
Previous versions used SPIFFE for service-to-service authentication. Version 0.70+ now uses service account tokens instead.
Migration Steps
If your deployment requires mTLS:
- Remove any existing Kubernetes job configurations for mTLS.
- Implement a service mesh solution to manage mTLS between services.
- Configure your service mesh to handle the TLS certificate management.
Changes in Helm
- Config
tlshas been removed. - Config
storage.tlshas been removed. - Config
deployer.tlshas been removed. - Config
ingest.tlshas been removed. - Config
apiGateway.tlshas been removed. - Config
monitoringAppBackend.tlshas been removed. - Config
deployer.telemetry.auth.tlsEnabledhas been removed. - Config
telemetry.serverSecurityEnabledhas been removed. - Config
storage.auth.servicehas been added:service:
# -- Issuer URL for service authentication.
# In general, this should be set to whatever the "issuer" field is in the cluster's
# OIDC discovery document.
# `kubectl get --raw /.well-known/openid-configuration` can be used to retrieve
# the discovery document.
issuerURL: "https://kubernetes.default.svc"
# -- Configures whether to validate issuer URL for service authentication.
# Issuer of some service account variants does not need to be the issuer
# specified by the issuerURL.
validateIssuer: false
# -- Configures whether to use the Kubernetes HTTP client with TLS and token the issuer discovery
# and downloading the signing keys.
# Disable this when the issuer is not Kubernetes API server.
useKubernetesHTTPClient: true
Removal of support for older H2O Driverless AI versions
In MLOps version 0.70, support for H2O Driverless AI versions 1.10.6.3 and earlier has been discontinued. This change affects the following versions:
- 1.10.5-cuda11.2.2
- 1.10.5.1-cuda11.2.2
- 1.10.6-cuda11.2.2
- 1.10.6.1-cuda11.2.2
- 1.10.6.2-cuda11.2.2
- 1.10.6.3-cuda11.2.2
Removal of Pickle Runtime
Starting from H2O MLOps version 0.70, the Pickle Runtime has been removed.
Removal of environment from Python Client and UI
Starting with MLOps version 0.70.0, the environment feature has been removed from the user perspective in both the Python client and the UI. This change does not apply to the backend, and environment-related functionalities remain intact.
Changes in the UI
- Users no longer need to select an environment (e.g., PROD or DEV) when creating a deployment.
- The environment now defaults internally to
PROD. - Environment-related details are no longer visible in the UI.
Changes in the Python Client
-
The
environmentsproperty of theMLOpsProjectinstance is no longer available starting from client version 1.3.0. -
The environment now defaults internally to
PROD. -
When using the updated client, the following adjustments must be made in the code. Here,
projectrefers to an instance ofMLOpsProject, andclientrefers to an instance ofh2o_mlops.Client:Code Adjustments:
- Replace
project.environments.get(uid).deploymentswithproject.deployments. - Replace
project.environments.get(uid).endpointswithproject.endpoints. - Replace
project.environments.get(uid).allowed_affinitieswithclient.allowed_affinities. - Replace
project.environments.get(uid).allowed_tolerationswithclient.allowed_tolerations.
With these adjustments, your code will remain compatible with both the updated and older versions of the MLOps backend.
- Replace
From 0.68.x to 0.69.0
MLOPs runtimes
All runtime images must be updated to at least version 1.5.3, which was released with H2O MLOps version 0.69.0
Starting with version 0.69.0 all runtime images must be always updated to the runtime images released with that corresponding H2O MLOPs version. For example, H2O MLOps 0.69.1 was released with runtime images v1.5.4, and therefore all deployment's images must be updated to runtimes v1.5.4.
MLOps storage
Starting with MLOps version 0.69.0, only blob storages are supported as the backend. Support for other storage options has been discontinued. This change impacts the configuration parameters used in the MLOps Helm charts.
Changes to storage configuration parameters
storage:
persistence:
# All parameters under this section are no longer supported.
cloudPersistence:
# The 'enabled' parameter was removed since cloudPersistence is now the only option supported.
enabled:
pvcMigration:
# All parameters under this section are no longer supported.
After upgrading to MLOps 0.69.0, you can safely delete your existing PVC that was used as the storage backend prior to the 0.68.0 release. Perform this step manually to prevent any unintended data loss.
PBKDF2 hash support
H2O MLOps v0.69.0 now supports the PBKDF2 passphrase hash algorithm for more secure hashing. Note the following details:
- The PBKDF2 hash should follow the format
pbkdf2:<hashFunc>:<iterations>$<salt>$<hash>. - The
saltandhashshould be base64 encoded. - PBKDF2 hashing replaces bcrypt when creating deployments with the Passphrase (Stored hashed) security option.
- The Passphrase (Stored hashed) security option is listed as an available option in the Create Deployment panel dropdown only if
PASSPHRASE_HASH_TYPE_PBKDF2is included undersecurityOptions.activatedin thevalues.yaml. HavingPASSPHRASE_HASH_TYPE_BCRYPTis neither sufficient nor required. - Older deployments created with bcrypt hashing remains accessible without requiring any additional configuration.
From 0.67.x to 0.68.0
(Optional) Vertical Pod Autoscaler (VPA) support
MLOps version 0.68.0 introduces Vertical Pod Autoscaler (VPA) support for the Deployer. Note that VPA activation is optional and performed upon request. VPA allows dynamic scaling of CPU and memory resources based on application usage, improving resource efficiency and optimizing costs.
If the VPA is activated in MLOps, then VPA is supported in the cluster and the VPA CRDs and controllers are up and running alongside the Metrics Server.
For more information, see the Installation section of the VPA GitHub README and the Metrics Server installation instructions.
Note: For a list of known limitations, see the Known limitations section of the VPA GitHub README.
Key changes
- VPA Resource Specifications: Added VPA resource specification logic to the Scoring Apps and App Composer, allowing for the dynamic adjustment of their resource limits based on real-time demand.
- API Updates: New API logic has been added for specifying and validating VPA resources.
- New VPA Utility Functions: Implemented utility methods for creating and managing VPA resources, including validation and resource quantity handling.
- Deprecated Function Removal: Removed the deprecated Fabric8 createOrReplace usage in the Scoring Apps.
Removal of HT runtime based on Python 3.8
The Hydrogen Torch (HT) runtime based on Python 3.8, which was available by default in MLOps version 0.67.x, has been removed as of MLOps version 0.68.0. However, you can still use this runtime by registering it through extra runtimes.
The following requirements need to be met so that the runtime registered through extra runtimes is also visible in the UI:
- The
mlflow/flavors/python_function/loader_modulemust matchmlflow.pyfunc.model. - The runtime name must adhere to this pattern:
(python-scorer_hydrogen_torch_)(\w*)(38)(\w*).
Configure maximum number of Kubernetes replicas
With H2O MLOps v0.68.0, you can configure the maximum number of Kubernetes replicas that can be specified when creating a new deployment. To do this, update maxDeploymentReplicas in the values.yaml file (charts/mlops/values.yaml). By default, the maxDeploymentReplicas value is set to 5.
Removal of MLflow runtimes based on Python 3.8
MLflow runtimes based on Python 3.8 have been removed in MLOps version 0.68.0. Python 3.8 has officially reached end of life as of October 07, 2024.
Pickle runtime based on Python 3.12
MLOps version 0.68.0 introduces a pickle runtime using Python 3.12. Choose one of the following options:
-
Update your models to work with Python 3.12.
-
If you cannot update your models, the original pickle runtime based on Python 3.8.18 can be configured during MLOps installation by replacing the
pickle-3.12.7image withpickle-3.8.18.
Deployment of MLOps Telemetry as a long-running microservice
In MLOps version 0.67 and earlier, the MLOps telemetry component was configured as a cron job within the MLOps storage component in the Helm configuration. Starting with MLOps version 0.68, the MLOps telemetry component must be deployed as a separate long-running microservice that publishes event data at scheduled intervals.
To migrate from MLOps version 0.67 to 0.68:
- Remove the cron job configuration from the MLOps
storagecomponent in the Helm configuration. - Implement it as a separate
telemetrycomponent within Helm.
Helm values must be set as follows:
# Telemetry Configrations
telemetry = {
enabled = true
image = {
repository = "h2oai-modelstorage-telemetry${local.shared_services_repository_suffix}"
tag = local.component_version.mlops_telemetry_version
}
replicaCount = 1
nodeSelector = {
"hac.h2o.ai/provisioner" = "karpenter"
}
tolerations = [
{
key = "type"
operator = "Equal"
value = "cpu-consolidation"
effect = "NoSchedule"
}
]
podSecurityContext = {
enabled = true
}
containerSecurityContext = {
enabled = true
}
serviceAccount = {
name = "hac-mlops-storage-telemetry-service-account"
}
serverAddress = "hac-telemetry-service.telemetry.svc.cluster.local:80"
config = {
logLevel = "error"
}
}
Scheduler routine for MLOps Telemetry
MLOps version 0.68.0 introduces the SCHEDULER_INTERVAL_SECONDS env variable to run scheduler routine inside the application itself, replacing the use of a cron job. As a result, MLOps Telemetry is deployed as a long-running deployment in the K8s cluster that publishes event data at scheduled intervals. The default value is as follows:
SCHEDULER_INTERVAL_SECONDS=300
Restructured environment security options
Environment-related security options are now configured in a different way. Prior to v0.68.0, security options were specified using their corresponding numerical values. For example:
securityOptions: [1,2,3]
From v0.68.0 onwards, activated security options are configured in the values.yaml file (charts/mlops/values.yaml) using the security option name. For example:
securityOptions:
activated:
- .......
- "AUTHORIZATION_PROTOCOL_OIDC"
- .......
You can also set the default security option in the values.yaml file (charts/mlops/values.yaml) using the security option name. The default option serves as the default security setting that will be applied in the UI when creating a deployment and it must be a part of the Activated Security Options List.
securityOptions:
activated:
- .......
- "PASSPHRASE_HASH_TYPE_PLAINTEXT"
- .......
default: "PASSPHRASE_HASH_TYPE_PLAINTEXT"
The following security options are supported in v0.68.0:
- DISABLED: No security options are activated.
- PASSPHRASE_HASH_TYPE_PLAINTEXT: Passphrase hash type is plaintext.
- PASSPHRASE_HASH_TYPE_BCRYPT: Passphrase hash type is bcrypt.
- AUTHORIZATION_PROTOCOL_OIDC: OIDC authorization protocol is activated.
- The Activated Security Options List can not be empty.
- The default option must be part of the Activated Security Options List.
From v0.68.0 onwards, the way to create a deployment with No Security via API call also differs from previous versions. This change includes the following modifications to the h2o-mlops Python Client:
-
security_optionsis now a required field for thecreate_singlemethod of theMLOpsScoringDeploymentsclass. -
To ensure backward compatibility, v0.68.0 includes a new attribute for the
SecurityOptionsclass, calleddisabled_security. This attribute allows handling cases with the No Security option by setting it to True, instead of treating None or SecurityOptions() as No Security. -
Users of MLOps assembly v0.68.0 or above must set
disabled_security=Trueto use the No Security option. For users on older versions, No Security mode can be accessed by using SecurityOptions with default values.
Helm changes
- As of version 0.68.0, the
ENABLE_USER_EXTERNALID_UPDATEenvironment variable has been removed from storage, as it is no longer necessary. deploymentEnvironment.corsOriginhas been removed. Useglobal.cors.allowedOrigininstead.
Default deployment security option
As of version 0.68.0, the default security option for deployment is PASSPHRASE_HASH_TYPE_PLAINTEXT. Prior to this version, deployments were not secured by default.
Cloud migration information: MLOps storage
Starting with version 0.68.0, H2O MLOps will no longer support PVCs for storage, transitioning instead to cloud blob storage. MLOps storage will support blob storage from all three major cloud providers—AWS, Azure, and GCP—as well as Minio for on-premises installations. Consequently, all existing data must be migrated from PVC to blob storage during the upgrade to MLOps 0.68.0. All the data migrations steps will be taken care of by MLOps when MLOps storage is deployed in the MIGRATE mode and no manual user intervention is needed. End users shouldn't experience any down time or data loss while the migration is in progress.
Installation instructions
Deploy storage in MIGRATE mode
Note: Only follow the instructions in this section if MLOps storage was previously deployed with LOCAL mode using a Kubernetes PVC as the storage.
For AWS environments with S3
IAM auth is used to access the bucket. Following annotation should be set to the storage service account.
eks.amazonaws.com/role-arn: <iam-role-arn>
storage:
serviceAccount:
create: true
annotations: {
eks.amazonaws.com/role-arn: <iam-role-arn>
}
persistence:
enabled: true
cloudPersistence:
enabled: true
url: s3://<bucket-name>?region=<bucket-region>&prefix=<optional-prefix>
pvcMigration:
enabled: true
cloudProvider: s3
bucketName: <bucket-name>
region: <bucket-region>
prefix: <optional-prefix>
For GCP environments with Google Cloud Storage
Workload identify is used to access the bucket. The following annotation must be set to the storage service account:
iam.gke.io/gcp-service-account: <service_account_email>
Helm values must be set as follows:
storage:
serviceAccount:
create: true
annotations: {
iam.gke.io/gcp-service-account: <service_account_email>
}
persistence:
enabled: true
cloudPersistence:
enabled: true
url: gs://<bucket-name>
pvcMigration:
enabled: true
cloudProvider: gcs
bucketName: <bucket-name>
region: <bucket-region>
For Azure environments with Azure Blob Storage
Workload identify is used to access the bucket. The following annotation must be set to the storage service account:
azure.workload.identity/client-id=<client-id>
The following label must be set to storage pods (service and migrator job):
azure.workload.identity/use=true
Helm values must be set as follows:
storage:
serviceAccount:
create: true
annotations: {
azure.workload.identity/client-id=<client-id>
}
extraPodLabels: {
azure.workload.identity/use=true
}
persistence:
enabled: true
cloudPersistence:
enabled: true
url: azblob://<bucket-name>
pvcMigration:
enabled: true
cloudProvider: azureblob
bucketName: <bucket-name>
region: <bucket-region>
accountName: <storage-account-name>
For on-premise environments with Minio
storage:
persistence:
enabled: true
cloudPersistence:
enabled: true
url: s3://<minio-bucket-name>?endpoint=<minio-url>®ion=<minio-region>&hostname_immutable=true
access_key_id: <minio-access-key-id>
secret_access_key: <minio-secret-access-key>
pvcMigration:
enabled: true
cloudProvider: minio
bucketName: <bucket-name>
region: <minio-region>
endpoint: <minio-url>
access_key_id: <minio-access-key-id>
secret_access_key: <minio-secret-access-key>
From 0.66.1 to 0.67.0
Announcement: Upcoming Java MOJO Runtime removal
The Java MOJO Runtime was removed in H2O MLOps version 1.1. Version 0.68.0 was the last release to include the Java MOJO Runtime. For migration steps, see Removal of Java MOJO Runtime. The C++ MOJO runtime has also been replaced by dai-mojo-scorer.
Scoring runtimes
-
MLflow Runtimes images are twice as large now. This means that deployments of these run-times can take longer due to longer pulling times.
-
Runtimes for DAI 1.10.4.3 and older are removed as of MLOps version 0.67.0.
-
MLflow runtimes support Python 3.8 and later starting with MLOps version 0.67.0.
For more information on scoring runtimes in H2O MLOps, see Scoring runtimes.
Python client
Starting with version 0.67.0, the official Python client of H2O MLOps is h2o-mlops. The minimum Python version required for the client is Python 3.9.
Built on top of the legacy Python client, h2o-mlops retains all previous functionalities. You can continue to access the legacy client's features through h2o-mlops as needed.
Note that users of the legacy client can switch to the new Python client (h2o-mlops) by importing h2o-mlops before using any features of the legacy client. This switch can be made without needing to modify any existing code or import statements.
Removal of Conda from Wave app
With the removal of Conda as of MLOps version 0.67.0, third-party models can no longer be uploaded to the MLOps frontend using serialized Pickle files. However, you can still upload models from frameworks like scikit-learn, PyTorch, XGBoost, LightGBM, and TensorFlow using MLflow packaged files.
Monitoring data retention
- Starting with version 0.67.0, per project data retention duration can be set for monitoring data stored on InfluxDB. To enable this feature, set the
MONITOR_INFLUXDB_PER_PROJECT_DATA_RETENTION_PERIODenv to the deployer with a correct duration string. Minimum retention period is 1h and the max isINF.INFwill be the default IfMONITOR_INFLUXDB_PER_PROJECT_DATA_RETENTION_PERIODis not set,INFis the default duration.
-monitor_influxdb_per_project_data_retention_period is exposed for H2O MLOps helm charts to set the MONITOR_INFLUXDB_PER_PROJECT_DATA_RETENTION_PERIOD for deployer.
Emissary
Switch from emissary to gateway-api:
- Emissary's CRDs are no longer used.
- For mapping deployments to http, Gateway API's HTTPRoute CRD is used.
- Gateway API implemented with Envoy Gateway.
- (Breaking change) Gateway API doesn't support custom error responses. This means that if a deployment is scaled down, the following custom error body is no longer displayed:
Deployment is scaled down to zero replicas. Please increase the number of replicas to use the deployment.For more information, see Custom error responses. - (Breaking change) - If a deployment is scaled down, error code 500 is thrown instead of 503.
Other changes
- External model registry is removed as of version 0.67.0.
- Submit and view feedback for this page
- Send feedback about H2O MLOps to cloud-feedback@h2o.ai