Version: v1.1.8

Monitoring setup

This guide shows you how to configure and manage monitoring for your deployments and batch scoring jobs using the H2O MLOps Python client.

Follow the steps below to define monitored columns, optionally set up Kafka integration, enable monitoring for either deployments or batch jobs, and view the collected monitoring data.

Step 1: Define input and output columns

To enable monitoring in H2O MLOps, you must specify the input and output columns to monitor. You can do this in one of the following ways:

Manual configuration
Automatic configuration

Manual configuration

You can manually define the monitored columns using the MonitoringOptions class:

from h2o_mlops.options import (
   BaselineData,
   Column,
   MissingValues,
   MonitoringOptions,
   NumericalAggregate,
)
from h2o_mlops.types import ColumnLogicalType

options = MonitoringOptions(
   enabled=True,
   input_columns=[
     Column(
         name="age",
         logical_type=ColumnLogicalType.NUMERICAL,
     ),
   ],
   output_columns=[
     Column(
         name="quantity",
         logical_type=ColumnLogicalType.NUMERICAL,
         is_model_output=True,
     )
   ],
   baseline_data=[
       BaselineData(
           column_name="AGE",
           logical_type=ColumnLogicalType.NUMERICAL,
           numerical_aggregate=NumericalAggregate(
               bin_edges=[
                   float("-inf"),
                   22.0,
                   23.0,
                   25.0,
                   26.0,
                   28.0,
                   30.0,
                   31.0,
                   float("inf"),
               ],
               bin_count=[0, 1, 3, 1, 2, 2, 3, 3],
               mean_value=27.266666666666666,
               standard_deviation=3.2396354880199243,
               min_value=22.0,
               max_value=31.0,
               sum_value=409.0,
           ),
           categorical_aggregate=None,
           missing_values=MissingValues(row_count=0),
       ),
   ],
)

Automatic configuration

You can also configure monitoring automatically. This method can calculate the baseline using PySpark.

from h2o_mlops.types import ColumnLogicalType
from h2o_mlops.utils.monitoring import (
   Format,
   get_spark_session,
   prepare_monitoring_options_from_data_frame,
   read_source,
)

session = get_spark_session()

baseline_data_frame = read_source(
   spark=session,
   source_data="file:///datasets/categorical_data.csv",
   source_format=Format.CSV,
)

### User is able to override logical type for column for example ID column
logical_type_overrides = {
   "id": ColumnLogicalType.ID,
}

# Experiment is optional and base on schema in experiment code is able to discover proper types for monitoring
options = prepare_monitoring_options_from_data_frame(
   data_frame=baseline_data_frame,
   logical_type_overrides=logical_type_overrides,
   experiment=experiment,
)

options.enabled = True

Step 2: Optional: Kafka integration for raw scoring logs

You can enable the export of raw scoring request and response data to Kafka, if it is enabled in the environment. You can use a global topic or specify a custom topic per deployment.

options.kafka_topic = "test"

Step 3: Edit baseline and columns before deployment or batch job creation

You can modify the automatically detected baseline and monitored columns before deployment or batch job creation if the detection was inaccurate.

To modify the logical type of an existing column:

options.input_columns[0].logical_type = ColumnLogicalType.CATEGORICAL

To replace an entire column definition:

options.input_columns[0] = Column(
   name="width",
   logical_type=ColumnLogicalType.NUMERICAL,
)

Step 4: Configure monitoring

Option A: Configure monitoring for deployment

You can deploy a model with monitoring enabled, or enable or disable monitoring after deployment.

Deploy with monitoring enabled

To deploy with monitoring enabled:

deployment = workspace.deployments.create(
   name="demo-deployment",
   composition_options=[comp_opts],
   mode=DeploymentModeType.SINGLE_MODEL,
   monitoring_options=options,
   security_options=sec_opt,
)

Enable or disable monitoring after deployment

You can enable or disable monitoring after deployment as long as the monitored columns were provided. If they weren’t, you must configure them first with the monitoring_options configuration.

To disable monitoring if it was already configured:

options = deployment.monitoring_options
options.enabled = False
deployment.update(monitoring_options=options)

To enable monitoring when it wasn’t configured at deployment time:

First, define the monitored columns using manual or automatic configuration.
For more information, see Step 1: Define input and output columns. Then:

options = deployment.monitoring_options
options.enabled = True
deployment.update(monitoring_options=options)

Option B: Configure monitoring for batch scoring jobs

To create a batch scoring job with monitoring enabled, use the monitoring_options parameter configured in Steps 1-3:

job = workspace.batch_scoring_jobs.create(
   source=source,
   sink=sink,
   model=model,
   scoring_runtime=scoring_runtime,
   name="Test job",
   monitoring_options=options,  # Use the options configured in Steps 1-3
)

The monitoring_options parameter accepts the same MonitoringOptions object that you configured in Step 1: Define input and output columns, whether using manual or automatic configuration.

note

For detailed batch scoring configuration including source and sink setup, see Batch scoring.

Step 5: View monitoring data

After you configure and enable monitoring, you can view monitoring data through baseline aggregates and scoring aggregates.

View baseline aggregates

Baseline aggregates represent the reference distribution of your data. Use them to compare against scoring aggregates for drift detection.

Count baseline aggregates

Get the total number of baseline aggregates:

Input:

deployment.monitoring.baseline_aggregates.count()

Output:

List baseline aggregates

List all baseline aggregates:

Input:

aggregates = deployment.monitoring.baseline_aggregates.list()
aggregates

Output:

   | column   | type        | is_model_output   | uid
---+----------+-------------+-------------------+--------------------------------------
 0 | age      | NUMERICAL   | False             | a1b2c3d4-e5f6-7890-abcd-ef1234567890
 1 | quantity | NUMERICAL   | True              | b2c3d4e5-f6a7-8901-bcde-f12345678901

note

The output of the list() method is displayed in a formatted table view. By default, only the first 50 rows are displayed.
Calling len(aggregates) returns the total number of rows it contains, not just the number currently displayed.
To customize the number of rows displayed, you can call the show() method with the n argument. This allows more rows to be shown when needed. For example:
```
aggregates.show(n=100)
```
This displays up to 100 baseline aggregates.
You can iterate over aggregates directly.

Filter baseline aggregates

Use the list() method with key-value arguments to filter the baseline aggregates.

Input:

deployment.monitoring.baseline_aggregates.list(column="age")

This returns a list of matching baseline aggregates as a table.

Output:

   | column   | type        | is_model_output   | uid
---+----------+-------------+-------------------+--------------------------------------
 0 | age      | NUMERICAL   | False             | a1b2c3d4-e5f6-7890-abcd-ef1234567890

Retrieve a baseline aggregate

To retrieve a specific baseline aggregate, use indexing on the list:

Input:

aggregate = deployment.monitoring.baseline_aggregates.list()[0]
aggregate

Output:

<class 'h2o_mlops._monitoring.MLOpsBaselineAggregate(
   uid='a1b2c3d4-e5f6-7890-abcd-ef1234567890',
   column='age',
   logical_type=<ColumnLogicalType.NUMERICAL>,
   is_model_output=False,
)'>

note

You can also retrieve a specific baseline aggregate from the list returned by list() using filtering and indexing. For example, aggregate = deployment.monitoring.baseline_aggregates.list(column="age")[0].

Baseline aggregate properties

A baseline aggregate has the following main properties:

uid: The unique identifier of the baseline aggregate.
column: The name of the monitored column (string).
logical_type: The logical type of the monitored column (for example, ColumnLogicalType.NUMERICAL or ColumnLogicalType.CATEGORICAL). In the list() table output, this property appears under the type column header.
is_model_output: Whether the column is a model output (True or False).
numerical_aggregate: The numerical distribution data, including bin edges, bin counts, mean, standard deviation, min, max, and sum. Present when the column type is numerical.
categorical_aggregate: The categorical distribution data. Present when the column type is categorical.
text_aggregate: The text distribution data. Present when the column type is text.
missing_values: Information about missing values, including the row count.

View scoring aggregates

Scoring aggregates capture the distribution of data observed during model scoring. Compare them against baseline aggregates to detect data drift.

Count scoring aggregates

Get the total number of scoring aggregates:

Input:

deployment.monitoring.scoring_aggregates.count()

Output:

List scoring aggregates

List all scoring aggregates:

Input:

aggregates = deployment.monitoring.scoring_aggregates.list()
aggregates

Output:

   | timestamp               | column   | type        | is_model_output   | experiment_uid                       | uid
---+-------------------------+----------+-------------+-------------------+--------------------------------------+--------------------------------------
 0 | 2025-07-17 03:45:00 PM  | age      | NUMERICAL   | False             | d9a47c99-c66c-4ff9-b2b6-30faf5f413ef | c3d4e5f6-a7b8-9012-cdef-123456789012
 1 | 2025-07-17 03:45:00 PM  | quantity | NUMERICAL   | True              | d9a47c99-c66c-4ff9-b2b6-30faf5f413ef | d4e5f6a7-b8c9-0123-defa-234567890123

note

The output of the list() method is displayed in a formatted table view. By default, only the first 50 rows are displayed.
Calling len(aggregates) returns the total number of rows it contains, not just the number currently displayed.
To customize the number of rows displayed, you can call the show() method with the n argument. This allows more rows to be shown when needed. For example:
```
aggregates.show(n=100)
```
This displays up to 100 scoring aggregates.
You can iterate over aggregates directly.

Filter scoring aggregates

Use the list() method with key-value arguments to filter the scoring aggregates.

Input:

deployment.monitoring.scoring_aggregates.list(column="age")

This returns a list of matching scoring aggregates as a table.

Output:

   | timestamp               | column   | type        | is_model_output   | experiment_uid                       | uid
---+-------------------------+----------+-------------+-------------------+--------------------------------------+--------------------------------------
 0 | 2025-07-17 03:45:00 PM  | age      | NUMERICAL   | False             | d9a47c99-c66c-4ff9-b2b6-30faf5f413ef | c3d4e5f6-a7b8-9012-cdef-123456789012

Retrieve a scoring aggregate

To retrieve a specific scoring aggregate, use indexing on the list:

Input:

aggregate = deployment.monitoring.scoring_aggregates.list()[0]
aggregate

Output:

<class 'h2o_mlops._monitoring.MLOpsScoringAggregate(
   uid='c3d4e5f6-a7b8-9012-cdef-123456789012',
   column='age',
   logical_type=<ColumnLogicalType.NUMERICAL>,
   is_model_output=False,
   timestamp=datetime.datetime(2025, 7, 17, 15, 45, tzinfo=tzutc()),
   aggregation_window='PT5M',
)'>

note

You can also retrieve a specific scoring aggregate from the list returned by list() using filtering and indexing. For example, aggregate = deployment.monitoring.scoring_aggregates.list(column="age")[0].

Scoring aggregate properties

A scoring aggregate has the following main properties:

uid: The unique identifier of the scoring aggregate.
column: The name of the monitored column (string).
logical_type: The logical type of the monitored column (for example, ColumnLogicalType.NUMERICAL or ColumnLogicalType.CATEGORICAL). In the list() table output, this property appears under the type column header.
is_model_output: Whether the column is a model output (True or False).
experiment_uid: The unique identifier of the experiment associated with this scoring aggregate.
numerical_aggregate: The numerical distribution data, including bin edges, bin counts, mean, standard deviation, min, max, and sum. Present when the column type is numerical.
categorical_aggregate: The categorical distribution data. Present when the column type is categorical.
text_aggregate: The text distribution data. Present when the column type is text.
missing_values: Information about missing values, including the row count.
timestamp: The start time of the aggregation window.
aggregation_window: The duration of the aggregation window, expressed in ISO 8601 duration format (for example, PT5M for 5 minutes).

View monitoring data for batch scoring jobs

Batch scoring jobs expose the same monitoring property as deployments. After you create a batch scoring job with monitoring enabled, you can access baseline and scoring aggregates through job.monitoring.

Count baseline aggregates for batch scoring jobs

Get the total number of baseline aggregates for a batch scoring job:

Input:

job.monitoring.baseline_aggregates.count()

Output:

List baseline aggregates for batch scoring jobs

List all baseline aggregates for a batch scoring job:

Input:

aggregates = job.monitoring.baseline_aggregates.list()
aggregates

Output:

   | column   | type        | is_model_output   | uid
---+----------+-------------+-------------------+--------------------------------------
 0 | age      | NUMERICAL   | False             | a1b2c3d4-e5f6-7890-abcd-ef1234567890
 1 | quantity | NUMERICAL   | True              | b2c3d4e5-f6a7-8901-bcde-f12345678901

note

The output of the list() method is displayed in a formatted table view. By default, only the first 50 rows are displayed.
Calling len(aggregates) returns the total number of rows it contains, not just the number currently displayed.
To customize the number of rows displayed, you can call the show() method with the n argument. This allows more rows to be shown when needed. For example:
```
aggregates.show(n=100)
```
This displays up to 100 baseline aggregates.
You can iterate over aggregates directly.

Count scoring aggregates for batch scoring jobs

Get the total number of scoring aggregates for a batch scoring job:

Input:

job.monitoring.scoring_aggregates.count()

Output:

List scoring aggregates for batch scoring jobs

List all scoring aggregates for a batch scoring job:

Input:

aggregates = job.monitoring.scoring_aggregates.list()
aggregates

Output:

   | timestamp               | column   | type        | is_model_output   | experiment_uid                       | uid
---+-------------------------+----------+-------------+-------------------+--------------------------------------+--------------------------------------
 0 | 2025-07-17 03:45:00 PM  | age      | NUMERICAL   | False             | d9a47c99-c66c-4ff9-b2b6-30faf5f413ef | c3d4e5f6-a7b8-9012-cdef-123456789012
 1 | 2025-07-17 03:45:00 PM  | quantity | NUMERICAL   | True              | d9a47c99-c66c-4ff9-b2b6-30faf5f413ef | d4e5f6a7-b8c9-0123-defa-234567890123

note

The output of the list() method is displayed in a formatted table view. By default, only the first 50 rows are displayed.
Calling len(aggregates) returns the total number of rows it contains, not just the number currently displayed.
To customize the number of rows displayed, you can call the show() method with the n argument. This allows more rows to be shown when needed. For example:
```
aggregates.show(n=100)
```
This displays up to 100 scoring aggregates.
You can iterate over aggregates directly.

note

All filtering, retrieval, and property access methods described in the View baseline aggregates and View scoring aggregates sections work identically for batch scoring jobs. Replace deployment.monitoring with job.monitoring in the code examples.

For example, to filter baseline aggregates by column:

job.monitoring.baseline_aggregates.list(column="age")

Step 6: Delete monitoring data

warning

Deleting monitoring data permanently removes all monitoring data, including baselines, for the deployment. You cannot undo this action.

To delete all monitoring data for a deployment:

deployment.monitoring.delete()

note

Deleting monitoring data is available for deployments only. Batch scoring jobs do not support this operation.

Feedback

Submit and view feedback for this page
Send feedback about H2O MLOps to cloud-feedback@h2o.ai

Step 1: Define input and output columns​

Manual configuration​

Automatic configuration​

Step 2: Optional: Kafka integration for raw scoring logs​

Step 3: Edit baseline and columns before deployment or batch job creation​

Step 4: Configure monitoring​

Option A: Configure monitoring for deployment​

Deploy with monitoring enabled​

Enable or disable monitoring after deployment​

Option B: Configure monitoring for batch scoring jobs​

Step 5: View monitoring data​

View baseline aggregates​

Count baseline aggregates​

List baseline aggregates​

Filter baseline aggregates​

Retrieve a baseline aggregate​

Baseline aggregate properties​

View scoring aggregates​

Count scoring aggregates​

List scoring aggregates​

Filter scoring aggregates​

Retrieve a scoring aggregate​

Scoring aggregate properties​

View monitoring data for batch scoring jobs​

Count baseline aggregates for batch scoring jobs​

List baseline aggregates for batch scoring jobs​

Count scoring aggregates for batch scoring jobs​

List scoring aggregates for batch scoring jobs​

Step 6: Delete monitoring data​

Step 1: Define input and output columns

Manual configuration

Automatic configuration

Step 2: Optional: Kafka integration for raw scoring logs

Step 3: Edit baseline and columns before deployment or batch job creation

Step 4: Configure monitoring

Option A: Configure monitoring for deployment

Deploy with monitoring enabled

Enable or disable monitoring after deployment

Option B: Configure monitoring for batch scoring jobs

Step 5: View monitoring data

View baseline aggregates

Count baseline aggregates

List baseline aggregates

Filter baseline aggregates

Retrieve a baseline aggregate

Baseline aggregate properties

View scoring aggregates

Count scoring aggregates

List scoring aggregates

Filter scoring aggregates

Retrieve a scoring aggregate

Scoring aggregate properties

View monitoring data for batch scoring jobs

Count baseline aggregates for batch scoring jobs

List baseline aggregates for batch scoring jobs

Count scoring aggregates for batch scoring jobs

List scoring aggregates for batch scoring jobs

Step 6: Delete monitoring data