Monitoring setup
This guide shows you how to configure and manage monitoring for your deployments and batch scoring jobs using the H2O MLOps Python client.
Follow the steps below to define monitored columns, optionally set up Kafka integration, enable monitoring for either deployments or batch jobs, and view the collected monitoring data.
Step 1: Define input and output columns
To enable monitoring in H2O MLOps, you must specify the input and output columns to monitor. You can do this in one of the following ways:
Manual configuration
You can manually define the monitored columns using the MonitoringOptions class:
from h2o_mlops.options import (
BaselineData,
Column,
MissingValues,
MonitoringOptions,
NumericalAggregate,
)
from h2o_mlops.types import ColumnLogicalType
options = MonitoringOptions(
enabled=True,
input_columns=[
Column(
name="age",
logical_type=ColumnLogicalType.NUMERICAL,
),
],
output_columns=[
Column(
name="quantity",
logical_type=ColumnLogicalType.NUMERICAL,
is_model_output=True,
)
],
baseline_data=[
BaselineData(
column_name="AGE",
logical_type=ColumnLogicalType.NUMERICAL,
numerical_aggregate=NumericalAggregate(
bin_edges=[
float("-inf"),
22.0,
23.0,
25.0,
26.0,
28.0,
30.0,
31.0,
float("inf"),
],
bin_count=[0, 1, 3, 1, 2, 2, 3, 3],
mean_value=27.266666666666666,
standard_deviation=3.2396354880199243,
min_value=22.0,
max_value=31.0,
sum_value=409.0,
),
categorical_aggregate=None,
missing_values=MissingValues(row_count=0),
),
],
)
Automatic configuration
You can also configure monitoring automatically. This method can calculate the baseline using PySpark.
from h2o_mlops.types import ColumnLogicalType
from h2o_mlops.utils.monitoring import (
Format,
get_spark_session,
prepare_monitoring_options_from_data_frame,
read_source,
)
session = get_spark_session()
baseline_data_frame = read_source(
spark=session,
source_data="file:///datasets/categorical_data.csv",
source_format=Format.CSV,
)
### User is able to override logical type for column for example ID column
logical_type_overrides = {
"id": ColumnLogicalType.ID,
}
# Experiment is optional and base on schema in experiment code is able to discover proper types for monitoring
options = prepare_monitoring_options_from_data_frame(
data_frame=baseline_data_frame,
logical_type_overrides=logical_type_overrides,
experiment=experiment,
)
options.enabled = True
Step 2: Optional: Kafka integration for raw scoring logs
You can enable the export of raw scoring request and response data to Kafka, if it is enabled in the environment. You can use a global topic or specify a custom topic per deployment.
options.kafka_topic = "test"
Step 3: Edit baseline and columns before deployment or batch job creation
You can modify the automatically detected baseline and monitored columns before deployment or batch job creation if the detection was inaccurate.
To modify the logical type of an existing column:
options.input_columns[0].logical_type = ColumnLogicalType.CATEGORICAL
To replace an entire column definition:
options.input_columns[0] = Column(
name="width",
logical_type=ColumnLogicalType.NUMERICAL,
)
Step 4: Configure monitoring
Option A: Configure monitoring for deployment
You can deploy a model with monitoring enabled, or enable or disable monitoring after deployment.
Deploy with monitoring enabled
To deploy with monitoring enabled:
deployment = workspace.deployments.create(
name="demo-deployment",
composition_options=[comp_opts],
mode=DeploymentModeType.SINGLE_MODEL,
monitoring_options=options,
security_options=sec_opt,
)
Enable or disable monitoring after deployment
You can enable or disable monitoring after deployment as long as the monitored columns were provided. If they weren’t, you must configure them first with the monitoring_options configuration.
To disable monitoring if it was already configured:
options = deployment.monitoring_options
options.enabled = False
deployment.update(monitoring_options=options)
To enable monitoring when it wasn’t configured at deployment time:
First, define the monitored columns using manual or automatic configuration.
For more information, see Step 1: Define input and output columns. Then:
options = deployment.monitoring_options
options.enabled = True
deployment.update(monitoring_options=options)
Option B: Configure monitoring for batch scoring jobs
To create a batch scoring job with monitoring enabled, use the monitoring_options parameter configured in Steps 1-3:
job = workspace.batch_scoring_jobs.create(
source=source,
sink=sink,
model=model,
scoring_runtime=scoring_runtime,
name="Test job",
monitoring_options=options, # Use the options configured in Steps 1-3
)
The monitoring_options parameter accepts the same MonitoringOptions object that you configured in Step 1: Define input and output columns, whether using manual or automatic configuration.
For detailed batch scoring configuration including source and sink setup, see Batch scoring.
Step 5: View monitoring data
After you configure and enable monitoring, you can view monitoring data through baseline aggregates and scoring aggregates.
View baseline aggregates
Baseline aggregates represent the reference distribution of your data. Use them to compare against scoring aggregates for drift detection.
Count baseline aggregates
Get the total number of baseline aggregates:
Input:
deployment.monitoring.baseline_aggregates.count()
Output:
2
List baseline aggregates
List all baseline aggregates:
Input:
aggregates = deployment.monitoring.baseline_aggregates.list()
aggregates
Output:
| column | type | is_model_output | uid
---+----------+-------------+-------------------+--------------------------------------
0 | age | NUMERICAL | False | a1b2c3d4-e5f6-7890-abcd-ef1234567890
1 | quantity | NUMERICAL | True | b2c3d4e5-f6a7-8901-bcde-f12345678901
-
The output of the
list()method is displayed in a formatted table view. By default, only the first 50 rows are displayed. -
Calling
len(aggregates)returns the total number of rows it contains, not just the number currently displayed. -
To customize the number of rows displayed, you can call the
show()method with thenargument. This allows more rows to be shown when needed. For example:aggregates.show(n=100)This displays up to 100 baseline aggregates.
-
You can iterate over
aggregatesdirectly.
Filter baseline aggregates
Use the list() method with key-value arguments to filter the baseline aggregates.
Input:
deployment.monitoring.baseline_aggregates.list(column="age")
This returns a list of matching baseline aggregates as a table.
Output:
| column | type | is_model_output | uid
---+----------+-------------+-------------------+--------------------------------------
0 | age | NUMERICAL | False | a1b2c3d4-e5f6-7890-abcd-ef1234567890
Retrieve a baseline aggregate
To retrieve a specific baseline aggregate, use indexing on the list:
Input:
aggregate = deployment.monitoring.baseline_aggregates.list()[0]
aggregate
Output:
<class 'h2o_mlops._monitoring.MLOpsBaselineAggregate(
uid='a1b2c3d4-e5f6-7890-abcd-ef1234567890',
column='age',
logical_type=<ColumnLogicalType.NUMERICAL>,
is_model_output=False,
)'>
You can also retrieve a specific baseline aggregate from the list returned by list() using filtering and indexing.
For example, aggregate = deployment.monitoring.baseline_aggregates.list(column="age")[0].
Baseline aggregate properties
A baseline aggregate has the following main properties:
uid: The unique identifier of the baseline aggregate.column: The name of the monitored column (string).logical_type: The logical type of the monitored column (for example,ColumnLogicalType.NUMERICALorColumnLogicalType.CATEGORICAL). In thelist()table output, this property appears under thetypecolumn header.is_model_output: Whether the column is a model output (TrueorFalse).numerical_aggregate: The numerical distribution data, including bin edges, bin counts, mean, standard deviation, min, max, and sum. Present when the column type is numerical.categorical_aggregate: The categorical distribution data. Present when the column type is categorical.text_aggregate: The text distribution data. Present when the column type is text.missing_values: Information about missing values, including the row count.
View scoring aggregates
Scoring aggregates capture the distribution of data observed during model scoring. Compare them against baseline aggregates to detect data drift.
Count scoring aggregates
Get the total number of scoring aggregates:
Input:
deployment.monitoring.scoring_aggregates.count()
Output:
5
List scoring aggregates
List all scoring aggregates:
Input:
aggregates = deployment.monitoring.scoring_aggregates.list()
aggregates
Output:
| timestamp | column | type | is_model_output | experiment_uid | uid
---+-------------------------+----------+-------------+-------------------+--------------------------------------+--------------------------------------
0 | 2025-07-17 03:45:00 PM | age | NUMERICAL | False | d9a47c99-c66c-4ff9-b2b6-30faf5f413ef | c3d4e5f6-a7b8-9012-cdef-123456789012
1 | 2025-07-17 03:45:00 PM | quantity | NUMERICAL | True | d9a47c99-c66c-4ff9-b2b6-30faf5f413ef | d4e5f6a7-b8c9-0123-defa-234567890123
-
The output of the
list()method is displayed in a formatted table view. By default, only the first 50 rows are displayed. -
Calling
len(aggregates)returns the total number of rows it contains, not just the number currently displayed. -
To customize the number of rows displayed, you can call the
show()method with thenargument. This allows more rows to be shown when needed. For example:aggregates.show(n=100)This displays up to 100 scoring aggregates.
-
You can iterate over
aggregatesdirectly.
Filter scoring aggregates
Use the list() method with key-value arguments to filter the scoring aggregates.
Input:
deployment.monitoring.scoring_aggregates.list(column="age")
This returns a list of matching scoring aggregates as a table.
Output:
| timestamp | column | type | is_model_output | experiment_uid | uid
---+-------------------------+----------+-------------+-------------------+--------------------------------------+--------------------------------------
0 | 2025-07-17 03:45:00 PM | age | NUMERICAL | False | d9a47c99-c66c-4ff9-b2b6-30faf5f413ef | c3d4e5f6-a7b8-9012-cdef-123456789012
Retrieve a scoring aggregate
To retrieve a specific scoring aggregate, use indexing on the list:
Input:
aggregate = deployment.monitoring.scoring_aggregates.list()[0]
aggregate
Output:
<class 'h2o_mlops._monitoring.MLOpsScoringAggregate(
uid='c3d4e5f6-a7b8-9012-cdef-123456789012',
column='age',
logical_type=<ColumnLogicalType.NUMERICAL>,
is_model_output=False,
timestamp=datetime.datetime(2025, 7, 17, 15, 45, tzinfo=tzutc()),
aggregation_window='PT5M',
)'>
You can also retrieve a specific scoring aggregate from the list returned by list() using filtering and indexing.
For example, aggregate = deployment.monitoring.scoring_aggregates.list(column="age")[0].
Scoring aggregate properties
A scoring aggregate has the following main properties:
uid: The unique identifier of the scoring aggregate.column: The name of the monitored column (string).logical_type: The logical type of the monitored column (for example,ColumnLogicalType.NUMERICALorColumnLogicalType.CATEGORICAL). In thelist()table output, this property appears under thetypecolumn header.is_model_output: Whether the column is a model output (TrueorFalse).experiment_uid: The unique identifier of the experiment associated with this scoring aggregate.numerical_aggregate: The numerical distribution data, including bin edges, bin counts, mean, standard deviation, min, max, and sum. Present when the column type is numerical.categorical_aggregate: The categorical distribution data. Present when the column type is categorical.text_aggregate: The text distribution data. Present when the column type is text.missing_values: Information about missing values, including the row count.timestamp: The start time of the aggregation window.aggregation_window: The duration of the aggregation window, expressed in ISO 8601 duration format (for example,PT5Mfor 5 minutes).
View monitoring data for batch scoring jobs
Batch scoring jobs expose the same monitoring property as deployments. After you create a batch scoring job with monitoring enabled, you can access baseline and scoring aggregates through job.monitoring.
Count baseline aggregates for batch scoring jobs
Get the total number of baseline aggregates for a batch scoring job:
Input:
job.monitoring.baseline_aggregates.count()
Output:
2
List baseline aggregates for batch scoring jobs
List all baseline aggregates for a batch scoring job:
Input:
aggregates = job.monitoring.baseline_aggregates.list()
aggregates
Output:
| column | type | is_model_output | uid
---+----------+-------------+-------------------+--------------------------------------
0 | age | NUMERICAL | False | a1b2c3d4-e5f6-7890-abcd-ef1234567890
1 | quantity | NUMERICAL | True | b2c3d4e5-f6a7-8901-bcde-f12345678901
-
The output of the
list()method is displayed in a formatted table view. By default, only the first 50 rows are displayed. -
Calling
len(aggregates)returns the total number of rows it contains, not just the number currently displayed. -
To customize the number of rows displayed, you can call the
show()method with thenargument. This allows more rows to be shown when needed. For example:aggregates.show(n=100)This displays up to 100 baseline aggregates.
-
You can iterate over
aggregatesdirectly.
Count scoring aggregates for batch scoring jobs
Get the total number of scoring aggregates for a batch scoring job:
Input:
job.monitoring.scoring_aggregates.count()
Output:
5
List scoring aggregates for batch scoring jobs
List all scoring aggregates for a batch scoring job:
Input:
aggregates = job.monitoring.scoring_aggregates.list()
aggregates
Output:
| timestamp | column | type | is_model_output | experiment_uid | uid
---+-------------------------+----------+-------------+-------------------+--------------------------------------+--------------------------------------
0 | 2025-07-17 03:45:00 PM | age | NUMERICAL | False | d9a47c99-c66c-4ff9-b2b6-30faf5f413ef | c3d4e5f6-a7b8-9012-cdef-123456789012
1 | 2025-07-17 03:45:00 PM | quantity | NUMERICAL | True | d9a47c99-c66c-4ff9-b2b6-30faf5f413ef | d4e5f6a7-b8c9-0123-defa-234567890123
-
The output of the
list()method is displayed in a formatted table view. By default, only the first 50 rows are displayed. -
Calling
len(aggregates)returns the total number of rows it contains, not just the number currently displayed. -
To customize the number of rows displayed, you can call the
show()method with thenargument. This allows more rows to be shown when needed. For example:aggregates.show(n=100)This displays up to 100 scoring aggregates.
-
You can iterate over
aggregatesdirectly.
All filtering, retrieval, and property access methods described in the View baseline aggregates and View scoring aggregates sections work identically for batch scoring jobs. Replace deployment.monitoring with job.monitoring in the code examples.
For example, to filter baseline aggregates by column:
job.monitoring.baseline_aggregates.list(column="age")
Step 6: Delete monitoring data
Deleting monitoring data permanently removes all monitoring data, including baselines, for the deployment. You cannot undo this action.
To delete all monitoring data for a deployment:
deployment.monitoring.delete()
Deleting monitoring data is available for deployments only. Batch scoring jobs do not support this operation.
- Submit and view feedback for this page
- Send feedback about H2O MLOps to cloud-feedback@h2o.ai