gcp.vertex-ai-batch-prediction-job
GCP Vertex AI Batch Prediction Job Resource
Vertex AI Batch Prediction Jobs are used to run batch inference workloads on machine learning models at scale.
- example:
List all Batch Prediction Jobs in specific locations:
policies:
- name: vertexai-batch-jobs-inventory
resource: gcp.vertex-ai-batch-prediction-job
query:
- location: us-central1
- location: us-east1
- example:
Find long-running batch prediction jobs:
policies:
- name: vertexai-batch-jobs-long-running
resource: gcp.vertex-ai-batch-prediction-job
filters:
- type: value
key: state
value: JOB_STATE_RUNNING
- type: value
key: createTime
value_type: age
op: greater-than
value: 24
- example:
Find failed batch prediction jobs:
policies:
- name: vertexai-batch-jobs-failed
resource: gcp.vertex-ai-batch-prediction-job
filters:
- type: value
key: state
value: JOB_STATE_FAILED
Filters
metrics
Supports metrics filters on resources.
All resources that have cloud watch metrics are supported.
Docs on cloud watch metrics
Google Supported Metrics https://cloud.google.com/monitoring/api/metrics_gcp
Custom Metrics https://cloud.google.com/monitoring/api/v3/metric-model#intro-custom-metrics
- name: firewall-hit-count
resource: gcp.firewall
filters:
- type: metrics
name: firewallinsights.googleapis.com/subnet/firewall_hit_count
aligner: ALIGN_COUNT
days: 14
value: 1
op: greater-than
The period-start key allows you to align the metric window in two ways.
By default, using auto, the window is computed relative to the current time.
Alternatively, setting it to start-of-day aligns the window to full UTC calendar days,
beginning at 00:00:00 UTC and ending at current day 00:00:00 UTC.
- name: instance-low-cpu-last-full-day
resource: gcp.instance
filters:
- type: metrics
name: compute.googleapis.com/instance/cpu/utilization
aligner: ALIGN_MEAN
days: 1
value: 0.05
op: less-than
period-start: start-of-day
properties:
aligner:
enum:
- ALIGN_NONE
- ALIGN_DELTA
- ALIGN_RATE
- ALIGN_INTERPOLATE
- ALIGN_MIN
- ALIGN_MAX
- ALIGN_MEAN
- ALIGN_COUNT
- ALIGN_SUM
- REDUCE_COUNT_FALSE
- ALIGN_STDDEV
- ALIGN_COUNT_TRUE
- ALIGN_COUNT_FALSE
- ALIGN_FRACTION_TRUE
- ALIGN_PERCENTILE_99
- ALIGN_PERCENTILE_95
- ALIGN_PERCENTILE_50
- ALIGN_PERCENTILE_05
- ALIGN_PERCENT_CHANG
type: string
days:
type: number
filter:
type: string
group-by-fields:
items:
type: string
type: array
metric-key:
type: string
missing-value:
type: number
name:
type: string
op:
enum:
- eq
- equal
- ne
- not-equal
- gt
- greater-than
- ge
- gte
- le
- lte
- lt
- less-than
- glob
- regex
- regex-case
- in
- ni
- not-in
- contains
- difference
- intersect
- mod
type: string
period-start:
enum:
- auto
- start-of-day
type: string
reducer:
enum:
- REDUCE_NONE
- REDUCE_MEAN
- REDUCE_MIN
- REDUCE_MAX
- REDUCE_MEAN
- REDUCE_SUM
- REDUCE_STDDEV
- REDUCE_COUNT
- REDUCE_COUNT_TRUE
- REDUCE_COUNT_FALSE
- REDUCE_FRACTION_TRUE
- REDUCE_PERCENTILE_99
- REDUCE_PERCENTILE_95
- REDUCE_PERCENTILE_50
- REDUCE_PERCENTILE_05
type: string
type:
enum:
- metrics
value:
type: number
required:
- value
- name
- op
Permissions - monitoring.timeSeries.list
Actions
delete
Delete Vertex AI Batch Prediction Jobs
Deletes a Vertex AI Batch Prediction Job. Note that this is an asynchronous operation that returns a long-running operation. The job will be deleted in the background.
Warning: This permanently deletes the batch prediction job and its metadata. Job results in Cloud Storage are not affected.
- example:
Delete failed batch prediction jobs:
policies:
- name: delete-failed-batch-jobs
resource: gcp.vertex-ai-batch-prediction-job
filters:
- type: value
key: state
value: JOB_STATE_FAILED
actions:
- type: delete
properties:
type:
enum:
- delete
required:
- type
Permissions - aiplatform.batchPredictionJobs.delete
stop
Stop (Cancel) Vertex AI Batch Prediction Jobs
Cancels a running Vertex AI Batch Prediction Job. This is useful for cost control and incident response when jobs are running longer than expected or consuming unexpected resources.
Note: Only jobs in JOB_STATE_RUNNING or JOB_STATE_PENDING can be cancelled. Completed, failed, or already cancelled jobs cannot be cancelled.
- example:
Cancel long-running batch prediction jobs:
policies:
- name: cancel-long-running-batch-jobs
resource: gcp.vertex-ai-batch-prediction-job
filters:
- type: value
key: state
value: JOB_STATE_RUNNING
- type: value
key: createTime
value_type: age
op: greater-than
value: 24
actions:
- type: stop
- example:
Cancel all running batch jobs (emergency cost control):
policies:
- name: emergency-cancel-all-batch-jobs
resource: gcp.vertex-ai-batch-prediction-job
filters:
- type: value
key: state
value: JOB_STATE_RUNNING
actions:
- type: stop
properties:
type:
enum:
- stop
required:
- type
Permissions - aiplatform.batchPredictionJobs.cancel