Skip to content
  • Auto
  • Light
  • Dark

Evaluation Test Cases

Evaluation Test Cases

Create Evaluation Test Case.
post/v2/gen-ai/evaluation_test_cases
List Evaluation Test Cases
get/v2/gen-ai/evaluation_test_cases
List Evaluation Runs by Test Case
get/v2/gen-ai/evaluation_test_cases/{evaluation_test_case_uuid}/evaluation_runs
Retrieve Information About an Existing Evaluation Test Case
get/v2/gen-ai/evaluation_test_cases/{test_case_uuid}
Update an Evaluation Test Case.
put/v2/gen-ai/evaluation_test_cases/{test_case_uuid}
ModelsExpand Collapse
APIEvaluationTestCase = object { archived_at, created_at, created_by_user_email, 15 more }
archived_at: optional string
formatdate-time
created_at: optional string
formatdate-time
created_by_user_email: optional string
created_by_user_id: optional string
formatuint64
dataset: optional object { created_at, dataset_name, dataset_uuid, 3 more }
created_at: optional string

Time created at.

formatdate-time
dataset_name: optional string

Name of the dataset.

dataset_uuid: optional string

UUID of the dataset.

file_size: optional string

The size of the dataset uploaded file in bytes.

formatuint64
has_ground_truth: optional boolean

Does the dataset have a ground truth column?

row_count: optional number

Number of rows in the dataset.

formatint64
dataset_name: optional string
dataset_uuid: optional string
description: optional string
latest_version_number_of_runs: optional number
formatint32
metrics: optional array of APIEvaluationMetric { description, inverted, metric_name, 5 more }
description: optional string
inverted: optional boolean

If true, the metric is inverted, meaning that a lower value is better.

metric_name: optional string
metric_type: optional "METRIC_TYPE_UNSPECIFIED" or "METRIC_TYPE_GENERAL_QUALITY" or "METRIC_TYPE_RAG_AND_TOOL"
Accepts one of the following:
"METRIC_TYPE_UNSPECIFIED"
"METRIC_TYPE_GENERAL_QUALITY"
"METRIC_TYPE_RAG_AND_TOOL"
metric_uuid: optional string
metric_value_type: optional "METRIC_VALUE_TYPE_UNSPECIFIED" or "METRIC_VALUE_TYPE_NUMBER" or "METRIC_VALUE_TYPE_STRING" or "METRIC_VALUE_TYPE_PERCENTAGE"
Accepts one of the following:
"METRIC_VALUE_TYPE_UNSPECIFIED"
"METRIC_VALUE_TYPE_NUMBER"
"METRIC_VALUE_TYPE_STRING"
"METRIC_VALUE_TYPE_PERCENTAGE"
range_max: optional number

The maximum value for the metric.

formatfloat
range_min: optional number

The minimum value for the metric.

formatfloat
name: optional string
star_metric: optional APIStarMetric { metric_uuid, name, success_threshold, success_threshold_pct }
metric_uuid: optional string
name: optional string
success_threshold: optional number

The success threshold for the star metric. This is a value that the metric must reach to be considered successful.

formatfloat
success_threshold_pct: optional number

The success threshold for the star metric. This is a percentage value between 0 and 100.

formatint32
test_case_uuid: optional string
total_runs: optional number
formatint32
updated_at: optional string
formatdate-time
updated_by_user_email: optional string
updated_by_user_id: optional string
formatuint64
version: optional number
formatint64
APIStarMetric = object { metric_uuid, name, success_threshold, success_threshold_pct }
metric_uuid: optional string
name: optional string
success_threshold: optional number

The success threshold for the star metric. This is a value that the metric must reach to be considered successful.

formatfloat
success_threshold_pct: optional number

The success threshold for the star metric. This is a percentage value between 0 and 100.

formatint32