Evaluation Test Cases
List Evaluation Test Cases
get/v2/gen-ai/evaluation_test_cases
Create Evaluation Test Case.
post/v2/gen-ai/evaluation_test_cases
List Evaluation Runs by Test Case
get/v2/gen-ai/evaluation_test_cases/{evaluation_test_case_uuid}/evaluation_runs
Retrieve Information About an Existing Evaluation Test Case
get/v2/gen-ai/evaluation_test_cases/{test_case_uuid}
Update an Evaluation Test Case.
put/v2/gen-ai/evaluation_test_cases/{test_case_uuid}
ModelsExpand Collapse
APIEvaluationTestCase = object { archived_at, created_at, created_by_user_email, 15 more }
APIEvaluationTestCase = object { archived_at, created_at, created_by_user_email, 15 more }
archived_at: optional string
formatdate-time
created_at: optional string
formatdate-time
created_by_user_email: optional string
created_by_user_id: optional string
formatuint64
dataset: optional object { created_at, dataset_name, dataset_uuid, 3 more }
dataset: optional object { created_at, dataset_name, dataset_uuid, 3 more }
created_at: optional string
Time created at.
formatdate-time
dataset_name: optional string
Name of the dataset.
dataset_uuid: optional string
UUID of the dataset.
file_size: optional string
The size of the dataset uploaded file in bytes.
formatuint64
has_ground_truth: optional boolean
Does the dataset have a ground truth column?
row_count: optional number
Number of rows in the dataset.
formatint64
dataset_name: optional string
dataset_uuid: optional string
description: optional string
latest_version_number_of_runs: optional number
formatint32
description: optional string
inverted: optional boolean
If true, the metric is inverted, meaning that a lower value is better.
metric_name: optional string
metric_type: optional "METRIC_TYPE_UNSPECIFIED" or "METRIC_TYPE_GENERAL_QUALITY" or "METRIC_TYPE_RAG_AND_TOOL"
metric_type: optional "METRIC_TYPE_UNSPECIFIED" or "METRIC_TYPE_GENERAL_QUALITY" or "METRIC_TYPE_RAG_AND_TOOL"
Accepts one of the following:
"METRIC_TYPE_UNSPECIFIED"
"METRIC_TYPE_GENERAL_QUALITY"
"METRIC_TYPE_RAG_AND_TOOL"
metric_uuid: optional string
metric_value_type: optional "METRIC_VALUE_TYPE_UNSPECIFIED" or "METRIC_VALUE_TYPE_NUMBER" or "METRIC_VALUE_TYPE_STRING" or "METRIC_VALUE_TYPE_PERCENTAGE"
metric_value_type: optional "METRIC_VALUE_TYPE_UNSPECIFIED" or "METRIC_VALUE_TYPE_NUMBER" or "METRIC_VALUE_TYPE_STRING" or "METRIC_VALUE_TYPE_PERCENTAGE"
Accepts one of the following:
"METRIC_VALUE_TYPE_UNSPECIFIED"
"METRIC_VALUE_TYPE_NUMBER"
"METRIC_VALUE_TYPE_STRING"
"METRIC_VALUE_TYPE_PERCENTAGE"
range_max: optional number
The maximum value for the metric.
formatfloat
range_min: optional number
The minimum value for the metric.
formatfloat
name: optional string
metric_uuid: optional string
name: optional string
success_threshold: optional number
The success threshold for the star metric. This is a value that the metric must reach to be considered successful.
formatfloat
success_threshold_pct: optional number
The success threshold for the star metric. This is a percentage value between 0 and 100.
formatint32
test_case_uuid: optional string
total_runs: optional number
formatint32
updated_at: optional string
formatdate-time
updated_by_user_email: optional string
updated_by_user_id: optional string
formatuint64
version: optional number
formatint64
APIStarMetric = object { metric_uuid, name, success_threshold, success_threshold_pct }
APIStarMetric = object { metric_uuid, name, success_threshold, success_threshold_pct }
metric_uuid: optional string
name: optional string
success_threshold: optional number
The success threshold for the star metric. This is a value that the metric must reach to be considered successful.
formatfloat
success_threshold_pct: optional number
The success threshold for the star metric. This is a percentage value between 0 and 100.
formatint32