Evaluation Test Cases

List Evaluation Test Cases

get/v2/gen-ai/evaluation_test_cases

Create Evaluation Test Case.

post/v2/gen-ai/evaluation_test_cases

List Evaluation Runs by Test Case

get/v2/gen-ai/evaluation_test_cases/{evaluation_test_case_uuid}/evaluation_runs

Retrieve Information About an Existing Evaluation Test Case

get/v2/gen-ai/evaluation_test_cases/{test_case_uuid}

Update an Evaluation Test Case.

put/v2/gen-ai/evaluation_test_cases/{test_case_uuid}

ModelsExpand Collapse

APIEvaluationTestCase = object { archived_at, created_at, created_by_user_email, 15 more }

archived_at: optional string

formatdate-time

created_at: optional string

formatdate-time

created_by_user_email: optional string

created_by_user_id: optional string

formatuint64

dataset: optional object { created_at, dataset_name, dataset_uuid, 3 more }

created_at: optional string

Time created at.

formatdate-time

dataset_name: optional string

Name of the dataset.

dataset_uuid: optional string

UUID of the dataset.

file_size: optional string

The size of the dataset uploaded file in bytes.

formatuint64

has_ground_truth: optional boolean

Does the dataset have a ground truth column?

row_count: optional number

Number of rows in the dataset.

formatint64

dataset_name: optional string

dataset_uuid: optional string

description: optional string

latest_version_number_of_runs: optional number

formatint32

metrics: optional array of APIEvaluationMetric { description, inverted, metric_name, 5 more }

description: optional string

inverted: optional boolean

If true, the metric is inverted, meaning that a lower value is better.

metric_name: optional string

metric_type: optional "METRIC_TYPE_UNSPECIFIED" or "METRIC_TYPE_GENERAL_QUALITY" or "METRIC_TYPE_RAG_AND_TOOL"

Accepts one of the following:

"METRIC_TYPE_UNSPECIFIED"

"METRIC_TYPE_GENERAL_QUALITY"

"METRIC_TYPE_RAG_AND_TOOL"

metric_uuid: optional string

metric_value_type: optional "METRIC_VALUE_TYPE_UNSPECIFIED" or "METRIC_VALUE_TYPE_NUMBER" or "METRIC_VALUE_TYPE_STRING" or "METRIC_VALUE_TYPE_PERCENTAGE"

Accepts one of the following:

"METRIC_VALUE_TYPE_UNSPECIFIED"

"METRIC_VALUE_TYPE_NUMBER"

"METRIC_VALUE_TYPE_STRING"

"METRIC_VALUE_TYPE_PERCENTAGE"

range_max: optional number

The maximum value for the metric.

formatfloat

range_min: optional number

The minimum value for the metric.

formatfloat

star_metric: optional APIStarMetric { metric_uuid, name, success_threshold, success_threshold_pct }

metric_uuid: optional string

success_threshold: optional number

The success threshold for the star metric. This is a value that the metric must reach to be considered successful.

formatfloat

success_threshold_pct: optional number

The success threshold for the star metric. This is a percentage value between 0 and 100.

formatint32

test_case_uuid: optional string

total_runs: optional number

formatint32

updated_at: optional string

formatdate-time

updated_by_user_email: optional string

updated_by_user_id: optional string

formatuint64

version: optional number

formatint64

APIStarMetric = object { metric_uuid, name, success_threshold, success_threshold_pct }

metric_uuid: optional string

success_threshold: optional number

The success threshold for the star metric. This is a value that the metric must reach to be considered successful.

formatfloat

success_threshold_pct: optional number

The success threshold for the star metric. This is a percentage value between 0 and 100.

formatint32