Geval completeness
GEval Completeness Metric.
This metric is used to evaluate the completeness of the generated output using GEval.
GEvalCompletenessMetric(*args, threshold=1.0, **kwargs)
Bases: DeepEvalGEvalMetric
GEval Completeness Metric.
This metric is used to evaluate the completeness of the generated output.
Available Fields
- query (str): The query to evaluate the completeness of the model's output.
- generated_response (str): The generated response to evaluate the completeness of the model's output.
- expected_response (str): The expected response to evaluate the completeness of the model's output.
Scoring
- [0, 1] (Continuous): Normalized score range. Stored native 1-3 rubric value in rubric_score field.
Cookbook Example
Please refer to example_geval_completeness.py in the gen-ai-sdk-cookbook repository.
Initializes the GEvalCompletenessMetric class.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
str | None
|
The name of the metric. Defaults to "completeness". |
required |
evaluation_params
|
list[LLMTestCaseParams] | None
|
The evaluation parameters. Defaults to [INPUT, ACTUAL_OUTPUT, EXPECTED_OUTPUT]. |
required |
model
|
str | ModelId | BaseLMInvoker
|
The model to use for the metric. Defaults to DefaultValues.MODEL. |
required |
criteria
|
str | None
|
The criteria to use for the metric. Defaults to COMPLETENESS_CRITERIA. |
required |
evaluation_steps
|
list[str] | None
|
The evaluation steps to use for the metric. Defaults to COMPLETENESS_EVALUATION_STEPS. |
required |
rubric
|
list[Rubric] | None
|
The rubric to use for the metric. Defaults to COMPLETENESS_RUBRIC. |
required |
model_credentials
|
str | None
|
The model credentials to use for the metric. Defaults to None. Required when model is a string. |
required |
model_config
|
dict[str, Any] | None
|
The model config to use for the metric. Defaults to None. |
required |
threshold
|
float
|
The threshold to use for the metric. Defaults to 1.0. Must be between 0.0 and 1.0 inclusive. |
1.0
|
additional_context
|
str | None
|
Additional context like few-shot examples. Defaults to COMPLETENESS_FEW_SHOT. |
required |
batch_status_check_interval
|
float
|
Time between batch status checks in seconds. Defaults to 30.0. |
required |
batch_max_iterations
|
int
|
Maximum number of status check iterations before timeout. Defaults to 120. |
required |
strict_mode
|
bool
|
If True, binarizes score to 1.0 or 0.0. Defaults to False. |
required |