Skip to content

Geval completeness

GEval Completeness Metric.

This metric is used to evaluate the completeness of the generated output using GEval.

GEvalCompletenessMetric(*args, threshold=1.0, **kwargs)

Bases: DeepEvalGEvalMetric

GEval Completeness Metric.

This metric is used to evaluate the completeness of the generated output.

Available Fields
  • query (str): The query to evaluate the completeness of the model's output.
  • generated_response (str): The generated response to evaluate the completeness of the model's output.
  • expected_response (str): The expected response to evaluate the completeness of the model's output.
Scoring
  • [0, 1] (Continuous): Normalized score range. Stored native 1-3 rubric value in rubric_score field.
Cookbook Example

Please refer to example_geval_completeness.py in the gen-ai-sdk-cookbook repository.

Initializes the GEvalCompletenessMetric class.

Parameters:

Name Type Description Default
name str | None

The name of the metric. Defaults to "completeness".

required
evaluation_params list[LLMTestCaseParams] | None

The evaluation parameters. Defaults to [INPUT, ACTUAL_OUTPUT, EXPECTED_OUTPUT].

required
models BaseLMInvoker | list[BaseLMInvoker] | None

The model invoker(s) to use for the metric.

required
criteria str | None

The criteria to use for the metric. Defaults to COMPLETENESS_CRITERIA.

required
evaluation_steps list[str] | None

The evaluation steps to use for the metric. Defaults to COMPLETENESS_EVALUATION_STEPS.

required
rubric list[Rubric] | None

The rubric to use for the metric. Defaults to COMPLETENESS_RUBRIC.

required
threshold float

The threshold to use for the metric. Defaults to 1.0. Must be between 0.0 and 1.0 inclusive.

1.0
additional_context str | None

Additional context like few-shot examples. Defaults to COMPLETENESS_FEW_SHOT.

required
batch_status_check_interval float

Time between batch status checks in seconds. Defaults to 30.0.

required
batch_max_iterations int

Maximum number of status check iterations before timeout. Defaults to 120.

required
strict_mode bool

If True, binarizes score to 1.0 or 0.0. Defaults to False.

required