Skip to content

Geval groundedness

GEval Groundedness Metric.

This metric is used to evaluate the groundedness of the generated output using GEval.

GEvalGroundednessMetric(*args, threshold=1.0, **kwargs)

Bases: DeepEvalGEvalMetric

GEval Groundedness Metric.

This metric is used to evaluate the groundedness of the generated output.

Available Fields
  • query (str): The query to evaluate the groundedness of the model's output.
  • generated_response (str): The generated response to evaluate the groundedness of the model's output.
  • retrieved_context (str): The retrieved context to evaluate the groundedness of the model's output.
Scoring
  • [0, 1] (Continuous): Normalized score range. Stored native 1-3 rubric value in rubric_score field.
Cookbook Example

Please refer to example_geval_groundedness.py in the gen-ai-sdk-cookbook repository.

Initializes the GEvalGroundednessMetric class.

Parameters:

Name Type Description Default
name str | None

The name of the metric. Defaults to "groundedness".

required
evaluation_params list[LLMTestCaseParams] | None

The evaluation parameters. Defaults to [INPUT, ACTUAL_OUTPUT, RETRIEVAL_CONTEXT].

required
model str | ModelId | BaseLMInvoker

The model to use for the metric. Defaults to DefaultValues.MODEL.

required
criteria str | None

The criteria to use for the metric. Defaults to GROUNDEDNESS_CRITERIA.

required
evaluation_steps list[str] | None

The evaluation steps to use for the metric. Defaults to GROUNDEDNESS_EVALUATION_STEPS.

required
rubric list[Rubric] | None

The rubric to use for the metric. Defaults to GROUNDEDNESS_RUBRIC.

required
model_credentials str | None

The model credentials to use for the metric. Defaults to None. Required when model is a string.

required
model_config dict[str, Any] | None

The model config to use for the metric. Defaults to None.

required
threshold float

The threshold to use for the metric. Defaults to 1.0. Must be between 0.0 and 1.0 inclusive.

1.0
additional_context str | None

Additional context like few-shot examples. Defaults to GROUNDEDNESS_FEW_SHOT.

required
batch_status_check_interval float

Time between batch status checks in seconds. Defaults to 30.0.

required
batch_max_iterations int

Maximum number of status check iterations before timeout. Defaults to 120.

required
strict_mode bool

If True, binarizes score to 1.0 or 0.0. Defaults to False.

required