Geval groundedness

GEval Groundedness Metric.

This metric is used to evaluate the groundedness of the generated output using GEval.

`GEvalGroundednessMetric(*args, threshold=1.0, **kwargs)`

GEval Groundedness Metric.

This metric is used to evaluate the groundedness of the generated output.

Available Fields

query (str): The query to evaluate the groundedness of the model's output.
generated_response (str): The generated response to evaluate the groundedness of the model's output.
retrieved_context (str): The retrieved context to evaluate the groundedness of the model's output.

Scoring

[0, 1] (Continuous): Normalized score range. Stored native 1-3 rubric value in rubric_score field.

Cookbook Example

Please refer to example_geval_groundedness.py in the gen-ai-sdk-cookbook repository.

Initializes the GEvalGroundednessMetric class.

Parameters:

Name	Type	Description	Default
`name`	`str \| None`	The name of the metric. Defaults to "groundedness".	required
`evaluation_params`	`list[LLMTestCaseParams] \| None`	The evaluation parameters. Defaults to [INPUT, ACTUAL_OUTPUT, RETRIEVAL_CONTEXT].	required
`models`	`BaseLMInvoker \| list[BaseLMInvoker] \| None`	The model invoker(s) to use for the metric.	required
`criteria`	`str \| None`	The criteria to use for the metric. Defaults to GROUNDEDNESS_CRITERIA.	required
`evaluation_steps`	`list[str] \| None`	The evaluation steps to use for the metric. Defaults to GROUNDEDNESS_EVALUATION_STEPS.	required
`rubric`	`list[Rubric] \| None`	The rubric to use for the metric. Defaults to GROUNDEDNESS_RUBRIC.	required
`threshold`	`float`	The threshold to use for the metric. Defaults to 1.0. Must be between 0.0 and 1.0 inclusive.	`1.0`
`additional_context`	`str \| None`	Additional context like few-shot examples. Defaults to GROUNDEDNESS_FEW_SHOT.	required
`batch_status_check_interval`	`float`	Time between batch status checks in seconds. Defaults to 30.0.	required
`batch_max_iterations`	`int`	Maximum number of status check iterations before timeout. Defaults to 120.	required
`strict_mode`	`bool`	If True, binarizes score to 1.0 or 0.0. Defaults to False.	required