Geval generation evaluator

GEval Generation Evaluator.

`GEvalGenerationEvaluator(metrics=None, enabled_metrics=None, model=DefaultValues.MODEL, model_credentials=None, model_config=None, run_parallel=True, rule_book=None, generation_rule_engine=None, judge=None, refusal_metric=None, batch_status_check_interval=DefaultValues.BATCH_STATUS_CHECK_INTERVAL, batch_max_iterations=DefaultValues.BATCH_MAX_ITERATIONS)`

GEval Generation Evaluator.

This evaluator is used to evaluate the generation of the model.

Default expected input

query (str): The query to evaluate the generation of the model's output.
retrieved_context (str): The retrieved context to evaluate the generation of the model's output.
expected_response (str): The expected response to evaluate the generation of the model's output.
generated_response (str): The generated response to evaluate the generation of the model's output.

Attributes:

Name	Type	Description
`name`	`str`	The name of the evaluator.
`metrics`	`List[BaseMetric]`	The list of metrics to evaluate.
`run_parallel`	`bool`	Whether to run the metrics in parallel.
`rule_book`	`RuleBook \| None`	The rule book.
`generation_rule_engine`	`GenerationRuleEngine \| None`	The generation rule engine.

Initialize the GEval Generation Evaluator.

Parameters:

Name	Type	Description	Default
`metrics`	`List[BaseMetric] \| None`	The list of metrics to evaluate.	`None`
`enabled_metrics`	`List[type[BaseMetric] \| str] \| None`	The list of enabled metrics.	`None`
`model`	`str \| ModelId \| BaseLMInvoker`	The model to use for the metrics.	`MODEL`
`model_credentials`	`str \| None`	The model credentials to use for the metrics.	`None`
`model_config`	`dict[str, Any] \| None`	The model config to use for the metrics.	`None`
`run_parallel`	`bool`	Whether to run the metrics in parallel.	`True`
`rule_book`	`RuleBook \| None`	The rule book.	`None`
`generation_rule_engine`	`GenerationRuleEngine \| None`	The generation rule engine.	`None`
`judge`	`MultipleLLMAsJudge \| None`	Optional multiple LLM judge for ensemble evaluation.	`None`
`refusal_metric`	`type[BaseMetric] \| None`	The refusal metric to use. If None, the default refusal metric will be used. Defaults to GEvalRefusalMetric.	`None`
`batch_status_check_interval`	`float`	Time between batch status checks in seconds. Defaults to 30.0.	`BATCH_STATUS_CHECK_INTERVAL`
`batch_max_iterations`	`int`	Maximum number of status check iterations before timeout. Defaults to 120 (60 minutes with default interval).	`BATCH_MAX_ITERATIONS`