Geval generation evaluator
GEval Generation Evaluator.
References
NONE
GEvalGenerationEvaluator(metrics=None, enabled_metrics=None, model=DefaultValues.MODEL, model_credentials=None, model_config=None, run_parallel=True, rule_book=None, generation_rule_engine=None, judge=None, refusal_metric=None)
Bases: GenerationEvaluator
GEval Generation Evaluator.
This evaluator is used to evaluate the generation of the model.
Default expected input
- query (str): The query to evaluate the generation of the model's output.
- retrieved_context (str): The retrieved context to evaluate the generation of the model's output.
- expected_response (str): The expected response to evaluate the generation of the model's output.
- generated_response (str): The generated response to evaluate the generation of the model's output.
Attributes:
| Name | Type | Description |
|---|---|---|
name |
str
|
The name of the evaluator. |
metrics |
List[BaseMetric]
|
The list of metrics to evaluate. |
run_parallel |
bool
|
Whether to run the metrics in parallel. |
rule_book |
RuleBook | None
|
The rule book. |
generation_rule_engine |
GenerationRuleEngine | None
|
The generation rule engine. |
Initialize the GEval Generation Evaluator.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
metrics |
List[BaseMetric] | None
|
The list of metrics to evaluate. |
None
|
enabled_metrics |
List[type[BaseMetric] | str] | None
|
The list of enabled metrics. |
None
|
model |
str | ModelId | BaseLMInvoker
|
The model to use for the metrics. |
MODEL
|
model_credentials |
str | None
|
The model credentials to use for the metrics. |
None
|
model_config |
dict[str, Any] | None
|
The model config to use for the metrics. |
None
|
run_parallel |
bool
|
Whether to run the metrics in parallel. |
True
|
rule_book |
RuleBook | None
|
The rule book. |
None
|
generation_rule_engine |
GenerationRuleEngine | None
|
The generation rule engine. |
None
|
judge |
MultipleLLMAsJudge | None
|
Optional multiple LLM judge for ensemble evaluation. |
None
|
refusal_metric |
type[BaseMetric] | None
|
The refusal metric to use. If None, the default refusal metric will be used. Defaults to GEvalRefusalMetric. |
None
|