Lm based retrieval evaluator
LM-based retrieval evaluator.
This evaluator focuses on retrieval quality for RAG-style pipelines using
- DeepEval contextual precision
- DeepEval contextual recall
It applies a simple rule-based combiner over precision and recall to derive an overall retrieval rating and a global explanation.
LMBasedRetrievalEvaluator(metrics=None, enabled_metrics=None, model=DefaultValues.MODEL, model_credentials=None, model_config=None, run_parallel=True, metrics_aggregator=None)
Bases: BaseEvaluator
Evaluator for LM-based retrieval quality in RAG pipelines.
This evaluator
- Runs a configurable set of retrieval metrics (by default: DeepEval contextual precision and contextual recall)
- Combines their scores using a simple rule-based scheme to produce:
relevancy_rating(good / bad / incomplete)score(aggregated retrieval score)
Default expected input
- input (str): The input query to evaluate the metric.
- expected_output (str): The expected output to evaluate the metric.
- retrieved_context (str | list[str]): The list of retrieved contexts to evaluate the metric. If the retrieved context is a str, it will be converted into a list with a single element.
Attributes:
| Name | Type | Description |
|---|---|---|
name |
str
|
The name of the evaluator. |
metrics |
list[BaseMetric]
|
The list of metrics to evaluate. |
enabled_metrics |
Sequence[type[BaseMetric] | str] | None
|
The list of metrics to enable. |
model |
str | ModelId | BaseLMInvoker
|
The model to use for the metrics. |
model_credentials |
str | None
|
The model credentials to use for the metrics. |
model_config |
dict[str, Any] | None
|
The model configuration to use for the metrics. |
run_parallel |
bool
|
Whether to run the metrics in parallel. |
Initialize the LM-based retrieval evaluator.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
metrics
|
Sequence[BaseMetric] | None
|
Optional custom retrieval metric instances. If provided, these will be used as the base pool and may override the default metrics by name. Defaults to None. |
None
|
enabled_metrics
|
Sequence[type[BaseMetric] | str] | None
|
Optional subset of metrics to enable
from the metric pool. Each entry can be either a metric class or its |
None
|
model
|
str | ModelId | BaseLMInvoker
|
Model for the default DeepEval metrics. Defaults to
|
MODEL
|
model_credentials
|
str | None
|
Credentials for the model, required when |
None
|
model_config
|
dict[str, Any] | None
|
Optional model configuration. Defaults to None. |
None
|
run_parallel
|
bool
|
Whether to run retrieval metrics in parallel. Defaults to True. |
True
|
metrics_aggregator
|
MetricsAggregator | None
|
Aggregator for polarity-aware binary scoring. If None, a default MetricsAggregator is used. Defaults to None. |
None
|
required_fields
property
Return the union of required fields from all configured metrics.