Skip to content

Lm based retrieval evaluator

LM-based retrieval evaluator.

This evaluator focuses on retrieval quality for RAG-style pipelines using
  • DeepEval contextual precision
  • DeepEval contextual recall

It applies a simple rule-based combiner over precision and recall to derive an overall retrieval rating and a global explanation.

LMBasedRetrievalEvaluator(metrics=None, enabled_metrics=None, model=DefaultValues.MODEL, model_credentials=None, model_config=None, run_parallel=True, metrics_aggregator=None)

Bases: BaseEvaluator

Evaluator for LM-based retrieval quality in RAG pipelines.

This evaluator
  • Runs a configurable set of retrieval metrics (by default: DeepEval contextual precision and contextual recall)
  • Combines their scores using a simple rule-based scheme to produce:
    • relevancy_rating (good / bad / incomplete)
    • score (aggregated retrieval score)
Default expected input
  • input (str): The input query to evaluate the metric.
  • expected_output (str): The expected output to evaluate the metric.
  • retrieved_context (str | list[str]): The list of retrieved contexts to evaluate the metric. If the retrieved context is a str, it will be converted into a list with a single element.

Attributes:

Name Type Description
name str

The name of the evaluator.

metrics list[BaseMetric]

The list of metrics to evaluate.

enabled_metrics Sequence[type[BaseMetric] | str] | None

The list of metrics to enable.

model str | ModelId | BaseLMInvoker

The model to use for the metrics.

model_credentials str | None

The model credentials to use for the metrics.

model_config dict[str, Any] | None

The model configuration to use for the metrics.

run_parallel bool

Whether to run the metrics in parallel.

Initialize the LM-based retrieval evaluator.

Parameters:

Name Type Description Default
metrics Sequence[BaseMetric] | None

Optional custom retrieval metric instances. If provided, these will be used as the base pool and may override the default metrics by name. Defaults to None.

None
enabled_metrics Sequence[type[BaseMetric] | str] | None

Optional subset of metrics to enable from the metric pool. Each entry can be either a metric class or its name. If None, all metrics from the pool are used. Defaults to None.

None
model str | ModelId | BaseLMInvoker

Model for the default DeepEval metrics. Defaults to DefaultValues.MODEL.

MODEL
model_credentials str | None

Credentials for the model, required when model is a string. Defaults to None.

None
model_config dict[str, Any] | None

Optional model configuration. Defaults to None.

None
run_parallel bool

Whether to run retrieval metrics in parallel. Defaults to True.

True
metrics_aggregator MetricsAggregator | None

Aggregator for polarity-aware binary scoring. If None, a default MetricsAggregator is used. Defaults to None.

None

required_fields property

Return the union of required fields from all configured metrics.