Lm based metric

This module contains the LLM-based metric class.

`LMBasedMetric(name, response_schema, prompt_builder, model=DefaultValues.MODEL, model_credentials=None, model_config=None, parse_response_fn=None, batch_status_check_interval=DefaultValues.BATCH_STATUS_CHECK_INTERVAL, batch_max_iterations=DefaultValues.BATCH_MAX_ITERATIONS)`

Bases: BaseMetric

A multi purpose LM-based metric class.

This class is a multi purpose LM-based metric class that can be used to evaluate the performance of a LM-based metric. It can be used to evaluate the performance of a LM-based metric by providing a response schema, a prompt builder, a model id, and a model credentials.

Attributes:

Name	Type	Description
`name`	`str`	The name of the metric.
`response_schema`	`ResponseSchema`	The response schema to use for the metric.
`prompt_builder`	`PromptBuilder`	The prompt builder to use for the metric.
`model_credentials`	`str`	The model credentials to use for the metric.
`model`	`Union[str, ModelId, BaseLMInvoker]`	The model to use for the metric.
`model_config`	`dict[str, Any] \| None`	The model config to use for the metric. Defaults to an empty.

Initialize the LMBasedMetric class.

Parameters:

Name	Type	Description	Default
`name`	`str`	The name of the metric.	required
`response_schema`	`ResponseSchema`	The response schema to use for the metric.	required
`prompt_builder`	`PromptBuilder`	The prompt builder to use for the metric.	required
`model`	`Union[str, ModelId, BaseLMInvoker]`	The model to use for the metric.	`MODEL`
`model_credentials`	`str \| None`	The model credentials to use for the metric. Defaults to None.	`None`
`model_config`	`dict[str, Any] \| None`	The model config to use for the metric. Defaults to an empty dictionary.	`None`
`parse_response_fn`	`Callable[[str \| LMOutput], MetricOutput] \| None`	The function to use to parse the response from the LM. Defaults to a function that parses the response from the LM.	`None`
`batch_status_check_interval`	`float`	Time between batch status checks in seconds. Defaults to 30.0.	`BATCH_STATUS_CHECK_INTERVAL`
`batch_max_iterations`	`int`	Maximum number of status check iterations before timeout. Defaults to 120 (60 minutes with default interval).	`BATCH_MAX_ITERATIONS`

`evaluate(data)` `async`

Evaluate with custom prompt lifecycle support.

Overrides BaseMetric.evaluate() to add custom prompt application and state management. This ensures custom prompts are applied before evaluation and state is restored after.

For batch processing, uses efficient batch API when all items have the same custom prompts. Falls back to per-item processing when items have different custom prompts.

Parameters:

Name	Type	Description	Default
`data`	`MetricInput \| list[MetricInput]`	Single data item or list of data items to evaluate.	required

Returns:

Type	Description
`MetricOutput \| list[MetricOutput]`	Evaluation results with scores namespaced by metric name.

`default_parse_response_fn(response)`

Default function to parse the result of the LLM into a MetricOutput.

Assumes response contains 'score' and 'reason' or 'explanation' fields.

Parameters:

Name	Type	Description	Default
`response`	`str \| LMOutput`	The response from the LLM, which can be either a string containing JSON or an LMOutput object with structured output.	required

Returns:

Name	Type	Description
`MetricOutput`	`MetricOutput`	The parsed response as a dictionary.

Raises:

Type	Description
`ValueError`	If the response cannot be parsed or is missing required fields.

Lm based metric

LMBasedMetric(name, response_schema, prompt_builder, model=DefaultValues.MODEL, model_credentials=None, model_config=None, parse_response_fn=None, batch_status_check_interval=DefaultValues.BATCH_STATUS_CHECK_INTERVAL, batch_max_iterations=DefaultValues.BATCH_MAX_ITERATIONS)

evaluate(data) async

default_parse_response_fn(response)

`LMBasedMetric(name, response_schema, prompt_builder, model=DefaultValues.MODEL, model_credentials=None, model_config=None, parse_response_fn=None, batch_status_check_interval=DefaultValues.BATCH_STATUS_CHECK_INTERVAL, batch_max_iterations=DefaultValues.BATCH_MAX_ITERATIONS)`

`evaluate(data)` `async`

`default_parse_response_fn(response)`