Skip to content

Langfuse dataset

Langfuse Dataset Class.

This class is a wrapper around the Langfuse dataset class. The dataset can be loaded from Langfuse and data can be added to Langfuse.

Authors

Christina Alexandra (christina.alexandra@gdplabs.id)

References

NONE

LangfuseDataset(dataset, langfuse_client, dataset_name=None, expected_output_key='expected_response', mapping=None, attachments_config=None)

Bases: BaseDataset

Langfuse dataset class for the evaluator.

Attributes:

Name Type Description
dataset list[MetricInput]

The dataset to use for the evaluation.

langfuse_client Langfuse

The Langfuse client instance.

dataset_name str

The name of the dataset in Langfuse.

expected_output_key str | None

The key for expected output. Defaults to "expected_response".

mapping dict[str, Any] | None

Optional mapping for field keys. Defaults to None.

Initialize the LangfuseDataset class.

Parameters:

Name Type Description Default
dataset List[MetricInput]

The dataset to use for the evaluation.

required
langfuse_client Langfuse

The Langfuse client instance.

required
dataset_name Optional[str]

The name of the dataset in Langfuse.

None
expected_output_key str | None

The key for expected output. Defaults to "expected_response".

'expected_response'
mapping dict[str, Any] | None

Optional mapping for field keys. Defaults to None.

None
attachments_config AttachmentConfig | dict[str, Any] | None

Configuration for loading attachments. Defaults to None.

None

convert_to_standard_dataset(expected_output_key=None, mapping=None)

Convert the dataset to standard data.

Parameters:

Name Type Description Default
expected_output_key str | None

The key for expected output. Defaults to None.

None
mapping dict[str, Any] | None

Optional mapping for field keys. Defaults to None.

None

Returns:

Type Description
List[MetricInput]

List[MetricInput]: The converted dataset.

from_csv(path, langfuse_client, dataset_name=None, dataset_description='', metadata=None, is_append=False, attachments_config=None, **kwargs) staticmethod

Create a LangfuseDataset from a CSV file.

Parameters:

Name Type Description Default
path str

The path to the CSV file.

required
langfuse_client Langfuse

The Langfuse client instance.

required
dataset_name str

The name to register this dataset under in Langfuse. If None, defaults to the CSV filename without extension. Defaults to None.

None
dataset_description str

The description of the dataset. Defaults to an empty string.

''
metadata dict

Optional metadata for the dataset. Defaults to None.

None
is_append bool

If True, append items to existing dataset. If False, only create if dataset doesn't exist.

False
attachments_config AttachmentConfig | dict[str, Any] | None

Configuration for loading attachments. Defaults to None.

None
**kwargs Any

Additional arguments to pass to pandas read_csv.

{}

Returns:

Name Type Description
LangfuseDataset LangfuseDataset

The created dataset.

from_dict(dataset, langfuse_client, dataset_name, dataset_description='', mapping=None, metadata=None, is_append=False, attachments_config=None) staticmethod

Create a LangfuseDataset from a list of MetricInput.

Parameters:

Name Type Description Default
dataset List[MetricInput]

The dataset to create.

required
langfuse_client Langfuse

The Langfuse client instance.

required
dataset_name str

The name of the dataset in Langfuse.

required
dataset_description str

The description of the dataset. Defaults to an empty string.

''
mapping dict[str, Any] | None

Optional mapping for field keys. Defaults to None.

None
metadata dict

Optional metadata for the dataset. Defaults to None.

None
is_append bool

If True, append items to existing dataset. If False, only create if dataset doesn't exist.

False
attachments_config AttachmentConfig | dict[str, Any] | None

Configuration for loading attachments. Defaults to None.

None

Returns:

Name Type Description
LangfuseDataset LangfuseDataset

The created dataset.

from_gsheets(sheet_id, worksheet_name, client_email, private_key, langfuse_client, dataset_name=None, dataset_description='', mapping=None, metadata=None, is_append=False, attachments_config=None) async staticmethod

Create a LangfuseDataset from Google Sheets.

Parameters:

Name Type Description Default
sheet_id str

The ID of the Google Sheet.

required
worksheet_name str

The name of the worksheet within the Google Sheet.

required
client_email str

The client email for Google Sheets API.

required
private_key str

Base64-encoded private key for Google Sheets API.

required
langfuse_client Langfuse

The Langfuse client instance.

required
dataset_name str

The name of the dataset in Langfuse.

None
dataset_description str

The description of the dataset. Defaults to an empty string.

''
mapping dict[str, Any] | None

Optional mapping for field keys. Defaults to None.

None
metadata dict

Optional metadata for the dataset. Defaults to None.

None
is_append bool

If True, append items to existing dataset. If False, only create if dataset doesn't exist.

False
attachments_config AttachmentConfig | dict[str, Any] | None

Configuration for loading attachments. Defaults to None.

None

Returns:

Name Type Description
LangfuseDataset LangfuseDataset

The created dataset.

from_jsonl(path, langfuse_client, dataset_name=None, dataset_description='', metadata=None, is_append=False, attachments_config=None, **kwargs) staticmethod

Create a LangfuseDataset from a JSONL file.

Parameters:

Name Type Description Default
path str

The path to the JSONL file.

required
langfuse_client Langfuse

The Langfuse client instance.

required
dataset_name str

The name of the dataset in Langfuse.

None
dataset_description str

The description of the dataset. Defaults to an empty string.

''
metadata dict

Optional metadata for the dataset. Defaults to None.

None
is_append bool

If True, append items to existing dataset. If False, only create if dataset doesn't exist.

False
attachments_config AttachmentConfig | dict[str, Any] | None

Configuration for loading attachments. Defaults to None.

None
**kwargs Any

Additional arguments to pass to the constructor (deprecated, use attachments_config instead).

{}

Returns:

Name Type Description
LangfuseDataset LangfuseDataset

The created dataset.

from_langfuse(langfuse_client, dataset_name, mapping=None, attachments_config=None) staticmethod

Load a dataset from Langfuse.

Parameters:

Name Type Description Default
langfuse_client Langfuse

The Langfuse client instance.

required
dataset_name str

The name of the dataset in Langfuse.

required
mapping dict[str, Any] | None

Optional mapping for field keys. Defaults to None.

None
attachments_config AttachmentConfig | dict[str, Any] | None

Configuration for loading attachments. Defaults to None.

None

Returns:

Name Type Description
LangfuseDataset LangfuseDataset

The loaded dataset.

Raises:

Type Description
ValueError

If the dataset is not found or has no data.

load()

Load the dataset.

Returns:

Type Description
List[MetricInput]

List[MetricInput]: The loaded dataset with proper Langfuse structure.

prepare_row_for_inference(row, dataset_name=None, dataset_item_id=None, metadata=None, **kwargs) async

Prepare row for inference by syncing with Langfuse.

This creates or syncs the dataset item in Langfuse before inference.

Parameters:

Name Type Description Default
row dict[str, Any]

The row to prepare.

required
dataset_name str | None

Name of the Langfuse dataset. Uses self._dataset_name if None.

None
dataset_item_id str | None

Optional dataset item ID.

None
metadata dict[str, Any] | None

Additional metadata.

None
**kwargs Any

Additional arguments.

{}

Returns:

Type Description
dict[str, Any]

dict[str, Any]: The prepared row, potentially with item_id added.

to_standard_format()

Convert Langfuse format dataset to standard format.

Returns:

Type Description
list[MetricInput]

list[MetricInput]: Dataset in standard format.

validate()

Validate the dataset.

Raises:

Type Description
ValueError

If the dataset is not a list of MetricInput or if required fields are missing.