Langfuse dataset
Langfuse Dataset Class.
This class is a wrapper around the Langfuse dataset class. The dataset can be loaded from Langfuse and data can be added to Langfuse.
References
NONE
LangfuseDataset(dataset, langfuse_client, dataset_name=None, expected_output_key='expected_response', mapping=None, attachments_config=None)
Bases: BaseDataset
Langfuse dataset class for the evaluator.
Attributes:
| Name | Type | Description |
|---|---|---|
dataset |
list[MetricInput]
|
The dataset to use for the evaluation. |
langfuse_client |
Langfuse
|
The Langfuse client instance. |
dataset_name |
str
|
The name of the dataset in Langfuse. |
expected_output_key |
str | None
|
The key for expected output. Defaults to "expected_response". |
mapping |
dict[str, Any] | None
|
Optional mapping for field keys. Defaults to None. |
Initialize the LangfuseDataset class.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
dataset |
List[MetricInput]
|
The dataset to use for the evaluation. |
required |
langfuse_client |
Langfuse
|
The Langfuse client instance. |
required |
dataset_name |
Optional[str]
|
The name of the dataset in Langfuse. |
None
|
expected_output_key |
str | None
|
The key for expected output. Defaults to "expected_response". |
'expected_response'
|
mapping |
dict[str, Any] | None
|
Optional mapping for field keys. Defaults to None. |
None
|
attachments_config |
AttachmentConfig | dict[str, Any] | None
|
Configuration for loading attachments. Defaults to None. |
None
|
convert_to_standard_dataset(expected_output_key=None, mapping=None)
Convert the dataset to standard data.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
expected_output_key |
str | None
|
The key for expected output. Defaults to None. |
None
|
mapping |
dict[str, Any] | None
|
Optional mapping for field keys. Defaults to None. |
None
|
Returns:
| Type | Description |
|---|---|
List[MetricInput]
|
List[MetricInput]: The converted dataset. |
from_csv(path, langfuse_client, dataset_name=None, dataset_description='', metadata=None, is_append=False, attachments_config=None, **kwargs)
staticmethod
Create a LangfuseDataset from a CSV file.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
path |
str
|
The path to the CSV file. |
required |
langfuse_client |
Langfuse
|
The Langfuse client instance. |
required |
dataset_name |
str
|
The name to register this dataset under in Langfuse. If None, defaults to the CSV filename without extension. Defaults to None. |
None
|
dataset_description |
str
|
The description of the dataset. Defaults to an empty string. |
''
|
metadata |
dict
|
Optional metadata for the dataset. Defaults to None. |
None
|
is_append |
bool
|
If True, append items to existing dataset. If False, only create if dataset doesn't exist. |
False
|
attachments_config |
AttachmentConfig | dict[str, Any] | None
|
Configuration for loading attachments. Defaults to None. |
None
|
**kwargs |
Any
|
Additional arguments to pass to pandas read_csv. |
{}
|
Returns:
| Name | Type | Description |
|---|---|---|
LangfuseDataset |
LangfuseDataset
|
The created dataset. |
from_dict(dataset, langfuse_client, dataset_name, dataset_description='', mapping=None, metadata=None, is_append=False, attachments_config=None)
staticmethod
Create a LangfuseDataset from a list of MetricInput.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
dataset |
List[MetricInput]
|
The dataset to create. |
required |
langfuse_client |
Langfuse
|
The Langfuse client instance. |
required |
dataset_name |
str
|
The name of the dataset in Langfuse. |
required |
dataset_description |
str
|
The description of the dataset. Defaults to an empty string. |
''
|
mapping |
dict[str, Any] | None
|
Optional mapping for field keys. Defaults to None. |
None
|
metadata |
dict
|
Optional metadata for the dataset. Defaults to None. |
None
|
is_append |
bool
|
If True, append items to existing dataset. If False, only create if dataset doesn't exist. |
False
|
attachments_config |
AttachmentConfig | dict[str, Any] | None
|
Configuration for loading attachments. Defaults to None. |
None
|
Returns:
| Name | Type | Description |
|---|---|---|
LangfuseDataset |
LangfuseDataset
|
The created dataset. |
from_gsheets(sheet_id, worksheet_name, client_email, private_key, langfuse_client, dataset_name=None, dataset_description='', mapping=None, metadata=None, is_append=False, attachments_config=None)
async
staticmethod
Create a LangfuseDataset from Google Sheets.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
sheet_id |
str
|
The ID of the Google Sheet. |
required |
worksheet_name |
str
|
The name of the worksheet within the Google Sheet. |
required |
client_email |
str
|
The client email for Google Sheets API. |
required |
private_key |
str
|
Base64-encoded private key for Google Sheets API. |
required |
langfuse_client |
Langfuse
|
The Langfuse client instance. |
required |
dataset_name |
str
|
The name of the dataset in Langfuse. |
None
|
dataset_description |
str
|
The description of the dataset. Defaults to an empty string. |
''
|
mapping |
dict[str, Any] | None
|
Optional mapping for field keys. Defaults to None. |
None
|
metadata |
dict
|
Optional metadata for the dataset. Defaults to None. |
None
|
is_append |
bool
|
If True, append items to existing dataset. If False, only create if dataset doesn't exist. |
False
|
attachments_config |
AttachmentConfig | dict[str, Any] | None
|
Configuration for loading attachments. Defaults to None. |
None
|
Returns:
| Name | Type | Description |
|---|---|---|
LangfuseDataset |
LangfuseDataset
|
The created dataset. |
from_jsonl(path, langfuse_client, dataset_name=None, dataset_description='', metadata=None, is_append=False, attachments_config=None, **kwargs)
staticmethod
Create a LangfuseDataset from a JSONL file.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
path |
str
|
The path to the JSONL file. |
required |
langfuse_client |
Langfuse
|
The Langfuse client instance. |
required |
dataset_name |
str
|
The name of the dataset in Langfuse. |
None
|
dataset_description |
str
|
The description of the dataset. Defaults to an empty string. |
''
|
metadata |
dict
|
Optional metadata for the dataset. Defaults to None. |
None
|
is_append |
bool
|
If True, append items to existing dataset. If False, only create if dataset doesn't exist. |
False
|
attachments_config |
AttachmentConfig | dict[str, Any] | None
|
Configuration for loading attachments. Defaults to None. |
None
|
**kwargs |
Any
|
Additional arguments to pass to the constructor (deprecated, use attachments_config instead). |
{}
|
Returns:
| Name | Type | Description |
|---|---|---|
LangfuseDataset |
LangfuseDataset
|
The created dataset. |
from_langfuse(langfuse_client, dataset_name, mapping=None, attachments_config=None)
staticmethod
Load a dataset from Langfuse.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
langfuse_client |
Langfuse
|
The Langfuse client instance. |
required |
dataset_name |
str
|
The name of the dataset in Langfuse. |
required |
mapping |
dict[str, Any] | None
|
Optional mapping for field keys. Defaults to None. |
None
|
attachments_config |
AttachmentConfig | dict[str, Any] | None
|
Configuration for loading attachments. Defaults to None. |
None
|
Returns:
| Name | Type | Description |
|---|---|---|
LangfuseDataset |
LangfuseDataset
|
The loaded dataset. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If the dataset is not found or has no data. |
load()
Load the dataset.
Returns:
| Type | Description |
|---|---|
List[MetricInput]
|
List[MetricInput]: The loaded dataset with proper Langfuse structure. |
prepare_row_for_inference(row, dataset_name=None, dataset_item_id=None, metadata=None, **kwargs)
async
Prepare row for inference by syncing with Langfuse.
This creates or syncs the dataset item in Langfuse before inference.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
row |
dict[str, Any]
|
The row to prepare. |
required |
dataset_name |
str | None
|
Name of the Langfuse dataset. Uses self._dataset_name if None. |
None
|
dataset_item_id |
str | None
|
Optional dataset item ID. |
None
|
metadata |
dict[str, Any] | None
|
Additional metadata. |
None
|
**kwargs |
Any
|
Additional arguments. |
{}
|
Returns:
| Type | Description |
|---|---|
dict[str, Any]
|
dict[str, Any]: The prepared row, potentially with item_id added. |
to_standard_format()
Convert Langfuse format dataset to standard format.
Returns:
| Type | Description |
|---|---|
list[MetricInput]
|
list[MetricInput]: Dataset in standard format. |
validate()
Validate the dataset.
Raises:
| Type | Description |
|---|---|
ValueError
|
If the dataset is not a list of MetricInput or if required fields are missing. |