Dict dataset
Dict-Based Dataset.
DictDataset(dataset, dataset_name=None, attachments_config=None)
Bases: BaseDataset
Dict-Based Dataset.
This class is a subclass of the BaseDataset class. It is used to store a dataset in a dictionary format.
Attributes:
| Name | Type | Description |
|---|---|---|
dataset |
list[dict]
|
The dataset to evaluate. |
Initialize the DictDataset class.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
dataset
|
list[MetricInput]
|
The dataset to use for the evaluation. |
required |
dataset_name
|
str | None
|
The name of the dataset. |
None
|
attachments_config
|
AttachmentConfig | dict[str, Any] | None
|
Configuration for loading attachments. Defaults to None. |
None
|
from_csv(path, dataset_name=None, attachments_config=None, **kwargs)
classmethod
Load a dataset from a CSV file.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
path
|
str
|
The path to the CSV file. |
required |
dataset_name
|
str | None
|
The name of the dataset. If None, defaults to filename. Defaults to None. |
None
|
attachments_config
|
AttachmentConfig | dict[str, Any] | None
|
Configuration for loading attachments. Defaults to None. |
None
|
**kwargs
|
Any
|
Additional arguments to pass to pandas read_csv. |
{}
|
Returns:
| Name | Type | Description |
|---|---|---|
DictDataset |
DictDataset
|
The loaded dataset. |
from_gsheets(sheet_id, worksheet_name, client_email, private_key, dataset_name=None, attachments_config=None)
async
staticmethod
Load a dataset from Google Sheets.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
sheet_id
|
str
|
The ID of the Google Sheet. |
required |
worksheet_name
|
str
|
The name of the worksheet within the Google Sheet. |
required |
client_email
|
str
|
The client email for Google Sheets API. |
required |
private_key
|
str
|
Base64-encoded private key for Google Sheets API. |
required |
dataset_name
|
str | None
|
The name of the dataset. If None, defaults to worksheet_name. Defaults to None. |
None
|
attachments_config
|
AttachmentConfig | dict[str, Any] | None
|
Configuration for loading attachments. Defaults to None. |
None
|
Returns:
| Name | Type | Description |
|---|---|---|
DictDataset |
DictDataset
|
The loaded dataset. |
from_huggingface_hub(path_or_name, split, dataset_name=None, attachments_config=None, **kwargs)
staticmethod
Load a dataset from HuggingFace Hub.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
path_or_name
|
str
|
The path or name of the dataset on HuggingFace Hub. |
required |
split
|
str
|
The split of the dataset (e.g. "train", "test"). |
required |
dataset_name
|
str | None
|
The name of the dataset. If None, defaults to path_or_name. Defaults to None. |
None
|
attachments_config
|
AttachmentConfig | dict[str, Any] | None
|
Configuration for loading attachments. Defaults to None. |
None
|
**kwargs
|
Any
|
Additional arguments to pass to datasets.load_dataset. |
{}
|
Returns:
| Name | Type | Description |
|---|---|---|
DictDataset |
DictDataset
|
The loaded dataset. |
from_jsonl(path, dataset_name=None, attachments_config=None, **kwargs)
classmethod
Load a dataset from a JSONL file.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
path
|
str
|
The path to the JSONL file. |
required |
dataset_name
|
str | None
|
The name of the dataset. If None, defaults to filename. Defaults to None. |
None
|
attachments_config
|
AttachmentConfig | dict[str, Any] | None
|
Configuration for loading attachments. Defaults to None. |
None
|
**kwargs
|
Any
|
Additional arguments to pass to the constructor (deprecated, use attachments_config instead). |
{}
|
Returns:
| Name | Type | Description |
|---|---|---|
DictDataset |
DictDataset
|
The loaded dataset. |
from_langfuse(langfuse_client, dataset_name, attachments_config=None)
staticmethod
Load a dataset from Langfuse (read-only).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
langfuse_client
|
Any
|
The Langfuse client instance. |
required |
dataset_name
|
str
|
The name of the dataset in Langfuse. |
required |
attachments_config
|
AttachmentConfig | dict[str, Any] | None
|
Configuration for loading attachments. Defaults to None. |
None
|
Returns:
| Name | Type | Description |
|---|---|---|
DictDataset |
DictDataset
|
The loaded dataset. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If the dataset is not found or has no items. |
load()
Load the dataset.
Returns:
| Type | Description |
|---|---|
list[MetricInput]
|
list[MetricInput]: The loaded dataset. |
validate()
Validate the dataset.
Raises:
| Type | Description |
|---|---|
ValueError
|
If the dataset is not a list of MetricInput. |