Skip to content

Hf dataset

Hugging Face Dataset Class.

This class is a wrapper around the Hugging Face dataset class.

Authors

Surya Mahadi (made.r.s.mahadi@gdplabs.id)

References

NONE

HuggingFaceDataset(dataset, dataset_name=None, attachments_config=None)

Bases: BaseDataset

Hugging Face dataset class for the evaluator.

Attributes:

Name Type Description
dataset list[MetricInput]

The dataset to use for the evaluation.

Initialize the HuggingFaceDataset class.

Parameters:

Name Type Description Default
dataset Dataset

The dataset to use for the evaluation.

required
dataset_name str | None

The name of the dataset. Defaults to None.

None
attachments_config AttachmentConfig | dict[str, Any] | None

Configuration for loading attachments. Defaults to None.

None

from_hub(path_or_name, split, dataset_name=None, attachments_config=None, **kwargs) staticmethod

Create a HuggingFaceDataset from a Hugging Face dataset.

Parameters:

Name Type Description Default
path_or_name str

The path or name of the dataset.

required
split str

The split of the dataset.

required
dataset_name str | None

The name of the dataset. If None, defaults to path_or_name. Defaults to None.

None
attachments_config AttachmentConfig | dict[str, Any] | None

Configuration for loading attachments. Defaults to None.

None
**kwargs Any

Additional arguments to pass to the load function.

{}

Returns:

Name Type Description
HuggingFaceDataset HuggingFaceDataset

The created dataset.

from_list(dataset, dataset_name=None, attachments_config=None) staticmethod

Create a HuggingFaceDataset from a list of MetricInput.

Parameters:

Name Type Description Default
dataset list[MetricInput]

The dataset to create.

required
dataset_name str | None

The name of the dataset. Defaults to None.

None
attachments_config AttachmentConfig | dict[str, Any] | None

Configuration for loading attachments. Defaults to None.

None

Returns:

Name Type Description
HuggingFaceDataset HuggingFaceDataset

The created dataset.

load()

Load the dataset.

Returns:

Type Description
list[MetricInput]

list[MetricInput]: The loaded dataset.

validate()

Validate the dataset.

Raises:

Type Description
ValueError

If the dataset is not a list of MetricInput.