Skip to content

Utils

Utility functions and helpers for GLLM training.

This module contains general-purpose utilities used across the GLLM training framework.

Authors
  • Alfan Dinda Rahmawan (alfan.d.rahmawan@gdplabs.id)
Reviewer
  • Muhammad Afif Al Hawari (muhammad.a.a.hawari@gdplabs.id)
References

NONE

ComponentsValidator(components=None)

Validator for training components and paths.

This class provides validation methods for models, paths, and other components required for training.

Initialize the validator with training components.

Parameters:

Name Type Description Default
components Optional[Any]

Training components to validate.

None

update_components(components)

Update the components reference.

Parameters:

Name Type Description Default
components Any

New training components to validate.

required

validate_components_initialized()

Ensure required components are initialized.

Raises:

Type Description
RuntimeError

If components are not properly initialized.

validate_model_for_saving()

Validate that model is ready for saving.

Raises:

Type Description
RuntimeError

If model is not ready for saving.

validate_model_path(model_path) staticmethod

Validate that the model path exists and is accessible.

Parameters:

Name Type Description Default
model_path str

Path to the model directory.

required

Raises:

Type Description
FileNotFoundError

If the model path doesn't exist.

FileLoggerCallback(log_path='logs/train_steps.jsonl')

Bases: TrainerCallback

Callback for logging training metrics to a JSONL file.

This callback writes training logs to a JSON Lines (JSONL) file format, capturing step-level metrics, epoch information, and other training statistics during the training process. The logs are written incrementally as training progresses.

Attributes:

Name Type Description
log_path

Path to the JSONL log file where training metrics will be written.

f

File handle for the log file (opened during training, closed at end).

Initialize FileLoggerCallback.

Parameters:

Name Type Description Default
log_path

Path to the JSONL file where training logs will be written. Defaults to "logs/train_steps.jsonl". The directory will be created if it does not exist.

'logs/train_steps.jsonl'

on_log(args, state, control, logs=None, **kwargs)

Called every logging_steps during training.

Writes the current training metrics to the log file in JSONL format. Each log entry includes the global step, epoch, and all metrics from the logs dictionary.

Parameters:

Name Type Description Default
args

Training arguments.

required
state

Training state containing global_step and epoch information.

required
control

Training control object.

required
logs

Dictionary containing training metrics to log. Defaults to None.

None
**kwargs

Additional keyword arguments.

{}

on_train_begin(args, state, control, **kwargs)

Called when training begins.

Opens the log file in append mode for writing training metrics.

Parameters:

Name Type Description Default
args

Training arguments.

required
state

Training state.

required
control

Training control object.

required
**kwargs

Additional keyword arguments.

{}

on_train_end(args, state, control, **kwargs)

Called when training ends.

Closes the log file handle if it was opened.

Parameters:

Name Type Description Default
args

Training arguments.

required
state

Training state.

required
control

Training control object.

required
**kwargs

Additional keyword arguments.

{}