Utils
Utility functions and helpers for GLLM training.
This module contains general-purpose utilities used across the GLLM training framework.
Reviewer
- Muhammad Afif Al Hawari (muhammad.a.a.hawari@gdplabs.id)
References
NONE
ComponentsValidator(components=None)
Validator for training components and paths.
This class provides validation methods for models, paths, and other components required for training.
Initialize the validator with training components.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
components |
Optional[Any]
|
Training components to validate. |
None
|
update_components(components)
Update the components reference.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
components |
Any
|
New training components to validate. |
required |
validate_components_initialized()
Ensure required components are initialized.
Raises:
| Type | Description |
|---|---|
RuntimeError
|
If components are not properly initialized. |
validate_model_for_saving()
Validate that model is ready for saving.
Raises:
| Type | Description |
|---|---|
RuntimeError
|
If model is not ready for saving. |
validate_model_path(model_path)
staticmethod
Validate that the model path exists and is accessible.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
model_path |
str
|
Path to the model directory. |
required |
Raises:
| Type | Description |
|---|---|
FileNotFoundError
|
If the model path doesn't exist. |
FileLoggerCallback(log_path='logs/train_steps.jsonl')
Bases: TrainerCallback
Callback for logging training metrics to a JSONL file.
This callback writes training logs to a JSON Lines (JSONL) file format, capturing step-level metrics, epoch information, and other training statistics during the training process. The logs are written incrementally as training progresses.
Attributes:
| Name | Type | Description |
|---|---|---|
log_path |
Path to the JSONL log file where training metrics will be written. |
|
f |
File handle for the log file (opened during training, closed at end). |
Initialize FileLoggerCallback.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
log_path |
Path to the JSONL file where training logs will be written. Defaults to "logs/train_steps.jsonl". The directory will be created if it does not exist. |
'logs/train_steps.jsonl'
|
on_log(args, state, control, logs=None, **kwargs)
Called every logging_steps during training.
Writes the current training metrics to the log file in JSONL format. Each log entry includes the global step, epoch, and all metrics from the logs dictionary.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
args |
Training arguments. |
required | |
state |
Training state containing global_step and epoch information. |
required | |
control |
Training control object. |
required | |
logs |
Dictionary containing training metrics to log. Defaults to None. |
None
|
|
**kwargs |
Additional keyword arguments. |
{}
|
on_train_begin(args, state, control, **kwargs)
Called when training begins.
Opens the log file in append mode for writing training metrics.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
args |
Training arguments. |
required | |
state |
Training state. |
required | |
control |
Training control object. |
required | |
**kwargs |
Additional keyword arguments. |
{}
|
on_train_end(args, state, control, **kwargs)
Called when training ends.
Closes the log file handle if it was opened.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
args |
Training arguments. |
required | |
state |
Training state. |
required | |
control |
Training control object. |
required | |
**kwargs |
Additional keyword arguments. |
{}
|