Modality converter
Defines a base class for image to text converter used in Gen AI applications.
This module provides the foundation for converting images to text in various formats, including OCR, image captioning, and other image analysis tasks.
References
NONE
BaseModalityConverter()
Bases: ABC
An abstract base class for modality conversion used in Gen AI applications.
This class provides a foundation for building modality converter components in Gen AI applications. It supports converting between different modalities (e.g. text, images, audio, video) and can be extended to implement various types of conversion tasks like OCR, captioning, speech-to-text, text-to-speech, etc.
Initialize the base modality converter component with logging capabilities.
convert(source, **kwargs)
abstractmethod
async
Executes the modality conversion process.
This method validates the input parameters and calls convert to perform the modality conversion process.
It ensures that the required source parameter is provided and valid before proceeding.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
source |
str | bytes
|
The source of the modality to convert. |
required |
**kwargs |
Any
|
A dictionary of arguments required for the modality conversion process.
Must include |
{}
|
Returns:
| Name | Type | Description |
|---|---|---|
TextResult |
TextResult
|
The result of processing the modality. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If |
TypeError
|
If |
NotImplementedError
|
If the method is not implemented in a subclass. |