Modality converter

Defines a base class for image to text converter used in Gen AI applications.

This module provides the foundation for converting images to text in various formats, including OCR, image captioning, and other image analysis tasks.

`BaseModalityConverter()`

Bases: ABC

An abstract base class for modality conversion used in Gen AI applications.

This class provides a foundation for building modality converter components in Gen AI applications. It supports converting between different modalities (e.g. text, images, audio, video) and can be extended to implement various types of conversion tasks like OCR, captioning, speech-to-text, text-to-speech, etc.

Initialize the base modality converter component with logging capabilities.

`convert(source, **kwargs)` `abstractmethod` `async`

Executes the modality conversion process.

This method validates the input parameters and calls convert to perform the modality conversion process. It ensures that the required source parameter is provided and valid before proceeding.

Parameters:

Name	Type	Description	Default
`source`	`str \| bytes`	The source of the modality to convert.	required
`**kwargs`	`Any`	A dictionary of arguments required for the modality conversion process. Must include `source` of type `str`. May include additional parameters specific to the implementation.	`{}`

Returns:

Name	Type	Description
`TextResult`	`TextResult`	The result of processing the modality.

Raises:

Type	Description
`ValueError`	If `source` is missing from the input kwargs or is empty.
`TypeError`	If `source` is not a string or bytes.
`NotImplementedError`	If the method is not implemented in a subclass.

Modality converter

BaseModalityConverter()

convert(source, **kwargs) abstractmethod async

`BaseModalityConverter()`

`convert(source, **kwargs)` `abstractmethod` `async`