Caption

Schema for image captioning operations in Gen AI applications.

This module defines the data structures for representing results from image captioning operations. It provides: 1. Result class for image captions 2. Support for multiple caption types 3. Metadata storage 4. Domain knowledge integration 5. External context support through attachments

`Caption`

Bases: BaseModel

Result class for image captioning operations.

This class extends ImageToTextResult to provide a structured format for image captioning results, supporting: - Multiple caption types (one-liner, detailed, domain-specific) - Caption count tracking - Metadata storage for processing details

Attributes:

Name	Type	Description
`image_one_liner`	`str`	Brief, single-sentence summary of the image. Defaults to empty string if not provided.
`image_description`	`str`	Detailed, multi-sentence description of the image. Defaults to empty string if not provided.
`domain_knowledge`	`str`	Domain-specific interpretation or context. Defaults to empty string if not provided.
`number_of_captions`	`int`	Total number of distinct captions generated. Defaults to 0 if no captions are generated.
`image_metadata`	`dict[str, Any]`	Additional information about the image such as image location.
`attachments_context`	`list[Attachment]`	Optional list of external context objects (files, bytes, or pre-processed inputs) that can enrich captioning results. Bytes are automatically converted into Attachment objects via `Attachment.from_bytes`.
`output_schema`	`str`	Output schema. Defaults to empty string if not provided.
`schema_description`	`str`	Schema description. Defaults to empty string if not provided.
`language`	`str`	Language of the captions. Defaults to "Indonesian" if not provided.

`handle_none_attachments(attachments_value)`

Normalize and validate attachments_context.

This method ensures that the attachments_context field is always a list of Attachment objects. It handles multiple input cases:

None -> returns an empty list
list[bytes] -> converts each item into an Attachment via Attachment.from_bytes
list[Attachment] -> keeps as-is
list[mixed] -> normalizes supported types, raises error on unsupported types
any other type -> raises TypeError

Parameters:

Name	Type	Description	Default
`attachments_value`	`Any`	Input value provided to `attachments_context`.	required

Returns:

Type	Description
`Any`	list[Attachment]: A normalized list of `Attachment` objects.

Raises:

Type	Description
`TypeError`	If an unsupported type is provided (e.g., str, dict).

`handle_none_metadata(metadata_value)` `classmethod`

Handle None values for image_metadata by using empty dict.

`handle_none_number_of_captions(caption_value)` `classmethod`

Handle None values for number_of_captions by using default.

`handle_none_values(str_value)` `classmethod`

Handle None values by converting them to default values.

Caption

Caption

handle_none_attachments(attachments_value)

handle_none_metadata(metadata_value) classmethod

handle_none_number_of_captions(caption_value) classmethod

handle_none_values(str_value) classmethod

`Caption`

`handle_none_attachments(attachments_value)`

`handle_none_metadata(metadata_value)` `classmethod`

`handle_none_number_of_captions(caption_value)` `classmethod`

`handle_none_values(str_value)` `classmethod`