Video caption result

Schema for video caption metadata.

`Keyframe`

Bases: BaseModel

Represents a keyframe extracted from a video segment.

Attributes:

Name	Type	Description
`time_offset`	`float`	Time within the segment where the keyframe occurs.
`caption`	`str \| None`	Text description of this specific keyframe.

Bases: BaseModel

Represents a video segment with its captions, transcripts, and keyframes.

Attributes:

Name	Type	Description
`start_time`	`float \| None`	The segment's starting time in seconds.
`end_time`	`float \| None`	The segment's ending time in seconds.
`transcripts`	`list[AudioTranscript]`	Optional list of transcripts for the segment.
`segment_caption`	`list[str]`	The single, rich description of the segment's action/plot.
`keyframes`	`list[Keyframe]`	Optional list of keyframes extracted from the segment.

Ensure segment has caption, fallback to keyframes/transcripts if needed.

Ensure all keyframes time offset is non-negative.

Ensure all transcripts time offset is non-negative.

Bases: BaseModel

Metadata for video captioning results.

Attributes:

Name	Type	Description
`video_summary`	`str`	A high-level summary of the entire video's plot, topic, or main events.
`segments`	`list[Segment]`	List of video segments with their captions and metadata.

Ensure segment end time is greater than start time.

If end time equal or lower than start time, then use next segment start time-1 as end time.

Backward-compatible model alias for video caption result payload.