Skip to content

Capabilities

Capability protocols for datastore interface.

This package defines the core capability protocols that all datastores can implement. Each protocol represents a specific set of functionality that datastores can opt-in to provide.

Authors

Kadek Denaya (kadek.d.r.diana@gdplabs.id)

References

NONE

EncryptionCapability

Bases: Protocol

Protocol defining the encryption capability interface.

This protocol defines the contract that all encryption implementations must satisfy. The EncryptionCapabilityMixin class provides a concrete implementation, but custom implementations can also satisfy this protocol without inheriting from the mixin.

Note

Encryption is an internal-only capability. Unlike fulltext and vector capabilities which users access via properties, encryption works transparently in the background. Users cannot access store.encryption - it's not exposed as a public property.

encryption_config property

Get the current encryption configuration.

Returns:

Type Description
set[str] | None

set[str] | None: Set of encrypted field names if encryption is enabled, None otherwise.

decrypt_chunks(chunks, logger=None)

Decrypt chunks if encryption is enabled.

Parameters:

Name Type Description Default
chunks list[Chunk]

List of chunks to decrypt.

required
logger Any | None

Optional logger instance for logging operations. Defaults to None.

None

Returns:

Type Description
list[Chunk]

list[Chunk]: List of decrypted chunks.

encrypt_chunks(chunks, logger=None)

Encrypt chunks if encryption is enabled.

Parameters:

Name Type Description Default
chunks list[Chunk]

List of chunks to encrypt.

required
logger Any | None

Optional logger instance for logging operations. Defaults to None.

None

Returns:

Type Description
list[Chunk]

list[Chunk]: List of encrypted chunks.

FulltextCapability

Bases: Protocol

Protocol for full-text search and document operations.

This protocol defines the interface for datastores that support CRUD operations and flexible querying mechanisms for document data.

clear(**kwargs) async

Clear all records from the datastore.

Parameters:

Name Type Description Default
**kwargs

Datastore-specific parameters.

{}

create(data, **kwargs) async

Create new records in the datastore.

Parameters:

Name Type Description Default
data Chunk | list[Chunk]

Data to create (single item or collection).

required
**kwargs

Datastore-specific parameters.

{}

delete(filters=None, options=None, **kwargs) async

Delete records from the datastore.

Parameters:

Name Type Description Default
filters FilterClause | QueryFilter | None

Filters to select records to delete. FilterClause objects are automatically converted to QueryFilter internally. Defaults to None, in which case no operation is performed (no-op).

None
options QueryOptions | None

Query options for sorting and limiting deletions. Defaults to None.

None
**kwargs

Datastore-specific parameters.

{}

retrieve(filters=None, options=None, **kwargs) async

Read records from the datastore with optional filtering.

Parameters:

Name Type Description Default
filters FilterClause | QueryFilter | None

Query filters to apply. FilterClause objects are automatically converted to QueryFilter internally. Defaults to None.

None
options QueryOptions | None

Query options like limit and sorting. Defaults to None.

None
**kwargs

Datastore-specific parameters.

{}

Returns:

Type Description
list[Chunk]

list[Chunk]: Query results.

retrieve_fuzzy(query, max_distance=2, filters=None, options=None, **kwargs) async

Find records that fuzzy match the query within distance threshold.

Parameters:

Name Type Description Default
query str

Text to fuzzy match against.

required
max_distance int

Maximum edit distance for matches (Levenshtein distance). Defaults to 2.

2
filters FilterClause | QueryFilter | None

Optional metadata filters to apply. FilterClause objects are automatically converted to QueryFilter internally. Defaults to None.

None
options QueryOptions | None

Query options (limit, sorting, etc.). Defaults to None.

None
**kwargs

Datastore-specific parameters.

{}

Returns:

Type Description
list[Chunk]

list[Chunk]: Matched chunks ordered by relevance/distance.

update(update_values, filters=None, **kwargs) async

Update existing records in the datastore.

Parameters:

Name Type Description Default
update_values dict[str, Any]

Values to update.

required
filters FilterClause | QueryFilter | None

Filters to select records to update. FilterClause objects are automatically converted to QueryFilter internally. Defaults to None.

None
**kwargs

Datastore-specific parameters.

{}

GraphCapability

Bases: Protocol

Protocol for graph database operations.

This protocol defines the interface for datastores that support graph-based data operations. This includes node and relationship management as well as graph queries.

delete_node(label, identifier_key, identifier_value) async

Delete a node and its relationships.

Parameters:

Name Type Description Default
label str

Node label/type.

required
identifier_key str

Node identifier key.

required
identifier_value str

Node identifier value.

required

Returns:

Name Type Description
Any Any

Deletion result information.

delete_relationship(node_source_key, node_source_value, relation, node_target_key, node_target_value) async

Delete a relationship between nodes.

Parameters:

Name Type Description Default
node_source_key str

Source node identifier key.

required
node_source_value str

Source node identifier value.

required
relation str

Relationship type.

required
node_target_key str

Target node identifier key.

required
node_target_value str

Target node identifier value.

required

Returns:

Name Type Description
Any Any

Deletion result information.

retrieve(query, parameters=None) async

Retrieve data from the graph with specific query.

Parameters:

Name Type Description Default
query str

Query to retrieve data from the graph.

required
parameters dict[str, Any] | None

Query parameters. Defaults to None.

None

Returns:

Type Description
list[dict[str, Any]]

list[dict[str, Any]]: Query results as list of dictionaries.

upsert_node(label, identifier_key, identifier_value, properties=None) async

Create or update a node in the graph.

Parameters:

Name Type Description Default
label str

Node label/type.

required
identifier_key str

Key field for node identification.

required
identifier_value str

Value for node identification.

required
properties dict[str, Any] | None

Additional node properties. Defaults to None.

None

Returns:

Name Type Description
Any Any

Created/updated node information.

upsert_relationship(node_source_key, node_source_value, relation, node_target_key, node_target_value, properties=None) async

Create or update a relationship between nodes.

Parameters:

Name Type Description Default
node_source_key str

Source node identifier key.

required
node_source_value str

Source node identifier value.

required
relation str

Relationship type.

required
node_target_key str

Target node identifier key.

required
node_target_value str

Target node identifier value.

required
properties dict[str, Any] | None

Relationship properties. Defaults to None.

None

Returns:

Name Type Description
Any Any

Created/updated relationship information.

HybridCapability

Bases: Protocol

Protocol for hybrid search combining different retrieval paradigms.

This protocol defines the interface for datastores that support hybrid search operations combining multiple retrieval strategies (fulltext, vector).

clear(**kwargs) async

Clear all records from the datastore.

Parameters:

Name Type Description Default
**kwargs Any

Datastore-specific parameters.

{}

create(chunks, **kwargs) async

Create chunks with automatic generation of all configured search fields.

This method automatically generates and indexes all fields required by the configured searches in with_hybrid(). For each chunk:

  1. FULLTEXT search: Indexes text content in the configured field name.
  2. VECTOR search: Generates dense embedding using the configured em_invoker and indexes it in the configured field name.

Parameters:

Name Type Description Default
chunks list[Chunk]

List of chunks to create and index.

required
**kwargs Any

Datastore-specific parameters.

{}

create_from_vectors(chunks, dense_vectors=None, **kwargs) async

Create chunks with pre-computed vectors for multiple fields.

Allows indexing pre-computed vectors for multiple vector fields at once. Field names must match those configured in with_hybrid().

Parameters:

Name Type Description Default
chunks list[Chunk]

Chunks to index.

required
dense_vectors dict[str, list[tuple[Chunk, Vector]]] | None

Dict mapping field names to lists of (chunk, vector) tuples. Defaults to None.

None
**kwargs Any

Datastore-specific parameters.

{}

delete(filters=None, **kwargs) async

Delete records from the datastore.

Parameters:

Name Type Description Default
filters FilterClause | QueryFilter | None

Filters to select records to delete. FilterClause objects are automatically converted to QueryFilter internally. Defaults to None.

None
**kwargs Any

Datastore-specific parameters.

{}
Note

If filters is None, no operation is performed (no-op).

retrieve(query, fusion_mode=None, filters=None, options=None, **kwargs) async

Retrieve using hybrid search combining different retrieval paradigms.

Parameters:

Name Type Description Default
query str

Query text to search with.

required
fusion_mode str | None

Fusion mode to use. Defaults to None, in which case the default fusion mode from with_hybrid() is used.

None
filters FilterClause | QueryFilter | None

Query filters to apply. FilterClause objects are automatically converted to QueryFilter internally. Defaults to None.

None
options QueryOptions | None

Query options like limit and sorting. Defaults to None.

None
**kwargs Any

Datastore-specific parameters.

{}

Returns:

Type Description
list[Chunk]

list[Chunk]: Query results ordered by relevance.

retrieve_by_vectors(query=None, dense_vector=None, fusion_mode=None, filters=None, options=None, **kwargs) async

Hybrid search using pre-computed vectors.

Parameters:

Name Type Description Default
query str | None

Optional query text (for fulltext search). Defaults to None.

None
dense_vector Vector | None

Pre-computed dense vector for VECTOR search. Defaults to None.

None
fusion_mode str | None

Fusion mode to use. Defaults to None, in which case the default fusion mode from with_hybrid() is used.

None
filters FilterClause | QueryFilter | None

Query filters to apply. FilterClause objects are automatically converted to QueryFilter internally. Defaults to None.

None
options QueryOptions | None

Query options like limit and sorting. Defaults to None.

None
**kwargs Any

Datastore-specific parameters.

{}

Returns:

Type Description
list[Chunk]

list[Chunk]: Query results ordered by relevance.

update(update_values, filters=None, **kwargs) async

Update existing records in the datastore.

Parameters:

Name Type Description Default
update_values dict[str, Any]

Values to update.

required
filters FilterClause | QueryFilter | None

Filters to select records to update. FilterClause objects are automatically converted to QueryFilter internally. Defaults to None.

None
**kwargs Any

Datastore-specific parameters.

{}

HybridSearchType

Bases: StrEnum

Types of searches that can be combined in hybrid search.

SearchConfig

Bases: BaseModel

Configuration for a single search component in hybrid search.

Examples:

FULLTEXT search configuration: python config = SearchConfig( search_type=HybridSearchType.FULLTEXT, field="text", weight=0.3 )

VECTOR search configuration: python config = SearchConfig( search_type=HybridSearchType.VECTOR, field="embedding", em_invoker=em_invoker, weight=0.5 )

Attributes:

Name Type Description
search_type HybridSearchType

Type of search (FULLTEXT or VECTOR).

field str

Field name in the index (e.g., "text", "embedding").

weight float

Weight for this search in hybrid search. Defaults to 1.0.

em_invoker BaseEMInvoker | None

Embedding model invoker required for VECTOR type. Defaults to None.

top_k int | None

Per-search top_k limit (optional). Defaults to None.

extra_kwargs dict[str, Any]

Additional search-specific parameters. Defaults to empty dict.

validate_field_not_empty(v) classmethod

Validate that field name is not empty.

Parameters:

Name Type Description Default
v str

Field name value.

required

Returns:

Name Type Description
str str

Validated field name.

Raises:

Type Description
ValueError

If field name is empty.

validate_search_requirements()

Validate configuration based on search type.

Returns:

Name Type Description
SearchConfig 'SearchConfig'

Validated configuration instance.

Raises:

Type Description
ValueError

If required fields are missing for the search type.

validate_top_k(v) classmethod

Validate that top_k is positive if provided.

Parameters:

Name Type Description Default
v int | None

top_k value.

required

Returns:

Type Description
int | None

int | None: Validated top_k value.

Raises:

Type Description
ValueError

If top_k is provided but not positive.

VectorCapability

Bases: Protocol

Protocol for vector similarity search operations.

This protocol defines the interface for datastores that support vector-based retrieval operations. This includes similarity search, ID-based lookup as well as vector storage.

clear() async

Clear all records from the datastore.

create(data) async

Add chunks to the vector store with automatic embedding generation.

Parameters:

Name Type Description Default
data Chunk | list[Chunk]

Single chunk or list of chunks to add.

required

create_from_vector(chunk_vectors, **kwargs) async

Add pre-computed vectors directly.

Parameters:

Name Type Description Default
chunk_vectors list[tuple[Chunk, Vector]]

List of tuples containing chunks and their corresponding vectors.

required
**kwargs Any

Datastore-specific parameters.

{}

delete(filters=None, **kwargs) async

Delete records from the datastore.

Parameters:

Name Type Description Default
filters FilterClause | QueryFilter | None

Filters to select records to delete. FilterClause objects are automatically converted to QueryFilter internally. Defaults to None.

None
**kwargs Any

Datastore-specific parameters

{}
Note

If filters is None, no operation is performed (no-op).

ensure_index(**kwargs) async

Ensure vector index exists, creating it if necessary.

This method ensures that the vector index required for similarity search operations is created. If the index already exists, this method performs no operation (idempotent).

Parameters:

Name Type Description Default
**kwargs Any

Datastore-specific parameters for index configuration.

{}

retrieve(query, filters=None, options=None, **kwargs) async

Read records from the datastore using text-based similarity search with optional filtering.

Parameters:

Name Type Description Default
query str

Input text to embed and search with.

required
filters FilterClause | QueryFilter | None

Query filters to apply. FilterClause objects are automatically converted to QueryFilter internally. Defaults to None.

None
options QueryOptions | None

Query options like limit and sorting. Defaults to None.

None
**kwargs Any

Datastore-specific parameters.

{}

Returns:

Type Description
list[Chunk]

list[Chunk]: Query results.

retrieve_by_vector(vector, filters=None, options=None, **kwargs) async

Direct vector similarity search.

Parameters:

Name Type Description Default
vector Vector

Query embedding vector.

required
filters FilterClause | QueryFilter | None

Query filters to apply. FilterClause objects are automatically converted to QueryFilter internally. Defaults to None.

None
options QueryOptions | None

Query options like limit and sorting. Defaults to None.

None
**kwargs Any

Datastore-specific parameters.

{}

Returns:

Type Description
list[Chunk]

list[Chunk]: List of chunks ordered by similarity score.

update(update_values, filters=None, **kwargs) async

Update existing records in the datastore.

Parameters:

Name Type Description Default
update_values dict[str, Any]

Values to update.

required
filters FilterClause | QueryFilter | None

Filters to select records to update. FilterClause objects are automatically converted to QueryFilter internally. Defaults to None.

None
**kwargs Any

Datastore-specific parameters.

{}