Capabilities
Capability base classes and models for datastore interface.
This package defines the core capability base classes and related models used by datastores.
BaseAgenticGraphCapability
Bases: ABC
Base class for agentic graph exploration operations.
Provides read-only, context-aware graph exploration methods for AI agents. All methods enforce read-only access to prevent unintended graph mutations.
This capability is designed to be composed via multiple inheritance alongside
:class:~gllm_datastore.core.capabilities.graph_capability.BaseGraphCapability:
.. code-block:: python
class Neo4jGraphCapability(BaseGraphCapability, BaseAgenticGraphCapability):
...
This allows consumers to type-hint against either capability independently:
.. code-block:: python
def ingest(graph: BaseGraphCapability): ... # CRUD only
def explore(graph: BaseAgenticGraphCapability): ... # exploration only
get_neighborhood(node_id=None, relationship_type=None, target_node_id=None, limit=10)
abstractmethod
async
Get graph patterns matching partial constraints.
Provide at least one of: node_id, relationship_type, or
target_node_id. Returns up to limit diverse patterns to help
understand the graph structure.
Differs from traverse_graph() in that it discovers patterns matching
given constraints without requiring a specific traversal path or starting
point.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
node_id
|
str | None
|
Source node ID to filter by. Defaults to None. |
None
|
relationship_type
|
str | None
|
Relationship type to filter by. Defaults to None. |
None
|
target_node_id
|
str | None
|
Target node ID to filter by. Defaults to None. |
None
|
limit
|
int
|
Maximum number of patterns to return. Defaults to 10. |
10
|
Returns:
| Type | Description |
|---|---|
list[Triplet]
|
list[Triplet]: List of source-relationship-target triplets matching the constraints. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If none of |
NotImplementedError
|
This is an abstract method that must be implemented by subclasses. |
search_autocomplete(query, query_pattern, search_var, limit=10)
abstractmethod
async
Context-sensitive search constrained by a partial query pattern.
Executes a partial query pattern and searches for nodes that can be bound
to search_var. The pattern provides context for the search.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
query
|
str
|
Search query string. |
required |
query_pattern
|
str
|
Partial query with a variable placeholder. |
required |
search_var
|
str
|
Variable name to search for (e.g., |
required |
limit
|
int
|
Maximum number of results to return. Defaults to 10. |
10
|
Returns:
| Type | Description |
|---|---|
list[Node]
|
list[Node]: List of nodes matching the query within the pattern context. |
Raises:
| Type | Description |
|---|---|
NotImplementedError
|
This is an abstract method that must be implemented by subclasses. |
search_constrained(query, position, source_node_id=None, relationship_type=None, target_node_id=None, limit=10)
abstractmethod
async
Search for items in a specific pattern position under constraints.
Builds a graph pattern with the given constraints and searches for items in the specified position (source, relationship, or target).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
query
|
str
|
Search query string. |
required |
position
|
SearchPosition
|
What to search for: SOURCE, RELATIONSHIP, or TARGET. |
required |
source_node_id
|
str | None
|
Constraint on the source node. Defaults to None. |
None
|
relationship_type
|
str | None
|
Constraint on the relationship type. Defaults to None. |
None
|
target_node_id
|
str | None
|
Constraint on the target node. Defaults to None. |
None
|
limit
|
int
|
Maximum number of results to return. Defaults to 10. |
10
|
Returns:
| Type | Description |
|---|---|
list[Node] | list[Triplet]
|
list[Node] | list[Triplet]: Nodes when |
Raises:
| Type | Description |
|---|---|
NotImplementedError
|
This is an abstract method that must be implemented by subclasses. |
search_node(query, node_label=None, limit=10)
abstractmethod
async
Search for nodes using case-insensitive substring matching.
Searches across common properties: id, name, title,
description.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
query
|
str
|
Search query string. |
required |
node_label
|
str | None
|
Optional node label to filter results. Defaults to None. |
None
|
limit
|
int
|
Maximum number of results to return. Defaults to 10. |
10
|
Returns:
| Type | Description |
|---|---|
list[Node]
|
list[Node]: List of matching nodes. |
Raises:
| Type | Description |
|---|---|
NotImplementedError
|
This is an abstract method that must be implemented by subclasses. |
search_rel_of_node(query, node_id, direction=RelationshipDirection.BOTH, limit=10)
abstractmethod
async
Search for relationship types connected to a specific node.
More context-sensitive than :meth:search_relationship — only returns
relationships that connect to the given node.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
query
|
str
|
Search query string. |
required |
node_id
|
str
|
Node ID to search relationships for. |
required |
direction
|
RelationshipDirection
|
Direction to search.
Defaults to :attr: |
BOTH
|
limit
|
int
|
Maximum number of results to return. Defaults to 10. |
10
|
Returns:
| Type | Description |
|---|---|
list[Triplet]
|
list[Triplet]: List of triplets for relationships connected to the node. |
Raises:
| Type | Description |
|---|---|
NotImplementedError
|
This is an abstract method that must be implemented by subclasses. |
search_relationship(query, node_label=None, limit=10)
abstractmethod
async
Search for relationship types using substring matching.
Returns relationship types that exist in the graph with usage counts. Helps agents avoid hallucinating relationship type names.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
query
|
str
|
Search query string. |
required |
node_label
|
str | None
|
Optional node label to filter relationships. Defaults to None. |
None
|
limit
|
int
|
Maximum number of results to return. Defaults to 10. |
10
|
Returns:
| Type | Description |
|---|---|
list[Triplet]
|
list[Triplet]: List of triplets representing relationship types found in the graph. |
Raises:
| Type | Description |
|---|---|
NotImplementedError
|
This is an abstract method that must be implemented by subclasses. |
search_target_of_rel(query, relationship_type, source_node_id=None, limit=10)
abstractmethod
async
Search for target nodes reachable via a relationship type.
If source_node_id is provided, finds targets reachable only from
that node. Otherwise finds all nodes reachable via the relationship type.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
query
|
str
|
Search query string. |
required |
relationship_type
|
str
|
Relationship type to traverse. |
required |
source_node_id
|
str | None
|
Optional source node to start from. Defaults to None. |
None
|
limit
|
int
|
Maximum number of results to return. Defaults to 10. |
10
|
Returns:
| Type | Description |
|---|---|
list[Node]
|
list[Node]: List of target nodes matching the query. |
Raises:
| Type | Description |
|---|---|
NotImplementedError
|
This is an abstract method that must be implemented by subclasses. |
BaseFulltextCapability(encryption=None, default_batch_size=None)
Bases: DataStoreCapability
Base class for fulltext capability implementations.
Handles encryption and batching transparently. Subclasses implement internal CRUD methods that operate on plaintext data (or receive already-encrypted data when encryption is enabled).
create(data, batch_size=None, **kwargs)
async
retrieve_fuzzy(query, max_distance=2, filters=None, options=None, **kwargs)
async
Find records that fuzzy match the query within distance threshold, with automatic decryption.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
query
|
str
|
Text to fuzzy match against. |
required |
max_distance
|
int
|
Maximum edit distance for matches (e.g. Levenshtein). Defaults to 2. |
2
|
filters
|
FilterClause | QueryFilter | None
|
Optional metadata filters. Defaults to None. |
None
|
options
|
QueryOptions | None
|
Query options (limit, etc.). Defaults to None. |
None
|
**kwargs
|
Any
|
Passed to subclass _retrieve_fuzzy. |
{}
|
Returns:
| Type | Description |
|---|---|
list[Chunk]
|
list[Chunk]: Matched chunks ordered by relevance/distance, decrypted when encryption is enabled. |
BaseGraphCapability
Bases: ABC
Base class for graph database operations.
This base class defines the interface for datastores that support graph-based data operations. This includes node and relationship management as well as graph queries.
The upsert methods accept two calling conventions:
New-style (preferred) -- pass a Node or Edge object directly:
.. code-block:: python
node = Node(type="Person", metadata={"name": "Alice", "age": 30})
await capability.upsert_node(node=node)
edge = Edge(type="KNOWS", source_id="alice-id", target_id="bob-id",
metadata={"since": 2020})
await capability.upsert_relationship(edge=edge)
Old-style (deprecated) -- pass individual string/dict arguments by keyword.
These still work but emit a DeprecationWarning at call time:
.. code-block:: python
await capability.upsert_node(
label="Person", identifier_key="name", identifier_value="Alice",
properties={"age": 30},
)
await capability.upsert_relationship(
node_source_key="name", node_source_value="Alice", relation="KNOWS",
node_target_key="name", node_target_value="Bob",
)
agent
property
Return the agentic exploration sub-capability.
Override this in concrete graph capability implementations that support agentic exploration.
Returns:
| Name | Type | Description |
|---|---|---|
BaseAgenticGraphCapability |
BaseAgenticGraphCapability
|
The agentic sub-capability. |
Raises:
| Type | Description |
|---|---|
NotImplementedError
|
If this datastore does not support agentic capability. |
delete_node(label, identifier_key, identifier_value)
abstractmethod
async
Delete a node and its relationships.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
label
|
str
|
Node label/type. |
required |
identifier_key
|
str
|
Node identifier key. |
required |
identifier_value
|
str
|
Node identifier value. |
required |
Returns:
| Name | Type | Description |
|---|---|---|
Any |
Any
|
Deletion result information. |
Raises:
| Type | Description |
|---|---|
NotImplementedError
|
This method is not implemented in the subclass. |
delete_nodes(properties, label=None)
abstractmethod
async
Delete all nodes matching the given properties, together with their relationships.
Unlike :meth:delete_node, which targets a single node by a known identifier, this
method bulk-deletes every node whose properties match all key-value pairs in
properties, optionally filtered by label.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
properties
|
dict[str, Any]
|
Property key-value pairs that a node must match
(e.g. |
required |
label
|
str | None
|
Node label to restrict deletion.
When |
None
|
Returns:
| Name | Type | Description |
|---|---|---|
int |
int
|
Number of nodes deleted. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If |
NotImplementedError
|
If the subclass has not implemented this method, or if the backend does not support label-less bulk deletion. |
delete_relationship(node_source_key, node_source_value, relation, node_target_key, node_target_value)
abstractmethod
async
Delete a relationship between nodes.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
node_source_key
|
str
|
Source node identifier key. |
required |
node_source_value
|
str
|
Source node identifier value. |
required |
relation
|
str
|
Relationship type. |
required |
node_target_key
|
str
|
Target node identifier key. |
required |
node_target_value
|
str
|
Target node identifier value. |
required |
Returns:
| Name | Type | Description |
|---|---|---|
Any |
Any
|
Deletion result information. |
Raises:
| Type | Description |
|---|---|
NotImplementedError
|
This method is not implemented in the subclass. |
query(query, parameters=None)
abstractmethod
async
Run a read-write query against the graph and return rows.
Use this for statements that mutate the graph (MERGE, CREATE,
DELETE, SET, INSERT, etc.) or for mixed read-write transactions.
For pure reads, prefer :meth:retrieve so the driver can route to a
read replica when available.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
query
|
str
|
Graph query that may mutate state. |
required |
parameters
|
dict[str, Any] | None
|
Query parameters. Defaults to None. |
None
|
Returns:
| Type | Description |
|---|---|
list[dict[str, Any]]
|
list[dict[str, Any]]: Query results as list of dictionaries. |
Raises:
| Type | Description |
|---|---|
NotImplementedError
|
This method is not implemented in the subclass. |
retrieve(query, parameters=None)
abstractmethod
async
Run a read-only query against the graph and return rows.
Implementations must execute query in a read-only context (e.g. Neo4j
execute_read). Use :meth:query for read-write statements such as
MERGE, CREATE or DELETE.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
query
|
str
|
Read-only graph query. |
required |
parameters
|
dict[str, Any] | None
|
Query parameters. Defaults to None. |
None
|
Returns:
| Type | Description |
|---|---|
list[dict[str, Any]]
|
list[dict[str, Any]]: Query results as list of dictionaries. |
Raises:
| Type | Description |
|---|---|
NotImplementedError
|
This method is not implemented in the subclass. |
upsert_node(node=None, label=None, identifier_key=None, identifier_value=None, properties=None)
abstractmethod
async
Create or update a node in the graph.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
node
|
Node | None
|
A |
None
|
label
|
str | None
|
The label of the node.
Deprecated -- use |
None
|
identifier_key
|
str | None
|
The key of the identifier.
Deprecated -- use |
None
|
identifier_value
|
str | None
|
The value of the identifier.
Deprecated -- use |
None
|
properties
|
dict[str, Any] | None
|
Additional node properties.
Deprecated -- use |
None
|
Returns:
| Type | Description |
|---|---|
Node | None
|
Node | None: The upserted node, or |
Raises:
| Type | Description |
|---|---|
NotImplementedError
|
This method is not implemented in the subclass. |
upsert_relationship(edge=None, node_source_key=None, node_source_value=None, relation=None, node_target_key=None, node_target_value=None, properties=None)
abstractmethod
async
Create or update a relationship between nodes.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
edge
|
Edge | None
|
An |
None
|
node_source_key
|
str | None
|
Source node identifier key.
Deprecated -- use |
None
|
node_source_value
|
str | None
|
Source node identifier value.
Deprecated -- use |
None
|
relation
|
str | None
|
Relationship type.
Deprecated -- use |
None
|
node_target_key
|
str | None
|
Target node identifier key.
Deprecated -- use |
None
|
node_target_value
|
str | None
|
Target node identifier value.
Deprecated -- use |
None
|
properties
|
dict[str, Any] | None
|
Relationship properties.
Deprecated -- use |
None
|
Returns:
| Type | Description |
|---|---|
Edge | None
|
Edge | None: The upserted edge, or |
Raises:
| Type | Description |
|---|---|
NotImplementedError
|
This method is not implemented in the subclass. |
BaseHybridCapability(encryption=None, default_batch_size=None)
Bases: DataStoreCapability
Base class for hybrid capability implementations.
create(chunks, batch_size=None, **kwargs)
async
Create chunks with automatic encryption and batching.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
chunks
|
list[Chunk]
|
Chunks to create and index. |
required |
batch_size
|
int | None
|
Override batch size. Defaults to None. |
None
|
**kwargs
|
Any
|
Passed to subclass _create. |
{}
|
create_from_vector(chunks, dense_vectors=None, batch_size=None, **kwargs)
async
Create from pre-computed vectors with encryption and batching.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
chunks
|
list[Chunk]
|
Chunks to index. |
required |
dense_vectors
|
dict[str, list[tuple[Chunk, Vector]]] | None
|
Per-field vectors. Defaults to None. |
None
|
batch_size
|
int | None
|
Override batch size; controls actual batching. Defaults to None. |
None
|
**kwargs
|
Any
|
Passed to subclass. |
{}
|
retrieve(query, filters=None, options=None, **kwargs)
async
Retrieve using hybrid search with automatic decryption.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
query
|
str
|
Query text to search with. |
required |
filters
|
FilterClause | QueryFilter | None
|
Query filters to apply. Defaults to None. |
None
|
options
|
QueryOptions | None
|
Query options like limit and sorting. Defaults to None. |
None
|
**kwargs
|
Any
|
Additional arguments passed to _retrieve. |
{}
|
Returns:
| Type | Description |
|---|---|
list[Chunk]
|
list[Chunk]: Decrypted query results. |
retrieve_by_vector(query=None, dense_vectors=None, filters=None, options=None, **kwargs)
async
Retrieve by pre-computed vectors with automatic decryption.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
query
|
str | None
|
Optional query text. Defaults to None. |
None
|
dense_vectors
|
dict[str, Vector] | None
|
Field name to query vector. Defaults to None. |
None
|
filters
|
FilterClause | QueryFilter | None
|
Filters. Defaults to None. |
None
|
options
|
QueryOptions | None
|
Query options. Defaults to None. |
None
|
**kwargs
|
Any
|
Passed to _retrieve_by_vector. |
{}
|
Returns:
| Type | Description |
|---|---|
list[Chunk]
|
list[Chunk]: Decrypted chunks. |
BaseVectorCapability(em_invoker, encryption=None, default_batch_size=None)
Bases: DataStoreCapability
Base class for vector capability implementations.
Provides default batching/encryption flows for create, create_from_vector, retrieve, retrieve_by_vector, update, delete, and clear.
Initialize the base vector capability.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
em_invoker
|
BaseEMInvoker
|
Embedding model invoker (required). |
required |
encryption
|
EncryptionCapability | None
|
Encryption capability. Defaults to None. |
None
|
default_batch_size
|
int | None
|
Default batch size. Defaults to None. |
None
|
em_invoker
property
Return the embedding model invoker.
Returns:
| Name | Type | Description |
|---|---|---|
BaseEMInvoker |
BaseEMInvoker
|
The EM invoker instance. |
create(data, batch_size=None, **kwargs)
async
create_from_vector(chunk_vectors, batch_size=None, **kwargs)
async
Create from pre-computed vectors with encryption and batching.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
chunk_vectors
|
list[tuple[Chunk, Vector]]
|
Chunks and their vectors. |
required |
batch_size
|
int | None
|
Override batch size. Defaults to None. |
None
|
**kwargs
|
Any
|
Passed to subclass. |
{}
|
ensure_index(**kwargs)
abstractmethod
async
Ensure vector index exists.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
**kwargs
|
Any
|
Datastore-specific parameters. |
{}
|
Raises:
| Type | Description |
|---|---|
NotImplementedError
|
This method is not implemented in the subclass. |
retrieve_by_vector(vector, filters=None, options=None, **kwargs)
async
Retrieve by vector with automatic decryption.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
vector
|
Vector
|
Query vector. |
required |
filters
|
FilterClause | QueryFilter | None
|
Filters. Defaults to None. |
None
|
options
|
QueryOptions | None
|
Query options. Defaults to None. |
None
|
**kwargs
|
Any
|
Passed to _retrieve_by_vector. |
{}
|
Returns:
| Type | Description |
|---|---|
list[Chunk]
|
list[Chunk]: Decrypted chunks. |
update(update_values, filters=None, batch_size=None, **kwargs)
async
Update records with centralized encryption and content embedding refresh.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
update_values
|
dict[str, Any]
|
Fields to update. |
required |
filters
|
FilterClause | QueryFilter | None
|
Filters. Defaults to None. |
None
|
batch_size
|
int | None
|
Optional batch size override. Defaults to DefaultBatchSize.UPDATE when not configured at request/capability level. |
None
|
**kwargs
|
Any
|
Passed to backend-specific _update implementation. |
{}
|
DataStoreCapability(encryption=None, default_batch_size=None)
Bases: ABC
Base class for capability implementations that share encryption and batching.
Holds common state (encryption, default_batch_size) and helpers used by BaseFulltextCapability, BaseVectorCapability, and BaseHybridCapability. Subclasses should not inherit from this directly; use the specific base (BaseFulltextCapability, etc.) instead.
Initialize the data store capability.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
encryption
|
EncryptionCapability | None
|
Encryption capability. Defaults to None. |
None
|
default_batch_size
|
int | None
|
Default batch size. Defaults to None. |
None
|
clear(**kwargs)
async
Clear all records. Delegates to subclass _clear.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
**kwargs
|
Any
|
Passed to _clear. |
{}
|
delete(filters=None, options=None, **kwargs)
async
Delete records that match the given filter. Delegates to subclass _delete.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
filters
|
FilterClause | QueryFilter | None
|
Deletion criteria; only records matching this filter are removed. Passed to _delete. Defaults to None. |
None
|
options
|
QueryOptions | None
|
Query options (e.g. ordering/limit for eviction-style deletes). Passed to _delete. Defaults to None. |
None
|
**kwargs
|
Any
|
Passed to _delete. |
{}
|
Returns:
| Name | Type | Description |
|---|---|---|
Any |
Any
|
Backend-specific delete metadata when available. Returns None for no-op deletes or backends that do not expose delete metadata. |
retrieve(*args, **kwargs)
async
Retrieve records with automatic decryption. Delegates to _retrieve then decrypts.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
*args
|
Any
|
Passed to _retrieve (signature is capability-specific). |
()
|
**kwargs
|
Any
|
Passed to _retrieve. |
{}
|
Returns:
| Type | Description |
|---|---|
list[Chunk]
|
list[Chunk]: Decrypted chunks. |
update(update_values, filters=None, batch_size=None, **kwargs)
async
Update records with automatic encryption of update_values.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
update_values
|
dict[str, Any]
|
Fields to update. |
required |
filters
|
FilterClause | QueryFilter | None
|
Filters. Defaults to None. |
None
|
batch_size
|
int | None
|
Optional batch size override. Defaults to DefaultBatchSize.UPDATE when not configured at request/capability level. |
None
|
**kwargs
|
Any
|
Passed to _update. |
{}
|
update_encryption(encryption)
Update the encryption capability.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
encryption
|
EncryptionCapability
|
New encryption capability. |
required |
Raises:
| Type | Description |
|---|---|
TypeError
|
If the provided encryption is not an instance of EncryptionCapability. |
EncryptionCapability(encryptor, encrypted_fields)
Unified implementation of encryption capability.
This class provides the shared encryption and decryption logic that is identical across all backend implementations. It handles: - Chunk content and metadata encryption/decryption - Preparation of encrypted chunks with plaintext embeddings - Encryption of update values
Thread Safety
This class is designed to be thread-safe when used with thread-safe encryptors. The encryptor instance passed must be thread-safe for concurrent encryption/decryption operations. Methods in this class do not perform internal synchronization - thread safety is delegated to the underlying encryptor.
Attributes:
| Name | Type | Description |
|---|---|---|
encryptor |
BaseEncryptor
|
The encryptor instance to use for encryption/decryption. Must be thread-safe for concurrent operations. |
_encrypted_fields |
set[str]
|
The set of fields to encrypt. |
Initialize the encryption capability.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
encryptor
|
BaseEncryptor
|
The encryptor instance to use for encryption. |
required |
encrypted_fields
|
set[str]
|
The set of fields to encrypt. Supports:
1. Content field: "content"
2. Metadata fields using dot notation: "metadata.secret_key", "metadata.secret_value"
Example: |
required |
encryption_config
property
Get the current encryption configuration.
Returns:
| Type | Description |
|---|---|
set[str]
|
set[str]: Set of encrypted field names. |
is_enabled
property
Check if encryption is enabled (has configured fields).
Returns:
| Name | Type | Description |
|---|---|---|
bool |
bool
|
True if encryption fields are configured, False otherwise. |
decrypt_chunks(chunks)
decrypt_field(value)
Decrypt a single field value.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
value
|
str
|
The encrypted value to decrypt. |
required |
Returns:
| Name | Type | Description |
|---|---|---|
str |
str
|
Decrypted value. |
encrypt_chunks(chunks)
encrypt_embedded_chunks(chunks, em_invoker)
async
Encrypt chunks and generate embeddings from plaintext before encryption.
Generates embeddings from plaintext content to ensure embeddings represent the original content rather than encrypted ciphertext. This is used when encryption is enabled.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
chunks
|
list[Chunk]
|
List of chunks to encrypt and generate embeddings for. |
required |
em_invoker
|
BaseEMInvoker
|
Embedding model invoker to generate embeddings. |
required |
Returns:
| Type | Description |
|---|---|
list[tuple[Chunk, Vector]]
|
list[tuple[Chunk, Vector]]: List of tuples containing encrypted chunks and their corresponding vectors generated from plaintext. |
encrypt_field(value)
Encrypt a single field value.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
value
|
str
|
The value to encrypt. |
required |
Returns:
| Name | Type | Description |
|---|---|---|
str |
str
|
Encrypted value. |
encrypt_update_values(update_values, content_field_name=CHUNK_KEYS.CONTENT)
Encrypt update values if encryption is enabled.
This method encrypts content and metadata values in update_values according to the encryption configuration. It handles type conversion for non-string values before encryption.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
update_values
|
dict[str, Any]
|
Dictionary of values to encrypt. Supports "content" and "metadata" keys. |
required |
content_field_name
|
str
|
The field name to use for content in the output. Defaults to CHUNK_KEYS.CONTENT. Useful for datastores like Elasticsearch that use "text" instead of "content". |
CONTENT
|
Returns:
| Type | Description |
|---|---|
dict[str, Any]
|
dict[str, Any]: Dictionary with encrypted values where applicable. The "content" key is mapped to content_field_name in the output. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If encryption fails for any field. |
HybridSearchType
Bases: StrEnum
Types of searches that can be combined in hybrid search.
SearchConfig
Bases: BaseModel
Configuration for a single search component in hybrid search.
Examples:
FULLTEXT search configuration:
python
config = SearchConfig(
search_type=HybridSearchType.FULLTEXT,
field="text",
weight=0.3
)
VECTOR search configuration:
python
config = SearchConfig(
search_type=HybridSearchType.VECTOR,
field="embedding",
em_invoker=em_invoker,
weight=0.5
)
Attributes:
| Name | Type | Description |
|---|---|---|
search_type |
HybridSearchType
|
Type of search (FULLTEXT or VECTOR). |
field |
str
|
Field name in the index (e.g., "text", "embedding"). |
weight |
float
|
Weight for this search in hybrid search. Defaults to 1.0. |
em_invoker |
BaseEMInvoker | None
|
Embedding model invoker required for VECTOR type. Defaults to None. |
top_k |
int | None
|
Per-search top_k limit (optional). Defaults to None. |
extra_kwargs |
dict[str, Any]
|
Additional search-specific parameters. Defaults to empty dict. |
validate_field_not_empty(v)
classmethod
Validate that field name is not empty.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
v
|
str
|
Field name value. |
required |
Returns:
| Name | Type | Description |
|---|---|---|
str |
str
|
Validated field name. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If field name is empty. |
validate_search_requirements()
Validate configuration based on search type.
Returns:
| Name | Type | Description |
|---|---|---|
SearchConfig |
'SearchConfig'
|
Validated configuration instance. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If required fields are missing for the search type. |
validate_top_k(v)
classmethod
Validate that top_k is positive if provided.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
v
|
int | None
|
top_k value. |
required |
Returns:
| Type | Description |
|---|---|
int | None
|
int | None: Validated top_k value. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If top_k is provided but not positive. |