Skip to content

Data store

Elasticsearch data store with capability composition.

Authors

Kadek Denaya (kadek.d.r.diana@gdplabs.id)

References

NONE

ElasticsearchDataStore(index_name, client=None, url=None, cloud_id=None, api_key=None, username=None, password=None, request_timeout=DEFAULT_REQUEST_TIMEOUT)

Bases: BaseDataStore

Elasticsearch data store with multiple capability support.

This is the explicit public API for Elasticsearch. Users know they're using Elasticsearch, not a generic "elastic-like" datastore.

Attributes:

Name Type Description
engine str

Always "elasticsearch" for explicit identification. This attribute ensures users know they're using Elasticsearch, not a generic "elastic-like" datastore.

index_name str

The name of the Elasticsearch index.

client AsyncElasticsearch

AsyncElasticsearch client.

Initialize the Elasticsearch data store.

Parameters:

Name Type Description Default
index_name str

The name of the Elasticsearch index to use for operations. This index name will be used for all queries and operations.

required
client AsyncElasticsearch | None

Pre-configured Elasticsearch client instance. If provided, it will be used instead of creating a new client from url/cloud_id. Must be an instance of AsyncElasticsearch. Defaults to None.

None
url str | None

The URL of the Elasticsearch server. For example, "http://localhost:9200". Either url or cloud_id must be provided if client is None. Defaults to None.

None
cloud_id str | None

The cloud ID of the Elasticsearch cluster. Used for Elastic Cloud connections. Either url or cloud_id must be provided if client is None. Defaults to None.

None
api_key str | None

The API key for authentication. If provided, will be used for authentication. Mutually exclusive with username/password. Defaults to None.

None
username str | None

The username for basic authentication. Must be provided together with password. Mutually exclusive with api_key. Defaults to None.

None
password str | None

The password for basic authentication. Must be provided together with username. Mutually exclusive with api_key. Defaults to None.

None
request_timeout int

The request timeout in seconds. Defaults to DEFAULT_REQUEST_TIMEOUT.

DEFAULT_REQUEST_TIMEOUT

Raises:

Type Description
ValueError

If neither url nor cloud_id is provided when client is None.

TypeError

If client is provided but is not an instance of AsyncElasticsearch.

fulltext property

Access fulltext capability if supported.

This method uses the logic of its parent class to return the fulltext capability handler. This method overrides the parent class to return the ElasticsearchFulltextCapability handler for better type hinting.

Returns:

Name Type Description
ElasticsearchFulltextCapability ElasticsearchFulltextCapability

Fulltext capability handler.

Raises:

Type Description
NotSupportedException

If fulltext capability is not supported.

supported_capabilities property

Return list of currently supported capabilities.

Returns:

Type Description
list[str]

list[str]: List of capability names that are supported.

vector property

Access vector capability if supported.

This method uses the logic of its parent class to return the vector capability handler. This method overrides the parent class to return the ElasticsearchVectorCapability handler for better type hinting.

Returns:

Name Type Description
ElasticsearchVectorCapability ElasticsearchVectorCapability

Vector capability handler.

Raises:

Type Description
NotSupportedException

If vector capability is not supported.

translate_query_filter(query_filter, **kwargs) classmethod

Translate QueryFilter or FilterClause to Elasticsearch native filter syntax.

This method delegates to the ElasticsearchQueryTranslator and returns the result as a dictionary.

Parameters:

Name Type Description Default
query_filter FilterClause | QueryFilter

The filter to translate. Can be a single FilterClause, a QueryFilter with multiple clauses and logical conditions. FilterClause objects are automatically converted to QueryFilter.

required
**kwargs Any

Additional parameters (unused, kept for compatibility with base class).

{}

Returns:

Type Description
dict[str, Any] | None

dict[str, Any] | None: The translated filter as an Elasticsearch DSL dictionary. Returns None when filters are empty. The dictionary format matches Elasticsearch Query DSL syntax.

with_encryption(encryptor, fields)

Enable encryption for specified fields.

Note

Encrypted fields (content and metadata fields specified in encryption configuration) must be serializable to strings. Non-string values will be converted to strings before encryption.

Warning

When encryption is enabled for fields, some search and filter operations may be limited or broken. Encrypted fields cannot be used in filters for update or delete operations, as the filter values are not encrypted and will not match the encrypted data stored in the index. Use non-encrypted fields (like 'id') for filtering when working with encrypted data.

Parameters:

Name Type Description Default
encryptor BaseEncryptor

The encryptor instance to use. Must not be None.

required
fields set[str] | list[str]

Set or list of field names to encrypt. Must not be empty.

required

Returns:

Name Type Description
ElasticsearchDataStore ElasticsearchDataStore

Self for method chaining.

Raises:

Type Description
ValueError

If encryptor is None or fields is empty.

with_fulltext(index_name=None, query_field='text')

Configure fulltext capability and return datastore instance.

This method uses the logic of its parent class to configure the fulltext capability. This method overrides the parent class for better type hinting.

Parameters:

Name Type Description Default
index_name str | None

The name of the Elasticsearch index to use for fulltext operations. If None, uses the default index_name from the datastore instance. Defaults to None.

None
query_field str

The field name to use for text content in queries. This field will be used for BM25 and other text search operations. Defaults to "text".

'text'

Returns:

Name Type Description
ElasticsearchDataStore ElasticsearchDataStore

Self for method chaining.

with_vector(em_invoker, index_name=None, query_field='text', vector_query_field='vector', retrieval_strategy=None, distance_strategy=None)

Configure vector capability and return datastore instance.

This method uses the logic of its parent class to configure the vector capability. This method overrides the parent class for better type hinting.

Parameters:

Name Type Description Default
em_invoker BaseEMInvoker

The embedding model to perform vectorization.

required
index_name str | None

The name of the Elasticsearch index. Defaults to None, in which case the default class attribute will be utilized.

None
query_field str

The field name for text queries. Defaults to "text".

'text'
vector_query_field str

The field name for vector queries. Defaults to "vector".

'vector'
retrieval_strategy AsyncRetrievalStrategy | None

The retrieval strategy for retrieval. Defaults to None, in which case DenseVectorStrategy() is used.

None
distance_strategy str | None

The distance strategy for retrieval. Defaults to None.

None

Returns:

Name Type Description
Self ElasticsearchDataStore

Self for method chaining.