Skip to content

Core

Core abstractions and utilities for GLLM Datastore.

Authors

Kadek Denaya (kadek.d.r.diana@gdplabs.id)

References

NONE

FilterClause

Bases: BaseModel

Single filter criterion with operator support.

Examples:

FilterClause(key="metadata.age", value=25, operator=FilterOperator.GT)
FilterClause(key="metadata.status", value=["active", "pending"], operator=FilterOperator.IN)

Attributes:

Name Type Description
key str

The field path to filter on (supports dot notation for nested fields).

value int | float | str | bool | list[str] | list[float] | list[int] | list[bool] | None

The value to compare against.

operator FilterOperator

The comparison operator.

to_query_filter()

Convert FilterClause to QueryFilter.

This method enables automatic conversion of FilterClause to QueryFilter.

Example
clause = FilterClause(key="metadata.status", value="active", operator=FilterOperator.EQ)
query_filter = clause.to_query_filter()
# Results in: QueryFilter(filters=[clause], condition=FilterCondition.AND)

Returns:

Name Type Description
QueryFilter QueryFilter

A QueryFilter wrapping this FilterClause with AND condition.

FilterCondition

Bases: StrEnum

Logical conditions for combining filters.

FilterOperator

Bases: StrEnum

Operators for comparing field values.

FulltextCapability

Bases: Protocol

Protocol for full-text search and document operations.

This protocol defines the interface for datastores that support CRUD operations and flexible querying mechanisms for document data.

clear(**kwargs) async

Clear all records from the datastore.

Parameters:

Name Type Description Default
**kwargs

Datastore-specific parameters.

{}

create(data, **kwargs) async

Create new records in the datastore.

Parameters:

Name Type Description Default
data Chunk | list[Chunk]

Data to create (single item or collection).

required
**kwargs

Datastore-specific parameters.

{}

delete(filters=None, options=None, **kwargs) async

Delete records from the datastore.

Parameters:

Name Type Description Default
filters FilterClause | QueryFilter | None

Filters to select records to delete. FilterClause objects are automatically converted to QueryFilter internally. Defaults to None, in which case no operation is performed (no-op).

None
options QueryOptions | None

Query options for sorting and limiting deletions. Defaults to None.

None
**kwargs

Datastore-specific parameters.

{}

retrieve(filters=None, options=None, **kwargs) async

Read records from the datastore with optional filtering.

Parameters:

Name Type Description Default
filters FilterClause | QueryFilter | None

Query filters to apply. FilterClause objects are automatically converted to QueryFilter internally. Defaults to None.

None
options QueryOptions | None

Query options like limit and sorting. Defaults to None.

None
**kwargs

Datastore-specific parameters.

{}

Returns:

Type Description
list[Chunk]

list[Chunk]: Query results.

retrieve_fuzzy(query, max_distance=2, filters=None, options=None, **kwargs) async

Find records that fuzzy match the query within distance threshold.

Parameters:

Name Type Description Default
query str

Text to fuzzy match against.

required
max_distance int

Maximum edit distance for matches (Levenshtein distance). Defaults to 2.

2
filters FilterClause | QueryFilter | None

Optional metadata filters to apply. FilterClause objects are automatically converted to QueryFilter internally. Defaults to None.

None
options QueryOptions | None

Query options (limit, sorting, etc.). Defaults to None.

None
**kwargs

Datastore-specific parameters.

{}

Returns:

Type Description
list[Chunk]

list[Chunk]: Matched chunks ordered by relevance/distance.

update(update_values, filters=None, **kwargs) async

Update existing records in the datastore.

Parameters:

Name Type Description Default
update_values dict[str, Any]

Values to update.

required
filters FilterClause | QueryFilter | None

Filters to select records to update. FilterClause objects are automatically converted to QueryFilter internally. Defaults to None.

None
**kwargs

Datastore-specific parameters.

{}

GraphCapability

Bases: Protocol

Protocol for graph database operations.

This protocol defines the interface for datastores that support graph-based data operations. This includes node and relationship management as well as graph queries.

delete_node(label, identifier_key, identifier_value) async

Delete a node and its relationships.

Parameters:

Name Type Description Default
label str

Node label/type.

required
identifier_key str

Node identifier key.

required
identifier_value str

Node identifier value.

required

Returns:

Name Type Description
Any Any

Deletion result information.

delete_relationship(node_source_key, node_source_value, relation, node_target_key, node_target_value) async

Delete a relationship between nodes.

Parameters:

Name Type Description Default
node_source_key str

Source node identifier key.

required
node_source_value str

Source node identifier value.

required
relation str

Relationship type.

required
node_target_key str

Target node identifier key.

required
node_target_value str

Target node identifier value.

required

Returns:

Name Type Description
Any Any

Deletion result information.

retrieve(query, parameters=None) async

Retrieve data from the graph with specific query.

Parameters:

Name Type Description Default
query str

Query to retrieve data from the graph.

required
parameters dict[str, Any] | None

Query parameters. Defaults to None.

None

Returns:

Type Description
list[dict[str, Any]]

list[dict[str, Any]]: Query results as list of dictionaries.

upsert_node(label, identifier_key, identifier_value, properties=None) async

Create or update a node in the graph.

Parameters:

Name Type Description Default
label str

Node label/type.

required
identifier_key str

Key field for node identification.

required
identifier_value str

Value for node identification.

required
properties dict[str, Any] | None

Additional node properties. Defaults to None.

None

Returns:

Name Type Description
Any Any

Created/updated node information.

upsert_relationship(node_source_key, node_source_value, relation, node_target_key, node_target_value, properties=None) async

Create or update a relationship between nodes.

Parameters:

Name Type Description Default
node_source_key str

Source node identifier key.

required
node_source_value str

Source node identifier value.

required
relation str

Relationship type.

required
node_target_key str

Target node identifier key.

required
node_target_value str

Target node identifier value.

required
properties dict[str, Any] | None

Relationship properties. Defaults to None.

None

Returns:

Name Type Description
Any Any

Created/updated relationship information.

QueryFilter

Bases: BaseModel

Composite filter supporting multiple conditions and logical operators.

Attributes:

Name Type Description
filters list[FilterClause | QueryFilter]

List of filters to combine. Can include nested QueryFilter for complex logic.

condition FilterCondition

Logical operator to combine filters. Defaults to AND.

Examples:

  1. Simple AND: age > 25 AND status == "active" python QueryFilter( filters=[ FilterClause(key="metadata.age", value=25, operator=FilterOperator.GT), FilterClause(key="metadata.status", value="active", operator=FilterOperator.EQ) ], condition=FilterCondition.AND )

  2. Complex OR: (status == "active" OR status == "pending") AND age >= 18 python QueryFilter( filters=[ QueryFilter( filters=[ FilterClause(key="metadata.status", value="active"), FilterClause(key="metadata.status", value="pending") ], condition=FilterCondition.OR ), FilterClause(key="metadata.age", value=18, operator=FilterOperator.GTE) ], condition=FilterCondition.AND )

  3. NOT: NOT (status == "deleted") python QueryFilter( filters=[ FilterClause(key="metadata.status", value="deleted") ], condition=FilterCondition.NOT )

from_dicts(filter_dicts, condition=FilterCondition.AND) classmethod

Create QueryFilter from list of filter dictionaries.

Example
QueryFilter.from_dicts(
    [
        {"key": "metadata.age", "value": 25, "operator": ">"},
        {"key": "metadata.status", "value": "active"}
    ],
    condition=FilterCondition.AND
)

Parameters:

Name Type Description Default
filter_dicts list[dict[str, Any]]

List of filter dictionaries. Contains the key, value, and operator.

required
condition FilterCondition

Logical operator to combine filters. Defaults to AND.

AND

Returns:

Name Type Description
QueryFilter 'QueryFilter'

Composite filter instance.

QueryOptions

Bases: BaseModel

Model for query options.

Attributes:

Name Type Description
include_fields Sequence[str] | None

The fields to include in the query result. Defaults to None.

order_by str | None

The column to order the query result by. Defaults to None.

order_desc bool

Whether to order the query result in descending order. Defaults to False.

limit int | None

The maximum number of rows to return. Must be >= 0. Defaults to None.

Example
QueryOptions(include_fields=["field1", "field2"], order_by="column1", order_desc=True, limit=10)

VectorCapability

Bases: Protocol

Protocol for vector similarity search operations.

This protocol defines the interface for datastores that support vector-based retrieval operations. This includes similarity search, ID-based lookup as well as vector storage.

clear() async

Clear all records from the datastore.

create(data) async

Add chunks to the vector store with automatic embedding generation.

Parameters:

Name Type Description Default
data Chunk | list[Chunk]

Single chunk or list of chunks to add.

required

create_from_vector(chunk_vectors, **kwargs) async

Add pre-computed vectors directly.

Parameters:

Name Type Description Default
chunk_vectors list[tuple[Chunk, Vector]]

List of tuples containing chunks and their corresponding vectors.

required
**kwargs Any

Datastore-specific parameters.

{}

delete(filters=None, **kwargs) async

Delete records from the datastore.

Parameters:

Name Type Description Default
filters FilterClause | QueryFilter | None

Filters to select records to delete. FilterClause objects are automatically converted to QueryFilter internally. Defaults to None.

None
**kwargs Any

Datastore-specific parameters

{}
Note

If filters is None, no operation is performed (no-op).

retrieve(query, filters=None, options=None, **kwargs) async

Read records from the datastore using text-based similarity search with optional filtering.

Parameters:

Name Type Description Default
query str

Input text to embed and search with.

required
filters FilterClause | QueryFilter | None

Query filters to apply. FilterClause objects are automatically converted to QueryFilter internally. Defaults to None.

None
options QueryOptions | None

Query options like limit and sorting. Defaults to None.

None
**kwargs Any

Datastore-specific parameters.

{}

Returns:

Type Description
list[Chunk]

list[Chunk]: Query results.

retrieve_by_vector(vector, filters=None, options=None, **kwargs) async

Direct vector similarity search.

Parameters:

Name Type Description Default
vector Vector

Query embedding vector.

required
filters FilterClause | QueryFilter | None

Query filters to apply. FilterClause objects are automatically converted to QueryFilter internally. Defaults to None.

None
options QueryOptions | None

Query options like limit and sorting. Defaults to None.

None
**kwargs Any

Datastore-specific parameters.

{}

Returns:

Type Description
list[Chunk]

list[Chunk]: List of chunks ordered by similarity score.

update(update_values, filters=None, **kwargs) async

Update existing records in the datastore.

Parameters:

Name Type Description Default
update_values dict[str, Any]

Values to update.

required
filters FilterClause | QueryFilter | None

Filters to select records to update. FilterClause objects are automatically converted to QueryFilter internally. Defaults to None.

None
**kwargs Any

Datastore-specific parameters.

{}

all_(key, values)

Create an ALL filter (array field contains all of the values).

This operator checks if an array field contains all of the values in the provided list. The field must be an array/list, and every value in the values list must be present as an element in the array. The array may contain additional elements.

Example

Filter for documents where the tags array contains both "python" and "javascript". This will match only if metadata.tags contains both values. For example, if metadata.tags = ["python", "javascript", "rust"], this will match. If metadata.tags = ["python", "rust"], this will not match (missing "javascript").

from gllm_datastore.core.filters import all_

filter = all_("metadata.tags", ["python", "javascript"])

Parameters:

Name Type Description Default
key str

Field path to filter on (must be an array field).

required
values list

List of values. All must be present in the array.

required

Returns:

Name Type Description
FilterClause FilterClause

ALL filter.

and_(*filters)

Combine filters with AND condition.

This logical operator combines multiple filters such that all conditions must be satisfied. A document matches only if it satisfies every filter in the list.

Example

Filter for documents where status is "active" AND age is at least 18. This will match documents that satisfy both conditions simultaneously.

from gllm_datastore.core.filters import and_, eq, gte

filter = and_(eq("metadata.status", "active"), gte("metadata.age", 18))

Parameters:

Name Type Description Default
*filters FilterClause | QueryFilter

Variable number of filters to combine. All filters must match for a document to be included.

()

Returns:

Name Type Description
QueryFilter QueryFilter

Combined filter with AND condition.

any_(key, values)

Create an ANY filter (array field contains any of the values).

This operator checks if an array field contains at least one of the values in the provided list. The field must be an array/list, and at least one element from the values list must be present in the array. This is similar to checking if the arrays have any intersection.

Example

Filter for documents where the tags array contains at least one of "python" or "javascript". This will match if metadata.tags contains "python", "javascript", or both. For example, if metadata.tags = ["python", "rust"], this will match (because of "python").

from gllm_datastore.core.filters import any_

filter = any_("metadata.tags", ["python", "javascript"])

Parameters:

Name Type Description Default
key str

Field path to filter on (must be an array field).

required
values list

List of values. At least one must be present in the array.

required

Returns:

Name Type Description
FilterClause FilterClause

ANY filter.

array_contains(key, value)

Create an ARRAY_CONTAINS filter (array field contains value).

This operator checks if an array field contains the specified value as an element. The field must be an array/list, and the value must be present in that array. Use this for checking array membership.

Example

Filter for documents where the tags array contains "python". This will match documents where "python" is an element in metadata.tags. For example, if metadata.tags = ["python", "javascript"], this will match.

from gllm_datastore.core.filters import array_contains

filter = array_contains("metadata.tags", "python")

Parameters:

Name Type Description Default
key str

Field path to filter on (must be an array field).

required
value Any

Value to check if it exists as an element in the array.

required

Returns:

Name Type Description
FilterClause FilterClause

ARRAY_CONTAINS filter.

eq(key, value)

Create an equality filter.

This operator checks if the field value is exactly equal to the specified value. Works with strings, numbers, booleans, and other scalar types.

Example

Filter for documents where metadata.status == active.

from gllm_datastore.core.filters import eq

filter = eq("metadata.status", "active")

Parameters:

Name Type Description Default
key str

Field path to filter on.

required
value Any

Value to compare. Matches field values exactly equal to this value.

required

Returns:

Name Type Description
FilterClause FilterClause

Equality filter.

gt(key, value)

Create a greater-than filter.

This operator checks if the field value is strictly greater than the specified value. Only works with numeric fields (int or float).

Example

Filter for documents where metadata.price > 100.

from gllm_datastore.core.filters import gt

filter = gt("metadata.price", 100)

Parameters:

Name Type Description Default
key str

Field path to filter on (must be numeric).

required
value int | float

Threshold value. Matches field values greater than this.

required

Returns:

Name Type Description
FilterClause FilterClause

Greater-than filter.

gte(key, value)

Create a greater-than-or-equal filter.

This operator checks if the field value is greater than or equal to the specified value. Only works with numeric fields (int or float).

Example

Filter for documents where metadata.price >= 100.

from gllm_datastore.core.filters import gte

filter = gte("metadata.price", 100)

Parameters:

Name Type Description Default
key str

Field path to filter on (must be numeric).

required
value int | float

Threshold value. Matches field values greater than or equal to this.

required

Returns:

Name Type Description
FilterClause FilterClause

Greater-than-or-equal filter.

in_(key, values)

Create an IN filter.

This operator checks if the field value is one of the values in the provided list. Works with scalar fields (string, number, boolean). The field value must exactly match one of the values in the list.

Example

Filter for documents where metadata.status in ["active", "pending"].

from gllm_datastore.core.filters import in_

filter = in_("metadata.status", ["active", "pending"])

Parameters:

Name Type Description Default
key str

Field path to filter on (must be a scalar field).

required
values list

List of possible values. Matches field values that match one of these exactly.

required

Returns:

Name Type Description
FilterClause FilterClause

IN filter.

lt(key, value)

Create a less-than filter.

This operator checks if the field value is strictly less than the specified value. Only works with numeric fields (int or float).

Example

Filter for documents where metadata.price < 100.

from gllm_datastore.core.filters import lt

filter = lt("metadata.price", 100)

Parameters:

Name Type Description Default
key str

Field path to filter on (must be numeric).

required
value int | float

Threshold value. Matches field values less than this.

required

Returns:

Name Type Description
FilterClause FilterClause

Less-than filter.

lte(key, value)

Create a less-than-or-equal filter.

This operator checks if the field value is less than or equal to the specified value. Only works with numeric fields (int or float).

Example

Filter for documents where metadata.price <= 100.

from gllm_datastore.core.filters import lte

filter = lte("metadata.price", 100)

Parameters:

Name Type Description Default
key str

Field path to filter on (must be numeric).

required
value int | float

Threshold value. Matches field values less than or equal to this.

required

Returns:

Name Type Description
FilterClause FilterClause

Less-than-or-equal filter.

ne(key, value)

Create a not-equal filter.

This operator checks if the field value is not equal to the specified value. Works with strings, numbers, booleans, and other scalar types.

Example

Filter for documents where metadata.status != active.

from gllm_datastore.core.filters import ne

filter = ne("metadata.status", "active")

Parameters:

Name Type Description Default
key str

Field path to filter on.

required
value Any

Value to exclude. Matches all values except this one.

required

Returns:

Name Type Description
FilterClause FilterClause

Not-equal filter.

nin(key, values)

Create a NOT IN filter.

This operator checks if the field value is not in the provided list. Works with scalar fields (string, number, boolean). The field value must not match any of the values in the list.

Example

Filter for documents where metadata.status not in ["deleted", "archived"].

from gllm_datastore.core.filters import nin

filter = nin("metadata.status", ["deleted", "archived"])

Parameters:

Name Type Description Default
key str

Field path to filter on (must be a scalar field).

required
values list

List of excluded values. Matches field values that do not match any of these.

required

Returns:

Name Type Description
FilterClause FilterClause

NOT IN filter.

not_(filter)

Negate a filter.

This logical operator inverts the result of a filter. A document matches if it does not satisfy the specified filter condition. Useful for exclusion criteria.

This operator only supports NOT with a single filter. Multiple filters in NOT condition are not supported.

Example

Filter for documents where status is NOT "deleted". This will match all documents except those with status == "deleted". Can also be used with other operators, e.g., not_(text_contains("content", "spam")) to exclude documents containing a specific substring.

from gllm_datastore.core.filters import not_, eq

filter = not_(eq("metadata.status", "deleted"))

Parameters:

Name Type Description Default
filter FilterClause | QueryFilter

Filter to negate. Documents matching this filter will be excluded from results.

required

Returns:

Name Type Description
QueryFilter QueryFilter

Negated filter.

or_(*filters)

Combine filters with OR condition.

This logical operator combines multiple filters such that at least one condition must be satisfied. A document matches if it satisfies any of the filters in the list.

Example

Filter for documents where status is "active" OR status is "pending". This will match documents that satisfy either condition (or both).

from gllm_datastore.core.filters import or_, eq

filter = or_(eq("metadata.status", "active"), eq("metadata.status", "pending"))

Parameters:

Name Type Description Default
*filters FilterClause | QueryFilter

Variable number of filters to combine. At least one filter must match for a document to be included.

()

Returns:

Name Type Description
QueryFilter QueryFilter

Combined filter with OR condition.

text_contains(key, value)

Create a TEXT_CONTAINS filter (text field contains substring).

This operator checks if a text/string field contains the specified substring. The field must be a string, and the value must appear as a substring within that string. Use this for substring matching in text content.

Example

Filter for documents where the content field contains "machine learning". This will match documents where "machine learning" appears anywhere in the content. For example, if content = "This is about machine learning algorithms", this will match.

from gllm_datastore.core.filters import text_contains

filter = text_contains("content", "machine learning")

Parameters:

Name Type Description Default
key str

Field path to filter on (must be a string/text field).

required
value str

Substring to search for in the text.

required

Returns:

Name Type Description
FilterClause FilterClause

TEXT_CONTAINS filter.