Skip to content

Fulltext

In-memory implementation of fulltext search and CRUD capability.

This module provides an in-memory implementation of the FulltextCapability protocol using dictionary-based storage optimized for development and testing scenarios.

Authors

Kadek Denaya (kadek.d.r.diana@gdplabs.id)

References

NONE

InMemoryFulltextCapability(store=None)

In-memory implementation of FulltextCapability protocol.

This class provides document CRUD operations and flexible querying using pure Python data structures optimized for development and testing.

Attributes:

Name Type Description
store dict[str, Chunk]

Dictionary storing Chunk objects with their IDs as keys.

Initialize the in-memory fulltext capability.

Parameters:

Name Type Description Default
store dict[str, Any] | None

Dictionary storing Chunk objects with their IDs as keys. Defaults to None.

None

clear() async

Clear all records from the datastore.

create(data) async

Create new records in the datastore.

Example

Create a new chunk.

await fulltext_capability.create(Chunk(content="Test chunk", metadata={"category": "test"}))

Parameters:

Name Type Description Default
data Chunk | list[Chunk]

Data to create (single item or collection).

required

Raises:

Type Description
ValueError

If data structure is invalid.

delete(filters=None, options=None) async

Delete records from the datastore.

Usage Example
from gllm_datastore.core.filters import filter as F

# Direct FilterClause usage
await fulltext_capability.delete(filters=F.eq("metadata.category", "tech"))

# Multiple filters
await fulltext_capability.delete(
    filters=F.and_(F.eq("metadata.category", "tech"), F.eq("metadata.status", "draft"))
)

Parameters:

Name Type Description Default
filters FilterClause | QueryFilter | None

Filters to select records to delete. FilterClause objects are automatically converted to QueryFilter internally. Defaults to None.

None
options QueryOptions | None

Query options for sorting and limiting deletions (for eviction-like operations). Defaults to None.

None

Returns:

Name Type Description
None None

This method performs deletions in-place.

retrieve(filters=None, options=None) async

Read records from the datastore with optional filtering.

Usage Example
from gllm_datastore.core.filters import filter as F

# Direct FilterClause usage
results = await fulltext_capability.retrieve(filters=F.eq("metadata.category", "tech"))

# Multiple filters
results = await fulltext_capability.retrieve(
    filters=F.and_(F.eq("metadata.category", "tech"), F.eq("metadata.status", "active"))
)

Parameters:

Name Type Description Default
filters FilterClause | QueryFilter | None

Query filters to apply. FilterClause objects are automatically converted to QueryFilter internally. Defaults to None.

None
options QueryOptions | None

Query options for sorting and pagination. Defaults to None.

None

Returns:

Type Description
list[Chunk]

list[Chunk]: List of matched chunks after applying filters and options.

retrieve_fuzzy(query, max_distance=2, filters=None, options=None) async

Find records that fuzzy match the query within distance threshold.

Parameters:

Name Type Description Default
query str

Text to fuzzy match against.

required
max_distance int

Maximum edit distance for matches. Defaults to 2.

2
filters FilterClause | QueryFilter | None

Optional metadata filters to apply. FilterClause objects are automatically converted to QueryFilter internally. Defaults to None.

None
options QueryOptions | None

Query options, only limit is used here. Defaults to None.

None

Returns:

Type Description
list[Chunk]

list[Chunk]: Matched chunks ordered by distance (ascending), limited by options.limit.

update(update_values, filters=None) async

Update existing records in the datastore.

Example

Update certain metadata of a chunk with specific filters.

from gllm_datastore.core.filters import filter as F

# Direct FilterClause usage
await fulltext_capability.update(
    update_values={"metadata": {"status": "published"}},
    filters=F.eq("metadata.category", "tech"),
)

# Multiple filters
await fulltext_capability.update(
    update_values={"metadata": {"status": "published"}},
    filters=F.and_(F.eq("metadata.status", "draft"), F.eq("metadata.category", "tech")),
)

Parameters:

Name Type Description Default
update_values dict[str, Any]

Mapping of fields to new values to apply.

required
filters FilterClause | QueryFilter | None

Filters to select records to update. FilterClause objects are automatically converted to QueryFilter internally. Defaults to None.

None