Skip to content

Enhancement to the weaviate datatypes support (text[], object, object[]) with WeaviateDocumentIndex #1849

@vincetrep

Description

@vincetrep

Initial Checks

  • I have searched Google & GitHub for similar requests and couldn't find anything
  • I have read and followed the docs and still think this feature is missing

Description

There is a limitation with the translation of docarray data types with the data types in weaviate.

Currently the list of datatypes is limited to the list included here:
docarray/index/backends/weaviate.py line 247

        default_column_config: Dict[Any, Dict[str, Any]] = field(
            default_factory=lambda: {
                np.ndarray: {},
                docarray.typing.ID: {},
                'string': {},
                'text': {},
                'int': {},
                'number': {},
                'boolean': {},
                'number[]': {},
                'blob': {},

line 710

        py_weaviate_type_map = {
            docarray.typing.ID: 'string',
            str: 'text',
            int: 'int',
            float: 'number',
            bool: 'boolean',
            np.ndarray: 'number[]',
            bytes: 'blob',
        }

line 197 create_schema

  • would need to accommodate the new data types in the schema creation.

The lists outlined above are more limited than the supported data types in weaviate:
https://weaviate.io/developers/weaviate/config-refs/datatypes

We are looking to support text[] -> list of strings, object and object[] data types in order to fully leverage the weaviate storage.

One of the motivations is to have simpler data storage and also to be able to make use of weaviate's new filters : ContainsAny, ContainsAll - https://weaviate.io/developers/weaviate/api/graphql/filters#filter-structure
At the moment we need to serialize our array as a string and use Like operators which is not ideal and detrimental to performance of queries.

These filters are essentially to find conditions within arrays of values. These data types (object[] and text[] aren't supported when indexing data with the WeaviateDocumentIndex.

Affected Components

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions