-
Notifications
You must be signed in to change notification settings - Fork 234
Description
Initial Checks
- I have searched Google & GitHub for similar requests and couldn't find anything
- I have read and followed the docs and still think this feature is missing
Description
There is a limitation with the translation of docarray data types with the data types in weaviate.
Currently the list of datatypes is limited to the list included here:
docarray/index/backends/weaviate.py line 247
default_column_config: Dict[Any, Dict[str, Any]] = field(
default_factory=lambda: {
np.ndarray: {},
docarray.typing.ID: {},
'string': {},
'text': {},
'int': {},
'number': {},
'boolean': {},
'number[]': {},
'blob': {},line 710
py_weaviate_type_map = {
docarray.typing.ID: 'string',
str: 'text',
int: 'int',
float: 'number',
bool: 'boolean',
np.ndarray: 'number[]',
bytes: 'blob',
}line 197 create_schema
- would need to accommodate the new data types in the schema creation.
The lists outlined above are more limited than the supported data types in weaviate:
https://weaviate.io/developers/weaviate/config-refs/datatypes
We are looking to support text[] -> list of strings, object and object[] data types in order to fully leverage the weaviate storage.
One of the motivations is to have simpler data storage and also to be able to make use of weaviate's new filters : ContainsAny, ContainsAll - https://weaviate.io/developers/weaviate/api/graphql/filters#filter-structure
At the moment we need to serialize our array as a string and use Like operators which is not ideal and detrimental to performance of queries.
These filters are essentially to find conditions within arrays of values. These data types (object[] and text[] aren't supported when indexing data with the WeaviateDocumentIndex.