Skip to content

draft Release Notes 0.32.1Β #1579

@JoanFM

Description

@JoanFM

Release Note

This release contains 4 bug fixes, 1 refactoring and 2 documentation improvements.

βš™ Refactoring

Improve ElasticDocIndex logging (#1551)

More debugging logs have been added inside ElasticDocIndex.

🐞 Bug Fixes

Allow InMemoryExactNNIndex with Optional embedding tensors (#1575)

You can now index Documents where the tensor search_field is Optional. The index will not consider these None embeddings when running a search.

import torch
from typing import Optional

from docarray import BaseDoc, DocList
from docarray.typing import TorchTensor
from docarray.index import InMemoryExactNNIndex


class EmbeddingDoc(BaseDoc):
    embedding: Optional[TorchTensor[768]]

index = InMemoryExactNNIndex[TestDoc](DocList[TestDoc]([TestDoc(embedding=(torch.rand(768,) if i % 2 else None)) for i in range(5)]))
index.find(torch.rand((768,)), search_field="embedding", limit=3)

Safe is_subclass check (#1569)

In DocArray, especially when dealing with indexers, field types are checked that lead to calls to Python's is_subclass method.
This call fails under some circumstances, for instance when checked for a List or Tuple. Starting with this release, we use a safe version that does not fail for these cases.

This enables the following usage, which would otherwise fail:

from docarray import BaseDoc
from docarray.index import HnswDocumentIndex

class MyDoc(BaseDoc):
    test: List[str]

index = HnswDocumentIndex[MyDoc]()

Fix AnyDoc deserialization (#1571)

AnyDoc is a schema-less special Document that adapts to the schema of the data it tries to load. However, in cases where the data contained Dictionaries or Lists, deserialization failed. This is now fixed and you can have this behavior:

from docarray.base_doc import AnyDoc, BaseDoc
from typing import Dict

class ConcreteDoc(BaseDoc):
    text: str
    tags: Dict[str, int]

doc = ConcreteDoc(text='text', tags={'type': 1})

any_doc = AnyDoc.from_protobuf(doc.to_protobuf())
assert any_doc.text == 'text'
assert any_doc.tags == {'type': 1}

dict method for Document view (#1559)

Prior to this fix, doc.dict() would return an empty Dictionary if doc.is_view() == True:

class MyDoc(BaseDoc):
    foo: int

vec = DocVec[MyDoc]([MyDoc(foo=3)])
# before
doc = vec[0]
assert doc.is_view()
print(doc.dict())
# > {}

# after
doc = vec[0]
assert doc.is_view()
print(doc.dict())
# > {'id': 'f285db406a949a7e7ab084032800f7d8', 'foo': 3}

πŸ“— Documentation Improvements

🀟 Contributors

We would like to thank all contributors to this release:

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions