DocArray 0.20.1 Update

DocArray is a library for nested, unstructured, multimodal data in transit, including text, image, audio, video, 3D mesh, etc

DocArray is a library for nested, unstructured, multimodal data in transit, including text, image, audio, video, 3D mesh, etc. It allows deep-learning engineers to efficiently process, embed, search, recommend, store, and transfer multi-modal data with a Pythonic API.

💡
DocArray was released under the open-source Apache License 2.0 in January 2022. It is currently a sandbox project under LF AI & Data Foundation.

DocArray is the common data layer used in all Jina AI products.
Release 💫 Patch v0.20.1 · docarray/docarray
Release Note (0.20.1) Release time: 2022-12-12 09:32:37 🐞 Bug FixesMake Milvus DocumentArray thread safe and suitable for pytest (#904)This bug was causing connectivity issues when using multip...

Release Note (0.20.1)

This release contains 2 bug fixes and 1 documentation improvement.

🐞 Bug Fixes

Make Milvus DocumentArray thread safe and suitable for pytest (#904)

This bug was causing connectivity issues when using multiple DocumentArrays in different threads to connect to the same Milvus instance, e.g. in pytest.

This would produce an error like the following:

E1207 14:59:51.357528591    2279 fork_posix.cc:76]           Other threads are currently calling into gRPC, skipping fork() handlers
E1207 14:59:51.367985469    2279 fork_posix.cc:76]           Other threads are currently calling into gRPC, skipping fork() handlers
E1207 14:59:51.457061884    3934 ev_epoll1_linux.cc:824]     assertion failed: gpr_atm_no_barrier_load(&g_active_poller) != (gpr_atm)worker
Fatal Python error: Aborted

This fix creates a separate gRPC connection for each MilvusDocumentArray instance, circumventing the issue.

Restore backwards compatibility for (de)serialization (#903)

DocArray v0.20.0 broke (de)serialization backwards compatibility with earlier versions of the library, making it impossible to load DocumentArrays from v0.19.1 or earlier from disk:

# DocArray <= 0.19.1
da = DocumentArray([Document() for _ in range(10)])
da.save_binary('old-da.docarray')
# DocArray == 0.20.0
da = DocumentArray.load_binary('old-da.docarray')
da.extend([Document()])
print(da)
AttributeError: 'DocumentArrayInMemory' object has no attribute '_is_subindex'

This fix restores backwards compatibility by not relying on newly introduced private attributes:

# DocArray <= 0.19.1
da = DocumentArray([Document() for _ in range(10)])
da.save_binary('old-da.docarray')
# DocArray == 0.20.1
da = DocumentArray.load_binary('old-da.docarray')
da.extend([Document()])
print(da)
<DocumentArray (length=11) at 140683902276416>

Process finished with exit code 0

📗 Documentation Improvement

🤘 Contributors

We would like to thank all contributors to this release:

Engineering Group

We do opensource, we do neural search, we do creative AI, we do MLOps. We do we.

... and You!

You love opensource and AI engineering. So join Jina AI today! Let's lead the future of Multimodal AI. 🚀

Table of Contents

1
🐞 Bug Fixes
2
📗 Documentation Improvement
3
🤘 Contributors