Welcome to DocArray!#

DocArray is a library for nested, unstructured, multimodal data in transit, including text, image, audio, video, 3D mesh, etc. It allows deep-learning engineers to efficiently process, embed, search, recommend, store, and transfer the multi-modal data with a Pythonic API.

🚪 Door to cross-/multi-modal world: super-expressive data structure for representing complicated/mixed/nested text, image, video, audio, 3D mesh data. The foundation data structure of Jina, CLIP-as-service, DALL·E Flow, DiscoArt etc.

🧑‍🔬 Data science powerhouse: greatly accelerate data scientists’ work on embedding, k-NN matching, querying, visualizing, evaluating via Torch/TensorFlow/ONNX/PaddlePaddle on CPU/GPU.

🚡 Data in transit: optimized for network communication, ready-to-wire at anytime with fast and compressed serialization in Protobuf, bytes, base64, JSON, CSV, DataFrame. Perfect for streaming and out-of-memory data.

🔎 One-stop k-NN: Unified and consistent API for mainstream vector databases that allows nearest neighboour search including Elasticsearch, Redis, ANNLite, Qdrant, Weaviate.

👒 For modern apps: GraphQL support makes your server versatile on request and response; built-in data validation and JSON Schema (OpenAPI) help you build reliable webservices.

🐍 Pythonic experience: designed to be as easy as a Python list. If you know how to Python, you know how to DocArray. Intuitive idioms and type annotation simplify the code you write.

🛸 Integrate with IDE: pretty-print and visualization on Jupyter notebook & Google Colab; comprehensive auto-complete and type hint in PyCharm & VS Code.

Read more on why should you use DocArray and comparison to alternatives.

Jina in Jina AI neural search ecosystem


PyPI is the latest version.

Make sure you have Python 3.7+ and numpy installed on Linux/Mac/Windows:

pip install docarray

No extra dependency will be installed.

conda install -c conda-forge docarray

No extra dependency will be installed.

pip install "docarray[common]"

The following dependencies will be installed to enable the most common features:


Used in


advanced serialization


compression in seralization


push/pull to Jina Cloud


visualizing image sprites


image data-related IO


used in embedding projector of DocumentArray


used in embedding projector of DocumentArray

pip install "docarray[full]"

In addition to common, the following dependencies will be installed to enable full features:


Used in


for sparse embedding, tensors


for video processing and IO


for 3D mesh processing and IO


for GraphQL support

Alternatively, you can first do basic installation and then install missing dependencies on-demand.

pip install "docarray[full,test]"

This will install all requirements for reproducing tests on your local dev environment.

>>> import docarray
>>> docarray.__version__
>>> from docarray import Document, DocumentArray


Join Us#

DocArray is backed by Jina AI and licensed under Apache-2.0. We are actively hiring AI engineers, solution engineers to build the next neural search ecosystem in open-source.

Index | Module Index