Annlite#
One can use Annlite as the document store for DocumentArray. It is useful when one wants to have faster Document retrieval on embeddings, i.e. .match()
, .find()
.
Tip
This feature requires annlite
. You can install it via pip install "docarray[annlite]".
Usage#
One can instantiate a DocumentArray with Annlite storage like so:
from docarray import DocumentArray
da = DocumentArray(storage='annlite', config={'n_dim': 10})
The usage would be the same as the ordinary DocumentArray.
To access a DocumentArray formerly persisted, one can specify the data_path
in config
.
from docarray import DocumentArray
da = DocumentArray(storage='annlite', config={'data_path': './data', 'n_dim': 10})
da.summary()
Note that specifying the n_dim
is mandatory before using Annlite
as a backend for DocumentArray.
Other functions behave the same as in-memory DocumentArray.
Config#
The following configs can be set:
Name | Description | Default |
---|---|---|
n_dim |
Number of dimensions of embeddings to be stored and retrieved | This is always required |
data_path |
The data folder where the data is located | A random temp folder |
metric |
Distance metric to be used during search. Can be 'cosine', 'dot' or 'euclidean' | 'cosine' |
ef_construction |
The size of the dynamic list for the nearest neighbors (used during the construction) | None , defaults to the default value in the AnnLite package* |
ef_search |
The size of the dynamic list for the nearest neighbors (used during the search) | None , defaults to the default value in the AnnLite package* |
max_connection |
The number of bi-directional links created for every new element during construction. | None , defaults to the default value in the AnnLite package* |
*You can check the default values in the AnnLite source code