Protocol Documentation#

Table of Contents#

Top

docarray.proto#

DenseNdArrayProto#

Represents a (quantized) dense n-dim array

Field Type Label Description
buffer bytes the actual array data, in bytes
shape uint32 repeated the shape (dimensions) of the array
dtype string the data type of the array

DocumentArrayProto#

Field Type Label Description
docs DocumentProto repeated a list of Documents

DocumentProto#

Represents a Document

Field Type Label Description
id string A hexdigest that represents a unique document ID
blob bytes the raw binary content of this document, which often represents the original document when comes into jina
tensor NdArrayProto the ndarray of the image/audio/video document
text string a text document
granularity uint32 the depth of the recursive chunk structure
adjacency uint32 the width of the recursive match structure
parent_id string the parent id from the previous granularity
weight float The weight of this document
uri string a uri of the document could be: a local file path, a remote url starts with http or https or data URI scheme
modality string modality, an identifier to the modality this document belongs to. In the scope of multi/cross modal search
mime_type string mime type of this document, for buffer content, this is required; for other contents, this can be guessed
offset float the offset of the doc
location float repeated the position of the doc, could be start and end index of a string; could be x,y (top, left) coordinate of an image crop; could be timestamp of an audio clip
chunks DocumentProto repeated list of the sub-documents of this document (recursive structure)
matches DocumentProto repeated the matched documents on the same level (recursive structure)
embedding NdArrayProto the embedding of this document
tags google.protobuf.Struct a structured data value, consisting of field which map to dynamically typed values.
scores DocumentProto.ScoresEntry repeated Scores performed on the document, each element corresponds to a metric
evaluations DocumentProto.EvaluationsEntry repeated Evaluations performed on the document, each element corresponds to a metric
_metadata google.protobuf.Struct system-defined meta attributes represented in a structured data value.

DocumentProto.EvaluationsEntry#

Field Type Label Description
key string
value NamedScoreProto

DocumentProto.ScoresEntry#

Field Type Label Description
key string
value NamedScoreProto

NamedScoreProto#

Represents the relevance model to ref_id

Field Type Label Description
value float value
op_name string the name of the operator/score function
description string text description of the score
ref_id string the score is computed between doc id and ref_id

NdArrayProto#

Represents a general n-dim array, can be either dense or sparse

Field Type Label Description
dense DenseNdArrayProto dense representation of the ndarray
sparse SparseNdArrayProto sparse representation of the ndarray
cls_name string the name of the ndarray class
parameters google.protobuf.Struct

SparseNdArrayProto#

Represents a sparse ndarray

Field Type Label Description
indices DenseNdArrayProto A 2-D int64 tensor of shape [N, ndims], which specifies the indices of the elements in the sparse tensor that contain nonzero values (elements are zero-indexed)
values DenseNdArrayProto A 1-D tensor of any type and shape [N], which supplies the values for each element in indices.
shape uint32 repeated A 1-D int64 tensor of shape [ndims], which specifies the shape of the sparse tensor.

Scalar Value Types#

.proto Type Notes C++ Java Python Go C# PHP Ruby
double double double float float64 double float Float
float float float float float32 float float Float
int32 Uses variable-length encoding. Inefficient for encoding negative numbers – if your field is likely to have negative values, use sint32 instead. int32 int int int32 int integer Bignum or Fixnum (as required)
int64 Uses variable-length encoding. Inefficient for encoding negative numbers – if your field is likely to have negative values, use sint64 instead. int64 long int/long int64 long integer/string Bignum
uint32 Uses variable-length encoding. uint32 int int/long uint32 uint integer Bignum or Fixnum (as required)
uint64 Uses variable-length encoding. uint64 long int/long uint64 ulong integer/string Bignum or Fixnum (as required)
sint32 Uses variable-length encoding. Signed int value. These more efficiently encode negative numbers than regular int32s. int32 int int int32 int integer Bignum or Fixnum (as required)
sint64 Uses variable-length encoding. Signed int value. These more efficiently encode negative numbers than regular int64s. int64 long int/long int64 long integer/string Bignum
fixed32 Always four bytes. More efficient than uint32 if values are often greater than 2^28. uint32 int int uint32 uint integer Bignum or Fixnum (as required)
fixed64 Always eight bytes. More efficient than uint64 if values are often greater than 2^56. uint64 long int/long uint64 ulong integer/string Bignum
sfixed32 Always four bytes. int32 int int int32 int integer Bignum or Fixnum (as required)
sfixed64 Always eight bytes. int64 long int/long int64 long integer/string Bignum
bool bool boolean boolean bool bool boolean TrueClass/FalseClass
string A string must always contain UTF-8 encoded or 7-bit ASCII text. string String str/unicode string string string String (UTF-8)
bytes May contain any arbitrary sequence of bytes. string ByteString str []byte ByteString string String (ASCII-8BIT)