You can freely convert between DocumentArray and pandas.Dataframe, read more details in Serialization. Besides, you can load and write CSV file with DocumentArray.
Documents Summary
Length 5
Homogenous Documents True
Common Attributes ('id', 'tags')
Attributes Summary
Attribute Data type #Unique values Has empty value
──────────────────────────────────────────────────────────
id ('str',) 5 False
tags ('dict',) 5 False
You can observe that each row is loaded as a Document and the columns are loaded into Document.tags.
In general, from_csv will try its best to resolve the column names of the table and map them into the corresponding Document attributes. If such attempt fails, you can always resolve the field manually via:
One thing needs to be careful is that tabular data is often not good for representing nested Document. Hence, nested Document will be stored in flatten.
If your Documents contain tags, and you want to store each tag in a separate column, then you can do: