Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
24 changes: 11 additions & 13 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -674,41 +674,39 @@ And to seal the deal, let us show you how easily documents slot into your FastAP
```python
import numpy as np
from fastapi import FastAPI
from httpx import AsyncClient

from docarray.base_doc import DocArrayResponse
from docarray import BaseDoc
from docarray.documents import ImageDoc
from docarray.typing import NdArray
from docarray.base_doc import DocArrayResponse


class InputDoc(BaseDoc):
img: ImageDoc
text: str


class OutputDoc(BaseDoc):
embedding_clip: NdArray
embedding_bert: NdArray


input_doc = InputDoc(img=ImageDoc(tensor=np.zeros((3, 224, 224))))

app = FastAPI()

def model_img(img: ImageTensor) -> NdArray:
return np.zeros((100, 1))

def model_text(text: str) -> NdArray:
return np.zeros((100, 1))

@app.post("/doc/", response_model=OutputDoc, response_class=DocArrayResponse)
@app.post("/embed/", response_model=OutputDoc, response_class=DocArrayResponse)
async def create_item(doc: InputDoc) -> OutputDoc:
## call my fancy model to generate the embeddings
doc = OutputDoc(
embedding_clip=np.zeros((100, 1)), embedding_bert=np.zeros((100, 1))
embedding_clip=model_img(doc.img.tensor), embedding_bert=model_text(doc.text)
)
return doc


async with AsyncClient(app=app, base_url="http://test") as ac:

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are u removing this?

response = await ac.post("/doc/", data=input_doc.json())
resp_doc = await ac.get("/docs")
resp_redoc = await ac.get("/redoc")
response = await ac.post("/embed/", data=input_doc.json())

```

Just like a vanilla Pydantic model!
Expand Down
18 changes: 10 additions & 8 deletions docs/user_guide/representing/array.md
Original file line number Diff line number Diff line change
Expand Up @@ -256,20 +256,20 @@ This is where the custom syntax `DocList[DocType]` comes into play.
!!! note
`DocList[DocType]` creates a custom [`DocList`][docarray.array.doc_list.doc_list.DocList] that can only contain `DocType` Documents.

This syntax is inspired by more statically typed languages, and even though it might offend Python purists, we believe that it is a good user experience to think of an Array of `BaseDoc`s rather than just an array of non-homogenous `BaseDoc`s.
This syntax is inspired by more statically typed languages, and even though it might offend Python purists, we believe that it is a good user experience to think of an Array of `BaseDoc`s rather than just an array of heterogeneous `BaseDoc`s.

That said, `AnyDocArray` can also be used to create a non-homogenous `AnyDocArray`:
That said, `AnyDocArray` can also be used to create a heterogeneous `AnyDocArray`:

!!! note
The default `DocList` can be used to create a non-homogenous list of `BaseDoc`.
The default `DocList` can be used to create a heterogeneous list of `BaseDoc`.

!!! warning
`DocVec` cannot store non-homogenous `BaseDoc` and always needs the `DocVec[DocType]` syntax.
`DocVec` cannot store heterogeneous `BaseDoc` and always needs the `DocVec[DocType]` syntax.

The usage of a non-homogenous `DocList` is similar to a normal Python list but still offers DocArray functionality
The usage of a heterogeneous `DocList` is similar to a normal Python list but still offers DocArray functionality
like [serialization and sending over the wire](../sending/first_step.md). However, it won't be able to extend the API of your custom schema to the Array level.

Here is how you can instantiate a non-homogenous `DocList`:
Here is how you can instantiate a heterogeneous `DocList`:

```python
from docarray import BaseDoc, DocList
Expand Down Expand Up @@ -386,10 +386,10 @@ this means that if you call `docs.image` multiple times, under the hood you will
Let's see how it will work with `DocVec`:

```python
from docarray import DocList
from docarray import DocVec
import numpy as np

docs = DocList[ImageDoc](
docs = DocVec[ImageDoc](
[ImageDoc(image=np.random.rand(3, 224, 224)) for _ in range(10)]
)

Expand Down Expand Up @@ -460,6 +460,8 @@ Both [`DocList`][docarray.array.doc_list.doc_list.DocList] and [`DocVec`][docarr
Using nested optional fields differs slightly between DocList and DocVes, so watch out. But in a nutshell:

When accessing a nested BaseDoc:


* DocList will return a list of documents if the field is optional and a DocList if the field is not optional
* DocVec will return a DocVec if all documents are there, or None if all docs are None. No mix of docs and None allowed!
* DocVec will behave the same for a tensor field instead of a BaseDoc
Expand Down