Skip to content

馃挮 Release v0.8.0

Compare
Choose a tag to compare
@github-actions github-actions released this 12 Oct 08:12

Release Note (0.8.0)

Release time: 2022-10-12 08:11:40

This release contains 3 new features, 1 performance improvement, and 1 documentation improvements.

馃啎 Features

Support large ONNX model files (#828)

Before this release, the ONNX model file is limited to 2GB. Now we support large ONNX models which are archived into zip files, in which several small ONNX files are stored for subgraphs. As a result, we are now able to serve all of the CLIP models via onnxruntime.

Support ViT-B-32, ViT-L-14, ViT-H-14 and ViT-g-14 trained on laion-2b (#825)

Users can now serve four new CLIP models from OpenCLIP trained on the Laion-2B dataset:

  • ViT-B-32::laion2b-s34b-b79k
  • ViT-L-14::laion2b-s32b-b82k
  • ViT-H-14::laion2b-s32b-b79k
  • ViT-g-14::laion2b-s12b-b42k

The ViT-H-14 model achieves 78.0% zero-shot top-1 accuracy on ImageNet and 73.4% on zero-shot image retrieval at Recall@5 on MS COCO. This is the best-performing open source CLIP model. To use the new models, simply specify the model name, e.g., ViT-H-14::laion2b-s32b-b79k in the FLOW YAML. For example:

jtype: Flow
version: '1'
with:
  port: 51000
executors:
  - name: clip_t
    uses:
      jtype: CLIPEncoder
      with:
        name: ViT-H-14::laion2b-s32b-b79k
      metas:
        py_modules:
          - clip_server.executors.clip_torch

Please refer to model support to see the full list of supported models.

In-place result in clip_client; preserve output order by uuid (#815)

The clip_client module now supports in-place embedding. This means the result of a call to the CLIP server to get embeddings is stored in the input DocumentArray, instead of creating a new DocumentArray. Consequently, the DocumentArray returned by a call to Client.encode now has the same order as the input DocumentArray.

This could cause a breaking change if code depends on Client.encode to return a new DocumentArray instance.

If you run the following code, you can verify that the input DocumentArray now contains the embeddings and that the order is unchanged.

from docarray import DocumentArray, Document
from clip_client import Client

c = Client('grpc://0.0.0.0:51000')

da = [
    Document(text='she smiled, with pain'),
    Document(uri='apple.png'),
    Document(uri='apple.png').load_uri_to_image_tensor(),
    Document(blob=open('apple.png', 'rb').read()),
    Document(uri='https://clip-as-service.jina.ai/_static/favicon.png'),
    Document(
        uri='data:image/gif;base64,R0lGODlhEAAQAMQAAORHHOVSKudfOulrSOp3WOyDZu6QdvCchPGolfO0o/XBs/fNwfjZ0frl3/zy7////wAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACH5BAkAABAALAAAAAAQABAAAAVVICSOZGlCQAosJ6mu7fiyZeKqNKToQGDsM8hBADgUXoGAiqhSvp5QAnQKGIgUhwFUYLCVDFCrKUE1lBavAViFIDlTImbKC5Gm2hB0SlBCBMQiB0UjIQA7'
    ),
]

c.encode(da)
print(da.embeddings)

馃殌 Performance

Drop image content to boost latency (#824)

Calls to Client.encode no longer return the input image with the embedding. Since embeddings are now inserted into the original DocumentArray instance, this is unnecessary network traffic. As a result, the system is now faster and more responsive. Performance improvement is dependent on the size of the image and network bandwidth.

馃摋 Documentation Improvements

CLIP benchmark on zero-shot classification and retrieval tasks (#832)

We now provide benchmark information for CLIP models on zero-shot classification and retrieval tasks. This information should help users to choose the best CLIP model for their specific use-cases. For more details, please read the Benchmark page in the CLIP-as-Service User Guide.

馃 Contributors

We would like to thank all contributors to this release:
felix-wang(@numb3r3 )
Ziniu Yu(@ZiniuYu )
Jie Fu(@jemmyshin )