Skip to content

mcvella/viam-moondream-vision-modal

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

moondream modular vision service

This module implements the rdk vision API in a viam-labs:vision:moondream-modal model.

This model leverages the Moondream tiny vision language model to allow for image classification and querying - with inference running on the Modal platform, allowing you to augment your Viam machines with serverless cloud-based VLM capabilities.

Build and Run

To use this module, follow these instructions to add a module from the Viam Registry and select the viam-labs:vision:moondream model from the viam-labs moondream-vision module.

You will also need to sign up for a Modal account, create a workspace, and then create an API token. The Modal API token ID and secret must then be used in your module configuration.

Configure Modal API Token

In the Viam app, you will need to configure access to your Modal account by setting environment variables for this module. To do so, in CONFIGURE, click on JSON, and within the service configuration for this module, add:

      "env": {
        "MODAL_TOKEN_ID": "YOURTOKENHERE",
        "MODAL_TOKEN_SECRET": "YOURSECRETHERE"
      }

Configure your vision service

Note

Before configuring your vision service, you must create a machine.

Navigate to the Config tab of your robot’s page in the Viam app. Click on the Service subtab and click Create service. Select the vision type, then select the viam-labs:vision:moondream model. Enter a name for your vision service and click Create.

On the new service panel, copy and paste the following attribute template into your vision service's Attributes box:

{
}

Note

For more information, see Configure a Robot.

Attributes

The following attributes are available for viam-labs:vision:yolov8 model:

Name Type Inclusion Description

Example Configurations

{
}

API

The moondream resource provides the following methods from Viam's built-in rdk:service:vision API

get_classifications(image=binary, count)

get_classifications_from_camera(camera_name=string, count)

Note: if using this method, any cameras you are using must be set in the depends_on array for the service configuration, for example:

      "depends_on": [
        "cam"
      ]

By default, the Moondream model will be asked the question "describe this image". If you want to ask a different question about the image, you can pass that question as the extra parameter "question". For example:

moondream.get_classifications(image, 1, extra={"question": "what is the person wearing?"})

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published