Skip to content

mcvella/viam-moondream-vision-modal

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

moondream modular vision service

This module implements the rdk vision API in a mcvella:vision:moondream-modal model.

This model leverages the Moondream tiny vision language model to allow for image classification and querying - with inference running on the Modal platform, allowing you to augment your Viam machines with serverless cloud-based VLM capabilities.

Build and Run

To use this module, follow these instructions to add a module from the Viam Registry and select the mcvella:vision:moondream-modal model from the moondream-vision module.

You will also need to sign up for a Modal account, create a workspace, and then create an API token. The Modal API token ID and secret must then be used in your module configuration.

Configure Modal API Token

In the Viam app, you will need to configure access to your Modal account by setting environment variables for this module. To do so, in CONFIGURE, click on JSON, and within the service configuration for this module, add:

      "env": {
        "MODAL_TOKEN_ID": "YOURTOKENHERE",
        "MODAL_TOKEN_SECRET": "YOURSECRETHERE"
      }

Configure your vision service

Note

Before configuring your vision service, you must create a machine.

Navigate to the Config tab of your robot’s page in the Viam app. Click on the Service subtab and click Create service. Select the vision type, then select the mcvella:vision:moondream-modal model. Enter a name for your vision service and click Create.

On the new service panel, copy and paste the following attribute template into your vision service's Attributes box:

{
}

Note

For more information, see Configure a Robot.

Attributes

The following attributes are available for mcvella:vision:moondream-modal model:

Name Type Inclusion Description
default_question string optional For classifications, the default question to ask about the image. Defaults to "describe this image".
default_class string optional For detections, the default class to detect in the image. Defaults to "person".
gaze_detection boolean optional If set to true, detections will be gaze detections. Defaults to false.

Example Configurations

{
}

API

The moondream resource provides the following methods from Viam's built-in rdk:service:vision API

get_classifications(image=binary, count)

get_classifications_from_camera(camera_name=string, count)

Note: if using this method, any cameras you are using must be set in the depends_on array for the service configuration, for example:

      "depends_on": [
        "cam"
      ]

By default, the Moondream model will be asked the question "describe this image". If you want to ask a different question about the image, you can pass that question as the extra parameter "question". For example:

moondream.get_classifications(image, 1, extra={"question": "what is the person wearing?"})

get_detections(image=binary)

get_detections_from_camera(camera_name=string, count)

Note: if using this method, any cameras you are using must be set in the depends_on array for the service configuration, for example:

      "depends_on": [
        "cam"
      ]

By default, the Moondream model will look for the class "person". If you want to detect another class, you can pass that class as the extra parameter "class". For example:

moondream.get_detections(image, extra={"class": "shoes"})

To use Moondream's "gaze detection" capabilities, either set gaze_detection to true in your service attribute config, or pass gaze_detection as true to the detections call, for example:

moondream.get_detections(image, extra={"gaze_detection": true})

If gaze_detection is activated, you detections will be returned with classes of face_ and gaze_, where counter attempts to match a face with where that face is gazing.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published