Skip to content

mcvella/viam-auto-label-filter

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

auto-label-filter modular resource

This module implements the rdk camera API in a mcvella:camera:auto-label-filter model.

With this model you can leverage a configured detector, as well as an optional VLM set up as a classifier (for example Moondream) to automatically capture labeled training images with bounding boxes.

For example, if you wanted to train an ML model to detect specific household pets, you could first set up a detector like Grounding Dino that knows how to spot cats and dogs. Then, you can set up a VLM classifier like Moondream, which can more specifically label dog and cat detections. You can then configure this component's attributes like:

{
  "detector": "grounding-dino",
  "classifier": "moondream",
  "dataset_name": "pets",
  "labels": [
    { "match": "white dog with spots", "label": "fido"},
    { "match": "brown furry dog", "label": "rex"},
    { "match": "black cat", "label": "onyx"},
    { "match": "calico cat", "label": "lemeaux"}
  ]
}

Now, any images that have detections that match will be stored in the pets dataset in Viam Data Management when data capture is activated for this component.

Requirements

At minimum, a CV detector must be set up in your Viam machine. It is recommended that a grounding model like Grounding Dino be used, as it can match a large number of "base" classes, and will do partial matches (like matching "person" in "person wearing glasses").

A VLM-based classifier like Moondream is not required, but if you want accurate full matches on more complex classes like "person wearing glasses" or "brown furry dog" you'll need it set up. Note that running these models can be taxing on CPUs/GPUs - you'll need to consider this when setting up data capture (you may only be able to capture data at rate of one image every 5-30 seconds, depending on the hardware the VLM is running on).

Both the detector and classifier would be configured as dependencies of this camera model.

Note that Viam app credentials and information about the organization, location, and part are also required, as machine resources (components and services) cannot interact with the Viam app without explicit permission.

Build and run

To use this module, follow the instructions to add a module from the Viam Registry and select the rdk:camera:mcvella:camera:auto-label-filter model from the mcvella:camera:auto-label-filter module.

Configuring this component

Note

Before configuring this component, you must create a machine.

Navigate to the Config tab of your machine's page in the Viam app. Click on the Components subtab and click Create component. Select the camera type, then select the mcvella:camera:auto-label-filter model. Click Add module, then enter a name for your camera and click Create.

On the new component panel, copy and paste the following attribute template into your camera’s Attributes box:

{
  "detector": "grounding-dino",
  "classifier": "moondream",
  "camera": "physical-camera",
  "labels": [
    "person without glasses",
    { "match": "person with glasses", "label": "wearing glasses" }
  ],
  "dataset_name": "glasses",
  "detector_confidence_threshold": 0.4,
  "org_id": "abc123",
  "location_id": "xyz213",
  "part_id": "mhj127",
  "app_api_key": "my_app_key",
  "app_api_key_id": "my_api_key_id",
}

Note

For more information, see Configure a Machine.

Attributes

The following attributes are available for rdk:camera:mcvella:camera:auto-label-filter cameras:

Name Type Inclusion Description
detector string Required Name of configured detector
classifier string Optional Name of configured VLM classifier - must accept "question" as an extra parameter
camera string Required Name of physical camera to capture images
labels list Required List of labels to auto-detect and label. Labels can be a string representing the label or a dictionary in the form of {"match": "thing to detect", "label": "what to label it"}
dataset_name string Optional Name of dataset to associate captured images with, if specified. Will create the dataset if it does not yet exist.
detector_confidence_threshold float Optional Minimum confidence for detections, defaults to 0.4.
org_id string Required Viam organization ID in which to store data
location_id string Required Viam location ID for which to store data
part_id string Required Viam location ID for which to store data
app_api_key string Required Viam app API key to use to store data in the Viam cloud
app_api_key_id string Required Viam app API key ID to use to store data in the Viam cloud

Next steps

In order to capture images with labeled bounding boxes with this component, you must enable Viam Data Capture for your configured auto-label-filter component. Then, any images that match will be stored in Viam Data Capture. Note that due to a current limitation, this component can only capture when internet connected.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published