Skip to content

adiasg/robot-data-augment

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Robot Training Data Augmentation

This repo contains a plug-and-play tool to massively multiply robot training datasets by augmenting training episodes with new scenes. Currently RLDS formatted datasets are supported, and Open-X-Embodiment is used as an example in this repo.

New interactions
New interactions
Replacing objects
Replacing objects
New lighting
New lighting
Replacing textures
Replacing textures

Images from training episodes are transformed into new scenes while leaving unchanged critical visual aspect such as trajectories and object interaction. Currently, this is powered by RunwayML's Gen4-Alpeh.

Tools are provided for:

  • Downloading the dataset.
  • Extracting videos from the dataset.
  • Generating new videos.
  • (coming soon) Writing back new episodes to the dataset.

Quickstart

Prerequisites

  • Docker is required - this tool is packaged as a container.
  • Load your Replicate API key in .env - video-to-video generative model APIs from Replicate are used:
REPLICATE_API_TOKEN=XXXXXXXXXXX

Build

docker build -f tool/Dockerfile -t oxe-tool .

Run

  • Make directories for inputs and outputs:
mkdir oxe-datasets videos
  • Download episodes from a sample dataset from Open-X-Embodiment:
docker run --rm \
  -v "$(pwd)/oxe-datasets:/datasets" \
  oxe-tool download_dataset \
  --dataset bridge \
  --max_episodes 50
  • Export videos:
docker run --rm \
  -v "$(pwd)/oxe-datasets:/datasets:ro" \
  -v "$(pwd)/videos:/videos" \
  oxe-tool export_video \
  --dataset bridge \
  --max_episodes 50 --fps 24 --info
  • Generate a transformed video:
docker run --rm \
  --env-file .env \
  -v "$(pwd)/videos:/videos" \
  oxe-tool generate_video \
  --dataset bridge \
  --video-name ep00021.mp4 \
  --prompt "Re-light the scene with a bright white spotlight" \
  --seed 1234

CLI Overview

Run help:

docker run --rm oxe-tool --help
docker run --rm oxe-tool download_dataset --help
docker run --rm oxe-tool export_video --help
docker run --rm oxe-tool generate_video --help

Subcommands and key options:

  • download_dataset

    • Downloads to /datasets (mounted via Docker -v)
    • --dataset (repeatable), or --datasets (comma/space-separated). If not provided, defaults to bridge.
    • --max_episodes: optional integer to limit episodes downloaded per dataset (default: download all episodes).
  • export_video

    • Reads from /datasets, writes to /videos (both mounted via Docker -v)
    • --dataset (repeatable), or --datasets (comma/space-separated). If not provided, defaults to bridge.
    • --split (default train), --max_episodes (default 5), --fps (default 24), --display_key (default image), --info
    • --image_key_choice: Pre-select image key choice (1-based index) for datasets with multiple camera views to avoid interactive prompts
    • For interactive selection in Docker, add -it flags: docker run --rm -it ...
  • generate_video

    • Reads/writes from /videos (mounted via Docker -v)
    • --dataset: dataset name (matches directory name in video structure)
    • --video-name: video filename (e.g., ep00001.mp4)
    • --prompt: text prompt for the model
    • --seed: optional integer for reproducible generations
    • Input video must be 24fps, ≤5s, ≤1MB; aspect ratio must be one of 16:9, 9:16, 4:3, 3:4, 1:1, 21:9
    • Output saved to /videos/{dataset}/generated/{video-name}_generated-{number}.mp4
    • Requires REPLICATE_API_TOKEN in the environment (e.g., --env-file .env).
  • Future commands (coming soon): augment_dataset, to write back the augmented episodes to a dataset.

Notes

  • The tool downloads the Open-X-Embodiment dataset from the public mirror gs://gresearch/robotics/.
  • File Organization:
    • export_video creates dataset-specific subdirectories: {video-dir}/{dataset}/ep{N}.mp4
    • generate_video creates organized output: {video-dir}/{dataset}/generated/{video}_generated-{N}.mp4
    • Generated videos use automatic numbering to prevent overwrites
  • Video Requirements for AI Generation: 24fps, ≤5 seconds duration, ≤1MB file size, supported aspect ratios only
  • Video-to-Video AI Model: Uses RunwayML's Gen4-Aleph

About

Augment robot training data with generative media

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published