This repo contains a plug-and-play tool to massively multiply robot training datasets by augmenting training episodes with new scenes. Currently RLDS formatted datasets are supported, and Open-X-Embodiment is used as an example in this repo.
New interactions
|
Replacing objects
|
New lighting
|
Replacing textures
|
Images from training episodes are transformed into new scenes while leaving unchanged critical visual aspect such as trajectories and object interaction. Currently, this is powered by RunwayML's Gen4-Alpeh.
Tools are provided for:
- Downloading the dataset.
- Extracting videos from the dataset.
- Generating new videos.
- (coming soon) Writing back new episodes to the dataset.
- Docker is required - this tool is packaged as a container.
- Load your Replicate API key in
.env- video-to-video generative model APIs from Replicate are used:
REPLICATE_API_TOKEN=XXXXXXXXXXXdocker build -f tool/Dockerfile -t oxe-tool .- Make directories for inputs and outputs:
mkdir oxe-datasets videos- Download episodes from a sample dataset from Open-X-Embodiment:
docker run --rm \
-v "$(pwd)/oxe-datasets:/datasets" \
oxe-tool download_dataset \
--dataset bridge \
--max_episodes 50- Export videos:
docker run --rm \
-v "$(pwd)/oxe-datasets:/datasets:ro" \
-v "$(pwd)/videos:/videos" \
oxe-tool export_video \
--dataset bridge \
--max_episodes 50 --fps 24 --info- Generate a transformed video:
docker run --rm \
--env-file .env \
-v "$(pwd)/videos:/videos" \
oxe-tool generate_video \
--dataset bridge \
--video-name ep00021.mp4 \
--prompt "Re-light the scene with a bright white spotlight" \
--seed 1234Run help:
docker run --rm oxe-tool --help
docker run --rm oxe-tool download_dataset --help
docker run --rm oxe-tool export_video --help
docker run --rm oxe-tool generate_video --helpSubcommands and key options:
-
download_dataset- Downloads to
/datasets(mounted via Docker-v) --dataset(repeatable), or--datasets(comma/space-separated). If not provided, defaults tobridge.--max_episodes: optional integer to limit episodes downloaded per dataset (default: download all episodes).
- Downloads to
-
export_video- Reads from
/datasets, writes to/videos(both mounted via Docker-v) --dataset(repeatable), or--datasets(comma/space-separated). If not provided, defaults tobridge.--split(defaulttrain),--max_episodes(default5),--fps(default24),--display_key(defaultimage),--info--image_key_choice: Pre-select image key choice (1-based index) for datasets with multiple camera views to avoid interactive prompts- For interactive selection in Docker, add
-itflags:docker run --rm -it ...
- Reads from
-
generate_video- Reads/writes from
/videos(mounted via Docker-v) --dataset: dataset name (matches directory name in video structure)--video-name: video filename (e.g.,ep00001.mp4)--prompt: text prompt for the model--seed: optional integer for reproducible generations- Input video must be 24fps, ≤5s, ≤1MB; aspect ratio must be one of
16:9,9:16,4:3,3:4,1:1,21:9 - Output saved to
/videos/{dataset}/generated/{video-name}_generated-{number}.mp4 - Requires
REPLICATE_API_TOKENin the environment (e.g.,--env-file .env).
- Reads/writes from
-
Future commands (coming soon):
augment_dataset, to write back the augmented episodes to a dataset.
- The tool downloads the Open-X-Embodiment dataset from the public mirror
gs://gresearch/robotics/. - File Organization:
export_videocreates dataset-specific subdirectories:{video-dir}/{dataset}/ep{N}.mp4generate_videocreates organized output:{video-dir}/{dataset}/generated/{video}_generated-{N}.mp4- Generated videos use automatic numbering to prevent overwrites
- Video Requirements for AI Generation: 24fps, ≤5 seconds duration, ≤1MB file size, supported aspect ratios only
- Video-to-Video AI Model: Uses RunwayML's Gen4-Aleph



