Vla training #10

timothygao8710 · 2024-07-12T22:07:38Z

In this pull request, we fine-tune Open-Source Vision-Language-Action Model - OpenVLA to give Stompy to ability to find the optimal next move based on a

language instruction (what action should the robot make to push the cube to the target?)
and
a monocular, third-person picture of scene (single 512x512 RBG image capturing the robot, target, cube, and environment).

A move is defined in discrete action-space, as a 7-DoF vector defining robot claw's delta x, y, z, row, yaw, pitch, and pinch.
A move is considered "correct" if the predicted output token matches the expected output token for all 7 DoFs. The loss function is defined similarly.

OpenVLA is not zero-shot. It needs to be fine-tuned for each new (environment, task).

Data processing scripts take in a directory of json files, each containing the optimal steps for a single episode, collates all (current_image, optimal_next_step) tuples into a .h5 dataset. Right now this data is generated via PPO, which has access to all variables in sim.

datasets.py loads the .h5 file into a custom PyTorch dataset, which is wrapped in dataloader and served with custom batch size, image transformations, tokenizer, etc in finetune.py

finetune.py further has configuration options for: learning_rate, use_lora, lora configs, num_epochs, pretrained_model_path, ....

Training Observations and Logs:

Our first approach didn't work because we only did a single camera angle, we realized that we needed to capture a larger distribution to have to model actually do that, so we randomized camera angles and cube and target location
after a lot of tweaking and tricks we got it to 30% action accuracy in validation, which is decent because the model has to pick the right action out of # discrete tokens ^ # deg of freedom = 256 ^ 7 options for it to be considered correct
But when we tested it it still doesn't work - it works at the start, but the moment the arms "tweaks" / makes a non-optimal move, it goes out of distribution because the training data only ever contains optimal trajectories
so now we're training on more data to get the arm to learn error correction

Overfit run - gets around 30% action accuracy on new, randomized push cube task.

…t VLA inference on never-seen-before backgrounds work (when it didn't before)

codekansas · 2024-07-12T22:38:41Z

oops sorry

timothygao8710 · 2024-07-12T23:08:08Z

No worries! I'm starting to eliminate my competitive programming habits and making sure that when I contribute to repos, I'm being more mindful

timothygao8710 · 2024-07-12T23:10:20Z

Also - I think this vla stuff is independent of if the sim we choose to use and also if we want to adapt it to real life in the future, it only needs access to image and next best move. Maybe not putting it in stompy_live is better? I also got rid of the env_norm stuff, can put that in a separarte repo too

codekansas and others added 9 commits July 5, 2024 21:24

add data scripts

9cbad30

updated fine tune scripts, overfits easily

8141383

add all relevant scripts to make VLA work

73fa994

using grounding sam for environmental normalization, to make zero-sho…

5c5f0c4

…t VLA inference on never-seen-before backgrounds work (when it didn't before)

fix wandb, add eval, fuse weights, polish scripts generally

43f4899

chore: refractoring, reorganizing, minor bug fixes

2af1b71

chore: update .gitignore

8e205ae

chore: untrack files in gitigore

ff17390

chore: more refractoring

d011a9c

codekansas closed this Jul 12, 2024

codekansas deleted the vla-training branch July 12, 2024 22:37

codekansas restored the vla-training branch July 12, 2024 22:37

codekansas deleted the vla-training branch July 12, 2024 22:38

codekansas restored the vla-training branch July 12, 2024 22:38

codekansas reopened this Jul 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Vla training #10

Vla training #10

timothygao8710 commented Jul 12, 2024 •

edited

Loading

codekansas commented Jul 12, 2024

timothygao8710 commented Jul 12, 2024

timothygao8710 commented Jul 12, 2024

Vla training #10

Are you sure you want to change the base?

Vla training #10

Conversation

timothygao8710 commented Jul 12, 2024 • edited Loading

codekansas commented Jul 12, 2024

timothygao8710 commented Jul 12, 2024

timothygao8710 commented Jul 12, 2024

timothygao8710 commented Jul 12, 2024 •

edited

Loading