Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simulation for validation #7

Closed
fugokidi opened this issue Jun 24, 2024 · 16 comments
Closed

Simulation for validation #7

fugokidi opened this issue Jun 24, 2024 · 16 comments

Comments

@fugokidi
Copy link

Thank you for open-sourcing the VLA. I am new to robotic models and I do not have a physical robot. Is there any simulation platform to validate a VLA model? May I know how to evaluate a VLA model in the digital domain? Thank you again.

@mbreuss
Copy link

mbreuss commented Jun 26, 2024

You can test models using the simpler env: https://github.com/simpler-env/SimplerEnv
The benchmark contains selected tasks from the Google Fractal dataset and the Berkley Bridge dataset that are part of OXE-dataset used for pretraining this policy.

@fugokidi
Copy link
Author

@mbreuss Thanks a lot. I will check it out right away.

@siddk
Copy link

siddk commented Jun 28, 2024

Thanks @mbreuss for getting here before we did! We don't have "fully vetted" results for the SimplerEnv environments yet - if either you or @fugokidi would be interested in making a PR with that integration, we would definitely welcome it!

Otherwise, @moojink @kpertsch or I will try to get to this as soon as we can!

@fugokidi
Copy link
Author

@siddk Thank you for considering simulation. Although I am going to do it, the compute in my institution is under maintenance till July 12. By that time, if there is not any, I will try to contribute. If someone is faster and wants to contribute, please go ahead.

@DelinQu
Copy link

DelinQu commented Jul 4, 2024

I just finished the openvla policy integration in simpler env and performed an evaluation. The performance is not too bad, and I might miss some implementation details, e.g., normalization and scales. Waiting for me to submit a PR. 🤗

@mbreuss
Copy link

mbreuss commented Jul 4, 2024

@DelinQu very cool! Can you share your results for OpenVLA on the Simpler Benchmark?

@DelinQu
Copy link

DelinQu commented Jul 4, 2024

Of course! I'm organizing the code and log.

@DelinQu
Copy link

DelinQu commented Jul 4, 2024

The code is avaiable at SimplerEnv-OpenVLA. I need about one day to complete all the experiments and organize the data before reporting the results. The PR will be requested after all the validation.

@ZSL98
Copy link

ZSL98 commented Jul 9, 2024

@DelinQu Cool! Could you please share a link to your results? That would be helpful.

@ZSL98
Copy link

ZSL98 commented Jul 9, 2024

Oh! I found your PR here

@michaelwan11
Copy link

@moojink @siddk @kpertsch Thanks for the great work! I was wondering if you have OpenVLA success rates for the SimplerEnv environments? I took a look at the results obtained by @DelinQu here: simpler-env/SimplerEnv#10 and they seem a bit low compared to the real-world results you achieve in the paper, so I was wondering if you obtain similar results to what's in the PR.

@huangmz21
Copy link

Thank you very much for open-sourcing the VLA.I am new to this and I am trying to run the finetune.py on a single RTX 4090.I wonder if the parameter "max_steps" is sufficient for the module but not for various datasets,so we can stop the training manually.

@xuanlinli17
Copy link

xuanlinli17 commented Aug 18, 2024

OpenVLA pull request in SIMPLER: simpler-env/SimplerEnv#10

We've run both real & sim OpenVLA-7B evaluations on Google Robot following the same setup as Simpler-Env. Please see the result table at this link

The Bridge setup is pending as there seem to be some remaining implementation issues.

@moojink
Copy link

moojink commented Sep 5, 2024

Hi all,

Thank you all for your patience!

We recently added new experiments for fine-tuning OpenVLA on the LIBERO simulation benchmark (see Appendix E in v2 of the OpenVLA paper on arXiv). We also added instructions to run fine-tuned OpenVLA checkpoints in LIBERO simulation and reproduce our results in the updated README (see the LIBERO Simulation Benchmark Evaluations section). This should be sufficient for those who wish to try OpenVLA in simulation!

As for the SIMPLER simulation, we believe that the current real-to-sim distribution shift in the input images may be causing an issue in OpenVLA's performance there. We are observing very different behaviors in side-by-side comparisons between OpenVLA in SIMPLER simulated Bridge vs. OpenVLA in real-world Bridge (see this video for examples where OpenVLA fails drastically in SIMPLER simulated episodes but succeeds in the real-world counterparts for the widowx_put_eggplant_in_basket task). We aren't exactly sure why OpenVLA is less robust to the SIMPLER-Bridge distribution shifts than other methods, though the lack of data augmentations during OpenVLA's OpenX pretraining may have something to do with it (we did not apply augmentations during pretraining due to a limited compute budget and because we observed that they did not increase performance in real-world Bridge rollouts).

If you want to train/test OpenVLA on the same data distribution to observe more reliable rollout behaviors, I would personally recommend checking out our new LIBERO experiments and code described above!

-Moo Jin

@Toradus
Copy link

Toradus commented Sep 5, 2024

Nice work on adding Libero!
Is there any way you could share your converted RLDS Libero Data which you used for the LoRA finetuning?

Edit: Or could you make your used RLDS Builder Public?

@moojink
Copy link

moojink commented Sep 5, 2024

Hi @Toradus,

Yes! We just uploaded the modified LIBERO RLDS datasets here: https://huggingface.co/datasets/openvla/modified_libero_rlds

We added a note on how to download them via git in the README; see the command here:

git clone [email protected]:datasets/openvla/modified_libero_rlds

These datasets are compatible with OpenVLA, Octo, and any other method that can be trained with RLDS datasets.

-Moo Jin

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests