Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pure python training, evaluation and rollout documentation request. #209

Open
redzhepdx opened this issue Feb 17, 2024 · 5 comments
Open
Assignees

Comments

@redzhepdx
Copy link

Hi everyone,

As a professional who has worked with a few RL frameworks in the past, I can confidently say that this is one of the cleanest, most user-friendly, and advanced RL library I've encountered. In fact, I'm planning to introduce it to my team as our future RL framework, and we're excited to contribute to its development. I especially appreciate the Dreamer implementations and the informative blog posts – amazing work!

Based on my experience with RL framework development, I have a few recommendations that could make this library even more appealing to a wider range of engineers:

Pure Python Examples:

While I understand the value of Hydra as a tool for configuration management and rapid experimentation, it can be intimidating for newcomers. To address this barrier and encourage broader adoption, I recommend creating 3-4 pure Python documentation/tutorial examples demonstrating training, evaluation, and rollout using existing Lagos functionalities. This approach has been successful in attracting large-scale users to other RL libraries.

Here are some examples that might be helpful:

Similar to: https://github.com/araffin/rl-tutorial-jnrr19/blob/sb3/1_getting_started.ipynb
A bit more complex (but valuable): https://github.com/araffin/rl-tutorial-jnrr19/blob/sb3/1_getting_started.ipynb
Integrating with Isaac Gym: https://docs.omniverse.nvidia.com/isaacsim/latest/isaac_gym_tutorials/tutorial_advanced_rl_stable_baselines.html

Tips and Tricks:

As we all know, RL algorithms are sensitive to hyperparameters and often require specific techniques like action masking, observation normalization, and reward scaling to be successful on new environments. Given the library's advanced capabilities with World Models, sharing insights and best practices on these topics would be incredibly valuable to the community (including myself!). Here are some examples from other libraries:

https://stable-baselines.readthedocs.io/en/master/guide/rl_tips.html
https://maze-rl.readthedocs.io/en/latest/best_practices_and_tutorials/tricks_of_the_trade.html

Transitioning to Hydra:

Once users become comfortable with the library's fundamentals, they'll naturally progress towards exploring scalability and advanced experimentation, which is where Hydra shines. Consider creating a separate tutorial or example notebook showcasing how to leverage Hydra and Sheep-RL's train and evaluate functionalities to achieve this transition smoothly.

I hope you find these recommendations helpful. Best of luck to the developers!

@belerico
Copy link
Member

Hi @redzhepdx! Thank you for the suggestions, really appreciated them!
We can definitely have something similar to this and this: what do you think @michele-milesi?
For the contribution we have to introduce a how to contribute.md, but if you want there is an old issue regarding the implementation of the DQN methods and their variants, if you want to start somewhere.
Thank you

@michele-milesi
Copy link
Member

Hi there,
@belerico, yes, we can start with something similar to the two examples you mentioned.
For the environment part, I think we can try to recycle this. Or are you thinking to use a more complex environment? (like this).

@redzhepdx
Copy link
Author

Hi @michele-milesi ,
I believe the complexity of the environment matters little. You can use any environment but I would recommend something like crawler or any of mujoco or classical gym environments to show the capabilities of the framework on decently challenging cases so anyone can test it locally.

Thanks a lot for your prompt reaction to this topic.

@verityw
Copy link

verityw commented Mar 26, 2024

Is there any update on this? Would really appreciate a pure Python example to use for research, to better integrate my existing stable-baselines code with!

@michele-milesi
Copy link
Member

Hi @verityw,
we are fixing a few problems we found with half-precision training. After this, we will move on to pure python examples.
Thank you for your patience.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants