Notebooks to demonstrate ART (Agent Reinforcement Trainer) in practice!
Agent Task | Example Notebook | Description | Comparative Performance |
---|---|---|---|
ART•E LangGraph | 🏋️ Train agent | Qwen 2.5 7B learns to search emails using LangGraph | [Link coming soon] |
MCP•RL | 🏋️ Train agent | Qwen 2.5 3B masters the NWS MCP server | [Link coming soon] |
ART•E [RULER] | 🏋️ Train agent | Qwen 2.5 7B learns to search emails using RULER | |
2048 | 🏋️ Train agent | Qwen 2.5 3B learns to play 2048 | |
Temporal Clue | 🏋️ Train agent | Qwen 2.5 7B learns to solve Temporal Clue | [Link coming soon] |
Tic Tac Toe | 🏋️ Train agent | Qwen 2.5 3B learns to play Tic Tac Toe | |
Codenames | 🏋️ Train agent | Qwen 2.5 3B learns to play Codenames | ![]() |
AutoRL [RULER] | 🏋️ Train agent | Train Qwen 2.5 7B to master any task | [Link coming soon] |
ART should work with most vLLM/HuggingFace-transformers compatible causal language models, or at least the ones supported by Unsloth. Gemma 3 does not appear to be supported for the time being. If any other model isn't working for you, please let us know on Discord or open an issue on GitHub!
ART is in active development, and contributions are most welcome! Please see the CONTRIBUTING.md file for more information.
This repository's source code is available under the Apache-2.0 License.
ART stands on the shoulders of giants. While we owe many of the ideas and early experiments that led to ART's development to the open source RL community at large, we're especially grateful to the authors of the following projects:
Finally, thank you to our partners who've helped us test ART in the wild! We're excited to see what you all build with it.