RLtools Example: Implementing a Custom Environment

This example implements a full pendulum environment (including a visualization) in 191 lines of code (115 for the environment + 76 for the UI). You can find the implementation in include/my_pendulum

UNIX (Linux / macOS)

git clone https://github.com/rl-tools/example
cd example
git submodule update --init external/rl_tools
mkdir build
cd build
cmake .. -DCMAKE_BUILD_TYPE=Release
cmake --build .
./my_pendulum

Windows

On windows you should use cmake --build . --config Release to build the executable. If you have Intel MKL/oneMKL installed you should use the Command Prompt (not PowerShell) and initialize the MKL environment variables before running the commands:

"C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
git clone https://github.com/rl-tools/example
cd example
git submodule update --init external/rl_tools
mkdir build
cd build
cmake .. -DCMAKE_BUILD_TYPE=Release
cmake --build . --config Release
Release\my_pendulum.exe

BLAS Backend & Optimizations

This example should work out of the box, but it is relatively slow if you do not activate one of the BLAS backends. Depending on your platform uncomment the set(RL_TOOLS_BACKEND_ENABLE_XXX ON) line in CMakeLists.txt and re-run the previous cmake config and build steps to dispatch to the optimized BLAS routines: This should be substantially faster. To make it even fast you can disable the evaluation episodes (by enabling the BENCHMARK flag in the CMakeLists.txt). Furthermore, you can play with options like -Ofast and -march=native depending on your CPU architecture. For reference, this takes about 600ms on an i9-10885H (on Linux, on Windows around 1.5s for some reason) and about 300ms on an Apple M3.

Experiment Tracking

This example also includes the automatic experiment tracking available through the Loop Interface. More infos about the experiment tracking can be found in the docs: Experiment Tracking. To enable the visualization of trajectories for a custom environment three functions are required (these are defined in operations_cpu.h due to their dependency on std::string):

Parameters => JSON: Self-explanatory
State => JSON: Self-explanatory
UI Render function string: This function uses the HTML5 Canvas rendering API and can be easily created using https://studio.rl.tools. Nnote that due to the wide spread use of the HTML5 Canvas drawing API, also ChatGPT is really good at creating render functions for different environments if you give it an example like the ones provided on https://studio.rl.tools.

The experiment tracking and save-trajectories step will periodically record trajectories and store them as .json files. After/while running the training you can run python3 -m http.server which should start a local webserver on http://localhost:8080 where you can see the recorded trajectories based on the render function you provided.

Name		Name	Last commit message	Last commit date
Latest commit History 35 Commits
.github/workflows		.github/workflows
external		external
include/my_pendulum		include/my_pendulum
src		src
.gitignore		.gitignore
.gitmodules		.gitmodules
CMakeLists.txt		CMakeLists.txt
README.MD		README.MD
index.html		index.html
serve.sh		serve.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RLtools Example: Implementing a Custom Environment

UNIX (Linux / macOS)

Windows

BLAS Backend & Optimizations

Experiment Tracking

About

Languages

rl-tools/example

Folders and files

Latest commit

History

Repository files navigation

RLtools Example: Implementing a Custom Environment

UNIX (Linux / macOS)

Windows

BLAS Backend & Optimizations

Experiment Tracking

About

Resources

Stars

Watchers

Forks

Languages