![]() |
![]() |
![]() |
This repository provides the "Starting Kit" to partecipate in NeurIPS 2019 - Robot open-Ended Autonomous Learning competition.
The installation requires Python 3.5+. The Starting Kit was tested on Ubuntu (>= Ubuntu 16.04) but it can be also run on other operating systems.
To install the REAL Competition Starting Kit on linux
-
install gym and pybullet packages:
pip install gym pybullet pyopengl
-
download the REALCompetitionStartingKit repo:
git clone https://github.com/GOAL-Robots/REALCompetitionStartingKit.git
-
install the REALCompetitionStartingKit package:
cd REALCompetitionStartingKit pip install -e .
To install the REAL Competition Starting Kit on windows in the anaconda enviroment
-
install microsoft Visual Studio c++14 - community at https://visualstudio.microsoft.com/visual-cpp-build-tools/
-
install the anaconda environment for windows at https://www.anaconda.com/distribution/#windows
-
create a python virtual environment
conda create -n pyenv numpy pip
-
activate the virtual environment
conda activate pyenv
-
install gym and pybullet packages:
pip install gym pybullet pyopengl
-
download the REALCompetitionStartingKit repo:
git clone https://github.com/GOAL-Robots/REALCompetitionStartingKit.git
-
install the REALCompetitionStartingKit package:
cd REALCompetitionStartingKit pip install -e .
The environment is a standard gym environment and can be called alone as shown here:
env = gym.make('REALComp-v0')
observation = env.reset()
for t in range(10):
# Call your controller to chose action
action = controller.step(observation, reward, done)
# do action
observation, reward, done, _ = env.step(action)
where the controller is any object with a step() attribute returning an action vector. A exammple type of the controller is given by this simple class:
class FakePolicy:
"""
A fake controller chosing random actions
"""
def __init__(self, action_space):
self.action_space = action_space
self.action = np.zeros(action_space.shape[0])
def step(self, observation, reward, done):
"""
Returns a vector of random values
"""
self.action += 0.1*np.pi*np.random.randn(self.action_space.shape[0])
return self.action
It includes a 7DoF kuka arm with a 2Dof gripper, a table with 3 objects on it and a camera looking at the table from the top. The gripper has four touch sensors on the inner part of its links.
The action
attribute of env.step
must be a vector of 9 joint positions in radiants.
The first 7 joints have a range between -Pi/2 and +Pi/2.
The two gripper joints have a range between 0 and +Pi/2. They are also coupled so that the second joint will be at most twice the angle of the first one.
The observation
object returned byenv.step
is a dictionary:
- observation["joint_positions"] is a vector containing the current angles of the 9 joints
- observation["touch_sensors"] is a vector containing the current touch intensity at the four touch sensors (see figure below)
- observation["retina"] is a 240x320x3 array with the current top camera image
- observation["goal"] is a 240x320x3 array with the target top camera image (all zeros except for the extrinsic phase, see below the task description)
![]() |
The reward
value returned byenv.step
is always put to 0.
The done
value returned byenv.step
is set to True
only when a phase is concluded (see below - intrinsic and extrinsic phases)
The environment can be also used in a sandbox. In realcomp_env specs you find an explanation of methods needed to read the objects, links, contacts, and other stuff. Using those methods in the final version of your controller is not permitted, but it might be useful while testing.
A complete simulation for the REAL Competition is made of two phases:
- Intrinsic phase: No goal is given and the controller can do whatever it needs to explore and learn something from the environment. This phase will last 10 million timesteps.
- Extrinsic phase: divided in trials. On each trial a goal is given and the controller must chose the actions that modify the environment so that the state corresponding to the goal is reached within 1000 timesteps.
realcomp/task/demo.py runs the entire simulation. The participants are supposed to substitute the MyController object in realcomp/task/my_controller.py with their own controller object.
Running demo.py also returns an extrinsic score for local evaluation.