Arenas being skipped during training #57

Talha1337 · 2024-12-25T16:36:30Z

Describe the bug

Using the Training Mode code in the Launching Animal-AI guide and using a configuration file consisting of a number of arenas, every time a new arena is loaded, Animal-AI also skips an arena. It also appears to skip arena 0 upon initialisation.

To Reproduce
Steps to reproduce the behavior:
Use a yaml file with an even number of arenas, marking the even-numbered arenas with a particular characteristic (in this case placing the agent high up so that it is clear if an even-numbered arena was loaded through the agent falling)

!ArenaConfig
arenas:
  0: !Arena
    timeLimit: 25
    items:
    - !Item
      name: GoodGoal
      positions:
      - !Vector3 {x: 20.0, y: 0, z: 24.22989401000175}
      rotations: [0]
      sizes:
      - !Vector3 {x: 1.042745657461917, y: 1.042745657461917, z: 1.042745657461917}
    - !Item
      name: Agent
      positions:
      - !Vector3 {x: 20, y: 5, z: 20}
      rotations: [270]
    
  1: !Arena
    timeLimit: 25
    items:
    - !Item
      name: GoodGoal
      positions:
      - !Vector3 {x: 17.558805678810632, y: 0, z: 20.0}
      rotations: [0]
      sizes:
      - !Vector3 {x: 0.7303252438561296, y: 0.7303252438561296, z: 0.7303252438561296}
    - !Item
      name: Agent
      positions:
      - !Vector3 {x: 20, y: 0, z: 20}
      rotations: [270]
    
  2: !Arena
    timeLimit: 25
    items:
    - !Item
      name: GoodGoal
      positions:
      - !Vector3 {x: 16.936613438939432, y: 0, z: 20.0}
      rotations: [0]
      sizes:
      - !Vector3 {x: 0.518046959702215, y: 0.518046959702215, z: 0.518046959702215}
    - !Item
      name: Agent
      positions:
      - !Vector3 {x: 20, y: 5, z: 20}
      rotations: [90]
      
  3: !Arena
    timeLimit: 25
    items:
    - !Item
      name: GoodGoal
      positions:
      - !Vector3 {x: 16.936613438939432, y: 0, z: 20.0}
      rotations: [0]
      sizes:
      - !Vector3 {x: 0.518046959702215, y: 0.518046959702215, z: 0.518046959702215}
    - !Item
      name: Agent
      positions:
      - !Vector3 {x: 20, y: 0, z: 20}
      rotations: [90]

Run the training Animal-AI file with the paths replaced by the path of the yaml file and the path of the animal-ai .exe file.

# Import the necessary libraries
from stable_baselines3 import PPO
import matplotlib.pyplot as plt
from stable_baselines3.common.monitor import Monitor

import torch as th

import sys
import random
import pandas as pd
import os

from mlagents_envs.envs.unity_gym_env import UnityToGymWrapper
from animalai.environment import AnimalAIEnvironment
import subprocess

def train_agent_single_config(configuration_file, env_path , log_bool = False, aai_seed = 2023, watch = False, num_steps = 10000, num_eval = 100):
    
    port = 5005 + random.randint(
    0, 1000
    )  # uses a random port to avoid problems if a previous version exits slowly
    
    # Create the environment and wrap it...
    aai_env = AnimalAIEnvironment( # the environment object
        seed = aai_seed, # seed for the pseudo random generators
        file_name=env_path,
        arenas_configurations=configuration_file,
        play=False, # note that this is set to False for training
        base_port=port, # the port to use for communication between python and the Unity environment
        inference=watch, # set to True if you want to watch the agent play
        useCamera=True, # set to False if you don't want to use the camera (no visual observations)
        resolution=64,
        useRayCasts=False, # set to True if you want to use raycasts
        no_graphics=False, # set to True if you don't want to use the graphics ('headless' mode)
        timescale=3
    )

    env = UnityToGymWrapper(aai_env, uint8_visual=True, allow_multiple_obs=False, flatten_branched=True) # the wrapper for the environment
    
    runname = "optional_run_name" # the name of the run, used for logging

    policy_kwargs = dict(activation_fn=th.nn.ReLU) # the policy kwargs for the PPO agent, such as the activation function
    
    model = PPO("CnnPolicy", env, policy_kwargs=policy_kwargs, verbose=1, tensorboard_log="./tensorboardLogs") 
    # verbosity level: 0 for no output, 1 for info messages (such as device or wrappers used), 2 for debug messages
    
    model.learn(num_steps, reset_num_timesteps=False)
    env.close()

# IMPORTANT! Replace the path to the application and the configuration file with the correct paths here:
env_path = r"..\WINDOWS\AAI\Animal-AI.exe"
configuration_file = r"example_batch_eval.yaml"

rewards = train_agent_single_config(configuration_file=configuration_file, env_path = env_path, watch = True, num_steps = 10000, num_eval = 3000)

Expected behavior
Arena 0 is skipped initially and then only evenly-numbered arenas are loaded which can be seen by the fact that in every arena the agent appears to begin by falling. Arenas are loaded as such in this case: 2, 0, 2, 0, 2, 0 ...

Desktop (please complete the following information):

OS: Windows
Browser: Chrome
Version 131.0.6778.140

Additional context
Animal AI Version 4.1.0

The text was updated successfully, but these errors were encountered:

github-actions · 2024-12-25T16:36:52Z

Hello there! Thanks for submitting an issue. We really appreciate your time and effort into making Animal-AI better!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Arenas being skipped during training #57

Arenas being skipped during training #57

Talha1337 commented Dec 25, 2024

github-actions bot commented Dec 25, 2024

Arenas being skipped during training #57

Arenas being skipped during training #57

Comments

Talha1337 commented Dec 25, 2024

github-actions bot commented Dec 25, 2024