Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Arenas being skipped during training #57

Open
Talha1337 opened this issue Dec 25, 2024 · 1 comment
Open

Arenas being skipped during training #57

Talha1337 opened this issue Dec 25, 2024 · 1 comment

Comments

@Talha1337
Copy link

Describe the bug

Using the Training Mode code in the Launching Animal-AI guide and using a configuration file consisting of a number of arenas, every time a new arena is loaded, Animal-AI also skips an arena. It also appears to skip arena 0 upon initialisation.

To Reproduce
Steps to reproduce the behavior:
Use a yaml file with an even number of arenas, marking the even-numbered arenas with a particular characteristic (in this case placing the agent high up so that it is clear if an even-numbered arena was loaded through the agent falling)

!ArenaConfig
arenas:
  0: !Arena
    timeLimit: 25
    items:
    - !Item
      name: GoodGoal
      positions:
      - !Vector3 {x: 20.0, y: 0, z: 24.22989401000175}
      rotations: [0]
      sizes:
      - !Vector3 {x: 1.042745657461917, y: 1.042745657461917, z: 1.042745657461917}
    - !Item
      name: Agent
      positions:
      - !Vector3 {x: 20, y: 5, z: 20}
      rotations: [270]
    
  1: !Arena
    timeLimit: 25
    items:
    - !Item
      name: GoodGoal
      positions:
      - !Vector3 {x: 17.558805678810632, y: 0, z: 20.0}
      rotations: [0]
      sizes:
      - !Vector3 {x: 0.7303252438561296, y: 0.7303252438561296, z: 0.7303252438561296}
    - !Item
      name: Agent
      positions:
      - !Vector3 {x: 20, y: 0, z: 20}
      rotations: [270]
    
  2: !Arena
    timeLimit: 25
    items:
    - !Item
      name: GoodGoal
      positions:
      - !Vector3 {x: 16.936613438939432, y: 0, z: 20.0}
      rotations: [0]
      sizes:
      - !Vector3 {x: 0.518046959702215, y: 0.518046959702215, z: 0.518046959702215}
    - !Item
      name: Agent
      positions:
      - !Vector3 {x: 20, y: 5, z: 20}
      rotations: [90]
      
  3: !Arena
    timeLimit: 25
    items:
    - !Item
      name: GoodGoal
      positions:
      - !Vector3 {x: 16.936613438939432, y: 0, z: 20.0}
      rotations: [0]
      sizes:
      - !Vector3 {x: 0.518046959702215, y: 0.518046959702215, z: 0.518046959702215}
    - !Item
      name: Agent
      positions:
      - !Vector3 {x: 20, y: 0, z: 20}
      rotations: [90]

Run the training Animal-AI file with the paths replaced by the path of the yaml file and the path of the animal-ai .exe file.

# Import the necessary libraries
from stable_baselines3 import PPO
import matplotlib.pyplot as plt
from stable_baselines3.common.monitor import Monitor

import torch as th

import sys
import random
import pandas as pd
import os

from mlagents_envs.envs.unity_gym_env import UnityToGymWrapper
from animalai.environment import AnimalAIEnvironment
import subprocess

def train_agent_single_config(configuration_file, env_path , log_bool = False, aai_seed = 2023, watch = False, num_steps = 10000, num_eval = 100):
    
    port = 5005 + random.randint(
    0, 1000
    )  # uses a random port to avoid problems if a previous version exits slowly
    
    # Create the environment and wrap it...
    aai_env = AnimalAIEnvironment( # the environment object
        seed = aai_seed, # seed for the pseudo random generators
        file_name=env_path,
        arenas_configurations=configuration_file,
        play=False, # note that this is set to False for training
        base_port=port, # the port to use for communication between python and the Unity environment
        inference=watch, # set to True if you want to watch the agent play
        useCamera=True, # set to False if you don't want to use the camera (no visual observations)
        resolution=64,
        useRayCasts=False, # set to True if you want to use raycasts
        no_graphics=False, # set to True if you don't want to use the graphics ('headless' mode)
        timescale=3
    )

    env = UnityToGymWrapper(aai_env, uint8_visual=True, allow_multiple_obs=False, flatten_branched=True) # the wrapper for the environment
    
    runname = "optional_run_name" # the name of the run, used for logging

    policy_kwargs = dict(activation_fn=th.nn.ReLU) # the policy kwargs for the PPO agent, such as the activation function
    
    model = PPO("CnnPolicy", env, policy_kwargs=policy_kwargs, verbose=1, tensorboard_log="./tensorboardLogs") 
    # verbosity level: 0 for no output, 1 for info messages (such as device or wrappers used), 2 for debug messages
    
    model.learn(num_steps, reset_num_timesteps=False)
    env.close()

# IMPORTANT! Replace the path to the application and the configuration file with the correct paths here:
env_path = r"..\WINDOWS\AAI\Animal-AI.exe"
configuration_file = r"example_batch_eval.yaml"

rewards = train_agent_single_config(configuration_file=configuration_file, env_path = env_path, watch = True, num_steps = 10000, num_eval = 3000)

Expected behavior
Arena 0 is skipped initially and then only evenly-numbered arenas are loaded which can be seen by the fact that in every arena the agent appears to begin by falling. Arenas are loaded as such in this case: 2, 0, 2, 0, 2, 0 ...

Desktop (please complete the following information):

  • OS: Windows
  • Browser: Chrome
  • Version 131.0.6778.140

Additional context
Animal AI Version 4.1.0

Copy link

Hello there! Thanks for submitting an issue. We really appreciate your time and effort into making Animal-AI better!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant