diff --git a/docs/Learning-Environment-Design-Agents.md b/docs/Learning-Environment-Design-Agents.md index 38b4387f24..f47440c7d8 100644 --- a/docs/Learning-Environment-Design-Agents.md +++ b/docs/Learning-Environment-Design-Agents.md @@ -620,6 +620,7 @@ the order of the entities, so there is no need to properly "order" the entities before feeding them into the `BufferSensor`. The `BufferSensorComponent` Editor inspector has two arguments: + - `Observation Size` : This is how many floats each entities will be represented with. This number is fixed and all entities must have the same representation. For example, if the entities you want to diff --git a/docs/Learning-Environment-Examples.md b/docs/Learning-Environment-Examples.md index 22fb7fcd37..37d4ed0671 100644 --- a/docs/Learning-Environment-Examples.md +++ b/docs/Learning-Environment-Examples.md @@ -231,7 +231,7 @@ you would like to contribute environments, please see our objects around agent's forward direction (40 by 40 with 6 different categories). - Actions: - 3 continuous actions correspond to Forward Motion, Side Motion and Rotation - - 1 discrete acion branch for Laser with 2 possible actions corresponding to + - 1 discrete action branch for Laser with 2 possible actions corresponding to Shoot Laser or No Action - Visual Observations (Optional): First-person camera per-agent, plus one vector flag representing the frozen state of the agent. This scene uses a combination diff --git a/docs/ML-Agents-Overview.md b/docs/ML-Agents-Overview.md index 3680dbdc72..0bdee003f2 100644 --- a/docs/ML-Agents-Overview.md +++ b/docs/ML-Agents-Overview.md @@ -434,6 +434,7 @@ Similarly to Curiosity, Random Network Distillation (RND) is useful in sparse or reward environments as it helps the Agent explore. The RND Module is implemented following the paper [Exploration by Random Network Distillation](https://arxiv.org/abs/1810.12894). RND uses two networks: + - The first is a network with fixed random weights that takes observations as inputs and generates an encoding - The second is a network with similar architecture that is trained to predict the @@ -491,9 +492,9 @@ to the expert, the agent is incentivized to remain alive for as long as possible This can directly conflict with goal-oriented tasks like our PushBlock or Pyramids example environments where an agent must reach a goal state thus ending the episode as quickly as possible. In these cases, we strongly recommend that you -use a low strength GAIL reward signal and a sparse extrinisic signal when +use a low strength GAIL reward signal and a sparse extrinsic signal when the agent achieves the task. This way, the GAIL reward signal will guide the -agent until it discovers the extrnisic signal and will not overpower it. If the +agent until it discovers the extrinsic signal and will not overpower it. If the agent appears to be ignoring the extrinsic reward signal, you should reduce the strength of GAIL. diff --git a/docs/Migrating.md b/docs/Migrating.md index e2a71b9f97..6c2cd48143 100644 --- a/docs/Migrating.md +++ b/docs/Migrating.md @@ -21,7 +21,7 @@ from mlagents_envs.envs.unity_gym_env import UnityToGymWrapper ## Migrating the package to version 2.x - The official version of Unity ML-Agents supports is now 2022.3 LTS. If you run - into issues, please consider deleting your project's Library folder and reponening your + into issues, please consider deleting your project's Library folder and reopening your project. - If you used any of the APIs that were deprecated before version 2.0, you need to use their replacement. These deprecated APIs have been removed. See the migration steps bellow for specific API replacements. @@ -130,7 +130,7 @@ values from `GetMaxBoardSize()`. ### GridSensor changes The sensor configuration has changed: -* The sensor implementation has been refactored and exsisting GridSensor created from extension package +* The sensor implementation has been refactored and existing GridSensor created from extension package will not work in newer version. Some errors might show up when loading the old sensor in the scene. You'll need to remove the old sensor and create a new GridSensor. * These parameters names have changed but still refer to the same concept in the sensor: `GridNumSide` -> `GridSize`, @@ -151,8 +151,8 @@ data type changed from `float` to `int`. The index of first detectable tag will * The observation data should be written to the input `dataBuffer` instead of creating and returning a new array. * Removed the constraint of all data required to be normalized. You should specify it in `IsDataNormalized()`. Sensors with non-normalized data cannot use PNG compression type. -* The sensor will not further encode the data recieved from `GetObjectData()` anymore. The values -recieved from `GetObjectData()` will be the observation sent to the trainer. +* The sensor will not further encode the data received from `GetObjectData()` anymore. The values +received from `GetObjectData()` will be the observation sent to the trainer. ### LSTM models from previous releases no longer supported The way that Sentis processes LSTM (recurrent neural networks) has changed. As a result, models @@ -169,7 +169,7 @@ the model using the python trainer from this release. - `VectorSensor.AddObservation(IEnumerable)` is deprecated. Use `VectorSensor.AddObservation(IList)` instead. - `ObservationWriter.AddRange()` is deprecated. Use `ObservationWriter.AddList()` instead. -- `ActuatorComponent.CreateAcuator()` is deprecated. Please use override `ActuatorComponent.CreateActuators` +- `ActuatorComponent.CreateActuator()` is deprecated. Please use override `ActuatorComponent.CreateActuators` instead. Since `ActuatorComponent.CreateActuator()` is abstract, you will still need to override it in your class until it is removed. It is only ever called if you don't override `ActuatorComponent.CreateActuators`. You can suppress the warnings by surrounding the method with the following pragma: @@ -376,7 +376,7 @@ vector observations to be used simultaneously. method names will be removed in a later release: - `InitializeAgent()` was renamed to `Initialize()` - `AgentAction()` was renamed to `OnActionReceived()` - - `AgentReset()` was renamed to `OnEpsiodeBegin()` + - `AgentReset()` was renamed to `OnEpisodeBegin()` - `Done()` was renamed to `EndEpisode()` - `GiveModel()` was renamed to `SetModel()` - The `IFloatProperties` interface has been removed. @@ -532,7 +532,7 @@ vector observations to be used simultaneously. depended on [PEP420](https://www.python.org/dev/peps/pep-0420/), which caused problems with some of our tooling such as mypy and pylint. - The official version of Unity ML-Agents supports is now 2022.3 LTS. If you run - into issues, please consider deleting your library folder and reponening your + into issues, please consider deleting your library folder and reopening your projects. You will need to install the Sentis package into your project in order to ML-Agents to compile correctly. diff --git a/docs/Package-Settings.md b/docs/Package-Settings.md index 4bc00973c4..d796e52de2 100644 --- a/docs/Package-Settings.md +++ b/docs/Package-Settings.md @@ -9,7 +9,7 @@ You can find them at `Edit` > `Project Settings...` > `ML-Agents`. It lists out ## Create Custom Settings In order to to use your own settings for your project, you'll need to create a settings asset. -You can do this by clicking the `Create Settings Asset` buttom or clicking the gear on the top right and select `New Settings Asset...`. +You can do this by clicking the `Create Settings Asset` button or clicking the gear on the top right and select `New Settings Asset...`. The asset file can be placed anywhere in the `Asset/` folder in your project. After Creating the settings asset, you'll be able to modify the settings for your project and your settings will be saved in the asset. @@ -21,7 +21,7 @@ You can create multiple settings assets in one project. By clicking the gear on the top right you'll see all available settings listed in the drop-down menu to choose from. -This allows you to create different settings for different scenatios. For example, you can create two +This allows you to create different settings for different scenarios. For example, you can create two separate settings for training and inference, and specify which one you want to use according to what you're currently running. ![Multiple Settings](images/multiple-settings.png) diff --git a/docs/Profiling-Python.md b/docs/Profiling-Python.md index 0e69aefcd0..21bc529423 100644 --- a/docs/Profiling-Python.md +++ b/docs/Profiling-Python.md @@ -1,6 +1,6 @@ # Profiling in Python -As part of the ML-Agents Tookit, we provide a lightweight profiling system, in +As part of the ML-Agents Toolkit, we provide a lightweight profiling system, in order to identity hotspots in the training process and help spot regressions from changes. diff --git a/docs/Python-Custom-Trainer-Plugin.md b/docs/Python-Custom-Trainer-Plugin.md index 2881d7d506..4c78bfc513 100644 --- a/docs/Python-Custom-Trainer-Plugin.md +++ b/docs/Python-Custom-Trainer-Plugin.md @@ -5,7 +5,7 @@ capabilities. we introduce an extensible plugin system to define new trainers ba in `Ml-agents` Package. This will allow rerouting `mlagents-learn` CLI to custom trainers and extending the config files with hyper-parameters specific to your new trainers. We will expose a high-level extensible trainer (both on-policy, and off-policy trainers) optimizer and hyperparameter classes with documentation for the use of this plugin. For more -infromation on how python plugin system works see [Plugin interfaces](Training-Plugins.md). +information on how python plugin system works see [Plugin interfaces](Training-Plugins.md). ## Overview Model-free RL algorithms generally fall into two broad categories: on-policy and off-policy. On-policy algorithms perform updates based on data gathered from the current policy. Off-policy algorithms learn a Q function from a buffer of previous data, then use this Q function to make decisions. Off-policy algorithms have three key benefits in the context of ML-Agents: They tend to use fewer samples than on-policy as they can pull and re-use data from the buffer many times. They allow player demonstrations to be inserted in-line with RL data into the buffer, enabling new ways of doing imitation learning by streaming player data. diff --git a/docs/Python-Gym-API.md b/docs/Python-Gym-API.md index 50051195ed..97869899ce 100644 --- a/docs/Python-Gym-API.md +++ b/docs/Python-Gym-API.md @@ -11,7 +11,7 @@ Unity environment via Python. ## Installation -The gym wrapper is part of the `mlgents_envs` package. Please refer to the +The gym wrapper is part of the `mlagents_envs` package. Please refer to the [mlagents_envs installation instructions](ML-Agents-Envs-README.md). diff --git a/docs/Python-LLAPI-Documentation.md b/docs/Python-LLAPI-Documentation.md index 640c4ddb99..9cba2f9c07 100644 --- a/docs/Python-LLAPI-Documentation.md +++ b/docs/Python-LLAPI-Documentation.md @@ -678,7 +678,7 @@ of downloading the Unity Editor. The UnityEnvRegistry implements a Map, to access an entry of the Registry, use: ```python registry = UnityEnvRegistry() -entry = registry[] +entry = registry[] ``` An entry has the following properties : * `identifier` : Uniquely identifies this environment @@ -689,7 +689,7 @@ An entry has the following properties : To launch a Unity environment from a registry entry, use the `make` method: ```python registry = UnityEnvRegistry() -env = registry[].make() +env = registry[].make() ``` diff --git a/docs/Python-On-Off-Policy-Trainer-Documentation.md b/docs/Python-On-Off-Policy-Trainer-Documentation.md index 4f13bdf72e..e2fc7770c7 100644 --- a/docs/Python-On-Off-Policy-Trainer-Documentation.md +++ b/docs/Python-On-Off-Policy-Trainer-Documentation.md @@ -694,7 +694,7 @@ class Lesson() ``` Gathers the data of one lesson for one environment parameter including its name, -the condition that must be fullfiled for the lesson to be completed and a sampler +the condition that must be fulfilled for the lesson to be completed and a sampler for the environment parameter. If the completion_criteria is None, then this is the last lesson in the curriculum. diff --git a/docs/Python-Optimizer-Documentation.md b/docs/Python-Optimizer-Documentation.md index 9b7e1b993c..7cdfaec832 100644 --- a/docs/Python-Optimizer-Documentation.md +++ b/docs/Python-Optimizer-Documentation.md @@ -43,8 +43,8 @@ Get value estimates and memories for a trajectory, in batch form. **Arguments**: - `batch`: An AgentBuffer that consists of a trajectory. -- `next_obs`: the next observation (after the trajectory). Used for boostrapping - if this is not a termiinal trajectory. +- `next_obs`: the next observation (after the trajectory). Used for bootstrapping + if this is not a terminal trajectory. - `done`: Set true if this is a terminal trajectory. - `agent_id`: Agent ID of the agent that this trajectory belongs to. diff --git a/docs/Python-PettingZoo-API.md b/docs/Python-PettingZoo-API.md index b78c311569..2c62ed8415 100644 --- a/docs/Python-PettingZoo-API.md +++ b/docs/Python-PettingZoo-API.md @@ -7,7 +7,7 @@ interfacing with a Unity environment via Python. ## Installation and Examples -The PettingZoo wrapper is part of the `mlgents_envs` package. Please refer to the +The PettingZoo wrapper is part of the `mlagents_envs` package. Please refer to the [mlagents_envs installation instructions](ML-Agents-Envs-README.md). [[Colab] PettingZoo Wrapper Example](https://colab.research.google.com/github/Unity-Technologies/ml-agents/blob/develop-python-api-ga/ml-agents-envs/colabs/Colab_PettingZoo.ipynb) diff --git a/docs/Readme.md b/docs/Readme.md index 7f02ec127d..c5a8e06ac4 100644 --- a/docs/Readme.md +++ b/docs/Readme.md @@ -52,6 +52,7 @@ to get started with the latest release of ML-Agents.** The table below lists all our releases, including our `main` branch which is under active development and may be unstable. A few helpful guidelines: + - The [Versioning page](Versioning.md) overviews how we manage our GitHub releases and the versioning process for each of the ML-Agents components. - The [Releases page](https://github.com/Unity-Technologies/ml-agents/releases) @@ -165,7 +166,7 @@ We have also published a series of blog posts that are relevant for ML-Agents: ### More from Unity - [Unity Sentis](https://unity.com/products/sentis) -- [Introductin Unity Muse and Sentis](https://blog.unity.com/engine-platform/introducing-unity-muse-and-unity-sentis-ai) +- [Introducing Unity Muse and Sentis](https://blog.unity.com/engine-platform/introducing-unity-muse-and-unity-sentis-ai) ## Community and Feedback diff --git a/docs/Training-ML-Agents.md b/docs/Training-ML-Agents.md index 4f6e9e9a13..9fd3f52006 100644 --- a/docs/Training-ML-Agents.md +++ b/docs/Training-ML-Agents.md @@ -413,7 +413,7 @@ Unless otherwise specified, omitting a configuration will revert it to its defau In some cases, you may want to specify a set of default configurations for your Behaviors. This may be useful, for instance, if your Behavior names are generated procedurally by the environment and not known before runtime, or if you have many Behaviors with very similar -settings. To specify a default configuraton, insert a `default_settings` section in your YAML. +settings. To specify a default configuration, insert a `default_settings` section in your YAML. This section should be formatted exactly like a configuration for a Behavior. ```yaml diff --git a/docs/Tutorial-Custom-Trainer-Plugin.md b/docs/Tutorial-Custom-Trainer-Plugin.md index aee26396a3..06e9d2bc0e 100644 --- a/docs/Tutorial-Custom-Trainer-Plugin.md +++ b/docs/Tutorial-Custom-Trainer-Plugin.md @@ -13,7 +13,7 @@ Users of the plug-in system are responsible for implementing the trainer class s Please refer to the internal [PPO implementation](../ml-agents/mlagents/trainers/ppo/trainer.py) for a complete code example. We will not provide a workable code in the document. The purpose of the tutorial is to introduce you to the core components and interfaces of our plugin framework. We use code snippets and patterns to demonstrate the control and data flow. -Your custom trainers are responsible for collecting experiences and training the models. Your custom trainer class acts like a co-ordinator to the policy and optimizer. To start implementing methods in the class, create a policy class objects from method `create_policy`: +Your custom trainers are responsible for collecting experiences and training the models. Your custom trainer class acts like a coordinator to the policy and optimizer. To start implementing methods in the class, create a policy class objects from method `create_policy`: ```python @@ -243,7 +243,7 @@ Before installing your custom trainer package, make sure you have `ml-agents-env pip3 install -e ./ml-agents-envs && pip3 install -e ./ml-agents ``` -Install your cutom trainer package(if your package is pip installable): +Install your custom trainer package(if your package is pip installable): ```shell pip3 install your_custom_package ``` diff --git a/docs/Unity-Environment-Registry.md b/docs/Unity-Environment-Registry.md index c5caa68cbd..27f14561ed 100644 --- a/docs/Unity-Environment-Registry.md +++ b/docs/Unity-Environment-Registry.md @@ -28,7 +28,8 @@ env.close() ## Create and share your own registry -In order to share the `UnityEnvironemnt` you created, you must : +In order to share the `UnityEnvironment` you created, you must: + - [Create a Unity executable](Learning-Environment-Executable.md) of your environment for each platform (Linux, OSX and/or Windows) - Place each executable in a `zip` compressed folder - Upload each zip file online to your preferred hosting platform diff --git a/ml-agents-envs/mlagents_envs/registry/unity_env_registry.py b/ml-agents-envs/mlagents_envs/registry/unity_env_registry.py index 86bddc99bd..f0099ecf18 100644 --- a/ml-agents-envs/mlagents_envs/registry/unity_env_registry.py +++ b/ml-agents-envs/mlagents_envs/registry/unity_env_registry.py @@ -16,7 +16,7 @@ class UnityEnvRegistry(Mapping): The UnityEnvRegistry implements a Map, to access an entry of the Registry, use: ```python registry = UnityEnvRegistry() - entry = registry[] + entry = registry[] ``` An entry has the following properties : * `identifier` : Uniquely identifies this environment @@ -27,7 +27,7 @@ class UnityEnvRegistry(Mapping): To launch a Unity environment from a registry entry, use the `make` method: ```python registry = UnityEnvRegistry() - env = registry[].make() + env = registry[].make() ``` """ diff --git a/ml-agents/mlagents/trainers/optimizer/torch_optimizer.py b/ml-agents/mlagents/trainers/optimizer/torch_optimizer.py index 8cb0a6ee8c..9ee3845515 100644 --- a/ml-agents/mlagents/trainers/optimizer/torch_optimizer.py +++ b/ml-agents/mlagents/trainers/optimizer/torch_optimizer.py @@ -148,8 +148,8 @@ def get_trajectory_value_estimates( """ Get value estimates and memories for a trajectory, in batch form. :param batch: An AgentBuffer that consists of a trajectory. - :param next_obs: the next observation (after the trajectory). Used for boostrapping - if this is not a termiinal trajectory. + :param next_obs: the next observation (after the trajectory). Used for bootstrapping + if this is not a terminal trajectory. :param done: Set true if this is a terminal trajectory. :param agent_id: Agent ID of the agent that this trajectory belongs to. :returns: A Tuple of the Value Estimates as a Dict of [name, np.ndarray(trajectory_len)], diff --git a/ml-agents/mlagents/trainers/poca/optimizer_torch.py b/ml-agents/mlagents/trainers/poca/optimizer_torch.py index 4f77de4ebb..de17f3d3b2 100644 --- a/ml-agents/mlagents/trainers/poca/optimizer_torch.py +++ b/ml-agents/mlagents/trainers/poca/optimizer_torch.py @@ -565,8 +565,8 @@ def get_trajectory_and_baseline_value_estimates( """ Get value estimates, baseline estimates, and memories for a trajectory, in batch form. :param batch: An AgentBuffer that consists of a trajectory. - :param next_obs: the next observation (after the trajectory). Used for boostrapping - if this is not a termiinal trajectory. + :param next_obs: the next observation (after the trajectory). Used for bootstrapping + if this is not a terminal trajectory. :param next_groupmate_obs: the next observations from other members of the group. :param done: Set true if this is a terminal trajectory. :param agent_id: Agent ID of the agent that this trajectory belongs to. diff --git a/ml-agents/mlagents/trainers/settings.py b/ml-agents/mlagents/trainers/settings.py index 1ee0accde4..9cb9a1f291 100644 --- a/ml-agents/mlagents/trainers/settings.py +++ b/ml-agents/mlagents/trainers/settings.py @@ -517,7 +517,7 @@ def need_increment( class Lesson: """ Gathers the data of one lesson for one environment parameter including its name, - the condition that must be fullfiled for the lesson to be completed and a sampler + the condition that must be fulfilled for the lesson to be completed and a sampler for the environment parameter. If the completion_criteria is None, then this is the last lesson in the curriculum. """