Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a characteristic for solvers using action masks and make use of it in rollout #445

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

nhuet
Copy link
Contributor

@nhuet nhuet commented Nov 29, 2024

  • Use it in rollout to make them be aware of current action mask, by calling their retrieve_applicable_actions() method.
  • Add a get_action_mask() method to domains by default converting applicable actions space into a 0-1 numpy array, provided that the action space of each agent is an EnumerableSpace.
  • Use these new features to simplify how the RayRLlib solver handles action masking:
    • inherit from Maskable
    • do not require anymore FullObservable from the domain to use action
      masking, as get_action_mask() can be called without the solver knowing about
      the current state (and since in rollout, the actual domain is now
      used)
    • decide whether using action masking directly in __init__() so that
      using_applicable_actions() can be overriden properly
    • use common functions for unwrap_obs and wrap_action in solver and
      wrapper environment to avoid code duplication
    • use domain.get_action_mask() to convert applicable actions into a mask
      (the method is more efficient as not calling get_applicable_actions()
      for each actions)

@nhuet nhuet marked this pull request as draft December 10, 2024 16:31
@nhuet nhuet force-pushed the rollout-action-mask branch from f8826b1 to 3c73d1d Compare December 12, 2024 16:24
@nhuet nhuet changed the title Add option in rollout for sample_action kwargs (e.g. action masking) Add a characteristic for solvers using action masks and make use of it in rollout Dec 12, 2024
@nhuet nhuet marked this pull request as ready for review December 12, 2024 16:30
- Use it in rollout to make them be aware of current action mask.
- Add a `get_action_mask()` method to domains by default converting
  applicable actions space into a 0-1 numpy array, provided that the
  action space of each agent is an EnumerableSpace.
- inherits from Maskable
- do not require anymore FullObservable from the domain to use action
  masking, as get_action_mask() can be called without the solver knowing about
  the current state (and since in rollout, the actual domain is now
  used)
- decide whether using action masking directly in __init__() so that
  using_applicable_actions() can be overriden properly
- use common functions for unwrap_obs and wrap_action in solver and
  wrapper environment to avoid code duplication
- use domain.get_action_mask() to convert applicable actions into a mask
  (the method is more efficient as not calling get_applicable_actions()
  for each actions)
This is more memory sufficient for only 0-1's.
And seems to be the standard for action mask at least for ray.rllib,
as shown in `action_mask_key` documentation at
https://docs.ray.io/en/latest/rllib/rllib-training.html
@nhuet nhuet force-pushed the rollout-action-mask branch from 3c73d1d to 85f6c59 Compare December 13, 2024 12:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant