Add a characteristic for solvers using action masks and make use of it in rollout #445
+573
−141
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
retrieve_applicable_actions()
method.get_action_mask()
method to domains by default converting applicable actions space into a 0-1 numpy array, provided that the action space of each agent is an EnumerableSpace.RayRLlib
solver handles action masking:Maskable
FullObservable
from the domain to use actionmasking, as
get_action_mask()
can be called without the solver knowing aboutthe current state (and since in rollout, the actual domain is now
used)
__init__()
so thatusing_applicable_actions()
can be overriden properlyunwrap_obs
andwrap_action
in solver andwrapper environment to avoid code duplication
domain.get_action_mask()
to convert applicable actions into a mask(the method is more efficient as not calling
get_applicable_actions()
for each actions)