You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Objectives
Trains can choose to run slower than permitted. This reflects core railway domain features.
Todos
Trains have a speed that can be lower than the current max speed. The simulation updates the train's speed counter accordingly. Reinterpret DO_NOTHING as keep running, STOP_MOVING as decelerate and MOVE_FORWARD as accelerate. Configuration sets the acceleration/braking delta or the old behaviour.
Add configurable penalty to reward function penalizing if a train enters a cell already occupied.
Add tests on
Minimal testing requirements
Close following with same speed still works
Additional context
We do not implement a multi-resource reservation system - inter-agent communication mimicking ahead reservation is the agents' task (it could be programmed out as multi-resource allocation there, trains then only need be able to brake within the reserved distance (or gamble...)).
Open
What about observations?
The text was updated successfully, but these errors were encountered:
FlatlandResourceAllocator: which cells are hold by which agent? Also keeps track of time of last allocation and de-allocation (for minimum free time aka headway) allocate_resource() Allocates a list of resources to an agent only if it is free or self-held.
RollingStock encapsulates physical properties like traction capacity etc. for the agent; also helper function to compute train_acceleration, max_braking_acceleration, max_tractive_effort based on current speed and reservations ahead
FlatlandDynamics extends MultiResourcesAllocationRailEnv by asking the agent whether it wants to move forward or not (update_movement_dynamics()) irrespective of the action passed to step (!!)
demo_flatland_dynamics takes the action from the observation - although the agent keeps track of its resources and should know whether it wants to move or not...
Discussion:
no need for new actions in the env
need for reservation ahead aka. multiresourceallocation in the env
How and when are the decisions taken to reserve ahead? Do agents always try to reserve ahead once they would need to brake or could they also reserve ahead more than that? Agents above are in the sense of the env internals, is there an agent in the sense of a controller taking decisions?
Objectives
Trains can choose to run slower than permitted. This reflects core railway domain features.
Todos
DO_NOTHING
as keep running,STOP_MOVING
as decelerate andMOVE_FORWARD
as accelerate. Configuration sets the acceleration/braking delta or the old behaviour.Minimal testing requirements
Additional context
We do not implement a multi-resource reservation system - inter-agent communication mimicking ahead reservation is the agents' task (it could be programmed out as multi-resource allocation there, trains then only need be able to brake within the reserved distance (or gamble...)).
Open
The text was updated successfully, but these errors were encountered: