Variable speed profiles. #111

chenkins · 2025-01-17T12:19:43Z

Objectives
Trains can choose to run slower than permitted. This reflects core railway domain features.

Todos

Trains have a speed that can be lower than the current max speed. The simulation updates the train's speed counter accordingly. Reinterpret DO_NOTHING as keep running, STOP_MOVING as decelerate and MOVE_FORWARD as accelerate. Configuration sets the acceleration/braking delta or the old behaviour.
Add configurable penalty to reward function penalizing if a train enters a cell already occupied.
Add tests on

Minimal testing requirements

Close following with same speed still works

Additional context
We do not implement a multi-resource reservation system - inter-agent communication mimicking ahead reservation is the agents' task (it could be programmed out as multi-resource allocation there, trains then only need be able to brake within the reserved distance (or gamble...)).

Open

What about observations?

The text was updated successfully, but these errors were encountered:

chenkins · 2025-02-19T09:42:38Z

Analysing @aiAdrian 's Design of FlatlandDynamics

ShortestPathNextStepObservation: which of the 3 available directions
for each agent (Left, Forward, Right) lead to the shortest path to its target;
MultiResourcesAllocationRailEnv
FlatlandResourceAllocator: which cells are hold by which agent? Also keeps track of time of last allocation and de-allocation (for minimum free time aka headway) allocate_resource() Allocates a list of resources to an agent only if it is free or self-held.
DynamicAgent.get_allocated_resource() called by the env to get the desired reservations
DynamicAgent.all_resource_ok() called by the env to signal to the agent whether the reservations could be made
DynamicAgent.update_movement_dynamics() called by the env (!!) to check whether the agent wants to move forward (due to its internal dynamics keeping).
RollingStock encapsulates physical properties like traction capacity etc. for the agent; also helper function to compute train_acceleration, max_braking_acceleration, max_tractive_effort based on current speed and reservations ahead
FlatlandDynamics extends MultiResourcesAllocationRailEnv by asking the agent whether it wants to move forward or not (update_movement_dynamics()) irrespective of the action passed to step (!!)
demo_flatland_dynamics takes the action from the observation - although the agent keeps track of its resources and should know whether it wants to move or not...

Discussion:

no need for new actions in the env
need for reservation ahead aka. multiresourceallocation in the env
How and when are the decisions taken to reserve ahead? Do agents always try to reserve ahead once they would need to brake or could they also reserve ahead more than that? Agents above are in the sense of the env internals, is there an agent in the sense of a controller taking decisions?

chenkins added the enhancement New feature or request label Jan 17, 2025

chenkins modified the milestones: 4.0.4, 4.0.5 Jan 17, 2025

chenkins added the help wanted label Feb 19, 2025

chenkins modified the milestones: 4.0.5, 4.0.6 Feb 19, 2025

chenkins self-assigned this Feb 26, 2025

chenkins linked a pull request Feb 26, 2025 that will close this issue

111 Variable Speed Profiles. #136

Draft

7 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Variable speed profiles. #111

Variable speed profiles. #111

chenkins commented Jan 17, 2025 •

edited

Loading

chenkins commented Feb 19, 2025 •

edited

Loading

Variable speed profiles. #111

Variable speed profiles. #111

Comments

chenkins commented Jan 17, 2025 • edited Loading

chenkins commented Feb 19, 2025 • edited Loading

chenkins commented Jan 17, 2025 •

edited

Loading

chenkins commented Feb 19, 2025 •

edited

Loading