Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

125/96 Episodes with malfunction for benchmarking and regression tests. 8 Policy abstraction. #131

Open
wants to merge 24 commits into
base: main
Choose a base branch
from

Conversation

chenkins
Copy link
Contributor

@chenkins chenkins commented Feb 23, 2025

Changes

  • Extract Trajectory data structure from benchmark_episodes.py.
  • Deprecate duplicate env_generator in rail_env_utils.py. Add initial observations and infos from reset() to env_generator interface
  • Add cli for generating trajectories from policy (policy abstraction)
  • Add cli for validating and evaluating trajectories
  • data model and flow documentation

Related issues

Closes #125
Closes to #8
Closes #96

Checklist

  • Tests are included for relevant behavior changes.
  • Documentation is added in the flatland-book repo for relevant behavior changes.
  • If you made important user-facing changes, describe them under the [Unreleased] tag in CHANGELOG.md.
  • New package dependencies are declared in the pyproject.toml file.
    Requirement files have been updated by running tox -e requirements.
  • Code works with all supported Python versions (3.10, 3.11 and 3.12). Checks run with all three version and are
    required to run successfully.
  • Code is formatted according to PEP 8 (an IDE like PyCharm can do this for you).
  • Technical guidelines listed in CONTRIBUTING.md are followed.

@chenkins chenkins added this to the 4.0.6 milestone Feb 23, 2025
@chenkins chenkins changed the title Extract Trajectory data structure from benchmark_episodes.py. 125 Extract Trajectory data structure from benchmark_episodes.py. Feb 23, 2025
@chenkins chenkins changed the title 125 Extract Trajectory data structure from benchmark_episodes.py. 125 Episodes with malfunction for benchmarking and regression tests. Policy abstraction. Feb 23, 2025
@chenkins chenkins marked this pull request as ready for review February 26, 2025 10:06
@chenkins chenkins requested a review from a team as a code owner February 26, 2025 10:06
@chenkins chenkins changed the title 125 Episodes with malfunction for benchmarking and regression tests. Policy abstraction. 125/96 Episodes with malfunction for benchmarking and regression tests. 8 Policy abstraction. Feb 26, 2025


class Policy:
def act(self, handle: int, observation: Any, **kwargs) -> RailEnvActions:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@manuschn Not sure we want/need the handle as well? The policy's acting should be based on the observation only and not depend on the handle?

Copy link
Contributor

@aiAdrian aiAdrian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we break all python version below 3.9 !!!!
TypeError: Type subscription requires python >= 3.9

(flatland-rl-test) u216993@K57156:~/flatland/test/flatland_solver_policy$ python example/flatland_dynamics/example_flatland_dynamics.py
Traceback (most recent call last):
File "example/flatland_dynamics/example_flatland_dynamics.py", line 4, in
from environment.flatland_railway_extension.flatland_dynamics import FlatlandDynamicsEnvironment
File "/home/u216993/flatland/test/flatland_solver_policy/environment/flatland_railway_extension/flatland_dynamics.py", line 3, in
from flatland_railway_extension.FlatlandEnvironmentHelper import FlatlandEnvironmentHelper
File "/home/u216993/flatland/test/flatland_railway_extension/flatland_railway_extension/FlatlandEnvironmentHelper.py", line 6, in
from flatland.envs.malfunction_generators import MalfunctionParameters, ParamMalfunctionGen
File "/home/u216993/flatland/test/flatland-rl/flatland/envs/malfunction_generators.py", line 8, in
from flatland.envs import persistence
File "/home/u216993/flatland/test/flatland-rl/flatland/envs/persistence.py", line 10, in
from flatland.envs import rail_env
File "/home/u216993/flatland/test/flatland-rl/flatland/envs/rail_env.py", line 28, in
from flatland.utils import seeding
File "/home/u216993/flatland/test/flatland-rl/flatland/utils/seeding.py", line 97, in
HashableRandomState = Tuple[str, np.ndarray[np.uint], int, int, float]
TypeError: Type subscription requires python >= 3.9

@chenkins chenkins requested a review from a team March 6, 2025 20:11
@chenkins chenkins modified the milestones: 4.0.6, 4.0.5 Mar 10, 2025
# it's not sufficient to store random_seed, as seeding from random_seed is done
# at start of reset (before rail/line/timetable (re-)generation,
# hence np_random depends on rail/line/timetable generation
# TODO would it be better to have env_generation without reset? Conceptually, the env should be initialised with a state incl. the seed
Copy link
Contributor Author

@chenkins chenkins Mar 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discuss/resolve TODO.
Current approach: reset calls rail/line/timetable generators -> random state at start depends on these generators.
Desireable approach (anticipated by env_generator): the env starts is already reset in initial state, therefore need to persist np_random after reset is called.

Is the current implementation self-consistent? Add test

  • save env without reset
  • save env after reset
    load both and reset w/ and w/o reset -> the result should be the same in all 4 cases. Is this enough?

============


TODO move to `flatland-book`?
Copy link
Contributor Author

@chenkins chenkins Mar 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@manuschn are these diagrams helpful ?

```

Flow Env Step
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@manuschn idea is to refactor step/reset accordingly reflecting this flow structure -> #138

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add episodes with malfunction for benchmarking and regression tests. Trajectory/episode recording.
2 participants