Proposal for Code Structure Improvement Using `jax.lax.cond` #245

helpingstar · 2024-09-13T09:15:05Z

Is your feature request related to a problem? Please describe

This is a simple question regarding code style. It is not related to any bugs.

jumanji/jumanji/environments/routing/connector/env.py

Lines 184 to 198 in fd511b4

    
           timestep = jax.lax.cond( 
        
               done | (new_state.step_count >= self.time_limit), 
        
               lambda: termination( 
        
                   reward=reward, 
        
                   observation=observation, 
        
                   extras=extras, 
        
               ), 
        
               lambda: transition( 
        
                   reward=reward, 
        
                   observation=observation, 
        
                   extras=extras, 
        
               ), 
        
           ) 
        
           return new_state, timestep

jumanji/jumanji/environments/logic/game_2048/env.py

Lines 222 to 236 in fd511b4

    
           timestep = jax.lax.cond( 
        
               done, 
        
               lambda: termination( 
        
                   reward=reward, 
        
                   observation=observation, 
        
                   extras=extras, 
        
               ), 
        
               lambda: transition( 
        
                   reward=reward, 
        
                   observation=observation, 
        
                   extras=extras, 
        
               ), 
        
           ) 
        
           return state, timestep

Rather than repeatedly using lambda and duplicating variables as shown in the code above, it seems better to follow the functional style of jax.lax.cond and write it in the style of the solution code below.

It seems that there is little to no difference in performance.
If this is a minor issue, I will close it.

Describe the solution you'd like

timestep = jax.lax.cond(
    done,
    termination,
    transition,
    reward,
    observation,
    extras,
)

Describe alternatives you've considered

None

Additional context

jumanji/jumanji/environments/logic/minesweeper/env.py

Lines 178 to 185 in fd511b4

    
           next_timestep = jax.lax.cond( 
        
               done, 
        
               termination, 
        
               transition, 
        
               reward, 
        
               next_observation, 
        
           ) 
        
           return next_state, next_timestep

jumanji/jumanji/environments/logic/graph_coloring/env.py

Lines 202 to 209 in fd511b4

    
           timestep = lax.cond( 
        
               done, 
        
               termination, 
        
               transition, 
        
               reward, 
        
               obs, 
        
           ) 
        
           return next_state, timestep

jumanji/jumanji/environments/logic/rubiks_cube/env.py

Lines 169 to 176 in fd511b4

    
           next_timestep = jax.lax.cond( 
        
               done, 
        
               termination, 
        
               transition, 
        
               reward, 
        
               next_observation, 
        
           ) 
        
           return next_state, next_timestep

jumanji/jumanji/environments/logic/sudoku/env.py

Lines 124 to 132 in fd511b4

    
           timestep = jax.lax.cond( 
        
               done, 
        
               termination, 
        
               transition, 
        
               reward, 
        
               observation, 
        
           ) 
        
           return next_state, timestep

jumanji/jumanji/environments/packing/tetris/env.py

Lines 235 to 242 in fd511b4

    
           next_timestep = jax.lax.cond( 
        
               done, 
        
               termination, 
        
               transition, 
        
               reward, 
        
               next_observation, 
        
           ) 
        
           return next_state, next_timestep

Misc

Check for duplicate requests.

The text was updated successfully, but these errors were encountered:

sash-a · 2024-09-13T09:22:09Z

Hi, thanks for the suggestion! Agreed it isn't as clean as the others, but the reason it needs to be done this way is because the transition and termination function take different arguments:

def transition(
    reward: Array,
    observation: Observation,
    discount: Optional[Array] = None,
    extras: Optional[Dict] = None,
    shape: Union[int, Sequence[int]] = (),
) -> TimeStep:

def termination(
    reward: Array,
    observation: Observation,
    extras: Optional[Dict] = None,
    shape: Union[int, Sequence[int]] = (),
) -> TimeStep:

If we simply passed in un-named arguments through the cond like done in other envs it would pass the extras as discounts to the transition branch. There isn't really a clean solution here unless JAX allows for named arguments to jax.lax.cond

clement-bonnet · 2024-09-13T09:23:59Z

Hi, thank you for your comment and your observation. Yes, we would rather have what you suggested, i.e. a simple jax.lax.cond statement. However, as far as I know, cond statements only take positional argument (as opposed to kwargs), and the presence of extras in Connector and Game2048 that you mentioned above does not play well with the termination and transition functions. The transition function has an additional optional discount argument between observations and extras, which the termination function does not have. Therefore, they do not have the same signature as far as it concerns the first three positional arguments. So you can't simply use a cond statement.

clement-bonnet · 2024-09-13T09:26:14Z

Thank you @sash-a for replying so fast that I didn't see your comment! Agree with what you said.

A solution would be to move the discount argument from transition to after the extras argument so that they share the same (reward, observation, extras) positional arguments. Unless as you mentioned jax implements kwargs for cond.

sash-a · 2024-09-13T09:29:53Z

Ye that seems like a reasonable solution

helpingstar · 2024-09-13T12:36:52Z

It seems that the cases including the lambda form also included extras. Thank you both for your kind guidance. I also think @clement-bonnet 's solution looks good.

Since all instances calling transition use extras as keyword arguments, it seems possible to change the function definition. However, considering potential version issues, I’m cautious about submitting a PR right away.

I’m learning a lot from studying Jumanji’s code and can see the effort put into systematically building the JAX reinforcement learning environment throughout. Thank you very much for writing such excellent code.

I hope @clement-bonnet 's solution is implemented (or that jax.lax.cond will support keyword arguments. I hope there’s something I can help with.). I'll leave the issue open. Feel free to close it at any time!

sash-a · 2024-11-01T07:49:30Z

@helpingstar would you be interested in making a PR to fix this?

helpingstar · 2024-11-01T08:45:23Z

@sash-a I’d be glad to help with this. Is there a preferred timeline?

sash-a · 2024-11-01T08:48:31Z

No rush honestly, whenever you have time! I think @clement-bonnet suggested fix should work well

helpingstar · 2024-11-01T08:53:41Z

@sash-a Got it, thanks for letting me know!

helpingstar added the enhancement New feature or request label Sep 13, 2024

sash-a added good first issue Good for newcomers help wanted Extra attention is needed labels Oct 25, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Proposal for Code Structure Improvement Using `jax.lax.cond` #245

Proposal for Code Structure Improvement Using `jax.lax.cond` #245

helpingstar commented Sep 13, 2024

sash-a commented Sep 13, 2024

clement-bonnet commented Sep 13, 2024

clement-bonnet commented Sep 13, 2024

sash-a commented Sep 13, 2024

helpingstar commented Sep 13, 2024 •

edited

Loading

sash-a commented Nov 1, 2024

helpingstar commented Nov 1, 2024

sash-a commented Nov 1, 2024

helpingstar commented Nov 1, 2024

Proposal for Code Structure Improvement Using jax.lax.cond #245

Proposal for Code Structure Improvement Using jax.lax.cond #245

Comments

helpingstar commented Sep 13, 2024

Is your feature request related to a problem? Please describe

Describe the solution you'd like

Describe alternatives you've considered

Additional context

Misc

sash-a commented Sep 13, 2024

clement-bonnet commented Sep 13, 2024

clement-bonnet commented Sep 13, 2024

sash-a commented Sep 13, 2024

helpingstar commented Sep 13, 2024 • edited Loading

sash-a commented Nov 1, 2024

helpingstar commented Nov 1, 2024

sash-a commented Nov 1, 2024

helpingstar commented Nov 1, 2024

Proposal for Code Structure Improvement Using `jax.lax.cond` #245

Proposal for Code Structure Improvement Using `jax.lax.cond` #245

helpingstar commented Sep 13, 2024 •

edited

Loading