Use single forward pass in shared model architectures #156

lopatovsky · 2024-06-10T16:43:39Z

Single forward pass

Motivation:

When applying the shared model, forward pass is called twice, once for policy and once for value. The input values for the forward call are identical, so the output value could be cached to improve performance.

!Note: Single forward pass also influences the autograd graph construction, so the significant speedup happens also during the backward pass phase.

Speed eval:

Big neural network (units: [2048, 1024, 1024, 512])
3840 steps
Running on top of Oige env simulation (constant for each run)


Library	Single forward pass	Time (s)	slowing factor Base: rlgames, mixed pr. = True	slowing factor Base: rlgames, mixed pr. = False
RLGamesmixed pr. = False	Yes	141	1.259x	1 (base)
RLGamesmixed pr. = True	Yes	112	1 (base)	0.794x
SKRL	No	199	1.777x	1.411x
SKRL	Yes	151	1.348x	1.071x

* Mixed precision = True

Quality eval:

We trained a policy for our task with each of the configurations multiple times. We didn’t observe any statistically significant difference in quality of the final results.

Notice: The single and double pass runs would be identical in ideal world, but because of finite double precision and different order of computation of gradient, they diverge gradually.

Note:

- this implementation is minimalistic, but it’s quite dangerous to generalise, as it requires the value forward pass always follow the policy forward pass.
To make it safer we may implement caching of input and check if the next input is the same
- a) check if they are reference to the same object

This is simple, but using a state_preprocessor breaks the reference. So it would either have slightly weaker performance or we would need to cache state_preprocessor as well

- b) compare input and cached input tensors directly. It brings some overhead in computation, but it’s negligible compared to time spared.

Use single forward pass in shared model

c44b268

Toni-SM changed the base branch from main to develop June 13, 2024 00:50

Rename cached shared layer/network output

d3395c6

Toni-SM merged commit 32f25d6 into Toni-SM:develop Jun 15, 2024
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use single forward pass in shared model architectures #156

Use single forward pass in shared model architectures #156

lopatovsky commented Jun 10, 2024 •

edited

Loading

Use single forward pass in shared model architectures #156

Use single forward pass in shared model architectures #156

Conversation

lopatovsky commented Jun 10, 2024 • edited Loading

Single forward pass

lopatovsky commented Jun 10, 2024 •

edited

Loading