Fast network's weights transfert to other processes #10438

thomashirtz · 2022-04-25T15:41:29Z

thomashirtz
Apr 25, 2022

Hello!
I have a Distributed RL setup where I am training a network (on a GPU) on a "Learner Process" and "acting" with the same network's architecture on "Actor process"es. I would like to send the learner's network's weight to the other processes after some training iterations.

Currently I am using a custom build shared_memory made with pytrees. I can therefore access to the weights from all the actor processes after updating it from the learner process. However it takes some time to update the shared_memory with the learner (approximately 15-25% of the time needed to do a training step).

I wanted to know if there is any way to store/retrieve/transfer the network's weights more efficiently (since it is originally already available on the GPU). Maybe by doing a copy of the weights directly from the GPU to other parts of the GPU (or RAM) in order to access it from the other processes ?

Thanks!