A syntax error in __call_model #19

Clayfigure · 2023-09-27T02:08:24Z

In the __call_model funtion in lamorel/caller.py, you set object_gather_list=None. However, it is not allowed in torch.dist.

ClementRomac · 2023-09-27T07:45:10Z

Hi,

Did you get an error? What is the version of Pytorch you are using?

Clayfigure · 2023-09-28T05:28:37Z

i am using 1.9.0+cu111，and error message says: argument "gather_list" must be specified on destination rank.
Also, I am confused that what is the point to gather all information in None in the llm master process？

ClementRomac · 2023-09-28T08:01:00Z

As per Pytorch 1.9.0's documentation (https://pytorch.org/docs/1.9.0/distributed.html), the torch.distributed.gather_object method still takes an object_gather_list argument. So I don't get why you have this error.

Concerning the None, the object_gather_list argument is used to specify the variable on which to gather (on the destination process) all obj passed by the other processes. So each process sending an obj has no need to specify an object_gather_list. On the contrary, the destination process (here self._llm_master_process) does not specify any obj but does give an object_gather_list (as it is receiving objects but not sending one). You can find the destination process' code here.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

A syntax error in __call_model #19

A syntax error in __call_model #19

Clayfigure commented Sep 27, 2023

ClementRomac commented Sep 27, 2023

Clayfigure commented Sep 28, 2023

ClementRomac commented Sep 28, 2023

A syntax error in __call_model #19

A syntax error in __call_model #19

Comments

Clayfigure commented Sep 27, 2023

ClementRomac commented Sep 27, 2023

Clayfigure commented Sep 28, 2023

ClementRomac commented Sep 28, 2023