You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
while self._has_unfinished_sequences(this_peer_finished, synced_gpus, device=input_ids.device):
# prepare model inputs
model_inputs = self.prepare_inputs_for_generation(input_ids, **model_kwargs)
# forward pass to get next token
outputs = self(
**model_inputs,
return_dict=True,
output_attentions=output_attentions,
output_hidden_states=output_hidden_states,
)
if synced_gpus and this_peer_finished:
continue # don't waste resources running the code we don't need
...
Why is this condition checked after the outputs is generated? Can this be considered a form of resource wastage? Could this part be moved to the beginning of the while loop?
if synced_gpus and this_peer_finished:
continue # don't waste resources running the code we don't need
The code comes from transformers/generation/utils.py: GenerationMixin._sample
The text was updated successfully, but these errors were encountered:
Why is this condition checked after the
outputs
is generated? Can this be considered a form of resource wastage? Could this part be moved to the beginning of the while loop?The code comes from
transformers/generation/utils.py
:GenerationMixin._sample
The text was updated successfully, but these errors were encountered: