You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
If I add @torch.autocast(device_type='cuda', dtype=torch.bfloat16) to the forward method I get the following type mismatch on the linear layer directly after MyMLP:
Traceback (most recent call last):
...
File "/home/jcbgb/anaconda3/envs/hans/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/home/jcbgb/anaconda3/envs/hans/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
return forward_call(*args, **kwargs)
File "/home/jcbgb/anaconda3/envs/hans/lib/python3.10/site-packages/torch/nn/modules/container.py", line 219, in forward
input = module(input)
File "/home/jcbgb/anaconda3/envs/hans/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/home/jcbgb/anaconda3/envs/hans/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
return forward_call(*args, **kwargs)
File "/home/jcbgb/anaconda3/envs/hans/lib/python3.10/site-packages/torch/nn/modules/linear.py", line 117, in forward
return F.linear(input, self.weight, self.bias)
RuntimeError: mat1 and mat2 must have the same dtype, but got BFloat16 and Float
If I put my whole loss function in an autocast block I get this issue later in the backwards pass:
Traceback (most recent call last):
...
File "/home/jcbgb/anaconda3/envs/hans/lib/python3.10/site-packages/torch/_tensor.py", line 521, in backward
torch.autograd.backward(
File "/home/jcbgb/anaconda3/envs/hans/lib/python3.10/site-packages/torch/autograd/__init__.py", line 289, in backward
_engine_run_backward(
File "/home/jcbgb/anaconda3/envs/hans/lib/python3.10/site-packages/torch/autograd/graph.py", line 769, in _engine_run_backward
return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass
File "/home/jcbgb/anaconda3/envs/hans/lib/python3.10/site-packages/torch/autograd/function.py", line 306, in apply
return user_fn(self, *args)
File "/home/jcbgb/anaconda3/envs/hans/lib/python3.10/site-packages/scattermoe/parallel_experts.py", line 55, in backward
d_gates = torch.bmm(output_expanded, grad_out[:, :, None]).squeeze(-1)
RuntimeError: expected scalar type BFloat16 but found Float
The text was updated successfully, but these errors were encountered:
I'm getting a couple of dtype-related errors when using the MLP module in a torch.autocast block. Here's my simple wrapper of the MLP module:
If I add
@torch.autocast(device_type='cuda', dtype=torch.bfloat16)
to the forward method I get the following type mismatch on the linear layer directly after MyMLP:If I put my whole loss function in an autocast block I get this issue later in the backwards pass:
The text was updated successfully, but these errors were encountered: