When I set the activate='relu', I met the RuntimeError. (Solved) #247

liuwang0713 · 2022-04-22T06:33:52Z

When I set the activate='relu' in CSPDarknet53.py, line 35, I met the following RuntimeError.
(NV A100, CUDA 11.4, PyTorch 1.10.1) (in other server with different version is ok)

Traceback (most recent call last):
  File "train.py", line 308, in <module>
    Trainer(
  File "train.py", line 196, in train
    loss.backward()
  File "/usr/local/lib/python3.8/dist-packages/torch/_tensor.py", line 307, in backward
    torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs)
  File "/usr/local/lib/python3.8/dist-packages/torch/autograd/__init__.py", line 154, in backward
    Variable._execution_engine.run_backward(
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [12, 512, 13, 13]], which is output 0 of ReluBackward0, is at version 2; expected version 1 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True).

Finally, I solved the problem by adjust the code
out += residual -> out = out + residual in CSPDarknet53.py, line108.

The text was updated successfully, but these errors were encountered:

argusswift · 2022-08-01T13:15:24Z

When I set the activate='relu' in CSPDarknet53.py, line 35, I met the following RuntimeError. (NV A100, CUDA 11.4, PyTorch 1.10.1) (in other server with different version is ok)

Traceback (most recent call last):
  File "train.py", line 308, in <module>
    Trainer(
  File "train.py", line 196, in train
    loss.backward()
  File "/usr/local/lib/python3.8/dist-packages/torch/_tensor.py", line 307, in backward
    torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs)
  File "/usr/local/lib/python3.8/dist-packages/torch/autograd/__init__.py", line 154, in backward
    Variable._execution_engine.run_backward(
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [12, 512, 13, 13]], which is output 0 of ReluBackward0, is at version 2; expected version 1 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True).

Finally, I solved the problem by adjust the code out += residual -> out = out + residual in CSPDarknet53.py, line108.

ths

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

When I set the activate='relu', I met the RuntimeError. (Solved) #247

When I set the activate='relu', I met the RuntimeError. (Solved) #247

liuwang0713 commented Apr 22, 2022

argusswift commented Aug 1, 2022

When I set the activate='relu', I met the RuntimeError. (Solved) #247

When I set the activate='relu', I met the RuntimeError. (Solved) #247

Comments

liuwang0713 commented Apr 22, 2022

argusswift commented Aug 1, 2022