You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It appears that the latest version of the miopen-hip package does not support beta != 0 for ConvolutionBackwardData, ConvolutionBackwardWeights, and ConvolutionBackwardBias, (and ConvolutionForward as well?). While it is possible to work around this, this generally requires a temporary to assign the output to, and then a saxpy post op. This is particularly useful when the weights are shared or the network recurs in the same graph. In the case of backward data, any time the same input is connected to multiple subsequent layers, it is ideal if the gradient can be accumulated directly rather than separately.
Note that cuDNN supports beta in the equivalent functions.
Is there any planned support for this and or how feasible is this change? Thanks.
The text was updated successfully, but these errors were encountered:
I'm not able to access link you provided in your post.
Please request for privileges that necessary to access the private MIOpen repo. Ask @junliume or @JehandadKhan for details.
Is there a plan for supporting beta for ConvolutionBackward functions?
All convolutions are missing support for ALPHA and BETA. IIRC there is no specific plan (with deadline etc) yet. But such an feature is needed (albeit this is not a showstopper), and we have it in mind.
It appears that the latest version of the miopen-hip package does not support beta != 0 for ConvolutionBackwardData, ConvolutionBackwardWeights, and ConvolutionBackwardBias, (and ConvolutionForward as well?). While it is possible to work around this, this generally requires a temporary to assign the output to, and then a saxpy post op. This is particularly useful when the weights are shared or the network recurs in the same graph. In the case of backward data, any time the same input is connected to multiple subsequent layers, it is ideal if the gradient can be accumulated directly rather than separately.
Note that cuDNN supports beta in the equivalent functions.
Is there any planned support for this and or how feasible is this change? Thanks.
The text was updated successfully, but these errors were encountered: