You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Why add a de-noising task to speed up training? In transformer, no attention is paid to the difference between the noised query and the original query, and the final loss is calculated separately. In addition, at what node does the noised query affect the accelerated matching of the original query?为什么添加一个去噪任务,就可以使得训练加速,在transformer中加噪的query和原始query之间不做注意,并且最后的loss是分开计算的,那加噪的query具体在什么节点上对原始query的加速匹配产生了影响?
The text was updated successfully, but these errors were encountered:
Why add a de-noising task to speed up training? In transformer, no attention is paid to the difference between the noised query and the original query, and the final loss is calculated separately. In addition, at what node does the noised query affect the accelerated matching of the original query?为什么添加一个去噪任务,就可以使得训练加速,在transformer中加噪的query和原始query之间不做注意,并且最后的loss是分开计算的,那加噪的query具体在什么节点上对原始query的加速匹配产生了影响?
The text was updated successfully, but these errors were encountered: