Skip to content

Any explaination about the training speed changes? #923

Answered by Giuseppe5
RyougiKukoc asked this question in Q&A
Discussion options

You must be logged in to vote

During QAT, for the first training steps, the activation scale factors are computed through statics and this causes the initial slowdown. After that the scales are converted to a parameters, so it is faster since it is not necessary anymore to compute statics.

Replies: 1 comment 1 reply

Comment options

You must be logged in to vote
1 reply
@RyougiKukoc
Comment options

Answer selected by RyougiKukoc
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants