An experiment to check the impact of freezing embedding matrix on the validation metric depending on the number of non-overlapping tokens between train and validation set.
The experiment was run in the colab notebook, you can check the tensorboard logs there. There is more detailed explanation in the article. The main results can be viewed on the following plot: