pytorch implementation of the TwinBert paper (https://arxiv.org/pdf/2002.06275v1.pdf)
This notebook was created to train a Siamese Bert architecture to find similar pair of text documents. The authors of the paper have used this architecture to create a backend model for a sponsored search system. The goal was to display a list of ads that best match user’s intent.
Due to lack of data, i have trained this model on the Quora Questions Pairs Dataset (https://www.kaggle.com/c/quora-question-pairs)