Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sentence Transformer encodings #1168

Open
Gchand0249 opened this issue Sep 14, 2021 · 18 comments
Open

Sentence Transformer encodings #1168

Gchand0249 opened this issue Sep 14, 2021 · 18 comments

Comments

@Gchand0249
Copy link

Gchand0249 commented Sep 14, 2021

Hi ,

We finetuned Sentence transformers on our domain specific data (similar to NLI data). It is giving high cosine score for irrelevant suggestions . We used good , bad , ok while labeling the data.

@nreimers
Copy link
Member

Adding a question to your issue would be quite helpful.

@Gchand0249
Copy link
Author

yes , After extracting embeddings from sbert we using cosine score for sorting results. Issue here is results with high cosine score are irrelevant . And similar results are getting less score. We are unable to figure it out why it is happening

@nreimers
Copy link
Member

Likely due to wrongly training the model.

@Gchand0249
Copy link
Author

Gchand0249 commented Sep 15, 2021

Thank you , Does performance depends on batch size ?

Could you please elaborate what does it mean by wrongly training ? Epoch, batch size or Data perspective
or Loss

We trained for 4 epochs with batch size 16 and used SoftmaxLoss .

@nreimers
Copy link
Member

SoftmaxLoss is the wrong loss. Have a look at the other losses functions

@Gchand0249
Copy link
Author

Gchand0249 commented Sep 15, 2021

Thanks for your replay , Could you please suggest me preferable loss to train sbert ?

@nreimers
Copy link
Member

MultipleNegativesRankingLoss or one of the triplet losses

@Gchand0249
Copy link
Author

Gchand0249 commented Sep 15, 2021

Thank you @nreimers ,

Could you please explain why SoftmaxLoss is the wrong loss ? In the sbert website you mentioned that you used softmax loss for training sbert on NLI data and our data labels are similar to NLI data.

@Gchand0249
Copy link
Author

Thank you @nreimers ,

Could you please explain why SoftmaxLoss is the wrong loss ? In the sbert website you mentioned that you used softmax loss for training sbert on NLI data and our data labels are similar to NLI data.

@nreimers
Copy link
Member

That it works on NLI is rather a coincidence, but there is not good logic behind it:
https://www.sbert.net/examples/training/nli/README.html#multiplenegativesrankingloss

@Gchand0249
Copy link
Author

Thank you for you suggestion , In hard multiplenegativesrankingloss , team stated that "" You can also provide one or multiple hard negatives per anchor-positive pair by structering the data like this: (a_1, p_1, n_1), (a_2, p_2, n_2)
Here, n_1 is a hard negative for (a_1, p_1). The loss will use for the pair (a_i, p_i) all p_j (j!=i) and all n_j as negatives. ""

Could you please eloberate this statement . Does it mean loss use P_j and n_j as negatives for a_i ?

@nreimers
Copy link
Member

Yes

@Gchand0249
Copy link
Author

Gchand0249 commented Sep 18, 2021

We did synonym expansion for data and in our case most of a_i and p_j are positive. How does it works in this case. Won't it effect the embeddings?

@Gchand0249
Copy link
Author

We did synonym expansion for data and in our case most of a_i and p_j are positive. How does it works in this case. Won't it effect the embeddings?

@nreimers
Copy link
Member

Then you have to create a custom DataLoader that ensures that a batch does not contain two entries of the same type

@Gchand0249
Copy link
Author

It is very difficult for us to extract two entries of the same type. Is it okay to go with triplet loss ?

@nreimers
Copy link
Member

Sure

@Gchand0249
Copy link
Author

Thank you .... Does distance_metric in triplet loss has any impact on performance ? We tried with default Euclidean preformance was not good so we are trying now with COsine.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants