You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi there, first thanks for you great work!
I'm wodering what is the correct approach to train on msmarco with a similar approach to train_bi-encoder_margin-mse.py, where both positive and negative are sampled diffrently everytime using the SentenceTransformersTrainer instead of the deprecated training method to be able to use the multigpu and a more structured approach.
I'm also wondering what is the exact procedure to use the evaluators when using accelerate or torch.distributed.
Thanks!
The text was updated successfully, but these errors were encountered:
Hello!
Apologies for the delay, I've been working on a release.
The exact approach from that script is tricky to reproduce, because Sentence Transformers now works with Dataset instances, with which it's trickier to fully change them up every epoch. Instead, you can now train with multiple negatives at a time (by creating a column for each, see the Loss Overview docs)
Regarding evaluator instances - sadly they simply don't work well on multi-GPU right now. They only run on process 0 during training, and if you want to run an evaluator prior to training, you could add if trainer.is_local_process_zero() so it only has to compute on one of the GPUs, but that won't make it quicker.
Hi there, first thanks for you great work!
I'm wodering what is the correct approach to train on msmarco with a similar approach to train_bi-encoder_margin-mse.py, where both positive and negative are sampled diffrently everytime using the SentenceTransformersTrainer instead of the deprecated training method to be able to use the multigpu and a more structured approach.
I'm also wondering what is the exact procedure to use the evaluators when using accelerate or torch.distributed.
Thanks!
The text was updated successfully, but these errors were encountered: