You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thanks for the great work!
It's a very clever way to compute embeddings beforehand and use them directly as target values during backpropagation step.
Questions
Have you done any testing to find out, how well the distilled model performs as compared to the original teacher model?
If we use Vision Transformer (ViT) models as base, should there be any improvement to embedding quality?
Instead of using the distilled model for classification task by computing the probs, How well it performs in case we want to utilize the raw embeddings for ranking the images based on cosine distance.
The text was updated successfully, but these errors were encountered:
Thanks for the great work!
It's a very clever way to compute embeddings beforehand and use them directly as target values during backpropagation step.
Questions
probs
, How well it performs in case we want to utilize the raw embeddings for ranking the images based on cosine distance.The text was updated successfully, but these errors were encountered: