Distilled model benchmarks? #2

imneonizer · 2023-09-11T04:11:54Z

Thanks for the great work!
It's a very clever way to compute embeddings beforehand and use them directly as target values during backpropagation step.

Questions

Have you done any testing to find out, how well the distilled model performs as compared to the original teacher model?
If we use Vision Transformer (ViT) models as base, should there be any improvement to embedding quality?
Instead of using the distilled model for classification task by computing the probs, How well it performs in case we want to utilize the raw embeddings for ranking the images based on cosine distance.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Distilled model benchmarks? #2

Distilled model benchmarks? #2

imneonizer commented Sep 11, 2023

Distilled model benchmarks? #2

Distilled model benchmarks? #2

Comments

imneonizer commented Sep 11, 2023

Questions