Skip to content

coffeedrunkpanda/learning

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 

Repository files navigation

Learning Transformers right from the Beginning

NLP

Start with The illustrated Transformer. It is the most comprehensible guide to start with visual aid.

For deeper level understanding:

Vision Transformers

Video Transformers

Now the ultimate level, the one that will test your limits and understanding of the previous models. The Video transformer. I will confess that I am new to this Video Transformer thing, but here come some suggestions. I personally just read about the Video Swin Transformer.

The ones that I do not know but want to know more about:

  • Hugging face implementation of TimeSFormer.

  • Torchvision implementation of MVIT another video transformer for classification and detection.

  • Another interesting model is Swin2SR which performs Super Resolution tasks on compressed image and videos.

About

Learning stuff

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published