Summary repository for AI Summer 2022. Introduction to Transformer models, with practical applications to inferencing and training
Presented by Vanderbilt Data Science Institute data scientists:
- Dr. Jesse Spencer-Smith, Chief Data Scientist
- Dr. Charreau Bell, Senior Data Scientist
- Umang Chaudhry, Data Scientist
The objective of these workshops is to develop foundational skills in understanding, inferencing and training Transformer models primarily using HuggingFace, an extremely user-friendly API for transformers.
To get the most out of this crash course in Python:
- Open Colab (workbook) notebooks and actively write code along with the instructor
- Actively participate in discussion
- Actively participate in breakout rooms
- Perform homework assignments before coming to class the next day
- Relax your mind and ask questions
- Let us know how you are doing using Fastcups at https://cups.fast.ai/vanderbilt-ai-summer!
- Sign up for a Google Collaboratory account. The free account should be sufficient, but you will get more compute (and longer running times) if you sign up for Colab Pro at ~$5/month.
- Sign up for a Hugginface.co account. Again, the free account should be sufficient.
- Suggested: Preview the book Natural Language Processing with Transformers by Lewis Tunstall, Leandro von Werra and Thomas Wolf. If you are affiliated with Vanderbilt University, you can access this pre-print book (and any book by O’Reilly) free by logging into O'Reilly Media using your Vanderbilt email address. Vanderbilt licenses all content from O’Reilly. The book covers Transformers for purposes beyond text.
- Think about any data you might want to bring to the workshop. Also begin thinking about any projects you might want to accomplish during our month. We’ll have office hours for you to work with us to get your first project off the ground!
Mon, May 16 Introduction to Transformers, architecture, Huggingface models, datasets, spaces
Wed, May 18 Text applications / inferencing pipeline / sharing your work interactively with Gradio
Fri, May 20 Training for text / pushing to hub / custom Gradio - see https://github.com/vanderbilt-data-science/ai-summer-gradio
Mon, May 23 Audio models
Wed, May 25 Audio models / Image models
Fri, May 27 Image models
Mon, May 30 Custom models from scratch/special tokens/domain adaptation
Wed, Jun 1 Custom models from scratch/Perceiver IO
Fri, Jun 3 Research Presentation / Whirlwind tour of what's new / Next Steps
During these workshops, we'll have a number of breakout rooms where you'll work with others for discussion or develop code to solve an assignment. Please screenshot or paste your results in the following Google doc:
https://docs.google.com/document/d/15deDo3TBlgue_7ueoHake-O3HoEqCZKBZOHWfmfUlFQ/edit?usp=sharing
During a live session, open https://cups.fast.ai/vanderbilt-ai-summer and click on the green, yellow, or red cup to indicate how you are doing!
Video recordings of these workshops can be found at the links below:
Note: Titles of the records may say "Office Hours," but they are of the course.
- Monday, May 16: https://vanderbilt.zoom.us/rec/share/i06HPNJBGU4qvCUvsxGgouubLD8ydFY3Tax4oxB6BaildJlsrTfjkDvdQs0sDI6F.2zkoTRMEVMQxk72v?startTime=1652708325000
- Wednesay, May 18: https://vanderbilt.zoom.us/rec/share/jGKukco6K64DzUVkiUXjANDJSeWdEhd3dMHlO5CPKw9kVHvOR51Z4uMiWzQP_FkS.Ci5h98qmKVg1zoFa?startTime=1652882017000 (note, the title of the recording is incorrect, but it is indeed the recording of our Wednesday class!)
- Friday, May 20: https://vanderbilt.zoom.us/rec/share/Yxp1ZoEcMZCEZrcfvplpWQy-i09bhZvo6rl8SAgU6_cPMvC8vl4rOzMqtmTDCfcf.EDSXtfU1cNQrKSvc?startTime=1653054808000
- Monday, May 23: https://vanderbilt.zoom.us/rec/share/9tHGPnVjhtXgD2DxfeZGhhWKbBfGBo0ZUBezEAlO-zv778uywcJQ9MuowBW_2lgZ.eMwfT-B39ulZN73Q?startTime=1653314184000
- Wednesday, May 25: https://vanderbilt.zoom.us/rec/share/08cPU-4E8NZqCEfkyaAIjLMkOjWWo7nUtQUk6xn1RIg9GKqrqIe8EE5E4oj3JDQB.3MljahnyRxiA25wJ?startTime=1653487005000
- Friday, May 27: https://vanderbilt.zoom.us/rec/share/zxQcyZsEHv2QJx45VCHnV5rVnXjFQrSCLavDWg8B-zwsw7i_v6piWf9UYUiRta1e.dom7PLc8tR7UwE-w?startTime=1653659784000
- Wednesday, June 1: https://vanderbilt.zoom.us/rec/share/_GFi0dTXKqrNt7j53-sGiqZCp0BriNYk7PXFx7Fd24xbmjHiUvYX6asMBo-5ckgg.tfLA81XoMnXw49sX?startTime=1654091752000
- Vanderbilt AI Summer Python Intro: Python Essentials for AI
A number of examples will be left to the reader. Please complete these assignments prior to coming to the next day of the course. These homework assignments are designed to augment your understanding of Python, enable you to avoid common pitfalls of programming, resolve known areas of ambiguity that often arise in our new learners, and navigate and understand common errors that Python will throw.
DGX A100 Compute Grant: https://forms.gle/2mGfEy9DB4JU2GpZ8
- VU Python for AI: This is the recording for the first week of AI Summer covering the basics of Python for AI frameworks
- A Whirlwind Tour of Python, Jake VanderPlas
- Python Data Science Handbook, Jake VanderPlas
- Programming with Python, Software Carpentries
- Introduction to Python (Datacamp Youtube Playlist)
- Introduction to Python (freeCodeCamp.org)
- Fastcups https://cups.fast.ai/vanderbilt-ai-summer (to be used synchronously during class)
- Natural Language Processing with Transformers by Lewis Tunstall, Leandro von Werra and Thomas Wolf. If you are affiliated with Vanderbilt University, you can access this pre-print book (and any book by O’Reilly) free by logging into O'Reilly Media using your Vanderbilt email address. Vanderbilt licenses all content from O’Reilly. The book covers Transformers for purposes beyond text.