NLP Course - Hugging Face 🤗

🕸 LinkedIn • 📙 Kaggle • 💻 Medium Blog • 🤗 Hugging Face •

This repository contains a shorter version of the NLP Course on Hugging Face. The aim is to have a few notes from the course and the code snippets that seem most important to keep handy in each section. Think of it as a sort of "cheat sheet" to quickly explore the most important concepts in Transformers and Hugging Face ecosystem. If you've already taken the original or similar course, or have some basic knowledge of Transformers, you'll undoubtly find this useful as a quick refresher on the various concepts.

The course covers Natural Language Processing (NLP) using libraries from the Hugging Face 🤗 ecosystem :

Transformers,
Datasets,
Tokenizers, and
Accelerate — as well as the Hugging Face Hub.

What is NLP?

NLP is a field of linguistics and Machine Learning focused on understanding everything related to human language.

Common NLP tasks:

Classifying sentences and words: sentiment analysis, email spam detection, grammatical components and named entities identification...
Generating text content: text auto-generation, filling masked words...
Extracting an answer from a text: questions-answers.
Generating a new sentence from an input text: translation, summurization...

NLP also tackles complex challenges in speech recognition and computer vision (audio transcription, image description).

Why is it challenging ?

Computers don’t process information in the same way as humans. Humans can easily understand a sentence meaning or determine how similar two sentences are. For machine learning (ML) models, such tasks are more difficult. The text needs to be processed in a way that enables the model to learn from it. And because language is complex, we need to think carefully about how this processing must be done. There has been a lot of research done on how to represent text, and we will look at some methods in the next chapter.

About 🤗 Transformers library

The 🤗 Transformers library was created to provide a single API through which any Transformer model can be loaded, trained, and saved. The library’s main features are:

Ease of use: Downloading, loading, and using a state-of-the-art NLP model for inference can be done in just two lines of code.
Flexibility: At their core, all models are simple PyTorch nn.Module or TensorFlow tf.keras.Model classes and can be handled like any other models in their respective machine learning (ML) frameworks.
Simplicity: Hardly any abstractions are made across the library. The “All in one file” is a core concept: a model’s forward pass is entirely defined in a single file, so that the code itself is understandable and hackable.

Repository structure

The repository directories are organized as follow:

1. Transformer models

Title	Description	Article
1. Transformers, what can they do?	Look at what Transformer models can do and use our first tool from the 🤗 Transformers library: the `pipeline()` function.	Article
2. How do Transformers work?	High-level look at the architecture of Transformer models.	Article
3. Encoder, Decoders, Encoder-Decoder models	Learn more about Encoder, Decoders, Encoder-Decoder models	Article

🛑 Disclaimer ❌:

This is by no means intended to replace the original course. If you're new to Transformers and Hugging Face, it would be best to refer to the latter, or at least have some basic knowledge of Deep Learning, particularly NLP. The aim is to have a few notes from the course and the code snippets that seem most important to me to keep to hand in each part. Like a sort of cheat sheet to refresh your memory on the most important concepts to remember. If you've already taken the original course, or have some basic knowledge of Transformers, you'll no doubt find this useful as a quick refresher on the various concepts.

Original course:

License: The original course is released under the permissive Apache 2 license.
Citation:

  author = {Hugging Face},
  title = {The Hugging Face Course, 2022},
  howpublished = "\url{https://huggingface.co/course}",
  year = {2022},
  note = "[Online; accessed <today>]"
}

Name		Name	Last commit message	Last commit date
Latest commit History 113 Commits
1. Transformer models		1. Transformer models
2. Using 🤗Transformers		2. Using 🤗Transformers
3. Fine-tuning a pretrained model		3. Fine-tuning a pretrained model
4. Sharing models and tokenizers		4. Sharing models and tokenizers
5. The 🤗 Datasets library		5. The 🤗 Datasets library
6. The 🤗 Tokenizers library		6. The 🤗 Tokenizers library
7. Main NLP tasks		7. Main NLP tasks
9. Building and sharing demos		9. Building and sharing demos
img		img
1-behind-the-pipeline.ipynb		1-behind-the-pipeline.ipynb
1-processing-the-data.ipynb		1-processing-the-data.ipynb
3-models-4-tokenizers.ipynb		3-models-4-tokenizers.ipynb
4-handling-multiple-sequences.ipynb		4-handling-multiple-sequences.ipynb
5-putting-it-all-together.ipynb		5-putting-it-all-together.ipynb
9-building-and-sharing-demos.ipynb		9-building-and-sharing-demos.ipynb
Gradio.ipynb		Gradio.ipynb
README.md		README.md
transformers-models-the-pipeline-function.ipynb		transformers-models-the-pipeline-function.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NLP Course - Hugging Face 🤗

What is NLP?

Common NLP tasks:

Why is it challenging ?

About 🤗 Transformers library

Repository structure

1. Transformer models

🛑 Disclaimer ❌:

Original course:

About

Releases

Packages

Languages

ANYANTUDRE/NLP-Course-Hugging-Face

Folders and files

Latest commit

History

Repository files navigation

NLP Course - Hugging Face 🤗

What is NLP?

Common NLP tasks:

Why is it challenging ?

About 🤗 Transformers library

Repository structure

1. Transformer models

🛑 Disclaimer ❌:

Original course:

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages