From e76583adb21948c5bd4b8affce85161760a49633 Mon Sep 17 00:00:00 2001 From: Ikko Eltociear Ashimine Date: Tue, 9 May 2023 18:20:45 +0900 Subject: [PATCH] Update README.md HuggingFace -> Hugging Face --- README.md | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/README.md b/README.md index 83e19b13..b9495e66 100644 --- a/README.md +++ b/README.md @@ -2,11 +2,11 @@ Hi there! -This repository contains demos I made with the [Transformers library](https://github.com/huggingface/transformers) by 🤗 HuggingFace. Currently, all of them are implemented in PyTorch. +This repository contains demos I made with the [Transformers library](https://github.com/huggingface/transformers) by 🤗 Hugging Face. Currently, all of them are implemented in PyTorch. -NOTE: if you are not familiar with HuggingFace and/or Transformers, I highly recommend to check out our [free course](https://huggingface.co/course/chapter1), which introduces you to several Transformer architectures (such as BERT, GPT-2, T5, BART, etc.), as well as an overview of the HuggingFace libraries, including [Transformers](https://github.com/huggingface/transformers), [Tokenizers](https://github.com/huggingface/tokenizers), [Datasets](https://github.com/huggingface/datasets), [Accelerate](https://github.com/huggingface/accelerate) and the [hub](https://huggingface.co/). +NOTE: if you are not familiar with Hugging Face and/or Transformers, I highly recommend to check out our [free course](https://huggingface.co/course/chapter1), which introduces you to several Transformer architectures (such as BERT, GPT-2, T5, BART, etc.), as well as an overview of the Hugging Face libraries, including [Transformers](https://github.com/huggingface/transformers), [Tokenizers](https://github.com/huggingface/tokenizers), [Datasets](https://github.com/huggingface/datasets), [Accelerate](https://github.com/huggingface/accelerate) and the [hub](https://huggingface.co/). -For an overview of the ecosystem of HuggingFace for computer vision (June 2022), refer to [this notebook](https://github.com/NielsRogge/Transformers-Tutorials/blob/master/HuggingFace_vision_ecosystem_overview_(June_2022).ipynb) with corresponding [video](https://www.youtube.com/watch?v=oL-xmufhZM8&t=2884s). +For an overview of the ecosystem of Hugging Face for computer vision (June 2022), refer to [this notebook](https://github.com/NielsRogge/Transformers-Tutorials/blob/master/HuggingFace_vision_ecosystem_overview_(June_2022).ipynb) with corresponding [video](https://www.youtube.com/watch?v=oL-xmufhZM8&t=2884s). Currently, it contains the following demos: * Audio Spectrogram Transformer ([paper](https://arxiv.org/abs/2104.01778)): @@ -97,7 +97,7 @@ Currently, it contains the following demos: - performing inference with `SegformerForSemanticSegmentation` [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/NielsRogge/Transformers-Tutorials/blob/master/SegFormer/Segformer_inference_notebook.ipynb) - fine-tuning `SegformerForSemanticSegmentation` on custom data using native PyTorch [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/NielsRogge/Transformers-Tutorials/blob/master/SegFormer/Fine_tune_SegFormer_on_custom_dataset.ipynb) * T5 ([paper](https://arxiv.org/abs/1910.10683)): - - fine-tuning `T5ForConditionalGeneration` on a Dutch summarization dataset on TPU using HuggingFace Accelerate [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/NielsRogge/Transformers-Tutorials/tree/master/T5) + - fine-tuning `T5ForConditionalGeneration` on a Dutch summarization dataset on TPU using Hugging Face Accelerate [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/NielsRogge/Transformers-Tutorials/tree/master/T5) - fine-tuning `T5ForConditionalGeneration` (CodeT5) for Ruby code summarization using PyTorch Lightning [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/NielsRogge/Transformers-Tutorials/blob/master/T5/Fine_tune_CodeT5_for_generating_docstrings_from_Ruby_code.ipynb) * TAPAS ([paper](https://arxiv.org/abs/2004.02349)): - fine-tuning `TapasForQuestionAnswering` on the Microsoft [Sequential Question Answering (SQA)](https://www.microsoft.com/en-us/download/details.aspx?id=54253) dataset [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/NielsRogge/Transformers-Tutorials/blob/master/TAPAS/Fine_tuning_TapasForQuestionAnswering_on_SQA.ipynb) @@ -229,7 +229,7 @@ for k,v in batch.items(): # decode the input_ids of the first example of the batch print(tokenizer.decode(batch['input_ids'][0].tolist()) ``` -- [HuggingFace Datasets](https://huggingface.co/docs/datasets/). Datasets is a library by HuggingFace that allows to easily load and process data in a very fast and memory-efficient way. It is backed by [Apache Arrow](https://arrow.apache.org/), and has cool features such as memory-mapping, which allow you to only load data into RAM when it is required. It only has deep interoperability with the [HuggingFace hub](https://huggingface.co/datasets), allowing to easily load well-known datasets as well as share your own with the community. +- [Hugging Face Datasets](https://huggingface.co/docs/datasets/). Datasets is a library by Hugging Face that allows to easily load and process data in a very fast and memory-efficient way. It is backed by [Apache Arrow](https://arrow.apache.org/), and has cool features such as memory-mapping, which allow you to only load data into RAM when it is required. It only has deep interoperability with the [Hugging Face hub](https://huggingface.co/datasets), allowing to easily load well-known datasets as well as share your own with the community. Loading a custom dataset as a Dataset object can be done as follows (you can install datasets using `pip install datasets`): ```python @@ -293,5 +293,5 @@ for epoch in range(epochs): ``` - [PyTorch Lightning (PL)](https://www.pytorchlightning.ai/). PyTorch Lightning is a framework that automates the training loop written above, by abstracting it away in a Trainer object. Users don't need to write the training loop themselves anymore, instead they can just do `trainer = Trainer()` and then `trainer.fit(model)`. The advantage is that you can start training models very quickly (hence the name lightning), as all training-related code is handled by the `Trainer` object. The disadvantage is that it may be more difficult to debug your model, as the training and evaluation is now abstracted away. -- [HuggingFace Trainer](https://huggingface.co/transformers/main_classes/trainer.html). The HuggingFace Trainer API can be seen as a framework similar to PyTorch Lightning in the sense that it also abstracts the training away using a Trainer object. However, contrary to PyTorch Lightning, it is not meant not be a general framework. Rather, it is made especially for fine-tuning Transformer-based models available in the HuggingFace Transformers library. The Trainer also has an extension called `Seq2SeqTrainer` for encoder-decoder models, such as BART, T5 and the `EncoderDecoderModel` classes. Note that all [PyTorch example scripts](https://github.com/huggingface/transformers/tree/master/examples/pytorch) of the Transformers library make use of the Trainer. -- [HuggingFace Accelerate](https://github.com/huggingface/accelerate): Accelerate is a new project, that is made for people who still want to write their own training loop (as shown above), but would like to make it work automatically irregardless of the hardware (i.e. multiple GPUs, TPU pods, mixed precision, etc.). +- [Hugging Face Trainer](https://huggingface.co/transformers/main_classes/trainer.html). The Hugging Face Trainer API can be seen as a framework similar to PyTorch Lightning in the sense that it also abstracts the training away using a Trainer object. However, contrary to PyTorch Lightning, it is not meant not be a general framework. Rather, it is made especially for fine-tuning Transformer-based models available in the Hugging Face Transformers library. The Trainer also has an extension called `Seq2SeqTrainer` for encoder-decoder models, such as BART, T5 and the `EncoderDecoderModel` classes. Note that all [PyTorch example scripts](https://github.com/huggingface/transformers/tree/master/examples/pytorch) of the Transformers library make use of the Trainer. +- [Hugging Face Accelerate](https://github.com/huggingface/accelerate): Accelerate is a new project, that is made for people who still want to write their own training loop (as shown above), but would like to make it work automatically irregardless of the hardware (i.e. multiple GPUs, TPU pods, mixed precision, etc.).