huggingface上的nlp课程的notebook
model | examlpes | task |
---|---|---|
encoder | ALBERT, BERT, DistilBERT, ELECTRA, RoBERTa | Sentence classification, named entity recognition, extractive question answering |
decoder | CTRL, GPT, GPT-2, Transformer XL | Text generation |
encoder-decoder | BART, T5, Marian, mBART | Summarization, translation, generative question answering |
https://huggingface.co/learn/nlp-course/chapter2/4?fw=pt
tokenizer | examlpes | tokenization method |
---|---|---|
Byte-level BPE | as used in GPT-2 | Subword tokenization |
WordPiece | as used in BERT | Subword tokenization |
SentencePiece or Unigram | as used in several multilingual models | Subword tokenization |