You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Paper
How Effective is Task-Agnostic Data Augmentation for Pretrained Transformers?
Introduction
Data augmentation has been widely used in various fields, such as NLP and computer vision, to improve the performance of deep learning models. In the vision domain, task-agnostic data augmentation is not only common but also highly effective. However, in the NLP domain, augmentation techniques are often task-specific, such as back-translation for machine translation and negative sampling for question answering and document retrieval.
Main Problem
In computer vision, task-agnostic data augmentation has been found to significantly improve performance. However, the effectiveness of task-agnostic augmentation techniques has not been thoroughly studied in the field of NLP. The authors have chosen two methods—Easy Data Augmentation and Back-Translation—to evaluate their effects across five classification tasks and two datasets using three pre-trained transformer models (BERT, XLNET, and RoBERTa).
Illustrative Example
Back translation example: "I am happy" to German and then back into English might yield paraphrases like "I feel joy" or "I am glad."
The EDA is comprehensively studied in another issue.
Input
A sentence from one of the mentioned datasets with a specific task associated with it
Output
The outcome of a classification task
Motivation
The authors were motivated by the widespread success of task-agnostic data augmentation in computer vision and wanted to explore whether these techniques remain effective for pre-trained transformers in NLP. While data augmentation has been shown to improve performance in non-pre-trained models, the authors aimed to determine its impact on modern pre-trained architectures.
Related works and their gaps
The paper fills the gap by examining the effectiveness of data augmentation on pre-trained transformers. Previous studies showed data augmentation improved performance in non-pretrained models (Zhang et al., 2015; Coulombe, 2018; Wei and Zou, 2019; Yu et al., 2018), but its utility for pre-trained models was unclear. The authors evaluate these methods across several classification tasks and data regimes, revealing the inconsistent benefits for pre-trained transformers(2020 arXiv, How Effecti…).
Contribution of this paper
A comprehensive empirical evaluation of two task-agnostic data augmentation techniques (Easy Data Augmentation and Back-Translation) on pre-trained transformers across 6 datasets and 5 NLP tasks.
Evidence shows that these techniques provide only marginal and inconsistent improvements for pre-trained models like BERT, XLNET, and RoBERTa, even in low-data regimes(2020 arXiv, How Effecti…).
Proposed methods
Not included
Experiments
3 variants of modern pre-trained transformers, including BERT, XLNET, and ROBERTA.
Datasets:
SST-2 (Socher et al., 2013) - classification task: sentiment analysis
SUBJ (Pang and Lee, 2004) - classification task: subjectivity detection
RT (Pang and Lee, 2005) - classification task: sentiment analysis
MNLI (Williams et al., 2017) - classification task: natural language inference
STS-B (Baudis et al., 2016) - classification task: semantic similarity
TREC (Li and Roth, 2002) - classification task: question type classification
Gaps this work
Task-agnostic data augmentation techniques in this research did not consistently improve the performance of pre-trained transformers across all tasks and models. Maybe there is a need to consider other methods that are more compatible with pre-trained models. In addition to that, models from different families do not take into account.(For instance LSTM)
The text was updated successfully, but these errors were encountered:
hosseinfani
changed the title
2020, arXiv, How Effective is Task-Agnostic Data Augmentation for Pretrained Transformers?
2020, EMNLP, How Effective is Task-Agnostic Data Augmentation for Pretrained Transformers?
Sep 26, 2024
Paper
How Effective is Task-Agnostic Data Augmentation for Pretrained Transformers?
Introduction
Data augmentation has been widely used in various fields, such as NLP and computer vision, to improve the performance of deep learning models. In the vision domain, task-agnostic data augmentation is not only common but also highly effective. However, in the NLP domain, augmentation techniques are often task-specific, such as back-translation for machine translation and negative sampling for question answering and document retrieval.
Main Problem
In computer vision, task-agnostic data augmentation has been found to significantly improve performance. However, the effectiveness of task-agnostic augmentation techniques has not been thoroughly studied in the field of NLP. The authors have chosen two methods—Easy Data Augmentation and Back-Translation—to evaluate their effects across five classification tasks and two datasets using three pre-trained transformer models (BERT, XLNET, and RoBERTa).
Illustrative Example
Back translation example: "I am happy" to German and then back into English might yield paraphrases like "I feel joy" or "I am glad."
The EDA is comprehensively studied in another issue.
Input
A sentence from one of the mentioned datasets with a specific task associated with it
Output
The outcome of a classification task
Motivation
The authors were motivated by the widespread success of task-agnostic data augmentation in computer vision and wanted to explore whether these techniques remain effective for pre-trained transformers in NLP. While data augmentation has been shown to improve performance in non-pre-trained models, the authors aimed to determine its impact on modern pre-trained architectures.
Related works and their gaps
The paper fills the gap by examining the effectiveness of data augmentation on pre-trained transformers. Previous studies showed data augmentation improved performance in non-pretrained models (Zhang et al., 2015; Coulombe, 2018; Wei and Zou, 2019; Yu et al., 2018), but its utility for pre-trained models was unclear. The authors evaluate these methods across several classification tasks and data regimes, revealing the inconsistent benefits for pre-trained transformers(2020 arXiv, How Effecti…).
Contribution of this paper
A comprehensive empirical evaluation of two task-agnostic data augmentation techniques (Easy Data Augmentation and Back-Translation) on pre-trained transformers across 6 datasets and 5 NLP tasks.
Evidence shows that these techniques provide only marginal and inconsistent improvements for pre-trained models like BERT, XLNET, and RoBERTa, even in low-data regimes(2020 arXiv, How Effecti…).
Proposed methods
Not included
Experiments
3 variants of modern pre-trained transformers, including BERT, XLNET, and ROBERTA.
Datasets:
SST-2 (Socher et al., 2013) - classification task: sentiment analysis
SUBJ (Pang and Lee, 2004) - classification task: subjectivity detection
RT (Pang and Lee, 2005) - classification task: sentiment analysis
MNLI (Williams et al., 2017) - classification task: natural language inference
STS-B (Baudis et al., 2016) - classification task: semantic similarity
TREC (Li and Roth, 2002) - classification task: question type classification
Implementation
They have mentioned the code that they have used but not mentioned theirs
https://github.com/google-research/bert.
https://github.com/zihangdai/xlnet.
https://github.com/huggingface/transformers
https://github.com/jasonwei20/eda_nlp
Gaps this work
Task-agnostic data augmentation techniques in this research did not consistently improve the performance of pre-trained transformers across all tasks and models. Maybe there is a need to consider other methods that are more compatible with pre-trained models. In addition to that, models from different families do not take into account.(For instance LSTM)
The text was updated successfully, but these errors were encountered: