Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

2020, EMNLP, How Effective is Task-Agnostic Data Augmentation for Pretrained Transformers? #86

Open
Sepideh-Ahmadian opened this issue Sep 26, 2024 · 0 comments
Assignees
Labels
literature-review Summary of the paper related to the work

Comments

@Sepideh-Ahmadian
Copy link
Member

Paper
How Effective is Task-Agnostic Data Augmentation for Pretrained Transformers?

Introduction
Data augmentation has been widely used in various fields, such as NLP and computer vision, to improve the performance of deep learning models. In the vision domain, task-agnostic data augmentation is not only common but also highly effective. However, in the NLP domain, augmentation techniques are often task-specific, such as back-translation for machine translation and negative sampling for question answering and document retrieval.

Main Problem
In computer vision, task-agnostic data augmentation has been found to significantly improve performance. However, the effectiveness of task-agnostic augmentation techniques has not been thoroughly studied in the field of NLP. The authors have chosen two methods—Easy Data Augmentation and Back-Translation—to evaluate their effects across five classification tasks and two datasets using three pre-trained transformer models (BERT, XLNET, and RoBERTa).

Illustrative Example
Back translation example: "I am happy" to German and then back into English might yield paraphrases like "I feel joy" or "I am glad." 
The EDA is comprehensively studied in another issue.

Input
A sentence from one of the mentioned datasets with a specific task associated with it

Output
The outcome of a classification task

Motivation
The authors were motivated by the widespread success of task-agnostic data augmentation in computer vision and wanted to explore whether these techniques remain effective for pre-trained transformers in NLP. While data augmentation has been shown to improve performance in non-pre-trained models, the authors aimed to determine its impact on modern pre-trained architectures.

Related works and their gaps
The paper fills the gap by examining the effectiveness of data augmentation on pre-trained transformers. Previous studies showed data augmentation improved performance in non-pretrained models (Zhang et al., 2015; Coulombe, 2018; Wei and Zou, 2019; Yu et al., 2018), but its utility for pre-trained models was unclear. The authors evaluate these methods across several classification tasks and data regimes, revealing the inconsistent benefits for pre-trained transformers​(2020 arXiv, How Effecti…).

Contribution of this paper
A comprehensive empirical evaluation of two task-agnostic data augmentation techniques (Easy Data Augmentation and Back-Translation) on pre-trained transformers across 6 datasets and 5 NLP tasks.
Evidence shows that these techniques provide only marginal and inconsistent improvements for pre-trained models like BERT, XLNET, and RoBERTa, even in low-data regimes​(2020 arXiv, How Effecti…).

Proposed methods
Not included

Experiments
3 variants of modern pre-trained transformers, including BERT, XLNET, and ROBERTA.
Datasets:
SST-2 (Socher et al., 2013) - classification task: sentiment analysis
SUBJ (Pang and Lee, 2004) - classification task: subjectivity detection
RT (Pang and Lee, 2005) - classification task: sentiment analysis
MNLI (Williams et al., 2017) - classification task: natural language inference
STS-B (Baudis et al., 2016) - classification task: semantic similarity
TREC (Li and Roth, 2002) - classification task: question type classification

Implementation
They have mentioned the code that they have used but not mentioned theirs
https://github.com/google-research/bert.
https://github.com/zihangdai/xlnet.
https://github.com/huggingface/transformers
https://github.com/jasonwei20/eda_nlp

Gaps this work
Task-agnostic data augmentation techniques in this research did not consistently improve the performance of pre-trained transformers across all tasks and models. Maybe there is a need to consider other methods that are more compatible with pre-trained models. In addition to that, models from different families do not take into account.(For instance LSTM)

@Sepideh-Ahmadian Sepideh-Ahmadian self-assigned this Sep 26, 2024
@Sepideh-Ahmadian Sepideh-Ahmadian added the literature-review Summary of the paper related to the work label Sep 26, 2024
@hosseinfani hosseinfani changed the title 2020, arXiv, How Effective is Task-Agnostic Data Augmentation for Pretrained Transformers? 2020, EMNLP, How Effective is Task-Agnostic Data Augmentation for Pretrained Transformers? Sep 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
literature-review Summary of the paper related to the work
Projects
None yet
Development

No branches or pull requests

1 participant