Skip to content

Backtranslations of IMDB movie reviews for Data Augmentation Purposes

Notifications You must be signed in to change notification settings

sshleifer/backtranslated-imdb

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

55 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

text-augmentation

Backtranslated imdb movie reviews. Each directory is named imdb_{language_code} and mimics the original structure of the imdb dataset.

Backtranslating movie reviews to more languages

For backtranslating training data through Italian, the command would be

python cache_backtranslations.py     --imdb_dir imdb/train/ --target_language it

Backtranslating other text

Modify cache_backtranslations.py to read from and write to new paths

About

Backtranslations of IMDB movie reviews for Data Augmentation Purposes

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published