Skip to content

Latest commit

 

History

History

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 

Paraphrase Adversaries from Word Scrambling

[Original Paper] [Original Dataset] [Dataset Download]

This dataset contains 100k human-labeled pairs that feature the importance of modeling structure, context, and word order information for the problem of paraphrase identification.

All translated pairs are sourced from examples in PAWS-Wiki.

  • PAWS-Wiki Labeled (Final): containing pairs that are generated from both word swapping and back translation methods. All pairs have human judgements on both paraphrasing and fluency and they are split into Train/Dev/Test sections.

  • PAWS-Wiki Labeled (Swap-only): containing pairs that have no back translation counterparts and therefore they are not included in the first set. Nevertheless, they are high-quality pairs with human judgements on both paraphrasing and fluency, and they can be included as an auxiliary training set.

Translated to Indonesia using Google Translate API. Translate script is included.

Citation

@misc{zhang2019paws,
      title={PAWS: Paraphrase Adversaries from Word Scrambling}, 
      author={Yuan Zhang and Jason Baldridge and Luheng He},
      year={2019},
      eprint={1904.01130},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}