The IRSD dataset is available in irsd_data.zip
. This version excludes any sentence pairs from Newsela-auto.
Label indices are as follows:
- identity - 0
- rephrase - 1
- syntax split - 2
- discourse split - 3
For the gold classification test set (which includes some Newsela sentences) please contact the authors after receiving a Newsela licence.
Note: If the dataset cannot be downloaded, an alternative download is available at the following link.