-
Notifications
You must be signed in to change notification settings - Fork 0
mdschramm/Machine-Translation
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
This is the data/script package for Homework 6 of CS 124 at Stanford. This data package contains bitext data for two language pairs, French-English (fr-en) and Spanish-English (es-en). Training data are selected from the Europarl-v7 corpus, and dev/test data are from its corresponding development set. All data have been tokenized such that each line is a sentence, and all tokens are separated by spaces. Special characters such as apostrophes are escaped in a form similar to HTML special characters ('). Lines with the same line number from the beginning of the files of corresponding languages are source/target sentence pairs (i.e. source sentence and its translation). baseline test: BLEU-1 score: 43.107084 BLEU-2 score: 6.324437 basline dev: BLEU-1 score: 42.554177 BLEU-2 score: 6.322566
About
No description, website, or topics provided.
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published