Skip to content
forked from Tatoeba/tatoeba2

Official repository for main codebase for http://tatoeba.org, a multilingual sentence/translation database.

License

Notifications You must be signed in to change notification settings

pgudlani/tatoeba2

 
 

Repository files navigation

Tatoeba

Tatoeba is a libre/free database of example sentences translated into many languages. Our goal is to create a resource for people studying languages—either to learn or research. The database is currently used:

As a source of example sentences by free dictionaries and language learning websites (like Jim Breen’s WWWJDIC; Jim Breen is actually a member too):

  • There's a list of free dictionary and language learning websites using Tatoeba's corpus maintained by our member CK: http://a4esl.com/temporary/tatoeba/links.html

  • As a rich resource for language learners: They can find out how to use words or how to translate grammatical constructs and idioms.

  • For research: example papers include:

    • Research on treebanking Japanese (Francis Bond, 栗林 孝行 [Takayuki Kuribayashi], 橋本 力 [Hashimoto Chikara] (2008) HPSGに基づくフリーな日本語ツリー バンクの構築 [A free Japanese Treebank based on HPSG]. In 14th Annual Meeting of The Association for Natural Language Processing, Tokyo),
    • Statistical machine translation (Eric Nichols, Francis Bond, Darren Scott Appling and Yuji Matsumoto (2010) Paraphrasing Training Data for Statistical Machine Translation. Journal of Natural Language Processing, 17(3), pages 101-122)

The main site currently has about 1 million page views and 250 thousands unique visitors monthly, as reported by Google Analytics, and the corpus is growing steadily by 3% or more every month.

About

Official repository for main codebase for http://tatoeba.org, a multilingual sentence/translation database.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published