Skip to content

Repository for Frequency Word List Generator and processed files

License

Notifications You must be signed in to change notification settings

eglantine/FrequencyWords

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

FrequencyWords

Repository for Frequency Word List Generator and processed files

In early days I hosted the generated files on OneDrive with my blog https://invokeit.wordpress.com/frequency-word-lists/ linking to it. Moving forward, the code and the generated outputs are on GitHub.

OpenSubtitle tokenized source

The data used to generate this lists can be found at http://opus.lingfil.uu.se/OpenSubtitles2016.php

Format of the frequency lists:

word1 number1 (number1 represents occurance of word1 across all files)

word2 number2 (number2 represents occurance of word2 across all files)

Support

If you like to contribute towards my project, you can donate using PayPal button

paypal

About

Repository for Frequency Word List Generator and processed files

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • C# 100.0%