v0.9.0 Fixed Hindi tokenization issues with \u200D that should not break a word. http://unicode.scarfboy.com/?s=%E0%A4%B8%E0%A4%A8%E0%A5%8D%E2%80%8D%E0%A4%A4%E0%A4%BE%E0%A4%A8 Extracted Occurrences functions to separate file for better organization.