nlp-research

Getting Started with NLP using NLTK

Reference Guide : https://nlpforhackers.io/start/

Reference Guide : https://radimrehurek.com/gensim/tutorial.html

Notebook 1 (dir: gender-classifier-names):

Gender-Classification on Names using a Decision Tree
Resource on Normalizing Data : http://simpledatamining.blogspot.com/2015/05/how-to-deal-with-mixed-data-types-when.html
1. numerical data --> normalize
2. categorical data --> one-hot encoding
3. ordinal data --> normalize without one-hot encoding

Notebook 2 (dir: nltk-inverted-indexing):

Model to build a simple inverted indexing for input sentences using NLTK (tokenization + stopword-removal + stemming/lemmatization)
Resource on Stemming vs Lemmatization : https://nlp.stanford.edu/IR-book/html/htmledition/stemming-and-lemmatization-1.html

Notebook 3 (dir: text-classification):

Text Classification of a News Corpus using different methods to vectorize the given input
Resource explaining the different ways to vectorize text : https://monkeylearn.com/blog/beginners-guide-text-vectorization/
Possible Representations :
1. tf.idf
2. word2vec (Not Implemented)
3. skip-thought-vectors (Not Implemented)

Notebook 4 (dir: word2vec):

Create a custom word2vec library based on the OpinRank dataset containing reviews about cars and hotels using Gensim
Reference Article : http://kavita-ganesan.com/gensim-word2vec-tutorial-starter-code/#.XBjfIWloTDc
Dataset: https://github.com/kavgan/nlp-text-mining-working-examples/tree/master/word2vec

Notebook 5 (dir: word2vec):

Create a custom word2vec library based on the Amazon Review for Sentiment Analysis (Kaggle) dataset using Gensim
Dataset: https://www.kaggle.com/bittlingmayer/amazonreviews

Name		Name	Last commit message	Last commit date
Latest commit History 42 Commits
.ipynb_checkpoints		.ipynb_checkpoints
gender-classifier-names		gender-classifier-names
nltk-inverted-indexing		nltk-inverted-indexing
text-classification		text-classification
word2vec		word2vec
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

nlp-research

Getting Started with NLP using NLTK

Reference Guide : https://nlpforhackers.io/start/

Reference Guide : https://radimrehurek.com/gensim/tutorial.html

Notebook 1 (dir: gender-classifier-names):

Notebook 2 (dir: nltk-inverted-indexing):

Notebook 3 (dir: text-classification):

Notebook 4 (dir: word2vec):

Notebook 5 (dir: word2vec):

About

Releases

Packages

Languages

License

kapilkalra04/nlp-research

Folders and files

Latest commit

History

Repository files navigation

nlp-research

Getting Started with NLP using NLTK

Reference Guide : https://nlpforhackers.io/start/

Reference Guide : https://radimrehurek.com/gensim/tutorial.html

Notebook 1 (dir: gender-classifier-names):

Notebook 2 (dir: nltk-inverted-indexing):

Notebook 3 (dir: text-classification):

Notebook 4 (dir: word2vec):

Notebook 5 (dir: word2vec):

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages