Skip to content

Implemented text analysis using machine learning models to classify movie review sentiments as positive or negative. Built using Python 3.6.1.

Notifications You must be signed in to change notification settings

AbhinavThukral97/SentimentAnalysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Movie Reviews Sentiment Analysis using machine learning

Implemented text analysis using machine learning models to classify movie review sentiments as positive or negative. Built using Python 3.6.1.

  1. Tuned CountVectorizer (1_gram) to get appropriate features/tokens and then transformed to obtain input variable (document term matrix).
  2. Splitted training test with test size of 20%
  3. Used the following models to train on training data.
    • Naive Bayes
    • Logistic Regression
    • SVM (Support Vector Machine)
    • KNN (K Nearest Neighbors)
  4. Tested models on test data and calculated accuracy of predictions
  5. The results were as follows:
    • Naive Bayes: 98.9161849711%
    • Logistic Regression: 99.3497109827%
    • SVM: 99.0606936416%
    • KNN: 98.6994219653%
  6. Analysed further by observing confusion matrix
  7. Used Naive Bayes model to observe the number of tokens (words) and the positivity/negativity associated with that word.
  8. Implemented searching of selected words in the pandas dataframe to analyse specific words in the feature sets
  9. Used the most accurate model (Logistic Regression) to train on the entire dataset. df
  10. Took custom review inputs and predicted Positive/Negative review.

Dataset Source (from Kaggle): https://inclass.kaggle.com/c/si650winter11/data

About

Implemented text analysis using machine learning models to classify movie review sentiments as positive or negative. Built using Python 3.6.1.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages