- Python3.6
- For pip modules run: pip install -r requirements.txt or pip install nltk scikit-learn scipy
By using the 5 classifiers:
- Naive Bayes Classifier
- Multinomial Naive Bayes Classifier
- Bernoulli Naive Bayes Classifier
- Logistic Regression Classifier
- Linear SVC Classifier
All classifiers were trained on 10.000 short reviews of movies that were either positive (pos) or negative (neg). By combining the votes of these 5 classifiers a custom Voted Classifier has been constructed. This Voted Classifier (VCLF) uses the votes of the 5 trained classifier to come to a classification of a given string.
If you wish to use this classifier for yourself do the following steps:
- git clone https://github.com/DanielPerezJensen/sentiment-analyzer.git
- cd sentiment-analyzer
- If you want to train the classifiers on other data do this:
- Put positive reviews in "sentiment_data/positive" and negative reviews in "sentiment_data/negative". Separate every review by a newline.
- python3.6 train_clf.py
- If you get an error create these directories: pickled/data and pickled/algorithms
- Now you can import sentiment.py in files that are in the sentiment-analyzer directory.
- Usage:
> from sentiment import sentiment > print(sentiment("This movie was great")) > pos 1 > print(sentiment("I have had a really bad afternoon to be honest")) > neg 1