Twitter Semisupervised Sentiment Analysis

This repository is made for the NLP course project - Apr 2018.

Dependencies:

Dataset:

A semisupervised sentiment analysis for tweets of a twitter account over time.

Steps:

Collect all Tweets of an account in a json file with the following format:

{
  "source": "Twitter for iPhone",
  "text": "Some text",
  "created_at": "Sun Jul 08 21:58:52 +0000 2018",
  "retweet_count": 64399,
  "favorite_count": 183994,
  "is_retweet": false,
  "id_str": "1016079192604139520"
}

Use NLTK for Lemmatization and Tokenization.
Based on AFINN dataset, each word is given a score, from +5 (very positive) to -5 (very negative).
Use scikit-learn to calculate precision, recall, and f1-score.
Use Matplotlib to plot a histogram of the sentiment analysis over time.