Skip to content

Twitter Trends Analysis using Apache Spark on a local 2-node cluster

Notifications You must be signed in to change notification settings

ashok133/Twitter-Trends-Analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

537dbff · Jan 12, 2021

History

4 Commits
Apr 24, 2018
Apr 24, 2018
Apr 24, 2018
Apr 24, 2018
Apr 24, 2018
Apr 24, 2018
Apr 24, 2018
Apr 24, 2018
Apr 24, 2018
Apr 24, 2018
Apr 24, 2018
Apr 24, 2018
Apr 24, 2018
Apr 24, 2018
Apr 24, 2018
Apr 24, 2018
Apr 24, 2018
Apr 24, 2018
Apr 24, 2018
Apr 24, 2018
Apr 24, 2018
Apr 24, 2018
Apr 24, 2018
Apr 24, 2018
Apr 24, 2018
Apr 24, 2018
Jan 12, 2021
Apr 24, 2018

Repository files navigation

Twitter-Trends-Analysis

Twitter Trends Analysis using Apache Spark (PySpark) on a local 2-node cluster.

What it does?

Uses socketstream and listens to a TCP server, which integrates to twitter on it behalf and provides the tweets to this socket stream listener. These tweets can be analysed in real time by accepting a trending term and scouring the tweet stream to count the number of occurences of the term in each minute.

  1. Jupyter notebook - twitter_feed_bda.ipynb
  2. Server broker - tweetread.py
  3. Scoured data - tweet_count.csv

Help guides - PySpark installation

Configuring PySpark and iPython notebooks

Rest is self-explanatory.

About

Twitter Trends Analysis using Apache Spark on a local 2-node cluster

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published